当一些数据来源有偏时基于各数据来源统计量的稳健熔合萃取方法(王启华)

2023-03-14 | 撰稿: | 浏览:

 Information from multiple data sources is increasingly available. However, some data sources may produce biased estimates due to biased sampling, data corruption, or model misspecification. This calls for robust data combination methods with biased sources. In this paper, a robust data fusion-extraction method is proposed. In contrast to existing methods, the proposed method can be applied to the important case where researchers have no knowledge of which data sources are unbiased. The proposed estimator is easy to compute and only employs summary statistics, and hence can be applied to many different fields, e.g., meta-analysis, Mendelian randomization, and distributed systems. The proposed estimator is consistent even if many data sources are biased and is asymptotically equivalent to the oracle estimator that only uses unbiased data. Asymptotic normality of the proposed estimator is also established. In contrast to the existing meta-analysis methods, the theoretical properties are guaranteed even if the number of data sources and the dimension of the parameter diverges as the sample size increases.  Furthermore, the proposed method provides a consistent selection for unbiased data sources with probability approaching one. Simulation studies demonstrate the efficiency and robustness of the proposed method empirically. The proposed method is applied to a meta-analysis data set to evaluate the surgical treatment for moderate periodontal disease and to a Mendelian randomization data set to study the risk factors of head and neck cancer. 
相关论文: 
Wang Ruoyu,Wang Qihua* and Miao Wang (2023),A robust fusion-extraction procedure with summary statistics in the presence of biased sources,Biometrika, 103, 1, 1-17.
Email: qhwang@amss.ac.cn 
Biometrika是统计四大顶级刊物。评审人之一认为该文解决了一个有意义且重要的问题,所发展的理论是深刻的;而另一评审人认为所发展的方法创新并很可能对实际有用。

科研进展中国科学院数学与系统科学研究院应用数学研究所
地址 北京市海淀区中关村东路55号 思源楼6-7层 南楼5-6、8层 邮编:100190 电子邮箱:iam@amss.ac.cn
@2000-2022 京ICP备05058656号-1