×

Nonparametric estimation of genewise variance for microarray data. (English) Zbl 1200.62133

Summary: Estimation of genewise variance arises from two important applications in microarray data analysis: selecting significantly differentially expressed genes and validation tests for normalization of microarray data. We approach the problem by introducing a two-way nonparametric model, which is an extension of the famous J. Neyman and E. L. Scott model [Econometrica, Chicago 16, 1–32 (1948; Zbl 0034.07602)] and is applicable beyond microarray data. The problem itself poses interesting challenges because the number of nuisance parameters is proportional to the sample size and it is not obvious how the variance function can be estimated when the measurements are correlated.
In such a high-dimensional nonparametric problem, we proposed two novel nonparametric estimators for genewise variance functions and semiparametric estimators for measurement correlations, via solving a system of nonlinear equations. Their asymptotic normality is established. The finite sample property is demonstrated by simulation studies. The estimators also improve the power of the tests for detecting statistically differentially expressed genes. The methodology is illustrated by the data from the microarray quality control (MAQC) project.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62G05 Nonparametric estimation
92C40 Biochemistry, molecular biology
92D10 Genetics and epigenetics
62G20 Asymptotic properties of nonparametric inference
65C60 Computational problems in statistics (MSC2010)

Citations:

Zbl 0034.07602

References:

[1] Carroll, R. J. and Wang, Y. (2008). Nonparametric variance estimation in the analysis of microarray data: A measurement error approach. Biometrika 95 437-449. · Zbl 1437.62408 · doi:10.1093/biomet/asn017
[2] Cui, X., Hwang, J. T. and Qiu, J. (2005). Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics 6 59-75. · Zbl 1069.62090 · doi:10.1093/biostatistics/kxh018
[3] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications . Chapman and Hall, London. · Zbl 0873.62037
[4] Fan, J. and Niu, Y. (2007). Selection and validation of normalization methods for c-DNA microarrays using within-array replications. Bioinformatics 23 2391-2398. · Zbl 1279.92064
[5] Fan, J., Peng, H. and Huang, T. (2005). Semilinear high-dimensional model for normalization of microarray data: A theoretical analysis and partial consistency (with discussion). J. Amer. Statist. Assoc. 100 781-813. · Zbl 1117.62330 · doi:10.1198/016214504000001781
[6] Fan, J. and Ren, Y. (2007). Statistical analysis of DNA microarray data in cancer research. Clinical Cancer Research 12 4469-4473.
[7] Fan, J., Tam, P., Vande Woude, G. and Ren, Y. (2004). Normalization and analysis of cDNA micro-arrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proc. Natl. Acad. Sci. USA 101 1135-1140.
[8] Huang, J., Wang, D. and Zhang, C. (2005). A two-way semi-linear model for normalization and significant analysis of cDNA microarray data. J. Amer. Statist. Assoc. 100 814-829. · Zbl 1117.62358 · doi:10.1198/016214504000002032
[9] Kamb, A. and Ramaswami, A. (2001). A simple method for statistical analysis of intensity differences in microarray-deried gene expression data. BMC Biotechnology 1 8.
[10] Neyman, J. and Scott, E. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1-32. JSTOR: · Zbl 0034.07602 · doi:10.2307/1914288
[11] Patterson, T. et al. (2006). Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project. Nature Biotechnology 24 1140-1150.
[12] Ruppert, D., Wand, M. P., Holst, U. and Hössjer, O. (1997). Local polynomial variance function estimation. Technometrics 39 262-273. JSTOR: · Zbl 0891.62029 · doi:10.2307/1271131
[13] Smyth, G., Michaud, J. and Scott, H. (2005). Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21 2067-2075.
[14] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genome-wide studies. Proc. Natl. Acad. Sci. USA 100 9440-9445. · Zbl 1130.62385 · doi:10.1073/pnas.1530509100
[15] Tong, T. and Wang, Y. (2007). Optimal shrinkage estimation of variances with applications to microarray data analysis. J. Amer. Statist. Assoc. 102 113-122. · Zbl 1284.62449 · doi:10.1198/016214506000001266
[16] Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. 98 5116-5121. · Zbl 1012.92014 · doi:10.1073/pnas.091062498
[17] Wang, Y., Ma, Y. and Carroll, R. J. (2009). Variance estimation in the analysis of microarray data. J. Roy. Statist. Soc. Ser. B 71 425-445. · Zbl 1248.62221
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.