Estimation of false discovery rates in multiple testing: application to gene microarray data.

Biometrics 2003 Dec;59(4):1071-81

Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas, USA.

Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration.

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.0006-341x.2003.00123.xDOI Listing
December 2003
1 Read

Publication Analysis

Top Keywords

bootstrap procedure
8
genes denotes
8
false discovery
8
denotes number
8
number rejections
8
distribution
5
fdr
5
rate probability
4
complex derived
4
derived false
4
evaluate performance
4
discovery rate
4
probability error
4
measures considered
4
ev/r
4
fdr ev/r
4
considered fdr
4
performance methods
4
error measures
4
conducted evaluate
4

Altmetric Statistics

References

(Supplied by CrossRef)
Statistical methods for identifying differential expressed genes in replicated cDNA microarray experiments
Dudoit S. et al.
Statistica Sinica 2002
Statistical analysis of a gene expression microarray experiment with replication
Kerr M. K. et al.
Statistica Sinica 2002

Similar Publications