Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection

C Li, WH Wong - Proceedings of the National Academy of …, 2001 - National Acad Sciences
Proceedings of the National Academy of Sciences, 2001National Acad Sciences
Recent advances in cDNA and oligonucleotide DNA arrays have made it possible to
measure the abundance of mRNA transcripts for many genes simultaneously. The analysis
of such experiments is nontrivial because of large data size and many levels of variation
introduced at different stages of the experiments. The analysis is further complicated by the
large differences that may exist among different probes used to interrogate the same gene.
However, an attractive feature of high-density oligonucleotide arrays such as those …
Recent advances in cDNA and oligonucleotide DNA arrays have made it possible to measure the abundance of mRNA transcripts for many genes simultaneously. The analysis of such experiments is nontrivial because of large data size and many levels of variation introduced at different stages of the experiments. The analysis is further complicated by the large differences that may exist among different probes used to interrogate the same gene. However, an attractive feature of high-density oligonucleotide arrays such as those produced by photolithography and inkjet technology is the standardization of chip manufacturing and hybridization process. As a result, probe-specific biases, although significant, are highly reproducible and predictable, and their adverse effect can be reduced by proper modeling and analysis methods. Here, we propose a statistical model for the probe-level data, and develop model-based estimates for gene expression indexes. We also present model-based methods for identifying and handling cross-hybridizing probes and contaminating array regions. Applications of these results will be presented elsewhere.
National Acad Sciences