List of Figures | p. xiii |
List of Tables | p. xvi |
Preface | p. xvii |
Genome Probing Using Microarrays | |
Introduction | p. 3 |
DNA, RNA, Proteins, and Gene Expression | p. 7 |
The Molecules of Life | p. 7 |
Genes | p. 8 |
DNA | p. 9 |
RNA | p. 12 |
The Genetic Code | p. 13 |
Proteins | p. 14 |
Gene Expression and Microarrays | p. 15 |
Complementary DNA (cDNA) | p. 16 |
Nucleic Acid Hybridization | p. 16 |
Microarray Technology | p. 19 |
Transcriptional Profiling | p. 20 |
Sequencing-based Transcriptional Profiling | p. 20 |
Hybridization-based Transcriptional Profiling | p. 22 |
Microarray Technological Platforms | p. 23 |
Probe Selection and Synthesis | p. 24 |
Array Manufacturing | p. 30 |
Target Labeling | p. 31 |
Hybridization | p. 34 |
Scanning and Image Analysis | p. 35 |
Microarray Data | p. 36 |
Spotted Array Data | p. 36 |
In-situ Oligonucleotide Array Data | p. 37 |
So I Have My Microarray Data - What's Next? | p. 39 |
Confirming Microarray Results | p. 39 |
Northern Blot Analysis | p. 40 |
Reverse-transcription PCR and Quantitative Real-time RT-PCR | p. 40 |
Inherent Variability in Array Data | p. 45 |
Genetic Populations | p. 45 |
Variability in Gene Expression Levels | p. 47 |
Variability Due to Specimen Sampling | p. 47 |
Variability Due to Cell Cycle Regulation | p. 48 |
Experimental Variability | p. 48 |
Test the Variability by Replication | p. 50 |
Duplicated Spots | p. 50 |
Multiple Arrays and Biological Replications | p. 51 |
Background Noise | p. 53 |
Pixel-by-pixel Analysis of Individual Spots | p. 53 |
General Models for Background Noise | p. 56 |
Additive Background Noise | p. 57 |
Correction for Background Noise | p. 58 |
Example: Replication Test Data Set | p. 59 |
Noise Models for GeneChip Arrays | p. 62 |
Elusive Nature of Background Noise | p. 63 |
Transformation and Normalization | p. 67 |
Data Transformations | p. 67 |
Logarithmic Transformation | p. 67 |
Square Root Transformation | p. 68 |
Box-Cox Transformation Family | p. 69 |
Affine Transformation | p. 69 |
The Generalized-log Transformation | p. 71 |
Data Normalization | p. 72 |
Normalization Across G Genes | p. 74 |
Example: Mouse Juvenile Cystic Kidney Data Set | p. 75 |
Normalization Across G Genes and N Samples | p. 77 |
Color Effects and MA Plots | p. 78 |
Normalization Based on LOWESS Function | p. 80 |
Normalization Based on Rank-invariant Genes | p. 82 |
Normalization Based on a Sample Pool | p. 82 |
Global Normalization Using ANOVA Models | p. 82 |
Other Normalization Issues | p. 83 |
Missing Values in Array Data | p. 85 |
Missing Values in Array Data | p. 85 |
Sources of Problem | p. 85 |
Statistical Classification of Missing Data | p. 86 |
Missing Values in Replicated Designs | p. 88 |
Imputation of Missing Values | p. 89 |
Saturated Intensity Readings | p. 93 |
Saturated Intensity Readings | p. 93 |
Multiple Power-levels for Spotted Arrays | p. 93 |
Imputing Saturated Intensity Readings | p. 95 |
High Intensities in Oligonucleotide Arrays | p. 97 |
Statistical Models and Analysis | |
Experimental Design | p. 103 |
Factors Involved in Experiments | p. 103 |
Types of Design Structures | p. 106 |
Common Practice in Microarray Studies | p. 112 |
Reference Design | p. 112 |
Time-course Experiment | p. 114 |
Color Reversal | p. 115 |
Loop Design | p. 116 |
Example: Time-course Loop Design | p. 117 |
ANOVA Models for Microarray Data | p. 121 |
A Basic Log-linear Model | p. 121 |
ANOVA With Multiple Factors | p. 123 |
Main Effects | p. 123 |
Interaction Effects | p. 123 |
A Generic Fixed-Effects ANOVA Model | p. 124 |
Estimation for Interaction Effects | p. 126 |
Two-stage Estimation Procedures | p. 126 |
Example | p. 128 |
Identifying Differentially Expressed Genes | p. 130 |
Standard MSE-based Approach | p. 130 |
Other Approaches | p. 132 |
Modified MSE-based Approach | p. 132 |
Mixed-effects Models | p. 135 |
ANOVA for Split-plot Design | p. 136 |
Log Intensity Versus Log Ratio | p. 138 |
Multiple Testing in Microarray Studies | p. 143 |
Hypothesis Testing for Any Individual Gene | p. 143 |
Multiple Testing for the Entire Gene Set | p. 144 |
Framework for Multiple Testing | p. 144 |
Test Statistic for Each Gene | p. 145 |
Two Error Control Criteria in Multiple Testing | p. 146 |
Implementation Algorithms | p. 147 |
Example of Multiple Testing Algorithms | p. 152 |
Concluding Remarks | p. 153 |
Permutation Tests in Microarray Data | p. 157 |
Basic Concepts | p. 157 |
Permutation Tests in Microarray Studies | p. 160 |
Exchangeability in Microarray Designs | p. 160 |
Limitation of Having Few Permutations | p. 162 |
Pooling Test Results Across Genes | p. 162 |
Lipopolysaccharide-E. coli Data Set | p. 163 |
Statistical Model | p. 164 |
Permutation Testing and Results | p. 166 |
Bayesian Methods for Microarray Data | p. 171 |
Mixture Model for Gene Expression | p. 171 |
Variations on the Mixture Model | p. 173 |
Example of Gamma Models | p. 175 |
Mixture Model for Differential Expression | p. 176 |
Mixture Model for Color Ratio Data | p. 176 |
Relation of Mixture Model to ANOVA Model | p. 180 |
Bayes Interpretation of Mixture Model | p. 182 |
Empirical Bayes Methods | p. 183 |
Example of Empirical Bayes Fitting | p. 184 |
Hierarchical Bayes Models | p. 187 |
Example of Hierarchical Modeling | p. 189 |
Power and Sample Size Considerations | p. 193 |
Test Hypotheses in Microarray Studies | p. 194 |
Distributions of Estimated Differential Expression | p. 196 |
Summary Measures of Estimated Differential Expression | p. 196 |
Multiple Testing Framework | p. 197 |
Dependencies of Estimation Errors | p. 199 |
Familywise Type I Error Control | p. 200 |
Type I Error Control: the Sidak Approach | p. 201 |
Type I Error Control: the Bonferroni Approach | p. 203 |
Familywise Type II Error Control | p. 204 |
Type II Error Control: the Sidak Approach | p. 206 |
Type II Error Control: the Bonferroni Approach | p. 206 |
Contrast of Planning and Implementation in Multiple Testing | p. 207 |
Power Calculations for Different Summary Measures | p. 208 |
Designs with Linear Summary Measure | p. 208 |
Numerical Example for Linear Summary | p. 210 |
Designs with Quadratic Summary Measure | p. 211 |
Numerical Example for Quadratic Summary | p. 213 |
A Bayesian Perspective on Power and Sample Size | p. 214 |
Connection to Local Discovery Rates | p. 215 |
Representative Local True Discovery Rate | p. 215 |
Numerical Example for TDR and FDR | p. 216 |
Applications to Standard Designs | p. 216 |
Treatment-control Designs | p. 217 |
Sample Size for a Treatment-control Design | p. 218 |
Multiple-treatment Designs | p. 221 |
Power Table for a Multiple-treatment Design | p. 224 |
Time-course and Similar Multiple-treatment Designs | p. 227 |
Relation Between Power, Replication and Design | p. 228 |
Effects of Replication | p. 228 |
Controlling Sources of Variability | p. 229 |
Assessing Power from Microarray Pilot Studies | p. 230 |
Example 1: Juvenile Cystic Kidney Disease | p. 230 |
Example 2: Opioid Dependence | p. 231 |
Unsupervised Exploratory Analysis | |
Cluster Analysis | p. 237 |
Distance and Similarity Measures | p. 238 |
Distance Measures | p. 239 |
Properties of Distance Measures | p. 239 |
Minkowski Distance Measures | p. 240 |
Mahalanobis Distance | p. 241 |
Similarity Measures | p. 241 |
Inner Product | p. 241 |
Pearson Correlation Coefficient | p. 242 |
Spearman Rank Correlation Coefficient | p. 243 |
Inter-cluster Distance | p. 243 |
Mahalanobis Inter-cluster Distance | p. 244 |
Neighbor-based Inter-cluster Distance | p. 244 |
Hierarchical Clustering | p. 244 |
Single Linkage Method | p. 245 |
Complete Linkage Method | p. 245 |
Average Linkage Clustering | p. 245 |
Centroid Linkage Method | p. 246 |
Median Linkage Clustering | p. 246 |
Ward's Clustering Method | p. 246 |
Applications | p. 246 |
Comparisons of Clustering Algorithms | p. 247 |
K-means Clustering | p. 247 |
Bayesian Cluster Analysis | p. 248 |
Two-way Clustering Methods | p. 248 |
Reliability of Clustering Patterns for Microarray Data | p. 249 |
Principal Components and Singular Value Decomposition | p. 251 |
Principal Component Analysis | p. 251 |
Applications of Dominant Principal Components | p. 253 |
Singular-value Decomposition | p. 254 |
Computational Procedures for SVD | p. 255 |
Eigengenes and Eigenarrays | p. 256 |
Fraction of Eigenexpression | p. 256 |
Generalized Singular Value Decomposition | p. 257 |
Robust Singular Value Decomposition | p. 257 |
Self-Organizing Maps | p. 261 |
The Basic Logic of a SOM | p. 261 |
The SOM Updating Algorithm | p. 265 |
Program GENECLUSTER | p. 267 |
Supervised SOM | p. 268 |
Applications | p. 268 |
Using SOM to Cluster Genes | p. 268 |
Using SOM to Cluster Tumors | p. 269 |
Multiclass Cancer Diagnosis | p. 270 |
Supervised Learning Methods | |
Discrimination and Classification | p. 277 |
Fisher's Linear Discriminant Analysis | p. 278 |
Maximum Likelihood Discriminant Rules | p. 279 |
Bayesian Classification | p. 280 |
k-Nearest Neighbor Classifier | p. 281 |
Neighborhood Analysis | p. 282 |
A Gene-casting Weighted Voting Scheme | p. 283 |
Example: Classification of Leukemia Samples | p. 284 |
Artificial Neural Networks | p. 287 |
Single-layer Neural Network | p. 288 |
Separating Hyperplanes | p. 288 |
Class Labels | p. 289 |
Decision Rules | p. 290 |
Risk Functions | p. 290 |
Gradient Descent Procedures | p. 290 |
Rosenblatt's Perceptron Method | p. 291 |
General Structure of Multilayer Neural Networks | p. 292 |
Training a Multilayer Neural Network | p. 294 |
Sigmoid Functions | p. 294 |
Mathematical Formulation | p. 295 |
Training Algorithm | p. 296 |
Discussion | p. 298 |
Cancer Classification Using Neural Networks | p. 298 |
Support Vector Machines | p. 301 |
Geometric Margins for Linearly Separable Groups | p. 301 |
Convex Optimization in the Dual Space | p. 305 |
Support Vectors | p. 306 |
Linearly Nonseparable Groups | p. 307 |
Nonlinear Separating Boundary | p. 308 |
Kernel Functions | p. 309 |
Kernels Defined by Symmetric Functions | p. 309 |
Use of SVM for Classifying Genes | p. 310 |
Examples | p. 311 |
Functional Classification of Genes | p. 311 |
SVM and One-versus-All Classification Scheme | p. 313 |
Appendices | p. 316 |
Sample Size Table for Treatment-control Designs | p. 317 |
Power Table for Multiple-treatment Designs | p. 327 |
Glossary of Notation | p. 349 |
Author Index | p. 367 |
Topic Index | p. 373 |
Table of Contents provided by Rittenhouse. All Rights Reserved. |