| List of Figures | p. xiii |
| List of Tables | p. xvi |
| Preface | p. xvii |
| Genome Probing Using Microarrays | |
| Introduction | p. 3 |
| DNA, RNA, Proteins, and Gene Expression | p. 7 |
| The Molecules of Life | p. 7 |
| Genes | p. 8 |
| DNA | p. 9 |
| RNA | p. 12 |
| The Genetic Code | p. 13 |
| Proteins | p. 14 |
| Gene Expression and Microarrays | p. 15 |
| Complementary DNA (cDNA) | p. 16 |
| Nucleic Acid Hybridization | p. 16 |
| Microarray Technology | p. 19 |
| Transcriptional Profiling | p. 20 |
| Sequencing-based Transcriptional Profiling | p. 20 |
| Hybridization-based Transcriptional Profiling | p. 22 |
| Microarray Technological Platforms | p. 23 |
| Probe Selection and Synthesis | p. 24 |
| Array Manufacturing | p. 30 |
| Target Labeling | p. 31 |
| Hybridization | p. 34 |
| Scanning and Image Analysis | p. 35 |
| Microarray Data | p. 36 |
| Spotted Array Data | p. 36 |
| In-situ Oligonucleotide Array Data | p. 37 |
| So I Have My Microarray Data - What's Next? | p. 39 |
| Confirming Microarray Results | p. 39 |
| Northern Blot Analysis | p. 40 |
| Reverse-transcription PCR and Quantitative Real-time RT-PCR | p. 40 |
| Inherent Variability in Array Data | p. 45 |
| Genetic Populations | p. 45 |
| Variability in Gene Expression Levels | p. 47 |
| Variability Due to Specimen Sampling | p. 47 |
| Variability Due to Cell Cycle Regulation | p. 48 |
| Experimental Variability | p. 48 |
| Test the Variability by Replication | p. 50 |
| Duplicated Spots | p. 50 |
| Multiple Arrays and Biological Replications | p. 51 |
| Background Noise | p. 53 |
| Pixel-by-pixel Analysis of Individual Spots | p. 53 |
| General Models for Background Noise | p. 56 |
| Additive Background Noise | p. 57 |
| Correction for Background Noise | p. 58 |
| Example: Replication Test Data Set | p. 59 |
| Noise Models for GeneChip Arrays | p. 62 |
| Elusive Nature of Background Noise | p. 63 |
| Transformation and Normalization | p. 67 |
| Data Transformations | p. 67 |
| Logarithmic Transformation | p. 67 |
| Square Root Transformation | p. 68 |
| Box-Cox Transformation Family | p. 69 |
| Affine Transformation | p. 69 |
| The Generalized-log Transformation | p. 71 |
| Data Normalization | p. 72 |
| Normalization Across G Genes | p. 74 |
| Example: Mouse Juvenile Cystic Kidney Data Set | p. 75 |
| Normalization Across G Genes and N Samples | p. 77 |
| Color Effects and MA Plots | p. 78 |
| Normalization Based on LOWESS Function | p. 80 |
| Normalization Based on Rank-invariant Genes | p. 82 |
| Normalization Based on a Sample Pool | p. 82 |
| Global Normalization Using ANOVA Models | p. 82 |
| Other Normalization Issues | p. 83 |
| Missing Values in Array Data | p. 85 |
| Missing Values in Array Data | p. 85 |
| Sources of Problem | p. 85 |
| Statistical Classification of Missing Data | p. 86 |
| Missing Values in Replicated Designs | p. 88 |
| Imputation of Missing Values | p. 89 |
| Saturated Intensity Readings | p. 93 |
| Saturated Intensity Readings | p. 93 |
| Multiple Power-levels for Spotted Arrays | p. 93 |
| Imputing Saturated Intensity Readings | p. 95 |
| High Intensities in Oligonucleotide Arrays | p. 97 |
| Statistical Models and Analysis | |
| Experimental Design | p. 103 |
| Factors Involved in Experiments | p. 103 |
| Types of Design Structures | p. 106 |
| Common Practice in Microarray Studies | p. 112 |
| Reference Design | p. 112 |
| Time-course Experiment | p. 114 |
| Color Reversal | p. 115 |
| Loop Design | p. 116 |
| Example: Time-course Loop Design | p. 117 |
| ANOVA Models for Microarray Data | p. 121 |
| A Basic Log-linear Model | p. 121 |
| ANOVA With Multiple Factors | p. 123 |
| Main Effects | p. 123 |
| Interaction Effects | p. 123 |
| A Generic Fixed-Effects ANOVA Model | p. 124 |
| Estimation for Interaction Effects | p. 126 |
| Two-stage Estimation Procedures | p. 126 |
| Example | p. 128 |
| Identifying Differentially Expressed Genes | p. 130 |
| Standard MSE-based Approach | p. 130 |
| Other Approaches | p. 132 |
| Modified MSE-based Approach | p. 132 |
| Mixed-effects Models | p. 135 |
| ANOVA for Split-plot Design | p. 136 |
| Log Intensity Versus Log Ratio | p. 138 |
| Multiple Testing in Microarray Studies | p. 143 |
| Hypothesis Testing for Any Individual Gene | p. 143 |
| Multiple Testing for the Entire Gene Set | p. 144 |
| Framework for Multiple Testing | p. 144 |
| Test Statistic for Each Gene | p. 145 |
| Two Error Control Criteria in Multiple Testing | p. 146 |
| Implementation Algorithms | p. 147 |
| Example of Multiple Testing Algorithms | p. 152 |
| Concluding Remarks | p. 153 |
| Permutation Tests in Microarray Data | p. 157 |
| Basic Concepts | p. 157 |
| Permutation Tests in Microarray Studies | p. 160 |
| Exchangeability in Microarray Designs | p. 160 |
| Limitation of Having Few Permutations | p. 162 |
| Pooling Test Results Across Genes | p. 162 |
| Lipopolysaccharide-E. coli Data Set | p. 163 |
| Statistical Model | p. 164 |
| Permutation Testing and Results | p. 166 |
| Bayesian Methods for Microarray Data | p. 171 |
| Mixture Model for Gene Expression | p. 171 |
| Variations on the Mixture Model | p. 173 |
| Example of Gamma Models | p. 175 |
| Mixture Model for Differential Expression | p. 176 |
| Mixture Model for Color Ratio Data | p. 176 |
| Relation of Mixture Model to ANOVA Model | p. 180 |
| Bayes Interpretation of Mixture Model | p. 182 |
| Empirical Bayes Methods | p. 183 |
| Example of Empirical Bayes Fitting | p. 184 |
| Hierarchical Bayes Models | p. 187 |
| Example of Hierarchical Modeling | p. 189 |
| Power and Sample Size Considerations | p. 193 |
| Test Hypotheses in Microarray Studies | p. 194 |
| Distributions of Estimated Differential Expression | p. 196 |
| Summary Measures of Estimated Differential Expression | p. 196 |
| Multiple Testing Framework | p. 197 |
| Dependencies of Estimation Errors | p. 199 |
| Familywise Type I Error Control | p. 200 |
| Type I Error Control: the Sidak Approach | p. 201 |
| Type I Error Control: the Bonferroni Approach | p. 203 |
| Familywise Type II Error Control | p. 204 |
| Type II Error Control: the Sidak Approach | p. 206 |
| Type II Error Control: the Bonferroni Approach | p. 206 |
| Contrast of Planning and Implementation in Multiple Testing | p. 207 |
| Power Calculations for Different Summary Measures | p. 208 |
| Designs with Linear Summary Measure | p. 208 |
| Numerical Example for Linear Summary | p. 210 |
| Designs with Quadratic Summary Measure | p. 211 |
| Numerical Example for Quadratic Summary | p. 213 |
| A Bayesian Perspective on Power and Sample Size | p. 214 |
| Connection to Local Discovery Rates | p. 215 |
| Representative Local True Discovery Rate | p. 215 |
| Numerical Example for TDR and FDR | p. 216 |
| Applications to Standard Designs | p. 216 |
| Treatment-control Designs | p. 217 |
| Sample Size for a Treatment-control Design | p. 218 |
| Multiple-treatment Designs | p. 221 |
| Power Table for a Multiple-treatment Design | p. 224 |
| Time-course and Similar Multiple-treatment Designs | p. 227 |
| Relation Between Power, Replication and Design | p. 228 |
| Effects of Replication | p. 228 |
| Controlling Sources of Variability | p. 229 |
| Assessing Power from Microarray Pilot Studies | p. 230 |
| Example 1: Juvenile Cystic Kidney Disease | p. 230 |
| Example 2: Opioid Dependence | p. 231 |
| Unsupervised Exploratory Analysis | |
| Cluster Analysis | p. 237 |
| Distance and Similarity Measures | p. 238 |
| Distance Measures | p. 239 |
| Properties of Distance Measures | p. 239 |
| Minkowski Distance Measures | p. 240 |
| Mahalanobis Distance | p. 241 |
| Similarity Measures | p. 241 |
| Inner Product | p. 241 |
| Pearson Correlation Coefficient | p. 242 |
| Spearman Rank Correlation Coefficient | p. 243 |
| Inter-cluster Distance | p. 243 |
| Mahalanobis Inter-cluster Distance | p. 244 |
| Neighbor-based Inter-cluster Distance | p. 244 |
| Hierarchical Clustering | p. 244 |
| Single Linkage Method | p. 245 |
| Complete Linkage Method | p. 245 |
| Average Linkage Clustering | p. 245 |
| Centroid Linkage Method | p. 246 |
| Median Linkage Clustering | p. 246 |
| Ward's Clustering Method | p. 246 |
| Applications | p. 246 |
| Comparisons of Clustering Algorithms | p. 247 |
| K-means Clustering | p. 247 |
| Bayesian Cluster Analysis | p. 248 |
| Two-way Clustering Methods | p. 248 |
| Reliability of Clustering Patterns for Microarray Data | p. 249 |
| Principal Components and Singular Value Decomposition | p. 251 |
| Principal Component Analysis | p. 251 |
| Applications of Dominant Principal Components | p. 253 |
| Singular-value Decomposition | p. 254 |
| Computational Procedures for SVD | p. 255 |
| Eigengenes and Eigenarrays | p. 256 |
| Fraction of Eigenexpression | p. 256 |
| Generalized Singular Value Decomposition | p. 257 |
| Robust Singular Value Decomposition | p. 257 |
| Self-Organizing Maps | p. 261 |
| The Basic Logic of a SOM | p. 261 |
| The SOM Updating Algorithm | p. 265 |
| Program GENECLUSTER | p. 267 |
| Supervised SOM | p. 268 |
| Applications | p. 268 |
| Using SOM to Cluster Genes | p. 268 |
| Using SOM to Cluster Tumors | p. 269 |
| Multiclass Cancer Diagnosis | p. 270 |
| Supervised Learning Methods | |
| Discrimination and Classification | p. 277 |
| Fisher's Linear Discriminant Analysis | p. 278 |
| Maximum Likelihood Discriminant Rules | p. 279 |
| Bayesian Classification | p. 280 |
| k-Nearest Neighbor Classifier | p. 281 |
| Neighborhood Analysis | p. 282 |
| A Gene-casting Weighted Voting Scheme | p. 283 |
| Example: Classification of Leukemia Samples | p. 284 |
| Artificial Neural Networks | p. 287 |
| Single-layer Neural Network | p. 288 |
| Separating Hyperplanes | p. 288 |
| Class Labels | p. 289 |
| Decision Rules | p. 290 |
| Risk Functions | p. 290 |
| Gradient Descent Procedures | p. 290 |
| Rosenblatt's Perceptron Method | p. 291 |
| General Structure of Multilayer Neural Networks | p. 292 |
| Training a Multilayer Neural Network | p. 294 |
| Sigmoid Functions | p. 294 |
| Mathematical Formulation | p. 295 |
| Training Algorithm | p. 296 |
| Discussion | p. 298 |
| Cancer Classification Using Neural Networks | p. 298 |
| Support Vector Machines | p. 301 |
| Geometric Margins for Linearly Separable Groups | p. 301 |
| Convex Optimization in the Dual Space | p. 305 |
| Support Vectors | p. 306 |
| Linearly Nonseparable Groups | p. 307 |
| Nonlinear Separating Boundary | p. 308 |
| Kernel Functions | p. 309 |
| Kernels Defined by Symmetric Functions | p. 309 |
| Use of SVM for Classifying Genes | p. 310 |
| Examples | p. 311 |
| Functional Classification of Genes | p. 311 |
| SVM and One-versus-All Classification Scheme | p. 313 |
| Appendices | p. 316 |
| Sample Size Table for Treatment-control Designs | p. 317 |
| Power Table for Multiple-treatment Designs | p. 327 |
| Glossary of Notation | p. 349 |
| Author Index | p. 367 |
| Topic Index | p. 373 |
| Table of Contents provided by Rittenhouse. All Rights Reserved. |