Preface | p. xvii |
Acknowledgments | p. xxi |
About the Author | p. xxiii |
From Data to Models: Complexity and Challenges in Understanding Biological, Ecological, and Natural Systems | p. 1 |
Introduction | p. 1 |
Layout of the Book | p. 4 |
References | p. 7 |
Fundamentals of Neural Networks and Models for Linear Data Analysis | p. 11 |
Introduction and Overview | p. 11 |
Neural Networks and Their Capabilities | p. 12 |
Inspirations from Biology | p. 16 |
Modeling Information Processing in Neurons | p. 18 |
Neuron Models and Learning Strategies | p. 19 |
Threshold Neuron as a Simple Classifier | p. 20 |
Learning Models for Neurons and Neural Assemblies | p. 23 |
Hebbian Learning | p. 23 |
Unsupervised or Competitive Learning | p. 26 |
Supervised Learning | p. 26 |
Perceptron with Supervised Learning as a Classifier | p. 27 |
Perceptron Learning Algorithm | p. 28 |
A Practical Example of Perceptron on a Larger Realistic Data Set: Identifying the Origin of Fish from the Growth-Ring Diameter of Scales | p. 35 |
Comparison of Perceptron with Linear Discriminant Function Analysis in Statistics | p. 38 |
Multi-Output Perceptron for Multicategory Classification | p. 40 |
Higher-Dimensional Classification Using Perceptron | p. 45 |
Perceptron Summary | p. 45 |
Linear Neuron for Linear Classification and Prediction | p. 46 |
Learning with the Delta Rule | p. 47 |
Linear Neuron as a Classifier | p. 51 |
Classification Properties of a Linear Neuron as a Subset of Predictive Capabilities | p. 53 |
Example: Linear Neuron as a Predictor | p. 54 |
A Practical Example of Linear Prediction: Predicting the Heat Influx in a Home | p. 61 |
Comparison of Linear Neuron Model with Linear Regression | p. 62 |
Example: Multiple Input Linear Neuron Model-Improving the Prediction Accuracy of Heat Influx in a Home | p. 63 |
Comparison of a Multiple-Input Linear Neuron with Multiple Linear Regression | p. 63 |
Multiple Linear Neuron Models | p. 64 |
Comparison of a Multiple Linear Neuron Network with Canonical Correlation Analysis | p. 65 |
Linear Neuron and Linear Network Summary | p. 65 |
Summary | p. 66 |
Problems | p. 66 |
References | p. 67 |
Neural Networks for Nonlinear Pattern Recognition | p. 69 |
Overview and Introduction | p. 69 |
Multilayer Perceptron | p. 71 |
Nonlinear Neurons | p. 72 |
Neuron Activation Functions | p. 73 |
Sigmoid Functions | p. 74 |
Gaussian Functions | p. 76 |
Example: Population Growth Modeling Using a Nonlinear Neuron | p. 77 |
Comparison of Nonlinear Neuron with Nonlinear Regression Analysis | p. 80 |
One-Input Multilayer Nonlinear Networks | p. 80 |
Processing with a Single Nonlinear Hidden Neuron | p. 80 |
Examples: Modeling Cyclical Phenomena with Multiple Nonlinear Neurons | p. 86 |
Example 1: Approximating a Square Wave | p. 86 |
Example 2: Modeling Seasonal Species Migration | p. 94 |
Two-Input Multilayer Perceptron Network | p. 98 |
Processing of Two-Dimensional Inputs by Nonlinear Neurons | p. 98 |
Network Output | p. 102 |
Examples: Two-Dimensional Prediction and Classification | p. 103 |
Example 1: Two-Dimensional Nonlinear Function Approximation | p. 103 |
Example 2: Two-Dimensional Nonlinear Classification Model | p. 105 |
Multidimensional Data Modeling with Nonlinear Multilayer Perceptron Networks | p. 109 |
Summary | p. 110 |
Problems | p. 110 |
References | p. 112 |
Learning of Nonlinear Patterns by Neural Networks | p. 113 |
Introduction and Overview | p. 113 |
Supervised Training of Networks for Nonlinear Pattern Recognition | p. 114 |
Gradient Descent and Error Minimization | p. 115 |
Backpropagation Learning | p. 116 |
Example: Backpropagation Training-A Hand Computation | p. 117 |
Error Gradient with Respect to Output Neuron Weights | p. 120 |
The Error Gradient with Respect to the Hidden-Neuron Weights | p. 123 |
Application of Gradient Descent in Backpropagation Learning | p. 127 |
Batch Learning | p. 128 |
Learning Rate and Weight Update | p. 130 |
Example-by-Example (Online) Learning | p. 134 |
Momentum | p. 134 |
Example: Backpropagation Learning Computer Experiment | p. 138 |
Single-Input Single-Output Network with Multiple Hidden Neurons | p. 141 |
Multiple-Input, Multiple-Hidden Neuron, and Single-Output Network | p. 142 |
Multiple-Input, Multiple-Hidden Neuron, Multiple-Output Network | p. 143 |
Example: Backpropagation Learning Case Study-Solving a Complex Classification Problem | p. 145 |
Delta-Bar-Delta Learning (Adaptive Learning Rate) Method | p. 152 |
Example: Network Training with Delta-Bar-Delta-A Hand Computation | p. 154 |
Example: Delta-Bar-Delta with Momentum-A Hand Computation | p. 157 |
Network Training with Delta-Bar Delta-A Computer Experiment | p. 158 |
Comparison of Delta-Bar-Delta Method with Backpropagation | p. 159 |
Example: Network Training with Delta-Bar-Delta-A Case Study | p. 160 |
Steepest Descent Method | p. 163 |
Example: Network Training with Steepest Descent-Hand Computation | p. 163 |
Example: Network Training with Steepest Descent-A Computer Experiment | p. 164 |
Second-Order Methods of Error Minimization and Weight Optimization | p. 166 |
QuickProp | p. 167 |
Example: Network Training with QuickProp-A Hand Computation | p. 168 |
Example: Network Training with QuickProp-A Computer Experiment | p. 170 |
Comparison of QuickProp with Steepest Descent, Delta-Bar-Delta, and Backpropagation | p. 170 |
General Concept of Second-Order Methods of Error Minimization | p. 172 |
Gauss-Newton Method | p. 174 |
Network Training with the Gauss-Newton Method-A Hand Computation | p. 176 |
Example: Network Training with Gauss-Newton Method-A Computer Experiment | p. 178 |
The Levenberg-Marquardt Method | p. 180 |
Example: Network Training with LM Method-A Hand Computation | p. 182 |
Network Training with the LM Method-A Computer Experiment | p. 183 |
Comparison of the Efficiency of the First-Order and Second-Order Methods in Minimizing Error | p. 184 |
Comparison of the Convergence Characteristics of First-Order and Second-Order Learning Methods | p. 185 |
Backpropagation | p. 187 |
Steepest Descent Method | p. 188 |
Gauss-Newton Method | p. 189 |
Levenberg-Marquardt Method | p. 190 |
Summary | p. 192 |
Problems | p. 192 |
References | p. 193 |
Implementation of Neural Network Models for Extracting Reliable Patterns from Data | p. 195 |
Introduction and Overview | p. 195 |
Bias-Variance Tradeoff | p. 196 |
Improving Generalization of Neural Networks | p. 197 |
Illustration of Early Stopping | p. 199 |
Effect of Initial Random Weights | p. 203 |
Weight Structure of the Trained Networks | p. 206 |
Effect of Random Sampling | p. 207 |
Effect of Model Complexity: Number of Hidden Neurons | p. 212 |
Summary on Early Stopping | p. 213 |
Regularization | p. 215 |
Reducing Structural Complexity of Networks by Pruning | p. 221 |
Optimal Brain Damage | p. 222 |
Example of Network Pruning with Optimal Brain Damage | p. 223 |
Network Pruning Based on Variance of Network Sensitivity | p. 229 |
Illustration of Application of Variance Nullity in Pruning Weights | p. 232 |
Pruning Hidden Neurons Based on Variance Nullity of Sensitivity | p. 235 |
Robustness of a Network to Perturbation of Weights | p. 237 |
Confidence Intervals for Weights | p. 239 |
Summary | p. 241 |
Problems | p. 242 |
References | p. 243 |
Data Exploration, Dimensionality Reduction, and Feature Extraction | p. 245 |
Introduction and Overview | p. 245 |
Example: Thermal Conductivity of Wood in Relation to Correlated Input Data | p. 247 |
Data Visualization | p. 248 |
Correlation Scatter Plots and Histograms | p. 248 |
Parallel Visualization | p. 249 |
Projecting Multidimensional Data onto Two-Dimensional Plane | p. 250 |
Correlation and Covariance between Variables | p. 251 |
Normalization of Data | p. 253 |
Standardization | p. 253 |
Simple Range Scaling | p. 254 |
Whitening-Normalization of Correlated Multivariate Data | p. 255 |
Selecting Relevant Inputs | p. 259 |
Statistical Tools for Variable Selection | p. 260 |
Partial Correlation | p. 260 |
Multiple Regression and Best-Subsets Regression | p. 261 |
Dimensionality Reduction and Feature Extraction | p. 262 |
Multicollinearity | p. 262 |
Principal Component Analysis (PCA) | p. 263 |
Partial Least-Squares Regression | p. 267 |
Outlier Detection | p. 268 |
Noise | p. 270 |
Case Study: Illustrating Input Selection and Dimensionality Reduction for a Practical Problem | p. 270 |
Data Preprocessing and Preliminary Modeling | p. 271 |
PCA-Based Neural Network Modeling | p. 275 |
Effect of Hidden Neurons for Non-PCA- and PCA-Based Approaches | p. 278 |
Case Study Summary | p. 279 |
Summary | p. 280 |
Problems | p. 281 |
References | p. 281 |
Assessment of Uncertainty of Neural Network Models Using Bayesian Statistics | p. 283 |
Introduction and Overview | p. 283 |
Estimating Weight Uncertainty Using Bayesian Statistics | p. 285 |
Quality Criterion | p. 285 |
Incorporating Bayesian Statistics to Estimate Weight Uncertainty | p. 288 |
Square Error | p. 289 |
Intrinsic Uncertainty of Targets for Multivariate Output | p. 292 |
Probability Density Function of Weights | p. 293 |
Example Illustrating Generation of Probability Distribution of Weights | p. 295 |
Estimation of Geophysical Parameters from Remote Sensing: A Case Study | p. 295 |
Assessing Uncertainty of Neural Network Outputs Using Bayesian Statistics | p. 300 |
Example Illustrating Uncertainty Assessment of Output Errors | p. 301 |
Total Network Output Errors | p. 301 |
Error Correlation and Covariance Matrices | p. 302 |
Statistical Analysis of Error Covariance | p. 302 |
Decomposition of Total Output Error into Model Error and Intrinsic Noise | p. 304 |
Assessing the Sensitivity of Network Outputs to Inputs | p. 311 |
Approaches to Determine the Influence of Inputs on Outputs in Feedforward Networks | p. 311 |
Methods Based on Magnitude of Weights | p. 311 |
Sensitivity Analysis | p. 312 |
Example: Comparison of Methods to Assess the Influence of Inputs on Outputs | p. 313 |
Uncertainty of Sensitivities | p. 314 |
Example Illustrating Uncertainty Assessment of Network Sensitivity to Inputs | p. 315 |
PCA Decomposition of Inputs and Outputs | p. 315 |
PCA-Based Neural Network Regression | p. 320 |
Neural Network Sensitivities | p. 323 |
Uncertainty of Input Sensitivity | p. 325 |
PCA-Regularized Jacobians | p. 328 |
Case Study Summary | p. 333 |
Summary | p. 333 |
Problems | p. 334 |
References | p. 335 |
Discovering Unknown Clusters in Data with Self-Organizing Maps | p. 337 |
Introduction and Overview | p. 337 |
Structure of Unsupervised Networks | p. 338 |
Learning in Unsupervised Networks | p. 339 |
Implementation of Competitive Learning | p. 340 |
Winner Selection Based on Neuron Activation | p. 340 |
Winner Selection Based on Distance to Input Vector | p. 341 |
Other Distance Measures | p. 342 |
Competitive Learning Example | p. 343 |
Recursive Versus Batch Learning | p. 344 |
Illustration of the Calculations Involved in Winner Selection | p. 344 |
Network Training | p. 346 |
Self-Organizing Feature Maps | p. 349 |
Learning in Self-Organizing Map Networks | p. 349 |
Selection of Neighborhood Geometry | p. 349 |
Training of Self-Organizing Maps | p. 350 |
Neighbor Strength | p. 350 |
Example: Training Self-Organizing Networks with a Neighbor Feature | p. 351 |
Neighbor Matrix and Distance to Neighbors from the Winner | p. 354 |
Shrinking Neighborhood Size with Iterations | p. 357 |
Learning Rate Decay | p. 358 |
Weight Update Incorporating Learning Rate and Neighborhood Decay | p. 359 |
Recursive and Batch Training and Relation to K-Means Clustering | p. 360 |
Two Phases of Self-Organizing Map Training | p. 360 |
Example: Illustrating Self-Organizing Map Learning with a Hand Calculation | p. 361 |
SOM Case Study: Determination of Mastitis Health Status of Dairy Herd from Combined Milk Traits | p. 368 |
Example of Two-Dimensional Self-Organizing Maps: Clustering Canadian and Alaskan Salmon Based on the Diameter of Growth Rings of the Scales | p. 371 |
Map Structure and Initialization | p. 372 |
Map Training | p. 373 |
U-Matrix | p. 380 |
Map Initialization | p. 382 |
Example: Training Two-Dimensional Maps on Multidimensional Data | p. 382 |
Data Visualization | p. 383 |
Map Structure and Training | p. 383 |
U-Matrix | p. 389 |
Point Estimates of Probability Density of Inputs Captured by the Map | p. 390 |
Quantization Error | p. 391 |
Accuracy of Retrieval of Input Data from the Map | p. 393 |
Forming Clusters on the Map | p. 395 |
Approaches to Clustering | p. 396 |
Example Illustrating Clustering on a Trained Map | p. 397 |
Finding Optimum Clusters on the Map with the Ward Method | p. 401 |
Finding Optimum Clusters by K-Means Clustering | p. 403 |
Validation of a Trained Map | p. 406 |
n-Fold Cross Validation | p. 406 |
Evolving Self-Organizing Maps | p. 411 |
Growing Cell Structure of Map | p. 413 |
Centroid Method for Mapping Input Data onto Positions between Neurons on the Map | p. 416 |
Dynamic Self-Organizing Maps with Controlled Growth (GSOM) | p. 419 |
Example: Application of Dynamic Self-Organizing Maps | p. 422 |
Evolving Tree | p. 427 |
Summary | p. 431 |
Problems | p. 432 |
References | p. 434 |
Neural Networks for Time-Series Forecasting | p. 437 |
Introduction and Overview | p. 437 |
Linear Forecasting of Time-Series with Statistical and Neural Network Models | p. 440 |
Example Case Study: Regulating Temperature of a Furnace | p. 442 |
Multistep-Ahead Linear Forecasting | p. 444 |
Neural Networks for Nonlinear Time-Series Forecasting | p. 446 |
Focused Time-Lagged and Dynamically Driven Recurrent Networks | p. 446 |
Focused Time-Lagged Feedforward Networks | p. 448 |
Spatio-Temporal Time-Lagged Networks | p. 450 |
Example: Spatio-Temporal Time-Lagged Network-Regulating Temperature in a Furnace | p. 452 |
Single-Step Forecasting with Neural NARx Model | p. 454 |
Multistep Forecasting with Neural NARx Model | p. 455 |
Case Study: River Flow Forecasting | p. 457 |
Linear Model for River Flow Forecasting | p. 460 |
Nonlinear Neural (NARx) Model for River Flow Forecasting | p. 463 |
Input Sensitivity | p. 467 |
Hybrid Linear (ARIMA) and Nonlinear Neural Network Models | p. 468 |
Case Study: Forecasting the Annual Number of Sunspots | p. 470 |
Automatic Generation of Network Structure Using Simplest Structure Concept | p. 471 |
Case Study: Forecasting Air Pollution with Automatic Neural Network Model Generation | p. 473 |
Generalized Neuron Network | p. 475 |
Case Study: Short-Term Load Forecasting with a Generalized Neuron Network | p. 482 |
Dynamically Driven Recurrent Networks | p. 485 |
Recurrent Networks with Hidden Neuron Feedback | p. 485 |
Encapsulating Long-Term Memory | p. 485 |
Structure and Operation of the Elman Network | p. 488 |
Training Recurrent Networks | p. 490 |
Network Training Example: Hand Calculation | p. 495 |
Recurrent Learning Network Application Case Study: Rainfall Runoff Modeling | p. 500 |
Two-Step-Ahead Forecasting with Recurrent Networks | p. 503 |
Real-Time Recurrent Learning Case Study: Two-Step-Ahead Stream Flow Forecasting | p. 505 |
Recurrent Networks with Output Feedback | p. 508 |
Encapsulating Long-Term Memory in Recurrent Networks with Output Feedback | p. 508 |
Application of a Recurrent Net with Output and Error Feedback and Exogenous Inputs: (NARIMAx) Case Study: Short-Term Temperature Forecasting | p. 510 |
Training of Recurrent Nets with Output Feedback | p. 513 |
Fully Recurrent Network | p. 515 |
Fully Recurrent Network Practical Application Case Study: Short-Term Electricity Load Forecasting | p. 517 |
Bias and Variance in Time-Series Forecasting | p. 519 |
Decomposition of Total Error into Bias and Variance Components | p. 521 |
Example Illustrating Bias-Variance Decomposition | p. 522 |
Long-Term Forecasting | p. 528 |
Case Study: Long-Term Forecasting with Multiple Neural Networks (MNNs) | p. 531 |
Input Selection for Time-Series Forecasting | p. 533 |
Input Selection from Nonlinearly Dependent Variables | p. 535 |
Partial Mutual Information Method | p. 535 |
Generalized Regression Neural Network | p. 538 |
Self-Organizing Maps for Input Selection | p. 539 |
Genetic Algorithms for Input Selection | p. 541 |
Practical Application of Input Selection Methods for Time-Series Forecasting | p. 543 |
Input Selection Case Study: Selecting Inputs for Forecasting River Salinity | p. 546 |
Summary | p. 549 |
Problems | p. 551 |
References | p. 552 |
Appendix | p. 555 |
Index | p. 561 |
Table of Contents provided by Ingram. All Rights Reserved. |