
Advances in Learning Theory : Methods, Models, and Applications
Methods, Models, and Applications
By: Johan A. K. Suykens (Editor), G. Horvath (Editor), S. Basu (Editor), Charles A. Micchelli (Editor), Joos Vandewalle (Editor)
Hardcover | 1 May 2003
At a Glance
440 Pages
23.4 x 15.6 x 2.5
Hardcover
$563.95
or 4 interest-free payments of $140.99 with
orAims to ship in 10 to 15 business days
When will this arrive by?
Enter delivery postcode to estimate
Preface | p. v |
Organizing committee | p. ix |
List of chapter contributors | p. xi |
An Overview of Statistical Learning Theory | p. 1 |
Setting of the Learning Problem | p. 2 |
Function estimation model | p. 2 |
Problem of risk minimization | p. 2 |
Three main learning problems | p. 2 |
Empirical risk minimization induction principle | p. 4 |
Empirical risk minimization principle and the classical methods | p. 4 |
Four parts of learning theory | p. 5 |
The Theory of Consistency of Learning Processes | p. 6 |
The key theorem of the learning theory | p. 6 |
The necessary and sufficient conditions for uniform convergence | p. 7 |
Three milestones in learning theory | p. 9 |
Bounds on the Rate of Convergence of the Learning Processes | p. 10 |
The structure of the growth function | p. 11 |
Equivalent definition of the VC dimension | p. 11 |
Two important examples | p. 12 |
Distribution independent bounds for the rate of convergence of learning processes | p. 13 |
Problem of constructing rigorous (distribution dependent) bounds | p. 14 |
Theory for Controlling the Generalization of Learning Machines | p. 15 |
Structural risk minimization induction principle | p. 15 |
Theory of Constructing Learning Algorithms | p. 17 |
Methods of separating hyperplanes and their generalization | p. 17 |
Sigmoid approximation of indicator functions and neural nets | p. 18 |
The optimal separating hyperplanes | p. 19 |
The support vector network | p. 21 |
Why can neural networks and support vectors networks generalize? | p. 23 |
Conclusion | p. 24 |
Best Choices for Regularization Parameters in Learning Theory: On the Bias-Variance Problem | p. 29 |
Introduction | p. 30 |
RKHS and Regularization Parameters | p. 30 |
Estimating the Confidence | p. 32 |
Estimating the Sample Error | p. 38 |
Choosing the optimal [gamma] | p. 40 |
Final Remarks | p. 41 |
Cucker Smale Learning Theory in Besov Spaces | p. 47 |
Introduction | p. 48 |
Cucker Smale Functional and the Peetre K-Functional | p. 48 |
Estimates for the CS-Functional in Anisotropic Besov Spaces | p. 52 |
High-dimensional Approximation by Neural Networks | p. 69 |
Introduction | p. 70 |
Variable-basis Approximation and Optimization | p. 71 |
Maurey-Jones-Barron's Theorem | p. 73 |
Variation with respect to a Set of Functions | p. 75 |
Rates of Approximate Optimization over Variable Basis Functions | p. 77 |
Comparison with Linear Approximation | p. 79 |
Upper Bounds on Variation | p. 80 |
Lower Bounds on Variation | p. 82 |
Rates of Approximation of Real-valued Boolean Functions | p. 83 |
Functional Learning through Kernels | p. 89 |
Some Questions Regarding Machine Learning | p. 90 |
r.k.h.s Perspective | p. 91 |
Positive kernels | p. 91 |
r.k.h.s and learning in the literature | p. 91 |
Three Principles on the Nature of the Hypothesis Set | p. 92 |
The learning problem | p. 92 |
The evaluation functional | p. 93 |
Continuity of the evaluation functional | p. 93 |
Important consequence | p. 94 |
IR[superscript x] the set of the pointwise defined functions on x | p. 94 |
Reproducing Kernel Hilbert Space (r.k.h.s) | p. 95 |
Kernel and Kernel Operator | p. 97 |
How to build r.k.h.s.? | p. 97 |
Carleman operator and the regularization operator | p. 98 |
Generalization | p. 99 |
Reproducing Kernel Spaces (r.k.h.s) | p. 99 |
Evaluation spaces | p. 99 |
Reproducing kernels | p. 100 |
Representer Theorem | p. 104 |
Examples | p. 105 |
Examples in Hilbert space | p. 105 |
Other examples | p. 107 |
Conclusion | p. 107 |
Leave-one-out Error and Stability of Learning Algorithms with Applications | p. 111 |
Introduction | p. 112 |
General Observations about the Leave-one-out Error | p. 113 |
Theoretical Attempts to Justify the Use of the Leave-one-out Error | p. 116 |
Early work in non-parametric statistics | p. 116 |
Relation to VC-theory | p. 117 |
Stability | p. 118 |
Stability of averaging techniques | p. 119 |
Kernel Machines | p. 119 |
Background on kernel machines | p. 120 |
Leave-one-out error for the square loss | p. 121 |
Bounds on the leave-one-out error and stability | p. 122 |
The Use of the Leave-one-out Error in Other Learning Problems | p. 123 |
Transduction | p. 123 |
Feature selection and rescaling | p. 123 |
Discussion | p. 124 |
Sensitivity analysis, stability, and learning | p. 124 |
Open problems | p. 124 |
Regularized Least-Squares Classification | p. 131 |
Introduction | p. 132 |
The RLSC Algorithm | p. 134 |
Previous Work | p. 135 |
RLSC vs. SVM | p. 136 |
Empirical Performance of RLSC | p. 137 |
Approximations to the RLSC Algorithm | p. 139 |
Low-rank approximations for RLSC | p. 141 |
Nonlinear RLSC application: image classification | p. 142 |
Leave-one-out Bounds for RLSC | p. 146 |
Support Vector Machines: Least Squares Approaches and Extensions 155 | p. 155 |
Introduction | p. 156 |
Least Squares SVMs for Classification and Function Estimation | p. 158 |
LS-SVM classifiers and link with kernel FDA | p. 158 |
Function estimation case and equivalence to a regularization network solution | p. 161 |
Issues of sparseness and robustness | p. 161 |
Bayesian inference of LS-SVMs and Gaussian processes | p. 163 |
Primal-dual Formulations to Kernel PCA and CCA | p. 163 |
Kernel PCA as a one-class modelling problem and a primal-dual derivation | p. 163 |
A support vector machine formulation to Kernel CCA | p. 166 |
Large Scale Methods and On-line Learning | p. 168 |
Nystrom method | p. 168 |
Basis construction in the feature space using fixed size LS-SVM | p. 169 |
Recurrent Networks and Control | p. 172 |
Conclusions | p. 173 |
Extension of the [nu]-SVM Range for Classification | p. 179 |
Introduction | p. 180 |
[nu] Support Vector Classifiers | p. 181 |
Limitation in the Range of [nu] | p. 185 |
Negative Margin Minimization | p. 186 |
Extended [nu]-SVM | p. 188 |
Kernelization in the dual | p. 189 |
Kernelization in the primal | p. 191 |
Experiments | p. 191 |
Conclusions and Further Work | p. 194 |
Kernels Methods for Text Processing | p. 197 |
Introduction | p. 198 |
Overview of Kernel Methods | p. 198 |
From Bag of Words to Semantic Space | p. 199 |
Vector Space Representations | p. 201 |
Basic vector space model | p. 203 |
Generalised vector space model | p. 204 |
Semantic smoothing for vector space models | p. 204 |
Latent semantic kernels | p. 205 |
Semantic diffusion kernels | p. 207 |
Learning Semantics from Cross Language Correlations | p. 211 |
Hypertext | p. 215 |
String Matching Kernels | p. 216 |
Efficient computation of SSK | p. 219 |
n-grams- a language independent approach | p. 220 |
Conclusions | p. 220 |
An Optimization Perspective on Kernel Partial Least Squares Regression | p. 227 |
Introduction | p. 228 |
PLS Derivation | p. 229 |
PCA regression review | p. 229 |
PLS analysis | p. 231 |
Linear PLS | p. 232 |
Final regression components | p. 234 |
Nonlinear PLS via Kernels | p. 236 |
Feature space K-PLS | p. 236 |
Direct kernel partial least squares | p. 237 |
Computational Issues in K-PLS | p. 238 |
Comparison of Kernel Regression Methods | p. 239 |
Methods | p. 239 |
Benchmark cases | p. 240 |
Data preparation and parameter tuning | p. 240 |
Results and discussion | p. 241 |
Case Study for Classification with Uneven Classes | p. 243 |
Feature Selection with K-PLS | p. 243 |
Thoughts and Conclusions | p. 245 |
Multiclass Learning with Output Codes | p. 251 |
Introduction | p. 252 |
Margin-based Learning Algorithms | p. 253 |
Output Coding for Multiclass Problems | p. 257 |
Training Error Bounds | p. 260 |
Finding Good Output Codes | p. 262 |
Conclusions | p. 263 |
Bayesian Regression and Classification | p. 267 |
Introduction | p. 268 |
Least squares regression | p. 268 |
Regularization | p. 269 |
Probabilistic models | p. 269 |
Bayesian regression | p. 271 |
Support Vector Machines | p. 272 |
The Relevance Vector Machine | p. 273 |
Model specification | p. 273 |
The effective prior | p. 275 |
Inference | p. 276 |
Making predictions | p. 277 |
Properties of the marginal likelihood | p. 278 |
Hyperparameter optimization | p. 279 |
Relevance vector machines for classification | p. 280 |
The Relevance Vector Machine in Action | p. 281 |
Illustrative synthetic data: regression | p. 281 |
Illustrative synthetic data: classification | p. 283 |
Benchmark results | p. 284 |
Discussion | p. 285 |
Bayesian Field Theory: from Likelihood Fields to Hyperfields | p. 289 |
Introduction | p. 290 |
The Bayesian framework | p. 290 |
The basic probabilistic model | p. 290 |
Bayesian decision theory and predictive density | p. 291 |
Bayes' theorem: from prior and likelihood to the posterior | p. 293 |
Likelihood models | p. 295 |
Log-probabilities, energies, and density estimation | p. 295 |
Regression | p. 297 |
Inverse quantum theory | p. 298 |
Prior models | p. 299 |
Gaussian prior factors and approximate symmetries | p. 299 |
Hyperparameters and hyperfields | p. 303 |
Hyperpriors for hyperfields | p. 308 |
Auxiliary fields | p. 309 |
Summary | p. 312 |
Bayesian Smoothing and Information Geometry | p. 319 |
Introduction | p. 320 |
Problem Statement | p. 321 |
Probability-Based Inference | p. 322 |
Information-Based Inference | p. 324 |
Single-Case Geometry | p. 327 |
Average-Case Geometry | p. 331 |
Similar-Case Modeling | p. 332 |
Locally Weighted Geometry | p. 336 |
Concluding Remarks | p. 337 |
Nonparametric Prediction | p. 341 |
Introduction | p. 342 |
Prediction for Squared Error | p. 342 |
Prediction for 0 - 1 Loss: Pattern Recognition | p. 346 |
Prediction for Log Utility: Portfolio Selection | p. 348 |
Recent Advances in Statistical Learning Theory | p. 357 |
Introduction | p. 358 |
Problem Formulations | p. 358 |
Uniform convergence of empirical means | p. 358 |
Probably approximately correct learning | p. 360 |
Summary of "Classical" Results | p. 362 |
Fixed distribution case | p. 362 |
Distribution-free case | p. 364 |
Recent Advances | p. 365 |
Intermediate families of probability measures | p. 365 |
Learning with prior information | p. 366 |
Learning with Dependent Inputs | p. 367 |
Problem formulations | p. 367 |
Definition of [beta]-mixing | p. 368 |
UCEM and PAC learning with [beta]-mixing inputs | p. 369 |
Applications to Learning with Inputs Generated by a Markov Chain | p. 371 |
Conclusions | p. 372 |
Neural Networks in Measurement Systems (an engineering view) | p. 375 |
Introduction | p. 376 |
Measurement and Modeling | p. 377 |
Neural Networks | p. 383 |
Support Vector Machines | p. 389 |
The Nature of Knowledge, Prior Information | p. 393 |
Questions Concerning Implementation | p. 394 |
Conclusions | p. 396 |
List of participants | p. 403 |
Subject Index | p. 411 |
Author Index | p. 415 |
Table of Contents provided by Rittenhouse. All Rights Reserved. |
ISBN: 9781586033415
ISBN-10: 1586033417
Series: NATO Science Series: Computer & Systems Sciences
Published: 1st May 2003
Format: Hardcover
Language: English
Number of Pages: 440
Audience: Professional and Scholarly
Publisher: IOS Press
Country of Publication: US
Dimensions (cm): 23.4 x 15.6 x 2.5
Weight (kg): 0.79
Shipping
Standard Shipping | Express Shipping | |
---|---|---|
Metro postcodes: | $9.99 | $14.95 |
Regional postcodes: | $9.99 | $14.95 |
Rural postcodes: | $9.99 | $14.95 |
How to return your order
At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.
Additional postage charges may be applicable.
Defective items
If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.
For more info please visit our Help Centre.
You Can Find This Book In

Data Governance: The Definitive Guide
People, Processes, and Tools to Operationalize Data Trustworthiness
Paperback
RRP $152.00
$73.75
OFF

Machine Learning with Python Cookbook
2nd Edition - Practical Solutions from Preprocessing to Deep Learning
Paperback
RRP $152.00
$73.75
OFF

Architecting Data and Machine Learning Platforms
Enable Analytics and Ai-Driven Innovation in the Cloud
Paperback
RRP $125.50
$60.90
OFF

Predictive Analytics for the Modern Enterprise
A Practitioner's Guide to Designing and Implementing Solutions
Paperback
RRP $125.50
$60.90
OFF

Graph-Powered Analytics and Machine Learning with TigerGraph
Driving Business Outcomes with Connected Data
Paperback
RRP $125.50
$60.90
OFF
This product is categorised by
- Non-FictionComputing & I.T.Computer ScienceMathematical Theory of Computation
- Non-FictionComputing & I.T.Computer ScienceArtificial IntelligenceMachine Learning
- Non-FictionComputing & I.T.Computer ScienceArtificial IntelligenceNeural Networks & Fuzzy Systems
- Non-FictionEducationEducation Equipment & Technology
- Non-FictionEducationTeaching Skills & Techniques