
Lecture Notes In Data Mining
By: Michael W Berry (Editor), Murray Browne (Editor)
Hardcover | 1 September 2006
At a Glance
236 Pages
23.5 x 15.88 x 1.27
Hardcover
RRP $212.99
$191.75
10%OFF
or 4 interest-free payments of $47.94 with
orShips in 15 to 25 business days
The continual explosion of information technology and the need for better data collection and management methods has made data mining an even more relevant topic of study. Books on data mining tend to be either broad and introductory or focus on some very specific technical aspect of the field.This book is a series of seventeen edited "student-authored lectures" which explore in depth the core of data mining (classification, clustering and association rules) by offering overviews that include both analysis and insight.The initial chapters lay a framework of data mining techniques by explaining some of the basics such as applications of Bayes Theorem, similarity measures, and decision trees. Before focusing on the pillars of classification, clustering and association rules, the book also considers alternative candidates such as point estimation and genetic algorithms.The book's discussion of classification includes an introduction to decision tree algorithms, rule-based algorithms (a popular alternative to decision trees) and distance-based algorithms. Five of the lecture-chapters are devoted to the concept of clustering or unsupervised classification. The functionality of hierarchical and partitional clustering algorithms is also covered as well as the efficient and scalable clustering algorithms used in large databases. The concept of association rules in terms of basic algorithms, parallel and distributive algorithms and advanced measures that help determine the value of association rules are discussed. The final chapter discusses algorithms for spatial data mining.
| Preface | p. V |
| Point Estimation Algorithms | p. 1 |
| Introduction | p. 1 |
| Motivation | p. 2 |
| Methods of Point Estimation | p. 2 |
| The Method of Moments | p. 2 |
| Maximum Likelihood Estimation | p. 4 |
| The Expectation-Maximization Algorithm | p. 6 |
| Measures of Performance | p. 8 |
| Bias | p. 9 |
| Mean Squared Error | p. 9 |
| Standard Error | p. 10 |
| Efficiency | p. 10 |
| Consistency | p. 11 |
| The Jackknife Method | p. 11 |
| Summary | p. 13 |
| Applications of Bayes Theorem | p. 15 |
| Introduction | p. 15 |
| Motivation | p. 16 |
| The Bayes Approach for Classification | p. 17 |
| Statistical Framework for Classification | p. 17 |
| Bayesian Methodology | p. 20 |
| Examples | p. 22 |
| Example 1: Numerical Methods | p. 22 |
| Example 2: Bayesian Networks | p. 24 |
| Summary | p. 25 |
| Similarity Measures | p. 27 |
| Introduction | p. 27 |
| Motivation | p. 28 |
| Classic Similarity Measures | p. 28 |
| Dice | p. 30 |
| Overlap | p. 30 |
| Jaccard | p. 31 |
| Asymmetric | p. 31 |
| Cosine | p. 31 |
| Other Measures | p. 32 |
| Dissimilarity | p. 32 |
| Example | p. 33 |
| Current Applications | p. 35 |
| Multi-Dimensional Modeling | p. 35 |
| Hierarchical Clustering | p. 36 |
| Bioinformatics | p. 37 |
| Summary | p. 38 |
| Decision Trees | p. 39 |
| Introduction | p. 39 |
| Motivation | p. 41 |
| Decision Tree Algorithms | p. 42 |
| ID3 Algorithm | p. 43 |
| Evaluating Tests | p. 43 |
| Selection of Splitting Variable | p. 46 |
| Stopping Criteria | p. 46 |
| Tree Pruning | p. 47 |
| Stability of Decision Trees | p. 47 |
| Example: Classification of University Students | p. 48 |
| Applications of Decision Tree Algorithms | p. 49 |
| Summary | p. 50 |
| Genetic Algorithms | p. 53 |
| Introduction | p. 53 |
| Motivation | p. 54 |
| Fundamentals | p. 55 |
| Encoding Schema and Initialization | p. 56 |
| Fitness Evaluation | p. 57 |
| Selection | p. 58 |
| Crossover | p. 59 |
| Mutation | p. 61 |
| Iterative Evolution | p. 62 |
| Example: The Traveling-Salesman | p. 63 |
| Current and Future Applications | p. 65 |
| Summary | p. 66 |
| Classification: Distance-based Algorithms | p. 67 |
| Introduction | p. 67 |
| Motivation | p. 68 |
| Distance Functions | p. 68 |
| City Block Distance | p. 69 |
| Euclidean Distance | p. 70 |
| Tangent Distance | p. 70 |
| Other Distances | p. 71 |
| Classification Algorithms | p. 72 |
| A Simple Approach Using Mean Vector | p. 72 |
| K-Nearest Neighbors | p. 74 |
| Current Applications | p. 76 |
| Summary | p. 77 |
| Decision Tree-based Algorithms | p. 79 |
| Introduction | p. 79 |
| Motivation | p. 80 |
| ID3 | p. 80 |
| C4.5 | p. 82 |
| C5.0 | p. 83 |
| CART | p. 84 |
| Summary | p. 85 |
| Covering (Rule-based) Algorithms | p. 87 |
| Introduction | p. 87 |
| Motivation | p. 88 |
| Classification Rules | p. 88 |
| Covering (Rule-based) Algorithms | p. 90 |
| 1R Algorithm | p. 91 |
| PRISM Algorithm | p. 94 |
| Other Algorithms | p. 96 |
| Applications of Covering Algorithms | p. 97 |
| Summary | p. 97 |
| Clustering: An Overview | p. 99 |
| Introduction | p. 99 |
| Motivation | p. 100 |
| The Clustering Process | p. 100 |
| Pattern Representation | p. 101 |
| Pattern Proximity Measures | p. 102 |
| Clustering Algorithms | p. 103 |
| Hierarchical Algorithms | p. 103 |
| Partitional Algorithms | p. 105 |
| Data Abstraction | p. 105 |
| Cluster Assessment | p. 105 |
| Current Applications | p. 107 |
| Summary | p. 107 |
| Clustering: Hierarchical Algorithms | p. 109 |
| Introduction | p. 109 |
| Motivation | p. 110 |
| Agglomerative Hierarchical Algorithms | p. 111 |
| The Single Linkage Method | p. 112 |
| The Complete Linkage Method | p. 114 |
| The Average Linkage Method | p. 116 |
| The Centroid Method | p. 116 |
| The Ward Method | p. 117 |
| Divisive Hierarchical Algorithms | p. 118 |
| Summary | p. 120 |
| Clustering: Partitional Algorithms | p. 121 |
| Introduction | p. 121 |
| Motivation | p. 122 |
| Partitional Clustering Algorithms | p. 122 |
| Squared Error Clustering | p. 122 |
| Nearest Neighbor Clustering | p. 126 |
| Partitioning Around Medoids | p. 127 |
| Self-Organizing Maps | p. 131 |
| Current Applications | p. 132 |
| Summary | p. 132 |
| Clustering: Large Databases | p. 133 |
| Introduction | p. 133 |
| Motivation | p. 134 |
| Requirements for Scalable Clustering | p. 134 |
| Major Approaches to Scalable Clustering | p. 135 |
| The Divide-and-Conquer Approach | p. 135 |
| Incremental Clustering Approach | p. 135 |
| Parallel Approach to Clustering | p. 136 |
| BIRCH | p. 137 |
| DBSCAN | p. 139 |
| CURE | p. 140 |
| Summary | p. 141 |
| Clustering: Categorical Attributes | p. 143 |
| Introduction | p. 143 |
| Motivation | p. 144 |
| ROCK Clustering Algorithm | p. 145 |
| Computation of Links | p. 146 |
| Goodness Measure | p. 147 |
| Miscellaneous Issues | p. 148 |
| Example | p. 148 |
| COOLCAT Clustering Algorithm | p. 149 |
| CACTUS Clustering Algorithm | p. 151 |
| Summary | p. 152 |
| Association Rules: An Overview | p. 153 |
| Introduction | p. 153 |
| Motivation | p. 154 |
| Association Rule Process | p. 154 |
| Terminology and Notation | p. 154 |
| From Data to Association Rules | p. 157 |
| Large Itemset Discovery Algorithms | p. 158 |
| Apriori | p. 158 |
| Sampling | p. 160 |
| Partitioning | p. 162 |
| Summary | p. 163 |
| Association Rules: Parallel and Distributed Algorithms | p. 169 |
| Introduction | p. 169 |
| Motivation | p. 170 |
| Parallel and Distributed Algorithms | p. 171 |
| Data Parallel Algorithms on Distributed Memory Systems | p. 172 |
| Count Distribution (CD) | p. 172 |
| Task Parallel Algorithms on Distributed Memory Systems | p. 174 |
| Data Distribution (DD) | p. 174 |
| Candidate Distribution (CaD) | p. 174 |
| Intelligent Data Distribution (IDD) | p. 175 |
| Data Parallel Algorithms on Shared Memory Systems | p. 176 |
| Common Candidate Partitioned Database (CCPD) | p. 176 |
| Task Parallel Algorithms on Shared Memory Systems | p. 177 |
| Asynchronous Parallel Mining (APM) | p. 177 |
| Discussion of Parallel Algorithms | p. 177 |
| Summary | p. 179 |
| Association Rules: Advanced Techniques and Measures | p. 183 |
| Introduction | p. 183 |
| Motivation | p. 184 |
| Incremental Rules | p. 184 |
| Generalized Association Rules | p. 185 |
| Quantitative Association Rules | p. 187 |
| Correlation Rules | p. 188 |
| Measuring the Quality of Association Rules | p. 189 |
| Lift | p. 189 |
| Conviction | p. 189 |
| Chi-Squared Test | p. 190 |
| Summary | p. 191 |
| Spatial Mining: Techniques and Algorithms | p. 193 |
| Introduction and Motivation | p. 193 |
| Concept Hierarchies and Generalization | p. 194 |
| Spatial Rules | p. 196 |
| STING | p. 197 |
| Spatial Classification | p. 199 |
| ID3 Extension | p. 200 |
| Two-Step Method | p. 201 |
| Spatial Clustering | p. 202 |
| CLARANS | p. 202 |
| GDBSCAN | p. 203 |
| DBCLASD | p. 204 |
| Summary | p. 204 |
| References | p. 207 |
| Index | p. 219 |
| Table of Contents provided by Ingram. All Rights Reserved. |
ISBN: 9789812568021
ISBN-10: 9812568026
Published: 1st September 2006
Format: Hardcover
Number of Pages: 236
Audience: Professional and Scholarly
Publisher: World Scientific Publishing Co Pte Ltd
Country of Publication: GB
Dimensions (cm): 23.5 x 15.88 x 1.27
Weight (kg): 0.5
Shipping
| Standard Shipping | Express Shipping | |
|---|---|---|
| Metro postcodes: | $9.99 | $14.95 |
| Regional postcodes: | $9.99 | $14.95 |
| Rural postcodes: | $9.99 | $14.95 |
Orders over $79.00 qualify for free shipping.
How to return your order
At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.
Additional postage charges may be applicable.
Defective items
If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.
For more info please visit our Help Centre.
You Can Find This Book In

Apache Iceberg: The Definitive Guide
Data Lakehouse Functionality, Performance, and Scalability on the Data Lake
Paperback
RRP $133.00
$64.75
OFF























