
Instant online reading.
Don't wait for delivery!
An Introduction to Information Retrieval
By: Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze
Hardcover | 25 August 2008
At a Glance
496 Pages
25.4 x 17.78 x 2.87
Hardcover
RRP $113.95
$100.75
12%OFF
or 4 interest-free payments of $25.19 with
orShips in 5 to 7 business days
Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.
All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science.
Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
About the Authors
Christopher Manning is an Associate Professor of Computer Science and Linguistics at Stanford University. His research concentrates on probabilistic models of language and statistical natural language processing, information extraction, text understanding and text mining.
Dr Prabhakar Raghavan is Head of Yahoo! Research and a Consulting Professor of Computer Science at Stanford University.
Dr Hinrich Schütze resides as Chair of Theoretical Computational Linguistics at the Institute for Natural Language Processing, University of Stuttgart,
Industry Reviews
| Table of Notation | p. xi |
| Preface | p. xv |
| Boolean retrieval | p. 1 |
| An example information retrieval problem | p. 3 |
| A first take at building an inverted index | p. 6 |
| Processing Boolean queries | p. 9 |
| The extended Boolean model versus ranked retrieval | p. 13 |
| References and further reading | p. 16 |
| The term vocabulary and postings lists | p. 18 |
| Document delineation and character sequence decoding | p. 18 |
| Determining the vocabulary of terms | p. 21 |
| Faster postings list intersection via skip pointers | p. 33 |
| Positional postings and phrase queries | p. 36 |
| References and further reading | p. 43 |
| Dictionaries and tolerant retrieval | p. 45 |
| Search structures for dictionaries | p. 45 |
| Wildcard queries | p. 48 |
| Spelling correction | p. 52 |
| Phonetic correction | p. 58 |
| References and further reading | p. 59 |
| Index construction | p. 61 |
| Hardware basics | p. 62 |
| Blocked sort-based indexing | p. 63 |
| Single-pass in-memory indexing | p. 66 |
| Distributed indexing | p. 68 |
| Dynamic indexing | p. 71 |
| Other types of indexes | p. 73 |
| References and further reading | p. 76 |
| Index compression | p. 78 |
| Statistical properties of terms in information retrieval | p. 79 |
| Dictionary compression | p. 82 |
| Postings file compression | p. 87 |
| References and further reading | p. 97 |
| Scoring, term weighting, and the vector space model | p. 100 |
| Parametric and zone indexes | p. 101 |
| Term frequency and weighting | p. 107 |
| The vector space model for scoring | p. 110 |
| Variant tf-idf functions | p. 116 |
| References and further reading | p. 122 |
| Computing scores in a complete search system | p. 124 |
| Efficient scoring and ranking | p. 124 |
| Components of an information retrieval system | p. 132 |
| Vector space scoring and query operator interaction | p. 136 |
| References and further reading | p. 137 |
| Evaluation in information retrieval | p. 139 |
| Information retrieval system evaluation | p. 140 |
| Standard test collections | p. 141 |
| Evaluation of unranked retrieval sets | p. 142 |
| Evaluation of ranked retrieval results | p. 145 |
| Assessing relevance | p. 151 |
| A broader perspective: System quality and user utility | p. 154 |
| Results snippets | p. 157 |
| References and further reading | p. 159 |
| Relevance feedback and query expansion | p. 162 |
| Relevance feedback and pseudo relevance feedback | p. 163 |
| Global methods for query reformulation | p. 173 |
| References and further reading | p. 177 |
| XML retrieval | p. 178 |
| Basic XML concepts | p. 180 |
| Challenges in XML retrieval | p. 183 |
| A vector space model for XML retrieval | p. 188 |
| Evaluation of XML retrieval | p. 192 |
| Text-centric versus data-centric XML retrieval | p. 196 |
| References and further reading | p. 198 |
| Probabilistic information retrieval | p. 201 |
| Review of basic probability theory | p. 202 |
| The probability ranking principle | p. 203 |
| The binary independence model | p. 204 |
| An appraisal and some extensions | p. 212 |
| References and further reading | p. 216 |
| Language models for information retrieval | p. 218 |
| Language models | p. 218 |
| The query likelihood model | p. 223 |
| Language modeling versus other approaches in information retrieval | p. 229 |
| Extended language modeling approaches | p. 230 |
| References and further reading | p. 232 |
| Text classification and Naive Bayes | p. 234 |
| The text classification problem | p. 237 |
| Naive Bayes text classification | p. 238 |
| The Bernoulli model | p. 243 |
| Properties of Naive Bayes | p. 245 |
| Feature selection | p. 251 |
| Evaluation of text classification | p. 258 |
| References and further reading | p. 264 |
| Vector space classification | p. 266 |
| Document representations and measures of relatedness in vector spaces | p. 267 |
| Rocchio classification | p. 269 |
| k nearest neighbor | p. 273 |
| Linear versus nonlinear classifiers | p. 277 |
| Classification with more than two classes | p. 281 |
| The bias-variance tradeoff | p. 284 |
| References and further reading | p. 291 |
| Support vector machines and machine learning on documents | p. 293 |
| Support vector machines: The linearly separable case | p. 294 |
| Extensions to the support vector machine model | p. 300 |
| Issues in the classification of text documents | p. 307 |
| Machine-learning methods in ad hoc information retrieval | p. 314 |
| References and further reading | p. 318 |
| Flat clustering | p. 321 |
| Clustering in information retrieval | p. 322 |
| Problem statement | p. 326 |
| Evaluation of clustering | p. 327 |
| K-means | p. 331 |
| Model-based clustering | p. 338 |
| References and further reading | p. 343 |
| Hierarchical clustering | p. 346 |
| Hierarchical agglomerative clustering | p. 347 |
| Single-link and complete-link clustering | p. 350 |
| Group-average agglomerative clustering | p. 356 |
| Centroid clustering | p. 358 |
| Optimality of hierarchical agglomerative clustering | p. 360 |
| Divisive clustering | p. 362 |
| Cluster labeling | p. 363 |
| Implementation notes | p. 365 |
| References and further reading | p. 367 |
| Matrix decompositions and latent semantic indexing | p. 369 |
| Linear algebra review | p. 369 |
| Term-document matrices and singular value decompositions | p. 373 |
| Low-rank approximations | p. 376 |
| Latent semantic indexing | p. 378 |
| References and further reading | p. 383 |
| Web search basics | p. 385 |
| Background and history | p. 385 |
| Web characteristics | p. 387 |
| Advertising as the economic model | p. 392 |
| The search user experience | p. 395 |
| Index size and estimation | p. 396 |
| Near-duplicates and shingling | p. 400 |
| References and further reading | p. 404 |
| Web crawling and indexes | p. 405 |
| Overview | p. 405 |
| Crawling | p. 406 |
| Distributing indexes | p. 415 |
| Connectivity servers | p. 416 |
| References and further reading | p. 419 |
| Link analysis | p. 421 |
| The Web as a graph | p. 422 |
| PageRank | p. 424 |
| Hubs and authorities | p. 433 |
| References and further reading | p. 439 |
| Bibliography | p. 441 |
| Index | p. 469 |
| Table of Contents provided by Ingram. All Rights Reserved. |
ISBN: 9780521865715
ISBN-10: 0521865719
Published: 25th August 2008
Format: Hardcover
Language: English
Number of Pages: 496
Audience: Professional and Scholarly
Publisher: Cambridge University Press
Country of Publication: GB
Dimensions (cm): 25.4 x 17.78 x 2.87
Weight (kg): 1.09
Shipping
| Standard Shipping | Express Shipping | |
|---|---|---|
| Metro postcodes: | $9.99 | $14.95 |
| Regional postcodes: | $9.99 | $14.95 |
| Rural postcodes: | $9.99 | $14.95 |
Orders over $79.00 qualify for free shipping.
How to return your order
At Booktopia, we offer hassle-free returns in accordance with our returns policy. If you wish to return an item, please get in touch with Booktopia Customer Care.
Additional postage charges may be applicable.
Defective items
If there is a problem with any of the items received for your order then the Booktopia Customer Care team is ready to assist you.
For more info please visit our Help Centre.
You Can Find This Book In

Practical Salesforce Architecture
Understanding and Deploying the Salesforce Ecosystem for the Enterprise
Paperback
RRP $106.50
$52.75
OFF

Advances in Information Storage Systems Vol. 10 : Selected Papers from the International Conference on Micromechatronics for Information and Precision Equipment (MIPE '97)
Selected Papers from the International Conference on Micromechatronics for Information and Precision Equipment (MIPE '97)
Hardcover
RRP $264.99
$238.75
OFF






















