Overview

Authors:

Sholom M. Weiss ⁰,
Nitin Indurkhya ¹,
Tong Zhang ²

Sholom M. Weiss
1. T.J. Watson Research Center, IBM Corporation, Yorktown Heights, USA
View author publications

You can also search for this author in PubMed Google Scholar
Nitin Indurkhya
1. School of Computer Science &, Engineering, University of New South Wales, Sydney, Australia
View author publications

You can also search for this author in PubMed Google Scholar
Tong Zhang
1. Dept. Statistics, Rutgers University, Piscataway, USA
View author publications

You can also search for this author in PubMed Google Scholar

Presents a comprehensive, practical and easy-to-read introduction to text mining
Includes chapter summaries, useful historical and bibliographic remarks, and classroom-tested exercises for each chapter
Provides several descriptive case studies that take readers from problem description to systems deployment in the real world
Includes supplementary material: sn.pub/extras

Part of the book series: Texts in Computer Science (TCS)

30k Accesses
64 Citations
6 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 44.99

Price excludes VAT (USA)

Softcover Book USD 59.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (9 chapters)

Front Matter

Pages I-XIII

Download chapter PDF
Overview of Text Mining
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 1-12
From Textual Information to Numerical Vectors
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 13-38
Using Text for Prediction
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 39-73
Information Retrieval and Text Mining
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 75-90
Finding Structure in a Document Collection
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 91-112
Looking for Information in Documents
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 113-139
Data Sources for Prediction: Databases, Hybrid Data and the Web
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 141-155
Case Studies
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 157-188
Emerging Directions
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Pages 189-205
Back Matter

Pages 207-226

Download chapter PDF

Keywords

About this book

One consequence of the pervasive use of computers is that most documents originate in digital form. Widespread use of the Internet makes them readily available. Text mining – the process of analyzing unstructured natural-language text – is concerned with how to extract information from these documents. Developed from the authors’ highly successful Springer reference on text mining, Fundamentals of Predictive Text Mining is an introductory textbook and guide to this rapidly evolving field. Integrating topics spanning the varied disciplines of data mining, machine learning, databases, and computational linguistics, this uniquely useful book also provides practical advice for text mining. In-depth discussions are presented on issues of document classification, information retrieval, clustering and organizing documents, information extraction, web-based data-sourcing, and prediction and evaluation. Background on data mining is beneficial, but not essential. Where advanced concepts are discussed that require mathematical maturity for a proper understanding, intuitive explanations are also provided for less advanced readers. Topics and features: presents a comprehensive, practical and easy-to-read introduction to text mining; includes chapter summaries, useful historical and bibliographic remarks, and classroom-tested exercises for each chapter; explores the application and utility of each method, as well as the optimum techniques for specific scenarios; provides several descriptive case studies that take readers from problem description to systems deployment in the real world; includes access to industrial-strength text-mining software that runs on any computer; describes methods that rely on basic statistical techniques, thus allowing for relevance to all languages (not just English); contains links to free downloadable software and other supplementary instruction material. Fundamentals of Predictive Text Mining is an essential resource for IT professionalsand managers, as well as a key text for advanced undergraduate computer science students and beginning graduate students. Dr. Sholom M. Weiss is a Research Staff Member with the IBM Predictive Modeling group, in Yorktown Heights, New York, and Professor Emeritus of Computer Science at Rutgers University. Dr. Nitin Indurkhya is Professor at the School of Computer Science and Engineering, University of New South Wales, Australia, as well as founder and president of data-mining consulting company Data-Miner Pty Ltd. Dr. Tong Zhang is Associate Professor at the Department of Statistics and Biostatistics at Rutgers University, New Jersey.

Reviews

From the reviews:

"This is a practical, up-to-date account of the various techniques for dealing intelligently with free text. It would be an invaluable resource to any advanced undergraduate student interested in information retrieval." (Patrick Oladimeji, Times Higher Education, 26 May 2011)

“This is a well-written and interesting text for information technology (IT) professionals and computer science students. It seems to address all of the topics related to the fields that, when integrated, are known as knowledge engineering. … Without a doubt, the authors’ experience in the field makes this book a successful contribution to the literature that targets the interests of the IT community and beyond.” (Jolanta Mizera-Pietraszko, ACM Computing Reviews, June, 2011)

“This well-written work, which offers a unifying view of text mining through a systematic introduction to solving real-world problems. … The uniqueness of this book is the recourse to the prediction problem, which, by providing practical advice, allows for the integration of related topics. … The book is accompanied by a software implementation of the main algorithmic practices introduced. This is the icing on the cake for both beginners and expert readers … . This is the book … I have always wanted to read.” (Ernesto D’Avenzo, ACM Computing Reviews, August, 2012)

Authors and Affiliations

T.J. Watson Research Center, IBM Corporation, Yorktown Heights, USA

Sholom M. Weiss
School of Computer Science &, Engineering, University of New South Wales, Sydney, Australia

Nitin Indurkhya
Dept. Statistics, Rutgers University, Piscataway, USA

Tong Zhang

About the authors

Dr. Sholom M. Weiss is a Research Staff Member with the IBM Predictive Modeling group, in Yorktown Heights, New York, and Professor Emeritus of Computer Science at Rutgers University. Dr. Nitin Indurkhya is Professor at the School of Computer Science and Engineering, University of New South Wales, Australia, as well as founder and president of data-mining consulting company Data-Miner Pty Ltd. Dr. Tong Zhang is Associate Professor at the Department of Statistics and Biostatistics at Rutgers University, New Jersey.

Bibliographic Information

Book Title: Fundamentals of Predictive Text Mining
Authors: Sholom M. Weiss, Nitin Indurkhya, Tong Zhang
Series Title: Texts in Computer Science
DOI: https://doi.org/10.1007/978-1-84996-226-1
Publisher: Springer London
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer-Verlag London Limited 2010
Softcover ISBN: 978-1-4471-2565-5Published: 05 September 2012
eBook ISBN: 978-1-84996-226-1Published: 14 June 2010
Series ISSN: 1868-0941
Series E-ISSN: 1868-095X
Edition Number: 1
Number of Pages: XIV, 226
Topics: Data Mining and Knowledge Discovery, Natural Language Processing (NLP), Computer Appl. in Administrative Data Processing, Information Storage and Retrieval, Database Management

Publish with us

Policies and ethics

Fundamentals of Predictive Text Mining

Overview

Access this book

Other ways to access

Table of contents (9 chapters)

Front Matter

Back Matter

Keywords

About this book

Reviews

Authors and Affiliations

T.J. Watson Research Center, IBM Corporation, Yorktown Heights, USA

School of Computer Science &, Engineering, University of New South Wales, Sydney, Australia

Dept. Statistics, Rutgers University, Piscataway, USA

About the authors

Bibliographic Information

Publish with us

Search

Navigation