Overview

Authors:

Sholom M. Weiss ⁰,
Nitin Indurkhya ¹,
Tong Zhang ²,
…
Fred J. Damerau ³

Sholom M. Weiss
1. TJ Watson Labs, IBM Research, Yorktown Heights, USA
View author publications

You can also search for this author in PubMed Google Scholar
Nitin Indurkhya
1. School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
View author publications

You can also search for this author in PubMed Google Scholar
Tong Zhang
1. TJ Watson Labs, IBM Research, Yorktown Heights, USA
View author publications

You can also search for this author in PubMed Google Scholar
Fred J. Damerau
1. TJ Watson Labs, IBM Research, Yorktown Heights, USA
View author publications

You can also search for this author in PubMed Google Scholar

Provides an authoritative, comprehensive survey of the concepts, principles, and methods of text mining (the search and retrieval of nonnumeric data), which is becoming increasingly critical at companies and organizations as they attempt to fully utilize their document/textual databases
Includes supplementary material: sn.pub/extras

29k Accesses
117 Citations
5 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 129.00

Price excludes VAT (USA)

Softcover Book USD 169.99

Price excludes VAT (USA)

Hardcover Book USD 169.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (8 chapters)

Front Matter

Pages i-xii

Download chapter PDF
Overview of Text Mining
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 1-13
From Textual Information to Numerical Vectors
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 15-46
Using Text for Prediction
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 47-84
Information Retrieval and Text Mining
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 85-102
Finding Structure in a Document Collection
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 103-128
Looking for Information in Documents
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 129-156
Case Studies
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 157-195
Emerging Directions
- Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
Pages 197-211
Back Matter

Pages 213-237

Download chapter PDF

Keywords

About this book

Data mining is a mature technology. The prediction problem, looking for predictive patterns in data, has been widely studied. Strong me- ods are available to the practitioner. These methods process structured numerical information, where uniform measurements are taken over a sample of data. Text is often described as unstructured information. So, it would seem, text and numerical data are different, requiring different methods. Or are they? In our view, a prediction problem can be solved by the same methods, whether the data are structured - merical measurements or unstructured text. Text and documents can be transformed into measured values, such as the presence or absence of words, and the same methods that have proven successful for pred- tive data mining can be applied to text. Yet, there are key differences. Evaluation techniques must be adapted to the chronological order of publication and to alternative measures of error. Because the data are documents, more specialized analytical methods may be preferred for text. Moreover, the methods must be modi?ed to accommodate very high dimensions: tens of thousands of words and documents. Still, the central themes are similar.

Authors and Affiliations

TJ Watson Labs, IBM Research, Yorktown Heights, USA

Sholom M. Weiss, Tong Zhang, Fred J. Damerau
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia

Nitin Indurkhya

Bibliographic Information

Book Title: Text Mining
Book Subtitle: Predictive Methods for Analyzing Unstructured Information
Authors: Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, Fred J. Damerau
DOI: https://doi.org/10.1007/978-0-387-34555-0
Publisher: Springer New York, NY
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer Science+Business Media, LLC, part of Springer Nature 2005
Hardcover ISBN: 978-0-387-95433-2
Softcover ISBN: 978-1-4419-2996-9
eBook ISBN: 978-0-387-34555-0
Edition Number: 1
Number of Pages: XII, 237
Topics: Data Mining and Knowledge Discovery, Information Systems and Communication Service, Information Storage and Retrieval, Natural Language Processing (NLP), Computer Appl. in Administrative Data Processing, Database Management

Publish with us

Policies and ethics

Text Mining

Overview

Access this book

Other ways to access

Table of contents (8 chapters)

Front Matter

Back Matter

Keywords

About this book

Authors and Affiliations

TJ Watson Labs, IBM Research, Yorktown Heights, USA

School of Computer Science and Engineering, University of New South Wales, Sydney, Australia

Bibliographic Information

Publish with us

Search

Navigation