- Full Description
Data Mining, the automatic extraction of implicit and potentially useful information from data, is increasingly used in commercial, scientific and other application areas. This book explains and explores the principal techniques of Data Mining: for classification, generation of association rules and clustering. It is written for readers without a strong background in mathematics or statistics and focuses on detailed examples and explanations of the algorithms given. This should prove of value to readers of all kinds, from those whose only use of data mining techniques will be via commercial packages right through to academic researchers. This book aims to help the general reader develop the necessary understanding to use commercial data mining packages discriminatingly, as well as enabling the advanced reader to understand or contribute to future technical advances in the field. Each chapter has practical exercises to enable readers to check their progress. A full glossary of technical terms used is included.
- Table of Contents
Table of Contents
- Introduction to Data Mining.
- Data for Data Mining.
- Introduction to Classification: Naive Bayes and Nearest Neighbour.
- Using Decision Trees for Classification.
- Decision Tree Induction: Using Entropy for Attribute Selection.
- Decision Tree Induction: Using Frequency Tables for Attribute Selection.
- Estimating the Predictive Accuracy of a Classifier.
- Continuous Attributes.
- Avoiding Overfitting of Decision Trees.
- More about Entropy.
- Inducing Modular Rules for Classification.
- Measuring the Performance of a Classifier.
- Association Rule Mining I.
- Association Rule Mining II.
- Text Mining.
- Appendix A: Essential Mathematics.
- Appendix B: Datasets.
- Appendix C: Sources of Further Information.
- Appendix D: Glossary and Notation.
- Appendix E: Solutions to Self
- assessment Exercises.
If you think that you've found an error in this book, please let us know by emailing to email@example.com . You will find any confirmed erratum below, so you can check if your concern has already been addressed. No errata are currently published