Name: Machine Learning with PySpark
ISBN: 978-1-4842-4131-8

Authors:

Pramod Singh ⁰

Pramod Singh
1. Bangalore, India
View author publications

You can also search for this author in PubMed Google Scholar

Covers all PySpark machine learning models including PySpark advanced methods
Contains practical applications of machine learning algorithms
Presents advanced features of engineering techniques for machine learning models

34k Accesses
17 Citations
9 Altmetric

Buy it now

eBook USD 24.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Learn about institutional subscriptions

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (9 chapters)

Front Matter

Pages i-xviii

PDF
Evolution of Data
- Pramod Singh
Pages 1-10
Introduction to Machine Learning
- Pramod Singh
Pages 11-21
Data Processing
- Pramod Singh
Pages 23-42
Linear Regression
- Pramod Singh
Pages 43-64
Logistic Regression
- Pramod Singh
Pages 65-98
Random Forests
- Pramod Singh
Pages 99-122
Recommender Systems
- Pramod Singh
Pages 123-157
Clustering
- Pramod Singh
Pages 159-190
Natural Language Processing
- Pramod Singh
Pages 191-218
Back Matter

Pages 219-223

PDF

About this book

Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and recommender systems using PySpark.

Machine Learning with PySpark shows you how to build supervised machine learning models such as linear regression, logistic regression, decision trees, and random forest. You’ll also see unsupervised machine learning models such as K-means and hierarchical clustering. A major portion of the book focuses on feature engineering to create useful features with PySpark to train the machine learning models. The natural language processing section covers text processing, text mining, and embedding for classification.

After reading thisbook, you will understand how to use PySpark’s machine learning library to build and train various machine learning models. Additionally you’ll become comfortable with related PySpark components, such as data ingestion, data processing, and data analysis, that you can use to develop data-driven intelligent applications.

What You Will Learn

Build a spectrum of supervised and unsupervised machine learning algorithms
Implement machine learning algorithms with Spark MLlib libraries
Develop a recommender system with Spark MLlib libraries
Handle issues related to feature engineering, class balance, bias and variance, and cross validation for building an optimal fit model

Who This Book Is For

Data science and machine learning professionals.

Keywords

Authors and Affiliations

Bangalore, India

Pramod Singh

About the author

Pramod Singh is an established data scientist with over eight years of experience in data and solving business challenges. He has worked in organizations such as Infosys, Tally and SapientRazorfish. Also, president of a data science meet-up group and regular speaker at various webinars. Recently spoke at major conference: GIDS 2018 and presented a session on “Sequence Embedding in Spark” which was well received. He has an online Udemy course on machine learning.

Bibliographic Information

Book Title: Machine Learning with PySpark
Book Subtitle: With Natural Language Processing and Recommender Systems
Authors: Pramod Singh
DOI: https://doi.org/10.1007/978-1-4842-4131-8
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
eBook ISBN: 978-1-4842-4131-8Published: 14 December 2018
Edition Number: 1
Number of Pages: XVIII, 223
Number of Illustrations: 149 b/w illustrations, 1 illustrations in colour
Topics: Artificial Intelligence, Python, Big Data/Analytics, Open Source

Publish with us

Policies and ethics

Authors:

Sections

Buy it now

Buying options

Other ways to access

Table of contents (9 chapters)

Front Matter

Back Matter

About this book

Keywords

Authors and Affiliations

Bangalore, India

About the author

Bibliographic Information

Publish with us

Buy it now

Buying options

Other ways to access

Search

Navigation