Authors:
- Presents advanced features of PySpark and code optimization techniques
- Covers SparkSQL, Spark Streaming, Spark MLlib, and GraphFrames
- Discusses and demonstrates Data Science and Big Data processing with PySpark MLlib
Buy it now
Buying options
Tax calculation will be finalised at checkout
Other ways to access
This is a preview of subscription content, log in via an institution to check for access.
Table of contents (9 chapters)
-
Front Matter
-
Back Matter
About this book
PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will learn to apply RDD to solve day-to-day big data problems. Python and NumPy are included and make it easy for new learners of PySpark to understand and adopt the model.
What You Will Learn
- Understand the advanced features of PySpark2 and SparkSQL
- Optimize your code
- Program SparkSQL with Python
- Use Spark Streaming and Spark MLlib with Python
- Perform graph analysis with GraphFrames
Who This Book Is For
Data analysts, Python programmers, big data enthusiasts
Authors and Affiliations
-
Bangalore, India
Raju Kumar Mishra
About the author
Bibliographic Information
Book Title: PySpark Recipes
Book Subtitle: A Problem-Solution Approach with PySpark2
Authors: Raju Kumar Mishra
DOI: https://doi.org/10.1007/978-1-4842-3141-8
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Copyright Information: Raju Kumar Mishra 2018
Softcover ISBN: 978-1-4842-3140-1Published: 10 December 2017
eBook ISBN: 978-1-4842-3141-8Published: 09 December 2017
Edition Number: 1
Number of Pages: XXIII, 265
Number of Illustrations: 35 b/w illustrations, 12 illustrations in colour
Topics: Big Data, Programming Techniques, Programming Languages, Compilers, Interpreters, Data Mining and Knowledge Discovery