Overview

Authors:

Michael Frampton

Michael Frampton

View author publications

You can also search for this author in PubMed Google Scholar

Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset is an introduction for developers and architects anyone else interested in big data to using the Apache Hadoop toolset.
It includes a description of all tool capabilities as well as in-depth instructions to build and test a working system.

56k Accesses
3 Citations
5 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 34.99

Price excludes VAT (USA)

Softcover Book USD 44.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (11 chapters)

Front Matter

Pages i-xxii

Download chapter PDF
The Problem with Data
- Michael Frampton
Pages 1-10
Storing and Configuring Data with Hadoop, YARN, and ZooKeeper
- Michael Frampton
Pages 11-56
Collecting Data with Nutch and Solr
- Michael Frampton
Pages 57-83
Processing Data with Map Reduce
- Michael Frampton
Pages 85-120
Scheduling and Workflow
- Michael Frampton
Pages 121-154
Moving Data
- Michael Frampton
Pages 155-189
Monitoring Data
- Michael Frampton
Pages 191-223
Cluster Management
- Michael Frampton
Pages 225-256
Analytics with Hadoop
- Michael Frampton
Pages 257-290
ETL with Hadoop
- Michael Frampton
Pages 291-323
Reporting with Hadoop
- Michael Frampton
Pages 325-359
Back Matter

Pages 361-368

Download chapter PDF

About this book

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system.

As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive).

The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton.

Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to:

Store big data
Configure big data
Process big data
Schedule processes
Move data among SQL and NoSQL systems
Monitor data
Perform big data analytics
Report on big data processes and projects
Test big data systems

Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career.

About the author

Mike Frampton has been in the IT industry since 1990, working in many roles (tester, developer, support, QA), and in many sectors ( telecoms, banking, energy, insurance). He has also worked for major corporations and banks, including IBM, HP, and JPMorgan Chase. The owner of Semtech Solutions, an IT/Big Data consultancy, Mike currently lives by the beach in Paraparaumu, New Zealand, with his wife and son.

Bibliographic Information

Book Title: Big Data Made Easy
Book Subtitle: A Working Guide to the Complete Hadoop Toolset
Authors: Michael Frampton
DOI: https://doi.org/10.1007/978-1-4842-0094-0
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Softcover ISBN: 978-1-4842-0095-7Published: 24 December 2014
eBook ISBN: 978-1-4842-0094-0Published: 31 December 2014
Edition Number: 1
Number of Pages: XVI, 392
Number of Illustrations: 168 b/w illustrations
Topics: Big Data, Database Management, Information Systems and Communication Service

Publish with us

Policies and ethics

Big Data Made Easy

Overview

Access this book

Other ways to access

Table of contents (11 chapters)

Front Matter

Back Matter

About this book

About the author

Bibliographic Information

Publish with us

Search

Navigation