- Full Description
Youve heard the hype about Hadoop: it runs petabytescale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, its been heavily committed to by tech giants like IBM, Yahoo!, and the Apache Project, and its completely open-source (thus free). But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running?
From Apress, the name youve come to trust for handson technical knowledge, Pro Hadoop brings you up to speed on Hadoop. You learn the ins and outs of MapReduce; how to structure a cluster, design, and implement the Hadoop file system; and how to build your first cloudcomputing tasks using Hadoop. Learn how to let Hadoop take care of distributing and parallelizing your softwareyou just focus on the code, Hadoop takes care of the rest.
Best of all, youll learn from a tech professional whos been in the Hadoop scene since day one. Written from the perspective of a principal engineer with downinthetrenches knowledge of what to do wrong with Hadoop, you learn how to avoid the common, expensive first errors that everyone makes with creating their own Hadoop system or inheriting someone elses.
Skip the novice stage and the expensive, hardtofix mistakes...go straight to seasoned pro on the hottest cloudcomputing framework with Pro Hadoop. Your productivity will blow your managers away.
What youll learn
- Set up a standalone Hadoop cluster the smart way, laid out simply and step by step so you can get up and running quickly to build your next data center, collaborative, dataintensive Internet services application, Software as a Service (SaaS), and more.
- Optimize your Hadoop production tasks like an experienced pro.
- Work with timeproven, bulletproof standard patterns that have been tested and debugged in highvolume production.
- Understand just enough theoretical knowledge to know why something works in Hadoop, without getting bogged down in abstruse walls of theory.
- Get detailed explanations of not only how to do something with Hadoop, but also why, from a frontline coder with years in the Hadoop game.
- Turn someone elses expensive clusterwide wrong into an orderly, productive "right" with professionallevel debugging and testing.
Who this book is for
IT professionals interested in investigating Hadoop and implementing it in their organizations, and existing Hadoop users who want to deepen their professional toolkits.
- Table of Contents
Table of Contents
- Getting Started with Hadoop Core
- The Basics of a MapReduce Job
- The Basics of Multimachine Clusters
- HDFS Details for Multimachine Clusters
- MapReduce Details for Multimachine Clusters
- Tuning Your MapReduce Jobs
- Unit Testing and Debugging
- Advanced and Alternate MapReduce Techniques
- Solving Problems with Hadoop
- Projects Based On Hadoop and Future Directions
- Source Code/Downloads