Overview
- First book to provide an end-to-end solution approach
- Includes data capture strategies for time series and relational data
- Covers data processing using Hive and Spark
Access this book
Tax calculation will be finalised at checkout
Other ways to access
Table of contents (8 chapters)
Keywords
About this book
When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more.
Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point.
What You'll Learn
- Get to know data lake architecture and design principles
- Implement data capture and streaming strategies
- Implement data processing strategies in Hadoop
- Understand the data lake security framework and availability model
Who This Book Is For
Big data architects and solution architects
Authors and Affiliations
About the authors
Venkata Giri currently works with GE Digital and has been involved with building resilient distributed services at a massive scale. He has worked on big data tech stack, relational databases, high availability, and performance tuning. With over 20 years of experience in data technologies, he has in-depth knowledge of big data ecosystems, complex data ingestion pipelines, data engineering, data processing, and operations. Prior to working at GE, he worked with the data teams at Linkedin and Yahoo.
Bibliographic Information
Book Title: Practical Enterprise Data Lake Insights
Book Subtitle: Handle Data-Driven Challenges in an Enterprise Big Data Lake
Authors: Saurabh Gupta, Venkata Giri
DOI: https://doi.org/10.1007/978-1-4842-3522-5
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Copyright Information: Saurabh Gupta, Venkata Giri 2018
Softcover ISBN: 978-1-4842-3521-8Published: 28 June 2018
eBook ISBN: 978-1-4842-3522-5Published: 27 June 2018
Edition Number: 1
Number of Pages: XVIII, 327
Number of Illustrations: 90 b/w illustrations
Topics: Big Data, Computer Applications, Big Data/Analytics