Name: Practical Enterprise Data Lake Insights
ISBN: 978-1-4842-3522-5

Overview

Authors:

Saurabh Gupta ⁰,
Venkata Giri ¹

Saurabh Gupta
1. Bangalore, India
View author publications

You can also search for this author in PubMed Google Scholar
Venkata Giri
1. Bangalore, India
View author publications

You can also search for this author in PubMed Google Scholar

First book to provide an end-to-end solution approach
Includes data capture strategies for time series and relational data
Covers data processing using Hive and Spark

13k Accesses
13 Citations
4 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 39.99

Price excludes VAT (USA)

Softcover Book USD 49.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (8 chapters)

Front Matter

Pages i-xviii

Download chapter PDF
Introduction to Enterprise Data Lakes
- Saurabh Gupta, Venkata Giri
Pages 1-31
Data lake ingestion strategies
- Saurabh Gupta, Venkata Giri
Pages 33-85
Capture Streaming Data with Change-Data-Capture
- Saurabh Gupta, Venkata Giri
Pages 87-123
Data Processing Strategies in Data Lakes
- Saurabh Gupta, Venkata Giri
Pages 125-199
Data Archiving Strategies in Data Lakes
- Saurabh Gupta, Venkata Giri
Pages 201-223
Data Security in Data Lakes
- Saurabh Gupta, Venkata Giri
Pages 225-259
Ensure High Availability of Data Lake
- Saurabh Gupta, Venkata Giri
Pages 261-295
Managing Data Lake Operations
- Saurabh Gupta, Venkata Giri
Pages 297-315
Back Matter

Pages 317-327

Download chapter PDF

Keywords

About this book

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues.

When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more.

Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point.

What You'll Learn

Get to know data lake architecture and design principles
Implement data capture and streaming strategies
Implement data processing strategies in Hadoop
Understand the data lake security framework and availability model

Who This Book Is For

Big data architects and solution architects

Authors and Affiliations

Bangalore, India

Saurabh Gupta, Venkata Giri

About the authors

Saurabh K. Gupta is a technology leader, published author, and database enthusiast with more than 11 years of industry experience in data architecture, engineering, development, and administration. Working as a Manager, Data & Analytics at GE Transportation, his focus lies with data lake analytics programs that build a digital solution for business stakeholders. In the past, he has worked extensively with Oracle database design and development, PaaS and IaaS cloud service models, consolidation, and in-memory technologies. He has authored two books on advanced PL/SQL for Oracle versions 11g and 12c. He is a frequent speaker at numerous conferences organized by the user community and technical institutions. He tweets at @saurabhkg and blogs at sbhoracle.wordpress.com.

Venkata Giri currently works with GE Digital and has been involved with building resilient distributed services at a massive scale. He has worked on big data tech stack, relational databases, high availability, and performance tuning. With over 20 years of experience in data technologies, he has in-depth knowledge of big data ecosystems, complex data ingestion pipelines, data engineering, data processing, and operations. Prior to working at GE, he worked with the data teams at Linkedin and Yahoo.

Bibliographic Information

Book Title: Practical Enterprise Data Lake Insights
Book Subtitle: Handle Data-Driven Challenges in an Enterprise Big Data Lake
Authors: Saurabh Gupta, Venkata Giri
DOI: https://doi.org/10.1007/978-1-4842-3522-5
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Copyright Information: Saurabh Gupta, Venkata Giri 2018
Softcover ISBN: 978-1-4842-3521-8Published: 28 June 2018
eBook ISBN: 978-1-4842-3522-5Published: 27 June 2018
Edition Number: 1
Number of Pages: XVIII, 327
Number of Illustrations: 90 b/w illustrations
Topics: Big Data, Computer Applications, Big Data/Analytics

Publish with us

Policies and ethics

Overview

Access this book

Other ways to access

Table of contents (8 chapters)

Front Matter

Back Matter

Keywords

About this book

Authors and Affiliations

Bangalore, India

About the authors

Bibliographic Information

Publish with us

Search

Navigation