Master Big Data Ingestion and Analytics with Flume, Sqoop, Hive and Spark

By: Navdeep Kaur

Write A Review

eText | 17 July 2019 | Edition Number 1

At a Glance

Format
PDF

eText

$146.29

or 4 interest-free payments of $36.57 with

Instant online reading in your Booktopia eTextbook Library *

Read online on

Desktop

Tablet

Mobile

Not downloadable to your eReader or an app

Why choose an eTextbook?

Instant Access *

Purchase and read your book immediately

Read Aloud

Listen and follow along as Bookshelf reads to you

Study Tools

Built-in study tools like highlights and more

* eTextbooks are not downloadable to your eReader or an app and can be accessed via web browsers only. You must be connected to the internet and have no technical issues with your device or browser that could prevent the eTextbook from operating.

In this course, you will start by learning about the Hadoop Distributed File System (HDFS) and the most common Hadoop commands required to work with HDFS. Then, you'll be introduced to Sqoop Import, through which will gain knowledge of the lifecycle of the Sqoop command and how to use the import command to migrate data from Mysql to HDFS, and from Mysql to Hive-and much more.

In addition, you will learn about Sqoop Export to migrate data effectively, and about Apache Flume to ingest data. The section Apache Hive introduces Hive, alongside external and managed tables; working with different files, and Parquet and Avro—and more. You will learn about Spark Dataframes, Spark SQL and lot more in the last sections.

All the codes and supporting files are available at: https://github.com/PacktPublishing/Master-Big-Data-Ingestion-and-Analytics-with-Flume-Sqoop-Hive-and-Spark