CommonLounge Archive

The Hands-On Guide to Hadoop and Big Data

February 28, 2018

This 18-part hands-on course introduces you to the basics of Hadoop and Big Data through code samples and walkthroughs.

The primary objectives of this course are:

  1. Install and develop with a real Hadoop installation using Docker on either your local machine or on the Digital Ocean Cloud.
  2. Set up the backbone of your own big data cluster using HDFS and MapReduce.
  3. Analyze large data sets by writing programs on Pig and Spark.
  4. Store and query your data using Sqoop, Hive, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto.
  5. Manage your cluster and workflows using Oozie, YARN, Mesos, Zookeeper, and Hue.
  6. Stream real-time data using Kafka, Flume, Spark Streaming, Flink, and Storm.

Everything in this guide is 100% free. You can think of this guide as a “Free Online Nano Book”. Tutorials are easy to understand and make you productive at learning Big Data concepts.

Pro-tip: If you already know some of the concepts below, you can skip them by marking them as completed.


Introduction to Big Data

Introduction to Hadoop

Writing Programs on Hadoop

Storing and Querying Data

Cluster and Workflow Management

Streaming Data

Real World Systems


© 2016-2022. All rights reserved.