This 18-part hands-on course introduces you to the basics of Hadoop and Big Data through code samples and walkthroughs.
The primary objectives of this course are:
- Install and develop with a real Hadoop installation using Docker on either your local machine or on the Digital Ocean Cloud.
- Set up the backbone of your own big data cluster using HDFS and MapReduce.
- Analyze large data sets by writing programs on Pig and Spark.
- Store and query your data using Sqoop, Hive, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto.
- Manage your cluster and workflows using Oozie, YARN, Mesos, Zookeeper, and Hue.
- Stream real-time data using Kafka, Flume, Spark Streaming, Flink, and Storm.
Everything in this guide is 100% free. You can think of this guide as a “Free Online Nano Book”. Tutorials are easy to understand and make you productive at learning Big Data concepts.
Pro-tip: If you already know some of the concepts below, you can skip them by marking them as completed.
- Hive Tutorial
- Sqoop Tutorial
- HBase Tutorial — Hadoop and NoSQL Part 1
- Cassandra Tutorial — Hadoop and NoSQL Part 2
- MongoDB — Hadoop and NoSQL Part 3
- Data Querying Tools Tutorial — Zeppelin, Drill, Phoenix, and Presto
- Oozie Tutorial — Workflow Management
- Cluster Management Tools Tutorial — YARN, Tez, Mesos, Zookeeper, and Hue
- Kafka, Flume, and Flafka Tutorial
- Streaming Tools Tutorial —Spark Streaming, Apache Flink, and Storm