This 32-part course consists of tutorials, quizzes, hands-on assignments and real-world projects to learn data science, as well as advanced python tools for data science. You can think of this list as a "Free Online Nano Book".

# What is Data Science? Why Data Science?

Trillions of gigabytes of data is being produced yearly, and the number is still growing exponentially. It is estimated that for every person, 1.7 megabytes of data will be produced every second by 2020.

Our society is increasingly becoming data dependent. Data is only a raw material and extracting information from it requires further work. Data Science helps us make sense of data.

**Growth of Data.** Source: Patrick Cheesman

# Primary Objectives of the Course

- Understand what data science is and it's key components - programming, data, statistics and probability, machine learning, big data technologies.
- Understand linear algebra, statistics and probability concepts for data science.
- Learn advanced Python data science libraries such as Pandas, NumPy, Matplotlib, etc.
- Understand the end-to-end workflow of a typical data science project (using Pandas, NumPy, Matplotlib, etc).
- Understand how to perform hypothesis testing.
- Understand techniques for performing analysis on networks (graphs), and implement the famous PageRank algorithm.
- Understand what machine learning is and learn some popular machine learning algorithms.
- Implement some machine learning algorithms and apply them to solve problems like image classification, sentiment classification and provide movie recommendations.
- Understand databases and systems used to store, manage and retrieve data.
- Understand and implement the map-reduce framework used to perform computation in a large-scale distributed setting.
- Implement a real-world project using data science techniques.

**Subscribe** to add this course to the top of your Home Page**. Get started **with the first article below**.**