CommonLounge Archive

Learn Data Science with Python

March 23, 2018

This 45-part course consists of tutorials, quizzes, hands-on assignments and real-world projects to learn data science, as well as advanced python tools for data science.

Key Features of the Course

  1. 10+ portfolio projects and 150+ exercises to give you a lot of practice and build fluency.
  2. Most of the tutorials are available in three different formats — video, long article and bite-sized cards — so you can learn the way that works best for you.
  3. Articles and videos have code execution built-in. You can play the instructor’s code right inside the video!
  4. Exceptional content quality. We teach you the real thing, no dumbing things down or only talking about the easiest use case.
  5. We collect ratings on every tutorial and project. Anything with average rating below 4.5 is sent back for revision.

What is Data Science? Why Data Science?

Trillions of gigabytes of data are being produced yearly, and the number is still growing exponentially. However, data is only a raw material, and Data Science is the field which enables us to extract information from it.

Data Scientists are in demand in virtually every company to drive strategic decisions and power their business. It was ranked the #1 Job by Glassdoor with an average salary of over $120,000.

Growth of Data. Source: Patrick Cheesman

Primary Objectives of the Course

  1. Become familiar with all the key components of data science — programming, statistics and probability, data analysis and exploration, and machine learning.
  2. Learn advanced Python data science libraries such as NumPy, Pandas, Matplotlib, Seaborn and Scikit-learn.
  3. Understand machine learning concepts and algorithms. Apply them to problems in image classification, sentiment classification, movie recommendations, etc.
  4. Become comfortable with the the end-to-end workflow of a typical data science project - starting from data cleaning and analysis and going all the way to interpreting results and machine learning models.
  5. Attain fluency and build a portfolio by implementing many real-world projects using above techniques.

Prerequisites: Python and Linear Algebra, Statistics and Probability Review.

Related course: Machine Learning.

Enroll to add this course to the top of your Home Page. Get started with the first tutorial below.

Introduction to Data Science

An introduction to data science - what it is and examples of data science around us. The introduction article also introduces key components of data science, namely - programming, data, statistics, machine learning and big data.

NumPy Library

In the next few sections, you’ll learn about various python libraries for data science. This section teaches you NumPy (Numerical Python), which provides vector and matrix primitives in Python.

Pandas Library

The Pandas library introduces a DataFrame, which is basically a table (like a database table or an excel sheet, but in Python). Pandas is the go-to library for working with structured tabular datasets.

Data Visualization with Matplotlib

In this section, we will learn about data visualization. We will mostly be using the matplotlib library.

Data Cleaning and Analysis

In this section, we will learn about some advanced data cleaning and analysis methods — including combining multiple datasets, data transformations, and handling duplicate data, missing values, and outliers. The section ends with a detailed project for end-to-end data cleaning and analysis.

Advanced Data Analysis

In this course, we will learn about some advanced methods for data analysis using pandas and seaborn libraries. This includes getting aggregate statistics after grouping the data based on some variable, analyzing time series data, and performing multivariate analysis.

Machine Learning

In this section, we’ll introduce machine learning and take a look at a number of machine learning algorithms. Algorithms will include regression, classification and clustering algorithms commonly used in data science projects. After learning each algorithm, we’ll do hands-on projects and apply them to various applications such as handwritten digit recognition and diabetes diagnosis.

End-to-End Data Science Projects

This section consists of two end-to-end data science projects. The last tutorial contains a list of 10 project ideas (including datasets and suggested algorithms). It is recommended that you do at-least one end-to-end project as part of the course.

Natural Language Processing

Natural language processing comprises of a set of computational techniques to understand natural languages such as English, Spanish, Chinese, etc. In this section, you’ll see many popular NLP applications, such as search engines, finding related articles, sentiment classification and text classification, and topic modeling.

Other topics in Data Science

This section introduces us to databases and SQL, used for storing and managing data used in computer systems. We’ll also look at map reduce, a programming model that allows us to perform parallel processing on large data sets in a distributed environment. Again, our tutorials will be interleaved with quizzes and hands-on assignments.

© 2016-2022. All rights reserved.