In this hands-on assignment, we'll apply the NumPy python library to explore a dataset. The dataset we'll be using is a medical dataset with information about some patients on metrics like glucose, insulin levels, and other metrics related to diabetes. The assignment will serve two primary objectives - (a) practice NumPy on a realistic task, and (b) learn how to get a feel for a large dataset (also known as data cleaning and data exploration).
We'll be using the following dataset: diabetes.csv. Open the file in your favorite text editor and have a look.
The following are the column names: Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, Age, Outcome.
The quiz will guide you through the rest of the assi...
Exploring Vowpal Wabbit with the Avazu Clickthrough Prediction Challenge
In online advertising, click-through rate (CTR) is a very important metric for evaluating ad performance. As a result, click prediction systems are essential and widely used for sponsored search and real-time bidding.
For this competition, we have provided 11 days worth of Avazu data to build and test prediction models. Can you find a strategy that beats standard classification algorithms? The winning models from this competition will be released under an open-source license.
id: ad identifier
click: 0/1 for non-click/click
hour: format is YYMMDDHH, so 14091123 means 23:00 on Sept. 11, 2014 UTC.
One of the most prominent field of application of machine learning in sports, and a lot of people love sports statistics. It is an excellent domain for practicing data exploration and visualization. In fact, most machine learning work that perform well on sports data is 90% data exploration and 10% model building.
Cricsheet has a bunch of cricket data available for download.
We provide ball-by-ball data for Men’s and Women’s T...
I am attempting to make a ground based robot traverse from Point A to Point B through an Obstacle Course having fixed and moving obstacles (Imagine a marketplace or a railway station), with input from a camera feed (a group of CCTV cameras/ live feed from a drone over the area). What kind of ML algorithm will enable the system to plan the shortest path(s) for the ground robot to follow under varying circumstances (moving obstacles at certain speeds travelling randomly over the task plane)?