Statistics: Central Tendency metrics, Dispersion and Correlation (Quick Review)
Statistics is a very broad branch of mathematics that deals with everything related to data, from collection and organization of data to its analysis, interpretation and presentation. With the ever increasing amount of data, statistics has become an indispensable tool in every field where one has to work with data.
When the amount of data we are dealing with is fairly small then it might be possible to talk about all the data items individually. However, when we are dealing with large quantities of data, which is almost always the case in real world situations, we need to have some characteristic values that can represent the data.
In this tutorial, we'll introduce such measures first for a single variable. For example, say the weight of students in a particular school. These measures will include measures of central tendency and measures of dispersion. Then, we'll look at measures for und...
If real world data-sets contain numeric, texts, alpha-numeric, time-stamps, and various other unstructured data types, then how does one store, retrieve and easily manipulate these multidimensional data-sets? The answer is a data science library like Pandas! Pandas is a powerful data analysis toolkit with high-performance and easy-to-use data structures. Unlike Excel and SQL, it carries a host of useful tools, methods and other functionalities that set it apart when it comes to row-wise and column-wise data manipulations. We will visit these functionalities in this tutorial.
A Support Vector Machine (SVM) is a classification algorithm, typically used for binary classification. An SVM with gaussian kernel has been consistently shown to be one of the best machine learning models, achieving the highest accuracy in a large variety of datasets, specially datasets which do not involve images or audio.
Understanding the various hyper-parameters of an SVM are central to achieving high accuracy. In this tutorial, we'll learn what an SVM is, and what is the purpose of its various hyper-parameters.
Given a binary classification dataset, an SVM aims to find a separating hyperplane between positive and negative instances. For example, in the figure below, each of the blue lines is a plane which separates the positive and negative data points. All blue points lie on one side, and all red points lie on the other rise. New unseen samples are categorized based on the side of the hyperplane in which they fall.
The answer is, because although the calculations are abstracted out by libraries, gradient descent on neural networks is susceptible to various issues such as vanishing gradients, dead neurons, etc. When we are trying to train a neural network and we face these problems, we will not be able to resolve it unless we understand whats going on behind the scenes.
Part of Speech tagging: Understanding Text Syntax and Structures, Part 1
Language Syntax and Structure
Syntax and structure of a natural language such as English are tied with a set of specific rules, conventions, and principles which dictate how words are combined into phrases, phrases get combined into clauses, and clauses get combined into sentences. All these constituents exist together in any sentence and are related to each other in a hierarchical structure.
Let’s consider a very basic example of language structure which explains a specific example in the light of subject and predicate relationship. Consider a simple sentence:
Harry is playing football
This sentence is talking about two subjects - Harry and football. To find the subject of the sentence, it is easier to fir...