Overfitting is one of the most important problems (and concepts) in machine learning.
It's not a good idea to test a machine learning model on a dataset which we used to train it, since it won't give any indication of how well our model performs on unseen data. The ability to perform well on unseen data is called generalization, and is the desirable characteristic we want in a model.
When a model performs well on training data (the data on which the algorithm was trained) but does not perform well on test data (new or unseen data), we say that it has overfit the training data or that the model is overfitting. This happens because the model learns the noise present in the training data as if it was a reliable pattern.
Conversely, when a model does not perform well on training data (i.e. it fails to capture patterns present in the training data) as well as unseen data then it is said to be under-fitting. That is, the model is unable to capture patterns present in the training data.
A smaller dataset can significantly increase the chance of overfitting. This is because it is much tougher to separate reliable patterns from noise when the dataset is small.
Bonus note: Ensemble Methods are good learners in the case of a small dataset. In ensemble methods, instead of training one model we train multiple models and average the predictions.