Trillions of gigabytes of data are being produced yearly, and the number is still growing exponentially. It is estimated that for every person, 1.7 megabytes of data will be produced every second by 2020 and digital data accumulation will reach about 44 zettabytes or 44 trillion gigabytes. This explosion of data is also shown in the graph below.
Data is only a raw material and extracting information from it requires further work. Our society is increasingly becoming data dependent and data science is the field which helps us make sense of this huge quantity of data.
Data Science is an interdisciplinary field, and makes use of methods and technologies from different fields such as computer science, databases, mathematics, statistics and machine learning. Data Science is involved with the collection, preparation, analysis, visualization, management and preservation of data. This data is often available in very large quantities, and covers a variety of types.