Introduction to Hadoop Distributed File System
The Hadoop Distributed File System (HDFS) is the one of the two components that makes up the backbone of Hadoop. As the name suggests, HDFS was implemented based on the distributed file system architecture with a few extra features.
A Distributed File System (DFS) is a file system that uses network protocols to store and manage files on a server or set of servers. The goal of a DFS is to allow clients to access data as if it were on their local machine. A DFS also allows data to be stored and shared in a secure and convenient way for multiple users. The servers that store the data have full control over the data and give access control to the clients of the DFS. This is why access to servers that are part of a distributed file system are limited — the data must be retrieved through an API.
HDFS was designed based on the distributed file syst...