The Apache Hadoop software library is a framework that allows for the distributed
processing of large data sets across clusters of computers using a simple programming
model. It is designed to scale up from single servers to thousands of machines, each offering
local computation and storage. Rather than rely on hardware to deliver high-avaiability, the
library itself is designed to detect and handle failures at the application layer, so delivering a
highly-availabile service on top of a cluster of computers, each of which may be prone to
failures.