Dell, EMC, Dell Technologies, Cisco,

Monday, May 30, 2016

What is Hadoop?

The rise of big data into an essential component of business strategy to modernise and better serve customers has in part been boosted by the appearance of Hadoop.

#Hadoop, the free Java-based programming framework, is designed to support the processing of large data sets in a distributed computing environment that is typically built from commodity hardware.

Hadoop is part of the #Apache Software Foundation, which supports the development of open-source software projects.

At its core, Hadoop consists of a storage part called the Hadoop Distributed File System (HDFS), and a processing part called #MapReduce.

Basically, Hadoop works by splitting large files into blocks which are then distributed across nodes in a cluster to be processed.

The base framework is made up of Hadoop Common, which contains libraries and utilities for other Hadoop modules; #HDFS, a distributed file system that stores data on commodity machines; #YARN, which works as a resource management platform; and MapReduce, which is for large scale data processing.

http://www.cbronline.com/news/big-data/software/what-is-hadoop-4904909

No comments:

Post a Comment