Advertisement

A Brief History of the Hadoop Ecosystem

In 2002, internet researchers just wanted a better search engine, and preferably one that was open-sourced. That was when Doug Cutting and Mike Cafarella decided to give them what they wanted, and they called their project “Nutch.” Hadoop was originally designed as part of the Nutch infrastructure, and was presented in the year 2005. The […]

Benchmarking Hadoop Performance: On-Premises S3-Compatible Storage Keeps Pace with HDFS

Click to learn more about authors Gary Ogasawara and Tatsuya Kawano. When deploying Hadoop, scaling storage can be difficult and costly because the storage and compute are co-located on the same hardware nodes. By implementing the storage layer using S3-compatible storage software and using an S3 connector instead of HDFS, it’s possible to separate storage […]

Hadoop Overview: A Big Data Toolkit

Big Data isn’t new. Forbes traces the origins back to the “information explosion” concept first identified in 1941. The challenge has been to develop practical methods for dealing with the 3Vs: Volume, Variety, and Velocity. Without tools to support and simplify the manipulation and analysis of large data sets, the ability to use that data […]