Advertisement

A Brief History of the Hadoop Ecosystem

In 2002, internet researchers just wanted a better search engine, and preferably one that was open-sourced. That was when Doug Cutting and Mike Cafarella decided to give them what they wanted, and they called their project “Nutch.” Hadoop was originally designed as part of the Nutch infrastructure, and was presented in the year 2005. The […]

Data Lakes: What They are and How to Use Them

Click to learn more about author Jaya Shankar Byrraju. For most companies, having data means having access to wealth. And the key to fully leveraging the wealth that data represents lies in how effectively companies harness, manage, parse, and interpret it. But first, the data must exist somewhere. Enter data lakes. These are central repositories […]

Unifying Big Data Workloads

Try querying Big Data sets and computing results through high volumes and variety across multiple independent storage systems – you’ll find a tangled web in the Tower of Babel, where platforms communicate in different languages. Then ask for speedy manipulations with that data set and it seems almost impossible. This describes the challenge faced by […]

Benchmarking Hadoop Performance: On-Premises S3-Compatible Storage Keeps Pace with HDFS

Click to learn more about authors Gary Ogasawara and Tatsuya Kawano. When deploying Hadoop, scaling storage can be difficult and costly because the storage and compute are co-located on the same hardware nodes. By implementing the storage layer using S3-compatible storage software and using an S3 connector instead of HDFS, it’s possible to separate storage […]

The Power of Crunching Big Data Effectively

Click to learn more about author Lex Boost. Not embracing the Big Data trend can cost your company. According to an Accenture study, 79 percent of enterprise executives agree that companies not embracing Big Data will lose their competitive edge. With data creation on track to grow tenfold by 2025, it is extremely important for […]

Hadoop Overview: A Big Data Toolkit

Big Data isn’t new. Forbes traces the origins back to the “information explosion” concept first identified in 1941. The challenge has been to develop practical methods for dealing with the 3Vs: Volume, Variety, and Velocity. Without tools to support and simplify the manipulation and analysis of large data sets, the ability to use that data […]

8 Big Data Trends to Watch For

by Angela Guess Tom Phelan, Chief Architect of BlueData, recently wrote in InsideBigData, “Over the next year, a growing number of customers will realize the vast business benefits of Big Data and will deploy Big Data solutions across their organization. Technical innovations, the rise of BDaaS, a shifting approach to data locality, platform convergence and […]