This presentation was given Tuesday, August 23, 2011 at the NoSQL Now! 2011 Conference in San Jose, California
About the Presentation
This tutorial session discussed the motivation, real-world use cases, and choices to make when deploying two related and popular open-source “noSql” systems: Apache Hadoop and Apache HBase.These systems enable enterprises to store, to analyze, and to profit from all of their data. In order to handle petabyte-scale volumes of raw data, these distributed systems make different assumptions and different design decisions than traditional relational databases.These designs enable new applications and potentially enable cloud deployments. We will also discuss some of the tradeoffs when choosing between public cloud deployments and private cluster deployments.
During the session we will discuss several topics:
- History of the Apache Hadoop and Apache HBase projects.
- Tell-tale signs about when to seriously considering Hadoop and HBase
- A high-level introduction to HBase’s data access interfaces and abstractions.
- Examples of real world HBase and Hadoop use cases and application architectures.
- How the high-level architecture of HBase and Hadoop enables on-the-fly scaling to deal with increased workloads.
- Using tools like Apache Whirr to deploy Hadoop and HBase in Public clouds
- Hadoop and HBase tradeoffs: Public cloud vs private cluster.