by Angela Guess
Hovhannes Avoyan is continuing his series of articles on the best NoSQL databases. He recently took a look at Apache HBase, “originally created for use with Apache’s Hadoop, a software framework that supports data-intensive distributed applications under a free license.”
Avoyan writes, “HBase is really a clone (or a very close relative) of Google’s Bigtable, and, like I said, it was originally created for use with Hadoop. Actually, HBase is a subproject of the Apache Hadoop project. HBase offers database capabilities for Hadoop, which means you can use it as a source or sink for MapReduce jobs. HBase is a column-oriented database, and it is built to provide low latency requests on top of Hadoop HDFS. Unlike some other columnar databases that provide eventual consistency, HBase is very consistent.”
He goes on, “An HBase cluster uses several kinds of servers. For one, HDFS needs at least one namenode and several datanodes. Plus, HBase needs a ZooKeeper cluster, a master and several region servers. Requests must be made to the master(s). On the HDFS level, existing data are not sharded automatically. However, new data is sharded. On the HBase level, data is divided into regions that are sharded automatically across region servers.”
photo credit: HBase

















