by Angela Guess
Rob Gonzalez recently discussed what he sees as the two main Big Data problems. Gonzalez writes, “With all the hullabaloo around Big Data, I’ve been a little surprised that there hasn’t been more talk about how to consume the vast petabytes that people are talking about…until I realized that there are really two Big Data problems out there!”
He continues, “Roughly speaking, the two primary ways in which data scales is by adding depth and by adding breadth. The first is what most people mean when they refer to Big Data. Want to run analytics on every single transaction that Wal*Mart has done over 10 years to analyze trends? THAT is vertical scale. Technically, you can characterize it as having lots and lots of similarly structured data. That is where technologies like Hadoop and column-based data storage make a big difference.”
Gonzalez adds, “Horizontal Big Data, on the other hand, is like the Linked Data Cloud. It has all kinds of random information that ranges from highly structured and numeric to highly unstructured. Significantly, it tends to change quite a bit over time with increasing heterogeneity. That’s a completely different kind of scale, and one that is not well solved by using highly structured, vertically scaling technologies.”
photo credit: Lauren Manning

















