by Angela Guess
Josh Constine of TechCrunch reports, “Facebook revealed some big, big stats on big data to a few reporters at its HQ today, including that its system processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. Plus it gave the first details on its new ‘Project Prism’.”
Constine continues, “VP of Engineering Jay Parikh explained why this is so important to Facebook: ‘Big data really is about having insights and making an impact on your business. If you aren’t taking advantage of the data you’re collecting, then you just have a pile of data, you don’t have big data.’ By processing data within minutes, Facebook can rollout out new products, understand user reactions, and modify designs in near real-time.”
He goes on, “Another stat Facebook revealed was that over 100 petebytes of data are stored in a single Hadoop disk cluster, and Parikh noted ‘We think we operate the single largest Hadoop system in the world.’ In a hilarious moment, when asked ‘Is your Hadoop cluster bigger than Yahoo’s?’, Parikh proudly stated ‘Yes’ with a wink.”
photo credit: TechCrunch

















