By: Robert Greene
At the recent Super Computing conference in Seattle, the major theme was Big Data and technologies that enable Big Data management. This was certainly an interesting departure from the typical computational issues these types of events tend to focus on like the CPU versus GPU, high speed network interconnects and storage. But the focus on Big Data certainly makes sense given that the many attendees are really feeling the strain of the data explosion – taking on issues involving time and space dependent data sets at the terabyte and even petabyte range. As such, much of the discussion was related to software instead of just the bits and bytes of the next evolution in the hardware stack.
Through these discussions a clear theme emerged: software systems over the next decade will need to be rewritten. In particular, the existing software solutions are not designed to leverage multi-core processors nor solid-state storage, both of which are undeniable trends which will dominate the hardware architectures of the data center and the enterprise within the next five years.
This is not to say that existing software systems are not designed for these technological evolutions, but essentially software design decisions of the past were made based on bottlenecks that simply do not exist in the newer hardware. For example, databases that have heavy I/O components were designed for block based I/O, which meant moving data into memory in big chunks because spinning disk seek time was the limiting factor in throughput. With this, it was better to seek a small number of times and read more data.
However, with SSD in Flash/DRAM drives, seek time is negligible. So, software can be changed to become more granular in how it deals with storage, and by becoming more granular it can then accommodate higher levels of concurrency and throughput. Similarly, many software systems were designed using largely single threaded, exclusive access data structures and those systems must be redesigned with multi-threading and concurrent access data structures in mind to achieve the parallelism and concurrency demanded by the next generation of real-time business solutions.
The task of redesigning software components based on decades of code for this new paradigm in hardware, while maintaining backwards compatibility with solutions already deployed in the enterprise, is an impossible effort. It means change will come to all and, out of necessity, organizations will re-evaluate and choose new software components to carry them into the next decade and beyond. Else, they will fall behind as competition moves to these new technologies which will enable their real-time business in this increasingly competitive business landscape.