Of all the technologies to radically alter the landscape of Data Management today—Big Data, mobile, analytics, and others—Cloud Computing is arguably one of the most important, as it enables these technologies to apply to one another with an efficacy that would otherwise not be possible.
While various Cloud Computing applications regularly receive headlines for predictive analytics, the Internet of Things, Cognitive Computing and more, it’s relatively easy to forget that the insight and action they deliver hinges on the highly scalable, ubiquitously accessible, elastic resource provisioning environment of the Cloud.
In reminder of this fact, an entire session was dedicated to the necessity of the Cloud in modern Data Management practices at IBM’s Insight 2014 conference. Hosted by IBM General Manager of Cloud Strategy Don Rippert, “The Power of Data in the Cloud” emphasized Cloud Computing’s utility for data integration, access, and replication, while examining critical aspects of Cloud infrastructure and security with a number of unique use cases that emphasize its burgeoning applicability across vertical industries.
Arguably the most essential aspect of the Cloud is its ability to provide an integration of nearly limitless numbers of data sources involving structured, semi-structured, and unstructured data. Such integration spans geographic location and includes both on-premise and Cloud sources, and is frequently typified by a speed of access that comes in real time or close to real time. The degree of scalability required to do so and the levels of performance that the Cloud enables while doing so would be extremely costly (if not impossible) to achieve in traditional server-based, on-premise environments. The seamless integration of proprietary data in a physical location and myriad public, Big Data sources, as well as the relevant exposure of Cloud APIs in products such as IBM Bluemix creates some extreme use cases, such as:
- Speedboat Racing: Speedboat racing is a data intensive, specialized sport in which competitors regularly race across oceans in excess of 100 miles an hour. Its unique data demands include a plethora of competitors at different geographic locations, broadcasting those locations in a comprehensive fashion for a television and commentator audience, and disseminating data to judges and officials to accurately award accolades for various facets of performance. Additionally, racers need up to the moment information about the many different facets of the craft they are piloting and the activities of their competitors. By utilizing the Cloud as a means to funnel upwards of a hundred disparate data sources at once for analytics, advanced analytics company DataSkill was able to partner with visualization specialist Virtual Eye to not only present an attractive broadcast package to audiences, but also significantly improve the performance and awareness of competitors.
- Grocery Store/Retail Management: Grocery store retailer Delhaize America has utilized different IBM Cloud technologies to gain increased knowledge regarding customer behavior and weather patterns. By combining on-premise point of sales data with Cloud based, Big Data weather sources, the chain was able to ascertain that the weather influences sales for beer, lamb, and veal when it’s warm, people tend not to shop when the weather is too hot, and numerous other “proprietary trade secrets” Director of Business Intelligence and Data Management Nik Greene decided not to share with Ripert.
- Yacht Racing: By sending data to the Cloud for analytics in a similar fashion as it does for speed racing with DataSkill, Virtual Eye CEO Ian Taylor was able to track the progress of seven different boats during the Volvo Ocean Race while issuing a presentation on Cloud visualizations from the comfort of a smart phone.
The ubiquitous access that Cloud Computing is known for is realized primarily through geographic location, in which users in different physical locations in different parts of the country or world are all able to access the same data. Database-as-a-Service (DBaaS) offerings and other Service Oriented Architecture (SOA) platforms enable users to utilize the same data and share it through different applications regardless of their location. IBM’s Cloudant is a NoSQL DBaaS that scales with the best of databases and utilizes basic analytics functionality such as search and indexing. Its access potential is substantially enhanced by its ability to replicate data, which is one of the core points of interest in the Cloud and enables users to transmit data between it and physical locations.
Data replication is primarily done for the purposes of analytics as well as to move data back and forth from the Cloud. It is also a vital process in organizations in which there are multiple data centers. Due to the replication characteristics of Cloud platforms such as Cloudant, organizations are able to utilize several key functions that are necessary for relying on data. Replication enables continuous availability of data which is critical in situations in which there is maintenance and upgrades to various software or hardware applications, with continuously incoming or streaming data.
Replication is also vital for the prevention of failures, helps organizations to have more than one copy of data, and is an invaluable aspect of the ubiquitous, simultaneous user access the Cloud provides. Certain platforms (such as Cloudant) can effectively save copies of changes that were made to data, so that regardless of who is accessing it via the Cloud, users are assured of utilizing the most recent copies of data:
“One of the great things about Cloudant is all of the replicas of the data can not only be read, but can be written to as well,” IBM Fellow, Chief Technology Officer Cloud and PureApplication Jason McGee said. “So, you’re not bottlenecking on one point that’s able to update the information. You can do that from anywhere you want.”
Because of the ubiquity of accessibility and availability of the Cloud, questions of security have frequently accompanied its deployments. However, the unique infrastructure of the Cloud actually presents an opportunity to refresh security procedures and to make them as flexible and agile as certain applications. Instead of taking a reactionary approach to security and implementing measures to protect instances of breach after they have occurred, Cloud-based security largely evolves as applications and business processes do. The difference is the manual approach in the on-premise environment versus the automated approaches of the Cloud, which present greater standardization for security and are less static than the former is. Pivotal differences between Cloud and on-premise security typically include:
- A greater focus on access: Most on-premise security is focused on the periphery and entrance points. Security that restricts Cloud access, however, is focused on all the different layers of an application—not just its entry points.
- Visibility: A number of modern security measures and tools specifically for the Cloud provide visibility layers in which IT personnel can see who is utilizing what data for what purpose, and regulate access accordingly.
- Protection: Certain online security platforms can help to reduce instances in phishing and phone fraud by analyzing data and determining the patterns of attackers to provide more proactive security that can ultimately function as a deterrent.
“We worked with an automaker who was able to simplify how their customers were able to log into their applications by connecting their existing social identities (on places like Facebook and Twitter and Google and other places) to access their applications. So they were able to both protect and manage access to the data that they cared about and provide a simplified experience for their clients.”
The power of Cloud Computing is formidable and is regularly used to enhance most other facets of Data Management. The sort of ubiquitous analytics it provides for sporting events and retail can easily apply to other industries. It effectively functions as a platform to integrate and improve some of the most relevant technologies in the data landscape today. Finally, its impact on data replication and security hint at the way in which it is actually transforming the way organizations are able to conduct business, by provisioning continuous access to data and shifting security paradigms to better align with overall agile processes.