Advertisement

Capitalizing on Elastic Computing

By on

elastic storage x300by Jelani Harper

Aside from its potential to integrate and aggregate a host of resources and technologies, Cloud Computing can most prominently benefit the enterprise through the enabling of Elastic Computing.

Elastic Computing is the utilization of the Cloud’s nearly limitless scalability to provision resources on demand and at scale without conventional architectural concerns associated with doing so on-premise such as:

  • Scaling Up and Down: In physical environments, substantially adding or subtracting resources requires reconfiguring one’s architecture to do so—which is time consuming. Moreover, it is possible that after reconfiguring architecture to scale up, scaling back down again can require even more time than the former process, if doing so is even possible.
  • Storage: Storage in on-premise environments is significantly more costly than Cloud-based storage, and was the principle reason that the philosophy of minimizing data within a physical warehouse was derived in order to reduce storage costs.”
  • Network and Bandwidth: One of the primary issues associated with scaling in physical environments is that doing so requires attaining greater and lesser amounts of bandwidth and accommodating a degree of strain on conventional architecture that is virtually non-existent when utilizing Cloud capabilities.

In comparison, the myriad positives associated with Elastic Computing include:

  • Increased Speed: By provisioning resources when one needs them for specific tasks, the end user can greatly speed up time consuming processes such as querying and performing analytics.
  • Resource Allocation: Selecting what resources to use in a particular manner enables users to dedicate, for example, a percentage to computing and another percentage to storage to achieve greater efficiency and simultaneously load a data warehouse while querying it.
  • Accommodating Various Workloads: A subset of the resource allocation functionality of Elastic Computing is the fact that it can account for various workloads while accessing just one platform through the Cloud.

Snowflake Elastic Data Warehouse

“Probably the most important and relevant thing here is that at this point we’re sort of at that transitioning and inflection point where more data is coming from the Cloud than is being built on premises,” Snowflake Computing CEO Bob Muglia stated. “That’s a trend that will continue…As we talk to customers, more and more of them have a great deal of interest in doing data analysis in the Cloud. When the data is born in the Cloud it makes it a lot easier.”

The Snowflake Elastic Data Warehouse is a Data Warehouse-as-a-Service (DWaaS) offering that utilizes Elastic Computing to specifically provide analytics on what the company terms business data—meaning both structured transactional data and semi-structured data provided by machines (including sensor or web-based Big Data). Its propensity to do so with the latter data type is all the more notable since it is a fully relational, SQL database that can accommodate Big Data analytics in the Cloud as well as it does typical transactional data.

Leveraging SQL

There are several ways in which Snowflake’s Elastic Computing benefits are buttressed by its reliance on SQL and this query language’s continued relevance in the era of Cloud-based Big Data sets. Those advantages include:

  • Native Skills and Tools: Relational technologies dominated the database landscape up until Big Data’s popularity emerged a couple of years ago. As such, the fact that SQL can readily analyze Big Data advantages the enterprise by allowing organizations to take advantage of skills and tools that they already possess while reducing the need for scarce Data Scientists.
  • Integration: Snowflake’s DWaaS enables organizations to easily integrate Big Data with conventional transactional and historic data by allowing them to perform analytics on both without any need for replication or virtualization technologies. Its service provides a more thorough overview of an organization’s data assets and its insight into business problems.
  • Expedience: Relational databases typically provide a bevy of metadata that can decrease the time it takes to issue and ascertain the results of queries. For example, whereas batch process databases (such as Hadoop) would need to query an entire dataset to find relevant information for a specific set of values, SQL technologies can do so in a fraction of the time by utilizing metadata to only query data with germane attributes.

Elastic Optimization and Warehouse Management

According to Snowflake Vice-President of Products and Marketing Jon Bock:

“We provided a system that allows you to load…semi-structured data in its native form—which you can’t do with most databases. Most databases you have to convert it first. And then once it’s in the system, we automatically optimize that using the optimizations that a relational database is really, really good at. That is something that Hadoop…doesn’t do because it doesn’t have all those optimizations that a relational database does. We give someone the ability to take structured data—which is business data—and combine it with machine generated data in one system or even one query, without having to do all these other steps.”

Additional efficiency attributed to the Snowflake Elastic Data Warehouse that makes it extremely viable for analyzing structured and semi-structured Big Data sets while using the Cloud’s advantages include:

  • Dynamic Optimization: The database considers the state of an enterprise’s resources—which are subject to change over time due to Elastic Computing—and optimizes its processes around those resources dynamically. Conventional databases optimize around resources at a set period of time and require substantial modification to re-optimize once resources change.
  • Partitioning Resources: The degree of specificity for changing and manipulating the various resources to power the database enables users to dedicate particular amounts to computing and storage, which allows them to issue queries while loading the database.
  • Automation: Snowflake Elastic Data Warehouse can automate various aspects of data warehouse management (copying data, tuning performances, or spreading data across the system) and assist end users in these processes as they go, instead of involving database administrators or other IT personnel.

Versus Hadoop

The initial explosion of Big Data largely involved Hadoop as a means of accessing semi-structured and unstructured data, and came to result in organizations attempting to utilize it as an integration hub for their proprietary, on premise data and those that were otherwise. When compared with Hadoop, Snowflake Elastic Data Warehouse is a native relational environment as opposed to the former. This fact accounts for an expedience and ease of integration, in addition to querying and optimization that requires less effort and resources than the latter does. These same factors limit Hadoop’s efficacy with transactional data, although Hadoop is open source and is much more cost effective than a traditional database. Still, it is not a data warehouse.

Final Thoughts

Snowflake Elastic Data Warehouse runs on Amazon Web Services and utilizes Cloud-based storage options that are much less expensive than on-premise ones. Users have the physical security provided by Amazon at the site where the data is actually stored, and can utilize additional security options such as Amazon’s private Clouds and conventional database security facilitated by Snowflake pertaining to who has access to certain types of data and how.

The overall significance of Snowflake’s offering, however, is multifaceted. On the one hand, it presents a viable means for combining machine-generated Big Data with traditional structured data in a native SQL environment that enables quick analytics and a seamless integration of these data types. More importantly, perhaps, it alludes to the importance of Elastic Computing as one of the most vital capabilities that the Cloud engenders. The dynamic nature of the former is responsible for myriad options in Data Management that would not be possible without it. Only time will reveal the various applications and means of utility it can provide for the enterprise.

 

Leave a Reply