“Our recent research shows that companies now agree that data is the most important asset in an organization today – sometimes even more than human resources,” said Mathias Golombek, CTO for Exasol in a recent DATAVERSITY® interview. Yet evolving technology and organizational issues have led to separated systems, data silos, and lost opportunities. “The question is, how can you solve that issue?”
A History of Data Silos
In most organizations, data is not centralized at any one point, said Golombek. There are several reasons for this. Legacy systems have certain limitations regarding the number of users accessing the system and the amount of data volume they can manage. In large organizations with many distributed departments, it’s difficult to have the comprehensive aggregated view of all data required to do proper Data Governance. “On the other hand, if you could aggregate the data all together, there’s a huge potential,” he commented.
The Data Lake Problem
Golombek said that many early Big Data adopters got started with data lakes when there was a proliferation of new Hadoop systems and Big Data initiatives.
“Oddly enough, 70 percent to 80 percent of these projects failed. There was not a clear outcome with these initiatives because they were mainly about the technology and not about the use case,” he said.
Companies created big teams, big clusters of service, and loaded all kinds of data into these systems, without any clear strategy around it. Another factor with Hadoop systems, he said, is that they are low cost, scalable pure storage, but big performance problems emerge when complex data analytics requires accessing large data volumes. “That’s why data lakes turn often into data ‘reservoirs,’ which are not accessible anymore,” he said.
Golombek said that historically, organizations created separated teams with different tools and focuses, creating silos. The Data Science team tended toward the more mathematical: statisticians, Python experts, and so on. The Business Intelligence team came from the database world: SQL experts, data warehouse experts, DBAs and so on. For a long time, these teams worked in completely separate environments. They had separate systems, they had separate data and they didn’t work closely together. “A positive trend we see with our clients is that these two groups are now working together, which is very important.”
Data as Part of Overall Strategy
Because of the growing recognition of the value of data as an asset, Golombek said that many of his clients now want to transform themselves into data-driven companies. As part of this strategy, organizations create competence centers and new roles like Chief Data Officer, or Chief Data Science Officer. Rather than having a data warehouse managed by the IT department, the data and how it is managed becomes a strategic part of the entire company and its business chain. “And that is the key for every other following step. Otherwise Data Management ends up being focused on small discussions about IT projects and technology.”
The focus on data as a strategic, company-wide asset has to start from the top-down. Once top management is on board, the next step is finding the right solution.
“You must have the right technology. You need technology which enables you and is not a burden,” he remarked. Tools should allow the storage and integration of as much data as possible without creating additional bottlenecks. Because different kinds of data can affect performance and accessing times, Golombek warns against aggregating everything “in one single box.” He suggests instead a smart architecture with a layered system and federated access to all the data stored in different systems.
According to Golombek there are three capabilities that can enable companies to become more data-driven: performance, federation, and self-tuning.
The Exasol Advantage
A powerful engine to integrate directly with the database can allow more people to access that data, no matter where it is, said Golombek. With Exasol’s new addition of AI, data can be used in a multidimensional way, with the ability to accelerate data analytics both in the cloud and on-prem for organizations of all sizes. “Besides the pure speed, you also gain the ability to create more complex questions throughout the data,” he said. Other aspects of the Exasol platform include:
- Federation: Allows access to data across multiple systems, silos, and places. Exasol allows federation through an open framework using the programming language of your choice.
“Once you install the adaptor, which is just a very small clip, you can create virtual schemas, which actually link to the underlying technology. You could create a schema to a MongoDB or to an Oracle. Afterwards the data, which is sitting in Oracle, is virtually available in Exasol without copying the data completely into the database.”
- Self-tuning: Automated tuning removes the need for DBA time spent optimizing. “If you come from an Oracle zone, you normally have a couple of DBAs doing nothing other than optimizing and tuning the database because they are constantly hitting walls,” added Golombek. “We don’t have that. The Exasol database was designed to completely run on its own.”
Exasol is a database and analytics vendor, with cross-platform parallels to Oracle, IBM Db2, SQL Server, Snowflake, or Redshift. The chief difference is that other than SAP HANA, Exasol is the only pure in-memory database. “Our founders had the vision 20 years ago that memory would become cheap one day, so they decided to create a memory database.” Golombek said that they ended up creating a very scalable database that is completely focused on memory computing. “Besides HANA, Exasol is actually the only software in the market completely optimized for memory computing.”
Once Golombek’s customers realize the new opportunities they have, they have the freedom to become innovative. “With Exasol, they can analyze and store more data, and put more back into the history of their customers. They can do Data Science on a far more granular level.”
Use Case: A Data-Driven Company
Golombek sees Exasol as an enabling tool and many of his users say that once they brought in Exasol, they were able to forget about technology and concentrate on use cases. “We are an empowerment technology.”
As an example of what his customers have been able to do with this new level of empowerment, he talked about a large European e-commerce customer that used Exasol to optimize their e-commerce shop. They brought in all their research data, customer profiling, logistics and supply chain, financial data, and their inventory. “Now they know exactly what’s in their system, what can be sold, what’s working and what’s not,” said Golombek.
By having access to all their data, they can reduce fraud, better manage stocking levels, and can implement things like dynamic pricing. The virtual performance on the front end is optimized while all the number crunching is done on the backend, using artificial intelligence to then present the right product for each individual customer. This reduces the ratio of returns.
“They are a data-driven company through and through. They claim to have not a single decision made by humans—only based on data and only automated.” Rather than looking at reports and then having people make decisions, everything is coded into algorithms based on data analytics, with all the analytics done using Exasol. “We’re very proud of this because we were able to help this company become a really, really big player in the market.”
Golombek notes the change that his clients experience after working with Exasol for a while. “Once they realize the opportunities, they really start to become very creative.” When working with adidas, one of their employees told Golombek that if he’d run a recent query against their old Oracle system, he would have been fired, because the system would have been blown away by it. “That is a game changer, because it opens you up to completely new thinking.”
Image used under license from Shutterstock.com