Streaming Analytics: Predictions of the Future

By on

hf_stan_032816Streaming Analytics provides real-time optimization of the customer experience, improves quality of service, network performance, and creates new revenue streams for the enterprise. Netflix uses streaming analytics to provide greater efficiency, forecast changes more effectively, and save money by allowing the company to implement new processes in real time. Netflix has over 75 million customers streaming data worldwide. It can track how many users watched all seasons of a given program. Data gathered can identify user trends to track the day of the week when users watched a movie, the time of day, the zip code, and other metrics.

Auto insurance companies use Streaming Analytics to develop policies based on driver data. Companies can determine drivers’ habits, effectively assess auto risks, create customized pricing based on driver habits, and create incentive programs to encourage drivers to improve their driving habits.

According to Steve Wilkes, Chief Executive Officer and Founder of Striim, his company wanted to be able to more than just capture, store, process, and analyze the data. They wanted to do it in real time, with a platform that could also visualize it and set alerts for issues. “We’ve built that platform with Striim,” he said. “We can now source data from lots of different sources, not just change data capture from databases, but also log files, message buses, and IoT devices.” Machine data captured from log files, networks, services, and devices can be collected and analyzed real-time without having to store the data. Wilkes says:

“People assume log files are streams. If you think of databases, most people don’t think of those as streaming. That [changing] event stream is the database activity, so by using change data capture as part of our platform, being able to look at the change, the activity that’s happening in an Oracle database or a MySQL database, even the HP Nonstop – even in those databases we can do change data capture. You wait until you have a file and wait until you put it into sources of streaming. ”

All those files can be captured as streaming data, but the real challenge is taking all the different sources within the many enterprise databases, at the edge of the network, and with any of the files types that an enterprise may need for their analytics and processing them all in real time, all streaming.

Combining In-Memory and Streaming Analytics

Some people ask the question, if they move to in-memory and Streaming Analytics does that mean that their Big Data Analytics and Data Lakes are going to go away? According to Wilkes:

“I don’t feel that is the case. There is still a space that requires large amounts of historical data. It may be that the historical data is built, processed, and is used on a streaming analytical platform but that data is there for a purpose.”

Historical data is used by Data Scientists that run statistical models and understand long-term patterns and trends from that data. From a streaming perspective, an enterprise can look at their customers and what they are doing right now. From a Big Data analysis perspective, they can understand their previous behavior. “I the sweet spot,” said Wilkes. “The thing that is the big game changer is where you combine those two together.” It’s where an enterprise can incorporate historical content, their previous behavior, along with their streaming real-time activity, so that they make more informed decisions across-the-board.

The Cloud

Cloud data and Cloud Analytics progress has been slow. However, the Cloud will become the central repository for data and analytics. The Cloud offers increased speed in analysis of data and scalability. Companies not utilizing the Cloud are already lagging behind their competitors, especially in terms analytics.

“We are hearing more and more from customers of their desire to move into a hybrid cloud model,” commented Wilkes. “If you are a provider of SaaS or if your application is already running in the Cloud then that’s similar to a hybrid cloud model.” Many companies are already working to scale out though, while increasing their use of private clouds, they want to have elastic scalability without enormous expenditures of larger infrastructure investment, said Wilkes. “We’re looking at hybrid cloud models where we can utilize having a cloud in a private fashion without using external public IPs. We would use internal IPs through a VPN in a public cloud with a high bandwidth.”

 Internet of Things

Real-time predictions depend on data from Internet of Things (IoT). IoT will be the number one driver of data growth for many years forthcoming. The variety of devices producing real-time data will continue to grow – cities, governments, car manufacturers, healthcare, factories, and utilities will be the major drivers. The demand for analytics on real-time data streams is increasing rapidly as more devices connect to the Internet. IDC predicts that the worldwide installed base of IoT endpoints will grow at a rate of 21.4% through 2019 to 25.6 billion endpoints with IDC expecting approximately 30 billion connections in 2020.

“IoT is one of those unfortunate trends that uses huge amount of different things in all different directions,” said Wilkes. “It is certainly the case that some of those things provide their data through internet protocols and are able to deliver data through HTTP.” In cases where internet and similar protocol are used, then data collectors can essentially run their collection services from anywhere – on premise or on the Cloud. Though if other devices are used, such as sensors in cars and large factories, or credit cards for ATMs, then large amounts of data are still being created but the collection points are harder to access and control. “All that data is still considered IoT and those companies still need to collect it,” said Wilkes. “You have to find the right data to analyze, the right tools to analyze the data, and the right people to interpret the data into useful and meaningful decisions.”

Open Source

Open source is becoming more popular for infrastructure as well as Streaming Analytics. Open source is essential to Streaming Analytics because it enables the expanding advancement occurring in the Streaming Analytics sector. The open source market will continue to develop and vary in mostly all analytics and data processing related spheres. This may cause confusion and frustration among customers who will need to invest more time assessing and trying to incorporate various open source solutions. Open source may sound really good to some enterprises, due to its initial low costs, but according to Wilkes, it’s certainly not a viable option for many:

“Open source on the surface sounds like a great thing, but we’ve heard from our customers that there are a lot of different options in every category of open source. Prior to selecting an open source you will need to go through an evaluation of the different technologies which can be quite complex. It may be underestimated the actual cost of open source. There are always these fragmentations and spin-offs. That is always the danger with open source, even if you have a popular open source platform if something better comes along there is nothing stopping all of the contributors to that platform moving on and doing something else or wanting to start the next great best thing so they can have them come to their new open source based technology. So don’t always rely that it’s always going to be there.”

Leave a Reply