Loading...
You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  BI / Data Science Blogs  >  Current Article

NoSQL and Real-Time Analytics: What You Need to Know

By   /  January 11, 2016  /  No Comments

Learn more about Jim Scott.

Jan-imageIf you’re in business, you need to know what’s going on both inside and outside your company operations. You need some kind of analytics to understand what your employees and your customers are doing. Analytics has been part of business for a while, but with the 24/7 nature of modern business you need real-time analytics. And if you’re looking for real-time analytics, an in-Hadoop NoSQL database will help support your data needs.

Why NoSQL?

Old database and analytics systems were designed for a different era of computing, when programmers submitted batch jobs to run overnight on a mainframe. While SQL was a revolution in data storage in 1970, it’s still very much tied to that era—back when E. F. Codd was working for IBM.

Modern real-time analytics is much quicker, offering response times of under a minute, even for complex queries. Much of the performance power comes from the different kinds of data models that NoSQL databases can support. SQL databases only offer the familiar tables tied together with foreign keys. But with NoSQL, you can use whatever data model makes the most sense for the job at hand. You can explore relationships between data points with graph databases, or you can use key-value databases to represent data as simpler key-value pairs. You can also use wide tables for a more traditional tabular view, but with the ability to have arbitrarily large number of columns. You can also scale out more easily by adding more nodes to your cluster instead of having to move everything to bigger machines.

The relatively lightweight nature of NoSQL also allows for more advanced applications. While it’s possible to use machine learning programs like those built with Apache Spark, the greater performance that NoSQL offers will make machine learning tasks that much faster. If you’re running credit card fraud detection, you can monitor thousands of transactions every second and flag potentially fraudulent transactions before the credit card company is on the hook for them.

Kinds of Analytics

So what kinds of analytics are available?

  • Decisive analytics help you make a decision.
  • Descriptive analytics show you what’s happening.
  • Predictive analytics try to forecast what’s going to happen in the future.
  • Prescriptive analytics make recommendations.

The key to real-time analytics is that you can see what’s going on and make decisions quickly.

Applications

Fraud detection is a good example of an application that benefits from real-time analytics—and one that many companies are going to need in the future, given the large number high-profile data breaches. Credit card companies can sift through hundreds of thousands of transactions and look for patterns. Transactions that seem out of character, such as sudden big-ticket items purchased far from where the cardholder lives can be flagged as potentially fraudulent.

Have you sat through a marathon of your favorite shows on Netflix or Hulu? Chances are you aren’t the only one watching. Streaming media services want to keep you glued to the screen and paying that $7.99 a month. They take your viewing habits and guess what else you’ll want to watch by combining your viewing habits against those with similar tastes. Building customer profiles this way is an example of predictive analytics.

Like most companies, your data center is your business—and you should know what’s going on in there. As with fraud detection, you can monitor incoming connections for anything out of the ordinary to avoid becoming the next costly and embarrassing data breach story. You can also look at the software your employees are using. Do they spend most of their time videoconferencing or sending instant messages? Do they have trouble using a particular program?

With the changing media landscape, it’s more challenging for marketers and advertisers to reach prospects. By some measures, Netflix viewership is outpacing network TV. Netflix doesn’t have commercials, so marketers will have to find ways to maximize their return on investment. With real-time analytics, they can see what’s influencing purchase decisions and meet potential customers there. Marketers can sift through support records to see where customers are struggling and suggest possible changes to product design.

Conclusion

The world is changing, and real-time analytics will help you keep up. In-Hadoop NoSQL lets you deploy a reliable, distributed database quickly that can scale with your needs.

About the author

James A. Scott (prefers to go by Jim) is Director, Enterprise Strategy & Architecture at MapR Technologies and is very active in the Hadoop community. Jim helped build the Hadoop community in Chicago as cofounder of the Chicago Hadoop Users Group. He has implemented Hadoop at three different companies, supporting a variety of enterprise use cases from managing Points of Interest for mapping applications, to Online Transactional Processing in advertising, as well as full data center monitoring and general data processing. Jim also was the SVP of Information Technology and Operations at SPINS, the leading provider of retail consumer insights, analytics reporting and consulting services for the Natural and Organic Products industry. Additionally, Jim served as Lead Engineer/Architect for Conversant (formerly Dotomi), one of the world's largest and most diversified digital marketing companies, and also held software architect positions at several companies including Aircell, NAVTEQ, and Dow Chemical. Jim speaks at many industry events around the world on big data technologies and enterprise architecture. When he's not solving business problems with technology, Jim enjoys cooking, watching-and-quoting movies and spending time with his wife and kids. Jim is on Twitter as @kingmesal.

You might also like...

To Get Value from Data, Organizations Should Also Focus on Data Flow

Read More →