Loading...
You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Big Data Articles  >  Current Article

Data Architecture and the Need to Choose the Right Data Platform

By   /  February 22, 2018  /  No Comments

data platformMaking do with a less than optimal Data Architecture and Data Platform is like reaching to pay for groceries and finding no wallet or cash. As William McKnight, President of McKnight Consulting Group asked during his Keynote Address for the DATAVERSITY® Database Now! Online 2017 Conference, “We may be overwhelmed by data, should we not still choose the right platform?”

McKnight is an experienced Information Management Strategist and author of Information Management: Strategies for Gaining a Competitive Advantage with Data. During his presentation, he emphasized that:

“Our economy is entirely dependent on the natural resource of data. We are sitting on a gold asset of our organization. How our organization is going to compete and gain competitive advantage over the next decade entirely depends on how we use data.”

As an expert in recommending Data Platforms, McKnight has done a number of maturity studies over his career. He noted that the industries, and those companies within those industries, that are doing more with their data do much better than those that are not. McKnight observed that “top performers are expanding their Big Data implementations.”

So, why think about Data Architecture now?  McKnight said we need to move past the mindset of “just give me some data fast” and “give me good data, but do it efficiently” to “give them all data, fast, and effectively.” To embody this demand “it’s time to do something outside of the box and differently.” McKnight acknowledged that this is:

“Hard when you are underwater and [have more requests] than you can deliver. But we got to get the platforming correct for the work load and make it work together with Data Integration [and] Data Virtualization. The Data Warehouse is no longer the center of the universe. We have these non-relational platform possibilities that actually have a value proposition.”

What to Consider When Selecting a Data Platform?

Selecting the right data store type is essential to building a more effective Data Platform within the entire Data Architecture of an organization. “It used to be everything was a database,” reflected McKnight. But there are now many other options, such as file-based scale-out systems, which are “not technically, down at the bit and byte level, databases,” he said. File-based scale-out systems don’t have the same framework around the data. He recommended such systems especially for unstructured or semi-structured data. Other necessary considerations include:

  • Data Store Placement: McKnight said that, “we have a real viable possibility to not necessarily put the data store in our data center.” There are numerous Cloud options available now that are certainly viable, and cost-effective options. There are private, public, and many hybrid Cloud possibilities.
  • Workload Architecture: “Distinguish between operational or analytical workloads,” advised McKnight. “Short transactional request and more complex (often longer) analytics requests demand a different architecture.” Breaking down the requirements of workloads and properly designing a Data Platform around those workloads is essential.
  • Memory: McKnight observed, “A lot of people out there are still hooked to their HDDs [Hard Disk Drive]” He urged organizations to “open your mind a little bit.” There are many possibilities such as SSD (Solid State Drives), In-Memory, and other lower cost memory options available on the market today.

He used the example of an In-Memory data store that provides super-fast performance. “For selective workloads it has a high special functionality, opening more opportunities on ROI. We are starting to exploit more In-Memory these days.”

He compared memory selection to “putting the wind at my sails” which makes a sail boat go faster and provide an edge over the other boat.  He remarked that In-Memory may, “give a little more room for error as we go through the design process.”

Don’t Forget the Data Profile

Data maturity is about, “creating an efficient environment that we can add onto without starting all over again, every time.” To do this organizations need to look at the data profile. “Many of us are upside down in terms of where our priorities should be.” McKnight said:

“I can get a lot out of the data profile. [Tell me] the size and type of the data in terms of if it is structured or unstructured, and what some sample records look like, [in addition] to how frequently is the data coming in. Where is it coming in from? How frequently does it need to be accessed, what is the quality of the data, etc.?”

The Cloud Now Offers an Attractive Option

McKnight said that as he gets into financing a Data Platform option with clients that, “many companies don’t want to deal with capitalizing expenses. They would much rather operationalize them and this is the Cloud model, right?” In thinking about the Cloud, tight integration is an imperative.

McKnight offered the following example:

“You might put your Data Warehouse in the Cloud. Well what about your BI, might you put them in the Cloud? What about Data Integration? What about MDM, can that be in the Cloud? Yes, to all the above. Start thinking hard about the data and things will follow.”

He stated that a mature Data Architecture “not only has some Cloud, but a lot of Cloud in it today.” There are different Cloud models, and McKnight emphasized,”it is pretty important to get the right one for you.”

New Selection Vectors

In addition to the factors mentioned above, it’s necessary to weigh new selection vectors for a Data Platform. He listed:

  • Robustness of SQL: “There are some newfound capabilities within SQL that makes a lot of sense [and are] very important.”
  • Built-in Optimization: Consider across the Cloud and across Data Virtualization. The optimizers have more work today.
  • On-the-Fly Elasticity: Ask, do you really have it? Do you really need it?
  • Dynamic Environment Adaptation: Evaluate the ability to take on concurrent usage, different patterns of usage at the same time?
  • Separation of Compute from Storage: This is very important in the Cloud to scale those two things separately.
  • Support for Diverse Data: There is JSON, XML, and various forms of unstructured data flowing into the enterprise data environment. They need to be considered.

Succeeding with a Data Platform

Based on McKnight’s portfolio of clients from the last couple of years, he noted that, “the requirements have gone up tremendously, in terms of the number of users, the performance expectations, the amount of data, the complexity of the analytics, and so on.” So, succeeding with a Data Platform is crucial and can be determined by the following:

  • Performance: McKnight puts performance as the number one point. He advocated:

“We can give our users better performance out of our platforming decisions. They can grow with their capabilities in the data [and] are not going to be limited because [each query] is going to take 5 minutes. They will get to the deeper levels if those queries are popping. That is not going to happen if you haven’t thought about [Data Platforming] for a while.”

  • Provisioning: McKnight described provisioning as “How quickly can you get the Data Platform up and running? How Agile is it?”
  • Scale: He recommended asking “Can I start small and grow?”
  • Cost: Don’t overdo the cost part of the equation. Keep costs relative to what the organization can afford.

Data Platform Conclusions

McKnight provided seven final take-aways for succeeding with a Data Platform:

  1. Many Data Platforms are viable today in enterprises of all sizes.
  2. Get the platforming right and follow a plan.
  3. Start with data store type, placement, and workload architecture.
  4. Use the Data Profile as a strong determinant of correct platform.
  5. Make sure the Data Platform will perform, now and for unspecified requirements.
  6. Analytic platforms should be either staging, Operational Data Stores (ODS), Data Warehouse (DW), Data Mart (fed from the DW or specialized) or Hadoop.
  7. The Cloud now offers attractive options with better economics.

 

Check out Database Now! Online at http://databasenow.com/

 

Here is the video of the Database Now! Online 2017 Presentation:

 

 

 

Photo Credit: alexdndz/Shutterstock.com

 

About the author

Michelle Knight enjoys putting her information specialist background to use by writing technical articles on enhancing Data Quality, lending to useful information. Michelle has written articles on W3C validator for SiteProNews, SEO competitive analysis for the SLA (Special Libraries Association), Search Engine alternatives to Google, for the Business Information Alert, and Introductions on the Semantic Web, HTML 5, and Agile, Seabourne INC LLC, through AboutUs.com. She has worked as a software tester, a researcher, and a librarian. She has over five years of experience, contracting as a quality assurance engineer at a variety of organizations including Intel, Cigna, and Umpqua Bank. During that time Michelle used HTML, XML, and SQL to verify software behavior through databases Michelle graduated, from Simmons College, with a Masters in Library and Information with an Outstanding Information Science Student Award from the ASIST (The American Society for Information Science and Technology) and has a Bachelor of Arts in Psychology from Smith College. Michelle has a talent for digging into data, a natural eye for detail, and an abounding curiosity about finding and using data effectively.

You might also like...

Data Strategy vs. Data Architecture

Read More →