Advertisement

Controlling SAP HANA Data Sprawl

By on
Read more about author Eamonn O’Neill.

Enterprises running large SAP HANA instances in the cloud are seeing a new challenge appear as their databases continue to grow. Since SAP HANA has a simplified data layout and structure compared to a more complex legacy database, it was assumed this would result in less data sprawl and duplication. But does the data stay small?

A key principle of cloud architecture is that services should be both atomic and scalable, allowing capacity to grow and shrink enormously with demand. But this capacity is generally delivered through the scale-out of many mid-sized compute instances. SAP HANA, however, generally works on the principle of scale-up, moving to ever-increasingly large compute instances to deal with the growth in data stored within it. SAP HANA scale-out capability is limited in solving this problem.

As such, large SAP HANA instances present somewhat of an anti-pattern on hyperscale cloud. 

Hyperscalers have met this challenge by delivering colossal instances, up to 24TB in size, which today is normally needed by only the very largest of enterprises. Of course, when those enterprises grow beyond that limit, they ask for even larger instances. This cannot go on indefinitely. Larger instances are much harder to provide at scale in the cloud and SAP customers who push the limits face greater availability risks. 

While the availability risk of SAP HANA is a cause for concern, the infrastructure bills for large SAP HANA instances are enough to compel customers to get their estate under control. The cost of a single 24-terabyte system in the cloud is around $800K a year. With most enterprises running multiple production-sized instances, it’s clear to see how cloud spend can get out of control with the unfettered growth of SAP HANA. Some organizations running large SAP HANA instances in the cloud can already see material costs coming on the horizon, and they are struggling to justify it.

So, what options are available to SAP customers who want to avoid the inevitable SAP HANA sprawl? There must be a better answer than “just keep growing larger instances.”

Deploy a Data Management Strategy for Large SAP HANA

To deal with SAP HANA data sprawl and the subsequent infrastructure costs that come with running large instances in the cloud, enterprises should have a clearly defined Data Management strategy. And, the earlier organizations put this strategy in place, the better. 

The four-point Data Management strategy outlined below can help organizations effectively manage the growth of their SAP HANA data and successfully keep data sprawl at bay.

1.  Put Things in Order

The first order of business to halt the massive growth of SAP HANA data is to put your data in order. All the data your organization generates doesn’t need to exist in SAP for the long term. For example, your active data within the current period needs to be available for audit purposes. But once the data and reporting periods have been closed, the question you and other stakeholders should ask is “Why does this need to stay in SAP?” Going through this evaluation process will help you put proper rules in place to check the growth of your SAP HANA database.

2. Trim the Fat by Archiving Data

After figuring out which data needs to go, the next order of business is data archiving. Data archiving is the process of moving data that is no longer actively used out of production systems and into a separate storage system for long-term retention. By archiving your data, you can trim the fat in your SAP HANA database and ensure your overall cloud environment is lean.

It is worth noting that while data archiving is a technical exercise that has been around for a while, organizations have always struggled to get it going. That said, key stakeholders need to fully support the data archiving initiative for it to yield the desired results.

3. Make HANA Smarter with SAP NSE

One of SAP HANA’s strengths is to put everything in memory. This function is not conducive to an effective Data Management strategy. To effectively manage the growth of SAP HANA, organizations need to make it smarter with SAP HANA Native Storage Extension (NSE).

SAP HANA NSE is a general-purpose, built-in warm data-tiered store in SAP HANA that lets organizations manage less frequently accessed data without fully loading it into memory. The solution integrates disk-based database technology with SAP HANA in-memory to intelligently put into memory only what it thinks you’re going to use.

SAP HANA NSE configuration is based on understanding usage patterns, the age of the data, and the relevance of the data being evaluated. When properly configured, NSE will keep the growth of SAP HANA in check and improve its price-performance ratio.

4. SAP HANA Data Partitioning

With rules set around the data that stays in SAP, data archiving to trim the size of your data, and SAP HANA NSE to keep the growth of your SAP HANA database in check, the final piece to an effective SAP HANA Data Management strategy is data partitioning. Data partitioning is the process of dividing index-organized tables into smaller pieces so the data can be easily accessed and managed.

When it comes to managing SAP HANA, data partitioning is all about looking at the biggest tables within the database and deciding which of those large tables your organization can do without. By partitioning large tables in SAP HANA, you can reduce the size of the tables being loaded into memory and ease the demand on memory. This translates to smaller, more manageable, and cost-effective SAP HANA systems.

Bringing It All Together

As SAP data accelerates, data sprawl is becoming more of a challenge for enterprises running SAP HANA in the cloud. The four-point strategy above can help prevent and deal with SAP HANA data sprawl and its risks; however, a critical component to making this strategy work is an SAP managed services provider with the skills and expertise to effectively incorporate archiving, SAP HANA NSE, and partitioning into your overall SAP-HANA-on-cloud Data Management strategy.