Click to learn more about author Rob Mandeville.
Databases are at the heart of every organization’s data center: they’re arguably one of the most important components of a business’s success and surely one of the most complex and critical when it comes to resource consumption and performance management. However, when troubleshooting or attempting to resolve the root cause of an issue, many database administrators (DBAs) often consider database health and performance data to be synonymous metrics, but they’re not.
The Difference Between Database Health and Database Performance
Certainly, database health and performance are closely related and interconnected, but assuming they are interchangeable is a common misconception and the distinction is an important one. By focusing on health rather than performance, you run the risk of overlooking critical elements, as both sets of metrics answer different fundamental questions and take drastically different vantage points:
- Health is inclusive of data points such as CPU utilization, I/O statistics, memory pressure, etc., which collectively indicate whether a database is capable of performing properly, but these metrics alone cannot confirm if satisfactory performance has been achieved.
- Performance is used to explain how your database queries are being executed by integrating an element of time measurement.
More specifically, as defined by Craig Mullins, a fellow database and Data Management expert, performance is “the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.” But time is essential to understanding and leveraging performance data, as the immediate satisfaction of end-users is a DBA’s primary goal.
Perhaps the best way to illustrate the difference between these two methods of measurement is to imagine a fellow employee in a car driving down a road. This road stretches between point A and point B. We can calculate how many vehicles should be able to go from A to B in an hour (potential capacity). We can also count the number of cars that drive past in an hour while utilizing that road (actual consumption). What we can’t predict is whether your colleague made it to work in a timely manner; they could have been stuck behind a broken-down vehicle, for example. Without the performance metrics of any one car, we’re only able to report how many cars went by and that theoretically, your colleague should have arrived in a timely fashion.
For complete insight into the performance of databases and queries, health and performance data must be taken together. While health is a very mature topic and nearly all database monitoring solutions offer visibility into these metrics, performance is another story. To create sufficient visibility, here are a few must-haves when monitoring and managing database performance. You should have:
- A way to measure the time component that end-users can relate to
- A way to determine if resource use is optimized, not simply utilized
- A way to measure throughput (TPS is a common metric, although often too simplistic as a standalone)
- A way to measure contention and its impact on end-users
- A way to measure workload and what’s driving it
Database Performance Management and Troubleshooting Best Practices
With these performance management must-have elements in mind, you should look to leverage the following best practices to stop resolving only symptoms of a larger problem, and ultimately identify the true root cause of database performance issues:
- Ensure you have the necessary data and metrics. If not, you risk having gaps that impact your ability to troubleshoot. If you miss more than one piece of critical data, those gaps widen. The ability to drill down into granular metrics like resource contention (e.g., maxed out CPU, memory, network, etc.) and a database’s workload (e.g., execution counts, reads and writes, replication, parsing statements, etc.) is key in identifying the true root cause of a performance problem. Measuring wait times across multiple dimensions is also important, allowing you to triangulate performance problems quickly and easily, but even more important, accurately.
- Arrange data in a meaningful way. The way in which data is arranged is unique to each individual IT professional, but the overall concept should help you quickly identify and resolve the root cause. You need to be able to conduct whichever pattern analysis works best for you in terms of detecting anomalies and understanding when you’re deviating from the norm. These patterns should be easy to pick out.
- Identify a way to triangulate issues across dimensions. This involves the ability answer the who, what, when, where, and why details of a performance problem. You need to know things like the SQL, application, user, database, etc. to understand who and what was impacted by poor performance and what was causing the impact.
- Capture up-to-date execution plans. Your job will be twice as hard if you’re attempting to troubleshoot a problem without insight into what the optimizer was thinking at the time of execution. Unless you understand which path it chose to solution whatever is being requested of the database, it’s impossible to tell where it went inefficient or where it caused the performance problem. Optimizers are a bit of a black box across vendors, so attempting to understand how they resolve issues is a moving target.
- Retain a historical perspective of data and performance to help establish a baseline normal. It’s nearly impossible to tell when a database, or any other system for that matter, is underperforming if you don’t have a day-to-day baseline “normal” to measure against. You should use comprehensive management and monitoring tools that provide a single dashboard of performance and the ability to drill down across database technologies and across deployment methods, including cloud, as well as establish a historical record of performance metrics. This will ensure your organization isn’t wasting valuable IT funds by addressing a database performance problem with the wrong solution.
At the end of the day, your focus on understanding the performance of your organization’s database systems requires an appreciation for the difference between database health and performance, though both are key to a well-running database environment. By leveraging the above best practices, including tools that provide insight into health, consumption, and performance, you can increase overall visibility and position yourself to more quickly and accurately identify and resolve database performance problems to deliver a more efficient, streamlined end-user experience and an overall better result for the business.