How to Handle Throttling in Cloud Databases

By on

Click to learn more about author Niels Brinch.


Yes, retry is the answer and conclusion of this article, but let’s spend a little time looking at the problem on a higher level — what solutions there are and what the consequences are of those solutions.

What is Throttling?

If you are used to having your own database servers, you may not be used to being throttled. But if you work with cloud databases, being throttled is business as usual, and that is the first and most important realization.

Throttling is not an error; it just means you are using more than the capacity you purchased. For example, Cosmos DB sells capacity in chunks of 100 Request Units, and if you exceed that, you are throttled.

Cloud databases do this in order to offer you predictable and fast performance. Cosmos DB is always fast, and if you exceed the capacity you reserved, it will also be fast to tell you that.

In short, throttling means you are exceeding your capacity and to call back later.

Why Throttling is Necessary

When you are used to an old school database server, you are also used to incredible robustness. A well-managed database will always respond. Always.

So why, when using a cloud database, do you have to get used to being rejected every now and then?

Imagine the alternative. For both Cosmos DB and our own Gyxi database, calls consistently take less than 50 milliseconds.

Imagine if instead of 50 milliseconds, a call took 10 seconds to complete because it was busy. That would take up the same amount of network connections and memory for a call 200 times longer.

Time is money and using 1 KB of memory for 10 seconds is 10 times more expensive than using 1 KB of memory for 1 second.

So, another way to look at this is that throttling makes your cloud database 200 times cheaper. Cloud databases commonly use shared infrastructure, so if you use up significant resources, then fewer resources are available for others, and their calls would hang as well, creating a snowball effect.

Why Retry is Not a Problem

We have already established that time is money and that the cloud database provider does not want to spend that time and money on their end, so instead, they will respond immediately and tell you to call later.

As throttling is “normal,” you need to be ready for it to happen, and you can make a conscious choice about how it happens:

1. In your server-side application, catch the throttling status code (usually 429 or 503), wait for a second or two, and then call again. Cosmos DB will even respond with a header that tells you how long to wait before you should try again.

Using this option means that your application will spend that time and those resources waiting for the cloud database. Clever how the cloud database passed their problems on to you, right?

Well, you have someone to pass the cost on to as well — the client application

2. Your client application can also be built to receive and understand that it has been throttled and retry. This puts the wait time on your client application, and whether that is a browser or another client, it will not be impacted greatly by having to wait a bit longer.

It’s different for the cloud database, which may have millions of calls per second and cannot handle using 200 times too many resources. It’s also different for your server-side application, which might receive thousands of calls per second and can also not handle the extra resource consumption.

Advantage of Letting the Client Handle Throttling

It’s virtually free for your client application to wait — not only because it does not need to scale up (it is only itself) but also because it is usually stateful.

It knows what page it is on and what it is trying to do, whereas your server-side application is likely (and hopefully) stateless, which means for every method call, they simply have to do their job as fast possible and get out. Throttling makes this hundred of times cheaper and thus hundreds of times more scalable.



Throttling happens, and it is OK. It is one of the trade-offs you have to accept in order to get the extreme scalability you can get from a cloud database. You just have to build retry logic into your applications.

We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept