Advertisement

Foreign Keys and the Delete Performance Issue

By on

Eventually you’ll run into a simple delete instruction that takes minutes (I mean, years!) to get executed. It’s just a hundred-row table, and, still, it takes a lifetime to get the rows deleted. Here’s a small tip: you’re probably missing some indexes.

Foreign Keys and their dirty secrets 

There is a very basic, yet very common misunderstanding set by developers and beginner modelers: a foreign key doesn’t represent any sort of index whatsoever. You might get confused because it has “key” in its name, but make no mistake: foreign keys are no indexes.

This is extremely important to keep in mind, and there are plenty of articles telling you some reasons why you should care about it. One of those is right here at DATAVERSITY®, and I truly recommend you take 3 minutes to read Micheal Blaha’s post.

After reading it, I’m certain you’ll be already tempted to index all of your foreign keys. So, please do! But if you’re not convinced yet, I’m here to share the dirtiest secret of it: the delete performance issue involving foreign keys. Why is this the dirtiest one? Because it’s hidden inside the delete instruction, and not the select one. And nobody would normally think about “deletion optimization”, right?! (It doesn’t even sound right).

The scenario 

First of all, let’s a very simple scenario. Suppose you have an Event table, a Person table, and a table representing People’s Event Reports, connected to both tables.

HNetzka-blogimage

Now, let’s imagine that you have 30,000 rows in your Event table, and 50,000 rows in your Person table. Those are not big numbers, but the EventReport table could have up to 1,500,000 on it. Big enough, isn’t it?!

The performance issue 

Perhaps you analyzed your Execution plan and noticed that the joins weren’t having any performance problems (some databases are really clever about it), and you decide not to create the indexes. Well, try to delete an Event from the table then. It doesn’t need to be related to any Report, it could be – perhaps – a duplicated Event that you want to delete. You execute a simple query, trying to delete the event by it’s ID (Primary Key, super fast), and – BAM – it takes 30 seconds to get executed!

That happens because it’ll need to check every EventReport’s row to see if that event isn’t related to it (it’ll perform a complete table scan). After all, that’s what foreign keys are about: they create a

constraint between primary keys and other tables’ columns, just to make sure there’ll be integrity between them.

Now imagine that your events are related to 10 other tables, including large tables like Reviews, Comments and so on… It could take minutes to delete one single row in your table – even though you’re trying to delete it by its ID.

The rule-of-thumb: please index your foreign keys! 

There’s no better way to finish this article than using Blaha’s words: please, index your foreign keys! If you don’t want to index all of them – perhaps because you wouldn’t be comfortable with thousands of indexes in your database, you should at least index the foreign keys belonging to your bigger tables (10k+ rows). It’ll definitely improve a lot the performance of your application – but, please, let me know if it doesn’t!

Leave a Reply