Fixing Locking and Blocking Issues in SQL Server - Part 4 - Deadlocks

This is part 4 in a series of posts:

Part 1 covered being sure there is a locking and blocking issue
Part 2 covered the impact of RCSI
Part 3 looked at the impacts of indexing on locking and blocking

Today, though, I want to look at the next potential issue, and that’s deadlocks.

Deadlocks

I often hear people interchangeably using the words lock, block, and deadlock.

Locks are a standard feature of systems that are designed for concurrency. Nothing nasty to see here. When they become a problem is in two situations:

They are held too long and cause a blocking problem
Incompatible locks are held and cause deadlocks

The simplest example of a lock is when I have an item and you want it, but you must wait until I finish with it. That’s just normal with sharing something.

If I finish using it quickly, you might barely even notice. But if I hang onto it for a very long time, you experience blocking. Apart from slowing you down (which might seem to be a problem to you), it’s also not actually an issue. When I finish, you’ll still get it.

As soon as we have more than one item though, things can get messy. If I have item A and I want item B, but you are holding item B because you also want item A, then we have a problem. That’s a deadlock.

It also doesn’t have to just be two people. I could have item A and want item B. Terry might have item B and want item C. Nga might have item C and want item D. Paula might have item D and want item A. A larger deadlock has occurred.

Resolving Deadlocks - the Old Way

In the past, I worked with systems that didn’t resolve this automatically. What would typically happen is that two users would suddenly have their applications freeze. Then a little while later, another user needed something that one of the first two had, and her application would freeze too.

This would go on and on across the building. We used to call it creeping death syndrome.

Eventually, someone would get sick of waiting, and go to find an admin. The admin would work out which user caused the first problem and kill their process. Everyone would then come back to life.

Resolving Deadlocks - the SQL Server Way

In SQL Server, there was a conscious decision to not let this happen and to resolve it automatically. By the word “resolve” though, you need to read “kill one of the user’s processes”.

SQL Server has a process that looks for these situations and cleans them up. The only question, of course, was which process to kill.

In early versions, they used to kill whichever process had just put the last link the deadlock chain. That process would return an error explaining they had been chosen as a deadlock victim and roll back.

(Note: users don’t take well to being told they’re victims).

Over time though, this caused issues as long-running processes could collide with short-running processes and get rolled back. They could then have the same issue happen again and again.

So a decision was made to look at the amount of data that had been written to the transaction log by the processes, and kill the one that had done the least work. That worked way better.

That’s how SQL Server currently handles deadlocks. In the next post, I’ll talk about how you should plan to work around deadlocks.

2019-05-09