SQL: When Performance Tuning SQL Server, Fix the Problem, not the Symptom

I spend a lot of time working with teams of DBAs and developers who need to fix performance problems. One of the real issues that I see time and again though, is people fixing the symptom, not the problem.
Many DBAs spend their time tracing query activity, working out which queries the system is spending a lot of time on, and then working out how to make those important queries run faster.
Automated tuning systems like those being put into SQL Server currently achieve much the same outcome.
But I’d always contend that if that’s all you can consider, you are often working on the wrong problem.
I was at a site some years back, where I found a single query that was loading a list of countries, but was being executed 12,000 times per minute and the system seemed constantly busy executing the query.
Some DBAs that I have worked with, would have just ignored the query. Each individual execution of the query took very few resources, and it wouldn’t have made it through their filter of queries that need looking at, because it ran fast.
Other (better) DBAs might identify it as an important query because they aggregated the impacts of the queries. Even better if they aggregated the normalized version of the query. (See our free course on doing this if you aren’t familiar with it). They would then spend their time working out if they could improve the performance of the query by indexing, looking at wait stats, reducing blocking, and other techniques.
This is all good stuff, but they’re all still missing the real problem which is why is the application asking the same question 12,000 times a minute?
That’s the actual problem that needs solving.
[caption id=“attachment_4105” align=“alignnone” width=“549”] Awesome image by Annie Spratt[/caption]
To really solve performance problems, you need to be prepared to step back, look at the bigger picture, and be prepared to go back into the application code with the developers. It’s worth noting that many 3rd party application vendors won’t be keen on you doing that, but it’s the only way to actually solve the problems.
Now, in this case, it’s pretty clear the list of countries isn’t changing. (It’s a few years since we got a new one in South Sudan).
[caption id=“attachment_4107” align=“alignnone” width=“501”] Awesome image by Kyle Glenn[/caption]
So why on earth (pun intended) is the application constantly asking for this? Surely it should be caching that data.
And of course, when I checked with the development team, they had been caching the data. What went wrong though, is they had a typo in the caching code. Every time they checked the cache, they thought the values weren’t there, so they reloaded them.
That was the real problem that needed to be solved. And no amount of fiddling with indexes, wait stats, locks, etc. was going to have a noticeable impact.
2018-11-12