SQL: How many indexes per table is too many when you're using SQL Server?

I wish query tuning was easy. Today, it's not. One day machines will do it better than us, but today, doing it well is still somewhat of an artistic pursuit. You can teach most people the basics, but it takes a long time to get a really good "feel" for what needs to be done. However, something that I seem to differ with a lot of people on, is about how many indexes is too many?

Is there a formula?

Whenever work is hard to get right, I see people endlessly trying to provide simple formulae as guidelines. I understand why they feel a need to do that. But is there a magic number, like 5, or 8, or 10 indexes per table, where you decide there are too many?

Sorry, but the answer is no.

Do indexes help reads but hurt writes/updates?

Sorry, again there's no simple answer here, no matter what you've heard.

Clearly, having an index that supports finding the data that you're looking for is a really helpful thing. Agreed.

But when you need to update data, the first thing you need to do is to find the data that has to be updated. Almost every time I see this discussion, people are updating, and deleting data based only upon a primary key and they have a clustered index on that key. So they argue that other indexes are just extra work during those updates and deletes.

But not all updates are like that. Imagine I need to change every occurrence of "Lucky Goldstar" to "LG" in a column. Without an index on that value,  I'm back to reading the whole table, or at least another entire index, just to find the values to change.

You might say "but how often would you do that?" and that's the key issue. It's all about how often you need to perform these operations.

Should I be tuning for reads or writes?

It's important to understand what you're optimizing. Whenever I ask a developer what proportion of the time their system would spend writing vs reading, so often I'm told numbers like 20% or 30%. I even had an argument with a guy from the SQL Server product group who told me he thought 40% to 60% was common.

Yet, for almost all typical business applications that I work on, the number that I see is closer to 1%.

So, on most systems (at least the ones that I see, and I see a lot of them), optimizing for writes is a big call. Do you really want to be optimizing for 1% of the workload, or for the 99% of the workload?

Now there are situations where the timing of those writes is critical. Yes that happens. But guess what? If you make the 99% of the work dramatically (i.e. reads) faster, your writes get faster too. The system is doing far less work, and there will be way less impact from pesky things like locking and blocking issues.

Find the right queries to tune

You can't optimize the system for all the queries that are run on it. The most important step is to work out where to put your effort. If you don't know how to do that, spend some time on our free online 4 Steps to Faster SQL Server Applications course. It will show you how to do basic tracing and normalizing and aggregation of query templates, so you know where to spend your effort.

You need to fix the queries that are having the biggest impact on your system. Then everything gets better.

Useless indexes really are just load

While I'm not the index police on how many indexes you can have on a table, I also don't want to see useless indexes. They really are just overhead for zero benefit.

As a hint, the vast majority of single column indexes are pretty useless. Not 100% true but close.

If you don't know how indexes really work, and how to design useful ones, we have an online SQL Server Indexing for Developers course.

Rather than spending hours messing around trying to work it out, it'll be worth some hours of your time taking the course.

 

4 thoughts on “SQL: How many indexes per table is too many when you're using SQL Server?”

  1. Thank you, Greg, for making this topic clear one more time!

    Far too often, developers are on the road with wild ideas about what is right or wrong. Often I notice a kind of over-index anxiety. In the end, it is always about managing risks like "too slow queries" or "not to harm the workload with blocking issues." There is no free lunch. If you need this particular order of data, then you have to create and maintain them. And sometimes big players have to have an index for the 1% query because this query could stop the whole system.

  2. this data is from about 15 years ago. Each nonclustered index increases in the insert/update burden by about 25% of the base table write cost. Example: insert/update to a table with 4 nonclustered indexes costs about 2X more than to the same table w/o nonclustered indexes.
    One question is whether you can support the insert/update burden of so many indexes. Another question is: what is the implication of not having a particular index. will the plan use another index in a less efficient plan or does the plan become a scan?
    Also. the query plan grossly over-weighs the cost of the key lookup if your data is in memory. So a plan that shows many key lookup executes as very expensive really is not that expensive. And the query optimized switches from key lookup to scan far too soon (again, when your data is in memory).
    So a single column key that is selective is not that bad.
    And: consider leaving out frequently updated columns from the include list of indexes on the understanding that the key lookup really is not that expensive.

    1. Hi Joe, whenever I see that sort of overhead stated, the assumption is as I mentioned in the blog post, that the row is being updated with a lookup by the primary key. That's far from always being the case. As for single column indexes, I see people create many of those, and they're almost always unused. In that case, they really are just overhead. For example, while it might be helpful to have an index on CustomerID in an Orders table (and I'd always start with one there to support the foreign key), invariably that's not a great index. People won't just want the CustomerID and the OrderID, they'll want the OrderDate, etc. and invariably that removes the usefulness of the index. And yes, I agree that the cost of lookups can be greatly overstated in query plans. It counts multiple lookups to the same page as though they are separate lookups. I wish it gave us better info than that. But I do not share the concern that I often hear about the overhead of most index updates. The vast majority of systems that I work on, aren't spending their time doing that. They're reading a lot of data that they don't need to, because of a lack of indexes.

Leave a Reply

Your email address will not be published. Required fields are marked *