SQL: Are bit columns useless in SQL Server indexes?

Image by Aziz Acharki

If you are aware of Betteridge’s law of headlines, you already know the answer, but let me explain.

There are a lot of odd myths that surround SQL Server. One of the more persistent ones is related to indexes on columns that hold bit values.

A SQL Server column that’s a bit can be in three states. It can have the value zero; it can have the value one; and it can be NULL (ie: have no value at all).

Back in SQL Server 6.0 and 6.5 days, the Microsoft courseware used to tell you that if you had a column that was just a bit (ie: 0 or 1), then there was no point creating an index on it. The same advice continued onto SQL Server 7 materials, and if I recall correctly, SQL Server 2000 courseware said something similar.

In SQL Server 7 and 2000, the product used to actively fight with people who were trying to create these types of indexes. Enterprise Manager (one of the predecessors to Management Studio), had a GUI for creating indexes but didn’t list any bit columns for you to choose.

Little wonder that there was a persistent message that bit columns were pointless and that myth continues to this day in many people’s minds.

The logic was that if there were only two values, there would be nothing selective enough about a bit to make it useful. I’m sure they were thinking about proportions like male and female (not a good option for today’s world) and they were imagining fairly evenly distributed values. If half of the values were zero and half were one, then what was the point of the column as an index key?

Problem is that was only partly true, and only up to a point.

Image by Mikhail Rakityanskiy
Image by Mikhail Rakityanskiy

Indexing has never been about the range of potential values, it’s about the selectivity of the values.

So I might not want to use a bit column in an index if it’s a 50/50 proposition, but imagine an IsFinalized column on a transaction table. If there are 2 billion transactions and 200 that aren’t finalized, I’d never use the index to find the finalized ones, but I’d sure be hoping there was an index to let me find the unfinalized ones!

Even in SQL Server 2000, where it tried to dissuade you in the tools, you could use bit columns in indexes when creating the index by using T-SQL. We’ve used them very successfully over the years.

So if you hear people immediately dismissing bit columns from indexes (or see this in the output from development tools), don’t buy into it. Always look at the selectivity of columns in indexes, not the ranges of possible values.

2 thoughts on “SQL: Are bit columns useless in SQL Server indexes?”

  1. I tell people not to use BIT datatypes at all. You already mentioned one problem that was unique to SQL Server; three value logic. Originally the BIT was a true bit and had only {1,0} as its domain. But when it became a numeric data type it had to follow the basic rules of all datatypes in SQL; it had to be nullable. This really screwed up a lot of systems when it came time to upgrade. Nobody thought to add a NOT NULL constraint when they ported their code over and it was really hard to find out what was going on.

    My objection to assembly language style flags is that, SQL is a “predicate language”, that is, you discover status of something by using logic instead of setting flags. Your example of completed transactions is pretty typical. It gives us no clue as to how the transaction was completed (cancellation, successful, timed out, etc.) so the flag hides information from us. We also have no idea what the flag is set correctly, without actually testing the predicate that that determines this status.

    But I strongly agree with your statement that an an index should have high selectivity.

  2. Hi Joe, yep, understand the thoughts. The next post in fact, is about whether or not having many bit columns makes sense, so that’s good timing 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *