SQL: Is indexing bit columns useful in SQL Server?

Awesome image by Jon Tyson

One data type that's often been quite controversial is the bit data type. So a bit is where we have an individual value, which is either a zero or one. They're the only values allowed unless it's NULL.

Always keep in mind when you're working with bits in SQL server that they're aren't just 2 potential states, there are three. It can be zero, or one, or it can be NULL. That is, it can have no value at all.

A common misconception

Bit values can be indexed and contrary to many misconceptions that have been around over the years, indexing bits can be very, very useful.

Now I used to see a lot of Microsoft material, that said there is absolutely no point indexing bit columns because they can only have two values. But the issue is not the number of values that a column can have, the issue is how selective the value is.

Why did they think that?

Let me explain the thinking that led to this. Imagine I had a table that's holding the people who are attending a conference and we have an indication of whether or not they've previously attended the same conference. Let's say that 50% of them have been to the conference before. Now having an index on that column probably isn't going to be very useful, as that's not at all selective.

To be selective, you need something that's quite tiny in terms of the number of rows returned compared to the total number of rows in the table.

But let's consider a different situation. If I have a transaction table and it contains billions of transactions, 500 of which are unfinalized. I could have a column in the table called IsFinalized that is a bit data type. (My purist friends would argue that's not normalized properly but run with me on this for now). I would never use the index on that column to find all the ones that are finalized, because that's most of the table.

However, if I ever went looking for the unfinalized transactions, I most certainly would hope there's an index there to let me find those 500 unfinalized ones.

It would also make a good candidate for a filtered index, but that's a topic for another day.

Why is the idea so common?

The idea that indexing bits isn't useful started back in early versions of SQL Server. In SQL Server 7, you couldn't build an index on a bit column. In SQL Server 2000, you could do it via T-SQL but the tools like Enterprise Manager didn't give you the option. Even the SQL Server training materials from Microsoft said it wasn't useful.

So people got the idea that it wasn't useful. But that was never true.

Indexing bit columns can be very, very powerful where the data is selective.

Learn about indexing

Want to learn more about indexing? Take our online on-demand course now:
https://training.sqldownunder.com/p/sql-server-indexing-for-developers

 

Leave a Reply

Your email address will not be published.