SQL: SELECT TOP 100 PERCENT is a code smell for SQL Server

SQL: SELECT TOP 100 PERCENT is a code smell for SQL Server

We help a lot of customers upgrade their systems. Part of that job involves looking for code smells before starting the upgrades. If you aren’t familiar with code smells, this Wikipedia article explains it pretty well: https://en.wikipedia.org/wiki/Code_smell.

Specifically, they describe a code smell as any characteristic in the source code of a program that possibly indicates a deeper problem.

One of these is the presence of TOP 100 PERCENT or perhaps written as TOP (100) PERCENT in SELECT queries. Almost always, we see this in the definition of views.

In SQL Server, a view is basically just a SELECT statement that’s been given a name.

Tables don’t have a natural order, and neither do views. If you want to output rows in a specific order, you need to use an ORDER BY clause in the queries that reference the tables or views.

What usually leads to this code smell is that the developers have attempted to create an ordered view. Let’s see an example.

In the WideWorldImporters database, I could write a query to return all the USB food items:

Note that I haven’t included an ORDER BY clause. The data happens to be in StockItemName order but that’s not guaranteed at all.

Now imagine that I want to create a view, but I want that view to return the rows in StockItemID order. I could try to create it like this:

That will return this error:

You are correctly told that the ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions. That’s because those objects don’t have any natural order.

But it’s the final part of the message that leads developers astray: “unless TOP, OFFSET or FOR XML is also specified”.

So they change the query as follows:

And the error goes away.

Unfortunately, in SQL Server 2000, because of a qwerk in how that version was implemented, it also worked as expected.

The problem here is that the TOP statement wasn’t implemented well, back in SQL Server 2000. It used the ORDER BY clause both for determining the TOP rows, and for the output order of the query. It really should have had a separate option on TOP instead, something like an OVER clause:

SELECT TOP(10) OVER (ORDER BY ProductID) …. ORDER BY CustomerID;

So this led to people thinking they could create ordered views by including a TOP statement.

In SQL Server 2005, this stopped working as expected. The ORDER BY in the view was used for implementing TOP but it wasn’t used for setting the order of the view itself. Customers started complaining that it was now broken. It wasn’t broken at all. It used the TOP statement to work out which rows to return (ie: all of them in this case), and then returned the rows in any order it felt like.

Views don’t have any natural order.

So if we see TOP (100) PERCENT in code, we suspect that the developer has attempted to create an ordered view and we check if that’s important before going any further. Chances are that the query that uses that view (or the client application) need to be modified instead.

2018-06-11