T-Sql 101

I’ve been talking about the basic window functions in T-SQL and one that’s not well known but is surprisingly useful is NTILE.

I’m not sure on the name but it’s probably short for percentile. I don’t know why they didn’t call it a slightly more meaningful name, but what it says is take the output and break it up into bands or chunks of data.

So if I say NTILE(10), that’ll give me a tenth of the rows with a value of 1, another tenth of the rows with a value of 2, and so on.

2021-03-29

In my last T-SQL 101 post, I mentioned ROW_NUMBER. It let you put a row number or position beside each row that was returned. Sometimes though, you want a rank instead. A rank is similar but it’s like a position in a race.

In the example above, I’ve used RANK to produce an ordering, based on an alphabetical listing of city names. Notice there’s Abercorn Abercorn Abercorn and then Aberdeen. So, like in a race, if three people came first they all get the value 1. The next person is fourth. Three people came forth, so then the next one is 7th, and so on.

2021-03-22

In SQL Server 2000 and earlier versions, I often heard people ask “How do I output a row number beside each row that’s output in my query?”

I remember some people arguing that it wasn’t a valid request, as it didn’t feel “set-based” but it was an appropriate request, and it could be dealt with in a set-based manner. Sometimes it’s very, very useful to be able to do that.

2021-03-15

It’s unfortunate that the SELECT query in SQL isn’t written in the order that operations logically occur (if not physically). I suspect that’s one of the things that makes learning SQL a bit harder than it needs to be.

Without getting into really complex queries, you need to understand the logical order of the operations.

FROM

The starting point is to determine where the data is coming from. This is normally a table but it could be other sets of rows like views or table expressions.

2021-03-09

I’ve previously talked about how the WHERE clause is used to limit the rows that are included in a query. If we’re using a GROUP BY, then WHERE is determining what goes into the grouping. But what if you want to apply a limit that’s based on the outcome of the grouping? That’s what HAVING does.

If I execute the following query:

SELECT Size, 
       COUNT(ProductID) AS NumberOfProducts
FROM dbo.Products 
WHERE IsShownOnPriceList <> 0
GROUP BY Size;

I see this output:

2021-03-08

When you calculate an aggregate, the default is that it applies to the entire table, but you might not want that.

For example, I might want to calculate the longest shelf life for products. But I want to calculate that for each size of product.

If we look at the first example, we have added a GROUP BY clause to our query, and it returns exactly that. The output is on the left hand side below the query. For each size (determined by the GROUP BY), the maximum of the shelf life is returned.

2021-03-01

I mentioned in a previous post that COUNT was an aggregate. The other common aggregates are shown in this table, and no surprise what they do.

SUM adds up or totals the values.

AVG calculates the average of the values.

MIN works out the minimum value.

MAX works out the maximum value.

But if you’ve started to think about how SQL Server works, you might be wondering about what happens with NULLs.

2021-02-22

In previous posts, I looked at how to read data from a table. Now, we need to look at how we do calculations on the data in the table.

The most basic calculation we might do is to count the number of rows in the table. The first example above does that.

What about the asterisk?

But also notice that is has an asterisk in the query. Some people worry about the asterisk being in their queries as usually having an asterisk isn’t a good idea. In fact, some customers have automated systems for checking code, and the automated system might complain about the asterisk.

2021-02-15

In a previous post, I showed how to use CAST and CONVERT. What I didn’t mention before though, is what happens when the conversion will fail. If I try to convert the string ‘hello’ to an int, that just isn’t going to work. Of course, what does happen, is the statement returns an error. Same thing happens if I try to convert the 30th February 2016 to a date. There aren’t 30 days in February. Again, an error will be returned.

2021-02-10

Sometimes we need to determine whether a string is a date or whether it is a number.

In the first example above, I’m asking if the string ‘20190229’ is a valid date. You can see from the response (0) that it isn’t. That’s because even though it’s a valid date format, February in 2019 doesn’t have a 29th day. It’s not a leap year.

The value returned from the ISDATE function is a zero or a 1. Curiously, the return value is of data type int. You’d think that a function that starts with Is and tests something would return a bit data type instead. But that’s just one of the curiosities of T-SQL.

2021-02-01