T-SQL 101: #93 Restarting row numbering by using PARTITION BY

Imagine that I've used ROW_NUMBER to number all the cinemas in my database. I'll get values from 1 to the maximum number of cinemas. But what if I want to number the cinemas within each city? i.e. Aberdeen has three cinemas, so number them 1, 2, and 3. But when we get to the next city, start the numbering again. We can do this by adding PARTITION BY to the OVER clause.

PARTITION BY and OVER can do many, many things and I don't want to get too far into complexity on this in today's post, but I do want to mention that when you have a window function like ROW_NUMBER, RANK, DENSE_RANK, and NTILE, instead of just returning one large set of values, you can partition the values by using PARTITION BY.

As another simple example, imagine I need a row number for each order line in an order. I can just partition by the order number.

It's not just row numbering though. With NTILE, I could allocate the results of exams into bands, but partition the entire set by which class the students are in, or by which exam they took.

Partitioning the data into groups can be very useful.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

T-SQL 101: #92 Categorizing output rows by using NTILE

I've been talking about the basic window functions in T-SQL and one that's not well known but is surprisingly useful is NTILE.

I'm not sure on the name but it's probably short for percentile. I don't know why they didn't call it a slightly more meaningful name, but what it says is take the output and break it up into bands or chunks of data.

So if I say NTILE(10), that'll give me a tenth of the rows with a value of 1, another tenth of the rows with a value of 2, and so on.

This is actually really, really useful if you're trying to do things like populate a graph with the bottom 10%, the next 10% and so on. This function would return you the required tiling for that graph i.e. which band each value needs to go into. In the example, I've called it customer band, but I've just gone through CinemaID order and so they start all being number one. Then there would have been a tenth of them with number 2. A tenth of them with number 3 and so on, and the end of the list is shown on the right. There the the last ones will have the value 10.

This can be very useful.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

T-SQL 101: #91 Determining positions in a set by using RANK and DENSE_RANK

In my last T-SQL 101 post, I mentioned ROW_NUMBER. It let you put a row number or position beside each row that was returned. Sometimes though, you want a rank instead. A rank is similar but it's like a position in a race.

In the example above, I've used RANK to produce an ordering, based on an alphabetical listing of city names. Notice there's Abercorn Abercorn Abercorn and then Aberdeen. So, like in a race, if three people came first they all get the value 1. The next person is fourth. Three people came forth, so then the next one is 7th, and so on.

Dense Rank

That's the most common form of output but occasionally, people want the same thing, but without gaps. DENSE_RANK is like RANK, but it doesn't have gaps.

This is almost like getting to come second in a race, even though three people came first. So the idea here is that Abercorn there's three of them, and they get the value 1. Then there's Aberdeen. There's three of those, so they all get the value 2. Abminga comes next, so it gets 3. And so on.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

SQL Interview: Starting a new series of blog posts

Many clients hire me to carry out job interviews on their behalf. If they're hiring someone for a data-related role, they want to make sure they're hiring the right person. That usually means two things:

  • The person will fit well within the culture of the organization
  • The person is technically competent.

I can't help with the first one. If I've worked with the client for a long time, I might have a pretty good idea but ultimately, that's something the client needs to decide.

But I'm well-placed to help them with the second part, and that's the part that the client often doesn't feel they can do well themselves.

Note: It's not the point of this post, but if you'd like to get me involved in your interviews, please just reach out.

Technical interviews

For the technical interviews, it's not just a basic knowledge test. I aim to work out two things:

  • Does the candidate actually understand the technology that they claim to, and at which level of expertise?
  • How well do they approach problems that they don't immediately know the answer to?

For their knowledge understanding, I'll start with fairly easy questions so they get comfortable, then use increasing depth to probe different areas, to judge their real understanding in those areas.

For the problem solving, I'll give them scenarios, ask if they have any immediate ideas on potential solutions, then dive into how they'd try to solve it. They can tell me what they'd check, and I tell them what they would have found if they did.

Problem Solving Ability

I'm keen to see how logically they think and how they approach problem solving. I wish I could say that the ability to think logically can easily be learned, but I'm afraid my experience tells me that's just not true.

Back when I worked for HP in the mid 80's, they sent us on a Kepner-Tregoe course on logical troubleshooting. I'm sure it seemed like a great idea, and the course was well constructed. But the people in my class who could logically troubleshoot at the start, were the same ones who could do it at the end. It's great to have a core methodology. But to me, it's all about the ability to think logically, and that takes longer than a week to learn. However, it's a critical skill when getting to the bottom of problems.

Blog Series

I decided to make a blog post series that covers a lot of questions that check core understanding of SQL Server topics. I'll include questions at different levels, and on a variety of SQL Server related topics.

I'm not talking about tricky or gotchya questions, or anything like exam prep questions. Just checks on core understanding.

BTW: I've added a new category called SQL Interview. So you can easily find any/all of them amongst my other posts by using the category filter on the blog. I've also got a series of Business Intelligence questions (focussing on Power BI, Tabular data models, Data Factory, etc.) being prepared.

I hope you enjoy reading these. Nothing is ever absolute in this industry, so if you disagree with any of them, feel free to comment and say why.

 

Book Review: The Lean Startup by Eric Ries

A few weeks back, I finished reading The Lean Startup: How Constant Innovation Creates Radically Successful Businesses by Eric Ries.

There have been a lot of books on Lean methodology over the years, so I didn't know what to expect with this one. But it was published back in 2001 (i.e. the turn of the century) and that's when the Lean methodology was still fairly new in entrepreneur circles.

I've seen others comment that this was the "single most informative book on the iterative entrepreneural process".

Ries looks a using the approach to consider how companies are created and evolve, and how their new products are launched into the market. He focusses on how to create a startup that's a long lasting startup. So many businesses crash and burn each year, mostly within their early stages. Ries looks closely at how you can create something new, but right in amongst very uncertain times.

It's a fascinating book. I just wish it was a little more lively. It does plod along a bit, but the content is excellent.

The Verdict

An excellent book. 8 out of 10. The excellent content could have been delivered a bit more lively though.

 

 

Opinion: Do your job advertisements still show a location?

I've been amazed as the pandemic has continued, how many companies that claim to understand remote work, still really don't get it. I've decided that a good test of that, is whether job advertisements still show a city or location for the work.

Now, not everyone can work remotely, and not everyone has anywhere suitable to work from remotely, but the majority of knowledge workers can do just that. That's who I'm talking about here.

Old School

David Solomon, the CEO of investment bank Goldman Sachs hit the news last week, when he described remote work as an aberration. He said he wants his people back in the office while noting that "a business like ours, which has an innovative, collaborative apprenticeship culture" can't work remotely. He added that "It's an aberration that we're going to correct as soon as possible".

Turns out though, that he's already now walking back much of what he said.

To agile innovative businesses that haven't been around all that many years, embracing remote work seems easy. For businesses like banks that have been around for more than a century, I'm sure they do see the change as a blip in their timelines. Turning these organizations around will be like trying to turn around the Ever Given within the Suez Canal.

Pre-pandemic

Before the pandemic, I already saw many businesses that were hamstrung by their inability to utilize remote talent. I remember consulting at a development company in Melbourne that had nearly 500 developers. They were endlessly searching for new developers. But only if they could live in Melbourne and come to their offices.

What's the outcome of that?

Well over time, they'd lose people and couldn't replace them. They'd already exhausted the local market, and so they started lowering the standards of who they'd hire.

Yet if they'd embraced remote workers, they could have had such a pool of bright developers they could have used. Instead of focussing on endless job searches, they should have been focussing on enabling a remote workforce. Instead of buying more office space and equipment, they should have been buying better connectivity.

Again, there are exceptions, like requirements for particular citizenships in certain jobs, but they are the minority, and even then, they don't need to be located in a particular city.

Pandemic Time

So many businesses that argued they couldn't work remotely suddenly found they could when the pandemic hit.

Prior to the pandemic, I found it particularly ironic that companies like Microsoft and even newcomers like Slack, really didn't seem to get remote work, even though they were building the tools to enable it. Teams is improving rapidly now, but the early versions of it showed all the hallmarks of being designed by someone who lives in a corporate cubicle, always within a single domain and email address.

Both companies have now changed their thinking but I can only wonder what sort of tools we might have had by now, if they'd actually been using their own tooling to truly enable remote working.

In particular, I've been really pleased to see the number of Microsoft job advertisements that now don't have a location, and make it clear they want to find the right person no matter where they are. This is great to see.

Future Work Combinations

I've been closely watching the surveys within Australia on where knowledge-worker staff want to be able to work. Most of the results break down like this:

  • 20% want to be in the office all the time
  • 20% want to be remote all the time
  • 60% want flexibility to choose between the two as needs arise

And just as importantly, they want flexibility not only on where they work, they want flexibility on when they work. 9-5 Monday to Friday just isn't going to cut it any more.

Worse, the lack of timing flexibility affects far more women than men.

Attracting and Retaining Staff

Any company that thinks the world is going back the way it was, is going to have a serious difficulty attracting great staff, particularly if they're chasing world-class talent.

I can already imagine the companies that could lure away key staff from the Goldman Sachs of this world, by offering them a different way to work, that is employee-centric not just company-centric.

The best staff aren't likely to choose to spend mind-numbing hours commuting to/from cubicles in city centres, if there's an alternative.

 

 

 

 

 

T-SQL 101: #90 Numbering output rows by using ROW_NUMBER

In SQL Server 2000 and earlier versions, I often heard people ask "How do I output a row number beside each row that's output in my query?"

I remember some people arguing that it wasn't a valid request, as it didn't feel "set-based" but it was an appropriate request, and it could be dealt with in a set-based manner. Sometimes it's very, very useful to be able to do that.

In SQL Server 2005, we got the ROW_NUMBER function, and it did just what people were asking for. You got your normal query output, but also got a column with 1 for the first row, 2 for the second, and so on.

What's really good about this function though, is that it has an OVER clause that lets you specify the order for the numbering, separate to the order for the query. In the example shown above, I've ordered the output rows by CityName, and I've also ordered the row numbers by CityName. But I didn't have to. I could have said this:

That would have created row numbers based on the order of the CinemaID but the query would have output rows still in CityName order.

Note also that there was nothing special about the column alias (RowNumber) that I used. That could have been any name.

ROW_NUMBER was a powerful addition to T-SQL, but it was only the first of a set of window functions that were added to the language. We'll see more in later posts.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

Learning Mandarin: Measure words for people

I often hear people saying that a difficult part of learning Mandarin is getting used to all the measure words. There are a bunch of them, but I actually think it's easier than in English.

Does any English-speaking person really know all the collective nouns for things i.e. a flock of geese, but a murder of crows, etc.

Chinese has one standard measure word:

(Gè). It's pronounced a bit like the "ge" in "gert", and has a sort of guttural sound, and definitely not like the "ge" in "german".

It can be used for people. So I can say:

一个人 (Yīgè rén) which is literally like "one unit/measure person"

It can also be used for specific types of person:

一个男人 (Yīgè nánrén) for "one man"
一个女人 (Yīgè nǚrén) for "one woman"

If I'm referring to family members, I can use:

(kǒu) which is pronounced like rhyming with "row" as in "row a boat"

I can then say:

三口人 (Sānkǒu rén) for "three family members"

A more polite measure word for other people though would be:

(Wèi) which is pronounced like "way".

I can then say:

一位老女人 (Yī wèi lǎo nǚrén) for "an old woman"

Many of these types of measure words also have different meanings when not used as measure words.

For example: (Wèi) when used as a noun, can mean seat or place.

If I need to show status to someone, not just the respect that (Wèi)  implies, I can use:

(Míng) which is pronounced pretty much as you'd expect in English

I can then say:

一名法官 (Yī míng fǎguān) for "a judge"

Learning Mandarin

I'll write more soon on the best methods for learning Mandarin. If you want to get a taste for it in the meantime though, my current favorite site is iTalki, and my favorite teacher by far is Amy He. If you decide to try it, click here and it's cheaper for both you and me.

 

 

Opinion: Does your organization have a memory problem?

There's a café in Melbourne that I've liked going to for a while. It's not far from where I live when I'm in Melbourne, and it has all the makings of a really nice cosy suburban café. But there's something missing.

When I sat there the other day, my meal came as ordered but it was just a shadow of what it normally looks like. I put that down perhaps to a different chef that day. It's not great that they haven't all agreed on what those meals are usually made like, but that could have been a one off thing.

Another couple sat down near me though, and one of them put a finger on what the issue is with the café. He said "this place has no memory". He wasn't talking about the food changing; he was talking about each time you come in, it's like you've never been there before. What I think is the basic problem in that café, is that the only people who interact with the customers are the wait staff; there are usually only one or two, and they seem to change these staff regularly.

The old Cheers TV show had that aspect right. The theme song talked about people wanting to go where everyone knows your name.

IT Companies

A large IT company that I deal with has a partner program. I remember talking to one of their staff who had called me about an upcoming event and how they were running it. I asked why on earth they'd changed one of the core concepts. She said to me "It's always been that way". I said "No, it definitely was different". She said "Ever since I've worked here, that's how it has been". I asked how long she'd worked there, and she told me "Six Months".

Customer Feedback

Another large IT company has a way for customers to provide feedback, and for other customers to comment on and discuss the feedback.

But every few years, they seem to manage to have a need to purge all the feedback, either as some ill-conceived way of "tidying up" the feedback, or by losing info during migrations between feedback systems.  Either way, the information and rich discussions they have lost are amazing. Wonderful insights that were provided by people who've now moved out of the industry are all gone.

Now add into that mix a constantly changing set of staff within the relevant teams in the company. Again, you've created a lack of organizational memory. And time and again, you see the same mistakes made by new staff that previous staff had made.

Does your organization have a memory problem?

 

SQL: Use CREATE OR ALTER to deploy procedures

In the last few weeks, I've seen a surprising number of database scripts that deploy procedures and functions by dropping the object if it exists, then recreating it. Please don't write scripts like this.

When you drop a procedure, you not only drop the procedure, but you drop all permissions associated with it. If you then create the procedure again, you'll have the procedure, but you won't have the permissions.

That's the fundamental reason why ALTER PROCEDURE is useful. It changes the procedure without affecting permissions.

CREATE OR ALTER

Because there wasn't a clean way to create a procedure (prior to SQL Server 2016 Service Pack 1) if it wasn't already there, and to just alter it if it was, CREATE OR ALTER PROCEDURE was introduced.

This should now be the basis of your deployment code. It creates the procedure if it's not already there, but if it is already there, it alters it, and this keeps the permissions.

I now almost never write either CREATE PROCEDURE or ALTER PROCEDURE. I always just write CREATE OR ALTER PROCEDURE.

Digitally Signed Procedures

Note that this only applies to normal SQL permissions applied to the procedure. If you have permissions assigned indirectly through digital signatures on the procedure, they'll be gone when ALTER PROCEDURE is used, and that's as expected. If you change the procedure, you lose the signature, and that's the desired outcome.