I get pretty excited about new T-SQL enhancements. Back in 2016, I was so pleased to see Microsoft finally add a string split option to T-SQL, but my enthusiasm was limited when I saw how it was implemented. Now that's mostly fixed !
While it's possible to build functions that did string splitting just like I wanted, the problem is that no matter how you implemented them, they were really just too slow. And most importantly, much slower than a built-in function would be.
SQL Server 2016 allowed us to write code like this:
FROM STRING_SPLIT('Hello,there,Greg', ',') AS s;
That code would return this output:
STRING_SPLIT is a table-valued function. It took two parameters:
- The string to be split (Unicode or not)
- The separator
It returned a set of rows with a single column called value, and it was fast.
So what were the problems?
- The separator could only be a single character
- Often you need to trim the values being returned. (That can obviously be done outside this function using TRIM but it'd be nice if there was an option to do it in the function).
- But the biggest issue of all was that because a table-valued function doesn't have an inbuilt order for returning rows, you didn't know which token was which, that had been extracted from the string.
Now a TVF returns a table, and tables don't have built-in orders either. Nor do views. But what was needed was another column that told you which rows was which, and you could then use that for ordering when needed.
Azure SQL Database
STRING_SPLIT in Azure SQL Database now has an optional parameter to enable an ordinal (i.e. a position).
That extra parameter is a bit or an int that says if an ordinal value should be added to the output.
So this code:
SELECT [value], ordinal
FROM STRING_SPLIT('Hello,there,Greg', ',', 1) AS s;
I love working with Azure SQL Database for many reasons, but one of those is that you get new language extensions before other parts of SQL Server. I've been using this feature since the day that I first heard it existed, and it's working great. Now the updated documentation is online as well.
It also means that I can obtain any specific token that I want. For example, get me the third token in the string:
FROM STRING_SPLIT('Hello,there,Greg', ',', 1) AS s
WHERE ordinal = 3;
And that returns:
This is a much-needed enhancement.
6 thoughts on “Azure SQL Database now has an improved STRING_SPLIT !”
There is no question about it, this is an improvement but one is very disappointing for me! You can read what I recommended and still think that can be much better on my blog: New-Azure-SQL-supports-STRING_SPLIT-with-order-guaranteed-the-awesome-the-bad-and-the-expected
Hi Ronen, I'm not as concerned about different orders. If they were going to do that, I'd prefer an OVER clause applied to it somehow.
Erik Darling already noticed this change (in SSMS 18.10) https://www.erikdarlingdata.com/sql-server/changes-coming-to-string_split-optional-ordinal-position/
Yes, the MVPs have been aware of it for some time, but we were asked to hold off publicizing it until the code was rolled out across all regions, and docs went public. I've been using it very successfully for weeks. The docs went live on the 5th.
The Intellisense is actually broken, even in 18.10. The third parameter is optional but you get red squigglies if you only have two parameters.
This is great and all, but I have to write code that runs against both a local SQL Server (2019 and up) as well as AzureSQL. It's sometimes challenging. Any word on if/when this will be added to stand-alone SQL Server? With the news of SQL2022, I've been trying to find any reference for T-SQL improvements/enhancements/changes/deprecations, and have been unable to find anything solid. And could they put this in a SQL2019 service pack (PLEASE!) like they did with the DROP IF EXISTS syntax in SQL2016 SP2?
Hi Paul, it'll be in SQL Server 2022 for sure. I think it's highly unlikely that it will turn up in a Cumulative Update for 2019. (Note: there are no service packs any more)