Cosmos Down Under podcast 6 with guest Vincent-Philippe Lauzon is now published!

I published a new Cosmos Down Under podcast today.

In the podcast, I've discussed the new integration between Azure Cosmos DB and Azure Data Explorer with the program manager for the feature: Vincent-Philippe Lauzon.

It's a great new capability and worth half an hour of your time to find out about it.

I hope you enjoy it. You'll find it, along with all Cosmos Down Under podcasts at: https://podcast.cosmosdownunder.com

SDU Podcast #83 with guest Davide Mauri released

I recorded the first of a series of podcasts for the SQL Server 2022 timeframe recently. Once again, it was great to chat with another old friend Davide Mauri. Davide is a Principal Product Manager for Azure SQL Database.

Note that the topic I chose for today was the new  sp_invoke_external_rest_endpoint system stored procedure, and it's currently only available for Azure SQL Database but I expect it will get ported to other flavours of SQL Server over time. Davide has been deeply involved with the development of this procedure.

Developers have wanted a better way to interact with external services from within Azure SQL Database and this procedure really helps with that. I feel it lets the database now integrate much more closely into the Azure ecosystem.

You'll find the podcast with Davide (and all our other podcasts) here: https://podcast.sqldownunder.com

Azure SQL Database now has an improved STRING_SPLIT !

I get pretty excited about new T-SQL enhancements. Back in 2016, I was so pleased to see Microsoft finally add a string split option to T-SQL, but my enthusiasm was limited when I saw how it was implemented. Now that's mostly fixed !

While it's possible to build functions that did string splitting just like I wanted, the problem is that no matter how you implemented them, they were really just too slow. And most importantly, much slower than a built-in function would be.

The Issues

SQL Server 2016 allowed us to write code like this:

That code would return this output:

STRING_SPLIT is a table-valued function. It took two parameters:

  • The string to be split (Unicode or not)
  • The separator

It returned a set of rows with a single column called value, and it was fast.

So what were the problems?

  • The separator could only be a single character
  • Often you need to trim the values being returned. (That can obviously be done outside this function using TRIM but it'd be nice if there was an option to do it in the function).
  • But the biggest issue of all was that because a table-valued function doesn't have an inbuilt order for returning rows, you didn't know which token was which, that had been extracted from the string.

Now a TVF returns a table, and tables don't have built-in orders either. Nor do views. But what was needed was another column that told you which rows was which, and you could then use that for ordering when needed.

Azure SQL Database

STRING_SPLIT in Azure SQL Database now has an optional parameter to enable an ordinal (i.e. a position).

That extra parameter is a bit or an int that says if an ordinal value should be added to the output.

So this code:

Now returns:

I love working with Azure SQL Database for many reasons, but one of those is that you get new language extensions before other parts of SQL Server. I've been using this feature since the day that I first heard it existed, and it's working great. Now the updated documentation is online as well.

It also means that I can obtain any specific token that I want. For example, get me the third token in the string:

And that returns:

This is a much-needed enhancement.

ADF: Use MSIs not SQL Logins whenever possible

Azure Data Factory (ADF) is great for moving data around. It often needs to connect to SQL databases. If you're creating linked services for this, please try to avoid using SQL logins. Don't use usernames and passwords for authentication if you can avoid them.
 
Azure SQL Database lets you can create a user from a data factory's managed service identity (MSI). MSIs are a special type of Service Principal (SP). They provide an identity where you don't need to manage the ID or any sort of password.  Azure does that for you.
 
Each ADF exposes an Azure-based identity on its property page. Here is the main menu for my BeanPerfection data factory:
Data factory properties page
You might have guessed you should go to the Managed identities menu option, but that is for User Managed identities. I'll write more about them another day. The identity we're interested in is a System Managed Identity.
On this properties page, it's actually called the Managed Identity Application ID. Like all other types of Service Principal, Managed Service Identities have an ApplicationID.

Creating a user in Azure SQL Database

I create a user in an Azure SQL Database with the FROM EXTERNAL PROVIDER option. This says that Azure SQL isn't performing the authentication.

The user name is the full name of the ADF. In Azure SQL Database, we can just use the name of the ADF instead of the identity. It will do that for us.

I always make this data factory name lower-case. Once the user is created, I add it to whatever role in the database makes sense. In my example above, I used the role name datafactory_access but there's nothing special about that name.

You need to decide which role to add it to based upon the tasks that the data factory needs to perform. While it's temptingly easy to just add it to db_owner, try to resist that. If in doubt, create a role that's used for data factory access and grant it permissions as they are needed.

Must execute using an AAD-Based Connection

If you just try to execute the statements above, you might find that you get an error message saying:

Msg 33159, Level 16, State 1, Line 8
Principal 'beanperfectiondatafactory' could not be created. Only connections established with Active Directory accounts can create other Active Directory users.

You cannot create any type of Azure AD-based user in an Azure SQL Database, if your connection was authenticated as a SQL login. You must use a connection that was itself made using Azure-AD authentication.

I find that the easiest way to do that, is to make sure I have an Azure Activity Directory Admin assigned for my Azure SQL Server, and then just execute the code right in the Azure Portal. I use the Query Editor tab in the main menu for the Azure SQL Database and connect as that administrator.

 

SQL: Fix: The parameters supplied for the procedure "sp_set_firewall_rule" are not valid.

I often use the procedure sp_set_firewall_rule to set firewall rules for Azure SQL Server. (There's a similar call to set the firewall for databases). The other day though, I got an error that had me puzzled:

I also tried it with named parameters and got the same error.

When I looked at my previous scripts, I realised that I had used a Unicode string for the first parameter previously.

Solution

I changed 'TestRule' to N'TestRule' and it worked fine.

I've not seen a procedure before that wants a Unicode string that won't accept an ASCII string. For example, this works just fine:

Apparently strict data type checking has always been a feature of extended stored procedures and this one checks specifically for a Unicode string. I really don't use extended stored procedures much anyway.

What threw me as well is that I couldn't find it in the list of system stored procedures. It's not there because it's actually an extended stored procedure. These used to mostly have xp prefixes, not sp prefixes, and are a good example of why I don't love using prefixes like that.

I really wish though, that this procedure had a better error message. While the current one is strictly correct, it is not actually all that helpful.

 

SQL: Password complexity rules for Azure SQL

Azure SQL (both Azure SQL Database and Azure SQL Managed Instance) both have different password complexity rules to SQL Server. I was reading an email discussion list and a poster asked where he could find the list of password complexity rules for Azure SQL. I said I'd never seen a list.

Well it turns out that there is a list, but not where you might have thought to look. They're spelled out in this article:

Identify the right Azure SQL Database SKU for your on-premises database (Data Migration Assistant) – SQL Server | Microsoft Docs

To avoid you reading the whole article, at the time of writing, these were the rules for Azure SQL Database:

  • Your password must be at least 8 characters in length and no more than 128 characters in length.
  • Your password must contain characters from three of the following categories – English uppercase letters, English lowercase letters, numbers (0-9), and non-alphanumeric characters (!, $, #, %, etc.).
  • Your password cannot contain all or part of the login name. (Part of a login name is defined as three or more consecutive alphanumeric characters.)

Slightly more confusing is that the article says it's talking about the server admin password, but it also appears that this same list of restrictions applies to all logon/user passwords as well.

Azure SQL Managed Instance is exactly the same except it requires the password to be at least 16 characters in length.

I hope that helps someone, including myself when I next go looking for this.

 

SQL: UNMASK is now granular in Azure SQL

Dynamic Data Masking was added to SQL Server back in 2016. I've almost never used it since. Why? There were two reasons:

#1: The first reason was that it didn't work in conjunction with other security features. In particular, you couldn't use dynamic data masking in conjunction with Always Encrypted. From the start, I've been telling the product group that this is a misstep. There's very little data that I'd want to hide from my own staff, that I'm happy still being visible to a hosting provider.

This is still an issue today. Some users might just want to use Dynamic Data Masking though, without Always Encrypted, so there might be a use case there, particularly with on-premises systems.

#2: (And this was the big one) UNMASK wasn't granular enough. There was a single UNMASK permission for the whole database. That means it worked nothing like other permissions in the database. We can usually set permissions on schemas (my favourite), tables, and columns. And I didn't see UNMASK as any different. Again, a number of us made that very clear to the product team.

However, this has now changed !

I wanted to call attention to a recent blog post that just mentioned in passing that UNMASK was now granular. In fact, you can apply it like before, to the whole DB, but you also apply it at the schema, table, and/or column levels.

Well at least you can in Azure SQL Database (and Synapse Analytics). One of the things I love about working with Azure SQL Database is being at the front of the queue when T-SQL and DB enhancements are made.

This is great news. Thank you to the product team !

Azure: Changing Azure SQL Database Service Objective from T-SQL Commands

When I'm building Azure Data Factory pipelines, I often want to rescale, i.e. change the Service Level Objective (SLO) of Azure SQL Databases before and after processing. For example: I might have a database that sits at S0 or S1 all day long when not being processed (to allows for the odd adhoc query), but I want it at S6 to do overnight ingestion of data and loading of analytic data models. Then I want it to go back to what it was before.

This can make a huge difference to the cost of these databases, when they are used intensely but intermittently.

While it's possible to do this from PowerShell, etc. I've found that the easiest (by far) is to just execute the required commands from T-SQL.

Stored Procedure Approach?

What I really wanted to do, was to put a stored procedure in each database, that can be used to scale that same database. I could have then included that procedure in our SDU Tools.

But you can't do that. While the scaling would work, the stored procedure would always return an error because it would be terminated when the rescaling occurs.

The way to avoid being terminated, is to query the master database instead. The annoying part there, is that with Azure SQL Database, you aren't allowed to put any user objects in the master database. So while it would be a perfect location to hold a stored procedure for changing the SLO of a database.

Just a SQL Batch

In the end, the easiest is to just send the required SQL batch to the master database. In ADF, I can do that with a single Lookup activity. (That's the way you just execute a SQL statement).

It's annoying that the Lookup activity only supports parameters when you're calling a stored procedure. It doesn't support parameters for SQL query batches. So I have to just put the values right into the code.

Regardless, if you want to do this, below is an example of T-SQL code that will do that. You'll need to execute it against the master database, and as a server administrator.

(Note: code might wrap)

 

Making a cloud transformation, not just a migration – free course

In the dying days of PASS, one of the attempts at keeping the organization afloat was the introduction of the PASS Pro membership. One benefit of that was access to a series of Learning Experiences. Since the demise of PASS, our friends at Red-Gate have purchased the assets, and that includes those courses.

I was pleased to hear from the Red-Gate team that the courses have now been made available for free, and without any type of paywall/loginwall. That's just awesome.

The course that I built for them was on making a cloud transformation, instead of just a migration. So many times, I see CIOs/CTOs going on and on about the cloud transformation they are embarking on, yet I don't see them transforming anything. The cloud vendors love it, as it's an apparently quick way "to the cloud" for an organization. But for the organization, it often leads to lousy outcomes.

You can now see that course here.

The course is about 3 1/2 hours in length.

You'll find the other courses in that series here.

And did I mention free ?

 

SQL: GREATEST and LEAST – awesome new additions to Azure SQL Database

I still get really excited when T-SQL has new language elements added. I'm not talking about new data definition or management statements. I'm talking about new functions and logic. So, as expected, I was really happy when I saw some posts saying that GREATEST and LEAST now seemed to be working in Azure SQL Database.

They've been in other database engines before but weren't part of T-SQL.

I tried them, and sure enough, they work as expected. NOTE: Intellisense hasn't caught up yet, so they're still flagged as syntax errors.

GREATEST and LEAST

GREATEST takes a list of values and returns the largest. LEAST takes a list of values and returns the smallest.

Note that both ignore NULL values, and unlike aggregates, don't produce any message about the NULL value being ignored either:

That's how you'd hope it would work.

This is really good news. It's so common to want to choose values on this basis. For example, if I want to offer a customer the lowest price i.e. either their own special price, a sale price, or the normal price:

The Old Way

Before these functions existed, doing the above either required a convoluted bunch of CASE statements or a potentially even less obvious OUTER APPLY statement:

And that would get messier and messier if you needed to do it for multiple columns.

This is really good news. Thank you SQL Server team. Hope it comes to an on-premises version soon.