ADF: Use MSIs not SQL Logins whenever possible

ADF: Use MSIs not SQL Logins whenever possible

Azure Data Factory (ADF) is great for moving data around. It often needs to connect to SQL databases. If you’re creating linked services for this, please try to avoid using SQL logins. Don’t use usernames and passwords for authentication if you can avoid them.

Azure SQL Database lets you can create a user from a data factory’s managed service identity (MSI). MSIs are a special type of Service Principal (SP). They provide an identity where you don’t need to manage the ID or any sort of password.  Azure does that for you.

Each ADF exposes an Azure-based identity on its property page. Here is the main menu for my BeanPerfection data factory:

Data factory properties page

You might have guessed you should go to the Managed identities menu option, but that is for User Managed identities. I’ll write more about them another day. The identity we’re interested in is a System Managed Identity.

On this properties page, it’s actually called the Managed Identity Application ID. Like all other types of Service Principal, Managed Service Identities have an ApplicationID.

Creating a user in Azure SQL Database

I create a user in an Azure SQL Database with the FROM EXTERNAL PROVIDER option. This says that Azure SQL isn’t performing the authentication.

IF NOT EXISTS (SELECT 1 
               FROM sys.sysusers 
               WHERE [name] = N'beanperfectiondatafactory') 
BEGIN 
    CREATE USER [beanperfectiondatafactory] 
    FROM EXTERNAL PROVIDER; 

    ALTER ROLE datafactory_access 
    ADD MEMBER [beanperfectiondatafactory]; 
END;

The user name is the full name of the ADF. In Azure SQL Database, we can just use the name of the ADF instead of the identity. It will do that for us.

I always make this data factory name lower-case. Once the user is created, I add it to whatever role in the database makes sense. In my example above, I used the role name datafactory_access but there’s nothing special about that name.

You need to decide which role to add it to based upon the tasks that the data factory needs to perform. While it’s temptingly easy to just add it to db_owner, try to resist that. If in doubt, create a role that’s used for data factory access and grant it permissions as they are needed.

Must execute using an AAD-Based Connection

If you just try to execute the statements above, you might find that you get an error message saying:

Msg 33159, Level 16, State 1, Line 8 Principal ‘beanperfectiondatafactory’ could not be created. Only connections established with Active Directory accounts can create other Active Directory users.

You cannot create any type of Azure AD-based user in an Azure SQL Database, if your connection was authenticated as a SQL login. You must use a connection that was itself made using Azure-AD authentication.

I find that the easiest way to do that, is to make sure I have an Azure Activity Directory Admin assigned for my Azure SQL Server, and then just execute the code right in the Azure Portal. I use the Query Editor tab in the main menu for the Azure SQL Database and connect as that administrator.

2021-09-26