Recently I’ve been working on a system where I had to create a test script that reliably recreated a database each time it was executed. I’ve done this many times before and thought I knew how but I was wrong. It turns out that reliably dropping a database in a script that you run from SQL Server Management Studio (SSMS) is harder than it looks. (It’s also harder than it should be).
In Books Online (BOL), the article for DROP DATABASE says :
“You cannot drop a database currently being used. This means open for reading or writing by any user. To remove users from the database, use ALTER DATABASE to set the database to SINGLE_USER.”
Based on this, I had been using the following style of script to drop and recreate a database:
USE master;
GO
IF EXISTS(SELECT 1 FROM sys.databases WHERE name = N'Blah')
BEGIN
ALTER DATABASE Blah SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE Blah;
END;
GO
CREATE DATABASE Blah ON
…
However, what I’ve found is that when I execute this from SSMS, it doesn’t always work. Fairly randomly (and hard to reproduce), it fails with an error message telling me that the DROP failed because the database was in use. But isn’t that the whole purpose of setting it to single user?
The problem seems to be that although the database gets set to single user, another connection could be made to it before the DROP DATABASE statement occurs. Setting the database to SINGLE_USER isn’t enough as the current database context for the script is the master database, not the database in question.
A number of users and fellow MVPs have told me that they’ve experienced the same issue. What we suspect is causing this is the Intellisense system within SSMS connecting to the database to check the syntax of objects later in the same script. The problem only really seems to happen in SSMS but it could happen elsewhere.
A great suggestion from Rob Farley was to set the database OFFLINE instead of SINGLE_USER. While that would stop the other connection, the problem is that when you DROP a database that is OFFLINE, the files for the database don’t get deleted. You’d then need to have some additional script (nasty) to delete the files as well after the database was dropped.
What is really needed to get around this race condition is:
DROP DATABASE WITH ROLLBACK IMMEDIATE;
If you agree, I’ve created a Connect item here that you can vote on:
(Update 8 Sep 2019): As all the Connect items got nuked again, that link doesn't work anymore. I did just find it in the new UserVoice system under a different name here: https://feedback.azure.com/forums/908035-sql-server/suggestions/32906317-cannot-reliably-drop-a-database-in-t-sql-script
In SMO there's a Server object method called KillAllProcesses('AdventureWorks'), where you specify the database name inside the parentheses. If you capture the T-SQL executed when that method is called it returns the following query:
SELECT DISTINCT request_session_id FROM master.sys.dm_tran_locks WHERE resource_type = 'DATABASE' AND resource_database_id = db_id(N'AdventureWorks')
If any results are returned – say that Session ID 51 is using the database – SMO executes a Kill statement for each session returned.
Perhaps you could implement a similar set of logic in your script.
Hi Allen,
Yes, one option is to hunt down and kill all spids that are connected to the database but even that has the potential for a race condition. One could connect as soon as you kill another.
There just should be an atomic way to say "make this database go away".
If it really is the IntelliSense which is the culprit one solution would be to swap to SQLCMD mode – as Intellisense is turned off then.
Hello there.
I implemented the DROP DATABASE WITH ROLLBACK IMMEDIATE a while ago, only to see that this particular condition is not gone. It's been happening for me for the past 3+ months; the solution that worked for me was to create two different steps in a job, one to put the database in single user, and the other one to actually drop the database. Both steps are using ROLLBACK IMMEDIATE.
Hi Colin,
The problem is that DROP DATABASE doesn't have a rollback option, and having it in two separate statements doesn't work reliably.
Hi Stephen, true but not all tools support SQLCMD mode. There should be a way to do this reliably in a T-SQL script.
If I recall correctly, you can do the alter to single user in the context of the db youre altering, then switch to master for the drop, and your session retains the single user lock.
Hi Tony,
No, note that the database might not exist when the script is first run. Also, as soon as you change to another DB context, another session could connect before your drop executes.
It simply needs to be an atomic operation, not two separate operations.
I agree it needs to be an atomic operation, but currently the approach i describe works, as the lock is held by your session after the zswitch to master. This script does what you describe, and shows the lock still held:
USE master
GO
IF DB_ID('Test') IS NOT NULL
BEGIN
USE Test
ALTER DATABASE Test SET SINGLE_USER WITH ROLLBACK IMMEDIATE
USE master
SELECT DB_NAME(resource_database_id), * FROM sys.dm_tran_locks WHERE request_session_id = @@SPID
DROP DATABASE Test
END
Greg, I like the idea of rollback immediate – simple and elegant.
Would emergency mode work? I'm not sure what impact dropping a database in that mode has, but at least no one (except an admin — you, presumably) would be able to connect…
Strictly speaking (and annoying as the behavior is), I am afraid this connect request is not a bug, in that I can find no Microsoft promise that DROP DATABASE _must_ drop a database (they actually suggest the opposite can happen). As such, this connect request is more akin to a Design Change Request (DCR). For many development shops, DCRs are assigned a lower priority than bugs. DCRs invariably need a solid business justification (in order to get them considered in the current release, or even for the next release).
For DCRs and bugs, it is important to explain why workarounds are not viable. To determine whether a workaround is viable, I am afraid one must first know the cause of a problem. Without knowing a cause, it is possible (even likely) for many of the solutions (or suggested workarounds) to be worthless. I am afraid this connect request does not identify a cause (e.g. the title mentions a "T-SQL script" that has not been attached, and the text does not mention SSMS), and I am afraid it only partially identifies the effect (e.g. it does not explicitly identify the error or errors being raised by the T-SQL script).
A cause could be SSMS' Intellisense, AUTO_UPDATE_STATISTICS_ASYNC, or something else. A timely trace (keep in mind that ASYNC stats could have kicked off minutes to hours before SINGLE_USER was issued) can help uncover a cause. Or some TRY CATCH error handling might also prove useful. Without a cause identified, I have to assume the problem Microsoft will face (when reading this request) is: What code needs to be redesigned? For any large RDBMS product, I have to assume the team that handle connectivity is different from the team that handles security, is different from the team that handles DDL, and is different from the team that handles system metadata. And yet for such a problem, a solution is likely to involve all of those teams.
For an Intellisense cause, I believe Microsoft's intent was to design SSMS for DBA use (i.e. SSMS was not designed for use by users). So if the cause is rooted in Intellisense, Microsoft might suggest SQLCMD to be used by DBAs (instead of SSMS) as a workaround. As an added bonus, SQLCMD can be automated. But Microsoft will still need to consider: Is the design needing to be changed within SSMS or within the engine (note that the engine is designed to serve whatever its clients, such as SSMS, demand)?
Or, if the cause of the failure is rooted in AUTO_UPDATE_STATISTICS_ASYNC, the way to address this cause has already been documented in BOL (http://technet.microsoft.com/en-us/library/bb522682%28v=sql.110%29.aspx, under the SINGLE_USER section). Note that it would be useless to suggest SQLCMD as a workaround for this cause. Note that this cause is by design (hence why how to address it has already been documented in BOL).
And I am imagining that RESTRICTED_USER is potentially another workaround (for an unknown cause).
Because of the above, I am afraid I see a connect request that needs more work (such as scripted repro steps, the complete error or errors SSMS is raising). Also was is needed is the expected outcome, which is above (but not in the request) "What is really needed to get around this race condition is:
DROP DATABASE WITH ROLLBACK IMMEDIATE;"
By stating the expected outcome (attaching the script, and mention SSMS) in the connect request, I have to assume Microsoft will be able to more precisely offer a solution or workaround,
Without such details I have to assume Microsoft will be wondering where this request should be routed. Even if routed correctly, I assume developers without a business justification are going to have a hard time justifying the work that will be needed to address this connect request :).
Instead of simply voting, I think it would more useful for voters to explain (as a Comment under the connect request) how their inability to reliably drop a database impacts or has impacted their business.
I am not claiming current behavior is a _good_ design :). I am not claiming the connect request is unworthy of a solution (or workaround). What I am claiming is that the request is likely to go nowhere (no matter how many vote for it), because I see insufficient data to compute :). I am claiming Microsoft has a top-heavy bureaucracy whose wheels need to be heavily greased (far more than you and I imagine :).
While handing a need to Microsoft upon a silver platter might seem rather onerous, I assume it will result in a more pragmatically useful Microsoft response:).
Hi Adam,
EMERGENCY still allows multiple connections by sysadmin members.
Hi Bill,
I see it as either an engine bug or (more likely) a documentation bug. If the doco says that you can drop a database this way and it doesn't always work, either the engine or the doco need changing.
For Tony, I'm not getting the same result. If I go up to a system without a Test database and execute the code you have above:
USE master
GO
IF DB_ID('Test') IS NOT NULL
BEGIN
USE Test
ALTER DATABASE Test SET SINGLE_USER WITH ROLLBACK IMMEDIATE
USE master
SELECT DB_NAME(resource_database_id), * FROM sys.dm_tran_locks WHERE request_session_id = @@SPID
DROP DATABASE Test
END
What I get is:
Msg 911, Level 16, State 1, Line 4
Database 'Test' does not exist. Make sure that the name is entered correctly.
It seems to check that the database that's referenced by the USE actually exists before executing the batch.
Fixed:
USE master
GO
IF DB_ID('Test') IS NOT NULL
BEGIN
EXEC('
USE Test
ALTER DATABASE Test SET SINGLE_USER WITH ROLLBACK IMMEDIATE
USE master
SELECT DB_NAME(resource_database_id), * FROM sys.dm_tran_locks WHERE request_session_id = @@SPID
DROP DATABASE Test
')
END
Hi Tony,
I like the idea of using dynamic SQL here and think that might work much more reliably. Even though it shows the resource being held, I suspect there still would be a chance of a race condition between when the database context changes back to master and the drop occurs but it's obviously very unlikely to fail.
I just wish there was a statement that did it cleanly. You should be able to just say "drop that database and disconnect anyone that was connected to it".
Have you tried with NO_WAIT?
(ALTER DATABASE Blah SET SINGLE_USER WITH NO_WAIT)
I wonder if the quicker alter database will yield you more reliable dropping.
Hi Jon,
NO_WAIT doesn't work faster. It just fails if it can't get the required locks.
Its not just DROPing a database that is affected, ive had similar issues rolling back databases to snapshot.
IMO what is need is
"ALTER DATABASE X Set SINGLE_USER WITH ROLLBACK IMMEDIATE ( OPTION = OBTAIN_LOCK)"
Or in english , set to single user and make that user me.
We have the same issue here at Stack Overflow, though more often with read-only/read-write changes. The scenario isn't SSMS taking connections, but 11 high traffic web servers fighting for connections. These operations don't have a small chance of failure – they will almost certainly fail. Being stuck in single-user mode when the same 11 servers fighting for it is a bad place to be, so the fail case is pretty harsh.
I would LOVE an atomic operation for any operation that requires SINGLE_USER mode, using it online without other mechanisms is nearly impossible for us – and it has such a slim chance of working, it's not worth risking the fail. Currently we have application level switches to disable databases and connections from the app tier – it's the only way to get things done.
Would it be possible to do a re-try, if the drop fails?
Something like:
create procedure DropTestDB as
begin
IF DB_ID('hsptest') IS NOT NULL BEGIN
begin try
ALTER DATABASE hsptest SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DROP DATABASE hsptest
End try
Begin Catch
exec DropTestDB
End Catch
End
End
go
exec DropTestDB
I just created a CLR SP to drop the db (except for master database) as I put it here http://www.sqlservercentral.com/blogs/jeffrey_yao/2014/01/20/reliably-drop-a-database-with-clr-stored-procedure/
Hi Jeff,
That's cool but many people don't have CLR integration enabled. And you shouldn't need to drop out to CLR code just to drop a database reliably. So far, I think the dynamic SQL option that Tony suggested is the best option for straightforward T-SQL code. It's a pity to need to jump into dynamic code though.
Hi Greg,
Actually Tony's dynamic sql option is almost the same as my t-sql method mentioned in my http://dbaphilosophy.wordpress.com/2014/01/07/reliably-drop-a-database-in-t-sql-script/, however, from a PURE theory perspective, as soon as "use master" is executed, another session can grab the target db, so from theory perspective, the target db should be switched to offline first and then drop it and then delete each data/log file used by this target db. That's why I think only CLR SP can accomplish this task easily.
But you are absolutely right, not every environment allow CLR enabled. 🙂
Unless your connect bug is implemented, the best way to do this is to write a CLR proc that will look in sys.database_files, put the database offline, drop it, and delete the files. You would need "external_access" permission and the proc would have to run with the same credentials as the sql service.
Yes its annoying to write all that C#, but I can't think of a better way.