I keep doing work at sites where none of the database code is stored in version control (source control) systems. I keep wondering why that is.
At a recent site, all the source code was in individual files just sitting in a single folder. That's just not sensible.
I'm left wondering why it is that almost every team that I see working with higher-level languages just assumes that some form of source control would be used, yet it's almost the opposite when I'm working with data teams.
Having decent source control makes such a difference:
- No more overwriting changes and losing them.
- No more wondering what changed between versions, or who changed them.
- And so on and so on.
There seems to have never been a culture of source control among DBAs; and database developers are somewhere in between these two worlds.
One aspect of this is tooling.
Vendors like Red-Gate do a reasonable job with their source control offerings for T-SQL but some clients want a "pure-Microsoft" solution for some reason.
In earlier versions of SQL Server Management Studio (SSMS), there was support for an SCCI (Source Code Control Interface) provider add-on. That would let you connect SQL Server script projects to source control. Sadly, that disappeared in recent versions of SSMS. I gather that there might be a way to attach the Visual Studio Team Explorer to it but I haven't pursued that and I really hope that a standard interface will return soon. I feel that SSMS should interface directly with both TFS and Git as part of a default install. Having tools like this without source code interfaces built in, helps to push an inappropriate direction.
If however, you are using SQL Server Database Tools (SSDT) to build your databases, then both TFS and Git are standard connections from Team Explorer.
I just find that I can't do my database development work directly in SSDT. I find very few people do that. Most use SSMS for development.
I'd love to hear others' thoughts on why this culture has evolved this way, and how to change it.
My own theory is that it relates back to the artificial barriers between DBAs and developers; all things data related "need" a DBA, even if they consider themselves to be "development DBAs". In other words, if you work with data, there's a tendency to see yourself in terms of the particular product you use and not part of the general family of "developers"
Developers should use source control. As long as database developers see themselves as DBAs and not developers, source control isn't part of their language.
In our development shop we have a developer database that developers can access. And much of it (stored procedures, functions, indexes, views) are stored in TFS source control. We have an in-house tool that compares what is in source control with what is in the developer database, and if there are any differences, the build will fail. The tool will also sync the database to source control or vice-versa. Sadly, other database artifacts are not yet part of this system (foreign keys, table schema) but maybe some day. Let's just say this system, home-brew and flawed though it is, is far better than nothing, and has saved our bacon more than a few times.
Interesting…. I have yet to work at a place where db code (and schema) is not in source control. It was always in Visual Source Safe, svn, cvs, git at every place I worked since 1999. Current place everything is built upon checkin from Jenkins in dev, twice a day in test. Maybe I got lucky 🙂
Denis
@pmbAustin – I love the idea of breaking the build if something in dev doesn't match production. I would exclude any object starting with "dba_" so things needed for dev work can be excluded.
@Denis – I'd like to hear more about how you handle structure changes.
Having worked in lots of environments, I can say the issue is complex, and that alone is a deterrent, especially in shops with little IT/DevOps chops. Stored code (sprocs/triggers/functions/types) should be in source control for sure. But the issue is tricky for referential integrity and schema. Once the database is put in production, the nature of the game changes forever, and I've only had custom processes to deal with rolling DDL to production.
Current and longstanding practice where I've worked is to package all db changes into an update package (set of scripts or some other tech) in two parts: 1) Get the stored code changes deployed (create and alter procedure statements); and 2) Other DDL SQL.
#1 is straightforward- Package all create/alter statements into one or many files, that the deploy uses .
#2 is via an idempotent update.sql that is only ever appended to after initial creation (when is initial creation? See below). This script checks for existing first when adding/removing columns, altering referential integrity, adding tables/views, dealing with sequences, look up table hydration or revisioning (which can be much more tricky) etc.
Also handled would be changes that need complex orchestration to handle data evolution related to the change. There is a reason Entity Framework (Microsoft's answer to Hibernate) doesn't automate schema changes. It's just too complex of an issue for any tool to automate. That's why data engineers and DBA's make the big bucks!
So when is initial creation of the update.sql? The first time you deploy changes after the database was put into production!
Yes, that means the update.sql can have way ancient stuff in it. But considering the complexity of all the issues that could occur in a schema rollback and/or database restore scenarios, this works well. It also works for revisioning to V current different databases which may of different versions (say you have a copy of production used for analysis that isn't revisioned at the same rate as production and you don't want to recreate which would require a data reload).
In my experience, an update.sql statement spanning years of development is no problem if written with great care. It runs fast, and ensures consistent state every time. It does however, require a culture of development where the team works under a DBA or data developer SPA to hand craft the update.sql file for every rollout containing database code or schema changes. But that level of organization pays huge dividends in a crisis.
Sean
You aren´t a DBA and can´t undestand our routines.
Source control isn´t used by DBA because our main tasks are absolutely different of developers.
We not develop codes, so simple.
Database developers write database instructions and in some cases can be used a source control tool, but in most cases is useless. If database should be back to specific point in time there is features of Database management systems that provide this.
There is no code control for DBA because is a logic question.
Devops for DBA is a joke, just a role for non DBA´s get a job.
Thanks for the feedback. Can't say I agree. And "you aren't a DBA" is an interesting claim 🙂
Nice little bit of ESL trolling there, thanks.
I've been a DBA, as well as analyst, designer, and developer for 20 years, and it's all about the workflow, baby. My Sprocs have headers and additional information and commands attached, multiple versions, tests, and separate troubleshooting/analysis DML code pages, which isn't something you can do easily without some kind of external storage, and source control is hard to beat for that. VS DB projects are very difficult to work with by comparison to integrated source control in SSMS. I'm trying to find a replacement for working directly with TFS from SSMS right now because I just can't to it all in Visual Studio, and the world has moved on from SQL Server 2014.
It might be worth taking a look at Azure Data Studio. It's based on VSCode and has SCC integration baked in. Its focus claims to be primarily on query editing.
I've had a database in source control for years using SSDT/Studio/TFS. One of the legitimate problems or differences between code deployment and database deployment is the database is stateful. Any data in the database changes the state of the database and all subsequent operations must account for that state.
The main adaptation required is to make database changes in a certain way: add a field and migrate to it rather than renaming a field. Add data to system/control/config tables only if it does not exist. You do have to put on your DBA hat when developing the database, but then again, if you don't have that hat you should probably ask someone who does before you go checking in database code.
It also means you can't just "cowboy" in your changes because the database reverts to what's in source code with every build. You can test changes and work with data and experiment "live" in SSMS (or in SQL query windows in studio) and some of that is useful. But the changes have to be made in source to make them stick.
The gains in source control far outweigh the differences in approach. Any code-schema-data coordinated changes are promoted together through the dev-test-prod environments. No more "oh I forgot" moments or untested changes just because the database was not updated with a release. And you can create a new environment, including the database, with the ease of creating a branch. There's really no excuse for not doing it.
Agreed. A change in discipline is required.
Best option is to implement timestamped up/down migrations. Tools I've used include:
– http://www.mybatis.org/migrations/ – SQL scripted migrations (works with most JDBC drivers. I used it with MySQL)
– https://fluentmigrator.github.io/ – C# migrations for Single or Multi-vendor schema (my project was SQL Server and dear old MS Access). I implemented a SQL Server => C# Fluent Migrator code generator that assisted in sequencing the migration operations.
– ORMs: Entity Framework, NHibernate and Hibernate
Hi Tony, I just like to see the DB code in projects, and in the same source control as any other part of the applications. Nowadays, no reason not to. And there are many good diff tools for SQL, etc. as well for when they're needed.
I'm not a fan of ORMs at all. I live in a world where performance and code quality matters. And I just see disaster after disaster with them. As an example, was at a site two years ago, where they estimate they lost 10 months work for 210 developers, ripping NHibernate back out.
A new tool/option to try – https://dbversioncontrol.com/ – SSMS plugin that generates migration scripts for schema/data changes. To be used with any version control system (GIT/TFS/SVN/etc.)
DB Version Control looks to be a beta at the moment though?
What about VersionSQL? I've not used it yet but it looks like it runs in SSMS and hooks into git.
Hi Twebb, what looks beta about it? See the response I had to Paul's question. SSMS is not their target for working with source control.
I'm having a lot of trouble with this. I want to be able share the SSMS solutions with the rest of my team. That is not currently possible within SSMS, I have to drop out to the commandline or use a separate tool to sync the changes to our GitLab server. It used to be possible as you can see from this video, https://www.youtube.com/watch?v=1iJMuUDB5MU
But these it's about syncing up at the database level and not about sharing code.
Hi Paul, the Microsoft method for working with DB source is now to use Visual Studio or VSCode (both free) to work with database projects, rather than doing this from SSMS. And yes, SSMS used to have an SCCI add-in that let it talk to TFS, etc. You were able to use that with script projects in SSMS. I still wish it had that. When you talk about SSMS "solutions", is it script projects that you're working with, rather than database projects? The team explained what was going on here (and provided one possible workaround): https://cloudblogs.microsoft.com/sqlserver/2016/11/21/source-control-in-sql-server-management-studio-ssms/. A better experience would be to use one of the many available Git add-ins for SSMS if that's what you're trying to do. The general direction from Microsoft is towards database projects, not just sets of scripts. However, I have bunches of script projects, and yes, I wish that SSMS would let me work directly with Git for them.
Hi,
I'm talking about the "script projects" that you access using Solution Explorer as a convenient way to use one pane of glass rather having drop in to the commandline or use another external tool to manage commits.
Thanks for your response. We are testing if that hack will work on SSMS 18+, I've a suspicion it probably won't.
We have been working with RedGate source control but it by bypasses Solution Explorer and hooks into the database instead. And from what we can tell all the current SSMS plugins work the same way.
Yes, today you have to use a Git add-in for SSMS, or given they're just files, an external Git client, or use VS to manage script projects. VSCode (using Azure Data Studio) might also be a good direction.
Hi Paul,
I really like the cut of you jib. I have made it a policy of mine to treat T-SQL like any other programming language such as C# and use tools such as Visual Studio and Git source control in the same manner. It really worries me when I hear the only copy of the database we have is a "backup". We should have a code copy also to be truly compliant of 'infrastructure as code'.