DevOps: Without PaaS, a Cloud Transformation is just Expensive Hosting 2.0

Damon Edwards had a session recently where he said claimed that Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0. There is much in his session that I completely agree with, and have been concerned about for quite a while. I see it more in terms of the adoption of Platform as a Service (PaaS) offerings.

I spend most of my consulting/mentoring time in larger organizations, many are large financial organizations. In every one now, there is a person heading up a "Cloud Transformation" project, but none of these companies mean the same thing when they talk about these types of projects.

In so many companies, all they are doing is taking their virtual machines and networks that are hosted in some existing hosting provider, and moving them into a public cloud.

Let's be clear: those companies are not making a cloud transformation.

They might be replacing existing infrastructure that's difficult to work with, with dynamic and configurable cloud infrastructure. They might also be outsourcing many of the functions of their difficult-to-work-with IT support teams with the public cloud ones. This can particularly apply to many offshore or offsite IT management providers.

But there's no real transformation going on. In so many cases, they are missing out on the real beauty that can be offered by cloud-based services.

Companies like Microsoft started pushing PaaS services heavily at first, but then realized that many companies just aren't agile enough to be able to start to use them. They now seem to focus on getting the customers into the cloud as the first step (aka lift and shift), then later focusing on getting them to use it properly.

If you aren't putting PaaS services in place, I don't think you're really making a cloud transformation. You may just be upgrading to the Expensive Hosting 2.0 that Damon mentioned. 

That may be better than where you were but you shouldn't confuse this with a real transformation.

 

SDU Tools: Get SQL Server Table Schema Comparison

In a recent post, I mentioned that I often need to do a quick check to see if the schema of two SQL Server databases is the same, and how our GetDBSchemaCoreComparison procedure can make that easy. On a similar vein, I often need to get a detailed comparison of two tables.

In our free SDU Tools for developers and DBAs, there is another stored procedure called GetTableSchemaComparison to make that easy as well. It takes the following parameters and also returns a rowset that's easy to consume programmatically (or by just looking at it):

@Table1DatabaseName sysname -> name of the database containing the first table
@Table1SchemaName sysname -> schema name for the first table
@Table1TableName sysname -> table name for the first table
@Table2DatabaseName sysname -> name of the database containing the second table
@Table2SchemaName sysname -> schema name for the second table
@Table2TableName sysname -> table name for the second table
@IgnoreColumnID bit -> set to 1 if tables with the same columns but in different order are considered equivalent, otherwise set to 0
@IgnoreFillFactor bit -> set to 1 if index fillfactors are to be ignored, otherwise set to 0

Note that it doesn't care if the two tables have different names in the same or two different databases.

You can see the outcome in the main image above.

You can see it in action here:

To become an SDU Insider and to get our free tools and eBooks, please just visit here:

http://sdutools.sqldownunder.com

SDU Tools: Get SQL Server Database Schema Core Comparison

I often need to do a quick check to see if the schema of two SQL Server databases is the same.

In our free SDU Tools for developers and DBAs, there is a stored procedure called GetDBSchemaCoreComparison to make that easy. It takes the following parameters and returns a rowset that's easy to consume programmatically (or by just looking at it):

@Database1 sysname -> name of the first database to check
@Database2 sysname -> name of the second database to compare
@IgnoreColumnID bit -> set to 1 if tables with the same columns but in different order are considered equivalent, otherwise set to 0
@IgnoreFillFactor bit -> set to 1 if index fillfactors are to be ignored, otherwise set to 0

You can see the outcome in the main image above.

You can see it in action here:

To become an SDU Insider and to get our free tools and eBooks, please just visit here:

http://sdutools.sqldownunder.com

Opinion: Don't buy hardware before a Proof of Concept

Just a short post today to call out something that I'm seeing again and again. It's where organizations purchase all their hardware and software platforms before they start to carry out a proof of concept. This is a very poor option.

I was reading the data strategy for a global company that I was doing consulting work for. They were proudly introducing the new strategy yet I was sitting looking at it, trying to work out what they were thinking. The first step of their plan was to buy everything they needed. The second step was to carry out a proof of concept to see how it would all work (presuming it would work suitably at all).

This is ridiculous.

In that case, I think what's happening is that the IT management wants to seem proactive, buying hardware and software platforms is what they are experienced at, and they want to look like they are "doing something".

Image by RawPixel
Image by RawPixel

Yet, invariably, this locks them into decisions that aren't in the best interests of the organization. Instead of making sensible decisions, they end up making decisions, based on what they have already committed to. And the more expensive that purchase was, the more they will try for years to justify the expenditure decision that they made. Every choice will later be taken, based upon how well it fits with their existing purchase.

Don't do this.

Do everything you can to carry out the proof of concept without buying anything that locks you into a decision path.

SQL: Backup SQL Server Databases to Nowhere Immediately After Creation (suggestion)

Most companies have some sort of ongoing maintenance processes that perform periodic backups of databases. They also have log backups scheduled for any databases that are online and in full recovery model. However, when you first create a database, it might be in full recovery model yet never have had a full backup performed. If your scheduled jobs then try to create a log backup prior to the first full backup, the backup will fail, and that might make your job fail.

One option to avoid this is to always create a backup of a database immediately, as part of the database creation script. You aren't needing the backup, you just want to avoid the next log backup failing if that happens before the next full backup.

The easiest way to do that is a backup to nowhere ie: the NUL device. You can do that via a command like this:

BACKUP DATABASE [MyNewDB] TO DISK = 'NUL';

It might be worth adding that to your database creation scripts for full recovery databases, to avoid throwing errors in your log backup jobs.

DevOps: Thoughts on Microsoft’s Acquisition of Github

I have many friends who would have checked the calendar when they first heard that Microsoft was buying Github. They would have guessed it was April 1st.

I think it’s another pretty bold strategic move by Satya Nadella.

It’s been interesting to see all the naysayers coming out of the woodwork to beat up on the idea of Microsoft owning Github, as though it was going to be the death of Github. Almost every single time I hear a justification though, it is based on their opinion of Microsoft or something they did often decades ago.

People outside the Microsoft camp seem genuinely unaware of the seismic changes that have happened within Microsoft in recent years. As someone who has worked with them and followed them closely for decades, I can tell you that it is a very, very different company now. If your opinion of them is based on anything more than a few years old, it’s time to re-evaluate your stance.

Microsoft is already a heavy user of Github, and is the largest contributor to open source software on the planet.

But more importantly, their acquisition puts Github into a solid financial position that it did not have before. Github was pretty much at a crossroads, had a few suitors, but in any rational evaluation, Microsoft was the best positioned for this.

From Microsoft’s point of view, I can see how it will beautifully mesh with many ongoing changes within the company, particularly as things like Azure Functions take hold. It also provides more certainty for existing Github enterprise customers, and will introduce a whole new raft of enterprise level customers to Github.

The one big concern that I have is around identity. This is an area that Microsoft hasn’t yet sorted out. There are still too many issues with Microsoft Accounts vs Organizational Accounts and so on. There needs to be a good plan to integrate the Github identity system.

DevOps: Deleting stored procedures and views in target database when using database projects

We've been using SQL Server database projects ever since they were called "Visual Studio Team Edition for Database Professionals". That was back in 2005 and the name was quite a mouthful.

Until the last year or so though, we've only seen fairly patchy adoption of them among our customers. Over the last year or so, this has really changed. We're seeing so many customers starting to use them. Why?

I'm guessing that it's the increased interest in using DevOps.

One question that I hear again and again though relates to deployment. Developers assume that if you delete a stored procedure or view from the database project, that it will be deleted from the target database when you publish the project.

By default, that won't happen but it's easy to change.

When you are in the Publish Database window, click Advanced.

In the Advanced Publish Settings window, select the Drop tab:

The option that you're looking for is Drop objects in target but not in source:

You can then choose which types of objects that this applies to.

 

DevOps: To branch or not to branch

One of the discussions that comes up from time to time when working with DevOps is branching strategy.

One of the main features of Git that's often claimed is that it's so good at branching and merging. And indeed, it's pretty good at that. But the bigger question is whether lots of branching is desirable in the first place.

One argument says that if you are using branches (let's say to build features), that you really aren't doing Continuous Integration (CI). The downside of being features in separate branches is that at some point, you'll have to merge the code back in, and there's probably going to be nothing automated about that. One software house that I've been mentoring in has a very large number of active live branches.

Each time a merge to a central trunk (ie: master branch) is attempted, it's beyond painful, often takes months, and introduces enormous numbers of bugs that need to be fixed. If you then combine that scenario with a lack of decent testing, you have a problem. Merging becomes really, really scary.

Image by Priscilla Du Preez
Image by Priscilla Du Preez

There is nothing agile or continuous about that.

The alternative approach is to always work in the master branch. While you can end up colliding with what other people are doing, at least you deal with that right then and there. You can end either with a frustrating set of collisions so that you're endlessly working with an unstable base, but at least you know about it straight away.

A bigger issue is that it's easy to accidentally release unreleased features into the wild, or at least parts of them. Feature flags are often used to try to hide this but that can go wrong too.

So in the end, to branch or not to branch? My take on it is that branches are fine if they are letting you try something out in a scratchpad area, and when they are really short-lived. But long term branches aren't something that I'm keen to see.

DevOps: Scripting SQL Server objects and data from the command line

The other day I posted a shortcut about how to use SQL Server Management Studio (SSMS) to generate INSERT statements for data.

In one of the comments, Tom Corrigan asked if there was a way to do that from the command line.

The answer is yes. Apart from a variety of 3rd party tools, in May last year, Microsoft released tools to do just that.

The new tool is mssql-scripter and you'll find an intro to it here:

https://blogs.technet.microsoft.com/dataplatforminsider/2017/05/17/try-new-sql-server-command-line-tools-to-generate-t-sql-scripts-and-monitor-dynamic-management-views/

They describe it as "the multiplatform command line equivalent of the widely used Generate Scripts Wizard experience in SSMS".

Importantly, note the fact that it's multiplatform. It's built in python and works on Linux, macOS, and Windows and can create both DDL and DML scripts that target SQL Server, and that also includes Azure SQL Database and Azure SQL Data Warehouse.

You choose where the output goes. Normally you'll send it to .sql files but like any other Unix style utility, you can pipe its stdout to other commands.

The source code is up at github here:

https://github.com/Microsoft/mssql-scripter

I love the fact that Microsoft teams are now putting source code like this up on github: continuing evidence of a "new Microsoft".

That means that you can:

  • Use it as a great learning resource
  • And if you're keen, submit pull requests to improve it

So to generate INSERT statements, you can use it like this:

mssql-scripter -S somesqlserver -d WideWorldImporters -U someuser -P somepassword –include-objects Customers –data-only

Note that it would all be on a single line.

Opinion: Which SQL Server columns should be nullable #2: Avoid placeholder values

In my previous opinion post, I was discussing why magic values where a problem. They are often added by developers when they are trying to avoid having any NULL data in a database.

While discussing this, I think it's worth also highlighting the overall dangers of placeholder values. These are values that aren't magic values but are ones that are intended to be replaced at a later time.

If you use any of these, you need to have a solid process in place to:

  • Make them consistent and easily identifiable
  • Replace them in any database that's heading towards production
  • Avoid them like crazy in production databases (and if you can't do that, have a great process for finding any that leak out)

Here are some examples of what I'm talking about:

This one is quite nasty because it really has nothing to make it stand out from the containing text.

The main image above with the TV show is a more obvious one (or you'd hope so). No doubt it's a little harder for them to detect the use of that one but it can't be that hard, or they need to use another type of image that doesn't require human detection.

One of my favorites is where people have used text generators, but then forgotten to replace the text. Here's an example on a wine bottle:

This is the classic Lorem Ipsum text.

The problem here is that they wanted something that really made the mock-up of the bottle look like the real thing. Unfortunately, it really did. I am surprised that whoever printed the labels didn't notice it though.

Ideally, if you have a good DevOps process, you might need to have a testing step that checks for the presence of any placeholder values before any data is published.

Before I leave this topic, for a little fun, Shopify published a link to some cute lorem ipsum style generators.

https://www.shopify.com.au/partners/blog/79940998-15-funny-lorem-ipsum-generators-to-shake-up-your-design-mockups

They did fortunately include my oldest favorite: Bacon Ipsum.

A blog reader Erica Velásquez mentioned that here is a "standard" lorem ipsum style generator that she likes:
https://www.websiteplanet.com/webtools/lorem-ipsum/

Next week, back to how NULL does/doesn't fit with database design.