SQL Server 2014 Hybrid: Storing data files in Azure storage – Bizarre or not?

In the sessions that I was attending in Redmond yesterday, I was interested to see the presenter discussing the idea of having individual database files in the cloud as some sort of bizarre option that’s now available in SQL Server 2014. I don’t see it as bizarre at all. However, I see two distinct use cases that make complete sense to me:

Overcoming Drive Limitations in Azure-based SQL Virtual Machines

One of the current limitations of running SQL Server in Azure virtual machines is that you can only attach two additional data drives per CPU core, and each of those drives is up to 1 TB in size. For example, if you spin up a two core VM, you can attach 4 data drives.

By allocating URLs for data drives, you can overcome this limitation. In a recent whitepaper that I was a reviewer on, we discussed the IOPs and size limitations of Azure storage based drives, both at the individual drive level and the storage account level. Please read that whitepaper for more info on those characteristics that remain unchanged. A core takeaway though is that you shouldn’t mix storage accounts for files that make up a single database, to make sure that you maintain write ordering. You could use different storage accounts (to overcome single account limits) for different databases though.

Tiered Storage for Partitioned Tables in On-Premises Databases

I commonly implement partitioned tables for sliding window scenarios. For example, a customer might want to keep 7 years of data online at any time. If table partitioning isn’t familiar to you, please read another whitepaper that I was a reviewer on. Even though it was created for SQL Server 2008, it’s pretty much still spot on.

One way that I use partitioned tables is to keep active data in the current partition, and to keep older data in separate partitions, often one per month. This usually allows me to have the older data in read-only filegroups. This gives you a great situation where your application just sees a normal table, but most of the table is read-only. There are several advantages of this, some of which are:

– While locks are still acquired on the read-only data, hints like NOLOCK are then safe to use on that data.

– You won’t have a query accidentally overwrite your older data.

– The older data could be stored on slower media.

– The older data could have a higher compression rate. For example, you could use ROW compression for the active data, and PAGE compression for the older data.

– SQL Server doesn’t need to run recovery on read-only filegroups.

Using data files in Azure storage could make really good sense for some of this older data. You could store your old read-only data in storage files and keep it available. So your current data would be fast to access but your older data would still be present, but slower to access.

I could see a situation where you might never need to remove your old data.

One other question that comes up in this situation relates to how backups work. Clearly you don’t want to be pulling all your data out of the cloud when you perform a backup, particularly if you are going to also backup to the cloud. However, piecemeal backups in SQL Server are perfect for this. If you just store your read-only filegroups and archive data in Azure storage files, you can safely perform a backup with READ_WRITE_FILEGROUPS_ONLY.

All in all, I see that some of these new hybrid scenarios could be quite compelling, and not bizarre at all.

More updates to Azure: Which changes relate to SQL people?

The Azure team keep rolling out improvements at an amazing pace. Scott Guthrie posted recently about the latest set of changes. Here are the ones that I think are most important for SQL people:

Import/Export Hard Drives

Even though Azure storage accounts can hold a very large amount of data, one of the big questions has always been about how to get that data into the storage account in the first place. That question is now answered because the new Windows Azure Import/Export service lets you ship hard drives directly to the Azure team for uploading. The reverse is also available. If you export a large amount of data to a storage account, you can move it onto a hard drive and have the drive sent to you.

Currently the service uses FedEx for transport. If you are concerned about the security of data in transit, bit locker encryption is also supported in both directions.

HDInsight Out of Preview Mode

HDInsight has made its way out of preview and into general availability. It allows you to process large volumes of data (in Azure Storage) using Apache Hadoop tools (including Pig and Hive).

HDInsight gives you a way of rapidly (and economically) spinning up a Hadoop cluster when you need one, and shut it down again when you are done.

Virtual Machine Gallery

As the number of virtual machine images has continued to grow, it’s been getting harder and harder to find the image that you want when creating a new virtual machine. UI changes have been made to make this easier.

One of the discussions that the team has been having lately is around how long images should be kept. For example, once SQL Server 2014 ships, should images of SQL Server 2008 R2, or SQL Server 2008 also be kept? I’d like to see them all kept but I’m not sure how much ongoing maintenance that is for the team. Also, if they are all kept, should they only keep the version with the operating system that was the most current at the time? If anyone has strong opinions on this, I’d love to hear them and I’ll pass them on.

General Changes

You can now configure endpoint security using PowerShell.

You can now configure alerts related to billing. You can get an email when you bill exceeds a given threshold. One thing I’d really like to see (which isn’t there yet) is the ability to simply have your bill emailed to you each month, even when it’s prepaid. Does anyone else feel the need for that?

For further details on areas that I haven’t mentioned, you’ll find Scott’s post here: http://weblogs.asp.net/scottgu/archive/2013/11/04/windows-azure-import-export-hard-drives-vm-acls-web-sockets-remote-debugging-continuous-delivery-new-relic-billing-alerts-and-more.aspx

Technet Live: What’s new in SQL Server 2014

Hi Folks,

If you want to get your head around the core improvements coming in the SQL Server 2014 wave, I’m presenting a Technet Live session on November 29th (Australian time).

Registration is here: https://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032570709&Culture=en-AU&community=0

I’d love to see you online. 

SDU Podcast #61: Dr David DeWitt

I had the distinct honour (honor) this week of recording a new SQL Down Under podcast with Dr David DeWitt from the Jim Gray Systems Lab at Microsoft. Anyone that has been to a PASS summit over the last few years would know that David seems to have single-handedly converted morning keynotes into must-attend events.

In this podcast, David explains his role, gives a philosophical view on where we’re at in the database industry and discusses core enhancements in SQL Server 2014.

You’ll find it here: http://www.sqldownunder.com/Podcasts

Enjoy!