SQL: Maintaining Online Website Data during Full Data Refreshes – Part 2 Code Example

I had a number of responses to my blog post about maintaining an online website's data during full data refreshes by using partitioning. Lots of people asked for a code sample, so I've provided it here.

It's really common to want to keep a website online as much as possible, but need to rebuild the data that it's displaying periodically. It might be a website for members of a superannuation fund, and you need to let them see balances as of last night. The problem is that you can't have the data missing for long or the website isn't going to keep working. You can't just truncate the tables, and start repopulating them again. You need to make the changeover as quick as possible.

Let's look at an example involving stock pricing.

Live Website

I'll start with two schemas: one that holds tables for the website, and one that's my staging tables:

Now I'll add the StockItems table that will be used by the live site:

And while I'm at it, I'll put some data in it from the WideWorldImporters database:

So now you can imagine that we have a live website hitting this data. Here's a sample of the data:

Note that there's a Steel Gray item shown.

Updated Stock Items

Now imagine that a number of changes have happened to the stock items, and in addition, the company has decided that they no longer want to sell Steel Gray colored items.

We need to reload the stock items table with new data. If that was a super-fast option, we might just get away with making the changes directly to the table. But in many cases, we need to truncate the table, then repopulate it from source data, and that repopulation can take a long time.

So here's what we can do instead:

Staging Table

Let's create another StockItems table. I'll put in a separate schema but there's nothing special about using schemas here. It could just have been another table. But it needs to be identical to the live table.

This is the table that we can then truncate and repopulate, doing whatever we need to until it has the correct data in it.

For demo purposes, let's just load it with stock items that aren't Steel Gray.

This is our background reload process.

Switch the Staging Data to the Live Data

Now when it's time to move the staged data into production, we do that by truncating the live data and switching in the staged data. This is likely to be a very fast operation, as the data being switched in does not move at all. It's just a metadata operation.

And now the live site has the new data. (Note that we could do a partition switch, even though we weren't using table partitioning directly. Every table has, by default, a single partition).

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *