SQL: Implicit vs Explicit Transaction Handling – JDBC Driver I’m looking at you

In a relational database like SQL Server, transactions are the mechanism used to ensure that entire operations either complete or are rolled back. The obvious example used to be that if you transfer funds from one place to another, that both the debit and the credit need to occur, or neither occurs.

Fair enough and straightforward enough.

Computers also try to give you the illusion that you are the only one using them. Concurrent transactions are a place where that illusion breaks. While you are working in a transaction, you are potentially affecting other users of the system. So we try to manage how long transactions are held open for. The aim is to always have transactions protect what’s needed but be as short as possible.

If you want applications to scale, you need to clearly manage the lifetime of your transactions.

In SQL Server, if I just execute an UPDATE statement, the statement either completes or doesn’t but no other transaction is involved. This is the default behavior.

But SQL Server also has an option to enable implicit transaction handling. This used to often be called “chained mode”. I can enable it like this:

With this enabled, whenever I execute a statement that modifies data, SQL Server automagically starts a transaction for me. Importantly, the data is not committed until I explicitly later say COMMIT.

I really dislike this option because it breaks my desire to manage transaction lifetimes, and breaks the rules about keeping transactions as short as possible.

I’ve noticed that some applications do this instead of managing transactions. JDBC-based apps seem to be key offenders. What they do is this:

  • enable implicit transaction mode
  • do whatever work they need to do, making changes when required
  • later run statements to commit any uncommitted work
  • and yet again execute yet another commit if there is still any uncommitted work

I’m sorry, but this is not transaction management, and it’s certainly not a formula for an application that will scale.

 

 

DevOps: To branch or not to branch

One of the discussions that comes up from time to time when working with DevOps is branching strategy.

One of the main features of Git that’s often claimed is that it’s so good at branching and merging. And indeed, it’s pretty good at that. But the bigger question is whether lots of branching is desirable in the first place.

One argument says that if you are using branches (let’s say to build features), that you really aren’t doing Continuous Integration (CI). The downside of being features in separate branches is that at some point, you’ll have to merge the code back in, and there’s probably going to be nothing automated about that. One software house that I’ve been mentoring in has a very large number of active live branches.

Each time a merge to a central trunk (ie: master branch) is attempted, it’s beyond painful, often takes months, and introduces enormous numbers of bugs that need to be fixed. If you then combine that scenario with a lack of decent testing, you have a problem. Merging becomes really, really scary.

Image by Priscilla Du Preez
Image by Priscilla Du Preez

There is nothing agile or continuous about that.

The alternative approach is to always work in the master branch. While you can end up colliding with what other people are doing, at least you deal with that right then and there. You can end either with a frustrating set of collisions so that you’re endlessly working with an unstable base, but at least you know about it straight away.

A bigger issue is that it’s easy to accidentally release unreleased features into the wild, or at least parts of them. Feature flags are often used to try to hide this but that can go wrong too.

So in the end, to branch or not to branch? My take on it is that branches are fine if they are letting you try something out in a scratchpad area, and when they are really short-lived. But long term branches aren’t something that I’m keen to see.

Shortcut: Fix Intellisense and Printer Colors in SQL Server Management Studio

SQL Server Management Studio (SSMS) is a highly configurable tool. One of the areas that’s often ignored but which can be quite important is color configuration.

SSMS color codes SQL scripts (and other types of files that it understands) as you type.

This is really useful but I’ve found on some systems that some of the color selections aren’t great. Here’s an example:

On many systems that I work with, the color for sys.tables in the query above is quite a fluoro green and almost unreadable. But if you don’t like this, you can change it.

In Tools, then Options, then Fonts and Colors, select Text Editor, then look in the list of Display items:

Note that when Text Editor is selected, several SQL options appear in the Display items list. They are not there when you select other settings such as Printer.

I could then change the nasty SQL System Table color option to something easier to work with:

I noticed that Comment was a standard Green so I’ve chosen Olive here, and then on my screen, they look much better:

If they aren’t dark enough, I could also Bold them.

It’s worth noting that this can help for people with different visual challenges (or color blindness in general).

While there is a separate set of colors for Printer, up to v17.6 it unfortunately doesn’t include the list of SQL language elements. (That does seem odd as it has other language elements).

SDU Tools: Update Statistics on SQL Server Tables

Having up to date statistics is critical for SQL Server choosing appropriate query plans. Poor statistics can lead to poor query plan outcomes.

Generally, SQL Server manages this well by itself. As a rough rule, most versions auto-update statistics when the number of changes is about twenty percent of a count of the number of rows in the table. (Some recent changes have slightly altered how this works but the ball park is pretty good).

There are situations though where this self-management isn’t enough.

In T-SQL, you can use the UPDATE STATISTICS statement to force a recalculation, and you can specify a sample percentage to determine how many rows SQL Server looks at while calculating the statistics. But the command is a bit painful to use when you want to do many tables, or all the tables in a schema, etc.

One of our free DBA and developer SDU Tools is designed to make this easier. In the image above, you can see it used with a selection of tables included. In this case, it’s looking in the WideWorldImporters database, and processing all tables called Cities or People in all schemas. It is using a sample size of 30 percent.

Here’s another example, where all tables in a database are being processed:

Note that the tool outputs the commands that it is executing into the Messages tab so you can see what it did.

You can see it in action here:

For more information on joining SDU Insiders to get our free tools and other resources, please visit here:

http://sdutools.sqldownunder.com

 

Free eBook: SQL Server Management Studio Tips and Tricks

I’m so pleased to now have this book out. You’ll find it here:

http://ssmsbook.sqldownunder.com

I’ve worked with SQL Server for decades and have compiled this list of tips and tricks over that time. This eBook is a compilation of a series of blog posts that I have either made or are scheduled to be made as part of my Thursday “Shortcut” series, for shortcuts that apply to SSMS.

There are a lot of tips and tricks. It’s now not far from 200 pages.

While I’d love to remember who first showed me each and every one of these, the original source of any particular tip or trick is now too difficult to determine, and often may have come from multiple sources. Though, I need to thank the Data Platform MVP community for endless inspiration, and for no doubt being the ones who showed me many of these items over the years. For that reason, I consider myself the editor more than the author.

We intend to keep enhancing and upgrading this book. If you have feedback for it, please send that to ssmsbook@sqldownunder.com.

I really hope you enjoy it.

Opinion: Should you use bit columns?

In an earlier post, I discussed why bit columns can be useful in SQL Server Indexes.

I used an example of a transaction table that had an IsFinalized column. And if it did, then it’s very likely that I’d want to have it indexed. But what I didn’t discuss was whether that bit column made sense in the first place.

An IsFinalized column in a transaction table seems simple enough. You might think of it as “yes it’s complete”. But what does that actually mean? Chances are that something has occurred to make it finalized. Perhaps it has been posted, or it’s an invoice that has been paid, or it’s a payment that has been applied against another transaction.

The point is that it probably represents a lack of normalization, at least to some degree.

Even if it’s just a column that a user updates to say it’s finalized, chances are that you’d want to know when that happened. So a column that held the finalization date and time might be better, and would be NULL if that hadn’t happened. (Whether or not these NULLable columns are desirable is the topic of a future post).

The other time that I regularly see these types of flags is when they represent a characteristic of an entity, but one that changes its nature. For example, I might have a People table and might have an IsEmployee column.

The concern with that type of design is that it often hides further normalization issues. I remember a post that Scott Hanselman made years ago where he mentioned that if you are designing a class in a high-level language, the presence of boolean attributes for the class usually means that you’ve messed up the class design.

And the same thing applies here. Rather than a People table with an IsEmployee column, there probably should be an Employees table that refers to that People table instead.

I don’t take an overly purist view on this. For example, yes the presence of a related row in an Employees table would indicate the same thing, but if there are no other attributes about the person being an employee, I might not want to create that other table.

 

SQL: Does having more indexes always make data modifications slower?

As I’ve pointed out in other posts, Betteridge’s Law of Headlines says you already know that the answer to this is no.

There is a persistent myth among SQL Server users that having more indexes always slows things down, particularly anything that needs to modify data (INSERT, UPDATE, DELETE, MERGE).

However, there are two aspects of this to consider:

The first is that you need to decide how much the modifications actually matter in the overall performance of the system. In an earlier blog post, I noted that when I trace typical transactional systems, I see reads making up about 99% of page I/O’s. Writes barely even appear. But I’m not saying they are irrelevant, just that you have to focus on what the system is spending most of its time doing.

Having faster reads (and indexes help a lot here), can make your writes much faster anyway, if that’s the main thing the system is doing.

I understand that some times the write performance is critical, but don’t be confused about write times that are slow because they’re being blocked by read operations either.

The second aspect is that to update or delete something, you have to find it first. (Or for INSERT operations, work out where it needs to go). Whenever I see people showing examples of indexes slowing down updates, they show updates where the primary key of the table is provided. So yes, if all you are doing is updating a row by its primary key, other indexes are just baggage and overhead for that operation.

Updates by primary keys though, are not the only ways that tables get updated. Note the query in the main image above. We’re correcting city names. Anything that was spelled “Smytheville” now needs to be “Smithville”.

You can see the missing index hint here:

No surprise that SQL Server is complaining that to perform this update efficiently, that you need another index. The problem is that to update the city names, we need to be able to find the wrong ones first.

Worse, without that index, imagine which rows SQL Server is going to lock while performing that operation.

This is just a simple example, but I just wanted to make the point that more indexes does not always equal slower updates. It can be quite the opposite.

DevOps: Scripting SQL Server objects and data from the command line

The other day I posted a shortcut about how to use SQL Server Management Studio (SSMS) to generate INSERT statements for data.

In one of the comments, Tom Corrigan asked if there was a way to do that from the command line.

The answer is yes. Apart from a variety of 3rd party tools, in May last year, Microsoft released tools to do just that.

The new tool is mssql-scripter and you’ll find an intro to it here:

https://blogs.technet.microsoft.com/dataplatforminsider/2017/05/17/try-new-sql-server-command-line-tools-to-generate-t-sql-scripts-and-monitor-dynamic-management-views/

They describe it as “the multiplatform command line equivalent of the widely used Generate Scripts Wizard experience in SSMS”.

Importantly, note the fact that it’s multiplatform. It’s built in python and works on Linux, macOS, and Windows and can create both DDL and DML scripts that target SQL Server, and that also includes Azure SQL Database and Azure SQL Data Warehouse.

You choose where the output goes. Normally you’ll send it to .sql files but like any other Unix style utility, you can pipe its stdout to other commands.

The source code is up at github here:

https://github.com/Microsoft/mssql-scripter

I love the fact that Microsoft teams are now putting source code like this up on github: continuing evidence of a “new Microsoft”.

That means that you can:

  • Use it as a great learning resource
  • And if you’re keen, submit pull requests to improve it

So to generate INSERT statements, you can use it like this:

mssql-scripter -S somesqlserver -d WideWorldImporters -U someuser -P somepassword –include-objects Customers –data-only

Note that it would all be on a single line.

Shortcut: Presentation Mode in SQL Server Management Studio

I spend a lot of time delivering presentations of various types. Many of those presentations involve showing code in either SQL Server Management Studio (SSMS) or Visual Studio (VS).

I’ve become quite fast at taking a default setup of SSMS and changing it to the fonts, etc. that I want to use for a presentation. Given how large these fonts are, I don’t want to use them for day to day work.

The best solution that I’ve found for this is to create another user (let’s call it DemoUser) on my laptop, and then configuring fonts, etc. for presentations for that user, quite separate to my normal work fonts.

In recent builds of SSMS, the team realized the importance of this and added a Presentation mode. In this image, you can see that I’ve typed font into the Quick Launch bar. (Note that this is separate to the Find box that I’ve also shown. I’ve seen people try to type “font” into that box and wonder why nothing relevant came up).

In the drop-down list that appears, you can see a list of options that contain the word Font. It checks both the option name, and the description. While you could type Present as I’ve seen many people suggest, you’ll note that it wouldn’t bring up the option to RestoreDefaultFonts. The word Font is present in all the options that we want.

Let’s start by choosing PresentEdit from that list.

You’ll note that an XML file appears. It would be helpful if this was a GUI instead of an XML file because you’ll need to know what the names of the other settings are, if you want to change them.

For now, let’s just change the TextEditorFontSize and the EnvironmentFontSize using the perfectly-fine XML editor in SSMS. The first entry changes the size of the text when you’re editing queries. The second one affects the size of text in Object Explorer, menus, etc.

If I click File, then Save As, note where this is saved:

It’s under your AppData folder. Once this is saved though, we can test it.
Type Font in the Quick Launch bar again, then select the Present On option. You’ll notice that things have changed. Here is my menu, etc. Note the increase in size.

If you change output grid or text sizes, you’ll still need to restart SSMS to see the outcome. The only difference is that you won’t get the warning that you normally do.

And once we’re finished presenting, we want to go back to the sizes we had before. And this is where I’m not at all happy with this presentation option. The only choice you have is RestoreDefaultFonts. Once you choose that, you’ll see it wasn’t kidding. You’re right back to defaults, not to the fonts you had previously been using.

So while this had the potential to be a good feature, I can’t give it a pass. It feels ill-conceived. Surely when you click Present On, they could save your current settings, and then have a Present Off option instead of, or in addition to, the RestoreDefaultFonts option.

Once again, I think there aren’t enough grumpy old guys in the team.

Image by Alexandre Debiève

SDU Tools: Table of Numbers in SQL Server T-SQL

For anyone who’s worked with SQL Server for a long time, you’ll know that one of the common requests is that a table of numbers would be useful. Yes, just a table with 1, 2, 3, 4 and so on. Often in a query, you need just a certain set of values.

In our free SDU Tools, we’ve added a function to do that in a reasonably efficient way.

In the main screenshot, you can see it in use. We made it a little more flexible in that you can specify the starting number, and then specify the number of numbers.

Because we’ve done this with a recursive function, if you need to output more than 100 values, it’s important to add a MAXRECURSION option into the query like this:

We hope you find it useful. You can see it in action here:

For more information on joining our insiders team to get all our free tools and resources, follow the link here:

http://sdutools.sqldownunder.com