SQL Server 2008: Interesting Full-Text Dynamic Management Function

I’ve been playing around with full text search in SQL Server 2008 and was intrigued by the DMF sys.dm_fts_parser(). It allows you to see the result of the word-breaking occurring within full text search. If you execute the query:

select * from sys.dm_fts_parser(‘”Hello Greg. How are you? I haven”t seen you for a while.”‘,1033,0,0);

It returns the following data (along with some other columns):

occurrence  special_term     display_term
——————————————
1           Exact Match      hello
2           Exact Match      greg
10          End Of Sentence  END OF FILE
11          Noise Word       how
12          Noise Word       are
13          Noise Word       you
21          End Of Sentence  END OF FILE
22          Noise Word       i
23          Exact Match      haven’t
24          Exact Match      seen
25          Noise Word       you
26          Noise Word       for
27          Noise Word       a
28          Noise Word       while
36          End Of Sentence  END OF FILE

(15 row(s) affected)

That’s really sweet as it lets you parse text into words. The 1033 was the locale I chose and the other two parameters were a stop word list (formerly called a noise word list) and whether or not it should be accent sensitive.

Arthur C Clarke made his final orbit of the Sun today

I was really sad to hear that Arthur C Clarke died today. For those that enjoyed his work, it’s worth watching his “goodbye” video he made on the occasion of his birthday in December 2007. He knew at that time that the end was near. I particularly liked his choice of a Rudyard Kipling poem to read at the end. You’ll find the video here: http://www.youtube.com/watch?v=eLXQ7rNgWwg

SQL Server 2008 – DMF: “On change – prevent” is the wrong name

One of the things I really don’t like in the Declarative Management Framework is the term “On change – prevent” for the execution mode. Given we don’t have INSTEAD OF DDL triggers and only AFTER DDL triggers, we can’t *prevent* these things at all. All we can do is roll them back after they happen. I think the name should be changed to “On change – rollback” to reflect what’s actually happening.
 
Also, if we ever get INSTEAD OF DDL triggers, we might then be able to prevent the actions. What then would the option be called?
 
 
And for some pressure on getting INSTEAD OF triggers for DDL, my previous item on that is here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=299328
 
You know the drill: vote once, vote often 🙂

SQL Reporting Services: be very careful with using “c” formatting for currency

While on site this week, another common problem that I see everywhere arose again.

When you need to format currency, you use the “c” format right? It’s in nearly every set of course materials I’ve ever seen.

In many cases, that’s wrong!

Take a long hard think about whether you really want the monetary values that you display automatically converting to the client system’s currency. If you are reporting financial figures, for example, is it really appropriate that they should mysteriously suddenly change from dollars to yen because the client is based in Zama City rather than in New York City? Not likely. A million dollars in sales might be good but a million yen in sales not quite so good 🙂

SQL Down Under show 35 – Roger Doherty – SQL Server 2008 for Database Developers

Hi Folks,

 

SQL Down Under show 35 with guest SQL Server DPE Evangelist Roger Doherty is now available for download from www.sqldownunder.com.

 

In the show, Roger discusses what developers are looking forward to in SQL Server 2008.

 

Enjoy!

 

Regards,

 

Greg

SQL Server Reporting Services: Avoid T-SQL in Reports

I spent time working with Reporting Services again today on a client site. The complexity of the reports we were working on reminded me that I really, really don’t like seeing T-SQL scripts (or really any business logic) embedded in reports.

DBAs tend to be considered a conservative bunch. One thing they’re usually conservative about is refactoring their databases. In many cases, this is because they have little idea what they will break when they make database changes. I’ve seen countless organisations that have allowed direct access to tables from client Access-based applications and have then lost control of the ability to ever change anything in the database because of the hundreds of little applications and reports that they might then break. I spend a lot of time talking to them about how to regain control of the situation.

Reporting Services is another area where this can happen. Fortunately, the reports are typically located in a single location. When a database change is needed, at least you don’t then have to search the enterprise for reports that might break. However, life is much simpler if all the report does is call a stored procedure instead of having T-SQL embedded in it. I see the advantages of this approach as:

1. Refactoring the database is easier.

2. Unit testing of the code is much easier. You can easily build tests to just call the stored procedures. While possible via the web service interface, it’s much harder to test the reports directly and requires a different skill set.

3. It allows the UI to be built by one person and the stored procedures to be built by another.

4. It’s easy to deal with report permissions. Typically I create a Reports schema in the database and grant EXECUTE permission on that schema to the reporting users groups. That way, I don’t have to manage individual permissions on the stored procedures and I don’t have to grant permissions on the tables (or perhaps views).

5. I may well get benefits on procedure cache efficiency.

6. I can use the same stored procedure on multiple reports. I quite often find I have to write each report twice: once to look nice, the other to just dump the data into Excel format. Typically, when clients ask for Excel output, they just want the data, not the pretty formatting. I can use the same procedure for both versions. (Fortunately in SQL Server 2008, it looks like I’ll have more flexible Excel rendering options).

SQL Server Data Services in Cloud for the WideOpen Web

SQL Server Data Services was announced today at MIX. This new offering from Microsoft provides Internet-based data services using RESTful interfaces like those provided by ADO.NET Data Services.

This is going to enable some outstanding new applications.

Apply for the beta here now: http://www.microsoft.com/sql/dataservices/default.mspx

SQL Cache Dependencies

A little while back I wrote an article on SQL Cache Dependencies. I had a colleague ask about these today so I think it’s time to blog it. (For more information, I also did a podcast some time back with my old mates Wally McClure and Paul Glavich from the ASP.NET Podcast. It’s at: http://weblogs.asp.net/pglavich/archive/2005/08/21/423160.aspx)

SQL Dependencies in ASP.NET v2

 

Traditional ASP Applications

In traditional ASP applications, every time the user would request a page, the application would retrieve the data from the database to render the page. Clearly this doesn’t scale very well. It works for a small number of users but not a large number. Often though, this data access can be completely avoided for certain types of data.

Consider the Production.ProductCategory table in the AdventureWorks database. Realistically, how often would it change? If we are building an application that displays the list of product categories, it doesn’t make sense to retrieve it from the database on every request. We could improve on this situation if the same user was retrieving this list regularly by holding it in the user’s Session state but that also wasn’t that helpful if there were a large number of users as many copies of the same data would be held.

ASP.NETCache Object

A key advance in ASP.NET was the introduction of the Cache object. It provided a server-wide shared cache that could be used to hold relatively static data. We could read the list of product categories into a dataset and hold it in the Cache.

The issue with any cached data though is when to refresh the data in the cache. If we choose to refresh it once per day, we then could be showing an incorrect list for up to an entire day. If we base the refresh policy on elapsed time, we have a trade-off. The shorter the refresh time, the fresher the data but the less useful the cache is. The longer the refresh interval, the more efficient the cache becomes but the staler the data potentially is.

The ASP.NET Cache object in v1 gave us two options for cache invalidation. One was time based, the other was based on a file dependency. This was originally designed to allow for situations where the data was coming from an XML file stored in the filesystem. A filesystem watcher was set up to monitor the file. If the file had changed, the Cache would be invalidated and refreshed.

But the invalidation option that we really wanted was “when the data changes”.

Cache Invalidation Prior To ASP.NET v2

Prior to ASP.NET v2, there were a number of attempts to provide cache invalidation when the data changed. The most popular version was:

1.       Create an extended stored procedure in SQL Server that simply touched a nominated file in the filesystem.

2.       Add a trigger (for INSERT, UPDATE and DELETE) on the table to be monitored that called fired the trigger.

3.       Set the cache to have a dependency on that file.

This option was miles ahead of where we started but it had a few downsides. First, writing extended stored procedures for SQL Server was not trivial. It couldn’t be done in managed code but a C++ template was available to use as a starting point. Second, the trigger would fire for any changes to the table, even if our query was only for certain rows in the table. Third, the mechanism was table-based and didn’t deal with queries that involved joins across multiple tables. Last, the mechanism was tied to a specific database engine and often to a specific version of a database engine.

ADO.NETv2 SqlDependency And SQL Server 2005

SQL Server 2005 added a wonderful new feature called Query Notifications. (NOTE: nothing to do with Notification Services – they often get confused). The idea here is that you could say to SQL Server “Here is a query I want you to execute. However, when you finish, I don’t want you to forget the query, I want you to remember it and, if the underlying data that it was based on changes in such a way that *if* I was to re-execute the query I’d get a different response, I want you to tell me about it”.

This is exactly what we need to support our ideal Cache invalidation scenario. We don’t want to re-execute the query, unless the data has changed. You might be thinking “that sounds like a lot of hard work for the database engine”. Well it is a bit of work but it’s in fact utilising the same mechanism that was already in place for detecting data changes underneath Indexed Views in SQL Server 2000. What it does mean is that we have some limitations on the types of query that this will work with:

·         Use two part names for tables (SQL Server needs to know which schema so it knows exactly which table, rather than having to resolve that at query time)

·         Reasonable complexity (not all queries can be implemented)

·         Specific column lists (ie: No SELECT * FROM somewhere)

Normally we SQL Server, we send a query and get a response. We don’t expect unannounced data to arrive sometime later. So the next issue is how SQL Server can send us the notification. To do this it uses another new feature called Service Broker. Service Broker is a transacted queue that lives within the database server. Communication occurs between “services” attached to “endpoints”. We then need to set up an endpoint ready to receive the notification.

The ADO.NET team did a lot of this for us. They created a new object called the SqlDependency. It is a simple programmable object that wraps most of this functionality for us. Basically, it’s used by:

·         Create a SqlCommand you wish to execute

·         Wrap a new SqlDependency around the command (you pass the command to the constructor of the dependency)

·         An event on the SqlDependency object fires when the data changes.

It’s that easy. The one remaining requirement is to set up the queue to listen for the notifications. You do this by calling the Start() method on the SqlDependency object once before you get going.

This new object can be used in any ADO.NET v2 application, including in WinForms applications. However, its primary use-case is in ASP.NET Cache invalidation. You can use it in code like this:

using (SqlConnection conn = new SqlConnection(“some connection string”))

{

      SqlCommand cmd = new

         SqlCommand(“SELECT EmployeeID, EmployeeName FROM Employees”,conn);

      SqlCacheDependency dependency = new SqlCacheDependency(cmd);

      SqlDataAdapter employeesDA = new SqlDataAdapter(cmd);

      DataSet employessDS = new DataSet(“EmployeesDS”);

      employeessDA.Fill(employeesDS,”Employees”);

      Cache.Insert(“emp”, employeesDS, dependency);

}

ASP.NET v2 SqlCacheDependency

Even though the ADO.NET v2 team had done most of the work, the ASP.NET team decided to make it even easier to use with the Cache object. In v2, the Cache object can now have a SqlCacheDependency in addition to the previous options of file and time dependencies. This can be applied to the Cache object and also to other cached objects like user controls. You can specify a directive like:

<%@ OutputCache Duration=”2400” VaryByParam=”none”

               SqlDependency=”northwind:Employees” %>

You can also directly use the SqlDependencies in code:

Cache.Insert(“emp”,obj,

        new SqlCacheDependency(“northwind”,”Employees”));

The code shown is a general form of the SqlCacheDependency settings. In ASP.NET v2, when using SQL Server 2005’s special mechanisms, we use the specific string “CommandNotification” rather than database and table names.

What If I Don’t Have SQL Server 2005?

Rather than tying their implementation directly to SQL Server 2005, the ASP.NET team also decided to create a pluggable architecture so other forms of SqlCacheDependency could be provided. For example, they have provided an implementation for SQL Server 2000. As SQL Server 2000 doesn’t offer query notifications or service broker, the implementation isn’t as good as the one for 2005. It works like this:

·         A table is created to hold details of change notifications. (A command has been provided to help with this: aspnet_regsql.exe –d databasename –ed –E)

·         Triggers are set up on the required tables. (Again a command has been provided to assist: aspnet_regsql.exe –d databasename –et –t tablename –E)

·         ASP.NET polls the change notification table to look for changes.

Obviously polling is not as efficient as being notified but it’s still a whole lot more efficient that having the application constantly looking up the data.

This mechanism can then be extended to other database engines.