SQL Server 2008 R2: StreamInsight Development Models

One thing that seems confusing to people when they first look at StreamInsight is that there are several development models:

1. Implicit Server

2. Explicit Server

3. IObservable/IObserver 

Implicit server is fairly straightforward. You define your input stream and create a query to consume your output stream. StreamInsight does all of the heavy lifting associated with creating the server, binding your adapters, etc.

Explicit server is the most flexible but the most coding work. You specifically spin up (create and instantiate) a server object, from it spin up an application, register your input/output adapters and create them via a factory and spin up a query binding object. You use the query binding object to bind together your standing query (via a query template) and your input and output adapters. You use this query binding object to then create a query object (an instance of a query) which you then start/stop as required. With the explicit server model, you can get reuse of query templates, adapters, etc. and you can reuse an existing event stream for multiple queries. This model is also how you'd work with a server that's not hosted in-process if that's required.

IObservable/IObserver makes the creation of an input stream quite easy. It is based on the RX extensions (Reactive Extensions) to the .NET Framework. You take an enumerable class (ie: one that implements IEnumerable) and you create an Observable class from it. You then hook up subscribers to the observable class.

Picking the right model is something you'll need to do fairly early on in your StreamInsight project.

Stored Procedures – Time for a real contract?

Increasingly, developers are using tools that try to automate code generation when dealing with databases. Stored procedures have been a thorn in the side of this. Mostly that's because it's difficult to obtain the metadata that is really needed.


Most automated tools try to work out what rowsets can come back from the sprocs. The Entity Framework, LINQ to SQL and many others use SET FMTONLY ON to try to determine what might come back from the sproc. This is a flawed mechanism as it returns one set of metadata for every potential code path through the sproc. It really only works for the most trivial sprocs ie: single statements that perform basic CRUD operations.

The first thing that is needed is a way to express the sets of rows that might be returned. This should be part of the definition of the sproc. It should include the ability to express multiple sets of rows. Some tools only work with the first rowset but that isn't sufficient because it has become quite common for people to build sprocs that return multiple rowsets to avoid network round trips. I should be able to say something like:

CREATE OR ALTER PROCEDURE SomeSchema.SomeSproc (someparameters)
WITH ROWS OrderHeaders(SomeColumn INT, SomeOtherColumn NVARCHAR(50) NULL),
OrderDetails(AnotherColumn INT, YetAnotherColumn INT,
EvenYetAnotherColumn GEOGRAPHY),
EXECUTE AS (execution options here if needed)

Note that I think the rowsets should be able to be named and I also threw in the need for CREATE OR ALTER. Please, please, please can we have this !!!


For a client to work effectively with a server, it should have knowledge of the potential exceptions that might be thrown by that code. This should also be available in the metadata of the sproc. I think it could be included like this:

CREATE OR ALTER PROCEDURE SomeSchema.SomeSproc (someparameters)
WITH ROWS OrderHeaders(SomeColumn INT, SomeOtherColumn NVARCHAR(50) NULL),
OrderDetails(AnotherColumn INT, YetAnotherColumn INT,
EvenYetAnotherColumn GEOGRAPHY),
EXCEPTIONS NoSuchCustomer(50020,'No such Customer'),
DuplicateOrder(50022,'That order already exists'),
EXECUTE AS (execution options here if needed)

Clearly, other exceptions could occur in an unplanned way but client code should be able to be configured automatically to deal with potentially expected errors. For example, a code generation tool could automatically build a skeleton error-handling routine for errors that it already knows could exist.


Clearly, this is all then heading towards having a contract for a sproc. When you have multiple people (or worse multiple teams of people) building parts of an application, it is really important to have a contract at the interface point. Perhaps the contract itself should have a name ie: something like:

CREATE OR ALTER PROCEDURE SomeSchema.SomeSproc (someparameters)
WITH CONTRACT SalesOrderHeaderAndDetails
(ROWS OrderHeaders(SomeColumn INT, SomeOtherColumn NVARCHAR(50) NULL),
OrderDetails(AnotherColumn INT, YetAnotherColumn INT,
EvenYetAnotherColumn GEOGRAPHY),
EXCEPTIONS NoSuchCustomer(50020,'No such Customer'),
DuplicateOrder(50022,'That order already exists')),
EXECUTE AS (execution options here if needed)

Enforcement in TSQL

I'd love to have a situation where some of this is enforced as much as possible within T-SQL. For example, given the declarations above, I'd love to see a situation where a SELECT statement that doesn't match one of the rowsets in the metadata or a RAISERROR with an error number not in the list of declared exceptions was enough to make the creation or alteration of the sproc fail. Perhaps any sproc with a declared CONTRACT could have these sorts of restrictions or we could have a word like ENFORCED:

CREATE OR ALTER PROCEDURE SomeSchema.SomeSproc (someparameters)
(ROWS OrderHeaders(SomeColumn INT, SomeOtherColumn NVARCHAR(50) NULL),
OrderDetails(AnotherColumn INT, YetAnotherColumn INT,
EvenYetAnotherColumn GEOGRAPHY),
EXCEPTIONS NoSuchCustomer(50020,'No such Customer'),
DuplicateOrder(50022,'That order already exists')),
EXECUTE AS (execution options here if needed)

In the feedback, I was asked for a Connect item. Here it is: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=525653

SQL Server 2008 R2 – StreamInsight – Event Payloads

One of the key decisions you'll make when working with StreamInsight is the payload that will be carried by each event. Events contain and EventKind (which is related to whether they're inserting a new event or modifying an existing one), some temporal information (depending upon the EventShape -> Point, Interval or Edge) and a payload that is user-defined.

A payload is defined via a .NET class or struct. In general, a class will be a better option as it ensures field ordering which is likely to be important for generic (vs typed) adapters. StreamInsight ignores anything except public fields and properties and there are limitations on the data types. For example, basic .NET types are permitted but not any of the nested types.

A payload will typically look like:

public class TollPayload


    public int TollBoothID { get; set; }

    public int LaneID { get; set; }

    public int VehicleType { get; set; }

    public String TagID { get; set; }


In addition to the fields and properties, I'd suggest a few other items are helpful.

First, make sure you override the ToString() method. It's helpful to be able to just show the contents of an event as a string without needing to couple the consumer's code to the payload fields. This is particularly useful with logging.

Second, I find it useful to provide public static methods that convert types (or enums) to much more meaningful names. In particular, this can be useful while projecting the event fields in the LINQ standing queries.

SQL Server 2008 R2: Reporting Services – Pagination

Another item I had on the Connect site was about pagination. I wanted to be able to restart page numbering within a report. In my case, this was because I wanted to output a set of invoices in a single report but I wanted each invoice to have its own set of page numbers.

This was another item on Connect that came back saying "Done!"

And again, the team went further.

You can disable pagination if it doesn't make sense for your rendering target. You can restart page numbers at the group level based on an expression. You can also set the page name that will be used as a tab name (worksheet name) in Excel rendering.

These sorts of items might seem small additions to the product but they make life so much simpler for those of us designing reports.

Again, thank you Reporting Services team!

SQL Server 2008 R2: Reporting Services – RenderFormat

I'm really happy with the Reporting Services team. I've had a few items on Connect where I've asked for features and they've come back and said "Done!".

One of these is RenderFormat. I wanted the ability to change my formatting based upon where I was rendering to. For me, this came from a need to have quite different output when rendered to Excel than when rendered to HTML or PDF.

Now, there is a global variable called RenderFormat that allows me to use RenderFormatName in expressions. Woohoo.

But the good part is they went further. The team realized that some decisions are simply based on whether or not the target is an interactive one or not ie: perhaps I want a very different page length for HTML than for PDF. They added RenderFormat.IsInteractive to allow us to test for this.

Thank you Reporting Services team.

SQL Server 2008 R2: StreamInsight Event Models – EventShapes

Continuing on the theme of describing StreamInsight, the next major concept is the Event Model. Events in StreamInsight are made up of two sets of data. One set is the basic information required by the StreamInsight engine such as when an event occurred. The other set is the user data contained within the events, called the "payload".

You can define what is contained in the payload. It is effectively a .NET class or struct that exposes a set of public members. We'll talk more about the payload in another post.

I mentioned before that StreamInsight has details of when an event occurs. It's actually more flexible than just storing a date and time. The temporal information that is stored is determined by the Eventshape. Events have three potential shapes.

EventShape.Interval provides a start and stop time for an event and is used for events that have a duration. The times are stored as UTC.

EventShape.Point deals with events that occur at a single point in time. The start time is all that matters. An end time is available but it is defined as the start time plus one cronon (or the smallest granularity of time storage). We mostly tend to ignore the end time.

EventShape.Edge deals with events that occur over an interval but at the time an event is recorded, we only have the start time. Later the event is updated when we find out the end time.

SQL Server 2008 R2: What is StreamInsight used for

Since I posted some StreamInsight info the other day, I've had a bunch of people asking me what StreamInsight is used for.

StreamInsight is Microsoft's implementation of Complex Event Processing. This is not a new market but it is new territory for Microsoft.

Complex Event Processing (CEP) is all about querying data while it's still in flight. Traditionally, we obtain data from a source, put it into a database and then query the database. When using CEP, we query the data *before* it hits a database and derive information that helps us make rapid business decisions, potentially also including automated business decisions.

I liked the way that one of our new colleagues (Sharon Bjeletich) put it to me: "It's about throwing the data at the query, rather than throwing the query at the data". 

There are lots of places that this makes sense but they all involve relatively high data rates. Good examples of these are automated trading in capital markets, fraud detection in networks or in casino operations, battefield control systems for military use, outbreak management for public health, etc.

While StreamInsight may combine the data with reference data stored in SQL Server, the primary development skills needed for working with it are .NET development skills.

The Region: Sofware Industry Predictions for 2010: iPhone General-Purpose Applications

Our Microsoft RD lead Kevin Schuler has asked us to post predictions for 2010 that will appear in a special edition of TheRegion. (Check out www.theregion.com for any interesting blog if you haven't already). Here's mine:

Against all perceived wisdom, I suspect that the interest in developing general applications for the iPhone store will peak this year, unless Apple comes out with a more innovative platform. At present, Apple have completely won the mindshare in relation to phone applications, not just the hardware game. All major websites I deal with are starting to create iPhone friendly versions. Early on, we heard amazing stories of how developers had made a fortune through the appstore. I see a few problems becoming more apparent this year:

1. The price of applications. Even super-sophisticated applications are considered over-priced now at $8. While there's some truth that it's "just a numbers game", it's getting much harder to justify the effort required to build the next generation of apps as the price drops lower and lower.

2. Political control of the appstore. Having a developer story that says that you can spend six months building an app, make it beautiful and functional and then at a whim Apple could decide to not let you sell it, and you have no other way to sell it, isn't a good story. That's particularly the case when the reasons might seem unreasonable to you eg: not competing with built-in functionality or not providing a service that their "partners" already provide.

3. Most serious applications being built now seem to be front-ends for standard business sites. There's nothing wrong with that but it's the interest in building general purpose applications that I'm suggesting will peak.

4. You can't find things in the appstore any more. The beauty of the appstore has become it's ugly side too. How do you efficiently find apps that are worthwhile amongst the load of rubbish that's in there. And the volume is increasing daily.

What do you think will happen in the software industry this year?

SQL Server 2008 R2: StreamInsight AdvanceTimePolicy.Adjust

While building content for the upcoming Metro training for SQL Server 2008 R2, Bill Chesnut and I were puzzled about the Adjust option for AdvanceTimePolicy in the AdvanceTimeSettings for a stream. It was described in Books Online as causing the timestamp for the event to be adjusted forward to the time of the latest CTI (current time increment). No matter what we tried though, we couldn't seem to get it to do anything.

After discussing it with Roman from the product group, we worked out what our issue was.

We were using EventShape.Point. That means that the starttime and endtime are 1 cronon apart (smallest unit of time for the system). Our events had times that were prior to the latest CTI timestamp but we weren't seeing them be adjusted.

Turns out that the adjustment only occurs when your event interval (ie: from starttime to endtime) overlaps the CTI. Then, the starttime of the event is adjusted to match the CTI. This means the event has been adjusted to start at the CTI timestamp and still end when it was recorded as ending before adjustment.

Because we were using EventShape.Point, no adjustment was occuring as our event didn't overlap the CTI. Had we been using EventShape.Interval, and had a starttime before the CTI and and endtime at or after the CTI, we would have seen the adjustment working.