SQL Server 2008 R2: What is StreamInsight used for

Since I posted some StreamInsight info the other day, I've had a bunch of people asking me what StreamInsight is used for.

StreamInsight is Microsoft's implementation of Complex Event Processing. This is not a new market but it is new territory for Microsoft.

Complex Event Processing (CEP) is all about querying data while it's still in flight. Traditionally, we obtain data from a source, put it into a database and then query the database. When using CEP, we query the data *before* it hits a database and derive information that helps us make rapid business decisions, potentially also including automated business decisions.

I liked the way that one of our new colleagues (Sharon Bjeletich) put it to me: "It's about throwing the data at the query, rather than throwing the query at the data". 

There are lots of places that this makes sense but they all involve relatively high data rates. Good examples of these are automated trading in capital markets, fraud detection in networks or in casino operations, battefield control systems for military use, outbreak management for public health, etc.

While StreamInsight may combine the data with reference data stored in SQL Server, the primary development skills needed for working with it are .NET development skills.

The Region: Sofware Industry Predictions for 2010: iPhone General-Purpose Applications

Our Microsoft RD lead Kevin Schuler has asked us to post predictions for 2010 that will appear in a special edition of TheRegion. (Check out www.theregion.com for any interesting blog if you haven't already). Here's mine:

Against all perceived wisdom, I suspect that the interest in developing general applications for the iPhone store will peak this year, unless Apple comes out with a more innovative platform. At present, Apple have completely won the mindshare in relation to phone applications, not just the hardware game. All major websites I deal with are starting to create iPhone friendly versions. Early on, we heard amazing stories of how developers had made a fortune through the appstore. I see a few problems becoming more apparent this year:

1. The price of applications. Even super-sophisticated applications are considered over-priced now at $8. While there's some truth that it's "just a numbers game", it's getting much harder to justify the effort required to build the next generation of apps as the price drops lower and lower.

2. Political control of the appstore. Having a developer story that says that you can spend six months building an app, make it beautiful and functional and then at a whim Apple could decide to not let you sell it, and you have no other way to sell it, isn't a good story. That's particularly the case when the reasons might seem unreasonable to you eg: not competing with built-in functionality or not providing a service that their "partners" already provide.

3. Most serious applications being built now seem to be front-ends for standard business sites. There's nothing wrong with that but it's the interest in building general purpose applications that I'm suggesting will peak.

4. You can't find things in the appstore any more. The beauty of the appstore has become it's ugly side too. How do you efficiently find apps that are worthwhile amongst the load of rubbish that's in there. And the volume is increasing daily.

What do you think will happen in the software industry this year?

SQL Server 2008 R2: StreamInsight AdvanceTimePolicy.Adjust

While building content for the upcoming Metro training for SQL Server 2008 R2, Bill Chesnut and I were puzzled about the Adjust option for AdvanceTimePolicy in the AdvanceTimeSettings for a stream. It was described in Books Online as causing the timestamp for the event to be adjusted forward to the time of the latest CTI (current time increment). No matter what we tried though, we couldn't seem to get it to do anything.

After discussing it with Roman from the product group, we worked out what our issue was.

We were using EventShape.Point. That means that the starttime and endtime are 1 cronon apart (smallest unit of time for the system). Our events had times that were prior to the latest CTI timestamp but we weren't seeing them be adjusted.

Turns out that the adjustment only occurs when your event interval (ie: from starttime to endtime) overlaps the CTI. Then, the starttime of the event is adjusted to match the CTI. This means the event has been adjusted to start at the CTI timestamp and still end when it was recorded as ending before adjustment.

Because we were using EventShape.Point, no adjustment was occuring as our event didn't overlap the CTI. Had we been using EventShape.Interval, and had a starttime before the CTI and and endtime at or after the CTI, we would have seen the adjustment working.

SQL Server 2008 R2 – Departmental applications?

One of the new items coming with SQL Server 2008 R2 and Visual Studio 2010 is the Data-Tier Application. It is designed for (what are described as) departmental applications.

What a "deparmental" application is deserves some thought. Mostly it relates to the size of the application. What percentage of your databases (count of databases not their volume) would be under say 2GB? What about 10GB? The argument is that for most sites, it's a surprisingly high percentage. Even most sites I see at the Enterprise level have one or two very large databases and the rest are fairly small. Does that apply to your sites?

StreamInsight and Reactive Extensions to .NET

I've been doing a lot of work lately with StreamInsight, coming in SQL Server 2008 R2.

There are three development models you can use with StreamInsight: Implicit Server, Explicit Server and IObservable/IObserver.

When I was working through material on the IObservable/IObserver pattern, it wasn't immediately apparent to me where it had come from. It's based on the Rx Framework for .NET (Reactive Extensions). I finally got to watch the PDC Online session from Erik Meijer on the Rx Framework a few days ago and so many things suddenly fell into place for me.

If you have an interest in working with StreamInsight, I'd recommend watching Erik Meijer's session on the Reactive Extensions here: http://www.microsoftpdc.com.

 

Is The Paid-Article Website Dead?

I was doing some varied reading this morning and stumbled across this article by Paul Graham. I want to highlight this passage:

"We now have several examples to prove that amateurs can surpass professionals, when they have the right kind of system to channel their efforts. Wikipedia may be the most famous. Experts have given Wikipedia middling reviews, but they miss the critical point: it's good enough. And it's free, which means people actually read it. On the web, articles you have to pay for might as well not exist. Even if you were willing to pay to read them yourself, you can't link to them. They're not part of the conversation."

It pretty much sums up what I've been thinking for some time about sites with paid-for articles. Do they have any future at all? I was interested to see Rupert Murdoch placing his hopes on a paid-for future. He's arguing that free news sites are dead. Can't say I agree with that. I'm sure they'll be different to what we've been used to in the past.

When I'm searching for technical topics, I have to say that every time I see a link to a site I know is paid, I don't think "I must join that site some time", I simply automatically skip over their content. A good indication on Google is page caching. Google will happily turn off page caching for paid-for sites. I wish they had an option to simply leave them out of my results set. When I'm searching for results, any page I see that doesn't have a cached page available, is probably no longer of interest to me.

I think Paul's last sentence is the most telling: "They're not part of the conversation". You can't build a buzz or discussion around something that people have to pay to see.

What this does raise is the question on how technical content will be generated in future. Is our future one that's full of "good enough" technical articles too? Or is advertising the only way forward, much as we might wish it wasn't?

OT: Green Science and Bogus Mathematics

With the climate summit in Copenhagen now finished, I wanted to make a few comments about a trend that really annoys me. I'm fairly "green oriented" in my outlook but amongst "green" scientists and advocates, there is an endless desire to make each cause sound much stronger than the facts permit. I think this does their support more harm than good. The recent expose on modified emails bore that out only too well but I want to show a few simpler examples.

Solar Heating

In Australia, we're encouraged to reduce our power consumption. This seems a great goal. One way of doing this is to install solar power heating. For a country like Australia that's not short on sunshine, you'd think that's a no-brainer. What annoys me though is how the message is pushed. We're endlessly told it will "save us money". This is based on logic like:

1. You buy a solar hot water system

2. You use $30 less electricity every quarter

3. Therefore the unit pays for itself

This logic can only appeal to people that don't get that money costs money. If I pay $3000 to have a unit installed, I've lost an opportunity of more than $30 every quarter. Worse, if I borrow money to purchase it (on a credit card), I'll be paying at least $150 per quarter in interest on that $3000.

I'm keen to see solar units installed across the country. But please don't apply bogus mathematics to justify it. Much better to just tell us: "it'll cost you money but you'll be doing your part to help".

Water Saving Devices

Another annoying one is water-saving shower heads. The logic works something like:

1. Four people in the house take a 10 minute shower each day.

2. A 5 star shower head reduces water flow to 54%

3. Fitting one will save 46% of the water used for showers.

Seems simple logic???

What is completely ignored is that showers with water-saving shower heads are often longer than with standard shower heads. I did some testing at my house. I actually use *less* water with a zero-star shower head than with a five-star one. Why? Because the ultra-green heating unit feeding it doesn't get working properly until the water has been flowing for some time, so I have to turn it on for a while before I can get into it. Then, the low flow makes it harder to use for cleaning (most people have experienced the need to run around in a five star shower just to get wet). Washing hair, etc. takes much longer with a "green" shower head and so on.

All the five star shower head does is cause me to use more water, to have a lousy shower and waste a bunch of time. It's not as simple as the bogus mathematics used to support it.

Odd that you can't create a filtered index on a deterministic persisted calculated column

On a client site the other day, I came across a situation (unfortunately too common) where a column in a table was being used for two purposes. It could either hold an integer value or a string. Only about 100 rows out of many millions had the integer value. Some of the client code needed to calculate the maximum value when it was an integer. First step I tried was to add a persisted calculated column like so:

CREATE TABLE dbo.LousyTable

( ColumnWithMixedValues varchar(20),

  SomeOtherColumn varchar(10),

  MixedValueColumnAsInt AS

    CASE WHEN ISNUMERIC(ColumnWithMixedValues) = 1

         THEN CAST(ColumnWithMixedValues AS int)

         ELSE NULL

    END PERSISTED

);

After indexing the calculated column, all was good. But I then thought I should create a filtered index instead:

CREATE INDEX IndexAttempt1 ON dbo.LousyTable (MixedValueColumnAsInt)

WHERE MixedValueColumnAsInt IS NOT NULL;

but this fails with:

Msg 10609, Level 16, State 1, Line 1

Filtered index 'IX_LousyTable' cannot be created on table 'dbo.LousyTable' because the column 'MixedValueColumnAsInt' in the filter expression is a computed column. Rewrite the filter expression so that it does not include this column.

I was discussing this with fellow MVP Rob Farley and we tried some other options such as:

CREATE INDEX IndexAttempt2 ON dbo.LousyTable (MixedValueColumnAsInt)

WHERE ISNUMERIC(ColumnWithMixedValues) = 1;

 

CREATE INDEX IndexAttempt3 ON dbo.LousyTable(MixedValueColumnAsInt)

WHERE CASE WHEN ISNUMERIC(ColumnWithMixedValues) = 1

           THEN CAST(ColumnWithMixedValues AS int)

           ELSE NULL

      END IS NOT NULL;

Regardless, there's no option to do this. I really think there should be. It's hard to imagine why it isn't permitted.

If you think so too, here's the connect item to vote on:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=518328