Part of the value in the ongoing evolution of the T-SQL language is that we are moving further and further towards being declarative rather than prescriptive ie: we are able to tell SQL Server what we want, rather than how to do it. Over time, that raises more and more possibilities for the optimizer to work with us to achieve a better outcome.
For example, note the following query against the AdventureWorksDW database:
SELECT rs.ProductKey, rs.OrderDateKey, rs.SalesOrderNumber,
rs.OrderDateKey – (SELECT TOP(1) prev.OrderDateKey
FROM dbo.FactResellerSales AS prev
WHERE rs.ProductKey = prev.ProductKey
AND prev.OrderDateKey <= rs.OrderDateKey
AND prev.SalesOrderNumber < rs.SalesOrderNumber
ORDER BY prev.OrderDateKey DESC,
prev.SalesOrderNumber DESC)
AS DaysSincePrevOrder
FROM dbo.FactResellerSales AS rs
ORDER BY rs.ProductKey, rs.OrderDateKey, rs.SalesOrderNumber;
In this query, I'm trying to include details of how long it was since the previous order, beside the details of the current order. Note the option that the LAG operator now provides:
SELECT ProductKey, OrderDateKey, SalesOrderNumber,
OrderDateKey – LAG(OrderDateKey)
OVER (PARTITION BY ProductKey
ORDER BY OrderDateKey, SalesOrderNumber)
AS DaysSincePrevOrder
FROM dbo.FactResellerSales AS rs
ORDER BY ProductKey, OrderDateKey, SalesOrderNumber;
Also note how much more elegant the code is but more importantly, look at the difference in optimization:
This is great work from the T-SQL and engine teams. I encourage you to get out and try the new Windowing functions in Denali CTP3.
Hi Greg,
I inherently distrust the Query Cost relative to batch to be anywhere near accurate, especially where scalar functions (user defined or not) are involved.
What is the estimated cost of the LAG function within the plan ? , I would bet that it is a lot less than its real cost.
Do the io counters in profiler show the same ratio of costs ?
I agree, only back-to-back profiler figures are reliable enough in my experience
Hi Dave/John,
I totally agree but in the case of these window functions, we've seen a real difference in the plans generated and they are much, much cleaner in how they work.
The real sweet spot seems to be any sort of running totals, which are otherwise an exponentially bad problem as the number of rows increases. And with the window functions, they just perform consistently.
Regards,
Greg
Posting "elapsed times" would be much better(trustworthy) way to compare.