Book: The Microsoft Data Warehouse Toolkit : Joy Mundy and Warren Thornthwaite

There are a number of key books that I've missed reading over the years, in areas that interest me. Recently, I've been fixing that. One that is always discussed is The Microsoft Data Warehouse Toolkit by Joy Mundy and Warren Thornthwaite from the Kimball Group.

I would have to say I enjoyed reading it. It is a large book at over 700 pages and a couple of inches thick so it took a while to get through.

I found the chapter "Designing the Business Process Dimensional Model" to be the most compelling part of the book. I can't say I totally agreed with all the advice in there but it does touch the key topics that need to be considered.

I did find the constant references throughout the book that provided mappings from the Microsoft technologies to the Kimball Method terminology quite irritating. The assumption is that you've already bought into the Kimball Method and are now moving to the Microsoft BI stack. While I'm sure that's valid for many, it's certainly not the case for many others that would be the target audience for the book. For those who aren't into the Kimball Method, the dual terminology adds an extra (and unnecessary) burden while reading the material.

The parts of the book I most struggled with were the areas where advice was given on relational database aspects of SQL Server and on hardware and system configuration. While I'm sure they felt it important to include information on this, it clearly isn't an area of expertise for the authors. I suspect it would have been better to have left this material out and referred instead to more targeted books on the topics.

While the "not-totally-Microsoft-oriented" approach of the book might be seen as a benefit, it's also a bit of a downside. I find that with quite a few books that I'm reading at present. I'm not sure if the authors would have written the same ideas and recommendations if their opinions weren't somewhat colored by their experience with other toolsets beforehand ie: if they were coming to the Microsoft BI toolset with fresh eyes.

Regardless, it is a classic book that's worth a look by anyone working in this area. The section on dimensional modelling and its terminology would make a good starting point for many wanting to get a handle on the most common concepts.

A SQL Server Meme

Well I was called out by Tibor Karaszi, so here goes:

How old were you when you first started programming?

I'd say I was about 19 when I started. I remember in 1976 that I was at University of Queensland. I was doing an honours degree in physics and maths and didn't have the slightest interest in those computing people that spent their lunch hours looking at great piles of 15×11 listings. By the next year, I was one of them.

How did you get started in programming?

I was doing RPG work on some mini-computers, some work on micros (first TRS-80 model) and some mainframe work (Fujitsu X8, IBM assembler and COBOL and JCL, etc.) I was completely fascinated in what you could do with each type of machine. I loved the interactivity of the micro. By 1978, I'd bought a Cromemco multi-user Z-80 system running MPM and with the old bank-switched memory. The biggest hassle was finding a hard drive (around 5 and 10 meg at the time) that would work reliably for any length of time.

What was your first language?

RPG

What was the first real program you wrote?

Must have been some RPG code for a client of the consultant I was working with/learning from.

What languages have you used since you started programming?

I'm guessing now (and am sure I'll miss some) but the ones I've used in any substantial amount would  be:

RPG, COBOL, Assembler, Pascal, Modula 2, C, C#, C++, Basic (many variations), SQL, SPL, Algol, Simula 

What was your first professional programming gig?

It'd be RPG coding for a local consultancy.

If you knew then what you know now, would you have started programming?

Yes

If there is one thing you learned along the way that you would tell new developers, what would it be?

Have something passionate that you're working on all the time, even if it isn't what you do for a living.

What’s the most fun you’ve ever had … programming?

As Tibor mentioned, it would have been the voyage of discovery in the early days. However, the playing around I did with operating system internals while working on MPE for HP was really fascinating. 

Who are you calling out?

Peter DeBetta, Kevin Kline, Craig Utley, Fernando Guerrero 

 

SQL Server 2008 Whitepapers starting to appear

Our team have been working on a number of whitepapers for SQL Server 2008. On of the first of these out the door is Itzik Ben-Gan's new paper on the T-SQL enhancements. It's great reading and can be found at:

http://msdn.microsoft.com/en-gb/library/cc721270(SQL.100).aspx

I got to work with Ron Talmage on the new partitioning whitepaper. Watch for it soon too.

PASS Summit Sessions Appearing

I had a note from Bill Graziano this morning telling me that our spotlight sessions for the PASS Summit in Seattle in November have been posted, along with details of some of the other sessions. I'm really looking forward to the summit this year as I had to miss TechEd at the last minute. Details are here:

http://summit2008.sqlpass.org/spotlight-sessions.html

http://summit2008.sqlpass.org/program-sessions.html

 

BI Databases and Table Prefixes

I know this post has the potential for religious-level debate but it's time to make it anyway.

The more I've been working with Analysis Services lately, the more it grates on me that the BI community still seem to be the last ones hanging onto table prefixes. They're not doing "tblSomeTable" but they are using "dim", "fact", etc.

Hasn't the time for this long gone now?

Most of the argument seems to be about finding tables in a list of tables. You could do that via schemas if you really wanted to. But as Adam Machanic pointed out recently, from 2005 onwards many-to-many dimensions blur these lines anyway.

Is it time for the prefixes to go?

OT: Crocodiles know much more than we think

A few weeks ago I managed to catch the tail end of the reptiles series that Sir David Attenborough created. If you have a spare 3 1/2 minutes, take a look at this video: http://www.bbc.co.uk/sn/tvradio/programmes/lifeincoldblood/video.shtml?licbtt08

People seem to think crocodiles are cold, unintelligent eating machines. This video clearly shows they doing something that I'd suggest that more than 99% of humans couldn't do, even with pen, paper and a calculator with weeks of notice and a library at their disposal. What fascinates me is how they sense when to do this, given the combination of events happens so infrequently. Yet they arrive and set aside their territorial squabbles for just a day or two at exactly the right time.

Crocodiles have always intrigued me. I grew up not too far from Steve Irwin's place at Beerwah. Whatever anyone ever thought of him, I found that if you watched him in action in person, I have never seen anyone more mesmerising.

Clearly, there's a lot more to this world that we don't understand yet. That's what I love about science. It's not the answers that are the best part, it's the questions. I sense that we know so very little as yet.

Book: Database Refactoring: Evolutionary Database Design

I had heard a lot of praise for Scott Ambler's book: Database Refactoring: Evolutionary Database Design over the past few years. It's another relatively classic book that I've been slow to read.

I often mentioned to people that when I was at a software design review meeting for Microsoft around the DataDude product (Visual Studio Team Edition for Database Professionals), I noticed that Sachin Rekhi from the team was walking around with a copy of this book under his arm. As Sachin was responsible for the refactorings to go into the product and there was only one (rename) at the time, I thought that was a good sign for where the product might head. I wasn't aware that he had been a contributor to the book. Sachin wrote some of the opening details.

Now that I've read it, I'd have to say I was underwhelmed by it. I really liked the idea that someone would tackle this topic as it's sorely needed in the database community where I endlessly see DBAs who feel like they can never change anything in their schemas. I spend a lot of time with DBAs discussing how they might "regain control" of their databases to avoid this.

The biggest problem I see with the book for a SQL Server DBA's use is that Scott has (understandably) focussed on lowest-common-denominator approaches, mostly using triggers to achieve everything he does. The code samples are all from Oracle and I'm sure that wouldn't help many although they wouldn't be that hard to translate. But in the end, it's often just the wrong approach. For example, he talks about how to introduce calculated columns as a refactoring, again using triggers to maintain them. But calculated columns have been part of SQL Server since 2000 and 2005 introduced persisted calculated columns. Each type of these is automatically maintained by SQL Server and each has a different use case.

That's really the problem with the book. While the concepts are great, most of the book is filled with the "how and why" and the "how" is often far from what you'd want to do when working with SQL Server and the "why" is also often off the mark. Another example is his splitting of tables horizontally which would have been better done via table partitioning since SQL Server 2005.

So, in the end, I was left with very mixed feelings on the book. For DBAs who might not have been exposed to unit testing, test driven development, agile methodologies, etc. this might provide a reasonable introduction in a database context. But I wouldn't want to see SQL Server DBAs following the advice on exactly what to do.

Book: Screw it, Let's do it – Lessons in Life

This is another one of those books I picked up in an airport when I ran out of reading material while travelling. One of the key criteria I applied when choosing the book was whether or not I'd finish reading it by the time we landed. We had a bit of a delay boarding so I'd definitely finished it by the time we'd landed. It's in the "quick reads" series and it is just that.

I'm quite a fan of Richard Branson. There's something about his larrikin nature that I admire. Perhaps it's that I don't do these things enough myself. In the book Screw It, Let's Just Do It – Lessons in Life, Richard describes his outlook on life and business. I really was hoping for more insights into the guy. I did mildly enjoy it and found many of the stories interesting, particularly the section on his criminal past. I was unaware of it.

It's ok for passing an hour or two.

Book: Inside Microsoft SQL Server 2005 Query Tuning and Optimization

I haven't posted up any book reviews recently so it's time to catch up a bit. For some reason, it had taken me ages to get to read Kalen's latest book in the Inside SQL Server series: Inside Microsoft SQL Server 2005 – Query Tuning and Optimization.

As expected, it's a great piece of work. I very much enjoyed the chapters written by other authors as well, particulary those from Adam Machanic and Craig Freedman. Ron Talmage, Sunil Agarwal and Lubor Kollar have also made strong contributions to this work.

I had wondered what outcome the book would aim for. I found it to be more of a good coverage of many of the things you need to know and have a good understanding of when doing performance tuning, moreso than one that provides a pragmatic approach to what you need to do. Perhaps that's another book. This book ceratinly provides the background knowledge of much of the internals involved.

Highly recommended reading!