SQL Server 2008 Full Text Indexing – Working out how many documents still need to be processed

I had a query from an attendee of my full-text indexing session at TechEd US. He asked how he can find out which documents (or how many) still need to be processed. I did a little investigation on this and here's my best guess:

<WARNING: Undocumented and potentially just a guess!>

1. Query for the objectid of your full-text index. You can do this by:

select * from sys.fulltext_indexes

2. Open an admin connection to your system ie: connect to admin:SERVER instead of SERVER.

3. Query as follows:

select * from sys.fulltext_index_docidstatus_2105058535

(the number on the end needs to be your full text index's object id not mine :-))

From what I can see, this table seems to hold details of documents not yet processed and it gets cleaned up as documents are processed. This is an internal table that you can see via:

select * from sys.internal_tables

</WARNING: Undocumented and potentially just a guess!>

Hope that helps someone.

 

Jacob Sebastian is a writing machine: free eBook on XML Schemas in SQL Server

One of the members of our Asian regional development team for PASS is Jacob Sebastian. A week or so back he told me he'd written an eBook for the Red-Gate folk on XML Schemas in SQL Server. I downloaded it expecting it to be fifty to a hundred pages. It was 483 pages. What can I say: Jacob is a writing machine. You can download it here:

http://beyondrelational.com/blogs/jacob/archive/2009/04/26/my-latest-book-the-art-of-xsd-sql-server-xml-schema-collections-available-for-free-download.aspx

Farewell Steve and thanks for all the fish

Microsoft lost a lot of good people this week. I have to say this change has me dumbfounded. Steve has become a friend over many years of presenting at the same events. I've usually found him to be one of the most interesting people at any of these events. He's also usually the one of top presenters (if not the top) at most of these events.

Good luck Steve.

Why don't the headlines say "Developer glitch" or "Design glitch" instead of "Database glitch"

Most people are aware that a "database" glitch caused the download servers for Windows 7 RC to fail the other day. What annoys me though is that the headlines always say "Database glitch" or "SQL Server glitch". Based on what Paul Randall was posting today, it seems like a pretty simple "Design glitch" or a "Developer glitch".

Every month, I find myself at sites with issues caused by the lack of database-related skills in developer teams. SQL Server does such a good job and is so easy to work with that it seems like many developer teams think they don't need database-related skills, particularly at the design stage. How can that message get changed? Or is that a lost cause and the product needs to simply become:

  • even easier to use or
  • more accomodating of design issues or
  • clearly identify design issues?

Perhaps the headlines should say "Project Management Glitch".

Timely reminder to avoid early filtering on resource usage when profiling SQL

I'm back in Melbourne doing some performance-tuning work this week.

Yesterday's issue ended up being a caching problem in middle-tier code. These issues are surprisingly common.

The symptoms were hundreds of thousands of calls to a particular stored proc over a period of half an hour. It's a timely reminder that when you're tracing using SQL Trace calls or Profiler, it's important to avoid filtering out calls that aren't using too many resources, until you've looked at the bigger picture. For example, the logical reads, CPU, duration, etc. on each call were close to zero. No call on its own was a problem but the overall effect of the calls was staggering.

In the end, the problem was a cache timeout value set to 60 instead of 3600. The cache was meant to be flushed each hour, not each minute and the developer responsible thought the value was meant to be in minutes, not seconds.

How important is extensibility for SQL Server?

One of the things that has always surprised me with SQL Server is the lack of extensibility points. In fact, the team seems to go out of their way to remove or avoid them. SQL Server Management Studio is an obvious example but I see it as a much deeper problem.

Taking SQL Server 2008 as a recent example, there is a fixed list of facets. Why? Surely there must be a well-defined interface that all the supplied ones adhere to. Why isn't that interface exposed?

I find that every time I'm in a software review or similar meeting, I'm the one in the room saying "how do I build one of those?".

I think the product would be so much richer if an ecosystem was permitted to be created around it. For example, I saw Klaus Aschenbrenner demonstrating a nice plug in for SSMS a while back that provided a class model style view of service broker objects. Why do such things have to be hacked into the product without having them integrated via supported interfaces?

The product team isn't the only source of ideas for extending the product and it also doesn't have limitless funds available for development. Why should the growth of the product be stunted by an inability to let other people expand it?

How important do you feel extensibility is for the product?

No Microsoft BI Conference this year, SQL PASS Summit is the place to be, Call for speakers extended

Microsoft have announced that they won't be running a BI conference as a separate event this year and that they will be supporting the SQL PASS summit as one of their key BI events for the year.

Because of that, PASS has extended the call for speakers (that originally was to close at April 10th), to allow for those that might now want to consider speaking at the summit.

Regardless, this means that the SQL PASS Summit will truly be the place to be for SQL Server and BI professionals this year. I hope to see you there.

Details are at: http://summit2009.sqlpass.org/

 

SQL Server 2008 SP1 is out: Now another adoption blocker is gone

There seems to be a concept that no-one should install a new version of SQL Server until at least one service pack has been released for it. I've never ascribed to that thinking. I find it amusing that customers who would not install the RTM version of SQL Server 2008 (which was fully regression tested) would happily run SQL Server 2005 SP2 plus about 10 cumulative updates (which weren't).

Most of the issues I've seen with new versions tends to relate to the newer functionality. You do occasionally find exceptions to this but that's also why you need to do reasonable testing on your own applications.

Regardless, that adoption blocker is now gone with the release today of SP1 for SQL Server 2008. You can find it here: http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=66ab3dbb-bf3e-4f46-9559-ccc6a4f9dc19