If I create a primary key (let's say a CustomerID in a Customers table) in SQL Server, an index is automatically created to support that key. Primary keys must be unique and not null. SQL Server uses that index to ensure that the key is unique. Keep in mind that when I say "key", I'm referring to one or more columns, not necessarily just one.
The same happens for unique keys. Again, it makes it easy for SQL Server to ensure the uniqueness.
But if I create a foreign key (let's say a CustomerID in an Orders table), SQL Server doesn't create any index to support that. When I have the foreign key in place, and I perform an INSERT or UPDATE, it can use the primary key of the Customers table to check whether or not the customer exists.
Where the problems start
That's all well and good. But ask yourself: what's the chance that I need to come the other way across the relationship ie: find me the orders for a particular customer? At that point, you'd want an index to help your query.
DELETE statements can be even nastier. If I execute a command to just delete one customer, without any index in place, that's going to cause SQL Server to read the entire Orders table first. It has to make sure there are no orders for that customer. So a simple DELETE operation that should take no time at all, suddenly becomes a nasty mess that involves reading millions or billions of rows. I'd also end up with all sorts of locks involved.
An index on the foreign key would have avoided that.
The big question though is if SQL Server auto-created the index for you, would it be useful? For the DELETE operation, it most certainly would be, and for that reason alone, I'd like to see it there.
But what about the query? Well it's likely that if I was looking for all the orders for a customer, I'd want more than the OrderID. I'd probably want an OrderDate, a DueDate, a PurchaseOrderNumber, etc. And this is where the controversy starts. That auto-created index would be OK for this, as long as the query was highly selective. It would end up doing lookups to get the other columns once it found the orders.
But if the outcome wasn't highly selective, the index would be ignored. SQL Server would decide that the lookups would be too expensive.
So what is needed? I'd need an index where the CustomerID was the first component of the index, and the other columns were either index key columns, or included columns. So perhaps an index like this:
CREATE INDEX FK_dbo_Orders_dbo_Customers
ON dbo.Orders (CustomerID, OrderDate)
INCLUDE (DueDate, PurchaseOrderNumber);
Importantly, that index would work for the DELETE operation as well.
I believe that SQL Server would be overall better if it auto-created indexes to support foreign keys. I just see too many issues happen because they're missing.
Ideally, it would do that only unless it detected that there was another suitable index (best bet). Alternately, it could offer an I_KNOW_WHAT_I_AM_DOING clause when creating the foreign key.
OK, perhaps I'd settle for a clause like WITH NO_INDEX.
What's that smell?
When I'm reviewing databases, I do look for this. I consider a declared foreign key, where there is no index with the foreign key columns as its left-most components, to be a code smell.
If you want to check for these, our ListUnindexedForeignKeys procedure in our free developer and DBA toolkit (SDU Tools) offers a simple way to find out.