Cons & Pros for Foreign Key

16. September 2009 22:10 by Eric in SQL Server  //   Comments (0)

Reasons to use Foreign Keys:

* You won't get Orphaned Rows (data integrity issue)

* You can get nice "on delete cascade" behavior, automatically cleaning up tables

* Knowing about the relationships between tables in the database helps the Optimizer plan your queries for most efficient execution, since it is able to get better estimates on join cardinality.

* FKs give a pretty big hint on what statistics are most important to collect on the database, which in turn leads to better performance

* They enable all kinds of auto-generated support -- ORMs can generate themselves, visualization tools will be able to create nice schema layouts for you, etc

* Someone new to the project will get into the flow of things faster since otherwise implicit relationships are explicitly documented

Reasons not to use Foreign Keys:

* You are making the DB work extra on every CRUD operation because it has to check FK consistency. This can be a big cost if you have a lot of churn

* By enforcing relationships, FKs specify an order in which you have to add/delete things, which can lead to refusal by the DB to do what you want. (Granted, in such cases, what you are trying to do is create an Orphaned Row, and that's not usually a good thing). This is especially painful when you are doing large batch updates, and you load up one table before another, with the second table creating consistent state (but should you be doing that sort of thing if there is a possibility that the second load fails and your database is now inconsistent?).

* Sometimes you know beforehand your data is going to be dirty, you accept that, and you want the DB to accept it

* You are just being lazy

DB integrity is paramount. FK's define relational structure in a relational DB, without FKs there's no point you might as well use Excel. If FK's are causing you a headache, it's because your approach is wrong -> Your data access patterns and business logic layer MUST be defined by the requirements of your database, not the other way around. Stuart's input:

Performance is NOT a valid reason to leave off a foreign key. The following solutions should be considered:

  • If performance is an issue in scenarios such as separate user databases or logging scenarios, then a suitable architectural change (usually clear and planned denormalisation) is a better solution. For example, a user action log can use a name and date rather than an ID as we only need a denormalised record of the actions that is never joined back.
  • For large batch scenarios, there are other solutions for running batches that don't need to use row-by-row check constraints. Using a transaction, check constraints can be blocked, then batch performed and then check constraints done on the full table in one hit with a rollback on failure.

Let your schema and application speak for itself, so if the relationship is required, then enforce it on a db level because the next cowboy to work on it may not.

In my opinion, the only reason to leave these off unless it is by design (in which case identity fields should not be used or the field is nullable) is the 4th point Laziness

As always, small temporary apps are not subject to strict architectural rules, however good habits are better practiced on the smaller jobs too.