Thursday, June 29, 2006

Using Guids or Integers as database identifiers

Globally unique identifiers (Guids) are those strings that you find in COM clsid’s or in the windows registry and that look like: {ED256ABD-5BE7-4E46-BCDA-1E26B0364EBB}. You need 128 bits to store a Guid.

Integers are simply numbers that identify database rows in a table in an ascending order. They generally start at a seed of 1 and are automatically incremented by the database itself with an incremental step of 1, so you get 1, 2, 3, … Note that the database lets you specify the seed and the increment as required. You need 32 bits to store an integer.

The detractors of Guids claim that Guids require a lot more storage space, so they dramatically affect performances. If you want to store contact details with name, address, telephone and fax number in a table, you will easily get a dozen fields which will at least require 1000 bytes to store with a double-byte character set (required for working in any country). In this example, the difference on storage between using a Guid and an Integer as an identity is below 1%.

There are several advantages to using Guids:
  • Guids are well-typed in Java and in the .Net framework and I personally like to have a specific type for identifiers in the business layer;

  • They can be generated in the application, so you do not have to query the database after an insert to know which identifier has been generated for you;

  • They are guaranteed to be globally unique by using a combination of the network card MAC address and time at instant of generation, which is a reasonable guarantee. This is required in replication scenarios and SQL Server replication relies on Guids anyway, so using Guids is more future proof.

No comments: