In the database what are lookup tables

Proper use of lookup tables

I'm having trouble figuring out exactly how to set good boundaries on when and where to use lookup tables in a database. Most of the sources I've looked at say I can never have too many, but at some point it seems like the database gets broken into so many pieces that while it's efficient, it's no longer manageable. Here is an example I'm working with:

Let's say I have a table called Employees:

Imagine for a moment that the data is more complex and has hundreds of rows. The most obvious thing I see that could be moved to a look-up table would be position. I could create a table called Positions and paste the foreign keys from the Position table into the Employees table in the Position column.

But how far can I still break the information down into smaller look-up tables before it becomes unmanageable? I could create a gender table and put a 1 for "male" and a 2 for "female" in a separate look-up table. I could even put LNames and FNames in tables. All "John" entries are replaced with a foreign key of 1 that points to the FName table, which indicates that an ID of 1 corresponds to John. However, if you go too far down that rabbit hole, your Employees table will be reduced to a jumble of foreign keys:

While this may or may not be more efficient for a server, it is certainly unreadable for a normal person trying to maintain it, and it becomes more difficult for an application developer trying to access it. So my real question is how far is it too far? Are there "best practices" for this kind of thing, or a good set of guidelines somewhere? I can't find any information online that really provides useful guidelines for this particular problem. Database design is old hat to me, but GOOD database design is very new so overly technical answers can be over my head. Any help would be appreciated!


But how far can I keep breaking the information down into smaller look-up tables before it becomes unmanageable? I could create a gender table and put a 1 for "male" and a 2 for "female" in a separate look-up table.

You mix two different topics. One problem is the use of a "look-up table"; The other is the use of surrogate keys (ID numbers).

Start with this table.

You can create a look-up table for such items.

Your original table looks the same as it did before you created the lookup table. For the table the employees are no additional links required to obtain useful human readable data.

Using a "look-up table" produces the following result: Does your application need control over the input values ​​provided by a foreign key reference? In this case, you can always use a look-up table. (Regardless of whether a replacement key is used.)

In some cases, you can populate this table entirely at design time. In other cases, users need to be able to add rows to this table at runtime. (And you will likely need to involve some administrative processes to validate new data.) Gender, which is actually an ISO standard, can be fully populated at design time. Street names for international online product orders will likely need to be added at runtime.

In your Employees table, I would only search for "position" because that is a limited amount of data that can be expanded.

  • The gender is self-describing (e.g. or), limited to 2 values ​​and can be enforced with a CHECK restriction. They will not add new genders (ignore political correctness bills)
  • The first name "John" is not part of a finite, restricted amount of data: the potential amount of data is so large that it is practically unlimited, and therefore should not be looked up

If you want to add a new position, just add a row to the look-up table. This also eliminates data change anomalies that are a point of normalization

Once you have a million employees, it's more efficient to store tinyint PositionID as a varchar.

Let's add a new column "Salary Currency". I would use a look-up table here with a key of CHF, GBP, EUR, USD etc: I would not use a surrogate key. This could be constrained with a CHECK constraint like gender, but it's a limited but extensible record like position. I am giving this example because I would use the natural key even if it appears in a million rows of employee data, even though it is char (3) and not tinyint

So, in summary, you use lookup tables

  1. where you have a finite but expandable amount of data in one column
  2. Where is is not self-describing
  3. to avoid data change anomalies

The answer is "it depends". Not very satisfying, but there are plenty of influences that push and pull the design. If you have app programmers designing the database, a structure as you describe it will work for them because the ORM hides the complexity. You'll work your hair out writing reports and putting ten tables together to get an address.

Design for use, intended use, and likely future use. This is where your knowledge of the business process comes in. When designing a database for a veterinary company, there are reasonable assumptions about its size, use, and how it works, all of which are vastly different from those of a high-tech start-up.

Reuse a favorite quote

"A wise man once told me" normalize until it hurts, denormalize until it works ".

Somewhere in there is the sweet spot. In my experience, having a key id in more than one table is not as serious a crime as some believe if you never change the primary key.

Take this abbreviated example of highly normalized tables from a real system

These tables contain a linked list of individual properties and high-level child properties that are used here

This looks great: get all cases with a property_id in a selection

Let's make a list

Now try to select all the properties of a case if it has property_types 3 and 4 and 5 or not ...

It just hurts ... even if you handle it more elegantly. However, add a bit of de-normalization by splitting properties for which a case only has one property_id, and this could be a lot better.

To find out if there are too many tables or not, you should query the database of questions used by the application, a report and analysis from year to year.

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from.

By continuing, you consent to our use of cookies and other tracking technologies and affirm you're at least 16 years old or have consent from a parent or guardian.

You can read details in our Cookie policy and Privacy policy.