In the database what are lookup tables
Proper use of lookup tables
I'm having trouble figuring out exactly how to set good boundaries on when and where to use lookup tables in a database. Most of the sources I've looked at say I can never have too many, but at some point it seems like the database gets broken into so many pieces that while it's efficient, it's no longer manageable. Here is an example I'm working with:
Let's say I have a table called Employees:
Imagine for a moment that the data is more complex and has hundreds of rows. The most obvious thing I see that could be moved to a look-up table would be position. I could create a table called Positions and paste the foreign keys from the Position table into the Employees table in the Position column.
But how far can I still break the information down into smaller look-up tables before it becomes unmanageable? I could create a gender table and put a 1 for "male" and a 2 for "female" in a separate look-up table. I could even put LNames and FNames in tables. All "John" entries are replaced with a foreign key of 1 that points to the FName table, which indicates that an ID of 1 corresponds to John. However, if you go too far down that rabbit hole, your Employees table will be reduced to a jumble of foreign keys:
While this may or may not be more efficient for a server, it is certainly unreadable for a normal person trying to maintain it, and it becomes more difficult for an application developer trying to access it. So my real question is how far is it too far? Are there "best practices" for this kind of thing, or a good set of guidelines somewhere? I can't find any information online that really provides useful guidelines for this particular problem. Database design is old hat to me, but GOOD database design is very new so overly technical answers can be over my head. Any help would be appreciated!
But how far can I keep breaking the information down into smaller look-up tables before it becomes unmanageable? I could create a gender table and put a 1 for "male" and a 2 for "female" in a separate look-up table.
You mix two different topics. One problem is the use of a "look-up table"; The other is the use of surrogate keys (ID numbers).
Start with this table.
You can create a look-up table for such items.
Your original table looks the same as it did before you created the lookup table. For the table the employees are no additional links required to obtain useful human readable data.
Using a "look-up table" produces the following result: Does your application need control over the input values provided by a foreign key reference? In this case, you can always use a look-up table. (Regardless of whether a replacement key is used.)
In some cases, you can populate this table entirely at design time. In other cases, users need to be able to add rows to this table at runtime. (And you will likely need to involve some administrative processes to validate new data.) Gender, which is actually an ISO standard, can be fully populated at design time. Street names for international online product orders will likely need to be added at runtime.
In your Employees table, I would only search for "position" because that is a limited amount of data that can be expanded.
- The gender is self-describing (e.g. or), limited to 2 values and can be enforced with a CHECK restriction. They will not add new genders (ignore political correctness bills)
- The first name "John" is not part of a finite, restricted amount of data: the potential amount of data is so large that it is practically unlimited, and therefore should not be looked up
If you want to add a new position, just add a row to the look-up table. This also eliminates data change anomalies that are a point of normalization
Once you have a million employees, it's more efficient to store tinyint PositionID as a varchar.
Let's add a new column "Salary Currency". I would use a look-up table here with a key of CHF, GBP, EUR, USD etc: I would not use a surrogate key. This could be constrained with a CHECK constraint like gender, but it's a limited but extensible record like position. I am giving this example because I would use the natural key even if it appears in a million rows of employee data, even though it is char (3) and not tinyint
So, in summary, you use lookup tables
- where you have a finite but expandable amount of data in one column
- Where is is not self-describing
- to avoid data change anomalies
The answer is "it depends". Not very satisfying, but there are plenty of influences that push and pull the design. If you have app programmers designing the database, a structure as you describe it will work for them because the ORM hides the complexity. You'll work your hair out writing reports and putting ten tables together to get an address.
Design for use, intended use, and likely future use. This is where your knowledge of the business process comes in. When designing a database for a veterinary company, there are reasonable assumptions about its size, use, and how it works, all of which are vastly different from those of a high-tech start-up.
Reuse a favorite quote
"A wise man once told me" normalize until it hurts, denormalize until it works ".
Somewhere in there is the sweet spot. In my experience, having a key id in more than one table is not as serious a crime as some believe if you never change the primary key.
Take this abbreviated example of highly normalized tables from a real system
These tables contain a linked list of individual properties and high-level child properties that are used here
This looks great: get all cases with a property_id in a selection
Let's make a list
Now try to select all the properties of a case if it has property_types 3 and 4 and 5 or not ...
It just hurts ... even if you handle it more elegantly. However, add a bit of de-normalization by splitting properties for which a case only has one property_id, and this could be a lot better.
To find out if there are too many tables or not, you should query the database of questions used by the application, a report and analysis from year to year.
- Can a whole family have NPD
- How would you interpret the linked quote
- How do single mothers survive financially
- What is the most explosive element on earth
- Why are annelid fossils rare
- Which is the best boutique in Chennai
- How many players are played in boxing?
- Why are drums played with cymbals?
- Too much thinking could lead to seizures
- How is aortic aneurysm treated
- What languages are used to program microcontrollers
- Is a self-financed doctorate allowed in Germany?
- How can humans have twins as children?
- How do I create passion for reading
- How can I make myself a doctor
- Is life better in Geneva or Lausanne
- Which MIT astrophysics courses should be taken
- How unique is the apple ecosystem
- What is a chiton
- What does the Bible say about success
- Are Pokemon movies canon
- Is the identity of Anonymous necessary in today's world
- Why is there a spouse privilege
- The visualization improves the focus and the achievement of goals
- Which country names start with ZA
- How can men dress cheaply
- What is pure and impure water
- How should I invest my child's savings?
- Care is a substitute for planning
- Why is benzene a soft base
- Are there laws against running too fast?
- Why is random doodling considered an art?
- What happens next with adultery?
- Why do I cry about little things