In Part 1 of this series about database keys, we looked at the difference between natural and surrogate keys, using a table with data about friends. Here’s that table again:
SELECT * FROM friends; first_name | last_name | hair_color | id ------------+-----------+------------+---- Monica | Geller | | 1 Ross | Geller | | 2 Chandler | Bing | | 3 Joey | Tribbiani | | 4 Rachel | Green | | 5 Phoebe | Buffet | blonde | 6 Monica | Lewinsky | | 7 Al | Green | | 8 Joey | Kangaroo | orange | 9 (9 rows)
The table contains columns for data about first name, last name, and hair color, plus an
id column that was created to uniquely identify each row, regardless of whether or not the values in the other columns were unique.
id column of the
friends table is an example of a primary key. A primary key is used to uniquely identify a row in the same table. Columns can be explicitly specified as primary keys in PostgreSQL using
PRIMARY KEY , as was done when the column was created in the previous article:
ALTER TABLE friends ADD COLUMN id serial PRIMARY KEY;
A primary key is usually, though not always, an automatically-incrementing integer. But no matter the data type, a primary key must be unique, and so surrogate keys make great primary keys.
While a primary key identifies a row in its own table, a foreign key identifies a row in another table in the database.
Let’s say we want to store data about the city where each of our friends lives. Since many of our friends live in the same city, we’ll just store cities in a separate table. This way, we only have to enter each city once, preventing duplication. Each city will have its own id, which is its primary key. Let’s create the table and add some data:
CREATE TABLE cities( id serial PRIMARY KEY, name text ); INSERT INTO cities (name) VALUES ('New York'), ('Memphis'), ('Sydney'), ('Washington, D.C.');
cities table looks like this:
SELECT * FROM cities; id | name ----+------------------ 1 | New York 2 | Memphis 3 | Sydney 4 | Washington, D.C. (4 rows)
Now, to link the city data to the friends data, we can add another column in the
friends table to store each the id of the appropriate city. We’ll name this column
city_id . The city id, which was the primary key for the
cities table, becomes a foreign key in the
ALTER TABLE friends ADD COLUMN city_id integer REFERENCES cities(id);
When creating this new column, the
REFERENCES keyword specifies that it is a foreign key column, and
cities(id) specifies which primary key the column is referencing.
Now we can update the
friends table with the city data:
UPDATE friends SET city_id = 1 WHERE id BETWEEN 1 AND 6; UPDATE friends SET city_id = 2 WHERE id = 8; UPDATE friends SET city_id = 3 WHERE id = 9; UPDATE friends SET city_id = 4 WHERE id = 7;
Here is our
friends table with both primary and foreign key columns populated:
SELECT * FROM friends; first_name | last_name | hair_color | id | city_id ------------+-----------+------------+----+--------- Monica | Geller | | 1 | 1 Ross | Geller | | 2 | 1 Chandler | Bing | | 3 | 1 Joey | Tribbiani | | 4 | 1 Rachel | Green | | 5 | 1 Phoebe | Buffet | blonde | 6 | 1 Al | Green | | 8 | 2 Joey | Kangaroo | orange | 9 | 3 Monica | Lewinsky | | 7 | 4 (9 rows)
In this table, while the primary key must be unique, the foreign key does not have this restriction. After all, it makes sense – six of our friends live in New York, and therefore share the same city id.
So, to summarize:
- Primary keys uniquely identify data within a table.
- Foreign keys identify data in a separate table.
So hopefully these two posts have been useful in clearing up any confusion about the types of keys in a database, whether natural or surrogate, primary or foreign.