A Bit About Database Keys, Part 2: Primary and Foreign Keys

In Part 1 of this series about database keys, we looked at the difference between natural and surrogate keys, using a table with data about friends.  Here’s that table again:

SELECT * FROM friends;
 first_name | last_name | hair_color | id 
------------+-----------+------------+----
 Monica     | Geller    |            |  1
 Ross       | Geller    |            |  2
 Chandler   | Bing      |            |  3
 Joey       | Tribbiani |            |  4
 Rachel     | Green     |            |  5
 Phoebe     | Buffet    | blonde     |  6
 Monica     | Lewinsky  |            |  7
 Al         | Green     |            |  8
 Joey       | Kangaroo  | orange     |  9
(9 rows)

The table contains columns for data about first name, last name, and hair color, plus an  id column that was created to uniquely identify each row, regardless of whether or not the values in the other columns were unique.

Primary Keys

The  id column of the  friends table is an example of a primary key.  A primary key is used to uniquely identify a row in the same table.  Columns can be explicitly specified as primary keys in PostgreSQL using  PRIMARY KEY , as was done when the column was created in the previous article:

ALTER TABLE friends
ADD COLUMN id serial PRIMARY KEY;

A primary key is usually, though not always, an automatically-incrementing integer.  But no matter the data type, a primary key must be unique, and so surrogate keys make great primary keys.

Foreign Keys

While a primary key identifies a row in its own table, a foreign key identifies a row in another table in the database.

Let’s say we want to store data about the city where each of our friends lives. Since many of our friends live in the same city, we’ll just store cities in a separate table.  This way, we only have to enter each city once, preventing duplication.  Each city will have its own id, which is its primary key.  Let’s create the table and add some data:

CREATE TABLE cities(
id serial PRIMARY KEY,
name text
);

INSERT INTO cities (name)
VALUES ('New York'), ('Memphis'), ('Sydney'), ('Washington, D.C.');

So the  cities table looks like this:

SELECT * FROM cities;
 id |       name
----+------------------
  1 | New York
  2 | Memphis
  3 | Sydney
  4 | Washington, D.C.
(4 rows)

Now, to link the city data to the friends data, we can add another column in the  friends table to store each the id of the appropriate city.  We’ll name this column  city_id . The city id, which was the primary key for the  cities table, becomes a foreign key in the  friends table.

ALTER TABLE friends
ADD COLUMN city_id integer REFERENCES cities(id);

When creating this new column, the  REFERENCES keyword specifies that it is a foreign key column, and  cities(id) specifies which primary key the column is referencing.

Now we can update the  friends table with the city data:

UPDATE friends SET city_id = 1
WHERE id BETWEEN 1 AND 6;

UPDATE friends SET city_id = 2
WHERE id = 8;

UPDATE friends SET city_id = 3
WHERE id = 9;

UPDATE friends SET city_id = 4
WHERE id = 7;

Here is our  friends table with both primary and foreign key columns populated:

SELECT * FROM friends;

 first_name | last_name | hair_color | id | city_id 
------------+-----------+------------+----+---------
 Monica     | Geller    |            |  1 |       1
 Ross       | Geller    |            |  2 |       1
 Chandler   | Bing      |            |  3 |       1
 Joey       | Tribbiani |            |  4 |       1
 Rachel     | Green     |            |  5 |       1
 Phoebe     | Buffet    | blonde     |  6 |       1
 Al         | Green     |            |  8 |       2
 Joey       | Kangaroo  | orange     |  9 |       3
 Monica     | Lewinsky  |            |  7 |       4
(9 rows)

In this table, while the primary key must be unique, the foreign key does not have this restriction.  After all, it makes sense – six of our friends live in New York, and therefore share the same city id.

So, to summarize:

  • Primary keys uniquely identify data within a table.
  • Foreign keys identify data in a separate table.

So hopefully these two posts have been useful in clearing up any confusion about the types of keys in a database, whether natural or surrogate, primary or foreign.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.