Humans identify everything using names. For example, you, I, and everyone have unique names as our identity. There are several situations where two persons have similar first names. In those cases, we have to use their full name ( first and last names together) to identify them as separate persons. If one or more persons have similar full names, we use some other property, such as phone number, to acknowledge the persons individually. So, we always try to devise a way to identify every individual separately.
In the world of SQL databases, the primary key is like a digital fingerprint for each row in a table. It ensures that every piece of data is unique and easily identifiable. In this article, we’ll explore what a primary key is, why it’s crucial, and how it makes database management more organized and reliable. Whether you’re new to SQL or a seasoned pro, understanding the primary key is key to efficient data management. Let’s get started on this journey into the heart of SQL databases.
Table of Content
- What is Primary Key?
- Why use Primary Key?
- How to choose a strong Primary Key in SQL?
- Examples of Strong Primary Key in SQL
- Difference between Primary Key, Composite Key, and Foreign Key
- Primary Key Operations
- Multiple Column Primary Key
What is Primary Key?
A primary key is a column (or group of columns) in a relational database table that uniquely identifies each row in the table. It is a special kind of key that has the following properties:
- Unique: No two rows in the table should have the same primary key value.
- Non-null: The primary key value should not be null.
- Stable: The primary key value should not change over time.
- Easy to use: The primary key value should be easy to remember and use in queries.
A simple example of a primary key in SQL is the id column in a table of customer records. Each customer record has a unique id value, and no two customer records can have the same id value.
Why use Primary Key in SQL?
|Why it is important
|Uniquely identifies each row of a table
|Allows you to easily identify and access any row in the table.
|Customer ID, Product ID, Employee ID, etc. If you want to find the record for a specific customer, you can simply look up the customer’s ID. This is especially important for large tables with many rows.
|Prevents duplication of records in a table
|Ensures that the data in your table is accurate and consistent. If you allow duplicate records, it can be difficult to know which record is the correct one. This can lead to errors and inconsistencies in your data.
|No two customers can have the same Customer ID, no two products can have the same Product ID, and no two employees can have the same Employee ID
|Helps with updating or deleting only specific records
|Primary Key in SQL helps you to make changes to the data in your table without affecting other records.
|If you want to delete the record for a specific customer, you can simply specify the Customer ID in your query. This will prevent you from accidentally deleting the records for other customers. The same is true for updating records.
|Helps ensure that fields aren’t null
|Ensures that the data in your table is complete. If a field is null, it means that the value for that field is unknown. This can make it difficult to use the data in your table.
|A primary key column cannot have null values. This helps to ensure that the data in your table is accurate and consistent.
|Helps set up relationships between tables
|Creates a relationship between the two tables, which allows you to query data from both tables at the same time. For example, you could query a table of customers and a table of orders to find all the orders for a specific customer.
|A primary key in one table can be used as a foreign key in another table.
How to choose a strong Primary Key in SQL?
- Unique: The primary key value must be unique for each row in the table. This means that no two rows can have the same primary key value. For example, a column for the customer’s email address is not unique, as two customers could have the same email address.
- Not null: The primary key value cannot be null. This means that every row in the table must have a value for the primary key.
- Unlikely to change: The primary key value should be unlikely to change over time. This is because changing the primary key value can be difficult and error-prone. For example, a person’s name or address is likely to change over time, so these columns should not be used as primary keys.
- Small and used in indexes: The primary key should be a small, fixed-size data type. This will improve the performance of queries that use the primary key. The primary key should also be used in indexes. Indexes can improve the performance of queries that search for specific records in the table.
- Not changed once assigned: The primary key value should not change once it is assigned. This is because changing the primary key value can break relationships between tables.
Examples of Strong Primary Key in SQL
- Customer ID: It can be treated as a primary key as it is unique for each customer. It is unlikely to change over time, as it is assigned by the company when the customer opens an account.
- Employee number: This is another primary key that is unique for each employee. It is unlikely to change over time, as it is assigned by the company when the employee is hired.
- Product ID: The data in this column is a natural key since it is unique for each product. It is unlikely to change over time, as it is assigned by the company when the product is created.
- Order number: This is a surrogate key that is unique for each order. It is likely to change over time, as new orders are created. However, it is still a strong primary key, as it is unlikely that two orders will have the same order number.
- Invoice number: This is another surrogate key that is unique for each invoice. It is likely to change over time, as new invoices are created. However, it is still a strong primary key, as it is unlikely that two invoices will have the same invoice number.
- Universally unique identifier (UUID): A UUID is a 128-bit number that is guaranteed to be unique. It is a good choice for a primary key when there is no natural key that is unique.
Difference between Primary Key, Composite Key, and Foreign Key
|Uniquely identify each row in a table
|Uniquely identify each row in a table when a single column is insufficient
|Establish relationships between tables
|Student ID in a school
|(Social Security Number, Account Number)
|Book ID in checkout records referencing Book ID in books table
|Each student has a unique ID used for identification
|Requires a combination of two or more columns to guarantee uniqueness
|Links records in one table to records in another table
|Each value in the column must be unique
|The combination of values in the columns must be unique together
|Values must exist in the referenced table’s column
|Ensures that each record has a unique identifier in the table
|Ensures that combinations of columns have unique values
|Ensures that values in the column exist in another table
|Example in Use
|Ensures that no two students have the same ID
|Used when both your social security number and account number are needed for verification
|Allows tracking which student checked out which book
|In Real Life
|Social Security Number, Passport Number
|Combination lock with multiple numbers
|Barcode on a product linking to its manufacturer
|Each student’s ID card
|A combination lock
|Barcode scanner linking to a product database
Primary Key Operations
The primary Key is always associated with the column or columns in a table. We can create, alter or even drop the primary key from the columns in a table. Let’s go through each of these using an example.
Single Column Primary Key
Suppose you have a table named students having columns of name, roll, age, and marks. These are the data for the students of a particular class. Now every student from a similar class will have a unique roll number thus, we can make the roll number column our primary key.
Let’s create the table of students with roll number as the primary key –
CREATE TABLE Students ( Name VARCHAR(100), Roll INT, Age INT, Marks FLOAT, PRIMARY KEY (Roll) );
Thus, we place the column name inside the PRIMARY KEY() brackets. This indicates that the column Roll is our primary key, and it should satisfy all the conditions to be a primary key to identify the records uniquely.
Now suppose you have created a table like this –
CREATE TABLE Students ( Name VARCHAR(100), Roll INT, Age INT, Marks FLOAT );
That is, without defining any column as a primary column. Now you may need to define a column as a primary column. In that case, we have to alter the table and add the primary key constraint with the required column as such –
ALTER TABLE Students ADD PRIMARY KEY (Roll);
This will make the Roll column a Primary field.
Now you may want to remove the primary constraint from any column. In that case, what you have to do is –
ALTER TABLE Students DROP PRIMARY KEY
This will remove the single Primary Key you are having in your Students table for the Roll column.
Multiple Column Primary Key
Suppose you have a table of students with columns of name, class, age, marks, and roll number. Now all the students from a specific class will have a unique roll number associated with them, that is there will be a single person from class 6 with roll 10, and a single person from class 7 with roll 10, but no two students from these classes will have the same roll number. Thus, we have to take two columns named class and roll them together as a primary key. We cannot take class and roll numbers individually, as the class number and roll number will repeat in the whole dataset.
Let’s create the table of students with class and roll number as primary keys –
CREATE TABLE Students ( Name VARCHAR(100), Class INT, Roll INT, Age INT, Marks FLOAT, CONSTRAINT Class_Roll PRIMARY KEY (Class, Roll) );
This Class and Roll column together will make up the primary key, and Class_Roll is the name given to the new primary key created out of two columns. In short, Class_Roll is the name given to the primary key constraint, which is basically the two columns together. This naming is necessary for identifying the primary key in the table, which is then used for identifying unique records. So, in this case, when we talk about Class_Roll, we are talking about the primary fields made from the columns of Class and Roll together.
Now, if we want to delete the primary key constraint in the case of multiple column primary key, we have to do this –
ALTER TABLE Students DROP CONSTRAINT Class_Roll
Thus, we can remove the primary key constraint from more than one column at once by dropping it using the name we have given to it while creating.
Now, if we require to create the constraints once again on both the columns, then we have to alter the table in this way –
ALTER TABLE Students ADD CONSTRAINT Class_Roll PRIMARY KEY (Class, Roll);
While using alter table statement to add the primary key constraints to the columns in a table, those columns must follow the primary key conditions. The specific columns must not have any duplicate value or null value in them.
Primary Key is an important concept and is helpful in fetching unique records from the table. The primary key helps in preventing redundancy or duplicated values in the dataset. The primary field plays a crucial role in joining two or more tables. We use the primary field while joining two or more tables because this field has unique values to identify each record separately.
If you do not have any column suitable as a primary field, you should create another column having unique values and mark it as a primary field. If you have a single or multiple column(s) suitable to be the primary key, that field(s) should be defined as the primary key. The primary field can be of integer or character data type; it just needs to have all unique and non-null values.