Intro to Database Systems - Part 4 & 5 : Relational Algebra

Let’s start by understanding what relational algebra is:

A query language allows manipulation and retrieval of data from a database. It is useful to explain how a SQL query is executed internally.

Understanding relational algebra is basically understanding SQL. Some basic operations from relational algebra are:

The select operation is an unary operation noted σc( R ). It returns a subset of rows from the input relation R, satisfying condition C. This constraint is expressed as the predicate. The predicate is a boolean expression whose operators are:


Considering a table of employees E, this is an example of the select operation in SQL:



Under the relation algebra form, we’ll have:

It is important to notice that:

The project operation is an unary operation noted ∏L ( R ), where L is a list of attributes (A1, A2, A3, …, An) and R is the input relation. It extracts a subset of columns (the attributes) in a relation and discards the rest. If the user is interested in selecting the values of a few attributes, rather than selecting all attributes of the relation/table, one should go for the project operation.

This is an example of the project operation in SQL:



Under the relational algebra form, we’ll have:

Note that again, the process is simple:

We can also use operator composition: embed the result of one operation as an input to another relation. Example:


The rename operator is a unary operator noted ρ( X(x1,x2,…,xn), Y(y1, y2, …, yn)) where X is identical to the initial relation Y, but with a new relation name X and new attribute names (x1 to xn). Example:

The union operator takes as input two relations R and S.

As an example,


The intersection operator is applied when only rows shared by both A and B are included in the result. It has the same preconditions as the union operator (no duplicates, same number of attributes, domain compatibility).


The set difference operator between two relations A - B results in a relation containing rows that are in A but not in B.


The cross product operator between two relations A X B is the set of pairs that can be formed by pairing each row of A with each row of B.


If it happens that attributes have the same name in the resulting relation, we should prefix the similar attributes with the initial relation name (example : id is an attribute of relation school, but also of relation library. In the resulting relation, there should be an attribute for school id and another one for library id).

An easy way to implement the cross product operator is a nested loop. For each row from table A, pair it with a row from table B.

The condition join is just a cross product satisfying a condition. We apply the nested loop method in order to create the pairs, but only in the case where the condition is satisfied. It’s the selection operation applied to a cross product.


In SQL we’ll have : SELECT * FROM Employee, Department WHERE dept = depNr


The equi-join is simply a condition join where the condition is an equality.

The natural-join is an equi-join if values from columns with the same name are equal. If relation A has attribute (column) studID and relation B has attribute studID, rows will be paired only if their value are identical (relation A has a student with studID 99 and relation B has a student with stutID 99). In other words, we need 2 conditions to apply the paring:

As a conclusion, here are some properties of relational algebra:







Now read this

Intro to Database Systems : Indexing Part 2 - B+ trees

In the previous section, Indexing Part 1, we’ve seen that building an index for frequently used attributes considerably increases the efficiency of a query. In this section we’ll discuss the most widely used index implementation: the B+... Continue →