Intro to Database Systems - Part 11 to 13 : Integrity Constraints
Constraints are there to impose restrictions on what data is allowed in the database. The basic E/R model introduces us to 2 types of constraints:
- KEYS
- Multiplicities
The relational model allows us to be more flexible. It imposes bounds on attributes. For now, we are able to force an attribute to be:
- non null
- of a specific domain (integer, string, etc)
But what if we need to express the situation of a student that can only take some courses if his GPA > 3.0? What if, in order to be a teacher assistant, you need to have successfully passed that course with an A or you need to have a 3.0 of cumulative GPA? By introducing the SQL keyword CHECK, we introduce 2 new notions:
- attribute-based checks : fancy name to mention that we are imposing a constraint on a column
- tuple-based checks: imposing constraints on rows.
We are introducing checks (and in general, constraints), for the following main reasons:
- to catch data-entry errors (we can only input a GPA between 0.0 and 4.0)
- as correctness criteria for database updates (GPA can only be an int)
- to enforce consistency across data (by setting primary keys)
Here are a few examples of an attribute-based check:
- CREATE TABLE tableName (column columnType CHECK (anyPossibleCondition)
- CREATE TABLE Student (GPA float CHECK (GPA <= 40 AND GPA > 0));
- CREATE TABLE Sells ( bar CHAR(20), beer CHAR(20) CHECK ( beer IN (SELECT name FROM Beers)), price REAL CHECK ( price <= 5.00 ) );
- CREATE TABLE Student ( major char(10) CHECK(major NOT IN (SELECT dept-name FROM Dept WHERE status = “full”) );
It is important to notice that:
- checks are performed on update or insert, not on delete
- if condition is violated, the system will reject the update/edit
- conditions may include subqueries that evaluate as conditions
For the cases where we play with more than one one column, we need to use tuple-based checks (basically row checks). But aren’t rows composed by columns? If we want to check on a row, why can’t we simply do the following:
- CREATE table….. column1 CHECK(column1 >= 1,000), column2 CHECK(column2 >12000)
We can, only that the previous command checks column1 AND column2, not column1 OR column2. What if we want to validate a check on one OR another attribute? That’s where tuple-based checks are useful. The syntax is slightly different:
- CREATE table….. column1 …, column2…., CHECK(column1 >= 1,000 or column2 >12000)
What if we want to modify our constraints, or remove them? What if for this year you don’t need to have a 3.0 GPA to be accepted to an MBA, but a 2.6? In that case, we need to use a naming constraint convention.
In the following example, we have 2 constraints:
- skaterID is an independent key
- rating should be between 1 and 10
If we want to name our constraints or to edit them, we’ll do the following:
- CREATE TABLE Skaters ( skaterID INT NOT NULL, skaterName VARCHAR(20), rating INT CONTRAINT rat CHECK (rating > 0 and rating < 11) age INT, CONTRAINT primKey PRIMARY KEY (sid), CONTRAINT rateAndAge CHECK (rating <= 4 OR age >5) );
If we want to remove a constraint:
- ALTER TABLE Skaters DROP CONSTRAINT rateAndAge;
If we want to add a constraint:
- ALTER TABLE Skaters ADD CONSTRAINT rateAngAge CHECK (rating <= 5 OR age >5);