Intro to Database Systems - Part 1 : A quick intro to db
Let’s start with some new vocabulary and acronyms:
- DBMS : Database Management System
- Data model : concepts describing a collection of data (example: relational data model, object-oriented data model).
- Schema : characteristics and properties of that collection of data (the columns in a relational database and their type - int, string, etc.).
- Instance : a particular collection of data (the rows in a relational database).
As data models, we can have :
- Relational : the most popular one, basically a table with rows and columns (MySQL);
- Semi-structured : relational but with XML. (NoSQL databases: CouchDB, MongoDB);
- Object-oriented (OO) : relational but with an OO-interface;
- Key-value : simple interface + scalable (Cassandra/Hbase)
The main advantages of the relational model are :
- simplicity and power
- strong mathematics
- accessing data can be done using powerful optimization possibilities
Some advantages of more complex data models like the pre-relational model (network/hierarchy) or the post-relational (OO/XML) are the fact that they have :
- more power
- more flexibility
- a more structured representation
The advantage of the Key-value model is its simplicity, but it’s also the main drawback.
An Entity-Relationship Model is a diagram describing the requirements of the database. It contains:
- entity types : set of object we wish to store data about (showed only once: one case for all the students, one for all the customers, etc);
- relationships : association between entity types (the entity type “person” is related with the entity type “house” by the relationship “occupies”).
- attributes : data characteristic to each entity type. There will be the same number of attributes for each entity within an entity type (meaning that if our entity type is “students”, all the students in our database will have a field for an ID, a first name and a last name).
A transaction is a single logical operation on the data (read/write). The ACID model, defining 4 properties, guarantees that a database transaction is processed reliably:
- Atomicity : “all or nothing”; if part of the transaction fails, all the transaction fails.
- Consistency: only valid data will be written in the database.
- Isolation: multiple transactions occurring in the same do not impact each other’s execution. This doesn’t ensure which transaction occurs first, but only the fact transactions will not interfere which each other.
- Durability: any transaction committed to the db won’t be lost.