Intro to Database Systems : Basic Perspectives on Disk and Buffer Management

The Database Management System stores information at 3 levels of the memory hierarchy:

Now why can’t we store everything in the main memory, if it’s the fastest way? Because …

Why can’t we store everything on tape?

Disk blocks or pages are the main units for measuring retrieved data. They have a fixed usable size, usually being 512 bytes. We can read (from disk to RAM) or write (from RAM to disk) pages.

The seek time is the most time consuming operation when accessing data on disk (from 1 to 20 msec). To compare, accessing data from the main memory is in the order of nanoseconds.

Lowest layers of the Database Management System are in charge of how the place is used on the disk. Higher levels depend on a buffer (the lowest layer of DBMS) to:

In other words, the buffer manager is doing 3 things:

When data has to be loaded from the disk:

Replacement frames have a pin counter of 0. Once the page from the frame is loaded, the pin counter becomes 1, the equivalent of an empty frame. If there is an empty frame, the buffer manager picks an empty frame.

Once the data contained in the replacement frame is requested, its pin counter is incremented. We can state this as a general rule: “when requesting a page that is already in the buffer, its pin counter is incremented”. After the operation is finished, we decrement it. If the page (the disk block) is modified, a dirty bit is set and the frame is immediately written to the disk (update).

If there are no empty frames, only unpinned pages (pin counter = 0) can be chosen to accept loaded pages from the disk: this is the replacement policy. (This makes sense: if we reached the point where we have only non-empty frames, it means that we will have to wait for all the transactions to finish. We will have to wait for all the pins to be decremented. Hence, the pin counter has to get back to 0, or become 1 and unlock a frame.)

DBMS maintain their own buffer rather than use that of the OS so that they control when to let out pages from it through the implementation of pin counters and replacement policies.



Now read this

Intro to Database Systems : Indexing

We learned in the last lecture that when data is stored on disks, it is sorted as a set of blocks of data (also called pages). A block is accessed as a whole, in its entirety. On the disk, blocks are structured as link lists: they both... Continue →