Intro to Database Systems - Part 8 to 10 : Intermediate SQL
Using the next common keywords will return a column with only one row. They are called aggregate functions:
- 1) COUNT
- 2) AVG
- 3) SUM
- 4) MIN/MAX
1) COUNT : returns the number of rows matching a specific criteria.
- SELECT COUNT * FROM Skaters
- SELECT COUNT(CustomerID) AS OrdersFromCustomerID7 FROM Orders WHERE CustomerID=7;
- SELECT COUNT DISTINCT(CustomerID) FROM Orders : selects the number of distinct customers.
2) AVG : returns the average value of a column
- SELECT AVG(Price) FROM Products: the average price of the products column.
3) SUM : returns the sum value of a column
- SELECT SUM(Quantity) FROM OrderDetails : sum of all quantity fields for the “OrderDetails” table.
4) MAX/MIN : returns the largest/lowest value of the selected column.
- SELECT MAX(Price) AS HighestPrice FROM Products;
- SELECT MIN(Price) AS SmallestOrderPrice FROM Products;
Let’s investigate on some other common keywords now:
5) GROUP BY : a keyword to group elements by columns.
- SELECT Name SUM(Salary) FROM Customers GROUP BY Name;
- Another example is grouping all the grades of the same student under the same average: SELECT StudentName AVG(Grades) FROM Students GROUP BY (StudentID)
It’s important to keep it mind that the general form of GROUP BY is :
- SELECT column_name FROM table_name WHERE [ conditions ] GROUP BY column_name
6) HAVING : the keyword HAVING is to groups what WHERE is to columns. It’s a filter. It restricts the groups of returned rows to only those whose the condition is TRUE.
- SELECT Department, SUM(sales) AS “Total sales” FROM order_details GROUP BY Department HAVING SUM(sales) > 1000 : it filters in a way that only departments with sales greater than $1000 are returned.
- SELECT Department, COUNT( * ) AS “Number of employees” FROM Employees WHERE salary > 25000 GROUP BY Department HAVING COUNT ( * ) > 10: only departments with more than 10 employees will be returned.
For now we’ve covered a pretty good intro to SQL tables. Let’s talk about 2 features of SQL that are a little more abstract now :
- VIEWS : virtual SQL tables
- NULL values : missing unknown data
1) VIEWS
The view is just like a table. Here is the general syntax:
- CREATE VIEW view_name AS SELECT column_name(s) FROM table_name WHERE condition
Now why should you use a VIEW instead of a normal table? For 3 reasons:
- 1) Hide complexity: if you want to join tables or have complex calculations to do, you can code it into a view and then select from a view as you would do with a table
- 2) Security: permissions can be set on views instead of underlying tables. We can then surface only on the data the user needs to see.
- 3) Simplify code applied to the db : during refactoring, replace the table with a view of the same name. The original schema is preserved, we are altering only the copy of it.
We can CREATE views and then manipulate them:
- CREATE VIEW “Category Sales For 1997” AS SELECT DISTINCT CategoryName, SUM(ProductSales) AS CategorySales FROM “Product Sales for 1997” GROUP BY CategoryName : calculates the total sale for each category in 1997.
- SELECT * FROM “Category Sales For 1997”
- SELECT * FROM “Category Sales For 1997” WHERE CategoryName=‘Beverages’ : we add a condition in order to see the total sale only for the category “Beverages”.
To DROP views :
- DROP VIEW view_name
But what if we want to drop the table, and the table has a view associated it with? That’s when we use DROP TABLE CASCADE.
2) NULL VALUES
To select records that are null in our db we :
- SELECT LastName, FirstName, Address FROM Persons WHERE Address IS NULL
We can NOT compare to null. We get an “unknown” answer. We can evaluate this result either to “true”, “false” or again “unknown”. If we
NOT unknown we’ll obtain unknown once more.
Also * A OR B = unknown* when one of the following situations occur:
- (A = false and B = unknown)
- (A = unknown and B = false)
- (A = unknown and B = unknown)
Same for A AND B = unknown.
As a last note on NULL, please be aware that COUNT ( * ) actually counts the null values. The other aggregate operations simply ignore the NULL values.