A declarative DML and DDL. Case insensitive. Became the industry standard in the 1980s.
Query planner decides on the most efficient way to execute a query.
Flavors and Versions
Section titled “Flavors and Versions”SQL has several implementations with varying features and syntax:
-
MySQL: Popular open-source database system, currently owned by Oracle
- Major versions: 5.7, 8.0
-
PostgreSQL: Advanced open-source object-relational database
- Major versions: 11, 12, 13, 14, 15, 16
-
Microsoft SQL Server: Microsoft’s enterprise database system
- Major versions: 2016, 2019, 2022
-
Oracle Database: Enterprise database system by Oracle Corporation
- Major versions: 12c, 19c, 21c
-
SQLite: Self-contained, serverless database engine
- Lightweight, embedded solution
-
MariaDB: Community-developed fork of MySQL
Each SQL flavor implements the ANSI SQL standard but adds proprietary extensions and features, which can affect portability of SQL code between systems. The syntax and capabilities of functions vary between SQL dialects.
Data Types
Section titled “Data Types”SQL supports several built-in data types as well as user-defined data types.
-
Numeric Types:
INTEGER/INT: Whole numbers without decimal placesDECIMAL(p,s)/NUMERIC(p,s): Exact numeric values with specified precisionFLOAT/REAL: Approximate numeric valuesSMALLINT/BIGINT: Smaller and larger integer values
-
String Types:
CHAR(n): Fixed-length character stringsVARCHAR(n): Variable-length character stringsTEXT: Large variable-length character strings
-
Date and Time Types:
DATE: Stores date values (YYYY-MM-DD)TIME: Stores time values (HH:MM:SS)TIMESTAMP: Stores date and time valuesDATETIME: Similar to TIMESTAMP
-
Boolean Type:
BOOLEAN: Stores TRUE, FALSE, or NULL values
-
Binary Types:
BLOB: Binary Large Object for storing binary dataBINARY/VARBINARY: Fixed and variable-length binary data
Users can define additional data types using CREATE TYPE.
Domains
Section titled “Domains”Allow defining custom data types with constraints. They are useful for enforcing business rules and improving code readability.
Table Modifications
Section titled “Table Modifications”ADD COLUMN vs DROP COLUMN
Section titled “ADD COLUMN vs DROP COLUMN”Most early database systems allowed adding new columns but not dropping existing ones.
Adding a column is a non-destructive change. The DBMS appends metadata for the new column, which can be quickly done. And treats missing values in old rows as NULL or default. Backward compatibility is maintained when a new column is added.
Dropping a column is destructive. It requires rewriting every record to remove stored bytes and update file formats. The table must be locked during this process. Backward compatibility will be lost if another database object depends on the dropped column. This introduces potential data loss, or errors.
However modern database systems support dropping columns. They do this safely by combining metadata indirection, lazy cleanup, and background table rewriting. These mechanisms make column removal efficient and non-destructive.
Metadata Indirection
Section titled “Metadata Indirection”The column would be marked as “deleted” in metadata. Queries and new writes simply ignore the dropped column.
Lazy Cleanup
Section titled “Lazy Cleanup”Instead of removing the data instantly, the process is deferred. Avoids long table locks. Preserves transactional safety.
Background Table Rewriting
Section titled “Background Table Rewriting”Modern engines maintain system catalogs that track all dependencies. To ensure a consistent schema, when dropping a column:
- The DBMS checks for dependent objects.
- Either rejects the operation or performs a cascade delete of references.
To combine columns from more than 1 relations based on related columns.
INNER JOIN: Returns records that have matching values in both tables.LEFT (OUTER) JOIN: Returns all records from the left table and matched records from the right table.RIGHT (OUTER) JOIN: Returns all records from the right table and matched records from the left table.FULL (OUTER) JOIN: Returns all records when there’s a match in either the left or right table.CROSS JOIN: Returns the Cartesian product of two tables.NATURAL JOIN: Returns the intersection of two tables based on common column names. Not recommended due to ambiguity and potential performance issues.
SQL Queries
Section titled “SQL Queries”Can be nested, but have performance implications.
Nested Subquery
Section titled “Nested Subquery”A nested subquery is a query inside another query. Allows one SQL statement to use the result of another as input, enabling complex filtering and comparison logic. Cannot use outer row values.
A subquery (inner query) is enclosed in parentheses and used within:
WHEREFROMHAVINGSELECT
Subquery executes first. Its result is passed to the outer query. The outer query then applies filters or joins based on that result.
Single-Row Subquery
Section titled “Single-Row Subquery”Returns one value. Used with comparison operators.
SELECT nameFROM studentWHERE total_credits = ( SELECT MAX(total_credits) FROM student);Multiple-Row Subquery
Section titled “Multiple-Row Subquery”Returns multiple values. Used with operators like IN, ANY, ALL.
SELECT nameFROM studentWHERE dept_name IN ( SELECT dept_name FROM department WHERE building = 'Watson');Multiple-Column Subquery
Section titled “Multiple-Column Subquery”Returns multiple columns per row. Compared using tuple notation.
SELECT nameFROM instructorWHERE (dept_name, salary) IN ( SELECT dept_name, salary FROM instructor WHERE salary > 90000);Nested in FROM Clause
Section titled “Nested in FROM Clause”Subquery acts as a temporary table (derived relation).
SELECT dept_name, avg_salFROM ( SELECT dept_name, AVG(salary) AS avg_sal FROM instructor GROUP BY dept_name) AS dept_avgWHERE avg_sal > 80000;Correlated Subquery
Section titled “Correlated Subquery”A subquery which is dependent on outer row values. Runs once for each row. Has performance pitfalls. Can often be replaced by JOIN for efficiency.