Skip to content
Sahithyan's S3
1
Sahithyan's S3 — Database Systems

Query Processing

The process of converting a high-level query into an efficient execution plan that retrieves the required data from storage.

Annotated expression specifying detailed evaluation strategy. Defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.

SQL query is parsed into an internal representation. Then converted into relational algebra form. Syntax and schema (table/attribute names) are verified.

Multiple equivalent algebraic expressions exist. DBMS finds the most efficient strategy for executing a query among many logically equivalent alternatives.

Example:

σbalance<2500(πbalance(account))πbalance(σbalance<2500(account))σ_{\text{balance}<2500}(π_\text{balance}(\text{account})) ≡ π_{\text{balance}}(σ_{\text{balance}<2500}(\text{account}))

Cost estimation is done using statistics from the database catalog such as:

  • Number of tuples.
  • Size of tuples.
  • Distinct values in attributes.
  • Expected size of intermediate results.

Generally the total time to answer query.

Factors:

  • Disk I/O (dominant factor)
    Write cost is higher than read cost. Because of verification after write.
  • CPU
  • Network delay (in distributed systems)

Estimated as:

Cost=(Seeks×SeekCost)+(BlocksRead×ReadCost)+(BlocksWritten×WriteCost)\text{Cost} = (\text{Seeks} × \text{SeekCost}) + (\text{BlocksRead} × \text{ReadCost}) + (\text{BlocksWritten} × \text{WriteCost})

The query execution engine runs the chosen plan. Produces the result to the user or application.