Query Processing

1 minute read

Relational Operators Notes

Logical vs Physical Operators

Logical Operators: What they do

  • Examples: Union, Selection, Join, Grouping

Physical Operators How they do it (How they perfrom a Selection, which would be using a B-tree)

  • Example: Nest Loop join, sort-merge, hash join, index join

Relational Operators that we will Review

  • Select
  • Project
  • Join
  • Aggregation
  • Set Operators

Relational Model

  • Quick Review

Artist(name, year, country)

name year country
Wu Tang Clan 1992 USA
Notorious B.I.G. 1992 USA
Tupac 1994 USA

A relation is an unordered set that contain the relationship of attributes that represent entites

A tuple is a set of attribute values (also know as its domain) in the relation

  • Values are normally atomic/scalar
  • The special value NULL is a member of every domain

Relational Algebra is procedural query language,which takes Relation as input and generate relation as output

Select and Project

What is Select? -> Access path = a way to retrieve tuples from a table

The SELECT operation is used for selecting a subset of the tuples according to a given syntax of selection

Sigma(σ)Symbol denotes it in relational algebra

  • File scan
    • Scan the “Entire” File
    • I/O cost: O(n), where n = # of pages
    • EXPENSIVE $$$
  • Index Scan:
    • Uses an index available, think hash or B+ Tree
    • I/O cost: varies on the index we pick
      • Hash Index: O(1)
      • B+ Tree Index: O(logN) + X

How do we choose the right index?

  • Selectivity of an access path = fraction of data pages that need to be retrieved

Joins

  • Review Relational Algebra

  • A Regular Join returns the cross product

https://www.youtube.com/watch?v=pJWCwfv983Q&t=38s

Nested Loop Join

Block Nested Loop Join

Index Nested Loop Join

Joins and Memory

Sort-Merge Join (SMJ)

Hash Join (HJ)

SMJ Vs. HJ

Sort

WHy do we need to sort?

  • Becuase Tuples in a table of relational mode have no specific order

Sorting Algorithms

  • If Data fits in memory, then we can use standard sorting algorithm like quick sort

  • If Data does not fit in memory, we can use external sorting

Updated: