Title
- Micro-topic: A1 (linear search) (Page 695)
- Micro-topic: A2 (clustering index, equality on key) (Page 697)
- Micro-topic: A3 (clustering index, equality on non-key) (Page 697)
- Micro-topic: A4 (secondary index, equality) (Page 697)
- Micro-topic: A5 (clustering index, comparison) (Page 698)
- Micro-topic: A6 (secondary index, comparison) (Page 698)
H4: A1 (Linear Search)
-
Concept: The most basic approach. The system scans each block of the relation’s file and tests every record against the selection condition.
-
Applicability: Always applicable, regardless of file ordering, index availability, or selection condition.
-
Cost:
- Block Transfers: (where is the number of blocks in the relation’s file)
- Seeks: 1 (for the initial seek to start the file scan; additional seeks may be required if the file blocks are not stored contiguously, cost are negligible ).
- If the selection is equality on a key.
- Average cost:
- Block Transfers:
- Because the search can end when that tuple is found.
- Average cost:
-
Notes: Simple but can be inefficient for large relations, especially if only a few tuples satisfy the selection condition.
H4: A2 (Clustering Index, Equality on Key)
-
Concept: Uses a B+-tree clustering (primary) index on a key attribute with an equality condition (e.g.,
WHERE ID = '12345'). -
Applicability: Requires a clustering index on the attribute involved in the equality condition.
-
Assumption:
- To model the common situation that the internal nodes of the index are in the in-memory buffer, can be set to 1.
-
Cost:
- Block Transfers: (where is the height of the B+-tree)
- transfers to traverse the index to the leaf.
- 1 transfer to fetch the record.
- Seeks: (one seek per level of the tree, plus one for the data block).
- Block Transfers: (where is the height of the B+-tree)
-
Notes: Very efficient for retrieving a single record based on a key.
H4: A3 (Clustering Index, Equality on Non-Key)
- Concept: Uses a clustering B+-tree index on a non-key attribute with an equality condition (e.g.,
WHERE dept_name = 'Music'). - Applicability: Requires a clustering index on the attribute. Because the file is sorted by the non-key clustering index, all tuples with the same indexed attribute value are stored consecutively.
- Cost:
- Block Transfers:
- transfers to traverse the index to the first leaf with a matching value.
- transfers to read the contiguous blocks containing all matching records (b is number of block containing records).
- Seeks: seeks for each level of the B+-tree, 1 seek for the first block.
- Block Transfers:
- Notes: Efficient if many records satisfy the condition.
H4: A4 (Secondary Index, Equality)
- Concept: Uses a secondary (non-clustering) B+-tree index on any attribute (key or non-key) with an equality condition.
- Applicability: Requires a secondary index on the attribute.
- Cases:
- Equality on Key:
- Cost: . Identical to A2.
- Equality on Non-Key:
- Worst-Case Cost: (where n is the number of records retrieved).
- I/O operations to traverse the index.
- Up to I/O operations to fetch records (one per record, as records may reside on different blocks with no guaranteed order).
- Best-Case Cost: will be much less than worst-case in large in-memory buffer.
- Equality on Key:
- Notes: Can be very expensive for non-key attributes if many records match, because each record retrieval may require a separate disk I/O. If secondary index is used on a B+-tree file organization or other file organizations that may require relocation of records, accessing a record through such a secondary index is more expensive.
H4: A5 (Clustering Index, Comparison)
- Concept: Uses a clustering B+-tree index with a comparison condition (>, >=, <, ⇐).
- Applicability: Requires a clustering index on the attribute in the comparison.
- Cases:
- or :
- Use the index to find the first tuple where .
- Scan the file forward from that point.
- or :
- Scan the file from the beginning until a tuple is found where or , respectively. Do not require index lookup.
- Index is not useful.
- Cost:
- Identical to A3, equality on non-key.
- or :
H4: A6 (Secondary Index, Comparison)
- Concept: Uses a secondary B+-tree index with a comparison condition.
- Applicability: Requires a secondary index on the attribute in the comparison.
- Process: Scan index leaf nodes, retrieving records, from the smallest value up to v (for < and ⇐), or from v up to the maximum value (for > and >=).
- Cost:
* Worst-Case Cost: (where n is the number of records fetched)
- Notes: Can be very inefficient if many records match, because each record may require a separate disk I/O. Should only be used if very few records match.