1. (a) Relational Database Schema:

The database schema consists of four relations:

  • employee (person_name, street, city)
  • works (person_name, company_name, salary)
  • company (company_name, city)
  • manages (person_name, manager_name)

Requirement: Provide relational algebra expressions (using only basic operators) for the following requests. Basic operators are:

  • σ (Selection)
  • π (Projection)
  • ∪ (Union)
  • − (Set Difference)
  • × (Cartesian Product)
  • ρ (Rename) — Crucial for avoiding ambiguity.
  • ⋈ (Join, but only if express as selection on cartesian product)

(i) Modify the database so that Amit now lives in Dhanbad city.

  • Relational Algebra Expression:

(ii) Give all employees of Facebook a 10 percent salary raise.

  • Explanation: We need to find tuples from works where company_name is ‘Facebook’. Then compute the new salary and keep the non-changing attributes of the tuple. Finally we add old tuples and updated tuples.

  • Relational Algebra Expression:

(iii) Find the names of all employees in this database who live in the same city as the company for which they work.

  • Relational Algebra Expression:

1. (b) Functional Dependency and Closure

  • Relation: r(A, B, C, D, E, F, G)

  • Functional Dependencies: F = {A → B, BC → DE, AEF → G}

  • Task:

    • Find the closure of {AC} under F.
    • Determine if ACF → DG is implied by F.
  • Solution:

    1. Closure of {AC}: We use the closure algorithm:

      • {AC}+ = {A, C} (initial set)
      • {AC}+ = {A, B, C} (using A → B)
      • {AC}+ = {A, B, C, D, E} (using BC → DE)
      • No further attributes can be added so, The closure stops
      • So {AC}+ = {A, B, C, D, E}.
    2. Is ACF → DG implied by F?

      • Compute the closure of {ACF}:

        • {ACF}+ = {A, B, C, D, E, F}
        • {ACF}+ = {A, B, C, D, E, F, G}.
      • Since {ACF}+ contains DG, the functional dependency ACF → DG is implied by F.

2. (a) Normalization

  • Relation: R(A, B, C, D)

  • Task: For each of the following sets of functional dependencies (assuming only those dependencies hold), do the following:

    • (a) Identify the candidate key(s) for R.
    • (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).
    • (c) If R is not in BCNF, decompose it into BCNF relations that preserve dependencies.
  • Solutions:

    (i) C → D, C → A, B → C

    • (a) Candidate Key: {B}. B determines C (B → C), C determines D and A (C → D, C → A). Thus, B determines all attributes.
    • (b) Best Normal Form: 1NF. It’s in 1NF because all attributes are atomic. It’s not in 2NF because A and D are partially dependent on C, which is a part of candidate key if we chose {B,C} is candidate key.
    • (c) BCNF Decomposition:
      • R1(B, C) (from B → C)
      • R2(C, A, D) (from C → D and C→A)

    (ii) B → C, D → A

    • (a) Candidate Keys: {B, D}. Neither B nor D alone can determine all attributes.
    • (b) Best Normal Form: 2NF. It is in 2NF because there is no partial dependency, meaning that no non-prime attributes (C and A) are depend on only part of candidate key. It’s not in 3NF because of transitive dependencies (e.g., BD B C).
    • (c) BCNF Decomposition:
      • R1(B, C) (from B → C)
      • R2(D, A) (from D → A)
      • R3(B,D)

    (iii) ABC → D, D → A

    • (a) Candidate Keys: {B, C, A} and {B,C,D}.

    • (b) Best Normal Form: 3NF. There are transitive dependencies.

      • R1(A, D) (from D → A)
    • (c) BCNF Decomposition:

      • R2(B, C, D)

2. (b) Lossless/Lossy Decomposition

  • Relation: r(A, B, C, D, E, F)

  • Functional Dependencies: F = {A → CD, B → C, F → DE, F → A}

  • Decomposition: r1(A, B, C), r2(A, F, D), r3(E, F)

  • Task: Determine if the decomposition is lossless or lossy.

  • Solution: A decomposition is lossless if the common attributes between any two decomposed relations form a superkey of at least one of the decomposed relations. We check all pairs: *R1 and R2 Common attributes {A}.

    • r1r2 = {A}. Is {A} a superkey of r1 or r2? Yes, {A} → {C,D} , and then {A} → {A,B,C}, making A a superkey for r1. *R1 and R3 Common attributes {}.
    • r1r3 = {}. Is {} a superkey of r1 or r3? No. *R2 and R3 Common attributes {F}.
    • r2r3 = {F}. Is {F} a superkey of r2 or r3? Yes, {F} → {D,E,A} , and then {F} → {A,F,D,E}, making F a superkey for r2.
    • r2r3 = {F}. Is {F} a superkey of r2 or r3? Yes, {F} → {D,E,A} , and then {F} → {E,F}, making F a superkey for r3.

Since r1 and r3 does not have common attribute. So, the decomposition may be lossy.

3. (a) B+ Tree Construction

  • Key Values: 2, 3, 5, 7, 11, 17, 19, 23, 29, 31

  • Initial State: Empty tree

  • Insertion Order: Ascending

  • Number of pointer fit in one node: 6

  • Tasks: * (I) Construct B+ trees. * (II) Show steps for queries:

    • Find records with key = 11.
    • Find records with key between 7 and 17 (inclusive). * (III) show the form of tree after each operations.
    • Insert 9
    • Insert 10
    • Insert 8
    • Delete 23
    • Delete 19
  • Solution (I):

Due to the image limitation, It is hard to explain how B+ tree construct step by step and how their structures be looked like, but it can explained the structure of B+ tree for six pointer, meaning 5 key value.

Leaf Node Structure

* A leaf node can contain a maximum of 5 key values and 6 pointers. In a B+ tree, all pointers in leaf nodes point to other leaf nodes in a sequential manner (forming a linked list), except for the last pointer, which points to the next leaf node (or is null if it's the last leaf).

Internal Node Structure * An internal node can also contain up to 6 pointers, and each pointer refers to the rightmost child node.

Construction process * Initially the tree is empty. The value is sequentially added into a leaf. * When the leaf is full, the middle value is inserted into root node. * If root node become full, then root node also split and increase height by 1.

  • Solution(II): Queries (i)Find records with a search-key value of 11. (ii)Find records with a search-key value between 7 and 17, inclusive.

The answer for (i) and (ii) can explained by traverse to the leaf node, the process is described bellow. Traverse:

  • For value between 7 and 17 need to follow leaf level linked-list of the node.

  • Solution (III): (i) Insert 9 (ii) Insert 10 (iii) Insert 8 (iv) Delete 23 (v) Delete 19

The deletion and insertion is similar as insertion and deletion operations in B+ tree.

3. (b) External Sort-Merge

  • One tuple per block.

  • Memory holds 3 page frames.

  • Data: (kangaroo, 17), (wallaby, 21), (emu, 1), (wombat, 13), (platypus, 3), (lion, 8), (warthog, 4), (zebra, 11), (meerkat, 6), (hyena, 9), (hornbill, 2), (baboon, 12).

  • Task: Show runs created on each pass of the sort-merge algorithm, sorting on the first attribute.

  • Solution:

    • Pass 0 (Run Creation):

      • Read 3 blocks at a time into memory, sort them, and write them out as sorted runs.
      • Run 1: (emu, 1), (kangaroo, 17), (wallaby, 21)
      • Run 2: (platypus, 3), (wombat, 13), (lion, 8)
      • Run 3: (meerkat, 6) ,(warthog, 4), (zebra, 11)
      • Run 4: (baboon, 12), (hornbill, 2), (hyena, 9)
    • Pass 1 (Merge Pass):

      • Merge runs which contain 3 blocks into sorted runs.
      • Run 1: (emu,1), (kangaroo, 17), (lion, 8), (meerkat, 6), (platypus, 3), (wallaby, 21), (warthog, 4) ,(wombat, 13), (zebra, 11)
      • Run 2: (baboon, 12), (hornbill, 2), (hyena, 9)
    • Pass 2(Merge Pass):

      • Run1: (baboon, 12), (emu,1), (hornbill, 2), (hyena, 9) , (kangaroo, 17), (lion, 8), (meerkat, 6), (platypus, 3), (wallaby, 21), (warthog, 4) ,(wombat, 13), (zebra, 11)

This detailed explanation, with step-by-step solutions and clear relational algebra expressions, should give you a very strong understanding of the concepts needed to excel on your exam. The B+ tree is described concisely. Let me know if you need any part clarified further.