TDT4145 - Data Modelling, Databases and Database Management Systems

Undervisningsmateriell

Datamodellering

Begrep Forklaring
Miniverden De konseptene fra virkeligheten som skal modelleres.
Datamodell Et sett begreper for å beskrive struktur (hvilke typer data, sammenheger og restriksjoner) og operasjoner (handlinger på data: innsetting, endring, sletting og spørring)
Konseptuelle modeller For å forstå og dokumentere. For eksempel ER-modeller.
Implementasjonsmodeller For å realisere. For eksempel relasjonsdatabasemodeller.
Fysiske modeller Kontroll på lagring og optimalisering. Disse er produktspesifikke modeller.
Database En strukturert samling av relaterte data.
Informasjon Data + tolkning
DBMS Databasehåndteringssystem. Software som styrer lagring og tilgang på data i databasen.
Databasesystem DBMS + data
Selvbeskrivende DBMS Inneholder både data og beskrivelser om dataen (metadata). Dette gjøres generelt i DBMS-katalogen.
Selvbeskrivende datamodeller Inkluderer databeskrivelser i selve dataen. For eksempel XML og JSON.
Program-data uavhengighet En egenskap som går ut på at strukturendring i datafil ikke medfører at man må endre programmene som har tilgang til dataen.
Flerbrukersstøtte Flere brukere kan hente ut og lagre data samtidig uten problemer.
Entitetsintegritet Alle relasjonsskjemaer skal ha en primærnøkkel. Primærnøkler må være unike og kan ikke være NULL.
Referanseintegritet Hvis en fremmednøkkel ikke er NULL, må den referere til en eksisterende rad i den tabellen den peker mot.
Databasetilstand Database-forekomsten (dataene) på et gitt tidspunkt.
Konsistent databasetilstand At databasen tilfredsstiller alle integritetsregler og relevante reglene fra miniverdenen.
Transaksjon En serie operasjoner på en database som sammen bevarer databasens konsistens.

ER-modellering

Viktig! ER-notasjon:

ER_notasjon_2017

Entiteter og entitetsklasser

Begrep Forklaring
Entitet Objekt som eksisterer i miniverdenen. Et element av en entitetsklasse.
Enitetsklasse Mengden av alle likeartede entiteter som er av samme klasse (type) og har samme egenskaper
Svak entitetsklasse En entitetsklasse der nøkkelen (delvis nøkkel) bygger på nøkkelen fra en annen identifiserende entitetsklasse, gjennom en identifiserende relasjonsklasse. Den må være eksistensavhengig av deltakelse i den identifiserende relasjonsklassen.

Attributter

Attributter beskriver entiteter. Hvert attributt henter sine mulige verdier fra et domene (datatype).

Type attributt Notasjon Forklaring
Enkel attributt (Standard) Ingen Attributt
Flerverdiattributt Dobbel linje Kan holde flere verdier
Sammensatt attributt Har egne subattributter Har egne subattributter
Avledet attributt Stiplet linje Kan avledes fra andre attributter, trenger ikke lagres eksplisitt
Nøkkelattributt Understreket Identifikator for entiteten

Relasjoner og relasjonsklasser

Begrep Forklaring
Relasjon Sammenheng mellom to eller flere enititeter. Kan ha attributter på samme måte som entiteter. Relasjonen eksisterer ikke uten de entitetene som deltar.
Relasjonsklasse Mengden av likeartede relasjoner mellom samme enitetsklasser.
Rekursiv relasjonsklasse Relasjonsklasser der samme entitetsklasse inngår flere ganger. En entitetsklasse kan da ha flere roller i relasjonsklassen.
Kardinalitet og strukturelle restriksjoner

Foreleser i emnet foretrekker strukturelle restriksjoner

Begrep Forklaring
Kardinalitet Ved bruk av 1 eller N på hver side av relasjonen antydes om relasjonen er en-til-en, en-til-mange eller mange-til-mange. Noteres på motsatt side av relasjonen som den aktuelle entiteten. Enkel strek antyder delvis deltagelse (entitetene må ikke delta i relasjonen), dobbel strek antyder total deltagelse (entitetene må delta i relasjonen minst en gang).
Strukturell restriksjon (min, max) Antyder at en entitet opptrer i relasjonen minst min ganger og maks max ganger. Vanlige restriksjoner er (0, 1), (1, 1), (0, n) og (1, n). Noteres på samme side av relasjonen som den aktuelle entiteten.

Forekomstdiagram

EER-modellering

Spesialisering og generalisering

Begrep Forklaring
Superklasse-subklasserelasjon Subklassene arver egenskapene (attributter og relasjoner) til superklassen, men kan også ha særegne attributter og relasjoner. En subklasse er en delmengde av den tilhørende superklassen.
Total spesialisering Alle entiteter i superklassen må delta i minst en subklasse. Noteres med dobbel strek.
Overlappende subklasser En entitet i superklassen kan delta i flere enn en subklasse. Noteres med O eller mangel på notasjon.
Disjunkte subklasser En entitet i superklassen kan ikke delta i flere enn en subklasse. Noteres med D.
En entitet kan delta i Disjunkt Overlappende
Delvis 0-1 subklasser 0-n subklasser
Total 1 subklasse 1-n subklasser

Kategorier

En kategori representerer en samling av entiteter som er en delmenge av entitetene fra ulike entitetsklasser (kategorienes superklasser). Blir også kalt en union-type. En kategori kan være delvis eller total. En total kategori vil inneholde alle entitetene i unionen av superklassene og markeres med dobbelt strek mellom kategorien og sirkelen.

Relasjonsdatabaser

Relasjonersdatabaser er bygd opp av tabeller der kolonner angir attributter og rader angir entiteter eller relasjoner.

Mapping fra ER-modell

Konsept Hvordan mappe
Regulære entitetsklasser Entitetsklassen får egen tabell. Attributter blir kolonner. Sammensatte attributter splittes opp. Primærnøkkel fra ER-modell brukes.
Svake entitetstyper Legger til primærnøkkel (fremmednøkkel) fra identifiserende entitetsklasser, sammen med klassens delvise nøkkel. Ellers likt.
Binære 1:1 relasjoner Fremmednøkkel kan legges i en av entitetsklassetabellene, eller så kan relasjonen få egen tabell.
Binære 1:N relasjoner Fremmednøkkel kan legges i entiteten med kardinalitet 1, eller så kan relasjonen få egen tabell.
Binære N:N relasjoner Relasjonen får egen tabell.
Flerverdiattributter Får egen tabell med fremmednøkkel mot entitetsklassen.
N-ære relasjonsklasser (N > 2) Får egen tabell med fremmednøkler mot alle tilhørende entitetsklasser.

Mapping fra EER-modell

Konsept Hvordan mappe
Spesialisering/generalisering A) Separate tabeller for superklasse og alle subklasser. Kommentar: Data om en entitet spredt over flere tabeller, må bruke join. B) Separate tabeller for subklasser som inkluderer attributter fra superklassen. Kommentar: Krever total og disjunkt spesialisering. C) En tabell som inkluderer attributter fra superklassen og alle subklasser. Kommentar: Ved disjunkt spesialisering kreves et type-attributt. Ved overlappende spesialisering kreves en boolsk type-vektor (en kolonne per type).
Kategorier En tabell for kategori-entitetsklassen, med typeattributt. Superklassene har som regel ulike nøkler, i tillegg til en surrogat(fremmed)nøkkel mot kategorien.

Relasjonsalgebra

I relasjonsalgebra bruker vi operasjoner for å manipulere relasjonsdatabasetabeller. Merk at en rad her er en tuppel og en tabell er en mengde av tuppler. Vi vil altså ikke ha duplikater. Operatorene er lukket over tabeller (både input og output er mengder).

Operasjoner

Operasjon Operator Forklaring
Seleksjon σBetingelse(A)\sigma_{\text{Betingelse}}(\text{A}) Returnerer radene i A som oppfyller betingelsen.
Projeksjon ΠKolonnenavn...(A)\Pi_{\text{Kolonnenavn...}}(\text{A}) Returnerer de oppgitte kolonnene i A.
Union (A, B)\cup(\text{A, B}) Returnerer unionen av A og B. Krever like kolonnenavn.
Snitt (A, B)\cap(\text{A, B}) Returnerer snittet av A og B. Krever like kolonnenavn.
Differanse (A, B)-(\text{A, B}) Returnerer radene i A som ikke er i B. Krever like kolonnenavn.
Kartesisk produkt X(A, B)\text{X}(\text{A, B}) Kombinerer alle rader i A med alle rader i B.
Indre join Betingelse(A, B)\Join_{\text{Betingelse}}(\text{A, B}) Kombinerer radene i A og B som sammen oppfyller betingelsen.
Naturlig join (A, B)*(\text{A, B}) Kombinerer radene i A og B der alle kolonner i A og B med samme navn også har samme verdi.
Venstre ytre join =Betingelse(A, B)=\Join_{\text{Betingelse}}(\text{A, B}) Kombinerer radene i A og B som sammen oppfyller betingelsen, men inkluderer også resten av radene i A (med NULL-verdier i nye kolonner).
Høyre ytre join =Betingelse(A, B)\Join=_{\text{Betingelse}}(\text{A, B}) Kombinerer radene i A og B som sammen oppfyller betingelsen, men inkluderer også resten av radene i B (med NULL-verdier i nye kolonner).
Full ytre join ==Betingelse(A, B)=\Join=_{\text{Betingelse}}(\text{A, B}) Kombinerer radene i A og B som sammen oppfyller betingelsen, men inkluderer også resten av radene i A og B (med NULL-verdier i nye kolonner).
Omdøping ρTabellnavn(Attributtnavn..)(A)\rho_{Tabellnavn(Attributtnavn..)}(\text{A}) Returnerer en ny tabell med nye kolonnenavn.
Sortering Attributt..(A)\uparrow \downarrow_{Attributt..}(A) Returnerer en tabell der radene er sortert etter de oppgitte attributtene. Man kan spesifisere ASC (stigende) eller DESC (synkende) rekkefølge.

Gruppering og aggregering

Gjør utregninger "per gruppe". FGrupperingsattributter, Aggregeringsfunksjoner(A)F_{\text{Grupperingsattributter, Aggregeringsfunksjoner}}(A) Visualisert med graf plasseres grupperingsattributtene på venstre side og aggregeringsfunksjonene på høyre side. Man kan også definere et alias med AS\text{AS}. Eksempel: La A = Bilde(Fotograf, Motiv)\text{A = Bilde(Fotograf, Motiv)} Da vil FFotograf, COUNT(Motiv) AS AntallMotiver(A)F_{\text{Fotograf, COUNT(Motiv) AS AntallMotiver}}(A) returnere en tabell Bilde(Fotograf, AntallMotiver)\text{Bilde(Fotograf, AntallMotiver)} med antall motiver hver fotograf har avbildet.

Funksjon Forklaring
SUM(attributt)\text{SUM(attributt)} Sum
AVG(attributt)\text{AVG(attributt)} Gjennomsnitt
MIN(attributt)\text{MIN(attributt)} Minimum
MAX(attributt)\text{MAX(attributt)} Maksimum
COUNT(attributt)\text{COUNT(attributt)} Antall verdier (unntatt NULL)
COUNT(*)\text{COUNT(*)} Antall rader

SQL

Deklarativt spørrespråk ("hva, ikke hvordan") - vi trenger ikke tenke på optimalisering. Ikke mengdeorientert som relasjonsalgebra, resultattabellene kan ha like rader - må be om at duplikater fjernes.

Opprette data

DDL - Data Definition Language

CREATE TABLE /* opprette tabell */
ALTER TABLE /* endre tabell */
DROP TABLE /* fjerne tabell */
CREATE TABLE Person (
    Pnr     INTEGER NOT NULL,
    Navn    VARCHAR(30),
    Alder   INTEGER,
    CONSTRAINT Person_PK PRIMARY KEY (Pnr));

Fremmednøkkelrestriksjoner

CONSTRAINT RestriksjonsNavn FOREIGN KEY (Attributt) REFERENCES FremmedTabell(FremmedAttributt)
    ON UPDATE <>
    ON DELETE <>);

Alternativer for <>:

Manipulere data

DML - Data Manipulation Language

Sette inn data - INSERT INTO

INSERT INTO Person VALUES (1, 'Ola', 24);
INSERT INTO Person VALUES (2, 'Kari', 45);
INSERT INTO Person VALUES (3, 'Per', 14);

Oppdatere data - UPDATE

UPDATE Person
SET Navn = 'Karianne'
WHERE Pnr = 2

Slette data - DELETE

DELETE FROM Person
WHERE Navn = 'Ola'

Spørringer - SELECT

Merk: SELECT * angir at vi ønsker alle attributter.

SELECT attributt, ...
FROM tabell, ...
WHERE betingelse

Relasjonsalgebraekvivalent: Πattributt, ...(σbetingelse(tabell))\Pi_{\text{attributt, ...}}(\sigma_{\text{betingelse}}(\text{tabell}))

Spesifisere krav - WHERE

Følgende operator virker som forventet.

Operator
AND, OR
=, !=, <, >, <=, >=

Merk: <> er det samme som !=

-- Finner personnummer og navn for personer over 20
SELECT Pnr, Navn
FROM Person
WHERE Alder > 20

Fjerne duplikater - DISTINCT

SELECT DISTINCT Navn
FROM Person

Sortering - ORDER BY

SELECT Navn
FROM Person
ORDER BY Alder ASC

Strengsammenlikning - LIKE

% angir vilkårlig antall tegn. _ angir ett tegn.

-- Finner alle attributter for personer med navn som begynner på P
SELECT *
FROM Person
WHERE Navn LIKE 'P%'

Data fra flere tabeller: kartesisk produkt og ulike typer join

Plasseres i FROM-delen. Gitt tabeller A og B:

Funksjon Beskrivelse
A, B Kartesisk produkt
A CROSS JOIN B Kartesisk produkt
A NATURAL JOIN B Natural join
A (INNER) JOIN B Indre join
A LEFT (OUTER) JOIN B Venstre ytre join
A RIGHT (OUTER) JOIN B Høyre ytre join
A FULL (OUTER) JOIN B Full ytre join

Parenteser antyder frivillig spesifisering. Se relasjonsalgebra for nærmere forklaring av funksjonene.

Betingelse Beskrivelse
NATURAL Natural join
ON betingelse Spesifiserer join-betingelsen
USING (attributt, ...) Spesifiserer hvilke attributter som må være like i de to tabellene
-- Finner navn på hund og eier for personer over 60 år
SELECT Person.Navn, Hund.Navn
FROM Person INNER JOIN Hund ON Person.Pnr = Hund.EierPnr
WHERE Person.Alder > 60

Aggregering

Se aggregering i relasjonsalgebra. Funksjonene ignorerer NULL-verdier. Med DISTINCT opererer funksjonene på unike verdier. AS definerer et alias.

-- Finner snittalderen av alle personer
SELECT AVG(Alder) AS SnittAlder
FROM Person

Gruppering - GROUP BY

Definerer grupper (partisjoner) av rader. Må komme etter en eventuell WHERE.

-- Finner snittalderen for hvert navn
SELECT AVG(Alder) AS SnittAlder
FROM Person
GROUP BY Navn
Betingelse etter gruppering - HAVING

Som WHERE, men etter gruppering.

-- Finner snittalderen for hvert navn, men kun hvis snittet over 60
SELECT Navn, AVG(Alder) AS SnittAlder
FROM Person
GROUP BY Navn
HAVING SnittAlder > 60

Nøstede spørringer

I WHERE-delen kan vi sammenlikne med resultatet fra en spørring. I FROM-delen kan vi gjøre en ny spørring på resultatet av en spørring.

Sammenlikningsoperatorer

=, !=, <, >, <=, >= virker som forventet.

-- Finner navnet til den eldste personen
SELECT Navn
FROM Person
WHERE Alder = (SELECT MAX(Alder) FROM Person)
ANY, ALL
-- Finner navnet til den eldste personen (alternativ måte)
SELECT Navn
FROM Person
WHERE Alder >= ALL (SELECT Alder FROM Person)
IN, NOT IN
-- Finner snittalderen til forfattere
SELECT AVG(Alder)
FROM Person
WHERE Navn IN (SELECT Forfatter FROM Bok)
EXISTS

Dette er en korrelert del-spørring, dvs. at den sender inn data. (Om Person-entiteter i dette tilfelle).

-- Finner navn på personer som ikke jobber med noen andre personer
SELECT Navn
FROM Person
WHERE NOT EXISTS (
    SELECT *
    FROM JobberMed
    WHERE JobberMed.PnrA = Person.Pnr OR JobberMed.PnrB = Person.Pnr
)
Union, snitt og differanse

Krever kompatible operandtabeller (like kolonnenavn).

Funksjon Forklaring
UNION Union
INTERSECT Snitt
EXCEPT Minus
UNION ALL Union (beholder duplikater)
INTERSECT ALL Snitt (beholder duplikater)
EXCEPT ALL Minus (beholder duplikater)
-- Finner navn som brukes på både personer og hunder
SELECT Navn FROM Person
INTERSECT
SELECT Navn FROM Hund
Nøstede spørringer i FROM-delen

I FROM-delen kan vi gjøre en ny spørring på resultatet av en spørring. Lager en midlertidig tabell.

-- Finner antall arbeidere for ingeniøryrker med under 1000 arbeidere
SELECT Yrke AntallArbeidere
FROM (
  SELECT Yrke, COUNT(Pnr) AS AntallArbeidere
  FROM Person
  GROUP BY Yrke
)
WHERE Yrke LIKE '%ingeniør' AND AntallArbeidere < 1000

Virtuelle tabeller - VIEW

Views er tabeller som er avledet ut fra tabeller som er lagret i databasen. Hvorfor: Spørreforenkling, sikkerhet og ytelse.

CREATE VIEW Forelder(ForelderPnr, ForelderNavn, BarnPnr, BarnNavn)
AS SELECT F.Pnr, F.Navn, B.Pnr, B.Navn
FROM Person AS F INNER JOIN Person AS B
ON (F.Pnr = B.MorPnr OR F.Pnr = B.FarPnr)

Normalisering

Vi ønsker å unngå redundans (lagring av samme informasjon flere ganger) og innsettings-, oppdaterings- og slettings-anomalier.

Restriksjoner

Begrep Forklaring
Implisitte restriksjoner Er en del av datamodellen og håndheves derfor alltid av DBMS.
Eksplisitte restriksjoner Kan uttrykkes i datamodellen (databaseskjemaet). Håndheves av DBMS. Eksempler: Primærnøkkel, fremmednøkkel, datatyper, verdi-begrensninger, etc.
Applikasjonsbaserte restriksjoner (business rules) Må håndteres utenfor datamodellen (av applikasjonsprogrammene).

Funksjonelle avhengigheter

Notasjon Forklaring
R(A,B,...)R(A, B, ...) Skjema for tabellen. A, B, ... er attributter.
XYX \rightarrow Y YY er funksjonelt avhengig av XX
r(R)r(R) Tabellforekomsten
tir(R)t_i \in r(R) Rad i tabellforekomsten
ti[A]t_i[A] Radens verdi for attributtet AA
XRX \subseteq R En delmengde av attributtene i RR

Forkortes gjerne til FA.

XYX \rightarrow Y, der X,YRX, Y \subseteq R uttrykker en restriksjon på alle lovlige tabellforekomster for RR (en funksjonell avhengighet). Alle rader (tuppler), tjt_j pg tit_i, i en forekomst r(R)r(R) som har samme verdier for attributtene i XX (dvs. ti[X]=tj[X]t_i[X] = t_j[X]), ha samme verdier for attributtene i YY (dvs. ti[Y]=tj[Y]t_i[Y] = t_j[Y]).

Kort sagt: FA-en PersonNummerNavn\text{PersonNummer} \rightarrow \text{Navn} tilsier at alle rader med samme PersonNummer\text{PersonNummer} må ha samme Navn\text{Navn}.

Utledningsregler

Disse er sjelden nødvendige.

Navn Regel
IR-1 (reflexive) YX{ Y \subseteq X } gir XYX \rightarrow Y
IR-2 (augmentation) XY{ X \rightarrow Y } gir XZYZXZ \rightarrow YZ
IR-3 (transitive) XY,YZ{ X \rightarrow Y, \, Y \rightarrow Z } gir XZX \rightarrow Z
IR-4 (decomposition) XYZ{ X \rightarrow YZ } gir XYX \rightarrow Y
IR-5 (additive) XY,XZ{ X \rightarrow Y, \, X \rightarrow Z } gir XYZX \rightarrow YZ
IR-6 (pseudotransitive) XY,WYZ{ X \rightarrow Y, \, WY \rightarrow Z } gir WXZWX \rightarrow Z
X,Y,Z,WRX, Y, Z, W \subseteq R (mengden av alle attributter)

Full funksjonell avhengighet

En funksjonell avhengighet XYX \rightarrow Y er en full funksjonell avhengighet hvis det er umulig å fjerne et attributt, AXA \in X, og ha (XA)Y(X - {A}) \rightarrow Y. Kan tenkes på som at XX er en minimal nøkkel for YY.

Fler-verdi-avhengigheter - MVD

La X,YX, Y være delmengder av RR. En fler-verdi-avhengighet XYX \twoheadrightarrow Y betyr at mengden YY-verdier som er assosiert med en XX-verdi bestemmes av XX og er uavhengig av andre attributter. Dette kalles også for en MVD (multi-value dependency).

Eksempel: I en tabell over hvilke hobbyer personer har Hobbyer(PersonNummer, Hobby)\text{Hobbyer(PersonNummer, Hobby)} er PersonNummerHobby\text{PersonNummer} \twoheadrightarrow \text{Hobby}. Det er en mengde Hobby-verdier som er assosiert med en PersonNummer-verdi og uavhengig av andre attributter.

Trivielle fler-verdi-avhengigheter

XYX \twoheadrightarrow Y er triviell hvis YY er en delmengde av XX eller XY=RX \cup Y = R. Trivielle MVD-er utelates restriksjoner i 4NF.

Tillukning

Anta FF er en mengde funksjonelle avhengigheter. Tillukningen til FF er mengden av alle funksjonelle avhengigheter som kan utledes av FF og er gitt ved følgende F+={XYXY kan utledes fra FA-ene i F}F^+ = \{ X \rightarrow Y \,|\, X \rightarrow Y \text{ kan utledes fra FA-ene i F} \}

Anta RR og FF, XRX \subseteq R. Tillukningen til XX er mengden av alle attributter som er funksjonelt avhengige av XX og er gitt ved følgende X+={YRXYF+}X^+ = \{ Y \in R \,|\, X \rightarrow Y \in F^+ \}

Nøkler

Begrep Forklaring
Nøkkelattributt Et attributt som inngår i en eller flere kandidatnøkler
Ikke-nøkkelattributt Et attributt som ikke inngår i noen kandidatnøkler
Supernøkkel En mengde attributter som sammen danner en unik identifikator
Nøkkel/kandidatnøkkel En minimal supernøkkel (inneholder ingen flere attributter enn nødvendig)
Primærnøkkel Primærnøkkelen velges blant kandidatnøklene
Sekundærnøkkel En kandidatnøkkel som ikke er primærnøkkel. Her kan vi velge å tillate NULL-verdier.

Normalformer

Alle høyere normalformer forutsetter de lavere normalformene.

Første normalform - 1NF

En tabell er på 1NF hvis og bare hvis alle attributter har en enkelt verdi fra domenet.

Andre normalform - 2NF

En tabell er på andre normalform hvis og bare hvis det ikke finnes noen ikke-nøkkelattributter som er delvis avhengig av en kandidatnøkkel.

Tredje normalform - 3NF

En tabell er på 3NF hvis og bare hvis det for alle tabellens funksjonelle avhengigheter på formen XYX \rightarrow Y er slik at enten

Boyce-Codd normalform - BCNF

En tabell er på BCNF hvis og bare hvis det for alle tabellens funksjonelle avhengigheter på formen XYX \rightarrow Y, er slik at XX er en supernøkkel i tabellen.

Fjerde normalform - 4NF

En tabell er på 4NF hvis og bare hvis det for alle ikke-trivielle MVD-er XYX \twoheadrightarrow Y, er slik at XX er en supernøkkel.

Dekomponeringskriterier

Dekomponeringskriterium Forklaring
Normalform Ser på hver enkelt projeksjon (tabell), og ønsker så høy normalform som mulig.
Attributtbevaring Alle attributter i RR må finnes i minst en av projeksjonene, slik at den samme dataen kan lagres.
Bevaring av funksjonelle avhengigheter Alle funksjonelle avhengigheter i FF skal finnes i en eller flere RiR_i-er eller kunne utledes fra FA-ene som gjelder i RiR_i-ene.
Tapsløs-join-egenskapen Må kunne komme tilbake til utgangspunktet, og ikke kunne skape falske data. Kan sikres ved å sjekke hver oppdeling med snittregelen, eller bruke tabellmetoden (algoritme 15.3 i læreboka).

Regler for tapsløs join

Begrep Forklaring
Tabellmetoden (algoritme 15.3 i læreboka)
Snittregelen Dekomponeringen har tapsløs-join-egenskapen hvis felles attributter (snittet) i R1R_1 og R2R_2 er en supernøkkel i en eller begge tabellene: R1R2R1R_1 \cap R_2 \rightarrow R_1 eller R1R2R2R_1 \cap R_2 \rightarrow R_2.
Fagins teorem La A,BA, B være delmengder av RR og la C=RABC = R - AB. Projeksjonene ABAB og ACAC har tapsløs-join-egenskapen hvis og bare hvis ABA \twoheadrightarrow B eller ACA \twoheadrightarrow C.

Part 2: Summary

Del 2 kan oppsummeres på følgende måte:

18 Strategies for Query Processing

The scanner identifies the query tokens—such as SQL keywords, attribute names, and relation names. the parser checks the query syntax to determine whether it is formulated according to the syntax rules (rules of grammar) of the query language. the parser checks the query syntax to determine whether it is formulated according to the syntax rules (rules of grammar) of the query language. The parser checks the query syntax to determine whether it is formulated according to the syntax rules (rules of grammar) of the query language.

The query optimizer module has the task of producing a good execution plan, and the code generator generates the code to execute that plan. The runtime database processor has the task of running (executing) the query code, whether in compiled or interpreted mode, to produce the query result.

Algorithms for External Sorting

Note that sorting of a particular file may be avoided if an appropriate index— such as a primary or clustering index exists on the desired file attribute to allow ordered access to the records of the file.

External sorting refers to sorting algorithms that are suitable for large files of records stored on disk that do not fit entirely in main memory. The typical external sorting algorithm uses a sort-merge strategy, which starts by sorting small subfiles—called runs—of the main file and then merges the sorted runs, creating larger sorted subfiles that are merged in turn.

The buffer space is divided into individual buffers, where each buffer is the same size in bytes as the size of one disk block.

In the sorting phase, runs that can fit in the available buffer space are read into main memory, sorted using an internal sorting algorithm, and written back to disk as temporary sorted subfiles (or runs).

Let nRn_R be the number of initial runs, bb the number of file blocks available, and nBn_B the buffer space. Then

nR=bnBn_R= \left\lceil \frac{b}{n_B} \right\rceil

In the merging phase, the sorted runs are merged during one or more merge passes. Each merge pass can have one or more merge steps.The degree of merging dMd_M is the number of sorted subfiles that can be merged in each merge step.

dM=min(nB1,nR)d_M = \min (n_B − 1, n_R)

The number of merge passes is logdMnR\left\lceil \log_{dM} n_R \right\rceil. The following formula approximates the number of disk block reads and writes:

2b+2blogdMnR2 \, b + 2 \, b \, \log_{dM} n_R

Algorithms for SELECT Operation

Search Methods for Simple Selection.

Method S1 (linear search) applies to any file. Method S2 (binary search) requires the file to be sorted. The methods that use an index (S3a, S4, S5, and S6) are generally referred to as index searches, and they require the appropriate index to exist. Methods S4 and S6 can be used to retrieve records in a certain range in range queries. Method S7a (bitmap index search) is suitable for retrievals where an attribute must match an enumerated set of values. Method S7b (functional index search) is suitable when the match is based on a function of one or more attributes on which a functional index exists.

Search Methods for Conjunctive Selection

Conjunctive condition—that is, if it is made up of several simple conditions connected with the AND logical connective the DBMS can use the following additional methods to implement the operation:

The DBMS can use the following additional methods to implement the operation:

Search Methods for Disjunctive Selection

A disjunctive condition (where simple conditions are connected by the OR logical connective rather than by AND) is much harder to process and optimize.

Only if an access path exists on every simple condition in the disjunction can we optimize the selection by retrieving the records satisfying each condition—or their record ids—and then applying the union operation to eliminate duplicates.

Estimating the Selectivity of a Condition

To minimize the overall cost of query execution in terms of resources used and response time, the query optimizer receives valuable input from the system catalog, which contains crucial statistical information about the database.

Information in the Database Catalog. A typical RDBMS catalog contains the following types of information:

When the optimizer is choosing between multiple simple conditions in a conjunctive select condition, it typically considers the selectivity of each condition. The selectivity (sl) is defined as the ratio of the number of records (tuples) that satisfy the condition to the total number of records (tuples) in the file (relation).

Estimates of selectivities are possible from the information kept in the DBMS catalog and are used by the optimizer.

Implementing the JOIN Operation

Methods for Implementing Joins

How Buffer Space and Choice of Outer-Loop File Affect Performance of Nested-Loop Join

It is advantageous to read as many blocks as possible at a time into memory from the file whose records are used for the outer loop.

An extra buffer in main memory is needed to contain the resulting records after they are joined, and the contents of this result buffer can be appended to the result file whenever it is filled.

Let nBn_B be the buffer size and bIb_I be the number of blocks accessed for inner-loop file and bOb_O be the number of blocks accessed for the outer-loop file.We get the following total number of block read accesses:

bO+bIbOnB2b_O + b_I \cdot \left\lceil \frac{b_O}{n_B – 2} \right\rceil

If the result file of the join operation has bRES disk blocks, each block is written once to disk, so an additional bRES block accesses (writes) should be added to the preceding formulas in order to estimate the total cost of the join operation.

Join selection factor: the fraction of records in one file that will be joined with records in the other file. This factor depends on the particular equijoin condition between the two files.

The file with the high join selection factor) should be used in the (single) join loop.

20 Introduction to Transaction Processing Concepts and Theory

Introduction to Transaction Processing

Single-User versus Multiuser Systems

One criterion for classifying a database system is according to the number of users who can use the system concurrently. A DBMS is single-user if at most one user at a time can use the system, and it is multiuser if many users can use the system— and hence access the database—concurrently.

Most of the theory concerning concurrency control in databases is developed in terms of interleaved concurrency.

Transactions, Database Items, Read and Write Operations, and DBMS Buffers

A transaction is an executing program that forms a logical unit of database processing. A transaction includes one or more database access operations—these can include insertion, deletion, modification (update), or retrieval operations.

A database is basically represented as a collection of named data items. The size of a data item is called its granularity. A data item can be a database record, but it can also be a larger unit such as a whole disk block, or even a smaller unit such as an individual field (attribute) value of some record in the database.

The read-set of a transaction is the set of all items that the transaction reads, and the write-set is the set of all items that the transaction writes.

Why Concurrency Control Is Needed

Why Recovery Is Needed

If a transaction fails after executing some of its operations but before executing all of them, the operations already executed must be undone and have no lasting effect.

Types of Failures. Failures are generally classified as transaction, system, and media failures. There are several possible reasons for a transaction to fail in the middle of execution:

Transaction and System Concepts

Transaction States and Additional Operations

A transaction is an atomic unit of work that should either be completed in its entirety or not done at all. For recovery purposes, the system needs to keep track of when each transaction starts, terminates, and commits, or aborts. Therefore, the recovery manager of the DBMS needs to keep track of the following operations:

A transaction goes into an active state immediately after it starts execution, where it can execute its READ and WRITE operations. When the transaction ends, it moves to the partially committed state. At this point, some types of concurrency control protocols may do additional checks to see if the transaction can be committed or not.

If these checks are successful, the transaction is said to have reached its commit point and enters the committed state. When a transaction is committed, it has concluded its execution successfully and all its changes must be recorded perma nently in the database, even if a system failure occurs.

However, a transaction can go to the failed state if one of the checks fails or if the transaction is aborted during its active state. The terminated state corresponds to the transaction leaving the system.

The System Log

To be able to recover from failures that affect transactions, the system maintains a log6 to keep track of all transaction operations that affect the values of database items, as well as other transaction information that may be needed to permit recovery from failures. The log is a sequential, append-only file that is kept on disk, so it is not affected by any type of failure except for disk or catastrophic failure.

The following are the types of entries—called log records—that are written to the log file and the corresponding action for each log record.

  1. [start_transaction, T]. Indicates that transaction T has started execution.
  2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has changed the value of database item X from old_value to new_value.
  3. [read_item, T, X]. Indicates that transaction T has read the value of database item X.
  4. [commit, T]. Indicates that transaction T has completed successfully, and affirms that its effect can be committed (recorded permanently) to the database.
  5. [abort, T]. Indicates that transaction T has been aborted.

Protocols for recovery that avoid cascading rollbacks which include nearly all practical protocols—do not require that READ operations are written to the system log. Additionally, some recovery protocols require simpler WRITE entries that only include one of new_value or old_value instead of including both.

Because the log contains a record of every WRITE operation that changes the value of some database item, it is possible to undo the effect of these WRITE operations of a transaction T by tracing backward through the log and resetting all items changed by a WRITE operation of T to their old_values. Redo of an operation may also be necessary if a transaction has its updates recorded in the log but a failure occurs before the system can be sure that all these new_values have been written to the actual database on disk from the main memory buffers.

Commit Point of a Transaction

A transaction T reaches its commit point when all its operations that access the database have been executed successfully and the effect of all the transaction operations on the database have been recorded in the log. Beyond the commit point, the transaction is said to be committed, and its effect must be permanently recorded in the database. The transaction then writes a commit record [commit, T] into the log.

If a system failure occurs, we can search back in the log for all transactions T that have written a [start_transaction, T] record into the log but have not written their [commit, T] record yet; these transactions may have to be rolled back to undo their effect on the database during the recovery process. Transactions that have written their commit record in the log must also have recorded all their WRITE operations in the log, so their effect on the database can be redone from the log records.

At the time of a system crash, only the log entries that have been written back to disk are considered in the recovery process if the contents of main memory are lost.

before a transaction reaches its commit point, any portion of the log that has not been written to the disk yet must now be written to the disk. This process is called force-writing

DBMS-Specific Buffer Replacement Policies

If all the buffers in the DBMS cache are occupied and new disk pages are required to be loaded into main memory from disk, a page replacement policy is needed to select the particular buffers to be replaced.

Domain Separation (DS) Method. the DBMS cache is divided into separate domains (sets of buffers). Each domain handles one type of disk pages, and page replacements within each domain are handled via the basic LRU (least recently used) page replacement.

Hot Set Method. The hot set method determines for each database processing algorithm the set of disk pages that will be accessed repeatedly, and it does not replace them until their processing is completed.

The DBMIN Method. The DBMIN page replacement policy will calculate a locality set using QLSM for each file instance involved in the query. then allocates the appropriate number of buffers to each file instance involved in the query based on the locality set for that file instance.

Desirable Properties of Transactions

The following are the ACID properties:

It is the responsibility of the transaction recovery subsystem of a DBMS to ensure atomicity. If a transaction fails to complete for some reason the recovery technique must undo any effects of the transaction on the database.

The preservation of consistency is generally considered to be the responsibility of the programmers. A consistent state of the database satisfies the constraints specified in the schema as well as any other constraints on the database that should hold.

The isolation property is enforced by the concurrency control subsystem of the DBMS.

The durability property is the responsibility of the recovery subsystem of the DBMS.

There have been attempts to define the level of isolation of a transaction. A transaction is said to have level 0 (zero) isolation if it does not overwrite the dirty reads of higher-level transactions. Level 1 (one) isolation has no lost updates, and level 2 isolation has no lost updates and no dirty reads. Finally, level 3 isolation (also called true isolation) has, in addition to level 2 properties, repeatable reads.9 Another type of isolation is called snapshot isolation.

Characterizing Schedules Based on Recoverability

Schedules (Histories) of Transactions

A schedule (or history) S of n transactions T1, T2, … , Tn is an ordering of the operations of the transactions. Operations from different transactions can be interleaved in the schedule S. However, for each transaction Ti that participates in the schedule S, the operations of Ti in S must appear in the same order in which they occur in Ti.

symbols b, r, w, e, c, and a for the operations begin_transaction, read_item, write_item, end_transaction, commit, and abort,

Two operations in a schedule are said to conflict if they satisfy all three of the following conditions: (1) they belong to different transactions; (2) they access the same item X; and (3) at least one of the operations is a write_item(X). Intuitively, two operations are conflicting if changing their order can result in a different outcome.

a schedule is a partial order of the operations in the n transactions. However, a total order must be specified in the schedule for any pair of conflicting operations and for any pair of operations from the same transaction.

Characterizing Schedules Based on Recoverability

it is important to characterize the types of schedules for which recovery is possible, as well as those for which recovery is relatively simple.

once a transaction T is committed, it should never be necessary to roll back T. This ensures that the durability property of transactions is not violated. The schedules that theoretically meet this criterion are called recoverable schedules. A schedule where a committed transaction may have to be rolled back during recovery is called nonrecoverable and hence should not be permitted by the DBMS. The condition for a recoverable schedule is as follows: A schedule S is recoverable if no transaction T in S commits until all transactions T′ that have written some item X that T reads have committed. A transaction T reads from transaction T′ in a schedule S if some item X is

if sufficient information is kept (in the log), a recovery algorithm can be devised for any recoverable schedule.

it is possible for a phenomenon known as cascading rollback (or cascading abort) to occur in some recoverable schedules, where an uncommitted transaction has to be rolled back because it read an item from a transaction that failed.

A schedule is said to be cascadeless, or to avoid cascading rollback, if every transaction in the schedule reads only items that were written by committed transactions.

a strict schedule, in which transactions can neither read nor write an item X until the last transaction that wrote X has committed (or aborted). In a strict schedule, the process of undoing a write_item(X) operation of an aborted transaction is simply to restore the before image

It is important to note that any strict schedule is also cascadeless, and any cascadeless schedule is also recoverable.

Characterizing Schedules Based on Serializability

Now we characterize the types of schedules that are always considered to be correct when concurrent transactions are executing. Such schedules are known as serializable schedules.

Serial, Nonserial, and Conflict-Serializable Schedules

Formally, a schedule S is serial if, for every transaction T participating in the schedule, all the operations of T are executed consecutively in the schedule; otherwise, the schedule is called nonserial.

The problem with serial schedules is that they limit concurrency by prohibiting interleaving of operations. serial schedules are unacceptable in practice.

We would like to determine which of the nonserial schedules always give a correct result and which may give erroneous results. A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same n transactions. Saying that a nonserial schedule S is serializable is equivalent to saying that it is correct, because it is equivalent to a serial schedule, which is considered correct.

Two schedules are called result equivalent if they produce the same final state of the database. However, two different schedules may accidentally produce the same final state.

Two schedules are said to be conflict equivalent if the relative order of any two conflicting operations is the same in both schedules.

Using the notion of conflict equivalence, we define a schedule S to be serializable if it is (conflict) equivalent to some serial schedule S′.

Testing for Serializability of a Schedule

The algorithm looks at only the read_item and write_item operations in a schedule to construct a precedence graph.

An edge is created by the algorithm if a pair of conflicting operations exist in Tj and Tk and the conflicting operation in Tj appears in the schedule before the conflicting operation in Tk.

How Serializability Is Used for Concurrency Control

Being serializable is distinct from being serial. A serializable schedule gives the benefits of concurrent execution without giving up any correctness.

The approach taken in most commercial DBMSs is to design protocols (sets of rules) that—if followed by every individual transaction or if enforced by a DBMS concurrency control subsystem—will ensure serializability of all schedules in which the transactions participate.

two-phase locking, is based on locking data items to prevent concurrent transactions from interfering with one another.

Transaction Support in SQL

The access mode can be specified as READ ONLY or READ WRITE.

The diagnostic area size option, DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the number of conditions that can be held simultaneously in the diagnostic area.

The isolation level option is specified using the statement ISOLATION LEVEL isolation, where the value for isolation can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, or SERIALIZABLE.

If a transaction executes at a lower isolation level than SERIALIZABLE, then one or more of the following three violations may occur:

  1. Dirty read.
  2. Nonrepeatable read.
  3. Phantoms.

Snapshot Isolation.

The basic definition of snapshot isolation is that a transaction sees the data items that it reads based on the committed values of the items in the database snapshot (or database state) when the transaction starts. Snapshot isolation will ensure that the phantom record problem does not occur, since the database transaction, or in some cases the database statement, will only see the records that were committed in the database at the time the transaction starts.

21 Concurrency Control Techniques

Two-Phase Locking Techniques for Concurrency Control

A lock is a variable associated with a data item that describes the status of the item with respect to possible operations that can be applied to it.

Types of Locks and System Lock Tables

Shared/Exclusive (or Read/Write) Locks. We should allow several transactions to access the same item X if they all access X for reading purposes only. This is because read operations on the same item by different transactions are not conflicting. There are three locking operations: read_lock(X), write_lock(X), and unlock(X). Three possible states: read-locked, write-locked, or unlocked. A read-locked item is also called share-locked because other transactions are allowed to read the item, whereas a write-locked item is called exclusive-locked because a single transaction exclusively holds the lock on the item.

When we use the shared/exclusive locking scheme, the system must enforce the following rules:

  1. A transaction T must issue the operation read_lock(X) or write_lock(X) before any read_item(X) operation is performed in T.
  2. A transaction T must issue the operation write_lock(X) before any write_item(X) operation is performed in T.
  3. A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X) operations are completed in T. (rigorous)
  4. A transaction T will not issue an unlock(X) operation unless it already holds a read (shared) lock or a write (exclusive) lock on item X.

w lock conversion; that is, a transaction that already holds a lock on item X is allowed under certain conditions to convert the lock from one locked state to another. If T is the only transaction holding a read lock on X at the time it issues the write_lock(X) operation, the lock can be upgraded. It is also possible for a transaction T to issue a write_lock(X) and then later to downgrade the lock by issuing a read_lock(X) operation.

To guarantee serializability, we must follow an additional protocol concerning the positioning of locking and unlocking operations in every transaction. The best-known protocol, two-phase locking, is described in the next section.

Guaranteeing Serializability by Two-Phase Locking

A transaction is said to follow the two-phase locking protocol if all locking operations (read_lock, write_lock) precede the first unlock operation in the transaction.

It can be proved that, if every transaction in a schedule follows the two-phase locking protocol, the schedule is guaranteed to be serializable, obviating the need to test for serializability of schedules.

Two-phase locking may limit the amount of concurrency that can occur in a schedule.

Basic, Conservative, Strict, and Rigorous Two-Phase Locking.

The technique just described is known as basic 2PL.

A variation known as conservative 2PL (or static 2PL) requires a transaction to lock all the items it accesses before the transaction begins execution, by predeclaring its read-set and write-set. Conservative 2PL is a deadlock-free protocol.

the most popular variation of 2PL is strict 2PL, which guarantees strict schedules. In this variation, a transaction T does not release any of its exclusive (write) locks until after it commits or aborts. Strict 2PL is not deadlock-free.

A more restrictive variation of strict 2PL is rigorous 2PL, which also guarantees strict schedules. In this variation, a transaction T does not release any of its locks (exclusive or shared) until after it commits or aborts, and so it is easier to implement than strict 2PL.

Dealing with Deadlock and Starvation

Locking is generally considered to have a high overhead, because every read or write operation is preceded by a system locking request. The use of locks can also cause two additional problems: deadlock and starvation.

Deadlock occurs when each transaction T in a set of two or more transactions is waiting for some item that is locked by some other transaction T′ in the set.

One way to prevent deadlock is to use a deadlock prevention protocol. These protocols are not generally used in practice, either because of unrealistic assumptions or because of their possible overhead. Some schemes that prevent deadlock are wait-die, wound-wait and cautious waiting.

An alternative approach to dealing with deadlock is deadlock detection. A simple way to detect a state of deadlock is for the system to construct and maintain a wait-for graph.

Another simple scheme to deal with deadlock is the use of timeouts. In this method, if a transaction waits for a period longer than a system-defined timeout period, the system assumes that the transaction may be deadlocked and aborts it—regardless of whether a deadlock actually exists.

Starvation, which occurs when a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally. This may occur if the waiting scheme for locked items is unfair in that it gives priority to some transactions over others. One solution for starvation is to have a fair waiting scheme, such as using a first-come-first-served queue. Another scheme allows some transactions to have priority over others but increases the priority of a transaction the longer it waits. The wait-die and wound-wait schemes discussed previously avoid starvation, because they restart a transaction that has been aborted with its same original timestamp

Multiversion Concurrency Control Techniques

These protocols for concurrency control keep copies of the old values of a data item when the item is updated (written); they are known as multiversion concurrency control because several versions (values) of an item are kept by the system. An obvious drawback of multiversion techniques is that more storage is needed to maintain multiple versions of the database items.

One reason for keeping multiple versions is that some read operations that would be rejected in other techniques can still be accepted by reading an older version of the item to maintain serializability.

The extreme case is a temporal database which keeps track of all changes and the times at which they occurred.

The snapshot isolation technique can also be implemented by keeping older versions of items in a temporary store.

22 Database Recovery Techniques

Recovery Concepts

Recovery Outline and Categorization of Recovery Algorithms

Recovery from transaction failures usually means that the database is restored to the most recent consistent state before the time of failure. To do this, the system must keep information about the changes that were applied to data items by the various transactions. This information is typically kept in the system log.

  1. If there is extensive damage to a wide portion of the database due to catastrophic failure, such as a disk crash, the recovery method restores a past copy of the database that was backed up to archival storage and reconstructs a more current state by reapplying or redoing the operations of committed transactions from the backed-up log, up to the time of failure.
  2. When the database on disk is not physically damaged, the recovery strategy is to identify any changes that may cause an inconsistency in the database. For example, a transaction that has updated some database items on disk but has not been committed needs to have its changes reversed by undoing its write operations. It may also be necessary to redo some operations in order to restore a consistent state of the database; for example, if a transaction has committed but some of its write operations have not yet been written to disk. The entries kept in the online system log on disk are analyzed to determine the appropriate actions for recovery.

Conceptually, we can distinguish two main policies for recovery from noncatastrophic transaction failures: deferred update and immediate update. The deferred update techniques do not physically update the database on disk until after a transaction commits; then the updates are recorded in the database. deferred update is also known as the NO-UNDO/REDO algorithm.

In the immediate update techniques, the database may be updated by some operations of a transaction before the transaction reaches its commit point. However, these operations must also be recorded in the log on disk by force-writing before they are applied to the database on disk, making recovery still possible. In the general case of immediate update, both undo and redo may be required during recovery. This technique, known as the UNDO/REDO algorithm, requires both operations during recovery and is used most often in practice. A variation of the algorithm where all updates are required to be recorded in the database on disk before a transaction commits requires undo only, so it is known as the UNDO/NO-REDO algorithm.

The UNDO and REDO operations are required to be idempotent—that is, executing an operation multiple times is equivalent to executing it just once. In fact, the whole recovery process should be idempotent, in case the system were to fail during the recovery process.

Caching (Buffering) of Disk Blocks

Typically, multiple disk pages that include the data items to be updated are cached into main memory buffers and then updated in memory before being written back to disk.

Associated with each buffer in the cache is a dirty bit, which can be included in the directory entry to indicate whether or not the buffer has been modified. When the buffer contents are replaced (flushed) from the cache, the contents must first be written back to the corresponding disk page only if its dirty bit is 1.

the pin-unpin bit, is also needed—a page in the cache is pinned (bit value 1 (one)) if it cannot be written back to disk as yet. For example, the recovery protocol may restrict certain buffer pages from being written back to the disk until the transactions that changed this buffer have committed.

Two main strategies can be employed when flushing a modified buffer back to disk. in-place updating, writes the buffer to the same original disk location, thus overwriting the old value of any changed data items on disk. shadowing, writes an updated buffer at a different disk location, so multiple versions of data items can be maintained, but this approach is not typically used in practice.

In general, the old value of the data item before updating is called the before image (BFIM), and the new value after updating is called the after image (AFIM).

Write-Ahead Logging, Steal/No-Steal, and Force/No-Force

When in-place updating is used, it is necessary to use a log for recovery. the recovery mechanism must ensure that the BFIM of the data item is recorded in the appropriate log entry and that the log entry is flushed to disk before the BFIM is overwritten with the AFIM in the database on disk. This process is generally known as write-ahead logging and is necessary so we can UNDO the operation if this is required during recovery.

we need to distinguish between two types of log entry information included for a write command: the information needed for UNDO and the information needed for REDO. A REDO-type log entry includes the new value (AFIM) of the item written by the operation. The UNDO-type log entries include the old value (BFIM) of the item since this is needed to undo the effect of the operation. In an UNDO/REDO algorithm, both BFIM and AFIM are recorded into a single log entry.

The log is simply a sequential (appendonly) disk file

Standard DBMS recovery terminology includes the terms steal/no-steal and force/no-force, which specify the rules that govern when a page from the database cache can be written to disk:

Steal/No-Steal. If a cache buffer page updated by a transaction cannot be written to disk before the transaction commits, the recovery method is called a no-steal approach. if the recovery protocol allows writing an updated buffer before the transaction commits, it is called steal. Steal is used when the DBMS cache (buffer) manager needs a buffer frame for another transaction and the buffer manager replaces an existing page that had been updated but whose transaction has not committed. The no-steal rule means that UNDO will never be needed during recovery, since a committed transaction will not have any of its updates on disk before it commits.

Force/No-Force. If all pages updated by a transaction are immediately written to disk before the transaction commits, the recovery approach is called a force approach. Otherwise, it is called no-force. The force rule means that REDO will never be needed during recovery, since any committed transaction will have all its updates on disk before it is committed.

The deferred update (NO-UNDO) recovery scheme discussed in Section 22.2 follows a no-steal approach. However, typical database systems employ a steal/no-force (UNDO/REDO) strategy. The advantage of steal is that it avoids the need for a very large buffer space to store all updated pages in memory. The advantage of no-force is that an updated page of a committed transaction may still be in the buffer when another transaction needs to update it, thus eliminating the I/O cost to write that page multiple times to disk and possibly having to read it again from disk. This may provide a substantial saving in the number of disk I/O operations when a specific page is updated heavily by multiple transactions.

Checkpoints in the System Log and Fuzzy Checkpointing

A [ checkpoint, list of active transactions ] record is written into the log periodically at that point when the system writes out to the database on disk all DBMS buffers that have been modified. all transactions that have their [ commit, T ] entries in the log before a [ checkpoint ] entry do not need to have their WRITE operations redone in case of a system crash,

Taking a checkpoint consists of the following actions:

  1. Suspend execution of transactions temporarily.
  2. Force-write all main memory buffers that have been modified to disk.
  3. Write a [ checkpoint ] record to the log, and force-write the log to disk.
  4. Resume executing transactions.

fuzzy checkpointing. In this technique, the system can resume transaction processing after a [begin_checkpoint] record is written to the log without having to wait for step 2 to finish. When step 2 is completed, an [end_checkpoint, … ] record is written in the log with the relevant information collected during checkpointing.

Transaction Rollback and Cascading Rollback

f a transaction fails for whatever reason after updating the database, but before the transaction commits, it may be necessary to roll back the transaction. The undo-type log entries are used to restore the old values of data items that must be rolled back.

If a transaction T is rolled back, any transaction S that has, in the interim, read the value of some data item X written by T must also be rolled back. This phenomenon is called cascading rollback, and it can occur when the recovery protocol ensures recoverable schedules but does not ensure strict or cascadeless schedules. almost all recovery mechanisms are designed so that cascading rollback is never required.

NO-UNDO/REDO Recovery Based on Deferred Update

The idea behind deferred update is to defer or postpone any actual updates to the database on disk until the transaction completes its execution successfully and reaches its commit point. deferred update can generally be characterized as a no-steal approach.

During transaction execution, the updates are recorded only in the log and in the cache buffers.If a transaction fails before reaching its commit point, there is no need to undo any operations because the transaction has not affected the database on disk in any way. Therefore, only REDOtype log entries are needed in the log, which include the new value (AFIM).

it cannot be used in practice unless transactions are short and each transaction changes few items. For other types of transactions, there is the potential for running out of buffer space

it is only necessary to REDO the last update of X from the log during recovery because the other updates would be overwritten by this last REDO.

Recovery Techniques Based on Immediate Update

In these techniques, when a transaction issues an update command, the database on disk can be updated immediately

Provisions must be made for undoing the effect of update operations that have been applied to the database by a failed transaction. This is accomplished by rolling back the transaction and undoing the effect of the transaction’s write_item operations. Therefore, the UNDO-type log entries, which include the old value (BFIM) of the item, must be stored in the log. Because UNDO can be needed during recovery, these methods follow a steal strategy.

Theoretically, we can distinguish two main categories of immediate update algorithms.

  1. UNDO/NO-REDO recovery algorithm. In this method, all updates by a transaction must be recorded on disk before the transaction commits, so that REDO is never needed. steal/force strategy.
  2. If the transaction is allowed to commit before all its changes are written to the database, we have the most general case, known as the UNDO/REDO recovery algorithm. the steal/no-force strategy is applied

Shadow Paging

When a transaction begins executing, the current directory—whose entries point to the most recent or current database pages on disk—is copied into a shadow directory. The shadow directory is then saved on disk while the current directory is used by the transaction.

During transaction execution, the shadow directory is never modified. When a write_item operation is performed, a new copy of the modified database page is created, but the old copy of that page is not overwritten.

To recover from a failure during transaction execution, it is sufficient to free the modified database pages and to discard the current directory. The state of the database before transaction execution is available through the shadow directory

Committing a transaction corresponds to discarding the previous shadow directory.

this technique can be categorized as a NO-UNDO/NO-REDO technique for recovery.

In a multiuser environment with concurrent transactions, logs and checkpoints must be incorporated into the shadow paging technique. if the directory is large, the overhead of writing shadow directories to disk as transactions commit is significant.

The ARIES Recovery Algorithm

ARIES uses a steal/no-force approach for writing, and it is based on three concepts: write-ahead logging, repeating history during redo, and logging changes during undo. Repeating history, means that ARIES will retrace all actions of the database system prior to the crash to reconstruct the database state when the crash occurred. Transactions that were uncommitted at the time of the crash (active transactions) are undone. Logging during undo, will prevent ARIES from repeating the completed undo operations if a failure occurs during recovery,

The ARIES recovery procedure consists of three main steps: analysis, REDO, and UNDO. The analysis step identifies the dirty (updated) pages in the buffer6 and the set of transactions active at the time of the crash. The appropriate point in the log where the REDO operation should start is also determined. The REDO phase actually reapplies updates from the log to the database. Generally, the REDO operation is applied only to committed transactions. However, this is not the case in ARIES.

information stored by ARIES and in the data pages will allow ARIES to determine whether the operation to be redone has actually been applied to the database and therefore does not need to be reapplied. Thus, only the necessary REDO operations are applied during recovery. Finally, during the UNDO phase, the log is scanned backward and the operations of transactions that were active at the time of the crash are undone in reverse order. The information needed for ARIES to accomplish its recovery procedure includes the log, the Transaction Table, and the Dirty Page Table.

In ARIES, every log record has an associated log sequence number (LSN). Each LSN corresponds to a specific change (action) of some transaction. Also, each data page will store the LSN of the latest log record corresponding to a change for that page. A log record is written for any of the following actions: updating a page (write), committing a transaction (commit), aborting a transaction (abort), undoing an update (undo), and ending a transaction (end).

Common fields in all log records include the previous LSN for that transaction, the transaction ID, and the type of log record. The previous LSN is important because it links the log records (in reverse order) for each transaction.

In addition to the log, two tables are needed for efficient recovery: the Transaction Table and the Dirty Page Table, which are maintained by the transaction manager. When a crash occurs, these tables are rebuilt in the analysis phase of recovery. The Transaction Table contains an entry for each active transaction, with information such as the transaction ID, transaction status, and the LSN of the most recent log record for the transaction. The Dirty Page Table contains an entry for each dirty page in the DBMS cache, which includes the page ID and the LSN corresponding to the earliest update to that page.

Checkpointing in ARIES consists of the following: writing a begin_checkpoint record to the log, writing an end_checkpoint record to the log, and writing the LSN of the begin_checkpoint record to a special file. This special file is accessed during recovery to locate the last checkpoint information. With the end_checkpoint record, the contents of both the Transaction Table and Dirty Page Table are appended to the end of the log. To reduce the cost, fuzzy checkpointing is used so that the DBMS can continue to execute transactions during checkpointing

During analysis, the log records being analyzed may cause modifications to these two tables. For instance, if an end log record was encountered for a transaction T in the Transaction Table, then the entry for T is deleted from that table. If some other type of log record is encountered for a transaction T′, then an entry for T′ is inserted into the Transaction Table, if not already present, and the last LSN field is modified.

The REDO phase follows next. To reduce the amount of unnecessary work, ARIES starts redoing at a point in the log where it knows (for sure) that previous changes to dirty pages have already been applied to the database on disk. It can determine this by finding the smallest LSN, M, of all the dirty pages in the Dirty Page Table, which indicates the log position where ARIES needs to start the REDO phase.

For each change recorded in the log, the REDO algorithm would verify whether or not the change has to be reapplied. For example, if a change recorded in the log pertains to page P that is not in the Dirty Page Table, then this change is already on disk and does not need to be reapplied. Or, if a change recorded in the log (with LSN = N, say) pertains to page P and the Dirty Page Table contains an entry for P with LSN greater than N, then the change is already present. If neither of these two conditions holds, page P is read from disk and the LSN stored on that page, LSN(P), is compared with N. If N < LSN(P), then the change has been applied and the page does not need to be rewritten to disk.

Once the REDO phase is finished, the database is in the exact state that it was in when the crash occurred. Now, the UNDO phase proceeds by scanning backward from the end of the log and undoing the appropriate actions. A compensating log record is written for each action that is undone.

Since a checkpoint has occurred, the address of the associated begin_checkpoint record is retrieved, which is location 4. The analysis phase starts from location 4 until it reaches the end.

For the REDO phase, the smallest LSN in the Dirty Page Table is 1. Hence the REDO will start at log record 1 and proceed with the REDO of updates. The LSNs {1, 2, 6, 7} corresponding to the updates for pages C, B, A, and C, respectively, are not less than the LSNs of those pages (as shown in the Dirty Page Table). So those data pages will be read again and the updates reapplied from the log

UNDO is applied only to the active transaction T3. The UNDO phase starts at log entry 6 (the last update for T3) and proceeds backward in the log. The backward chain of updates for transaction T3 (only log record 6 in this example) is followed and undone.