Operational semantics of transactions

更新时间:2023-08-13 11:17:01 阅读量: IT计算机 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

Operational Semantics of Transactions

Andreas Prinz Bernhard Thalheim

DResearch Digital Media GmbH,Computer Science Institute Otto-Schmirgal-Str.3,Brandenburg University of Technology at Cottbus D-10319Berlin,Germany PostBox101344,D-03013Cottbus Email prinz@DResearch.de Email thalheim@informatik.tu-cottbus.de

Abstract

Mathematics is forcing towards a consistent frame-work of theory puter Science is an engineering discipline and sometimes su?ers from ad-hoc de?nitions.Transactions are a concept that is commonly used in the database area.It is often de?ned in the form:given a syntactic construct in an abstract form and declare a number of properties an engine should support which is not speci?ed and invisible.

This paper aims in providing an operational se-mantics for transactions.A DBMS implementation is then considered to be a faithful re?nement of the operational semantics.

1Introduction

Transactions are one of the fundamental frameworks in the information systems area.It is necessary to de?ne the notion of“transaction”that is robust ac-cording to the following requirements:

?Logical semantics must coincide with operational semantics for transactions.

?Parallel execution of transactions must be de?n-able inside the operational semantics used for transactions.

?One re?nement of the transaction model is the implementation of transaction execution by a DBMS.

?Arbitrary order of execution:Transactions can be executed in any order as long as they are not competing for resources.

?Rigid punch:Transaction execution leaves traces in the database whenever the e?ect of the trans-action does not contradict the database.

1.1Variety of De?nitions

“De?nitions are a matter of luck”is a humorous statement often made by A.N.Kolmogoro?and H. Thiele.The transaction de?nition made in a variety of books and papers seems to be a good example of this claim:

TA as obligation(Embley1998):“A transaction is a program unit that accesses the database;it re-trieves and may update data.A database system has the responsibility of executing a transaction Copyright c 2003,Australian Computer Society,Inc.This pa-per appeared in the Fourteenth Australasian Database Con-ference(ADC2003),Adelaide,Australia.Conferences in Re-search and Practice in Information Technology,Vol.17.Xi-aofang Zhou and Klaus-Dieter Schewe,Ed.Reproduction for academic,not-for pro?t purposes permitted provided this text is included.

so that it is both atomic and correct....A trans-action is a program unit that preserves correct-ness and atomicity.”

TA as an agent(Garcia-Molina et al.2000):“A trans-action,like any program,executes a number of steps in sequence;often several of these steps will modify the database.Each transaction has

a state,which represents what has happened so

far in the transaction.The state includes the current place in the transaction’s code being exe-cuted and the values of any local variables of the transaction that will be needed later on.”

TA as special program(Codd1990):“A transaction is a collection of activities involving changes to the database,all of which must be executed suc-cessfully if the changes are to be committed to the database,and none of which may be committed if any one or more of the activities fail.Normally, such a collection of activities is represented by a sequence of relational commands.The beginning of the sequence is signaled by a command such as BEGIN or BEGIN TRANSACTION.Its termination is signaled by a command such as END or COMMIT-or,if it is necessary to abort the transaction,ABORT.”

Two views on TA’s(Hsu1998):“An end user com-municates with a database through a mechanism called a transaction.A transaction can be de-?ned from the user viewpoint and from the sys-tem viewpoint.The end user(the operator,the system administrator,etc.)sees a transaction as a request/reply unit expressed in the form of

a source program.The system sees a transac-

tion as a sequence of operations(reads,writes, etc.)on the data elements.The user conveys a change to the DBMS via a transaction and awaits

a reply from the system.The DBMS then imple-

ments the set of operations(de?ned in the trans-action)on a subset of data elements by execut-ing the transaction under a set of the changes through a“successful”execution of the transac-tion.The DBMS guarantees the incorporation of the changes through a“successful”execution of the transaction.We will refer to such execution of a transaction as“commit”.

A transaction T must possess a set of well-

de?ned properties to be able to correctly re?ect in the database the changes to the part of the real world.In executing a transaction,the sys-tem guarantees that all the changes proposed in the transaction,not only a part of them,are in-corporated correctly in the database.”

TA as concurrent operation(Elmasri/Navathe2000):“The execution of a program that accesses or changes the contents of the database is called

a transaction.The transaction submitted by

various users may execute concurrently and may

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

access and update the same database records.If this concurrency is uncontrolled,it may lead to problems such as an inconsistent database.”

TA as speci?c application programs(Lewis et al.2002):“A transaction is a program that can perform the following functions:

1.It can update a database to re?ect the

occurrence of a real-world event that e?ects the state of the enterprise the database is modeling.

An example...

2.It can ensure that one or more real-world

events occur....

3.It can return information derived from the

database.”

The problems of the variety becomes worse if in the same source a variety of de?nitions is used.Beside the de?nition of(Hsu1998)in(Hsu/Kumar1998)an-other de?nition(H¨a rder/Reuter1998)is used which is repeated by most of the books in the database area (Vossen1991):

The concept of a transaction,which includes all database interactions between BeginOfTransaction and EndOfTransaction in the preceding example,re-quires that all of its actions be executed indivisibil-ity:Either all actions are properly re?ected in the database or nothing has happened.No changes are re?ected in the database if at any point in time before reaching the CT,the user enters the ERROR clause containing the RestoreTransaction.To achieve this kind of indivisibility,a transaction must have four properties:

Atomicity.It must be of all-or-nothing type described before,and the user must,whatever happens, know which state he or she is in. Consistency.A transaction reaching its normal end (EOT,end of transaction),thereby commit-ting its results,preserves the consistency of the database.In other words,each successful trans-action by de?nition commits only legal results.

This condition is necessary for the fourth prop-erty,durability.

Isolation.Events within a transaction must be hidden from other transactions running concurrently.If this were not the case,a transaction could not be reset to its beginning for the reasons sketched earlier.The techniques that achieve isolation are known as synchronization,and since Gray et al.

..there have been numerous contributions to this topic of database research..

Durability.Once a transaction has been completed and has committed its results to the database, the system must guarantee that these results survive any subsequent malfunctions.Since there is no sphere of control constituting a set of transactions,the database management system (DBMS)has no control beyond transaction boundaries.Therefore,the user must guarantee that things the system says have happened have actually happened.Since,by de?nition,each transaction is correct,the e?ects of an inevitable incorrect transaction(i.e.,the transaction con-taining faulty data)can only be removed using countertransactions.

Another requirement used is the serializability re-quirement:

Running two transactions in parallel should have the same e?ect as running them one after the other. Transaction order is important for the e?ect.Con-sider one transaction changing the value for x to2x and another transaction changing x to x?2.There-fore,order of execution matters.Execution of the sec-ond after the?rst gives2x?2.Execution of the?rst after the second gives2x?4.Therefore,serializabil-ity means that running a number of transactions in parallel should have the same e?ect as running them sequentially in a certain order.

These de?nitions are taught in database courses. Therefore,the database community de?ned in brief that a transaction is nothing else as a sequence of database operations that preserve the ACID properties.

1.2State Models Used in Transaction De?-

nitions

There are very few papers and books proposing a state model of transaction execution.Let us summa-rize and extend the models proposed so far.We notice that the description below is not explicitly proposed in the literature but can be extracted on the basis of the intentional,narrative descriptions.

State model:The transaction engine has?ve states (Gray/Reuter1993):

BeginOfTransaction(BOT):The transactions marked by‘not?nalized’are in the BOT

state and wait for execution.

Run:The transaction engine runs the transac-tion and executes read,write and compute

statements.

Abort:The transaction is in an abort state.The resources occupied are freed.After com-

pletion the transaction returns to the BOT

state.

Commit:The transaction engine has completed the statements of the transaction and checks

now the correctness of the integrity con-

straints.If the constraints are valid the

next state is the EOT state.Otherwise,the

engine directs the transaction to the abort

state.

EndOfTransaction(EOT):The transaction en-gine completes the execution of the trans-

action and marks the transaction by‘?nal-

ized’.

This state model is displayed in Figure1.

r r

r r

r r j¨¨

¨¨

¨¨B

r r

r r

r r j

T

'

'

BOT

Run

Abort

Commit

EOT

Figure1:The States of a Transaction in the State Chart Approach

The model is often used in the literature in the form displayed in Figure1without the return from abort to BOT.However,transactions are rerun if they fail or abort.

Event model:The event model(Elmasri/Navathe 2000)is based on the events the recovery man-ager may use.

BeginOfTransaction:The label BOT marks the beginning of a transaction.

Read or Write:The transaction engine exe-cutes elementary operations for the given

transaction.

EndOfTransaction:The read/write sequence has ended.Now integrity is to be checked.

CommitTransaction:The CommitTransaction event signals the successful completion of

the transaction.

Additionally,the recovery manager uses events:

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

Rollback:The transaction has not been success-ful.E?ects to the database must be undone.

Redo:The redo event causes the manager to re-peat the operation.

Undo:The undo event forces the manager to re-pair the e?ects of applying a singleton op-eration to the database.The ovals used in Figure 2show system activities,i.e.,state transitions:Active:The transaction sequence is currently ex-ecuted.

Partially committed:The sequence has been exe-cuted and the concurrency controller checks

whether there is an interference with other transactions.Furthermore,integrity con-straints are checked by the transaction en-gine.

Committed:The transaction has been success-fully completed.The auxiliary logs are re-moved.Failed:The e?ects of the transaction on the database are compensated.Terminated:The transaction has been success-fully completed or has failed.In the case that the transaction failed no e?ect on the database can be observed.The event model state transition diagram is pic-tured in Figure 2.

E BeginOf Transaction

Active c E EndOf Transaction Partially

Committed Commit Committed

c Failed

r r r

r r

r Terminated c r r r

j ¨

¨¨¨¨¨

¨¨¨%c Read Write Figure 2:The Event Model of Transactions Statechart model:The transaction engine starts

(Weikum 2002)at the Begin state.After calling

the transaction the transaction state is changed

to Active .The transaction is either running or

blocked by the engine due to the database state

or the state of other transactions.An active

transaction is either committed or aborted.

The statechart of transactions is displayed in Fig-ure 3.The three models use a transaction engine (or sched-uler or recovery engine)for the explanation what is

considered to be a transaction.It seems,however,that transactions should be de?ned without referring

to an engine or an implementation.

2General De?nition of Transactions 2.1Basic De?nitions

Transactions are de?ned over databases schemata.Let (S ,Σ)be a database schema and O B S the set of basic modi?cation and retrieval operations de?ned on S .

Begin

Committed Aborted Active

c T Running Blocked

Resume

Block ££

£#t E

T ''Commit Reject

Restart ¢¢¢ t Figure 3:The Statechart Model of Transactions Typically,the elementary modi?cation operation is the write operation de?ned on locations loc =(R,o )of an object o in a class R C de?ned on R .The elemen-tary retrieval operation is the read operation de?ned

on locations (R,o )of an object o in a class R C de?ned on R .

Basic modi?cation operations are the insertion,deletion and the updating operations de?ned for an

object o in a class R C de?ned on R or a group of objects.These operations are typically bound by an identi?cation predicate for the object or the group

of objects.In object-relational databases we assume that the identi?cation predicate is value-based .Basic retrieval operations are the select expres-sions de?ned by structural recursion on the structur-ing S .Classical SQL expressions are expressions of the form map (filter (join (...),ψ),S )

where the ?lter predicate is again an expression,the target structure for the mapping (or construction)is S .The static constraints in the schema (S ,Σ)can be transformed to transition constraints (Thalheim 2000).A transition constraint (Ψpre ,Ψpost )de?nes the preconditions and postconditions for state transi-tions of databases de?ned over S .Given a transition τconverting the database S C 1to the database S C 2=τ(S C 1).The transition constraint (Ψpre ,Ψpost )is valid for the transition (S C 1,τ(S C 1))if S C 1|=Ψpre

entails S C 2|=Ψpost .

Static constraints Σare therefore converted to a

transition constraint (Σ,Σ).

2.2Syntax of Transactions Transactions are de?ned on the basis of elementary operations.Following (Levene/Loizou 1999),we de-?ne a transaction T over (S ,Σ)as a ?nite sequence o 1;o 2;o 3;...;o m of basic modi?cation and retrieval op-erations over (S ,Σ).

Transactions may be applied to the database state

S C

sequentially and form a transition

T (S C )=o m (...(o 2(o 1(S C )))).2.3Functional semantics of transactions Logical semantics is based on the validity of transi-tion constraints.The transaction is considered to be a singleton transition.Given a transaction T over

(S ,Σ)and a database S C .

The result of applying the transaction T to S C is the database T (S C ).

The e?ect of application of T to S C is de?ned as

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

a transition constraint preserving transformation

T (S C )=

T (S C )if T (S C )|=Σ

S C if T (S C )|=ΣThe transaction can be thus understood as an invari-ant state transition.

Transactions T 1and T 2are competing if read (T 1)∩write (T 2)=?or read (T 2)∩write (T 1)=?or write (T 2)∩write (T 1)=?.The sets read (T i )and write (T i )consists of the locations of objects which are read or written by the transaction T i .

Parallel execution of transactions T 1 T 2is cor-rect if either the transactions are not competing or the e?ect of T 1 T 2(S C )is equivalent to T 1(T 2(S C ))or to T 2(T 1(S C ))for any database S C .If paral-lel execution is correct transaction execution can be scheduled in parallel.

Logical semantics of transactions is de?ned by consistency (each transaction preserves transition

constraints)and

parallelization (transaction can be executed in par-allel).

We observe that atomicity is not considered.Atomic-ity is declared by granularity of transitions.Further-more,we are not concerned with durability.Durabil-ity is not a logical property but rather a property of storage.2.4

Operational semantics of transactions

Instead of using an abstract interpretation and a set of models,an abstract Moore-automaton M =(Z,f,I,Z final )is used with the set Z of states,its subset Z final of ?nal states,the state transition func-tion f ,and the initialization function I which assigns a starting state I (p,d )=z p,d ∈Z to each program p and each input d .The interpretation of the pro-gram p and the input d is the sequence of states z p,d =z 0,z 1,...,z i ,...with z i =f (z i ?1)for i ≥1.If the sequence is ?nite for n p,d and z n p,d ∈Z final then the program p is correct for d .If f (z i )is un-de?ned for some i and z i ∈Z final then the program does not have a meaning for d .If the sequence is in-?nite with z i ∈Z final for all i ∈N then the program is not terminating for d .

A variety of approaches has been developed for de?nition of operational semantics:

Scheduling,access and recovery models:Transactions

are executed in parallel and independent from each other.In order to support this require-ment,access,scheduling and recovery models are developed (Biskup 1995).

Given a set of transactions T 1,...,T n .A sched-ule S (T 1,...,T n )of T i =o i,1,...,o i,n i ,1≤i ≤n is assignment of moments S (o i,j )of time to the operations of the transactions which preserves the order of operations within each transaction.Now an access plan can be speci?ed for the ob-jects to be used in transactions.An access plan is roughly called con?ict-free if no transaction reads a value which is under change by another transaction.

This approach has the advantage that the trans-action machine is constructed.The disadvantage is the complexity of the constructive approach.Any change in transaction policy or constraint enforcement policy imposes a severe number of changes in the de?nitions.

Abstract automata models are widely used for pro-gramming languages.Such abstract models have the advantage that re?nement of requirements is re?ected by re?nement of abstract automata.For this reason,this approach is preferable if we are able to to de?ne such abstract automata.Operational semantics for transactions must be based on parallel execution of processes.Therefore,we need a machine that allows to model parallel exe-cution.3De?ning Operational Semantics Through Abstract State Machines (ASM)

3.1

Functional Semantics for Transactions using ASM

Abstract state machines (ASM)M (Gurevich 1997,Gurevich 2000,St¨a rk,Schmid &B¨o rger 1996)provide a framework for speci?cation of parallel execution.Abstract state machines are de?ned by a number of rules

IF condition DO actions

which can be applied to a state space in parallel .The condition or guard is an arbitrary ?rst-order for-mula.

The notion of the state space is the classical no-tion of mathematical structures (?)where data are abstract objects collected in relations or characteris-tic functions.The state space consists of static func-tions (which never change during any run)and dy-namic functions.Controlled functions are dynamic functions which are updateable only by ASM rules.Monitored functions are only updated by the envi-ronment.Interaction functions are updatable both by the ASM and the environment.Derived functions are neither updatable by the ASM nor by the envi-ronment but are de?ned in terms of other functions.Actions are updates of the state space,i.e.,func-tions of the form

f (t 1,...,t n ):=t

whose execution is changing the value of the lo-cation loc represented by the function f at the given parameters.The notion of the ASM run is the same as the computation notion of transition systems.An ASM computation step in a given state is the simultaneous execution of all rules whose guard is true in the state if these updates are consistent.An update set U is consistent if there are no pairs f (t 1,...,t n ):=t and f (t 1,...,t n ):=t in U with t =t .The ?ring a consistent update set U in a state A results in a new state B in which controlled and interaction functions are changed by the update set.

Simultaneous execution allows to abstract from ir-relevant details of sequential execution and to use syn-chronous parallelism.We observe that,therefore,the model ?ts very well to transaction execution.

ASM semantics has been been used for de?nition of semantics of object-oriented models in (Gottlob et al.1991)and for de?ning database recovery in (Gurevich et al.1997).

Distributed ASM use a set A of agents.In our setting,each transaction is an agent.The set of all operations is denoted by M .The operations x ∈M may belong to di?erent transactions T at the same time.The following requirements are applied to runs ρ:

?The set M is a partially ordered set with the predicate <.The set {y |y ≤x }is ?nite for each x .

?For each transaction T ∈A the set {x |x ∈T }is linearly ordered.

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

?The application functionΞof application of M assigns a state of M to the empty set of oper-ations.The functionΞassigns a state to each initial segment of M.The stateΞ(X)is the re-sult of performing all operations in X.

The application function is restricted by the co-herence condition:

If x is a maximal operation in a?nite initial seg-ment X of M and Y=X\{x},then the trans-action T with x∈T is an active transaction in Ξ(Y)andΞ(X)is obtained fromΞ(Y)by per-forming x atΞ(Y).

3.2Models of Modi?cation of Database Sys-

tems.

Modi?cation in-place:Any modi?cation caused by an operation of the transaction to the data in the secondary memory is directly written to the lo-cation of those data.In order to be able to undo

a modi?cation in the case of failure of the trans-

action the changes are recorded.The undo thus applies to the case that the transaction fails or aborts.

Modi?cation in-private:The transaction keep their lo-cal spaces.The local space can be abandoned if the transaction fails or aborts.If the transaction commits then the data of local space which are associated to the data in the secondary memory are?ushed to the secondary memory.

Shadow modi?cation:With the start of the transac-tion,a shadow page which is identical to the cur-rent page is allocated.This shadow page shows the state of data before the transaction.The transaction works on the original page.If the transaction fails the shadow is used for recovery.

If the transaction commits the shadow page is removed.

3.3Abstract Operational Semantics for

Transactions using ASM

The concurrency control can be solved by a large variety of approaches.Database books usually dis-cuss one or some of the approaches and introduce the transaction concept on the basis of those approaches. It is,however,not necessary,to introduce transac-tions on the basis of a speci?c solution.Furthermore, transaction machines also implement one or some of possible solutions.The transaction model is,however, far more general than the speci?c solution provided by a given DBMS.In the sequel we shall discuss two of the approaches.

We require that each of the solutions is a re?ne-ment of the relation?Ξ

?→.The re?nement of the transaction relation should be conservative,i.e.,all properties valid for?Ξ

?→must be preserved.

Conceptually,each transaction is working over a local copy of the database and writing the results of its work into the global database at the end.In order to cover all approaches for databases in the abstract presentation,we introduce the following abstract op-erations:

CreateOwnDB for creating a local copy of the global database,

PrepareMergeDB for preparing the merging into the global database,

MergeOwnDB for merging the results into the global database,

FreeOwnDB for housekeeping at the end of the lo-cal database,ReadOwnDB for reading values from the local database,and

WriteOwnDB for writing values to the local database.

These operation are re?ned in the sequel for sev-eral approaches of transaction implementation.

The State Space for Transactions.We distinguish between

Status of the transaction:The status of the transac-tion Status(transaction)is either undef(denot-ing inactivity of transactions in the queue),ac-tive(denoting that a transaction is currently run-ning),commit(denoting that a transaction has been committed),ready2commit(denoting that a transaction is ready for committing),failed(de-noting that the transaction has failed),or done (denoting that a transaction has been success-fully completed).See also the state transition diagram below.

Content of the transaction:Syntactically,transac-tions are sequences of basic computations, retrieval and modi?cation operations.The content Content(transaction)of the transaction is used for recharging the queue of operations to be performed by the database system.

The content of a transaction is given by a queue Queue(transaction)of operations to be performed.

Persistent database space:The database is kept per-sistently.Any change to the database must pre-serve transition constraints.We use the Sta-bleDB for representing the persistent database.

The General State Transition Diagram for Transac-tions..The diagrams discussed above cannot cap-ture the entire picture of transaction processing.We use additional states for the de?nition of transactions:

Ready2Commit:The e?ect of application of the transaction is checked against the database. ICtrue:If data under change are kept within the lo-cal space of the transaction the space is prepared for writing to the database.

Failed:The local space is used recovering from fail-ure of the transaction.

Inactive:The transaction is ready for execution but not yet scheduled.

The state model of the transaction displayed in Figure 1is extended to the model shown in Figure4.

r r

r r

r r j

r r

r r

r j

T

T

'

'

T

'

©

E

s

BOT

Run

Failed

Aborted

Inactive

Ready2Commit

ICtrue

Committed

EOT

Figure4:The States of a Transaction

We use a Z-like notation for speci?cation of the rule

ruleName:IF condition DO actions,i.e.,

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

ruleName:

condition

actions

The abstract setting for transactions can be now de?ned by the following rules:

BOT:

Status(Self)=undef

Status(Self):=active

Queue(Self):=Content(Self)

CreateOwnDB()

This rule speci?es the instantiation of the local space of the transaction after it has been scheduled. RUN:

Status(Self)=active

IF Queue(Self)=?

THEN

CASE Top(Queue(Self))OF

read(loc):ReadOwnDB(loc)

write(loc,val):WriteOwnDB(loc,val)

compute(expr,loc):

WriteOwnDB(loc,compute(expr,loc)) ENDCASE

Queue(Self):=Pop(Queue(Self))

ELSE Status(Self):=ready2commit

ENDIF

The RUN state rule for transactions speci?es the run of the transaction.This run is modeled without consideration of concurrent writes to the stable database.Concurrent writes appear in the state ICtrue.We shall show later how we can cope with this situation by introducing a central location log space for transactions.

READY2COMMIT:

Status(Self)=ready2commit

IF OwnDB|=Σ

THEN Status(Self):=ICtrue

ELSE Status(Self):=ICfalse

ENDIF

The READY2COMMIT state rule checks whether a consistent state can be reached if the local mod-i?cation data of the transaction are written to the stable database.

FAILED:

(Status(Self)=failed∨Status(Self)=ICfalse)

Status(Self):=aborted

In the abstract setting,the FAILED rule simply transfers the status of the transaction to aborted.We will discuss in the next subchapters the re?nement of this rule.

ABORTED:

Status(Self)=aborted

IF CanReschedule(Self)THEN

Status(Self):=undef

FreeOwnDB()

The ABORTED rule transfers the state of the transaction to the inactive state.From there it becomes active again.The activation is triggered by the monitored function CanReschedule. ICTRUE:

Status(Self)=ICtrue

PrepareMergeDB()

Status(Self):=committed

If the transaction is completed and does not invalidate the integrity constraints we can prepare the states for writing data to locations in the stable database.This state will be modi?ed due to problems of isolation.

COMMITTED:

Status(Self)=committed

MergeOwnDB()

FreeOwnDB()

Status(Self):=done

Program(Self):=undef

The COMMITTED rule checks out the transac-tion.The data which are in the queue for writing are now written in parallel to the stable database.

For completeness we do now de?ne an abstract version of the interface operations. CreateOwnDB:

OwnDB(Self):=globalDB

initOwnDB(Self):=globalDB

MergeOwnDB:

for all x∈Location:

OwnDB(Self,x)=initOwnDB(Self,x)

globalDB(x):=OwnDB(Self,x) FreeOwnDB:

//do nothing here

ReadOwnDB(where:Location):

return OwnDB(Self,where)

WriteOwnDB(where:Location,val:Value):

OwnDB(Self,where):=val

3.4Operational Semantics for Transactions

in the In-Private Setting

Operational semantics of transaction can be re?ned in a large variety of ways.We shall demonstrate this va-riety for transactions in the in-private setting.These variations demonstrate the advantage of the abstract de?nition discussed in section3.3.The transactions which are enabled to run are supported by:

Current database:The current database Cur-rentDB(transaction,location)keeps all those data that are used in the working space and that are related to data on the same location at the secondary storage device.

Flash data:The moment of writing may depend on the actual conditions of the machine.There-fore,we write data which may be transferred to the database in the secondary storage device to Ready2Write(location).

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

Log data for constraint checking:The data which are generated by the transaction must be checked.

We write all these data to the space Trans-fer4Write(transaction,location)).

Change space:The database system may also write data directly to the stable database.In this case, we record overwriting of data by the transaction using the space LogDB(transaction,location).

This space allows to undo the e?ects the trans-action caused in the secondary storage. Variables:The transaction may use additional vari-ables for computation.These variables belong to a transaction and have a location.We use the space CurrentDB(transaction,location)for recording.

Concurrency control data:The transaction machine may use also data for concurrency control of sets of competing transactions.Typical concurrency control data are lock data,i.e.the lock space Lock(location)partially assigning a location to

a transaction or the read-write-lock space Read-

Lock(location)which partially assigns a location to a set of transactions and WriteLock(location) which partially assigns a location to one transac-tion.

Most of the literature on TA uses only one or a small number of the variety.

It su?ces to describe the meaning of the abstract operations in order to get the in-private setting: CreateOwnDB:

CurrentDB(Self,*):=undef

Transfer4Write(Self,*):=undef PrepareMergeDB:

FOR ALL x∈

(σT A=t(T ransfer4W rite))[Location]

∩StableDB[Location]

DO Ready2WriteDB(x):=

Transfer4Write(Self,x)

ENDDO

MergeOwnDB:

FOR ALL x∈Ready2W riteDB[Location] DO StableDB(x):=Ready2WriteDB(x)

ENDDO

FreeOwnDB:

//still do nothing here

CurrentDB(Self,loc):=StableDB(loc) WriteOwnDB(where:Location,val:Value):

Transfer4Write(t,loc):=CurrentDB(t,loc)

We observe now

OwnDB=StableDB∪

((σT A=t(T ransfer4W rite))[Location]

∩StableDB[Location]) Proposition1If the run of transactions t1,...,t k is applicable then the transition by the transactions t1,...,t k in the in-private setting is atomic and con-sistent.

Proposition2Durability is preserved for transac-tions in the in-private setting.

We observe however that isolation is not pre-served by this set of rules.There are many isolation de?ciencies for parallel execution.The following list is not exhaustive:

(LU):Transactions may read data from a location before another transaction uses this location,com-pute new values and write to the location after an-other transaction has written to this location.This behavior is called lost update.

(DR):A transaction may abort and may have writ-ten to a location.Another transaction may read data from this location before the state of the location is changed to the one before the?rst transaction has been performed.This behavior is called dirty read be-fore abort.The same problem occurs if a transaction writes several times to the location and another trans-action reads from this location between the writes. This situation is called dirty intermediate read. (DW).A transaction that is aborting later changes a location.The location is later(but before the abort of the?rst transaction)used by another transaction for computation of data which are written to another location or direct change of the same location.This behavior is called dirty write.

(NRR).A transaction may read several times from a location.The location has been changed by another transaction between the read operations of the?rst transaction.This behavior is called non-repeatable read.

(PR):A transaction inserts or deletes data to a loca-tion between the execution of aggregation functions in another transaction which predicates operate on the location of the?rst transaction.This behavior is called phantom read.

We observe that the transaction execution in the in-private setting does not have any dirty read or dirty write.

We shall demonstrate now that a number of re?ne-ments of the rule set above exists such that isolation can be reached:

Lock space solutions:We use the Lock(location) space or the ReadLock(location)and Write-Lock(location)spaces.

Let us consider the case of utilization of the Lock(location)space.The assignment of locks can follow di?erent strategies:

Obtain all locks at the beginning:The transac-tion obtains all locks necessary during

initialization of the transaction.Two

settings are applicable:

Optimistic setting:Locks are only obtained

for write operations.The controller

causes an abort of the transaction or an

assignment of a wait state whenever an-

other transaction is competing for this

location.

Pessimistic setting:All locks are obtained

for all read and write operations.

Obtain locks just in time:Whenever a write op-eration(or in the pessimistic setting a read

operation)is performed by the transaction

the corresponding lock is obtained.

The release of the locks can be based on two strategies:

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

Release as early as possible:Whenever the trans-action does not need the data from loca-

tion the lock is released.If the transaction

aborts a speci?c recovery technique is nec-

essary.

Release at the end of the transaction:All locks obtained by the transaction are released at

the end.

The utilization of the ReadLock(location)and WriteLock(location)is similar.Read locks are collected as long as there is no write attempt.If there is a write attempt to loc by t then the set of read lock to the location is singleton ReadLock= {t}or empty.Otherwise the transaction either must be aborted or must cause an abort of the other transactions using this location.

In the case of occupied locks two options can be applied:

Aborting the transaction:The transaction cannot be continued and is aborted.

Delaying the transaction:The transaction is transferred to a wait state.The state is

changed whenever all locks can be obtained.

The invocation of aborted or not completed transactions may be ordered by a number of ap-proaches,e.g.,

Kill-wait approach:If a transaction t makes a re-quest that competes with an operation of

another transaction t then the order of the

transaction request the?rst transaction to

wait if the order of t is less than that of t .

In the other case t is aborted.

Wait-die approach:If the transaction t is com-peting with another transaction t and the

order of the?rst transaction is less than

that of the second the?rst transaction is

aborted.In the other case,the?rst trans-

action waits.

Typical orders are based on time stamps.

The locking may be performed on locations or on clusters of locations,e.g.,pages.In this case we use predicate locks which specify the granularity of locking.

Intermediate storage solutions:We may execute all transactions?rst and use an intermediate storage UsedDB(loc)for the data that have already been obtained by other transactions.In this case,the ReadOwnDB rule is changed by a locking rule:

ReadOwnDB(where:Location):

IF UsedDB(loc)=undef∨

UsedDB(loc)=Self

THEN

CurrentDB(Self,loc):=StableDB(loc)

UsedDB(loc):=t

ELSE Status(Self):=aborted

ENDIF

In a similar form the FreeOwnDB rule is changed by adding actions for release of locations in UsedDB.

The variety of solutions discussed for lock space solutions is also applicable to the case of inter-mediate storage.

Monitor-based solutions:Shared data environments can be enhanced by monitors,i.e.,extensions of the objects by modules acting similar to an access and write guard.Any transaction accessing or

writing to objects performs its operation through the monitor associated with that object.Moni-tors are therefore locking programs on the stor-age level.If an object is unlocked then a trans-action may access the object and write to the object.A transaction may obtain object locks and release them in the FreeOwnDB rule.

We may distinguish,therefore,a number of ap-proaches for implementing monitors similar to the implementation of lock space solutions.The transaction is performing in the BOT a message exchange with the StableDB.Similarly to lock space solutions,optimistic and pessimistic strate-gies and release strategies can be applied.

Let us change the rule set used above for the pes-simistic locking at the beginning and abortion of the transaction if the locks cannot be obtained: CreateOwnDB:

CurrentDB(Self,*):=undef

Transfer4Write(Self,*):=undef

FOR ALL x∈

{loc|read(loc)∈Content(Self)

∨write(loc)∈Content(Self)} DO

IF Lock(loc)=undef

THEN Lock(loc):=t

ELSE Status(Self):=failed

ENDDO

This locking technique is combined with release at the end by a change of the following rule: FreeOwnDB:

FOR ALL loc:(Lock(loc)=t)

DO Lock(loc):=undef

ENDDO

The other rules remain unchanged.

Now we can prove the following property: Proposition3Transaction execution in the in-private setting based

·on pessimistic locking at the beginning

·with the aborting option if locks cannot be obtained ·with release at the end

is a conservative re?nement of the relation?Ξ

?→. Similar propositions can be derived for the other iso-lation solutions.Most DBMS use locking techniques. Older DBMS have used also techniques based on in-termediate storage.These solutions have shown sim-ilar performance parameters as locking techniques.

Monitor-based techniques are not used in classi-cal relational DBMS.These solutions are currently experienced to real-time and distributed databases. As far as we know,however,these techniques are not well-documented.

3.5Operational Semantics for Transactions

in the In-Place Setting

Let us now discuss in brief the in-place setting for transactions.We follow the approach used in section 3.4.We introduce the general state description.We can derive similar properties.The in-place setting is an optimistic strategy.If the transactions does not abort the treatment is much simpler than in the case of the in-private setting.This advantage is,however,

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

only observable in centralized environments.In dis-tributed environments,the in place setting requires sophisticated monitoring.

The states of transactions pictured in Figure4are speci?ed as follows:

CreateOwnDB:

CurrentDB(Self,*):=undef

Transfer4Write(Self,*):=undef

LogDB(Self,*):=undef

This rule uses an explicit LogDB for recording the actions of the transaction.

PrepareMergeDB:

//do nothing here

MergeOwnDB:

//do nothing here

FreeOwnDB:

FOR ALL x∈(σT A=t(LogDB))[Location] DO StableDB(x):=LogDB(Self,x)

LogDB(Self,x):=undef

ENDDO

CurrentDB(Self,loc):=StableDB(loc) WriteOwnDB(where:Location,val:Value): WriteOwnDB()

IF LogDB(Self,loc)=undef

THEN LogDB(Self,loc):=StableDB(loc)

ENDIF

StableDB(loc):=CurrentDB(Self,loc)

Transfer4Write(t,loc):=CurrentDB(t,loc)

We obtain the equality:OwnDB=StableDB.

All activities performed are directly written to the StableDB.If the transactions abort these writes must be compensated by an undo action.All writes must be recorded in the LogDB.We use the LogDB only for a write performed by one transaction.If history of writes must be recorded the treatment of LogDB must be more sophisticated.

The integrity constraint control becomes simpler since the StableDB must be checked without consid-eration of other ASM states.

The undo action is simple as long as only one write to the LogDB is allowed.If we must record a history then monitoring or locking approaches can be used.

The next three rules are very simple since they change only the state of the transaction. Proposition4If the run of transactions t1,...,t k is applicable then the transition by the transactions t1,...,t k in the in-place setting is atomic and consis-tent.

Proposition5Durability is preserved for transac-tions in the in-place setting.

Isolation can be based on similar techniques that have already been discussed in section2.4.4Towards an Understanding of Constraint Enforcement Used in SQL’99

Integrity enforcement becomes more complex if trans-actions are considered.SQL’99(SQL1999)has intro-duced a rather confusing treatment.

SQL’99constraints can be coupled with integrity en-forcement policy in a large variety:

Checking mode:The integrity enforcement can be deferred until the end of transactions

or can be forced to an immediate control

directly at the moment a change in the

database appears.

Choice of statement or row level:The granularity of enforcement can be at the row level or at

the statement level.If row level is chosen

each modi?cation(insert,delete or update)

of object forces application of integrity con-

trol.

Constraints may be pre-or post-conditions:

Constraints may be checked before or after

a statement or a modi?cation is executed.

Scope conditions for reference columns allow to disable or enable the application of referen-

tial integrity constraints.

Matching conditions soften the satisfaction of ref-erential integrity constraints.The equalities

to be checked can be partially ful?lled.

Reference types are based on tuple(or object) identi?ers and can be used instead of for-

eign key values.

Triggers are based on the event-condition-action paradigm of rules depending on events performed

on a class.Events are modi?cation actions exe-

cuted on the database.Triggers have a number

of variations:

Number of triggers per events and events per trigger: Triggers can be based on exactly one or

several events.Events can be attached to

one or several triggers.

Activation time of triggers can be before or af-ter appearance of the event speci?ed for the

trigger.Further,activation conditions may

be de?ned.

Con?ict resolution of execution order of triggers may be based on di?erent policies.

Order of constraint check di?ers.DB2checks?rst key and unique constraints,second referential con-

straints and then check constraints.Sybase,Or-

acle and Informix control?rst check constraints,

then key constraints and last referential con-

straints.Ingres and MS SQL check?rst keys

and uniqueness,then check constraints and last

referential constraints.

SQL’92declarative constraints are not null conditions, key conditions,check conditions,foreign key

constraints,uniqueness conditions,domain con-

straints and assertions.Although they have not

yet been completely implemented by DBMS they

are kept in the standard for SQL’99.It should

be noted,however,that the SQL’99standard

did not restrict to one semantics for these con-

straints.

This large variety must be understandable with trans-action models.We feel the urgent need for sophisti-cated transaction models with an operational seman-tics which helps in understanding which model has to be chosen in which case for which integrity constraint enforcement policy.The models proposed above en-able us in de?ning an operational semantics and in reasoning on the e?ects of the choices made by the application programmer.

Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often def

This large variety becomes completely confus-ing if rule triggering is considered.Except (Schewe/Thalheim1998),rule triggering is not well-understood in the database community.The SQL standard allows one event per trigger and an arbi-trary number of triggers per event.This approach is used in DB2,MS SQL,Sybase SQL Anywhere and partially in Informix.Ingres and Oracle do not limit the number of events per trigger.Sybase uses the op-posite approach:only one trigger per event is allowed but the number of events per trigger is not limited. It can be shown that the Sybase approach leads to better programming.

The SQL’99approach su?ers from a number of pitfalls:

Local de?nition without global understanding.

Trigger avalanches.

Unknown implications.

Swinging transaction systems.

Constraint modi?cation anomaly.

A better approach is the derivation of such a specialization of a given basic operation which pre-serves the integrity constraint.We may use the great-est consistent specialization(Schewe/Thalheim1999) as such specialization.(Link2002)provides a nice framework for applying this approach to tailored re-?nements.

The transaction models discussed above at the conceptual level can be re?ned by combining the trig-ger execution frame with the state transition diagram of a transaction.

The state model uses additional structures of the working space:

Immediate constraints are either checked at the row or at the statement level.There are two activation modes:

Check before execution for constraints mode is ‘immediate’and whose activation time is

‘before’.

Check after execution for constraints which are checked after execution of a statement or

after modi?cation of a tuple.

Deferred constraints are checked at the end.They re-place the setΣin the READY2COMMIT state.

Using queues instead of sets of constraints models the order of constraint check which is di?erent in current DBMS.

The state model in Figure4is extended by the states Prepare4RunNext,ApplicationOfImmedi-ateTrigger,and Ready4Next.Due to paper length we do not elaborate the application of these options.

5Conclusion

Transactions are speci?ed at the logical level as atomic operations which preserve consistency of a database.They can be potentially executed in par-allel if they are not competing for resources or if the competition can be resolved.The logical level does not consider speci?c details of implementation op-tions.Implementation options depend on the support of the computation and main-memory engine,on the solutions for the isolation of competing transactions and on the consistency enforcement mode.

This paper proposes both a logical semantics and an operational semantics for transactions which can be re?ned in dependence on the options.References

Biskup,J.(1995),Foundations of information systems,Vieweg, Braunschweig,(in German).

Codd, E.F.(1990),The relational model for database manage-ment-Version2Addison-Weslay.

Elmasri R.&Navathe,S.B.(2000),Fundamentals of database systems,Benjamin/Cummings Publ.

Embley,D.(1998),Object database development,Addison-Wesley.

Gottlob,G.,Kappel,G.&Schre?,M.(1982),‘Semantics of object-oriented data models:The evolving algebra approach’,in LNCS504,Springer,144-160.

Garcia-Molina,H.,Ullman,J.D.&Widom,J.(2000),Database system implementation,Prentice-Hall.

Gurevich,J.,Soparkar,N.&Wallace, C.(1997),Formalizing database recovery,Journal of Universal Computer Science, 3,4,320-340.

Gray J.&Reuter,A.(1993),Transaction processing:Concepts and techniques,Morgan-Kaufman.

Gurevich,Y.(May1997),Draft of the ASM Guide.Technical Report,Univ.of Michigan EECS Department,CSE-TR-336-

97.

Available from the ASM website via http://www.eecs.umich.edu/gasm/

Gurevich,Y.(2000),Sequential abstract-state machines capture sequential algorithms.ACM Transactions on Computa-tional Logic,1,1,77-111.

Haerder T.&Reuter,A.(1998),Principles of transaction-oriented database recovery,in(Hsu/Kumar1998),16-55.

Hsu M.&Kumar,V.(1998),Introduction to database recovery,in (Hsu/Kumar1998),6-15.

ISO International Standard:Database language SQL-Part2: Foundation(SQL Foundation).(1999),International Orga-nization for Standardization&American National Standard Institut,ANSI/ISO/IEC9075-2:99,Sept.1999.

Kumar V.&Hsu M.(eds.),(1998),Recovery mechnisms in database systems,Prentice-Hall.

Lewis,P.M.,Bernstein, A.&Kifer,M.(2002),Databases and transaction processing:An application-oriented approach, Addison-Wesley.

Levene M.&Loizou,G.(1999),A guided tour to relational databases and beyond,Springer.

Link,S.(2002),‘Towards a tailored theory of consistency enforce-ment in databases’,in Proc.FoIKS’02(eds.T.Eiter,K.-D.

Schewe),LNCS2284,Springer,160-177.

Malzew,A.I.(1970),Algebraic systems,Nauka,(in Russian).

Schewe K.-D.&Thalheim,B.(1998),‘Limitations of rule triggering systems for integrity maintenance in the context of transition speci?cation’,Acta Cybernetica,13,277-304.

Schewe K.-D.&Thalheim,B.(1999),‘Towards a theory of consis-tency enforcement’,Acta Informatica,36,2,97-141.

St¨a rk,R.,Schmid,J.&B¨o rger, E.(2001),Java and the Java virtual machine:De?nition,veri?cation and validation, Springer.

Thalheim,B.(2000),Entity-relationship modeling–Foundations of database technology,Springer.

Thalheim,B.(2001),Abstraction layers in database structuring: The star,snow?ake and hierarchical structuring,Preprint I-13-2001,Computer Science Institute,Brandenburg Univer-sity of Technology at Cottbus.

Vossen,G.(1991),Data Models,database languages and database management systems,Addison Wesley1991.

Weikum,G.&Vossen,G.(2002),Transactional information sys-tems:Theory,algorithms,and the practice of concurrency control and recovery,Morgan-Kaufman.

本文来源:https://www.bwwdw.com/article/f7ij.html

Top