|
Advanced
We recommend Three Tier Software Architectures as prerequisite reading for this technology description.
Since the 1980s,
two phase commit technology has been used to automatically
control and monitor commit and/or rollback activities for transactions in a
distributed database system.
Two phase commit technology is used when data
updates need to occur simultaneously at multiple databases within a
distributed system. Two phase commits are done to maintain
data integrity and
accuracy within the
distributed databases through
synchronized locking of all pieces of a transaction. Two phase commit is a
proven solution when data integrity in a distributed system is a
requirement. Two phase commit technology is used for hotel and airline
reservations, stock market transactions, banking applications, and credit card
systems. For more details on two phase commit see the ORACLE7 Server
Concept Manual and The Performance of Two-Phase Commit Protocols in
the Presence of Site Failures
[ORACLE7 92,
UCSB 94].
As shown in Figure 7, applying two phase commit protocols ensures that execution of data transactions are synchronized, either all committed or all rolled back (not committed) to each of the distributed databases.

Figure 7: Distributed Databases When Two Phase Commit Happens Simultaneously Through the Network
When dealing with distributed databases, such as in the client/server architecture, distributed transactions need to be coordinated throughout the network to ensure data integrity for the users. Distributed databases using the two phase commit technique update all participating databases simultaneously.
Unlike non-distributed databases (see Figure 8), where a single change is or is not made locally, all participating databases must all commit or all rollback in distributed databases, even if there is a system or network failure at any node. This is how the two phase commit process maintains system data integrity.

Figure 8: Non-Distributed Databases Make Only Local Updates
Two phase commit has two distinct processes that are accomplished in less than a fraction of a second:
- The
Prepare Phase,
where the global coordinator (initiating database) requests that all participants (distributed databases) will promise to commit or rollback the transaction. (Note: Any database could serve as the global coordinator, depending on the transaction.)
- The Commit Phase, where all participants respond to the coordinator that they are prepared, then the coordinator asks all nodes to commit the transaction. If all participants cannot prepare or there is a system component failure, the coordinator asks all databases to roll back the transaction.
Should there be a machine, network, or software failure during the two phase commit process, the two phase commit protocols will automatically and transparently complete the recovery with no work from the database administrator. This is done through use of pending transaction tables in each database where information about distributed transaction is maintained as they proceed through the two phase commit. Information in the pending transaction table is used by the recovery process to resolve any transaction of questionable status. This information can also be used by the database administrator to override automated recovery procedures by forcing a commit or a rollback to available participating databases.
Two phase commit protocols are offered in all modern distributed database products. However, the methods for implementing two phase commits may vary in the degree of automation provided. Some vendors provide a two phase commit implementation that is transparent to the application. Other vendors require specific programming of the calls into an application, and additional programming would be needed should rollback be a requirement; this situation would most likely result in an increase to program cost and schedule.
The two phase commit protocol has been used successfully since the 1980s for
hotel and airline reservations, stock market transactions, banking
applications and credit card systems
[Citron 93].
There have been two performance issues with two phase commit:
- If one database server is unavailable, none of the servers gets the updates. This is correctable if the software administrator forces the commit to the available participants, but if this is a recurring problem the administrator may not be able to keep up, thus causing system and network performance will deteriorate.
- There is significant demand in network resources as the number of database servers to which data must be distributed increases. This is correctable through network tuning and correctly building the data distribution through database optimization techniques.
Currently, two phase commit procedures are vendor proprietary. There are no
standards on how they should be implemented. X/Open has developed a standard
that is being implemented in several transaction processing monitors (see Transaction Processing Monitor
Technology), but it has not been adopted by the database vendors
[X/Open 96]. Two phase commit proprietary protocols have been published by several vendors.
An alternative to updating distributed databases with a two phase commit mechanism is to update multiple servers using a transaction queuing approach where transactions are distributed sequentially. Distributing transactions sequentially raises the problem of users working with different version of the data. In military usage, this could result in planning sorties for targets that have already been eliminated.
This technology is classified under the following categories. Select a
category for a list of related topics.
|
Name of technology
|
Database Two Phase Commit
|
|
Application category
|
Client-Server (AP.2.1.2.1)
Data Management (AP.2.6.1)
|
|
Quality measures category
|
Accuracy (QM.2.1.2.1)
|
|
Computing reviews category
|
Distributed Systems (C.2.4)
|
|
[Citron 93]
|
Citron, A., et al. "Two-Phase Commit Optimization and Tradeoffs in the
Commercial Environment," 520-529. Proceedings of the Ninth International
Conference on Data Engineering. Vienna, Austria, April 19-23, 1993. Los
Alamitos, CA: IEEE Computer Society Press, 1993.
|
|
[ORACLE7 92]
|
"Two-Phase Commit," 22-1-22-21. ORACLE7 Server Concept Manual
(6693-70-1292). Redwood City, CA: Oracle, 1992.
|
|
[Schussel 96]
|
Schussel, G. Replication, The Next Generation of Distributed Database
Technology [online]. Available WWW <URL:
http://www.dciexpo.com/geos/replica.htm> (1996).
|
|
[UCSB 94]
|
The Performance of Two-Phase Commit Protocols in the Presence of Site
Failures (TRCS94-09). Santa Barbara, CA: University of California,
Computer Science Department, April 1994.
|
|
[X/Open 96]
|
X/Open Web Site [online]. Available WWW <URL: http://www.rdg.opengroup.org/>
(1996).
|
Darleen Sadoski, GTE
David Altieri, GTE
20 June 97: updated URLs for [Schussel 96] and [X/Open 96];
changed label for [UCSB 94]
10 Jan 97 (original)
The Software
Engineering Institute (SEI) is a federally funded research and
development center sponsored by the U.S. Department of Defense
and operated by Carnegie Mellon University.
Copyright
2007
by Carnegie Mellon University
Terms of Use
URL: http://www.sei.cmu.edu/str/descriptions/dtpc_body.html
Last Modified: 11 January 2007
|