JDBC 3.0 - Part II: : Transaction Fundamentals

Saikat Goswami's picture
articles: 

This article explores the fundamentals of a transaction, in the context of JDBC, Enterprise Java Beans and a database.

Introduction ---- What is a Transaction?

A transaction is a set of operations that should be completed as one unit. When I say 'one unit', I mean, 'as a whole'. If it is okay for one operation to fail in the set, then the set cannot be called a 'transaction'. You go to a store, choose a book, take it to the counter, pay for it, the cashier puts the book in a bag, and you bring it home. This set of operations is dependent on each other. You can think of similar 'grouped' operations in real life (wonder why they call it 'real life', since life is always real).

ACID makes it a Transaction

In order for a set of operations to be a transaction, there are four characteristics that is a standard way to measure the transactional nature of a set of operations. They are called ACID (yet another acronym in the software world). A for Atomicity, C for Consistency, I for Isolation, D for Durability.

Atomicity: All the individual operations must go through. If any one in the group fails, the transaction fails. So, if you want to bring the book home without paying it, it is not a transaction, because all operations are not performed as a unit.

Consistency: This means that the database must be updated correctly, no matter what: the set of operations succeeds or fails. The programmer who is programming the business logic does not worry about the database being persisted or not. The design of a 'transaction' has to be that the database is updated appropriately.

Isolation: The operations are executed without any external process being able to disturb its sequence. If any other process wants to do a task in the transactional set, it has to wait. This prevents data from being corrupted by two different sources. For instance, if husband and wife have a joint account, and husband wants to deduct money online from his checking, and coincidentally, wife wants to deduct the money too, one of them has to wait.

Durable: This refers to the condition that once a transaction is complete, the database is and definitely is updated. The programmer responsible for programming business logic should not worry about the database not being updated.

To summarize, a transaction has a beginning and an end (just like everything else). These operations are all or none. Changes made in one transaction are not affected by changes made by another transaction. In short, transactions ensure data integrity.

Auto-Commit ---- A Simple Way to Manage a Transaction

One thing of mention immediately after you have read what you read is about the auto-commit options available to you by the java.sql.Connection class.
For example:

try{
	Connection conn = dataSource.getConnection();
	// set autoCommit to false
	conn.setAutoCommit(false);

CheckingAccount ca = new CheckingAccount();
SavingsAccount sa = new SavingsAccount();

// let's say this method checks if 'amount' is available in an
// account
checkIfAmountAvailable(amount, sa);

// let's say this method transfers 'amount' from savings to
// checking account
transfer( sa, amount, ca);

conn.commit();
}
catch( Exception e){
try{
conn.rollback();
}catch( Exception ne){}
}
finally{
try{
if( conn != null) {
conn.close();
}
} catch( Exception e) {}
}

In the above code, there are two steps in this transaction. The first step is to see if the amount exists in the savings account. The second step is the actual transfer. If and only if both the operations succeed, the results of the transaction will be committed in the database. If the first one succeeds and the second one fails, an exception will be thrown. This way, in the code, we are controlling the persistence in database. Had the auto-commit option being set to true (which is the default scenario), we would not be able to have control over the situation. If the first one succeeded and the second one was not, the changes rendered by the first method will already have been stored in the database (which is not good).

If the databases to which we are committing are different (i. e. the first method talks to a database, and the second method makes changes to a different database), then we can no longer use the Connection class's auto-commit feature. In that case, it becomes a 'distributed transaction', and things like 'two-phase commit' come into the picture.

Isolation Levels (I of the ACID)
While auto-commit takes care of the atomicity, 'isolation levels' take responsibility of the 'isolation' characteristic that is required of a transaction. As explained earlier, isolation is the level of visibility during a transaction. There are three standard levels of transaction isolation:

  • Dirty Read: Changes made during a transaction is 'visible' to other parties. If method checkIfAmountAvailable(..) is called, and right then the wife logs in online and wishes to view the savings account (remember, husband and wife have a joint account; so either has the same rights), she sees any changes made by the method. In this case, our first method checkIfAmountAvailable(..) is not making any changes. But if did, she would be able to see it, had the isolation level been set for 'dirty reads'. As you might have guessed, for banking scenarios like this, we will not have the isolation level to be set for dirty reads. An example could be viewing online catalogs. It would not make really that much of a difference if someone saw a catalog of, say, used floppies and missed a couple because it read dirty data.

  • Non-repeatable Read: Transaction 1 reads a row; transaction 2 changes the same row transaction1 read just now; transaction 1 re-reads the eponymous row. If transaction 1 sees the changes made by transaction2, we have just seen an example of a 'non-repeatable read'. Non-repeatable reads can corrupt data beyond repair. You have read a row the first time in your EJB, and have already started making changes to your variables. Now, the second time the data is different. Holy Moly!!

  • Phantom Read: Transaction 1 retrieves a bunch of rows, transaction 2 inserts a row, transaction 1 re-reads the query (you are right, since this query returned a bunch of rows, it does have a WHERE clause). If transaction 1 sees the additional row, it is the ghost row, the 'phantom read'.

A database has its own way of implementing these isolation levels. JDBC 3.0 offers five isolation levels that we can set through the Connection interface. It is very simple. The Connection interface has two methods:

public void setTransactionIsolation(int level) throws SQLException;
public int getTransactionIsolation() throws SQLException;

The integer passed in the set method is one of the following (these are the five isolation levels that JDBC 3.0 offers). These are static integers available in the Connection object:

  • TRANSACTION_READ_UNCOMMITTED: Does not prevent any of the dirty read, non-repeatable read or phantom read. One transaction can see anything anywhere. It does not care of any other transaction is in the middle of something.
  • TRANSACTION_READ_COMMITTED: This one prevents only dirty reads.
  • TRANSACTION_REPEATABLE_READ: Prevents dirty reads and non-repeatable reads.
  • TRANSACTION_SERIALIZABLE: Prevents all the bad things: dirty reads, non-repeatable reads and phantom reads.
  • TRANSACTION_NONE: This lonestar level means transactions are not supported.

These four isolation levels (we can ignore the last one, because it is just a 'not applicable' case) gives the Java developer a degree of control to manage transactions according to business rule sensitivity.

Summary

Dealing with transactions is very common practice. Setting restrictive isolation levels are expensive, as they involve a lot of database overhead, and affects performance (access to database becomes slow).

Prepared by Saikat Goswami, Boston, Massachussets, sai_nyc@hotmail.com

Comments

Hello Saikat,
I understood the different situations you have explaied, but if you can elaborate more on isolation level with some more eaxamples
,It would be very benificial for all of us.
Regards...

Problems:

# Dirty Read.

# Non-repeatable Read.

# Phantom Read.

Isolation Levels:

* TRANSACTION_SERIALIZABLE.
* TRANSACTION_REPEATABLE_READ.
* TRANSACTION_READ_COMMITTED.
* TRANSACTION_READ_UNCOMMITTED.
* TRANSACTION_NONE.

Now stop smoking pot.

Good stuff, Very good for beginners about transactions.

Thx.

Great description of the isolation levels. Just what I needed.

Was looking for it..thanks

Hello Saikat,
I understood the different situations you have explaied, but if you can elaborate more on transaction_none isolation level,I would be very happy.
Regards...