Transaction isolation in Hibernate

Working with databases implies the respect of ACID principle. One of "tools" used to ensure this respect are transactions.

Through this article we'll discover transactions world in Hibernate. The first part will define a concept of transaction. It'll also cover different types of transactions. The next part will rapidly introduce us into transactions idea implemented in Hibernate. The third, and at the same the last part, will show samples of transactions use.

What are transactions ?

Before approaching transactions management at Hibernate's level, we should remind the definition of transaction. A transaction is an unit of work that helps to respect ACID principle. This unit of work can only be executed entirely. It means that database will see zero (if transaction fails, operation called rollback) or all operations placed inside transaction (if transaction succeeds, operation called commit).

An important point of all transactions is their isolation. The isolation determines the visibility of changes made by one transaction for another transactions. So, for example, it helps to specify when (before transaction's commit, only after that) one transaction should see the data changes made by another transaction. They're 4 mains transaction's isolation levels:

  1. read uncommited - imagine two transactions, 'A' and 'B'. First, 'A' writes a data into one table without commiting the transaction. After, 'B' reads the uncommited data and work on it. But some error occurs on commiting the 'A' transaction and all changes are rollbacked. In this case, 'B' continues to work on uncommited data by the 'A' transaction. This mode is very fast but can introduce a lot of data consistency problems.
  2. read commited - we still use the same scenario as for read uncommited, but commited data is locked. It means that 'B' can't see uncommited data from the 'A' transaction. 'B' can see it only when 'A' will commit its transaction.
  3. repeatable read - this isolation level promotes the same data read, even if the data was changed meanwhile. We continue to work with our 'A' and 'B' transactions. First, 'B' makes a SELECT query and lock selected rows. After, 'A' makes an INSERT query. 'B' executes a new SELECT query with the same conditions as the first one. 'B' will now see the same results as previously (the second SELECT must be made under the same transaction as the first one).
  4. serializable - this level occurs when our 'B' transaction reads the data and lock whole data's table. It means that another transaction can't modify the data on this table. Unlike read uncommited, this way is the most secure. But in the other hand, it's also the slowest solution.

All of these levels can lead to incoscient data reads. We can distinguish 3 types of these reads:

  1. dirty read - caused by read uncommited level. In this inconsistency, data is different for two or more participating transactions. For example, the first transaction can read the uncommitted data by the second transaction.
  2. phantom read - occurs when two identical queries, executed inside the same transaction, return different set of rows. So, it consists on reading two different set of values, inside the same transaction. But unlike dirty read, all data of phantom read are committed. Yes, it's called phantom because the first and the second SELECT queries contain some phantom data (data absent in the first SELECT results or in the second one).
  3. non-repeatable read - this problem looks almost as a phantom read. But unlike phantom, non-repetable read is applied the same results. Imagine that 'B' reads one row at the first time (for example, a row with id = 30) and 'A' changes one of this row's values (for example, name). After the commit of 'A' transaction, 'B' reads again the same row inside the same transaction. But regarding to the first SELECT, it retrieves the row with the different values. This error can be produced when read committed isolation level is applied to the transactions. It's called non-repeatable because the result of first read of 'B' can't be repeatead in the second read.

Transactions in Hibernate

Hibernate starts the transactions by calling getTransaction() (JPA's implementation) or beginTransaction() (Hibernate's Session) methods. According to used persistence mechanism, a transaction can be an instance of javax.persistence.EntityTransaction (for JPA) or org.hibernate.Transaction (for Hibernate's Session). Both transaction are begun with begin() method, rollbacked thanks to rollback() one and commited through commit() invocation.

To configure transactions isolation level in Hibernate, we need to change the property called hibernate.connection.isolation. This property can take one from following entries:
- 1 (read uncommited)
- 2 (read commited)
- 4 (repeatable read)
- 8 (serializable)

Normally, the isolation level is set at java.sql.Connection level, through setTransactionIsolation(int level) method. Level passed in parameter should be one from Connection's constants: Connection.TRANSACTION_READ_UNCOMMITTED, Connection.TRANSACTION_READ_COMMITTED, Connection.TRANSACTION_REPEATABLE_READ or Connection.TRANSACTION_SERIALIZABLE.

Transactions in JPA - code samples

In our examples transaction level will be set at the database level directly. The tests consist on: - setting transaction isolation level in MySQL database - beginning transaction in MySQL database and inserting new row to products table. Product's name will be "cherry". - making select query in Hibernate and checking the number of rows whose name is equal to "cherry".

First, let's take a look on our database that already contains some of "cherries" :

mysql> SELECT * FROM products WHERE name = "cherry";
+----+--------+--------------+
| id | name   | lock_version |
+----+--------+--------------+
|  1 | cherry |            1 |
|  5 | cherry |            1 |
|  6 | cherry |            1 |
|  7 | cherry |            1 |
|  8 | cherry |            1 |
|  9 | cherry |            1 |
| 10 | cherry |            1 |
| 11 | cherry |            1 |
| 12 | cherry |            1 |
| 13 | cherry |            1 |
| 16 | cherry |            1 |
| 17 | cherry |            1 |
+----+--------+--------------+
12 rows in set (0.00 sec)

Our testing code looks like:

EntityManagerFactory emFactory = Persistence.createEntityManagerFactory("sampleUnit");
EntityManager em1 = emFactory.createEntityManager();
EntityTransaction t1 = em1.getTransaction();
t1.begin();
try {
  Query query = em1.createQuery("SELECT p FROM Product p WHERE name = :name");
  query.setParameter("name", "cherry");
  List<Product> products = query.getResultList();
  System.out.println("Got product size: "+products.size());
} finally {
  t1.rollback();
}

This code is based on sample querying for a list of rows whose name is "cherry". Its execution will give the output:

Got product size: 12

So, the results between database and Hibernate are the same. Test setup is prepared well.

To test if our Hibernate's code supports read uncommitted, we need to execute following MySQL statements in the database:

mysql> SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION; INSERT INTO products VALUES (NULL, 'cherry', 1);
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)

Here, we set the isolation level for all transactions to read uncommitted. After, we begin the transaction and insert a new row with the name "cherry". What happens from the Hibernate's side ?

Got product size: 13

Our expectations are correct. Transactions can read the uncommitted changes, so Hibernate sees 13 rows instead of 12. Even if the database transaction wasn't really committed. And imagine now that Hibernate starts to treat all 13 founds rows and database transaction rollbacks one nanosecond after. The data consistency is violated. If this data isn't sensible, it can be only a little bit problematic. But if the manipulated data concerns some payment process, the problem is much bigger because a client could receive a mail to pay an phantom invoice (invoice read from system but didn't inserted by database).

After discovering read uncommitted problems, let's take a look on read committed:

mysql> SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION; INSERT INTO products VALUES (NULL, 'cherry', 1);
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)

Hibernate's test will display:

Got product size: 12

Everything works well. The database transaction wasn't committed, so Hibernate can't see it. But as we've saw it, this mode causes data locks and can slow down all application. But a small slow-down is almost always better than unhappy clients who have to pay fictive invoices.

Also we can observe an interesting behavior on serializable isolation level:

mysql> SET GLOBAL TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION; INSERT INTO products VALUES (NULL, 'cherry', 1);
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)

When we launch test code, we see that transaction can't never access to the data because of table lock applied by serializable isolation level. Instead of number of rows, we'll receive a timeout exception:

javax.persistence.LockTimeoutException: could not extract ResultSet
        at org.hibernate.ejb.AbstractEntityManagerImpl.wrapLockException(AbstractEntityManagerImpl.java:1440)
        at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1339)
        at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310)
        at org.hibernate.ejb.QueryImpl.getResultList(QueryImpl.java:277)
        at com.waitingforcode.hibernate.locking.ReadTests.readUncommit(ReadTests.java:86)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
...
Caused by: org.hibernate.exception.LockTimeoutException: could not extract ResultSet
        at org.hibernate.dialect.MySQLDialect$1.convert(MySQLDialect.java:408)
        at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
...
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
  at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3491)
...

To test repeatable read isolation level, we need to modify our test code a little bit:

EntityManager em1 = emFactory.createEntityManager();
EntityTransaction t1 = em2.getTransaction();
t1.begin();
Query query = em1.createQuery("SELECT p FROM Product p WHERE name = :name");
query.setParameter("name", "cherry");
for (int i = 0; i < 5; i++) {
  Thread.sleep(2000);
  List<Product> products = query.getResultList();
  System.out.println("Got product size: "+products.size());
}

As you can see, we'll make 5 repeatable reads for the same query (Query query), inside the same transaction (EntityTransaction t1). Now we need to launch the test and to pass to the database and execute following query:

mysql> SET GLOBAL TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)

mysql> START TRANSACTION; INSERT INTO products VALUES (NULL, 'cherry', 1);
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (1.89 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.07 sec)

mysql> SELECT COUNT(*) FROM products WHERE name = "cherry";
+----------+
| COUNT(*) |
+----------+
|       13 |
+----------+
1 row in set (0.00 sec)

New "cherry" was added into database and Hibernate should see 13 rows. But he doesn't:

Got product size: 12
Got product size: 12
Got product size: 12
Got product size: 12
Got product size: 12

Because of repeatable reads isolation level, Hibernate's sees exactly the same number of rows for every SELECT query under began transaction. This level acts like a cache, so we can tell that the reads are faster than usually. But in the other hand, over again we are working with the data whose doesn't reflect the reality. If we retake invoices sample, in this case we could charge our client with less invoices as we should. And if it's only person with one invoice less, this is less problematic if it's one thousands cases.

You can see that isolation level inadapted to our environment can produce a lot of damages and the company can lose customers. To prevent that, remember that only serializable level is safe. It doesn't cause phantom, dirty and non-repeatable reads. But in the other hand, it's slower because of lock put on whole table. And it can be problematic in applications based on concurrency database access. The other levels, read committed, read uncommited, repetable read, are more reactive but they can all provoke inconsistent reads.

If you liked it, you should read:

The comments are moderated. I publish them when I answer, so don't worry if you don't see yours immediately :)