Almost every time we use auto increment keys (or serial in PostgreSQL). But Hibernate allows three more choices to handle primary key generation of entity objects. This article will show it.
A virtual conference at the intersection of Data and AI. This is not a conference for the hype. Its real users talking about real experiences.
- 40+ speakers with the likes of Hannes from Duck DB, Sol Rashidi, Joe Reis, Sadie St. Lawrence, Ryan Wolf from nvidia, Rebecca from lidl
- 12th September 2024
- Three simultaneous tracks
- Panels, Lighting Talks, Keynotes, Booth crawls, Roundtables and Entertainment.
- Topics include (ingestion, finops for data, data for inference (feature platforms), data for ML observability
- 100% virtual and 100% free
👉 Register here
Its first part will describe a pattern used to represent identities in object world, surrogate keys. After that, we'll explore all 4 primary key strategies.
What is surrogate key ?
To better understand the meaning of this concept, we should start by explain "surrogate" word. According to multiple online dictionnaries, surrogate means "one that takes place of another". We can call surrogate keys as substitution keys without business meaning. It means that user who reads it, doesn't know the meaning of, par example, auto generated number. They are only simple values generated by database or application.
The opposite of surrogate keys are natural keys. They are represented and understood in the real world much simpler than surrogate. A good example of natural keys can be: social security number or tracking numbers for parcels. But using it in applications is a risk. Imagine the situation where somebody introduces a security number of someone else and he tries to update the data with the clause "WHERE security_number = 'XXX'". Yes, he'll able to change the rows that aren't belong to him. And for the parcels tracking, you must to know that some companies "reuse" old tracking numbers. So, you can give a bad information to your visitor by using it as primary key.
Surrogate keys can be generated automatically by database system, for example as AUTO_INCREMENT or serial columns. They guarantee no conflicts and don't change while the row exists.
Primary key strategies in Hibernate
In the reality, Hibernate's key strategies are taken from javax.persistance.GenerationType enum. In our examples, we'll use following tables:
CREATE TABLE IF NOT EXISTS products ( id INT(5) NOT NULL, name VARCHAR(20) NOT NULL, PRIMARY KEY(id) );
The value of id column can change from one strategy to another. Test method used to illustrate the primary key generation will be:
Session session = sessionFactory.openSession(); try { Product product1 = new Product(); product1.setName("Product#1"); session.save(product1); Product product2 = new Product(); product2.setName("Product#2"); session.save(product2); session.flush(); } catch (Exception e) { LOGGER.error("An error occurred on saving products", e); } finally { session.close(); }
And now, the entity, without defined strategy:
@Entity @Table(name="products") public class Product { private int id; private String name; public void setId(int id) { this.id = id; } @Id @Column(name="id") public int getId() { return this.id; } public void setName(String name) { this.name = name; } @Column(name="name") @NotEmpty public String getName() { return this.name; } @Override public String toString() { return "Product {"+this.name+"}"; } }
Every time, this class will have only getId() part changed.
AUTO
The simplest strategy because it's based on database configuration for given column. This is the default value used by Hibernate. The new getId() part looks as below:
//... @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name="id") public int getId() { return this.id; } //...
We define also a new primary key strategy at database level:
mysql> ALTER TABLE products CHANGE COLUMN id id INT(5) NOT NULL AUTO_INCREMENT;
The primary key should be auto incremented. Now, let's execute the test class and see the output in the products table:
mysql> select * from products; +----+-----------+ | id | name | +----+-----------+ | 1 | Product#1 | | 2 | Product#2 | +----+-----------+ 2 rows in set (0.00 sec)
Hibernate uses strategy defined at the database level. Entity field annotated with @Id has auto incremented value.
IDENTITY
This strategy indicates that Hiberante will take the value from database and assign it to primary key field in the entity. To illustrate this case, we only need to change entity class:
//... @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(name="id") public int getId() { //...
After executing the test method, you should have following entries in the database:
mysql> select * from products; +----+-----------+ | id | name | +----+-----------+ | 1 | Product#1 | | 2 | Product#2 | | 3 | Product#1 | | 4 | Product#2 | +----+-----------+ 4 rows in set (0.00 sec)
SEQUENCE
Next generation strategy is called SEQUENCE. It uses database sequences as entity primary keys. What does it mean in practice ? The sequence looks almost like an auto increment key. The most important difference is that the sequences don't exist in MySQL. It's why for this case, we'll use PostgreSQL database. Let's first create the table and sequence with following queries:
springsandbox=> CREATE SEQUENCE seq_products START WITH 1 INCREMENT BY 2; springsandbox=> CREATE TABLE products (id INTEGER NOT NULL DEFAULT nextval('seq_products'), name VARCHAR(20) NOT NULL); CREATE TABLE springsandbox=> \d+ products; Table "public.products" Column | Type | Modifiers | Storage | Description --------+-----------------------+----------------------------------------------------+----------+------------- id | integer | not null default nextval('seq_products'::regclass) | plain | name | character varying(20) | not null | extended | Has OIDs: no springsandbox=> ALTER SEQUENCE seq_products OWNED BY products.id;
Database configuration is very explicit. In CREATE SEQUENCE query we tell that the sequence starts with 1 and is incremented by 2. After that, it's important to associate created sequence into products table. The last ALTER SEQUENCE query is also important. It indicates that only id column from products table can interact with the sequence called seq_products.
After this database manipulation, we can pass to entity modyfications:
@Id @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "productsSequence") @SequenceGenerator(name = "productsSequence", sequenceName = "seq_products", allocationSize=2) @Column(name="id") public int getId() { return this.id; }
A new attribute appears in @GeneratedValue annotation, generator. It informs Hibernate which method must be used to generate sequence value. In our case, the generator is represented by @SequenceGenerator annotation that points to sequence created previously in the database (seq_products). Note that the value "INCREMENT BY" is represented here by allocationSize. Both should be the same to avoid some surprising numbers. @GeneratedValue can find the right sequence thanks to @SequenceGenerator's name attribute.
The execution of test method should return following entries:
springsandbox=> select * from products; id | name ----+----------- 2 | Product#1 3 | Product#2 (2 rows)
Something doesn't work here. Logically, the first incremented value should be 2 and next 4. It's not the case here. It's because the Hibernate's sequence generator is not exactly the same as the database generator. Hibernate uses org.hibernate.id.enhanced.SequenceStyleGenerator to generate the final sequence number. We can see how it works by comparing the logs when allocationSize equals to database INCREMENT BY value and when the allocationSize is equal to 1. The equality case first:
26-04-2014 15:11:15;912 : TRACE: org.hibernate.internal.SessionImpl - Opened session at timestamp: 13985178758 26-04-2014 15:11:15;924 : TRACE: org.hibernate.event.internal.DefaultSaveOrUpdateEventListener - Saving transient instance 26-04-2014 15:11:15;934 : DEBUG: org.hibernate.SQL - select nextval ('seq_products') // ... 26-04-2014 15:11:16;038 : DEBUG: org.hibernate.id.enhanced.SequenceStructure - Sequence value obtained: 3 // ... 26-04-2014 15:11:16;043 : DEBUG: org.hibernate.event.internal.AbstractSaveEventListener - Generated identifier: 1, using strategy: org.hibernate.id.enhanced.SequenceStyleGenerator // ... 26-04-2014 15:11:16;075 : TRACE: org.hibernate.event.internal.DefaultSaveOrUpdateEventListener - Saving transient instance 26-04-2014 15:11:16;075 : DEBUG: org.hibernate.event.internal.AbstractSaveEventListener - Generated identifier: 2, using strategy: org.hibernate.id.enhanced.SequenceStyleGenerator // ... 26-04-2014 15:11:16;085 : DEBUG: org.hibernate.event.internal.AbstractFlushingEventListener - Flushed: 2 insertions, 0 updates, 0 deletions to 2 objects 26-04-2014 15:11:16;086 : DEBUG: org.hibernate.event.internal.AbstractFlushingEventListener - Flushed: 0 (re)creations, 0 updates, 0 removals to 0 collections 26-04-2014 15:11:16;088 : DEBUG: org.hibernate.internal.util.EntityPrinter - Listing entities: 26-04-2014 15:11:16;089 : DEBUG: org.hibernate.internal.util.EntityPrinter - com.waitingforcode.db.entity.Product{id=1, name=Product#1} 26-04-2014 15:11:16;089 : DEBUG: org.hibernate.internal.util.EntityPrinter - com.waitingforcode.db.entity.Product{id=2, name=Product#2} 26-04-2014 15:11:16;089 : TRACE: org.hibernate.event.internal.AbstractFlushingEventListener - Executing flush // ... 26-04-2014 15:11:16;191 : TRACE: org.hibernate.persister.entity.AbstractEntityPersister - Dehydrating entity: [com.waitingforcode.db.entity.Product#1] 26-04-2014 15:11:16;198 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [1] as [VARCHAR] - Product#1 26-04-2014 15:11:16;200 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [2] as [INTEGER] - 1 // ... 26-04-2014 15:11:16;219 : TRACE: org.hibernate.persister.entity.AbstractEntityPersister - Dehydrating entity: [com.waitingforcode.db.entity.Product#2] 26-04-2014 15:11:16;220 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [1] as [VARCHAR] - Product#2 26-04-2014 15:11:16;220 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [2] as [INTEGER] - 2
And now allocationSize equals to 1 while INCREMENT BY is equal to 2:26-04-2014 14:36:45;813 : TRACE: org.hibernate.internal.SessionImpl - Opened session at timestamp: 13985158057 26-04-2014 14:36:45;823 : TRACE: org.hibernate.event.internal.DefaultSaveOrUpdateEventListener - Saving transient instance 26-04-2014 14:36:45;834 : DEBUG: org.hibernate.SQL - select nextval ('seq_products') // ... 26-04-2014 14:36:45;880 : DEBUG: org.hibernate.id.enhanced.SequenceStructure - Sequence value obtained: 3 // ... 26-04-2014 14:36:45;883 : DEBUG: org.hibernate.event.internal.AbstractSaveEventListener - Generated identifier: 3, using strategy: org.hibernate.id.enhanced.SequenceStyleGenerator // ... 26-04-2014 14:36:45;912 : TRACE: org.hibernate.event.internal.DefaultSaveOrUpdateEventListener - Saving transient instance 26-04-2014 14:36:45;912 : DEBUG: org.hibernate.SQL - select nextval ('seq_products') // ... 26-04-2014 14:36:45;914 : DEBUG: org.hibernate.id.enhanced.SequenceStructure - Sequence value obtained: 5 // ... 26-04-2014 14:36:45;915 : DEBUG: org.hibernate.event.internal.AbstractSaveEventListener - Generated identifier: 5, using strategy: org.hibernate.id.enhanced.SequenceStyleGenerator // ... 26-04-2014 14:36:45;942 : DEBUG: org.hibernate.event.internal.AbstractFlushingEventListener - Flushed: 2 insertions, 0 updates, 0 deletions to 2 objects 26-04-2014 14:36:45;942 : DEBUG: org.hibernate.event.internal.AbstractFlushingEventListener - Flushed: 0 (re)creations, 0 updates, 0 removals to 0 collections 26-04-2014 14:36:45;951 : DEBUG: org.hibernate.internal.util.EntityPrinter - Listing entities: 26-04-2014 14:36:45;952 : DEBUG: org.hibernate.internal.util.EntityPrinter - com.waitingforcode.db.entity.Product{id=5, name=Product#2} 26-04-2014 14:36:45;952 : DEBUG: org.hibernate.internal.util.EntityPrinter - com.waitingforcode.db.entity.Product{id=3, name=Product#1} // ... 26-04-2014 14:36:46;073 : TRACE: org.hibernate.persister.entity.AbstractEntityPersister - Dehydrating entity: [com.waitingforcode.db.entity.Product#3] 26-04-2014 14:36:46;080 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [1] as [VARCHAR] - Product#1 26-04-2014 14:36:46;082 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [2] as [INTEGER] - 3 // ... 26-04-2014 14:36:46;097 : TRACE: org.hibernate.persister.entity.AbstractEntityPersister - Dehydrating entity: [com.waitingforcode.db.entity.Product#5] 26-04-2014 14:36:46;097 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [1] as [VARCHAR] - Product#2 26-04-2014 14:36:46;098 : TRACE: org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [2] as [INTEGER] - 5
The difference is situated at the level of "Generated identifier (...) using strategy: org.hibernate.id.enhanced.SequenceStyleGenerator". Inside this class we see that Hibernate uses optimizer to improve the identifiers generation performances. Take a look on code (version 4.2.7 of Hibernate):
protected String determineOptimizationStrategy(Properties params, int incrementSize) { // if the increment size is greater than one, we prefer pooled optimization; but we first // need to see if the user prefers POOL or POOL_LO... String defaultPooledOptimizerStrategy = ConfigurationHelper.getBoolean( Environment.PREFER_POOLED_VALUES_LO, params, false ) ? OptimizerFactory.StandardOptimizerDescriptor.POOLED_LO.getExternalName() : OptimizerFactory.StandardOptimizerDescriptor.POOLED.getExternalName(); String defaultOptimizerStrategy = incrementSize <= 1 ? OptimizerFactory.StandardOptimizerDescriptor.NONE.getExternalName() : defaultPooledOptimizerStrategy; return ConfigurationHelper.getString( OPT_PARAM, params, defaultOptimizerStrategy ); }
The first comment explains a lot of mysteries. If the increment size (our allocationSize attribute) is greater than 1, the optimizers will be used to generate database sequence number. In our cas, it was org.hibernate.id.enhanced.OptimizerFactory.HiLoOptimizer. As explained in JavaDoc of HiLoOptimizer, it's based on HiLo algorithm. This algorithm generates two values: lower and upper. The first one is calculated with following formula: (database next sequence value * allocationSize) + 1. The second one is made through: upper value - allocationSize.
In our case, the upper value was 3 ((1*2) + 1) and lower was 1 (3 - 2). It explains why the id starts with 1. But why the second products id is incremented by 1 ? If we look at documentation note, we'll understand it very quickly: "Note, 'value' always (after init) holds the next value to return". So, in our case the optimizer has already generated one value (1) and after that, it has just retrived the next value (+1).
To resume, if you want to use sequence generated by the database for some reasons (for example: writes possible not only from Hibernate level), you should but allocationSize to 1. Remember to do not leave out this attribute because Hibernate will take default value which is 50.
TABLE
After some complications with sequence generated key, we can pass to last strategy, based on another table values. To activate it, we need first to create a table with id sequences:
mysql> CREATE TABLE id_sequences (next_val int(5) NOT NULL DEFAULT 1, sequence_name CHAR(4) NOT NULL); Query OK, 0 rows affected (0.17 sec) mysql> desc id_sequences; +---------------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------------+---------+------+-----+---------+-------+ | next_val | int(5) | NO | | 1 | | | sequence_name | char(4) | NO | | NULL | | +---------------+---------+------+-----+---------+-------+ 2 rows in set (0.00 sec)
The next_val column indicates to Hibernate which next value must be used as identifier for entity. The sequence_name indicates for which entity the sequence is used. We retrieve this value in persistence annotation:
@Id @GeneratedValue(strategy = GenerationType.TABLE, generator = "productsTableGen") @TableGenerator(name = "productsTableGen", pkColumnValue = "prod", table="id_sequences", allocationSize=1 ) @Column(name="id") public int getId() { return this.id; }
Exactly as for sequence keys, they are also generator attribute in @GeneratedValue annotation. Its meaning is the same. The difference is situated at generator level. We'll use here table as generator, so in consequence, the annotation @TableGeneration. We put its name and another already approached attribute: allocationSize. Two new things are table and pkColumnValue. The first one informs Hibernate in which table are located identifier keys. The second one serves to retrieve row with identifier value reserved to Product's id field. After executing test, we should get:
mysql> SELECT * FROM id_sequences; +----------+---------------+ | next_val | sequence_name | +----------+---------------+ | 3 | prod | +----------+---------------+ 1 row in set (0.01 sec) mysql> select * from products; +----+-----------+ | id | name | +----+-----------+ | 1 | Product#1 | | 2 | Product#2 | +----+-----------+ 2 rows in set (0.00 sec)
Inside Hibernate's documentation all samples seem to be very simple to realize. But sometimes, as in the case of sequence generator, the complications can arrive. So remember the pitfall of allocationSize and the fact of internal id generation for TABLE and SEQUENCE strategies.