Data repositories in Spring Data JPA on waitingforcode.com

Thanks to JPA module of Spring Data project we can simplify database querying. In this article we'll focus on it.

4-day workshop · In-person or online

What would it take for you to trust your Databricks pipelines in production?

A 3-day bug hunt on a 3-person team costs up to €7,200 in lost engineering time. This workshop teaches you to prevent that — unit tests, data tests, and integration tests for PySpark and Databricks Lakeflow, including Spark Declarative Pipelines.

Unit, data & integration tests

Medallion architecture & Lakeflow SDP

Max 10 participants · production-ready templates

See the full curriculum → €7,000 flat fee · cohort of up to 10

Bartosz
Konieczny

Firstly, we'll describe a concept of repositories in Spring Data. After we'll explain how it works under the hood. We'll start here by investigating what repositories really are and how they are constructed. After we'll focus on @Query annotation and its execution. At the end, magic method findBy queries will be examined. The use case will be covered in one of next articles.

What Spring Data JPA repositories are ?

Very frequently used pattern, Data Access Object (DAO) consists on defining access for underlying persistence storage. Thanks to it, we can query a database and return the result as managed entity (simplified definition). It seems to be an appropriated solution for a lot of problematic. But it needs a lot of code written (querying, result treatment etc.). Spring Data project provides some of fresh point of view for data access layer.

This fresh point of view are the repositories consisting on interface-based programming model. They are an interfaces extending org.springframework.data.repository.Repository<T, ID extends Serializable> (or one of Repository subinterfaces as CrudRepository or PagingAndSortingRepository). These interfaces must be typed to the entity and its primary key class. For example, if our entity is ShoppingCart and it has a Long instance as primary key, the ShoppingCartRepository should be typed to <ShoppingCart, Long> pair.

In the most common cases, repository interface hasn't implementations. But in some special situations it's possible to implement it. We'll see it in one of the next articles. If the repository extends CrudRepository, it inherits automatically major CRUD methods: findAll() (retreive all entities), findOne(id) (retrieve one entity by its id), save (save new entity (INSERT) or only the changes of existing one (UPDATE)) or delete(id) (removes one entity by given id).

Repository can query persistence storage in two ways:
- through String queries: queries can be defined inside @Query annotation without nativeQuery flag (JPA queries, with JPQL language) or inside the same entity with nativeQuery flag set to true (SQL queries).
- through dynamic "findBy" method queries: repositories supports natively the queries prefixed by findBy and followed by WHERE clause included in method name. To better understand, there are some queries which can be translated to findBy interface method:

// creates the query like SELECT p FROM Product p WHERE p.name = ?1
public List<Product> findByName(String name);

// creates the query like SELECT p FROM Product p WHERE p.name = ?1 AND p.color = ?2
public List<Product> findByNameAndColor(String name, String color);

// creates the query like SELECT p FROM Product p WHERE p.name = ?1 OR p.color = ?2
public List<Product> findByNameOrColor(String name, String color);

How Spring Data JPA repositories work ?

It seems very clear but what mechanism handles it under the hood ? Because the repositories can't rest the simple interfaces. In fact, repositories defined as interfaces are dynamically converted to normal Spring beans. You can test it by invoking ApplicationContext's getBeanDefinitionNames method. Among controllers, services and another beans, it will print repositories too.

The repositories can be configured in XML file with following entry:

<jpa:repositories base-package="com.waitingforcode.repository"
        entity-manager-factory-ref="emf"
        transaction-manager-ref="transactionManager" />

Two last attributes don't need to be explained. entity-manager-factory-ref means the EntityManagerFactory to use while transaction-manager-ref points to transaction manager bean. The first attribute, base-package, will help us to understand how this entry is defined. A simple grep -r "base-package" on Spring Data source repository is enough to find the class handling the configuration. It's org.springframework.data.repository.config.XmlRepositoryConfigurationSource extending RepositoryConfigurationSourceSupport from the same package.

This class is used to read the configuration and pass its attributes to RepositoryConfigurationDelegate instance from the same package. This delegator is used to detect defined repositories through XML configuration or Java's annotations. Thanks to specified base-package, Spring Data can start to analyze the package with the help of RepositoryComponentProvider class that extends org.springframework.context.annotation.ClassPathScanningCandidateComponentProvider. It's a simple filter that looking for the classes corresponding to specified criteria in the classpath. If we look inside this class, we'll see that two types of classes are considered as repositories:
- the interfaces extending Repository
- the classes annotated with @RepositoryDefinition

If one class is annotated with @NoRepositoryBean, it won't be considered as repository and analyzed by RepositoryComponentProvider. After that, all found classes are considered as candidates to become repositories. RepositoryConfigurationDelegate analyzes them and tries to make for everyone an instance of org.springframework.beans.factory.config.AbstractBeanDefinition. This instance represents bean definition, exactly as it was described within XML tag.

When we use a repository through Spring-specific setter, for example @Autowired annotation, we invoke under the hood, a getObject method from org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport class. We can learn that by making a fake repository inside repositories package, for example this one:

public interface ShoppingCartRepository extends CrudRepository<ShoppingCart> {

}

If you try to autowire it, an BeanCreationException will be thrown. Thanks to its trace, you'll see how Spring wants to inject the dependency:

Caused by: java.lang.IllegalArgumentException: [Assertion failed] - this argument is required; it must not be null
  at org.springframework.util.Assert.notNull(Assert.java:112)
  at org.springframework.util.Assert.notNull(Assert.java:123)
  at org.springframework.data.jpa.repository.support.JpaEntityInformationSupport.getMetadata(JpaEntityInformationSupport.java:57)
  at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getEntityInformation(JpaRepositoryFactory.java:146)
  at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getTargetRepository(JpaRepositoryFactory.java:84)
  at org.springframework.data.jpa.repository.support.JpaRepositoryFactory.getTargetRepository(JpaRepositoryFactory.java:67)
  at org.springframework.data.repository.core.support.RepositoryFactorySupport.getRepository(RepositoryFactorySupport.java:136)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:153)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:43)
  at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:144)
  ... 36 more

Our getObject method returns repository proxy. If the proxy doesn't exist, it's initialized through getRepository method of RepositoryFactorySupport class. Because we implement JPA, Spring Data is looking for EntityManager. It uses this instance to construct an instance of org.springframework.data.jpa.repository.support.SimpleJpaRepository or QueryDslJpaRepository, if given repository interface requires QueryDsl implementation to be present. But for the reasons of simplicity, less focus on SimpleJpaRepository. This class defines the implementations for all CRUD operations already quoted: findOne, delete, findAll... All of them uses JPA queries. For example, for all reading queries are based on this one, protected, query. This method construct demanded query (findOne, findAll...) thanks to JPA Criteria API, already covered in the article about introduction to JPA Criteria API:

protected TypedQuery<T> getQuery(Specification<T> spec, Sort sort) {

  CriteriaBuilder builder = 
    em.getCriteriaBuilder();
  CriteriaQuery<T> query = 
    builder.createQuery(getDomainClass());

  Root<T> root = 
    applySpecificationToCriteria(spec, query);
  query.select(root);

  if (sort != null) {
    query.orderBy(toOrders(sort, root, builder));
  }

  return applyRepositoryMethodMetadata(em.createQuery(query));
}

How repositories execute @Query methods ?

Very interesting feature of Spring Data JPA project is @Query annotation placed in org.springframework.data.jpa.repository package. Thanks to it, we can write our own query (native SQL or JPQL) and get the result specified in annotated method. For example:

@Query("SELECT p FROM Product p WHERE p.id = :id")
public Product getProductById(@Param("id") int id);

This method will execute given query and return an instance of Product entity or null if the entity doesn't exist in the persistent storage. But how it's executed by repositories, defined as interfaces ? Which object represents string-based query ? To know that, we'll imitate "breaking something down" strategy by writting incorrect JPQL query:

public interface ProductRepository  extends CrudRepository<Product, Integer> {

  @Query("SELECT x FROM Product p WHERE id = 4")
  public Product getByName(String name);
}

By launching the code, we'll fall on an IllegalArgumentException with following trace:

Caused by: java.lang.IllegalArgumentException: java.lang.IllegalStateException: No data type for node: org.hibernate.hql.internal.ast.tree.IdentNode
\-[IDENT] IdentNode: 'x' {originalText=x}
  at org.springframework.data.jpa.repository.query.SimpleJpaQuery.(SimpleJpaQuery.java:71)
  at org.springframework.data.jpa.repository.query.SimpleJpaQuery.fromQueryAnnotation(SimpleJpaQuery.java:138)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$DeclaredQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:114)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$CreateIfNotFoundQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:160)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$AbstractQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:68)
  at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.(RepositoryFactorySupport.java:279)
  at org.springframework.data.repository.core.support.RepositoryFactorySupport.getRepository(RepositoryFactorySupport.java:147)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:153)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:43)
  at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:144)
  ... 36 more

As you can see, they are something we are looking for, org.springframework.data.jpa.repository.query.SimpleJpaQuery. In fact, every query written within @Query annotation is translated into SimpleJpaQuery instance. But this class contains only constructor and validateQuery method which checks if given query is correct. More interesting thing, query extraction, is made inside AbstractStringBasedJpaQuery constructor. Because SimpleJpaQuery extends this abstract class, this constructor is invoked too through super() call. Inside, the extraction is made with the instance of ExpressionBasedStringQuery. The methods of this class are used only when we deal with query containing SpEL expressions. Otherwise, it's StringQuery from the same package which makes the job.

StringQuery class represents a query which is after set directly into EntitiyManager's createQuery() method. This set is made inside public Query doCreateQuery of AbstractStringBasedJpaQuery class. Inside this method Spring also invokes an parameter binder which put all available query parameters (annotated with @Param) to generated javax.persistence.Query instance.

The binding isn't something complicated because it uses the same mechanisms as in standard JPA query creation process. Parameter binding is made through Query's setParameter method and, regarding to scenario, is applied to named parameters (:name, :id etc.) or parameter positions (?1, ?2 etc.).

How findBy methods work in Spring Data JPA repositories ?

Until now, Spring makes a lot of magic work to economize developers time. But even more magic work is made for methods started by findBy prefix. To understand how they are working, we'll opt for "breaking something down" strategy too. This time, our query starts by reserved findBy prefix and contains nonexistent property as a suffix:

public interface ProductRepository  extends CrudRepository<Product, Integer> {
  public Product findByInexistentAttribute(String fiction);
}

When you try to launch the code, a org.springframework.data.mapping.PropertyReferenceException will be thrown with given stack trace:

Caused by: org.springframework.data.mapping.PropertyReferenceException: No property inexistent found for type com.waitingforcode.data.Product
  at org.springframework.data.mapping.PropertyPath.(PropertyPath.java:75)
  at org.springframework.data.mapping.PropertyPath.create(PropertyPath.java:327)
  at org.springframework.data.mapping.PropertyPath.create(PropertyPath.java:353)
  at org.springframework.data.mapping.PropertyPath.create(PropertyPath.java:307)
  at org.springframework.data.mapping.PropertyPath.from(PropertyPath.java:271)
  at org.springframework.data.mapping.PropertyPath.from(PropertyPath.java:245)
  at org.springframework.data.repository.query.parser.Part.(Part.java:72)
  at org.springframework.data.repository.query.parser.PartTree$OrPart.(PartTree.java:180)
  at org.springframework.data.repository.query.parser.PartTree$Predicate.buildTree(PartTree.java:260)
  at org.springframework.data.repository.query.parser.PartTree$Predicate.(PartTree.java:240)
  at org.springframework.data.repository.query.parser.PartTree.(PartTree.java:71)
  at org.springframework.data.jpa.repository.query.PartTreeJpaQuery.(PartTreeJpaQuery.java:57)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$CreateQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:90)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$CreateIfNotFoundQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:162)
  at org.springframework.data.jpa.repository.query.JpaQueryLookupStrategy$AbstractQueryLookupStrategy.resolveQuery(JpaQueryLookupStrategy.java:68)
  at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.(RepositoryFactorySupport.java:279)
  at org.springframework.data.repository.core.support.RepositoryFactorySupport.getRepository(RepositoryFactorySupport.java:147)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:153)
  at org.springframework.data.repository.core.support.RepositoryFactoryBeanSupport.getObject(RepositoryFactoryBeanSupport.java:43)
  at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:144)
  ... 36 more

Searched class is org.springframework.data.repository.query.parser.PartTree. It holds two private fields, called subject and predicate. Both are used to translate "findBy" query into PartTreeJpaQuery instance. The first one, subject, is an instance of private static class called Subject. This class represents a subject part of the query, ie. entity concerned by the query. For example the subject of findUserByName will be User entity. Another field holded by PartTree is predicate which is an instance of Predicate private static class. It represents the elements concerned by the query. For example, the query findUserByName has a predicate "Name". To simplify, we can tell that predicate is everything that appears after "SELECT e FROM Entity e" clause (WHERE, ORDER BY etc.).

Predicate contains a list of nodes being instances of inner OrPart class and its the key to understand how Spring knows the SQL meaning of reserved query keywords, as IsNull, Null, IsBefore etc. OrPart implements Iterable interface typed to Part class from the same package. It contains an enum, Type, which holds are reserved keywords definitions.

All these elements are used afterward by QueryPreparer instance in PartTreeJPAQuery. It creates an instance of JpaQueryCreator by passing generate PartTree object. After creating the instance of javax.persistence.TypedQuery, parameter binding is invoked. Query generated in this way is after send to the database.

This articles introduces us into the Spring Data JPA world. In the first part, we saw the repositories, referred here as a fresh alternative for verbose DAO layer. The next parts explained more about some of magic things associated with repositories. Firstly, we discovered that they're all loaded through classpath loader and that in fact, they're translated into SimpleJpaRepository instances to handle most of methods. Next part shown the interpretation of @Query annotation. We saw that this annotation contained a JPQL (or SQL) query and were translated into SimpleJpaQuery. In the last part, we focused on magic "findBy" methods that were, in fact, written into instances of PartTreeJpaQuery class.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I wrote one on that topic! You can read it online on the O'Reilly platform, or get a print copy on Amazon.

I also help solve your data engineering problems contact@waitingforcode.com 📩

Data repositories in Spring Data JPA