Getting the Persistence Context Picture (Part III)Part 3 of this series deals with more advanced topics, requiring knowledge about persistence patterns and Hibernate APIs.
-  Getting the Persistence Context Picture (Part I)
-  Getting the Persistence Context Picture (Part II)
Conversational State ManagementOne advanced use case when using persistence frameworks is realization of conversations. A conversation spans multiple user interactions and, most of the time, realizes a well-defined process. Best way to think of a conversation is to think of some kind of wizard, e.g. a newsletter registration wizard. A newsletter registration wizard typically spans multiple user interactions, whereas each interaction needs user input and further validation to move on:
- a user needs to provide basic data, e.g. firstname, lastname, birthdate, etc.
- a user needs to register for several newsletter categories
- a user gets a summary and needs to confirm that information
A Small Intro to Database TransactionsWhenever a database transaction gets started, all data modification is tracked by the database. For example, in case of MySQL (InnoDB) databases, pages (think of a special data structure) are modified in a buffer pool and modifications are tracked in a redo log which is hold in synchronization with the disk. Whenever a transaction is committed the dirty pages are flushed out to the filesystem, otherwise if the transaction is rolled back, the dirty pages are removed from the pool and the changes are redone. It depends on the current transaction level if the current transaction has access to changes done by transactions executed in parallel (more details on MySQL transactions can be found at ). MySQL's default transaction level is "repeatable read": all reads within the same transaction return the same results - even if another transaction might have changed data in the meantime. InnoDB (a transactional MySQL database engine, integrated in MySQL server) achieves this behavior by creating snapshots when the first query is created. Other isolation levels (confirming to SQL-92 standard) are: "read uncommitted" > "read committed" > "repeatable read" > "serializable". The order represents the magnitude of locking which is necessary to realize the respective transaction level.
A Naive ApproachBack to conversational state management: as mentioned above, a naive approach would be to use a single database transaction for a single conversation. This approach apparently has many problems:
- if data is modified and DML statements generated, usually locks are created, avoiding other transactions to change it.
- databases are designed to keep transactions as short as possible, a transaction is seen as atomic unit and not a long living session, long-running transactions are typically discarded by the database management system.
- especially in web applications, it is hard for an application to determine conversation aborts - when the user closes its browser window in the middle of a transaction, or kills the browser process, there is hardly a change for the application to detect that circumstance.
- a transaction is typically linked to a database connection. the number of database transactions is typically limited to the application.
Extended Persistence Context PatternAs we've already seen in the second part of this series  Grails uses a so-called Session-per-Request pattern. Whenever a controller's method is called, a new Hibernate session spans the method call and, with flush mode turned to manual, the view rendering. When the view rendering is done, the session is closed. Of course, this pattern is not an option when implementing conversations, since changes in a controller's method call are committed on the method's return. One could pass Grails standard behavior using detached objects, but let me tell you: life gets only more complicated when detaching modified objects - especially in advanced domain models. What we will need to implement a conversation is a mechanism that spans the persistence context over several user requests, that pattern is called: the extended persistence context pattern. An extended persistence context reuses the persistence context for all interactions within the same conversation. In Hibernate speak: we need to find a way to (re)use a single
org.hibernate.Sessioninstance for conversation lifetime. Fortunately, there is a Grails plugin which serves that purpose perfectly: the web flow plugin.
Conversational Management with Web FlowsThe Grails web flow plugin is based on Spring Web Flow . Spring Web Flow uses XML configuration data to specify web flows: Groovy uses its own DSL implemented in
org.codehaus.groovy.grails.webflow.engine.builder.FlowBuilder. This approach has the advantage of being tightly integrated into the Grails controller concept: In this case, the closure property
newsletterRegistrationFlowis placed in a dedicated controller class and is automatically recognized by the web flow plugin. The plugin is responsible for instantiating a Grails web flow builder object which needs a closure as one of its input parameters. Leaving the DSL aside, best thing about web flows is that it realizes the extended persistence context aka flow managed persistence context (FMPC). The
HibernateFlowExecutionListeneris the place where the Hibernate session is created and than reused over multiple user interactions. It implements the
FlowExecutionListenerinterface. The flow execution listener provides callbacks for various states in the lifecycle of a conversation. Grails
HibernateFlowExecutionListeneruses these callbacks to implement the extended persistence context pattern. On conversation start, it creates a new Hibernate session: Whenever the session is paused, in between separate user requests, it is disconnected from the current database connection: Whenever resuming the current web flow, the session is connected with the database connection again. Whenever a web flow has completed its last step, the session is resumed and all changes are flushed in a single transaction: A call to
sessionFactory.getCurrentSession()causes the current session to be connected with the transaction and, at the end of the transaction template, committing all changes within that transaction. All changes which have been tracked in-memory so far, are by then synchronized with the database state. The price to be paid for conversations is higher memory consumption. In order to estimate the included effort, we need to take a closer look at how Hibernate realizes loading and caching of entities. In addition to implementing conversations, memory consumption is especially important in Hibernate based batch jobs.
Using Hibernate in Batch JobsThe most important thing when working with Hibernate is to remember: the persistence context references all persistent entities loaded, but entities don't know anything about it. As long as the persistence context is alive it does not discard references automatically. This is particularly important in batch jobs. When executing queries with large result sets you have to manually discard the Hibernate session otherwise the program is definitely running out of memory:
clearmethod that detaches all persistent objects being tracked by this session instance. Invoked on specific object instances,
evictalways to remove selected persistent objects from a particular session. In this context, it might be worth to take a look at Hibernate's
StatefulPersistenceContextclass. This is the piece of code that actually implements the persistence context pattern. As you can see in the following code snippet, invoking
clearremoves all references to all tracked objects: Another thing to notice when executing large result sets and keeping persistence contexts in memory is that Hibernate uses state snapshots to recognize modifications on persistent objects (remember how InnoDB realizes repeatable-read transaction isolation;-)). Whenever a persistent object is loaded, Hibernate creates a snapshot of the current state and keeps that snapshot in internal data-structures: Whenever you don't want Hibernate to create snapshot objects, you have to use readonly queries or objects. Marking a query as "readonly" is as easy as setting its
setReadOnly(true)property. In read-only mode, no snapshots are created and modified persistent objects are not marked as dirty. If your batch accesses the persistence context with read-access only, there is another way to optimize DB access: using a stateless session.
openStatelessSessionmethod that creates a fully statless session, without caching, modification tracking etc. In Grails, obtaining a stateless session is nothing more than injecting the current sessionFactory bean and calling
openStatelessSessionon it: In combination with stateless session objects, it is worth mentioning that if you want to modify data there is an interface to do that even when working with stateless sessions: Where interface
Workhas a single method declaration: As you can see
executegets a reference on the current
Connectionwhich, in the case of JDBC connections, can be used to formulate raw SQL queries. If your batch is processing large chunks of data, paging might be interesting too. Again, this can be done by setting the appropriate properties of Hibernate's
Queryclass. The code snippet above explicitly sets the flush mode to "manual", since flushing does not make sense in this context (all retrieved objects are readonly). A similiar API can be found in the
Criteriaclass, being supported by Grails by its own Criteria Builder DSL .