20

I'm trying to understand how Spring Batch does transaction management. This is not a technical question but more of conceptual one: what approach does Spring Batch use and what are the consequences of that approach?

Let me try to clarify this question a bit. For instance, looking at the TaskletStep, I see that generally a step execution looks something like this:

  1. several JobRepository transactions to prepare the step metadata
  2. a business transaction for every chunk to process
  3. more JobRepository transactions to update the step metadata with the results of chunk processing

This seems to make sense. But what about a failure between 2 and 3? This would mean the business transaction was committed but Spring Batch was unable to record that fact in its internal metadata. So a restart would reprocess the same items again even though they have already been committed. Right?

I'm looking for an explanation of these details and the consequences of the design decisions made in Spring Batch. Is this documented somewhere? The Spring Batch reference guide has very few details on this. It simply explains things from the application developer's point of view.

1 Answer 1

25

There are two fundamental types of steps in Spring Batch, a Tasklet Step and a chunk based step. Each has it's own transaction details. Let's look at each:

Tasklet Based Step
When a developer implements their own tasklet, the transactionality is pretty straight forward. Each call to the Tasklet#execute method is executed within a transaction. You are correct in that there are updates before and after a step's logic is executed. They are not technically wrapped in a transaction since rollback isn't something we'd want to support for the job repository updates.

Chunk Based Step
When a developer uses a chunk based step, there is a bit more complexity involved due to the added abilities for skip/retry. However, from a simple level, each chunk is processed in a transaction. You still have the same updates before and after a chunk based step that are non-transactional for the same reasons previously mentioned.

The "What if" scenario
In your question, you ask about what would happen if the business logic completed but the updates to the job repository failed for some reason. Would the previously updated items be re-processed on a restart. As in most things, that depends. If you are using stateful readers/writers like the FlatFileItemReader, with each commit of the business transaction, the job repository is updated with the current state of what has been processed (within the same transaction). So in that case, a restart of the job would pick up where it left off...in this case at the end, and process no additional records.

If you are not using stateful readers/writers or have save state turned off, then it is a bit of buyer beware and you may end up with the situation you describe. The default behavior in the framework is to save state so that restartability is preserved.

2
  • 1
    Thanks for providing these insights. Regarding the "what if" scenario: we're using the StaxEventItemReader, which is stateful. However, even in that case it could of course be that the update to the step state fails to commit while the business transaction committed or vice-verse. How are those situations dealt with? Or am I just misunderstanding something here? My point is: the updates to the JobRepository happen in transactions separate from the business transactions. So consequently it is possible that one commits while the other fails and as a result things get out-of-sync.
    – klr8
    Mar 30, 2015 at 11:56
  • 2
    The state updates (aka the state of the reader) and the business transactions are one and the same. The only updates that occur outside of the business transactions (and therefore are susceptible to being out of sync) are the ones marking the step as started and completed. The number of records read is persisted as part of the business transaction. Mar 30, 2015 at 14:13

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.