How can we share data between the different steps of a Job in Spring Batch?

Question

Digging into Spring Batch, I'd like to know as to How can we share data between the different steps of a Job?

Can we use JobRepository for this? If yes, how can we do that?

Is there any other way of doing/achieving the same?

Pierre Henry · Accepted Answer · 2017-10-03 14:38:05Z

55

From a step, you can put data into the StepExecutionContext. Then, with a listener, you can promote data from StepExecutionContext to JobExecutionContext.

This JobExecutionContext is available in all the following steps.

Becareful : data must be short. These contexts are saved in the JobRepository by serialization and the length is limited (2500 chars if I remember well).

So these contexts are good to share strings or simple values, but not for sharing collections or huge amounts of data.

Sharing huge amounts of data is not the philosophy of Spring Batch. Spring Batch is a set of distinct actions, not a huge Business processing unit.

edited Oct 3, 2017 at 14:38

Pierre Henry

17.1k23 gold badges88 silver badges107 bronze badges

answered Mar 30, 2010 at 11:25

Jean-Philippe Briend

3,51531 silver badges41 bronze badges

7

How would you share potentially large data, like in a collection? My itemProcessor generates a list (records to delete) and I need to pass that list down the flow for a tasklet to process (do the actual delete of records). Thx
– Micho Rizo
Oct 3, 2018 at 21:16
Could job scope somehow help at this case ?
– gstackoverflow
Aug 16, 2019 at 12:47
@MichoRizo I would recommend using a cache like redis/ecache if the list is huge. I like to keep the objects in context relatively smaller in size
– vijayakumarpsg587
Aug 16, 2019 at 17:16

Add a comment |

WineSoaked · Accepted Answer · 2010-05-07 20:58:17Z

51

the job repository is used indirectly for passing data between steps (Jean-Philippe is right that the best way to do that is to put data into the StepExecutionContext and then use the verbosely named ExecutionContextPromotionListener to promote the step execution context keys to the JobExecutionContext.

It's helpful to note that there is a listener for promoting JobParameter keys to a StepExecutionContext as well (the even more verbosely named JobParameterExecutionContextCopyListener); you will find that you use these a lot if your job steps aren't completely independent of one another.

Otherwise you're left passing data between steps using even more elaborate schemes, like JMS queues or (heaven forbid) hard-coded file locations.

As to the size of data that is passed in the context, I would also suggest that you keep it small (but I haven't any specifics on the

answered May 7, 2010 at 20:58

WineSoaked

1,4751 gold badge14 silver badges14 bronze badges

4

This is confirmed by the documentation + example here : docs.spring.io/spring-batch/trunk/reference/html/…
– Sébastien Nussbaumer
Aug 4, 2015 at 13:43
4

Damn, five years later and this question still has traction. Way to go Spring Batch :)
– WineSoaked
Aug 6, 2015 at 6:57
Could job scope somehow help at this case ?
– gstackoverflow
Aug 16, 2019 at 12:47

Add a comment |

Nenad Bozic · Accepted Answer · 2015-03-14 11:32:21Z

31

I would say you have 3 options:

Use StepContext and promote it to JobContext and you have access to it from each step, you must as noted obey limit in size
Create @JobScope bean and add data to that bean, @Autowire it where needed and use it (drawback is that it is in-memory structure and if job fails data is lost, migh cause problems with restartability)
We had larger datasets needed to be processed across steps (read each line in csv and write to DB, read from DB, aggregate and send to API) so we decided to model data in new table in same DB as spring batch meta tables, keep ids in JobContext and access when needed and delete that temporary table when job finishes successfully.

answered Mar 14, 2015 at 11:32

Nenad Bozic

3,76420 silver badges47 bronze badges

3

Regarding your 2 option . Can I access a bean set from reader class from writer class in this way ?
– Bill Goldberg
Mar 30, 2016 at 7:12
How do you mean set from reader? We created bean outside i configuration and injected it where needed. You can try and see how to promote something from reader to job scope but it seams to me as odd solution to define something with job scope in reader.
– Nenad Bozic
Mar 31, 2016 at 5:44
Could job scope somehow help at this case ?
– gstackoverflow
Aug 16, 2019 at 12:47
Would really appreciate if you could provide an example for how to use the @JobScoped bean. suggestion 2 Getting the following error when trying it. Method threw 'org.springframework.beans.factory.support.ScopeNotActiveException' exception. Cannot evaluate com.nordea.omega.reporting.job.ReportResponseJobScope$$SpringCGLIB$$0.toString()
– PlickPlick
Jan 25 at 11:24

Add a comment |

Jin Kwon · Accepted Answer · 2014-07-03 11:13:33Z

Here is what I did to save an object which is accessible through out the steps.

Created a listener for setting the object in job context

@Component("myJobListener")
public class MyJobListener implements JobExecutionListener {

    public void beforeJob(JobExecution jobExecution) {

        String myValue = someService.getValue();
        jobExecution.getExecutionContext().putString("MY_VALUE", myValue);
    }
}

Defined the listener in the job context

<listeners>
         <listener ref="myJobListener"/>
</listeners>

Consumed the value in step using BeforeStep annotation

@BeforeStep
public void initializeValues(StepExecution stepExecution) {

String value = stepExecution.getJobExecution().getExecutionContext().getString("MY_VALUE");

}

LoveAndCoding · Accepted Answer · 2012-02-03 19:09:44Z

9

You can use a Java Bean Object

Execute one step
Store the result in the Java object
Next step will refer the same java object to get the result stored by step 1

In this way you can store a huge collection of data if you want

edited Feb 3, 2012 at 19:09

LoveAndCoding

7,9173 gold badges31 silver badges55 bronze badges

answered Feb 3, 2012 at 18:56

jaykishan

1231 silver badge1 bronze badge

24

In the next step how am i gonna get the object from 1st step. Whole point of the question is that
– Elbek
Feb 11, 2013 at 19:33
2

@Elbek Autowire it. Your class in step one has the POJO autowired and sets the data, and your class in step two also has the same object autowired (should be the same instance unless you're doing remote partitioning) and uses the getter.
– IceBox13
Feb 23, 2015 at 16:06
1

how did you autowire a newly created instance in step 1 in step2?Hiow do you attach the new instance into spring context?
– Chandru
Jun 30, 2015 at 18:43
2

@ Component for the POJO, @ Autowired + Setters in the first step, @ Autowired + Getters in the subsequent. Use also the JobScope annotation in the Tasklets.
– Robert Kirsten
Apr 7, 2016 at 12:47

Add a comment |

Adriaan · Accepted Answer · 2022-12-08 15:12:10Z

You can store data in the simple object. Like:

AnyObject yourObject = new AnyObject();

public Job build(Step step1, Step step2) {
    return jobBuilderFactory.get("jobName")
            .incrementer(new RunIdIncrementer())
            .start(step1)
            .next(step2)
            .build();
}

public Step step1() {
    return stepBuilderFactory.get("step1Name")
            .<Some, Any> chunk(someInteger1)
            .reader(itemReader1())
            .processor(itemProcessor1())
            .writer(itemWriter1(yourObject))
            .build();
}

public Step step2() {
    return stepBuilderFactory.get("step2Name")
            .<Some, Any> chunk(someInteger2)
            .reader(itemReader2())
            .processor(itemProcessor2(yourObject))
            .writer(itemWriter2())
            .build();
}

Just add data to object in the writer or any other method and get it in any stage of next step

Koray Tugay · Accepted Answer · 2023-09-13 11:57:59Z

Another very simply approach, leaving here for future reference:

class MyTasklet implements Tasklet {
    @Override
    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) {
        getExecutionContext().put("foo", "bar");
    }
}

and

class MyOtherTasklet implements Tasklet {
    @Override
    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) {
        getExecutionContext().get("foo");
    }   
}

getExecutionContext here is:

ExecutionContext getExecutionContext(ChunkContext chunkContext) {
    return chunkContext.getStepContext()
                       .getStepExecution()
                       .getJobExecution()
                       .getExecutionContext();
}

Put it in a super class, in an interface as a default method, or simply paste in your Tasklets.

And what if you use a Partinioner and run multiple threads?? Then the very minimum is unique keys. — PlickPlick, Jan 25 at 11:14

Paulo Merson · Accepted Answer · 2020-10-06 21:04:10Z

Use ExecutionContextPromotionListener:

public class YourItemWriter implements ItemWriter<Object> {
    private StepExecution stepExecution;
    public void write(List<? extends Object> items) throws Exception {
        // Some Business Logic

        // put your data into stepexecution context
        ExecutionContext stepContext = this.stepExecution.getExecutionContext();
        stepContext.put("someKey", someObject);
    }
    @BeforeStep
    public void saveStepExecution(Final StepExecution stepExecution) {
        this.stepExecution = stepExecution;
    }
}

Now you need to add promotionListener to your job

@Bean
public Step step1() {
        return stepBuilder
        .get("step1")<Company,Company>  chunk(10)
        .reader(reader()).processor(processor()).writer(writer())
        .listener(promotionListener()).build();
}

@Bean
public ExecutionContextPromotionListener promotionListener() {
    ExecutionContextPromotionListener listener = new ExecutionContextPromotionListener();
    listener.setKeys(new String[] {"someKey"});
    listener.setStrict(true);
    return listener;
}

Now, in step2 get your data from job ExecutionContext

public class RetrievingItemWriter implements ItemWriter<Object> {
    private Object someObject;
    public void write(List<? extends Object> items) throws Exception {
        // ...
    }
    @BeforeStep
    public void retrieveInterstepData(StepExecution stepExecution) {
        JobExecution jobExecution = stepExecution.getJobExecution();
        ExecutionContext jobContext = jobExecution.getExecutionContext();
        this.someObject = jobContext.get("someKey");
    }
}

If you are working with tasklets, then use the following to get or put ExecutionContext

List<YourObject> yourObjects = (List<YourObject>) chunkContent.getStepContext().getJobExecutionContext().get("someKey");

It's easy to copy and paste the code from official documentation. Why You don't provide Your own implementation? Everybody know that its written in doc. — Eddy Bayonne, Sep 9, 2018 at 20:33
That’s what I did. I provided easy to understand part of code. And, is the same available on documentation? I didn’t know that. — Nikhil Pareek, Sep 10, 2018 at 2:12

reevesy · Accepted Answer · 2016-02-18 18:17:46Z

I was given a task to invoke the batch job one by one.Each job depends on another. First job result needs to execute the consequent job program. I was searching how to pass the data after job execution. I found that this ExecutionContextPromotionListener comes in handy.

1) I have added a bean for "ExecutionContextPromotionListener" like below

@Bean
public ExecutionContextPromotionListener promotionListener()
{
    ExecutionContextPromotionListener listener = new ExecutionContextPromotionListener();
    listener.setKeys( new String[] { "entityRef" } );
    return listener;
}

2) Then I attached one of the listener to my Steps

Step step = builder.faultTolerant()
            .skipPolicy( policy )
            .listener( writer )
            .listener( promotionListener() )
            .listener( skiplistener )
            .stream( skiplistener )
            .build();

3) I have added stepExecution as a reference in my Writer step implementation and populated in the Beforestep

@BeforeStep
public void saveStepExecution( StepExecution stepExecution )
{
    this.stepExecution = stepExecution;
}

4) in the end of my writer step, i populated the values in the stepexecution as the keys like below

lStepContext.put( "entityRef", lMap );

5) After the job execution, I retrieved the values from the lExecution.getExecutionContext() and populated as job response.

6) from the job response object, I will get the values and populate the required values in the rest of the jobs.

The above code is for promoting the data from the steps to ExecutionContext using ExecutionContextPromotionListener. It can done for in any steps.

gehbiszumeis · Accepted Answer · 2019-08-09 11:23:25Z

Spring Batch creates metadata tables for itself (like batch_job_execution, batch_job_execution_context, batch_step_instance, etc).

And I have tested (using postgres DB) that you can have at least 51,428 chars worth of data in one column (batch_job_execution_context.serialized_content). It could be more, it is just how much I tested.

When you are using Tasklets for your step (like class MyTasklet implements Tasklet) and override the RepeatStatus method in there, you have immediate access to ChunkContext.

class MyTasklet implements Tasklet {

    @Override
    public RepeatStatus execute(@NonNull StepContribution contribution, 
                                @NonNull ChunkContext chunkContext) {
        List<MyObject> myObjects = getObjectsFromSomewhereAndUseThemInNextStep();
        chunkContext.getStepContext().getStepExecution()
        .getJobExecution()
        .getExecutionContext()
        .put("mydatakey", myObjects);
    }
}

And now you have another step with a different Tasklet where you can access those objects

class MyOtherTasklet implements Tasklet {

    @Override
    public RepeatStatus execute(@NonNull StepContribution contribution, 
                                @NonNull ChunkContext chunkContext) {
        List<MyObject> myObjects = (List<MyObject>) 
        chunkContext.getStepContext().getStepExecution()
        .getJobExecution()
        .getExecutionContext()
        .get("mydatakey"); 
    }
}

Or if you dont have a Tasklet and have like a Reader/Writer/Processor, then

class MyReader implements ItemReader<MyObject> {

    @Value("#{jobExecutionContext['mydatakey']}")
    List<MyObject> myObjects;
    // And now myObjects are available in here

    @Override
    public MyObject read() throws Exception {

    }
}

Paulo Merson · Accepted Answer · 2020-10-20 19:59:52Z

Simple solution using Tasklets. No need to access the execution context. I used a map as the data element to move around. (Kotlin code.)

Tasklet

class MyTasklet : Tasklet {

    lateinit var myMap: MutableMap<String, String>

    override fun execute(contribution: StepContribution, chunkContext: ChunkContext): RepeatStatus? {
        myMap.put("key", "some value")
        return RepeatStatus.FINISHED
    }

}

Batch configuration

@Configuration
@EnableBatchProcessing
class BatchConfiguration {

    @Autowired
    lateinit var jobBuilderFactory: JobBuilderFactory

    @Autowired
    lateinit var stepBuilderFactory: StepBuilderFactory

    var myMap: MutableMap<String, String> = mutableMapOf()

    @Bean
    fun jobSincAdUsuario(): Job {
        return jobBuilderFactory
                .get("my-SO-job")
                .incrementer(RunIdIncrementer())
                .start(stepMyStep())    
                .next(stepMyOtherStep())        
                .build()
    }

    @Bean
    fun stepMyStep() = stepBuilderFactory.get("MyTaskletStep")        
        .tasklet(myTaskletAsBean())
        .build()

    @Bean
    fun myTaskletAsBean(): MyTasklet {
        val tasklet = MyTasklet()
        tasklet.myMap = myMap      // collection gets visible in the tasklet
        return tasklet
    }
}

Then in MyOtherStep you can replicate the same idiom seen in MyStep. This other Tasklet will see the data created in MyStep.

Important:

tasklets are created via a @Bean fun so that they can use @Autowired (full explanation).
for a more robust implementation, the tasklet should implement InitializingBean with

    override fun afterPropertiesSet() {
        Assert.notNull(myMap, "myMap must be set before calling the tasklet")
    }

vsr · Accepted Answer · 2018-06-13 10:58:07Z

0

As Nenad Bozic said in his 3rd option, use temp tables to share the data between steps, using context to share also does same thing, it writes to table and loads back in next step, but if you write into temp tables you can clean at the end of job.

answered Jun 13, 2018 at 10:58

vsr

631 silver badge11 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How can we share data between the different steps of a Job in Spring Batch?

12 Answers 12

Tasklet

Batch configuration

Your Answer

Not the answer you're looking for? Browse other questions tagged
spring-batch
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

Tasklet

Batch configuration

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged spring-batch or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
spring-batch
or ask your own question.