I have a Spring boot app which has a scheduled job that runs every 1,5 seconds. Its goal is to fetch data from a 3rd party api, update the database with results (if needed) and repeat. The next api call should not start before the previous updates are finished.
For simplicity lets say the code looks like this
@Scheduled(initialDelay = 3000, fixedDelay = 1500)
public void loadUpdates() {
List<Item> recentItems = apiClient.getUpdatesAfter(lastUpdateAt);
List<CompletableFuture<Void>> tasks = new ArrayList<>();
for(Item item: recentItems) {
tasks.add(CompletableFuture.runAsync(
() -> handleItemUpdates(item),
itemUpdateExecutor
));
}
// need to wait for all updates to finish
CompletableFuture.allOf(tasks.toArray(CompletableFuture[]::new)).join();
doSomethingElseHere();
saveTheLastUpdateTime();
}
public void handleItem(Item item) {
ItemDetails details = this.apiClient.getDetails(item.getId());
switch (item.getStatus()) {
case 1:
doOneTypeOfDbUpdates(item, details);
break;
case 2:
doOtherTypeOfDbUpdates(item, details);
break;
}
}
@Transactional
public void doOneTypeOfDbUpdates(Item item, ItemDetails itemDetails) {
updateOneTable();
insertManyRecordsToAnotherTable();
deleteSomethingFromThirdTable();
}
It worked fine in the beginning but now as the data amount grows this approach takes to much time.
Let's assume that the code of handling works fine and the db queries are optimal.
Edit: One item would cause various actions like another API call, inserts/updates into different tables in a transaction and/or databases, so doing bulk inserts of all the items is not possible from the received list is not really an option.
The question is how can this be done in a better way?
I have thought of replacing threads with a queue (rabbitmq) and process with multiple instances, but how to make sure that the next iteration will not start until all of the jobs are finished?
Any suggestions are welcome - Spring Integration, Apache Camel or any other solutions/frameworks/libraries/queues/etc.
Thank you in advance.
Item
you get and figure out if you can split those in other lists with common actions and execute those in bulk/batch. That would really improve performance.