Camunda optimization for many process instances


#1

Hi guys,

What is the fastest way to start process instances? I tried multiple instance call activity, multi-instance service task with delegate calling runtimeService.startProcessInstanceByKey, single service task with for loop having runtimeService.startProcessInstanceByKey for every variable. In case of 30 000 processes to start, what is the fastest way?
In my experience, all of the above ways take multiple hours, which is unusable.
Camunda version 7.8.

Thanks in advance,

JuLog


#2

Have you considered marking the start event for the process that gets started as asyncBefore? Then the speed is pretty much only down to the number of API calls you can push through per second and some (minimal) database interaction. Otherwise, you are waiting on execution of (part of) the process, which is blocking the throughput.


#3

Hi tiesebarrell,

I tried now setting async before on start event, it did not help. I still get aprox. 70 instances per minute, which is unusable. This is with multi-instance call activity having multi instance async before and start event of called process having async before.

Any other sugestion?

Thanks in advance


#4

Then it’s not the processing in the downstream process you’re waiting for. It might be the execution of the multi instance. Is there any logic involved there?

If you’re looking for pure speed in starting process instances, you can just do it all on the same execution thread in a service task and fire off as many api calls to start the processes as the machine can handle.

Alternatively, if you require some sort of processing for each process started, you might need to look at the settings for the JobExecutor and the exclusiveness of the multi instance. The JobExecutor will only pick up so many jobs at once and execute them on the available threads, so there’s the potential for a limitation there too.


#5

Thanks for replying, tiesebarrell.

There is some logic involved but it is not what is causing the issue (even when I leave only one service task with none code to execute it remains the same).

I tried changing JobExecutor settings to these:
core-pool-size: 10
max-jobs-per-acquisition: 10
max-pool-size: 10
queue-capacity: 10

and multi instance exlusiveness to false and to true, both options did not cause any improvement.

If my configuration is not good and you see the issue, please suggest what you think works the best.

As I mentioned first time I described the issue, I had no success changing call activity to service task. Is it supposed to be making any difference in performance?


#6

Well, there’s a big difference between a call activity and a service task, in that the first one starts another process and the other does a direct service invocation. If you’re using the service task to start a new process, the difference might not be so great.

I’m no expert of JobExecutor configuration, but I’d think the issue might be more in the polling frequency than the settings you have there. On the other hand, since you are starting the processes from within a process itself, you might be running into a situation where you are creating jobs so fast, that the set of jobs you’re trying to process is getting swamped in all the new jobs getting created by executing each one of them. You could have a look at the job count at any given time to see if that’s the case. You could maybe use job priorities to deal with that kind of issue - see https://docs.camunda.org/manual/7.9/user-guide/process-engine/the-job-executor/#specify-job-priorities for more info.


#7

The problem with job priorities is that I am starting new process instance for each element of collection having aprox. 30 000 elements - all elements have same priority.
How can I change polling frequency?


#8

All elements have the same priority, but it might me that the jobs that start each of those 30.000 actually have a higher priority: first at least start them all as quickly as possible, make sure that is done with a higher priority and then start working on each of them.

The frequency is configured in the process engine configuration’s JobExecutor: it has a property waitTimeInMillis you can set if you create your own instance. There are some other properties there as well to deal with backoff, etc.


#9

I will try adding waitTimeInMillis and setting it to 1000 or less.
If I understood well, you are saying I set job priority on main process starting 30000 new processes to higher than job priority for new processes?


#10

Yes, that’s what I mean. It will only have effect though, if your problem is caused by the fact that the upstream process that starts the downstream ones is actually doing the starting from a large number of jobs.