How to change the camunda default retry to 0 (ZERO)


#1

Hello Team,

We are using Camunda version 7.6.2. Just want to reduce the default number of retry.

I went through some block and try to modify the camunda process engine configuration setting (camunda-oracle-weblogic-ear-7.6.2-ee.ear\camunda-oracle-weblogic-service.jar\META-INF\bpm-platform.xml).

Here, just added one more property. I want to make the chnages for “Call Activity”, " Service Activity" and “script activity”.

<properties>
     <property name="defaultNumberOfRetries">0</property>
</properties>

After that am getting bellow exception
<Sep 25, 2017 10:33:27 AM EDT> <Enqueued request belonging to Work Manager wm/camunda-bpm-workmanager, application Camunda Engine is cancelled as the Work Manager is shutdown.>
Sep 25, 2017 10:33:27 AM org.camunda.commons.logging.BaseLogger logError
SEVERE: ENGINE-16004 Exception while closing command context: Cannot find execution with id ‘e2a0e24f-9448-11e7-bbc8-005056b11dbe’ referenced from job 'MessageEntity[repeat=null, id=e8f27824-9448-11e7-bbc8-005056b11dbe, revision=4515, duedate=null, lockOwner=d0c42db6-460a-4ef9-bc9c-7b7a9e34839a, lockExpirationTime=Mon Sep 25 10:38:27 EDT 2017, executionId=e2a0e24f-9448-11e7-bbc8-005056b11dbe, processInstanceId=e0ca26a3-9448-11e7-bbc8-005056b11dbe, isExclusive=false, retries=1, jobHandlerType=async-continuation, jobHandlerConfiguration=transition-create-scope, exceptionByteArray=null, exceptionByteArrayId=e915deae-9448-11e7-bbc8-005056b11dbe, exceptionMessage=Cannot find execution with id ‘e2a0e24f-9448-11e7-bbc8-005056b11dbe’ referenced from job 'MessageEntity[repeat=null, id=e8f27824-9448-11e7-bbc8-005056b11dbe, revision=4, duedate=null, lockOwner=11f6fdbe-116b-49a0-8680-12df4472b605, lockExpirationTime=Thu Sep 07 23:55:52 EDT 2017, executionId=e2a0e24f-9448-11e7-bbc8-005056b11dbe, processInstanceId=e0ca26a3-9448-11e7-bbc8-005056b11dbe, isExclusive=false, retries=2, jobHandlerType=async-continuation, jobHandlerConfiguration=transition-create-scope, exceptionByteArray=null, exceptionByteArrayId=e915deae-9448-11e7-bbc8-005056b11dbe, exceptionMessage=Cannot find execution with id ‘e2a0e24f-9448-11e7-bbc8-005056b11, deploymentId=89032437-9448-11e7-bbc8-005056b11dbe]’: execution is null


Please let me know how to over come from this issues.

Thanking you,
Sudhanshu


#2

Hi @Sudhanshu_Sekhar,

What is your goal with changing the retry to 0, what do you want to achieve?
Could you please share the bpm-platform.xml after the change or you already revert it?

Best regards,
Yana


#3

Hi again,

There was some problem with the formatting and I didn’t see this before:

After this change (keep in mind it is applied for all of the jobs not just the one related to “Call Activity”, " Service Activity" and “script activity”) if some job fails it would not be retried to execute after that.
My question still persist - what is the reasoning behind this
Also the part of the exception pasted above seems that it is not related to the change. Please attach the complete stack trace.

Best regards,
Yana


#4

Hi @yana.vasileva,

In my case, am sending some request to camunda workflow. When this request fails… Camunda retrying the same request multiple times(3 times). Which is feeding to my log.

For my case there are so many request is coming… & due to this multiple retry am facing space issues. So I want to stop this retry.

I have reverted my changes. Because of this modification… not a single order am able to place. For every request am getting stuck thread in my managed server (CamundaServer_1). FYI:

<">ipaglvp35.snt.bst.bls.com>; <CamundaServer_1> <[ACTIVE] ExecuteThread: ‘3’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <> <> <> <1506346431302> <[STUCK] ExecuteThread: ‘5’ for queue: ‘weblogic.kernel.Default (self-tuning)’ has been busy for “659” seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 659934 ms

", which is more than the configured time (StuckThreadMaxTime) of “600” seconds in “server-failure-trigger”. Stack trace: java.lang.Object.wait(Native Method)

If you need any further information please let me know.

Thanking you,
Sudhanshu


#5

Hi Sudhanshu,

What is this request, could you give us an example?

The number of retries is related to the failed jobs rather than failed requests. (more info here)
The scenario is not clear. Could you please attach the complete stacktrace.

Best regards,
Yana


#6

Hi @yana.vasileva,

Please find the attached screen shot. Here it is trying for multiple times.

When it fails at this “Supp Port Assignment”, it retries this job for three times.

Thanks,
Sudhanshu


#7

Hi Sudhanshu,

Sorry for the delay.

You can configure failedJobRetryTimeCycle property only for “Supp Port Assignment” service task.
Is this helpful for you?

Also I want to mention new 7.8.0-alpha feature that would be helpful in such situation - Global configuration for the failed job retry time cycle.

If these options are not working for you - please reproduce the issue once again and attach the log file with the complete stack trace of the issue.

Best regards,
Yana