Execution Id cannot be found


#1

Hello,
I am using camunda 7.9 with postgres DB
I am using the following approach -
[https://blog.bernd-ruecker.com/use-camunda-without-touching-java-and-get-an-easy-to-use-rest-based-orchestration-and-workflow-7bdf25ac198e]
After some stress testing I began to receive the following error ( and processes began to stack up on a specific task and not moving forward anymore ).
FetchAndLock requests began to return this sort of error as well.

18-Oct-2018 11:52:01.966 SEVERE [http-nio-8089-exec-41] org.camunda.commons.logging.BaseLogger.logError ENGINE-16004 Exception while closing command context: Cannot find execution with id 19e41972-d247-11e8-affa-00155d185227 for external task 19e41975-d247-11e8-affa-00155d185227: execution is null
org.camunda.bpm.engine.exception.NullValueException: Cannot find execution with id 19e41972-d247-11e8-affa-00155d185227 for external task 19e41975-d247-11e8-affa-00155d185227: execution is null
                at sun.reflect.GeneratedConstructorAccessor117.newInstance(Unknown Source)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
                at java.lang.reflect.Constructor.newInstance(Unknown Source)
                at org.camunda.bpm.engine.impl.util.EnsureUtil.generateException(EnsureUtil.java:344)
                at org.camunda.bpm.engine.impl.util.EnsureUtil.ensureNotNull(EnsureUtil.java:49)
                at org.camunda.bpm.engine.impl.util.EnsureUtil.ensureNotNull(EnsureUtil.java:44)
                at org.camunda.bpm.engine.impl.persistence.entity.ExternalTaskEntity.ensureExecutionInitialized(ExternalTaskEntity.java:416)
                at org.camunda.bpm.engine.impl.persistence.entity.ExternalTaskEntity.getExecution(ExternalTaskEntity.java:405)
                at org.camunda.bpm.engine.impl.externaltask.LockedExternalTaskImpl.fromEntity(LockedExternalTaskImpl.java:153)
                at org.camunda.bpm.engine.impl.cmd.FetchExternalTasksCmd.execute(FetchExternalTasksCmd.java:85)
                at org.camunda.bpm.engine.impl.cmd.FetchExternalTasksCmd.execute(FetchExternalTasksCmd.java:36)
                at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
                at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104)
                at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66)
                at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
                at org.camunda.bpm.engine.impl.externaltask.ExternalTaskQueryTopicBuilderImpl.execute(ExternalTaskQueryTopicBuilderImpl.java:59)
                at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.executeFetchAndLock(FetchAndLockHandlerImpl.java:210)
                at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.tryFetchAndLock(FetchAndLockHandlerImpl.java:193)
                at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.addPendingRequest(FetchAndLockHandlerImpl.java:262)
                at org.camunda.bpm.engine.rest.impl.FetchAndLockRestServiceImpl.fetchAndLock(FetchAndLockRestServiceImpl.java:34)
                at sun.reflect.GeneratedMethodAccessor77.invoke(Unknown Source)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
                at java.lang.reflect.Method.invoke(Unknown Source)
                at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:137)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:296)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:250)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:140)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:103)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:377)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:200)
                at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:220)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
                at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
                at org.camunda.bpm.engine.rest.filter.CacheControlFilter.doFilter(CacheControlFilter.java:41)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
                at org.camunda.bpm.engine.rest.filter.EmptyBodyFilter.doFilter(EmptyBodyFilter.java:95)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
                at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:494)
                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
                at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:651)
                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
                at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:412)
                at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
                at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:754)
                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1385)
                at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
                at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
                at java.lang.Thread.run(Unknown Source)

After some further investigation I found the “bug” but not its cause.
There are two tables for camunda for execution in runtime,
act_ru_ext_task
act_ru_execution
As I noticed camunda is using those as part of the process execution, the rows on those tables are being created, updated and removed all the time.
At my stress testing at some point they began to fill up and i got 390+ processes stuck at the same external task.

The reason (pretty sure this was the reason) why I got this exception message about null execution ID is because that specific row in the act_ru_ext_task table was damaged.

I actually found it! but it was tricky because the simple select where execution_id = YYYY query was not finding it ( which is also probably the reason why i got the null exception response in the camunda log).
It seems the value was inserted badly into the table only on that specific row.
When i used ‘like’ in the query i found it, and when I changed the execution Id property type from character varying(64) to character varying(63) The simple where query began to work on that row as well (not sure if that means anything), In either case I will do another stress test next week and it will be interesting to see if it happens again after this weird property change).
Important note, the simple ‘select where’ queries were working well on the other 390+ rows which got stacked up in that table, only the damaged row was not being selected even although you could clearly see it existed in that table ( I tried copying the execution_id part into the query in all sort of ways , it just couldnt find it on that specific row even although you can clearly see it in the “view all data” representation of the table ).

Once i removed that damaged row and restarted the services things began to work again and all the processes finished (except from one process instance which very likely was related to that specific damaged row which i deleted and which was blocking all the others from progressing ).
Also now the other table (act_ru_execution) had a problem, two rows which were not being updated / removed from it, very likely as a response to my manual removal of the damaged row from the other table, so my only option here was to remove those rows and lose that damaged process instance.

I hope I managed to explain the situation, and I will continue to test and investigate it but it seems
there is a bug of sorts related to either camunda or postgres or maybe both.
A side note, the server i am testing this on is kinda weak so maybe it is also a part of the problem.


#2

Did you ever faced this problem again. Because I am facing the same problem with camunda 11 and postgres DB.


#3

Hey,
For us the issue I described was related to the Postgres automatic vacuum feature.
We disabled this feature for 3 specific tables which continuously getting insert and delete commands.
act_ru_ext_task
act_ru_execution
act_ru_variable
We have not faced this problem since then.


#4

I tried this solution but it’s not working for me.
I am using Postgres only and disabled the vacuum feature for these tables.