Trying to isolate a problem in a engine instance:
- Have a short running process that has a Message Start Event (a few seconds).
- Have a short running process (a few minutes at max)
There was ~300 messages delivered in a short period of time to process #1 to the Message Start Event. Process #1 sends messages to process #2.
Process #1 processed about half of the messages and then started to stack up process instances on the Message Start Event, and the executor appeared to be suck/stalled/hanging?.
Both process 1 and 2 have large number of async tasks, and multiple parallel branches.
When i restarted the engine, all of the waiting instances at the Message Start Event in process 1 and all waiting tokens in process 2 were nearly instantly processed/executed.
There was no error that i could see in the logs related to Messages. But if there is a keyword is search for i can look back in the logs.
Anyone have ideas on what would cause the executor to hang/stall/get stuck?
Note that if i look up the job and manually execute it through the rest API, it executes successfully, but then is stuck on the next wait.
The stuck job has:
Addtional Server Details:
Running on Tomcat. Camunda 7.6. Shared Engine configuration.
MS SQL SERVER Database.