Intermittent stalling on event with listener


#1

Hi all,
I’ve noticed intermittent token stalling (without incident/error) on an event/task that includes a start listener with javascript code. I tested the listener code as attached to a start event and a script task, with the same result. Sometimes the process executes this step successfully, and sometimes the token just stops there. When it stalls, I can see that the javascript did not execute properly, as the process instance variables are not updated (the variables referenced by the script are initially set with default values when the instance is started via API). Rebooting the camunda instance doesn’t move the token forward. The problem seems to go away for a little while if I make a change to the process and redeploy the process to camunda, and start a new instance, but eventually it comes back with later in future instances that are started.

Does anyone have any ideas about what might be causing this?

Here’s my bpmn code:

<?xml version="1.0" encoding="UTF-8"?>
<bpmn:definitions xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL" xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI" xmlns:di="http://www.omg.org/spec/DD/20100524/DI" xmlns:dc="http://www.omg.org/spec/DD/20100524/DC" xmlns:camunda="http://camunda.org/schema/1.0/bpmn" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" id="Definitions_1p6p4gp" targetNamespace="http://bpmn.io/schema/bpmn" exporter="Camunda Modeler" exporterVersion="1.11.3">
  <bpmn:collaboration id="Collaboration_0ovrvy3">
    <bpmn:participant id="Participant_0k1ov6l" name="test" processRef="test" />
  </bpmn:collaboration>
  <bpmn:process id="test" name="test" isExecutable="true">
    <bpmn:laneSet />
    <bpmn:sequenceFlow id="SequenceFlow_0ma955x" sourceRef="StartEvent_1ss3z1s" targetRef="Task_1yuyqu7" />
    <bpmn:startEvent id="StartEvent_1ss3z1s">
      <bpmn:outgoing>SequenceFlow_0ma955x</bpmn:outgoing>
    </bpmn:startEvent>
    <bpmn:scriptTask id="Task_1yuyqu7" name="Step1" camunda:asyncBefore="true" scriptFormat="javascript">
      <bpmn:extensionElements>
        <camunda:executionListener event="start">
          <camunda:script scriptFormat="javascript"><![CDATA[// set bpm_action variable
// always set variables at the start, in case they're needed by this step
execution.setVariable("bpm_action", "01");

// set bpm_process_instance variable
var bpm_process_instance = execution.getProcessInstanceId();
execution.setVariable('bpm_process_instance', bpm_process_instance);

// set bpm_process_execution variable
var bpm_process_execution = execution.getId();
execution.setVariable('bpm_process_execution', bpm_process_execution);]]></camunda:script>
        </camunda:executionListener>
      </bpmn:extensionElements>
      <bpmn:incoming>SequenceFlow_0ma955x</bpmn:incoming>
      <bpmn:outgoing>SequenceFlow_1q56ni8</bpmn:outgoing>
      <bpmn:script>// PLACEHOLDER</bpmn:script>
    </bpmn:scriptTask>
    <bpmn:endEvent id="EndEvent_1nk8npt">
      <bpmn:incoming>SequenceFlow_1q56ni8</bpmn:incoming>
    </bpmn:endEvent>
    <bpmn:sequenceFlow id="SequenceFlow_1q56ni8" sourceRef="Task_1yuyqu7" targetRef="EndEvent_1nk8npt" />
  </bpmn:process>
  <bpmn:signal id="Signal_0uwtf92" name="GetData-8598932356" />
  <bpmn:signal id="Signal_0rxxh4x" name="UpdateDatabase58269" />
  <bpmndi:BPMNDiagram id="BPMNDiagram_1">
    <bpmndi:BPMNPlane id="BPMNPlane_1" bpmnElement="Collaboration_0ovrvy3">
      <bpmndi:BPMNShape id="Participant_0k1ov6l_di" bpmnElement="Participant_0k1ov6l">
        <dc:Bounds x="-21" y="113" width="1182" height="682" />
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape id="StartEvent_1ss3z1s_di" bpmnElement="StartEvent_1ss3z1s">
        <dc:Bounds x="40" y="518" width="36" height="36" />
        <bpmndi:BPMNLabel>
          <dc:Bounds x="58" y="557" width="0" height="14" />
        </bpmndi:BPMNLabel>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNEdge id="SequenceFlow_0ma955x_di" bpmnElement="SequenceFlow_0ma955x">
        <di:waypoint xsi:type="dc:Point" x="58" y="518" />
        <di:waypoint xsi:type="dc:Point" x="58" y="471" />
        <di:waypoint xsi:type="dc:Point" x="127" y="471" />
        <di:waypoint xsi:type="dc:Point" x="127" y="437" />
        <bpmndi:BPMNLabel>
          <dc:Bounds x="47.5" y="449" width="90" height="14" />
        </bpmndi:BPMNLabel>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNShape id="ScriptTask_0swsuqk_di" bpmnElement="Task_1yuyqu7">
        <dc:Bounds x="77" y="357" width="100" height="80" />
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape id="EndEvent_1nk8npt_di" bpmnElement="EndEvent_1nk8npt">
        <dc:Bounds x="255" y="379" width="36" height="36" />
        <bpmndi:BPMNLabel>
          <dc:Bounds x="273" y="418" width="0" height="14" />
        </bpmndi:BPMNLabel>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNEdge id="SequenceFlow_1q56ni8_di" bpmnElement="SequenceFlow_1q56ni8">
        <di:waypoint xsi:type="dc:Point" x="177" y="397" />
        <di:waypoint xsi:type="dc:Point" x="255" y="397" />
        <bpmndi:BPMNLabel>
          <dc:Bounds x="216" y="375" width="0" height="14" />
        </bpmndi:BPMNLabel>
      </bpmndi:BPMNEdge>
    </bpmndi:BPMNPlane>
  </bpmndi:BPMNDiagram>
</bpmn:definitions>

Thanks!


#2

Update: I put a 3-second timer delay after the start event and now the token seems to not be stalling. I figure maybe the instance needs some time to start up before variables can be set, but that’s just a wild guess. Anyways, I hope this helps anyone who runs into this issue.

Does anyone have a better idea? I’d love to avoid adding an unecessary delay into the process.


#3

What is your JavaScript doing ?


#4

It’s setting the value of 3 instance variables. For the bpm_action variable, it sets it to the string “01”. For the bpm_process_instance variable, it gets the process instance Id of the instance and sets it to that value. For the bpm_process_execution variable, it gets the instance Id and sets it to that value. These variables are needed later in the process for an httpconnector event.


#5

Can you remove that three second timer and add async before to each activity in the bpmn (events and tasks). Then retest. See if it stalls

If it stalls, provide specific detail about where it stalled (which task was it getting hung up on)


#6

It’s already setup like that (excluding the start event). I will add async before to the start event and test it. It will take a day or two to know if it works, since each time I update the bpmn file and deploy it, the problem goes away and then a day or two later it returns. I’ll report back.


#7

Ok, so I’ve tested the adding ‘async before’ to each event/task, including the start event, and it’s stalling on the start event. It didn’t stall for a few days after deploying the process, and then it started stalling (which seems to be the pattern).

Here’s a screenshot. The 13 instances on the placeholder task are instances I started during the first couple days after deploying the process, and the 5 instances on the start event are from today. From what I’ve seen, once it starts stalling, it stays that way. This occurs on camunda v7.8 and v7.9.

screenshot

Here’s the xml:

<?xml version="1.0" encoding="UTF-8"?>
<bpmn:definitions xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL" xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI" xmlns:di="http://www.omg.org/spec/DD/20100524/DI" xmlns:dc="http://www.omg.org/spec/DD/20100524/DC" xmlns:camunda="http://camunda.org/schema/1.0/bpmn" id="Definitions_0wtl26t" targetNamespace="http://bpmn.io/schema/bpmn" exporter="Camunda Modeler" exporterVersion="1.16.2">
  <bpmn:process id="F001_forum_test" isExecutable="true">
<bpmn:startEvent id="StartEvent_1" camunda:asyncBefore="true">
  <bpmn:outgoing>SequenceFlow_0g0c4lo</bpmn:outgoing>
</bpmn:startEvent>
<bpmn:scriptTask id="ScriptTask_0eqatyq" name="Step1" camunda:asyncBefore="true" scriptFormat="javascript">
  <bpmn:extensionElements>
    <camunda:executionListener event="start">
      <camunda:script scriptFormat="javascript">// set bpm_action variable
// always set variables at the start, in case they're needed by this step
execution.setVariable("bpm_action", "01");

// set bpm_process_instance variable
var bpm_process_instance = execution.getProcessInstanceId();
execution.setVariable('bpm_process_instance', bpm_process_instance);

// set bpm_process_execution variable
var bpm_process_execution = execution.getId();
execution.setVariable('bpm_process_execution', bpm_process_execution);</camunda:script>
    </camunda:executionListener>
  </bpmn:extensionElements>
  <bpmn:incoming>SequenceFlow_0g0c4lo</bpmn:incoming>
  <bpmn:outgoing>SequenceFlow_0vwt4lo</bpmn:outgoing>
  <bpmn:script>// PLACEHOLDER</bpmn:script>
</bpmn:scriptTask>
<bpmn:endEvent id="EndEvent_1rtztk7" camunda:asyncBefore="true">
  <bpmn:incoming>SequenceFlow_1u1j1ah</bpmn:incoming>
</bpmn:endEvent>
<bpmn:sequenceFlow id="SequenceFlow_0vwt4lo" sourceRef="ScriptTask_0eqatyq" targetRef="Task_0e6i56q" />
<bpmn:sequenceFlow id="SequenceFlow_0g0c4lo" sourceRef="StartEvent_1" targetRef="ScriptTask_0eqatyq" />
<bpmn:sequenceFlow id="SequenceFlow_1u1j1ah" sourceRef="Task_0e6i56q" targetRef="EndEvent_1rtztk7" />
<bpmn:userTask id="Task_0e6i56q" name="Placeholder">
  <bpmn:incoming>SequenceFlow_0vwt4lo</bpmn:incoming>
  <bpmn:outgoing>SequenceFlow_1u1j1ah</bpmn:outgoing>
</bpmn:userTask>
  </bpmn:process>
  <bpmndi:BPMNDiagram id="BPMNDiagram_1">
<bpmndi:BPMNPlane id="BPMNPlane_1" bpmnElement="F001_forum_test">
  <bpmndi:BPMNShape id="_BPMNShape_StartEvent_2" bpmnElement="StartEvent_1">
    <dc:Bounds x="173" y="102" width="36" height="36" />
  </bpmndi:BPMNShape>
  <bpmndi:BPMNShape id="ScriptTask_0eqatyq_di" bpmnElement="ScriptTask_0eqatyq">
    <dc:Bounds x="281" y="80" width="100" height="80" />
  </bpmndi:BPMNShape>
  <bpmndi:BPMNShape id="EndEvent_1rtztk7_di" bpmnElement="EndEvent_1rtztk7">
    <dc:Bounds x="657" y="102" width="36" height="36" />
  </bpmndi:BPMNShape>
  <bpmndi:BPMNEdge id="SequenceFlow_0vwt4lo_di" bpmnElement="SequenceFlow_0vwt4lo">
    <di:waypoint x="381" y="120" />
    <di:waypoint x="454" y="120" />
  </bpmndi:BPMNEdge>
  <bpmndi:BPMNEdge id="SequenceFlow_0g0c4lo_di" bpmnElement="SequenceFlow_0g0c4lo">
    <di:waypoint x="209" y="120" />
    <di:waypoint x="281" y="120" />
  </bpmndi:BPMNEdge>
  <bpmndi:BPMNEdge id="SequenceFlow_1u1j1ah_di" bpmnElement="SequenceFlow_1u1j1ah">
    <di:waypoint x="554" y="120" />
    <di:waypoint x="657" y="120" />
  </bpmndi:BPMNEdge>
  <bpmndi:BPMNShape id="UserTask_0tsq3gz_di" bpmnElement="Task_0e6i56q">
    <dc:Bounds x="454" y="80" width="100" height="80" />
  </bpmndi:BPMNShape>
</bpmndi:BPMNPlane>
  </bpmndi:BPMNDiagram>
</bpmn:definitions>

Is there anything else I can test to resolve this issue?

Also, the 3-second timer idea is no longer working on the other bpmn process I’ve been testing, as it’s now stalling too.


#8

Is there any other process defs running on your server?

The type of symptom you are describing is similar to what i have faced with the job executor become stuck/locked from tasks that are never completing / never timing out. In my case it was a bug/lack of timeout feature implemented in HTTP-Connector and thus if a network connection stalled and never timed-out, the job/task would be in the job executor forever.

Can you provide your actual BPMN file? The file you provide is stripped of the actual content/scripts that are being executed.


#9

Yes, there is another process def running. It is the full version of the process def I’m working here on with you. That process def does have an http-connector and mail-connector.

I PM’d you a link to the xml doc for the full process def (didn’t want to post here, in case I wasn’t thorough enough in removing internal data).

Note that during my testing, the token didn’t pass into the subprocess labeled ‘PS03’, or the end event.


#10

So first thing i notice is you are using my code snippets from pre-7.8 (i believe, maybe 7.9) for generating Incidents.

In 7.9 you do not need to manually generate incident entities. You can do so with execution.createIncident() https://docs.camunda.org/javadoc/camunda-bpm-platform/7.9/org/camunda/bpm/engine/delegate/DelegateExecution.html#createIncident(java.lang.String,%20java.lang.String)


#11

Second, as your next test, repleace your http-connector usage with Jsoup: Replacing Http-Connector with Jsoup usage

and retest. Make sure to add the timeout method in your connections. What we are testing is to see if you have network connections that are staying open forever thus the job that is executing your http-connector is remaining a active job and using up the worker pool.


#12

Would also recommend that you move your Javascript into external files, so IDEs can inspect your JS rather than it being within the modeler bpmn code. Makes it easier to find issues when you have this many scripts


#13

Another items for you: Generally you should not be sending signals or messages to yourself: as in do not send a signal/message from ProcessA to ProcessA. In your case you are doing this with your user task End script. There are weird effects and it does not work as one would think. I have found if you think you need to message “yourself” you usually have a BPMN design issue. consider abstracting into a different process and use a message event or Call activity.


#14

Thanks for all the feedback! I will test those changes.

A couple of questions:
-The reason I was using signals is because I had a task that was repeatedly used, so I was attempting to re-use the same code across multiple End Listeners. Is abstracting into a different process (as you mentioned) the best approach, or is there better way that keeps everything inside one file?
-If I move javascript into external files, how do I deploy that easily? I’m currently deploying via curl API request, and it deploys the BPMN file only (as far as I can see).


#15

Use Postman (getpostman.com) and deploy your other JS files along with bpmn file. They are just other files in your list of files

10%20PM

It depends on what you want to do: Looking at your example, its just a script you are executing, so it does not look like there is a lot of value in having it as a BPMN process or even a task. Its the equivalent to you calling a method in each of your other scripts. So i would likely add your JS into the classpath, and then load it, and call it from each of your other scripts that need to execute it.

take a look at: https://github.com/DigitalState/camunda-variations/tree/master/lib-js-file

In your case, you will write a .js file that has the specific function you want to re-use.


#16

I would also look at setting up a unit test, as this will simplify your testing and figuring out how things work:

This will show you how to unit test with Spock Framework, using Scripts and also unit test your javascript

See the whole repo for other examples like Adding mocks and adding mocks for your web servers

Take a look at using https://github.com/DigitalState/camunda-unit-test-helpers-groovy to simplify the boilerplate even further

and if you want some visual output of whats going on in the process, take a look at:


#17

Thanks @StephenOTT. I’m working on implementing these idea, they are very helpful. Should I replace the email-connector as well, or is this issue only potentially limited to the http-connector?


#18

I am personally a fan of using a http connection for sending e mail. But that’s up to you.

The issue with http connector is only with connector code as far as I am aware


#19

My challenge with sending via http is that is that I’m including process variables in the email body, and so I’m not sure how to setup a re-usable script that processes all variable values to create the final email body output (assuming I’m placing the email body content in an input parameter and extracting it’s contents via javascript). It seems like I’d have to create a new script for each email, whereas with email-connector I can just place the variables directly into the input field. Is there a better way?

So you’re saying the issue like doesn’t affect the email connector? They’re both ‘connectors’, so if the code is related to all connectors, then maybe it has the same problem?


#20

Ok so I’ve followed your recommendations, and tested it for a few days now (the test process uses Jsoup for http requests and the mail connector for email sending). The problem has not returned, so it seems like the issue has been resolved. Thank you so much! Also, thanks for all of your additional tips, those were extremely helpful, and things are running much more efficient now.