External Tasks Long Polling: Where is the resources actually being saved?

StephenOTT · March 1, 2019, 4:53pm

I have been reviewing the long polling REST code and trying to get into a bit more detail about where the resources are actually saved…

From the looks of the code, for each active connected client, there is a Request object that is added into a endless loop and it performs the queries repeatedly. If there are 5 clients connected, then there are 5 queries on the server occurring repeatedly using the ExternalTaskService.

Is this correct? other than the ~minor overhead of a new HTTP request/responses, where savings coming from? On the client side, it does remove the need to create some looping code, so there is a saving there, but its still a open active connection on the network.

Thanks

cc @thorben

thorben · March 7, 2019, 4:32pm

Hi Stephen,

Your understanding is correct. The benefits of this long-polling approach are less network traffic (due to less polling requests) and reduced latency when a task becomes available.

Cheers,
Thorben

StephenOTT · March 9, 2019, 1:03am

@thorben is there is clients or testing that have scale testing on this setup? It seems to be exponential query growth as more and more suggestions of using external tasks are get 1 task at a time. If you have workers with multiple topics, you have quickly vast number of queries being looped through.

What was the expected external task client implementation that would scale with vast number of worker clients and even more topics.

Ingo_Richtsmeier · March 11, 2019, 2:22pm

Hi @StephenOTT,

The answer is: It depends.

You can listen to multiple topics on one request. The external clients (Java and JavaScript) can handle different implementations in one client.

One extreme approach is to create a new client for every topic. The other extreme is to create a single client to handle all topics. If you have to balance between implementation teams and number of topics, you would choose something inbetween.

Hope this helps, Ingo

StephenOTT · March 11, 2019, 5:39pm

Ya I am seeing the “depends” aspect at scale and redistributing tasks is even more messy.

It really seems like a third party queue needs to be added to solve anything at sacale. Otherwise you have weird issues in each extreme. If you have many workers with each their own topics you have huge amounts of queries. If you have workers consuming multiple topics, you end up with task starvation or weird scenarios of tasks being locked and having to manage expirations as a common occurance rather than exception handling.

Ingo_Richtsmeier · March 12, 2019, 8:38am

Hi @StephenOTT,

don’t introduce another message queue. I had a discussion about this with a prospect recently and I could convince them, that the external task list in the database is already a message queue.

I tried the xternal task client during our hackdays, there is a video about the results: https://www.youtube.com/watch?v=7HWeWCKgTJM.

I started new process instances every two seconds and the external tasks took 15 and 10 seconds to complete. So the tasks piled up in the engine, but every fetchAndLock- and complete-REST-Request only took about 40-60 ms.

I think that the crucial point is to set maxTasks to 1. Then you can scale easily with the help of the operating system.

Cheers, Ingo

StephenOTT · March 13, 2019, 12:56am

@Ingo_Richtsmeier with this setup, are you not exponentially increasing the number of queries against the dB as the topic count increases? For every topic you start to have a huge list of topics in the long polling loop that need to be continually queried? The point about the bus, is that it’s a tool already designed for high query throughout and sub/pub. Where the long polling and just fetch and lock in general is not designed for this type of through out.

Ingo_Richtsmeier · March 18, 2019, 8:07pm

Hi @StephenOTT,

no, as I start my clients with maxTasks = 1, I can control the polling by starting as many clients as I need. The purpose of my test was to pile up the tasks in the database. Then long polling has no effect, as the next task is already there.

If you really want a publish-subscribe pattern, my first choice in BPMN would be send- and receive task.

Cheers, Ingo

system · January 30, 2024, 11:36am