Authorization performance with larger number of Tasks

Lukas_Horak · February 4, 2017, 8:42am

We met a performance issue with filters and authorization, after around 1000 tasks in the response, the filter execution get significantly slower. Around 5000 tasks, it takes 30 seconds.

The root cause of the issue seems to be that MySQL is not able to efficiently use index inside JOIN ON IN statement and have to do full table scan.

SELECT Count( DISTINCT res.id_ )
FROM ACT_RU_TASK res
inner join ACT_RU_IDENTITYLINK i
ON i.task_id_ = res.id_
left join ACT_RE_PROCDEF procdef
ON res.proc_def_id_ = procdef.id_
left join ( SELECT a.*
FROM ACT_RU_AUTHORIZATION a
WHERE a.type_ < 2 AND
( a.user_id_ IN ( ‘demo’, ‘’ ) OR
a.group_id_ IN ( ‘camunda-admin’, ‘management’, ‘accounting’, ‘sales’ ) ) AND
(( a.resource_type_ = 7 AND
a.perms_ & 2 = 2 OR
a.resource_type_ = 6 AND
a.perms_ & 64 = 64 )) ) auth
ON ( auth.resource_id_ IN ( res.id_, procdef.key_, '’ ) ) #<-- this is the problematic part
WHERE res.assignee_ IS NULL AND
res.assignee_ IS NULL AND
i.type_ = ‘candidate’ AND
( i.group_id_ IN ( ‘accounting’, ‘camunda-admin’, ‘management’, ‘sales’ ) ) AND
res.suspension_state_ = 1 AND
( ( res.case_execution_id_ IS NOT NULL ) OR
( auth.resource_id_ IS NOT NULL ) );

“table”: {
“table_name”: “a”,
“access_type”: “ALL”,
“possible_keys”: [
“ACT_UNIQ_AUTH_USER”,
“ACT_UNIQ_AUTH_GROUP”,
“ACT_IDX_AUTH_GROUP_ID”,
“ACT_IDX_AUTH_RESOURCE_ID”
],
“rows_examined_per_scan”: 18421,
“rows_produced_per_join”: 3610516,
“filtered”: “100.00”,
“range_checked_for_each_record”: “index map: 0x1E”,
“cost_info”: {
“read_cost”: “426.85”,
“eval_cost”: “722103.22”,
“prefix_cost”: “729195.28”,
“data_read_per_join”: “6G”
},
“used_columns”: [
“TYPE_”,
“GROUP_ID_”,
“USER_ID_”,
“RESOURCE_TYPE_”,
“RESOURCE_ID_”,
“PERMS_”
]
}

It happens on 7.5.0 and 7.6.0 using MariaDB 10.1, MySQL 5.6, MySQL 5.7, PostgreSQL 9.4. On H2 it is significantly faster. Using DB2 or Oracle is not really an option for us.
Version 7.4.0 use different approach and even though the queries also get slower, it is not as significant, but the SQL is very huge (over 600 lines for just few users, groups) and it gets slower with larger number of groups.

Steps to reproduce:
1. Start with clean Camunda
2. Start 5000 processes by calling POST engine-rest/process-definition/invoice:2:${PROCESS_ID}/submit-form
a. Randomly generated values for variables
3. Execute filter Accounting GET camunda/api/engine/engine/default/filter/{$ID_ACCOUNTING_FILTER}/count
4. Execute similar query without authentication (use engine-rest) and add the criteria
POST engine-rest/filter/{$ID_ACCOUNTING_FILTER}/count
{“candidateGroups”:[“accounting”],“processVariables”:[],“taskVariables”:[],“caseInstanceVariables”:[],“firstResult”:0,“maxResults”:15,“sorting”:[{“sortBy”:“created”,“sortOrder”:“desc”}],“active”:true}

Result:
- Camunda 7.6.0 API with authentication - 30s
- Camunda 7.4.0 API with authentication - 1s
- Camunda (both versions) API without authentication - 100ms

From reading the documentation and the forum, it isn’t really clear what is the recommendation and if there is some way how to force the DB to use the key? Or is there some other ideas how to handle larger number tasks in the filters?

Zelldon · February 5, 2017, 1:41pm

Hey Lukas,

thanks for reporting this.

Do you have customized the create scripts or do you have some custom indices on the Camunda tables? Please post which are the currently existing indices on the Task, Process definition and Authorization table.
Do you use InnoDB as an engine on MySQL?

Thanks and best regards,
Chris

mppfor_manu · February 5, 2017, 7:01pm

We also experienced “lesser” performance in other areas of Camunda (I don’t honestly remember which) when using MySQL databases. We created indexes as needed in those cases and got significant performance improvements.

However, because Camunda allows for broad support of a wide variety of databases, they bear the burden of maintaining configurations that work for the individual servers. In other words, the most efficient way to provide broad support is to keep the configurations as generic as possible, which means optimization for certain database types might not have been done. I’m not speaking from direct experience, but I can tell you that MySQL operations that could have been expressed in stored procedures would run much faster than they do today, but that would mean having to maintain similar optimizations across diverse platforms where the syntax couldn’t be shared.

The biggest concern I have with making any modifications to the schema provided “out-of-the-box” by Camunda are installation and maintenance. If you make modifications, you must carry all of those forward in both new installations as well as account for them during upgrades. I work in a very large company where we might potentially have hundreds of Camunda server instances. If we start modifying the database schema, we would have to ensure that those changes were made consistently to all installation and upgrade resources.

One final thought, I’m always a bit leery of modify a vendors configuration unless specifically instructed to do so (or at least they’ve reviewed and approved your modifications). Unintended consequences can result and support for modified environments could be difficult.

That all said, adding proper indexing to several tables can make order of magnitude differences in performance.

Lukas_Horak · February 6, 2017, 2:37am

Hi Zelldon,

we also run production instance with quite large volume on 7.4.0 and I would like to avoid as much in house changes as possible, especially before fully understanding all the non-obvious consequences. We did some customization before, but to eliminate all the factors, I run these performance tests on clean installation of Camunda 7.6.0 using the provided scripts for MySQL, MariaDB and PostreSQL. Originally, I though it could be because we use MySQL 5.7.12, but it behaves same on all these DBMS MySQL 5.7.12, 5.7.17, MySQL 5.6.35, MariaDB 10.1.21.

I run the scripts:

camunda-bpm-tomcat-7.6.0.zip\sql\create$DB_engine_7.6.0.sql
camunda-bpm-tomcat-7.6.0.zip\sql\create$DB_identity_7.6.0.sql

ACT_RU_TASK
PRIMARY KEY (ID_),
KEY ACT_IDX_TASK_CREATE (CREATE_TIME_),
KEY ACT_IDX_TASK_ASSIGNEE (ASSIGNEE_),
KEY ACT_IDX_TASK_TENANT_ID (TENANT_ID_),
KEY ACT_FK_TASK_EXE (EXECUTION_ID_),
KEY ACT_FK_TASK_PROCINST (PROC_INST_ID_),
KEY ACT_FK_TASK_PROCDEF (PROC_DEF_ID_),
KEY ACT_FK_TASK_CASE_EXE (CASE_EXECUTION_ID_),
KEY ACT_FK_TASK_CASE_DEF (CASE_DEF_ID_),

ACT_RE_PROCDEF
PRIMARY KEY (ID_),
KEY ACT_IDX_PROCDEF_DEPLOYMENT_ID (DEPLOYMENT_ID_),
KEY ACT_IDX_PROCDEF_TENANT_ID (TENANT_ID_),
KEY ACT_IDX_PROCDEF_VER_TAG (VERSION_TAG_)

ACT_RU_AUTHORIZATION
PRIMARY KEY (ID_),
UNIQUE KEY ACT_UNIQ_AUTH_USER (USER_ID_,TYPE_,RESOURCE_TYPE_,RESOURCE_ID_),
UNIQUE KEY ACT_UNIQ_AUTH_GROUP (GROUP_ID_,TYPE_,RESOURCE_TYPE_,RESOURCE_ID_),
KEY ACT_IDX_AUTH_GROUP_ID (GROUP_ID_),
KEY ACT_IDX_AUTH_RESOURCE_ID (RESOURCE_ID_)

ACT_HI_IDENTITYLINK
PRIMARY KEY (ID_),
KEY ACT_IDX_HI_IDENT_LNK_USER (USER_ID_),
KEY ACT_IDX_HI_IDENT_LNK_GROUP (GROUP_ID_),
KEY ACT_IDX_HI_IDENT_LNK_TENANT_ID (TENANT_ID_),
KEY ACT_IDX_HI_IDENT_LNK_PROC_DEF_KEY (PROC_DEF_KEY_)

From what I tried, it doesn’t looks like there is missing index. But the index is not being used in the query, even though it is in the suggested index link. Changing the query to enforce using the index, didn’t help.

The critical part of the query is this:
left join ( SELECT a.*
FROM ACT_RU_AUTHORIZATION a
WHERE … ) auth
ON ( auth.resource_id_ IN ( res.id_, procdef.key_, ‘*’ ) )

if I run the query with only 1 column inside the IN () then it use the index for AUTH.RESOURCE_ID (ACT_IDX_AUTH_RESOURCE_ID)

760_slow_query.txt (734 Bytes)
explain_slow_1.json (3.8 KB)
explain_slow_2_without_in.json (4.7 KB)
740_slow_query.txt (67.1 KB)

Zelldon · February 6, 2017, 2:43pm

Hey @mppfor_manu,

is it possible that you share the indices which increased the performance?
So perhaps we can add them into the Platform scripts.

Greets,
Chris

Zelldon · February 6, 2017, 2:49pm

Hey @Lukas_Horak,

I created a Bug ticket for that CAM-7441.
Since we test the authorization queries in a separate performance test suite i assume that the problem lies in the combination with the identity link table.

Do you have some more insight on this? Do you have an idea on improving the query or for some new indices to fix this problem? If so, please share it so we can test and improve the performance on platform as well. Maybe you can create a pull request for this.

Best regards,
Chris

mppfor_manu · February 6, 2017, 4:24pm

I think this was on version 7.4.X, so it may not be applicable to later versions. In fact, we don’t even have those databases any longer, so I can’t provide precise DDL statements. However, the following is how we determined where we needed new indices.

We looked at areas where we getting slow performance. For example, if we check the GUI for process history and it was taking a couple of minutes to respond, we might turn on MySQL logging to see what queries were being run. We would typically turn on logging in separate Camunda/MySQL server that had very little activity on it. Otherwise, it is difficult to tell what statements are being executed in MySQL.

Once you know which tables and columns are being accessed, you can examine the current database schema to determine if an appropriate index has been created for that. If no index exists, then you can create on the database. You must make sure that the index is carried forward through any updates that may take place.

Our friends at Camunda have done a great job optimizing performance, but we cannot expect them to anticipate every requirement.

eldar.zakiryanov · October 7, 2019, 4:08pm

We had the same issue with task count performance. There was around 20000 tasks and it took 30-60 seconds to give the result. In our case we have PostgreSQL as db. The problem is that PostgreSQL query planner used Nested loop in JOIN ON IN statement and have to do full table scan. To avoid this problem you have to write separate joins for every variable in “IN” statement. In this case there is three: “res.id_”, “procdef.key_”, “*”. I made some changes in two XML files where MyBatis generates sql:

Camunda 7.9.0

Original files:

camunda-bpm-platform/engine/src/main/resources/org/camunda/bpm/engine/impl/mapping/entity/Task.xml at 7.9.0 · camunda/camunda-bpm-platform · GitHub

camunda-bpm-platform/engine/src/main/resources/org/camunda/bpm/engine/impl/mapping/entity/Authorization.xml at 7.9.0 · camunda/camunda-bpm-platform · GitHub

Modified files(see “FIX SLOW COUNT” comment in files):

Authorization.xml (32.0 KB) Task.xml (29.0 KB)

In our project camunda used as lib in Spring based project. So I had to just put these files into directory:

src/main/resources/org/camunda/bpm/engine/impl/mapping/entity/Task.xml
src/main/resources/org/camunda/bpm/engine/impl/mapping/entity/Authorization.xml

and class loading hierarchy did the rest.

zambrovski · October 8, 2019, 8:14am

If you are going to manage much more tasks, the index tuning won’t help anymore. We had problems with Oracle by vast amounts of tasks in which no further index performance tuning was possible.

Rethink if the primary storage of Camunda in a RDBMS is the best approach, if every query joins at least three tables. In particular, we used to develop a solution with authorization switched off and manual check for authorizations from Java code (which performs better, if authorizations are stable over time and can be cached).

A completely different approach is to have a read-optimized projection of tasks like it is e.g. implemented in Camunda BPM Taskpool.

https://www.holunda.io/camunda-bpm-taskpool/

omprakashm · November 1, 2019, 10:15am

Perfect solution for Camunda 7.11.0 too.

Working fine for us. With 1000 task in list, initial it takes 1120 ms for API response which is now reduced to 400 ms. We are using Mysql 5.6.13 and Camunda Distributed tomcat 7.11.0.

Thanks.

tmetzke · November 4, 2019, 9:17am

Hi @eldar.zakiryanov,

thank you for providing those changed mappings to the community!

Do you think that those changes should be incorporated into the mappings in general?
If you do, would you mind opening a PR or JIRA ticket for it describing the benefits of your solution?
That way, the provided fix could be of even greater help to the community.

Best,
Tobias

hemesh · August 20, 2021, 9:47pm

How do we switch off authorizations?

Niall · August 23, 2021, 8:39am

hi @hemesh
This topic is probably not the right place for this question.
Can you create a new topic with a lot more details on your question and someone can hopfully understand what you want and help you out.

Niall · August 23, 2021, 8:39am