How to process huge DMN file?

Deepak_Srivastava · December 30, 2020, 2:37pm

Dear All,
We came across one situation where We need to process DMN file(2.5GB) on camunda modeler and then push it on server.
I was trying to open 25MB of DMN file and my camunda modeler was hanged.
Is there any other solution to process huge DMN file on Camunda Modeler/Server? Please advise.
@Philipp_Ossler , Need your advise.

Thanks
Deepak Srivastava

Philipp_Ossler · January 4, 2021, 4:49am

Hi @Deepak_Srivastava,

regarding your question:

No, I don’t see a way to process huge DMN files. The modeler improved in the past but there are limits Not sure about the limits of the DMN model API.

However, I would challenge that this is a valid use case for DMN. The purpose of DMN is to model and visualize your business rules for the stakeholders.

If your DMN is huge then I doubt that it is readable and maintainability for humans

You should think about how you model your DMN. If you want to make decisions over a large dataset then the dataset should be used as input for the decisions.

Or, don’t use DMN if it doesn’t fit the use case

Best regards,
Philipp

Deepak_Srivastava · January 4, 2021, 6:52am

Thanks Philipp.
We had excel sheet with half million rows with 29 input column and 35 output column.
First challenge which We faced to convert excel to DMN and existing converter was not sufficient to do this conversion process. So I have written python script to create xls2DMN, which worked.

Now we have big DMN file(>10 millions line) to push on CAMUNDA server and here we failed.

As you mentioned that DMN is not best fit for such large dataset, is there any way
==> To upload DMN directly on server(using API) or
==> To push these half million records directly in DB(I am using postgres) [because even after pushing DMN on server , logically its inserting records in table only , so if we push half million records in systematic way in DB, will there be any issue during rule execution?]

Please advise.

Thanks
Deepak Srivastava

Philipp_Ossler · January 4, 2021, 9:27am

In order to give you a good advice, I need more information about your use case.

Why do you have so many lines in your DMN?
What is the decision about?

Please share a part of your DMN.

Ingo_Richtsmeier · January 4, 2021, 10:34am

Hi @Deepak_Srivastava,

interesting question.

A point from my side: how did you maintain this rule set in the past?

Cheers, Ingo

Deepak_Srivastava · January 4, 2021, 10:59am

Hi @Ingo_Richtsmeier,
We are maintaining data set in DB tables.

Thanks @Philipp_Ossler,
In case of 64(29 input+35 output) columns in excel , even if you convert 1 row of data set,
it creates ~5KB of DMN file, now we have half million rows in excel which means
size of DMN will be 500000*5kb=2500000KB=2.5GB.

Please correct me if my calculation is wrong.
Will share our sample DMN soon.

Thanks
Deepak Srivastava

Ingo_Richtsmeier · January 5, 2021, 9:43am

Hi @Deepak_Srivastava,

I assume that the relational database is the right tool to maintain such an amout of rules (or call it data sets). Here you can search easily.

The interesting question is, if you can split up the huge single table in many smaller tables and combine them with a Decision requirement graph: https://docs.camunda.org/manual/7.14/reference/dmn/drg/

This could lead to smaller, simpler and maintainable decision tables.

But you could not generate them from the overall data in a single step. Maybe you can think about the structure of the tables with the modeler and generate the rules only and add them manually into the XML-file.

If you want to test your original generated DMN table with the engine, you can write a JUnit test to deploy the file without opening it in the modeler: https://docs.camunda.org/manual/7.14/user-guide/dmn-engine/testing/

I’m curious about your result.

Hope this helps, Ingo

Deepak_Srivastava · January 6, 2021, 10:25am

Thanks @Ingo_Richtsmeier.
First solution is not looking feasible and tried to check second solution.
Second solution where you suggested to write JUnit test, I think this(https://docs.camunda.org/manual/7.14/user-guide/dmn-engine/testing/) is for testing the DMN, but question is how to push large DMN on server without opening in CAMUNDA Modeler. Because once I push my large DMN on server then only I can write JUnit test to check my DMN.

Thanks and Regards
Deepak Srivastava

Ingo_Richtsmeier · January 6, 2021, 11:07am

Hi @Deepak_Srivastava,

The JUnit test will start a Camunda engine with an in-memory database as part of the test. There is no further installation required.

Instead of the documented version, you can write it a bit shorter with:

  @Test
  @Deployment(resources = {"tweetpruefung.dmn"})
  public void testPasswortTweeten() {
    Map<String, Object> variables = withVariables("content", "passwort: 123456", "email", "egal@email.de");
    
    DmnDecisionTableResult decisionTableResult = decisionService().evaluateDecisionTableByKey("tweetpruefung", variables);
    
    assertThat(decisionTableResult.getFirstResult()).containsEntry("approved", true);
  }

Just adjust the input parameters and expected values to your need.

A project template with all maven dependencies for this Junit test is available at github: GitHub - camunda/camunda-engine-unittest: Unit test template project for camunda engine.

If you want to deploy the file directly into a running Camunda engine, you can use the Rest API: Post Deployment | docs.camunda.org.

Why not? From my experience, a decision table with more that 10 Input parameters and more than 10 Output parameters didn’t serve just a single decision. It has grown over the past and from the business point of view it covers many special decisions where most of the parameters are not needed or null.

If this is the case, I would recommend to go the extra mile and have a look at common parameter combinations and try to extract them into separate tables.

You can find a simple example in our dmn simulator: Camunda DMN-Simulator.

Hope this helps, Ingo

Deepak_Srivastava · January 11, 2021, 6:33am

Thanks @Ingo_Richtsmeier.
I have tried to push DMN using services, but here its running in loop and no response came.
I checked the CPU/Memory Usage , which was normal, but no response came , even after 3.6 hours.
I have tried to push 25MB of DMN using services( [https://docs.camunda.org/manual/7.14/reference/rest/deployment/post-deployment/ ] on machine(8Core/32GB memory).
I will check another option of JUnit.
Regards
Deepak Srivastava

system · January 30, 2024, 12:49pm