Camunda 7.8 -ee : Camunda cannot reconnect to restarted AWS RDS


#1

Hi guys…

Have your guys ever experience the reconnection issue with AWS RDS Postgresql database ?

My camunda app servers (2 servers) are running under ALB load balancer (i used sticky connection).

Both servers are running under same DNS… (for high availability purpose)… and both connecting to same database (AWS RDS , Postgresql)… the problem found when AWS RDS got maintenanced and the RDS got restarted… and after that my camunda got hung … we need to restart the service on both servers to make it run again…

Question : how can we make it auto reconnect without restart service ?

-- error log ---
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: #033[0m#033[31m15:52:15,650 ERROR [org.camunda.bpm.engine.context] (default task-30) ENGINE-16004 Exception while closing command context:
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### Error querying database.  Cause: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error may exist in org/camunda/bpm/engine/impl/mapping/entity/ProcessDefinition.xml
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error may involve org.camunda.bpm.engine.impl.persistence.entity.ProcessDefinitionEntity.selectProcessDefinitionsByQueryCriteria-Inline
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error occurred while setting parameters
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### SQL: select distinct RES.*                      from ACT_RE_PROCDEF RES                                        order by RES.ID_ asc     LIMIT ? OFFSET ?
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### Cause: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command: org.apache.ibatis.exceptions.PersistenceException:
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### Error querying database.  Cause: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error may exist in org/camunda/bpm/engine/impl/mapping/entity/ProcessDefinition.xml
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error may involve org.camunda.bpm.engine.impl.persistence.entity.ProcessDefinitionEntity.selectProcessDefinitionsByQueryCriteria-Inline
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### The error occurred while setting parameters
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### SQL: select distinct RES.*                      from ACT_RE_PROCDEF RES                                        order by RES.ID_ asc     LIMIT ? OFFSET ?
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: ### Cause: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:150)
Mar  3 15:52:15 prod-camunda-1b-0 start-camunda-live.sh: at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:141)
-----------------

#2

Hi,

What application server do you use and more importantly, what does the db connection pool settings look like? In other words do you test on borrow, discard after error etc?

In addition, what is the DB URL - I understand that RDS may use DNS to fail over etc, hence do you reference the DNS address?

regards’

Rob


#3

Thanks for reply.

You mean the OS version ? We use Centos 7… the configuration: (i used wildfly)

<subsystem xmlns="urn:jboss:domain:datasources:4.0">
            <datasources>
                <datasource jndi-name="java:jboss/datasources/ExampleDS" pool-name="ExampleDS" enabled="true" use-java-context="true">
                    <connection-url>jdbc:postgresql://[DNS_of_DB]:5432</connection-url>
                    <driver>postgresql</driver>
                    <security>
                        <user-name>user</user-name>
                        <password>password</password>
                    </security>
                </datasource>
                <datasource jta="true" jndi-name="java:jboss/datasources/ProcessEngine" pool-name="ProcessEngine" enabled="true" use-java-context="true" use-ccm="true">
                    <connection-url>jdbc:postgresql://[DNS_of_DB]:5432/db</connection-url>
                    <driver>postgresql</driver>
                    <security>
                        <user-name>user</user-name>
                        <password>passpword</password>
                    </security>
                </datasource>
                <datasource jndi-name="java:/mydatabaseDS" pool-name="mydatabaseDS" enabled="true" use-java-context="true" use-ccm="true">
                    <connection-url>jdbc:postgresql://[DNS_of_DB]:5432/db</connection-url>
                    <driver>postgresql</driver>
                    <pool>
                        <min-pool-size>10</min-pool-size>
                        <max-pool-size>50</max-pool-size>
                        <prefill>false</prefill>
                        <flush-strategy>IdleConnections</flush-strategy>
                   </pool>
                    <security>
                        <user-name>user</user-name>
                        <password>password</password>
                    </security>
                    <validation>
                        <check-valid-connection-sql>SELECT 1</check-valid-connection-sql>
                        <background-validation>true</background-validation>
                        <background-validation-millis>60000</background-validation-millis>
                    </validation>
                </datasource>
                <drivers>
                    <driver name="h2" module="com.h2database.h2">
                        <xa-datasource-class>org.h2.jdbcx.JdbcDataSource</xa-datasource-class>
                    </driver>
                    <driver name="postgresql" module="org.postgresql.jdbc">
                        <xa-datasource-class>org.postgresql.xa.PGXADataSource</xa-datasource-class>
                    </driver>
                </drivers>
            </datasources>
        </subsystem>

I use the DNS to connection to RDS… i though it should be failover too but it did not in this case… that i will leave to AWS… anyway… in camunda perspective… any configuration that will help to that reconnection if the RDS got restarted? :slight_smile:


#4

Hi,

Some brief detail on RDS failover;

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.MultiAZ.html#Concepts.MultiAZ.Failover

In particular, given the JVM may cache DNS names, you need to consider DNS TTL;

https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html

regards

Rob


#5

@Webcyberrob

Thanks for you reply… i had been reading on these links… anyway… for Wildfly TTL configuration… can you help to advise where can i do the change ? i normally work on standalone.yml file only. Thanks in advance.