OPENDJ-1172 Deadlock between replication threads during shutdown
Review of the approach: Matthew Swift
Problem is caused by code deep into method calls that calls ReplicationServer.shutdown().
Thread 1 holds a lock on MessageHandler.msgQueue, an exception happens during processing and it then it calls ReplicationServer.shutdown() which then goes and tries to grab JEReplicaDB.msgQueue.
Thread 2 holds a lock on JEReplicaDB.msgQueue and then tries to grab MessageHandler.msgQueue.
The proper fix is to let the exceptions bubble up to the Thread.run() method, releasing all locks in the process, and call ReplicationServer.shutdown() from there.
replication.properties
Added stack traces to error messages.
ReplicationServerDomain.java:
Consequence of the change to the error messages, removed the use of MessageBuilder.
JEUtils.java: ADDED
Factorized all the code closing JE Transactions.
DraftCNDB.java, JEChangeNumberIndexDB.java, ReplicationDB.java:
Let ChangelogExceptions propagate up.
Used JEUtils.abort().
Consequence of the change to the error messages, removed the use of MessageBuilder.
JEReplicaDB.java:
Handled ChangelogException bubbling up here.
Extracted stop(Exception) method.
ReplicationDbEnv.java
removed one shutdownOnException() method.
Inlined innerShutdownOnException().
Consequence of the change to the error messages, removed the use of MessageBuilder.