OPENDJ-1262 NPE in ChangeNumberIndex during server startup
Code review: Matthew Swift
Caused by r10049.
Problem is down to the initialization sequence:
1. thread 1 - MultimasterReplication.initializeSynchronizationProvider()
1.1. it creates the ReplicationServerListener
1.1.1. the ReplicationServerListener in turn creates the ReplicationServer
1.1.1.1. the ReplicationServer in turn creates the ChangelogDB
1.1.1.1.1. the ChangelogDB in turn creates the ChangeNumberIndexer thread and STARTs it
1.1.1.1.1. the ChangelogDB starts the ChangeNumberIndexer thread
1.2. it proceeds with creating the LDAPReplicationDomain objects one by one
2. thread 2 - ChangeNumberIndexer.run()
2.1. it calls ChangeNumberIndexer.initialize()
2.1.1. ChangeNumberIndexer.initialize() calls MultimasterReplication.isECLEnabledDomain(baseDN)
Steps 1.2. and 2.1.1. are running concurrently.
If 2.1.1. is run before 1.2. is completed, In ChangeNumberIndexer.initialize():
1) MultimasterReplication.isECLEnabledDomain(baseDN) returns false, hence a cursor to the relevant replica DBs is not created
2) then the call to nextChangeForInsertDBCursor.getRecord() returns null, later throwing a NullPointerException because the ChangeNumberIndexer thread is in an illegal state: it was expecting to find an UpdateMsg with the correct CSN stamped on it.
MultimasterReplication.java:
Added State enum + state instance member to tell whether MultimasterReplication is ready for work.
Removed isRegistered instance member superseded by state instance member.
In isECLEnabledDomain(), completeSynchronizationProvider() and finalizeSynchronizationProvider(), deal with thread waits.
DomainFakeCfg.java:
Implemented toString().