Pulsar Broker 2.6.1 enters an infinite loop between these stack frames:
org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.internalReadFromLedger
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntries
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$1
To Reproduce
There aren't currently known steps to reproduce the issue in an isolated way. This happens in a load test on a 3 broker, 3 bookie, 3 zk cluster running in Openshift/k8s. The load test covers 7500 Pulsar topics which are read with short-living readers, there aren't any subscriptions. The topics are created in a namespace with retention policy of 10080 minutes. Each topic contains up to 5 messages.
Additional context
Pulsar Slack thread with lots of details at https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1602159831227900
Possible reason
This is the first few lines of code in OpRead.checkReadCompletion method:
void checkReadCompletion() {
if (entries.size() < count && cursor.hasMoreEntries()) {
// We still have more entries to read from the next ledger, schedule a new async operation
if (nextReadPosition.getLedgerId() != readPosition.getLedgerId()) {
cursor.ledger.startReadOperationOnLedger(nextReadPosition, OpReadEntry.this);
}
The usage of cursor.hasMoreEntries() seems invalid here.
In the table of data extracted from the heap dump by using Eclipse MAT OQL, I can see that OpRead.readPosition.ledgerId != OpRead.cursor.readPosition.ledgerId and OpRead.nextReadPosition.ledgerId != OpRead.cursor.ledger.lastConfirmedEntry.ledgerId (cursor write position) .

Therefore, when cursor.hasMoreEntries() is called, it should take OpRead.readPosition.getLedgerId() into account. Currently the ledgerId is ignored and in the case of when cursor really has more entries in a different ledgerId, it will lead to an infinite loop.
The code in checkReadCompletion has another problem. The call to cursor.ledger.startReadOperationOnLedger might call opReadEntry.readEntriesFailed.
PositionImpl startReadOperationOnLedger(PositionImpl position, OpReadEntry opReadEntry) {
Long ledgerId = ledgers.ceilingKey(position.getLedgerId());
if (null == ledgerId) {
opReadEntry.readEntriesFailed(new ManagedLedgerException.NoMoreEntriesToReadException("The ceilingKey(K key) method is used to return the " +
"least key greater than or equal to the given key, or null if there is no such key"), null);
}
The problem here is that there isn't currently a way to signal this back to checkReadCompletion method and the execution will continue calling startReadOperationOnLedger again later on in the same checkReadCompletion method call.
Possibly related issue
I will try to locate this issue.
I've been trying to understand the code more and analyze it. It's not so clear that the problem is in cursor.hasMoreEntries() call.
I added some more columns to the table that I extracted from one of the heap dumps: https://gist.github.com/lhotari/c34db2664b15fc32c9f0e7ae8b37dbda
The goal of this is that when I have the state of the variables in the table, I can manually read the code and debug it like I would have done if I would have had the change to attach a debugger when the broker was in the problematic state.
For example, I added cursor.ledger.currentLedger.lastAddConfirmed which is relevant state to understand how the execution keeps on looping. That is read on this line https://github.com/apache/pulsar/blob/e2fbf31ff34f2aa94276ab6103c73cec23849c1b/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1566. All rows in the extracted table have -1 as cursor.ledger.currentLedger.lastAddConfirmed. therefore, the execution will always go to the branch https://github.com/apache/pulsar/blob/e2fbf31ff34f2aa94276ab6103c73cec23849c1b/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1569
Within that branch, opReadEntry.checkReadCompletion() gets called again and the infinite loop keeps on looping.
Most helpful comment
I will try to locate this issue.