Skip to content

Conversation

@zilm13
Copy link
Contributor

@zilm13 zilm13 commented Dec 15, 2025

PR Description

Got this, fixing:

2025-12-15 20:15:13.305 WARN  - UNKNOWN ERROR
java.util.ConcurrentModificationException: null
at java.base/java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:1024) ~[?:?]
at java.base/java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:1047) ~[?:?]
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1939) ~[?:?]
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570) ~[?:?]
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560) ~[?:?]
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) ~[?:?]
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) ~[?:?]
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265) ~[?:?]
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636) ~[?:?]
at tech.pegasys.teku.statetransition.datacolumns.DataColumnSidecarRecoveringCustodyImpl.lambda$onSlot$2(DataColumnSidecarRecoveringCustodyImpl.java:139) ~[teku-ethereum-statetransition-develop.jar:25.11.1+47-gfb86d63ec4]
at tech.pegasys.teku.infrastructure.async.SafeFuture.fromRunnable(SafeFuture.java:157) ~[teku-infrastructure-async-develop.jar:25.11.1+47-gfb86d63ec4]
at tech.pegasys.teku.infrastructure.async.AsyncRunner.lambda$runAfterDelay$1(AsyncRunner.java:32) ~[teku-infrastructure-async-develop.jar:25.11.1+47-gfb86d63ec4]
at tech.pegasys.teku.infrastructure.async.SafeFuture.of(SafeFuture.java:74) ~[teku-infrastructure-async-develop.jar:25.11.1+47-gfb86d63ec4]
at tech.pegasys.teku.infrastructure.async.ScheduledExecutorAsyncRunner.lambda$createRunnableForAction$1(ScheduledExecutorAsyncRunner.java:124) ~[teku-infrastructure-async-develop.jar:25.11.1+47-gfb86d63ec4]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]

Fixed Issue(s)

Documentation

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.

Changelog

  • I thought about adding a changelog entry, and added one if I deemed necessary.

Note

Synchronizes access to recoveryTasks and iterates over a snapshot to avoid ConcurrentModificationException when scheduling recoveries.

  • Concurrency fix in DataColumnSidecarRecoveringCustodyImpl:
    • Synchronize recoveryTasks access when creating/updating tasks (computeIfAbsent).
    • In onSlot, iterate over a synchronized snapshot (new HashSet<>(recoveryTasks.keySet())) instead of the live key set.
    • Safely map tasks via Optional.ofNullable(...).flatMap(Optional::stream) to handle concurrent removals.
    • Add HashSet import for snapshot creation.

Written by Cursor Bugbot for commit ff4e6d9. This will update automatically on new commits. Configure here.

final Set<SlotAndBlockRoot> recoveryTaskKeys;
synchronized (recoveryTasks) {
recoveryTaskKeys = new HashSet<>(recoveryTasks.keySet());
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Synchronization on wrong object fails to prevent ConcurrentModificationException

The fix synchronizes on recoveryTasks (the outer SynchronizedLimitedMap instance), but LimitedMap.createSynchronizedNatural() wraps operations using Collections.synchronizedMap() which synchronizes on an internal delegate object. When createOrUpdateRecoveryTaskForDataColumnSidecar() calls recoveryTasks.computeIfAbsent() on another thread, it synchronizes on delegate, not recoveryTasks, so it won't be blocked by this synchronized block. The race condition and ConcurrentModificationException can still occur. The SynchronizedLimitedMap.copy() method demonstrates the correct pattern - synchronizing on delegate rather than this.

Fix in Cursor Fix in Web

@zilm13
Copy link
Contributor Author

zilm13 commented Dec 17, 2025

I really don't like this way to solve the issue. Maybe add ConcurrentHashMap without limits and prune it to the left from finalizedCheckpoint?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants