Disk buffering iterator explicit removal config option#2560
Disk buffering iterator explicit removal config option#2560jaydeluca merged 16 commits intoopen-telemetry:mainfrom
Conversation
jaydeluca
left a comment
There was a problem hiding this comment.
i dont have a lot of experience with how this module is used in the real world, but do we have concerns about overflowing disks if someone doesn't know this change is happening? just trying to think of how to get ahead of that
...ng/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java
Outdated
Show resolved
Hide resolved
...ng/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java
Outdated
Show resolved
Hide resolved
...c/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/ReadableFile.java
Outdated
Show resolved
Hide resolved
...ng/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java
Outdated
Show resolved
Hide resolved
...uffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java
Outdated
Show resolved
Hide resolved
…fering/internal/storage/FolderManager.java Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
…fering/internal/storage/FolderManager.java Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
…fering/internal/storage/Storage.java Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR changes disk-buffering iteration semantics so that reading no longer implicitly removes items/files from disk; instead, consumers must explicitly delete via Iterator.remove() (or ReadableResult.delete()), aligning with standard Iterator expectations and addressing issue #2540.
Changes:
- Stop implicitly deleting stored batches/files during iteration; require explicit deletion (
Iterator.remove()). - Add readable-file selection filtering to skip already-processed/invalid files and improve corrupted-data handling.
- Update tests and README examples to reflect the new explicit-deletion behavior.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/StorageIterator.java | Removes implicit delete-on-advance; keeps deletion behind Iterator.remove(). |
| disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java | Introduces exclusion predicate to move past finished/corrupt files without deleting them implicitly. |
| disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java | Adds cache-file parsing/selection with predicate-based exclusion and validation of file names. |
| disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/ReadableFile.java | Stops deleting the underlying file automatically at EOF; adds created-time tracking. |
| disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/reader/DelimitedProtoStreamReader.java | Treats partial reads as IO corruption by throwing IOException instead of returning null. |
| disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/files/ReadableFileTest.java | Updates expectations: empty file is no longer deleted merely by reading. |
| disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/StorageTest.java | Updates expectations: storage directory retains file when items aren’t explicitly deleted. |
| disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManagerTest.java | Refactors to new getReadableFile(Predicate) API and adds custom-filter coverage. |
| disk-buffering/src/test/java/io/opentelemetry/contrib/disk/buffering/IntegrationTest.java | Adds explicit-removal integration scenario; asserts on-disk retention vs removal. |
| disk-buffering/README.md | Updates usage example and documents explicit deletion options. |
...ng/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManager.java
Show resolved
Hide resolved
...uffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java
Outdated
Show resolved
Hide resolved
...rc/test/java/io/opentelemetry/contrib/disk/buffering/internal/storage/FolderManagerTest.java
Outdated
Show resolved
Hide resolved
…fering/internal/storage/files/ReadableFile.java Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
…e, to keep the current behavior
A disk overflow shouldn't happen because there's a safeguard mechanism that checks the max folder size and max file age for read config options and purges old data if needed. However, I do understand the concern about introducing a silent behavior change. After giving it some thought, I decided to keep the existing behavior as the default and instead allow for an "explicit removal" feature as a configurable, opt-in option, as it seems to me that the existing behavior of automatically deleting items suits most cases; otherwise, we would've gotten more people complaining about it by now. I've updated the readme to better explain all the options that users have after these changes. |
jaydeluca
left a comment
There was a problem hiding this comment.
LGTM! will just wait for one more set of eyes cc @zeitlinger @breedx-splk @bencehornak
| ``` | ||
|
|
||
| Note that even with explicit deletion, disk usage is still bounded by the configured max folder size and max file | ||
| age, so stale files are automatically purged when there's not enough space available before new data is written. |
bencehornak
left a comment
There was a problem hiding this comment.
Thanks for this, LGTM!
| fileExclusion = file -> file.getCreatedTimeMillis() <= currentFileCreatedTime; | ||
| readableFile.close(); | ||
| readableFileRef.set(null); | ||
| return doReadNext(deserializer, ++attemptNumber); |
There was a problem hiding this comment.
Not related, but loops are better than recursion as they don't pollute the stack.
There was a problem hiding this comment.
It's been ages since I wrote this part. I honestly don't remember why I chose recursion, but I think it's because on a happy path, it should only be executed once, so the recursion would only be present on exceptions, hence an iteration was not the expected, common behavior. I think I understand what you're saying, but so far it doesn't seem to cause an issue in practice, and it also doesn't seem to be a trivial change, so it wouldn't fall under my current set of priorities. Still, if you're interested in finding an alternative solution, feel free to open a PR, and I'll take a look 👍
...uffering/src/main/java/io/opentelemetry/contrib/disk/buffering/internal/storage/Storage.java
Show resolved
Hide resolved
4b01654
Addresses #2540 by making item deletion during iteration configurable.
A new
deleteItemsOnIterationoption has been added toFileStorageConfiguration, defaulting totrueto preserve backward compatibility. When set tofalse, items are no longer implicitly removed from disk during iteration, so users must callIterator.remove()explicitly to delete items in that scenario.Example usage with explicit deletion: