-
Notifications
You must be signed in to change notification settings - Fork 5k
[9.1](backport #47247) [Filebeat/Filestream] Fix missing last few lines of a file #47620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Filestream could miss ingesting the last few lines of a file when the following happened: - The Harvester reaches EOF and stops on its backoff - The inactive check that runs on its own go routine marks the file as inactive and cancels the reader/harvester context. - The file watcher, that runs on its own goroutine, detects a change in file size and tries to start a new harvester. The file watcher updates its internal state of the file as its current size. - The harvester fails to start because there is one already running (the one blocked on backoff wait). - The backoff expires and the Harvester resumes running and exits right away. - The file watcher has a state (size) for the file that is different than what was actually ingested, so it does not try to start a new harvester until there is another change in the file. This makes Filebeat to miss the last few lines added to the file. This commit fixes this problem by making the harvester notify the file watcher when it stops and the amount of data is has read. During the scan the file watcher can replace its internal state by the harvester, allowing it to start a new harvester if necessary. --------- Co-authored-by: Orestis Floros <[email protected]> Co-authored-by: Emilio Alvarez Piñeiro <[email protected]> Co-authored-by: Copilot <[email protected]> (cherry picked from commit 3fa1a5e) # Conflicts: # filebeat/input/filestream/environment_test.go # filebeat/input/filestream/filestream.go # filebeat/input/filestream/fswatch.go # filebeat/input/filestream/fswatch_test.go # filebeat/input/filestream/input.go # filebeat/input/filestream/input_delete_integration_test.go # filebeat/input/filestream/internal/input-logfile/fswatch.go # filebeat/input/filestream/internal/input-logfile/harvester.go # filebeat/input/filestream/internal/input-logfile/harvester_test.go # filebeat/input/filestream/prospector_creator.go # filebeat/input/filestream/prospector_creator_test.go # filebeat/tests/integration/filestream_truncation_test.go # libbeat/tests/integration/datagenerator.go
|
Cherry-pick of 3fa1a5e has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
🤖 GitHub commentsJust comment with:
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
|
Closing this for now as it is bringing features that are not part of 9.1 |
Proposed commit message
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration files./changelog/fragmentsusing the changelog tool.## Disruptive User ImpactAuthor's Checklist
How to test this PR locally
Run the tests
Manual test
Testing this fix manually is possible, but requires you to monitor the
logs and add data to the file being ingested at a specific time.
At a very high level, the steps are:
harvester
If you ran this test without the fix from this PR, after
#4Filestream will not try to start any more harvesters for the file,
effectively missing the last few lines.
The best way to manually test this PR is to have two terminals open,
one running Filebeat and another ready to append data to the file
Filebeat is ingesting.
Create a file with at least 1kb of data and write down its size
flog -n 20 > /tmp/flog.log wc -c /tmp/flog.logStart Filebeat with following config:
filebeat.yml
To make the logs easier to read, you can send the logs to stdout
and pipe them through jq:
Wait for the log entry:
'/tmp/flog.log' is inactiveAdd data to the file
flog -n 2 >> /tmp/flog.logWait for the log entry:
File /tmp/flog.log has been updatedWait for the log entry:
Harvester already runningWait for the log entry:
File is inactive. Closing. Path='/tmp/flog.log'Wait for the log entry:
Stopped harvester for fileWait for the log entry:
Updating previous state because harvester was closed. '/tmp/flog.log': xxx, wherexxxis the original file size.Wait for the log entry:
File /tmp/flog.log has been updatedWait for the log entry:
Starting harvester for fileWait for the log entry:
End of file reached: /tmp/flog.log; Backoff now.Ensure all events have been read:
wc -l output*.ndjson.Related issues
## Use cases## Screenshots## LogsBenchmarks
Go Benchmark
This is likely not very relevant to the final form of this PR, but I ran some benchmarks comparing the different strategies to prevent the race condition when accessing the
offsetandlastTimeReadin the harvester, below are the results and the codefilebeat/input/filestream/filestream_test.go
Benchbuilder
Latest release: v9.2.1
9.2.12m43.075351941s12264.000000175.31629.2.148.46343038s41269.000000183.11839.2.12m47.897040994s11912.000000176.51489.2.14m51.107096736s6870.000000178.5985PR version
9.3.02m41.103916351s12414.000000175.37349.3.047.520195331s42088.000000182.56259.3.02m44.102216849s12188.000000175.83899.3.04m56.598482898s6743.000000179.3721