Skip to content

Conversation

@mergify
Copy link
Contributor

@mergify mergify bot commented Dec 3, 2025

Proposed commit message

The two log messages are logged from different goroutines that race:

  • "Closing reader of filestream" is logged from a background goroutine spawned by ctxtool.WithFunc when streamCancel() triggers cancellation
  • "Stopped harvester for file" is logged from the main harvester goroutine via a defer in harvester.go

When streamCancel() executes, it closes a channel that wakes the background goroutine to log "Closing reader", while the main goroutine continues and logs "Stopped harvester". These two goroutines race to write to the log file, making the order non-deterministic.

The original test used sequential WaitLogsContains calls which track file offset - when messages appeared in the "wrong" order, the first check would advance the offset past both messages, causing the second to fail.

Changed to WaitLogsContainsAnyOrder which checks for both messages without relying on order.

Closes #47784

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

How to test this PR locally

  1. Start the integration test containers:

    cd filebeat && mage docker:composeUp
  2. Build the test binary:

    mage buildSystemTestBinary
  3. Run the specific test with FIPS mode (reproduces original failure without fix):

    GODEBUG=fips140=only \
    ES_HOST=localhost ES_USER=beats ES_PASS=testing \
    ES_SUPERUSER_USER=admin ES_SUPERUSER_PASS=testing \
    go test -v -failfast -tags "integration,requirefips" \
      -run "TestFilestreamDelete/Inactive_resource_not_finished_and_data_added_during_grace_period" \
      ./tests/integration/ -count=20
  4. Clean up:

    mage docker:composeDown

Related issues

## Proposed commit message

The two log messages are logged from different goroutines that race:

- "Closing reader of filestream" is logged from a background goroutine spawned by ctxtool.WithFunc when streamCancel() triggers cancellation
- "Stopped harvester for file" is logged from the main harvester goroutine via a defer in harvester.go

When streamCancel() executes, it closes a channel that wakes the background goroutine to log "Closing reader", while the main goroutine continues and logs "Stopped harvester". These two goroutines race to write to the log file, making the order non-deterministic.

The original test used sequential WaitLogsContains calls which track file offset - when messages appeared in the "wrong" order, the first check would advance the offset past both messages, causing the second to fail.

Changed to WaitLogsContainsAnyOrder which checks for both messages without relying on order.

Closes #47784

## How to test this PR locally

1. Start the integration test containers:
   ```bash
   cd filebeat && mage docker:composeUp
   ```

2. Build the test binary:
   ```bash
   mage buildSystemTestBinary
   ```

3. Run the specific test with FIPS mode (reproduces original failure without fix):
   ```bash
   GODEBUG=fips140=only \
   ES_HOST=localhost ES_USER=beats ES_PASS=testing \
   ES_SUPERUSER_USER=admin ES_SUPERUSER_PASS=testing \
   go test -v -failfast -tags "integration,requirefips" \
     -run "TestFilestreamDelete/Inactive_resource_not_finished_and_data_added_during_grace_period" \
     ./tests/integration/ -count=20
   ```

4. Clean up:
   ```bash
   mage docker:composeDown
   ```

(cherry picked from commit 5ce630a)
@mergify mergify bot requested a review from a team as a code owner December 3, 2025 18:07
@mergify mergify bot added the backport label Dec 3, 2025
@mergify mergify bot requested review from VihasMakwana and belimawr and removed request for a team December 3, 2025 18:07
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 3, 2025
@mergify mergify bot mentioned this pull request Dec 3, 2025
6 tasks
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@github-actions github-actions bot added bug flaky-test Unstable or unreliable test cases. Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team skip-changelog labels Dec 3, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport bug flaky-test Unstable or unreliable test cases. skip-changelog Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants