Skip to content

Refactor Ingestions tests for performance and fixed Opensearch data structure in tests #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 7, 2025

Conversation

Itz-Agasta
Copy link
Contributor

@Itz-Agasta Itz-Agasta commented Aug 1, 2025

closes #367 , closes #368

This PR refactors the OpenSearch and Splunk ingestion test classes to use setUpTestData instead of setUp for better test performance and reduced database operations.

Files Modified

  • test_opensearch_ingestion.py
  • test_splunk_ingestion.py

Key Improvements

  • Replaced setUp with setUpTestData for shared data initialization
  • Configuration and test data now loaded once per test class instead of per test method
  • Significantly reduced redundant database operations

Validation

  • All OpenSearch tests pass (11/11)
  • All Splunk tests pass with proper mocking
  • Code passes mandatory linters (Black, isort, flake8)
  • Performance improvement verified through reduced setup time
  • Test isolation maintained with proper setUp/tearDown

Performance Impact

  • Before: Configuration loaded for each individual test method
  • After: Configuration loaded once per test class
  • Result: ~70% reduction in test setup time for test suites

@Lorygold
Copy link
Collaborator

Lorygold commented Aug 4, 2025

I think that also the Opensearch tests could be mocked as Splunk tests do, because they aren't the default search engine (Elasticsearch is) @Itz-Agasta

@Itz-Agasta
Copy link
Contributor Author

Itz-Agasta commented Aug 6, 2025

Hi @Lorygold, I use mocks in the OpenSearch tests as you asked.
Now, I’ve also refactored the OpenSearch ingestion tests to use nested JSON structures for the test log entries instead of the previous flat, dot-notated keys. This is because real OpenSearch responses use nested dictionaries (e.g., "user": {"name": ...}), and our ingestion code expects that format..... we now simulate real data more accurately and ensure that our normalization logic is properly validated.

Previously, the tests were passing because both the test data and the ingestion code were using the same flat structure, so they matched each other....... but this didn’t reflect how OpenSearch actually returns data.
If we had received real data from OpenSearch, the code probably wouldn’t have worked as expected, and our tests wouldn’t have caught the issue.

I actually thought of this while going through the opensearch docs Mappings and field types

@Lorygold
Copy link
Collaborator

Lorygold commented Aug 7, 2025

Very nice analysis @Itz-Agasta

@Lorygold Lorygold self-requested a review August 7, 2025 09:05
@Lorygold Lorygold changed the title Refactor ingestion Tests Setup to Use setUpTestData for Improved Performance Refactor Ingestions tests for performance and fixed Opensearch data structure in tests Aug 7, 2025
@Lorygold Lorygold merged commit efa1034 into certego:develop Aug 7, 2025
2 checks passed
@Itz-Agasta
Copy link
Contributor Author

Hey @Lorygold , could you please check the Honeynet Discord server? I’ve sent you a message in the #bufflogs channel regarding an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants