Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI flakiness with HWP files #968

Open
apyrgio opened this issue Oct 23, 2024 · 4 comments
Open

CI flakiness with HWP files #968

apyrgio opened this issue Oct 23, 2024 · 4 comments
Labels
bug Something isn't working github_actions Pull requests that update GitHub Actions code

Comments

@apyrgio
Copy link
Contributor

apyrgio commented Oct 23, 2024

Our CI tests fail from time to time when converting our sample .hwp/.hwpx files. For example, see this CI test run: https://github.com/freedomofpress/dangerzone/actions/runs/11450930480/job/31859301008#step:10:991. This failures are intermittent, since the tests pass if we kick the CI job one more time.

The failed jobs don't report any helpful error message, so we need to dig deeper.

@apyrgio apyrgio added bug Something isn't working github_actions Pull requests that update GitHub Actions code labels Oct 23, 2024
@apyrgio
Copy link
Contributor Author

apyrgio commented Oct 23, 2024

We can run the HWP conversion locally in a tight loop:

$ while poetry run ./dev_scripts/dangerzone-cli <(base64 -d tests/test_docs_external/sample-hwp.hwp.b64) --output-filename /tmp/test-safe.pdf ; do sleep 1; done

I've left it running for a few minutes, and I haven't encountered an error yet. The logs I get from the container are:

----- DOC TO PIXELS LOG START -----
Installing LibreOffice extension 'h2orestart.oxt'
Converting to PDF using LibreOffice
Converting page 1/6 to pixels
Converting page 2/6 to pixels
Converting page 3/6 to pixels
Converting page 4/6 to pixels
Converting page 5/6 to pixels
Converting page 6/6 to pixels
Converted document to pixels
[COMMAND] unzip -d /usr/lib/libreoffice/share/extensions/h2orestart.oxt/ /libreoffice_ext/h2orestart.oxt
Archive:  /libreoffice_ext/h2orestart.oxt
  inflating: icon/H2Orestart.png
  inflating: description.xml
  inflating: description/desc_en.txt
  inflating: H2Orestart.jar
  inflating: description/desc_ko.txt
  inflating: registry/H2Orestart_types.xcu
  inflating: logger.properties
  inflating: registry/H2Orestart_filters.xcu
  inflating: registry/TypeDetection.xcu
  inflating: META-INF/manifest.xml
[COMMAND] libreoffice --headless --safe-mode --convert-to pdf --outdir /tmp /tmp/input_file
OpenJDK 64-Bit Server VM warning: Can't detect primordial thread stack location - find_vma failed
OpenJDK 64-Bit Server VM warning: Can't detect primordial thread stack location - find_vma failed
OpenJDK 64-Bit Server VM warning: Can't detect primordial thread stack location - find_vma failed
convert /tmp/input_file as a Writer document -> /tmp/input_file.pdf using filter : writer_pdf_Export
----- DOC TO PIXELS LOG END -----

This line may point to a race condition if we run it hundreds of times, but I'm not sure:

OpenJDK 64-Bit Server VM warning: Can't detect primordial thread stack location - find_vma failed

@almet
Copy link
Contributor

almet commented Oct 23, 2024

If there are some tests which are known to be unstable, one way to not bother too much about them is to rerun them with pytest-rerun-failures (yeah, this is a day where I propose pytest-* related tools ! 🍵 while waiting for CI to run).

(of course, it's better to investigate a bit first, but my experience is that in some cases it's not easy to find the culprit).

@apyrgio
Copy link
Contributor Author

apyrgio commented Oct 23, 2024

Oh, I didn't know about this tool. It's interesting, but I'm afraid it may mask race conditions that we need to know about :/

@almet
Copy link
Contributor

almet commented Oct 23, 2024

Yup, I think this should be used only if we don't have the time / we don't find / we can't fix the race condition 👍

almet added a commit that referenced this issue Nov 6, 2024
It seem that these tests are flaky, and as a result our CI pipeline is
failing from time to time. This will rerun it automatically when there
is an error.

See #968 for more
information
almet added a commit that referenced this issue Nov 10, 2024
It seem that these tests are flaky, and as a result our CI pipeline is
failing from time to time. This will rerun it automatically when there
is an error.

See #968 for more
information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working github_actions Pull requests that update GitHub Actions code
Projects
None yet
Development

No branches or pull requests

2 participants