-
Notifications
You must be signed in to change notification settings - Fork 8
Fix test_annotation_pipeline fails with half-precision-model=True
#477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Local tests have shown that:
Also not relevant for current bug:
|
|
I could not find any issue or release note for Torch 2.6 directly related to what we experience here.
Also may be a good step:
UPD: Calling Unrelated: |
d2f3dc8 to
6725e89
Compare
test_annotation_pipeline fails with half-precision-model=True
|
use_deterministic_algorithms() removed differences in results for half_precision_ops, both between versions and different machines (local and CI tests) |
Nice! But is there a min torch version that this requires? |
Our minimal torch 1.10 already supports it, so should be ok Anyways, bfloat16 has only 8 significant precision bits, i think we should not try to get it 10e-6 exact, but around 10e-3 |
what is the reason for using |
|
hmm strange. With my recent change, only EDIT: I was totally assuming, the tests on the CI are running with torch=2.7... is this not the case? |
|
btw do you have a GPU on your local machine? (I don't) |
It is default in pytorch autocast for CPU, so i think we just took it.
Yes, i do! But i double checked tests run on CPU. I also tried tests on GPU, results are also different from CPU, but i think they were stable across versions. I could rerun them to make sure. |
|
@RainbowRivey Can you check one more time what tests pass now locally at your machine (with and w/o GPU)? For me, everything is green now (locally), for both versions (from |
|
Here results of local tests
torch=2.3.0 (from poetry.lock) Detailsor shortly: and with torch==2.7.1 (all latest from pyproject.toml) Details================================================================ FAILURES ================================================================ |
…lf precision model
b27e0f9 to
b4c5714
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #477 +/- ##
==========================================
- Coverage 72.20% 72.10% -0.10%
==========================================
Files 32 32
Lines 2173 2176 +3
Branches 316 318 +2
==========================================
Hits 1569 1569
- Misses 523 526 +3
Partials 81 81 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…rch version (from pyproject.toml), see #480
This is the last non-breaking piece of the `pie-core` refactor. See [this](ArneBinder/pie-core#17) for context. requires: - [x] #477 - [x] [pie-core 0.2.1](https://github.com/ArneBinder/pie-core/releases/tag/v0.2.1) because of ArneBinder/pie-core#80
This PR adjusts
test_annotation_pipelinein such a way that the current behavior is made transparent: What parameter setting / dependency versions / device causes which result scores? Later (probably breaking) PRs will address anything that needs to be fixed.In detail, this PR does:
test_annotation_pipeline:resolve()etc.)half_precision_opsandhalf_precision_model1e-610e-2as absolute tolerance whenhalf_precision_model(reasoning: sing half_precision_model on cpu results in using dtype=torch.bfloat16 which has only 8 significant precision bits, so we use 10e-2 as absolute tolerance)torch.use_deterministic_algorithmsto make sure results are as reproducible as possible.In addition, this PR changes the following:
half_precision_opsis used in combination withhalf_precision_model, because of recommendation from PyTorch documentation; check just that warning in the respective test case (not the scores anymore!)Important
Though current half-precision model/ops tests are passing now with lowered tolerance, difference still may exceed this limit when upgrading
poetry.lock.We may need to adjust scores when updating!
Reason for that is
bfloat16type (default for half precision model/ops on CPU) is quite unstable:It has only 8 precision bits. Results of calculations with it may highly depend on operations order, which leads to different results on different Torch and/or it's back-ends versions and even on different machines.
Also, this tests still may fail locally on some machines due to this.
Requires:
nox#479TODO:
(without lowering the tolerance), locally and at CIabs=1e-6) for full precisionabs=1e-2) for half precision (both cases:half_precision_modelandhalf_precision_ops)Related:
Follow-up:
main#478PyTorchIEPipelinefrompie_core.AnnotationPipeline#475