-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System.IO.Net5Compat.Tests and System.IO.Tests suddenly exiting with error 137 #100558
Comments
Tagging subscribers to this area: @dotnet/area-system-io |
@dotnet/area-system-io there are a lot of hits on this and relatively recent. It seems to me to be happening across many configurations. I think it's worth having a look. |
137 means out of memory. We have not made any changes to 6.0 in System.IO, so I expect that either there was some infra change (like less memory available) or a bug was introduced in the product itself. The bug would be specific to Linux. @carlossanlop is it possible to perform some kind of binary search based on the merged PRs and when it started to fail? |
@adamsitnik @jozkee This is one of the most impactful failures in servicing. It only affects System.IO.Tests and System.IO.Net5Compat.Tests. Any chance you can take a look soon? |
@carlossanlop sure, but could you please answer the question I've asked in #100558 (comment) ? |
Sorry, I missed that question. Yes, you can use Kusto. David has used it many times in the past. |
This is the super basic kusto query you can execute if looking via issue: TestKnownIssues
| union KnownIssues
| where IssueId == "" This database stores data from the last 4 months so hopefully there's still info from April. This is the cluster where you would look for that info: https://dataexplorer.azure.com/clusters/dotnetperf.westus/databases/PerformanceData Unfortunately it seems that failure data is not stored if it's not linked to an issue. Thanks @AlitzelMendez for the above info. |
This test is failing a lot with 33 hits over the past 24 hours. We need to bring this back into 9.0.0, get it resolved, and plan to backport whatever change we make to the release/9.0 branch to clean up the failures there. |
I suspect it is this test runtime/src/libraries/System.Runtime/tests/System.IO.Tests/MemoryStream/MemoryStreamTests.cs Lines 103 to 107 in 2694613
It is already disabled and noted to be problematic in certain environments. I don't know how much memory the ADO containers have, but this test does a couple of 2GB allocations. I suspect you are just hitting the CoreCLR version of this Mono failure. #100225 |
That is the only test in |
It's most likely one of the tests that causes the OOM 👍 But I am not sure that it's the only one:
|
This test failure looks like the Linux OOM killer. The .NET process was able to allocate memory, but Linux shortly later ran out of memory. When that happens, Linux runs the OOM killer to start taking processes. See https://www.kernel.org/doc/gorman/html/understand/understand016.html for more information. The OOM killer decided that the .NET process was the right one to take down. |
@vcsjones thanks, I was not aware of that! (BTW it sucks as in a way it hides quite important information like stacktrace of the method that caused OOM) |
I think that test was contributing to the problem. The issue is still occurring in the |
The System.IO.Net5Compat.Tests and the System.IO.Tests test processes are intermittengly getting killed on Linux shortly after starting, and the exit code is 137.
Build Information
Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=627407
Build error leg or test failing: System.IO.Net5Compat.Tests
Error Message
System.IO.Net5Compat.Tests example
Libraries Test Run release coreclr Linux_musl x64 Debug
System.IO.Test example
Libraries Test Run release mono Linux x64 Debug
Known issue validation
Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=627407
Error message validated:
[Starting: System\.IO\.(Net5Compat\.)?Tests exit code 137
]Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 4/2/2024 11:08:28 PM UTC
Report
Summary
The text was updated successfully, but these errors were encountered: