Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." #1883

Open
3 tasks
jeffschwMSFT opened this issue Jan 24, 2024 · 7 comments

Comments

@jeffschwMSFT
Copy link
Member

jeffschwMSFT commented Jan 24, 2024

Build

https://dnceng.visualstudio.com/internal/_build/results?buildId=2360768&view=results

Error

##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134

Build leg reported

vsos

Pull Request

No response

Known issue core information

Fill out the known issue JSON section by following the step by step documentation on how to create a known issue

 {
    "ErrorMessage" : "",
    "BuildRetry": false,
    "ErrorPattern": "The job running on agent Azure Pipelines .+ ran longer than the maximum time of .+ minutes.",
    "ExcludeConsoleLog": false
 }

@dotnet/dnceng

Release Note Category

  • Feature changes/additions
  • Bug fixes
  • Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng/internal/_build/results?buildId=2360768
Error message validated: [The job running on agent Azure Pipelines .+ ran longer than the maximum time of .+ minutes.]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 2/7/2024 1:03:13 AM UTC

Report

Build Definition Step Name Console log Pull Request
982086 dotnet/runtime Libraries Test Run checked coreclr osx x64 Debug Log dotnet/runtime#113564
982070 dotnet/runtime osx-arm64 Release CoreCLR_Release Log
982067 dotnet/runtime osx-arm64 Release CoreCLR_Release Log
982065 dotnet/runtime osx-x64 Release CoreCLR_Release Log
982060 dotnet/runtime osx-x64 Release CoreCLR_Release Log
2664523 dotnet-sdk Darwin_AoT_Tests Log #48509
2664522 dotnet-sdk Darwin_AoT_Tests Log #48510
2664521 dotnet-sdk Darwin_AoT_Tests Log #48508
2664344 dotnet-sdk Darwin_AoT_Tests Log #48501
2664345 dotnet-sdk Darwin_AoT_Tests Log #48502
2664078 dotnet-sdk Darwin Log #48477
2663766 dotnet-sdk Darwin Log #48479
2663765 dotnet-sdk Darwin_AoT_Tests Log #48477
2663764 dotnet-sdk Darwin_AoT_Tests Log #48478
2663664 dotnet-sdk TestBuild: macOS (x64) Log #48482
979726 dotnet/sdk OSX_x64 Log dotnet/sdk#47338
2663552 dotnet-sdk Darwin Log #48479
2662865 dotnet-performance Performance scenarios ubuntu 2204 x64 Tiger main Log
2662645 dotnet-sdk Darwin Log #48427
2662650 dotnet-sdk TestBuild: macOS (x64) Log #48424
2662651 dotnet-sdk TestBuild: macOS (x64) Log #48432
2662644 dotnet-sdk Darwin_AoT_Tests Log #48428
2662643 dotnet-sdk Darwin_AoT_Tests Log #48426
2662417 dotnet-runtime Performance ios_scenarios iOSMono JIT iOSLlvmBuild iOSStripSymbols osx x64 perfiphone12mini net10.0 Log
2662050 dotnet-sdk Darwin_AoT_Tests Log #48373
2662049 dotnet-sdk Darwin Log #48374
2662048 dotnet-sdk Darwin Log #48375
2661996 dotnet-dotnet-monitor Test MacOS x64 Release Log
2661878 dotnet-dotnet-monitor Test MacOS x64 Release Log
2661853 dotnet-sdk Darwin Log #48375
2661852 dotnet-sdk Darwin_AoT_Tests Log #48374
2661851 dotnet-sdk Darwin_AoT_Tests Log #48373
977947 dotnet/arcade Linux Build_Debug Log dotnet/arcade#15511
2661696 dotnet-sdk AoT: macOS (x64) Log #48367
978087 dotnet/runtime osx-arm64 Debug PALTests Log
2661789 dotnet-dotnet OSX_arm64 Log
977965 dotnet/roslyn Test_Linux_Debug Log dotnet/roslyn#77567
977914 dotnet/xharness Helix Tests Build_Debug Log dotnet/xharness#1369
977777 dotnet/runtime maccatalyst-x64 Release AllSubsets_Mono Log
977774 dotnet/runtime maccatalyst-arm64 Release AllSubsets_Mono Log
977666 dotnet/sdk OSX_arm64 Log dotnet/sdk#47489
2660746 dotnet-runtime Performance ios_scenarios iOSMono JIT osx x64 perfiphone12mini net10.0 Log
2660291 dotnet-sdk TestBuild: macOS (x64) Log #48211
2660290 dotnet-sdk AoT: macOS (x64) Log #47785
2660148 dotnet-sdk TestBuild: macOS (x64) Log #47786
2660147 dotnet-sdk AoT: macOS (x64) Log #47785
2660068 dotnet-dotnet-monitor Test MacOS x64 Release Log
2660026 dotnet-sdk Darwin Log #48181
2660025 dotnet-sdk Darwin_AoT_Tests Log #48183
2659859 dotnet-sdk AoT: macOS (x64) Log #47786
2660013 dotnet-sdk TestBuild: macOS (x64) Log #48184
2660012 dotnet-sdk AoT: macOS (x64) Log #47785
975077 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log
974616 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60410
2659855 dotnet-dotnet-monitor Test MacOS x64 Release Log
974871 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log
2659849 dotnet-sdk TestBuild: macOS (x64) Log #47785
974563 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log
974302 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log
974298 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60356
974292 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60678
973846 dotnet/runtime osx-arm64 Release CoreCLR_Release Log
973794 dotnet/runtime maccatalyst-arm64 Release AllSubsets_Mono Log
973759 dotnet/runtime tvossimulator-x64 Release AllSubsets_Mono_RuntimeTests Log
973586 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60445
2658866 dotnet-runtime coreclr Pri1 Runtime Tests Run osx arm64 checked Log
972957 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60751
972861 dotnet/runtime maccatalyst-arm64 Release AllSubsets_Mono Log
972742 dotnet/performance Performance fsharpmicro ubuntu 2204 x64 Open main Log dotnet/performance#4748
2658179 dotnet-sdk Darwin Log #48181
2658102 dotnet-sdk AoT: macOS (x64) Log #47786
2658101 dotnet-sdk TestBuild: macOS (x64) Log #47785
972492 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60410
2657881 dotnet-sdk Darwin Log #48182
972197 dotnet/runtime maccatalyst-arm64 Release AllSubsets_Mono Log
972189 dotnet/runtime osx-arm64 Debug PALTests Log
972051 dotnet/runtime iossimulator-x64 Release AllSubsets_Mono Log
971565 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60410
2657334 dotnet-sdk AoT: macOS (x64) Log #47786
2657333 dotnet-sdk AoT: macOS (x64) Log #47785
2657361 dotnet-dotnet-monitor Test MacOS x64 Release Log
2657363 dotnet-sdk Darwin_AoT_Tests Log #48182
2657364 dotnet-sdk Darwin Log #48181
2657362 dotnet-sdk Darwin Log #48183
2657322 dotnet-sdk TestBuild: macOS (x64) Log #48184
971364 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log
971108 dotnet/runtime maccatalyst-x64 Release AllSubsets_Mono Log
2657012 dotnet-runtime Performance ios_scenarios iOSMono JIT iOSStripSymbols osx x64 perfiphone12mini net10.0 Log
970994 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60785
970636 dotnet/runtime osx-arm64 Release CoreCLR_Release Log
969909 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60410
969557 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60751
2655673 dotnet-runtime Performance ios_scenarios iOSMono JIT iOSLlvmBuild osx x64 perfiphone12mini net10.0 Log
968876 dotnet/runtime iossimulator-x64 Release AllSubsets_Mono Log
2655114 dotnet-performance Performance fsharpmicro ubuntu 2204 x64 Tiger 9.0 Log
2655106 dotnet-performance ubuntu 2204 x64 scenarios Tiger release/6.0 Log
2654969 dotnet-runtime Performance ios_scenarios iOSMono JIT osx x64 perfiphone12mini net10.0 Log
968292 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60673
967420 dotnet/runtime maccatalyst-x64 Release AllSubsets_Mono Log dotnet/runtime#111666
967026 dotnet/aspnetcore Test: Blazor E2E tests on Linux Log dotnet/aspnetcore#60445
Displaying 100 of 562 results

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
10 65 542
@lewing
Copy link
Member

lewing commented Feb 9, 2024

🤔

@nagilson
Copy link
Member

nagilson commented Oct 3, 2024

@dotnet/dnceng @dougbu This has impacted a lot of PRs recently, may you PTAL into expanding this Mac resource?

@ivanpovazan
Copy link
Member

@dotnet/dnceng we are hitting this again and it seems there is some issue with communication with Helix machines.

More context on timeouts happening in : https://dev.azure.com/dnceng-public/public/_build/results?buildId=930014&view=results

  • On success: Run tests in Helix step reports:
Waiting for completion of job fbec1164-55c5-4fa8-b492-e1ba1b413119 on osx.1200.amd64.open (Details: https://helix.dot.net/api/jobs/fbec1164-55c5-4fa8-b492-e1ba1b413119/details?api-version=2019-06-17 )
Job 7fdc2720-3d98-4bf3-8113-274dacd69c91 on osx.1200.arm64.open is completed with 6 finished work items.
  Job fbec1164-55c5-4fa8-b492-e1ba1b413119 on osx.1200.amd64.open is completed with 6 finished work items.
  Stopping Azure Pipelines Test Run Helix Tests Build_Debugosx.1200.amd64.open (Results: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=ms.vss-test-web.build-test-results-tab )
  Stopping Azure Pipelines Test Run Helix Tests Build_Debugosx.1200.arm64.open (Results: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=ms.vss-test-web.build-test-results-tab )

Build succeeded.

SENDHELIXJOB : warning : Helix queue osx.1200.amd64.open was set for estimated removal date of 2025-01-01. In most cases the queue will be removed permanently due to end-of-life; please contact dnceng for any questions or concerns, and we can help you decide how to proceed and discuss other options. [/home/vsts/work/1/s/tests/integration-tests/Apple/Simulator.Tests.proj]
SENDHELIXJOB : warning : Helix queue osx.1200.arm64.open was set for estimated removal date of 2025-01-01. In most cases the queue will be removed permanently due to end-of-life; please contact dnceng for any questions or concerns, and we can help you decide how to proceed and discuss other options. [/home/vsts/work/1/s/tests/integration-tests/Apple/Simulator.Tests.proj]
    2 Warning(s)
    0 Error(s)

Time Elapsed 00:03:14.97
Killing running build processes...

Finishing: Run tests in Helix

ref: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=logs&j=ccc97bb6-1a23-5e71-fdfa-3cdca4a74749&t=27fc7eb2-ead9-59e1-6679-a637855d40c5

  • While on failure - timeout the same step gets stuck with:
Waiting for completion of job 967d92a2-ec10-4332-927f-d28a6563f367 on osx.1200.arm64.open (Details: https://helix.dot.net/api/jobs/967d92a2-ec10-4332-927f-d28a6563f367/details?api-version=2019-06-17 )
  Job 5427159b-500d-49f1-aac0-ec148a492bbe on osx.1200.amd64.open is completed with 6 finished work items.

ref: https://dev.azure.com/dnceng-public/public/_build/results?buildId=930014&view=logs&s=c58bc33c-b825-5bca-90ca-50f6e9293dd8&j=e6966639-fe40-5068-d9ae-681cccecafdf

NOTE: All the tests successfully passed on Helix, but it seems that the communication is lost.

@garath
Copy link
Member

garath commented Jan 27, 2025

Looks like the timeout happened because the osx.1200.arm64.open queue was very busy while the job was running. Right now, I do not think there are any problems with the infrastructure. I will investigate a bit more to see what caused such a back-up.

@garath garath self-assigned this Jan 27, 2025
@garath
Copy link
Member

garath commented Jan 28, 2025

Ah, the queue was consumed with updates and patching. The patching jobs did run longer than necessary and we've communicated with our partner team about the issue. Future jobs will be much shorter and should not overly impact jobs.

@garath garath removed their assignment Jan 29, 2025
@ivanpovazan
Copy link
Member

ivanpovazan commented Feb 13, 2025

Looks like the timeout happened because the osx.1200.arm64.open queue was very busy while the job was running. Right now, I do not think there are any problems with the infrastructure. I will investigate a bit more to see what caused such a back-up.

We are still experiencing the problem.

Should we try to change the queue to osx.13.arm64?

@dotnet/dnceng

@ilyas1974
Copy link
Contributor

Looks at the failing builds, I'm noticing the jobs that are timing out in the hosted pool (Azure Pipelines) are still using the older hardware. I would recommend moving the workloads to the mac-latest-internal or mac-14-arm64 agent specifications. These have the latest mac hardware associated with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants