[email protected] / CPE PrgEnv-intel/8.6.0 stack #2847

rickgrubin-noaa · 2025-08-01T20:36:38Z

Commit Queue Requirements:

Fill out all sections of this template.
All sub component pull requests have been reviewed by their code managers.
Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
Commit 'test_changes.list' from previous step

Description:

This PR updates the gaeac6 Intel modulefiles for spack-stack [email protected] / CPE PrgEnv-intel/8.6.0

Commit Message:

* UFSWM - update gaeac6 Intel modulefiles for [email protected] / CPE PrgEnv-intel/8.6.0 stack

Priority:

Normal

Git Tracking

UFSWM:

Closes Update gaea-c6 Intel modulefiles to support [email protected] [email protected] / CPE PrgEnv-intel/8.6.0 #2846

Sub component Pull Requests:

None

UFSWM Blocking Dependencies:

None

Documentation:

No documentation update is required for this PR (please explain).

No documentation change necessary as documentation does not specifically reference making changes host-specific modulefiles, rather only how to load them, e.g. 3.5.1. Loading the Required Modules

Changes

Regression Test Changes (Please commit test_changes.list):

No Baseline Changes.

See attached file RegressionTests_weekly_gaeac6.log generated via ./rt.sh -a epic -r -w

Note that file test_changes.list was length=0 for ./rt.sh -a epic -r -c and ./rt.sh -a epic -r -w

./rt.sh -a epic -r -c followed by ./rt.sh -a epic -r -m generated 100%successful comparisons.

RegressionTests_weekly_gaeac6.log

Input data Changes:

None.

Library Changes/Upgrades:

Required
- Git Stack PR: Update gaea-c6 config for new CPE PrgEnv-intel/8.6.0 and [email protected] compilers #1713

Testing Log:

gspetro-NOAA · 2025-09-18T00:36:24Z

@rickgrubin-noaa Is this PR ready for review, or is there further work to do?

rickgrubin-noaa · 2025-09-26T15:37:36Z

@rickgrubin-noaa Is this PR ready for review, or is there further work to do?

It's been ready since August 1 (initial filing).

Branch is synced with HEAD of develop.

Fix typo in MODULEPATH

Fix stack compiler type to load

Require libfabric/1.20.1

Fix stack name, force libfabric/1.20.1

Fixes for gaeac6 OS upgrade

Remove module reset for gaeac6

Updates for new OS

ulmononian · 2025-10-03T20:09:54Z

what is the timeline to merge this? the upgrade from the intel classic to oneapi stack resolves issues for high-resolution tests (@JessicaMeixner-NOAA), among some other issues on c6.

JessicaMeixner-NOAA · 2025-10-03T20:12:43Z

what is the timeline to merge this? the upgrade from the intel classic to oneapi stack resolves issues for high-resolution tests (@JessicaMeixner-NOAA), among some other issues on c6.

I was actually able to run with the old version of spack-stack after changing an environment variable.

gspetro-NOAA · 2025-10-06T01:55:52Z

@ulmononian I was going to test on Ursa, but then it looked like @rickgrubin-noaa was added several more commits, and the control_c48 I ran failed, likely because he was in the midst of updating. If the PR is complete, then I will get back to testing it, and we can move it to "Schedule if everything passes. We have four PRs lined up already for this week, so it would go in on Friday at the earliest unless it can be combined with another PR. Let me know your thoughts on that.

rickgrubin-noaa · 2025-10-06T13:46:50Z

@ulmononian I was going to test on Ursa, but then it looked like @rickgrubin-noaa was added several more commits, and the control_c48 I ran failed, likely because he was in the midst of updating. If the PR is complete, then I will get back to testing it, and we can move it to "Schedule if everything passes. We have four PRs lined up already for this week, so it would go in on Friday at the earliest unless it can be combined with another PR. Let me know your thoughts on that.

@gspetro-NOAA the changes are strictly for gaea-c6 -- should be zero impact for anything on ursa.

gspetro-NOAA · 2025-10-06T13:50:13Z

@rickgrubin-noaa Sorry-that's what I meant. I did try to test on Gaea C6, and the initial test I tried failed, but you were suddenly pushing a bunch of changes. Are you done now?

rickgrubin-noaa · 2025-10-06T14:28:59Z

@rickgrubin-noaa Sorry-that's what I meant. I did try to test on Gaea C6, and the initial test I tried failed, but you were suddenly pushing a bunch of changes. Are you done now?

Yes; done and successfully tested last week.

ulmononian · 2025-10-06T22:17:54Z

@ulmononian I was going to test on Ursa, but then it looked like @rickgrubin-noaa was added several more commits, and the control_c48 I ran failed, likely because he was in the midst of updating. If the PR is complete, then I will get back to testing it, and we can move it to "Schedule if everything passes. We have four PRs lined up already for this week, so it would go in on Friday at the earliest unless it can be combined with another PR. Let me know your thoughts on that.

would be great to merge friday or shortly after. it's fine to combine this with another PR if that helps expedite the process and reduce resource usage. thank you!

DusanJovic-NOAA · 2025-10-07T14:39:20Z

The default compiler on Gaea C6 is now intel/2025.2, which is supposed to fix the bug causing MOM6 to fail to compile with ifx. Would it be possible to recompile the spack-stack with this compiler so that we can finally start testing the model using both Fortran and C/C++ LLVM based compilers.

rickgrubin-noaa · 2025-10-07T15:43:06Z

The default compiler on Gaea C6 is now intel/2025.2, which is supposed to fix the bug causing MOM6 to fail to compile with ifx. Would it be possible to recompile the spack-stack with this compiler so that we can finally start testing the model using both Fortran and C/C++ LLVM based compilers.

@DusanJovic-NOAA there are some known bugs in [email protected] that will be fixed in the next release, however creating host-specific stack configurations for [email protected] is underway.

gspetro-NOAA · 2025-10-07T23:54:06Z

@rickgrubin-noaa When I run the control_c48 test, I get a failure. From what I'm seeing, all your testing ran with the -c flag, which creates new baselines, so wouldn't this be a baseline changing PR? Also, what was the reason for running with -w? Just resource conservation?

If this is a baseline changing PR, we normally need you to run the full RT suite (./rt.sh -a epic -e) and push the test_changes.list file and the log for the system you ran on. If you expect the PR to change baselines for every test, then let us know that, and I'll confer with the other CMs to see if you should do the full test or if we'll just regenerate the baselines. The main issue there is just assessing whether the particular baseline changes are reasonable.

ulmononian · 2025-10-08T15:45:01Z

@rickgrubin-noaa When I run the control_c48 test, I get a failure. From what I'm seeing, all your testing ran with the -c flag, which creates new baselines, so wouldn't this be a baseline changing PR? Also, what was the reason for running with -w? Just resource conservation?

If this is a baseline changing PR, we normally need you to run the full RT suite (./rt.sh -a epic -e) and push the test_changes.list file and the log for the system you ran on. If you expect the PR to change baselines for every test, then let us know that, and I'll confer with the other CMs to see if you should do the full test or if we'll just regenerate the baselines. The main issue there is just assessing whether the particular baseline changes are reasonable.

@gspetro-NOAA did control_c48 fail in the baseline comparison step or elsewhere? this is really only a compiler/lib change, but baselines could be altered. we can run the full suite without -c and share the logs if that will help.

RatkoVasic-NOAA · 2025-10-08T15:47:06Z

@rickgrubin-noaa When I run the control_c48 test, I get a failure. From what I'm seeing, all your testing ran with the -c flag, which creates new baselines, so wouldn't this be a baseline changing PR? Also, what was the reason for running with -w? Just resource conservation?

If this is a baseline changing PR, we normally need you to run the full RT suite (./rt.sh -a epic -e) and push the test_changes.list file and the log for the system you ran on. If you expect the PR to change baselines for every test, then let us know that, and I'll confer with the other CMs to see if you should do the full test or if we'll just regenerate the baselines. The main issue there is just assessing whether the particular baseline changes are reasonable.

@gspetro-NOAA

Yes, this is baseline changing PR (different compiler - expected different results)
Please use test_changes.list as a whole (replace every baseline)
I ran ./rt.sh -c followed by ./rt.sh -m, and it passed ALL regression tests. I believe assigned CM will do the same.
You can disregard -w option, it is used when you don't want to compare results, -m option did comparison.

gspetro-NOAA · 2025-10-08T16:26:49Z

@ulmononian Yes, it failed in the comparison stage, which is expected for a compiler change, as @RatkoVasic-NOAA said. However, there was some confusion because this is listed as a non-baseline changing PR. It doesn't matter the reason the baselines change; if they change for any reason, it's a baseline changing PR.

On the CM side, Ratko's right that before merging, we would run with the -c command to regenerate baselines. Then there are a few other steps we take. However, this only occurs after the developer has run the full RT suite (./rt.sh -a <account> usually w/-e or -r options, too) on a relevant RDHPCS (usually Ursa, but here, Gaea) and pushed the log and the test_changes.list file.

test_changes.list allows us to have a record of what baselines were changed in the PR, but it will only have the full list of changed tests if the full RT suite is run.
Pushing the failed log let's us see what the results of the developer's testing were (and the commands used). For example, here, we would expect failures in the log, but we would only expect comparison failures, not failures for other reasons, and it is important to verify that.

In short, what we need is for @rickgrubin-noaa to run the full RT suite without -c and push the resulting test_changes.list file and RegressionTest_gaea.log file. Then we can schedule it for the commit queue.

gspetro-NOAA · 2025-10-13T16:54:36Z

I ran the RTs on Gaea C6, and the tests that fail are expected failures. Failures are either:

UNABLE TO COMPLETE COMPARISON, which is expected with compiler/baseline changes
UNABLE TO START TEST, which is expected for restart tests when the control failed due to a comparison error.

Note that cpld_control_gfsv17_iau_intel failed to start, but it is a control test that depends on another control (cpld_control_gfsv17_intel), so this is also expected, even though it is not called a restart test.
Given that the failures are expected, we should be able to proceed with this PR.

grantfirl · 2025-10-14T20:29:37Z

This has been combined into #2882

rickgrubin-noaa and others added 2 commits July 31, 2025 16:14

[email protected] / CPE PrgEnv-intel/8.6.0 stack

bc90377

Merge branch 'ufs-community:develop' into gaeac6-oneapi

bfce9c7

MichaelLueken mentioned this pull request Aug 4, 2025

ESMF 8.8.0 issues with spack-stack 1.9.2 on all machines when QUILTING = false JCSDA/spack-stack#1730

Closed

Merge branch 'ufs-community:develop' into gaeac6-oneapi

c0247f2

jkbk2004 mentioned this pull request Aug 6, 2025

Add two-way ocean-wave coupling feature to the HAFS applications #2584

Merged

14 tasks

rickgrubin-noaa added 5 commits August 11, 2025 14:50

Merge branch 'ufs-community:develop' into gaeac6-oneapi

c967705

Merge branch 'ufs-community:develop' into gaeac6-oneapi

60b5a30

Merge branch 'ufs-community:develop' into gaeac6-oneapi

e330840

Merge branch 'ufs-community:develop' into gaeac6-oneapi

1a839c4

Merge branch 'ufs-community:develop' into gaeac6-oneapi

28db3e2

gspetro-NOAA added this to PRs to Process Sep 18, 2025

gspetro-NOAA moved this to Evaluating in PRs to Process Sep 18, 2025

gspetro-NOAA added the No Baseline Change No Baseline Change label Sep 18, 2025

Merge branch 'ufs-community:develop' into gaeac6-oneapi

ecba893

gspetro-NOAA moved this from Evaluating to Review in PRs to Process Sep 29, 2025

rickgrubin-noaa added 10 commits September 30, 2025 09:38

Merge branch 'ufs-community:develop' into gaeac6-oneapi

c61b9a0

Update ufs_gaeac6.intel.lua

3c49f43

Fix typo in MODULEPATH

Update ufs_gaeac6.intel.lua

34a5cbb

Fix stack compiler type to load

Update ufs_gaeac6.intel.lua

bde80ab

Require libfabric/1.20.1

Update ufs_gaeac6.intelllvm.lua

06696c9

Fix stack name, force libfabric/1.20.1

Update ufs_gaeac6.intelllvm.lua

7b021b6

Update rt.sh

333c416

Fixes for gaeac6 OS upgrade

Update compile.sh

939807e

Remove module reset for gaeac6

Update run_test.sh

04ca92f

Updates for new OS

Update rt.sh

ac7be13

RatkoVasic-NOAA mentioned this pull request Oct 1, 2025

[INSTALL]: Reinstall spack-stack 1.9.1, 1.9.2 and 1.9.3 on C6 JCSDA/spack-stack#1779

Closed

Merge branch 'develop' into gaeac6-oneapi

da73809

gspetro-NOAA added Baseline Updates Current baselines will be updated. and removed No Baseline Change No Baseline Change labels Oct 8, 2025

add gaea c6 logs and test_changes.list

bb44b58

gspetro-NOAA moved this from Review to Schedule in PRs to Process Oct 13, 2025

gspetro-NOAA mentioned this pull request Oct 14, 2025

Sync from NCAR/main + Thompson params + [email protected] / CPE PrgEnv-intel/8.6.0 stack #2882

Open

14 tasks

[email protected] / CPE PrgEnv-intel/8.6.0 stack #2847

Are you sure you want to change the base?

[email protected] / CPE PrgEnv-intel/8.6.0 stack #2847

Uh oh!

Conversation

rickgrubin-noaa commented Aug 1, 2025 • edited by gspetro-NOAA Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commit Queue Requirements:

Description:

Commit Message:

Priority:

Git Tracking

UFSWM:

Sub component Pull Requests:

UFSWM Blocking Dependencies:

Documentation:

Changes

Regression Test Changes (Please commit test_changes.list):

Input data Changes:

Library Changes/Upgrades:

Testing Log:

Uh oh!

gspetro-NOAA commented Sep 18, 2025

Uh oh!

rickgrubin-noaa commented Sep 26, 2025

Uh oh!

ulmononian commented Oct 3, 2025

Uh oh!

JessicaMeixner-NOAA commented Oct 3, 2025

Uh oh!

gspetro-NOAA commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rickgrubin-noaa commented Oct 6, 2025

Uh oh!

gspetro-NOAA commented Oct 6, 2025

Uh oh!

rickgrubin-noaa commented Oct 6, 2025

Uh oh!

ulmononian commented Oct 6, 2025

Uh oh!

DusanJovic-NOAA commented Oct 7, 2025

Uh oh!

rickgrubin-noaa commented Oct 7, 2025

Uh oh!

gspetro-NOAA commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ulmononian commented Oct 8, 2025

Uh oh!

RatkoVasic-NOAA commented Oct 8, 2025

Uh oh!

gspetro-NOAA commented Oct 8, 2025

Uh oh!

gspetro-NOAA commented Oct 13, 2025

Uh oh!

grantfirl commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

rickgrubin-noaa commented Aug 1, 2025 •

edited by gspetro-NOAA

Loading

gspetro-NOAA commented Oct 6, 2025 •

edited

Loading

gspetro-NOAA commented Oct 7, 2025 •

edited

Loading