Question for configuring node and cpu # in running UFS on NCAR Derecho HPC #2135

cenlinhe · 2024-02-19T19:18:03Z

cenlinhe
Feb 19, 2024

I am testing the default UFS global coupled run (cpld_control_p8) on NCAR Derecho HPC. By default, in the run directory, the UFS Derecho run script set the following:
"#PBS -l select=3:ncpus=96:mpiprocs=96:ompthreads=1"
and
"mpiexec -n 200 -ppn 67 --hostfile $PBS_NODEFILE ./fv3.exe"

Note that on Derecho, each node has 128 cpus and even if users do not request all 128 cpus for each node, the core hours charged will still count 128 cpus for each node. In this case, it will be "3 * 128" instead of "3 * 96".

It seems that

this default script does not request all cpu (i.e., 128) for each node, which will waste resources because it will be counted as 128 cpus/node anyway in charging core hours;
this script above specifies "-n 200 -ppn 67", which also does not use all the requested cpus (=3*96=288). Are there any other considerations for such setup?

So my questions are:

What is the difference for "-n" and "-ppn" specified for "mpiexec" here? If I need to run a longer time period case, can I modify these values here for "-n" and "-ppn"? If so, is there certain rule I need to follow?
If I would like to test the most effective cpu configuration on NCAR Derecho, e.g., change the 96 cpus/node to 128 cpus/node to fully use the node cpu resources, can I do this? Would this mess up with the model configurations (e.g., cores specified to each model component in this coupled system)? If I am allowed to use more cpus per node, how should I change the "-n" and "-ppn" values in the "mpiexec" command line to work with the updated requested total cores in "#PBS -l select=3:ncpus=128:mpiprocs=128:ompthreads=1"?

Any help and advice would be really appreciated!

padhrigmccarthy · 2024-02-20T14:50:38Z

padhrigmccarthy
Feb 20, 2024

Following up on Cenlin's question, I am new to ufs-weather-model, but I spent some time this weekend trying to help him -- looking around to start to learn the configuration and workflow. The only place I was able to find node-related machine configuration was in the CICE module:
setenv ICE_MACHINE_TPNODE 128

Is there another place in the workflow where there is machine-specific config?

Alternatively, are there any hints for where the layout is specified? I know for SRW app, the layout is part of the grid specification. Would we need to change something like that (create a custom grid for derecho), in order to try to get closer to 128 nodes per core when running on Derecho? Just throwing out a couple ideas here to get the discussion going...

Thank you!

1 reply

gspetro-NOAA Feb 21, 2024
Maintainer

@padhrigmccarthy @cenlinhe ,

It looks like Jong and Denise have provided info regarding the use of core hours on Derecho, and Denise already pointed you in the right direction regarding how variables are set. I just wanted to address Paddy's question about workflow/configuration in a bit more detail. The Configurations section of the User's Guide describes how variables are set. Concretely, in the cpld_control_p8 regression test, the lines:

export_fv3
export_cpl

indicate where in ufs-weather-model/tests/default_vars.sh variables are coming from. The export_fv3() and export_cpl() functions export variables specific to the FV3 and coupled configurations respectively.

In general, default_vars.sh sets common variables first, then machine-specific variables (around line 232 for Derecho), and then variables common to specific sets of tests. Variables set within each test file (e.g., cpld_control_p8) override the defaults coming from default_vars.sh. And as Denise said, derived values are calculated in rt_utils.sh. When troubleshooting, I would work backwards--start in rt_utils.sh if dealing with those calculated values; otherwise look in the test file and then look through the functions called there from default_vars.sh in reverse order to see where variables might be set. Where a file name is set, you may need to refer to that file in parm (for example, there are several input namelist and ufs.configure files that are used to set certain variables, so be sure to look at the correct one!).

Hope this bit of complementary information is useful!

Best,
Gillian Petro

DeniseWorthen · 2024-02-20T15:04:34Z

DeniseWorthen
Feb 20, 2024
Maintainer

Perhaps @jkbk2004 or @natalie-perlin can weigh in here. I know that at the time Derecho was added to the supported plaforms, there was this issue created #2033 but I don't know the status.

0 replies

jkbk2004 · 2024-02-20T23:02:35Z

jkbk2004
Feb 20, 2024
Maintainer

@cenlinhe #2033 is slightly different angle. It's about runtime -depth THRD option on Derecho. BTW, layout is set on input.nml for FV3 domain decomposition. Component PE resources are specified on ufs.configure as well.

0 replies

jkbk2004 · 2024-02-20T23:09:27Z

jkbk2004
Feb 20, 2024
Maintainer

@cenlinhe FYI: https://ufs-weather-model.readthedocs.io/en/latest/FAQ.html#fv3atm

1 reply

cenlinhe Feb 21, 2024
Author

Thank you, Jong!

DeniseWorthen · 2024-02-21T14:43:59Z

DeniseWorthen
Feb 21, 2024
Maintainer

If I understand @cenlinhe initial question, this has nothing to do w/ FV3 layouts or how we specify FV3 atm resources. It has to do w/ efficient use of core-hours on Derecho.

Take the same case (cpld_control_p8) on Gaea. Gaea also has an available 128 tasks/node. We use the same ufs.configure (requiring a total of 200 tasks). We also specify that ESMF is managing the threading (globalResourceControl: true). In this case, the job card reads

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=128

and

srun --label -n 256 ./fv3.exe

Checking the PET logs for PET201 and above shows that nothing is happening on those tasks (you'll see the config reading and then finalizing is all). We request 2 "full" nodes (256) and use 200 of them (to allow for the ESMF-managed threading). I'm not sure, but I believe in this case we're "charged" for the use of both nodes. In the case of Derecho, if I understand, we're requesting 3 nodes but only using 96 on each. The concern being that this results in core-hours being wasted.

3 replies

cenlinhe Feb 21, 2024
Author

Hi Denise, your understanding is correct. It is about the efficient use of core-hours on Derecho. Currently, specifying 96 on each node seems a waste of core hours. I am also curious about the number "200". Is this a fixed number for this configuration no matter how many more cores are available to use? More general, such number for each of the different UFS configuration seems different and could not be changed to optimize the model run efficiency and run time, right? I am not a SE, so I am not sure if setting this "200" (in this case) is related to the UFS configuration infrastructure. Of course, I can go with the default setup here, but just want to know if this has been optimized for specific platforms (e.g., Derecho in this case). It may matter a lot if we start to run very intensive UFS experiments. Thanks!

DeniseWorthen Feb 21, 2024
Maintainer

The number 200 comes from the ufs.configure, which specifies the resources required in the various "petlist_bounds". The final component (wave) uses

# WAV #
WAV_model:                      ww3
WAV_petlist_bounds:             180 199

and since all the components are listed in order, that is the full number of tasks needed. In this case, waves is running on 20 tasks. In general, you can specify more resources for a component. But a couple of components (ATM and CICE) have restrictions on setting an arbitrary resource value. This page should help https://ufs-weather-model.readthedocs.io/en/latest/FAQ.html#how-do-i-set-the-total-number-of-tasks-for-my-job.

The actual settings used in the individual tests are calculated in rt.utils

ufs-weather-model/tests/rt_utils.sh

Line 16 in e464f5b

function compute_petbounds_and_tasks() {

using the values in default_vars.sh

ufs-weather-model/tests/default_vars.sh

Line 6 in e464f5b

# - different machines, different defaults:

cenlinhe Feb 21, 2024
Author

Thank you!

jkbk2004 · 2024-02-21T15:01:18Z

jkbk2004
Feb 21, 2024
Maintainer

@zach1221 @FernandoAndrade-NOAA we need an experiment to reset TPN on derecho: https://github.com/ufs-community/ufs-weather-model/blob/develop/tests/tests/cpld_control_p8#L86

1 reply

gspetro-NOAA Feb 21, 2024
Maintainer

@jkbk2004 Your findings will probably be useful to the SRW folks, too. This was brought up as an issue at the last CM meeting, and a discussion was opened on the topic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question for configuring node and cpu # in running UFS on NCAR Derecho HPC #2135

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Question for configuring node and cpu # in running UFS on NCAR Derecho HPC #2135

cenlinhe Feb 19, 2024

Replies: 6 comments · 6 replies

padhrigmccarthy Feb 20, 2024

gspetro-NOAA Feb 21, 2024 Maintainer

DeniseWorthen Feb 20, 2024 Maintainer

jkbk2004 Feb 20, 2024 Maintainer

jkbk2004 Feb 20, 2024 Maintainer

cenlinhe Feb 21, 2024 Author

DeniseWorthen Feb 21, 2024 Maintainer

cenlinhe Feb 21, 2024 Author

DeniseWorthen Feb 21, 2024 Maintainer

cenlinhe Feb 21, 2024 Author

jkbk2004 Feb 21, 2024 Maintainer

gspetro-NOAA Feb 21, 2024 Maintainer

cenlinhe
Feb 19, 2024

Replies: 6 comments 6 replies

padhrigmccarthy
Feb 20, 2024

gspetro-NOAA Feb 21, 2024
Maintainer

DeniseWorthen
Feb 20, 2024
Maintainer

jkbk2004
Feb 20, 2024
Maintainer

jkbk2004
Feb 20, 2024
Maintainer

cenlinhe Feb 21, 2024
Author

DeniseWorthen
Feb 21, 2024
Maintainer

cenlinhe Feb 21, 2024
Author

DeniseWorthen Feb 21, 2024
Maintainer

cenlinhe Feb 21, 2024
Author

jkbk2004
Feb 21, 2024
Maintainer

gspetro-NOAA Feb 21, 2024
Maintainer