Skip to content

Tests fail only when user runs a suite (chrysalis- intel) #7956

@rljacob

Description

@rljacob

When running e3sm_integration on my own, I get test fails that are not seen by jenkins and also not reproducible when the test is run by itself.

Reproducing:

  1. grab an interactive node with plenty of time: srun -N 1 -t 10:00:00 --pty bash
  2. From that interactive node, run the e3sm_integration test suite using the testing pe layouts (just like Jenkins)

cd cime/scripts; ./create_test --test-id mctintmaster -c --pesfile ../../cime_config/testmods_dirs/config_pes_tests.xml e3sm_integration -b master --output-root /lcrc/group/e3sm/jacob/scratch/chrys/mctintmaster

I ran this with master ea18a13 which had all-green result on the dashboard with Jenkins (except the one moab test). But my run had 33 failed tests. They were all run fails and the tracebacks were from different and unrelated places in the code.

I tried re-running one by simply going in to the test dir and running "./case.submit". Then it passed!
For another, re-built the test using ./create_test and just the one test as the argument. That also passed.

Test results are in /lcrc/group/e3sm/jacob/scratch/chrys/mctintmaster

Output of cs.status: /lcrc/group/e3sm/jacob/scratch/chrys/mctintmaster/mctintmaster.output
Just the failed tests with some notes: /lcrc/group/e3sm/jacob/scratch/chrys/mctintmaster/mctintmaster.fails

Since just rerunning had a different result, suspect something is different with my runtime when running a suite vs. running a single test.

My. bashrc has

ulimit -u unlimited 2>/dev/null || true
ulimit -c unlimited 2>/dev/null || true

some of the failed tests have core files.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions