Skip to content

Unexpected Error with Increasing MPI ranks related to search alg. #589

@nkzk-stan

Description

@nkzk-stan

I am facing unexpected behavior from Nalu. In short, I am rotating a square with a sliding mesh interface at a reasonably low omega (1.5) relative to other cases I have ran for a circle and ellipse.

When I deploy on 1 node and 16 cpus, I ran without a problem.

When I deploy on 1 node and 32 cpus. It provided an error 159:
_Throw number = 159

Throw test that evaluated to true: !std::isfinite(Teuchos::ScalarTraits::magnitude(omega))

Prolongator damping factor needs to be finite.
MueLu::Exceptions::RuntimeError'
what(): /shared/nalu/build/packages/Trilinos/packages/muelu/src/Transfers/Smoothed-Aggregation/MueLu_SaPFactory_def.hpp:228:_

I resubmitted the job with 32 cpus. It provided an error 160:
_Throw number = 160

Throw test that evaluated to true: true

Belos::StatusTestImpResNorm::checkStatus(): One or more of the current implicit residual norms is NaN.
Belos::StatusTestError'
what(): /shared/nalu/build/packages/Trilinos/packages/belos/src/BelosStatusTestImpResNorm.hpp:635:_

This issue was corrected by changing
search_tolerance: 0.05
activate_dynamic_search_algorithm: no

I have attached the input file ( had to change the file to pdf so it would be attached - so just remove the .pdf to access it)

In addition, for another simulation for a square at slower omega (.707), NALU is freezing at a the same timestep. This occured on both 16 and 32 cpus. This has the same input file as the above case with just the omega and timestep changed. This issue was also corrected by using the above fix. This is the last output when NALU would stall:


Time Step Count: 1075 Current Time: 15.5672
dtN: 0.0149393 dtNm1: 0.0149967 gammas: 1.49904 -1.99617 0.497129
Volume 796 min: 0.000463178 max: 0.00877652
NonConformal alg will ghost a new number of entities: 14 and remove 84 entities from ghosting.
DgInfo size overview for name: Current_surface_5__Opposing_surface_55

dgSquare_R1.i.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions