Skip to content

WeeklyTelcon_20170103

Geoffrey Paulsen edited this page Jan 9, 2018 · 1 revision

Open MPI Weekly Telcon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • David Bernholt (Oakridge)
  • Jeff Squyres (Cisco)
  • Artem Polyakov (Mellanox)
  • Edgar Garbriel
  • Geoffroy Vallee
  • George (UTK)
  • Howard
  • Ralph
  • Sylvain Jeaugey (Nvidia)
  • Todd Kordenbrock
  • Geoff Paulsen

Agenda

  • Jeff created a non-blocker Issue/PR for a missing include, should PR to v1.10.x

PMIx update

  • PMIx 1.2.0 was released
  • PMIx 1.2.1
    • found an unexpected thing that boris and artem fixed.
    • Some memory footprint things to fix (really bad on KNL or other high PPN nodes)
      • Need to have Nathan confirm that master branch works, then Ralph can backport memory fix to PMIx 1.2.x.
    • Ralph can look at other simple things to improve memory footprint.
  • PMIx 2.0 doesn't address memory footprint.
  • PMIx master now has new better integration with debuggers in PMIx reference server. Will be MPIR v2 based, but just renaming of PMIx calls.
  • Need PMIx 1.2.x. Don't want to go to PMIx 2.0 in Open MPI v2.1.x

  • MTT problems
    • server “stalled” over holiday
      • There is a script that runs every year to generate a new table.
      • Though Josh did that a few weeks ago. Josh
    • no daily summaries being delivered (Brian will look at it)

MTT Dev status:


Exceptional topics

  • complete the SPI on-boarding process
    • move funds, domains, etc.
  • Jeff and Ralph will discuss and make it happen.

Status Updates:

  • Cisco -
  • ORNL - David Bernholt - manage folks who contribute to OMPI in past.
    • New project - new Exa scale project.
      • Whole proposal - would be happy to show a few slides, in an upcoming telecon or face to face.
    • Wants to get more connected with OMPI team. Most folks have been off for last few weeks.
  • UTK - focus on thread safety and GPU support.
  • Nvidia - Did some heavy modifications on their MTT cluster before Christmas. Seems a bit slower, and not sure why. Need to look at that.

Status Update Rotation

  1. Cisco, ORNL, UTK, NVIDIA
  2. Mellanox, Sandia, Intel
  3. LANL, Houston, IBM, Fujitsu

Back to 2017 WeeklyTelcon-2017

Clone this wiki locally