-
Notifications
You must be signed in to change notification settings - Fork 497
Added Intel Gaudi GDR support for UCX #11017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds HabanaLabs Gaudi support: autoconf checks and build integration, new GAUDI memory type, Gaudi topology provider, Gaudi base/SCAL utilities, a Gaudi GDR MD and iface, pkg-config template and Debian install entries, plus IB MD extensions for Gaudi dmabuf detection. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant UCX as UCX Core
participant GaudiMD as Gaudi GDR MD
participant GaudiBase as Gaudi Base / SCAL
participant Topo as Gaudi Topology
participant Sysfs as /sys & /dev
App->>UCX: init
UCX->>GaudiMD: query_md_resources()
GaudiMD->>GaudiBase: uct_gaudi_base_query_devices()
GaudiBase->>Sysfs: open /dev/accel*, hlthunk, read PCI BDF
Sysfs-->>GaudiBase: fd, PCI info
GaudiBase-->>GaudiMD: device resource
App->>UCX: open md
UCX->>GaudiMD: uct_gaudi_md_open()
GaudiMD->>GaudiBase: get_fd / get_info -> dmabuf fd, base, size
GaudiBase-->>GaudiMD: dmabuf_fd, addresses
GaudiMD-->>UCX: md handle
UCX->>Topo: ucs_gaudi_find_best_connection(name)
Topo->>Sysfs: enumerate accel/hnic, build matrix (lazy)
Topo-->>UCX: best HNIC + port / UCS_ERR_NO_ELEM
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (7)
src/uct/ib/base/ib_md.c (1)
1344-1348: Gaudi GPUDirect detection looks correct; comment wording is slightly misleadingThe new
#ifdef HAVE_GAUDIblock reusesuct_ib_check_gpudirect_driverconsistently with the existing CUDA/ROCm checks and safely gatesUCS_MEMORY_TYPE_GAUDIregistration on the presence of Gaudi device nodes. This integrates cleanly into the later GPUDirect enablement check.Minor nit: the comment says “Gaudi DMABuf support”, but this block is only checking for Gaudi device/driver presence; actual dmabuf capability is still probed by
uct_ib_md_check_dmabuf(md)below. You might want to rephrase the comment to avoid confusion, e.g.:- /* Check for HabanaLabs Gaudi DMABuf support */ + /* Check for HabanaLabs Gaudi driver (to allow Gaudi GDR via dmabuf) */src/uct/gaudi/ucx-gaudi.pc.in (1)
7-11: PopulateLibs/Libs.privateto make the pkg-config entry usefulRight now
Libs:andLibs.private:are empty, sopkg-config --libs @PACKAGE@-gaudiwill not return any link flags. It would be better to mirror the patterns used by other UCX TL.pc.infiles (e.g., add the appropriate-L/-lflags and/orRequires:), so consumers can link Gaudi support just by depending on this pkg-config module.Please cross-check with the existing UCX TL
.pc.intemplates in this repo to ensure the Gaudi one follows the same conventions.src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.h (1)
6-27: Consider tightening include guard name and normalizing field namingFunctionally this header looks fine, but there are a couple of style nits:
- The include guard
GAUDI_MD_His quite generic; most UCX headers use a more specific name (e.g.,UCT_GAUDI_MD_H) to avoid collisions.- The
uct_gaudi_md_tmembers mostly use snake_case, buttotalSizestands out in camelCase; renaming tototal_sizewould better match the surrounding code.These are small cleanups and not blocking, but aligning with existing UCX style will help long-term maintainability.
If you decide to rename
totalSize, please update all its uses in the corresponding.cfiles and any other references.src/ucs/sys/topo/gaudi/topo.h (1)
7-64: API surface looks good; add<stdint.h>for robustnessThe Gaudi topology API (init/cleanup, module-id lookup, and best-connection selection) is well documented and matches the implementation behavior in
topo.c, including error reporting and lazy initialization semantics.One minor robustness improvement: this header uses
uint32_tbut does not include<stdint.h>directly, instead relying on transitive includes from other headers. Adding an explicit include would avoid surprises if those headers ever change:#include <stdint.h> #include <ucs/sys/topo/base/topo.h> #include <ucs/type/status.h>Please confirm this change compiles cleanly across all supported platforms/toolchains, in case any rely on specific include ordering.
src/uct/gaudi/base/gaudi_base.c (2)
23-31: ClarifysynDeviceGetInfo(-1, ...)vsdevice_idsemantics
uct_gaudi_base_get_fd()ignoresdevice_idwhensynDeviceGetInfo(-1, &deviceInfo)succeeds and always returnsdeviceInfo.fd. If multiple Gaudi devices are present, that may not correspond to the requesteddevice_idunless the API guarantees that-1already reflects the intended device anddevice_idis only for the hlthunk fallback. Consider either passingdevice_idtosynDeviceGetInfo(if supported) or documenting the assumption thatdevice_idis used only on the hlthunk path.
51-95: Add a defensive check before subtractinghbm_pool_start - addrIn
uct_gaudi_base_get_info(),offsetis computed ashbm_pool_start - addrwith both operandsuint64_t. If SCAL ever reportsdevice_base_address < device_base_allocated_address, this will underflow and produce a hugeoffsetpassed intohlthunk_device_mapped_memory_export_dmabuf_fd(). It would be safer to assert or explicitly check the ordering and fail early, e.g. returning an error ifhbm_pool_start < addr, rather than relying on implicit SCAL guarantees.src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.c (1)
16-27: Double-checkmax_num_eps = 0and near-zero bandwidth values
uct_gaudi_gdr_iface_query()overrides the basemax_num_epswith0and sets an extremely small non-zero dedicated bandwidth. If0is not explicitly treated as “unlimited” in UCT, this could be interpreted as “no endpoints supported”. If the TL is only a control/auxiliary iface, documenting this choice or simply preservingiface->super.config.max_num_epswould make the behavior clearer.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
config/m4/gaudi.m4(1 hunks)debian/ucx-gaudi.install(1 hunks)src/ucs/Makefile.am(2 hunks)src/ucs/memory/memory_type.c(2 hunks)src/ucs/memory/memory_type.h(1 hunks)src/ucs/sys/topo/gaudi/topo.c(1 hunks)src/ucs/sys/topo/gaudi/topo.h(1 hunks)src/uct/Makefile.am(1 hunks)src/uct/configure.m4(1 hunks)src/uct/gaudi/Makefile.am(1 hunks)src/uct/gaudi/base/gaudi_base.c(1 hunks)src/uct/gaudi/base/gaudi_base.h(1 hunks)src/uct/gaudi/base/scal.h(1 hunks)src/uct/gaudi/configure.m4(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.c(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.h(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.h(1 hunks)src/uct/gaudi/ucx-gaudi.pc.in(1 hunks)src/uct/ib/base/ib_md.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
src/uct/gaudi/base/gaudi_base.h (1)
src/uct/gaudi/base/gaudi_base.c (4)
uct_gaudi_base_get_fd(23-31)uct_gaudi_base_get_sysdev(33-49)uct_gaudi_base_get_info(51-95)uct_gaudi_base_query_devices(97-112)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.c (3)
src/uct/base/uct_iface.c (4)
uct_base_iface_query(509-515)uct_base_iface_is_reachable(343-355)uct_iface_base_query_v2(573-581)uct_base_iface_t(721-724)src/ucs/sys/stubs.c (5)
ucs_empty_function(15-17)ucs_empty_function_return_success(49-52)ucs_empty_function_return_unsupported(54-57)ucs_empty_function_return_zero(19-22)ucs_empty_function_return_zero_int(29-32)src/uct/gaudi/base/gaudi_base.c (1)
uct_gaudi_base_query_devices(97-112)
src/ucs/sys/topo/gaudi/topo.h (1)
src/ucs/sys/topo/gaudi/topo.c (4)
ucs_gaudi_topo_init(1716-1753)ucs_gaudi_topo_cleanup(1852-1899)ucs_gaudi_get_index_from_module_id(304-342)ucs_gaudi_find_best_connection(1513-1564)
src/uct/gaudi/base/gaudi_base.c (2)
src/ucs/sys/topo/base/topo.c (1)
ucs_topo_find_device_by_bdf_name(737-759)src/uct/base/uct_iface.c (1)
uct_single_device_resource(550-571)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (5)
src/uct/base/uct_md.c (3)
uct_md_base_md_query(505-523)uct_md_query_single_md_resource(144-162)uct_md_stub_rkey_unpack(173-181)src/ucs/debug/memtrack.c (2)
ucs_free(368-372)ucs_malloc(328-334)src/ucs/memory/memtype_cache.c (1)
ucs_memtype_cache_update(266-277)src/ucs/sys/stubs.c (3)
ucs_empty_function_return_unsupported(54-57)ucs_empty_function_return_success(49-52)ucs_empty_function(15-17)src/uct/gaudi/base/gaudi_base.c (3)
uct_gaudi_base_get_fd(23-31)uct_gaudi_base_get_info(51-95)uct_gaudi_base_get_sysdev(33-49)
src/ucs/sys/topo/gaudi/topo.c (6)
src/ucs/sys/topo/base/topo.c (6)
ucs_topo_get_device_bus_id(355-364)ucs_topo_num_devices(211-220)ucs_topo_sys_device_get_name(784-801)ucs_topo_find_device_by_bdf_name(737-759)ucs_topo_sys_device_set_name(761-782)ucs_topo_sys_device_get_numa_node(803-820)src/ucs/sys/string.c (5)
ucs_strncpy_safe(235-249)ucs_snprintf_safe(226-233)ucs_string_alloc_path_buffer(454-465)ucs_path_get_common_parent(373-381)ucs_path_calc_distance(383-389)src/ucs/debug/memtrack.c (3)
ucs_free(368-372)ucs_calloc(336-342)ucs_malloc(328-334)src/ucs/sys/sys.c (3)
ucs_sys_read_sysfs_file(1634-1654)ucs_make_affinity_str(1312-1344)ucs_sys_pthread_getaffinity(1376-1384)src/ucs/memory/numa.c (4)
ucs_numa_num_configured_cpus(110-122)ucs_numa_node_of_cpu(124-152)ucs_numa_num_configured_nodes(95-108)ucs_numa_distance(232-254)src/ucs/type/spinlock.h (3)
ucs_spin_lock(81-84)ucs_spin_unlock(127-130)ucs_spinlock_init(43-59)
🪛 Clang (14.0.6)
src/uct/gaudi/base/scal.h
[error] 4-4: 'stdint.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/base/gaudi_base.h
[error] 9-9: 'uct/base/uct_iface.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.h
[error] 9-9: 'uct/base/uct_iface.h' file not found
(clang-diagnostic-error)
src/ucs/sys/topo/gaudi/topo.h
[error] 10-10: 'ucs/sys/topo/base/topo.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c
[error] 10-10: 'ucs/memory/memtype_cache.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.h
[error] 9-9: 'uct/base/uct_md.h' file not found
(clang-diagnostic-error)
🔇 Additional comments (12)
debian/ucx-gaudi.install (1)
1-2: Install entries look correct; please confirm paths match built GAUDI artifactsThe two entries align with the new GAUDI perf-test and UCT libraries; just double‑check that the actual
.sonames and install directories produced by the build (including any multi‑arch directory like...-linux-gnu) match these paths and the rest of the Debian packaging for UCX.src/ucs/Makefile.am (2)
93-96: GAUDI topo header conditional looks good; verify path and macro wiringConditionally adding
sys/topo/gaudi/topo.htonobase_dist_libucs_la_HEADERSunderHAVE_GAUDIis consistent with the new topology support and keeps GAUDI headers out when the feature is disabled. Please just verify:
- The header really lives at
src/ucs/sys/topo/gaudi/topo.h, andHAVE_GAUDIis the same Automake conditional defined in your newUCX_CHECK_GAUDIlogic (so this block is never evaluated before the conditional is set).
228-231: GAUDI topo source inclusion is correct and properly guardedAdding
sys/topo/gaudi/topo.ctolibucs_la_SOURCESunderif HAVE_GAUDIcorrectly isolates GAUDI‑specific code from non‑GAUDI builds. The indentation tweak formemcpy_thunderx2.Sis cosmetic and has no functional effect.Also applies to: 235-235
config/m4/gaudi.m4 (1)
58-60: EnsureUCX_CHECK_GAUDIis invoked soHAVE_GAUDIconditional exists for MakefilesYou define
AM_CONDITIONAL([HAVE_GAUDI], [test "x$gaudi_happy" != xno])here and then useif HAVE_GAUDIin multiple Makefile.ams. Please verify thatUCX_CHECK_GAUDIis actually invoked fromconfigure.ac(or a top‑level UCX config macro) beforeAC_OUTPUT, otherwise Automake will complain that conditionalHAVE_GAUDIis not defined.src/uct/Makefile.am (1)
10-10: Includinggaudiin UCTSUBDIRSis appropriate; confirm gaudi subdir is no‑op when disabledAdding
gaudialongsidecuda/rocm/...ensures the GAUDI transport is built as part of the normal UCT build. Please just confirm thatsrc/uct/gaudi/Makefile.amfully guards its targets withHAVE_GAUDI(or similar), so that when GAUDI is not available the gaudi subdir effectively becomes a no‑op and doesn’t introduce spurious build failures.src/uct/configure.m4 (1)
7-15: GAUDI UCT configure fragment inclusion is consistent with other transportsIncluding
src/uct/gaudi/configure.m4alongside the other transport fragments is the right way to hook GAUDI module configuration into UCT. Please verify that the gaudi fragment:
- Appends
gauditouct_moduleswhen enabled, and- Invokes
UCX_CHECK_GAUDI(or otherwise ensuresHAVE_GAUDI/GAUDI_* flags are set)so that the new transport is both discoverable and correctly conditioned.
src/ucs/memory/memory_type.h (1)
38-51: GAUDI memory type addition looks consistentAdding
UCS_MEMORY_TYPE_GAUDIbeforeUCS_MEMORY_TYPE_LASTmatches the existing enum style and is correctly positioned for use with theucs_memory_type_for_eachmacro.Please ensure all switch statements over
ucs_memory_type_tin the rest of the codebase are updated to handleUCS_MEMORY_TYPE_GAUDIwhere appropriate (e.g., mapping to MDs/UCTs, logging, config parsing).src/ucs/memory/memory_type.c (1)
17-44: Names/descriptions for GAUDI are wired correctlyThe
ucs_memory_type_namesanducs_memory_type_descsentries forUCS_MEMORY_TYPE_GAUDIare consistent with the enum and the existing naming scheme; sentinel entries forUCS_MEMORY_TYPE_LASTremain intact.If there are helper routines that map string names to
ucs_memory_type_t(e.g., config parsing), please confirm they also recognize"gaudi"so configuration and logging remain consistent.src/uct/gaudi/gaudi_gdr/gaudi_gdr_iface.h (1)
6-15: Gaudi GDR iface definition is minimal and consistentThe TL name macro and
uct_gaudi_gdr_iface_twrapper overuct_base_iface_tfollow existing UCT conventions; include guard and header dependency look correct. The reporteduct/base/uct_iface.h file not foundfrom static analysis is almost certainly an include-path artifact, not a code issue.src/uct/gaudi/base/gaudi_base.h (1)
1-20: Gaudi base header API looks consistent with the implementationFunction prototypes and types line up with
gaudi_base.cand existing UCT/UCT topo APIs; nothing blocking here from a header/API perspective.src/uct/gaudi/Makefile.am (1)
6-38: GAUDI module Makefile wiring looks correctConditional build, sources, headers, and linkage flags are wired the same way as other UCT transports; no functional issues spotted here.
src/uct/gaudi/base/scal.h (1)
1-31: SCAL header is self-contained and matches current usageOpaque handle definitions,
scal_memory_pool_infoV2, and function prototypes line up with how they’re used ingaudi_base.c; nothing blocking from this header as written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (1)
35-36: Remove unsupported registration capability advertisement.Lines 35-36 advertise
reg_mem_typesfor HOST and GAUDI memory, but line 143 shows.mem_regreturnsUCS_ERR_UNSUPPORTED. This inconsistency will cause UCP to believe registration is available and fail at runtime.Apply this diff:
- attr->reg_mem_types = UCS_BIT(UCS_MEMORY_TYPE_HOST) | - UCS_BIT(UCS_MEMORY_TYPE_GAUDI); + attr->reg_mem_types = 0;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/ucs/sys/topo/gaudi/topo.c(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/ucs/sys/topo/gaudi/topo.c (7)
src/ucs/sys/topo/base/topo.c (6)
ucs_topo_get_device_bus_id(355-364)ucs_topo_num_devices(211-220)ucs_topo_sys_device_get_name(784-801)ucs_topo_find_device_by_bdf_name(737-759)ucs_topo_sys_device_set_name(761-782)ucs_topo_sys_device_get_numa_node(803-820)src/ucs/sys/string.c (5)
ucs_strncpy_safe(235-249)ucs_snprintf_safe(226-233)ucs_string_alloc_path_buffer(454-465)ucs_path_get_common_parent(373-381)ucs_path_calc_distance(383-389)src/ucs/debug/memtrack.c (3)
ucs_free(368-372)ucs_calloc(336-342)ucs_malloc(328-334)src/ucs/sys/sys.c (3)
ucs_sys_read_sysfs_file(1634-1654)ucs_make_affinity_str(1312-1344)ucs_sys_pthread_getaffinity(1376-1384)src/ucs/memory/numa.c (4)
ucs_numa_num_configured_cpus(110-122)ucs_numa_node_of_cpu(124-152)ucs_numa_num_configured_nodes(95-108)ucs_numa_distance(232-254)src/ucs/type/spinlock.h (3)
ucs_spin_lock(81-84)ucs_spin_unlock(127-130)ucs_spinlock_init(43-59)src/ucs/datastruct/list.h (1)
ucs_list_del(106-110)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (5)
src/uct/base/uct_md.c (3)
uct_md_base_md_query(505-523)uct_md_query_single_md_resource(144-162)uct_md_stub_rkey_unpack(173-181)src/ucs/debug/memtrack.c (2)
ucs_free(368-372)ucs_malloc(328-334)src/ucs/memory/memtype_cache.c (1)
ucs_memtype_cache_update(266-277)src/ucs/sys/stubs.c (3)
ucs_empty_function_return_unsupported(54-57)ucs_empty_function_return_success(49-52)ucs_empty_function(15-17)src/uct/gaudi/base/gaudi_base.c (3)
uct_gaudi_base_get_fd(23-31)uct_gaudi_base_get_info(51-95)uct_gaudi_base_get_sysdev(33-49)
🪛 Clang (14.0.6)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c
[error] 10-10: 'ucs/memory/memtype_cache.h' file not found
(clang-diagnostic-error)
🔇 Additional comments (8)
src/ucs/sys/topo/gaudi/topo.c (5)
1718-1722: Spinlock initialization issue from past review has been addressed.The past review concern about uninitialized spinlock has been properly fixed. The
pthread_oncemechanism now ensures the spinlock is initialized before anyucs_spin_lockcalls in bothucs_gaudi_topo_initanducs_gaudi_lazy_init.Also applies to: 1741-1747, 1773-1779
121-158: Helper function correctly manages realpath allocation.The
realpath(path, NULL)call allocates memory that is properly freed at line 155. Error handling is correct.
627-721: Distance estimation logic is sound.The NUMA-aware distance estimation with PCIe hop counting is well-structured. The fallback handling for undefined NUMA nodes and the incremental latency/bandwidth degradation per hop are reasonable heuristics.
1179-1513: Balanced assignment algorithm is well-designed.The two-pass NUMA-aware assignment with epsilon-based tie-breaking is sophisticated and correct. Memory management is proper with all temporary allocations freed. The soft-cap preference in Pass 2 elegantly handles NUMA nodes without local NICs.
1866-1911: Cleanup function properly manages resources and locks.The cleanup sequence correctly:
- Guards against double-cleanup with
provider_addedcheck- Takes locks in proper order (mutex then spinlock)
- Removes provider from global list
- Frees all allocations
- Resets state
- Releases locks
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (3)
42-52: Device FD is now properly closed on teardown.The past review issue has been fixed. Lines 48-50 now correctly close
md->fdbefore freeing the MD structure, preventing FD leaks.
54-74: Memory attribute query correctly validates address range.The range check at lines 63-66 ensures only addresses within the Gaudi device memory are accepted. The helper correctly populates memory info without duplicating the FD (duplication happens in the caller).
110-110: FD duplication is correct for dmabuf field.Line 110 properly duplicates the dmabuf FD so the caller owns an independent file descriptor. This follows UCX conventions for FD passing.
*UCT/GAUDI: Free the fds when called md_close
*UCT/GAUDI: Clean unused configuration
*CONFIG/M4: Clean config/m4 for gaudi
*UCT/GAUDI: Remove unused reg_mem type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (1)
src/ucs/sys/topo/gaudi/topo.c (1)
347-524: Tighten Gaudi enumeration and connection-matrix edge cases to avoid “ghost” devices and spurious ENOMEMIn
ucs_gaudi_enumerate_devices()and the downstream connection/assignment logic there are a couple of robustness gaps:
num_gaudi_devicesis not reduced to the number actually mapped into UCX topo
num_gaudi_devicesis first set from the count ofaccel*directories (Lines 375–389).- During the populate pass (Lines 444–480), some entries can be skipped (e.g. failed PCI addr read or
ucs_topo_find_device_by_bdf_name()failure), leaving those slots zero‑initialized. However,ucs_gaudi_topo_ctx.num_gaudi_devicesremains the original directory count.- Later code (e.g.
ucs_gaudi_create_connection_matrix()anducs_gaudi_build_assignment_balanced()) iterates up tonum_gaudi_devicesand will treat those trailing zero entries as real Gaudi devices, building connections and assignments for “ghost” devices that can never be addressed by name.A contained improvement is to clamp the count after the populate loop:
} - + /* Only keep Gaudi devices that were successfully mapped into UCX topo */ + ucs_gaudi_topo_ctx.num_gaudi_devices = gaudi_idx;
ucs_gaudi_create_connection_matrix()treatsmax_num_connections == 0as ENOMEMWhen
num_gaudi_devicesornum_hnic_devicesis 0,max_num_connectionsbecomes 0 (Lines 733–734),ucs_calloc(0, ...)can legitimately returnNULL, and the function translates this intoUCS_ERR_NO_MEMORY(Lines 735–740). That’s misleading: “no devices” is not an allocation failure.You can avoid this by short‑circuiting the zero case:
max_num_connections = ucs_gaudi_topo_ctx.num_gaudi_devices * ucs_gaudi_topo_ctx.num_hnic_devices; + if (max_num_connections == 0) { + ucs_gaudi_topo_ctx.num_connections = 0; + return UCS_OK; + } + ucs_gaudi_topo_ctx.connections = ucs_calloc(max_num_connections, + sizeof(ucs_gaudi_connection_t), + "gaudi_connections"); + if (!ucs_gaudi_topo_ctx.connections) { + return UCS_ERR_NO_MEMORY; + }This plays nicely with
ucs_gaudi_build_assignment_balanced(), which already short‑circuits the “no devices to assign” case (Lines 1247–1252).These changes are not strictly required for correctness in the happy path, but they make behaviour more predictable when sysfs/topology is incomplete or partially broken and avoid surprising “virtual” Gaudis and fake ENOMEM.
Also applies to: 720-780, 1155-1510
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
config/m4/gaudi.m4(1 hunks)src/ucs/sys/topo/gaudi/topo.c(1 hunks)src/uct/gaudi/base/gaudi_base.c(1 hunks)src/uct/gaudi/base/gaudi_base.h(1 hunks)src/uct/gaudi/configure.m4(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.h(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- config/m4/gaudi.m4
🧰 Additional context used
🧬 Code graph analysis (4)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (5)
src/uct/base/uct_md.c (3)
uct_md_base_md_query(505-523)uct_md_query_single_md_resource(144-162)uct_md_stub_rkey_unpack(173-181)src/uct/gaudi/base/gaudi_base.c (5)
uct_gaudi_base_close_dmabuf_fd(48-52)uct_gaudi_base_close_fd(42-46)uct_gaudi_base_get_fd(23-40)uct_gaudi_base_get_info(72-116)uct_gaudi_base_get_sysdev(54-70)src/ucs/debug/memtrack.c (2)
ucs_free(368-372)ucs_malloc(328-334)src/ucs/memory/memtype_cache.c (1)
ucs_memtype_cache_update(266-277)src/ucs/sys/stubs.c (3)
ucs_empty_function_return_unsupported(54-57)ucs_empty_function_return_success(49-52)ucs_empty_function(15-17)
src/ucs/sys/topo/gaudi/topo.c (6)
src/ucs/sys/topo/base/topo.c (6)
ucs_topo_get_device_bus_id(355-364)ucs_topo_num_devices(211-220)ucs_topo_sys_device_get_name(784-801)ucs_topo_find_device_by_bdf_name(737-759)ucs_topo_sys_device_set_name(761-782)ucs_topo_sys_device_get_numa_node(803-820)src/ucs/sys/string.c (3)
ucs_strncpy_safe(235-249)ucs_snprintf_safe(226-233)ucs_string_alloc_path_buffer(454-465)src/ucs/debug/memtrack.c (3)
ucs_free(368-372)ucs_calloc(336-342)ucs_malloc(328-334)src/ucs/sys/sys.c (3)
ucs_sys_read_sysfs_file(1634-1654)ucs_make_affinity_str(1312-1344)ucs_sys_pthread_getaffinity(1376-1384)src/ucs/memory/numa.c (4)
ucs_numa_num_configured_cpus(110-122)ucs_numa_node_of_cpu(124-152)ucs_numa_num_configured_nodes(95-108)ucs_numa_distance(232-254)src/ucs/type/spinlock.h (3)
ucs_spin_lock(81-84)ucs_spin_unlock(127-130)ucs_spinlock_init(43-59)
src/uct/gaudi/base/gaudi_base.h (1)
src/uct/gaudi/base/gaudi_base.c (6)
uct_gaudi_base_get_fd(23-40)uct_gaudi_base_close_fd(42-46)uct_gaudi_base_close_dmabuf_fd(48-52)uct_gaudi_base_get_sysdev(54-70)uct_gaudi_base_get_info(72-116)uct_gaudi_base_query_devices(118-133)
src/uct/gaudi/base/gaudi_base.c (3)
bindings/go/src/ucx/ucs_constants.go (2)
UCS_ERR_IO_ERROR(62-62)UCS_ERR_INVALID_ADDR(66-66)src/ucs/sys/topo/base/topo.c (1)
ucs_topo_find_device_by_bdf_name(737-759)src/uct/base/uct_iface.c (1)
uct_single_device_resource(550-571)
🪛 Clang (14.0.6)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c
[error] 10-10: 'ucs/memory/memtype_cache.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/base/gaudi_base.h
[error] 9-9: 'stdbool.h' file not found
(clang-diagnostic-error)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.h
[error] 9-9: 'stdbool.h' file not found
(clang-diagnostic-error)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: UCX PR (Codestyle ctags check)
- GitHub Check: UCX PR (Codestyle commit title)
- GitHub Check: UCX PR (Codestyle codespell check)
- GitHub Check: UCX PR (Codestyle format code)
- GitHub Check: UCX PR (Codestyle AUTHORS file update check)
- GitHub Check: UCX release DRP (Prepare CheckRelease)
- GitHub Check: UCX release (Prepare CheckRelease)
- GitHub Check: UCX snapshot (Prepare Check)
🔇 Additional comments (1)
src/ucs/sys/topo/gaudi/topo.c (1)
40-44: Spinlock initialization and init/cleanup ordering now look correctThe revised initialization and teardown flow for the Gaudi topo context is solid:
gaudi_init_mutex+provider_addedgate provider registration and cleanup.ucs_gaudi_spinlock_once_init()withpthread_once(Lines 1715–1719, 1738–1744, 1770–1776) guaranteesucs_gaudi_topo_ctx.lockis initialized exactly once, regardless of whetherucs_gaudi_topo_init()orucs_gaudi_lazy_init()runs first.ucs_gaudi_topo_cleanup()only takes the spinlock ifprovider_addedis set (Lines 1868–1875), matching the invariant that successful provider registration implies a successfully initialized spinlock.- Lock ordering is consistent (
gaudi_init_mutex→ucs_gaudi_topo_ctx.lock), so there is no obvious deadlock cycle with lazy init.This addresses the earlier uninitialized-spinlock UB and gives a clean story for concurrent init/lazy-init/cleanup.
Also applies to: 1715-1777, 1863-1908
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (2)
54-60: Prefer integer (or byte-pointer) arithmetic for address range and offset computationsThe range check currently compares
addrwithbegin/endasvoid *, which is technically undefined for unrelated pointers and can upset some analyzers. You can make this more robust and explicit by doing the math inuintptr_t(oruint8_t *) and reusing that representation when computingdmabuf_offset:- void *begin = (void *)gaudi_md->device_base_address; - void *end = (uint8_t *)begin + gaudi_md->totalSize; - - if ((addr < begin) || (addr >= end)) { + uintptr_t begin = gaudi_md->device_base_address; + uintptr_t end = begin + gaudi_md->totalSize; + uintptr_t a = (uintptr_t)addr; + + if ((a < begin) || (a >= end)) { mem_info->type = UCS_MEMORY_TYPE_LAST; return UCS_ERR_OUT_OF_RANGE; } @@ - if (mem_attr_p->field_mask & UCT_MD_MEM_ATTR_FIELD_DMABUF_OFFSET) { - mem_attr_p->dmabuf_offset = UCS_PTR_BYTE_DIFF(mem_info.base_address, - addr); - } + if (mem_attr_p->field_mask & UCT_MD_MEM_ATTR_FIELD_DMABUF_OFFSET) { + mem_attr_p->dmabuf_offset = + (ptrdiff_t)((uintptr_t)addr - + (uintptr_t)mem_info.base_address); + }This keeps the behavior the same while making the intent clearer and avoiding undefined pointer relational operations.
Also applies to: 111-114
49-50: Either uselengthfor bounds or mark it explicitly unused
lengthis accepted byuct_gaudi_md_query_attributes,uct_gaudi_md_mem_query, anduct_gaudi_md_detect_memory_typebut never used. That’s fine from a semantics standpoint, but with-Wextrait may generate unused-parameter warnings.Two options:
- If you expect callers to pass meaningful ranges, extend the bounds check to ensure
[addr, addr + length)stays within the device region.- If range length is intentionally ignored, mark it unused to keep builds quiet, e.g.:
uct_gaudi_md_query_attributes(uct_md_h md, const void *addr, size_t length, ucs_memory_info_t *mem_info, int *dmabuf_fd) { (void)length; /* intentionally unused */ ... }(Or use whatever
ucs_unused-style macro UCX prefers.)Also applies to: 70-73, 119-121
src/ucs/sys/topo/gaudi/topo.c (4)
80-95: Mirroring internal topo provider structs is ABI-fragile.
compatible_topo_ops_t/compatible_topo_provider_tmirror the current layout ofucs_sys_topo_ops_t/ucs_sys_topo_provider_tfromtopo.cand are then inserted intoucs_sys_topo_providers_list. This works only as long as the internal struct layout (field order, types, padding) does not change; any future change intopo.ccould silently break this provider.If possible, consider:
- Moving the real
ucs_sys_topo_provider_tdefinition into a shared header, or- Adding a registration API that hides the provider struct layout and just takes function pointers and a name.
That would make the Gaudi provider robust against internal topology changes.
266-304: Consider adding a range check when parsingmodule_idto avoid silent truncation.
ucs_gaudi_read_module_id()parses the module ID into anunsigned longand then casts touint32_twithout checking for overflow, unlikeucs_gaudi_read_vendor_id()which guards againstval > UINT16_MAX. While sysfs is unlikely to expose an out-of-rangemodule_id, adding a symmetric check would prevent silent truncation and keep behavior consistent with vendor parsing.A minimal adjustment:
- errno = 0; - val = strtoul(buffer, &endptr, 10); + errno = 0; + val = strtoul(buffer, &endptr, 10); @@ - if (errno != 0 || endptr == buffer || *endptr != '\0') { + if (errno != 0 || endptr == buffer || *endptr != '\0') { ucs_debug("Invalid module ID in %s: '%s'", path, buffer); return UCS_ERR_INVALID_PARAM; } - - *module_id = (uint32_t)val; - return UCS_OK; + if (val > UINT32_MAX) { + ucs_debug("Module ID %lu exceeds uint32_t in %s", val, path); + return UCS_ERR_INVALID_PARAM; + } + + *module_id = (uint32_t)val; + return UCS_OK;(Assuming
UINT32_MAXis already available via existing headers; otherwise, include it in the project-preferred way.)
351-400: Clarify semantics for “no Gaudi / no HNIC” vs. genuine initialization failures.
ucs_gaudi_enumerate_devices()returnsUCS_ERR_NO_DEVICEwhen there are no Gaudi devices (lines 395–400) or no Mellanox/Broadcom NICs (lines 419–424).ucs_gaudi_lazy_init()(lines 1762–1796) then propagates this as an error, which in turn causes:
ucs_gaudi_get_distance()to log an error and returnUCS_ERR_NO_DEVICE, anducs_gaudi_find_best_connection()to treat it as a hard failure (it only special-casesUCS_ERR_UNSUPPORTEDas “provider disabled”).On nodes where “no Gaudi / no RoCE NIC” is expected (common in mixed clusters), this may be more naturally treated as “provider not applicable” rather than an error, depending on how the global topo framework reacts to non-OK provider statuses.
I’d suggest double-checking the intended behavior:
- If “no Gaudi present” should not be an error for generic topology queries, consider mapping
UCS_ERR_NO_DEVICEfromucs_gaudi_enumerate_devices()toUCS_ERR_UNSUPPORTEDinsideucs_gaudi_lazy_init(), and then havingucs_gaudi_find_best_connection()continue to returnUCS_ERR_NO_ELEMin that case.- If you do want “no Gaudi / no NIC” to be a hard error for Gaudi-specific code but benign for generic topo, ensure the caller side (e.g., the topo provider dispatcher) explicitly treats
UCS_ERR_NO_DEVICEfrom this provider as ignorable.Also applies to: 419-424, 1762-1796
1606-1620: Sysfs vendor-ID lookups on every distance/memory query may be unnecessarily expensive.Both
ucs_gaudi_get_distance()(vendor checks at lines 1606–1620) anducs_gaudi_get_memory_distance()(lines 1667–1671) callucs_gaudi_read_vendor_id()on each invocation. That routine doesucs_gaudi_sys_dev_to_sysfs_path()plusfopen/fgetson/sys, which can be relatively expensive if these hooks are hit frequently in hot paths.Given you already have:
ucs_gaudi_topo_ctx.gaudi_devices[]anducs_gaudi_topo_ctx.hnic_devices[], anducs_gaudi_topo_ctx.hnic_vendor_ids[],you could avoid repeated sysfs I/O by:
- Caching vendor IDs per
ucs_sys_device_tin the Gaudi context (e.g., a small array or map), or- First checking whether
deviceis known to be a Gaudi or HNIC via the existing arrays, and only falling back toucs_gaudi_read_vendor_id()when not found.This is not correctness-critical, but could reduce overhead if these functions are queried often.
Also applies to: 1667-1671
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/ucs/sys/topo/gaudi/topo.c(1 hunks)src/ucs/sys/topo/gaudi/topo.h(1 hunks)src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/ucs/sys/topo/gaudi/topo.c (7)
src/ucs/sys/topo/base/topo.c (6)
ucs_topo_get_device_bus_id(355-364)ucs_topo_num_devices(211-220)ucs_topo_sys_device_get_name(784-801)ucs_topo_find_device_by_bdf_name(737-759)ucs_topo_sys_device_set_name(761-782)ucs_topo_sys_device_get_numa_node(803-820)src/ucs/sys/string.c (5)
ucs_strncpy_safe(235-249)ucs_snprintf_safe(226-233)ucs_string_alloc_path_buffer(454-465)ucs_path_get_common_parent(373-381)ucs_path_calc_distance(383-389)src/ucs/debug/memtrack.c (3)
ucs_free(368-372)ucs_calloc(336-342)ucs_malloc(328-334)src/ucs/sys/sys.c (3)
ucs_sys_read_sysfs_file(1634-1654)ucs_make_affinity_str(1312-1344)ucs_sys_pthread_getaffinity(1376-1384)src/ucs/memory/numa.c (4)
ucs_numa_num_configured_cpus(110-122)ucs_numa_node_of_cpu(124-152)ucs_numa_num_configured_nodes(95-108)ucs_numa_distance(232-254)src/ucs/type/spinlock.h (3)
ucs_spin_lock(81-84)ucs_spin_unlock(127-130)ucs_spinlock_init(43-59)src/ucs/datastruct/list.h (1)
ucs_list_del(106-110)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c (4)
src/uct/base/uct_md.c (3)
uct_md_base_md_query(505-523)uct_md_query_single_md_resource(144-162)uct_md_stub_rkey_unpack(173-181)src/uct/gaudi/base/gaudi_base.c (5)
uct_gaudi_base_close_dmabuf_fd(48-52)uct_gaudi_base_close_fd(42-46)uct_gaudi_base_get_fd(23-40)uct_gaudi_base_get_info(72-116)uct_gaudi_base_get_sysdev(54-70)src/ucs/memory/memtype_cache.c (1)
ucs_memtype_cache_update(266-277)src/ucs/sys/stubs.c (3)
ucs_empty_function_return_unsupported(54-57)ucs_empty_function_return_success(49-52)ucs_empty_function(15-17)
src/ucs/sys/topo/gaudi/topo.h (1)
src/ucs/sys/topo/gaudi/topo.c (4)
ucs_gaudi_topo_init(1725-1760)ucs_gaudi_topo_cleanup(1867-1911)ucs_gaudi_get_index_from_module_id(307-348)ucs_gaudi_find_best_connection(1516-1567)
🪛 Clang (14.0.6)
src/uct/gaudi/gaudi_gdr/gaudi_gdr_md.c
[error] 10-10: 'ucs/memory/memtype_cache.h' file not found
(clang-diagnostic-error)
src/ucs/sys/topo/gaudi/topo.h
[error] 10-10: 'ucs/sys/topo/base/topo.h' file not found
(clang-diagnostic-error)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: UCX PR (Codestyle commit title)
- GitHub Check: UCX PR (Codestyle AUTHORS file update check)
- GitHub Check: UCX PR (Codestyle format code)
- GitHub Check: UCX PR (Codestyle ctags check)
- GitHub Check: UCX PR (Codestyle codespell check)
- GitHub Check: UCX release DRP (Prepare CheckRelease)
- GitHub Check: UCX release (Prepare CheckRelease)
- GitHub Check: UCX snapshot (Prepare Check)
🔇 Additional comments (1)
src/ucs/sys/topo/gaudi/topo.h (1)
15-60: Gaudi topology public API and docs match the implementation.The prototypes and documented semantics (init/cleanup pairing,
-1sentinel forucs_gaudi_get_index_from_module_id(), and detailed status codes forucs_gaudi_find_best_connection()) align with the definitions and behavior intopo.c. Nothing blocking here from an API or documentation standpoint.
ee9a8df to
b108672
Compare
…on all the comments UCS: Added Intel gaudi topolody for gaudi devices
b108672 to
b0bc79f
Compare
|
The failure seems unrelated to our PR codes, anyone can help here to resolve the issue? |
yosefe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial review
| /* check if ROCM KFD driver is loaded */ | ||
| uct_ib_check_gpudirect_driver(md, "/dev/kfd", UCS_MEMORY_TYPE_ROCM); | ||
|
|
||
| #ifdef HAVE_GAUDI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this ifdef
|
|
||
| SUBDIRS = . | ||
|
|
||
| module_LTLIBRARIES = libuct_gaudi.la |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: align = on column
| esoha-nvidia <[email protected]> | ||
| lyu <[email protected]> | ||
| lzhang2 <[email protected]> | ||
| nileshnegi <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like unrelated changes are added to the AUTHORS file
| /** | ||
| * Copyright (C) Intel Corporation, 2025. ALL RIGHTS RESERVED. | ||
| * | ||
| * See file LICENSE for terms. | ||
| */ | ||
|
|
||
| #ifdef HAVE_CONFIG_H | ||
| #include "config.h" | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code in this file seems gaudi-specific, can we move it to uct/gaudi (to the transport)?
| #include <hlthunk.h> | ||
| #include <synapse_api.h> | ||
|
|
||
| int uct_gaudi_base_get_fd(int device_id, bool *fd_created) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls check clang-format output (here, { should be on next line)
|
|
||
| #include <stdint.h> | ||
|
|
||
| #define SCAL_SUCCESS 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use UCT_GAUID_ prefix for identifiers in gaudi transport
| * Otherwise, the device is opened via hlthunk_open_by_module_id function. | ||
| */ | ||
| rc = scal_init(fd, "", &scal_handle, NULL); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use spaces between blocks
| ucs_status_t uct_gaudi_base_get_sysdev(int fd, ucs_sys_device_t* sys_dev) { | ||
| ucs_status_t status; | ||
| char pci_bus_id[13]; | ||
| int rc = hlthunk_get_pci_bus_id_from_fd(fd, pci_bus_id, sizeof(pci_bus_id)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls don't use c99 style declarations (even in technically it compiles)
| int rc = hlthunk_get_pci_bus_id_from_fd(fd, pci_bus_id, sizeof(pci_bus_id)); | |
| int rc; | |
| rc = hlthunk_get_pci_bus_id_from_fd(fd, pci_bus_id, sizeof(pci_bus_id)); |
| #include <uct/base/uct_md.h> | ||
| #include "scal.h" | ||
|
|
||
| int uct_gaudi_base_get_fd(int device_id, bool *fd_created); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add space lines between functions
| rc = scal_init(fd, "", &scal_handle, NULL); | ||
| } | ||
| if (rc != SCAL_SUCCESS) { | ||
| ucs_error("Failed to get scal handle"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use small case for debug/error prints, and add information if possible like device name, error code meaning/string, etc
What?
This PR is to enable Intel Gaudi GDR to UCX, so the UCX will have the capability to use Gaudi GDR to access Gaudi memory using standard UCP API through host NIC.
Why?
Enhanced the UCX to Intel Gaudi devices.
How?
Added Gaudi memory support in USM and the rest are new Gaudi GDR codes.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.