|
11 | 11 | ### Features: |
12 | 12 | ### Bugfixes: |
13 | 13 |
|
14 | | -## 1.15.0-rc6 (September 20, 2023) |
15 | | -### Bugfixes: |
16 | | -#### UCP |
17 | | -* Fixed assertion when sending from noncontig GPU buffer to managed buffer. |
18 | | - |
19 | | -## 1.15.0-rc5 (September 12, 2023) |
20 | | -### Bugfixes: |
21 | | -#### UCP |
22 | | -* Fixed the data race on endpoint configurations. |
23 | | - |
24 | | -## 1.15.0-rc4 (August 30, 2023) |
25 | | -### Bugfixes: |
26 | | -#### RDMA CORE (IB, ROCE, etc.) |
27 | | -* Fixed dma-buf based memory region registration |
28 | | -* Fixed memory handle data corruption when PCIe relaxed ordering is enabled |
29 | | -#### UCS |
30 | | -* Fixed lane selection, adding bandwidth estimation for Sapphire Rapids family |
31 | | - |
32 | | -## 1.15.0-rc3 (August 8, 2023) |
33 | | -### Bugfixes: |
34 | | -#### UCP |
35 | | -* Fixed endpoint reconfiguration issues because of assymetrical selection |
36 | | -#### UCT |
37 | | -* Check dmabuf kernel support in ROCm memory domain |
38 | | -#### UCM |
39 | | -* Fixed conditional jump patching |
40 | | -#### Tools |
41 | | - * Fixed memory access flags in perftest |
42 | | - |
43 | | -## 1.15.0-rc2 (July 27, 2023) |
44 | | -### Features: |
45 | | -#### RDMA CORE (IB, ROCE, etc.) |
46 | | -* Implemented is_reachable_v2 for IB interfaces |
47 | | -#### Build |
48 | | -* Enabled build with binutils 2.40 |
49 | | -* Added versioned dependency to switch between packages with the same names |
50 | | - |
51 | | -### Bugfixes: |
52 | | -#### UCP |
53 | | -* Fixed endpoint reconfiguration error due to wrong locality detection |
54 | | -#### RDMA CORE (IB, ROCE, etc.) |
55 | | -* Fixed performance degradation when indirect atomic key is not supported by the hardware |
56 | | -* Fixed remote access error to strict-order key because of wrong offset |
57 | | -#### GPU (CUDA, ROCM) |
58 | | -* Fixed CUDA IPC performance degradation after libnuma removal |
59 | | - |
60 | | -## 1.15.0-rc1 (May 10, 2023) |
| 14 | +## 1.15.0 (September 28, 2023) |
61 | 15 | ### Features: |
62 | 16 | #### UCP |
63 | 17 | * Added 2-stage pipeline protocol in the new protocol infrastructure |
|
75 | 29 | * Added base implementation of is_reachable_v2 API using intra/inter flag |
76 | 30 | * Introduced MD capability for non-blocking registration memory types |
77 | 31 | #### RDMA CORE (IB, ROCE, etc.) |
| 32 | +* Added implementation of is_reachable_v2 routine to IB interface |
78 | 33 | * Added option to control CQE zipping per CQ RX/TX direction |
79 | 34 | * Added option to specify how DCI selects port under RoCE LAG |
80 | 35 | * Added hw_dcs to the list of policies to select DCI by an endpoint |
|
104 | 59 | * Added user-side memcpy option for AM benchmarks in ucx_perftest |
105 | 60 | * Added wireshark LUA dissectors for some UCX protocols |
106 | 61 | #### Build |
| 62 | +* Added support for binutils 2.40 |
| 63 | +* Added versioned dependency to switch between packages with the same names |
107 | 64 | * Added a separate xpmem deb subpackage |
108 | 65 | * Added aarch64 support to the binary distribution pipeline |
109 | 66 | * Removed dependency on libnuma |
110 | | - |
111 | 67 | ### Bugfixes: |
112 | 68 | #### UCP |
| 69 | +* Fixed assertion when sending from non-contiguous GPU buffer to managed buffer |
| 70 | +* Fixed the race condition on endpoint configurations |
| 71 | +* Fixed endpoint reconfiguration issues due to asymmetrical selection |
| 72 | +* Fixed endpoint reconfiguration error due to wrong locality detection |
113 | 73 | * Fixed crash during connection manager cleanup |
114 | 74 | * Fixed rkey index calculation for rendezvous protocol |
115 | 75 | * Fixed rcache dump function |
|
123 | 83 | * Fixed CPU/device atomics selection in the new protocol infrastructure |
124 | 84 | * Multiple fixes in the new protocol infrastructure information output |
125 | 85 | #### UCT |
| 86 | +* Added check for dmabuf kernel support in ROCm memory domain |
126 | 87 | * Fixed exported memh packing |
127 | 88 | * Fixed an error in checking return status of multi-threaded memory registration function |
128 | 89 | #### RDMA CORE (IB, ROCE, etc.) |
| 90 | +* Fixed dma-buf based memory region registration |
| 91 | +* Fixed memory handle data corruption when PCIe relaxed ordering is enabled |
| 92 | +* Fixed performance degradation when indirect atomic key is not supported by the hardware |
| 93 | +* Fixed remote access error to strict-order keys because of wrong offset |
129 | 94 | * Added check for UAR support to memory domain opening |
130 | 95 | * Fixed updating port counters for devx qp |
131 | 96 | * Fixed ibv_create_cq error message on node without Infiniband |
132 | 97 | * Fixed performance degradation due to using 2 paths on NDR400 by default |
133 | 98 | * Removed unnecessary async lock which otherwise would block UD progress |
| 99 | +#### GPU (CUDA, ROCM) |
| 100 | +* Fixed CUDA IPC performance degradation due to libnuma removal |
134 | 101 | #### UCS |
| 102 | +* Fixed lane selection and added bandwidth estimation for Sapphire Rapids family |
135 | 103 | * Fixed displaying wrong environment variable suggestions |
136 | 104 | * Fixed VFS warning output |
137 | 105 | * Fixed SEGV in ucs_debug_backtrace_next(), upon previous SEGV handling, due to ENOMEM situation |
138 | 106 | * Fixed memory corruption when using UCX_MPOOL_FIFO=y |
139 | 107 | #### UCM |
| 108 | +* Fixed conditional jump patching |
140 | 109 | * Fixed mremap() override |
141 | 110 | #### GPU (CUDA, ROCM) |
142 | 111 | * Fixed usage of dmabuf when the buffer is not page-aligned |
|
148 | 117 | #### Tests |
149 | 118 | * Fixed wrong usage of ep_close in examples |
150 | 119 | #### Tools |
| 120 | +* Fixed memory access flags in perftest |
151 | 121 | * Removed support for librte from perf |
152 | 122 | * Fixed worker flush deadlock when using multiple workers in ucx_perftest |
153 | 123 | #### Build |
|
0 commit comments