Skip to content

Conversation

shijin-aws
Copy link
Contributor

Currently, there is no track for the reg/dereg status of
an MR in efa provider, because fi_close(mr) will simply
free all the efa_mr object. This will cause crash or other
undeterministic when an application close a mr when there
are outstanding operations still reference it.

This patch resolves this problem by introducing two fields
in efa_mr structure: active and ref.

ref is the count of operations (op entries) that an mr
is referenced. It is initialized as 0 during mr registration,
, incremented when the mr is referenced by an operation,
and decremented when the operation is resolved.

active is a simple boolean that is toggled as true when
mr is registered and toggled as false when mr is deregistered
by fi_close.

The lifecycle or an efa_mr object behaves as:

  1. Allocated during fi_mr_reg*, initialize active as 1, ref as 0
  2. op 1 referenced the mr, ref++
  3. op 2 referenced the mr, ref--
  4. fi_close dereg the mr, active is set as 0, while the object is
    not freed.
  5. op 1 completes, ref--
  6. op 2 completes, ref--, now ref is 0 and active is 0, object is
    freed.

The sequence of 4, 5, 6 can be any order, and the object will be
freed only when active is 0 && ref is 0

Also introduce a check in the operation calls that any access to
inactive mr will fail with -FI_EINVAL.

Currently, there is no track for the reg/dereg status of
an MR in efa provider, because `fi_close(mr)` will simply
free all the `efa_mr` object. This will cause crash or other
undeterministic when an application close a mr when there
are outstanding operations still reference it.

This patch resolves this problem by introducing two fields
in efa_mr structure: active and ref.

ref is the count of operations (op entries) that an mr
is referenced. It is initialized as 0 during mr registration,
, incremented when the mr is referenced by an operation,
and decremented when the operation is resolved.

active is a simple boolean that is toggled as true when
mr is registered and toggled as false when mr is deregistered
by fi_close.

The lifecycle or an efa_mr object behaves as:

1. Allocated during fi_mr_reg*, initialize active as 1, ref as 0
2. op 1 referenced the mr, ref++
3. op 2 referenced the mr, ref--
4. fi_close dereg the mr, active is set as 0, while the object is
not freed.
5. op 1 completes, ref--
6. op 2 completes, ref--, now ref is 0 and active is 0, object is
freed.

The sequence of 4, 5, 6 can be any order, and the object will be
freed only when active is 0 && ref is 0

Also introduce a check in the operation calls that any access to
inactive mr will fail with -FI_EINVAL.

Signed-off-by: Shi Jin <[email protected]>
When the pkt send/write call failed, the tx pkt should
be released, because it was removed from the queued_pkts
dlist already. Otherwise it will not be cleaned in
the iteration of queued_pkts in the txe/rxe error handling
and causes assertion errors in tx pkt pool destroy

Signed-off-by: Shi Jin <[email protected]>
For tx pkt, pkt_entry->entry is linked to either
peer->outstanding_tx_pkts or txe->queued_pkts.
It must be removed from the list as part of the
pkt entry release, otherwise it will cause double
free bug when txe_handle_err and txe_release are
both called for the same txe.

Signed-off-by: Shi Jin <[email protected]>
@shijin-aws shijin-aws marked this pull request as draft June 18, 2025 17:07
The rx pkt in this error path may already be freed.
Temporarily skip the release to avoid double free.

Signed-off-by: Shi Jin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant