Skip to content

Unable to connect to new Discovery Controller on Discovery Controller Change Asynchronous Event Notification #2719

Open
@ankkuma

Description

@ankkuma

Hi,

I want to connect to a new discovery controller from NVME host after receiving the AEN for Discovery log change notice.

below are the steps I followed:

1. Make a persistent connection to the discovery controller.

sudo nvme discover --transport tcp --traddr 10.0.0.4 --trsvcid 4420 --persistent

10.0.0.4 => Current NVMF target/ discovery controller.

On Nvmf Host:

[azureuser@NVMe-oF-Initiator ~]$ sudo nvme list-subsys

nvme-subsys1 - NQN=nqn.2014-08.org.nvmexpress.discovery
hostnqn=nqn.2014-08.org.nvmexpress:uuid:142142ac-e66c-41dd-9535-4c6d3466fae4
iopolicy=numa

+- nvme1 tcp traddr=10.0.0.4,trsvcid=4420,src_addr=10.0.0.5 live
nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress:uuid:5a513dee-5df0-4dbb-b063-b4f4b4853c00
hostnqn=nqn.2014-08.org.nvmexpress:uuid:142142ac-e66c-41dd-9535-4c6d3466fae4
iopolicy=numa

2. NVMF target sends the AEN using RPC/SPDK.

On target:

[sudo ./scripts/rpc.py nvmf_discovery_add_referral --subnqn nqn.2014-08.org.nvmexpress.discovery --trtype TCP --traddr 10.0.0.6 --trsvcid 4420 --adrfam ipv4](url)

AEN detail from target:

async_event_type = SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE;
async_event_info = SPDK_NVME_ASYNC_EVENT_DISCOVERY_LOG_CHANGE;
log_page_identifier = SPDK_NVME_LOG_DISCOVERY;

3. NVMF host received the AEN. It should connect to the new Discovery Controller (10.0.0.6), but it failed to do so.

Feb 21 06:52:25 NVMe-oF-Initiator stafd[355165]: nvme1 - Received AEN: Change of Discovery Log Page (0x70f002)
Feb 21 06:52:25 NVMe-oF-Initiator stafd[355165]: Invoking: systemctl start nvmf-connect@--device\x3dnvme1.service
Feb 21 06:52:25 NVMe-oF-Initiator systemd[1]: Started NVMf auto-connect scan upon nvme discovery controller Events.
░░ Subject: A start job for unit nvmf-connect@--device\x3dnvme1.service has finished successfully
░░ Defined-By: systemd
░░ Support:  https://access.redhat.com/support
░░
░░ A start job for unit nvmf-connect@--device\x3dnvme1.service has finished successfully.
░░
░░ The job identifier is 82439.
Feb 21 06:52:25 NVMe-oF-Initiator sh[355422]: ctrl device nvme1 found, ignoring non matching command-line options
Feb 21 06:52:25 NVMe-oF-Initiator sh[355422]: Failed to write to /dev/nvme-fabrics: Required key not available
Feb 21 06:52:25 NVMe-oF-Initiator kernel: nvme nvme2: no valid PSK found
Feb 21 06:52:25 NVMe-oF-Initiator kernel: nvme nvme2: no valid PSK found
Feb 21 06:52:25 NVMe-oF-Initiator sh[355422]: Failed to write to /dev/nvme-fabrics: Required key not available
Feb 21 06:52:25 NVMe-oF-Initiator systemd[1]: nvmf-connect@--device\x3dnvme1.service: Deactivated successfully.
  1. Expectation: I was expecting host to connect to new discovery controller also which was sent as part of AEN arguments, which we can see that host received it (Verified in the Wireshark traces).

But, it is not happening, host is auto connecting to the same old discovery controller. I was expecting connection to both the discovery controllers.
What is the solution to this ? Do I need a different AEN ? If we can achieve this with a server side change, that is welcome. Do we need any different Lunix or NVME STAS Version ?

Version Details:

[azureuser@NVMe-oF-Target spdk]$ cat /etc/redhat-release
Red Hat Enterprise Linux release 9.4 (Plow)

[azureuser@NVMe-oF-Initiator ~]$ rpm -qi nvme-stas
Name : nvme-stas
Version : 2.2.1
Release : 2.el9
Architecture: noarch
Install Date: Sun 26 Jan 2025 03:42:56 PM UTC
Group : Unspecified
Size : 623388
License : ASL 2.0
Signature : RSA/SHA256, Wed 12 Apr 2023 07:58:31 AM UTC, Key ID 199e2f91fd431d51
Source RPM : nvme-stas-2.2.1-2.el9.src.rpm
Build Date : Fri 07 Apr 2023 04:18:36 PM UTC
Build Host : ppc-061.build.eng.bos.redhat.com
Packager : Red Hat, Inc.  http://bugzilla.redhat.com/bugzilla
Vendor : Red Hat, Inc.
URL :  https://github.com/linux-nvme/nvme-stas
Summary : NVMe STorage Appliance Services

Metadata

Metadata

Assignees

No one assigned

    Labels

    need more infoThe bug report doesn't have enough information to process

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions