Skip to content

intoduce inflight requests migration#38

Open
AlexMoshkov wants to merge 4 commits intoyandex-cloud:masterfrom
AlexMoshkov:inflight-requests-migration
Open

intoduce inflight requests migration#38
AlexMoshkov wants to merge 4 commits intoyandex-cloud:masterfrom
AlexMoshkov:inflight-requests-migration

Conversation

@AlexMoshkov
Copy link

Now on GET_VRING_BASE, all queued and inflight requests will be cancelled (or ignored), then they will be migrated to destination server as resubmitted requests.

@AlexMoshkov AlexMoshkov force-pushed the inflight-requests-migration branch from ed20019 to 16659bf Compare December 22, 2025 11:04
Now in vhd_complete_bio rq will be obtained from io request instead of
vring. It will help in future commits to prevent race between vring
cleanup and
vhd_complete_bio().

Signed-off-by: Alexandr Moshkov <[email protected]>
@AlexMoshkov AlexMoshkov force-pushed the inflight-requests-migration branch 4 times, most recently from fae7409 to e64bc9f Compare February 10, 2026 08:48
@AlexMoshkov AlexMoshkov force-pushed the inflight-requests-migration branch from e64bc9f to 50bd5a4 Compare February 16, 2026 10:27
server.c Outdated
int vhd_cancel_inflight_requests(struct vhd_request_queue *rq,
const struct vhd_vring *vring)
{
int num_force_cancelled = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ты в итоге это значение складываешь в uint16_t, думаю логично тут тоже использовать uint16_t

vdev.h Outdated
/* #requests pending completion when the queue is requested to stop */
uint16_t num_in_flight_at_stop;
/* #requests in flight when the queue is requested to force cancel */
uint16_t num_in_flight_at_cancell;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_in_flight_at_cancell -> num_in_flight_at_cancel

while (io) {
struct vhd_io *next = TAILQ_NEXT(io, inflight_link);
if (unlikely(io->vring == vring)) {
io->force_cancelled = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А расскажи как тут в итоге разруливается ситуация когда NBS комплитит запросы в ту же секунду как ты их отменяешь? Как разрешается рейс?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Если в ту же секунды NBS завершат запрос, то вызовут vhd_complete_bio, после чего этот запрос будет добавлен в тот же bh очередь, где и запросы отменяются.

Т.е. все запросы мы точно пометим с force_cancelled = true, и только после этого будут вызываться completion_handler на все эти запросы

if (unlikely(io->vring == vring)) {
TAILQ_REMOVE(&rq->submission, io, submission_link);
io->status = VHD_BDEV_CANCELED;
req_complete(io);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А сама структура запроса тут не утекает в итоге если ты не делаешь complete?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поидее не должна. req_complete вызовется, только теперь попозже.

Т.е. я этому запросу просто выставляю так же force_cancelled

Дальше он как обычный submission извечется в rq->inflight очередь и обработается так же как любой другой inflight.

Хотя я сейчас пытаюсь вспомнить почему я так сделал:
какая-то была проблема с счетчиком если оставить так как было..


void vhd_cancel_queued_requests(struct vhd_request_queue *rq,
int vhd_cancel_queued_requests(struct vhd_request_queue *rq,
const struct vhd_vring *vring);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

отступ

const struct vhd_vring *vring);

int vhd_cancel_inflight_requests(struct vhd_request_queue *rq,
const struct vhd_vring *vring);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

отступ

VHOST_USER_PROTOCOL_F_GET_VRING_BASE_INFLIGHT will be used to determine
whether to wait for drain in-flight requests or not.

Move has_feature() upwards

Signed-off-by: Alexandr Moshkov <[email protected]>
In case of GET_VRING_BASE on migration, cancel inflight requests, so
they will be
migrated to destination server.

In event rq set force_cancelled flag to all requests,
set num_in_inflight = 0, so vring_mark_drained will be executed asap
All completed requests in bh completion will be seted to CANCEL status
and cleaned up.

Signed-off-by: Alexandr Moshkov <[email protected]>

server.c: fix queued requests cancellation

Like inflight, we must mark submission request with
force_cancelled = true, so when it become inflight it will be ignored

Signed-off-by: Alexandr Moshkov <[email protected]>

vdev.c: fix GET_VRING_BASE response

Send last_avail substract all inflight requests that will be migrated.
So last_avail counter will be correct in QEMU, and after SET_VRING_BASE
in other instance of libvhost-server.

Signed-off-by: Alexandr Moshkov <[email protected]>
This num will also show how many in-flight requests will be migrated to
another instance of libvhost-server.

Signed-off-by: Alexandr Moshkov <[email protected]>
@AlexMoshkov AlexMoshkov force-pushed the inflight-requests-migration branch from 50bd5a4 to c8eb70b Compare February 25, 2026 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants