Message ID | 20190109112728.9214-3-xieyongji@baidu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vhost-user-blk: Add support for backend reconnecting | expand |
On Wed, Jan 09, 2019 at 07:27:23PM +0800, elohimes@gmail.com wrote: > @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, > slave can send file descriptors (at most 8 descriptors in each message) > to master via ancillary data using this fd communication channel. > > +Inflight I/O tracking > +--------------------- > + > +To support slave reconnecting, slave need to track inflight I/O in a > +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD > +are used to transfer the memory between master and slave. And to encourage > +consistency, we provide a recommended format for this memory: I think we should make a stronger statement and actually just say what the format is. Not recommend it weakly. > + > +offset width description > +0x0 0x400 region for queue0 > +0x400 0x400 region for queue1 > +0x800 0x400 region for queue2 > +... ... ... > + > +For each virtqueue, we have a 1024 bytes region. Why is the size hardcoded? Why not a function of VQ size? > The region's format is like: > + > +offset width description > +0x0 0x1 descriptor 0 is in use or not > +0x1 0x1 descriptor 1 is in use or not > +0x2 0x1 descriptor 2 is in use or not > +... ... ... > + > +For each descriptor, we use one byte to specify whether it's in use or not. > + > Protocol features > ----------------- > I think that it's a good idea to have a version in this region. Otherwise how are you going to handle compatibility when this needs to be extended?
On Tue, 15 Jan 2019 at 06:25, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Wed, Jan 09, 2019 at 07:27:23PM +0800, elohimes@gmail.com wrote: > > @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, > > slave can send file descriptors (at most 8 descriptors in each message) > > to master via ancillary data using this fd communication channel. > > > > +Inflight I/O tracking > > +--------------------- > > + > > +To support slave reconnecting, slave need to track inflight I/O in a > > +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD > > +are used to transfer the memory between master and slave. And to encourage > > +consistency, we provide a recommended format for this memory: > > I think we should make a stronger statement and actually > just say what the format is. Not recommend it weakly. > Okey, will do it. > > + > > +offset width description > > +0x0 0x400 region for queue0 > > +0x400 0x400 region for queue1 > > +0x800 0x400 region for queue2 > > +... ... ... > > + > > +For each virtqueue, we have a 1024 bytes region. > > > Why is the size hardcoded? Why not a function of VQ size? > Sorry, I didn't get your point. Should the region's size be fixed? Do you mean we need to document a function for the region's size? > > > The region's format is like: > > + > > +offset width description > > +0x0 0x1 descriptor 0 is in use or not > > +0x1 0x1 descriptor 1 is in use or not > > +0x2 0x1 descriptor 2 is in use or not > > +... ... ... > > + > > +For each descriptor, we use one byte to specify whether it's in use or not. > > + > > Protocol features > > ----------------- > > > > I think that it's a good idea to have a version in this region. > Otherwise how are you going to handle compatibility when > this needs to be extended? > I have put the version into the message's payload: VhostUserInflight. Is it OK? Thanks, Yongji
On Tue, Jan 15, 2019 at 02:46:42PM +0800, Yongji Xie wrote: > On Tue, 15 Jan 2019 at 06:25, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Wed, Jan 09, 2019 at 07:27:23PM +0800, elohimes@gmail.com wrote: > > > @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, > > > slave can send file descriptors (at most 8 descriptors in each message) > > > to master via ancillary data using this fd communication channel. > > > > > > +Inflight I/O tracking > > > +--------------------- > > > + > > > +To support slave reconnecting, slave need to track inflight I/O in a > > > +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD > > > +are used to transfer the memory between master and slave. And to encourage > > > +consistency, we provide a recommended format for this memory: > > > > I think we should make a stronger statement and actually > > just say what the format is. Not recommend it weakly. > > > > Okey, will do it. > > > > + > > > +offset width description > > > +0x0 0x400 region for queue0 > > > +0x400 0x400 region for queue1 > > > +0x800 0x400 region for queue2 > > > +... ... ... > > > + > > > +For each virtqueue, we have a 1024 bytes region. > > > > > > Why is the size hardcoded? Why not a function of VQ size? > > > > Sorry, I didn't get your point. Should the region's size be fixed? Do > you mean we need to document a function for the region's size? Well you are saying 0x0 to 0x400 is for queue0. How do you know that's enough? And why are 0x400 bytes necessary? After all max queue size can be very small. > > > > > The region's format is like: > > > + > > > +offset width description > > > +0x0 0x1 descriptor 0 is in use or not > > > +0x1 0x1 descriptor 1 is in use or not > > > +0x2 0x1 descriptor 2 is in use or not > > > +... ... ... > > > + > > > +For each descriptor, we use one byte to specify whether it's in use or not. > > > + > > > Protocol features > > > ----------------- > > > > > > > I think that it's a good idea to have a version in this region. > > Otherwise how are you going to handle compatibility when > > this needs to be extended? > > > > I have put the version into the message's payload: VhostUserInflight. Is it OK? > > Thanks, > Yongji I'm not sure I like it. So is qemu expected to maintain it? Reset it? Also don't you want to be able to detect that qemu has reset the buffer? If we have version 1 at a known offset that can serve both purposes. Given it only has value within the buffer why not store it there?
On Tue, 15 Jan 2019 at 20:54, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Tue, Jan 15, 2019 at 02:46:42PM +0800, Yongji Xie wrote: > > On Tue, 15 Jan 2019 at 06:25, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Wed, Jan 09, 2019 at 07:27:23PM +0800, elohimes@gmail.com wrote: > > > > @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, > > > > slave can send file descriptors (at most 8 descriptors in each message) > > > > to master via ancillary data using this fd communication channel. > > > > > > > > +Inflight I/O tracking > > > > +--------------------- > > > > + > > > > +To support slave reconnecting, slave need to track inflight I/O in a > > > > +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD > > > > +are used to transfer the memory between master and slave. And to encourage > > > > +consistency, we provide a recommended format for this memory: > > > > > > I think we should make a stronger statement and actually > > > just say what the format is. Not recommend it weakly. > > > > > > > Okey, will do it. > > > > > > + > > > > +offset width description > > > > +0x0 0x400 region for queue0 > > > > +0x400 0x400 region for queue1 > > > > +0x800 0x400 region for queue2 > > > > +... ... ... > > > > + > > > > +For each virtqueue, we have a 1024 bytes region. > > > > > > > > > Why is the size hardcoded? Why not a function of VQ size? > > > > > > > Sorry, I didn't get your point. Should the region's size be fixed? Do > > you mean we need to document a function for the region's size? > > > Well you are saying 0x0 to 0x400 is for queue0. > How do you know that's enough? And why are 0x400 > bytes necessary? After all max queue size can be very small. > > OK, I think I get your point. So we need something like: region's size = max_queue_size * 32 byte + xxx byte (if any) Right? > > > > > > > > The region's format is like: > > > > + > > > > +offset width description > > > > +0x0 0x1 descriptor 0 is in use or not > > > > +0x1 0x1 descriptor 1 is in use or not > > > > +0x2 0x1 descriptor 2 is in use or not > > > > +... ... ... > > > > + > > > > +For each descriptor, we use one byte to specify whether it's in use or not. > > > > + > > > > Protocol features > > > > ----------------- > > > > > > > > > > I think that it's a good idea to have a version in this region. > > > Otherwise how are you going to handle compatibility when > > > this needs to be extended? > > > > > > > I have put the version into the message's payload: VhostUserInflight. Is it OK? > > > > Thanks, > > Yongji > > I'm not sure I like it. So is qemu expected to maintain it? Reset it? > Also don't you want to be able to detect that qemu has reset the buffer? > If we have version 1 at a known offset that can serve both purposes. > Given it only has value within the buffer why not store it there? > Yes, that looks better. Will update it in v5. Thanks, Yongji
On Tue, 15 Jan 2019 at 22:18, Yongji Xie <elohimes@gmail.com> wrote: > > On Tue, 15 Jan 2019 at 20:54, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Tue, Jan 15, 2019 at 02:46:42PM +0800, Yongji Xie wrote: > > > On Tue, 15 Jan 2019 at 06:25, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > On Wed, Jan 09, 2019 at 07:27:23PM +0800, elohimes@gmail.com wrote: > > > > > @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, > > > > > slave can send file descriptors (at most 8 descriptors in each message) > > > > > to master via ancillary data using this fd communication channel. > > > > > > > > > > +Inflight I/O tracking > > > > > +--------------------- > > > > > + > > > > > +To support slave reconnecting, slave need to track inflight I/O in a > > > > > +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD > > > > > +are used to transfer the memory between master and slave. And to encourage > > > > > +consistency, we provide a recommended format for this memory: > > > > > > > > I think we should make a stronger statement and actually > > > > just say what the format is. Not recommend it weakly. > > > > > > > > > > Okey, will do it. > > > > > > > > + > > > > > +offset width description > > > > > +0x0 0x400 region for queue0 > > > > > +0x400 0x400 region for queue1 > > > > > +0x800 0x400 region for queue2 > > > > > +... ... ... > > > > > + > > > > > +For each virtqueue, we have a 1024 bytes region. > > > > > > > > > > > > Why is the size hardcoded? Why not a function of VQ size? > > > > > > > > > > Sorry, I didn't get your point. Should the region's size be fixed? Do > > > you mean we need to document a function for the region's size? > > > > > > Well you are saying 0x0 to 0x400 is for queue0. > > How do you know that's enough? And why are 0x400 > > bytes necessary? After all max queue size can be very small. > > > > > > OK, I think I get your point. So we need something like: > > region's size = max_queue_size * 32 byte + xxx byte (if any) > > Right? > > > > > > > > > > > > The region's format is like: > > > > > + > > > > > +offset width description > > > > > +0x0 0x1 descriptor 0 is in use or not > > > > > +0x1 0x1 descriptor 1 is in use or not > > > > > +0x2 0x1 descriptor 2 is in use or not > > > > > +... ... ... > > > > > + > > > > > +For each descriptor, we use one byte to specify whether it's in use or not. > > > > > + > > > > > Protocol features > > > > > ----------------- > > > > > > > > > > > > > I think that it's a good idea to have a version in this region. > > > > Otherwise how are you going to handle compatibility when > > > > this needs to be extended? > > > > > > > > > > I have put the version into the message's payload: VhostUserInflight. Is it OK? > > > > > > Thanks, > > > Yongji > > > > I'm not sure I like it. So is qemu expected to maintain it? Reset it? > > Also don't you want to be able to detect that qemu has reset the buffer? > > If we have version 1 at a known offset that can serve both purposes. > > Given it only has value within the buffer why not store it there? > > > > Yes, that looks better. Will update it in v5. > Hi Michael, I found a problem during implentmenting this. If we put version into the shared buffer, QEMU will reset it when vm reset. Then if backend restart at the same time, the version of this buffer will be lost. So maybe qemu still need to maintain it. Thanks, Yongji
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt index c2194711d9..67da41fdd2 100644 --- a/docs/interop/vhost-user.txt +++ b/docs/interop/vhost-user.txt @@ -142,6 +142,18 @@ Depending on the request type, payload can be: Offset: a 64-bit offset of this area from the start of the supplied file descriptor + * Inflight description + ---------------------------------------------------------- + | mmap size | mmap offset | align | num queues | version | + ---------------------------------------------------------- + + mmap size: a 64-bit size of area to track inflight I/O + mmap offset: a 64-bit offset of this area from the start + of the supplied file descriptor + align: a 32-bit align of each region in this area + num queues: a 16-bit number of virtqueues + version: a 16-bit version of this area + In QEMU the vhost-user message is implemented with the following struct: typedef struct VhostUserMsg { @@ -157,6 +169,7 @@ typedef struct VhostUserMsg { struct vhost_iotlb_msg iotlb; VhostUserConfig config; VhostUserVringArea area; + VhostUserInflight inflight; }; } QEMU_PACKED VhostUserMsg; @@ -175,6 +188,7 @@ the ones that do: * VHOST_USER_GET_PROTOCOL_FEATURES * VHOST_USER_GET_VRING_BASE * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD) + * VHOST_USER_GET_INFLIGHT_FD (if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD) [ Also see the section on REPLY_ACK protocol extension. ] @@ -188,6 +202,7 @@ in the ancillary data: * VHOST_USER_SET_VRING_CALL * VHOST_USER_SET_VRING_ERR * VHOST_USER_SET_SLAVE_REQ_FD + * VHOST_USER_SET_INFLIGHT_FD (if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD) If Master is unable to send the full message or receives a wrong reply it will close the connection. An optional reconnection mechanism can be implemented. @@ -382,6 +397,30 @@ If VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD protocol feature is negotiated, slave can send file descriptors (at most 8 descriptors in each message) to master via ancillary data using this fd communication channel. +Inflight I/O tracking +--------------------- + +To support slave reconnecting, slave need to track inflight I/O in a +shared memory. VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD +are used to transfer the memory between master and slave. And to encourage +consistency, we provide a recommended format for this memory: + +offset width description +0x0 0x400 region for queue0 +0x400 0x400 region for queue1 +0x800 0x400 region for queue2 +... ... ... + +For each virtqueue, we have a 1024 bytes region. The region's format is like: + +offset width description +0x0 0x1 descriptor 0 is in use or not +0x1 0x1 descriptor 1 is in use or not +0x2 0x1 descriptor 2 is in use or not +... ... ... + +For each descriptor, we use one byte to specify whether it's in use or not. + Protocol features ----------------- @@ -397,6 +436,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_CONFIG 9 #define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD 10 #define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 11 +#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12 Master message types -------------------- @@ -761,6 +801,26 @@ Master message types was previously sent. The value returned is an error indication; 0 is success. + * VHOST_USER_GET_INFLIGHT_FD + Id: 31 + Equivalent ioctl: N/A + Master payload: inflight description + + When VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD protocol feature has been + successfully negotiated, this message is submitted by master to get + a shared memory from slave. The shared memory will be used to track + inflight I/O by slave. Master should clear it when vm reset. + + * VHOST_USER_SET_INFLIGHT_FD + Id: 32 + Equivalent ioctl: N/A + Master payload: inflight description + + When VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD protocol feature has been + successfully negotiated, this message is submitted by master to send + the shared inflight buffer back to slave so that slave could get + inflight I/O after a crash or restart. + Slave message types ------------------- diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e09bed0e4a..4d118c6e14 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -52,6 +52,7 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_CONFIG = 9, VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10, VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11, + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12, VHOST_USER_PROTOCOL_F_MAX }; @@ -89,6 +90,8 @@ typedef enum VhostUserRequest { VHOST_USER_POSTCOPY_ADVISE = 28, VHOST_USER_POSTCOPY_LISTEN = 29, VHOST_USER_POSTCOPY_END = 30, + VHOST_USER_GET_INFLIGHT_FD = 31, + VHOST_USER_SET_INFLIGHT_FD = 32, VHOST_USER_MAX } VhostUserRequest; @@ -147,6 +150,14 @@ typedef struct VhostUserVringArea { uint64_t offset; } VhostUserVringArea; +typedef struct VhostUserInflight { + uint64_t mmap_size; + uint64_t mmap_offset; + uint32_t align; + uint16_t num_queues; + uint16_t version; +} VhostUserInflight; + typedef struct { VhostUserRequest request; @@ -169,6 +180,7 @@ typedef union { VhostUserConfig config; VhostUserCryptoSession session; VhostUserVringArea area; + VhostUserInflight inflight; } VhostUserPayload; typedef struct VhostUserMsg { @@ -1739,6 +1751,100 @@ static bool vhost_user_mem_section_filter(struct vhost_dev *dev, return result; } +static int vhost_user_get_inflight_fd(struct vhost_dev *dev, + struct vhost_inflight *inflight) +{ + void *addr; + int fd; + struct vhost_user *u = dev->opaque; + CharBackend *chr = u->user->chr; + VhostUserMsg msg = { + .hdr.request = VHOST_USER_GET_INFLIGHT_FD, + .hdr.flags = VHOST_USER_VERSION, + .payload.inflight.num_queues = dev->nvqs, + .hdr.size = sizeof(msg.payload.inflight), + }; + + if (!virtio_has_feature(dev->protocol_features, + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) { + return 0; + } + + if (vhost_user_write(dev, &msg, NULL, 0) < 0) { + return -1; + } + + if (vhost_user_read(dev, &msg) < 0) { + return -1; + } + + if (msg.hdr.request != VHOST_USER_GET_INFLIGHT_FD) { + error_report("Received unexpected msg type. " + "Expected %d received %d", + VHOST_USER_GET_INFLIGHT_FD, msg.hdr.request); + return -1; + } + + if (msg.hdr.size != sizeof(msg.payload.inflight)) { + error_report("Received bad msg size."); + return -1; + } + + if (!msg.payload.inflight.mmap_size) { + return 0; + } + + fd = qemu_chr_fe_get_msgfd(chr); + if (fd < 0) { + error_report("Failed to get mem fd"); + return -1; + } + + addr = mmap(0, msg.payload.inflight.mmap_size, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, msg.payload.inflight.mmap_offset); + + if (addr == MAP_FAILED) { + error_report("Failed to mmap mem fd"); + close(fd); + return -1; + } + + inflight->addr = addr; + inflight->fd = fd; + inflight->size = msg.payload.inflight.mmap_size; + inflight->offset = msg.payload.inflight.mmap_offset; + inflight->align = msg.payload.inflight.align; + inflight->version = msg.payload.inflight.version; + + return 0; +} + +static int vhost_user_set_inflight_fd(struct vhost_dev *dev, + struct vhost_inflight *inflight) +{ + VhostUserMsg msg = { + .hdr.request = VHOST_USER_SET_INFLIGHT_FD, + .hdr.flags = VHOST_USER_VERSION, + .payload.inflight.mmap_size = inflight->size, + .payload.inflight.mmap_offset = inflight->offset, + .payload.inflight.align = inflight->align, + .payload.inflight.num_queues = dev->nvqs, + .payload.inflight.version = inflight->version, + .hdr.size = sizeof(msg.payload.inflight), + }; + + if (!virtio_has_feature(dev->protocol_features, + VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD)) { + return 0; + } + + if (vhost_user_write(dev, &msg, &inflight->fd, 1) < 0) { + return -1; + } + + return 0; +} + VhostUserState *vhost_user_init(void) { VhostUserState *user = g_new0(struct VhostUserState, 1); @@ -1790,4 +1896,6 @@ const VhostOps user_ops = { .vhost_crypto_create_session = vhost_user_crypto_create_session, .vhost_crypto_close_session = vhost_user_crypto_close_session, .vhost_backend_mem_section_filter = vhost_user_mem_section_filter, + .vhost_get_inflight_fd = vhost_user_get_inflight_fd, + .vhost_set_inflight_fd = vhost_user_set_inflight_fd, }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 569c4053ea..730f436692 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -1481,6 +1481,114 @@ void vhost_dev_set_config_notifier(struct vhost_dev *hdev, hdev->config_ops = ops; } +void vhost_dev_reset_inflight(struct vhost_inflight *inflight) +{ + if (inflight->addr) { + memset(inflight->addr, 0, inflight->size); + } +} + +void vhost_dev_free_inflight(struct vhost_inflight *inflight) +{ + if (inflight->addr) { + qemu_memfd_free(inflight->addr, inflight->size, inflight->fd); + inflight->addr = NULL; + inflight->fd = -1; + } +} + +static int vhost_dev_resize_inflight(struct vhost_inflight *inflight, + uint64_t new_size) +{ + Error *err = NULL; + int fd = -1; + void *addr = qemu_memfd_alloc("vhost-inflight", new_size, + F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, + &fd, &err); + + if (err) { + error_report_err(err); + return -1; + } + + vhost_dev_free_inflight(inflight); + inflight->offset = 0; + inflight->addr = addr; + inflight->fd = fd; + inflight->size = new_size; + + return 0; +} + +void vhost_dev_save_inflight(struct vhost_inflight *inflight, QEMUFile *f) +{ + if (inflight->addr) { + qemu_put_be64(f, inflight->size); + qemu_put_be64(f, inflight->offset); + qemu_put_be32(f, inflight->align); + qemu_put_be16(f, inflight->version); + qemu_put_buffer(f, inflight->addr, inflight->size); + } else { + qemu_put_be64(f, 0); + } +} + +int vhost_dev_load_inflight(struct vhost_inflight *inflight, QEMUFile *f) +{ + uint64_t size; + + size = qemu_get_be64(f); + if (!size) { + return 0; + } + + if (inflight->size != size) { + if (vhost_dev_resize_inflight(inflight, size)) { + return -1; + } + } + inflight->size = size; + inflight->offset = qemu_get_be64(f); + inflight->align = qemu_get_be32(f); + inflight->version = qemu_get_be16(f); + + qemu_get_buffer(f, inflight->addr, size); + + return 0; +} + +int vhost_dev_set_inflight(struct vhost_dev *dev, + struct vhost_inflight *inflight) +{ + int r; + + if (dev->vhost_ops->vhost_set_inflight_fd && inflight->addr) { + r = dev->vhost_ops->vhost_set_inflight_fd(dev, inflight); + if (r) { + VHOST_OPS_DEBUG("vhost_set_inflight_fd failed"); + return -errno; + } + } + + return 0; +} + +int vhost_dev_get_inflight(struct vhost_dev *dev, + struct vhost_inflight *inflight) +{ + int r; + + if (dev->vhost_ops->vhost_get_inflight_fd) { + r = dev->vhost_ops->vhost_get_inflight_fd(dev, inflight); + if (r) { + VHOST_OPS_DEBUG("vhost_get_inflight_fd failed"); + return -errno; + } + } + + return 0; +} + /* Host notifiers must be enabled at this point. */ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev) { diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index 81283ec50f..97676bd237 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -25,6 +25,7 @@ typedef enum VhostSetConfigType { VHOST_SET_CONFIG_TYPE_MIGRATION = 1, } VhostSetConfigType; +struct vhost_inflight; struct vhost_dev; struct vhost_log; struct vhost_memory; @@ -104,6 +105,12 @@ typedef int (*vhost_crypto_close_session_op)(struct vhost_dev *dev, typedef bool (*vhost_backend_mem_section_filter_op)(struct vhost_dev *dev, MemoryRegionSection *section); +typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev *dev, + struct vhost_inflight *inflight); + +typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev, + struct vhost_inflight *inflight); + typedef struct VhostOps { VhostBackendType backend_type; vhost_backend_init vhost_backend_init; @@ -142,6 +149,8 @@ typedef struct VhostOps { vhost_crypto_create_session_op vhost_crypto_create_session; vhost_crypto_close_session_op vhost_crypto_close_session; vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter; + vhost_get_inflight_fd_op vhost_get_inflight_fd; + vhost_set_inflight_fd_op vhost_set_inflight_fd; } VhostOps; extern const VhostOps user_ops; diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index a7f449fa87..0a71596d8b 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -7,6 +7,16 @@ #include "exec/memory.h" /* Generic structures common for any vhost based device. */ + +struct vhost_inflight { + int fd; + void *addr; + uint64_t size; + uint64_t offset; + uint32_t align; + uint16_t version; +}; + struct vhost_virtqueue { int kick; int call; @@ -120,4 +130,13 @@ int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *data, */ void vhost_dev_set_config_notifier(struct vhost_dev *dev, const VhostDevConfigOps *ops); + +void vhost_dev_reset_inflight(struct vhost_inflight *inflight); +void vhost_dev_free_inflight(struct vhost_inflight *inflight); +void vhost_dev_save_inflight(struct vhost_inflight *inflight, QEMUFile *f); +int vhost_dev_load_inflight(struct vhost_inflight *inflight, QEMUFile *f); +int vhost_dev_set_inflight(struct vhost_dev *dev, + struct vhost_inflight *inflight); +int vhost_dev_get_inflight(struct vhost_dev *dev, + struct vhost_inflight *inflight); #endif