diff mbox series

vfio/pci: Document the MSI[X] resize side effects properly

Message ID 87im23bh72.ffs@nanos.tec.linutronix.de (mailing list archive)
State New, archived
Headers show
Series vfio/pci: Document the MSI[X] resize side effects properly | expand

Commit Message

Thomas Gleixner June 24, 2021, 12:06 p.m. UTC
The documentation of VFIO_IRQ_INFO_NORESIZE is inaccurate as it suggests
that it is safe to dynamically add new MSI-X vectors even when
previously allocated vectors are already in use and enabled.

Enabling additional vectors is possible according the MSI-X specification,
but the kernel does not have any mechanisms today to do that safely.

The only available mechanism is to teardown the already active vectors
and to setup the full vector set afterwards.

This requires to temporarily disable MSI-X which redirects any interrupt
raised by the device during this time to the legacy PCI/INTX which is
not handled and the interrupt is therefore lost.

Update the documentation of VFIO_IRQ_INFO_NORESIZE accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/uapi/linux/vfio.h |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

Comments

Alex Williamson June 24, 2021, 10:22 p.m. UTC | #1
On Thu, 24 Jun 2021 14:06:09 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> The documentation of VFIO_IRQ_INFO_NORESIZE is inaccurate as it suggests
> that it is safe to dynamically add new MSI-X vectors even when
> previously allocated vectors are already in use and enabled.
> 
> Enabling additional vectors is possible according the MSI-X specification,
> but the kernel does not have any mechanisms today to do that safely.
> 
> The only available mechanism is to teardown the already active vectors
> and to setup the full vector set afterwards.
> 
> This requires to temporarily disable MSI-X which redirects any interrupt
> raised by the device during this time to the legacy PCI/INTX which is
> not handled and the interrupt is therefore lost.
> 
> Update the documentation of VFIO_IRQ_INFO_NORESIZE accordingly.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/uapi/linux/vfio.h |   17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -699,10 +699,19 @@ struct vfio_region_info_cap_nvlink2_lnks
>   * disabling the entire index.  This is used for interrupts like PCI MSI
>   * and MSI-X where the driver may only use a subset of the available
>   * indexes, but VFIO needs to enable a specific number of vectors
> - * upfront.  In the case of MSI-X, where the user can enable MSI-X and
> - * then add and unmask vectors, it's up to userspace to make the decision
> - * whether to allocate the maximum supported number of vectors or tear
> - * down setup and incrementally increase the vectors as each is enabled.
> + * upfront.
> + *
> + * MSI cannot be resized safely when interrupts are in use already because
> + * resizing requires temporary disablement of MSI for updating the relevant
> + * PCI config space entries. Disabling MSI redirects an interrupt raised by
> + * the device during this time to the unhandled legacy PCI/INTX, which
> + * means the interrupt is lost.
> + *
> + * Enabling additional vectors for MSI-X is possible at least from the
> + * perspective of the MSI-X specification, but not supported by the
> + * exisiting PCI/MSI-X mechanisms in the kernel. The kernel provides
> + * currently only a full teardown/setup cycle which requires to disable
> + * MSI-X temporarily with the same side effects as for MSI.
>   */
>  struct vfio_irq_info {
>  	__u32	argsz;
> 

There's good information here, but as per my other reply I think
NORESIZE might be only a host implementation issue for both MSI and
MSI/X.

I'd also rather not focus on that existing implementation in this
header, which is essentially the uAPI spec, because that implementation
can change and we're unlikely to remember to update the description
here.  We might even be describing a device that emulates MSI/X in some
way that it's not bound by this limitation.  For example maybe Intel's
emulation of MSI-X backed by IMS wouldn't need this flag and we could
update QEMU to finally have a branch that avoids the teardown/setup.
We have a flag to indicate this behavior, consequences should be
relative to the presence of that flag.

Finally a nit, I don't really see a strong case that the existing text
is actually inaccurate or implying some safety against lost interrupts.
It's actually making note of the issue here already, though the more
explicit description is welcome.  Thanks,

Alex
diff mbox series

Patch

--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -699,10 +699,19 @@  struct vfio_region_info_cap_nvlink2_lnks
  * disabling the entire index.  This is used for interrupts like PCI MSI
  * and MSI-X where the driver may only use a subset of the available
  * indexes, but VFIO needs to enable a specific number of vectors
- * upfront.  In the case of MSI-X, where the user can enable MSI-X and
- * then add and unmask vectors, it's up to userspace to make the decision
- * whether to allocate the maximum supported number of vectors or tear
- * down setup and incrementally increase the vectors as each is enabled.
+ * upfront.
+ *
+ * MSI cannot be resized safely when interrupts are in use already because
+ * resizing requires temporary disablement of MSI for updating the relevant
+ * PCI config space entries. Disabling MSI redirects an interrupt raised by
+ * the device during this time to the unhandled legacy PCI/INTX, which
+ * means the interrupt is lost.
+ *
+ * Enabling additional vectors for MSI-X is possible at least from the
+ * perspective of the MSI-X specification, but not supported by the
+ * exisiting PCI/MSI-X mechanisms in the kernel. The kernel provides
+ * currently only a full teardown/setup cycle which requires to disable
+ * MSI-X temporarily with the same side effects as for MSI.
  */
 struct vfio_irq_info {
 	__u32	argsz;