diff mbox series

[1/7] vfio/pci: Disable auto-enable of exclusive INTx IRQ

Message ID 20240306211445.1856768-2-alex.williamson@redhat.com (mailing list archive)
State New, archived
Headers show
Series vfio: Interrupt eventfd hardening | expand

Commit Message

Alex Williamson March 6, 2024, 9:14 p.m. UTC
Currently for devices requiring masking at the irqchip for INTx, ie.
devices without DisINTx support, the IRQ is enabled in request_irq()
and subsequently disabled as necessary to align with the masked status
flag.  This presents a window where the interrupt could fire between
these events, resulting in the IRQ incrementing the disable depth twice.
This would be unrecoverable for a user since the masked flag prevents
nested enables through vfio.

Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
is never auto-enabled, then unmask as required.

Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Comments

Tian, Kevin March 7, 2024, 8:28 a.m. UTC | #1
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, March 7, 2024 5:15 AM
> 
> Currently for devices requiring masking at the irqchip for INTx, ie.
> devices without DisINTx support, the IRQ is enabled in request_irq()
> and subsequently disabled as necessary to align with the masked status
> flag.  This presents a window where the interrupt could fire between
> these events, resulting in the IRQ incrementing the disable depth twice.

Can you elaborate the last point about disable depth?

> This would be unrecoverable for a user since the masked flag prevents
> nested enables through vfio.

What is 'nested enables'?

> 
> Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> is never auto-enabled, then unmask as required.
> 
> Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

But this patch looks good to me:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

with one nit...

> 
> +	/*
> +	 * Devices without DisINTx support require an exclusive interrupt,
> +	 * IRQ masking is performed at the IRQ chip.  The masked status is

"exclusive interrupt, with IRQ masking performed at..."
Tian, Kevin March 7, 2024, 8:39 a.m. UTC | #2
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, March 7, 2024 5:15 AM
> 
> Currently for devices requiring masking at the irqchip for INTx, ie.
> devices without DisINTx support, the IRQ is enabled in request_irq()
> and subsequently disabled as necessary to align with the masked status
> flag.  This presents a window where the interrupt could fire between
> these events, resulting in the IRQ incrementing the disable depth twice.
> This would be unrecoverable for a user since the masked flag prevents
> nested enables through vfio.
> 
> Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> is never auto-enabled, then unmask as required.
> 
> Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

CC stable?
Alex Williamson March 7, 2024, 8:17 p.m. UTC | #3
On Thu, 7 Mar 2024 08:28:45 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Thursday, March 7, 2024 5:15 AM
> > 
> > Currently for devices requiring masking at the irqchip for INTx, ie.
> > devices without DisINTx support, the IRQ is enabled in request_irq()
> > and subsequently disabled as necessary to align with the masked status
> > flag.  This presents a window where the interrupt could fire between
> > these events, resulting in the IRQ incrementing the disable depth twice.  
> 
> Can you elaborate the last point about disable depth?

Each irq_desc maintains a depth field, a disable increments the depth,
an enable decrements.  On the disable transition from 0 to 1 the IRQ
chip is disabled, on the enable transition from 1 to 0 the IRQ chip is
enabled.

Therefore if an interrupt fires between request_irq() and
disable_irq(), vfio_intx_handler() will disable the IRQ (depth 0->1).
Note that masked is not tested here, the interrupt is expected to be
exclusive for non-pci_2_3 devices.  @masked would be redundantly set to
true.  The setup call path would increment depth to 2.  The order these
happen is not important so long as the interrupt is in-flight before
the setup path disables the IRQ.

> > This would be unrecoverable for a user since the masked flag prevents
> > nested enables through vfio.  
> 
> What is 'nested enables'?

In the case above we have masked true and disable depth 2.  If the user
now unmasks the interrupt then depth is reduced to 1, the IRQ is still
disabled, and masked is false.  The masked value is now out of sync
with the IRQ line and prevents the user from unmasking again.  The
disable depth is stuck at 1.

Nested enables would be if we allowed the user to unmask a line that we
think is already unmasked.
 
> > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > is never auto-enabled, then unmask as required.
> > 
> > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>  
> 
> But this patch looks good to me:
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> with one nit...
> 
> > 
> > +	/*
> > +	 * Devices without DisINTx support require an exclusive interrupt,
> > +	 * IRQ masking is performed at the IRQ chip.  The masked status is  
> 
> "exclusive interrupt, with IRQ masking performed at..."

TBH, the difference is too subtle for me.  With my version above you
could replace the comma with a period, I think it has the same meaning.
However, "...exclusive interrupt, with IRQ masking performed at..."
almost suggests that we need a specific type of exclusive interrupt
with this property.  There's nothing unique about the exclusive
interrupt, we could arbitrarily decide we want an exclusive interrupt
for DisINTx masking if we wanted to frustrate a lot of users.

Performing masking at the IRQ chip is actually what necessitates the
exclusive interrupt here.  Thanks,

Alex
Alex Williamson March 7, 2024, 8:23 p.m. UTC | #4
On Thu, 7 Mar 2024 08:39:16 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Thursday, March 7, 2024 5:15 AM
> > 
> > Currently for devices requiring masking at the irqchip for INTx, ie.
> > devices without DisINTx support, the IRQ is enabled in request_irq()
> > and subsequently disabled as necessary to align with the masked status
> > flag.  This presents a window where the interrupt could fire between
> > these events, resulting in the IRQ incrementing the disable depth twice.
> > This would be unrecoverable for a user since the masked flag prevents
> > nested enables through vfio.
> > 
> > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > is never auto-enabled, then unmask as required.
> > 
> > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>  
> 
> CC stable?

I've always found that having a Fixes: tag is sufficient to get picked
up for stable, so I typically don't do both.  If it helps out someone's
process I'd be happy to though.  Thanks,

Alex
Tian, Kevin March 8, 2024, 7:20 a.m. UTC | #5
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Friday, March 8, 2024 4:17 AM
> 
> On Thu, 7 Mar 2024 08:28:45 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Thursday, March 7, 2024 5:15 AM
> > >
> > > Currently for devices requiring masking at the irqchip for INTx, ie.
> > > devices without DisINTx support, the IRQ is enabled in request_irq()
> > > and subsequently disabled as necessary to align with the masked status
> > > flag.  This presents a window where the interrupt could fire between
> > > these events, resulting in the IRQ incrementing the disable depth twice.
> >
> > Can you elaborate the last point about disable depth?
> 
> Each irq_desc maintains a depth field, a disable increments the depth,
> an enable decrements.  On the disable transition from 0 to 1 the IRQ
> chip is disabled, on the enable transition from 1 to 0 the IRQ chip is
> enabled.
> 
> Therefore if an interrupt fires between request_irq() and
> disable_irq(), vfio_intx_handler() will disable the IRQ (depth 0->1).
> Note that masked is not tested here, the interrupt is expected to be
> exclusive for non-pci_2_3 devices.  @masked would be redundantly set to
> true.  The setup call path would increment depth to 2.  The order these
> happen is not important so long as the interrupt is in-flight before
> the setup path disables the IRQ.
> 
> > > This would be unrecoverable for a user since the masked flag prevents
> > > nested enables through vfio.
> >
> > What is 'nested enables'?
> 
> In the case above we have masked true and disable depth 2.  If the user
> now unmasks the interrupt then depth is reduced to 1, the IRQ is still
> disabled, and masked is false.  The masked value is now out of sync
> with the IRQ line and prevents the user from unmasking again.  The
> disable depth is stuck at 1.
> 
> Nested enables would be if we allowed the user to unmask a line that we
> think is already unmasked.

Thanks! clear to me now.

> 
> > > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > > is never auto-enabled, then unmask as required.
> > >
> > > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> >
> > But this patch looks good to me:
> >
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >
> > with one nit...
> >
> > >
> > > +	/*
> > > +	 * Devices without DisINTx support require an exclusive interrupt,
> > > +	 * IRQ masking is performed at the IRQ chip.  The masked status is
> >
> > "exclusive interrupt, with IRQ masking performed at..."
> 
> TBH, the difference is too subtle for me.  With my version above you
> could replace the comma with a period, I think it has the same meaning.
> However, "...exclusive interrupt, with IRQ masking performed at..."
> almost suggests that we need a specific type of exclusive interrupt
> with this property.  There's nothing unique about the exclusive
> interrupt, we could arbitrarily decide we want an exclusive interrupt
> for DisINTx masking if we wanted to frustrate a lot of users.
> 
> Performing masking at the IRQ chip is actually what necessitates the
> exclusive interrupt here.  Thanks,
> 

make sense. and I saw you replaced the commaon with a period in patch4.
Tian, Kevin March 8, 2024, 7:23 a.m. UTC | #6
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Friday, March 8, 2024 4:24 AM
> 
> On Thu, 7 Mar 2024 08:39:16 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Thursday, March 7, 2024 5:15 AM
> > >
> > > Currently for devices requiring masking at the irqchip for INTx, ie.
> > > devices without DisINTx support, the IRQ is enabled in request_irq()
> > > and subsequently disabled as necessary to align with the masked status
> > > flag.  This presents a window where the interrupt could fire between
> > > these events, resulting in the IRQ incrementing the disable depth twice.
> > > This would be unrecoverable for a user since the masked flag prevents
> > > nested enables through vfio.
> > >
> > > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > > is never auto-enabled, then unmask as required.
> > >
> > > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> >
> > CC stable?
> 
> I've always found that having a Fixes: tag is sufficient to get picked
> up for stable, so I typically don't do both.  If it helps out someone's
> process I'd be happy to though.  Thanks,
> 

According to "Documentation/process/submitting-patches.rst":

  Note: Attaching a Fixes: tag does not subvert the stable kernel rules
  process nor the requirement to Cc: stable@vger.kernel.org on all stable
  patch candidates. For more information, please read
  Documentation/process/stable-kernel-rules.rst.

Probably it's fine as long as the stable kernel maintainers don't complain. 
Alex Williamson March 8, 2024, 5:03 p.m. UTC | #7
On Fri, 8 Mar 2024 07:23:21 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Friday, March 8, 2024 4:24 AM
> > 
> > On Thu, 7 Mar 2024 08:39:16 +0000
> > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> >   
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Thursday, March 7, 2024 5:15 AM
> > > >
> > > > Currently for devices requiring masking at the irqchip for INTx, ie.
> > > > devices without DisINTx support, the IRQ is enabled in request_irq()
> > > > and subsequently disabled as necessary to align with the masked status
> > > > flag.  This presents a window where the interrupt could fire between
> > > > these events, resulting in the IRQ incrementing the disable depth twice.
> > > > This would be unrecoverable for a user since the masked flag prevents
> > > > nested enables through vfio.
> > > >
> > > > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > > > is never auto-enabled, then unmask as required.
> > > >
> > > > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>  
> > >
> > > CC stable?  
> > 
> > I've always found that having a Fixes: tag is sufficient to get picked
> > up for stable, so I typically don't do both.  If it helps out someone's
> > process I'd be happy to though.  Thanks,
> >   
> 
> According to "Documentation/process/submitting-patches.rst":
> 
>   Note: Attaching a Fixes: tag does not subvert the stable kernel rules
>   process nor the requirement to Cc: stable@vger.kernel.org on all stable
>   patch candidates. For more information, please read
>   Documentation/process/stable-kernel-rules.rst.
> 
> Probably it's fine as long as the stable kernel maintainers don't complain. 
Jason Gunthorpe March 8, 2024, 5:05 p.m. UTC | #8
On Thu, Mar 07, 2024 at 01:23:48PM -0700, Alex Williamson wrote:
> On Thu, 7 Mar 2024 08:39:16 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Thursday, March 7, 2024 5:15 AM
> > > 
> > > Currently for devices requiring masking at the irqchip for INTx, ie.
> > > devices without DisINTx support, the IRQ is enabled in request_irq()
> > > and subsequently disabled as necessary to align with the masked status
> > > flag.  This presents a window where the interrupt could fire between
> > > these events, resulting in the IRQ incrementing the disable depth twice.
> > > This would be unrecoverable for a user since the masked flag prevents
> > > nested enables through vfio.
> > > 
> > > Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
> > > is never auto-enabled, then unmask as required.
> > > 
> > > Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>  
> > 
> > CC stable?
> 
> I've always found that having a Fixes: tag is sufficient to get picked
> up for stable, so I typically don't do both.  If it helps out someone's
> process I'd be happy to though.  Thanks,

It helps other distros in the ecosystem to flag patches that really
should be backported. Not everyone runs their backport trees as
agressively as a stable does.

Jason
diff mbox series

Patch

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 237beac83809..136101179fcb 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -296,8 +296,15 @@  static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 
 	ctx->trigger = trigger;
 
+	/*
+	 * Devices without DisINTx support require an exclusive interrupt,
+	 * IRQ masking is performed at the IRQ chip.  The masked status is
+	 * protected by vdev->irqlock. Setup the IRQ without auto-enable and
+	 * unmask as necessary below under lock.  DisINTx is unmodified by
+	 * the IRQ configuration and may therefore use auto-enable.
+	 */
 	if (!vdev->pci_2_3)
-		irqflags = 0;
+		irqflags = IRQF_NO_AUTOEN;
 
 	ret = request_irq(pdev->irq, vfio_intx_handler,
 			  irqflags, ctx->name, vdev);
@@ -308,13 +315,9 @@  static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
 		return ret;
 	}
 
-	/*
-	 * INTx disable will stick across the new irq setup,
-	 * disable_irq won't.
-	 */
 	spin_lock_irqsave(&vdev->irqlock, flags);
-	if (!vdev->pci_2_3 && ctx->masked)
-		disable_irq_nosync(pdev->irq);
+	if (!vdev->pci_2_3 && !ctx->masked)
+		enable_irq(pdev->irq);
 	spin_unlock_irqrestore(&vdev->irqlock, flags);
 
 	return 0;