diff mbox series

PCI: hv: fix reading of PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN

Message ID 20240621014815.263590-1-wei.liu@kernel.org (mailing list archive)
State Changes Requested
Delegated to: Krzysztof Wilczyński
Headers show
Series PCI: hv: fix reading of PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN | expand

Commit Message

Wei Liu June 21, 2024, 1:48 a.m. UTC
The intent of the code snippet is to always return 0 for both fields.
The check is wrong though. Fix that.

This is discovered by this call in VFIO:

    pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);

The old code does not set *val to 0 because the second half of the check is
incorrect.

Fixes: 4daace0d8ce85 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Cc: stable@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 drivers/pci/controller/pci-hyperv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Michael Kelley June 21, 2024, 3:15 a.m. UTC | #1
From: Wei Liu <wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> 
> The intent of the code snippet is to always return 0 for both fields.
> The check is wrong though. Fix that.
> 
> This is discovered by this call in VFIO:
> 
>     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> 
> The old code does not set *val to 0 because the second half of the check is
> incorrect.
> 
> Fixes: 4daace0d8ce85 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V
> VMs")
> Cc: stable@kernel.org
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  drivers/pci/controller/pci-hyperv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 5992280e8110..eec087c8f670 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1130,8 +1130,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev
> *hpdev, int where,
>  		   PCI_CAPABILITY_LIST) {
>  		/* ROM BARs are unimplemented */
>  		*val = 0;
> -	} else if (where >= PCI_INTERRUPT_LINE && where + size <=
> -		   PCI_INTERRUPT_PIN) {
> +	} else if ((where == PCI_INTERRUPT_LINE || where == PCI_INTERRUPT_PIN) &&
> +		   size == 1) {

Any reason not to continue the pattern of the rest of the function,
and do the following to fix the bug?

   	} else if (where >= PCI_INTERRUPT_LINE && where + size <= 
  		   PCI_MIN_GNT) {

Your fix doesn't allow PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN
to be read together as a 2-byte access, though I don't know if that
matters.

I have a slight preference for the more consistent approach, but
don't really object to what you've done.  Treat my idea as a
suggestion to consider, but if you want to go with your approach,
that's OK too.

Michael

>  		/*
>  		 * Interrupt Line and Interrupt PIN are hard-wired to zero
>  		 * because this front-end only supports message-signaled
> --
> 2.43.0
>
Wei Liu June 21, 2024, 6:19 a.m. UTC | #2
On Fri, Jun 21, 2024 at 03:15:19AM +0000, Michael Kelley wrote:
> From: Wei Liu <wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> > 
> > The intent of the code snippet is to always return 0 for both fields.
> > The check is wrong though. Fix that.
> > 
> > This is discovered by this call in VFIO:
> > 
> >     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> > 
> > The old code does not set *val to 0 because the second half of the check is
> > incorrect.
> > 
> > Fixes: 4daace0d8ce85 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V
> > VMs")
> > Cc: stable@kernel.org
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > ---
> >  drivers/pci/controller/pci-hyperv.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> > index 5992280e8110..eec087c8f670 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -1130,8 +1130,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev
> > *hpdev, int where,
> >  		   PCI_CAPABILITY_LIST) {
> >  		/* ROM BARs are unimplemented */
> >  		*val = 0;
> > -	} else if (where >= PCI_INTERRUPT_LINE && where + size <=
> > -		   PCI_INTERRUPT_PIN) {
> > +	} else if ((where == PCI_INTERRUPT_LINE || where == PCI_INTERRUPT_PIN) &&
> > +		   size == 1) {
> 
> Any reason not to continue the pattern of the rest of the function,
> and do the following to fix the bug?
> 
>    	} else if (where >= PCI_INTERRUPT_LINE && where + size <= 
>   		   PCI_MIN_GNT) {
> 
> Your fix doesn't allow PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN
> to be read together as a 2-byte access, though I don't know if that
> matters.

I don't think this is a sane use case. Someone calls
pci_read_config_word on PCI_INTERRUPT_LINE and then breaks the two
fields out from a word size value.

There is only one in-tree instance attempting to read both fields at the
same time.  And it gets away with it because it immediately writes the
same value back to another register.

(Search for PCI_INTERRUPT_LINE in sound/pci/ctxfi/cthw20k1.c)

The rest of the code always does a byte read.

I had a version that looked like this:

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 5992280e8110..cdd5be16021d 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1130,8 +1130,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
                   PCI_CAPABILITY_LIST) {
                /* ROM BARs are unimplemented */
                *val = 0;
-       } else if (where >= PCI_INTERRUPT_LINE && where + size <=
-                  PCI_INTERRUPT_PIN) {
+       } else if ((where >= PCI_INTERRUPT_LINE && where + size <= PCI_INTERRUPT_PIN) ||
+                  (where >= PCI_INTERRUPT_PIN && where + size <= PCI_MIN_GNT)) {
                /*
                 * Interrupt Line and Interrupt PIN are hard-wired to zero
                 * because this front-end only supports message-signaled

It was consistent with the old style. But I thought it was a bit too
tedious to read, so I changed to the current version.

At the very least I would like to enforce the separation of the two
fields. 

I can send out the older version tomorrow -- just waiting for others to
chime in.

Thanks,
Wei.


> 
> I have a slight preference for the more consistent approach, but
> don't really object to what you've done.  Treat my idea as a
> suggestion to consider, but if you want to go with your approach,
> that's OK too.
> 
> Michael
> 
> >  		/*
> >  		 * Interrupt Line and Interrupt PIN are hard-wired to zero
> >  		 * because this front-end only supports message-signaled
> > --
> > 2.43.0
> > 
> 
>
Saurabh Singh Sengar June 21, 2024, 11:03 a.m. UTC | #3
On Fri, Jun 21, 2024 at 06:19:05AM +0000, Wei Liu wrote:
> On Fri, Jun 21, 2024 at 03:15:19AM +0000, Michael Kelley wrote:
> > From: Wei Liu <wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> > > 
> > > The intent of the code snippet is to always return 0 for both fields.
> > > The check is wrong though. Fix that.
> > > 
> > > This is discovered by this call in VFIO:
> > > 
> > >     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> > > 
> > > The old code does not set *val to 0 because the second half of the check is
> > > incorrect.
> > > 
> > > Fixes: 4daace0d8ce85 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V
> > > VMs")

12 characters are preferred for Fixes commit id.
'Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")'

> > > Cc: stable@kernel.org
> > > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > > ---
> > >  drivers/pci/controller/pci-hyperv.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> > > index 5992280e8110..eec087c8f670 100644
> > > --- a/drivers/pci/controller/pci-hyperv.c
> > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > @@ -1130,8 +1130,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev
> > > *hpdev, int where,

<snip>


> I had a version that looked like this:
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 5992280e8110..cdd5be16021d 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1130,8 +1130,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
>                    PCI_CAPABILITY_LIST) {
>                 /* ROM BARs are unimplemented */
>                 *val = 0;
> -       } else if (where >= PCI_INTERRUPT_LINE && where + size <=
> -                  PCI_INTERRUPT_PIN) {
> +       } else if ((where >= PCI_INTERRUPT_LINE && where + size <= PCI_INTERRUPT_PIN) ||
> +                  (where >= PCI_INTERRUPT_PIN && where + size <= PCI_MIN_GNT)) {

IMHO, I prefer this one due to consistency. We can have these 2 condition as separate "else if"
as well, which would align better with the rest of the logic in this function. However, I don't
have a strong preference on this matter.

- Saurabh
Dexuan Cui June 21, 2024, 6:41 p.m. UTC | #4
From: Jake Oshins <jakeo@microsoft.com> 
Sent: Friday, June 21, 2024 9:51 AM
> [...]
>On Fri, Jun 21, 2024 at 06:19:05AM +0000, Wei Liu wrote:
> On Fri, Jun 21, 2024 at 03:15:19AM +0000, Michael Kelley wrote:
> > From: Wei Liu <mailto:wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> > >
> > > The intent of the code snippet is to always return 0 for both fields.
> > > The check is wrong though. Fix that.
> > >
> > > This is discovered by this call in VFIO:
> > >
> > >     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> > >
> > > The old code does not set *val to 0 because the second half of the check is
> > > incorrect.

Hi Wei, so you got a non-zero 'pin' value returned by the host when the guest reads
from the MMIO config page. What's the consequence? Will VFIO try to use the legacy INTx 
rather than MSI/MSI-X? I'm curious how you noticed the bug. I'm also curious why the
host doesn't return 0 for the 'PIN' register when the guest reads it from the config page.

>  I believe that this fix is correct.  (And I'm frankly surprised that this bug didn't
> cause a problem before this.  It's been there since I first wrote the code.)
> -- Jake Oshins

I suppose it didn't cause any issue because the PCI device drivers use MSI/MSI-X,
so they don't care about the values of the 'PIN' and 'LINE' registers.

Thanks,
Dexuan
Wei Liu June 21, 2024, 7:33 p.m. UTC | #5
On Fri, Jun 21, 2024 at 06:41:04PM +0000, Dexuan Cui wrote:
> From: Jake Oshins <jakeo@microsoft.com> 
> Sent: Friday, June 21, 2024 9:51 AM
> > [...]
> >On Fri, Jun 21, 2024 at 06:19:05AM +0000, Wei Liu wrote:
> > On Fri, Jun 21, 2024 at 03:15:19AM +0000, Michael Kelley wrote:
> > > From: Wei Liu <mailto:wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> > > >
> > > > The intent of the code snippet is to always return 0 for both fields.
> > > > The check is wrong though. Fix that.
> > > >
> > > > This is discovered by this call in VFIO:
> > > >
> > > >     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> > > >
> > > > The old code does not set *val to 0 because the second half of the check is
> > > > incorrect.
> 
> Hi Wei, so you got a non-zero 'pin' value returned by the host when the guest reads
> from the MMIO config page. What's the consequence? Will VFIO try to use the legacy INTx 
> rather than MSI/MSI-X? I'm curious how you noticed the bug. I'm also curious why the
> host doesn't return 0 for the 'PIN' register when the guest reads it from the config page.

It is not the guest reading the register. The VM has not launched yet.
Everything happens on the host side. The host side software is preparing
the device for the VM to use.

The consequence of this bug is that user space code will think INTx is
available while in fact it is not.

VFIO itself doesn't care much. I noticed the bug because our VMM (Cloud
Hypervisor) initializes INTx whenever it is available.

> 
> >  I believe that this fix is correct.  (And I'm frankly surprised that this bug didn't
> > cause a problem before this.  It's been there since I first wrote the code.)
> > -- Jake Oshins
> 
> I suppose it didn't cause any issue because the PCI device drivers use MSI/MSI-X,
> so they don't care about the values of the 'PIN' and 'LINE' registers.

I suspect the same. Drivers almost always prefer MSI / MSI-X over INTx.
No one else triggered that code path before.

Thanks,
Wei.

> 
> Thanks,
> Dexuan
>
Wei Liu June 21, 2024, 8:32 p.m. UTC | #6
On Fri, Jun 21, 2024 at 04:03:27AM -0700, Saurabh Singh Sengar wrote:
> On Fri, Jun 21, 2024 at 06:19:05AM +0000, Wei Liu wrote:
> > On Fri, Jun 21, 2024 at 03:15:19AM +0000, Michael Kelley wrote:
> > > From: Wei Liu <wei.liu@kernel.org> Sent: Thursday, June 20, 2024 6:48 PM
> > > > 
> > > > The intent of the code snippet is to always return 0 for both fields.
> > > > The check is wrong though. Fix that.
> > > > 
> > > > This is discovered by this call in VFIO:
> > > > 
> > > >     pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin);
> > > > 
> > > > The old code does not set *val to 0 because the second half of the check is
> > > > incorrect.
> > > > 
> > > > Fixes: 4daace0d8ce85 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V
> > > > VMs")
> 
> 12 characters are preferred for Fixes commit id.
> 'Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")'

Fixed. Thanks.
diff mbox series

Patch

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 5992280e8110..eec087c8f670 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1130,8 +1130,8 @@  static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
 		   PCI_CAPABILITY_LIST) {
 		/* ROM BARs are unimplemented */
 		*val = 0;
-	} else if (where >= PCI_INTERRUPT_LINE && where + size <=
-		   PCI_INTERRUPT_PIN) {
+	} else if ((where == PCI_INTERRUPT_LINE || where == PCI_INTERRUPT_PIN) &&
+		   size == 1) {
 		/*
 		 * Interrupt Line and Interrupt PIN are hard-wired to zero
 		 * because this front-end only supports message-signaled