diff mbox

[v7] pci: quirk to skip msi disable on shutdown

Message ID 1441553385-3810-1-git-send-email-mst@redhat.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Michael S. Tsirkin Sept. 6, 2015, 3:32 p.m. UTC
On some hypervisors, virtio devices tend to generate spurious interrupts
when switching between MSI and non-MSI mode.  Normally, either MSI or
non-MSI is used and all is well, but during shutdown, linux disables MSI
which then causes an "irq %d: nobody cared" message, with irq being
subsequently disabled.

Since bus mastering is already disabled at this point, disabling MSI
isn't actually useful for spec compliant devices: MSI interrupts are
in-bus memory writes so disabling Bus Master (which is already done)
disables these as well: after some research, it appears to be there for
the benefit of devices that ignore the bus master bit.

As it's not clear how common this kind of bug is, this patch simply adds
a quirk, to be set by devices that wish to skip disabling msi on
shutdown, relying on bus master instead.

We set this quirk in virtio core.

Reported-by: Fam Zheng <famz@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Yinghai Lu <yhlu.kernel.send@gmail.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---


changes from v6:
	limit changes to virtio only
changes from v5:
        rebased on top of pci/msi
        fixed commit log, including comments by Bjorn
        and adding explanation to address comments/questions by Eric
        dropped stable Cc, this patch does not seem to qualify for stable
changes from v4:
        Yijing Wang <wangyijing@huawei.com> noted that
        early fixups rely on pci_msi_off.
        Split out the functionality and move off the
        required part to run early during pci_device_setup.
Changes from v3:
        fix a copy-and-paste error in
          pci: drop some duplicate code
        other patches are unchanged
        drop Cc stable for now
Changes from v2:
        move code from probe to device enumeration
        add patches to unexport pci_msi_off


 include/linux/pci.h                | 2 ++
 drivers/pci/pci-driver.c           | 6 ++++--
 drivers/virtio/virtio_pci_common.c | 4 ++++
 3 files changed, 10 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas Sept. 17, 2015, 3:44 p.m. UTC | #1
On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> On some hypervisors, virtio devices tend to generate spurious interrupts
> when switching between MSI and non-MSI mode.  Normally, either MSI or
> non-MSI is used and all is well, but during shutdown, linux disables MSI
> which then causes an "irq %d: nobody cared" message, with irq being
> subsequently disabled.
> 
> Since bus mastering is already disabled at this point, disabling MSI
> isn't actually useful for spec compliant devices: MSI interrupts are
> in-bus memory writes so disabling Bus Master (which is already done)
> disables these as well: after some research, it appears to be there for
> the benefit of devices that ignore the bus master bit.
> 
> As it's not clear how common this kind of bug is, this patch simply adds
> a quirk, to be set by devices that wish to skip disabling msi on
> shutdown, relying on bus master instead.
> 
> We set this quirk in virtio core.
> 
> Reported-by: Fam Zheng <famz@redhat.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Yinghai Lu <yhlu.kernel.send@gmail.com>
> Cc: Ulrich Obergfell <uobergfe@redhat.com>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Eric, what do you think of this?  You had many comments on previous
versions.

Minor comment on a comment below.

> ---
> 
> 
> changes from v6:
> 	limit changes to virtio only
> changes from v5:
>         rebased on top of pci/msi
>         fixed commit log, including comments by Bjorn
>         and adding explanation to address comments/questions by Eric
>         dropped stable Cc, this patch does not seem to qualify for stable
> changes from v4:
>         Yijing Wang <wangyijing@huawei.com> noted that
>         early fixups rely on pci_msi_off.
>         Split out the functionality and move off the
>         required part to run early during pci_device_setup.
> Changes from v3:
>         fix a copy-and-paste error in
>           pci: drop some duplicate code
>         other patches are unchanged
>         drop Cc stable for now
> Changes from v2:
>         move code from probe to device enumeration
>         add patches to unexport pci_msi_off
> 
> 
>  include/linux/pci.h                | 2 ++
>  drivers/pci/pci-driver.c           | 6 ++++--
>  drivers/virtio/virtio_pci_common.c | 4 ++++
>  3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 860c751..80f3494 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -180,6 +180,8 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1 << 6),
>  	/* Do not use PM reset even if device advertises NoSoftRst- */
>  	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
> +	/* Do not disable MSI on shutdown, disable bus master instead */

This comment doesn't really match what the patch does.  The patch merely
does "Do not disable MSI on shutdown."  It doesn't "disable bus master
instead."

Bus master may be disabled elsewhere, but that is independent of the
PCI_DEV_FLAGS_NO_MSI_SHUTDOWN flag.

> +	PCI_DEV_FLAGS_NO_MSI_SHUTDOWN = (__force pci_dev_flags_t) (1 << 8),
>  };
>  
>  enum pci_irq_reroute_variant {
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3cb2210..59d9e40 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -450,8 +450,10 @@ static void pci_device_shutdown(struct device *dev)
>  
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
> -	pci_msi_shutdown(pci_dev);
> -	pci_msix_shutdown(pci_dev);
> +	if (!(pci_dev->dev_flags & PCI_DEV_FLAGS_NO_MSI_SHUTDOWN)) {
> +		pci_msi_shutdown(pci_dev);
> +		pci_msix_shutdown(pci_dev);
> +	}
>  
>  #ifdef CONFIG_KEXEC
>  	/*
> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> index 78f804a..26f46c3 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -528,6 +528,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>  	if (rc)
>  		goto err_register;
>  
> +	pci_dev->dev_flags |= PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> +
>  	return 0;
>  
>  err_register:
> @@ -546,6 +548,8 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>  {
>  	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
>  
> +	pci_dev->dev_flags &= ~PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> +
>  	unregister_virtio_device(&vp_dev->vdev);
>  
>  	if (vp_dev->ioaddr)
> -- 
> MST
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Sept. 17, 2015, 3:49 p.m. UTC | #2
Bjorn Helgaas <bhelgaas@google.com> writes:

> On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
>> On some hypervisors, virtio devices tend to generate spurious interrupts
>> when switching between MSI and non-MSI mode.  Normally, either MSI or
>> non-MSI is used and all is well, but during shutdown, linux disables MSI
>> which then causes an "irq %d: nobody cared" message, with irq being
>> subsequently disabled.
>> 
>> Since bus mastering is already disabled at this point, disabling MSI
>> isn't actually useful for spec compliant devices: MSI interrupts are
>> in-bus memory writes so disabling Bus Master (which is already done)
>> disables these as well: after some research, it appears to be there for
>> the benefit of devices that ignore the bus master bit.
>> 
>> As it's not clear how common this kind of bug is, this patch simply adds
>> a quirk, to be set by devices that wish to skip disabling msi on
>> shutdown, relying on bus master instead.
>> 
>> We set this quirk in virtio core.
>> 
>> Reported-by: Fam Zheng <famz@redhat.com>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Yinghai Lu <yhlu.kernel.send@gmail.com>
>> Cc: Ulrich Obergfell <uobergfe@redhat.com>
>> Cc: Rusty Russell <rusty@rustcorp.com.au>
>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> Eric, what do you think of this?  You had many comments on previous
> versions.

I still think it is a hack to avoid actually fixing a driver.

I think the hack is better as a quirk.  I think the quirk would tend to
be better if it was limited to whatever set of hypervisors where this is
actually a problem.

And of course given that on any sane configuration we have the irq
watchdog I don't see what this is fixing in practice.

Other than suggesting this hack become find a way to limit itself to the
driver that is actually having problems I don't see a way to improve it.

> Minor comment on a comment below.
>
>> ---
>> 
>> 
>> changes from v6:
>> 	limit changes to virtio only
>> changes from v5:
>>         rebased on top of pci/msi
>>         fixed commit log, including comments by Bjorn
>>         and adding explanation to address comments/questions by Eric
>>         dropped stable Cc, this patch does not seem to qualify for stable
>> changes from v4:
>>         Yijing Wang <wangyijing@huawei.com> noted that
>>         early fixups rely on pci_msi_off.
>>         Split out the functionality and move off the
>>         required part to run early during pci_device_setup.
>> Changes from v3:
>>         fix a copy-and-paste error in
>>           pci: drop some duplicate code
>>         other patches are unchanged
>>         drop Cc stable for now
>> Changes from v2:
>>         move code from probe to device enumeration
>>         add patches to unexport pci_msi_off
>> 
>> 
>>  include/linux/pci.h                | 2 ++
>>  drivers/pci/pci-driver.c           | 6 ++++--
>>  drivers/virtio/virtio_pci_common.c | 4 ++++
>>  3 files changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 860c751..80f3494 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -180,6 +180,8 @@ enum pci_dev_flags {
>>  	PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1 << 6),
>>  	/* Do not use PM reset even if device advertises NoSoftRst- */
>>  	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
>> +	/* Do not disable MSI on shutdown, disable bus master instead */
>
> This comment doesn't really match what the patch does.  The patch merely
> does "Do not disable MSI on shutdown."  It doesn't "disable bus master
> instead."
>
> Bus master may be disabled elsewhere, but that is independent of the
> PCI_DEV_FLAGS_NO_MSI_SHUTDOWN flag.
>
>> +	PCI_DEV_FLAGS_NO_MSI_SHUTDOWN = (__force pci_dev_flags_t) (1 << 8),
>>  };
>>  
>>  enum pci_irq_reroute_variant {
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 3cb2210..59d9e40 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -450,8 +450,10 @@ static void pci_device_shutdown(struct device *dev)
>>  
>>  	if (drv && drv->shutdown)
>>  		drv->shutdown(pci_dev);
>> -	pci_msi_shutdown(pci_dev);
>> -	pci_msix_shutdown(pci_dev);
>> +	if (!(pci_dev->dev_flags & PCI_DEV_FLAGS_NO_MSI_SHUTDOWN)) {
>> +		pci_msi_shutdown(pci_dev);
>> +		pci_msix_shutdown(pci_dev);
>> +	}
>>  
>>  #ifdef CONFIG_KEXEC
>>  	/*
>> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
>> index 78f804a..26f46c3 100644
>> --- a/drivers/virtio/virtio_pci_common.c
>> +++ b/drivers/virtio/virtio_pci_common.c
>> @@ -528,6 +528,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
>>  	if (rc)
>>  		goto err_register;
>>  
>> +	pci_dev->dev_flags |= PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
>> +
>>  	return 0;
>>  
>>  err_register:
>> @@ -546,6 +548,8 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
>>  {
>>  	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
>>  
>> +	pci_dev->dev_flags &= ~PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
>> +
>>  	unregister_virtio_device(&vp_dev->vdev);
>>  
>>  	if (vp_dev->ioaddr)
>> -- 
>> MST
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Sept. 17, 2015, 3:56 p.m. UTC | #3
On Thu, Sep 17, 2015 at 10:44:24AM -0500, Bjorn Helgaas wrote:
> On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > On some hypervisors, virtio devices tend to generate spurious interrupts
> > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > which then causes an "irq %d: nobody cared" message, with irq being
> > subsequently disabled.
> > 
> > Since bus mastering is already disabled at this point, disabling MSI
> > isn't actually useful for spec compliant devices: MSI interrupts are
> > in-bus memory writes so disabling Bus Master (which is already done)
> > disables these as well: after some research, it appears to be there for
> > the benefit of devices that ignore the bus master bit.
> > 
> > As it's not clear how common this kind of bug is, this patch simply adds
> > a quirk, to be set by devices that wish to skip disabling msi on
> > shutdown, relying on bus master instead.
> > 
> > We set this quirk in virtio core.
> > 
> > Reported-by: Fam Zheng <famz@redhat.com>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: Yinghai Lu <yhlu.kernel.send@gmail.com>
> > Cc: Ulrich Obergfell <uobergfe@redhat.com>
> > Cc: Rusty Russell <rusty@rustcorp.com.au>
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Eric, what do you think of this?  You had many comments on previous
> versions.
> 
> Minor comment on a comment below.
> 
> > ---
> > 
> > 
> > changes from v6:
> > 	limit changes to virtio only
> > changes from v5:
> >         rebased on top of pci/msi
> >         fixed commit log, including comments by Bjorn
> >         and adding explanation to address comments/questions by Eric
> >         dropped stable Cc, this patch does not seem to qualify for stable
> > changes from v4:
> >         Yijing Wang <wangyijing@huawei.com> noted that
> >         early fixups rely on pci_msi_off.
> >         Split out the functionality and move off the
> >         required part to run early during pci_device_setup.
> > Changes from v3:
> >         fix a copy-and-paste error in
> >           pci: drop some duplicate code
> >         other patches are unchanged
> >         drop Cc stable for now
> > Changes from v2:
> >         move code from probe to device enumeration
> >         add patches to unexport pci_msi_off
> > 
> > 
> >  include/linux/pci.h                | 2 ++
> >  drivers/pci/pci-driver.c           | 6 ++++--
> >  drivers/virtio/virtio_pci_common.c | 4 ++++
> >  3 files changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 860c751..80f3494 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -180,6 +180,8 @@ enum pci_dev_flags {
> >  	PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1 << 6),
> >  	/* Do not use PM reset even if device advertises NoSoftRst- */
> >  	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
> > +	/* Do not disable MSI on shutdown, disable bus master instead */
> 
> This comment doesn't really match what the patch does.  The patch merely
> does "Do not disable MSI on shutdown."  It doesn't "disable bus master
> instead."
> 
> Bus master may be disabled elsewhere, but that is independent of the
> PCI_DEV_FLAGS_NO_MSI_SHUTDOWN flag.

Yes but I wanted to point out that bus master needs to work.
One of the reasons we don't do this patch for all devices is because
some builtin devices might have a broken bus master.

How about
"Some other means (e.g. disabling bus master) suppress interrupts."


> > +	PCI_DEV_FLAGS_NO_MSI_SHUTDOWN = (__force pci_dev_flags_t) (1 << 8),
> >  };
> >  
> >  enum pci_irq_reroute_variant {
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 3cb2210..59d9e40 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -450,8 +450,10 @@ static void pci_device_shutdown(struct device *dev)
> >  
> >  	if (drv && drv->shutdown)
> >  		drv->shutdown(pci_dev);
> > -	pci_msi_shutdown(pci_dev);
> > -	pci_msix_shutdown(pci_dev);
> > +	if (!(pci_dev->dev_flags & PCI_DEV_FLAGS_NO_MSI_SHUTDOWN)) {
> > +		pci_msi_shutdown(pci_dev);
> > +		pci_msix_shutdown(pci_dev);
> > +	}
> >  
> >  #ifdef CONFIG_KEXEC
> >  	/*
> > diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> > index 78f804a..26f46c3 100644
> > --- a/drivers/virtio/virtio_pci_common.c
> > +++ b/drivers/virtio/virtio_pci_common.c
> > @@ -528,6 +528,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
> >  	if (rc)
> >  		goto err_register;
> >  
> > +	pci_dev->dev_flags |= PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> > +
> >  	return 0;
> >  
> >  err_register:
> > @@ -546,6 +548,8 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> >  {
> >  	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> >  
> > +	pci_dev->dev_flags &= ~PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> > +
> >  	unregister_virtio_device(&vp_dev->vdev);
> >  
> >  	if (vp_dev->ioaddr)
> > -- 
> > MST
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Sept. 17, 2015, 4:03 p.m. UTC | #4
On Thu, Sep 17, 2015 at 10:49:06AM -0500, Eric W. Biederman wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
> 
> > On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> >> On some hypervisors, virtio devices tend to generate spurious interrupts
> >> when switching between MSI and non-MSI mode.  Normally, either MSI or
> >> non-MSI is used and all is well, but during shutdown, linux disables MSI
> >> which then causes an "irq %d: nobody cared" message, with irq being
> >> subsequently disabled.
> >> 
> >> Since bus mastering is already disabled at this point, disabling MSI
> >> isn't actually useful for spec compliant devices: MSI interrupts are
> >> in-bus memory writes so disabling Bus Master (which is already done)
> >> disables these as well: after some research, it appears to be there for
> >> the benefit of devices that ignore the bus master bit.
> >> 
> >> As it's not clear how common this kind of bug is, this patch simply adds
> >> a quirk, to be set by devices that wish to skip disabling msi on
> >> shutdown, relying on bus master instead.
> >> 
> >> We set this quirk in virtio core.
> >> 
> >> Reported-by: Fam Zheng <famz@redhat.com>
> >> Cc: Bjorn Helgaas <bhelgaas@google.com>
> >> Cc: Yinghai Lu <yhlu.kernel.send@gmail.com>
> >> Cc: Ulrich Obergfell <uobergfe@redhat.com>
> >> Cc: Rusty Russell <rusty@rustcorp.com.au>
> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > Eric, what do you think of this?  You had many comments on previous
> > versions.
> 
> I still think it is a hack to avoid actually fixing a driver.

The bug's in the hypervisor though.

> I think the hack is better as a quirk.

You mean put it in drivers/pci/quirks.c?

>  I think the quirk would tend to
> be better if it was limited to whatever set of hypervisors where this is
> actually a problem.

I guess we might add a new flag "msi disable at shutdown is safe" in future
hypervisors. Until such hypervisors ship I don't see a way to detect
this.

> And of course given that on any sane configuration we have the irq
> watchdog I don't see what this is fixing in practice.
> 
> Other than suggesting this hack become find a way to limit itself to the
> driver that is actually having problems I don't see a way to improve it.

It's already limited: virtio is the only one that sets
PCI_DEV_FLAGS_NO_MSI_SHUTDOWN. Did you notice this?

> > Minor comment on a comment below.
> >
> >> ---
> >> 
> >> 
> >> changes from v6:
> >> 	limit changes to virtio only
> >> changes from v5:
> >>         rebased on top of pci/msi
> >>         fixed commit log, including comments by Bjorn
> >>         and adding explanation to address comments/questions by Eric
> >>         dropped stable Cc, this patch does not seem to qualify for stable
> >> changes from v4:
> >>         Yijing Wang <wangyijing@huawei.com> noted that
> >>         early fixups rely on pci_msi_off.
> >>         Split out the functionality and move off the
> >>         required part to run early during pci_device_setup.
> >> Changes from v3:
> >>         fix a copy-and-paste error in
> >>           pci: drop some duplicate code
> >>         other patches are unchanged
> >>         drop Cc stable for now
> >> Changes from v2:
> >>         move code from probe to device enumeration
> >>         add patches to unexport pci_msi_off
> >> 
> >> 
> >>  include/linux/pci.h                | 2 ++
> >>  drivers/pci/pci-driver.c           | 6 ++++--
> >>  drivers/virtio/virtio_pci_common.c | 4 ++++
> >>  3 files changed, 10 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/include/linux/pci.h b/include/linux/pci.h
> >> index 860c751..80f3494 100644
> >> --- a/include/linux/pci.h
> >> +++ b/include/linux/pci.h
> >> @@ -180,6 +180,8 @@ enum pci_dev_flags {
> >>  	PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1 << 6),
> >>  	/* Do not use PM reset even if device advertises NoSoftRst- */
> >>  	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
> >> +	/* Do not disable MSI on shutdown, disable bus master instead */
> >
> > This comment doesn't really match what the patch does.  The patch merely
> > does "Do not disable MSI on shutdown."  It doesn't "disable bus master
> > instead."
> >
> > Bus master may be disabled elsewhere, but that is independent of the
> > PCI_DEV_FLAGS_NO_MSI_SHUTDOWN flag.
> >
> >> +	PCI_DEV_FLAGS_NO_MSI_SHUTDOWN = (__force pci_dev_flags_t) (1 << 8),
> >>  };
> >>  
> >>  enum pci_irq_reroute_variant {
> >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >> index 3cb2210..59d9e40 100644
> >> --- a/drivers/pci/pci-driver.c
> >> +++ b/drivers/pci/pci-driver.c
> >> @@ -450,8 +450,10 @@ static void pci_device_shutdown(struct device *dev)
> >>  
> >>  	if (drv && drv->shutdown)
> >>  		drv->shutdown(pci_dev);
> >> -	pci_msi_shutdown(pci_dev);
> >> -	pci_msix_shutdown(pci_dev);
> >> +	if (!(pci_dev->dev_flags & PCI_DEV_FLAGS_NO_MSI_SHUTDOWN)) {
> >> +		pci_msi_shutdown(pci_dev);
> >> +		pci_msix_shutdown(pci_dev);
> >> +	}
> >>  
> >>  #ifdef CONFIG_KEXEC
> >>  	/*
> >> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
> >> index 78f804a..26f46c3 100644
> >> --- a/drivers/virtio/virtio_pci_common.c
> >> +++ b/drivers/virtio/virtio_pci_common.c
> >> @@ -528,6 +528,8 @@ static int virtio_pci_probe(struct pci_dev *pci_dev,
> >>  	if (rc)
> >>  		goto err_register;
> >>  
> >> +	pci_dev->dev_flags |= PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> >> +
> >>  	return 0;
> >>  
> >>  err_register:
> >> @@ -546,6 +548,8 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
> >>  {
> >>  	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
> >>  
> >> +	pci_dev->dev_flags &= ~PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
> >> +
> >>  	unregister_virtio_device(&vp_dev->vdev);
> >>  
> >>  	if (vp_dev->ioaddr)
> >> -- 
> >> MST
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Sept. 21, 2015, 6:21 p.m. UTC | #5
On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> On some hypervisors, virtio devices tend to generate spurious interrupts
> when switching between MSI and non-MSI mode.  Normally, either MSI or
> non-MSI is used and all is well, but during shutdown, linux disables MSI
> which then causes an "irq %d: nobody cared" message, with irq being
> subsequently disabled.

My understanding is:

  Linux disables MSI/MSI-X during device shutdown.  If the device
  signals an interrupt after that, it may use INTx.

This INTx interrupt is not necessarily spurious.  Using INTx to signal an
interrupt that occurs when MSI is disabled seems like reasonable behavior
for any PCI device.

And it doesn't seem related to switching between MSI and non-MSI mode.
Yes, the INTx happens *after* disabling MSI, but it is not at all
*because* we disabled MSI.  So I wouldn't say "they generate spurious
interrupts when switching between MSI and non-MSI."

Why doesn't virtio-pci just register an INTx handler in addition to an MSI
handler?

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Sept. 21, 2015, 7:42 p.m. UTC | #6
On Mon, Sep 21, 2015 at 01:21:47PM -0500, Bjorn Helgaas wrote:
> On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > On some hypervisors, virtio devices tend to generate spurious interrupts
> > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > which then causes an "irq %d: nobody cared" message, with irq being
> > subsequently disabled.
> 
> My understanding is:
> 
>   Linux disables MSI/MSI-X during device shutdown.  If the device
>   signals an interrupt after that, it may use INTx.
> 
> This INTx interrupt is not necessarily spurious.  Using INTx to signal an
> interrupt that occurs when MSI is disabled seems like reasonable behavior
> for any PCI device.
> And it doesn't seem related to switching between MSI and non-MSI mode.
> Yes, the INTx happens *after* disabling MSI, but it is not at all
> *because* we disabled MSI.  So I wouldn't say "they generate spurious
> interrupts when switching between MSI and non-MSI."
> 
> Why doesn't virtio-pci just register an INTx handler in addition to an MSI
> handler?
> 
> Bjorn

The handler causes an expensive exit to the hypervisor,
and the INTx lines are shared with other devices.

Seems silly to slow them down just so we can do something
that triggers the device bug.  The bus master is disabled by that time,
if linux can just desist from touching MSI enable device won't
send either INTx (because MSI is on) or MSI
(because bus master is on) and all will be well.
Bjorn Helgaas Sept. 21, 2015, 10:10 p.m. UTC | #7
On Mon, Sep 21, 2015 at 10:42:13PM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 21, 2015 at 01:21:47PM -0500, Bjorn Helgaas wrote:
> > On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > > On some hypervisors, virtio devices tend to generate spurious interrupts
> > > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > > which then causes an "irq %d: nobody cared" message, with irq being
> > > subsequently disabled.
> > 
> > My understanding is:
> > 
> >   Linux disables MSI/MSI-X during device shutdown.  If the device
> >   signals an interrupt after that, it may use INTx.
> > 
> > This INTx interrupt is not necessarily spurious.  Using INTx to signal an
> > interrupt that occurs when MSI is disabled seems like reasonable behavior
> > for any PCI device.
> > And it doesn't seem related to switching between MSI and non-MSI mode.
> > Yes, the INTx happens *after* disabling MSI, but it is not at all
> > *because* we disabled MSI.  So I wouldn't say "they generate spurious
> > interrupts when switching between MSI and non-MSI."
> > 
> > Why doesn't virtio-pci just register an INTx handler in addition to an MSI
> > handler?
> 
> The handler causes an expensive exit to the hypervisor,
> and the INTx lines are shared with other devices.

Do we care?  Is this a performance path?  I thought we were in a kexec
shutdown path.

> Seems silly to slow them down just so we can do something
> that triggers the device bug.  The bus master is disabled by that time,
> if linux can just desist from touching MSI enable device won't
> send either INTx (because MSI is on) or MSI
> (because bus master is on) and all will be well.

It would also be silly to put special-purpose code in the PCI core
if there's a reasonable way to handle this in a driver.

Can you describe exactly what the device bug is?  Apparently you're
saying that if we shut down MSI, it triggers the bug?  And I guess
you're talking about a virtio device as implemented in qemu or other
hypervisors?

If we leave MSI enabled (as your patch does), then the device has MSI
enabled and Bus Master disabled.  I can see these possibilities:

  1) the device never recognizes an interrupt condition
  2) the device sets the pending bit but doesn't issue the MSI write,
     so the OS doesn't see the interrupt unless it polls for it
  3) the device signals MSI and we still have an MSI handler
     registered, so we silently handle it
  4) the device signals INTx

You seem to suggest that if we leave MSI enabled (as your patch does),
we're in case 1.  But I doubt that disabling MSI causes the device to
interrupt.

Case 2 seems more likely to me: the device recognized an interrupt
condition, e.g., an event occurred, and the OS simply doesn't see the
interrupt because the device can't issue the MSI message.

Case 3 does seem like it would be a device bug, because the device
shouldn't do an MSI write when Bus Master is disabled.  I don't see
this case mentioned explicitly in the PCI spec, but PCIe r3.0 spec sec
7.5.1.1 does make it clear that disabling Bus Master also disables MSI
messages.

I don't know whether case 4 would be legal or not.  But apparently it
doesn't happen with the virtio device anyway, so it's not really a
concern here.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Sept. 22, 2015, 11:29 a.m. UTC | #8
On Mon, Sep 21, 2015 at 05:10:43PM -0500, Bjorn Helgaas wrote:
> On Mon, Sep 21, 2015 at 10:42:13PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Sep 21, 2015 at 01:21:47PM -0500, Bjorn Helgaas wrote:
> > > On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > > > On some hypervisors, virtio devices tend to generate spurious interrupts
> > > > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > > > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > > > which then causes an "irq %d: nobody cared" message, with irq being
> > > > subsequently disabled.
> > > 
> > > My understanding is:
> > > 
> > >   Linux disables MSI/MSI-X during device shutdown.  If the device
> > >   signals an interrupt after that, it may use INTx.
> > > 
> > > This INTx interrupt is not necessarily spurious.  Using INTx to signal an
> > > interrupt that occurs when MSI is disabled seems like reasonable behavior
> > > for any PCI device.
> > > And it doesn't seem related to switching between MSI and non-MSI mode.
> > > Yes, the INTx happens *after* disabling MSI, but it is not at all
> > > *because* we disabled MSI.  So I wouldn't say "they generate spurious
> > > interrupts when switching between MSI and non-MSI."
> > > 
> > > Why doesn't virtio-pci just register an INTx handler in addition to an MSI
> > > handler?
> > 
> > The handler causes an expensive exit to the hypervisor,
> > and the INTx lines are shared with other devices.
> 
> Do we care?  Is this a performance path?  I thought we were in a kexec
> shutdown path.

Yes but the handler would always have to be registered, right?

> > Seems silly to slow them down just so we can do something
> > that triggers the device bug.  The bus master is disabled by that time,
> > if linux can just desist from touching MSI enable device won't
> > send either INTx (because MSI is on) or MSI
> > (because bus master is on) and all will be well.
> 
> It would also be silly to put special-purpose code in the PCI core
> if there's a reasonable way to handle this in a driver.
> 
> Can you describe exactly what the device bug is?  Apparently you're
> saying that if we shut down MSI, it triggers the bug?  And I guess
> you're talking about a virtio device as implemented in qemu or other
> hypervisors?

Yes. Basically depending on an internal device state, disabling MSI
sometimes wedges it.  The most easy to debug effect is if it starts
sending INTx interrupts, for which there's no handler currently.
Full system reset always gets us out of the bad state.


> If we leave MSI enabled (as your patch does), then the device has MSI
> enabled and Bus Master disabled.  I can see these possibilities:
> 
>   1) the device never recognizes an interrupt condition
>   2) the device sets the pending bit but doesn't issue the MSI write,
>      so the OS doesn't see the interrupt unless it polls for it
>   3) the device signals MSI and we still have an MSI handler
>      registered, so we silently handle it
>   4) the device signals INTx
> You seem to suggest that if we leave MSI enabled (as your patch does),
> we're in case 1.  But I doubt that disabling MSI causes the device to
> interrupt.
> 
> Case 2 seems more likely to me: the device recognized an interrupt
> condition, e.g., an event occurred, and the OS simply doesn't see the
> interrupt because the device can't issue the MSI message.
> 
> Case 3 does seem like it would be a device bug, because the device
> shouldn't do an MSI write when Bus Master is disabled.  I don't see
> this case mentioned explicitly in the PCI spec, but PCIe r3.0 spec sec
> 7.5.1.1 does make it clear that disabling Bus Master also disables MSI
> messages.
> 
> I don't know whether case 4 would be legal or not.  But apparently it
> doesn't happen with the virtio device anyway, so it's not really a
> concern here.
> 
> Bjorn

It's case 2. Except it doesn't actually set the pending
bit, because PCI spec has language like

	"While a vector is masked, the function is prohibited from sending the
	associated message, and the function must set the associated Pending bit"

but doesn't actually require to set pending if bus master enable
prevents sending MSI.
Bjorn Helgaas Sept. 22, 2015, 12:36 p.m. UTC | #9
On Tue, Sep 22, 2015 at 02:29:03PM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 21, 2015 at 05:10:43PM -0500, Bjorn Helgaas wrote:
> > On Mon, Sep 21, 2015 at 10:42:13PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Sep 21, 2015 at 01:21:47PM -0500, Bjorn Helgaas wrote:
> > > > On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > > > > On some hypervisors, virtio devices tend to generate spurious interrupts
> > > > > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > > > > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > > > > which then causes an "irq %d: nobody cared" message, with irq being
> > > > > subsequently disabled.
> > > > 
> > > > My understanding is:
> > > > 
> > > >   Linux disables MSI/MSI-X during device shutdown.  If the device
> > > >   signals an interrupt after that, it may use INTx.
> > > > 
> > > > This INTx interrupt is not necessarily spurious.  Using INTx to signal an
> > > > interrupt that occurs when MSI is disabled seems like reasonable behavior
> > > > for any PCI device.
> > > > And it doesn't seem related to switching between MSI and non-MSI mode.
> > > > Yes, the INTx happens *after* disabling MSI, but it is not at all
> > > > *because* we disabled MSI.  So I wouldn't say "they generate spurious
> > > > interrupts when switching between MSI and non-MSI."
> > > > 
> > > > Why doesn't virtio-pci just register an INTx handler in addition to an MSI
> > > > handler?
> > > 
> > > The handler causes an expensive exit to the hypervisor,
> > > and the INTx lines are shared with other devices.
> > 
> > Do we care?  Is this a performance path?  I thought we were in a kexec
> > shutdown path.
> 
> Yes but the handler would always have to be registered, right?

The pci_device_shutdown() path you're modifying calls drv->shutdown()
immediately before disabling MSI, so I suppose you could register a
handler in a virtio shutdown method.

> > > Seems silly to slow them down just so we can do something
> > > that triggers the device bug.  The bus master is disabled by that time,
> > > if linux can just desist from touching MSI enable device won't
> > > send either INTx (because MSI is on) or MSI
> > > (because bus master is on) and all will be well.
> > 
> > It would also be silly to put special-purpose code in the PCI core
> > if there's a reasonable way to handle this in a driver.
> > 
> > Can you describe exactly what the device bug is?  Apparently you're
> > saying that if we shut down MSI, it triggers the bug?  And I guess
> > you're talking about a virtio device as implemented in qemu or other
> > hypervisors?
> 
> Yes. Basically depending on an internal device state, disabling MSI
> sometimes wedges it.  The most easy to debug effect is if it starts
> sending INTx interrupts, for which there's no handler currently.
> Full system reset always gets us out of the bad state.

If disabling MSI causes the device to use INTx interrupts, that sounds
perfectly normal to me.

If disabling MSI causes the device to hang, *that* sounds like a bug.
Since this is virtio, we should be able to figure out exactly where
that happens.  Do you have a pointer to a virtio bug report, or even a
QEMU commit that fixes this virtio bug?

I understand that even if there is a virtio fix in QEMU, we want a
solution that works even with an old QEMU that doesn't contain the
fix.  But a pointer to a QEMU fix would really help understand and
document the Linux fix.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Sept. 22, 2015, 2:07 p.m. UTC | #10
On Tue, Sep 22, 2015 at 07:36:40AM -0500, Bjorn Helgaas wrote:
> On Tue, Sep 22, 2015 at 02:29:03PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Sep 21, 2015 at 05:10:43PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Sep 21, 2015 at 10:42:13PM +0300, Michael S. Tsirkin wrote:
> > > > On Mon, Sep 21, 2015 at 01:21:47PM -0500, Bjorn Helgaas wrote:
> > > > > On Sun, Sep 06, 2015 at 06:32:35PM +0300, Michael S. Tsirkin wrote:
> > > > > > On some hypervisors, virtio devices tend to generate spurious interrupts
> > > > > > when switching between MSI and non-MSI mode.  Normally, either MSI or
> > > > > > non-MSI is used and all is well, but during shutdown, linux disables MSI
> > > > > > which then causes an "irq %d: nobody cared" message, with irq being
> > > > > > subsequently disabled.
> > > > > 
> > > > > My understanding is:
> > > > > 
> > > > >   Linux disables MSI/MSI-X during device shutdown.  If the device
> > > > >   signals an interrupt after that, it may use INTx.
> > > > > 
> > > > > This INTx interrupt is not necessarily spurious.  Using INTx to signal an
> > > > > interrupt that occurs when MSI is disabled seems like reasonable behavior
> > > > > for any PCI device.
> > > > > And it doesn't seem related to switching between MSI and non-MSI mode.
> > > > > Yes, the INTx happens *after* disabling MSI, but it is not at all
> > > > > *because* we disabled MSI.  So I wouldn't say "they generate spurious
> > > > > interrupts when switching between MSI and non-MSI."
> > > > > 
> > > > > Why doesn't virtio-pci just register an INTx handler in addition to an MSI
> > > > > handler?
> > > > 
> > > > The handler causes an expensive exit to the hypervisor,
> > > > and the INTx lines are shared with other devices.
> > > 
> > > Do we care?  Is this a performance path?  I thought we were in a kexec
> > > shutdown path.
> > 
> > Yes but the handler would always have to be registered, right?
> 
> The pci_device_shutdown() path you're modifying calls drv->shutdown()
> immediately before disabling MSI, so I suppose you could register a
> handler in a virtio shutdown method.

I guess we could. Not sure what we'd do e.g. if that fails.

> > > > Seems silly to slow them down just so we can do something
> > > > that triggers the device bug.  The bus master is disabled by that time,
> > > > if linux can just desist from touching MSI enable device won't
> > > > send either INTx (because MSI is on) or MSI
> > > > (because bus master is on) and all will be well.
> > > 
> > > It would also be silly to put special-purpose code in the PCI core
> > > if there's a reasonable way to handle this in a driver.
> > > 
> > > Can you describe exactly what the device bug is?  Apparently you're
> > > saying that if we shut down MSI, it triggers the bug?  And I guess
> > > you're talking about a virtio device as implemented in qemu or other
> > > hypervisors?
> > 
> > Yes. Basically depending on an internal device state, disabling MSI
> > sometimes wedges it.  The most easy to debug effect is if it starts
> > sending INTx interrupts, for which there's no handler currently.
> > Full system reset always gets us out of the bad state.
> 
> If disabling MSI causes the device to use INTx interrupts, that sounds
> perfectly normal to me.
> 
> If disabling MSI causes the device to hang, *that* sounds like a bug.
> Since this is virtio, we should be able to figure out exactly where
> that happens.  Do you have a pointer to a virtio bug report, or even a
> QEMU commit that fixes this virtio bug?
> 
> I understand that even if there is a virtio fix in QEMU, we want a
> solution that works even with an old QEMU that doesn't contain the
> fix.  But a pointer to a QEMU fix would really help understand and
> document the Linux fix.
> 
> Bjorn

I'm not sure we ever understood it completely.

I think some of it has to do with the way the whole virtio 0
device register layout changes when you enable/disable MSI.  So should
be ok when using the modern virtio 1 model since we fixed this thing.

I was hoping that since disabling MSI in pci core is only useful as a
work-around (for devices with a broken bus master enable - even though I
don't think we know what these are exactly), a flag for not disabling it
won't be held to such a high standard.
Bjorn Helgaas Sept. 22, 2015, 3:46 p.m. UTC | #11
On Tue, Sep 22, 2015 at 05:07:19PM +0300, Michael S. Tsirkin wrote:
> On Tue, Sep 22, 2015 at 07:36:40AM -0500, Bjorn Helgaas wrote:
> > On Tue, Sep 22, 2015 at 02:29:03PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Sep 21, 2015 at 05:10:43PM -0500, Bjorn Helgaas wrote:

> > > > Can you describe exactly what the device bug is?  Apparently you're
> > > > saying that if we shut down MSI, it triggers the bug?  And I guess
> > > > you're talking about a virtio device as implemented in qemu or other
> > > > hypervisors?
> > > 
> > > Yes. Basically depending on an internal device state, disabling MSI
> > > sometimes wedges it.  The most easy to debug effect is if it starts
> > > sending INTx interrupts, for which there's no handler currently.
> > > Full system reset always gets us out of the bad state.
> > 
> > If disabling MSI causes the device to use INTx interrupts, that sounds
> > perfectly normal to me.
> > 
> > If disabling MSI causes the device to hang, *that* sounds like a bug.
> > Since this is virtio, we should be able to figure out exactly where
> > that happens.  Do you have a pointer to a virtio bug report, or even a
> > QEMU commit that fixes this virtio bug?
> > 
> > I understand that even if there is a virtio fix in QEMU, we want a
> > solution that works even with an old QEMU that doesn't contain the
> > fix.  But a pointer to a QEMU fix would really help understand and
> > document the Linux fix.
> 
> I'm not sure we ever understood it completely.
> 
> I think some of it has to do with the way the whole virtio 0
> device register layout changes when you enable/disable MSI.  So should
> be ok when using the modern virtio 1 model since we fixed this thing.

If you know that:

  - pci_device_shutdown() disables MSI,
  - disabling MSI changes the virtio register layout, and
  - the driver may touch the device after pci_device_shutdown(),

it seems like you might want a virtio shutdown method so you can
find out when the register layout changes.  Changing the register
layout sounds completely broken, and dealing with it sounds racy, but
it sounds like a mess that should be handled by the driver.

> I was hoping that since disabling MSI in pci core is only useful as a
> work-around (for devices with a broken bus master enable - even though I
> don't think we know what these are exactly), a flag for not disabling it
> won't be held to such a high standard.

Well, I suppose you see the problem with adding workarounds on top of
workarounds, especially when we don't have a clear understanding of
why we need them.  The "disable MSI on shutdown" code was added here:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d52877c7b1af

and there is no information in that patch about it being a workaround
for devices with broken bus master enable.

The cumulative effect of stuff like this is that it becomes impossible
to do any meaningful restructuring in the core without breaking some
corner case.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 860c751..80f3494 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -180,6 +180,8 @@  enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1 << 6),
 	/* Do not use PM reset even if device advertises NoSoftRst- */
 	PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),
+	/* Do not disable MSI on shutdown, disable bus master instead */
+	PCI_DEV_FLAGS_NO_MSI_SHUTDOWN = (__force pci_dev_flags_t) (1 << 8),
 };
 
 enum pci_irq_reroute_variant {
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3cb2210..59d9e40 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -450,8 +450,10 @@  static void pci_device_shutdown(struct device *dev)
 
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
-	pci_msi_shutdown(pci_dev);
-	pci_msix_shutdown(pci_dev);
+	if (!(pci_dev->dev_flags & PCI_DEV_FLAGS_NO_MSI_SHUTDOWN)) {
+		pci_msi_shutdown(pci_dev);
+		pci_msix_shutdown(pci_dev);
+	}
 
 #ifdef CONFIG_KEXEC
 	/*
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 78f804a..26f46c3 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -528,6 +528,8 @@  static int virtio_pci_probe(struct pci_dev *pci_dev,
 	if (rc)
 		goto err_register;
 
+	pci_dev->dev_flags |= PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
+
 	return 0;
 
 err_register:
@@ -546,6 +548,8 @@  static void virtio_pci_remove(struct pci_dev *pci_dev)
 {
 	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
 
+	pci_dev->dev_flags &= ~PCI_DEV_FLAGS_NO_MSI_SHUTDOWN;
+
 	unregister_virtio_device(&vp_dev->vdev);
 
 	if (vp_dev->ioaddr)