Message ID | 1426577832-23164-1-git-send-email-jiang.liu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: > To support IOAPIC hot-removal, we need to release PCI interrupt resource > when unbinding PCI device driver. But due to historical reason, > /* > * We would love to complain here if pci_dev->is_enabled is set, that > * the driver should have called pci_disable_device(), but the > * unfortunate fact is there are too many odd BIOS and bridge setups > * that don't like drivers doing that all of the time. > * Oh well, we can dream of sane hardware when we sleep, no matter how > * horrible the crap we have to deal with is when we are awake... > */ Quoting the comment here (especially the last two lines) is overkill and obscures the real point. The important thing is that some drivers have legitimate reasons for not calling pci_disable_device(). > some drivers don't call pci_disable_device() when unloading, which > prevents us from reallocating PCI interrupt resource on reloading > PCI driver and causes regressions. This isn't very clear. I can believe that "drivers not calling pci_disable_device()" means we don't release IRQ resources, which might prevent you from hot-removing an IOAPIC. But "drivers not calling pci_disable_device()" doesn't cause regressions. > So release PCI interrupt resource only if PCI device is disabled when > unbinding. By this way, we could support IOAPIC hot-removal on latest > platforms and avoid regressions on old platforms. Does this mean you can only hot-remove IOAPICs if all drivers for devices using the IOAPIC call pci_disable_device()? If so, it seems sort of dubious that we have to rely on drivers for that. What happens if we try to hot-remove an IOAPIC where we haven't released all the IRQ resources? Is there a nice error message that will help us debug problem reports? This has nothing to do with "latest platforms" and "old platforms." That text pretends to convey information, but it doesn't. To be useful, it would have to say something specific about how "latest" and "old" platforms are different. I haven't even figured out what causes the regressions yet. I guess maybe it's the fact that after b4b55cda5874, we always call pcibios_disable_irq(), while before we only called it if the driver used pci_disable_device()? The changelog should be clear about this. > Please aslo refer to: "also" > https://bugzilla.kernel.org/show_bug.cgi?id=94721 > This apparently fixes something and needs a Fixes: tag to help people who might backport the broken commit. > Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> > Reported-by: Alex Williamson <alex.williamson@redhat.com> > Reported-by: Thomas Hellstrom <thellstrom@vmware.com> > Reviewed-by: Rafael J. Wysocki <rjw@rjwysocki.net> > --- > Hi Rafael, > I have assumed an Reviewed-by from you, is that OK? > Thanks! > Gerry > --- > arch/x86/pci/common.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > index 3d2612b68694..8d792142cb2a 100644 > --- a/arch/x86/pci/common.c > +++ b/arch/x86/pci/common.c > @@ -527,7 +527,7 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, I know the notifier was added by b4b55cda5874, not this patch, but I don't think it's the best mechanism. I would rather do something like calling pcibios_disable_irq() directly from pci_device_remove(). That way the call is more explicit, it's in arch-independent code, and it's more parallel with how we call pcibios_enable_irq() in the pci_enable_device() path. This code is all x86-specific. But other arches use IOAPIC, and there's nothing obviously x86-specific here. Won't they still have issues here? > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > return NOTIFY_DONE; > > - if (pcibios_disable_irq) > + if (!pci_is_enabled(dev) && pcibios_disable_irq) > pcibios_disable_irq(dev); > > return NOTIFY_OK; > -- > 1.7.10.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015/3/19 6:11, Bjorn Helgaas wrote: > On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: >> To support IOAPIC hot-removal, we need to release PCI interrupt resource >> when unbinding PCI device driver. But due to historical reason, >> /* >> * We would love to complain here if pci_dev->is_enabled is set, that >> * the driver should have called pci_disable_device(), but the >> * unfortunate fact is there are too many odd BIOS and bridge setups >> * that don't like drivers doing that all of the time. >> * Oh well, we can dream of sane hardware when we sleep, no matter how >> * horrible the crap we have to deal with is when we are awake... >> */ > > Quoting the comment here (especially the last two lines) is overkill and > obscures the real point. The important thing is that some drivers have > legitimate reasons for not calling pci_disable_device(). Hi Bjorn, Thanks for review. I will rewrite the commit message. >> some drivers don't call pci_disable_device() when unloading, which >> prevents us from reallocating PCI interrupt resource on reloading >> PCI driver and causes regressions. > > This isn't very clear. I can believe that "drivers not calling > pci_disable_device()" means we don't release IRQ resources, which might > prevent you from hot-removing an IOAPIC. > > But "drivers not calling pci_disable_device()" doesn't cause regressions. > >> So release PCI interrupt resource only if PCI device is disabled when >> unbinding. By this way, we could support IOAPIC hot-removal on latest >> platforms and avoid regressions on old platforms. > > Does this mean you can only hot-remove IOAPICs if all drivers for devices > using the IOAPIC call pci_disable_device()? If so, it seems sort of > dubious that we have to rely on drivers for that. This is a quickfix for v4.0 merging window. We will try to solve this issue for next merging window. > > What happens if we try to hot-remove an IOAPIC where we haven't released > all the IRQ resources? Is there a nice error message that will help us > debug problem reports? > > This has nothing to do with "latest platforms" and "old platforms." That > text pretends to convey information, but it doesn't. To be useful, it > would have to say something specific about how "latest" and "old" platforms > are different. > > I haven't even figured out what causes the regressions yet. I guess maybe > it's the fact that after b4b55cda5874, we always call pcibios_disable_irq(), > while before we only called it if the driver used pci_disable_device()? > The changelog should be clear about this. > >> Please aslo refer to: > > "also" > >> https://bugzilla.kernel.org/show_bug.cgi?id=94721 >> > > This apparently fixes something and needs a Fixes: tag to help people who > might backport the broken commit. > >> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> >> Reported-by: Alex Williamson <alex.williamson@redhat.com> >> Reported-by: Thomas Hellstrom <thellstrom@vmware.com> >> Reviewed-by: Rafael J. Wysocki <rjw@rjwysocki.net> >> --- >> Hi Rafael, >> I have assumed an Reviewed-by from you, is that OK? >> Thanks! >> Gerry >> --- >> arch/x86/pci/common.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c >> index 3d2612b68694..8d792142cb2a 100644 >> --- a/arch/x86/pci/common.c >> +++ b/arch/x86/pci/common.c >> @@ -527,7 +527,7 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > > I know the notifier was added by b4b55cda5874, not this patch, but I don't > think it's the best mechanism. I would rather do something like calling > pcibios_disable_irq() directly from pci_device_remove(). That way the call > is more explicit, it's in arch-independent code, and it's more parallel > with how we call pcibios_enable_irq() in the pci_enable_device() path. > > This code is all x86-specific. But other arches use IOAPIC, and there's > nothing obviously x86-specific here. Won't they still have issues here? pcibios_enable_irq() and pcibios_disable_irq() currently are x86 specific, so I tried to keep it x86 specific. On the other hand, we want to release IOAPIC pin when a PCI device gets unused instead of getting removed, so just assign IRQ number for PCI devices in use. How about this commit message? ------------------------------------------------------------------ x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding To support IOAPIC hot-removal, we need to track IOAPIC pin usage, which is to allocate pin on demand and release pin when unused. According to the original design should allocate and release IOAPIC pin as below: pci_enable_device() -> pcibios_enable_device() when pci_dev->enable_cnt changing from 0 to 1 ->pcibios_enable_irq() ->allocate IOAPIC pin pci_disable_device() -> pcibios_disable_device() when pci_dev->enable_cnt changing from 1 to 0 ->pcibios_disable_irq() ->release IOAPIC pin But above design conclicts with PCI PM design. When suspending, PCI device driver may call pci_disable_device() and eventually release IOAPIC pin. When resuming, PCI device driver call pci_enable_device() and reallocating IOAPIC pin. Since v3.19, IOAPIC driver dynamically allocates IRQ number for IOAPIC pin. So when resuming, a different IRQ number may assigned, which breaks some PCI drivers' suspend/resume implementation. So commit ("x86/PCI: Refine the way to release PCI IRQ resources") tries to fix PM regressions by releasing IOAPIC pin when unbinding PCI driver unconditionally, which causes new regressions one some old platforms because: 1) some PCI device drivers skip calling pci_disable_device() when unbinding due to BIOS flaws, which causing non-zero pci_dev->enable_cnt after driver unbinding. 2) pci_enable_device() doesn't call pcibios_enable_irq() because pci_dev->enable_cnt is not zero when rebinding device driver, thus no IOAPIC pin(IRQ number) assigned to PCI device after rebinding. This patch implements a quick workaround which releases IOAPIC iff pci_dev->enable_cnt is zero after driver unbinding. A better solution should be to make IOAPIC allocation/releasing symmetric, 1) calling pcibios_enable_irq() on BUS_NOTIFY_BIND_DRIVER notification 2) calling pcibios_disable_irq() on BUS_NOTIFY_UNBOUND_DRIVER notification So we could make IOAPIC pin allocation/releasing independent of pci_dev->enable_cnt. We will try the symmetric solution for next merge window. Please also refer to: https://bugzilla.kernel.org/show_bug.cgi?id=94721 Fixes: b4b55cda5874("x86/PCI: Refine the way to release PCI IRQ resources") ------------------------------------------------------------------ Thanks! Gerry > >> if (action != BUS_NOTIFY_UNBOUND_DRIVER) >> return NOTIFY_DONE; >> >> - if (pcibios_disable_irq) >> + if (!pci_is_enabled(dev) && pcibios_disable_irq) >> pcibios_disable_irq(dev); >> >> return NOTIFY_OK; >> -- >> 1.7.10.4 >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: > On 2015/3/19 6:11, Bjorn Helgaas wrote: > > On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: > >> To support IOAPIC hot-removal, we need to release PCI interrupt resource > >> when unbinding PCI device driver. But due to historical reason, > >> /* > >> * We would love to complain here if pci_dev->is_enabled is set, that > >> * the driver should have called pci_disable_device(), but the > >> * unfortunate fact is there are too many odd BIOS and bridge setups > >> * that don't like drivers doing that all of the time. > >> * Oh well, we can dream of sane hardware when we sleep, no matter how > >> * horrible the crap we have to deal with is when we are awake... > >> */ > > > > Quoting the comment here (especially the last two lines) is overkill and > > obscures the real point. The important thing is that some drivers have > > legitimate reasons for not calling pci_disable_device(). > Hi Bjorn, > Thanks for review. I will rewrite the commit message. > >> some drivers don't call pci_disable_device() when unloading, which > >> prevents us from reallocating PCI interrupt resource on reloading > >> PCI driver and causes regressions. > > > > This isn't very clear. I can believe that "drivers not calling > > pci_disable_device()" means we don't release IRQ resources, which might > > prevent you from hot-removing an IOAPIC. > > > > But "drivers not calling pci_disable_device()" doesn't cause regressions. > > > >> So release PCI interrupt resource only if PCI device is disabled when > >> unbinding. By this way, we could support IOAPIC hot-removal on latest > >> platforms and avoid regressions on old platforms. > > > > Does this mean you can only hot-remove IOAPICs if all drivers for devices > > using the IOAPIC call pci_disable_device()? If so, it seems sort of > > dubious that we have to rely on drivers for that. > This is a quickfix for v4.0 merging window. We will try to solve this > issue for next merging window. If that is the plan, then I'd rather revert the offending commit and try again in the next cycle. Bjorn, what do you think?
On Thu, Mar 19, 2015 at 6:29 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: >> On 2015/3/19 6:11, Bjorn Helgaas wrote: >> > On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: >> >> To support IOAPIC hot-removal, we need to release PCI interrupt resource >> >> when unbinding PCI device driver. But due to historical reason, >> >> /* >> >> * We would love to complain here if pci_dev->is_enabled is set, that >> >> * the driver should have called pci_disable_device(), but the >> >> * unfortunate fact is there are too many odd BIOS and bridge setups >> >> * that don't like drivers doing that all of the time. >> >> * Oh well, we can dream of sane hardware when we sleep, no matter how >> >> * horrible the crap we have to deal with is when we are awake... >> >> */ >> > >> > Quoting the comment here (especially the last two lines) is overkill and >> > obscures the real point. The important thing is that some drivers have >> > legitimate reasons for not calling pci_disable_device(). >> Hi Bjorn, >> Thanks for review. I will rewrite the commit message. >> >> some drivers don't call pci_disable_device() when unloading, which >> >> prevents us from reallocating PCI interrupt resource on reloading >> >> PCI driver and causes regressions. >> > >> > This isn't very clear. I can believe that "drivers not calling >> > pci_disable_device()" means we don't release IRQ resources, which might >> > prevent you from hot-removing an IOAPIC. >> > >> > But "drivers not calling pci_disable_device()" doesn't cause regressions. >> > >> >> So release PCI interrupt resource only if PCI device is disabled when >> >> unbinding. By this way, we could support IOAPIC hot-removal on latest >> >> platforms and avoid regressions on old platforms. >> > >> > Does this mean you can only hot-remove IOAPICs if all drivers for devices >> > using the IOAPIC call pci_disable_device()? If so, it seems sort of >> > dubious that we have to rely on drivers for that. >> This is a quickfix for v4.0 merging window. We will try to solve this >> issue for next merging window. > > If that is the plan, then I'd rather revert the offending commit and try > again in the next cycle. > > Bjorn, what do you think? I don't know how hard it is to just revert that one commit at this point, but I would be in favor of doing that if it's feasible. We're headed toward a real morass of changelogs for a design that seems destined for overhaul. That makes it really hard to backport and rework things later. From the revised changelog: >> When suspending, PCI >> device driver may call pci_disable_device() and eventually release >> IOAPIC pin. When resuming, PCI device driver call pci_enable_device() >> and reallocating IOAPIC pin. Since v3.19, IOAPIC driver dynamically >> allocates IRQ number for IOAPIC pin. So when resuming, a different >> IRQ number may assigned, which breaks some PCI drivers' suspend/resume >> implementation. It seems like you're really standing on your head to make this situation work, and I think the result is too complicated and error-prone. One test is to see whether you can write a short, simple description of how driver writers need to manage IRQs with respect to probe/remove/suspend/remove. There are two other possibilities I can see: 1) Decide that a driver that captures the IRQ and then calls pci_enable_device() is just broken, and fix those drivers to re-capture the IRQ every time they call pci_enable_device(). I assume you've looked at this already and concluded it's not practical? 2) Configure the IRQ in pci_device_probe(). Then it would be configured before the driver sees the device, and you could dispose of it in pci_device_remove() when the driver is unbound. Does either of those make sense? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, March 19, 2015 09:08:38 AM Bjorn Helgaas wrote: > On Thu, Mar 19, 2015 at 6:29 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: > >> On 2015/3/19 6:11, Bjorn Helgaas wrote: > >> > On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: > >> >> To support IOAPIC hot-removal, we need to release PCI interrupt resource > >> >> when unbinding PCI device driver. But due to historical reason, > >> >> /* > >> >> * We would love to complain here if pci_dev->is_enabled is set, that > >> >> * the driver should have called pci_disable_device(), but the > >> >> * unfortunate fact is there are too many odd BIOS and bridge setups > >> >> * that don't like drivers doing that all of the time. > >> >> * Oh well, we can dream of sane hardware when we sleep, no matter how > >> >> * horrible the crap we have to deal with is when we are awake... > >> >> */ > >> > > >> > Quoting the comment here (especially the last two lines) is overkill and > >> > obscures the real point. The important thing is that some drivers have > >> > legitimate reasons for not calling pci_disable_device(). > >> Hi Bjorn, > >> Thanks for review. I will rewrite the commit message. > >> >> some drivers don't call pci_disable_device() when unloading, which > >> >> prevents us from reallocating PCI interrupt resource on reloading > >> >> PCI driver and causes regressions. > >> > > >> > This isn't very clear. I can believe that "drivers not calling > >> > pci_disable_device()" means we don't release IRQ resources, which might > >> > prevent you from hot-removing an IOAPIC. > >> > > >> > But "drivers not calling pci_disable_device()" doesn't cause regressions. > >> > > >> >> So release PCI interrupt resource only if PCI device is disabled when > >> >> unbinding. By this way, we could support IOAPIC hot-removal on latest > >> >> platforms and avoid regressions on old platforms. > >> > > >> > Does this mean you can only hot-remove IOAPICs if all drivers for devices > >> > using the IOAPIC call pci_disable_device()? If so, it seems sort of > >> > dubious that we have to rely on drivers for that. > >> This is a quickfix for v4.0 merging window. We will try to solve this > >> issue for next merging window. > > > > If that is the plan, then I'd rather revert the offending commit and try > > again in the next cycle. > > > > Bjorn, what do you think? > > I don't know how hard it is to just revert that one commit at this > point, but I would be in favor of doing that if it's feasible. The commit reverts cleanly and reverting it won't break anything that used to work in 3.19 and earlier (Gerry, please let me know if that is not correct). The only adverse consequence of reverting it I can see would be that the IOAPIC hotplug won't work in 4.0, but it didn't work before either and it's supposed to be a new feature in 4.0. > We're headed toward a real morass of changelogs for a design that > seems destined for overhaul. That makes it really hard to backport > and rework things later. Precisely. Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015/3/19 22:08, Bjorn Helgaas wrote: > On Thu, Mar 19, 2015 at 6:29 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: >>> On 2015/3/19 6:11, Bjorn Helgaas wrote: >>>> On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: >>>>> To support IOAPIC hot-removal, we need to release PCI interrupt resource >>>>> when unbinding PCI device driver. But due to historical reason, >>>>> /* >>>>> * We would love to complain here if pci_dev->is_enabled is set, that >>>>> * the driver should have called pci_disable_device(), but the >>>>> * unfortunate fact is there are too many odd BIOS and bridge setups >>>>> * that don't like drivers doing that all of the time. >>>>> * Oh well, we can dream of sane hardware when we sleep, no matter how >>>>> * horrible the crap we have to deal with is when we are awake... >>>>> */ >>>> >>>> Quoting the comment here (especially the last two lines) is overkill and >>>> obscures the real point. The important thing is that some drivers have >>>> legitimate reasons for not calling pci_disable_device(). >>> Hi Bjorn, >>> Thanks for review. I will rewrite the commit message. >>>>> some drivers don't call pci_disable_device() when unloading, which >>>>> prevents us from reallocating PCI interrupt resource on reloading >>>>> PCI driver and causes regressions. >>>> >>>> This isn't very clear. I can believe that "drivers not calling >>>> pci_disable_device()" means we don't release IRQ resources, which might >>>> prevent you from hot-removing an IOAPIC. >>>> >>>> But "drivers not calling pci_disable_device()" doesn't cause regressions. >>>> >>>>> So release PCI interrupt resource only if PCI device is disabled when >>>>> unbinding. By this way, we could support IOAPIC hot-removal on latest >>>>> platforms and avoid regressions on old platforms. >>>> >>>> Does this mean you can only hot-remove IOAPICs if all drivers for devices >>>> using the IOAPIC call pci_disable_device()? If so, it seems sort of >>>> dubious that we have to rely on drivers for that. >>> This is a quickfix for v4.0 merging window. We will try to solve this >>> issue for next merging window. >> >> If that is the plan, then I'd rather revert the offending commit and try >> again in the next cycle. >> >> Bjorn, what do you think? > > I don't know how hard it is to just revert that one commit at this > point, but I would be in favor of doing that if it's feasible. I will investigate about reverting. > > We're headed toward a real morass of changelogs for a design that > seems destined for overhaul. That makes it really hard to backport > and rework things later. > > From the revised changelog: > >>> When suspending, PCI >>> device driver may call pci_disable_device() and eventually release >>> IOAPIC pin. When resuming, PCI device driver call pci_enable_device() >>> and reallocating IOAPIC pin. Since v3.19, IOAPIC driver dynamically >>> allocates IRQ number for IOAPIC pin. So when resuming, a different >>> IRQ number may assigned, which breaks some PCI drivers' suspend/resume >>> implementation. > > It seems like you're really standing on your head to make this > situation work, and I think the result is too complicated and > error-prone. One test is to see whether you can write a short, simple > description of how driver writers need to manage IRQs with respect to > probe/remove/suspend/remove. > > There are two other possibilities I can see: > > 1) Decide that a driver that captures the IRQ and then calls > pci_enable_device() is just broken, and fix those drivers to > re-capture the IRQ every time they call pci_enable_device(). I assume > you've looked at this already and concluded it's not practical? > > 2) Configure the IRQ in pci_device_probe(). Then it would be > configured before the driver sees the device, and you could dispose of > it in pci_device_remove() when the driver is unbound. Actually I prefer solution 2 above. The key idea is to decouple IRQ resource allocation from pci_enabe/disable_device(), so irq resource will be allocated just before driver binding and will be released after driver unbinding. One issue left is the way to hook driver binding/unbinding events. Currently pcibios_enable/disable_irq() are x86 specific, so I use PCI notification to hook driver binding/unbinding evetns. If you are OK with introducing two new weak functions pcibios_enable/disable_irq() into PCI core, that's obviously a clear solution, easier to maintain and may benefit other platforms too in future. So should I introduce pcibios_enable/disable_irq() into PCI core? Thanks! Gerry > > Does either of those make sense? > > Bjorn > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015/3/19 23:57, Rafael J. Wysocki wrote: > On Thursday, March 19, 2015 09:08:38 AM Bjorn Helgaas wrote: >> On Thu, Mar 19, 2015 at 6:29 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>> On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: >>>> On 2015/3/19 6:11, Bjorn Helgaas wrote: >>>>> On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: >>>>>> To support IOAPIC hot-removal, we need to release PCI interrupt resource >>>>>> when unbinding PCI device driver. But due to historical reason, >>>>>> /* >>>>>> * We would love to complain here if pci_dev->is_enabled is set, that >>>>>> * the driver should have called pci_disable_device(), but the >>>>>> * unfortunate fact is there are too many odd BIOS and bridge setups >>>>>> * that don't like drivers doing that all of the time. >>>>>> * Oh well, we can dream of sane hardware when we sleep, no matter how >>>>>> * horrible the crap we have to deal with is when we are awake... >>>>>> */ >>>>> >>>>> Quoting the comment here (especially the last two lines) is overkill and >>>>> obscures the real point. The important thing is that some drivers have >>>>> legitimate reasons for not calling pci_disable_device(). >>>> Hi Bjorn, >>>> Thanks for review. I will rewrite the commit message. >>>>>> some drivers don't call pci_disable_device() when unloading, which >>>>>> prevents us from reallocating PCI interrupt resource on reloading >>>>>> PCI driver and causes regressions. >>>>> >>>>> This isn't very clear. I can believe that "drivers not calling >>>>> pci_disable_device()" means we don't release IRQ resources, which might >>>>> prevent you from hot-removing an IOAPIC. >>>>> >>>>> But "drivers not calling pci_disable_device()" doesn't cause regressions. >>>>> >>>>>> So release PCI interrupt resource only if PCI device is disabled when >>>>>> unbinding. By this way, we could support IOAPIC hot-removal on latest >>>>>> platforms and avoid regressions on old platforms. >>>>> >>>>> Does this mean you can only hot-remove IOAPICs if all drivers for devices >>>>> using the IOAPIC call pci_disable_device()? If so, it seems sort of >>>>> dubious that we have to rely on drivers for that. >>>> This is a quickfix for v4.0 merging window. We will try to solve this >>>> issue for next merging window. >>> >>> If that is the plan, then I'd rather revert the offending commit and try >>> again in the next cycle. >>> >>> Bjorn, what do you think? >> >> I don't know how hard it is to just revert that one commit at this >> point, but I would be in favor of doing that if it's feasible. > > The commit reverts cleanly and reverting it won't break anything that used to > work in 3.19 and earlier (Gerry, please let me know if that is not correct). Yes, revert should not cause new issues. Commit b4b55cda5874("Refine the way to release PCI IRQ resources") is a bugfix for xen-pciback. But the bugfix causes regressions on other platform. So it would be better to revert it and fix the issue in another better way in next merging window. > > The only adverse consequence of reverting it I can see would be that the > IOAPIC hotplug won't work in 4.0, but it didn't work before either and > it's supposed to be a new feature in 4.0. IOAPIC hotplug may still work, it only causes regressions to some PCI drivers. > >> We're headed toward a real morass of changelogs for a design that >> seems destined for overhaul. That makes it really hard to backport >> and rework things later. > > Precisely. Sorry for the troubles. When designing IOAPIC hotplug, I found architect has provided suitable hook points for IOAPIC pin usage track, so I adopted hook points in pci_enable/disable_device(). But recent regression reports remind me that's wrong decision, so will rework it in new way. Thanks! Gerry > > Rafael > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 19, 2015 at 10:09 PM, Jiang Liu <jiang.liu@linux.intel.com> wrote: > On 2015/3/19 22:08, Bjorn Helgaas wrote: >> There are two other possibilities I can see: >> >> 1) Decide that a driver that captures the IRQ and then calls >> pci_enable_device() is just broken, and fix those drivers to >> re-capture the IRQ every time they call pci_enable_device(). I assume >> you've looked at this already and concluded it's not practical? Did you look at this or not? I don't have any idea of the scope of the problem. I think in general we want drivers to start from scratch whenever they call pci_enable_device(). They should not assume that BARs are the same, IRQs are the same, etc. If we ever want to dynamically reassign resources, e.g., move BARs around to accommodate new hotplugged devices, a path involving pci_enable_device() seems a likely route of having the drivers learn about resource changes. >> 2) Configure the IRQ in pci_device_probe(). Then it would be >> configured before the driver sees the device, and you could dispose of >> it in pci_device_remove() when the driver is unbound. > Actually I prefer solution 2 above. The key idea is to decouple > IRQ resource allocation from pci_enabe/disable_device(), so irq > resource will be allocated just before driver binding and will > be released after driver unbinding. Solution 2 does have the advantage of making it simpler for driver writers. One disadvantage is that it *forces* us to do IRQ allocation, even though it may not be required. There are drivers that don't need IRQs because they use MSI or don't need interrupts at all. If we do IRQ allocation before binding the driver, and the allocation fails, these driver will no longer work even though they don't need the IRQs. > One issue left is the way to hook driver binding/unbinding events. > Currently pcibios_enable/disable_irq() are x86 specific, so I use > PCI notification to hook driver binding/unbinding evetns. > If you are OK with introducing two new weak functions > pcibios_enable/disable_irq() into PCI core, that's obviously > a clear solution, easier to maintain and may benefit other platforms > too in future. > > So should I introduce pcibios_enable/disable_irq() into PCI core? I think it would be better to add new weak interfaces than to use bus_register_notifier(). New interfaces are much more explicit when reading the code, and their ordering is very clearly defined. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, March 20, 2015 01:40:46 PM Jiang Liu wrote: > On 2015/3/19 23:57, Rafael J. Wysocki wrote: > > On Thursday, March 19, 2015 09:08:38 AM Bjorn Helgaas wrote: > >> On Thu, Mar 19, 2015 at 6:29 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > >>> On Thursday, March 19, 2015 03:49:33 PM Jiang Liu wrote: > >>>> On 2015/3/19 6:11, Bjorn Helgaas wrote: > >>>>> On Tue, Mar 17, 2015 at 03:37:12PM +0800, Jiang Liu wrote: > >>>>>> To support IOAPIC hot-removal, we need to release PCI interrupt resource > >>>>>> when unbinding PCI device driver. But due to historical reason, > >>>>>> /* > >>>>>> * We would love to complain here if pci_dev->is_enabled is set, that > >>>>>> * the driver should have called pci_disable_device(), but the > >>>>>> * unfortunate fact is there are too many odd BIOS and bridge setups > >>>>>> * that don't like drivers doing that all of the time. > >>>>>> * Oh well, we can dream of sane hardware when we sleep, no matter how > >>>>>> * horrible the crap we have to deal with is when we are awake... > >>>>>> */ > >>>>> > >>>>> Quoting the comment here (especially the last two lines) is overkill and > >>>>> obscures the real point. The important thing is that some drivers have > >>>>> legitimate reasons for not calling pci_disable_device(). > >>>> Hi Bjorn, > >>>> Thanks for review. I will rewrite the commit message. > >>>>>> some drivers don't call pci_disable_device() when unloading, which > >>>>>> prevents us from reallocating PCI interrupt resource on reloading > >>>>>> PCI driver and causes regressions. > >>>>> > >>>>> This isn't very clear. I can believe that "drivers not calling > >>>>> pci_disable_device()" means we don't release IRQ resources, which might > >>>>> prevent you from hot-removing an IOAPIC. > >>>>> > >>>>> But "drivers not calling pci_disable_device()" doesn't cause regressions. > >>>>> > >>>>>> So release PCI interrupt resource only if PCI device is disabled when > >>>>>> unbinding. By this way, we could support IOAPIC hot-removal on latest > >>>>>> platforms and avoid regressions on old platforms. > >>>>> > >>>>> Does this mean you can only hot-remove IOAPICs if all drivers for devices > >>>>> using the IOAPIC call pci_disable_device()? If so, it seems sort of > >>>>> dubious that we have to rely on drivers for that. > >>>> This is a quickfix for v4.0 merging window. We will try to solve this > >>>> issue for next merging window. > >>> > >>> If that is the plan, then I'd rather revert the offending commit and try > >>> again in the next cycle. > >>> > >>> Bjorn, what do you think? > >> > >> I don't know how hard it is to just revert that one commit at this > >> point, but I would be in favor of doing that if it's feasible. > > > > The commit reverts cleanly and reverting it won't break anything that used to > > work in 3.19 and earlier (Gerry, please let me know if that is not correct). > Yes, revert should not cause new issues. > Commit b4b55cda5874("Refine the way to release PCI IRQ resources") > is a bugfix for xen-pciback. But the bugfix causes regressions on > other platform. So it would be better to revert it and fix the issue > in another better way in next merging window. OK, I've queued up a revert of b4b55cda5874 and I'm going to push it to Linus for 4.0-rc5 later today. Thanks!
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 3d2612b68694..8d792142cb2a 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -527,7 +527,7 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, if (action != BUS_NOTIFY_UNBOUND_DRIVER) return NOTIFY_DONE; - if (pcibios_disable_irq) + if (!pci_is_enabled(dev) && pcibios_disable_irq) pcibios_disable_irq(dev); return NOTIFY_OK;