Message ID | 92EBB4272BF81E4089A7126EC1E7B284665970B5@IRSMSX101.ger.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Mayurkumar, On Wed, Aug 17, 2016 at 01:42:18PM +0000, Patel, Mayurkumar wrote: > Currently, if very fast hotplug removal and insertion event comes > as following > > [ 608.823412] pciehp 0000:00:1c.1:pcie04: Card not present on Slot(1) > [ 608.835249] pciehp 0000:00:1c.1:pcie04: Card present on Slot(1) > > In this case following scenario happens, > > While removal: > pcie_isr() -> pciehp_queue_interrupt_event() -> triggers queue_work(). > work invokes interrupt_event_handler() -> case INT_PRESENCE_OFF > and calls handle_surprise_event(). > > handle_surprise_event() again calls pciehp_get_adapter_status() > and reads slot status which might have been changed > already due to PCI_EXP_SLTSTA_PDC event for hotplug insertion > has happened. So it queues, ENABLE_REQ for both removal > and insertion interrupt based on latest slot status. > > In this case, PCIe device can not be hot-add again because > it was never removed due to which device can not get enabled. > > handle_surprise_event() can be removed and pciehp_queue_power_work() > can be directly triggered based on INT_PRESENCE_ON and INT_PRESENCE_OFF > from the switch case exist in interrupt_event_hanlder(). > > The patch ensures the pciehp_queue_power_work() processes > presence detect change for removal and insertion correctly. > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com> > --- > Resending the patch addressing to PCI Maintainer Bjorn Helgaas. > > drivers/pci/hotplug/pciehp_ctrl.c | 18 ++---------------- > 1 file changed, 2 insertions(+), 16 deletions(-) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 880978b..87c5bea 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -301,20 +301,6 @@ static void handle_button_press_event(struct slot *p_slot) > /* > * Note: This function must be called with slot->lock held > */ > -static void handle_surprise_event(struct slot *p_slot) > -{ > - u8 getstatus; > - > - pciehp_get_adapter_status(p_slot, &getstatus); > - if (!getstatus) > - pciehp_queue_power_work(p_slot, DISABLE_REQ); > - else > - pciehp_queue_power_work(p_slot, ENABLE_REQ); > -} > - > -/* > - * Note: This function must be called with slot->lock held > - */ > static void handle_link_event(struct slot *p_slot, u32 event) > { > struct controller *ctrl = p_slot->ctrl; > @@ -377,14 +363,14 @@ static void interrupt_event_handler(struct work_struct *work) > pciehp_green_led_off(p_slot); > break; > case INT_PRESENCE_ON: > - handle_surprise_event(p_slot); > + pciehp_queue_power_work(p_slot, ENABLE_REQ); > break; > case INT_PRESENCE_OFF: > /* > * Regardless of surprise capability, we need to > * definitely remove a card that has been pulled out! > */ > - handle_surprise_event(p_slot); > + pciehp_queue_power_work(p_slot, DISABLE_REQ); > break; > case INT_LINK_UP: > case INT_LINK_DOWN: Thanks a lot for this. I think other people have seen the same issue. Even with this fix, don't we have essentially the same problem one layer back? The first thing pcie_isr() does is read PCI_EXP_SLTSTA, then few lines down, we call pciehp_get_adapter_status(), which reads PCI_EXP_SLTSTA *again*. So I think the window is smaller but still there. I think what we really should do is read the status registers (PCI_EXP_SLTSTA and probably also PCI_EXP_LNKSTA) *once* in pcie_isr(), before we write PCI_EXP_SLTSTA to clear the RW1C bits there, and then queue up events based on those values, without re-reading the registers. What do you think? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 17, 2016 at 10:12 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > Hi Mayurkumar, > > On Wed, Aug 17, 2016 at 01:42:18PM +0000, Patel, Mayurkumar wrote: > > Currently, if very fast hotplug removal and insertion event comes > > as following > > > > [ 608.823412] pciehp 0000:00:1c.1:pcie04: Card not present on Slot(1) > > [ 608.835249] pciehp 0000:00:1c.1:pcie04: Card present on Slot(1) > > > > In this case following scenario happens, > > > > While removal: > > pcie_isr() -> pciehp_queue_interrupt_event() -> triggers queue_work(). > > work invokes interrupt_event_handler() -> case INT_PRESENCE_OFF > > and calls handle_surprise_event(). > > > > handle_surprise_event() again calls pciehp_get_adapter_status() > > and reads slot status which might have been changed > > already due to PCI_EXP_SLTSTA_PDC event for hotplug insertion > > has happened. So it queues, ENABLE_REQ for both removal > > and insertion interrupt based on latest slot status. > > > > In this case, PCIe device can not be hot-add again because > > it was never removed due to which device can not get enabled. > > > > handle_surprise_event() can be removed and pciehp_queue_power_work() > > can be directly triggered based on INT_PRESENCE_ON and INT_PRESENCE_OFF > > from the switch case exist in interrupt_event_hanlder(). > > > > The patch ensures the pciehp_queue_power_work() processes > > presence detect change for removal and insertion correctly. > > > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com> Acked-by: Rajat Jain <rajatxjain@gmail.com> > > > --- > > Resending the patch addressing to PCI Maintainer Bjorn Helgaas. > > > > drivers/pci/hotplug/pciehp_ctrl.c | 18 ++---------------- > > 1 file changed, 2 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > > index 880978b..87c5bea 100644 > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > @@ -301,20 +301,6 @@ static void handle_button_press_event(struct slot *p_slot) > > /* > > * Note: This function must be called with slot->lock held > > */ > > -static void handle_surprise_event(struct slot *p_slot) > > -{ > > - u8 getstatus; > > - > > - pciehp_get_adapter_status(p_slot, &getstatus); > > - if (!getstatus) > > - pciehp_queue_power_work(p_slot, DISABLE_REQ); > > - else > > - pciehp_queue_power_work(p_slot, ENABLE_REQ); > > -} > > - > > -/* > > - * Note: This function must be called with slot->lock held > > - */ > > static void handle_link_event(struct slot *p_slot, u32 event) > > { > > struct controller *ctrl = p_slot->ctrl; > > @@ -377,14 +363,14 @@ static void interrupt_event_handler(struct work_struct *work) > > pciehp_green_led_off(p_slot); > > break; > > case INT_PRESENCE_ON: > > - handle_surprise_event(p_slot); > > + pciehp_queue_power_work(p_slot, ENABLE_REQ); > > break; > > case INT_PRESENCE_OFF: > > /* > > * Regardless of surprise capability, we need to > > * definitely remove a card that has been pulled out! > > */ > > - handle_surprise_event(p_slot); > > + pciehp_queue_power_work(p_slot, DISABLE_REQ); > > break; > > case INT_LINK_UP: > > case INT_LINK_DOWN: > > Thanks a lot for this. I think other people have seen the same issue. > > Even with this fix, don't we have essentially the same problem one > layer back? The first thing pcie_isr() does is read PCI_EXP_SLTSTA, > then few lines down, we call pciehp_get_adapter_status(), which reads > PCI_EXP_SLTSTA *again*. So I think the window is smaller but still > there. > > I think what we really should do is read the status registers > (PCI_EXP_SLTSTA and probably also PCI_EXP_LNKSTA) *once* in > pcie_isr(), before we write PCI_EXP_SLTSTA to clear the RW1C bits > there, and then queue up events based on those values, without > re-reading the registers. > > What do you think? Yes, I agree. We need to do something about that *in addition * to the above patch to cover the whole story. However I think there still will be a room for some interrupt misses because we are collecting the interrupts in intr_loc, and theoretically we could be in a situation where in the pcie_isr, the do { ... } while(detected) loop gets a removal->insertion->removal all while in the same invocation of pcie_isr(). If this happens, the intr_loc will have recorded a single insertion and a single removal, and the final result will depend on the order in which we decide to process the events in intr_loc. Or, may be we can make the calls to pciehp_queue_interrupt_event() before clearing the RW1C in the slot status register (in the loop)? Thanks, rajat > > > Bjorn > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Rajat, thanks for chiming in! On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > On Wed, Aug 17, 2016 at 10:12 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > Hi Mayurkumar, > > > > On Wed, Aug 17, 2016 at 01:42:18PM +0000, Patel, Mayurkumar wrote: > > > Currently, if very fast hotplug removal and insertion event comes > > > as following > > > > > > [ 608.823412] pciehp 0000:00:1c.1:pcie04: Card not present on Slot(1) > > > [ 608.835249] pciehp 0000:00:1c.1:pcie04: Card present on Slot(1) > > > > > > In this case following scenario happens, > > > > > > While removal: > > > pcie_isr() -> pciehp_queue_interrupt_event() -> triggers queue_work(). > > > work invokes interrupt_event_handler() -> case INT_PRESENCE_OFF > > > and calls handle_surprise_event(). > > > > > > handle_surprise_event() again calls pciehp_get_adapter_status() > > > and reads slot status which might have been changed > > > already due to PCI_EXP_SLTSTA_PDC event for hotplug insertion > > > has happened. So it queues, ENABLE_REQ for both removal > > > and insertion interrupt based on latest slot status. > > > > > > In this case, PCIe device can not be hot-add again because > > > it was never removed due to which device can not get enabled. > > > > > > handle_surprise_event() can be removed and pciehp_queue_power_work() > > > can be directly triggered based on INT_PRESENCE_ON and INT_PRESENCE_OFF > > > from the switch case exist in interrupt_event_hanlder(). > > > > > > The patch ensures the pciehp_queue_power_work() processes > > > presence detect change for removal and insertion correctly. > > > > > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com> > > Acked-by: Rajat Jain <rajatxjain@gmail.com> > > > > > > --- > > > Resending the patch addressing to PCI Maintainer Bjorn Helgaas. > > > > > > drivers/pci/hotplug/pciehp_ctrl.c | 18 ++---------------- > > > 1 file changed, 2 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > > > index 880978b..87c5bea 100644 > > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > > @@ -301,20 +301,6 @@ static void handle_button_press_event(struct slot *p_slot) > > > /* > > > * Note: This function must be called with slot->lock held > > > */ > > > -static void handle_surprise_event(struct slot *p_slot) > > > -{ > > > - u8 getstatus; > > > - > > > - pciehp_get_adapter_status(p_slot, &getstatus); > > > - if (!getstatus) > > > - pciehp_queue_power_work(p_slot, DISABLE_REQ); > > > - else > > > - pciehp_queue_power_work(p_slot, ENABLE_REQ); > > > -} > > > - > > > -/* > > > - * Note: This function must be called with slot->lock held > > > - */ > > > static void handle_link_event(struct slot *p_slot, u32 event) > > > { > > > struct controller *ctrl = p_slot->ctrl; > > > @@ -377,14 +363,14 @@ static void interrupt_event_handler(struct work_struct *work) > > > pciehp_green_led_off(p_slot); > > > break; > > > case INT_PRESENCE_ON: > > > - handle_surprise_event(p_slot); > > > + pciehp_queue_power_work(p_slot, ENABLE_REQ); > > > break; > > > case INT_PRESENCE_OFF: > > > /* > > > * Regardless of surprise capability, we need to > > > * definitely remove a card that has been pulled out! > > > */ > > > - handle_surprise_event(p_slot); > > > + pciehp_queue_power_work(p_slot, DISABLE_REQ); > > > break; > > > case INT_LINK_UP: > > > case INT_LINK_DOWN: > > > > Thanks a lot for this. I think other people have seen the same issue. > > > > Even with this fix, don't we have essentially the same problem one > > layer back? The first thing pcie_isr() does is read PCI_EXP_SLTSTA, > > then few lines down, we call pciehp_get_adapter_status(), which reads > > PCI_EXP_SLTSTA *again*. So I think the window is smaller but still > > there. > > > > I think what we really should do is read the status registers > > (PCI_EXP_SLTSTA and probably also PCI_EXP_LNKSTA) *once* in > > pcie_isr(), before we write PCI_EXP_SLTSTA to clear the RW1C bits > > there, and then queue up events based on those values, without > > re-reading the registers. > > > > What do you think? > > > Yes, I agree. We need to do something about that *in addition * to the > above patch to cover the > whole story. However I think there still will be a room for some > interrupt misses because we are > collecting the interrupts in intr_loc, and theoretically we could be > in a situation where in the pcie_isr, the > > do { > ... > } while(detected) > > loop gets a removal->insertion->removal all while in the same > invocation of pcie_isr(). > If this happens, the intr_loc will have recorded a single insertion > and a single removal, and > the final result will depend on the order in which we decide to > process the events in intr_loc. I don't quite understand how that "do { .. } while (detected)" loop works or why it's done that way. Collecting interrupt status bits in an ISR is obviously a very common task; it seems like there should be a standard, idiomatic way of doing it, but I don't know it. > Or, may be we can make the calls to pciehp_queue_interrupt_event() > before clearing the > RW1C in the slot status register (in the loop)? Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any events related to it, then clear the relevant SLTSTA bits. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bjorn and Rajat Thanks for replying. > Hi Rajat, thanks for chiming in! > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > On Wed, Aug 17, 2016 at 10:12 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > > > Hi Mayurkumar, > > > > > > On Wed, Aug 17, 2016 at 01:42:18PM +0000, Patel, Mayurkumar wrote: > > > > Currently, if very fast hotplug removal and insertion event comes > > > > as following > > > > > > > > [ 608.823412] pciehp 0000:00:1c.1:pcie04: Card not present on Slot(1) > > > > [ 608.835249] pciehp 0000:00:1c.1:pcie04: Card present on Slot(1) > > > > > > > > In this case following scenario happens, > > > > > > > > While removal: > > > > pcie_isr() -> pciehp_queue_interrupt_event() -> triggers queue_work(). > > > > work invokes interrupt_event_handler() -> case INT_PRESENCE_OFF > > > > and calls handle_surprise_event(). > > > > > > > > handle_surprise_event() again calls pciehp_get_adapter_status() > > > > and reads slot status which might have been changed > > > > already due to PCI_EXP_SLTSTA_PDC event for hotplug insertion > > > > has happened. So it queues, ENABLE_REQ for both removal > > > > and insertion interrupt based on latest slot status. > > > > > > > > In this case, PCIe device can not be hot-add again because > > > > it was never removed due to which device can not get enabled. > > > > > > > > handle_surprise_event() can be removed and pciehp_queue_power_work() > > > > can be directly triggered based on INT_PRESENCE_ON and INT_PRESENCE_OFF > > > > from the switch case exist in interrupt_event_hanlder(). > > > > > > > > The patch ensures the pciehp_queue_power_work() processes > > > > presence detect change for removal and insertion correctly. > > > > > > > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com> > > > > Acked-by: Rajat Jain <rajatxjain@gmail.com> > > > > > > > > > --- > > > > Resending the patch addressing to PCI Maintainer Bjorn Helgaas. > > > > > > > > drivers/pci/hotplug/pciehp_ctrl.c | 18 ++---------------- > > > > 1 file changed, 2 insertions(+), 16 deletions(-) > > > > > > > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > > > > index 880978b..87c5bea 100644 > > > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > > > @@ -301,20 +301,6 @@ static void handle_button_press_event(struct slot *p_slot) > > > > /* > > > > * Note: This function must be called with slot->lock held > > > > */ > > > > -static void handle_surprise_event(struct slot *p_slot) > > > > -{ > > > > - u8 getstatus; > > > > - > > > > - pciehp_get_adapter_status(p_slot, &getstatus); > > > > - if (!getstatus) > > > > - pciehp_queue_power_work(p_slot, DISABLE_REQ); > > > > - else > > > > - pciehp_queue_power_work(p_slot, ENABLE_REQ); > > > > -} > > > > - > > > > -/* > > > > - * Note: This function must be called with slot->lock held > > > > - */ > > > > static void handle_link_event(struct slot *p_slot, u32 event) > > > > { > > > > struct controller *ctrl = p_slot->ctrl; > > > > @@ -377,14 +363,14 @@ static void interrupt_event_handler(struct work_struct *work) > > > > pciehp_green_led_off(p_slot); > > > > break; > > > > case INT_PRESENCE_ON: > > > > - handle_surprise_event(p_slot); > > > > + pciehp_queue_power_work(p_slot, ENABLE_REQ); > > > > break; > > > > case INT_PRESENCE_OFF: > > > > /* > > > > * Regardless of surprise capability, we need to > > > > * definitely remove a card that has been pulled out! > > > > */ > > > > - handle_surprise_event(p_slot); > > > > + pciehp_queue_power_work(p_slot, DISABLE_REQ); > > > > break; > > > > case INT_LINK_UP: > > > > case INT_LINK_DOWN: > > > > > > Thanks a lot for this. I think other people have seen the same issue. > > > > > > Even with this fix, don't we have essentially the same problem one > > > layer back? The first thing pcie_isr() does is read PCI_EXP_SLTSTA, > > > then few lines down, we call pciehp_get_adapter_status(), which reads > > > PCI_EXP_SLTSTA *again*. So I think the window is smaller but still > > > there. > > > > > > I think what we really should do is read the status registers > > > (PCI_EXP_SLTSTA and probably also PCI_EXP_LNKSTA) *once* in > > > pcie_isr(), before we write PCI_EXP_SLTSTA to clear the RW1C bits > > > there, and then queue up events based on those values, without > > > re-reading the registers. > > > > > > What do you think? > > > > > > Yes, I agree. Yes indeed that should be done too. > > We need to do something about that *in addition * to the > > above patch to cover the > > whole story. However I think there still will be a room for some > > interrupt misses because we are > > collecting the interrupts in intr_loc, and theoretically we could be > > in a situation where in the pcie_isr, the > > > > do { > > ... > > } while(detected) > > > > loop gets a removal->insertion->removal all while in the same > > invocation of pcie_isr(). > > If this happens, the intr_loc will have recorded a single insertion > > and a single removal, and > > the final result will depend on the order in which we decide to > > process the events in intr_loc. > > I don't quite understand how that "do { .. } while (detected)" loop > works or why it's done that way. Collecting interrupt status bits in > an ISR is obviously a very common task; it seems like there should be > a standard, idiomatic way of doing it, but I don't know it. > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > before clearing the > > RW1C in the slot status register (in the loop)? > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > events related to it, then clear the relevant SLTSTA bits. > Do you mean to remove the do {...} while loop and just read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > Bjorn Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > We need to do something about that *in addition * to the > > > above patch to cover the > > > whole story. However I think there still will be a room for some > > > interrupt misses because we are > > > collecting the interrupts in intr_loc, and theoretically we could be > > > in a situation where in the pcie_isr, the > > > > > > do { > > > ... > > > } while(detected) > > > > > > loop gets a removal->insertion->removal all while in the same > > > invocation of pcie_isr(). > > > If this happens, the intr_loc will have recorded a single insertion > > > and a single removal, and > > > the final result will depend on the order in which we decide to > > > process the events in intr_loc. > > > > I don't quite understand how that "do { .. } while (detected)" loop > > works or why it's done that way. Collecting interrupt status bits in > > an ISR is obviously a very common task; it seems like there should be > > a standard, idiomatic way of doing it, but I don't know it. > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > before clearing the > > > RW1C in the slot status register (in the loop)? > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > events related to it, then clear the relevant SLTSTA bits. > > > > Do you mean to remove the do {...} while loop and just > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? I don't know if removing the loop is the right thing or not. We need to understand why the loop is there in the first place and make sure removing it wouldn't break things. But I do think that in the resulting code, the connection between (1) the events we learn from the interrupt and (2) the queued work items needs to be crystal clear. Right now it's a bit muddy because of things like the case you fixed: a work item that goes back and looks at PCI_EXP_SLTSTA after it's been cleared and the hardware may have already set bits for new events. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > > > We need to do something about that *in addition * to the > > > > above patch to cover the > > > > whole story. However I think there still will be a room for some > > > > interrupt misses because we are > > > > collecting the interrupts in intr_loc, and theoretically we could be > > > > in a situation where in the pcie_isr, the > > > > > > > > do { > > > > ... > > > > } while(detected) > > > > > > > > loop gets a removal->insertion->removal all while in the same > > > > invocation of pcie_isr(). > > > > If this happens, the intr_loc will have recorded a single insertion > > > > and a single removal, and > > > > the final result will depend on the order in which we decide to > > > > process the events in intr_loc. > > > > > > I don't quite understand how that "do { .. } while (detected)" loop > > > works or why it's done that way. Collecting interrupt status bits in > > > an ISR is obviously a very common task; it seems like there should be > > > a standard, idiomatic way of doing it, but I don't know it. > > > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > > before clearing the > > > > RW1C in the slot status register (in the loop)? > > > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > > events related to it, then clear the relevant SLTSTA bits. > > > > > > > Do you mean to remove the do {...} while loop and just > > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > > I don't know if removing the loop is the right thing or not. We need > to understand why the loop is there in the first place and make sure > removing it wouldn't break things. > The pcie base spec. 3.0 says from chapter 6.7.3.1. Slot Events: A Downstream Port with hot-plug capabilities monitors the slot it controls for the slot events listed above. When one of these slot events is detected, the Port indicates that the event has occurred by setting the status field associated with the event. At that point, the event is pending until software clears the status field. Once a slot event is pending on a particular slot, all subsequent events of that type are ignored on that slot until the event is cleared. The Port must continue to monitor the slot for all other slot event types and report them as they occur. So does it mean that the port would continue providing MSI if there has been any other events occurred apart from the event which is not cleared? If that is the case then it's not sure why the loop is still needed. Also with having a loop, pcie_isr() is reading the PCI_EXP_SLTSTA again and taking action just based on latest event happens. That may mean removal->insertion happens fast enough then removal could be overwritten by insertion and pciehp_get_adapter_status () will return insertion only and we may miss the removal event to be processed for the loop. If we don't clear the Slot event quick enough then according to spec. statement above that event could get ignored and SW may never get notified. > But I do think that in the resulting code, the connection between > > (1) the events we learn from the interrupt and > (2) the queued work items > > needs to be crystal clear. Right now it's a bit muddy because of > things like the case you fixed: a work item that goes back and looks > at PCI_EXP_SLTSTA after it's been cleared and the hardware may have > already set bits for new events. > I have made a prototype patch in my follow up reply and tested ok on my existing setup on which I caught the previous issue although I cant say it will work on any HW. Clearing events after queuing events gave some problems and did not work properly on my existing HW where I test very fast insertion and removal events, in which case only "removal" event comes and "insertion" does not occur even if HW gets powered. > Bjorn Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 18, 2016 at 1:59 PM, Patel, Mayurkumar <mayurkumar.patel@intel.com> wrote: > > > > > On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > > > > > We need to do something about that *in addition * to the > > > > > above patch to cover the > > > > > whole story. However I think there still will be a room for some > > > > > interrupt misses because we are > > > > > collecting the interrupts in intr_loc, and theoretically we could be > > > > > in a situation where in the pcie_isr, the > > > > > > > > > > do { > > > > > ... > > > > > } while(detected) > > > > > > > > > > loop gets a removal->insertion->removal all while in the same > > > > > invocation of pcie_isr(). > > > > > If this happens, the intr_loc will have recorded a single insertion > > > > > and a single removal, and > > > > > the final result will depend on the order in which we decide to > > > > > process the events in intr_loc. > > > > > > > > I don't quite understand how that "do { .. } while (detected)" loop > > > > works or why it's done that way. Collecting interrupt status bits in > > > > an ISR is obviously a very common task; it seems like there should be > > > > a standard, idiomatic way of doing it, but I don't know it. > > > > > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > > > before clearing the > > > > > RW1C in the slot status register (in the loop)? > > > > > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > > > events related to it, then clear the relevant SLTSTA bits. > > > > > > > > > > Do you mean to remove the do {...} while loop and just > > > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > > > > I don't know if removing the loop is the right thing or not. We need > > to understand why the loop is there in the first place and make sure > > removing it wouldn't break things. > > > > > The pcie base spec. 3.0 says from chapter 6.7.3.1. Slot Events: > > A Downstream Port with hot-plug capabilities monitors the slot it controls for the slot events listed > above. When one of these slot events is detected, the Port indicates that the event has occurred by > setting the status field associated with the event. At that point, the event is pending until software > clears the status field. > Once a slot event is pending on a particular slot, all subsequent events of that type are ignored on > that slot until the event is cleared. The Port must continue to monitor the slot for all other slot > event types and report them as they occur. > > > So does it mean that the port would continue providing MSI if there has been > any other events occurred apart from the event which is not cleared? If that > is the case then it's not sure why the loop is still needed. I'd think that the loop is needed because we don't want to handle just one event on one interrupt. We want to handle as many events as we can (to keep the interrupt latency overhead low), that have happened during the ISR invocation. And hence the loop. > > Also with having a loop, pcie_isr() is reading the PCI_EXP_SLTSTA again and taking action just based on > latest event happens. That may mean removal->insertion happens fast enough then > removal could be overwritten by insertion and pciehp_get_adapter_status () > will return insertion only and we may miss the removal event to be processed for the loop. Current. This is the problem that Bjorn highlighted above. I think the current code works well when events of different kinds have happened (due to the loop). In case of similar kind of events (insertion / removal), I think there are 2 problems: (1) we use the transient values of slot status which may not be valid at that time. (2) because we ack the event before we queue it up, it leaves chance for error. I think we'd need more careful examination in order to fix these issues. > > If we don't clear the Slot event quick enough then according to spec. statement above > that event could get ignored and SW may never get notified. > > > > > But I do think that in the resulting code, the connection between > > > > (1) the events we learn from the interrupt and > > (2) the queued work items > > > > needs to be crystal clear. Right now it's a bit muddy because of > > things like the case you fixed: a work item that goes back and looks > > at PCI_EXP_SLTSTA after it's been cleared and the hardware may have > > already set bits for new events. > > > > I have made a prototype patch in my follow up reply and tested ok on my existing setup > on which I caught the previous issue although I cant say it will work on any HW. > Clearing events after queuing events gave some problems and did not work properly on my > existing HW where I test very fast insertion and removal events, in which > case only "removal" event comes and "insertion" does not occur even if HW gets powered. > > > Bjorn > Intel Deutschland GmbH > Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany > Tel: +49 89 99 8853-0, www.intel.de > Managing Directors: Christin Eisenschmid, Christian Lamprechter > Chairperson of the Supervisory Board: Nicole Lau > Registered Office: Munich > Commercial Register: Amtsgericht Muenchen HRB 186928 > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > > > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > > > > > > > We need to do something about that *in addition * to the > > > > > > above patch to cover the > > > > > > whole story. However I think there still will be a room for some > > > > > > interrupt misses because we are > > > > > > collecting the interrupts in intr_loc, and theoretically we could be > > > > > > in a situation where in the pcie_isr, the > > > > > > > > > > > > do { > > > > > > ... > > > > > > } while(detected) > > > > > > > > > > > > loop gets a removal->insertion->removal all while in the same > > > > > > invocation of pcie_isr(). > > > > > > If this happens, the intr_loc will have recorded a single insertion > > > > > > and a single removal, and > > > > > > the final result will depend on the order in which we decide to > > > > > > process the events in intr_loc. > > > > > > > > > > I don't quite understand how that "do { .. } while (detected)" loop > > > > > works or why it's done that way. Collecting interrupt status bits in > > > > > an ISR is obviously a very common task; it seems like there should be > > > > > a standard, idiomatic way of doing it, but I don't know it. > > > > > > > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > > > > before clearing the > > > > > > RW1C in the slot status register (in the loop)? > > > > > > > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > > > > events related to it, then clear the relevant SLTSTA bits. > > > > > > > > > > > > > Do you mean to remove the do {...} while loop and just > > > > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > > > > > > I don't know if removing the loop is the right thing or not. We need > > > to understand why the loop is there in the first place and make sure > > > removing it wouldn't break things. > > > > > > > > > The pcie base spec. 3.0 says from chapter 6.7.3.1. Slot Events: > > > > A Downstream Port with hot-plug capabilities monitors the slot it controls for the slot events listed > > above. When one of these slot events is detected, the Port indicates that the event has occurred by > > setting the status field associated with the event. At that point, the event is pending until software > > clears the status field. > > Once a slot event is pending on a particular slot, all subsequent events of that type are ignored on > > that slot until the event is cleared. The Port must continue to monitor the slot for all other slot > > event types and report them as they occur. > > > > > > So does it mean that the port would continue providing MSI if there has been > > any other events occurred apart from the event which is not cleared? If that > > is the case then it's not sure why the loop is still needed. > > I'd think that the loop is needed because we don't want to handle just > one event on one interrupt. We want to handle as many events as we can > (to keep the interrupt latency overhead low), that have happened > during the ISR invocation. And hence the loop. > Hmm, Right. I created a proposal patch (=>[v1,2/2] PCI: pciehp: Rework hotplug interrupt routine) to remove the do {...} while loop which I guess then should be rejected, right? As we decide to keep this loop? > > > > Also with having a loop, pcie_isr() is reading the PCI_EXP_SLTSTA again and taking action just based on > > latest event happens. That may mean removal->insertion happens fast enough then > > removal could be overwritten by insertion and pciehp_get_adapter_status () > > will return insertion only and we may miss the removal event to be processed for the loop. > > Current. This is the problem that Bjorn highlighted above. I think the > current code works well when events of different kinds have happened > (due to the loop). > > In case of similar kind of events (insertion / removal), I think there > are 2 problems: (1) we use the transient values of slot status which > may not be valid at that time. (2) because we ack the event before we > queue it up, it leaves chance for error. I think we'd need more > careful examination in order to fix these issues. > I also see 3rd problem too with this Loop. For example, if we are in the loop, If only slot removal comes -> detect = PCI_EXP_SLTSTA_PDC -> intr_lock |= detected -> detected==true so clear PDC interrupt bit => So we have a chance to detect another insertion event. We are in the same loop & if insertion comes again -> so detected = PCI_EXP_SLTSTA_PDC -> then detected &= ~intr_loc -> detected == false => This means PCI_EXP_SLTSTA would not be cleared for insertion event happened and there we could get a problem that No further PDC events can get triggered by HW. So if we want to keep the loop, I think clearing PCI_EXP_SLTSTA should be shifted outside of the loop to clear Slot events(but not based on detected) to ensure that loop just looks for the events Which has not occurred, also before clearing SLTSTA outside the loop, we can update link and slot_status to trigger the right events for DLLSC and PDC types. What do you think about following proposal? @@ -542,7 +542,7 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) struct pci_bus *subordinate = pdev->subordinate; struct pci_dev *dev; struct slot *slot = ctrl->slot; - u16 detected, intr_loc; + u16 detected, intr_loc, slot_status; u8 present; bool link; @@ -553,26 +553,30 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) */ intr_loc = 0; do { - pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &detected); - if (detected == (u16) ~0) { + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status); + if (slot_status == (u16) ~0) { ctrl_info(ctrl, "%s: no response from device\n", __func__); return IRQ_HANDLED; } - detected &= (PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | + detected = slot_status & (PCI_EXP_SLTSTA_ABP | + PCI_EXP_SLTSTA_PFD | PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC); detected &= ~intr_loc; intr_loc |= detected; if (!intr_loc) return IRQ_NONE; - if (detected) - pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, - intr_loc); } while (detected); ctrl_dbg(ctrl, "pending interrupts %#06x from Slot Status\n", intr_loc); + if (intr_loc & PCI_EXP_SLTSTA_PDC) + present = !!(slot_status & PCI_EXP_SLTSTA_PDS); + if (intr_loc & PCI_EXP_SLTSTA_DLLSC) + link = pciehp_check_link_active(ctrl); + + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, intr_loc); /* Check Command Complete Interrupt Pending */ if (intr_loc & PCI_EXP_SLTSTA_CC) { @@ -603,7 +607,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) /* Check Presence Detect Changed */ if (intr_loc & PCI_EXP_SLTSTA_PDC) { - pciehp_get_adapter_status(slot, &present); ctrl_info(ctrl, "Card %spresent on Slot(%s)\n", present ? "" : "not ", slot_name(slot)); pciehp_queue_interrupt_event(slot, present ? INT_PRESENCE_ON : @@ -618,7 +621,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) } if (intr_loc & PCI_EXP_SLTSTA_DLLSC) { - link = pciehp_check_link_active(ctrl); ctrl_info(ctrl, "slot(%s): Link %s event\n", slot_name(slot), link ? "Up" : "Down"); pciehp_queue_interrupt_event(slot, link ? INT_LINK_UP : > > > > If we don't clear the Slot event quick enough then according to spec. statement above > > that event could get ignored and SW may never get notified. > > > > > > > > > But I do think that in the resulting code, the connection between > > > > > > (1) the events we learn from the interrupt and > > > (2) the queued work items > > > > > > needs to be crystal clear. Right now it's a bit muddy because of > > > things like the case you fixed: a work item that goes back and looks > > > at PCI_EXP_SLTSTA after it's been cleared and the hardware may have > > > already set bits for new events. > > > > > > > I have made a prototype patch in my follow up reply and tested ok on my existing setup > > on which I caught the previous issue although I cant say it will work on any HW. > > Clearing events after queuing events gave some problems and did not work properly on my > > existing HW where I test very fast insertion and removal events, in which > > case only "removal" event comes and "insertion" does not occur even if HW gets powered. > > Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
On Tue, Aug 23, 2016 at 04:47:44PM -0700, Rajat Jain wrote: > On Thu, Aug 18, 2016 at 1:59 PM, Patel, Mayurkumar > <mayurkumar.patel@intel.com> wrote: > > > > So does it mean that the port would continue providing MSI if there has been > > any other events occurred apart from the event which is not cleared? If that > > is the case then it's not sure why the loop is still needed. > > I'd think that the loop is needed because we don't want to handle just > one event on one interrupt. We want to handle as many events as we can > (to keep the interrupt latency overhead low), that have happened > during the ISR invocation. And hence the loop. The Slot Status doesn't have to provide a single event at a time. But even if it does, this isn't exactly a performance path that would be hurt by having one interrupt per event, right? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > > > > On Wed, Aug 17, 2016 at 10:37:13PM +0000, Patel, Mayurkumar wrote: > > > > > > On Wed, Aug 17, 2016 at 10:54:12AM -0700, Rajat Jain wrote: > > > > > > > > > > > We need to do something about that *in addition * to the > > > > > > > above patch to cover the > > > > > > > whole story. However I think there still will be a room for some > > > > > > > interrupt misses because we are > > > > > > > collecting the interrupts in intr_loc, and theoretically we could be > > > > > > > in a situation where in the pcie_isr, the > > > > > > > > > > > > > > do { > > > > > > > ... > > > > > > > } while(detected) > > > > > > > > > > > > > > loop gets a removal->insertion->removal all while in the same > > > > > > > invocation of pcie_isr(). > > > > > > > If this happens, the intr_loc will have recorded a single insertion > > > > > > > and a single removal, and > > > > > > > the final result will depend on the order in which we decide to > > > > > > > process the events in intr_loc. > > > > > > > > > > > > I don't quite understand how that "do { .. } while (detected)" loop > > > > > > works or why it's done that way. Collecting interrupt status bits in > > > > > > an ISR is obviously a very common task; it seems like there should be > > > > > > a standard, idiomatic way of doing it, but I don't know it. > > > > > > > > > > > > > Or, may be we can make the calls to pciehp_queue_interrupt_event() > > > > > > > before clearing the > > > > > > > RW1C in the slot status register (in the loop)? > > > > > > > > > > > > Yeah, it seems like we should read PCI_EXP_SLTSTA once, queue up any > > > > > > events related to it, then clear the relevant SLTSTA bits. > > > > > > > > > > > > > > > > Do you mean to remove the do {...} while loop and just > > > > > read PCI_EXP_SLTSTA once in ISR , queue the work and clear interrupts? > > > > > > > > I don't know if removing the loop is the right thing or not. We need > > > > to understand why the loop is there in the first place and make sure > > > > removing it wouldn't break things. > > > > > > > > > > > > > The pcie base spec. 3.0 says from chapter 6.7.3.1. Slot Events: > > > > > > A Downstream Port with hot-plug capabilities monitors the slot it controls for the slot events listed > > > above. When one of these slot events is detected, the Port indicates that the event has occurred by > > > setting the status field associated with the event. At that point, the event is pending until software > > > clears the status field. > > > Once a slot event is pending on a particular slot, all subsequent events of that type are ignored on > > > that slot until the event is cleared. The Port must continue to monitor the slot for all other slot > > > event types and report them as they occur. > > > > > > > > > So does it mean that the port would continue providing MSI if there has been > > > any other events occurred apart from the event which is not cleared? If that > > > is the case then it's not sure why the loop is still needed. > > > > I'd think that the loop is needed because we don't want to handle just > > one event on one interrupt. We want to handle as many events as we can > > (to keep the interrupt latency overhead low), that have happened > > during the ISR invocation. And hence the loop. > > > > Hmm, Right. I created a proposal patch > (=>[v1,2/2] PCI: pciehp: Rework hotplug interrupt routine) > to remove the do {...} while loop which > I guess then should be rejected, right? As we decide to keep this loop? > > > > > > > Also with having a loop, pcie_isr() is reading the PCI_EXP_SLTSTA again and taking action just based on > > > latest event happens. That may mean removal->insertion happens fast enough then > > > removal could be overwritten by insertion and pciehp_get_adapter_status () > > > will return insertion only and we may miss the removal event to be processed for the loop. > > > > Current. This is the problem that Bjorn highlighted above. I think the > > current code works well when events of different kinds have happened > > (due to the loop). > > > > In case of similar kind of events (insertion / removal), I think there > > are 2 problems: (1) we use the transient values of slot status which > > may not be valid at that time. (2) because we ack the event before we > > queue it up, it leaves chance for error. I think we'd need more > > careful examination in order to fix these issues. > > > > I also see 3rd problem too with this Loop. > For example, if we are in the loop, > If only slot removal comes -> detect = PCI_EXP_SLTSTA_PDC -> intr_lock |= detected > -> detected==true so clear PDC interrupt bit => So we have a chance to detect > another insertion event. > > We are in the same loop & if insertion comes again -> so detected = PCI_EXP_SLTSTA_PDC > -> then detected &= ~intr_loc -> detected == false => This means > PCI_EXP_SLTSTA would not be cleared for insertion event happened > and there we could get a problem that No further PDC events can get triggered by HW. > > So if we want to keep the loop, I think clearing PCI_EXP_SLTSTA should be shifted > outside of the loop to clear Slot events(but not based on detected) to ensure that loop > just looks for the events Which has not occurred, also before clearing SLTSTA outside > the loop, we can update link and slot_status to trigger the right events for DLLSC and PDC types. > > What do you think about following proposal? > > @@ -542,7 +542,7 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > struct pci_bus *subordinate = pdev->subordinate; > struct pci_dev *dev; > struct slot *slot = ctrl->slot; > - u16 detected, intr_loc; > + u16 detected, intr_loc, slot_status; > u8 present; > bool link; > > @@ -553,26 +553,30 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > */ > intr_loc = 0; > do { > - pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &detected); > - if (detected == (u16) ~0) { > + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status); > + if (slot_status == (u16) ~0) { > ctrl_info(ctrl, "%s: no response from device\n", > __func__); > return IRQ_HANDLED; > } > > - detected &= (PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | > + detected = slot_status & (PCI_EXP_SLTSTA_ABP | > + PCI_EXP_SLTSTA_PFD | > PCI_EXP_SLTSTA_PDC | > PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC); > detected &= ~intr_loc; > intr_loc |= detected; > if (!intr_loc) > return IRQ_NONE; > - if (detected) > - pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, > - intr_loc); > } while (detected); > > ctrl_dbg(ctrl, "pending interrupts %#06x from Slot Status\n", intr_loc); > + if (intr_loc & PCI_EXP_SLTSTA_PDC) > + present = !!(slot_status & PCI_EXP_SLTSTA_PDS); > + if (intr_loc & PCI_EXP_SLTSTA_DLLSC) > + link = pciehp_check_link_active(ctrl); > + > + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, intr_loc); > > /* Check Command Complete Interrupt Pending */ > if (intr_loc & PCI_EXP_SLTSTA_CC) { > @@ -603,7 +607,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > > /* Check Presence Detect Changed */ > if (intr_loc & PCI_EXP_SLTSTA_PDC) { > - pciehp_get_adapter_status(slot, &present); > ctrl_info(ctrl, "Card %spresent on Slot(%s)\n", > present ? "" : "not ", slot_name(slot)); > pciehp_queue_interrupt_event(slot, present ? INT_PRESENCE_ON : > @@ -618,7 +621,6 @@ static irqreturn_t pcie_isr(int irq, void *dev_id) > } > > if (intr_loc & PCI_EXP_SLTSTA_DLLSC) { > - link = pciehp_check_link_active(ctrl); > ctrl_info(ctrl, "slot(%s): Link %s event\n", > slot_name(slot), link ? "Up" : "Down"); > pciehp_queue_interrupt_event(slot, link ? INT_LINK_UP : > > Hi Rajat/Bjorn Could you please comment, if the proposal can be worked out or we should consider removing do {..} while at all as Keith, Busch commented as that's not really performance path or is there any other way/suggestion you think of? > > > > > > If we don't clear the Slot event quick enough then according to spec. statement above > > > that event could get ignored and SW may never get notified. > > > > > > > > > > > > > But I do think that in the resulting code, the connection between > > > > > > > > (1) the events we learn from the interrupt and > > > > (2) the queued work items > > > > > > > > needs to be crystal clear. Right now it's a bit muddy because of > > > > things like the case you fixed: a work item that goes back and looks > > > > at PCI_EXP_SLTSTA after it's been cleared and the hardware may have > > > > already set bits for new events. > > > > > > > > > > I have made a prototype patch in my follow up reply and tested ok on my existing setup > > > on which I caught the previous issue although I cant say it will work on any HW. > > > Clearing events after queuing events gave some problems and did not work properly on my > > > existing HW where I test very fast insertion and removal events, in which > > > case only "removal" event comes and "insertion" does not occur even if HW gets powered. > > > > Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 880978b..87c5bea 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -301,20 +301,6 @@ static void handle_button_press_event(struct slot *p_slot) /* * Note: This function must be called with slot->lock held */ -static void handle_surprise_event(struct slot *p_slot) -{ - u8 getstatus; - - pciehp_get_adapter_status(p_slot, &getstatus); - if (!getstatus) - pciehp_queue_power_work(p_slot, DISABLE_REQ); - else - pciehp_queue_power_work(p_slot, ENABLE_REQ); -} - -/* - * Note: This function must be called with slot->lock held - */ static void handle_link_event(struct slot *p_slot, u32 event) { struct controller *ctrl = p_slot->ctrl; @@ -377,14 +363,14 @@ static void interrupt_event_handler(struct work_struct *work) pciehp_green_led_off(p_slot); break; case INT_PRESENCE_ON: - handle_surprise_event(p_slot); + pciehp_queue_power_work(p_slot, ENABLE_REQ); break; case INT_PRESENCE_OFF: /* * Regardless of surprise capability, we need to * definitely remove a card that has been pulled out! */ - handle_surprise_event(p_slot); + pciehp_queue_power_work(p_slot, DISABLE_REQ); break; case INT_LINK_UP: case INT_LINK_DOWN:
Currently, if very fast hotplug removal and insertion event comes as following [ 608.823412] pciehp 0000:00:1c.1:pcie04: Card not present on Slot(1) [ 608.835249] pciehp 0000:00:1c.1:pcie04: Card present on Slot(1) In this case following scenario happens, While removal: pcie_isr() -> pciehp_queue_interrupt_event() -> triggers queue_work(). work invokes interrupt_event_handler() -> case INT_PRESENCE_OFF and calls handle_surprise_event(). handle_surprise_event() again calls pciehp_get_adapter_status() and reads slot status which might have been changed already due to PCI_EXP_SLTSTA_PDC event for hotplug insertion has happened. So it queues, ENABLE_REQ for both removal and insertion interrupt based on latest slot status. In this case, PCIe device can not be hot-add again because it was never removed due to which device can not get enabled. handle_surprise_event() can be removed and pciehp_queue_power_work() can be directly triggered based on INT_PRESENCE_ON and INT_PRESENCE_OFF from the switch case exist in interrupt_event_hanlder(). The patch ensures the pciehp_queue_power_work() processes presence detect change for removal and insertion correctly. Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com> --- Resending the patch addressing to PCI Maintainer Bjorn Helgaas. drivers/pci/hotplug/pciehp_ctrl.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) -- 1.7.9.5 Mayurkumar Patel Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html