diff mbox

PCI, pciehp: Turn on link a while to workaround presense detection

Message ID 1386361357-7767-1-git-send-email-yinghai@kernel.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yinghai Lu Dec. 6, 2013, 8:22 p.m. UTC
During hotplug test on platform, found slot status register does not
report present status correctly. That present bit does not get cleared
even after that card is removed.

That problem is caused by commit:
| commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
|
|    PCI: pciehp: Disable/enable link during slot power off/on

It looks like chipset bug, that PresDet bit is "OR" operation between
sideband input from FPGA, and chipset inband detection from pcie link.
It does not like if we disable pcie link at first and power off
other side device, and it has input from inband detection
always 1.

Workaround: Try turn on link a while after power off.

After this patch, PresDet report correct status when removing or adding
card later.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> 3.4+

---
 drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ethan Zhao Dec. 7, 2013, 5:11 a.m. UTC | #1
Yinghai,
    Does this workaround have any known side effect to 'other'
platform ?  for example, some old platforms don't have such buggy
chipset/FPPA logic.  have you tested it on those hotplug capable box ?
    If the issue is caused by commit
2debd9289997fc5d1c0043b41201a8b40d5e11d0, should we revert it
first ?
    And if the workaround is specific to any vendor's platform, could
we introduce a method of only workaround the specific vendor id /
device id instead of  patching generic code ?


Thanks,
Ethan

On Sat, Dec 7, 2013 at 4:22 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> During hotplug test on platform, found slot status register does not
> report present status correctly. That present bit does not get cleared
> even after that card is removed.
>
> That problem is caused by commit:
> | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
> |
> |    PCI: pciehp: Disable/enable link during slot power off/on
>
> It looks like chipset bug, that PresDet bit is "OR" operation between
> sideband input from FPGA, and chipset inband detection from pcie link.
> It does not like if we disable pcie link at first and power off
> other side device, and it has input from inband detection
> always 1.
>
> Workaround: Try turn on link a while after power off.
>
> After this patch, PresDet report correct status when removing or adding
> card later.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Cc: <stable@vger.kernel.org> 3.4+
>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
>
> Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
> +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
>         }
>         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, slot_cmd);
> +
> +       /*
> +        * Enable link for a while so chipset could settle down
> +        * inband presence detection logic
> +        */
> +       pciehp_link_enable(ctrl);
> +       msleep(20);
> +       pciehp_link_disable(ctrl);
> +
>         return 0;
>  }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 7, 2013, 6:35 p.m. UTC | #2
On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> During hotplug test on platform, found slot status register does not
> report present status correctly. That present bit does not get cleared
> even after that card is removed.
>
> That problem is caused by commit:
> | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
> |
> |    PCI: pciehp: Disable/enable link during slot power off/on
>
> It looks like chipset bug, that PresDet bit is "OR" operation between
> sideband input from FPGA, and chipset inband detection from pcie link.

This doesn't sound like a chipset bug.  It sounds like exactly what
the spec describes: "Presence Detect State – This bit indicates the
presence of an adapter in the slot, reflected by the logical “OR” of
the Physical Layer in-band presence detect mechanism and, if present,
any out-of-band presence detect mechanism defined for the slot’s
corresponding form factor" (PCIe 3.0, sec 7.8.11).

So I want to fix this pciehp problem, but between 2debd9289997 and
this patch, pciehp_power_off_slot() is accumulating warts that make it
look like we're just reacting to things that break rather than making
a robust design to start with.

> It does not like if we disable pcie link at first and power off
> other side device, and it has input from inband detection
> always 1.
>
> Workaround: Try turn on link a while after power off.
>
> After this patch, PresDet report correct status when removing or adding
> card later.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Cc: <stable@vger.kernel.org> 3.4+
>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
>
> Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
> +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
>         }
>         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, slot_cmd);
> +
> +       /*
> +        * Enable link for a while so chipset could settle down
> +        * inband presence detection logic
> +        */
> +       pciehp_link_enable(ctrl);
> +       msleep(20);
> +       pciehp_link_disable(ctrl);
> +
>         return 0;
>  }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 8, 2013, 3:19 a.m. UTC | #3
On Sat, Dec 7, 2013 at 10:35 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> This doesn't sound like a chipset bug.  It sounds like exactly what
> the spec describes: "Presence Detect State – This bit indicates the
> presence of an adapter in the slot, reflected by the logical “OR” of
> the Physical Layer in-band presence detect mechanism and, if present,
> any out-of-band presence detect mechanism defined for the slot’s
> corresponding form factor" (PCIe 3.0, sec 7.8.11).

Not sure. the silicon does not report in-band correctly if we turn off
pcie link before turn the power of card.

case 1:
a. disable pci link
b. turn off power of pci cards
c. Present bit: card present.
d. remove the card.
e. Present bit: card is present.

case 2:
a. turn off power of pci cards
b. disable pci link
c. Present bit: card present.
d. remove the card.
e. Present bit: card is NOT present.

case 3:
a. disable pci link
b. turn off power of pci cards
c. Present bit: card present.
d. turn on pcie link
e. wait 20ms
f. turn off pcie link
g. remove the card.
h. Present bit: card is NOT present.

>
> So I want to fix this pciehp problem, but between 2debd9289997 and
> this patch, pciehp_power_off_slot() is accumulating warts that make it
> look like we're just reacting to things that break rather than making
> a robust design to start with.

We have 2debd9289997 (turn off link before turn off power) to avoid
some AER ....

with this patch we still keep that and link off
also have presence detect work correctly.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ethan Zhao Dec. 8, 2013, 4:55 a.m. UTC | #4
On Sun, Dec 8, 2013 at 11:19 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sat, Dec 7, 2013 at 10:35 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>
>> This doesn't sound like a chipset bug.  It sounds like exactly what
>> the spec describes: "Presence Detect State – This bit indicates the
>> presence of an adapter in the slot, reflected by the logical “OR” of
>> the Physical Layer in-band presence detect mechanism and, if present,
>> any out-of-band presence detect mechanism defined for the slot’s
>> corresponding form factor" (PCIe 3.0, sec 7.8.11).
>
> Not sure. the silicon does not report in-band correctly if we turn off
> pcie link before turn the power of card.
>

Obvious there are both in-band and out-of band present detect
mechanism on Yinghai's test platform:

> case 1:
> a. disable pci link
> b. turn off power of pci cards

in-band-present = should be 'off ', but still 'on'
the hardware/FPGA of Yinghai's test platform has special temper here,
it couldn't set in-band-present right if disable pci link first.
but other platform could work well when commit 2debd9289997 applied ?
non report that.

> c. Present bit: card present.
> d. remove the card.

out-band-present = off

present = n-band-present  | out-band-present
                 on | off
               = on
not work well.
> e. Present bit: card is present.
>
> case 2:
> a. turn off power of pci cards
> b. disable pci link

in-band-present = off
works well

> c. Present bit: card present.
> d. remove the card.

out-band-present = off

> e. Present bit: card is NOT present.

present = n-band-present  | out-band-present
                 off | off
              = off
works well.
>
> case 3:
> a. disable pci link
> b. turn off power of pci cards

in-band-present = should be 'off' , but still 'on'

> c. Present bit: card present.
> d. turn on pcie link
> e. wait 20ms

in-band-present = on ?

> f. turn off pcie link
in-band-present = off ?
> g. remove the card.
 out-band-present = off

present = in-band-present  | out-band-present = off

It seems,  Yinghai's hardware logic that sets present bit with in-band
OR out-band detection has special nature, it coundn't see power_off
command and set in-band-present right if you disable link first.  and
it works well with original code path shown by case 2.

if so, we should revert 2debd9289997 and workaround the AER flood with
AER mask bit somewhere instead of play such trick in generic code.
because of the original code could work on both this 'buggy'
flatform  and others.  and these workaround just make the code hard to
understand and far away from simple and generic.

just my 2 cents, of coz, you could ignore it wildly.

Thanks,
Ethan

> h. Present bit: card is NOT present.
>
>>
>> So I want to fix this pciehp problem, but between 2debd9289997 and
>> this patch, pciehp_power_off_slot() is accumulating warts that make it
>> look like we're just reacting to things that break rather than making
>> a robust design to start with.
>
> We have 2debd9289997 (turn off link before turn off power) to avoid
> some AER ....
>
> with this patch we still keep that and link off
> also have presence detect work correctly.
>
> Yinghai
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 8, 2013, 6:04 a.m. UTC | #5
On Sat, Dec 7, 2013 at 8:55 PM, Ethan Zhao <ethan.kernel@gmail.com> wrote:
> On Sun, Dec 8, 2013 at 11:19 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Sat, Dec 7, 2013 at 10:35 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>
>>> This doesn't sound like a chipset bug.  It sounds like exactly what
>>> the spec describes: "Presence Detect State – This bit indicates the
>>> presence of an adapter in the slot, reflected by the logical “OR” of
>>> the Physical Layer in-band presence detect mechanism and, if present,
>>> any out-of-band presence detect mechanism defined for the slot’s
>>> corresponding form factor" (PCIe 3.0, sec 7.8.11).
>>
>> Not sure. the silicon does not report in-band correctly if we turn off
>> pcie link before turn the power of card.
>>
>
> Obvious there are both in-band and out-of band present detect
> mechanism on Yinghai's test platform:
>
>> case 1:
>> a. disable pci link
>> b. turn off power of pci cards
>
> in-band-present = should be 'off ', but still 'on'
> the hardware/FPGA of Yinghai's test platform has special temper here,
> it couldn't set in-band-present right if disable pci link first.
> but other platform could work well when commit 2debd9289997 applied ?
> non report that.

It happens on all intel platform from nehalem, westmere, sandybridge,
and ivybridge.

If the users are careful enough then they would find the presence is
still set even after
card is removed.

but hotadd back later is still working, as attention button pressing
will still generate interrupt,..

>
>> c. Present bit: card present.
>> d. remove the card.
>
> out-band-present = off
>
> present = n-band-present  | out-band-present
>                  on | off
>                = on
> not work well.
>> e. Present bit: card is present.
>>
>> case 2:
>> a. turn off power of pci cards
>> b. disable pci link
>
> in-band-present = off
> works well
>
>> c. Present bit: card present.
>> d. remove the card.
>
> out-band-present = off
>
>> e. Present bit: card is NOT present.
>
> present = n-band-present  | out-band-present
>                  off | off
>               = off
> works well.

We can not check in-band presence directly.

We only can check result after "OR", and out-band presence.
and out-band presence from FPGA always report correctly.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajat Jain Dec. 8, 2013, 9:29 p.m. UTC | #6
Hello,

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: Saturday, December 07, 2013 10:36 AM
> To: Yinghai Lu
> Cc: linux-pci@vger.kernel.org; Kenji Kaneshige; stable@vger.kernel.org
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> > During hotplug test on platform, found slot status register does not
> > report present status correctly. That present bit does not get cleared
> > even after that card is removed.
> >
> > That problem is caused by commit:
> > | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
> > |
> > |    PCI: pciehp: Disable/enable link during slot power off/on
> >
> > It looks like chipset bug, that PresDet bit is "OR" operation between
> > sideband input from FPGA, and chipset inband detection from pcie link.
> 
> This doesn't sound like a chipset bug.  It sounds like exactly what the
> spec describes: "Presence Detect State - This bit indicates the presence
> of an adapter in the slot, reflected by the logical "OR" of the Physical
> Layer in-band presence detect mechanism and, if present, any out-of-band
> presence detect mechanism defined for the slot's corresponding form
> factor" (PCIe 3.0, sec 7.8.11).
> 
> So I want to fix this pciehp problem, but between 2debd9289997 and this
> patch, pciehp_power_off_slot() is accumulating warts that make it look
> like we're just reacting to things that break rather than making a
> robust design to start with.

Yes.

In fact things do get more complicated if we introduce link state based hot-plug, because then that requires that we do NOT disable the link permanently during power off. (Because we would want to receive future hot-plug link-up events).

> 
> > It does not like if we disable pcie link at first and power off other
> > side device, and it has input from inband detection always 1.
> >
> > Workaround: Try turn on link a while after power off.
> >
> > After this patch, PresDet report correct status when removing or
> > adding card later.
> >
> > Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> > Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> > Cc: <stable@vger.kernel.org> 3.4+
> >
> > ---
> >  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
> > +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> > @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
> >         }
> >         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
> >                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
> > slot_cmd);
> > +
> > +       /*
> > +        * Enable link for a while so chipset could settle down
> > +        * inband presence detection logic
> > +        */
> > +       pciehp_link_enable(ctrl);
> > +       msleep(20);
> > +       pciehp_link_disable(ctrl);
> > +
> >         return 0;
> >  }
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org More majordomo info
> at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 9, 2013, 12:58 a.m. UTC | #7
On Sun, Dec 8, 2013 at 1:29 PM, Rajat Jain <rajatjain@juniper.net> wrote:
>
> In fact things do get more complicated if we introduce link state based hot-plug, because then that requires that we do NOT disable the link permanently during power off. (Because we would want to receive future hot-plug link-up events).

is that out of spec?

Why do we need to use link state event with hotplug? What is attention
button doing?

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ethan Zhao Dec. 9, 2013, 2:10 a.m. UTC | #8
Rajat,

    Do you have a draft to extend the hotplug spec ?  seem you wanna
introduce link change interrupt to
the your chip or just poll the link status register ?

Thanks,
Ethan

On Mon, Dec 9, 2013 at 5:29 AM, Rajat Jain <rajatjain@juniper.net> wrote:
> Hello,
>
>> -----Original Message-----
>> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
>> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
>> Sent: Saturday, December 07, 2013 10:36 AM
>> To: Yinghai Lu
>> Cc: linux-pci@vger.kernel.org; Kenji Kaneshige; stable@vger.kernel.org
>> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
>> presense detection
>>
>> On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> > During hotplug test on platform, found slot status register does not
>> > report present status correctly. That present bit does not get cleared
>> > even after that card is removed.
>> >
>> > That problem is caused by commit:
>> > | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
>> > |
>> > |    PCI: pciehp: Disable/enable link during slot power off/on
>> >
>> > It looks like chipset bug, that PresDet bit is "OR" operation between
>> > sideband input from FPGA, and chipset inband detection from pcie link.
>>
>> This doesn't sound like a chipset bug.  It sounds like exactly what the
>> spec describes: "Presence Detect State - This bit indicates the presence
>> of an adapter in the slot, reflected by the logical "OR" of the Physical
>> Layer in-band presence detect mechanism and, if present, any out-of-band
>> presence detect mechanism defined for the slot's corresponding form
>> factor" (PCIe 3.0, sec 7.8.11).
>>
>> So I want to fix this pciehp problem, but between 2debd9289997 and this
>> patch, pciehp_power_off_slot() is accumulating warts that make it look
>> like we're just reacting to things that break rather than making a
>> robust design to start with.
>
> Yes.
>
> In fact things do get more complicated if we introduce link state based hot-plug, because then that requires that we do NOT disable the link permanently during power off. (Because we would want to receive future hot-plug link-up events).
>
>>
>> > It does not like if we disable pcie link at first and power off other
>> > side device, and it has input from inband detection always 1.
>> >
>> > Workaround: Try turn on link a while after power off.
>> >
>> > After this patch, PresDet report correct status when removing or
>> > adding card later.
>> >
>> > Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> > Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> > Cc: <stable@vger.kernel.org> 3.4+
>> >
>> > ---
>> >  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
>> >  1 file changed, 9 insertions(+)
>> >
>> > Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
>> > ===================================================================
>> > --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
>> > +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
>> > @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
>> >         }
>> >         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>> >                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
>> > slot_cmd);
>> > +
>> > +       /*
>> > +        * Enable link for a while so chipset could settle down
>> > +        * inband presence detection logic
>> > +        */
>> > +       pciehp_link_enable(ctrl);
>> > +       msleep(20);
>> > +       pciehp_link_disable(ctrl);
>> > +
>> >         return 0;
>> >  }
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org More majordomo info
>> at  http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 9, 2013, 6:33 p.m. UTC | #9
On Sat, Dec 7, 2013 at 8:19 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sat, Dec 7, 2013 at 10:35 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>
>> This doesn't sound like a chipset bug.  It sounds like exactly what
>> the spec describes: "Presence Detect State – This bit indicates the
>> presence of an adapter in the slot, reflected by the logical “OR” of
>> the Physical Layer in-band presence detect mechanism and, if present,
>> any out-of-band presence detect mechanism defined for the slot’s
>> corresponding form factor" (PCIe 3.0, sec 7.8.11).
>
> Not sure. the silicon does not report in-band correctly if we turn off
> pcie link before turn the power of card.

Yes, that's exactly what the spec says: "Note that the in-band
presence detect mechanism requires that power be applied to an adapter
for its presence to be detected."  So pciehp has to be designed to
take that into account.

Presence detect is obviously important for things like ExpressCard
where there's no attention button.  I expect that pciehp will use
presence detect to notice when a device is inserted and removed.

But I assume your hotplug situation is slots with attention buttons,
and I don't think presence detect is as important there.  If you
insert a card into an empty slot, nothing should happen until the
attention button is pressed.  We don't need presence detect there.  If
pciehp turns off a device to prepare for its physical removal, why do
we care about presence detect after the power is turned off?  I think
all pciehp can say is "this slot is powered off, and I don't know
whether there's still a card in it."

I think pciehp is somewhat buggy here -- it looks like
pciehp_enable_slot() looks at presence detect before turning on the
power.  I don't know why it does that, and it doesn't seem valid per
the spec statement I quoted above.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajat Jain Dec. 9, 2013, 6:46 p.m. UTC | #10
Hello,

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> owner@vger.kernel.org] On Behalf Of Ethan Zhao
> Sent: Sunday, December 08, 2013 6:10 PM
> To: Rajat Jain
> Cc: Bjorn Helgaas; Yinghai Lu; linux-pci@vger.kernel.org; Kenji
> Kaneshige; stable@vger.kernel.org
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> Rajat,
> 
>     Do you have a draft to extend the hotplug spec ?  seem you wanna
> introduce link change interrupt to the your chip or just poll the link
> status register ?

This is for the systems that do not implement elements like attention button:

http://www.spinics.net/lists/hotplug/msg05802.html
(Inspiration from the thread: http://www.spinics.net/lists/linux-pci/msg05783.html)


Here was the proposed minimal patch:

http://marc.info/?l=linux-pci&m=138610993912921&w=2
http://marc.info/?t=138611014300001&r=1&w=2

Would appreciate any review comments.

Thanks,

Rajat


> 
> Thanks,
> Ethan
> 
> On Mon, Dec 9, 2013 at 5:29 AM, Rajat Jain <rajatjain@juniper.net>
> wrote:
> > Hello,
> >
> >> -----Original Message-----
> >> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> >> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> >> Sent: Saturday, December 07, 2013 10:36 AM
> >> To: Yinghai Lu
> >> Cc: linux-pci@vger.kernel.org; Kenji Kaneshige;
> >> stable@vger.kernel.org
> >> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> >> presense detection
> >>
> >> On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org>
> wrote:
> >> > During hotplug test on platform, found slot status register does
> >> > not report present status correctly. That present bit does not get
> >> > cleared even after that card is removed.
> >> >
> >> > That problem is caused by commit:
> >> > | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
> >> > |
> >> > |    PCI: pciehp: Disable/enable link during slot power off/on
> >> >
> >> > It looks like chipset bug, that PresDet bit is "OR" operation
> >> > between sideband input from FPGA, and chipset inband detection from
> pcie link.
> >>
> >> This doesn't sound like a chipset bug.  It sounds like exactly what
> >> the spec describes: "Presence Detect State - This bit indicates the
> >> presence of an adapter in the slot, reflected by the logical "OR" of
> >> the Physical Layer in-band presence detect mechanism and, if present,
> >> any out-of-band presence detect mechanism defined for the slot's
> >> corresponding form factor" (PCIe 3.0, sec 7.8.11).
> >>
> >> So I want to fix this pciehp problem, but between 2debd9289997 and
> >> this patch, pciehp_power_off_slot() is accumulating warts that make
> >> it look like we're just reacting to things that break rather than
> >> making a robust design to start with.
> >
> > Yes.
> >
> > In fact things do get more complicated if we introduce link state
> based hot-plug, because then that requires that we do NOT disable the
> link permanently during power off. (Because we would want to receive
> future hot-plug link-up events).
> >
> >>
> >> > It does not like if we disable pcie link at first and power off
> >> > other side device, and it has input from inband detection always 1.
> >> >
> >> > Workaround: Try turn on link a while after power off.
> >> >
> >> > After this patch, PresDet report correct status when removing or
> >> > adding card later.
> >> >
> >> > Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> >> > Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> >> > Cc: <stable@vger.kernel.org> 3.4+
> >> >
> >> > ---
> >> >  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
> >> >  1 file changed, 9 insertions(+)
> >> >
> >> > Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> >> > ===================================================================
> >> > --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
> >> > +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> >> > @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
> >> >         }
> >> >         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
> >> >                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
> >> > slot_cmd);
> >> > +
> >> > +       /*
> >> > +        * Enable link for a while so chipset could settle down
> >> > +        * inband presence detection logic
> >> > +        */
> >> > +       pciehp_link_enable(ctrl);
> >> > +       msleep(20);
> >> > +       pciehp_link_disable(ctrl);
> >> > +
> >> >         return 0;
> >> >  }
> >> >
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-pci"
> >> in the body of a message to majordomo@vger.kernel.org More majordomo
> >> info at  http://vger.kernel.org/majordomo-info.html
> >>
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org More majordomo info
> at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 9, 2013, 7:28 p.m. UTC | #11
On Mon, Dec 9, 2013 at 10:46 AM, Rajat Jain <rajatjain@juniper.net> wrote:
>
> This is for the systems that do not implement elements like attention button:

then your pcie slot should add Surprise Removal/Insert support in your HW?

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajat Jain Dec. 9, 2013, 7:35 p.m. UTC | #12
Hi Yinghai,

> -----Original Message-----
> From: yhlu.kernel@gmail.com [mailto:yhlu.kernel@gmail.com] On Behalf Of
> Yinghai Lu
> Sent: Monday, December 09, 2013 11:28 AM
> To: Rajat Jain
> Cc: Ethan Zhao; Bjorn Helgaas; linux-pci@vger.kernel.org; Kenji
> Kaneshige; stable@vger.kernel.org; Guenter Roeck
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> On Mon, Dec 9, 2013 at 10:46 AM, Rajat Jain <rajatjain@juniper.net>
> wrote:
> >
> > This is for the systems that do not implement elements like attention
> button:
> 
> then your pcie slot should add Surprise Removal/Insert support in your
> HW?
> 

Actually there are situations where the presence detect signal does not make much sense, because the card is not really plugged out physically. For e.g. how do we deal with PCIe end points that may be on the board itself, but say come out of reset after the bootup? Or go down during a firmware upgrade and again come back up after the upgrade? 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 9, 2013, 8:20 p.m. UTC | #13
On Mon, Dec 9, 2013 at 11:35 AM, Rajat Jain <rajatjain@juniper.net> wrote:

> Actually there are situations where the presence detect signal does not make much sense, because the card is not really plugged out physically. For e.g. how do we deal with PCIe end points that may be on the board itself, but say come out of reset after the bootup? Or go down during a firmware upgrade and again come back up after the upgrade?
>

That is driver's job.
mpt2sas driver does diag_reset.

Think about kernel that could even be compiled without pciehp built-in.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Dec. 10, 2013, 12:34 a.m. UTC | #14
> -----Original Message-----
> From: yhlu.kernel@gmail.com [mailto:yhlu.kernel@gmail.com] On Behalf Of
> Yinghai Lu
> Sent: Monday, December 09, 2013 12:20 PM
> To: Rajat Jain
> Cc: Ethan Zhao; Bjorn Helgaas; linux-pci@vger.kernel.org; Kenji
> Kaneshige; stable@vger.kernel.org; Guenter Roeck
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> On Mon, Dec 9, 2013 at 11:35 AM, Rajat Jain <rajatjain@juniper.net>
> wrote:
> 
> > Actually there are situations where the presence detect signal does
> not make much sense, because the card is not really plugged out
> physically. For e.g. how do we deal with PCIe end points that may be on
> the board itself, but say come out of reset after the bootup? Or go
> down during a firmware upgrade and again come back up after the
> upgrade?
> >
> 
> That is driver's job.
> mpt2sas driver does diag_reset.
> 
Possibly, but quite frankly, after looking into that code, it seems quite dirty
to me. Of course, that is my personal opinion only. It may suggest, however,
assuming that the mpt2sas driver's resent handler causes a link down/link up 
sequence of events, that there is a valid case for the boot option.
I don't entirely understand, though, how that driver manages to reset its 
PCIe interface without the need to re-initialize it, but maybe I am just
missing something.

There are a number of other problems with the in-driver approach.
For example, it would require the driver to re-program the chip's PCIe
configuration space, as it will be 'vanilla' after the reset. Not even
sure how this can work, since that includes the chip BAR registers
and various other configuration registers which are normally set by the
PCIe subsystem. I'd rather handle this through the PCIe subsystem
instead of kludging around the problem in the driver.

Another problem is resulting boot delays. If link up/down is handled through
the PCIe subsystem, we can have the driver return -EPROBE_DEFER in the initial
probe, and everything just works automatically like magic. Waiting in probe
for the chip to re-initialize itself would add substantial delay to the boot
sequence. With several such chips in the system that can add up to a lot.

Even if we manage to implement all this, we still didn't solve our other
problems, which include chips which are activated after initial PCIe scan,
chips which (unfortunately) tend to hang up themselves without notice,
and real OIR devices which don't implement "correct" PCIe OIR handling.

Rajat's patch solves all those problems, and in my opinion it does so
quite elegantly.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ethan Zhao Dec. 10, 2013, 4:07 a.m. UTC | #15
Rajat,
    After read almost all the proposed material you mentioned, got
some of my points here for you and other guru's reference.

    1. It is not totally out of spec,  most of it have been described
in PCIe spec 3.0 section 6.7.3.3. Data Link Layer State Changed
Events.

    2. if it works on your hardware, none could stop you, because you
didn't hurt anybody, that is not evil at all !

    3. Introduce link status event  as replacement to all other
existed machinism is too aggressive that will make us lose some other
features of PCIe hotplug.  for example,  attention button let system
and user have time to make preparation and undo.  presence bit is
confirmation to system etc.  someone like me would say no if it would
be replacement while not an additament.

   4. Just as Bjorn mentioned, don't break it and 'fix' it, try to
make a complete and well formed , more important --- considered design
for all things,  the existing and the one you would like introduce. it
is welcome.

 Thanks,
Ethan

On Tue, Dec 10, 2013 at 2:46 AM, Rajat Jain <rajatjain@juniper.net> wrote:
> Hello,
>
>> -----Original Message-----
>> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
>> owner@vger.kernel.org] On Behalf Of Ethan Zhao
>> Sent: Sunday, December 08, 2013 6:10 PM
>> To: Rajat Jain
>> Cc: Bjorn Helgaas; Yinghai Lu; linux-pci@vger.kernel.org; Kenji
>> Kaneshige; stable@vger.kernel.org
>> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
>> presense detection
>>
>> Rajat,
>>
>>     Do you have a draft to extend the hotplug spec ?  seem you wanna
>> introduce link change interrupt to the your chip or just poll the link
>> status register ?
>
> This is for the systems that do not implement elements like attention button:
>
> http://www.spinics.net/lists/hotplug/msg05802.html
> (Inspiration from the thread: http://www.spinics.net/lists/linux-pci/msg05783.html)
>
>
> Here was the proposed minimal patch:
>
> http://marc.info/?l=linux-pci&m=138610993912921&w=2
> http://marc.info/?t=138611014300001&r=1&w=2
>
> Would appreciate any review comments.
>
> Thanks,
>
> Rajat
>
>
>>
>> Thanks,
>> Ethan
>>
>> On Mon, Dec 9, 2013 at 5:29 AM, Rajat Jain <rajatjain@juniper.net>
>> wrote:
>> > Hello,
>> >
>> >> -----Original Message-----
>> >> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
>> >> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
>> >> Sent: Saturday, December 07, 2013 10:36 AM
>> >> To: Yinghai Lu
>> >> Cc: linux-pci@vger.kernel.org; Kenji Kaneshige;
>> >> stable@vger.kernel.org
>> >> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
>> >> presense detection
>> >>
>> >> On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org>
>> wrote:
>> >> > During hotplug test on platform, found slot status register does
>> >> > not report present status correctly. That present bit does not get
>> >> > cleared even after that card is removed.
>> >> >
>> >> > That problem is caused by commit:
>> >> > | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
>> >> > |
>> >> > |    PCI: pciehp: Disable/enable link during slot power off/on
>> >> >
>> >> > It looks like chipset bug, that PresDet bit is "OR" operation
>> >> > between sideband input from FPGA, and chipset inband detection from
>> pcie link.
>> >>
>> >> This doesn't sound like a chipset bug.  It sounds like exactly what
>> >> the spec describes: "Presence Detect State - This bit indicates the
>> >> presence of an adapter in the slot, reflected by the logical "OR" of
>> >> the Physical Layer in-band presence detect mechanism and, if present,
>> >> any out-of-band presence detect mechanism defined for the slot's
>> >> corresponding form factor" (PCIe 3.0, sec 7.8.11).
>> >>
>> >> So I want to fix this pciehp problem, but between 2debd9289997 and
>> >> this patch, pciehp_power_off_slot() is accumulating warts that make
>> >> it look like we're just reacting to things that break rather than
>> >> making a robust design to start with.
>> >
>> > Yes.
>> >
>> > In fact things do get more complicated if we introduce link state
>> based hot-plug, because then that requires that we do NOT disable the
>> link permanently during power off. (Because we would want to receive
>> future hot-plug link-up events).
>> >
>> >>
>> >> > It does not like if we disable pcie link at first and power off
>> >> > other side device, and it has input from inband detection always 1.
>> >> >
>> >> > Workaround: Try turn on link a while after power off.
>> >> >
>> >> > After this patch, PresDet report correct status when removing or
>> >> > adding card later.
>> >> >
>> >> > Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> >> > Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>> >> > Cc: <stable@vger.kernel.org> 3.4+
>> >> >
>> >> > ---
>> >> >  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
>> >> >  1 file changed, 9 insertions(+)
>> >> >
>> >> > Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
>> >> > ===================================================================
>> >> > --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
>> >> > +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
>> >> > @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
>> >> >         }
>> >> >         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>> >> >                  pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
>> >> > slot_cmd);
>> >> > +
>> >> > +       /*
>> >> > +        * Enable link for a while so chipset could settle down
>> >> > +        * inband presence detection logic
>> >> > +        */
>> >> > +       pciehp_link_enable(ctrl);
>> >> > +       msleep(20);
>> >> > +       pciehp_link_disable(ctrl);
>> >> > +
>> >> >         return 0;
>> >> >  }
>> >> >
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-pci"
>> >> in the body of a message to majordomo@vger.kernel.org More majordomo
>> >> info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-pci"
>> > in the body of a message to majordomo@vger.kernel.org More majordomo
>> > info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org More majordomo info
>> at  http://vger.kernel.org/majordomo-info.html
>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajat Jain Dec. 10, 2013, 9:01 a.m. UTC | #16
Hello,

> -----Original Message-----
> From: Ethan Zhao [mailto:ethan.kernel@gmail.com]
> Sent: Monday, December 09, 2013 8:08 PM
> To: Rajat Jain
> Cc: Bjorn Helgaas; Yinghai Lu; linux-pci@vger.kernel.org; Kenji
> Kaneshige; stable@vger.kernel.org; Guenter Roeck
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> Rajat,
>     After read almost all the proposed material you mentioned,

Thanks a bunch for putting in efforts, I appreciate.

> got some
> of my points here for you and other guru's reference.
> 
>     1. It is not totally out of spec,  most of it have been described in
> PCIe spec 3.0 section 6.7.3.3. Data Link Layer State Changed Events.

Yes. That is what I had based my patch on.

> 
>     2. if it works on your hardware, none could stop you, because you
> didn't hurt anybody, that is not evil at all !
> 
>     3. Introduce link status event  as replacement to all other existed
> machinism is too aggressive that will make us lose some other features
> of PCIe hotplug.

I totally agree. My proposal was to add link state event * in addition to * other already existing mechanisms. That should (atleast in principle) not disturb existing mechanisms.

In fact, originally I proposed introducing a module parameter for enabling use of link events for hotplug. But later after discussion with Bjorn, I changed it to always enable.

>  for example,  attention button let system and user
> have time to make preparation and undo.  presence bit is confirmation to
> system etc.  someone like me would say no if it would be replacement
> while not an additament.

Yes, I understand. In fact I had discussed with Bjorn how the behavior might change and atleast my impression was that he seemed to be okay with it:

http://marc.info/?t=138513966800006&r=1&w=2

> 
>    4. Just as Bjorn mentioned, don't break it and 'fix' it, try to make
> a complete and well formed , more important --- considered design for
> all things,  the existing and the one you would like introduce. it is
> welcome.

Sure, I understand and would fully comply.

I'm have reasonable level of confidence that my patch addresses a problem that is
Faced by many products that do not support traditional HP elements. Thus to that
Extent, I wanted my work to be useful to the community.

As far as I could think of, I had submitted my patch with the hope that some one will take a look and point out any foreseen Issues, but unfortunately could not get a lot of attention. Not sure what should be the next step :-(
  
Actually the reason I spoke up in this thread was that I wanted to make aware that I am proposing that the link not be permanently left disabled at the time of power off, and discuss any related issues thereof. 

Thanks & best Regards,

Rajat

> 
>  Thanks,
> Ethan
> 
> On Tue, Dec 10, 2013 at 2:46 AM, Rajat Jain <rajatjain@juniper.net>
> wrote:
> > Hello,
> >
> >> -----Original Message-----
> >> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> >> owner@vger.kernel.org] On Behalf Of Ethan Zhao
> >> Sent: Sunday, December 08, 2013 6:10 PM
> >> To: Rajat Jain
> >> Cc: Bjorn Helgaas; Yinghai Lu; linux-pci@vger.kernel.org; Kenji
> >> Kaneshige; stable@vger.kernel.org
> >> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> >> presense detection
> >>
> >> Rajat,
> >>
> >>     Do you have a draft to extend the hotplug spec ?  seem you wanna
> >> introduce link change interrupt to the your chip or just poll the
> >> link status register ?
> >
> > This is for the systems that do not implement elements like attention
> button:
> >
> > http://www.spinics.net/lists/hotplug/msg05802.html
> > (Inspiration from the thread:
> > http://www.spinics.net/lists/linux-pci/msg05783.html)
> >
> >
> > Here was the proposed minimal patch:
> >
> > http://marc.info/?l=linux-pci&m=138610993912921&w=2
> > http://marc.info/?t=138611014300001&r=1&w=2
> >
> > Would appreciate any review comments.
> >
> > Thanks,
> >
> > Rajat
> >
> >
> >>
> >> Thanks,
> >> Ethan
> >>
> >> On Mon, Dec 9, 2013 at 5:29 AM, Rajat Jain <rajatjain@juniper.net>
> >> wrote:
> >> > Hello,
> >> >
> >> >> -----Original Message-----
> >> >> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> >> >> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> >> >> Sent: Saturday, December 07, 2013 10:36 AM
> >> >> To: Yinghai Lu
> >> >> Cc: linux-pci@vger.kernel.org; Kenji Kaneshige;
> >> >> stable@vger.kernel.org
> >> >> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to
> >> >> workaround presense detection
> >> >>
> >> >> On Fri, Dec 6, 2013 at 1:22 PM, Yinghai Lu <yinghai@kernel.org>
> >> wrote:
> >> >> > During hotplug test on platform, found slot status register does
> >> >> > not report present status correctly. That present bit does not
> >> >> > get cleared even after that card is removed.
> >> >> >
> >> >> > That problem is caused by commit:
> >> >> > | commit 2debd9289997fc5d1c0043b41201a8b40d5e11d0
> >> >> > |
> >> >> > |    PCI: pciehp: Disable/enable link during slot power off/on
> >> >> >
> >> >> > It looks like chipset bug, that PresDet bit is "OR" operation
> >> >> > between sideband input from FPGA, and chipset inband detection
> >> >> > from
> >> pcie link.
> >> >>
> >> >> This doesn't sound like a chipset bug.  It sounds like exactly
> >> >> what the spec describes: "Presence Detect State - This bit
> >> >> indicates the presence of an adapter in the slot, reflected by the
> >> >> logical "OR" of the Physical Layer in-band presence detect
> >> >> mechanism and, if present, any out-of-band presence detect
> >> >> mechanism defined for the slot's corresponding form factor" (PCIe
> 3.0, sec 7.8.11).
> >> >>
> >> >> So I want to fix this pciehp problem, but between 2debd9289997 and
> >> >> this patch, pciehp_power_off_slot() is accumulating warts that
> >> >> make it look like we're just reacting to things that break rather
> >> >> than making a robust design to start with.
> >> >
> >> > Yes.
> >> >
> >> > In fact things do get more complicated if we introduce link state
> >> based hot-plug, because then that requires that we do NOT disable the
> >> link permanently during power off. (Because we would want to receive
> >> future hot-plug link-up events).
> >> >
> >> >>
> >> >> > It does not like if we disable pcie link at first and power off
> >> >> > other side device, and it has input from inband detection always
> 1.
> >> >> >
> >> >> > Workaround: Try turn on link a while after power off.
> >> >> >
> >> >> > After this patch, PresDet report correct status when removing or
> >> >> > adding card later.
> >> >> >
> >> >> > Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> >> >> > Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> >> >> > Cc: <stable@vger.kernel.org> 3.4+
> >> >> >
> >> >> > ---
> >> >> >  drivers/pci/hotplug/pciehp_hpc.c |    9 +++++++++
> >> >> >  1 file changed, 9 insertions(+)
> >> >> >
> >> >> > Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> >> >> > ================================================================
> >> >> > ===
> >> >> > --- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
> >> >> > +++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
> >> >> > @@ -637,6 +637,15 @@ int pciehp_power_off_slot(struct slot *
> >> >> >         }
> >> >> >         ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n",
> __func__,
> >> >> >                  pci_pcie_cap(ctrl->pcie->port) +
> >> >> > PCI_EXP_SLTCTL, slot_cmd);
> >> >> > +
> >> >> > +       /*
> >> >> > +        * Enable link for a while so chipset could settle down
> >> >> > +        * inband presence detection logic
> >> >> > +        */
> >> >> > +       pciehp_link_enable(ctrl);
> >> >> > +       msleep(20);
> >> >> > +       pciehp_link_disable(ctrl);
> >> >> > +
> >> >> >         return 0;
> >> >> >  }
> >> >> >
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-
> pci"
> >> >> in the body of a message to majordomo@vger.kernel.org More
> >> >> majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>
> >> >
> >> >
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-
> pci"
> >> > in the body of a message to majordomo@vger.kernel.org More
> >> > majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-pci"
> >> in the body of a message to majordomo@vger.kernel.org More majordomo
> >> info at  http://vger.kernel.org/majordomo-info.html
> >>
> >
> >
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 10, 2013, 9:44 p.m. UTC | #17
On Mon, Dec 9, 2013 at 4:34 PM, Guenter Roeck <groeck@juniper.net> wrote:
>> > Actually there are situations where the presence detect signal does
>> not make much sense, because the card is not really plugged out
>> physically. For e.g. how do we deal with PCIe end points that may be on
>> the board itself, but say come out of reset after the bootup? Or go
>> down during a firmware upgrade and again come back up after the
>> upgrade?
>> >
>>
>> That is driver's job.
>> mpt2sas driver does diag_reset.
>>
> Possibly, but quite frankly, after looking into that code, it seems quite dirty
> to me. Of course, that is my personal opinion only. It may suggest, however,
> assuming that the mpt2sas driver's resent handler causes a link down/link up
> sequence of events, that there is a valid case for the boot option.
> I don't entirely understand, though, how that driver manages to reset its
> PCIe interface without the need to re-initialize it, but maybe I am just
> missing something.
>
> There are a number of other problems with the in-driver approach.
> For example, it would require the driver to re-program the chip's PCIe
> configuration space, as it will be 'vanilla' after the reset. Not even
> sure how this can work, since that includes the chip BAR registers
> and various other configuration registers which are normally set by the
> PCIe subsystem. I'd rather handle this through the PCIe subsystem
> instead of kludging around the problem in the driver.
>
> Another problem is resulting boot delays. If link up/down is handled through
> the PCIe subsystem, we can have the driver return -EPROBE_DEFER in the initial
> probe, and everything just works automatically like magic. Waiting in probe
> for the chip to re-initialize itself would add substantial delay to the boot
> sequence. With several such chips in the system that can add up to a lot.
>
> Even if we manage to implement all this, we still didn't solve our other
> problems, which include chips which are activated after initial PCIe scan,
> chips which (unfortunately) tend to hang up themselves without notice,
> and real OIR devices which don't implement "correct" PCIe OIR handling.
>
> Rajat's patch solves all those problems, and in my opinion it does so
> quite elegantly.

According to your description, i think you can do them with scripts in user
land and your driver.
Your driver should have provide interface to load fw.
Have a scripts and load fw, and from /sys stop and remove related pci devices.
then rescan corresponding bridge later.

In that case, you don't even to add dummy pcie slot cap...to get pciehp loaded.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Dec. 10, 2013, 10 p.m. UTC | #18
> -----Original Message-----
> From: yhlu.kernel@gmail.com [mailto:yhlu.kernel@gmail.com] On Behalf Of
> Yinghai Lu
> Sent: Tuesday, December 10, 2013 1:44 PM
> To: Guenter Roeck
> Cc: Rajat Jain; Ethan Zhao; Bjorn Helgaas; linux-pci@vger.kernel.org;
> Kenji Kaneshige; stable@vger.kernel.org
> Subject: Re: [PATCH] PCI, pciehp: Turn on link a while to workaround
> presense detection
> 
> On Mon, Dec 9, 2013 at 4:34 PM, Guenter Roeck <groeck@juniper.net>
> wrote:
> >> > Actually there are situations where the presence detect signal
> does
> >> not make much sense, because the card is not really plugged out
> >> physically. For e.g. how do we deal with PCIe end points that may be
> >> on the board itself, but say come out of reset after the bootup? Or
> >> go down during a firmware upgrade and again come back up after the
> >> upgrade?
> >> >
> >>
> >> That is driver's job.
> >> mpt2sas driver does diag_reset.
> >>
> > Possibly, but quite frankly, after looking into that code, it seems
> > quite dirty to me. Of course, that is my personal opinion only. It
> may
> > suggest, however, assuming that the mpt2sas driver's resent handler
> > causes a link down/link up sequence of events, that there is a valid
> case for the boot option.
> > I don't entirely understand, though, how that driver manages to reset
> > its PCIe interface without the need to re-initialize it, but maybe I
> > am just missing something.
> >
> > There are a number of other problems with the in-driver approach.
> > For example, it would require the driver to re-program the chip's
> PCIe
> > configuration space, as it will be 'vanilla' after the reset. Not
> even
> > sure how this can work, since that includes the chip BAR registers
> and
> > various other configuration registers which are normally set by the
> > PCIe subsystem. I'd rather handle this through the PCIe subsystem
> > instead of kludging around the problem in the driver.
> >
> > Another problem is resulting boot delays. If link up/down is handled
> > through the PCIe subsystem, we can have the driver return
> > -EPROBE_DEFER in the initial probe, and everything just works
> > automatically like magic. Waiting in probe for the chip to
> > re-initialize itself would add substantial delay to the boot
> sequence. With several such chips in the system that can add up to a
> lot.
> >
> > Even if we manage to implement all this, we still didn't solve our
> > other problems, which include chips which are activated after initial
> > PCIe scan, chips which (unfortunately) tend to hang up themselves
> > without notice, and real OIR devices which don't implement "correct"
> PCIe OIR handling.
> >
> > Rajat's patch solves all those problems, and in my opinion it does so
> > quite elegantly.
> 
> According to your description, i think you can do them with scripts in
> user land and your driver.
> Your driver should have provide interface to load fw.
> Have a scripts and load fw, and from /sys stop and remove related pci
> devices.
> then rescan corresponding bridge later.

The FW itself is accessible through an eeprom and can be updated that way.
To load it we need to set a flag in the device PCI memory space which causes
the entire chip including its PCIe interface to reset.

Sure, we can handle rescan from userspace. However, that is quite vulnerable
and messy (eg don't try to initiate a rescan without first explicitly removing
any devices which have since been reset, removed, and/or reinitialized).
Do anything wrong, or in the wrong sequence, and the PCIe data structures
in the kernel are completely messed up. This may be ok for experimental code,
but I would not want to ship it to a customer.

Besides, it still doesn't solve all problems such as to detect hung devices.
Handling it in the kernel is much more straightforward and cleaner.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_hpc.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_hpc.c
@@ -637,6 +637,15 @@  int pciehp_power_off_slot(struct slot *
 	}
 	ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, slot_cmd);
+
+	/*
+	 * Enable link for a while so chipset could settle down
+	 * inband presence detection logic
+	 */
+	pciehp_link_enable(ctrl);
+	msleep(20);
+	pciehp_link_disable(ctrl);
+
 	return 0;
 }