mbox series

[0/4] mwifiex PCI/wake-up interrupt fixes

Message ID 20190224140426.3267-1-marc.zyngier@arm.com (mailing list archive)
Headers show
Series mwifiex PCI/wake-up interrupt fixes | expand

Message

Marc Zyngier Feb. 24, 2019, 2:04 p.m. UTC
For quite some time, I wondered why the PCI mwifiex device built in my
Chromebook was unable to use the good old legacy interrupts. But as MSIs
were working fine, I never really bothered investigating. I finally had a
look, and the result isn't very pretty.

On this machine (rk3399-based kevin), the wake-up interrupt is described as
such:

&pci_rootport {
	mvl_wifi: wifi@0,0 {
		compatible = "pci1b4b,2b42";
		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
		interrupt-parent = <&gpio0>;
		interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
		pinctrl-names = "default";
		pinctrl-0 = <&wlan_host_wake_l>;
		wakeup-source;
	};
};

Note how the interrupt is part of the properties directly attached to the
PCI node. And yet, this interrupt has nothing to do with a PCI legacy
interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
altogether (Yay for the broken design!). This is in total violation of the
IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
specifiers describe the PCI device interrupts, and must obey the
INT-{A,B,C,D} mapping. Oops!

The net effect of the above is that Linux tries to do something vaguely
sensible, and uses the same interrupt for both the wake-up widget and the
PCI device. This doesn't work for two reasons: (1) the wake-up widget grabs
the interrupt in exclusive mode, and (2) the PCI interrupt is still routed
to the RC, leading to a screaming interrupt. This simply cannot work.

To sort out this mess, we need to lift the confusion between the two
interrupts. This is done by extending the DT binding to allow the wake-up
interrupt to be described in a 'wake-up' subnode, sidestepping the issue
completely. On my Chromebook, it now looks like this:

&pci_rootport {
	mvl_wifi: wifi@0,0 {
		compatible = "pci1b4b,2b42";
		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
		pinctrl-names = "default";
		pinctrl-0 = <&wlan_host_wake_l>;
		wake-up {
			interrupt-parent = <&gpio0>;
			interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
			wakeup-source;
		};
	};
};

The driver is then updated to look for this subnode first, and fallback to
the original, broken behaviour (spitting out a warning in the offending
configuration).

For good measure, there are two additional patches:

- The wake-up interrupt requesting is horribly racy, and could lead to
  unpredictable behaviours. Let's fix that properly.

- A final patch implementing the above transformation for the whole
  RK3399-based Chromebook range, which all use the same broken
  configuration.

With all that, I finally have PCI legacy interrupts working with the mwifiex
driver on my Chromebook.

[1] http://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf

Marc Zyngier (4):
  dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
    separate node
  mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
  mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
    it too late
  arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
    subnode

 .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
 .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
 drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
 3 files changed, 38 insertions(+), 6 deletions(-)

Comments

Ard Biesheuvel Feb. 25, 2019, 12:45 p.m. UTC | #1
On Sun, 24 Feb 2019 at 15:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> For quite some time, I wondered why the PCI mwifiex device built in my
> Chromebook was unable to use the good old legacy interrupts. But as MSIs
> were working fine, I never really bothered investigating. I finally had a
> look, and the result isn't very pretty.
>
> On this machine (rk3399-based kevin), the wake-up interrupt is described as
> such:
>
> &pci_rootport {
>         mvl_wifi: wifi@0,0 {
>                 compatible = "pci1b4b,2b42";
>                 reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>                        0x83010000 0x0 0x00100000 0x0 0x00100000>;
>                 interrupt-parent = <&gpio0>;
>                 interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>                 pinctrl-names = "default";
>                 pinctrl-0 = <&wlan_host_wake_l>;
>                 wakeup-source;
>         };
> };
>
> Note how the interrupt is part of the properties directly attached to the
> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> altogether (Yay for the broken design!). This is in total violation of the
> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> specifiers describe the PCI device interrupts, and must obey the
> INT-{A,B,C,D} mapping. Oops!
>

The mapping of legacy PCIe INTx interrupts onto wired system
interrupts is a property of the PCIe host controller, not of a
particular PCIe device. So I would argue that the code is broken here
as well: it should never attempt to interpret 'interrupt' properties
at the PCI device level as having any bearing on how legacy interrupts
are routed.

> The net effect of the above is that Linux tries to do something vaguely
> sensible, and uses the same interrupt for both the wake-up widget and the
> PCI device. This doesn't work for two reasons: (1) the wake-up widget grabs
> the interrupt in exclusive mode, and (2) the PCI interrupt is still routed
> to the RC, leading to a screaming interrupt. This simply cannot work.
>
> To sort out this mess, we need to lift the confusion between the two
> interrupts. This is done by extending the DT binding to allow the wake-up
> interrupt to be described in a 'wake-up' subnode, sidestepping the issue
> completely. On my Chromebook, it now looks like this:
>
> &pci_rootport {
>         mvl_wifi: wifi@0,0 {
>                 compatible = "pci1b4b,2b42";
>                 reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>                        0x83010000 0x0 0x00100000 0x0 0x00100000>;
>                 pinctrl-names = "default";
>                 pinctrl-0 = <&wlan_host_wake_l>;
>                 wake-up {
>                         interrupt-parent = <&gpio0>;
>                         interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>                         wakeup-source;
>                 };
>         };
> };
>
> The driver is then updated to look for this subnode first, and fallback to
> the original, broken behaviour (spitting out a warning in the offending
> configuration).
>
> For good measure, there are two additional patches:
>
> - The wake-up interrupt requesting is horribly racy, and could lead to
>   unpredictable behaviours. Let's fix that properly.
>
> - A final patch implementing the above transformation for the whole
>   RK3399-based Chromebook range, which all use the same broken
>   configuration.
>
> With all that, I finally have PCI legacy interrupts working with the mwifiex
> driver on my Chromebook.
>
> [1] http://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf
>
> Marc Zyngier (4):
>   dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
>     separate node
>   mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
>   mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
>     it too late
>   arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
>     subnode
>
>  .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
>  .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
>  drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
>  3 files changed, 38 insertions(+), 6 deletions(-)
>
> --
> 2.20.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Marc Zyngier Feb. 25, 2019, 2:52 p.m. UTC | #2
Hi Ard,

On 25/02/2019 12:45, Ard Biesheuvel wrote:
> On Sun, 24 Feb 2019 at 15:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> For quite some time, I wondered why the PCI mwifiex device built in my
>> Chromebook was unable to use the good old legacy interrupts. But as MSIs
>> were working fine, I never really bothered investigating. I finally had a
>> look, and the result isn't very pretty.
>>
>> On this machine (rk3399-based kevin), the wake-up interrupt is described as
>> such:
>>
>> &pci_rootport {
>>         mvl_wifi: wifi@0,0 {
>>                 compatible = "pci1b4b,2b42";
>>                 reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>>                        0x83010000 0x0 0x00100000 0x0 0x00100000>;
>>                 interrupt-parent = <&gpio0>;
>>                 interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>>                 pinctrl-names = "default";
>>                 pinctrl-0 = <&wlan_host_wake_l>;
>>                 wakeup-source;
>>         };
>> };
>>
>> Note how the interrupt is part of the properties directly attached to the
>> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
>> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
>> altogether (Yay for the broken design!). This is in total violation of the
>> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
>> specifiers describe the PCI device interrupts, and must obey the
>> INT-{A,B,C,D} mapping. Oops!
>>
> 
> The mapping of legacy PCIe INTx interrupts onto wired system
> interrupts is a property of the PCIe host controller, not of a
> particular PCIe device. So I would argue that the code is broken here
> as well: it should never attempt to interpret 'interrupt' properties
> at the PCI device level as having any bearing on how legacy interrupts
> are routed.

OpenFirmware says that this node contains the interrupt number of the
device (4.1.1. Open Firmware-defined Properties for Child Nodes). The
trick is that this property is generated *from* the device, and not set
in stone.

DT, on the other hand, takes whatever is described there and uses it as
the gospel to configure the OS, no matter how the PCI device is actually
configured. If the two don't match (like in this case), things break.
This is the "DT describes the HW" mantra, for (sometimes) better or
(more generally) worse.

What the DT code does is to interpret the whole interrupt specifier,
*including the interrupt-parent*. And that feels wrong. It should always
be in the context of the host controller. But on the other side, the DT
code is not in the business of validating the DT either...

It outlines one thing: If you have to interpret per-device PCI
properties from DT, you're in for serious trouble. I should get some
better HW.

Thanks,

	M.
Ard Biesheuvel Feb. 26, 2019, 4:21 p.m. UTC | #3
On Mon, 25 Feb 2019 at 15:53, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> Hi Ard,
>
> On 25/02/2019 12:45, Ard Biesheuvel wrote:
> > On Sun, 24 Feb 2019 at 15:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>
> >> For quite some time, I wondered why the PCI mwifiex device built in my
> >> Chromebook was unable to use the good old legacy interrupts. But as MSIs
> >> were working fine, I never really bothered investigating. I finally had a
> >> look, and the result isn't very pretty.
> >>
> >> On this machine (rk3399-based kevin), the wake-up interrupt is described as
> >> such:
> >>
> >> &pci_rootport {
> >>         mvl_wifi: wifi@0,0 {
> >>                 compatible = "pci1b4b,2b42";
> >>                 reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> >>                        0x83010000 0x0 0x00100000 0x0 0x00100000>;
> >>                 interrupt-parent = <&gpio0>;
> >>                 interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> >>                 pinctrl-names = "default";
> >>                 pinctrl-0 = <&wlan_host_wake_l>;
> >>                 wakeup-source;
> >>         };
> >> };
> >>
> >> Note how the interrupt is part of the properties directly attached to the
> >> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> >> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> >> altogether (Yay for the broken design!). This is in total violation of the
> >> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> >> specifiers describe the PCI device interrupts, and must obey the
> >> INT-{A,B,C,D} mapping. Oops!
> >>
> >
> > The mapping of legacy PCIe INTx interrupts onto wired system
> > interrupts is a property of the PCIe host controller, not of a
> > particular PCIe device. So I would argue that the code is broken here
> > as well: it should never attempt to interpret 'interrupt' properties
> > at the PCI device level as having any bearing on how legacy interrupts
> > are routed.
>
> OpenFirmware says that this node contains the interrupt number of the
> device (4.1.1. Open Firmware-defined Properties for Child Nodes). The
> trick is that this property is generated *from* the device, and not set
> in stone.
>
> DT, on the other hand, takes whatever is described there and uses it as
> the gospel to configure the OS, no matter how the PCI device is actually
> configured. If the two don't match (like in this case), things break.
> This is the "DT describes the HW" mantra, for (sometimes) better or
> (more generally) worse.
>
> What the DT code does is to interpret the whole interrupt specifier,
> *including the interrupt-parent*. And that feels wrong. It should always
> be in the context of the host controller. But on the other side, the DT
> code is not in the business of validating the DT either...
>
> It outlines one thing: If you have to interpret per-device PCI
> properties from DT, you're in for serious trouble. I should get some
> better HW.
>

Yeah, it obviously makes no sense at all for the interrupt parent of a
PCI device to deviate from the host bridge's interrupt parent, and
it's quite unfortunate that we can't simply ban it now that the cat is
out of the bag already.

Arguably, the wake up widget is not part of the PCI device, but I have
no opinion as to whether it is better modeling it as a sub device as
you are proposing or as an entirely separate device referenced via a
phandle.
Marc Zyngier Feb. 26, 2019, 5:14 p.m. UTC | #4
On 26/02/2019 16:21, Ard Biesheuvel wrote:
> On Mon, 25 Feb 2019 at 15:53, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> Hi Ard,
>>
>> On 25/02/2019 12:45, Ard Biesheuvel wrote:
>>> On Sun, 24 Feb 2019 at 15:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>
>>>> For quite some time, I wondered why the PCI mwifiex device built in my
>>>> Chromebook was unable to use the good old legacy interrupts. But as MSIs
>>>> were working fine, I never really bothered investigating. I finally had a
>>>> look, and the result isn't very pretty.
>>>>
>>>> On this machine (rk3399-based kevin), the wake-up interrupt is described as
>>>> such:
>>>>
>>>> &pci_rootport {
>>>>         mvl_wifi: wifi@0,0 {
>>>>                 compatible = "pci1b4b,2b42";
>>>>                 reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>>>>                        0x83010000 0x0 0x00100000 0x0 0x00100000>;
>>>>                 interrupt-parent = <&gpio0>;
>>>>                 interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>>>>                 pinctrl-names = "default";
>>>>                 pinctrl-0 = <&wlan_host_wake_l>;
>>>>                 wakeup-source;
>>>>         };
>>>> };
>>>>
>>>> Note how the interrupt is part of the properties directly attached to the
>>>> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
>>>> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
>>>> altogether (Yay for the broken design!). This is in total violation of the
>>>> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
>>>> specifiers describe the PCI device interrupts, and must obey the
>>>> INT-{A,B,C,D} mapping. Oops!
>>>>
>>>
>>> The mapping of legacy PCIe INTx interrupts onto wired system
>>> interrupts is a property of the PCIe host controller, not of a
>>> particular PCIe device. So I would argue that the code is broken here
>>> as well: it should never attempt to interpret 'interrupt' properties
>>> at the PCI device level as having any bearing on how legacy interrupts
>>> are routed.
>>
>> OpenFirmware says that this node contains the interrupt number of the
>> device (4.1.1. Open Firmware-defined Properties for Child Nodes). The
>> trick is that this property is generated *from* the device, and not set
>> in stone.
>>
>> DT, on the other hand, takes whatever is described there and uses it as
>> the gospel to configure the OS, no matter how the PCI device is actually
>> configured. If the two don't match (like in this case), things break.
>> This is the "DT describes the HW" mantra, for (sometimes) better or
>> (more generally) worse.
>>
>> What the DT code does is to interpret the whole interrupt specifier,
>> *including the interrupt-parent*. And that feels wrong. It should always
>> be in the context of the host controller. But on the other side, the DT
>> code is not in the business of validating the DT either...
>>
>> It outlines one thing: If you have to interpret per-device PCI
>> properties from DT, you're in for serious trouble. I should get some
>> better HW.
>>
> 
> Yeah, it obviously makes no sense at all for the interrupt parent of a
> PCI device to deviate from the host bridge's interrupt parent, and
> it's quite unfortunate that we can't simply ban it now that the cat is
> out of the bag already.
> 
> Arguably, the wake up widget is not part of the PCI device, but I have
> no opinion as to whether it is better modeling it as a sub device as
> you are proposing or as an entirely separate device referenced via a
> phandle.

It is not that clear. The widget seems to be an integral part of the
device, as it is the same basic IP that is used for SDIO and USB.

It looks like the good old pre-PCI-2.2 days, where you had to have a
separate cable between your network card and the base-board for the
wake-up interrupt to be delivered. Starting with PCI-2.2, the bus can
carry the signal just fine. With PCIe, it should just be an interrupt
TLP sent to the RC, but that's obviously not within the capabilities of
the HW.

Anyway, it'd be good if the Marvell people could chime in and let us
know how they'd prefer to handle this.

Thanks,

	M.
Brian Norris Feb. 26, 2019, 11:28 p.m. UTC | #5
+ others

Hi Marc,

Thanks for the series. I have a few bits of history to add to this, and
some comments.

On Sun, Feb 24, 2019 at 02:04:22PM +0000, Marc Zyngier wrote:
> For quite some time, I wondered why the PCI mwifiex device built in my
> Chromebook was unable to use the good old legacy interrupts. But as MSIs
> were working fine, I never really bothered investigating. I finally had a
> look, and the result isn't very pretty.
> 
> On this machine (rk3399-based kevin), the wake-up interrupt is described as
> such:
> 
> &pci_rootport {
> 	mvl_wifi: wifi@0,0 {
> 		compatible = "pci1b4b,2b42";
> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
> 		interrupt-parent = <&gpio0>;
> 		interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> 		pinctrl-names = "default";
> 		pinctrl-0 = <&wlan_host_wake_l>;
> 		wakeup-source;
> 	};
> };
> 
> Note how the interrupt is part of the properties directly attached to the
> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> altogether (Yay for the broken design!). This is in total violation of the
> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> specifiers describe the PCI device interrupts, and must obey the
> INT-{A,B,C,D} mapping. Oops!

You're not the first person to notice this. All the motivations are not
necessarily painted clearly in their cover letter, but here are some
previous attempts at solving this problem:

[RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/

As you can see by the 12th iteration, it wasn't left unsolved for lack
of trying...

Frankly, if a proper DT replacement to the admittedly bad binding isn't
agreed upon quickly, I'd be very happy to just have WAKE# support
removed from the DTS for now, and the existing mwifiex binding should
just be removed. (Wake-on-WiFi was never properly vetted on these
platforms anyway.) It mostly serves to just cause problems like you've
noticed.

> The net effect of the above is that Linux tries to do something vaguely
> sensible, and uses the same interrupt for both the wake-up widget and the
> PCI device. This doesn't work for two reasons: (1) the wake-up widget grabs
> the interrupt in exclusive mode, and (2) the PCI interrupt is still routed
> to the RC, leading to a screaming interrupt. This simply cannot work.
> 
> To sort out this mess, we need to lift the confusion between the two
> interrupts. This is done by extending the DT binding to allow the wake-up
> interrupt to be described in a 'wake-up' subnode, sidestepping the issue
> completely. On my Chromebook, it now looks like this:
> 
> &pci_rootport {
> 	mvl_wifi: wifi@0,0 {
> 		compatible = "pci1b4b,2b42";
> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
> 		pinctrl-names = "default";
> 		pinctrl-0 = <&wlan_host_wake_l>;
> 		wake-up {
> 			interrupt-parent = <&gpio0>;
> 			interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> 			wakeup-source;
> 		};
> 	};
> };

One problem Rockchip authors were also trying to resolve here is that
PCIe WAKE# handling should not really be something the PCI device driver
has to handle directly. Despite your complaints about not using in-band
TLP wakeup, a separate WAKE# pin is in fact a documented part of the
PCIe standard, and it so happens that the Rockchip RC does not support
handling TLPs in S3, if you want to have decent power consumption. (Your
"bad hardware" complaints could justifiably fall here, I suppose.)

Additionally, I've had pushback from PCI driver authors/maintainers on
adding more special handling for this sort of interrupt property (not
the binding specifically, but just the concept of handling WAKE# in the
driver), as they claim this should be handled by the system firmware,
when they set the appropriate wakeup flags, which filter down to
__pci_enable_wake() -> platform_pci_set_wakeup(). That's how x86 systems
do it (note: I know for a fact that many have a very similar
architecture -- WAKE# is not routed to the RC, because, why does it need
to? and they *don't* use TLP wakeup either -- they just hide it in
firmware better), and it Just Works.

So, we basically concluded that we should standardize on a way to
describe WAKE# interrupts such that PCI drivers don't have to deal with
it at all, and the PCI core can do it for us. 12 revisions later
and...we still never agreed on a good device tree binding for this.

IOW, maybe your wake-up sub-node is the best way to side-step the
problems of conflicting with the OF PCI spec. But I'd still really like
to avoid parsing it in mwifiex, if at all possible.

(We'd still be left with the marvell,wakeup-pin propery to parse
specifically in mwifiex, which sadly has to exist because....well,
Samsung decided to do chip-on-board, and then they failed to use the
correct pin on Marvell's side when wiring up WAKE#. Sigh.)

> The driver is then updated to look for this subnode first, and fallback to
> the original, broken behaviour (spitting out a warning in the offending
> configuration).
> 
> For good measure, there are two additional patches:
> 
> - The wake-up interrupt requesting is horribly racy, and could lead to
>   unpredictable behaviours. Let's fix that properly.

Ack. That mistake was repeated in other drivers and has since been fixed
in those. We need it here too.

Brian

> - A final patch implementing the above transformation for the whole
>   RK3399-based Chromebook range, which all use the same broken
>   configuration.
> 
> With all that, I finally have PCI legacy interrupts working with the mwifiex
> driver on my Chromebook.
> 
> [1] http://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf
> 
> Marc Zyngier (4):
>   dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
>     separate node
>   mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
>   mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
>     it too late
>   arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
>     subnode
> 
>  .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
>  .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
>  drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
>  3 files changed, 38 insertions(+), 6 deletions(-)
> 
> -- 
> 2.20.1
>
Brian Norris Feb. 26, 2019, 11:44 p.m. UTC | #6
Hi,

On Tue, Feb 26, 2019 at 05:14:00PM +0000, Marc Zyngier wrote:
> On 26/02/2019 16:21, Ard Biesheuvel wrote:
> > On Mon, 25 Feb 2019 at 15:53, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> It outlines one thing: If you have to interpret per-device PCI
> >> properties from DT, you're in for serious trouble. I should get some
> >> better HW.
> >>
> > 
> > Yeah, it obviously makes no sense at all for the interrupt parent of a
> > PCI device to deviate from the host bridge's interrupt parent, and
> > it's quite unfortunate that we can't simply ban it now that the cat is
> > out of the bag already.
> > 
> > Arguably, the wake up widget is not part of the PCI device, but I have
> > no opinion as to whether it is better modeling it as a sub device as
> > you are proposing or as an entirely separate device referenced via a
> > phandle.
> 
> It is not that clear. The widget seems to be an integral part of the
> device, as it is the same basic IP that is used for SDIO and USB.

It's not really a widget specific to this IP. It's just a GPIO. It so
happens that both SDIO and PCIe designs have wanted to use a GPIO for
wakeup, as many other devices do. (Note: it's not just cheap ARM
devices; pulling up some Intel Chromebook designs, I see the exact same
WAKE# GPIO on their PCIe WiFi as well.)

> It looks like the good old pre-PCI-2.2 days, where you had to have a
> separate cable between your network card and the base-board for the
> wake-up interrupt to be delivered. Starting with PCI-2.2, the bus can
> carry the signal just fine. With PCIe, it should just be an interrupt
> TLP sent to the RC, but that's obviously not within the capabilities of
> the HW.

You should search the PCI Express specification for WAKE#. There is a
clearly-documented "side-band wake" feature that is part of the
standard, as an alternative to in-band TLP wakeup. While you claim this
is an ancient thing, it in fact still in use on many systems -- it's
just usually abstracted better by ACPI firmware, whereas the dirty
laundry is aired a bit more on a Device Tree system. And we got it
wrong.

> Anyway, it'd be good if the Marvell people could chime in and let us
> know how they'd prefer to handle this.

I'm not sure this is really a Marvell-specific problem. (Well, except
for the marvell,wakeup-pin silliness, which is somewhat orthogonal.) In
fact, if we cared a little more about Wake-on-WiFi, we'd be trying to
support the same (out-of-band WAKE#) with other WiFi drivers.

Brian
Marc Zyngier Feb. 27, 2019, 9:27 a.m. UTC | #7
On 26/02/2019 23:44, Brian Norris wrote:
> Hi,
> 
> On Tue, Feb 26, 2019 at 05:14:00PM +0000, Marc Zyngier wrote:
>> On 26/02/2019 16:21, Ard Biesheuvel wrote:
>>> On Mon, 25 Feb 2019 at 15:53, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> It outlines one thing: If you have to interpret per-device PCI
>>>> properties from DT, you're in for serious trouble. I should get some
>>>> better HW.
>>>>
>>>
>>> Yeah, it obviously makes no sense at all for the interrupt parent of a
>>> PCI device to deviate from the host bridge's interrupt parent, and
>>> it's quite unfortunate that we can't simply ban it now that the cat is
>>> out of the bag already.
>>>
>>> Arguably, the wake up widget is not part of the PCI device, but I have
>>> no opinion as to whether it is better modeling it as a sub device as
>>> you are proposing or as an entirely separate device referenced via a
>>> phandle.
>>
>> It is not that clear. The widget seems to be an integral part of the
>> device, as it is the same basic IP that is used for SDIO and USB.
> 
> It's not really a widget specific to this IP. It's just a GPIO. It so
> happens that both SDIO and PCIe designs have wanted to use a GPIO for
> wakeup, as many other devices do. (Note: it's not just cheap ARM
> devices; pulling up some Intel Chromebook designs, I see the exact same
> WAKE# GPIO on their PCIe WiFi as well.)

Arghh! If I can't even point people to an Intel design as an example of
something done halfway right, we're screwed! ;-)

> 
>> It looks like the good old pre-PCI-2.2 days, where you had to have a
>> separate cable between your network card and the base-board for the
>> wake-up interrupt to be delivered. Starting with PCI-2.2, the bus can
>> carry the signal just fine. With PCIe, it should just be an interrupt
>> TLP sent to the RC, but that's obviously not within the capabilities of
>> the HW.
> 
> You should search the PCI Express specification for WAKE#. There is a
> clearly-documented "side-band wake" feature that is part of the
> standard, as an alternative to in-band TLP wakeup. While you claim this
> is an ancient thing, it in fact still in use on many systems -- it's
> just usually abstracted better by ACPI firmware, whereas the dirty
> laundry is aired a bit more on a Device Tree system. And we got it
> wrong.

I stand corrected. I was really hoping for these side-band wires to be a
thing of the past, but clearly the world hasn't moved as quickly as I hoped.

>> Anyway, it'd be good if the Marvell people could chime in and let us
>> know how they'd prefer to handle this.
> 
> I'm not sure this is really a Marvell-specific problem. (Well, except
> for the marvell,wakeup-pin silliness, which is somewhat orthogonal.) In
> fact, if we cared a little more about Wake-on-WiFi, we'd be trying to
> support the same (out-of-band WAKE#) with other WiFi drivers.

I'd definitely like to see this standardized. It would give us more
coverage, and prevent everyone from doing their own thing. But as you
mention, WoW doesn't seem to get much traction.

Thanks,

	M.
Marc Zyngier Feb. 27, 2019, 10:02 a.m. UTC | #8
+ Lorenzo

Hi Brian,

On 26/02/2019 23:28, Brian Norris wrote:
> + others
> 
> Hi Marc,
> 
> Thanks for the series. I have a few bits of history to add to this, and
> some comments.
> 
> On Sun, Feb 24, 2019 at 02:04:22PM +0000, Marc Zyngier wrote:
>> For quite some time, I wondered why the PCI mwifiex device built in my
>> Chromebook was unable to use the good old legacy interrupts. But as MSIs
>> were working fine, I never really bothered investigating. I finally had a
>> look, and the result isn't very pretty.
>>
>> On this machine (rk3399-based kevin), the wake-up interrupt is described as
>> such:
>>
>> &pci_rootport {
>> 	mvl_wifi: wifi@0,0 {
>> 		compatible = "pci1b4b,2b42";
>> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
>> 		interrupt-parent = <&gpio0>;
>> 		interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>> 		pinctrl-names = "default";
>> 		pinctrl-0 = <&wlan_host_wake_l>;
>> 		wakeup-source;
>> 	};
>> };
>>
>> Note how the interrupt is part of the properties directly attached to the
>> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
>> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
>> altogether (Yay for the broken design!). This is in total violation of the
>> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
>> specifiers describe the PCI device interrupts, and must obey the
>> INT-{A,B,C,D} mapping. Oops!
> 
> You're not the first person to notice this. All the motivations are not
> necessarily painted clearly in their cover letter, but here are some
> previous attempts at solving this problem:
> 
> [RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
> https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
> http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/
> 
> As you can see by the 12th iteration, it wasn't left unsolved for lack
> of trying...

I wasn't aware of this. That's definitely a better approach than my
hack, and I would really like this to be revived.

> 
> Frankly, if a proper DT replacement to the admittedly bad binding isn't
> agreed upon quickly, I'd be very happy to just have WAKE# support
> removed from the DTS for now, and the existing mwifiex binding should
> just be removed. (Wake-on-WiFi was never properly vetted on these
> platforms anyway.) It mostly serves to just cause problems like you've
> noticed.

Agreed. If there is no actual use for this, and that we can build a case
for a better solution, let's remove the wakeup support from the Gru DT
(it is invalid anyway), and bring it back if and when we get the right
level of support.

[...]

> One problem Rockchip authors were also trying to resolve here is that
> PCIe WAKE# handling should not really be something the PCI device driver
> has to handle directly. Despite your complaints about not using in-band
> TLP wakeup, a separate WAKE# pin is in fact a documented part of the
> PCIe standard, and it so happens that the Rockchip RC does not support
> handling TLPs in S3, if you want to have decent power consumption. (Your
> "bad hardware" complaints could justifiably fall here, I suppose.)
> 
> Additionally, I've had pushback from PCI driver authors/maintainers on
> adding more special handling for this sort of interrupt property (not
> the binding specifically, but just the concept of handling WAKE# in the
> driver), as they claim this should be handled by the system firmware,
> when they set the appropriate wakeup flags, which filter down to
> __pci_enable_wake() -> platform_pci_set_wakeup(). That's how x86 systems
> do it (note: I know for a fact that many have a very similar
> architecture -- WAKE# is not routed to the RC, because, why does it need
> to? and they *don't* use TLP wakeup either -- they just hide it in
> firmware better), and it Just Works.

Even on an arm64 platform, there is no reason why a wakeup interrupt
couldn't be handled by FW rather than the OS. It just need to be wired
to the right spot (so that it generates a secure interrupt that can be
handled by FW).

> So, we basically concluded that we should standardize on a way to
> describe WAKE# interrupts such that PCI drivers don't have to deal with
> it at all, and the PCI core can do it for us. 12 revisions later
> and...we still never agreed on a good device tree binding for this.

Is the DT binding the only problem? Do we have an agreement for the core
code?

> IOW, maybe your wake-up sub-node is the best way to side-step the
> problems of conflicting with the OF PCI spec. But I'd still really like
> to avoid parsing it in mwifiex, if at all possible.

Honestly, my solution is just a terrible hack. I wasn't aware that this
was a more general problem, and I'd love it to be addressed in the core
PCI code.

> (We'd still be left with the marvell,wakeup-pin propery to parse
> specifically in mwifiex, which sadly has to exist because....well,
> Samsung decided to do chip-on-board, and then they failed to use the
> correct pin on Marvell's side when wiring up WAKE#. Sigh.)

Oh well...

Thanks,

	M.
Ard Biesheuvel Feb. 27, 2019, 10:16 a.m. UTC | #9
On Wed, 27 Feb 2019 at 11:02, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> + Lorenzo
>
> Hi Brian,
>
> On 26/02/2019 23:28, Brian Norris wrote:
> > + others
> >
> > Hi Marc,
> >
> > Thanks for the series. I have a few bits of history to add to this, and
> > some comments.
> >
> > On Sun, Feb 24, 2019 at 02:04:22PM +0000, Marc Zyngier wrote:
> >> For quite some time, I wondered why the PCI mwifiex device built in my
> >> Chromebook was unable to use the good old legacy interrupts. But as MSIs
> >> were working fine, I never really bothered investigating. I finally had a
> >> look, and the result isn't very pretty.
> >>
> >> On this machine (rk3399-based kevin), the wake-up interrupt is described as
> >> such:
> >>
> >> &pci_rootport {
> >>      mvl_wifi: wifi@0,0 {
> >>              compatible = "pci1b4b,2b42";
> >>              reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> >>                     0x83010000 0x0 0x00100000 0x0 0x00100000>;
> >>              interrupt-parent = <&gpio0>;
> >>              interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> >>              pinctrl-names = "default";
> >>              pinctrl-0 = <&wlan_host_wake_l>;
> >>              wakeup-source;
> >>      };
> >> };
> >>
> >> Note how the interrupt is part of the properties directly attached to the
> >> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> >> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> >> altogether (Yay for the broken design!). This is in total violation of the
> >> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> >> specifiers describe the PCI device interrupts, and must obey the
> >> INT-{A,B,C,D} mapping. Oops!
> >
> > You're not the first person to notice this. All the motivations are not
> > necessarily painted clearly in their cover letter, but here are some
> > previous attempts at solving this problem:
> >
> > [RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
> > https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
> > http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/
> >
> > As you can see by the 12th iteration, it wasn't left unsolved for lack
> > of trying...
>
> I wasn't aware of this. That's definitely a better approach than my
> hack, and I would really like this to be revived.
>

I don't think this approach is entirely sound either.

From the side of the PCI device, WAKE# is just a GPIO line, and how it
is wired into the system is an entirely separate matter. So I don't
think it is justified to overload the notion of legacy interrupts with
some other pin that may behave in a way that is vaguely similar to how
a true wake-up capable interrupt works.

So I'd argue that we should add an optional 'wake-gpio' DT property
instead to the generic PCI device binding, and leave the interrupt
binding and discovery alone.
Brian Norris Feb. 27, 2019, 8:51 p.m. UTC | #10
Hi Marc,

On Wed, Feb 27, 2019 at 10:02:16AM +0000, Marc Zyngier wrote:
> On 26/02/2019 23:28, Brian Norris wrote:
> > On Sun, Feb 24, 2019 at 02:04:22PM +0000, Marc Zyngier wrote:
> >> Note how the interrupt is part of the properties directly attached to the
> >> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> >> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> >> altogether (Yay for the broken design!). This is in total violation of the
> >> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> >> specifiers describe the PCI device interrupts, and must obey the
> >> INT-{A,B,C,D} mapping. Oops!
> > 
> > You're not the first person to notice this. All the motivations are not
> > necessarily painted clearly in their cover letter, but here are some
> > previous attempts at solving this problem:
> > 
> > [RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
> > https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
> > http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/
> > 
> > As you can see by the 12th iteration, it wasn't left unsolved for lack
> > of trying...
> 
> I wasn't aware of this. That's definitely a better approach than my
> hack, and I would really like this to be revived.

Well, in some respects it may be better (mostly, handling in the PCI
core rather than each driver). But I'm still unsure about the DT
binding.

And while perhaps I could find time to revive it, it's probably more
expedient to kill the bad binding first.

> > Frankly, if a proper DT replacement to the admittedly bad binding isn't
> > agreed upon quickly, I'd be very happy to just have WAKE# support
> > removed from the DTS for now, and the existing mwifiex binding should
> > just be removed. (Wake-on-WiFi was never properly vetted on these
> > platforms anyway.) It mostly serves to just cause problems like you've
> > noticed.
> 
> Agreed. If there is no actual use for this, and that we can build a case
> for a better solution, let's remove the wakeup support from the Gru DT
> (it is invalid anyway), and bring it back if and when we get the right
> level of support.

+1

Today, something simple like NL80211_WOWLAN_TRIG_DISCONNECT and
NL80211_WOWLAN_TRIG_NET_DETECT may work OK, but I'm not confident that
anything more complicated is really a compelling story today (well,
outside of Android, which has a massively more complicated--and not
upstream--setup for this stuff).

> [...]
> 
> > One problem Rockchip authors were also trying to resolve here is that
> > PCIe WAKE# handling should not really be something the PCI device driver
> > has to handle directly. Despite your complaints about not using in-band
> > TLP wakeup, a separate WAKE# pin is in fact a documented part of the
> > PCIe standard, and it so happens that the Rockchip RC does not support
> > handling TLPs in S3, if you want to have decent power consumption. (Your
> > "bad hardware" complaints could justifiably fall here, I suppose.)
> > 
> > Additionally, I've had pushback from PCI driver authors/maintainers on
> > adding more special handling for this sort of interrupt property (not
> > the binding specifically, but just the concept of handling WAKE# in the
> > driver), as they claim this should be handled by the system firmware,
> > when they set the appropriate wakeup flags, which filter down to
> > __pci_enable_wake() -> platform_pci_set_wakeup(). That's how x86 systems
> > do it (note: I know for a fact that many have a very similar
> > architecture -- WAKE# is not routed to the RC, because, why does it need
> > to? and they *don't* use TLP wakeup either -- they just hide it in
> > firmware better), and it Just Works.
> 
> Even on an arm64 platform, there is no reason why a wakeup interrupt
> couldn't be handled by FW rather than the OS. It just need to be wired
> to the right spot (so that it generates a secure interrupt that can be
> handled by FW).

True...but then you also need a configuration (enable/disable) API for
it too. I don't think we have such a per-device API? So it would be a
pretty similar problem to solve anyway.

> > So, we basically concluded that we should standardize on a way to
> > describe WAKE# interrupts such that PCI drivers don't have to deal with
> > it at all, and the PCI core can do it for us. 12 revisions later
> > and...we still never agreed on a good device tree binding for this.
> 
> Is the DT binding the only problem? Do we have an agreement for the core
> code?

I'll have to re-read the old threads. I don't really remember where we
got bogged down... I think one outstanding question was whether WAKE#
should be associated with the port vs. the device. That might have been
might fault for confusing that one though.

> > IOW, maybe your wake-up sub-node is the best way to side-step the
> > problems of conflicting with the OF PCI spec. But I'd still really like
> > to avoid parsing it in mwifiex, if at all possible.
> 
> Honestly, my solution is just a terrible hack. I wasn't aware that this
> was a more general problem, and I'd love it to be addressed in the core
> PCI code.

Ack, so we agree.

Thanks,
Brian
Brian Norris Feb. 27, 2019, 8:57 p.m. UTC | #11
Hi Ard,

On Wed, Feb 27, 2019 at 11:16:12AM +0100, Ard Biesheuvel wrote:
> On Wed, 27 Feb 2019 at 11:02, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > On 26/02/2019 23:28, Brian Norris wrote:
> > > You're not the first person to notice this. All the motivations are not
> > > necessarily painted clearly in their cover letter, but here are some
> > > previous attempts at solving this problem:
> > >
> > > [RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
> > > https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
> > > http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/
> > >
> > > As you can see by the 12th iteration, it wasn't left unsolved for lack
> > > of trying...
> >
> > I wasn't aware of this. That's definitely a better approach than my
> > hack, and I would really like this to be revived.
> >
> 
> I don't think this approach is entirely sound either.

(I'm sure there may be problems with the above series. We probably
should give it another shot though someday, as I think it's closer to
the mark.)

> From the side of the PCI device, WAKE# is just a GPIO line, and how it
> is wired into the system is an entirely separate matter. So I don't
> think it is justified to overload the notion of legacy interrupts with
> some other pin that may behave in a way that is vaguely similar to how
> a true wake-up capable interrupt works.

I think you've conflated INTx with WAKE# just a bit (and to be fair,
that's exactly what the bad binding we're trying to replace did,
accidentally). We're not trying to claim this WAKE# signal replaces the
typical PCI interrupts, but it *is* an interrupt in some sense --
"depending on your definition of interrupt", per our IRC conversation ;)

> So I'd argue that we should add an optional 'wake-gpio' DT property
> instead to the generic PCI device binding, and leave the interrupt
> binding and discovery alone.

So I think Mark Rutland already shot that one down; it's conceptually an
interrupt from the device's perspective. We just need to figure out a
good way of representing it that doesn't stomp on the existing INTx
definitions.

Brian
Rafael J. Wysocki Feb. 27, 2019, 11:03 p.m. UTC | #12
On Wed, Feb 27, 2019 at 9:58 PM Brian Norris <briannorris@chromium.org> wrote:
>
> Hi Ard,
>
> On Wed, Feb 27, 2019 at 11:16:12AM +0100, Ard Biesheuvel wrote:
> > On Wed, 27 Feb 2019 at 11:02, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > > On 26/02/2019 23:28, Brian Norris wrote:
> > > > You're not the first person to notice this. All the motivations are not
> > > > necessarily painted clearly in their cover letter, but here are some
> > > > previous attempts at solving this problem:
> > > >
> > > > [RFC PATCH v11 0/5] PCI: rockchip: Move PCIe WAKE# handling into pci core
> > > > https://lkml.kernel.org/lkml/20171225114742.18920-1-jeffy.chen@rock-chips.com/
> > > > http://lkml.kernel.org/lkml/20171226023646.17722-1-jeffy.chen@rock-chips.com/
> > > >
> > > > As you can see by the 12th iteration, it wasn't left unsolved for lack
> > > > of trying...
> > >
> > > I wasn't aware of this. That's definitely a better approach than my
> > > hack, and I would really like this to be revived.
> > >
> >
> > I don't think this approach is entirely sound either.
>
> (I'm sure there may be problems with the above series. We probably
> should give it another shot though someday, as I think it's closer to
> the mark.)
>
> > From the side of the PCI device, WAKE# is just a GPIO line, and how it
> > is wired into the system is an entirely separate matter. So I don't
> > think it is justified to overload the notion of legacy interrupts with
> > some other pin that may behave in a way that is vaguely similar to how
> > a true wake-up capable interrupt works.
>
> I think you've conflated INTx with WAKE# just a bit (and to be fair,
> that's exactly what the bad binding we're trying to replace did,
> accidentally). We're not trying to claim this WAKE# signal replaces the
> typical PCI interrupts, but it *is* an interrupt in some sense --
> "depending on your definition of interrupt", per our IRC conversation ;)
>
> > So I'd argue that we should add an optional 'wake-gpio' DT property
> > instead to the generic PCI device binding, and leave the interrupt
> > binding and discovery alone.
>
> So I think Mark Rutland already shot that one down; it's conceptually an
> interrupt from the device's perspective.

Which device are you talking about?  The one that signals wakeup?  If
so, then I beg to differ.

On ACPI platforms WAKE# is represented as an ACPI GPE that is signaled
through SCI and handled at a different level (on HW-reduced ACPI it
actually can be a GPIO interrupt, but it still is handled with the
help of AML).  The driver of the device signaling wakeup need not even
be aware that WAKE# has been asserted.

> We just need to figure out a good way of representing it that doesn't stomp on the existing INTx
> definitions.

WAKE# is a signal that is converted into an interrupt, but that
interrupt may arrive at some place your driver has nothing to do with.
It generally doesn't make sense to represent it as an interrupt for
the target device.
Brian Norris Feb. 28, 2019, 2:29 a.m. UTC | #13
Hi Rafael,

On Wed, Feb 27, 2019 at 3:04 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Feb 27, 2019 at 9:58 PM Brian Norris <briannorris@chromium.org> wrote:
> > On Wed, Feb 27, 2019 at 11:16:12AM +0100, Ard Biesheuvel wrote:
> > > So I'd argue that we should add an optional 'wake-gpio' DT property
> > > instead to the generic PCI device binding, and leave the interrupt
> > > binding and discovery alone.
> >
> > So I think Mark Rutland already shot that one down; it's conceptually an
> > interrupt from the device's perspective.

Perhaps I shouldn't speak for Mark, but I am basically quoting him off IRC.

> Which device are you talking about?  The one that signals wakeup?  If
> so, then I beg to differ.

Yes, the endpoint device.

> On ACPI platforms WAKE# is represented as an ACPI GPE that is signaled
> through SCI and handled at a different level (on HW-reduced ACPI it
> actually can be a GPIO interrupt, but it still is handled with the
> help of AML).  The driver of the device signaling wakeup need not even
> be aware that WAKE# has been asserted.

Frankly, ACPI is not relevant to how we represent WAKE# in DT, IMO.
Also, we're talking about the *device*, not the driver. When talking
about Device Tree, that distinction is relevant.

So while the driver need not be aware (and I agree! it only needs to
care about enabling/disabling wake), *something* should be aware, and
the signal that "something" should be receiving is simply "did WAKE
happen"? That sounds basically like the device is signalling an
interrupt to me.

Maybe this goes back to some confusion we had elsewhere: what is the
meaning of "interrupt" in device tree?

> > We just need to figure out a good way of representing it that doesn't stomp on the existing INTx
> > definitions.
>
> WAKE# is a signal that is converted into an interrupt, but that
> interrupt may arrive at some place your driver has nothing to do with.

I could agree with that, perhaps. But that's also what Device Tree is
all about, really. We describe the relation between devices. So some
other <foo> handles events that are triggered by <bar>, so we use a
phandle to relate <bar> to <foo>.

> It generally doesn't make sense to represent it as an interrupt for
> the target device.

What would you suggest then? I'm not clearly understanding how you
think we should (a) describe (in DT) and (b) implement this WAKE#
handling.

Brian
Rafael J. Wysocki Feb. 28, 2019, 11:03 a.m. UTC | #14
On Thu, Feb 28, 2019 at 3:29 AM Brian Norris <briannorris@chromium.org> wrote:
>
> Hi Rafael,
>
> On Wed, Feb 27, 2019 at 3:04 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > On Wed, Feb 27, 2019 at 9:58 PM Brian Norris <briannorris@chromium.org> wrote:
> > > On Wed, Feb 27, 2019 at 11:16:12AM +0100, Ard Biesheuvel wrote:
> > > > So I'd argue that we should add an optional 'wake-gpio' DT property
> > > > instead to the generic PCI device binding, and leave the interrupt
> > > > binding and discovery alone.
> > >
> > > So I think Mark Rutland already shot that one down; it's conceptually an
> > > interrupt from the device's perspective.
>
> Perhaps I shouldn't speak for Mark, but I am basically quoting him off IRC.
>
> > Which device are you talking about?  The one that signals wakeup?  If
> > so, then I beg to differ.
>
> Yes, the endpoint device.
>
> > On ACPI platforms WAKE# is represented as an ACPI GPE that is signaled
> > through SCI and handled at a different level (on HW-reduced ACPI it
> > actually can be a GPIO interrupt, but it still is handled with the
> > help of AML).  The driver of the device signaling wakeup need not even
> > be aware that WAKE# has been asserted.
>
> Frankly, ACPI is not relevant to how we represent WAKE# in DT, IMO.

I mentioned ACPI as an example of how WAKE# can be handled.

> Also, we're talking about the *device*, not the driver. When talking
> about Device Tree, that distinction is relevant.

I'm not sure what you mean, really.

I guess we are talking about information in DT and some of it is
consumed by the driver while some of it is consumed by some other
pieces of code.

> So while the driver need not be aware (and I agree! it only needs to
> care about enabling/disabling wake),

Not even that.  Actually, PME enable is handled by the PCI bus type code.

> *something* should be aware,

Right.

> and the signal that "something" should be receiving is simply "did WAKE
> happen"?

Right.  But it may not be clear which device signaled it, because
WAKE# may be shared in general.

> That sounds basically like the device is signalling an interrupt to me.

Well, consider the native PME here.  In there, the device sends an
in-band message over the PCIe hierarchy to a root port and the
interrupt is signaled from there.  There is a PME driver in the kernel
that requests that IRQ and handles it (for all ports that can trigger
it), but for each port it binds to a separate device (or "service" if
you will) that takes care of all PME messages coming from all devices
under that port.

At an abstract level, WAKE# is expected to be similar AFAICS except
that there is a physical signal going from the device(s) (in low-power
states) that have detected external activity (wakeup) to an entity
that can trigger an interrupt.

The original idea in the PCI PM spec regarding WAKE# seems to be to
wire up WAKE# from all devices on the bus to one input that will cause
an interrupt to trigger when one of them drives WAKE# and whatever
handled that interrupt was expected to walk the bus, find the
device(s) with PME Status set and put it (or them) into D0 (clearing
PME Status in the process).  That still is a valid case to consider
IMO.

However, designers of some boards decided to provide dedicated
per-device interrupts for WAKE# and you may regard them as "wakeup
IRQs" except that the handling of each of them is the same as for the
"global WAKE#" above: if the PME Status of the device is set, put it
into D0 etc.  That part belongs to the PCI bus type layer IMO.

Of course, because nothing is easy and simple, there is a third case
in which one WAKE# line (and interrupt) can be shared by a subset of
devices on the bus and there are multiple (say two or three) subsets.

> Maybe this goes back to some confusion we had elsewhere: what is the
> meaning of "interrupt" in device tree?

Maybe.

> > > We just need to figure out a good way of representing it that doesn't stomp on the existing INTx
> > > definitions.
> >
> > WAKE# is a signal that is converted into an interrupt, but that
> > interrupt may arrive at some place your driver has nothing to do with.
>
> I could agree with that, perhaps. But that's also what Device Tree is
> all about, really. We describe the relation between devices. So some
> other <foo> handles events that are triggered by <bar>, so we use a
> phandle to relate <bar> to <foo>.

So you have a PCI endpoint on the one hand and the "wakeup serivce"
device on the other.

> > It generally doesn't make sense to represent it as an interrupt for
> > the target device.
>
> What would you suggest then? I'm not clearly understanding how you
> think we should (a) describe (in DT) and (b) implement this WAKE#
> handling.

I would introduce a "wakeup service" concept (along the lines of the
native PME) and make it request all interrupts associated with WAKE#.

In the case when the WAKE# interrupts are dedicated per-device, it
would be good to avoid walking the bus and use some
"wakeup-IRQ-to-device" mapping if available.

But as I said above, IMO the *handling* of all WAKE# interrupts
belongs to the PCI bus type as it is the same for all PCI devices.
Kalle Valo March 8, 2019, 8:26 a.m. UTC | #15
Marc Zyngier <marc.zyngier@arm.com> writes:

> For quite some time, I wondered why the PCI mwifiex device built in my
> Chromebook was unable to use the good old legacy interrupts. But as MSIs
> were working fine, I never really bothered investigating. I finally had a
> look, and the result isn't very pretty.
>
> On this machine (rk3399-based kevin), the wake-up interrupt is described as
> such:
>
> &pci_rootport {
> 	mvl_wifi: wifi@0,0 {
> 		compatible = "pci1b4b,2b42";
> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
> 		interrupt-parent = <&gpio0>;
> 		interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> 		pinctrl-names = "default";
> 		pinctrl-0 = <&wlan_host_wake_l>;
> 		wakeup-source;
> 	};
> };
>
> Note how the interrupt is part of the properties directly attached to the
> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
> altogether (Yay for the broken design!). This is in total violation of the
> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
> specifiers describe the PCI device interrupts, and must obey the
> INT-{A,B,C,D} mapping. Oops!
>
> The net effect of the above is that Linux tries to do something vaguely
> sensible, and uses the same interrupt for both the wake-up widget and the
> PCI device. This doesn't work for two reasons: (1) the wake-up widget grabs
> the interrupt in exclusive mode, and (2) the PCI interrupt is still routed
> to the RC, leading to a screaming interrupt. This simply cannot work.
>
> To sort out this mess, we need to lift the confusion between the two
> interrupts. This is done by extending the DT binding to allow the wake-up
> interrupt to be described in a 'wake-up' subnode, sidestepping the issue
> completely. On my Chromebook, it now looks like this:
>
> &pci_rootport {
> 	mvl_wifi: wifi@0,0 {
> 		compatible = "pci1b4b,2b42";
> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
> 		pinctrl-names = "default";
> 		pinctrl-0 = <&wlan_host_wake_l>;
> 		wake-up {
> 			interrupt-parent = <&gpio0>;
> 			interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
> 			wakeup-source;
> 		};
> 	};
> };
>
> The driver is then updated to look for this subnode first, and fallback to
> the original, broken behaviour (spitting out a warning in the offending
> configuration).
>
> For good measure, there are two additional patches:
>
> - The wake-up interrupt requesting is horribly racy, and could lead to
>   unpredictable behaviours. Let's fix that properly.
>
> - A final patch implementing the above transformation for the whole
>   RK3399-based Chromebook range, which all use the same broken
>   configuration.
>
> With all that, I finally have PCI legacy interrupts working with the mwifiex
> driver on my Chromebook.
>
> [1] http://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf
>
> Marc Zyngier (4):
>   dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
>     separate node
>   mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
>   mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
>     it too late
>   arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
>     subnode
>
>  .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
>  .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
>  drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
>  3 files changed, 38 insertions(+), 6 deletions(-)

I didn't read the discussion in detail, but I understanding is that I
should drop this series and wait for a new version. Please correct me if
I misunderstood.
Marc Zyngier March 8, 2019, 9:02 a.m. UTC | #16
On 08/03/2019 08:26, Kalle Valo wrote:
> Marc Zyngier <marc.zyngier@arm.com> writes:
> 
>> For quite some time, I wondered why the PCI mwifiex device built in my
>> Chromebook was unable to use the good old legacy interrupts. But as MSIs
>> were working fine, I never really bothered investigating. I finally had a
>> look, and the result isn't very pretty.
>>
>> On this machine (rk3399-based kevin), the wake-up interrupt is described as
>> such:
>>
>> &pci_rootport {
>> 	mvl_wifi: wifi@0,0 {
>> 		compatible = "pci1b4b,2b42";
>> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
>> 		interrupt-parent = <&gpio0>;
>> 		interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>> 		pinctrl-names = "default";
>> 		pinctrl-0 = <&wlan_host_wake_l>;
>> 		wakeup-source;
>> 	};
>> };
>>
>> Note how the interrupt is part of the properties directly attached to the
>> PCI node. And yet, this interrupt has nothing to do with a PCI legacy
>> interrupt, as it is attached to the wake-up widget that bypasses the PCIe RC
>> altogether (Yay for the broken design!). This is in total violation of the
>> IEEE Std 1275-1994 spec[1], which clearly documents that such interrupt
>> specifiers describe the PCI device interrupts, and must obey the
>> INT-{A,B,C,D} mapping. Oops!
>>
>> The net effect of the above is that Linux tries to do something vaguely
>> sensible, and uses the same interrupt for both the wake-up widget and the
>> PCI device. This doesn't work for two reasons: (1) the wake-up widget grabs
>> the interrupt in exclusive mode, and (2) the PCI interrupt is still routed
>> to the RC, leading to a screaming interrupt. This simply cannot work.
>>
>> To sort out this mess, we need to lift the confusion between the two
>> interrupts. This is done by extending the DT binding to allow the wake-up
>> interrupt to be described in a 'wake-up' subnode, sidestepping the issue
>> completely. On my Chromebook, it now looks like this:
>>
>> &pci_rootport {
>> 	mvl_wifi: wifi@0,0 {
>> 		compatible = "pci1b4b,2b42";
>> 		reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
>> 		       0x83010000 0x0 0x00100000 0x0 0x00100000>;
>> 		pinctrl-names = "default";
>> 		pinctrl-0 = <&wlan_host_wake_l>;
>> 		wake-up {
>> 			interrupt-parent = <&gpio0>;
>> 			interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
>> 			wakeup-source;
>> 		};
>> 	};
>> };
>>
>> The driver is then updated to look for this subnode first, and fallback to
>> the original, broken behaviour (spitting out a warning in the offending
>> configuration).
>>
>> For good measure, there are two additional patches:
>>
>> - The wake-up interrupt requesting is horribly racy, and could lead to
>>   unpredictable behaviours. Let's fix that properly.
>>
>> - A final patch implementing the above transformation for the whole
>>   RK3399-based Chromebook range, which all use the same broken
>>   configuration.
>>
>> With all that, I finally have PCI legacy interrupts working with the mwifiex
>> driver on my Chromebook.
>>
>> [1] http://www.devicetree.org/open-firmware/bindings/pci/pci2_1.pdf
>>
>> Marc Zyngier (4):
>>   dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
>>     separate node
>>   mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
>>   mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
>>     it too late
>>   arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
>>     subnode
>>
>>  .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
>>  .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
>>  drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
>>  3 files changed, 38 insertions(+), 6 deletions(-)
> 
> I didn't read the discussion in detail, but I understanding is that I
> should drop this series and wait for a new version. Please correct me if
> I misunderstood.

I indeed plan to resend the series with a slightly different approach,
removing support for the wake-up interrupt on mwifiex PCI devices
altogether.

Note that patch #3 is a fairly independent fix, which could be applied
right now.

Thanks,

	M.
Kalle Valo March 8, 2019, 9:36 a.m. UTC | #17
Marc Zyngier <marc.zyngier@arm.com> writes:

> On 08/03/2019 08:26, Kalle Valo wrote:
>> Marc Zyngier <marc.zyngier@arm.com> writes:
>> 
>>>   dt-bindings/marvell-8xxx: Allow wake-up interrupt to be placed in a
>>>     separate node
>>>   mwifiex: Fetch wake-up interrupt from 'wake-up' subnode when it exists
>>>   mwifiex: Flag wake-up interrupt as IRQ_NOAUTOEN rather than disabling
>>>     it too late
>>>   arm64: dts: rockchip: gru: Move wifi wake-up interrupt into its own
>>>     subnode
>>>
>>>  .../bindings/net/wireless/marvell-8xxx.txt    | 23 ++++++++++++++++++-
>>>  .../dts/rockchip/rk3399-gru-chromebook.dtsi   |  8 ++++---
>>>  drivers/net/wireless/marvell/mwifiex/main.c   | 13 +++++++++--
>>>  3 files changed, 38 insertions(+), 6 deletions(-)
>> 
>> I didn't read the discussion in detail, but I understanding is that I
>> should drop this series and wait for a new version. Please correct me if
>> I misunderstood.
>
> I indeed plan to resend the series with a slightly different approach,
> removing support for the wake-up interrupt on mwifiex PCI devices
> altogether.
>
> Note that patch #3 is a fairly independent fix, which could be applied
> right now.

Ok, I'll queue that for 5.2 then:

https://patchwork.kernel.org/patch/10827971/