mbox series

[RFC,v2,0/2] fw_devlink overlay fix

Message ID 20240409053704.428336-1-saravanak@google.com (mailing list archive)
Headers show
Series fw_devlink overlay fix | expand

Message

Saravana Kannan April 9, 2024, 5:37 a.m. UTC
Don't bother reviewing this patch. It needs to be tested and possibly
refactored first.

Geert and Herve,

This patch serious should hopefully fix both of your use cases
[1][2][3]. Can you please check to make sure the device links created
to/from the overlay devices are to/from the right ones?

I've only compile tested it. If I made some obvious mistake, feel free
to fix it and give it a shot.

Cc: Rob Herring <robh@kernel.org>

[1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
[2] - https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/
[3] - https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/

Thanks,
Saravana


Saravana Kannan (2):
  Revert "treewide: Fix probing of devices in DT overlays"
  of: dynamic: Fix probing of overlay devices

 drivers/base/core.c       | 23 +++++++++++++++++++++++
 drivers/bus/imx-weim.c    |  6 ------
 drivers/i2c/i2c-core-of.c |  5 -----
 drivers/of/dynamic.c      |  2 +-
 drivers/of/platform.c     |  5 -----
 drivers/spi/spi.c         |  5 -----
 include/linux/fwnode.h    |  1 +
 7 files changed, 25 insertions(+), 22 deletions(-)

Comments

Herve Codina April 9, 2024, 1:02 p.m. UTC | #1
Hi Saravana,

+CC Luca and Thomas

On Mon,  8 Apr 2024 22:37:01 -0700
Saravana Kannan <saravanak@google.com> wrote:

> Don't bother reviewing this patch. It needs to be tested and possibly
> refactored first.
> 
> Geert and Herve,
> 
> This patch serious should hopefully fix both of your use cases
> [1][2][3]. Can you please check to make sure the device links created
> to/from the overlay devices are to/from the right ones?
> 
> I've only compile tested it. If I made some obvious mistake, feel free
> to fix it and give it a shot.
> 
> Cc: Rob Herring <robh@kernel.org>
> 
> [1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
> [2] - https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/
> [3] - https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
> 

I tested your patches.

Concerning my use cases, they fix the issue described in
  https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/

But not the one described in
  https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
A link is still present between the i2c@600 and the PCI device.
instead of the i2c@600 and the pci-ep-bus.

Adding the patch clearing the FWNODE_FLAG_NOT_DEVICE in device_add() available
at [1] on top of your patches fixes the link issue.
With this additional patch applied, the link is present between the i2c@600
and the pci-ep-bus.

[1] https://lore.kernel.org/lkml/20240220111044.133776-2-herve.codina@bootlin.com/

Best regards,
Hervé
Geert Uytterhoeven April 9, 2024, 3:10 p.m. UTC | #2
Hi Saravana,

On Tue, Apr 9, 2024 at 7:37 AM Saravana Kannan <saravanak@google.com> wrote:
> Don't bother reviewing this patch. It needs to be tested and possibly
> refactored first.
>
> Geert and Herve,
>
> This patch serious should hopefully fix both of your use cases
> [1][2][3]. Can you please check to make sure the device links created
> to/from the overlay devices are to/from the right ones?

Thanks for your series!

After applying the first patch (the revert), the issue reported in
[1] is back, as expected.
After applying both patches, applying[A]/unapplying[B]/reapplying[C]
overlay [4] works as without this series, so
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Note that the state of /sys/class/devlink/ after [C] is still not the
same as after [A], as reported before in [5]:
  - platform:e6060000.pinctrl--platform:keys link is not recreated in [B],
  - nothing changes in /sys/class/devlink in [C].
But that issue is not introduced in this series.

> [1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/

[4] "arm64: dts: renesas: ebisu: cn41: Add overlay for MSIOF0 and 25LC040"
    https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/commit/?h=topic/renesas-overlays&id=222a4936b0d3dabd43bdffb3a578423bff97b02d
[5] https://lore.kernel.org/lkml/CAMuHMdXNoYH8PJE1xb4PK-vzjXtOzrxNJoZhsHT-H4Ucm=7_ig@mail.gmail.com/

Gr{oetje,eeting}s,

                        Geert
Saravana Kannan April 10, 2024, 12:41 a.m. UTC | #3
On Tue, Apr 9, 2024 at 8:10 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Tue, Apr 9, 2024 at 7:37 AM Saravana Kannan <saravanak@google.com> wrote:
> > Don't bother reviewing this patch. It needs to be tested and possibly
> > refactored first.
> >
> > Geert and Herve,
> >
> > This patch serious should hopefully fix both of your use cases
> > [1][2][3]. Can you please check to make sure the device links created
> > to/from the overlay devices are to/from the right ones?
>
> Thanks for your series!
>
> After applying the first patch (the revert), the issue reported in
> [1] is back, as expected.
> After applying both patches, applying[A]/unapplying[B]/reapplying[C]
> overlay [4] works as without this series, so
> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> Note that the state of /sys/class/devlink/ after [C] is still not the
> same as after [A], as reported before in [5]:
>   - platform:e6060000.pinctrl--platform:keys link is not recreated in [B],
>   - nothing changes in /sys/class/devlink in [C].
> But that issue is not introduced in this series.

Thanks for the testing and additional info! Looks like I'll need to
make more changes to accommodate more cases. I'll send out v3 once I
figure it out, but it should continue working for you.

-Saravana

>
> > [1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
>
> [4] "arm64: dts: renesas: ebisu: cn41: Add overlay for MSIOF0 and 25LC040"
>     https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/commit/?h=topic/renesas-overlays&id=222a4936b0d3dabd43bdffb3a578423bff97b02d
> [5] https://lore.kernel.org/lkml/CAMuHMdXNoYH8PJE1xb4PK-vzjXtOzrxNJoZhsHT-H4Ucm=7_ig@mail.gmail.com/
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Saravana Kannan April 10, 2024, 1:06 a.m. UTC | #4
On Tue, Apr 9, 2024 at 6:02 AM Herve Codina <herve.codina@bootlin.com> wrote:
>
> Hi Saravana,
>
> +CC Luca and Thomas
>
> On Mon,  8 Apr 2024 22:37:01 -0700
> Saravana Kannan <saravanak@google.com> wrote:
>
> > Don't bother reviewing this patch. It needs to be tested and possibly
> > refactored first.
> >
> > Geert and Herve,
> >
> > This patch serious should hopefully fix both of your use cases
> > [1][2][3]. Can you please check to make sure the device links created
> > to/from the overlay devices are to/from the right ones?
> >
> > I've only compile tested it. If I made some obvious mistake, feel free
> > to fix it and give it a shot.
> >
> > Cc: Rob Herring <robh@kernel.org>
> >
> > [1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
> > [2] - https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/
> > [3] - https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
> >
>
> I tested your patches.
>
> Concerning my use cases, they fix the issue described in
>   https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/

I went back and looked at the example. I'm not even sure I understand
that example. In that example at the link above, are you saying
without any changes to upstream reg_dock_sys_3v3 was listing it's
supplier as i2c5 instead of tca6424_dock_1? Why wasn't Geert's
existing changes in of_i2c_notify not sufficient? Looking at it, it
does:
rd->dn->fwnode.flags &= ~FWNODE_FLAG_NOT_DEVICE;

Which should clear the flag for tca6424_dock_1. Can you help me
understand why it's not getting cleared?

> But not the one described in
>   https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
> A link is still present between the i2c@600 and the PCI device.
> instead of the i2c@600 and the pci-ep-bus.

What do you mean by PCI device here? You say the same thing in the
link, but I don't understand what you mean. Can you clarify what
exactly gets added by the overlay? Please use the fwnode name in all
the descriptions, even when talking about device links. That should
help avoid the confusion.

Also, if you can show what the target node of the overlay looks like,
that'd help too.

> Adding the patch clearing the FWNODE_FLAG_NOT_DEVICE in device_add() available
> at [1] on top of your patches fixes the link issue.
> With this additional patch applied, the link is present between the i2c@600
> and the pci-ep-bus.

I know the problem with this patch series. But to fix it properly, I
need to understand the root of the overlay node in your examples and
the target it's applied to.

-Saravana

>
> [1] https://lore.kernel.org/lkml/20240220111044.133776-2-herve.codina@bootlin.com/
>
> Best regards,
> Hervé
Herve Codina April 10, 2024, 12:35 p.m. UTC | #5
Hi Saravana,

On Tue, 9 Apr 2024 18:06:33 -0700
Saravana Kannan <saravanak@google.com> wrote:

> On Tue, Apr 9, 2024 at 6:02 AM Herve Codina <herve.codina@bootlin.com> wrote:
> >
> > Hi Saravana,
> >
> > +CC Luca and Thomas
> >
> > On Mon,  8 Apr 2024 22:37:01 -0700
> > Saravana Kannan <saravanak@google.com> wrote:
> >  
> > > Don't bother reviewing this patch. It needs to be tested and possibly
> > > refactored first.
> > >
> > > Geert and Herve,
> > >
> > > This patch serious should hopefully fix both of your use cases
> > > [1][2][3]. Can you please check to make sure the device links created
> > > to/from the overlay devices are to/from the right ones?
> > >
> > > I've only compile tested it. If I made some obvious mistake, feel free
> > > to fix it and give it a shot.
> > >
> > > Cc: Rob Herring <robh@kernel.org>
> > >
> > > [1] - https://lore.kernel.org/lkml/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
> > > [2] - https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/
> > > [3] - https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
> > >  
> >
> > I tested your patches.
> >
> > Concerning my use cases, they fix the issue described in
> >   https://lore.kernel.org/all/20240221095137.616d2aaa@bootlin.com/  
> 
> I went back and looked at the example. I'm not even sure I understand
> that example. In that example at the link above, are you saying
> without any changes to upstream reg_dock_sys_3v3 was listing it's
> supplier as i2c5 instead of tca6424_dock_1? Why wasn't Geert's
> existing changes in of_i2c_notify not sufficient? Looking at it, it
> does:
> rd->dn->fwnode.flags &= ~FWNODE_FLAG_NOT_DEVICE;
> 
> Which should clear the flag for tca6424_dock_1. Can you help me
> understand why it's not getting cleared?

I don't really know but I can mke some asumptions.

Maybe the link involved in the issue is created quite early in the overlay
applying process, before the calls to the of notifier of_i2c_notify() that
can clear the FWNODE_FLAG_NOT_DEVICE.

The link is created by of_link_to_phandle(). This functions is triggered
from of_fwnode_add_links() which is the .add_link() operation of the OF
fwnode_operations.

The operation is called by fw_devlink_parse_fwnode(), itself triggered by
fw_devlink_link_device() each time device_add() is called.

If the device_add() related to the reg_dock_sys_3v3 is called before the
of_i2c_notify() related to tca6424_dock_1, the FWNODE_FLAG_NOT_DEVICE flag
in tca6424_dock_1 is not cleared when of_link_to_phandle() is called.

Does this scenario make sense ?

> 
> > But not the one described in
> >   https://lore.kernel.org/lkml/20240312151835.29ef62a0@bootlin.com/
> > A link is still present between the i2c@600 and the PCI device.
> > instead of the i2c@600 and the pci-ep-bus.  
> 
> What do you mean by PCI device here? You say the same thing in the
> link, but I don't understand what you mean. Can you clarify what
> exactly gets added by the overlay? Please use the fwnode name in all
> the descriptions, even when talking about device links. That should
> help avoid the confusion.
> 
> Also, if you can show what the target node of the overlay looks like,
> that'd help too.
> 
> > Adding the patch clearing the FWNODE_FLAG_NOT_DEVICE in device_add() available
> > at [1] on top of your patches fixes the link issue.
> > With this additional patch applied, the link is present between the i2c@600
> > and the pci-ep-bus.  
> 
> I know the problem with this patch series. But to fix it properly, I
> need to understand the root of the overlay node in your examples and
> the target it's applied to.


This is the Microchip Lan966x PCI device use case.
The Lan966x is a component that can be a "standard" SoC (i.e. core CPUs
and some peripherals) or as a PCI device.
When in PCI device mode, the core CPUs are not available and are replaced
by an endpoint PCI. The PCI host in PCI device mode has to act the code CPUs
in SoC mode. In other word the PCI host has access to peripherals (PCI MMIO
accesses) and has to handle each peripherals.

Drivers for these peripherals are already available in the kernel (reset
controller, i2c controller, network switch, clock controllers, ...) and are
functional when the Lan966x is in SoC mode.

In order to re-use all of these drivers, the solution was to have a Lan966x
PCI driver that is probe() matching the Lan966x PCI Vendor/Device ID.
This driver then load a DT overlay to described the Lan966x internal components.

Basically, the base DT (the DT describing the PCI host board) has the PCI nodes
between the root PCI node and the Lan966x PCI device built at runtime during the
PCI bus scan. At the end of the scan, the Lan966x driver load the overlay and
the final DT looks like the following:
--- 8< ---
soc {
	...
	pcie@d0070000 {
		compatible = "marvell,armada-3700-pcie";
		...
		/* A bridge, created at runtime during PCI scan */
		pci@0,0 {
			compatible = "pci11ab,100\0pciclass,060400\0pciclass,0604";
			...

			/* The Lan966x PCI device, created at runtime during PCI scan */
			dev@0,0 { 
				compatible = "pci1055,9660\0pciclass,020000\0pciclass,0200";
				...

				/*
				 * This node is added by the overlay
				 * during lan966x PCI driver probe()
				 */
				pci-ep-bus@0 {
					compatible = "simple-bus";
					...

					flx0: flexcom@e0040000 {
						compatible = "atmel,sama5d2-flexcom";
						reg = <0xe0040000 0x100>;
						ranges = <0x0 0xe0040000 0x800>;
						...			

						i2c_lan966x: i2c@600 {
							compatible = "microchip,lan966x-i2c";
							reg = <0x600 0x200>;
							...
						};
					};
					...
				};
			};
		};
	};
};
--- 8< ---

Without clearing FWNODE_FLAG_NOT_DEVICE, a link is created between
i2c@600 and dev@0,0.
With the FWNODE_FLAG_NOT_DEVICE cleared in device_add(), the link is
created between i2c@600 and pci-ep-bus@0.

flexcom@e0040000 is a MFD device.

In the lan966x PCI driver, the overlay is applied using the following:
--- 8< ---

static int lan966x_pci_load_overlay(struct lan966x_pci *data)
{
	u32 dtbo_size = __dtbo_lan966x_pci_end - __dtbo_lan966x_pci_begin;
	void *dtbo_start = __dtbo_lan966x_pci_begin;
	int ret;

	ret = of_overlay_fdt_apply(dtbo_start, dtbo_size, &data->ovcs_id,
				   data->dev->of_node);
	if (ret)
		return ret;

	return 0;
}

static int lan966x_pci_probe(struct pci_dev *pdev,
			     const struct pci_device_id *id)
{
	struct device *dev = &pdev->dev;
	struct lan966x_pci *data;
	...
	data->dev = dev;
	...
	ret = lan966x_pci_load_overlay(data);
	if (ret)
		return ret;

	...
	ret = of_platform_default_populate(dev->of_node, NULL, dev);
	if (ret)
		goto err_unload_overlay;

	return 0;
	...
}
--- 8< ---

Hope my explanations were clear enough and answered your question.
Feel free to ask some more details if needed.

I provided only extractions of the full dtso file. If you need some more
information (all nodes, all properties) feel free to ask.

Best regards,
Hervé