diff mbox series

[v4,2/7] nvme: assign of_node to nvme device

Message ID 20240809172106.25892-3-ansuelsmth@gmail.com (mailing list archive)
State New
Headers show
Series mtd: improve block2mtd + airoha parser | expand

Commit Message

Christian Marangi Aug. 9, 2024, 5:21 p.m. UTC
Introduce support for a dedicated node for a nvme card. This will be a
subnode of the nvme controller node that will have the "nvme-card"
compatible.

This follow a similar implementation done for mmc where the specific mmc
card have a dedicated of_node.

This can be used for scenario where block2mtd module is used to declare
partition in DT and block2mtd is called on the root block of the nvme
card, permitting the usage of fixed-partition parser or alternative
ones.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
 drivers/nvme/host/core.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Christoph Hellwig Aug. 12, 2024, 11:12 a.m. UTC | #1
On Fri, Aug 09, 2024 at 07:21:00PM +0200, Christian Marangi wrote:
> Introduce support for a dedicated node for a nvme card. This will be a
> subnode of the nvme controller node that will have the "nvme-card"
> compatible.

FYI, there really is no such thing as an NVMe card.  There is an
NVMe Namespace, which is the entity that contains the block data,
the Controller which corresponds to the pci_dev for NVMe-PCIe, and
the NVMe Subsystem, which contains Controllers and Namespaces.

> This follow a similar implementation done for mmc where the specific mmc
> card have a dedicated of_node.

That's not a good explanation to be honest.  Most eMMC host controllers
are OF probed devices, so of course they'll have an of_node.

Binding PCIe functions to of_nodes seems completely weird to me, and
you'll need to explain what this totally non-obvious thing makes sense.
Maybe it does, but it needs to be backed up with a very good rationale
that is very clearly documented.
Christian Marangi Aug. 12, 2024, 12:10 p.m. UTC | #2
On Mon, Aug 12, 2024 at 01:12:05PM +0200, Christoph Hellwig wrote:
> On Fri, Aug 09, 2024 at 07:21:00PM +0200, Christian Marangi wrote:
> > Introduce support for a dedicated node for a nvme card. This will be a
> > subnode of the nvme controller node that will have the "nvme-card"
> > compatible.
> 
> FYI, there really is no such thing as an NVMe card.  There is an
> NVMe Namespace, which is the entity that contains the block data,
> the Controller which corresponds to the pci_dev for NVMe-PCIe, and
> the NVMe Subsystem, which contains Controllers and Namespaces.
>

The chosen name was arbritrary just to follow eMMC ones. Can totally
change if problematic.

> > This follow a similar implementation done for mmc where the specific mmc
> > card have a dedicated of_node.
> 
> That's not a good explanation to be honest.  Most eMMC host controllers
> are OF probed devices, so of course they'll have an of_node.
> 
> Binding PCIe functions to of_nodes seems completely weird to me, and
> you'll need to explain what this totally non-obvious thing makes sense.
> Maybe it does, but it needs to be backed up with a very good rationale
> that is very clearly documented.
> 

But support of OF for PCIe is already a thing for a long time. (it all
works by setting the compatible of the PCIe ID card) and used in wifi
card at assign MAC address, calibration data, disable frequency.

In this context we would do a similar thing with declaring the NVMe with
the PCIe ID card (already supported) and we add support for defining an
additional subnode for usage of block2mtd.

The subnode is needed to keep consistency in how the disk struct are
defined with all the parenting levels.

Just to stress it more... This is really for consistency as PCIe OF node
are already a thing and on block2mtd (for example) the same thing can be
done with something like

disk->parent->parent.of_node (as it would point, if present, to the OF node
of the PCIe card (the NVMe))

With eMMC with the mmc-card subnode we would have to use

disk->parent.of_node

Not having this well organized and consistent schema in DT will result
in additional condition in the drivers...

Also consider that if the card is not detected, nothing is probed so
those additional node won't cause any harm.

If these 2 patch are problematic I can totally drop from the series but
it was really to add consistency in NVMe and eMMC. The real important
part is eMMC that is becoming the de-facto replacement for NAND/NOR on
high tier devices (mostly wifi6/7 consumer router)
Christoph Hellwig Aug. 12, 2024, 1:31 p.m. UTC | #3
On Mon, Aug 12, 2024 at 02:10:28PM +0200, Christian Marangi wrote:
> The chosen name was arbritrary just to follow eMMC ones. Can totally
> change if problematic.

NVMe namespaces are dynamic and can be created and deleted at will
at runtime.  I just don't see how they would even fit into OF
concepts.

There is a huge impedance mismatch here, to the point where I completely
fail to understand what you are trying to do.

> But support of OF for PCIe is already a thing for a long time. (it all
> works by setting the compatible of the PCIe ID card) and used in wifi
> card at assign MAC address, calibration data, disable frequency.

Please point to a document describing how, but more importantly why
this is done.  I've worked with and maintained Linux PCI(e) drivers for
about 20 years and never seen it.  And the concept simply doesn't make
sense in terms of a dynamically probed bus.

> Not having this well organized and consistent schema in DT will result
> in additional condition in the drivers...

NVMe Controllers are PCI functions (or virtual entities over the network).
Defining them in a static DT scheme does not make sense.  NVMe Namespaces
which are what contains the block data are dynamically discoverred and
can be created and deleted at runtime, so refering to them in DT is even
more broken.  I really don't see how any of this could remotely work.

> If these 2 patch are problematic I can totally drop from the series but
> it was really to add consistency in NVMe and eMMC. The real important
> part is eMMC that is becoming the de-facto replacement for NAND/NOR on
> high tier devices (mostly wifi6/7 consumer router)

If you aren't dealing with raw(ish) NAND don't use mtd.  MTD is designed
to deal with the nitty gritty details of NOR and NAND flash.  If you
already have an FTL running in the device there is absolutely no reason
to use it.
Markus Elfring Aug. 13, 2024, 9:15 a.m. UTC | #4
> This follow a similar implementation done for mmc …

       follows?

Regards,
Markus
Rob Herring (Arm) Aug. 13, 2024, 8:04 p.m. UTC | #5
On Mon, Aug 12, 2024 at 03:31:28PM +0200, Christoph Hellwig wrote:
> On Mon, Aug 12, 2024 at 02:10:28PM +0200, Christian Marangi wrote:
> > The chosen name was arbritrary just to follow eMMC ones. Can totally
> > change if problematic.

The purpose of those eMMC nodes was to describe various non-standard 
stuff you get when discoverable devices get soldered on a board. Power 
rails, reset GPIOs, etc. It was not to put partition info there, but I 
suppose having the node opens it to such abuse.

> 
> NVMe namespaces are dynamic and can be created and deleted at will
> at runtime.  I just don't see how they would even fit into OF
> concepts.
> 
> There is a huge impedance mismatch here, to the point where I completely
> fail to understand what you are trying to do.
> 
> > But support of OF for PCIe is already a thing for a long time. (it all
> > works by setting the compatible of the PCIe ID card) and used in wifi
> > card at assign MAC address, calibration data, disable frequency.
> 
> Please point to a document describing how, but more importantly why
> this is done.  I've worked with and maintained Linux PCI(e) drivers for
> about 20 years and never seen it.  And the concept simply doesn't make
> sense in terms of a dynamically probed bus.

With OpenFirmware systems, the firmware will probe PCI and populate 
nodes with what it discovered and with how it configured things. IBM 
PSeries uses DT for PCI hotplug as well, AIUI.

For Flattened DT, PCI devices are typically only present if there are 
extra non-discoverable things (again, happens when devices are soldered 
down and not in standard slots). Another example is a NIC where they 
cheaped out and didn't put an EEPROM to store the MAC address, so you 
store it in the DT. Now we're starting to see non-discoverable things 
sitting behind PCI devices, so we need the PCI device in DT to describe 
those downstream devices.

> > Not having this well organized and consistent schema in DT will result
> > in additional condition in the drivers...
> 
> NVMe Controllers are PCI functions (or virtual entities over the network).
> Defining them in a static DT scheme does not make sense.  NVMe Namespaces
> which are what contains the block data are dynamically discoverred and
> can be created and deleted at runtime, so refering to them in DT is even
> more broken.  I really don't see how any of this could remotely work.
> 
> > If these 2 patch are problematic I can totally drop from the series but
> > it was really to add consistency in NVMe and eMMC. The real important
> > part is eMMC that is becoming the de-facto replacement for NAND/NOR on
> > high tier devices (mostly wifi6/7 consumer router)
> 
> If you aren't dealing with raw(ish) NAND don't use mtd.  MTD is designed
> to deal with the nitty gritty details of NOR and NAND flash.  If you
> already have an FTL running in the device there is absolutely no reason
> to use it.

Yes, agreed.

Rob
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 053d5b4909cd..344523274d1b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -14,6 +14,7 @@ 
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/backing-dev.h>
+#include <linux/of.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/pr.h>
@@ -4651,6 +4652,7 @@  void nvme_uninit_ctrl(struct nvme_ctrl *ctrl)
 	nvme_hwmon_exit(ctrl);
 	nvme_fault_inject_fini(&ctrl->fault_inject);
 	dev_pm_qos_hide_latency_tolerance(ctrl->device);
+	of_node_put(ctrl->device->of_node);
 	cdev_device_del(&ctrl->cdev, ctrl->device);
 	nvme_put_ctrl(ctrl);
 }
@@ -4771,6 +4773,8 @@  int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	else
 		ctrl->device->groups = nvme_dev_attr_groups;
 	ctrl->device->release = nvme_free_ctrl;
+	ctrl->device->of_node = of_get_compatible_child(ctrl->dev->of_node,
+							"nvme-card");
 	dev_set_drvdata(ctrl->device, ctrl);
 
 	return ret;