Message ID | 20191107160834.21087-12-parav@mellanox.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | Mellanox, mlx5 sub function support | expand |
On Thu, 7 Nov 2019 10:08:27 -0600, Parav Pandit wrote: > Introduce a new mdev port flavour for mdev devices. > PF. > Prepare such port's phys_port_name using unique mdev alias. > > An example output for eswitch ports with one physical port and > one mdev port: > > $ devlink port show > pci/0000:06:00.0/65535: type eth netdev p0 flavour physical port 0 > pci/0000:06:00.0/32768: type eth netdev p1b0348cf880a flavour mdev alias 1b0348cf880a Surely those devices are anchored in on of the PF (or possibly VFs) that should be exposed here from the start. > Signed-off-by: Parav Pandit <parav@mellanox.com> > @@ -6649,6 +6678,9 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, > n = snprintf(name, len, "pf%uvf%u", > attrs->pci_vf.pf, attrs->pci_vf.vf); > break; > + case DEVLINK_PORT_FLAVOUR_MDEV: > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); Didn't you say m$alias in the cover letter? Not p$alias? > + break; > } > > if (n >= len)
> -----Original Message----- > From: kvm-owner@vger.kernel.org <kvm-owner@vger.kernel.org> On Behalf > Of Jakub Kicinski > Sent: Thursday, November 7, 2019 2:39 PM > To: Parav Pandit <parav@mellanox.com> > Cc: alex.williamson@redhat.com; davem@davemloft.net; > kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed > <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; > cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- > rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > On Thu, 7 Nov 2019 10:08:27 -0600, Parav Pandit wrote: > > Introduce a new mdev port flavour for mdev devices. > > PF. > > Prepare such port's phys_port_name using unique mdev alias. > > > > An example output for eswitch ports with one physical port and one > > mdev port: > > > > $ devlink port show > > pci/0000:06:00.0/65535: type eth netdev p0 flavour physical port 0 > > pci/0000:06:00.0/32768: type eth netdev p1b0348cf880a flavour mdev > > alias 1b0348cf880a > > Surely those devices are anchored in on of the PF (or possibly VFs) that should > be exposed here from the start. > They are anchored to PCI device in this implementation and all mdev device has their parent device too. However mdev devices establishes their unique identity at system level using unique UUID. So prefixing it with pf0, will shorten the remaining phys_port_name letter we get to use. Since we get unique 12 letters alias in a system for each mdev, prefixing it with pf/vf is redundant. In case of VFs, given the VF numbers can repeat among multiple PFs, and representor can be over just one eswitch instance, it was necessary to prefix. Mdev's devices parent PCI device is clearly seen in the PCI sysfs hierarchy, so don't prefer to duplicate it. > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > @@ -6649,6 +6678,9 @@ static int > __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, > > n = snprintf(name, len, "pf%uvf%u", > > attrs->pci_vf.pf, attrs->pci_vf.vf); > > break; > > + case DEVLINK_PORT_FLAVOUR_MDEV: > > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); > > Didn't you say m$alias in the cover letter? Not p$alias? > In cover letter I described the naming scheme for the netdevice of the mdev device (not the representor). Representor follows current unique phys_port_name method. > > + break; > > } > > > > if (n >= len)
On Thu, 7 Nov 2019 21:03:09 +0000, Parav Pandit wrote: > > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > > > On Thu, 7 Nov 2019 10:08:27 -0600, Parav Pandit wrote: > > > Introduce a new mdev port flavour for mdev devices. > > > PF. > > > Prepare such port's phys_port_name using unique mdev alias. > > > > > > An example output for eswitch ports with one physical port and one > > > mdev port: > > > > > > $ devlink port show > > > pci/0000:06:00.0/65535: type eth netdev p0 flavour physical port 0 > > > pci/0000:06:00.0/32768: type eth netdev p1b0348cf880a flavour mdev > > > alias 1b0348cf880a > > > > Surely those devices are anchored in on of the PF (or possibly VFs) that should > > be exposed here from the start. > > > They are anchored to PCI device in this implementation and all mdev device has their parent device too. > However mdev devices establishes their unique identity at system level using unique UUID. > So prefixing it with pf0, will shorten the remaining phys_port_name letter we get to use. > Since we get unique 12 letters alias in a system for each mdev, prefixing it with pf/vf is redundant. > In case of VFs, given the VF numbers can repeat among multiple PFs, and representor can be over just one eswitch instance, it was necessary to prefix. > Mdev's devices parent PCI device is clearly seen in the PCI sysfs hierarchy, so don't prefer to duplicate it. I'm talking about netlink attributes. I'm not suggesting to sprintf it all into the phys_port_name. > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > > @@ -6649,6 +6678,9 @@ static int > > __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, > > > n = snprintf(name, len, "pf%uvf%u", > > > attrs->pci_vf.pf, attrs->pci_vf.vf); > > > break; > > > + case DEVLINK_PORT_FLAVOUR_MDEV: > > > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); > > > > Didn't you say m$alias in the cover letter? Not p$alias? > > > In cover letter I described the naming scheme for the netdevice of > the mdev device (not the representor). Representor follows current > unique phys_port_name method. So we're reusing the letter that normal ports use? Why does it matter to name the virtualized device? In case of other reprs its the repr that has the canonical name, in case of containers and VMs they will not care at all what hypervisor identifier the device has. > > > + break; > > > } > > > > > > if (n >= len)
> -----Original Message----- > From: Jakub Kicinski <jakub.kicinski@netronome.com> > Sent: Thursday, November 7, 2019 7:18 PM > To: Parav Pandit <parav@mellanox.com> > Cc: alex.williamson@redhat.com; davem@davemloft.net; > kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed > <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; > cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- > rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > On Thu, 7 Nov 2019 21:03:09 +0000, Parav Pandit wrote: > > > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port > > > flavour > > > > > > On Thu, 7 Nov 2019 10:08:27 -0600, Parav Pandit wrote: > > > > Introduce a new mdev port flavour for mdev devices. > > > > PF. > > > > Prepare such port's phys_port_name using unique mdev alias. > > > > > > > > An example output for eswitch ports with one physical port and one > > > > mdev port: > > > > > > > > $ devlink port show > > > > pci/0000:06:00.0/65535: type eth netdev p0 flavour physical port 0 > > > > pci/0000:06:00.0/32768: type eth netdev p1b0348cf880a flavour mdev > > > > alias 1b0348cf880a > > > > > > Surely those devices are anchored in on of the PF (or possibly VFs) > > > that should be exposed here from the start. > > > > > They are anchored to PCI device in this implementation and all mdev > device has their parent device too. > > However mdev devices establishes their unique identity at system level > using unique UUID. > > So prefixing it with pf0, will shorten the remaining phys_port_name letter > we get to use. > > Since we get unique 12 letters alias in a system for each mdev, prefixing it > with pf/vf is redundant. > > In case of VFs, given the VF numbers can repeat among multiple PFs, and > representor can be over just one eswitch instance, it was necessary to prefix. > > Mdev's devices parent PCI device is clearly seen in the PCI sysfs hierarchy, > so don't prefer to duplicate it. > > I'm talking about netlink attributes. I'm not suggesting to sprintf it all into > the phys_port_name. > I didn't follow your comment. For devlink port show command output you said, "Surely those devices are anchored in on of the PF (or possibly VFs) that should be exposed here from the start." So I was trying to explain why we don't expose PF/VF detail in the port attributes which contains (a) flavour (b) netdev representor (name derived from phys_port_name) (c) mdev alias Can you please describe which netlink attribute I missed? > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > > > > @@ -6649,6 +6678,9 @@ static int > > > __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, > > > > n = snprintf(name, len, "pf%uvf%u", > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); > > > > break; > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: > > > > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); > > > > > > Didn't you say m$alias in the cover letter? Not p$alias? > > > > > In cover letter I described the naming scheme for the netdevice of the > > mdev device (not the representor). Representor follows current unique > > phys_port_name method. > > So we're reusing the letter that normal ports use? > I initially had 'm' as prefix to make it easy to recognize as mdev's port, instead of 'p', but during internal review Jiri's input was to just use 'p'. > Why does it matter to name the virtualized device? In case of other reprs its > the repr that has the canonical name, in case of containers and VMs they > will not care at all what hypervisor identifier the device has. > Well, many orchestration framework probably won't care of what name is picked up. And such name will likely get renamed to eth0 in VM or container. Unlike vxlan, macvlan interfaces, user explicitly specify the netdevice name, and when newlink() netlink command completes with success, user know the device to use. If we don't have persistent name for mdev, if a random name ethX is picked up, user needs refer to sysfs device hierarchy to know its netdev. Its super easy to do refer that, but having persistent name based out of alias makes things aligned like naming device on PCI bus. This way devices can be used without VM/container use cases too, for example user is interested in only 4 or 8 mdev devices in system and its setup is done through systemd.service. > > > > + break; > > > > } > > > > > > > > if (n >= len)
On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: > > I'm talking about netlink attributes. I'm not suggesting to sprintf it all into > > the phys_port_name. > > > I didn't follow your comment. For devlink port show command output you said, > > "Surely those devices are anchored in on of the PF (or possibly VFs) > that should be exposed here from the start." > So I was trying to explain why we don't expose PF/VF detail in the > port attributes which contains > (a) flavour > (b) netdev representor (name derived from phys_port_name) > (c) mdev alias > > Can you please describe which netlink attribute I missed? Identification of the PCI device. The PCI devices are not linked to devlink ports, so the sysfs hierarchy (a) is irrelevant, (b) may not be visible in multi-host (or SmartNIC). > > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > > > > > > @@ -6649,6 +6678,9 @@ static int > > > > __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, > > > > > n = snprintf(name, len, "pf%uvf%u", > > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); > > > > > break; > > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: > > > > > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); > > > > > > > > Didn't you say m$alias in the cover letter? Not p$alias? > > > > > > > In cover letter I described the naming scheme for the netdevice of the > > > mdev device (not the representor). Representor follows current unique > > > phys_port_name method. > > > > So we're reusing the letter that normal ports use? > > > I initially had 'm' as prefix to make it easy to recognize as mdev's port, instead of 'p', but during internal review Jiri's input was to just use 'p'. Let's way for Jiri to weigh in then. > > Why does it matter to name the virtualized device? In case of other reprs its > > the repr that has the canonical name, in case of containers and VMs they > > will not care at all what hypervisor identifier the device has. > > > Well, many orchestration framework probably won't care of what name is picked up. > And such name will likely get renamed to eth0 in VM or container. > Unlike vxlan, macvlan interfaces, user explicitly specify the netdevice name, and when newlink() netlink command completes with success, user know the device to use. > If we don't have persistent name for mdev, if a random name ethX is picked up, user needs refer to sysfs device hierarchy to know its netdev. > Its super easy to do refer that, but having persistent name based out of alias makes things aligned like naming device on PCI bus. > This way devices can be used without VM/container use cases too, for example user is interested in only 4 or 8 mdev devices in system and its setup is done through systemd.service.
> -----Original Message----- > From: Jakub Kicinski <jakub.kicinski@netronome.com> > Sent: Thursday, November 7, 2019 8:20 PM > To: Parav Pandit <parav@mellanox.com> > Cc: alex.williamson@redhat.com; davem@davemloft.net; > kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed > <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; > cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- > rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: > > > I'm talking about netlink attributes. I'm not suggesting to sprintf > > > it all into the phys_port_name. > > > > > I didn't follow your comment. For devlink port show command output you > > said, > > > > "Surely those devices are anchored in on of the PF (or possibly VFs) > > that should be exposed here from the start." > > So I was trying to explain why we don't expose PF/VF detail in the > > port attributes which contains > > (a) flavour > > (b) netdev representor (name derived from phys_port_name) > > (c) mdev alias > > > > Can you please describe which netlink attribute I missed? > > Identification of the PCI device. The PCI devices are not linked to devlink > ports, so the sysfs hierarchy (a) is irrelevant, (b) may not be visible in multi- > host (or SmartNIC). > It's the unique mdev device alias. It is not right to attach to the PCI device. Mdev is bus in itself where devices are identified uniquely. So an alias suffice that identity. > > > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > > > > > > > > @@ -6649,6 +6678,9 @@ static int > > > > > __devlink_port_phys_port_name_get(struct devlink_port > > > > > *devlink_port, > > > > > > n = snprintf(name, len, "pf%uvf%u", > > > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); > > > > > > break; > > > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: > > > > > > + n = snprintf(name, len, "p%s", attrs- > >mdev.mdev_alias); > > > > > > > > > > Didn't you say m$alias in the cover letter? Not p$alias? > > > > > > > > > In cover letter I described the naming scheme for the netdevice of > > > > the mdev device (not the representor). Representor follows current > > > > unique phys_port_name method. > > > > > > So we're reusing the letter that normal ports use? > > > > > I initially had 'm' as prefix to make it easy to recognize as mdev's port, > instead of 'p', but during internal review Jiri's input was to just use 'p'. > > Let's way for Jiri to weigh in then. Yeah. I remember his point was to not confuse the <en><m> prefix in the persistent device name with 'm' prefix in phys_port_name. Hence, his input was just 'p'. > > > > Why does it matter to name the virtualized device? In case of other > > > reprs its the repr that has the canonical name, in case of > > > containers and VMs they will not care at all what hypervisor identifier > the device has. > > > > > Well, many orchestration framework probably won't care of what name is > picked up. > > And such name will likely get renamed to eth0 in VM or container. > > Unlike vxlan, macvlan interfaces, user explicitly specify the netdevice name, > and when newlink() netlink command completes with success, user know the > device to use. > > If we don't have persistent name for mdev, if a random name ethX is > picked up, user needs refer to sysfs device hierarchy to know its netdev. > > Its super easy to do refer that, but having persistent name based out of > alias makes things aligned like naming device on PCI bus. > > This way devices can be used without VM/container use cases too, for > example user is interested in only 4 or 8 mdev devices in system and its > setup is done through systemd.service.
Fri, Nov 08, 2019 at 03:20:24AM CET, jakub.kicinski@netronome.com wrote: >On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: [...] >> > > > > @@ -6649,6 +6678,9 @@ static int >> > > > __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, >> > > > > n = snprintf(name, len, "pf%uvf%u", >> > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); >> > > > > break; >> > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: >> > > > > + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); >> > > > >> > > > Didn't you say m$alias in the cover letter? Not p$alias? >> > > > >> > > In cover letter I described the naming scheme for the netdevice of the >> > > mdev device (not the representor). Representor follows current unique >> > > phys_port_name method. >> > >> > So we're reusing the letter that normal ports use? >> > >> I initially had 'm' as prefix to make it easy to recognize as mdev's port, instead of 'p', but during internal review Jiri's input was to just use 'p'. > >Let's way for Jiri to weigh in then. Hmm, it's been so far I can't really recall. But looking at what we have now: DEVLINK_PORT_FLAVOUR_PHYSICAL "p%u"/"p%us%u" DEVLINK_PORT_FLAVOUR_PCI_PF "pf%u" DEVLINK_PORT_FLAVOUR_PCI_VF "pf%uvf%u" For mdev, the ideal format would be: "pf%um%s" or "pf%uvf%um%s", but that would be too long. I guess that "m%s" is fine. "p" is probably not a good idea as phys ports already have that. [...]
Fri, Nov 08, 2019 at 03:31:02AM CET, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jakub Kicinski <jakub.kicinski@netronome.com> >> Sent: Thursday, November 7, 2019 8:20 PM >> To: Parav Pandit <parav@mellanox.com> >> Cc: alex.williamson@redhat.com; davem@davemloft.net; >> kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed >> <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; >> cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- >> rdma@vger.kernel.org >> Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour >> >> On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: >> > > I'm talking about netlink attributes. I'm not suggesting to sprintf >> > > it all into the phys_port_name. >> > > >> > I didn't follow your comment. For devlink port show command output you >> > said, >> > >> > "Surely those devices are anchored in on of the PF (or possibly VFs) >> > that should be exposed here from the start." >> > So I was trying to explain why we don't expose PF/VF detail in the >> > port attributes which contains >> > (a) flavour >> > (b) netdev representor (name derived from phys_port_name) >> > (c) mdev alias >> > >> > Can you please describe which netlink attribute I missed? >> >> Identification of the PCI device. The PCI devices are not linked to devlink >> ports, so the sysfs hierarchy (a) is irrelevant, (b) may not be visible in multi- >> host (or SmartNIC). >> > >It's the unique mdev device alias. It is not right to attach to the PCI device. >Mdev is bus in itself where devices are identified uniquely. So an alias suffice that identity. Wait a sec. For mdev, what you say is correct. But here we talk about devlink_port which is representing this mdev. And this devlink_port is very similar to VF devlink_port. It is bound to specific PF (in case of mdev it could be PF-VF). > >> > > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> >> > > > > >> > > > > > @@ -6649,6 +6678,9 @@ static int >> > > > > __devlink_port_phys_port_name_get(struct devlink_port >> > > > > *devlink_port, >> > > > > > n = snprintf(name, len, "pf%uvf%u", >> > > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); >> > > > > > break; >> > > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: >> > > > > > + n = snprintf(name, len, "p%s", attrs- >> >mdev.mdev_alias); >> > > > > >> > > > > Didn't you say m$alias in the cover letter? Not p$alias? >> > > > > >> > > > In cover letter I described the naming scheme for the netdevice of >> > > > the mdev device (not the representor). Representor follows current >> > > > unique phys_port_name method. >> > > >> > > So we're reusing the letter that normal ports use? >> > > >> > I initially had 'm' as prefix to make it easy to recognize as mdev's port, >> instead of 'p', but during internal review Jiri's input was to just use 'p'. >> >> Let's way for Jiri to weigh in then. > >Yeah. >I remember his point was to not confuse the <en><m> prefix in the persistent device name with 'm' prefix in phys_port_name. >Hence, his input was just 'p'. Not sure what are you referring to. Udev places "n" in front of whatever string we construct here, so the namespace is entirely in our hands. > >> >> > > Why does it matter to name the virtualized device? In case of other >> > > reprs its the repr that has the canonical name, in case of >> > > containers and VMs they will not care at all what hypervisor identifier >> the device has. >> > > >> > Well, many orchestration framework probably won't care of what name is >> picked up. >> > And such name will likely get renamed to eth0 in VM or container. >> > Unlike vxlan, macvlan interfaces, user explicitly specify the netdevice name, >> and when newlink() netlink command completes with success, user know the >> device to use. >> > If we don't have persistent name for mdev, if a random name ethX is >> picked up, user needs refer to sysfs device hierarchy to know its netdev. >> > Its super easy to do refer that, but having persistent name based out of >> alias makes things aligned like naming device on PCI bus. >> > This way devices can be used without VM/container use cases too, for >> example user is interested in only 4 or 8 mdev devices in system and its >> setup is done through systemd.service.
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Friday, November 8, 2019 3:30 AM > To: Jakub Kicinski <jakub.kicinski@netronome.com> > Cc: Parav Pandit <parav@mellanox.com>; alex.williamson@redhat.com; > davem@davemloft.net; kvm@vger.kernel.org; netdev@vger.kernel.org; > Saeed Mahameed <saeedm@mellanox.com>; kwankhede@nvidia.com; > leon@kernel.org; cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- > rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > Fri, Nov 08, 2019 at 03:20:24AM CET, jakub.kicinski@netronome.com wrote: > >On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: > > [...] > > >> > > > > @@ -6649,6 +6678,9 @@ static int > >> > > > __devlink_port_phys_port_name_get(struct devlink_port > >> > > > *devlink_port, > >> > > > > n = snprintf(name, len, "pf%uvf%u", > >> > > > > attrs->pci_vf.pf, attrs->pci_vf.vf); > >> > > > > break; > >> > > > > + case DEVLINK_PORT_FLAVOUR_MDEV: > >> > > > > + n = snprintf(name, len, "p%s", attrs- > >mdev.mdev_alias); > >> > > > > >> > > > Didn't you say m$alias in the cover letter? Not p$alias? > >> > > > > >> > > In cover letter I described the naming scheme for the netdevice > >> > > of the mdev device (not the representor). Representor follows > >> > > current unique phys_port_name method. > >> > > >> > So we're reusing the letter that normal ports use? > >> > > >> I initially had 'm' as prefix to make it easy to recognize as mdev's port, > instead of 'p', but during internal review Jiri's input was to just use 'p'. > > > >Let's way for Jiri to weigh in then. > > Hmm, it's been so far I can't really recall. But looking at what we have > now: > DEVLINK_PORT_FLAVOUR_PHYSICAL "p%u"/"p%us%u" > DEVLINK_PORT_FLAVOUR_PCI_PF "pf%u" > DEVLINK_PORT_FLAVOUR_PCI_VF "pf%uvf%u" > For mdev, the ideal format would be: > "pf%um%s" or "pf%uvf%um%s", but that would be too long. > I guess that "m%s" is fine. > "p" is probably not a good idea as phys ports already have that. > > [...] Ok. I will revise to use "m%s". Thanks.
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Friday, November 8, 2019 3:47 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jakub Kicinski <jakub.kicinski@netronome.com>; > alex.williamson@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; > netdev@vger.kernel.org; Saeed Mahameed <saeedm@mellanox.com>; > kwankhede@nvidia.com; leon@kernel.org; cohuck@redhat.com; Jiri Pirko > <jiri@mellanox.com>; linux-rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > Fri, Nov 08, 2019 at 03:31:02AM CET, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Jakub Kicinski <jakub.kicinski@netronome.com> > >> Sent: Thursday, November 7, 2019 8:20 PM > >> To: Parav Pandit <parav@mellanox.com> > >> Cc: alex.williamson@redhat.com; davem@davemloft.net; > >> kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed > >> <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; > >> cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- > >> rdma@vger.kernel.org > >> Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port > >> flavour > >> > >> On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: > >> > > I'm talking about netlink attributes. I'm not suggesting to > >> > > sprintf it all into the phys_port_name. > >> > > > >> > I didn't follow your comment. For devlink port show command output > >> > you said, > >> > > >> > "Surely those devices are anchored in on of the PF (or possibly > >> > VFs) that should be exposed here from the start." > >> > So I was trying to explain why we don't expose PF/VF detail in the > >> > port attributes which contains > >> > (a) flavour > >> > (b) netdev representor (name derived from phys_port_name) > >> > (c) mdev alias > >> > > >> > Can you please describe which netlink attribute I missed? > >> > >> Identification of the PCI device. The PCI devices are not linked to > >> devlink ports, so the sysfs hierarchy (a) is irrelevant, (b) may not > >> be visible in multi- host (or SmartNIC). > >> > > > >It's the unique mdev device alias. It is not right to attach to the PCI device. > >Mdev is bus in itself where devices are identified uniquely. So an alias > suffice that identity. > > Wait a sec. For mdev, what you say is correct. But here we talk about > devlink_port which is representing this mdev. And this devlink_port is very > similar to VF devlink_port. It is bound to specific PF (in case of mdev it could > be PF-VF). > But mdev port has unique phys_port_name in system, it incorrect to use PF/VF prefix. What in hypothetical case, mdev is not on top of PCI...
Fri, Nov 08, 2019 at 04:45:06PM CET, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> >> Sent: Friday, November 8, 2019 3:47 AM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>; >> alex.williamson@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; >> netdev@vger.kernel.org; Saeed Mahameed <saeedm@mellanox.com>; >> kwankhede@nvidia.com; leon@kernel.org; cohuck@redhat.com; Jiri Pirko >> <jiri@mellanox.com>; linux-rdma@vger.kernel.org >> Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour >> >> Fri, Nov 08, 2019 at 03:31:02AM CET, parav@mellanox.com wrote: >> > >> > >> >> -----Original Message----- >> >> From: Jakub Kicinski <jakub.kicinski@netronome.com> >> >> Sent: Thursday, November 7, 2019 8:20 PM >> >> To: Parav Pandit <parav@mellanox.com> >> >> Cc: alex.williamson@redhat.com; davem@davemloft.net; >> >> kvm@vger.kernel.org; netdev@vger.kernel.org; Saeed Mahameed >> >> <saeedm@mellanox.com>; kwankhede@nvidia.com; leon@kernel.org; >> >> cohuck@redhat.com; Jiri Pirko <jiri@mellanox.com>; linux- >> >> rdma@vger.kernel.org >> >> Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port >> >> flavour >> >> >> >> On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: >> >> > > I'm talking about netlink attributes. I'm not suggesting to >> >> > > sprintf it all into the phys_port_name. >> >> > > >> >> > I didn't follow your comment. For devlink port show command output >> >> > you said, >> >> > >> >> > "Surely those devices are anchored in on of the PF (or possibly >> >> > VFs) that should be exposed here from the start." >> >> > So I was trying to explain why we don't expose PF/VF detail in the >> >> > port attributes which contains >> >> > (a) flavour >> >> > (b) netdev representor (name derived from phys_port_name) >> >> > (c) mdev alias >> >> > >> >> > Can you please describe which netlink attribute I missed? >> >> >> >> Identification of the PCI device. The PCI devices are not linked to >> >> devlink ports, so the sysfs hierarchy (a) is irrelevant, (b) may not >> >> be visible in multi- host (or SmartNIC). >> >> >> > >> >It's the unique mdev device alias. It is not right to attach to the PCI device. >> >Mdev is bus in itself where devices are identified uniquely. So an alias >> suffice that identity. >> >> Wait a sec. For mdev, what you say is correct. But here we talk about >> devlink_port which is representing this mdev. And this devlink_port is very >> similar to VF devlink_port. It is bound to specific PF (in case of mdev it could >> be PF-VF). >> >But mdev port has unique phys_port_name in system, it incorrect to use PF/VF prefix. Why incorrect? It is always bound to pf/vf? >What in hypothetical case, mdev is not on top of PCI... Okay, let's go hypothetical. In that case, it is going to be on top of something else, wouldn't it?
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > >> >> On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: > >> >> > > I'm talking about netlink attributes. I'm not suggesting to > >> >> > > sprintf it all into the phys_port_name. > >> >> > > > >> >> > I didn't follow your comment. For devlink port show command > >> >> > output you said, > >> >> > > >> >> > "Surely those devices are anchored in on of the PF (or possibly > >> >> > VFs) that should be exposed here from the start." > >> >> > So I was trying to explain why we don't expose PF/VF detail in > >> >> > the port attributes which contains > >> >> > (a) flavour > >> >> > (b) netdev representor (name derived from phys_port_name) > >> >> > (c) mdev alias > >> >> > > >> >> > Can you please describe which netlink attribute I missed? > >> >> > >> >> Identification of the PCI device. The PCI devices are not linked > >> >> to devlink ports, so the sysfs hierarchy (a) is irrelevant, (b) > >> >> may not be visible in multi- host (or SmartNIC). > >> >> > >> > > >> >It's the unique mdev device alias. It is not right to attach to the PCI > device. > >> >Mdev is bus in itself where devices are identified uniquely. So an > >> >alias > >> suffice that identity. > >> > >> Wait a sec. For mdev, what you say is correct. But here we talk about > >> devlink_port which is representing this mdev. And this devlink_port > >> is very similar to VF devlink_port. It is bound to specific PF (in > >> case of mdev it could be PF-VF). > >> > >But mdev port has unique phys_port_name in system, it incorrect to use > PF/VF prefix. > > Why incorrect? It is always bound to pf/vf? > Because mdev device already identified using its unique alias. Why does it need prefix? Mdev core generating the alias is not aware of the prefixes applied devlink. it shouldn't be. We want more letters towards uniqueness of the alias and filling it up with such prefixes doesn't make sense. > >What in hypothetical case, mdev is not on top of PCI... > > Okay, let's go hypothetical. In that case, it is going to be on top of something > else, wouldn't it? Yes, it will be. But just because it is on top of something, doesn't mean we include the whole parent dev, its bridge, its rc hierarchy here. There should be a need. It was needed in PF/VF case due to overlapping numbers of VFs via single devlink instance. You probably missed my reply to Jakub. Here it is no overlap.
Fri, Nov 08, 2019 at 05:43:43PM CET, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> >> >> >> On Fri, 8 Nov 2019 01:44:53 +0000, Parav Pandit wrote: >> >> >> > > I'm talking about netlink attributes. I'm not suggesting to >> >> >> > > sprintf it all into the phys_port_name. >> >> >> > > >> >> >> > I didn't follow your comment. For devlink port show command >> >> >> > output you said, >> >> >> > >> >> >> > "Surely those devices are anchored in on of the PF (or possibly >> >> >> > VFs) that should be exposed here from the start." >> >> >> > So I was trying to explain why we don't expose PF/VF detail in >> >> >> > the port attributes which contains >> >> >> > (a) flavour >> >> >> > (b) netdev representor (name derived from phys_port_name) >> >> >> > (c) mdev alias >> >> >> > >> >> >> > Can you please describe which netlink attribute I missed? >> >> >> >> >> >> Identification of the PCI device. The PCI devices are not linked >> >> >> to devlink ports, so the sysfs hierarchy (a) is irrelevant, (b) >> >> >> may not be visible in multi- host (or SmartNIC). >> >> >> >> >> > >> >> >It's the unique mdev device alias. It is not right to attach to the PCI >> device. >> >> >Mdev is bus in itself where devices are identified uniquely. So an >> >> >alias >> >> suffice that identity. >> >> >> >> Wait a sec. For mdev, what you say is correct. But here we talk about >> >> devlink_port which is representing this mdev. And this devlink_port >> >> is very similar to VF devlink_port. It is bound to specific PF (in >> >> case of mdev it could be PF-VF). >> >> >> >But mdev port has unique phys_port_name in system, it incorrect to use >> PF/VF prefix. >> >> Why incorrect? It is always bound to pf/vf? >> >Because mdev device already identified using its unique alias. Why does it need prefix? >Mdev core generating the alias is not aware of the prefixes applied devlink. it shouldn't be. >We want more letters towards uniqueness of the alias and filling it up with such prefixes doesn't make sense. mdev belongs undev pf/vf, no matter how uniqueue the name/alias is. Well, I don't really need those in the phys_port_name, mainly simply because they would not fit. However, I believe that you should fillup the PF/VF devlink netlink attrs. Note that we are not talking here about the actual mdev, but rather devlink_port associated with this mdev. And devlink port should have this info. > >> >What in hypothetical case, mdev is not on top of PCI... >> >> Okay, let's go hypothetical. In that case, it is going to be on top of something >> else, wouldn't it? >Yes, it will be. But just because it is on top of something, doesn't mean we include the whole parent dev, its bridge, its rc hierarchy here. >There should be a need. >It was needed in PF/VF case due to overlapping numbers of VFs via single devlink instance. You probably missed my reply to Jakub. Sure. Again, I don't really care about having that in phys_port_name. But please fillup the attrs. >Here it is no overlap. >
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> [..] > Well, I don't really need those in the phys_port_name, mainly simply because > they would not fit. However, I believe that you should fillup the PF/VF devlink > netlink attrs. > > Note that we are not talking here about the actual mdev, but rather > devlink_port associated with this mdev. And devlink port should have this info. > > > > > >> >What in hypothetical case, mdev is not on top of PCI... > >> > >> Okay, let's go hypothetical. In that case, it is going to be on top > >> of something else, wouldn't it? > >Yes, it will be. But just because it is on top of something, doesn't mean we > include the whole parent dev, its bridge, its rc hierarchy here. > >There should be a need. > >It was needed in PF/VF case due to overlapping numbers of VFs via single > devlink instance. You probably missed my reply to Jakub. > > Sure. Again, I don't really care about having that in phys_port_name. > But please fillup the attrs. > Ah ok. but than that would be optional attribute? Because you can have non pci based mdev, though it doesn't exist today along with devlink to my knowledge.
Fri, Nov 08, 2019 at 07:23:44PM CET, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> > >[..] >> Well, I don't really need those in the phys_port_name, mainly simply because >> they would not fit. However, I believe that you should fillup the PF/VF devlink >> netlink attrs. >> >> Note that we are not talking here about the actual mdev, but rather >> devlink_port associated with this mdev. And devlink port should have this info. >> >> >> > >> >> >What in hypothetical case, mdev is not on top of PCI... >> >> >> >> Okay, let's go hypothetical. In that case, it is going to be on top >> >> of something else, wouldn't it? >> >Yes, it will be. But just because it is on top of something, doesn't mean we >> include the whole parent dev, its bridge, its rc hierarchy here. >> >There should be a need. >> >It was needed in PF/VF case due to overlapping numbers of VFs via single >> devlink instance. You probably missed my reply to Jakub. >> >> Sure. Again, I don't really care about having that in phys_port_name. >> But please fillup the attrs. >> >Ah ok. but than that would be optional attribute? >Because you can have non pci based mdev, though it doesn't exist today along with devlink to my knowledge. Non-optional now. We can always change the code to not fill it up or fill up another attr instead. no UAPI harm.
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Friday, November 8, 2019 12:34 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Jakub Kicinski <jakub.kicinski@netronome.com>; > alex.williamson@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; > netdev@vger.kernel.org; Saeed Mahameed <saeedm@mellanox.com>; > kwankhede@nvidia.com; leon@kernel.org; cohuck@redhat.com; Jiri Pirko > <jiri@mellanox.com>; linux-rdma@vger.kernel.org > Subject: Re: [PATCH net-next 12/19] devlink: Introduce mdev port flavour > > Fri, Nov 08, 2019 at 07:23:44PM CET, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Jiri Pirko <jiri@resnulli.us> > > > >[..] > >> Well, I don't really need those in the phys_port_name, mainly simply > >> because they would not fit. However, I believe that you should fillup > >> the PF/VF devlink netlink attrs. > >> > >> Note that we are not talking here about the actual mdev, but rather > >> devlink_port associated with this mdev. And devlink port should have this > info. > >> > >> > >> > > >> >> >What in hypothetical case, mdev is not on top of PCI... > >> >> > >> >> Okay, let's go hypothetical. In that case, it is going to be on > >> >> top of something else, wouldn't it? > >> >Yes, it will be. But just because it is on top of something, doesn't > >> >mean we > >> include the whole parent dev, its bridge, its rc hierarchy here. > >> >There should be a need. > >> >It was needed in PF/VF case due to overlapping numbers of VFs via > >> >single > >> devlink instance. You probably missed my reply to Jakub. > >> > >> Sure. Again, I don't really care about having that in phys_port_name. > >> But please fillup the attrs. > >> > >Ah ok. but than that would be optional attribute? > >Because you can have non pci based mdev, though it doesn't exist today along > with devlink to my knowledge. > > Non-optional now. We can always change the code to not fill it up or fill up > another attr instead. no UAPI harm. Ok. sounds good. Will implement this in the respin.
diff --git a/include/net/devlink.h b/include/net/devlink.h index 6bf3b9e0595a..fcffc7f7cff2 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -60,6 +60,10 @@ struct devlink_port_pci_vf_attrs { u16 vf; /* Associated PCI VF for of the PCI PF for this port. */ }; +struct devlink_port_mdev_attrs { + const char *mdev_alias; /* Unique mdev alias used for this port. */ +}; + struct devlink_port_attrs { u8 set:1, split:1, @@ -70,6 +74,7 @@ struct devlink_port_attrs { struct devlink_port_phys_attrs phys; struct devlink_port_pci_pf_attrs pci_pf; struct devlink_port_pci_vf_attrs pci_vf; + struct devlink_port_mdev_attrs mdev; }; }; @@ -802,6 +807,10 @@ void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, const unsigned char *switch_id, unsigned char switch_id_len, u16 pf, u16 vf); +void devlink_port_attrs_mdev_set(struct devlink_port *devlink_port, + const unsigned char *switch_id, + unsigned char switch_id_len, + const char *mdev_alias); int devlink_sb_register(struct devlink *devlink, unsigned int sb_index, u32 size, u16 ingress_pools_count, u16 egress_pools_count, u16 ingress_tc_count, diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index b558ea88b766..db803c0d0e9f 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -187,6 +187,10 @@ enum devlink_port_flavour { * for the PCI VF. It is an internal * port that faces the PCI VF. */ + DEVLINK_PORT_FLAVOUR_MDEV, /* Represents eswitch port for the + * mdev device. It is an internal + * port that faces the mdev device. + */ }; enum devlink_param_cmode { @@ -424,6 +428,7 @@ enum devlink_attr { DEVLINK_ATTR_NETNS_FD, /* u32 */ DEVLINK_ATTR_NETNS_PID, /* u32 */ DEVLINK_ATTR_NETNS_ID, /* u32 */ + DEVLINK_ATTR_PORT_MDEV_ALIAS, /* string */ /* add new attributes above here, update the policy in devlink.c */ diff --git a/net/core/devlink.c b/net/core/devlink.c index 97e9a2246929..cb7b6ef5d520 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -542,6 +542,11 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, attrs->pci_vf.vf)) return -EMSGSIZE; break; + case DEVLINK_PORT_FLAVOUR_MDEV: + if (nla_put_string(msg, DEVLINK_ATTR_PORT_MDEV_ALIAS, + attrs->mdev.mdev_alias)) + return -EMSGSIZE; + break; case DEVLINK_PORT_FLAVOUR_PHYSICAL: case DEVLINK_PORT_FLAVOUR_CPU: case DEVLINK_PORT_FLAVOUR_DSA: @@ -6617,6 +6622,30 @@ void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, } EXPORT_SYMBOL_GPL(devlink_port_attrs_pci_vf_set); +/** + * devlink_port_attrs_mdev_set - Set mdev port attributes + * + * @devlink_port: devlink port + * @switch_id: if the port is part of switch, this is buffer with ID, + * otherwise this is NULL + * @switch_id_len: length of the switch_id buffer + * @mdev_alias: unique mdev alias for this port used to form phys_port_name + */ +void devlink_port_attrs_mdev_set(struct devlink_port *devlink_port, + const unsigned char *switch_id, + unsigned char switch_id_len, + const char *mdev_alias) +{ + struct devlink_port_attrs *attrs = &devlink_port->attrs; + + if (__devlink_port_attrs_set(devlink_port, + DEVLINK_PORT_FLAVOUR_MDEV, + switch_id, switch_id_len)) + return; + attrs->mdev.mdev_alias = mdev_alias; +} +EXPORT_SYMBOL_GPL(devlink_port_attrs_mdev_set); + static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, char *name, size_t len) { @@ -6649,6 +6678,9 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, n = snprintf(name, len, "pf%uvf%u", attrs->pci_vf.pf, attrs->pci_vf.vf); break; + case DEVLINK_PORT_FLAVOUR_MDEV: + n = snprintf(name, len, "p%s", attrs->mdev.mdev_alias); + break; } if (n >= len)
Introduce a new mdev port flavour for mdev devices. PF. Prepare such port's phys_port_name using unique mdev alias. An example output for eswitch ports with one physical port and one mdev port: $ devlink port show pci/0000:06:00.0/65535: type eth netdev p0 flavour physical port 0 pci/0000:06:00.0/32768: type eth netdev p1b0348cf880a flavour mdev alias 1b0348cf880a Signed-off-by: Parav Pandit <parav@mellanox.com> --- include/net/devlink.h | 9 +++++++++ include/uapi/linux/devlink.h | 5 +++++ net/core/devlink.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+)