Message ID | 20190808141255.45236-1-parav@mellanox.com (mailing list archive) |
---|---|
Headers | show |
Series | Simplify mtty driver and mdev core | expand |
On Thu, 8 Aug 2019 09:12:53 -0500 Parav Pandit <parav@mellanox.com> wrote: > Currently mtty sample driver uses mdev state and UUID in convoluated way to > generate an interrupt. > It uses several translations from mdev_state to mdev_device to mdev uuid. > After which it does linear search of long uuid comparision to > find out mdev_state in mtty_trigger_interrupt(). > mdev_state is already available while generating interrupt from which all > such translations are done to reach back to mdev_state. > > This translations are done during interrupt generation path. > This is unnecessary and reduandant. Is the interrupt handling efficiency of this particular sample driver really relevant, or is its purpose more to illustrate the API and provide a proof of concept? If we go to the trouble to optimize the sample driver and remove this interface from the API, what do we lose? This interface was added via commit: 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract interfaces Where the goal was to create a more formal interface and abstract driver access to the struct mdev_device. In part this served to make out-of-tree mdev vendor drivers more supportable; the object is considered opaque and access is provided via an API rather than through direct structure fields. I believe that the NVIDIA GRID mdev driver does make use of this interface and it's likely included in the sample driver specifically so that there is an in-kernel user for it (ie. specifically to avoid it being removed so casually). An interesting feature of the NVIDIA mdev driver is that I believe it has portions that run in userspace. As we know, mdevs are named with a UUID, so I can imagine there are some efficiencies to be gained in having direct access to the UUID for a device when interacting with userspace, rather than repeatedly parsing it from a device name. Is that really something we want to make more difficult in order to optimize a sample driver? Knowing that an mdev device uses a UUID for it's name, as tools like libvirt and mdevctl expect, is it really worthwhile to remove such a trivial API? > Hence, > Patch-1 simplifies mtty sample driver to directly use mdev_state. > > Patch-2, Since no production driver uses mdev_uuid(), simplifies and > removes redandant mdev_uuid() exported symbol. s/no production driver/no in-kernel production driver/ I'd be interested to hear how the NVIDIA folks make use of this API interface. Thanks, Alex > --- > Changelog: > v1->v2: > - Corrected email of Kirti > - Updated cover letter commit log to address comment from Cornelia > - Added Reviewed-by tag > v0->v1: > - Updated commit log > > Parav Pandit (2): > vfio-mdev/mtty: Simplify interrupt generation > vfio/mdev: Removed unused and redundant API for mdev UUID > > drivers/vfio/mdev/mdev_core.c | 6 ------ > include/linux/mdev.h | 1 - > samples/vfio-mdev/mtty.c | 39 +++++++---------------------------- > 3 files changed, 8 insertions(+), 38 deletions(-) >
On Thu, 8 Aug 2019 17:02:47 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Thu, 8 Aug 2019 09:12:53 -0500 > Parav Pandit <parav@mellanox.com> wrote: > > > Currently mtty sample driver uses mdev state and UUID in convoluated way to > > generate an interrupt. > > It uses several translations from mdev_state to mdev_device to mdev uuid. > > After which it does linear search of long uuid comparision to > > find out mdev_state in mtty_trigger_interrupt(). > > mdev_state is already available while generating interrupt from which all > > such translations are done to reach back to mdev_state. > > > > This translations are done during interrupt generation path. > > This is unnecessary and reduandant. > > Is the interrupt handling efficiency of this particular sample driver > really relevant, or is its purpose more to illustrate the API and > provide a proof of concept? If we go to the trouble to optimize the > sample driver and remove this interface from the API, what do we lose? Not sure how useful the sample driver is as a template; blindly copying their interrupt handling is probably not a good idea. > > This interface was added via commit: > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract interfaces > > Where the goal was to create a more formal interface and abstract > driver access to the struct mdev_device. In part this served to make > out-of-tree mdev vendor drivers more supportable; the object is > considered opaque and access is provided via an API rather than through > direct structure fields. > > I believe that the NVIDIA GRID mdev driver does make use of this > interface and it's likely included in the sample driver specifically so > that there is an in-kernel user for it (ie. specifically to avoid it > being removed so casually). An interesting feature of the NVIDIA mdev > driver is that I believe it has portions that run in userspace. As we > know, mdevs are named with a UUID, so I can imagine there are some > efficiencies to be gained in having direct access to the UUID for a > device when interacting with userspace, rather than repeatedly parsing > it from a device name. Is that really something we want to make more > difficult in order to optimize a sample driver? Knowing that an mdev > device uses a UUID for it's name, as tools like libvirt and mdevctl > expect, is it really worthwhile to remove such a trivial API? Ripping out the uuid is a bad idea, I agree. The device name simply is no good replacement for that. If there's a good use case for using the uuid in a vendor driver, let's keep the accessor. But then we probably should either leave the sample driver alone, or add a more compelling use of the api there. > > > Hence, > > Patch-1 simplifies mtty sample driver to directly use mdev_state. > > > > Patch-2, Since no production driver uses mdev_uuid(), simplifies and > > removes redandant mdev_uuid() exported symbol. > > s/no production driver/no in-kernel production driver/ > > I'd be interested to hear how the NVIDIA folks make use of this API > interface. Thanks, > > Alex > > > --- > > Changelog: > > v1->v2: > > - Corrected email of Kirti > > - Updated cover letter commit log to address comment from Cornelia > > - Added Reviewed-by tag > > v0->v1: > > - Updated commit log > > > > Parav Pandit (2): > > vfio-mdev/mtty: Simplify interrupt generation > > vfio/mdev: Removed unused and redundant API for mdev UUID > > > > drivers/vfio/mdev/mdev_core.c | 6 ------ > > include/linux/mdev.h | 1 - > > samples/vfio-mdev/mtty.c | 39 +++++++---------------------------- > > 3 files changed, 8 insertions(+), 38 deletions(-) > > >
On 8/9/2019 4:32 AM, Alex Williamson wrote: > On Thu, 8 Aug 2019 09:12:53 -0500 > Parav Pandit <parav@mellanox.com> wrote: > >> Currently mtty sample driver uses mdev state and UUID in convoluated way to >> generate an interrupt. >> It uses several translations from mdev_state to mdev_device to mdev uuid. >> After which it does linear search of long uuid comparision to >> find out mdev_state in mtty_trigger_interrupt(). >> mdev_state is already available while generating interrupt from which all >> such translations are done to reach back to mdev_state. >> >> This translations are done during interrupt generation path. >> This is unnecessary and reduandant. > > Is the interrupt handling efficiency of this particular sample driver > really relevant, or is its purpose more to illustrate the API and > provide a proof of concept? If we go to the trouble to optimize the > sample driver and remove this interface from the API, what do we lose? > > This interface was added via commit: > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract interfaces > > Where the goal was to create a more formal interface and abstract > driver access to the struct mdev_device. In part this served to make > out-of-tree mdev vendor drivers more supportable; the object is > considered opaque and access is provided via an API rather than through > direct structure fields. > > I believe that the NVIDIA GRID mdev driver does make use of this > interface and it's likely included in the sample driver specifically so > that there is an in-kernel user for it (ie. specifically to avoid it > being removed so casually). An interesting feature of the NVIDIA mdev > driver is that I believe it has portions that run in userspace. As we > know, mdevs are named with a UUID, so I can imagine there are some > efficiencies to be gained in having direct access to the UUID for a > device when interacting with userspace, rather than repeatedly parsing > it from a device name. That's right. > Is that really something we want to make more > difficult in order to optimize a sample driver? Knowing that an mdev > device uses a UUID for it's name, as tools like libvirt and mdevctl > expect, is it really worthwhile to remove such a trivial API? > >> Hence, >> Patch-1 simplifies mtty sample driver to directly use mdev_state. >> >> Patch-2, Since no production driver uses mdev_uuid(), simplifies and >> removes redandant mdev_uuid() exported symbol. > > s/no production driver/no in-kernel production driver/ > > I'd be interested to hear how the NVIDIA folks make use of this API > interface. Thanks, > Yes, NVIDIA mdev driver do use this interface. I don't agree on removing mdev_uuid() interface. Thanks, Kirti > Alex > >> --- >> Changelog: >> v1->v2: >> - Corrected email of Kirti >> - Updated cover letter commit log to address comment from Cornelia >> - Added Reviewed-by tag >> v0->v1: >> - Updated commit log >> >> Parav Pandit (2): >> vfio-mdev/mtty: Simplify interrupt generation >> vfio/mdev: Removed unused and redundant API for mdev UUID >> >> drivers/vfio/mdev/mdev_core.c | 6 ------ >> include/linux/mdev.h | 1 - >> samples/vfio-mdev/mtty.c | 39 +++++++---------------------------- >> 3 files changed, 8 insertions(+), 38 deletions(-) >> >
> -----Original Message----- > From: Kirti Wankhede <kwankhede@nvidia.com> > Sent: Monday, August 12, 2019 5:06 PM > To: Alex Williamson <alex.williamson@redhat.com>; Parav Pandit > <parav@mellanox.com> > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cohuck@redhat.com; > cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On 8/9/2019 4:32 AM, Alex Williamson wrote: > > On Thu, 8 Aug 2019 09:12:53 -0500 > > Parav Pandit <parav@mellanox.com> wrote: > > > >> Currently mtty sample driver uses mdev state and UUID in convoluated > >> way to generate an interrupt. > >> It uses several translations from mdev_state to mdev_device to mdev uuid. > >> After which it does linear search of long uuid comparision to find > >> out mdev_state in mtty_trigger_interrupt(). > >> mdev_state is already available while generating interrupt from which > >> all such translations are done to reach back to mdev_state. > >> > >> This translations are done during interrupt generation path. > >> This is unnecessary and reduandant. > > > > Is the interrupt handling efficiency of this particular sample driver > > really relevant, or is its purpose more to illustrate the API and > > provide a proof of concept? If we go to the trouble to optimize the > > sample driver and remove this interface from the API, what do we lose? > > > > This interface was added via commit: > > > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract > > interfaces > > > > Where the goal was to create a more formal interface and abstract > > driver access to the struct mdev_device. In part this served to make > > out-of-tree mdev vendor drivers more supportable; the object is > > considered opaque and access is provided via an API rather than > > through direct structure fields. > > > > I believe that the NVIDIA GRID mdev driver does make use of this > > interface and it's likely included in the sample driver specifically > > so that there is an in-kernel user for it (ie. specifically to avoid > > it being removed so casually). An interesting feature of the NVIDIA > > mdev driver is that I believe it has portions that run in userspace. > > As we know, mdevs are named with a UUID, so I can imagine there are > > some efficiencies to be gained in having direct access to the UUID for > > a device when interacting with userspace, rather than repeatedly > > parsing it from a device name. > > That's right. > > > Is that really something we want to make more difficult in order to > > optimize a sample driver? Knowing that an mdev device uses a UUID for > > it's name, as tools like libvirt and mdevctl expect, is it really > > worthwhile to remove such a trivial API? > > > >> Hence, > >> Patch-1 simplifies mtty sample driver to directly use mdev_state. > >> > >> Patch-2, Since no production driver uses mdev_uuid(), simplifies and > >> removes redandant mdev_uuid() exported symbol. > > > > s/no production driver/no in-kernel production driver/ > > > > I'd be interested to hear how the NVIDIA folks make use of this API > > interface. Thanks, > > > > Yes, NVIDIA mdev driver do use this interface. I don't agree on removing > mdev_uuid() interface. > We need to ask Greg or Linus on the kernel policy on whether an API should exist without in-kernel driver. We don't add such API in netdev, rdma and possibly other subsystem. Where can we find this mdev driver in-tree?
Hi Alex, > -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Friday, August 9, 2019 4:33 AM > To: Parav Pandit <parav@mellanox.com> > Cc: kvm@vger.kernel.org; kwankhede@nvidia.com; linux- > kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Thu, 8 Aug 2019 09:12:53 -0500 > Parav Pandit <parav@mellanox.com> wrote: > > > Currently mtty sample driver uses mdev state and UUID in convoluated > > way to generate an interrupt. > > It uses several translations from mdev_state to mdev_device to mdev uuid. > > After which it does linear search of long uuid comparision to find out > > mdev_state in mtty_trigger_interrupt(). > > mdev_state is already available while generating interrupt from which > > all such translations are done to reach back to mdev_state. > > > > This translations are done during interrupt generation path. > > This is unnecessary and reduandant. > > Is the interrupt handling efficiency of this particular sample driver really > relevant, or is its purpose more to illustrate the API and provide a proof of > concept? If we go to the trouble to optimize the sample driver and remove this > interface from the API, what do we lose? > > This interface was added via commit: > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract interfaces > > Where the goal was to create a more formal interface and abstract driver > access to the struct mdev_device. In part this served to make out-of-tree mdev > vendor drivers more supportable; the object is considered opaque and access is > provided via an API rather than through direct structure fields. > This is not the common practice in the kernel to provide exported symbol for every single field of the structure. > I believe that the NVIDIA GRID mdev driver does make use of this interface and > it's likely included in the sample driver specifically so that there is an in-kernel > user for it (ie. specifically to avoid it being removed so casually). An interesting > feature of the NVIDIA mdev driver is that I believe it has portions that run in > userspace. As we know, mdevs are named with a UUID, so I can imagine there > are some efficiencies to be gained in having direct access to the UUID for a > device when interacting with userspace, rather than repeatedly parsing it from > a device name. Can you please point to the kernel code that accesses the UUID? > Is that really something we want to make more difficult in > order to optimize a sample driver? Knowing that an mdev device uses a UUID > for it's name, as tools like libvirt and mdevctl expect, is it really worthwhile to > remove such a trivial API? > Yes. it is worthwhile to not keep any dead code in the kernel when there is no in-kernel driver using it. Did I miss a caller? Sample driver is setting wrong example of how/when uuid is used. There has be better example to show how/when/why to use it. Out of tree driver doesn't qualify API addition to my understanding. I like to listen to Greg and others for an API inclusion without user as I haven't come across such practice in other subsystems such as nvme, netdev, rdma. > > Hence, > > Patch-1 simplifies mtty sample driver to directly use mdev_state. > > > > Patch-2, Since no production driver uses mdev_uuid(), simplifies and > > removes redandant mdev_uuid() exported symbol. > > s/no production driver/no in-kernel production driver/ > > I'd be interested to hear how the NVIDIA folks make use of this API interface. > Thanks, > > Alex > > > --- > > Changelog: > > v1->v2: > > - Corrected email of Kirti > > - Updated cover letter commit log to address comment from Cornelia > > - Added Reviewed-by tag > > v0->v1: > > - Updated commit log > > > > Parav Pandit (2): > > vfio-mdev/mtty: Simplify interrupt generation > > vfio/mdev: Removed unused and redundant API for mdev UUID > > > > drivers/vfio/mdev/mdev_core.c | 6 ------ > > include/linux/mdev.h | 1 - > > samples/vfio-mdev/mtty.c | 39 +++++++---------------------------- > > 3 files changed, 8 insertions(+), 38 deletions(-) > >
On Tue, 13 Aug 2019 14:40:02 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Kirti Wankhede <kwankhede@nvidia.com> > > Sent: Monday, August 12, 2019 5:06 PM > > To: Alex Williamson <alex.williamson@redhat.com>; Parav Pandit > > <parav@mellanox.com> > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cohuck@redhat.com; > > cjia@nvidia.com > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On 8/9/2019 4:32 AM, Alex Williamson wrote: > > > On Thu, 8 Aug 2019 09:12:53 -0500 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > >> Currently mtty sample driver uses mdev state and UUID in convoluated > > >> way to generate an interrupt. > > >> It uses several translations from mdev_state to mdev_device to mdev uuid. > > >> After which it does linear search of long uuid comparision to find > > >> out mdev_state in mtty_trigger_interrupt(). > > >> mdev_state is already available while generating interrupt from which > > >> all such translations are done to reach back to mdev_state. > > >> > > >> This translations are done during interrupt generation path. > > >> This is unnecessary and reduandant. > > > > > > Is the interrupt handling efficiency of this particular sample driver > > > really relevant, or is its purpose more to illustrate the API and > > > provide a proof of concept? If we go to the trouble to optimize the > > > sample driver and remove this interface from the API, what do we lose? > > > > > > This interface was added via commit: > > > > > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract > > > interfaces > > > > > > Where the goal was to create a more formal interface and abstract > > > driver access to the struct mdev_device. In part this served to make > > > out-of-tree mdev vendor drivers more supportable; the object is > > > considered opaque and access is provided via an API rather than > > > through direct structure fields. > > > > > > I believe that the NVIDIA GRID mdev driver does make use of this > > > interface and it's likely included in the sample driver specifically > > > so that there is an in-kernel user for it (ie. specifically to avoid > > > it being removed so casually). An interesting feature of the NVIDIA > > > mdev driver is that I believe it has portions that run in userspace. > > > As we know, mdevs are named with a UUID, so I can imagine there are > > > some efficiencies to be gained in having direct access to the UUID for > > > a device when interacting with userspace, rather than repeatedly > > > parsing it from a device name. > > > > That's right. > > > > > Is that really something we want to make more difficult in order to > > > optimize a sample driver? Knowing that an mdev device uses a UUID for > > > it's name, as tools like libvirt and mdevctl expect, is it really > > > worthwhile to remove such a trivial API? > > > > > >> Hence, > > >> Patch-1 simplifies mtty sample driver to directly use mdev_state. > > >> > > >> Patch-2, Since no production driver uses mdev_uuid(), simplifies and > > >> removes redandant mdev_uuid() exported symbol. > > > > > > s/no production driver/no in-kernel production driver/ > > > > > > I'd be interested to hear how the NVIDIA folks make use of this API > > > interface. Thanks, > > > > > > > Yes, NVIDIA mdev driver do use this interface. I don't agree on removing > > mdev_uuid() interface. > > > We need to ask Greg or Linus on the kernel policy on whether an API > should exist without in-kernel driver. We don't add such API in > netdev, rdma and possibly other subsystem. Where can we find this > mdev driver in-tree? We probably would not have added the API only for an out of tree driver, but we do have a sample driver that uses it, even if it's rather convoluted. The sample driver is showing an example of using the API, which is rather its purpose more so than absolutely efficient interrupt handling. Also, let's not overstate what this particular API callback provides, it's simply access to the uuid of the device, which is a fundamental property of a mediated device. This API was added simply to provide data abstraction, allowing the struct mdev_device to be opaque to vendor drivers. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Tuesday, August 13, 2019 8:23 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Kirti Wankhede <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Tue, 13 Aug 2019 14:40:02 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Kirti Wankhede <kwankhede@nvidia.com> > > > Sent: Monday, August 12, 2019 5:06 PM > > > To: Alex Williamson <alex.williamson@redhat.com>; Parav Pandit > > > <parav@mellanox.com> > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > > cohuck@redhat.com; cjia@nvidia.com > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > On 8/9/2019 4:32 AM, Alex Williamson wrote: > > > > On Thu, 8 Aug 2019 09:12:53 -0500 Parav Pandit > > > > <parav@mellanox.com> wrote: > > > > > > > >> Currently mtty sample driver uses mdev state and UUID in > > > >> convoluated way to generate an interrupt. > > > >> It uses several translations from mdev_state to mdev_device to mdev > uuid. > > > >> After which it does linear search of long uuid comparision to > > > >> find out mdev_state in mtty_trigger_interrupt(). > > > >> mdev_state is already available while generating interrupt from > > > >> which all such translations are done to reach back to mdev_state. > > > >> > > > >> This translations are done during interrupt generation path. > > > >> This is unnecessary and reduandant. > > > > > > > > Is the interrupt handling efficiency of this particular sample > > > > driver really relevant, or is its purpose more to illustrate the > > > > API and provide a proof of concept? If we go to the trouble to > > > > optimize the sample driver and remove this interface from the API, what > do we lose? > > > > > > > > This interface was added via commit: > > > > > > > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract > > > > interfaces > > > > > > > > Where the goal was to create a more formal interface and abstract > > > > driver access to the struct mdev_device. In part this served to > > > > make out-of-tree mdev vendor drivers more supportable; the object > > > > is considered opaque and access is provided via an API rather than > > > > through direct structure fields. > > > > > > > > I believe that the NVIDIA GRID mdev driver does make use of this > > > > interface and it's likely included in the sample driver > > > > specifically so that there is an in-kernel user for it (ie. > > > > specifically to avoid it being removed so casually). An > > > > interesting feature of the NVIDIA mdev driver is that I believe it has > portions that run in userspace. > > > > As we know, mdevs are named with a UUID, so I can imagine there > > > > are some efficiencies to be gained in having direct access to the > > > > UUID for a device when interacting with userspace, rather than > > > > repeatedly parsing it from a device name. > > > > > > That's right. > > > > > > > Is that really something we want to make more difficult in order > > > > to optimize a sample driver? Knowing that an mdev device uses a > > > > UUID for it's name, as tools like libvirt and mdevctl expect, is > > > > it really worthwhile to remove such a trivial API? > > > > > > > >> Hence, > > > >> Patch-1 simplifies mtty sample driver to directly use mdev_state. > > > >> > > > >> Patch-2, Since no production driver uses mdev_uuid(), simplifies > > > >> and removes redandant mdev_uuid() exported symbol. > > > > > > > > s/no production driver/no in-kernel production driver/ > > > > > > > > I'd be interested to hear how the NVIDIA folks make use of this > > > > API interface. Thanks, > > > > > > > > > > Yes, NVIDIA mdev driver do use this interface. I don't agree on > > > removing > > > mdev_uuid() interface. > > > > > We need to ask Greg or Linus on the kernel policy on whether an API > > should exist without in-kernel driver. We don't add such API in > > netdev, rdma and possibly other subsystem. Where can we find this mdev > > driver in-tree? > > We probably would not have added the API only for an out of tree driver, but > we do have a sample driver that uses it, even if it's rather convoluted. The > sample driver is showing an example of using the API, which is rather its > purpose more so than absolutely efficient interrupt handling. For showing API use, it doesn't have to convoluted that too in interrupt handling code. It could be just dev_info(" UUID print..) But the whole point is to have useful API that non sample driver need to use. And there is none. In bigger objective, I wanted to discuss post this cleanup patch, is to expand mdev to have more user friendly device names. Before we reach there, I should include a patch that eliminates storing UUID itself in the mdev_device. > Also, let's not > overstate what this particular API callback provides, it's simply access to the > uuid of the device, which is a fundamental property of a mediated device. This fundamental property is available in form of device name already. > API was added simply to provide data abstraction, allowing the struct > mdev_device to be opaque to vendor drivers. Thanks, > I get that part. I prefer to remove the UUID itself from the structure and therefore removing this API makes lot more sense?
On Tue, 13 Aug 2019 16:28:53 +0000 Parav Pandit <parav@mellanox.com> wrote: > In bigger objective, I wanted to discuss post this cleanup patch, is to expand mdev to have more user friendly device names. Uh, what is unfriendly about uuids? > > Before we reach there, I should include a patch that eliminates storing UUID itself in the mdev_device. I do not think that's a great idea. A uuid is, well, a unique identifier. What's so bad about it that it should be eliminated? > > > Also, let's not > > overstate what this particular API callback provides, it's simply access to the > > uuid of the device, which is a fundamental property of a mediated device. > This fundamental property is available in form of device name already. Let me reiterate that the device name is a string containing a formatted uuid, not a uuid. > > > API was added simply to provide data abstraction, allowing the struct > > mdev_device to be opaque to vendor drivers. Thanks, > > > I get that part. I prefer to remove the UUID itself from the structure and therefore removing this API makes lot more sense? What I don't get is why you want to eliminate the uuid in the first place? Again, what's so bad about it?
On Tue, Aug 13, 2019 at 02:40:02PM +0000, Parav Pandit wrote: > We need to ask Greg or Linus on the kernel policy on whether an API should exist without in-kernel driver. > We don't add such API in netdev, rdma and possibly other subsystem. > Where can we find this mdev driver in-tree? The clear policy is that we don't keep such symbols around. Been there done that only recently again. The other interesting thing is the amount of code nvidia and partner developers have pushed into the kernel tree for exclusive use of their driver it should be clearly established by now that it is a derived work, but that is for a different discussion.
On Tue, 13 Aug 2019 16:28:53 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Tuesday, August 13, 2019 8:23 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Kirti Wankhede <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Tue, 13 Aug 2019 14:40:02 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Kirti Wankhede <kwankhede@nvidia.com> > > > > Sent: Monday, August 12, 2019 5:06 PM > > > > To: Alex Williamson <alex.williamson@redhat.com>; Parav Pandit > > > > <parav@mellanox.com> > > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > > > cohuck@redhat.com; cjia@nvidia.com > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > > > > > On 8/9/2019 4:32 AM, Alex Williamson wrote: > > > > > On Thu, 8 Aug 2019 09:12:53 -0500 Parav Pandit > > > > > <parav@mellanox.com> wrote: > > > > > > > > > >> Currently mtty sample driver uses mdev state and UUID in > > > > >> convoluated way to generate an interrupt. > > > > >> It uses several translations from mdev_state to mdev_device to mdev > > uuid. > > > > >> After which it does linear search of long uuid comparision to > > > > >> find out mdev_state in mtty_trigger_interrupt(). > > > > >> mdev_state is already available while generating interrupt from > > > > >> which all such translations are done to reach back to mdev_state. > > > > >> > > > > >> This translations are done during interrupt generation path. > > > > >> This is unnecessary and reduandant. > > > > > > > > > > Is the interrupt handling efficiency of this particular sample > > > > > driver really relevant, or is its purpose more to illustrate the > > > > > API and provide a proof of concept? If we go to the trouble to > > > > > optimize the sample driver and remove this interface from the API, what > > do we lose? > > > > > > > > > > This interface was added via commit: > > > > > > > > > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract > > > > > interfaces > > > > > > > > > > Where the goal was to create a more formal interface and abstract > > > > > driver access to the struct mdev_device. In part this served to > > > > > make out-of-tree mdev vendor drivers more supportable; the object > > > > > is considered opaque and access is provided via an API rather than > > > > > through direct structure fields. > > > > > > > > > > I believe that the NVIDIA GRID mdev driver does make use of this > > > > > interface and it's likely included in the sample driver > > > > > specifically so that there is an in-kernel user for it (ie. > > > > > specifically to avoid it being removed so casually). An > > > > > interesting feature of the NVIDIA mdev driver is that I believe it has > > portions that run in userspace. > > > > > As we know, mdevs are named with a UUID, so I can imagine there > > > > > are some efficiencies to be gained in having direct access to the > > > > > UUID for a device when interacting with userspace, rather than > > > > > repeatedly parsing it from a device name. > > > > > > > > That's right. > > > > > > > > > Is that really something we want to make more difficult in order > > > > > to optimize a sample driver? Knowing that an mdev device uses a > > > > > UUID for it's name, as tools like libvirt and mdevctl expect, is > > > > > it really worthwhile to remove such a trivial API? > > > > > > > > > >> Hence, > > > > >> Patch-1 simplifies mtty sample driver to directly use mdev_state. > > > > >> > > > > >> Patch-2, Since no production driver uses mdev_uuid(), simplifies > > > > >> and removes redandant mdev_uuid() exported symbol. > > > > > > > > > > s/no production driver/no in-kernel production driver/ > > > > > > > > > > I'd be interested to hear how the NVIDIA folks make use of this > > > > > API interface. Thanks, > > > > > > > > > > > > > Yes, NVIDIA mdev driver do use this interface. I don't agree on > > > > removing > > > > mdev_uuid() interface. > > > > > > > We need to ask Greg or Linus on the kernel policy on whether an API > > > should exist without in-kernel driver. We don't add such API in > > > netdev, rdma and possibly other subsystem. Where can we find this mdev > > > driver in-tree? > > > > We probably would not have added the API only for an out of tree driver, but > > we do have a sample driver that uses it, even if it's rather convoluted. The > > sample driver is showing an example of using the API, which is rather its > > purpose more so than absolutely efficient interrupt handling. > For showing API use, it doesn't have to convoluted that too in interrupt handling code. > It could be just dev_info(" UUID print..) I was thinking we could have the mtty driver expose a vendor sysfs attribute providing the UUID if you insist on cleaning up the interrupt path. > But the whole point is to have useful API that non sample driver need to use. > And there is none. Kirti has already indicated this API was useful and it's not a burden to maintain it. The trouble is that we don't have any in-kernel mdev drivers sophisticated enough to have the same kernel/user split as the NVIDIA driver, but that's a feature that I believe we wish to continue to support. The kernel exposes the device by UUID, userspace references the device by UUID, so it seems intuitive to provide a core API to retrieve the UUID for a device without parsing it from a device name string. We can continue to add trivial sample driver use cases or we can just agree that this is a useful interface for UUID based device. > In bigger objective, I wanted to discuss post this cleanup patch, is > to expand mdev to have more user friendly device names. "Friendly" is a matter of opinion. UUIDs provide us with consistent names, effectively avoids name collisions which in turn effectively avoids races in device creation, they're easy to deal with, and they're well known. Naming things is hard. Dealing with arbitrary user generated names or defining a policy around acceptable naming is hard. > Before we reach there, I should include a patch that eliminates > storing UUID itself in the mdev_device. > > > Also, let's not > > overstate what this particular API callback provides, it's simply > > access to the uuid of the device, which is a fundamental property > > of a mediated device. > This fundamental property is available in form of device name already. So you're wanting to optimize a sample driver interrupt handler in order to eliminate an API which makes the UUID available without parsing it from a string? And the complexity grows when you later propose that the string is now arbitrary? It doesn't seem like a good replacement. > > API was added simply to provide data abstraction, allowing the > > struct mdev_device to be opaque to vendor drivers. Thanks, > > > I get that part. I prefer to remove the UUID itself from the > structure and therefore removing this API makes lot more sense? Mdev and support tools around mdev are based on UUIDs because it's defined in the documentation. I don't think it's as simple as saying "voila, UUID dependencies are removed, users are free to use arbitrary strings". We'd need to create some kind of naming policy, what characters are allows so that we can potentially expand the creation parameters as has been proposed a couple times, how do we deal with collisions and races, and why should we make such a change when a UUID is a perfectly reasonable devices name. Thanks, Alex
On Tue, Aug 13, 2019 at 09:37:21AM -0700, Christoph Hellwig wrote: > On Tue, Aug 13, 2019 at 02:40:02PM +0000, Parav Pandit wrote: > > We need to ask Greg or Linus on the kernel policy on whether an API should exist without in-kernel driver. I "love" it when people try to ask a question of me and they don't actually cc: me. That means they really do not want the answer (or they already know it...) Thanks Christoph for adding me here. The policy is that the api should not exist at all, everyone knows this, why is this even a question? > > We don't add such API in netdev, rdma and possibly other subsystem. > > Where can we find this mdev driver in-tree? > > The clear policy is that we don't keep such symbols around. Been > there done that only recently again. Agreed. If anyone knows of anything else that isn't being used, we will be glad to free up the space by cleaning it up. > The other interesting thing is the amount of code nvidia and partner > developers have pushed into the kernel tree for exclusive use of their > driver it should be clearly established by now that it is a derived > work, but that is for a different discussion. That's a discussion the lawyers on their side keep wanting us to ignore, it's as if they think we are stupid and they are "pulling one over on us." ugh... thanks, greg "not a lawyer, but spends lots of time with them" k-h
Hi Christoph, Greg, > -----Original Message----- > From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Sent: Tuesday, August 13, 2019 11:10 PM > To: Christoph Hellwig <hch@infradead.org>; Parav Pandit > <parav@mellanox.com> > Cc: Kirti Wankhede <kwankhede@nvidia.com>; Alex Williamson > <alex.williamson@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Tue, Aug 13, 2019 at 09:37:21AM -0700, Christoph Hellwig wrote: > > On Tue, Aug 13, 2019 at 02:40:02PM +0000, Parav Pandit wrote: > > > We need to ask Greg or Linus on the kernel policy on whether an API should > exist without in-kernel driver. > > I "love" it when people try to ask a question of me and they don't actually cc: > me. That means they really do not want the answer (or they already know it...) > Thanks Christoph for adding me here. > I pretty much knew your answer and I was just hinting Kirti that if you ask Greg you would get the same answer. So we better cleanup without reaching out to you. :-) > The policy is that the api should not exist at all, everyone knows this, why is this > even a question? > Yes, I am aware of this. Few subsystems in which I worked, it has followed this policy cautiously. But when I heard different policy for mdev, I asked others wisdom. > > > We don't add such API in netdev, rdma and possibly other subsystem. > > > Where can we find this mdev driver in-tree? > > > > The clear policy is that we don't keep such symbols around. Been > > there done that only recently again. > > Agreed. If anyone knows of anything else that isn't being used, we will be glad > to free up the space by cleaning it up. > Ok. so this small patchset makes sense. Thanks for the ack and direction Christoph, Greg.
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Tuesday, August 13, 2019 10:42 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Kirti Wankhede <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Tue, 13 Aug 2019 16:28:53 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Tuesday, August 13, 2019 8:23 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>; kvm@vger.kernel.org; > > > linux- kernel@vger.kernel.org; cohuck@redhat.com; cjia@nvidia.com > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Tue, 13 Aug 2019 14:40:02 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > -----Original Message----- > > > > > From: Kirti Wankhede <kwankhede@nvidia.com> > > > > > Sent: Monday, August 12, 2019 5:06 PM > > > > > To: Alex Williamson <alex.williamson@redhat.com>; Parav Pandit > > > > > <parav@mellanox.com> > > > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > > > > cohuck@redhat.com; cjia@nvidia.com > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > > > > > > > > > On 8/9/2019 4:32 AM, Alex Williamson wrote: > > > > > > On Thu, 8 Aug 2019 09:12:53 -0500 Parav Pandit > > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > >> Currently mtty sample driver uses mdev state and UUID in > > > > > >> convoluated way to generate an interrupt. > > > > > >> It uses several translations from mdev_state to mdev_device > > > > > >> to mdev > > > uuid. > > > > > >> After which it does linear search of long uuid comparision to > > > > > >> find out mdev_state in mtty_trigger_interrupt(). > > > > > >> mdev_state is already available while generating interrupt > > > > > >> from which all such translations are done to reach back to > mdev_state. > > > > > >> > > > > > >> This translations are done during interrupt generation path. > > > > > >> This is unnecessary and reduandant. > > > > > > > > > > > > Is the interrupt handling efficiency of this particular sample > > > > > > driver really relevant, or is its purpose more to illustrate > > > > > > the API and provide a proof of concept? If we go to the > > > > > > trouble to optimize the sample driver and remove this > > > > > > interface from the API, what > > > do we lose? > > > > > > > > > > > > This interface was added via commit: > > > > > > > > > > > > 99e3123e3d72 vfio-mdev: Make mdev_device private and abstract > > > > > > interfaces > > > > > > > > > > > > Where the goal was to create a more formal interface and > > > > > > abstract driver access to the struct mdev_device. In part > > > > > > this served to make out-of-tree mdev vendor drivers more > > > > > > supportable; the object is considered opaque and access is > > > > > > provided via an API rather than through direct structure fields. > > > > > > > > > > > > I believe that the NVIDIA GRID mdev driver does make use of > > > > > > this interface and it's likely included in the sample driver > > > > > > specifically so that there is an in-kernel user for it (ie. > > > > > > specifically to avoid it being removed so casually). An > > > > > > interesting feature of the NVIDIA mdev driver is that I > > > > > > believe it has > > > portions that run in userspace. > > > > > > As we know, mdevs are named with a UUID, so I can imagine > > > > > > there are some efficiencies to be gained in having direct > > > > > > access to the UUID for a device when interacting with > > > > > > userspace, rather than repeatedly parsing it from a device name. > > > > > > > > > > That's right. > > > > > > > > > > > Is that really something we want to make more difficult in > > > > > > order to optimize a sample driver? Knowing that an mdev > > > > > > device uses a UUID for it's name, as tools like libvirt and > > > > > > mdevctl expect, is it really worthwhile to remove such a trivial API? > > > > > > > > > > > >> Hence, > > > > > >> Patch-1 simplifies mtty sample driver to directly use mdev_state. > > > > > >> > > > > > >> Patch-2, Since no production driver uses mdev_uuid(), > > > > > >> simplifies and removes redandant mdev_uuid() exported symbol. > > > > > > > > > > > > s/no production driver/no in-kernel production driver/ > > > > > > > > > > > > I'd be interested to hear how the NVIDIA folks make use of > > > > > > this API interface. Thanks, > > > > > > > > > > > > > > > > Yes, NVIDIA mdev driver do use this interface. I don't agree on > > > > > removing > > > > > mdev_uuid() interface. > > > > > > > > > We need to ask Greg or Linus on the kernel policy on whether an > > > > API should exist without in-kernel driver. We don't add such API > > > > in netdev, rdma and possibly other subsystem. Where can we find > > > > this mdev driver in-tree? > > > > > > We probably would not have added the API only for an out of tree > > > driver, but we do have a sample driver that uses it, even if it's > > > rather convoluted. The sample driver is showing an example of using the > API, which is rather its > > > purpose more so than absolutely efficient interrupt handling. > > For showing API use, it doesn't have to convoluted that too in interrupt > handling code. > > It could be just dev_info(" UUID print..) > > I was thinking we could have the mtty driver expose a vendor sysfs attribute > providing the UUID if you insist on cleaning up the interrupt path. > A vendor driver can add its own sysfs file per device. A while back I refactored rdma system to simplify all of such sysfs entries for several drivers. We had hurdle when we introduced namespaces to it. So I do not recommend adding it in the vendor driver. Rather, mdev code can add sysfs entry per device exposing its UUID. This will be unified across the vendors. And for below discussion, if the device is not based on UUID, this sysfs entry won't exist. More below. > > But the whole point is to have useful API that non sample driver need to use. > > And there is none. > > Kirti has already indicated this API was useful and it's not a burden to maintain > it. The trouble is that we don't have any in-kernel mdev drivers sophisticated > enough to have the same kernel/user split as the NVIDIA driver, but that's a > feature that I believe we wish to continue to support. The kernel exposes the > device by UUID, userspace references the device by UUID, so it seems intuitive > to provide a core API to retrieve the UUID for a device without parsing it from a > device name string. We can continue to add trivial sample driver use cases or But mdev or any vendor drivers are not parsing it anywhere today. That is why I was asking for an example production driver that explains why/how it uses it. > we can just agree that this is a useful interface for UUID based device. > > > In bigger objective, I wanted to discuss post this cleanup patch, is > > to expand mdev to have more user friendly device names. > > "Friendly" is a matter of opinion. UUIDs provide us with consistent names, > effectively avoids name collisions which in turn effectively avoids races in > device creation, they're easy to deal with, and they're well known. Naming > things is hard. Dealing with arbitrary user generated names or defining a policy > around acceptable naming is hard. > > > Before we reach there, I should include a patch that eliminates > > storing UUID itself in the mdev_device. > > > > > Also, let's not > > > overstate what this particular API callback provides, it's simply > > > access to the uuid of the device, which is a fundamental property of > > > a mediated device. > > This fundamental property is available in form of device name already. > > So you're wanting to optimize a sample driver interrupt handler in order to > eliminate an API which makes the UUID available without parsing it from a > string? Yes. because none production driver needs this and it can be derived from the device name. > And the complexity grows when you later propose that the string is now > arbitrary? It doesn't seem like a good replacement. > Well sample driver shouldn't use the API the way it uses in convoluted approach. Just a UUID print during create() call using mdev_uuid() is good enough for sake of showing example of an API. If we want to uniquely identify mdev using UUID, but not derive their names from UUID, by having additional parameter for its name, It makes sense to me. In this approach, mdev_uuid() can be added/kept if a vendor driver is interested in the UUID. At present there is none. > > > API was added simply to provide data abstraction, allowing the > > > struct mdev_device to be opaque to vendor drivers. Thanks, > > > > > I get that part. I prefer to remove the UUID itself from the structure > > and therefore removing this API makes lot more sense? > > Mdev and support tools around mdev are based on UUIDs because it's defined > in the documentation. When we introduce newer device naming scheme, it will update the documentation also. May be that is the time to move to .rst format too. > I don't think it's as simple as saying "voila, UUID > dependencies are removed, users are free to use arbitrary strings". We'd need > to create some kind of naming policy, what characters are allows so that we > can potentially expand the creation parameters as has been proposed a couple > times, how do we deal with collisions and races, and why should we make such > a change when a UUID is a perfectly reasonable devices name. Thanks, > Sure, we should define a policy on device naming to be more relaxed. We have enough examples in-kernel. Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot more), rdma etc which has arbitrary device names and ID based device names. Collisions and race is already taken care today in the mdev core. Same unique device names continue.
On Wed, 14 Aug 2019 05:54:36 +0000 Parav Pandit <parav@mellanox.com> wrote: > > > I get that part. I prefer to remove the UUID itself from the structure > > > and therefore removing this API makes lot more sense? > > > > Mdev and support tools around mdev are based on UUIDs because it's defined > > in the documentation. > When we introduce newer device naming scheme, it will update the documentation also. > May be that is the time to move to .rst format too. You are aware that there are existing tools that expect a uuid naming scheme, right? > > > I don't think it's as simple as saying "voila, UUID > > dependencies are removed, users are free to use arbitrary strings". We'd need > > to create some kind of naming policy, what characters are allows so that we > > can potentially expand the creation parameters as has been proposed a couple > > times, how do we deal with collisions and races, and why should we make such > > a change when a UUID is a perfectly reasonable devices name. Thanks, > > > Sure, we should define a policy on device naming to be more relaxed. > We have enough examples in-kernel. > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot more), rdma etc which has arbitrary device names and ID based device names. > > Collisions and race is already taken care today in the mdev core. Same unique device names continue. I'm still completely missing a rationale _why_ uuids are supposedly bad/restricting/etc. We want to uniquely identify a device, across different types of vendor drivers. An uuid is a unique identifier and even a well-defined one. Tools (e.g. mdevctl) are relying on it for mdev devices today. What is the problem you're trying to solve?
+ Jiri, + netdev To get perspective on the ndo->phys_port_name for the representor netdev of mdev. Hi Cornelia, > -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Wednesday, August 14, 2019 1:32 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia@nvidia.com > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Wed, 14 Aug 2019 05:54:36 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > > I get that part. I prefer to remove the UUID itself from the > > > > structure and therefore removing this API makes lot more sense? > > > > > > Mdev and support tools around mdev are based on UUIDs because it's > defined > > > in the documentation. > > When we introduce newer device naming scheme, it will update the > documentation also. > > May be that is the time to move to .rst format too. > > You are aware that there are existing tools that expect a uuid naming scheme, > right? > Yes, Alex mentioned too. The good tool that I am aware of is [1], which is 4 months old. Not sure if it is part of any distros yet. README also says, that it is in 'early in development. So we have scope to improve it for non UUID names, but lets discuss that more below. > > > > > I don't think it's as simple as saying "voila, UUID dependencies are > > > removed, users are free to use arbitrary strings". We'd need to > > > create some kind of naming policy, what characters are allows so > > > that we can potentially expand the creation parameters as has been > > > proposed a couple times, how do we deal with collisions and races, > > > and why should we make such a change when a UUID is a perfectly > > > reasonable devices name. Thanks, > > > > > Sure, we should define a policy on device naming to be more relaxed. > > We have enough examples in-kernel. > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot more), rdma > etc which has arbitrary device names and ID based device names. > > > > Collisions and race is already taken care today in the mdev core. Same > unique device names continue. > > I'm still completely missing a rationale _why_ uuids are supposedly > bad/restricting/etc. There is nothing bad about uuid based naming. Its just too long name to derive phys_port_name of a netdev. In details below. For a given mdev of networking type, we would like to have (a) representor netdevice [2] (b) associated devlink port [3] Currently these representor netdevice exist only for the PCIe SR-IOV VFs. It is further getting extended for mdev without SR-IOV. Each of the devlink port is attached to representor netdevice [4]. This netdevice phys_port_name should be a unique derived from some property of mdev. Udev/systemd uses phys_port_name to derive unique representor netdev name. This netdev name is further use by orchestration and switching software in user space. One such distro supported switching software is ovs [4], which relies on the persistent device name of the representor netdevice. phys_port_name has limitation to be only 15 characters long. UUID doesn't fit in phys_port_name. Longer UUID names are creating snow ball effect, not just in networking stack but many user space tools too. (as opposed to recently introduced mdevctl, are they more mdev tools which has dependency on UUID name?) Instead of mdev subsystem creating such effect, one option we are considering is to have shorter mdev names. (Similar to netdev, rdma, nvme devices). Such as mdev1, mdev2000 etc. Second option I was considering is to have an optional alias for UUID based mdev. This name alias is given at time of mdev creation. Devlink port's phys_port_name is derived out of this shorter mdev name alias. This way, mdev remains to be UUID based with optional extension. However, I prefer first option to relax mdev naming scheme. > We want to uniquely identify a device, across different > types of vendor drivers. An uuid is a unique identifier and even a well-defined > one. Tools (e.g. mdevctl) are relying on it for mdev devices today. > > What is the problem you're trying to solve? Unique device naming is still achieved without UUID scheme by various subsystems in kernel using alpha-numeric string. Having such string based continue to provide unique names. I hope I described the problem and two solutions above. [1] https://github.com/awilliam/mdevctl [2] https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c [3] http://man7.org/linux/man-pages/man8/devlink-port.8.html [4] https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6921 [5] https://www.openvswitch.org/
On Wed, 14 Aug 2019 12:27:01 +0000 Parav Pandit <parav@mellanox.com> wrote: > + Jiri, + netdev > To get perspective on the ndo->phys_port_name for the representor netdev of mdev. > > Hi Cornelia, > > > -----Original Message----- > > From: Cornelia Huck <cohuck@redhat.com> > > Sent: Wednesday, August 14, 2019 1:32 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia@nvidia.com > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Wed, 14 Aug 2019 05:54:36 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > I get that part. I prefer to remove the UUID itself from the > > > > > structure and therefore removing this API makes lot more sense? > > > > > > > > Mdev and support tools around mdev are based on UUIDs because it's > > defined > > > > in the documentation. > > > When we introduce newer device naming scheme, it will update the > > documentation also. > > > May be that is the time to move to .rst format too. > > > > You are aware that there are existing tools that expect a uuid naming scheme, > > right? > > > Yes, Alex mentioned too. > The good tool that I am aware of is [1], which is 4 months old. Not sure if it is part of any distros yet. > > README also says, that it is in 'early in development. So we have scope to improve it for non UUID names, but lets discuss that more below. The up-to-date reference for mdevctl is https://github.com/mdevctl/mdevctl. There is currently an effort to get this packaged in Fedora. > > > > > > > > I don't think it's as simple as saying "voila, UUID dependencies are > > > > removed, users are free to use arbitrary strings". We'd need to > > > > create some kind of naming policy, what characters are allows so > > > > that we can potentially expand the creation parameters as has been > > > > proposed a couple times, how do we deal with collisions and races, > > > > and why should we make such a change when a UUID is a perfectly > > > > reasonable devices name. Thanks, > > > > > > > Sure, we should define a policy on device naming to be more relaxed. > > > We have enough examples in-kernel. > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot more), rdma > > etc which has arbitrary device names and ID based device names. > > > > > > Collisions and race is already taken care today in the mdev core. Same > > unique device names continue. > > > > I'm still completely missing a rationale _why_ uuids are supposedly > > bad/restricting/etc. > There is nothing bad about uuid based naming. > Its just too long name to derive phys_port_name of a netdev. > In details below. > > For a given mdev of networking type, we would like to have > (a) representor netdevice [2] > (b) associated devlink port [3] > > Currently these representor netdevice exist only for the PCIe SR-IOV VFs. > It is further getting extended for mdev without SR-IOV. > > Each of the devlink port is attached to representor netdevice [4]. > > This netdevice phys_port_name should be a unique derived from some property of mdev. > Udev/systemd uses phys_port_name to derive unique representor netdev name. > This netdev name is further use by orchestration and switching software in user space. > One such distro supported switching software is ovs [4], which relies on the persistent device name of the representor netdevice. Ok, let me rephrase this to check that I understand this correctly. I'm not sure about some of the terms you use here (even after looking at the linked doc/code), but that's probably still ok. We want to derive an unique (and probably persistent?) netdev name so that userspace can refer to a representor netdevice. Makes sense. For generating that name, udev uses the phys_port_name (which represents the devlink port, IIUC). Also makes sense. > > phys_port_name has limitation to be only 15 characters long. > UUID doesn't fit in phys_port_name. Understood. But why do we need to derive the phys_port_name from the mdev device name? This netdevice use case seems to be just one use case for using mdev devices? If this is a specialized mdev type for this setup, why not just expose a shorter identifier via an extra attribute? > Longer UUID names are creating snow ball effect, not just in networking stack but many user space tools too. This snowball effect mainly comes from the device name -> phys_port_name setup, IIUC. > (as opposed to recently introduced mdevctl, are they more mdev tools which has dependency on UUID name?) I am aware that people have written scripts etc. to manage their mdevs. Given that the mdev infrastructure has been around for quite some time, I'd say the chance of some of those scripts relying on uuid names is non-zero. > > Instead of mdev subsystem creating such effect, one option we are considering is to have shorter mdev names. > (Similar to netdev, rdma, nvme devices). > Such as mdev1, mdev2000 etc. > > Second option I was considering is to have an optional alias for UUID based mdev. > This name alias is given at time of mdev creation. > Devlink port's phys_port_name is derived out of this shorter mdev name alias. > This way, mdev remains to be UUID based with optional extension. > However, I prefer first option to relax mdev naming scheme. Actually, I think that second option makes much more sense, as you avoid potentially breaking existing tooling. > > > We want to uniquely identify a device, across different > > types of vendor drivers. An uuid is a unique identifier and even a well-defined > > one. Tools (e.g. mdevctl) are relying on it for mdev devices today. > > > > What is the problem you're trying to solve? > Unique device naming is still achieved without UUID scheme by various subsystems in kernel using alpha-numeric string. > Having such string based continue to provide unique names. > > I hope I described the problem and two solutions above. > > [1] https://github.com/awilliam/mdevctl > [2] https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c > [3] http://man7.org/linux/man-pages/man8/devlink-port.8.html > [4] https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6921 > [5] https://www.openvswitch.org/ >
> -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Wednesday, August 14, 2019 6:39 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko <jiri@mellanox.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Wed, 14 Aug 2019 12:27:01 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > + Jiri, + netdev > > To get perspective on the ndo->phys_port_name for the representor netdev > of mdev. > > > > Hi Cornelia, > > > > > -----Original Message----- > > > From: Cornelia Huck <cohuck@redhat.com> > > > Sent: Wednesday, August 14, 2019 1:32 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > kernel@vger.kernel.org; cjia@nvidia.com > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Wed, 14 Aug 2019 05:54:36 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > I get that part. I prefer to remove the UUID itself from the > > > > > > structure and therefore removing this API makes lot more sense? > > > > > > > > > > Mdev and support tools around mdev are based on UUIDs because > > > > > it's > > > defined > > > > > in the documentation. > > > > When we introduce newer device naming scheme, it will update the > > > documentation also. > > > > May be that is the time to move to .rst format too. > > > > > > You are aware that there are existing tools that expect a uuid > > > naming scheme, right? > > > > > Yes, Alex mentioned too. > > The good tool that I am aware of is [1], which is 4 months old. Not sure if it is > part of any distros yet. > > > > README also says, that it is in 'early in development. So we have scope to > improve it for non UUID names, but lets discuss that more below. > > The up-to-date reference for mdevctl is > https://github.com/mdevctl/mdevctl. There is currently an effort to get this > packaged in Fedora. > Awesome. > > > > > > > > > > > I don't think it's as simple as saying "voila, UUID dependencies > > > > > are removed, users are free to use arbitrary strings". We'd > > > > > need to create some kind of naming policy, what characters are > > > > > allows so that we can potentially expand the creation parameters > > > > > as has been proposed a couple times, how do we deal with > > > > > collisions and races, and why should we make such a change when > > > > > a UUID is a perfectly reasonable devices name. Thanks, > > > > > > > > > Sure, we should define a policy on device naming to be more relaxed. > > > > We have enough examples in-kernel. > > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot > > > > more), rdma > > > etc which has arbitrary device names and ID based device names. > > > > > > > > Collisions and race is already taken care today in the mdev core. > > > > Same > > > unique device names continue. > > > > > > I'm still completely missing a rationale _why_ uuids are supposedly > > > bad/restricting/etc. > > There is nothing bad about uuid based naming. > > Its just too long name to derive phys_port_name of a netdev. > > In details below. > > > > For a given mdev of networking type, we would like to have > > (a) representor netdevice [2] > > (b) associated devlink port [3] > > > > Currently these representor netdevice exist only for the PCIe SR-IOV VFs. > > It is further getting extended for mdev without SR-IOV. > > > > Each of the devlink port is attached to representor netdevice [4]. > > > > This netdevice phys_port_name should be a unique derived from some > property of mdev. > > Udev/systemd uses phys_port_name to derive unique representor netdev > name. > > This netdev name is further use by orchestration and switching software in > user space. > > One such distro supported switching software is ovs [4], which relies on the > persistent device name of the representor netdevice. > > Ok, let me rephrase this to check that I understand this correctly. I'm not sure > about some of the terms you use here (even after looking at the linked > doc/code), but that's probably still ok. > > We want to derive an unique (and probably persistent?) netdev name so that > userspace can refer to a representor netdevice. Makes sense. > For generating that name, udev uses the phys_port_name (which represents > the devlink port, IIUC). Also makes sense. > You understood it correctly. > > > > phys_port_name has limitation to be only 15 characters long. > > UUID doesn't fit in phys_port_name. > > Understood. But why do we need to derive the phys_port_name from the mdev > device name? This netdevice use case seems to be just one use case for using > mdev devices? If this is a specialized mdev type for this setup, why not just > expose a shorter identifier via an extra attribute? > Representor netdev, represents mdev's switch port (like PCI SRIOV VF's switch port). So user must be able to relate this two objects in similar manner as SRIOV VFs. Phys_port_name is derived from the PCI PF and VF numbering scheme. Similarly mdev's such port should be derived from mdev's id/name/attribute. > > Longer UUID names are creating snow ball effect, not just in networking stack > but many user space tools too. > > This snowball effect mainly comes from the device name -> phys_port_name > setup, IIUC. > Right. > > (as opposed to recently introduced mdevctl, are they more mdev tools > > which has dependency on UUID name?) > > I am aware that people have written scripts etc. to manage their mdevs. > Given that the mdev infrastructure has been around for quite some time, I'd > say the chance of some of those scripts relying on uuid names is non-zero. > Ok. but those scripts have never managed networking devices. So those scripts won't break because they will always create mdev devices using UUID. When they use these new networking devices, they need more things than their scripts. So user space upgrade for such mixed mode case is reasonable. > > > > Instead of mdev subsystem creating such effect, one option we are > considering is to have shorter mdev names. > > (Similar to netdev, rdma, nvme devices). > > Such as mdev1, mdev2000 etc. > > > > Second option I was considering is to have an optional alias for UUID based > mdev. > > This name alias is given at time of mdev creation. > > Devlink port's phys_port_name is derived out of this shorter mdev name > alias. > > This way, mdev remains to be UUID based with optional extension. > > However, I prefer first option to relax mdev naming scheme. > > Actually, I think that second option makes much more sense, as you avoid > potentially breaking existing tooling. Let's first understand of what exactly will break with existing tool if they see non_uuid based device. > Existing tooling continue to work with UUID devices. Do you have example of what can break if they see non_uuid based device name? I think you are clear, but to be sure, UUID based creation will continue to be there. Optionally mdev will be created with alpha-numeric string, if we don't it as additional attribute. > > > > > We want to uniquely identify a device, across different types of > > > vendor drivers. An uuid is a unique identifier and even a > > > well-defined one. Tools (e.g. mdevctl) are relying on it for mdev devices > today. > > > > > > What is the problem you're trying to solve? > > Unique device naming is still achieved without UUID scheme by various > subsystems in kernel using alpha-numeric string. > > Having such string based continue to provide unique names. > > > > I hope I described the problem and two solutions above. > > > > [1] https://github.com/awilliam/mdevctl > > [2] > > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/ > > mellanox/mlx5/core/en_rep.c [3] > > http://man7.org/linux/man-pages/man8/devlink-port.8.html > > [4] > > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6 > > 921 > > [5] https://www.openvswitch.org/ > >
On Wed, 14 Aug 2019 13:45:49 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Cornelia Huck <cohuck@redhat.com> > > Sent: Wednesday, August 14, 2019 6:39 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko <jiri@mellanox.com>; > > netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Wed, 14 Aug 2019 12:27:01 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > + Jiri, + netdev > > > To get perspective on the ndo->phys_port_name for the representor netdev > > of mdev. > > > > > > Hi Cornelia, > > > > > > > -----Original Message----- > > > > From: Cornelia Huck <cohuck@redhat.com> > > > > Sent: Wednesday, August 14, 2019 1:32 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > > kernel@vger.kernel.org; cjia@nvidia.com > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Wed, 14 Aug 2019 05:54:36 +0000 > > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > > > I get that part. I prefer to remove the UUID itself from the > > > > > > > structure and therefore removing this API makes lot more sense? > > > > > > > > > > > > Mdev and support tools around mdev are based on UUIDs because > > > > > > it's > > > > defined > > > > > > in the documentation. > > > > > When we introduce newer device naming scheme, it will update the > > > > documentation also. > > > > > May be that is the time to move to .rst format too. > > > > > > > > You are aware that there are existing tools that expect a uuid > > > > naming scheme, right? > > > > > > > Yes, Alex mentioned too. > > > The good tool that I am aware of is [1], which is 4 months old. Not sure if it is > > part of any distros yet. > > > > > > README also says, that it is in 'early in development. So we have scope to > > improve it for non UUID names, but lets discuss that more below. > > > > The up-to-date reference for mdevctl is > > https://github.com/mdevctl/mdevctl. There is currently an effort to get this > > packaged in Fedora. > > > Awesome. > > > > > > > > > > > > > > > I don't think it's as simple as saying "voila, UUID dependencies > > > > > > are removed, users are free to use arbitrary strings". We'd > > > > > > need to create some kind of naming policy, what characters are > > > > > > allows so that we can potentially expand the creation parameters > > > > > > as has been proposed a couple times, how do we deal with > > > > > > collisions and races, and why should we make such a change when > > > > > > a UUID is a perfectly reasonable devices name. Thanks, > > > > > > > > > > > Sure, we should define a policy on device naming to be more relaxed. > > > > > We have enough examples in-kernel. > > > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot > > > > > more), rdma > > > > etc which has arbitrary device names and ID based device names. > > > > > > > > > > Collisions and race is already taken care today in the mdev core. > > > > > Same > > > > unique device names continue. > > > > > > > > I'm still completely missing a rationale _why_ uuids are supposedly > > > > bad/restricting/etc. > > > There is nothing bad about uuid based naming. > > > Its just too long name to derive phys_port_name of a netdev. > > > In details below. > > > > > > For a given mdev of networking type, we would like to have > > > (a) representor netdevice [2] > > > (b) associated devlink port [3] > > > > > > Currently these representor netdevice exist only for the PCIe SR-IOV VFs. > > > It is further getting extended for mdev without SR-IOV. > > > > > > Each of the devlink port is attached to representor netdevice [4]. > > > > > > This netdevice phys_port_name should be a unique derived from some > > property of mdev. > > > Udev/systemd uses phys_port_name to derive unique representor netdev > > name. > > > This netdev name is further use by orchestration and switching software in > > user space. > > > One such distro supported switching software is ovs [4], which relies on the > > persistent device name of the representor netdevice. > > > > Ok, let me rephrase this to check that I understand this correctly. I'm not sure > > about some of the terms you use here (even after looking at the linked > > doc/code), but that's probably still ok. > > > > We want to derive an unique (and probably persistent?) netdev name so that > > userspace can refer to a representor netdevice. Makes sense. > > For generating that name, udev uses the phys_port_name (which represents > > the devlink port, IIUC). Also makes sense. > > > You understood it correctly. > > > > > > > phys_port_name has limitation to be only 15 characters long. > > > UUID doesn't fit in phys_port_name. > > > > Understood. But why do we need to derive the phys_port_name from the mdev > > device name? This netdevice use case seems to be just one use case for using > > mdev devices? If this is a specialized mdev type for this setup, why not just > > expose a shorter identifier via an extra attribute? > > > Representor netdev, represents mdev's switch port (like PCI SRIOV VF's switch port). > So user must be able to relate this two objects in similar manner as SRIOV VFs. > Phys_port_name is derived from the PCI PF and VF numbering scheme. > Similarly mdev's such port should be derived from mdev's id/name/attribute. > > > > Longer UUID names are creating snow ball effect, not just in networking stack > > but many user space tools too. > > > > This snowball effect mainly comes from the device name -> phys_port_name > > setup, IIUC. > > > Right. > > > > (as opposed to recently introduced mdevctl, are they more mdev tools > > > which has dependency on UUID name?) > > > > I am aware that people have written scripts etc. to manage their mdevs. > > Given that the mdev infrastructure has been around for quite some time, I'd > > say the chance of some of those scripts relying on uuid names is non-zero. > > > Ok. but those scripts have never managed networking devices. > So those scripts won't break because they will always create mdev devices using UUID. > When they use these new networking devices, they need more things than their scripts. > So user space upgrade for such mixed mode case is reasonable. Tools like mdevctl are agnostic of the type of mdev device they're managing, it shouldn't matter than they've never managed a networking mdev previously, it follows the standards of mdev management. > > > > > > Instead of mdev subsystem creating such effect, one option we are > > considering is to have shorter mdev names. > > > (Similar to netdev, rdma, nvme devices). > > > Such as mdev1, mdev2000 etc. Note that these are kernel generated names, as are the other examples. In the case of mdev, the user is providing the UUID, which becomes the device name. When a user writes to the create attribute, there needs to be determinism that the user can identify the device they created vs another that may have been created concurrently. I don't see that we can put users in the path of managing device instance numbers. > > > Second option I was considering is to have an optional alias for UUID based > > mdev. > > > This name alias is given at time of mdev creation. > > > Devlink port's phys_port_name is derived out of this shorter mdev name > > alias. > > > This way, mdev remains to be UUID based with optional extension. > > > However, I prefer first option to relax mdev naming scheme. > > > > Actually, I think that second option makes much more sense, as you avoid > > potentially breaking existing tooling. > Let's first understand of what exactly will break with existing tool > if they see non_uuid based device. Do we really want a mixed namespace of device names, some UUID, some... something else? That seems like a mess. > Existing tooling continue to work with UUID devices. > Do you have example of what can break if they see non_uuid based > device name? I think you are clear, but to be sure, UUID based > creation will continue to be there. Optionally mdev will be created > with alpha-numeric string, if we don't it as additional attribute. I'm not onboard with a UUID being just one of the possible naming strings via which we can create mdev devices. I think that becomes untenable for userspace. I don't think a sufficient argument has been made against the alias approach, which seems to keep the UUID as a canonical name, providing a consistent namespace, augmented with user or kernel provided short alias. Thanks, Alex > > > > > > > We want to uniquely identify a device, across different types of > > > > vendor drivers. An uuid is a unique identifier and even a > > > > well-defined one. Tools (e.g. mdevctl) are relying on it for > > > > mdev devices > > today. > > > > > > > > What is the problem you're trying to solve? > > > Unique device naming is still achieved without UUID scheme by > > > various > > subsystems in kernel using alpha-numeric string. > > > Having such string based continue to provide unique names. > > > > > > I hope I described the problem and two solutions above. > > > > > > [1] https://github.com/awilliam/mdevctl > > > [2] > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/ > > > mellanox/mlx5/core/en_rep.c [3] > > > http://man7.org/linux/man-pages/man8/devlink-port.8.html > > > [4] > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6 > > > 921 > > > [5] https://www.openvswitch.org/ > > > >
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Wednesday, August 14, 2019 8:28 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Cornelia Huck <cohuck@redhat.com>; Kirti Wankhede > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko <jiri@mellanox.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Wed, 14 Aug 2019 13:45:49 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Cornelia Huck <cohuck@redhat.com> > > > Sent: Wednesday, August 14, 2019 6:39 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko > > > <jiri@mellanox.com>; netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Wed, 14 Aug 2019 12:27:01 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > + Jiri, + netdev > > > > To get perspective on the ndo->phys_port_name for the representor > > > > netdev > > > of mdev. > > > > > > > > Hi Cornelia, > > > > > > > > > -----Original Message----- > > > > > From: Cornelia Huck <cohuck@redhat.com> > > > > > Sent: Wednesday, August 14, 2019 1:32 PM > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > > > kernel@vger.kernel.org; cjia@nvidia.com > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > On Wed, 14 Aug 2019 05:54:36 +0000 Parav Pandit > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > > > I get that part. I prefer to remove the UUID itself from > > > > > > > > the structure and therefore removing this API makes lot more > sense? > > > > > > > > > > > > > > Mdev and support tools around mdev are based on UUIDs > > > > > > > because it's > > > > > defined > > > > > > > in the documentation. > > > > > > When we introduce newer device naming scheme, it will update > > > > > > the > > > > > documentation also. > > > > > > May be that is the time to move to .rst format too. > > > > > > > > > > You are aware that there are existing tools that expect a uuid > > > > > naming scheme, right? > > > > > > > > > Yes, Alex mentioned too. > > > > The good tool that I am aware of is [1], which is 4 months old. > > > > Not sure if it is > > > part of any distros yet. > > > > > > > > README also says, that it is in 'early in development. So we have > > > > scope to > > > improve it for non UUID names, but lets discuss that more below. > > > > > > The up-to-date reference for mdevctl is > > > https://github.com/mdevctl/mdevctl. There is currently an effort to > > > get this packaged in Fedora. > > > > > Awesome. > > > > > > > > > > > > > > > > > > > I don't think it's as simple as saying "voila, UUID > > > > > > > dependencies are removed, users are free to use arbitrary > > > > > > > strings". We'd need to create some kind of naming policy, > > > > > > > what characters are allows so that we can potentially expand > > > > > > > the creation parameters as has been proposed a couple times, > > > > > > > how do we deal with collisions and races, and why should we > > > > > > > make such a change when a UUID is a perfectly reasonable > > > > > > > devices name. Thanks, > > > > > > > > > > > > > Sure, we should define a policy on device naming to be more relaxed. > > > > > > We have enough examples in-kernel. > > > > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot > > > > > > more), rdma > > > > > etc which has arbitrary device names and ID based device names. > > > > > > > > > > > > Collisions and race is already taken care today in the mdev core. > > > > > > Same > > > > > unique device names continue. > > > > > > > > > > I'm still completely missing a rationale _why_ uuids are > > > > > supposedly bad/restricting/etc. > > > > There is nothing bad about uuid based naming. > > > > Its just too long name to derive phys_port_name of a netdev. > > > > In details below. > > > > > > > > For a given mdev of networking type, we would like to have > > > > (a) representor netdevice [2] > > > > (b) associated devlink port [3] > > > > > > > > Currently these representor netdevice exist only for the PCIe SR-IOV VFs. > > > > It is further getting extended for mdev without SR-IOV. > > > > > > > > Each of the devlink port is attached to representor netdevice [4]. > > > > > > > > This netdevice phys_port_name should be a unique derived from some > > > property of mdev. > > > > Udev/systemd uses phys_port_name to derive unique representor > > > > netdev > > > name. > > > > This netdev name is further use by orchestration and switching > > > > software in > > > user space. > > > > One such distro supported switching software is ovs [4], which > > > > relies on the > > > persistent device name of the representor netdevice. > > > > > > Ok, let me rephrase this to check that I understand this correctly. > > > I'm not sure about some of the terms you use here (even after > > > looking at the linked doc/code), but that's probably still ok. > > > > > > We want to derive an unique (and probably persistent?) netdev name > > > so that userspace can refer to a representor netdevice. Makes sense. > > > For generating that name, udev uses the phys_port_name (which > > > represents the devlink port, IIUC). Also makes sense. > > > > > You understood it correctly. > > > > > > > > > > phys_port_name has limitation to be only 15 characters long. > > > > UUID doesn't fit in phys_port_name. > > > > > > Understood. But why do we need to derive the phys_port_name from the > > > mdev device name? This netdevice use case seems to be just one use > > > case for using mdev devices? If this is a specialized mdev type for > > > this setup, why not just expose a shorter identifier via an extra attribute? > > > > > Representor netdev, represents mdev's switch port (like PCI SRIOV VF's switch > port). > > So user must be able to relate this two objects in similar manner as SRIOV > VFs. > > Phys_port_name is derived from the PCI PF and VF numbering scheme. > > Similarly mdev's such port should be derived from mdev's id/name/attribute. > > > > > > Longer UUID names are creating snow ball effect, not just in > > > > networking stack > > > but many user space tools too. > > > > > > This snowball effect mainly comes from the device name -> > > > phys_port_name setup, IIUC. > > > > > Right. > > > > > > (as opposed to recently introduced mdevctl, are they more mdev > > > > tools which has dependency on UUID name?) > > > > > > I am aware that people have written scripts etc. to manage their mdevs. > > > Given that the mdev infrastructure has been around for quite some > > > time, I'd say the chance of some of those scripts relying on uuid names is > non-zero. > > > > > Ok. but those scripts have never managed networking devices. > > So those scripts won't break because they will always create mdev devices > using UUID. > > When they use these new networking devices, they need more things than > their scripts. > > So user space upgrade for such mixed mode case is reasonable. > > Tools like mdevctl are agnostic of the type of mdev device they're managing, it > shouldn't matter than they've never managed a networking mdev previously, it > follows the standards of mdev management. > > > > > > > > > Instead of mdev subsystem creating such effect, one option we are > > > considering is to have shorter mdev names. > > > > (Similar to netdev, rdma, nvme devices). > > > > Such as mdev1, mdev2000 etc. > > Note that these are kernel generated names, as are the other examples. No. I probably gave the wrong examples. Mdev user provided names can be 'foo', 'bar', 'foo1'. > In the case of mdev, the user is providing the UUID, which becomes the device > name. When a user writes to the create attribute, there needs to be > determinism that the user can identify the device they created vs another that > may have been created concurrently. I don't see that we can put users in the > path of managing device instance numbers. No. Its just user provided names. > > > > > Second option I was considering is to have an optional alias for > > > > UUID based > > > mdev. > > > > This name alias is given at time of mdev creation. > > > > Devlink port's phys_port_name is derived out of this shorter mdev > > > > name > > > alias. > > > > This way, mdev remains to be UUID based with optional extension. > > > > However, I prefer first option to relax mdev naming scheme. > > > > > > Actually, I think that second option makes much more sense, as you > > > avoid potentially breaking existing tooling. > > Let's first understand of what exactly will break with existing tool > > if they see non_uuid based device. > > Do we really want a mixed namespace of device names, some UUID, some... > something else? That seems like a mess. > So you prefer alias as an attribute? If so, it should be an optional additional parameter during create time, because it is desired to not invent new callbacks for such attributes setting and (and rewrite them). > > Existing tooling continue to work with UUID devices. > > Do you have example of what can break if they see non_uuid based > > device name? I think you are clear, but to be sure, UUID based > > creation will continue to be there. Optionally mdev will be created > > with alpha-numeric string, if we don't it as additional attribute. > > I'm not onboard with a UUID being just one of the possible naming strings via > which we can create mdev devices. I think that becomes untenable for > userspace. I don't think a sufficient argument has been made against the alias > approach, which seems to keep the UUID as a canonical name, providing a > consistent namespace, augmented with user or kernel provided short alias. > Thanks, > If I understand you correctly, you prefer alias name approach to keep UUID naming scheme intact in mdev? > Alex > > > > > > > > > > We want to uniquely identify a device, across different types of > > > > > vendor drivers. An uuid is a unique identifier and even a > > > > > well-defined one. Tools (e.g. mdevctl) are relying on it for > > > > > mdev devices > > > today. > > > > > > > > > > What is the problem you're trying to solve? > > > > Unique device naming is still achieved without UUID scheme by > > > > various > > > subsystems in kernel using alpha-numeric string. > > > > Having such string based continue to provide unique names. > > > > > > > > I hope I described the problem and two solutions above. > > > > > > > > [1] https://github.com/awilliam/mdevctl > > > > [2] > > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ether > > > > net/ > > > > mellanox/mlx5/core/en_rep.c [3] > > > > http://man7.org/linux/man-pages/man8/devlink-port.8.html > > > > [4] > > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink. > > > > c#L6 > > > > 921 > > > > [5] https://www.openvswitch.org/ > > > > > >
+ Dave. Hi Jiri, Dave, Alex, Kirti, Cornelia, Please provide your feedback on it, how shall we proceed? Short summary of requirements. For a given mdev (mediated device [1]), there is one representor netdevice and devlink port in switchdev mode (similar to SR-IOV VF), And there is one netdevice for the actual mdev when mdev is probed. (a) representor netdev and devlink port should be able derive phys_port_name(). So that representor netdev name can be built deterministically across reboots. (b) for mdev's netdevice, mdev's device should have an attribute. This attribute can be used by udev rules/systemd or something else to rename netdev name deterministically. (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option to me. Hence, I would like to discuss below options. Option-1: mdev index Introduce an optional mdev index/handle as u32 during mdev create time. User passes mdev index/handle as input. phys_port_name=mIndex=m%u mdev_index will be available in sysfs as mdev attribute for udev to name the mdev's netdev. example mdev create command: UUID=$(uuidgen) echo $UUID index=10 > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create example netdevs: repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ mdev_netdev=enm10 Pros: 1. mdevctl and any other existing tools are unaffected. 2. netdev stack, ovs and other switching platforms are unaffected. 3. achieves unique phys_port_name for representor netdev 4. achieves unique mdev eth netdev name for the mdev using udev/systemd extension. 5. Aligns well with mdev and netdev subsystem and similar to existing sriov bdf's. Option-2: shorter mdev name Extend mdev to have shorter mdev device name in addition to UUID. such as 'foo', 'bar'. Mdev will continue to have UUID. phys_port_name=mdev_name Pros: 1. All same as option-1, except mdevctl needs upgrade for newer usage. It is common practice to upgrade iproute2 package along with the kernel. Similar practice to be done with mdevctl. 2. Newer users of mdevctl who wants to work with non_UUID names, will use newer mdevctl/tools. Cons: 1. Dual naming scheme of mdev might affect some of the existing tools. It's unclear how/if it actually affects. mdevctl [2] is very recently developed and can be enhanced for dual naming scheme. Option-3: mdev uuid alias Instead of shorter mdev name or mdev index, have alpha-numeric name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. example mdev create command: UUID=$(uuidgen) echo $UUID alias=foo > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create example netdevs: examle netdevs: repnetdev = ens2f0_mfoo mdev_netdev=enmfoo Pros: 1. All same as option-1. 2. Doesn't affect existing mdev naming scheme. Cons: 1. Index scheme of option-1 is better which can number large number of mdevs with fewer characters, simplifying the management tool. Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID Pros: 1. Doesn't require mdev extension Cons: 1. netdev stack, driver, uapi, user space, boot config wide changes 2. Possible user space extensions who assumed name size being 16 characters 3. Single device type demands namesize change for all netdev types [1] https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt [2] https://github.com/mdevctl/mdevctl Regards, Parav Pandit > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org <linux-kernel- > owner@vger.kernel.org> On Behalf Of Parav Pandit > Sent: Wednesday, August 14, 2019 9:51 PM > To: Alex Williamson <alex.williamson@redhat.com> > Cc: Cornelia Huck <cohuck@redhat.com>; Kirti Wankhede > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko <jiri@mellanox.com>; > netdev@vger.kernel.org > Subject: RE: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Wednesday, August 14, 2019 8:28 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Cornelia Huck <cohuck@redhat.com>; Kirti Wankhede > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko > > <jiri@mellanox.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Wed, 14 Aug 2019 13:45:49 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Cornelia Huck <cohuck@redhat.com> > > > > Sent: Wednesday, August 14, 2019 6:39 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede > > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko > > > > <jiri@mellanox.com>; netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Wed, 14 Aug 2019 12:27:01 +0000 Parav Pandit > > > > <parav@mellanox.com> wrote: > > > > > > > > > + Jiri, + netdev > > > > > To get perspective on the ndo->phys_port_name for the > > > > > representor netdev > > > > of mdev. > > > > > > > > > > Hi Cornelia, > > > > > > > > > > > -----Original Message----- > > > > > > From: Cornelia Huck <cohuck@redhat.com> > > > > > > Sent: Wednesday, August 14, 2019 1:32 PM > > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti > > > > > > Wankhede <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > > > > > > kernel@vger.kernel.org; cjia@nvidia.com > > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > On Wed, 14 Aug 2019 05:54:36 +0000 Parav Pandit > > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > > > > > I get that part. I prefer to remove the UUID itself from > > > > > > > > > the structure and therefore removing this API makes lot > > > > > > > > > more > > sense? > > > > > > > > > > > > > > > > Mdev and support tools around mdev are based on UUIDs > > > > > > > > because it's > > > > > > defined > > > > > > > > in the documentation. > > > > > > > When we introduce newer device naming scheme, it will update > > > > > > > the > > > > > > documentation also. > > > > > > > May be that is the time to move to .rst format too. > > > > > > > > > > > > You are aware that there are existing tools that expect a uuid > > > > > > naming scheme, right? > > > > > > > > > > > Yes, Alex mentioned too. > > > > > The good tool that I am aware of is [1], which is 4 months old. > > > > > Not sure if it is > > > > part of any distros yet. > > > > > > > > > > README also says, that it is in 'early in development. So we > > > > > have scope to > > > > improve it for non UUID names, but lets discuss that more below. > > > > > > > > The up-to-date reference for mdevctl is > > > > https://github.com/mdevctl/mdevctl. There is currently an effort > > > > to get this packaged in Fedora. > > > > > > > Awesome. > > > > > > > > > > > > > > > > > > > > > > > I don't think it's as simple as saying "voila, UUID > > > > > > > > dependencies are removed, users are free to use arbitrary > > > > > > > > strings". We'd need to create some kind of naming policy, > > > > > > > > what characters are allows so that we can potentially > > > > > > > > expand the creation parameters as has been proposed a > > > > > > > > couple times, how do we deal with collisions and races, > > > > > > > > and why should we make such a change when a UUID is a > > > > > > > > perfectly reasonable devices name. Thanks, > > > > > > > > > > > > > > > Sure, we should define a policy on device naming to be more > relaxed. > > > > > > > We have enough examples in-kernel. > > > > > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, > > > > > > > lot more), rdma > > > > > > etc which has arbitrary device names and ID based device names. > > > > > > > > > > > > > > Collisions and race is already taken care today in the mdev core. > > > > > > > Same > > > > > > unique device names continue. > > > > > > > > > > > > I'm still completely missing a rationale _why_ uuids are > > > > > > supposedly bad/restricting/etc. > > > > > There is nothing bad about uuid based naming. > > > > > Its just too long name to derive phys_port_name of a netdev. > > > > > In details below. > > > > > > > > > > For a given mdev of networking type, we would like to have > > > > > (a) representor netdevice [2] > > > > > (b) associated devlink port [3] > > > > > > > > > > Currently these representor netdevice exist only for the PCIe SR-IOV > VFs. > > > > > It is further getting extended for mdev without SR-IOV. > > > > > > > > > > Each of the devlink port is attached to representor netdevice [4]. > > > > > > > > > > This netdevice phys_port_name should be a unique derived from > > > > > some > > > > property of mdev. > > > > > Udev/systemd uses phys_port_name to derive unique representor > > > > > netdev > > > > name. > > > > > This netdev name is further use by orchestration and switching > > > > > software in > > > > user space. > > > > > One such distro supported switching software is ovs [4], which > > > > > relies on the > > > > persistent device name of the representor netdevice. > > > > > > > > Ok, let me rephrase this to check that I understand this correctly. > > > > I'm not sure about some of the terms you use here (even after > > > > looking at the linked doc/code), but that's probably still ok. > > > > > > > > We want to derive an unique (and probably persistent?) netdev name > > > > so that userspace can refer to a representor netdevice. Makes sense. > > > > For generating that name, udev uses the phys_port_name (which > > > > represents the devlink port, IIUC). Also makes sense. > > > > > > > You understood it correctly. > > > > > > > > > > > > > phys_port_name has limitation to be only 15 characters long. > > > > > UUID doesn't fit in phys_port_name. > > > > > > > > Understood. But why do we need to derive the phys_port_name from > > > > the mdev device name? This netdevice use case seems to be just one > > > > use case for using mdev devices? If this is a specialized mdev > > > > type for this setup, why not just expose a shorter identifier via an extra > attribute? > > > > > > > Representor netdev, represents mdev's switch port (like PCI SRIOV > > > VF's switch > > port). > > > So user must be able to relate this two objects in similar manner as > > > SRIOV > > VFs. > > > Phys_port_name is derived from the PCI PF and VF numbering scheme. > > > Similarly mdev's such port should be derived from mdev's > id/name/attribute. > > > > > > > > Longer UUID names are creating snow ball effect, not just in > > > > > networking stack > > > > but many user space tools too. > > > > > > > > This snowball effect mainly comes from the device name -> > > > > phys_port_name setup, IIUC. > > > > > > > Right. > > > > > > > > (as opposed to recently introduced mdevctl, are they more mdev > > > > > tools which has dependency on UUID name?) > > > > > > > > I am aware that people have written scripts etc. to manage their mdevs. > > > > Given that the mdev infrastructure has been around for quite some > > > > time, I'd say the chance of some of those scripts relying on uuid > > > > names is > > non-zero. > > > > > > > Ok. but those scripts have never managed networking devices. > > > So those scripts won't break because they will always create mdev > > > devices > > using UUID. > > > When they use these new networking devices, they need more things > > > than > > their scripts. > > > So user space upgrade for such mixed mode case is reasonable. > > > > Tools like mdevctl are agnostic of the type of mdev device they're > > managing, it shouldn't matter than they've never managed a networking > > mdev previously, it follows the standards of mdev management. > > > > > > > > > > > > Instead of mdev subsystem creating such effect, one option we > > > > > are > > > > considering is to have shorter mdev names. > > > > > (Similar to netdev, rdma, nvme devices). > > > > > Such as mdev1, mdev2000 etc. > > > > Note that these are kernel generated names, as are the other examples. > No. I probably gave the wrong examples. > Mdev user provided names can be 'foo', 'bar', 'foo1'. > > > In the case of mdev, the user is providing the UUID, which becomes the > > device name. When a user writes to the create attribute, there needs > > to be determinism that the user can identify the device they created > > vs another that may have been created concurrently. I don't see that > > we can put users in the path of managing device instance numbers. > No. Its just user provided names. > > > > > > > > Second option I was considering is to have an optional alias for > > > > > UUID based > > > > mdev. > > > > > This name alias is given at time of mdev creation. > > > > > Devlink port's phys_port_name is derived out of this shorter > > > > > mdev name > > > > alias. > > > > > This way, mdev remains to be UUID based with optional extension. > > > > > However, I prefer first option to relax mdev naming scheme. > > > > > > > > Actually, I think that second option makes much more sense, as you > > > > avoid potentially breaking existing tooling. > > > Let's first understand of what exactly will break with existing tool > > > if they see non_uuid based device. > > > > Do we really want a mixed namespace of device names, some UUID, some... > > something else? That seems like a mess. > > > So you prefer alias as an attribute? If so, it should be an optional additional > parameter during create time, because it is desired to not invent new callbacks > for such attributes setting and (and rewrite them). > > > > Existing tooling continue to work with UUID devices. > > > Do you have example of what can break if they see non_uuid based > > > device name? I think you are clear, but to be sure, UUID based > > > creation will continue to be there. Optionally mdev will be created > > > with alpha-numeric string, if we don't it as additional attribute. > > > > I'm not onboard with a UUID being just one of the possible naming > > strings via which we can create mdev devices. I think that becomes > > untenable for userspace. I don't think a sufficient argument has been > > made against the alias approach, which seems to keep the UUID as a > > canonical name, providing a consistent namespace, augmented with user or > kernel provided short alias. > > Thanks, > > > If I understand you correctly, you prefer alias name approach to keep UUID > naming scheme intact in mdev? > > > Alex > > > > > > > > > > > > > We want to uniquely identify a device, across different types > > > > > > of vendor drivers. An uuid is a unique identifier and even a > > > > > > well-defined one. Tools (e.g. mdevctl) are relying on it for > > > > > > mdev devices > > > > today. > > > > > > > > > > > > What is the problem you're trying to solve? > > > > > Unique device naming is still achieved without UUID scheme by > > > > > various > > > > subsystems in kernel using alpha-numeric string. > > > > > Having such string based continue to provide unique names. > > > > > > > > > > I hope I described the problem and two solutions above. > > > > > > > > > > [1] https://github.com/awilliam/mdevctl > > > > > [2] > > > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/eth > > > > > er > > > > > net/ > > > > > mellanox/mlx5/core/en_rep.c [3] > > > > > http://man7.org/linux/man-pages/man8/devlink-port.8.html > > > > > [4] > > > > > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink. > > > > > c#L6 > > > > > 921 > > > > > [5] https://www.openvswitch.org/ > > > > > > > >
Parav Pandit writes: > + Dave. > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > Please provide your feedback on it, how shall we proceed? > > Hence, I would like to discuss below options. > > Option-1: mdev index > Introduce an optional mdev index/handle as u32 during mdev create time. > User passes mdev index/handle as input. > > phys_port_name=mIndex=m%u > mdev_index will be available in sysfs as mdev attribute for udev to name the mdev's netdev. > > example mdev create command: > UUID=$(uuidgen) > echo $UUID index=10 > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > example netdevs: > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > mdev_netdev=enm10 > > Pros: > 1. mdevctl and any other existing tools are unaffected. > 2. netdev stack, ovs and other switching platforms are unaffected. > 3. achieves unique phys_port_name for representor netdev > 4. achieves unique mdev eth netdev name for the mdev using udev/systemd extension. > 5. Aligns well with mdev and netdev subsystem and similar to existing sriov bdf's. > > Option-2: shorter mdev name > Extend mdev to have shorter mdev device name in addition to UUID. > such as 'foo', 'bar'. > Mdev will continue to have UUID. > phys_port_name=mdev_name > > Pros: > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > It is common practice to upgrade iproute2 package along with the kernel. > Similar practice to be done with mdevctl. > 2. Newer users of mdevctl who wants to work with non_UUID names, will use newer mdevctl/tools. > Cons: > 1. Dual naming scheme of mdev might affect some of the existing tools. > It's unclear how/if it actually affects. > mdevctl [2] is very recently developed and can be enhanced for dual naming scheme. > > Option-3: mdev uuid alias > Instead of shorter mdev name or mdev index, have alpha-numeric name alias. > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > example mdev create command: > UUID=$(uuidgen) > echo $UUID alias=foo > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > example netdevs: > examle netdevs: > repnetdev = ens2f0_mfoo > mdev_netdev=enmfoo > > Pros: > 1. All same as option-1. > 2. Doesn't affect existing mdev naming scheme. > Cons: > 1. Index scheme of option-1 is better which can number large number of mdevs with fewer characters, simplifying the management tool. I believe that Alex pointed out another "Cons" to all three options, which is that it forces user-space to resolve potential race conditions when creating an index or short name or alias. Also, what happens if `index=10` is not provided on the command-line? Does that make the device unusable for your purpose? -- Cheers, Christophe de Dinechin (IRC c3d)
> -----Original Message----- > From: Christophe de Dinechin <christophe.de.dinechin@gmail.com> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > Parav Pandit writes: > > > + Dave. > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > Please provide your feedback on it, how shall we proceed? > > > > Hence, I would like to discuss below options. > > > > Option-1: mdev index > > Introduce an optional mdev index/handle as u32 during mdev create time. > > User passes mdev index/handle as input. > > > > phys_port_name=mIndex=m%u > > mdev_index will be available in sysfs as mdev attribute for udev to name the > mdev's netdev. > > > > example mdev create command: > > UUID=$(uuidgen) > > echo $UUID index=10 > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > example netdevs: > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > mdev_netdev=enm10 > > > > Pros: > > 1. mdevctl and any other existing tools are unaffected. > > 2. netdev stack, ovs and other switching platforms are unaffected. > > 3. achieves unique phys_port_name for representor netdev 4. achieves > > unique mdev eth netdev name for the mdev using udev/systemd extension. > > 5. Aligns well with mdev and netdev subsystem and similar to existing sriov > bdf's. > > > > Option-2: shorter mdev name > > Extend mdev to have shorter mdev device name in addition to UUID. > > such as 'foo', 'bar'. > > Mdev will continue to have UUID. > > phys_port_name=mdev_name > > > > Pros: > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > It is common practice to upgrade iproute2 package along with the kernel. > > Similar practice to be done with mdevctl. > > 2. Newer users of mdevctl who wants to work with non_UUID names, will use > newer mdevctl/tools. > > Cons: > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > It's unclear how/if it actually affects. > > mdevctl [2] is very recently developed and can be enhanced for dual naming > scheme. > > > > Option-3: mdev uuid alias > > Instead of shorter mdev name or mdev index, have alpha-numeric name > alias. > > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > example mdev create command: > > UUID=$(uuidgen) > > echo $UUID alias=foo > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > example netdevs: > > examle netdevs: > > repnetdev = ens2f0_mfoo > > mdev_netdev=enmfoo > > > > Pros: > > 1. All same as option-1. > > 2. Doesn't affect existing mdev naming scheme. > > Cons: > > 1. Index scheme of option-1 is better which can number large number of > mdevs with fewer characters, simplifying the management tool. > > I believe that Alex pointed out another "Cons" to all three options, which is that > it forces user-space to resolve potential race conditions when creating an index > or short name or alias. > This race condition exists for at least two subsystems that I know of, i.e. netdev and rdma. If a device with a given name exists, subsystem returns error. When user space gets error code EEXIST, and it can picks up different identifier(s). > Also, what happens if `index=10` is not provided on the command-line? > Does that make the device unusable for your purpose? Yes, it is unusable to an extent. Currently we have DEVLINK_PORT_FLAVOUR_PCI_VF in include/uapi/linux/devlink.h Similar to it, we need to have DEVLINK_PORT_FLAVOUR_MDEV for mdev eswitch ports. This port flavour needs to generate phys_port_name(). This should be user parameter driven. Because representor netdevice name is generated based on this parameter.
On Tue, 20 Aug 2019 11:25:05 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Christophe de Dinechin <christophe.de.dinechin@gmail.com> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > Parav Pandit writes: > > > > > + Dave. > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > Hence, I would like to discuss below options. > > > > > > Option-1: mdev index > > > Introduce an optional mdev index/handle as u32 during mdev create time. > > > User passes mdev index/handle as input. > > > > > > phys_port_name=mIndex=m%u > > > mdev_index will be available in sysfs as mdev attribute for udev to name the > > mdev's netdev. > > > > > > example mdev create command: > > > UUID=$(uuidgen) > > > echo $UUID index=10 > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > example netdevs: > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > mdev_netdev=enm10 > > > > > > Pros: > > > 1. mdevctl and any other existing tools are unaffected. > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > 3. achieves unique phys_port_name for representor netdev 4. achieves > > > unique mdev eth netdev name for the mdev using udev/systemd extension. > > > 5. Aligns well with mdev and netdev subsystem and similar to existing sriov > > bdf's. > > > > > > Option-2: shorter mdev name > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > such as 'foo', 'bar'. > > > Mdev will continue to have UUID. I fail to understand how 'uses uuid' and 'allow shorter device name' are supposed to play together? > > > phys_port_name=mdev_name > > > > > > Pros: > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > > It is common practice to upgrade iproute2 package along with the kernel. > > > Similar practice to be done with mdevctl. > > > 2. Newer users of mdevctl who wants to work with non_UUID names, will use > > newer mdevctl/tools. > > > Cons: > > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > > It's unclear how/if it actually affects. > > > mdevctl [2] is very recently developed and can be enhanced for dual naming > > scheme. The main problem is not tools we know about (i.e. mdevctl), but those we don't know about. IOW, this (and the IFNAMESIZ change, which seems even worse) are the options I would not want at all. > > > > > > Option-3: mdev uuid alias > > > Instead of shorter mdev name or mdev index, have alpha-numeric name > > alias. > > > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > > example mdev create command: > > > UUID=$(uuidgen) > > > echo $UUID alias=foo > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > example netdevs: > > > examle netdevs: > > > repnetdev = ens2f0_mfoo > > > mdev_netdev=enmfoo > > > > > > Pros: > > > 1. All same as option-1. > > > 2. Doesn't affect existing mdev naming scheme. > > > Cons: > > > 1. Index scheme of option-1 is better which can number large number of > > mdevs with fewer characters, simplifying the management tool. > > > > I believe that Alex pointed out another "Cons" to all three options, which is that > > it forces user-space to resolve potential race conditions when creating an index > > or short name or alias. > > > This race condition exists for at least two subsystems that I know of, i.e. netdev and rdma. > If a device with a given name exists, subsystem returns error. > When user space gets error code EEXIST, and it can picks up different identifier(s). If you decouple device creation and setting the alias/index, you make the issue visible and thus much more manageable. > > > Also, what happens if `index=10` is not provided on the command-line? > > Does that make the device unusable for your purpose? > Yes, it is unusable to an extent. > Currently we have DEVLINK_PORT_FLAVOUR_PCI_VF in include/uapi/linux/devlink.h > Similar to it, we need to have DEVLINK_PORT_FLAVOUR_MDEV for mdev eswitch ports. > This port flavour needs to generate phys_port_name(). This should be user parameter driven. > Because representor netdevice name is generated based on this parameter. I'm also unsure how the extra parameter is supposed to work; writing it to the create attribute does not sound right. mdevctl supports setting additional parameters on an already created device (see the examples provided for vfio-ap), so going that route would actually work out of the box from the tooling side. What you would need is some kind of synchronization/locking to make sure that you only link up to the other device after the extra attribute has been set and that you don't allow to change it as long as it is associated with the other side. I do not know enough about the actual devices to suggest something here; if you need userspace cooperation, maybe uevents would be an option.
On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit <parav@mellanox.com> wrote: > + Dave. > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > Please provide your feedback on it, how shall we proceed? > > Short summary of requirements. > For a given mdev (mediated device [1]), there is one representor > netdevice and devlink port in switchdev mode (similar to SR-IOV VF), > And there is one netdevice for the actual mdev when mdev is probed. > > (a) representor netdev and devlink port should be able derive > phys_port_name(). So that representor netdev name can be built > deterministically across reboots. > > (b) for mdev's netdevice, mdev's device should have an attribute. > This attribute can be used by udev rules/systemd or something else to > rename netdev name deterministically. > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ > in drivers, uapi, netlink, boot config area and more. Changing > IFNAMSIZ for a mdev bus doesn't really look reasonable option to me. How many characters do we really have to work with? Your examples below prepend various characters, ex. option-1 results in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ? > Hence, I would like to discuss below options. > > Option-1: mdev index > Introduce an optional mdev index/handle as u32 during mdev create > time. User passes mdev index/handle as input. > > phys_port_name=mIndex=m%u > mdev_index will be available in sysfs as mdev attribute for udev to > name the mdev's netdev. > > example mdev create command: > UUID=$(uuidgen) > echo $UUID index=10 > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create Nit, IIRC previous discussions of additional parameters used comma separators, ex. echo $UUID,index=10 >... > > example netdevs: > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ Is the parent really relevant in the name? Tools like mdevctl are meant to provide persistence, creating the same mdev devices on the same parent, but that's simply the easiest policy decision. We can also imagine that multiple parent devices might support a specified mdev type and policies factoring in proximity, load-balancing, power consumption, etc might be weighed such that we really don't want to promote userspace creating dependencies on the parent association. > mdev_netdev=enm10 > > Pros: > 1. mdevctl and any other existing tools are unaffected. > 2. netdev stack, ovs and other switching platforms are unaffected. > 3. achieves unique phys_port_name for representor netdev > 4. achieves unique mdev eth netdev name for the mdev using > udev/systemd extension. 5. Aligns well with mdev and netdev subsystem > and similar to existing sriov bdf's. A user provided index seems strange to me. It's not really an index, just a user specified instance number. Presumably you have the user providing this because if it really were an index, then the value depends on the creation order and persistence is lost. Now the user needs to both avoid uuid collision as well as "index" number collision. The uuid namespace is large enough to mostly ignore this, but this is not. This seems like a burden. > Option-2: shorter mdev name > Extend mdev to have shorter mdev device name in addition to UUID. > such as 'foo', 'bar'. > Mdev will continue to have UUID. > phys_port_name=mdev_name > > Pros: > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > It is common practice to upgrade iproute2 package along with the > kernel. Similar practice to be done with mdevctl. > 2. Newer users of mdevctl who wants to work with non_UUID names, will > use newer mdevctl/tools. Cons: > 1. Dual naming scheme of mdev might affect some of the existing tools. > It's unclear how/if it actually affects. > mdevctl [2] is very recently developed and can be enhanced for dual > naming scheme. I think we've already nak'ed this one, the device namespace becomes meaningless if the name becomes just a string where a uuid might be an example string. mdevs are named by uuid. > Option-3: mdev uuid alias > Instead of shorter mdev name or mdev index, have alpha-numeric name > alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > example mdev create command: > UUID=$(uuidgen) > echo $UUID alias=foo > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > example netdevs: > examle netdevs: > repnetdev = ens2f0_mfoo > mdev_netdev=enmfoo > > Pros: > 1. All same as option-1. > 2. Doesn't affect existing mdev naming scheme. > Cons: > 1. Index scheme of option-1 is better which can number large number > of mdevs with fewer characters, simplifying the management tool. No better than option-1, simply a larger secondary namespace, but still requires the user to come up with two independent names for the device. > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to > 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID > > Pros: > 1. Doesn't require mdev extension > Cons: > 1. netdev stack, driver, uapi, user space, boot config wide changes > 2. Possible user space extensions who assumed name size being 16 > characters 3. Single device type demands namesize change for all > netdev types What about an alias based on the uuid? For example, we use 160-bit sha1s daily with git (uuids are only 128-bit), but we generally don't reference git commits with the full 20 character string. Generally 12 characters is recommended to avoid ambiguity. Could mdev automatically create an abbreviated sha1 alias for the device? If so, how many characters should we use and what do we do on collision? The colliding device could add enough alias characters to disambiguate (we likely couldn't re-alias the existing device to disambiguate, but I'm not sure it matters, userspace has sysfs to associate aliases). Ex. UUID=$(uuidgen) ALIAS=$(echo $UUID | sha1sum | colrm 13) Since there seems to be some prefix overhead, as I ask about above in how many characters we actually have to work with in IFNAMESZ, maybe we start with 8 characters (matching your "index" namespace) and expand as necessary for disambiguation. If we can eliminate overhead in IFNAMESZ, let's start with 12. Thanks, Alex
On Tue, 20 Aug 2019 11:19:04 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > What about an alias based on the uuid? For example, we use 160-bit > sha1s daily with git (uuids are only 128-bit), but we generally don't > reference git commits with the full 20 character string. Generally 12 > characters is recommended to avoid ambiguity. Could mdev automatically > create an abbreviated sha1 alias for the device? If so, how many > characters should we use and what do we do on collision? The colliding > device could add enough alias characters to disambiguate (we likely > couldn't re-alias the existing device to disambiguate, but I'm not sure > it matters, userspace has sysfs to associate aliases). Ex. > > UUID=$(uuidgen) > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > Since there seems to be some prefix overhead, as I ask about above in > how many characters we actually have to work with in IFNAMESZ, maybe we > start with 8 characters (matching your "index" namespace) and expand as > necessary for disambiguation. If we can eliminate overhead in > IFNAMESZ, let's start with 12. Thanks, > > Alex I really like that idea, and it seems the best option proposed yet, as we don't need to create a secondary identifier.
> -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Tuesday, August 20, 2019 10:01 PM > > > > Option-1: mdev index > > > > Introduce an optional mdev index/handle as u32 during mdev create > time. > > > > User passes mdev index/handle as input. > > > > > > > > phys_port_name=mIndex=m%u > > > > mdev_index will be available in sysfs as mdev attribute for udev > > > > to name the > > > mdev's netdev. > > > > > > > > example mdev create command: > > > > UUID=$(uuidgen) > > > > echo $UUID index=10 > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > example netdevs: > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > mdev_netdev=enm10 > > > > > > > > Pros: > > > > 1. mdevctl and any other existing tools are unaffected. > > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > > 3. achieves unique phys_port_name for representor netdev 4. > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd > extension. > > > > 5. Aligns well with mdev and netdev subsystem and similar to > > > > existing sriov > > > bdf's. > > > > > > > > Option-2: shorter mdev name > > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > > such as 'foo', 'bar'. > > > > Mdev will continue to have UUID. > > I fail to understand how 'uses uuid' and 'allow shorter device name' > are supposed to play together? > Each mdev will have uuid as today. Instead of naming device based on UUID, name it based on explicit name given by the user. Again, I want to repeat, this name parameter is optional. > > > > phys_port_name=mdev_name > > > > > > > > Pros: > > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > > > It is common practice to upgrade iproute2 package along with the kernel. > > > > Similar practice to be done with mdevctl. > > > > 2. Newer users of mdevctl who wants to work with non_UUID names, > > > > will use > > > newer mdevctl/tools. > > > > Cons: > > > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > > > It's unclear how/if it actually affects. > > > > mdevctl [2] is very recently developed and can be enhanced for > > > > dual naming > > > scheme. > > The main problem is not tools we know about (i.e. mdevctl), but those we don't > know about. > Well, if it not part of the distros, there is very little can do about it by kernel. I tried mdevctl with mdev named using non UUID and it were able to list them. > IOW, this (and the IFNAMESIZ change, which seems even worse) are the > options I would not want at all. > Ok. > > > > > > > > Option-3: mdev uuid alias > > > > Instead of shorter mdev name or mdev index, have alpha-numeric > > > > name > > > alias. > > > > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > > > example mdev create command: > > > > UUID=$(uuidgen) > > > > echo $UUID alias=foo > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > example netdevs: > > > > examle netdevs: > > > > repnetdev = ens2f0_mfoo > > > > mdev_netdev=enmfoo > > > > > > > > Pros: > > > > 1. All same as option-1. > > > > 2. Doesn't affect existing mdev naming scheme. > > > > Cons: > > > > 1. Index scheme of option-1 is better which can number large > > > > number of > > > mdevs with fewer characters, simplifying the management tool. > > > > > > I believe that Alex pointed out another "Cons" to all three options, > > > which is that it forces user-space to resolve potential race > > > conditions when creating an index or short name or alias. > > > > > This race condition exists for at least two subsystems that I know of, i.e. > netdev and rdma. > > If a device with a given name exists, subsystem returns error. > > When user space gets error code EEXIST, and it can picks up different > identifier(s). > > If you decouple device creation and setting the alias/index, you make the issue > visible and thus much more manageable. > I thought about it. It has two issues. 1. user should be able to set this only once. Repeatedly setting it requires changing/notifying it. 2. setting alias translating in creating devlink port doesn't sound correct. Because if user attempts to reset to different value, it required unregistration, reregistration. All of such race conditions handling it not worth it. So setting the index, I liked Alex's term more 'instance number', at instance creation time is lot more simple. > > > > > Also, what happens if `index=10` is not provided on the command-line? > > > Does that make the device unusable for your purpose? > > Yes, it is unusable to an extent. > > Currently we have DEVLINK_PORT_FLAVOUR_PCI_VF in > > include/uapi/linux/devlink.h Similar to it, we need to have > DEVLINK_PORT_FLAVOUR_MDEV for mdev eswitch ports. > > This port flavour needs to generate phys_port_name(). This should be user > parameter driven. > > Because representor netdevice name is generated based on this parameter. > > I'm also unsure how the extra parameter is supposed to work; writing it to the > create attribute does not sound right. > Why? When you create a device it takes multiple mandatory and optional parameters. This is common for netdev (vxlan, vlan, macvlan, ipvlan, gre and more). > mdevctl supports setting additional parameters on an already created device > (see the examples provided for vfio-ap), so going that route would actually > work out of the box from the tooling side. > I explained that setting and re-setting attributes for instance create time value is not worth. > What you would need is some kind of synchronization/locking to make sure that > you only link up to the other device after the extra attribute has been set and > that you don't allow to change it as long as it is associated with the other side. I > do not know enough about the actual devices to suggest something here; if you > need userspace cooperation, maybe uevents would be an option.
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Tuesday, August 20, 2019 10:49 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Tue, 20 Aug 2019 08:58:02 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > + Dave. > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > Please provide your feedback on it, how shall we proceed? > > > > Short summary of requirements. > > For a given mdev (mediated device [1]), there is one representor > > netdevice and devlink port in switchdev mode (similar to SR-IOV VF), > > And there is one netdevice for the actual mdev when mdev is probed. > > > > (a) representor netdev and devlink port should be able derive > > phys_port_name(). So that representor netdev name can be built > > deterministically across reboots. > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > This attribute can be used by udev rules/systemd or something else to > > rename netdev name deterministically. > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in > > drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ > > for a mdev bus doesn't really look reasonable option to me. > > How many characters do we really have to work with? Your examples below > prepend various characters, ex. option-1 results in ens2f0_m10 or enm10. Do > the extra 8 or 3 characters in these count against IFNAMSIZ? > Maximum 15. Last is null termination. Some udev rules setting by user prefix the PF netdev interface. I took such example below where ens2f0 netdev named is prefixed. Some prefer not to prefix. > > Hence, I would like to discuss below options. > > > > Option-1: mdev index > > Introduce an optional mdev index/handle as u32 during mdev create > > time. User passes mdev index/handle as input. > > > > phys_port_name=mIndex=m%u > > mdev_index will be available in sysfs as mdev attribute for udev to > > name the mdev's netdev. > > > > example mdev create command: > > UUID=$(uuidgen) > > echo $UUID index=10 > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > Nit, IIRC previous discussions of additional parameters used comma separators, > ex. echo $UUID,index=10 >... > Yes, ok. > > > example netdevs: > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > Is the parent really relevant in the name? No. I just picked one udev example who prefixed the parent netdev name. But there are users who do not prefix it. > Tools like mdevctl are meant to > provide persistence, creating the same mdev devices on the same parent, but > that's simply the easiest policy decision. We can also imagine that multiple > parent devices might support a specified mdev type and policies factoring in > proximity, load-balancing, power consumption, etc might be weighed such that > we really don't want to promote userspace creating dependencies on the > parent association. > > > mdev_netdev=enm10 > > > > Pros: > > 1. mdevctl and any other existing tools are unaffected. > > 2. netdev stack, ovs and other switching platforms are unaffected. > > 3. achieves unique phys_port_name for representor netdev 4. achieves > > unique mdev eth netdev name for the mdev using udev/systemd extension. > > 5. Aligns well with mdev and netdev subsystem and similar to existing > > sriov bdf's. > > A user provided index seems strange to me. It's not really an index, just a user > specified instance number. Presumably you have the user providing this > because if it really were an index, then the value depends on the creation order > and persistence is lost. Now the user needs to both avoid uuid collision as well > as "index" number collision. The uuid namespace is large enough to mostly > ignore this, but this is not. This seems like a burden. > I liked the term 'instance number', which is lot better way to say than index/handle. Yes, user needs to avoid both the collision. UUID collision should not occur in most cases, they way UUID are generated. So practically users needs to pick unique 'instance number', similar to how it picks unique netdev names. Burden to user comes from the requirement to get uniqueness. > > Option-2: shorter mdev name > > Extend mdev to have shorter mdev device name in addition to UUID. > > such as 'foo', 'bar'. > > Mdev will continue to have UUID. > > phys_port_name=mdev_name > > > > Pros: > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > It is common practice to upgrade iproute2 package along with the > > kernel. Similar practice to be done with mdevctl. > > 2. Newer users of mdevctl who wants to work with non_UUID names, will > > use newer mdevctl/tools. Cons: > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > It's unclear how/if it actually affects. > > mdevctl [2] is very recently developed and can be enhanced for dual > > naming scheme. > > I think we've already nak'ed this one, the device namespace becomes > meaningless if the name becomes just a string where a uuid might be an > example string. mdevs are named by uuid. > > > Option-3: mdev uuid alias > > Instead of shorter mdev name or mdev index, have alpha-numeric name > > alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > example mdev create command: > > UUID=$(uuidgen) > > echo $UUID alias=foo > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > example netdevs: > > examle netdevs: > > repnetdev = ens2f0_mfoo > > mdev_netdev=enmfoo > > > > Pros: > > 1. All same as option-1. > > 2. Doesn't affect existing mdev naming scheme. > > Cons: > > 1. Index scheme of option-1 is better which can number large number of > > mdevs with fewer characters, simplifying the management tool. > > No better than option-1, simply a larger secondary namespace, but still > requires the user to come up with two independent names for the device. > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to > > 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID > > > > Pros: > > 1. Doesn't require mdev extension > > Cons: > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > Possible user space extensions who assumed name size being 16 > > characters 3. Single device type demands namesize change for all > > netdev types > > What about an alias based on the uuid? For example, we use 160-bit sha1s > daily with git (uuids are only 128-bit), but we generally don't reference git > commits with the full 20 character string. Generally 12 characters is > recommended to avoid ambiguity. Could mdev automatically create an > abbreviated sha1 alias for the device? If so, how many characters should we > use and what do we do on collision? The colliding device could add enough > alias characters to disambiguate (we likely couldn't re-alias the existing device > to disambiguate, but I'm not sure it matters, userspace has sysfs to associate > aliases). Ex. > > UUID=$(uuidgen) > ALIAS=$(echo $UUID | sha1sum | colrm 13) > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the same time. Setting is via different sysfs attribute is lot code burden with no extra benefit. > Since there seems to be some prefix overhead, as I ask about above in how > many characters we actually have to work with in IFNAMESZ, maybe we start > with 8 characters (matching your "index" namespace) and expand as necessary > for disambiguation. If we can eliminate overhead in IFNAMESZ, let's start with > 12. Thanks, > If user is going to choose the alias, why does it have to be limited to sha1? Or you just told it as an example? It can be an alpha-numeric string. Instead of mdev imposing number of characters on the alias, it should be best left to the user. Because in future if netdev improves on the naming scheme, mdev will be limiting it, which is not right. So not restricting alias size seems right to me. User configuring mdev for networking devices in a given kernel knows what user is doing. So user can choose alias name size as it finds suitable. > Alex
> -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Tuesday, August 20, 2019 11:25 PM > To: Alex Williamson <alex.williamson@redhat.com> > Cc: Parav Pandit <parav@mellanox.com>; Jiri Pirko <jiri@mellanox.com>; > David S . Miller <davem@davemloft.net>; Kirti Wankhede > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Tue, 20 Aug 2019 11:19:04 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > What about an alias based on the uuid? For example, we use 160-bit > > sha1s daily with git (uuids are only 128-bit), but we generally don't > > reference git commits with the full 20 character string. Generally 12 > > characters is recommended to avoid ambiguity. Could mdev > > automatically create an abbreviated sha1 alias for the device? If so, > > how many characters should we use and what do we do on collision? The > > colliding device could add enough alias characters to disambiguate (we > > likely couldn't re-alias the existing device to disambiguate, but I'm > > not sure it matters, userspace has sysfs to associate aliases). Ex. > > > > UUID=$(uuidgen) > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > > Since there seems to be some prefix overhead, as I ask about above in > > how many characters we actually have to work with in IFNAMESZ, maybe > > we start with 8 characters (matching your "index" namespace) and > > expand as necessary for disambiguation. If we can eliminate overhead > > in IFNAMESZ, let's start with 12. Thanks, > > > > Alex > > I really like that idea, and it seems the best option proposed yet, as we don't > need to create a secondary identifier. User setting this alias at mdev creation time and exposed via sysfs as read only attribute works. Exposing that as const char *mdev_alias(struct mdev_device *dev) to vendor drivers..
On Wed, 21 Aug 2019 03:42:25 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Tuesday, August 20, 2019 10:49 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Tue, 20 Aug 2019 08:58:02 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > + Dave. > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > Short summary of requirements. > > > For a given mdev (mediated device [1]), there is one representor > > > netdevice and devlink port in switchdev mode (similar to SR-IOV VF), > > > And there is one netdevice for the actual mdev when mdev is probed. > > > > > > (a) representor netdev and devlink port should be able derive > > > phys_port_name(). So that representor netdev name can be built > > > deterministically across reboots. > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > > This attribute can be used by udev rules/systemd or something else to > > > rename netdev name deterministically. > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in > > > drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ > > > for a mdev bus doesn't really look reasonable option to me. > > > > How many characters do we really have to work with? Your examples below > > prepend various characters, ex. option-1 results in ens2f0_m10 or enm10. Do > > the extra 8 or 3 characters in these count against IFNAMSIZ? > > > Maximum 15. Last is null termination. > Some udev rules setting by user prefix the PF netdev interface. I took such example below where ens2f0 netdev named is prefixed. > Some prefer not to prefix. > > > > Hence, I would like to discuss below options. > > > > > > Option-1: mdev index > > > Introduce an optional mdev index/handle as u32 during mdev create > > > time. User passes mdev index/handle as input. > > > > > > phys_port_name=mIndex=m%u > > > mdev_index will be available in sysfs as mdev attribute for udev to > > > name the mdev's netdev. > > > > > > example mdev create command: > > > UUID=$(uuidgen) > > > echo $UUID index=10 > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > Nit, IIRC previous discussions of additional parameters used comma separators, > > ex. echo $UUID,index=10 >... > > > Yes, ok. > > > > > example netdevs: > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > Is the parent really relevant in the name? > No. I just picked one udev example who prefixed the parent netdev name. > But there are users who do not prefix it. > > > Tools like mdevctl are meant to > > provide persistence, creating the same mdev devices on the same parent, but > > that's simply the easiest policy decision. We can also imagine that multiple > > parent devices might support a specified mdev type and policies factoring in > > proximity, load-balancing, power consumption, etc might be weighed such that > > we really don't want to promote userspace creating dependencies on the > > parent association. > > > > > mdev_netdev=enm10 > > > > > > Pros: > > > 1. mdevctl and any other existing tools are unaffected. > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > 3. achieves unique phys_port_name for representor netdev 4. achieves > > > unique mdev eth netdev name for the mdev using udev/systemd extension. > > > 5. Aligns well with mdev and netdev subsystem and similar to existing > > > sriov bdf's. > > > > A user provided index seems strange to me. It's not really an index, just a user > > specified instance number. Presumably you have the user providing this > > because if it really were an index, then the value depends on the creation order > > and persistence is lost. Now the user needs to both avoid uuid collision as well > > as "index" number collision. The uuid namespace is large enough to mostly > > ignore this, but this is not. This seems like a burden. > > > I liked the term 'instance number', which is lot better way to say than index/handle. > Yes, user needs to avoid both the collision. > UUID collision should not occur in most cases, they way UUID are generated. > So practically users needs to pick unique 'instance number', similar to how it picks unique netdev names. > > Burden to user comes from the requirement to get uniqueness. > > > > Option-2: shorter mdev name > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > such as 'foo', 'bar'. > > > Mdev will continue to have UUID. > > > phys_port_name=mdev_name > > > > > > Pros: > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > > It is common practice to upgrade iproute2 package along with the > > > kernel. Similar practice to be done with mdevctl. > > > 2. Newer users of mdevctl who wants to work with non_UUID names, will > > > use newer mdevctl/tools. Cons: > > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > > It's unclear how/if it actually affects. > > > mdevctl [2] is very recently developed and can be enhanced for dual > > > naming scheme. > > > > I think we've already nak'ed this one, the device namespace becomes > > meaningless if the name becomes just a string where a uuid might be an > > example string. mdevs are named by uuid. > > > > > Option-3: mdev uuid alias > > > Instead of shorter mdev name or mdev index, have alpha-numeric name > > > alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > > example mdev create command: > > > UUID=$(uuidgen) > > > echo $UUID alias=foo > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > example netdevs: > > > examle netdevs: > > > repnetdev = ens2f0_mfoo > > > mdev_netdev=enmfoo > > > > > > Pros: > > > 1. All same as option-1. > > > 2. Doesn't affect existing mdev naming scheme. > > > Cons: > > > 1. Index scheme of option-1 is better which can number large number of > > > mdevs with fewer characters, simplifying the management tool. > > > > No better than option-1, simply a larger secondary namespace, but still > > requires the user to come up with two independent names for the device. > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to > > > 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID > > > > > > Pros: > > > 1. Doesn't require mdev extension > > > Cons: > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > > Possible user space extensions who assumed name size being 16 > > > characters 3. Single device type demands namesize change for all > > > netdev types > > > > What about an alias based on the uuid? For example, we use 160-bit sha1s > > daily with git (uuids are only 128-bit), but we generally don't reference git > > commits with the full 20 character string. Generally 12 characters is > > recommended to avoid ambiguity. Could mdev automatically create an ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > abbreviated sha1 alias for the device? If so, how many characters should we ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > use and what do we do on collision? The colliding device could add enough > > alias characters to disambiguate (we likely couldn't re-alias the existing device > > to disambiguate, but I'm not sure it matters, userspace has sysfs to associate > > aliases). Ex. > > > > UUID=$(uuidgen) > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the same time. > Setting is via different sysfs attribute is lot code burden with no extra benefit. Just an example of the alias, not proposing how it's set. In fact, proposing that the user does not set it, mdev-core provides one automatically. > > Since there seems to be some prefix overhead, as I ask about above in how > > many characters we actually have to work with in IFNAMESZ, maybe we start > > with 8 characters (matching your "index" namespace) and expand as necessary > > for disambiguation. If we can eliminate overhead in IFNAMESZ, let's start with > > 12. Thanks, > > > If user is going to choose the alias, why does it have to be limited to sha1? > Or you just told it as an example? > > It can be an alpha-numeric string. No, I'm proposing a different solution where mdev-core creates an alias based on an abbreviated sha1. The user does not provide the alias. > Instead of mdev imposing number of characters on the alias, it should be best left to the user. > Because in future if netdev improves on the naming scheme, mdev will be limiting it, which is not right. > So not restricting alias size seems right to me. > User configuring mdev for networking devices in a given kernel knows what user is doing. > So user can choose alias name size as it finds suitable. That's not what I'm proposing, please read again. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Wednesday, August 21, 2019 9:51 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Wed, 21 Aug 2019 03:42:25 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Tuesday, August 20, 2019 10:49 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > + Dave. > > > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > > > Short summary of requirements. > > > > For a given mdev (mediated device [1]), there is one representor > > > > netdevice and devlink port in switchdev mode (similar to SR-IOV > > > > VF), And there is one netdevice for the actual mdev when mdev is probed. > > > > > > > > (a) representor netdev and devlink port should be able derive > > > > phys_port_name(). So that representor netdev name can be built > > > > deterministically across reboots. > > > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > > > This attribute can be used by udev rules/systemd or something else > > > > to rename netdev name deterministically. > > > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more. > > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option > to me. > > > > > > How many characters do we really have to work with? Your examples > > > below prepend various characters, ex. option-1 results in ens2f0_m10 > > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ? > > > > > Maximum 15. Last is null termination. > > Some udev rules setting by user prefix the PF netdev interface. I took such > example below where ens2f0 netdev named is prefixed. > > Some prefer not to prefix. > > > > > > Hence, I would like to discuss below options. > > > > > > > > Option-1: mdev index > > > > Introduce an optional mdev index/handle as u32 during mdev create > > > > time. User passes mdev index/handle as input. > > > > > > > > phys_port_name=mIndex=m%u > > > > mdev_index will be available in sysfs as mdev attribute for udev > > > > to name the mdev's netdev. > > > > > > > > example mdev create command: > > > > UUID=$(uuidgen) > > > > echo $UUID index=10 > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > > > Nit, IIRC previous discussions of additional parameters used comma > > > separators, ex. echo $UUID,index=10 >... > > > > > Yes, ok. > > > > > > > example netdevs: > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > > > Is the parent really relevant in the name? > > No. I just picked one udev example who prefixed the parent netdev name. > > But there are users who do not prefix it. > > > > > Tools like mdevctl are meant to > > > provide persistence, creating the same mdev devices on the same > > > parent, but that's simply the easiest policy decision. We can also > > > imagine that multiple parent devices might support a specified mdev > > > type and policies factoring in proximity, load-balancing, power > > > consumption, etc might be weighed such that we really don't want to > > > promote userspace creating dependencies on the parent association. > > > > > > > mdev_netdev=enm10 > > > > > > > > Pros: > > > > 1. mdevctl and any other existing tools are unaffected. > > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > > 3. achieves unique phys_port_name for representor netdev 4. > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd > extension. > > > > 5. Aligns well with mdev and netdev subsystem and similar to > > > > existing sriov bdf's. > > > > > > A user provided index seems strange to me. It's not really an > > > index, just a user specified instance number. Presumably you have > > > the user providing this because if it really were an index, then the > > > value depends on the creation order and persistence is lost. Now > > > the user needs to both avoid uuid collision as well as "index" > > > number collision. The uuid namespace is large enough to mostly ignore > this, but this is not. This seems like a burden. > > > > > I liked the term 'instance number', which is lot better way to say than > index/handle. > > Yes, user needs to avoid both the collision. > > UUID collision should not occur in most cases, they way UUID are generated. > > So practically users needs to pick unique 'instance number', similar to how it > picks unique netdev names. > > > > Burden to user comes from the requirement to get uniqueness. > > > > > > Option-2: shorter mdev name > > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > > such as 'foo', 'bar'. > > > > Mdev will continue to have UUID. > > > > phys_port_name=mdev_name > > > > > > > > Pros: > > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > > > It is common practice to upgrade iproute2 package along with the > > > > kernel. Similar practice to be done with mdevctl. > > > > 2. Newer users of mdevctl who wants to work with non_UUID names, > > > > will use newer mdevctl/tools. Cons: > > > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > > > It's unclear how/if it actually affects. > > > > mdevctl [2] is very recently developed and can be enhanced for > > > > dual naming scheme. > > > > > > I think we've already nak'ed this one, the device namespace becomes > > > meaningless if the name becomes just a string where a uuid might be > > > an example string. mdevs are named by uuid. > > > > > > > Option-3: mdev uuid alias > > > > Instead of shorter mdev name or mdev index, have alpha-numeric > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > > > example mdev create command: > > > > UUID=$(uuidgen) > > > > echo $UUID alias=foo > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > > example netdevs: > > > > examle netdevs: > > > > repnetdev = ens2f0_mfoo > > > > mdev_netdev=enmfoo > > > > > > > > Pros: > > > > 1. All same as option-1. > > > > 2. Doesn't affect existing mdev naming scheme. > > > > Cons: > > > > 1. Index scheme of option-1 is better which can number large > > > > number of mdevs with fewer characters, simplifying the management > tool. > > > > > > No better than option-1, simply a larger secondary namespace, but > > > still requires the user to come up with two independent names for the > device. > > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 > > > > to > > > > 64 bytes phys_port_name=mdev_UUID_string > mdev_netdev_name=enmUUID > > > > > > > > Pros: > > > > 1. Doesn't require mdev extension > > > > Cons: > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > > > Possible user space extensions who assumed name size being 16 > > > > characters 3. Single device type demands namesize change for all > > > > netdev types > > > > > > What about an alias based on the uuid? For example, we use 160-bit > > > sha1s daily with git (uuids are only 128-bit), but we generally > > > don't reference git commits with the full 20 character string. > > > Generally 12 characters is recommended to avoid ambiguity. Could > > > mdev automatically create an > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > abbreviated sha1 alias for the device? If so, how many characters > > > should we > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > use and what do we do on collision? The colliding device could add > > > enough alias characters to disambiguate (we likely couldn't re-alias > > > the existing device to disambiguate, but I'm not sure it matters, > > > userspace has sysfs to associate aliases). Ex. > > > > > > UUID=$(uuidgen) > > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > > > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the > same time. > > Setting is via different sysfs attribute is lot code burden with no extra benefit. > > Just an example of the alias, not proposing how it's set. In fact, proposing that > the user does not set it, mdev-core provides one automatically. > > > > Since there seems to be some prefix overhead, as I ask about above > > > in how many characters we actually have to work with in IFNAMESZ, > > > maybe we start with 8 characters (matching your "index" namespace) > > > and expand as necessary for disambiguation. If we can eliminate > > > overhead in IFNAMESZ, let's start with 12. Thanks, > > > > > If user is going to choose the alias, why does it have to be limited to sha1? > > Or you just told it as an example? > > > > It can be an alpha-numeric string. > > No, I'm proposing a different solution where mdev-core creates an alias based > on an abbreviated sha1. The user does not provide the alias. > > > Instead of mdev imposing number of characters on the alias, it should be best > left to the user. > > Because in future if netdev improves on the naming scheme, mdev will be > limiting it, which is not right. > > So not restricting alias size seems right to me. > > User configuring mdev for networking devices in a given kernel knows what > user is doing. > > So user can choose alias name size as it finds suitable. > > That's not what I'm proposing, please read again. Thanks, I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev. So even if mdev chose to pick 12 characters, it could result in collision. Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using. So 12 character sha1 method will still work by user.
On Wed, 21 Aug 2019 04:40:15 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Wednesday, August 21, 2019 9:51 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Wed, 21 Aug 2019 03:42:25 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > Sent: Tuesday, August 20, 2019 10:49 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 > > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > + Dave. > > > > > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > > > > > Short summary of requirements. > > > > > For a given mdev (mediated device [1]), there is one representor > > > > > netdevice and devlink port in switchdev mode (similar to SR-IOV > > > > > VF), And there is one netdevice for the actual mdev when mdev is probed. > > > > > > > > > > (a) representor netdev and devlink port should be able derive > > > > > phys_port_name(). So that representor netdev name can be built > > > > > deterministically across reboots. > > > > > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > > > > This attribute can be used by udev rules/systemd or something else > > > > > to rename netdev name deterministically. > > > > > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more. > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option > > to me. > > > > > > > > How many characters do we really have to work with? Your examples > > > > below prepend various characters, ex. option-1 results in ens2f0_m10 > > > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ? > > > > > > > Maximum 15. Last is null termination. > > > Some udev rules setting by user prefix the PF netdev interface. I took such > > example below where ens2f0 netdev named is prefixed. > > > Some prefer not to prefix. > > > > > > > > Hence, I would like to discuss below options. > > > > > > > > > > Option-1: mdev index > > > > > Introduce an optional mdev index/handle as u32 during mdev create > > > > > time. User passes mdev index/handle as input. > > > > > > > > > > phys_port_name=mIndex=m%u > > > > > mdev_index will be available in sysfs as mdev attribute for udev > > > > > to name the mdev's netdev. > > > > > > > > > > example mdev create command: > > > > > UUID=$(uuidgen) > > > > > echo $UUID index=10 > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > > > > > Nit, IIRC previous discussions of additional parameters used comma > > > > separators, ex. echo $UUID,index=10 >... > > > > > > > Yes, ok. > > > > > > > > > example netdevs: > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > > > > > Is the parent really relevant in the name? > > > No. I just picked one udev example who prefixed the parent netdev name. > > > But there are users who do not prefix it. > > > > > > > Tools like mdevctl are meant to > > > > provide persistence, creating the same mdev devices on the same > > > > parent, but that's simply the easiest policy decision. We can also > > > > imagine that multiple parent devices might support a specified mdev > > > > type and policies factoring in proximity, load-balancing, power > > > > consumption, etc might be weighed such that we really don't want to > > > > promote userspace creating dependencies on the parent association. > > > > > > > > > mdev_netdev=enm10 > > > > > > > > > > Pros: > > > > > 1. mdevctl and any other existing tools are unaffected. > > > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > > > 3. achieves unique phys_port_name for representor netdev 4. > > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd > > extension. > > > > > 5. Aligns well with mdev and netdev subsystem and similar to > > > > > existing sriov bdf's. > > > > > > > > A user provided index seems strange to me. It's not really an > > > > index, just a user specified instance number. Presumably you have > > > > the user providing this because if it really were an index, then the > > > > value depends on the creation order and persistence is lost. Now > > > > the user needs to both avoid uuid collision as well as "index" > > > > number collision. The uuid namespace is large enough to mostly ignore > > this, but this is not. This seems like a burden. > > > > > > > I liked the term 'instance number', which is lot better way to say than > > index/handle. > > > Yes, user needs to avoid both the collision. > > > UUID collision should not occur in most cases, they way UUID are generated. > > > So practically users needs to pick unique 'instance number', similar to how it > > picks unique netdev names. > > > > > > Burden to user comes from the requirement to get uniqueness. > > > > > > > > Option-2: shorter mdev name > > > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > > > such as 'foo', 'bar'. > > > > > Mdev will continue to have UUID. > > > > > phys_port_name=mdev_name > > > > > > > > > > Pros: > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage. > > > > > It is common practice to upgrade iproute2 package along with the > > > > > kernel. Similar practice to be done with mdevctl. > > > > > 2. Newer users of mdevctl who wants to work with non_UUID names, > > > > > will use newer mdevctl/tools. Cons: > > > > > 1. Dual naming scheme of mdev might affect some of the existing tools. > > > > > It's unclear how/if it actually affects. > > > > > mdevctl [2] is very recently developed and can be enhanced for > > > > > dual naming scheme. > > > > > > > > I think we've already nak'ed this one, the device namespace becomes > > > > meaningless if the name becomes just a string where a uuid might be > > > > an example string. mdevs are named by uuid. > > > > > > > > > Option-3: mdev uuid alias > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'. > > > > > example mdev create command: > > > > > UUID=$(uuidgen) > > > > > echo $UUID alias=foo > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create > > > > > > example netdevs: > > > > > examle netdevs: > > > > > repnetdev = ens2f0_mfoo > > > > > mdev_netdev=enmfoo > > > > > > > > > > Pros: > > > > > 1. All same as option-1. > > > > > 2. Doesn't affect existing mdev naming scheme. > > > > > Cons: > > > > > 1. Index scheme of option-1 is better which can number large > > > > > number of mdevs with fewer characters, simplifying the management > > tool. > > > > > > > > No better than option-1, simply a larger secondary namespace, but > > > > still requires the user to come up with two independent names for the > > device. > > > > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 > > > > > to > > > > > 64 bytes phys_port_name=mdev_UUID_string > > mdev_netdev_name=enmUUID > > > > > > > > > > Pros: > > > > > 1. Doesn't require mdev extension > > > > > Cons: > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > > > > Possible user space extensions who assumed name size being 16 > > > > > characters 3. Single device type demands namesize change for all > > > > > netdev types > > > > > > > > What about an alias based on the uuid? For example, we use 160-bit > > > > sha1s daily with git (uuids are only 128-bit), but we generally > > > > don't reference git commits with the full 20 character string. > > > > Generally 12 characters is recommended to avoid ambiguity. Could > > > > mdev automatically create an > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > abbreviated sha1 alias for the device? If so, how many characters > > > > should we > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > use and what do we do on collision? The colliding device could add > > > > enough alias characters to disambiguate (we likely couldn't re-alias > > > > the existing device to disambiguate, but I'm not sure it matters, > > > > userspace has sysfs to associate aliases). Ex. > > > > > > > > UUID=$(uuidgen) > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > > > > > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the > > same time. > > > Setting is via different sysfs attribute is lot code burden with no extra benefit. > > > > Just an example of the alias, not proposing how it's set. In fact, proposing that > > the user does not set it, mdev-core provides one automatically. > > > > > > Since there seems to be some prefix overhead, as I ask about above > > > > in how many characters we actually have to work with in IFNAMESZ, > > > > maybe we start with 8 characters (matching your "index" namespace) > > > > and expand as necessary for disambiguation. If we can eliminate > > > > overhead in IFNAMESZ, let's start with 12. Thanks, > > > > > > > If user is going to choose the alias, why does it have to be limited to sha1? > > > Or you just told it as an example? > > > > > > It can be an alpha-numeric string. > > > > No, I'm proposing a different solution where mdev-core creates an alias based > > on an abbreviated sha1. The user does not provide the alias. > > > > > Instead of mdev imposing number of characters on the alias, it should be best > > left to the user. > > > Because in future if netdev improves on the naming scheme, mdev will be > > limiting it, which is not right. > > > So not restricting alias size seems right to me. > > > User configuring mdev for networking devices in a given kernel knows what > > user is doing. > > > So user can choose alias name size as it finds suitable. > > > > That's not what I'm proposing, please read again. Thanks, > > I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev. > So even if mdev chose to pick 12 characters, it could result in collision. > Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using. > So 12 character sha1 method will still work by user. Haven't you already provided examples where certain drivers or subsystems have unique netdev prefixes? If mdev provides a unique alias within the subsystem, couldn't we simply define a netdev prefix for the mdev subsystem and avoid all other collisions? I'm not in favor of the user providing both a uuid and an alias/instance. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Wednesday, August 21, 2019 10:27 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Wed, 21 Aug 2019 04:40:15 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Wednesday, August 21, 2019 9:51 AM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Wed, 21 Aug 2019 03:42:25 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > -----Original Message----- > > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > > Sent: Tuesday, August 20, 2019 10:49 PM > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > > netdev@vger.kernel.org > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > + Dave. > > > > > > > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > > > > > > > Short summary of requirements. > > > > > > For a given mdev (mediated device [1]), there is one > > > > > > representor netdevice and devlink port in switchdev mode > > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev > when mdev is probed. > > > > > > > > > > > > (a) representor netdev and devlink port should be able derive > > > > > > phys_port_name(). So that representor netdev name can be built > > > > > > deterministically across reboots. > > > > > > > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > > > > > This attribute can be used by udev rules/systemd or something > > > > > > else to rename netdev name deterministically. > > > > > > > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of > > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more. > > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look > > > > > > reasonable option > > > to me. > > > > > > > > > > How many characters do we really have to work with? Your > > > > > examples below prepend various characters, ex. option-1 results > > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count > against IFNAMSIZ? > > > > > > > > > Maximum 15. Last is null termination. > > > > Some udev rules setting by user prefix the PF netdev interface. I > > > > took such > > > example below where ens2f0 netdev named is prefixed. > > > > Some prefer not to prefix. > > > > > > > > > > Hence, I would like to discuss below options. > > > > > > > > > > > > Option-1: mdev index > > > > > > Introduce an optional mdev index/handle as u32 during mdev > > > > > > create time. User passes mdev index/handle as input. > > > > > > > > > > > > phys_port_name=mIndex=m%u > > > > > > mdev_index will be available in sysfs as mdev attribute for > > > > > > udev to name the mdev's netdev. > > > > > > > > > > > > example mdev create command: > > > > > > UUID=$(uuidgen) > > > > > > echo $UUID index=10 > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr > > > > > > > eate > > > > > > > > > > Nit, IIRC previous discussions of additional parameters used > > > > > comma separators, ex. echo $UUID,index=10 >... > > > > > > > > > Yes, ok. > > > > > > > > > > > example netdevs: > > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > > > > > > > Is the parent really relevant in the name? > > > > No. I just picked one udev example who prefixed the parent netdev name. > > > > But there are users who do not prefix it. > > > > > > > > > Tools like mdevctl are meant to > > > > > provide persistence, creating the same mdev devices on the same > > > > > parent, but that's simply the easiest policy decision. We can > > > > > also imagine that multiple parent devices might support a > > > > > specified mdev type and policies factoring in proximity, > > > > > load-balancing, power consumption, etc might be weighed such > > > > > that we really don't want to promote userspace creating dependencies > on the parent association. > > > > > > > > > > > mdev_netdev=enm10 > > > > > > > > > > > > Pros: > > > > > > 1. mdevctl and any other existing tools are unaffected. > > > > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > > > > 3. achieves unique phys_port_name for representor netdev 4. > > > > > > achieves unique mdev eth netdev name for the mdev using > > > > > > udev/systemd > > > extension. > > > > > > 5. Aligns well with mdev and netdev subsystem and similar to > > > > > > existing sriov bdf's. > > > > > > > > > > A user provided index seems strange to me. It's not really an > > > > > index, just a user specified instance number. Presumably you > > > > > have the user providing this because if it really were an index, > > > > > then the value depends on the creation order and persistence is > > > > > lost. Now the user needs to both avoid uuid collision as well as "index" > > > > > number collision. The uuid namespace is large enough to mostly > > > > > ignore > > > this, but this is not. This seems like a burden. > > > > > > > > > I liked the term 'instance number', which is lot better way to say > > > > than > > > index/handle. > > > > Yes, user needs to avoid both the collision. > > > > UUID collision should not occur in most cases, they way UUID are > generated. > > > > So practically users needs to pick unique 'instance number', > > > > similar to how it > > > picks unique netdev names. > > > > > > > > Burden to user comes from the requirement to get uniqueness. > > > > > > > > > > Option-2: shorter mdev name > > > > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > > > > such as 'foo', 'bar'. > > > > > > Mdev will continue to have UUID. > > > > > > phys_port_name=mdev_name > > > > > > > > > > > > Pros: > > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer > usage. > > > > > > It is common practice to upgrade iproute2 package along with > > > > > > the kernel. Similar practice to be done with mdevctl. > > > > > > 2. Newer users of mdevctl who wants to work with non_UUID > > > > > > names, will use newer mdevctl/tools. Cons: > > > > > > 1. Dual naming scheme of mdev might affect some of the existing > tools. > > > > > > It's unclear how/if it actually affects. > > > > > > mdevctl [2] is very recently developed and can be enhanced for > > > > > > dual naming scheme. > > > > > > > > > > I think we've already nak'ed this one, the device namespace > > > > > becomes meaningless if the name becomes just a string where a > > > > > uuid might be an example string. mdevs are named by uuid. > > > > > > > > > > > Option-3: mdev uuid alias > > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric > > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', > 'bar'. > > > > > > example mdev create command: > > > > > > UUID=$(uuidgen) > > > > > > echo $UUID alias=foo > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr > > > > > > > eate > > > > > > > example netdevs: > > > > > > examle netdevs: > > > > > > repnetdev = ens2f0_mfoo > > > > > > mdev_netdev=enmfoo > > > > > > > > > > > > Pros: > > > > > > 1. All same as option-1. > > > > > > 2. Doesn't affect existing mdev naming scheme. > > > > > > Cons: > > > > > > 1. Index scheme of option-1 is better which can number large > > > > > > number of mdevs with fewer characters, simplifying the > > > > > > management > > > tool. > > > > > > > > > > No better than option-1, simply a larger secondary namespace, > > > > > but still requires the user to come up with two independent > > > > > names for the > > > device. > > > > > > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ > > > > > > from 16 to > > > > > > 64 bytes phys_port_name=mdev_UUID_string > > > mdev_netdev_name=enmUUID > > > > > > > > > > > > Pros: > > > > > > 1. Doesn't require mdev extension > > > > > > Cons: > > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > > > > > Possible user space extensions who assumed name size being 16 > > > > > > characters 3. Single device type demands namesize change for > > > > > > all netdev types > > > > > > > > > > What about an alias based on the uuid? For example, we use > > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we > > > > > generally don't reference git commits with the full 20 character string. > > > > > Generally 12 characters is recommended to avoid ambiguity. > > > > > Could mdev automatically create an > > > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > abbreviated sha1 alias for the device? If so, how many > > > > > characters should we > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > use and what do we do on collision? The colliding device could > > > > > add enough alias characters to disambiguate (we likely couldn't > > > > > re-alias the existing device to disambiguate, but I'm not sure > > > > > it matters, userspace has sysfs to associate aliases). Ex. > > > > > > > > > > UUID=$(uuidgen) > > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > > > > > > > I explained in previous reply to Cornelia, we should set UUID and > > > > ALIAS at the > > > same time. > > > > Setting is via different sysfs attribute is lot code burden with no extra > benefit. > > > > > > Just an example of the alias, not proposing how it's set. In fact, > > > proposing that the user does not set it, mdev-core provides one > automatically. > > > > > > > > Since there seems to be some prefix overhead, as I ask about > > > > > above in how many characters we actually have to work with in > > > > > IFNAMESZ, maybe we start with 8 characters (matching your > > > > > "index" namespace) and expand as necessary for disambiguation. > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. > > > > > Thanks, > > > > > > > > > If user is going to choose the alias, why does it have to be limited to sha1? > > > > Or you just told it as an example? > > > > > > > > It can be an alpha-numeric string. > > > > > > No, I'm proposing a different solution where mdev-core creates an > > > alias based on an abbreviated sha1. The user does not provide the alias. > > > > > > > Instead of mdev imposing number of characters on the alias, it > > > > should be best > > > left to the user. > > > > Because in future if netdev improves on the naming scheme, mdev > > > > will be > > > limiting it, which is not right. > > > > So not restricting alias size seems right to me. > > > > User configuring mdev for networking devices in a given kernel > > > > knows what > > > user is doing. > > > > So user can choose alias name size as it finds suitable. > > > > > > That's not what I'm proposing, please read again. Thanks, > > > > I understood your point. But mdev doesn't know how user is going to use > udev/systemd to name the netdev. > > So even if mdev chose to pick 12 characters, it could result in collision. > > Hence the proposal to provide the alias by the user, as user know the best > policy for its use case in the environment its using. > > So 12 character sha1 method will still work by user. > > Haven't you already provided examples where certain drivers or subsystems > have unique netdev prefixes? If mdev provides a unique alias within the > subsystem, couldn't we simply define a netdev prefix for the mdev subsystem > and avoid all other collisions? I'm not in favor of the user providing both a uuid > and an alias/instance. Thanks, > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision?
On Wed, 21 Aug 2019 05:01:52 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Wednesday, August 21, 2019 10:27 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Wed, 21 Aug 2019 04:40:15 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > Sent: Wednesday, August 21, 2019 9:51 AM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Wed, 21 Aug 2019 03:42:25 +0000 > > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > > -----Original Message----- > > > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > > > Sent: Tuesday, August 20, 2019 10:49 PM > > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > > > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > > > netdev@vger.kernel.org > > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit > > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > > > + Dave. > > > > > > > > > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia, > > > > > > > > > > > > > > Please provide your feedback on it, how shall we proceed? > > > > > > > > > > > > > > Short summary of requirements. > > > > > > > For a given mdev (mediated device [1]), there is one > > > > > > > representor netdevice and devlink port in switchdev mode > > > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev > > when mdev is probed. > > > > > > > > > > > > > > (a) representor netdev and devlink port should be able derive > > > > > > > phys_port_name(). So that representor netdev name can be built > > > > > > > deterministically across reboots. > > > > > > > > > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute. > > > > > > > This attribute can be used by udev rules/systemd or something > > > > > > > else to rename netdev name deterministically. > > > > > > > > > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID. > > > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of > > > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more. > > > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look > > > > > > > reasonable option > > > > to me. > > > > > > > > > > > > How many characters do we really have to work with? Your > > > > > > examples below prepend various characters, ex. option-1 results > > > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count > > against IFNAMSIZ? > > > > > > > > > > > Maximum 15. Last is null termination. > > > > > Some udev rules setting by user prefix the PF netdev interface. I > > > > > took such > > > > example below where ens2f0 netdev named is prefixed. > > > > > Some prefer not to prefix. > > > > > > > > > > > > Hence, I would like to discuss below options. > > > > > > > > > > > > > > Option-1: mdev index > > > > > > > Introduce an optional mdev index/handle as u32 during mdev > > > > > > > create time. User passes mdev index/handle as input. > > > > > > > > > > > > > > phys_port_name=mIndex=m%u > > > > > > > mdev_index will be available in sysfs as mdev attribute for > > > > > > > udev to name the mdev's netdev. > > > > > > > > > > > > > > example mdev create command: > > > > > > > UUID=$(uuidgen) > > > > > > > echo $UUID index=10 > > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr > > > > > > > > eate > > > > > > > > > > > > Nit, IIRC previous discussions of additional parameters used > > > > > > comma separators, ex. echo $UUID,index=10 >... > > > > > > > > > > > Yes, ok. > > > > > > > > > > > > > example netdevs: > > > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */ > > > > > > > > > > > > Is the parent really relevant in the name? > > > > > No. I just picked one udev example who prefixed the parent netdev name. > > > > > But there are users who do not prefix it. > > > > > > > > > > > Tools like mdevctl are meant to > > > > > > provide persistence, creating the same mdev devices on the same > > > > > > parent, but that's simply the easiest policy decision. We can > > > > > > also imagine that multiple parent devices might support a > > > > > > specified mdev type and policies factoring in proximity, > > > > > > load-balancing, power consumption, etc might be weighed such > > > > > > that we really don't want to promote userspace creating dependencies > > on the parent association. > > > > > > > > > > > > > mdev_netdev=enm10 > > > > > > > > > > > > > > Pros: > > > > > > > 1. mdevctl and any other existing tools are unaffected. > > > > > > > 2. netdev stack, ovs and other switching platforms are unaffected. > > > > > > > 3. achieves unique phys_port_name for representor netdev 4. > > > > > > > achieves unique mdev eth netdev name for the mdev using > > > > > > > udev/systemd > > > > extension. > > > > > > > 5. Aligns well with mdev and netdev subsystem and similar to > > > > > > > existing sriov bdf's. > > > > > > > > > > > > A user provided index seems strange to me. It's not really an > > > > > > index, just a user specified instance number. Presumably you > > > > > > have the user providing this because if it really were an index, > > > > > > then the value depends on the creation order and persistence is > > > > > > lost. Now the user needs to both avoid uuid collision as well as "index" > > > > > > number collision. The uuid namespace is large enough to mostly > > > > > > ignore > > > > this, but this is not. This seems like a burden. > > > > > > > > > > > I liked the term 'instance number', which is lot better way to say > > > > > than > > > > index/handle. > > > > > Yes, user needs to avoid both the collision. > > > > > UUID collision should not occur in most cases, they way UUID are > > generated. > > > > > So practically users needs to pick unique 'instance number', > > > > > similar to how it > > > > picks unique netdev names. > > > > > > > > > > Burden to user comes from the requirement to get uniqueness. > > > > > > > > > > > > Option-2: shorter mdev name > > > > > > > Extend mdev to have shorter mdev device name in addition to UUID. > > > > > > > such as 'foo', 'bar'. > > > > > > > Mdev will continue to have UUID. > > > > > > > phys_port_name=mdev_name > > > > > > > > > > > > > > Pros: > > > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer > > usage. > > > > > > > It is common practice to upgrade iproute2 package along with > > > > > > > the kernel. Similar practice to be done with mdevctl. > > > > > > > 2. Newer users of mdevctl who wants to work with non_UUID > > > > > > > names, will use newer mdevctl/tools. Cons: > > > > > > > 1. Dual naming scheme of mdev might affect some of the existing > > tools. > > > > > > > It's unclear how/if it actually affects. > > > > > > > mdevctl [2] is very recently developed and can be enhanced for > > > > > > > dual naming scheme. > > > > > > > > > > > > I think we've already nak'ed this one, the device namespace > > > > > > becomes meaningless if the name becomes just a string where a > > > > > > uuid might be an example string. mdevs are named by uuid. > > > > > > > > > > > > > Option-3: mdev uuid alias > > > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric > > > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', > > 'bar'. > > > > > > > example mdev create command: > > > > > > > UUID=$(uuidgen) > > > > > > > echo $UUID alias=foo > > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr > > > > > > > > eate > > > > > > > > example netdevs: > > > > > > > examle netdevs: > > > > > > > repnetdev = ens2f0_mfoo > > > > > > > mdev_netdev=enmfoo > > > > > > > > > > > > > > Pros: > > > > > > > 1. All same as option-1. > > > > > > > 2. Doesn't affect existing mdev naming scheme. > > > > > > > Cons: > > > > > > > 1. Index scheme of option-1 is better which can number large > > > > > > > number of mdevs with fewer characters, simplifying the > > > > > > > management > > > > tool. > > > > > > > > > > > > No better than option-1, simply a larger secondary namespace, > > > > > > but still requires the user to come up with two independent > > > > > > names for the > > > > device. > > > > > > > > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ > > > > > > > from 16 to > > > > > > > 64 bytes phys_port_name=mdev_UUID_string > > > > mdev_netdev_name=enmUUID > > > > > > > > > > > > > > Pros: > > > > > > > 1. Doesn't require mdev extension > > > > > > > Cons: > > > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2. > > > > > > > Possible user space extensions who assumed name size being 16 > > > > > > > characters 3. Single device type demands namesize change for > > > > > > > all netdev types > > > > > > > > > > > > What about an alias based on the uuid? For example, we use > > > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we > > > > > > generally don't reference git commits with the full 20 character string. > > > > > > Generally 12 characters is recommended to avoid ambiguity. > > > > > > Could mdev automatically create an > > > > > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > abbreviated sha1 alias for the device? If so, how many > > > > > > characters should we > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > use and what do we do on collision? The colliding device could > > > > > > add enough alias characters to disambiguate (we likely couldn't > > > > > > re-alias the existing device to disambiguate, but I'm not sure > > > > > > it matters, userspace has sysfs to associate aliases). Ex. > > > > > > > > > > > > UUID=$(uuidgen) > > > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13) > > > > > > > > > > > I explained in previous reply to Cornelia, we should set UUID and > > > > > ALIAS at the > > > > same time. > > > > > Setting is via different sysfs attribute is lot code burden with no extra > > benefit. > > > > > > > > Just an example of the alias, not proposing how it's set. In fact, > > > > proposing that the user does not set it, mdev-core provides one > > automatically. > > > > > > > > > > Since there seems to be some prefix overhead, as I ask about > > > > > > above in how many characters we actually have to work with in > > > > > > IFNAMESZ, maybe we start with 8 characters (matching your > > > > > > "index" namespace) and expand as necessary for disambiguation. > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. > > > > > > Thanks, > > > > > > > > > > > If user is going to choose the alias, why does it have to be limited to sha1? > > > > > Or you just told it as an example? > > > > > > > > > > It can be an alpha-numeric string. > > > > > > > > No, I'm proposing a different solution where mdev-core creates an > > > > alias based on an abbreviated sha1. The user does not provide the alias. > > > > > > > > > Instead of mdev imposing number of characters on the alias, it > > > > > should be best > > > > left to the user. > > > > > Because in future if netdev improves on the naming scheme, mdev > > > > > will be > > > > limiting it, which is not right. > > > > > So not restricting alias size seems right to me. > > > > > User configuring mdev for networking devices in a given kernel > > > > > knows what > > > > user is doing. > > > > > So user can choose alias name size as it finds suitable. > > > > > > > > That's not what I'm proposing, please read again. Thanks, > > > > > > I understood your point. But mdev doesn't know how user is going to use > > udev/systemd to name the netdev. > > > So even if mdev chose to pick 12 characters, it could result in collision. > > > Hence the proposal to provide the alias by the user, as user know the best > > policy for its use case in the environment its using. > > > So 12 character sha1 method will still work by user. > > > > Haven't you already provided examples where certain drivers or subsystems > > have unique netdev prefixes? If mdev provides a unique alias within the > > subsystem, couldn't we simply define a netdev prefix for the mdev subsystem > > and avoid all other collisions? I'm not in favor of the user providing both a uuid > > and an alias/instance. Thanks, > > > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision? I think it would be a mistake to waste so many chars on a prefix, but 9 characters of sha1 likely wouldn't have a collision before we have 10s of thousands of devices. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Wednesday, August 21, 2019 10:56 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > Just an example of the alias, not proposing how it's set. In > > > > > fact, proposing that the user does not set it, mdev-core > > > > > provides one > > > automatically. > > > > > > > > > > > > Since there seems to be some prefix overhead, as I ask about > > > > > > > above in how many characters we actually have to work with > > > > > > > in IFNAMESZ, maybe we start with 8 characters (matching your > > > > > > > "index" namespace) and expand as necessary for disambiguation. > > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. > > > > > > > Thanks, > > > > > > > > > > > > > If user is going to choose the alias, why does it have to be limited to > sha1? > > > > > > Or you just told it as an example? > > > > > > > > > > > > It can be an alpha-numeric string. > > > > > > > > > > No, I'm proposing a different solution where mdev-core creates > > > > > an alias based on an abbreviated sha1. The user does not provide the > alias. > > > > > > > > > > > Instead of mdev imposing number of characters on the alias, it > > > > > > should be best > > > > > left to the user. > > > > > > Because in future if netdev improves on the naming scheme, > > > > > > mdev will be > > > > > limiting it, which is not right. > > > > > > So not restricting alias size seems right to me. > > > > > > User configuring mdev for networking devices in a given kernel > > > > > > knows what > > > > > user is doing. > > > > > > So user can choose alias name size as it finds suitable. > > > > > > > > > > That's not what I'm proposing, please read again. Thanks, > > > > > > > > I understood your point. But mdev doesn't know how user is going > > > > to use > > > udev/systemd to name the netdev. > > > > So even if mdev chose to pick 12 characters, it could result in collision. > > > > Hence the proposal to provide the alias by the user, as user know > > > > the best > > > policy for its use case in the environment its using. > > > > So 12 character sha1 method will still work by user. > > > > > > Haven't you already provided examples where certain drivers or > > > subsystems have unique netdev prefixes? If mdev provides a unique > > > alias within the subsystem, couldn't we simply define a netdev > > > prefix for the mdev subsystem and avoid all other collisions? I'm > > > not in favor of the user providing both a uuid and an > > > alias/instance. Thanks, > > > > > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have > collision? > > I think it would be a mistake to waste so many chars on a prefix, but 9 > characters of sha1 likely wouldn't have a collision before we have 10s of > thousands of devices. Thanks, > > Alex Jiri, Dave, Are you ok with it for devlink/netdev part? Mdev core will create an alias from a UUID. This will be supplied during devlink port attr set such as, devlink_port_attrs_mdev_set(struct devlink_port *port, const char *mdev_alias); This alias is used to generate representor netdev's phys_port_name. This alias from the mdev device's sysfs will be used by the udev/systemd to generate predicable netdev's name. Example: enm<mdev_alias_first_12_chars> I took Ethernet mdev as an example. New prefix 'm' stands for mediated device. Remaining 12 characters are first 12 chars of the mdev alias.
Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Alex Williamson <alex.williamson@redhat.com> >> Sent: Wednesday, August 21, 2019 10:56 AM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; >> cjia <cjia@nvidia.com>; netdev@vger.kernel.org >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> > > > > Just an example of the alias, not proposing how it's set. In >> > > > > fact, proposing that the user does not set it, mdev-core >> > > > > provides one >> > > automatically. >> > > > > >> > > > > > > Since there seems to be some prefix overhead, as I ask about >> > > > > > > above in how many characters we actually have to work with >> > > > > > > in IFNAMESZ, maybe we start with 8 characters (matching your >> > > > > > > "index" namespace) and expand as necessary for disambiguation. >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. >> > > > > > > Thanks, >> > > > > > > >> > > > > > If user is going to choose the alias, why does it have to be limited to >> sha1? >> > > > > > Or you just told it as an example? >> > > > > > >> > > > > > It can be an alpha-numeric string. >> > > > > >> > > > > No, I'm proposing a different solution where mdev-core creates >> > > > > an alias based on an abbreviated sha1. The user does not provide the >> alias. >> > > > > >> > > > > > Instead of mdev imposing number of characters on the alias, it >> > > > > > should be best >> > > > > left to the user. >> > > > > > Because in future if netdev improves on the naming scheme, >> > > > > > mdev will be >> > > > > limiting it, which is not right. >> > > > > > So not restricting alias size seems right to me. >> > > > > > User configuring mdev for networking devices in a given kernel >> > > > > > knows what >> > > > > user is doing. >> > > > > > So user can choose alias name size as it finds suitable. >> > > > > >> > > > > That's not what I'm proposing, please read again. Thanks, >> > > > >> > > > I understood your point. But mdev doesn't know how user is going >> > > > to use >> > > udev/systemd to name the netdev. >> > > > So even if mdev chose to pick 12 characters, it could result in collision. >> > > > Hence the proposal to provide the alias by the user, as user know >> > > > the best >> > > policy for its use case in the environment its using. >> > > > So 12 character sha1 method will still work by user. >> > > >> > > Haven't you already provided examples where certain drivers or >> > > subsystems have unique netdev prefixes? If mdev provides a unique >> > > alias within the subsystem, couldn't we simply define a netdev >> > > prefix for the mdev subsystem and avoid all other collisions? I'm >> > > not in favor of the user providing both a uuid and an >> > > alias/instance. Thanks, >> > > >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have >> collision? >> >> I think it would be a mistake to waste so many chars on a prefix, but 9 >> characters of sha1 likely wouldn't have a collision before we have 10s of >> thousands of devices. Thanks, >> >> Alex > >Jiri, Dave, >Are you ok with it for devlink/netdev part? >Mdev core will create an alias from a UUID. > >This will be supplied during devlink port attr set such as, > >devlink_port_attrs_mdev_set(struct devlink_port *port, const char *mdev_alias); > >This alias is used to generate representor netdev's phys_port_name. >This alias from the mdev device's sysfs will be used by the udev/systemd to generate predicable netdev's name. >Example: enm<mdev_alias_first_12_chars> What happens in unlikely case of 2 UUIDs collide? >I took Ethernet mdev as an example. >New prefix 'm' stands for mediated device. >Remaining 12 characters are first 12 chars of the mdev alias. Does this resolve the identification of devlink port representor? I assume you want to use the same 12(or so) chars, don't you?
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Thursday, August 22, 2019 2:59 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Alex Williamson <alex.williamson@redhat.com> > >> Sent: Wednesday, August 21, 2019 10:56 AM > >> To: Parav Pandit <parav@mellanox.com> > >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > >> netdev@vger.kernel.org > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> > >> > > > > Just an example of the alias, not proposing how it's set. In > >> > > > > fact, proposing that the user does not set it, mdev-core > >> > > > > provides one > >> > > automatically. > >> > > > > > >> > > > > > > Since there seems to be some prefix overhead, as I ask > >> > > > > > > about above in how many characters we actually have to > >> > > > > > > work with in IFNAMESZ, maybe we start with 8 characters > >> > > > > > > (matching your "index" namespace) and expand as necessary for > disambiguation. > >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. > >> > > > > > > Thanks, > >> > > > > > > > >> > > > > > If user is going to choose the alias, why does it have to > >> > > > > > be limited to > >> sha1? > >> > > > > > Or you just told it as an example? > >> > > > > > > >> > > > > > It can be an alpha-numeric string. > >> > > > > > >> > > > > No, I'm proposing a different solution where mdev-core > >> > > > > creates an alias based on an abbreviated sha1. The user does > >> > > > > not provide the > >> alias. > >> > > > > > >> > > > > > Instead of mdev imposing number of characters on the alias, > >> > > > > > it should be best > >> > > > > left to the user. > >> > > > > > Because in future if netdev improves on the naming scheme, > >> > > > > > mdev will be > >> > > > > limiting it, which is not right. > >> > > > > > So not restricting alias size seems right to me. > >> > > > > > User configuring mdev for networking devices in a given > >> > > > > > kernel knows what > >> > > > > user is doing. > >> > > > > > So user can choose alias name size as it finds suitable. > >> > > > > > >> > > > > That's not what I'm proposing, please read again. Thanks, > >> > > > > >> > > > I understood your point. But mdev doesn't know how user is > >> > > > going to use > >> > > udev/systemd to name the netdev. > >> > > > So even if mdev chose to pick 12 characters, it could result in collision. > >> > > > Hence the proposal to provide the alias by the user, as user > >> > > > know the best > >> > > policy for its use case in the environment its using. > >> > > > So 12 character sha1 method will still work by user. > >> > > > >> > > Haven't you already provided examples where certain drivers or > >> > > subsystems have unique netdev prefixes? If mdev provides a > >> > > unique alias within the subsystem, couldn't we simply define a > >> > > netdev prefix for the mdev subsystem and avoid all other > >> > > collisions? I'm not in favor of the user providing both a uuid > >> > > and an alias/instance. Thanks, > >> > > > >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 > >> > characters have > >> collision? > >> > >> I think it would be a mistake to waste so many chars on a prefix, but > >> 9 characters of sha1 likely wouldn't have a collision before we have > >> 10s of thousands of devices. Thanks, > >> > >> Alex > > > >Jiri, Dave, > >Are you ok with it for devlink/netdev part? > >Mdev core will create an alias from a UUID. > > > >This will be supplied during devlink port attr set such as, > > > >devlink_port_attrs_mdev_set(struct devlink_port *port, const char > >*mdev_alias); > > > >This alias is used to generate representor netdev's phys_port_name. > >This alias from the mdev device's sysfs will be used by the udev/systemd to > generate predicable netdev's name. > >Example: enm<mdev_alias_first_12_chars> > > What happens in unlikely case of 2 UUIDs collide? > Since users sees two devices with same phys_port_name, user should destroy recently created mdev and recreate mdev with different UUID? > > >I took Ethernet mdev as an example. > >New prefix 'm' stands for mediated device. > >Remaining 12 characters are first 12 chars of the mdev alias. > > Does this resolve the identification of devlink port representor? Not sure if I understood your question correctly, attemping to answer below. phys_port_name of devlink port is defined by the first 12 characters of mdev alias. > I assume you want to use the same 12(or so) chars, don't you? Mdev's netdev will also use the same mdev alias from the sysfs to rename netdev name from ethX to enm<mdev_alias>, where en=Etherenet, m=mdev. So yes, same 12 characters are use for mdev's netdev and mdev devlink port's phys_port_name. Is that what are you asking?
Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> >> Sent: Thursday, August 22, 2019 2:59 PM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> netdev@vger.kernel.org >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: >> > >> > >> >> -----Original Message----- >> >> From: Alex Williamson <alex.williamson@redhat.com> >> >> Sent: Wednesday, August 21, 2019 10:56 AM >> >> To: Parav Pandit <parav@mellanox.com> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller >> >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; >> >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> >> netdev@vger.kernel.org >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. In >> >> > > > > fact, proposing that the user does not set it, mdev-core >> >> > > > > provides one >> >> > > automatically. >> >> > > > > >> >> > > > > > > Since there seems to be some prefix overhead, as I ask >> >> > > > > > > about above in how many characters we actually have to >> >> > > > > > > work with in IFNAMESZ, maybe we start with 8 characters >> >> > > > > > > (matching your "index" namespace) and expand as necessary for >> disambiguation. >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. >> >> > > > > > > Thanks, >> >> > > > > > > >> >> > > > > > If user is going to choose the alias, why does it have to >> >> > > > > > be limited to >> >> sha1? >> >> > > > > > Or you just told it as an example? >> >> > > > > > >> >> > > > > > It can be an alpha-numeric string. >> >> > > > > >> >> > > > > No, I'm proposing a different solution where mdev-core >> >> > > > > creates an alias based on an abbreviated sha1. The user does >> >> > > > > not provide the >> >> alias. >> >> > > > > >> >> > > > > > Instead of mdev imposing number of characters on the alias, >> >> > > > > > it should be best >> >> > > > > left to the user. >> >> > > > > > Because in future if netdev improves on the naming scheme, >> >> > > > > > mdev will be >> >> > > > > limiting it, which is not right. >> >> > > > > > So not restricting alias size seems right to me. >> >> > > > > > User configuring mdev for networking devices in a given >> >> > > > > > kernel knows what >> >> > > > > user is doing. >> >> > > > > > So user can choose alias name size as it finds suitable. >> >> > > > > >> >> > > > > That's not what I'm proposing, please read again. Thanks, >> >> > > > >> >> > > > I understood your point. But mdev doesn't know how user is >> >> > > > going to use >> >> > > udev/systemd to name the netdev. >> >> > > > So even if mdev chose to pick 12 characters, it could result in collision. >> >> > > > Hence the proposal to provide the alias by the user, as user >> >> > > > know the best >> >> > > policy for its use case in the environment its using. >> >> > > > So 12 character sha1 method will still work by user. >> >> > > >> >> > > Haven't you already provided examples where certain drivers or >> >> > > subsystems have unique netdev prefixes? If mdev provides a >> >> > > unique alias within the subsystem, couldn't we simply define a >> >> > > netdev prefix for the mdev subsystem and avoid all other >> >> > > collisions? I'm not in favor of the user providing both a uuid >> >> > > and an alias/instance. Thanks, >> >> > > >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 >> >> > characters have >> >> collision? >> >> >> >> I think it would be a mistake to waste so many chars on a prefix, but >> >> 9 characters of sha1 likely wouldn't have a collision before we have >> >> 10s of thousands of devices. Thanks, >> >> >> >> Alex >> > >> >Jiri, Dave, >> >Are you ok with it for devlink/netdev part? >> >Mdev core will create an alias from a UUID. >> > >> >This will be supplied during devlink port attr set such as, >> > >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const char >> >*mdev_alias); >> > >> >This alias is used to generate representor netdev's phys_port_name. >> >This alias from the mdev device's sysfs will be used by the udev/systemd to >> generate predicable netdev's name. >> >Example: enm<mdev_alias_first_12_chars> >> >> What happens in unlikely case of 2 UUIDs collide? >> >Since users sees two devices with same phys_port_name, user should destroy recently created mdev and recreate mdev with different UUID? Driver should make sure phys port name wont collide, in this case that it does not provide 2 same attrs for 2 different ports. Hmm, so the order of creation matters. That is not good. >> >> >I took Ethernet mdev as an example. >> >New prefix 'm' stands for mediated device. >> >Remaining 12 characters are first 12 chars of the mdev alias. >> >> Does this resolve the identification of devlink port representor? >Not sure if I understood your question correctly, attemping to answer below. >phys_port_name of devlink port is defined by the first 12 characters of mdev alias. >> I assume you want to use the same 12(or so) chars, don't you? >Mdev's netdev will also use the same mdev alias from the sysfs to rename netdev name from ethX to enm<mdev_alias>, where en=Etherenet, m=mdev. > >So yes, same 12 characters are use for mdev's netdev and mdev devlink port's phys_port_name. > >Is that what are you asking? Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Thursday, August 22, 2019 3:28 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Jiri Pirko <jiri@resnulli.us> > >> Sent: Thursday, August 22, 2019 2:59 PM > >> To: Parav Pandit <parav@mellanox.com> > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> > >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Alex Williamson <alex.williamson@redhat.com> > >> >> Sent: Wednesday, August 21, 2019 10:56 AM > >> >> To: Parav Pandit <parav@mellanox.com> > >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > >> >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > >> >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > >> >> netdev@vger.kernel.org > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> >> > >> >> > > > > Just an example of the alias, not proposing how it's set. > >> >> > > > > In fact, proposing that the user does not set it, > >> >> > > > > mdev-core provides one > >> >> > > automatically. > >> >> > > > > > >> >> > > > > > > Since there seems to be some prefix overhead, as I ask > >> >> > > > > > > about above in how many characters we actually have to > >> >> > > > > > > work with in IFNAMESZ, maybe we start with 8 > >> >> > > > > > > characters (matching your "index" namespace) and > >> >> > > > > > > expand as necessary for > >> disambiguation. > >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. > >> >> > > > > > > Thanks, > >> >> > > > > > > > >> >> > > > > > If user is going to choose the alias, why does it have > >> >> > > > > > to be limited to > >> >> sha1? > >> >> > > > > > Or you just told it as an example? > >> >> > > > > > > >> >> > > > > > It can be an alpha-numeric string. > >> >> > > > > > >> >> > > > > No, I'm proposing a different solution where mdev-core > >> >> > > > > creates an alias based on an abbreviated sha1. The user > >> >> > > > > does not provide the > >> >> alias. > >> >> > > > > > >> >> > > > > > Instead of mdev imposing number of characters on the > >> >> > > > > > alias, it should be best > >> >> > > > > left to the user. > >> >> > > > > > Because in future if netdev improves on the naming > >> >> > > > > > scheme, mdev will be > >> >> > > > > limiting it, which is not right. > >> >> > > > > > So not restricting alias size seems right to me. > >> >> > > > > > User configuring mdev for networking devices in a given > >> >> > > > > > kernel knows what > >> >> > > > > user is doing. > >> >> > > > > > So user can choose alias name size as it finds suitable. > >> >> > > > > > >> >> > > > > That's not what I'm proposing, please read again. Thanks, > >> >> > > > > >> >> > > > I understood your point. But mdev doesn't know how user is > >> >> > > > going to use > >> >> > > udev/systemd to name the netdev. > >> >> > > > So even if mdev chose to pick 12 characters, it could result in > collision. > >> >> > > > Hence the proposal to provide the alias by the user, as user > >> >> > > > know the best > >> >> > > policy for its use case in the environment its using. > >> >> > > > So 12 character sha1 method will still work by user. > >> >> > > > >> >> > > Haven't you already provided examples where certain drivers or > >> >> > > subsystems have unique netdev prefixes? If mdev provides a > >> >> > > unique alias within the subsystem, couldn't we simply define a > >> >> > > netdev prefix for the mdev subsystem and avoid all other > >> >> > > collisions? I'm not in favor of the user providing both a > >> >> > > uuid and an alias/instance. Thanks, > >> >> > > > >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 > >> >> > characters have > >> >> collision? > >> >> > >> >> I think it would be a mistake to waste so many chars on a prefix, > >> >> but > >> >> 9 characters of sha1 likely wouldn't have a collision before we > >> >> have 10s of thousands of devices. Thanks, > >> >> > >> >> Alex > >> > > >> >Jiri, Dave, > >> >Are you ok with it for devlink/netdev part? > >> >Mdev core will create an alias from a UUID. > >> > > >> >This will be supplied during devlink port attr set such as, > >> > > >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const char > >> >*mdev_alias); > >> > > >> >This alias is used to generate representor netdev's phys_port_name. > >> >This alias from the mdev device's sysfs will be used by the > >> >udev/systemd to > >> generate predicable netdev's name. > >> >Example: enm<mdev_alias_first_12_chars> > >> > >> What happens in unlikely case of 2 UUIDs collide? > >> > >Since users sees two devices with same phys_port_name, user should destroy > recently created mdev and recreate mdev with different UUID? > > Driver should make sure phys port name wont collide, So when mdev creation is initiated, mdev core calculates the alias and if there is any other mdev with same alias exist, it returns -EEXIST error before progressing further. This way user will get to know upfront in event of collision before the mdev device gets created. How about that? > in this case that it does > not provide 2 same attrs for 2 different ports. > Hmm, so the order of creation matters. That is not good. > > >> > >> >I took Ethernet mdev as an example. > >> >New prefix 'm' stands for mediated device. > >> >Remaining 12 characters are first 12 chars of the mdev alias. > >> > >> Does this resolve the identification of devlink port representor? > >Not sure if I understood your question correctly, attemping to answer below. > >phys_port_name of devlink port is defined by the first 12 characters of mdev > alias. > >> I assume you want to use the same 12(or so) chars, don't you? > >Mdev's netdev will also use the same mdev alias from the sysfs to rename > netdev name from ethX to enm<mdev_alias>, where en=Etherenet, m=mdev. > > > >So yes, same 12 characters are use for mdev's netdev and mdev devlink port's > phys_port_name. > > > >Is that what are you asking? > > Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> >> Sent: Thursday, August 22, 2019 3:28 PM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> netdev@vger.kernel.org >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: >> > >> > >> >> -----Original Message----- >> >> From: Jiri Pirko <jiri@resnulli.us> >> >> Sent: Thursday, August 22, 2019 2:59 PM >> >> To: Parav Pandit <parav@mellanox.com> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> <cohuck@redhat.com>; >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> >> <cjia@nvidia.com>; netdev@vger.kernel.org >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: >> >> > >> >> > >> >> >> -----Original Message----- >> >> >> From: Alex Williamson <alex.williamson@redhat.com> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM >> >> >> To: Parav Pandit <parav@mellanox.com> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller >> >> >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; >> >> >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> >> >> netdev@vger.kernel.org >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. >> >> >> > > > > In fact, proposing that the user does not set it, >> >> >> > > > > mdev-core provides one >> >> >> > > automatically. >> >> >> > > > > >> >> >> > > > > > > Since there seems to be some prefix overhead, as I ask >> >> >> > > > > > > about above in how many characters we actually have to >> >> >> > > > > > > work with in IFNAMESZ, maybe we start with 8 >> >> >> > > > > > > characters (matching your "index" namespace) and >> >> >> > > > > > > expand as necessary for >> >> disambiguation. >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12. >> >> >> > > > > > > Thanks, >> >> >> > > > > > > >> >> >> > > > > > If user is going to choose the alias, why does it have >> >> >> > > > > > to be limited to >> >> >> sha1? >> >> >> > > > > > Or you just told it as an example? >> >> >> > > > > > >> >> >> > > > > > It can be an alpha-numeric string. >> >> >> > > > > >> >> >> > > > > No, I'm proposing a different solution where mdev-core >> >> >> > > > > creates an alias based on an abbreviated sha1. The user >> >> >> > > > > does not provide the >> >> >> alias. >> >> >> > > > > >> >> >> > > > > > Instead of mdev imposing number of characters on the >> >> >> > > > > > alias, it should be best >> >> >> > > > > left to the user. >> >> >> > > > > > Because in future if netdev improves on the naming >> >> >> > > > > > scheme, mdev will be >> >> >> > > > > limiting it, which is not right. >> >> >> > > > > > So not restricting alias size seems right to me. >> >> >> > > > > > User configuring mdev for networking devices in a given >> >> >> > > > > > kernel knows what >> >> >> > > > > user is doing. >> >> >> > > > > > So user can choose alias name size as it finds suitable. >> >> >> > > > > >> >> >> > > > > That's not what I'm proposing, please read again. Thanks, >> >> >> > > > >> >> >> > > > I understood your point. But mdev doesn't know how user is >> >> >> > > > going to use >> >> >> > > udev/systemd to name the netdev. >> >> >> > > > So even if mdev chose to pick 12 characters, it could result in >> collision. >> >> >> > > > Hence the proposal to provide the alias by the user, as user >> >> >> > > > know the best >> >> >> > > policy for its use case in the environment its using. >> >> >> > > > So 12 character sha1 method will still work by user. >> >> >> > > >> >> >> > > Haven't you already provided examples where certain drivers or >> >> >> > > subsystems have unique netdev prefixes? If mdev provides a >> >> >> > > unique alias within the subsystem, couldn't we simply define a >> >> >> > > netdev prefix for the mdev subsystem and avoid all other >> >> >> > > collisions? I'm not in favor of the user providing both a >> >> >> > > uuid and an alias/instance. Thanks, >> >> >> > > >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 >> >> >> > characters have >> >> >> collision? >> >> >> >> >> >> I think it would be a mistake to waste so many chars on a prefix, >> >> >> but >> >> >> 9 characters of sha1 likely wouldn't have a collision before we >> >> >> have 10s of thousands of devices. Thanks, >> >> >> >> >> >> Alex >> >> > >> >> >Jiri, Dave, >> >> >Are you ok with it for devlink/netdev part? >> >> >Mdev core will create an alias from a UUID. >> >> > >> >> >This will be supplied during devlink port attr set such as, >> >> > >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const char >> >> >*mdev_alias); >> >> > >> >> >This alias is used to generate representor netdev's phys_port_name. >> >> >This alias from the mdev device's sysfs will be used by the >> >> >udev/systemd to >> >> generate predicable netdev's name. >> >> >Example: enm<mdev_alias_first_12_chars> >> >> >> >> What happens in unlikely case of 2 UUIDs collide? >> >> >> >Since users sees two devices with same phys_port_name, user should destroy >> recently created mdev and recreate mdev with different UUID? >> >> Driver should make sure phys port name wont collide, >So when mdev creation is initiated, mdev core calculates the alias and if there is any other mdev with same alias exist, it returns -EEXIST error before progressing further. >This way user will get to know upfront in event of collision before the mdev device gets created. >How about that? Sounds fine to me. Now the question is how many chars do we want to have. > > >> in this case that it does >> not provide 2 same attrs for 2 different ports. >> Hmm, so the order of creation matters. That is not good. >> >> >> >> >> >I took Ethernet mdev as an example. >> >> >New prefix 'm' stands for mediated device. >> >> >Remaining 12 characters are first 12 chars of the mdev alias. >> >> >> >> Does this resolve the identification of devlink port representor? >> >Not sure if I understood your question correctly, attemping to answer below. >> >phys_port_name of devlink port is defined by the first 12 characters of mdev >> alias. >> >> I assume you want to use the same 12(or so) chars, don't you? >> >Mdev's netdev will also use the same mdev alias from the sysfs to rename >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, m=mdev. >> > >> >So yes, same 12 characters are use for mdev's netdev and mdev devlink port's >> phys_port_name. >> > >> >Is that what are you asking? >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
> -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Thursday, August 22, 2019 5:50 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Jiri Pirko <jiri@resnulli.us> > >> Sent: Thursday, August 22, 2019 3:28 PM > >> To: Parav Pandit <parav@mellanox.com> > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> > >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Jiri Pirko <jiri@resnulli.us> > >> >> Sent: Thursday, August 22, 2019 2:59 PM > >> >> To: Parav Pandit <parav@mellanox.com> > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > >> <cohuck@redhat.com>; > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> >> > >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > >> >> > > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > >> >> >> To: Parav Pandit <parav@mellanox.com> > >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > >> >> >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > >> >> >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; > >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > >> >> >> netdev@vger.kernel.org > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> >> >> > >> >> >> > > > > Just an example of the alias, not proposing how it's set. > >> >> >> > > > > In fact, proposing that the user does not set it, > >> >> >> > > > > mdev-core provides one > >> >> >> > > automatically. > >> >> >> > > > > > >> >> >> > > > > > > Since there seems to be some prefix overhead, as I > >> >> >> > > > > > > ask about above in how many characters we actually > >> >> >> > > > > > > have to work with in IFNAMESZ, maybe we start with > >> >> >> > > > > > > 8 characters (matching your "index" namespace) and > >> >> >> > > > > > > expand as necessary for > >> >> disambiguation. > >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with > 12. > >> >> >> > > > > > > Thanks, > >> >> >> > > > > > > > >> >> >> > > > > > If user is going to choose the alias, why does it > >> >> >> > > > > > have to be limited to > >> >> >> sha1? > >> >> >> > > > > > Or you just told it as an example? > >> >> >> > > > > > > >> >> >> > > > > > It can be an alpha-numeric string. > >> >> >> > > > > > >> >> >> > > > > No, I'm proposing a different solution where mdev-core > >> >> >> > > > > creates an alias based on an abbreviated sha1. The > >> >> >> > > > > user does not provide the > >> >> >> alias. > >> >> >> > > > > > >> >> >> > > > > > Instead of mdev imposing number of characters on the > >> >> >> > > > > > alias, it should be best > >> >> >> > > > > left to the user. > >> >> >> > > > > > Because in future if netdev improves on the naming > >> >> >> > > > > > scheme, mdev will be > >> >> >> > > > > limiting it, which is not right. > >> >> >> > > > > > So not restricting alias size seems right to me. > >> >> >> > > > > > User configuring mdev for networking devices in a > >> >> >> > > > > > given kernel knows what > >> >> >> > > > > user is doing. > >> >> >> > > > > > So user can choose alias name size as it finds suitable. > >> >> >> > > > > > >> >> >> > > > > That's not what I'm proposing, please read again. > >> >> >> > > > > Thanks, > >> >> >> > > > > >> >> >> > > > I understood your point. But mdev doesn't know how user > >> >> >> > > > is going to use > >> >> >> > > udev/systemd to name the netdev. > >> >> >> > > > So even if mdev chose to pick 12 characters, it could > >> >> >> > > > result in > >> collision. > >> >> >> > > > Hence the proposal to provide the alias by the user, as > >> >> >> > > > user know the best > >> >> >> > > policy for its use case in the environment its using. > >> >> >> > > > So 12 character sha1 method will still work by user. > >> >> >> > > > >> >> >> > > Haven't you already provided examples where certain drivers > >> >> >> > > or subsystems have unique netdev prefixes? If mdev > >> >> >> > > provides a unique alias within the subsystem, couldn't we > >> >> >> > > simply define a netdev prefix for the mdev subsystem and > >> >> >> > > avoid all other collisions? I'm not in favor of the user > >> >> >> > > providing both a uuid and an alias/instance. Thanks, > >> >> >> > > > >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 > >> >> >> > characters have > >> >> >> collision? > >> >> >> > >> >> >> I think it would be a mistake to waste so many chars on a > >> >> >> prefix, but > >> >> >> 9 characters of sha1 likely wouldn't have a collision before we > >> >> >> have 10s of thousands of devices. Thanks, > >> >> >> > >> >> >> Alex > >> >> > > >> >> >Jiri, Dave, > >> >> >Are you ok with it for devlink/netdev part? > >> >> >Mdev core will create an alias from a UUID. > >> >> > > >> >> >This will be supplied during devlink port attr set such as, > >> >> > > >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const char > >> >> >*mdev_alias); > >> >> > > >> >> >This alias is used to generate representor netdev's phys_port_name. > >> >> >This alias from the mdev device's sysfs will be used by the > >> >> >udev/systemd to > >> >> generate predicable netdev's name. > >> >> >Example: enm<mdev_alias_first_12_chars> > >> >> > >> >> What happens in unlikely case of 2 UUIDs collide? > >> >> > >> >Since users sees two devices with same phys_port_name, user should > >> >destroy > >> recently created mdev and recreate mdev with different UUID? > >> > >> Driver should make sure phys port name wont collide, > >So when mdev creation is initiated, mdev core calculates the alias and if there > is any other mdev with same alias exist, it returns -EEXIST error before > progressing further. > >This way user will get to know upfront in event of collision before the mdev > device gets created. > >How about that? > > Sounds fine to me. Now the question is how many chars do we want to have. > 12 characters from Alex's suggestion similar to git? > > > > > >> in this case that it does > >> not provide 2 same attrs for 2 different ports. > >> Hmm, so the order of creation matters. That is not good. > >> > >> >> > >> >> >I took Ethernet mdev as an example. > >> >> >New prefix 'm' stands for mediated device. > >> >> >Remaining 12 characters are first 12 chars of the mdev alias. > >> >> > >> >> Does this resolve the identification of devlink port representor? > >> >Not sure if I understood your question correctly, attemping to answer > below. > >> >phys_port_name of devlink port is defined by the first 12 characters > >> >of mdev > >> alias. > >> >> I assume you want to use the same 12(or so) chars, don't you? > >> >Mdev's netdev will also use the same mdev alias from the sysfs to > >> >rename > >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, > m=mdev. > >> > > >> >So yes, same 12 characters are use for mdev's netdev and mdev > >> >devlink port's > >> phys_port_name. > >> > > >> >Is that what are you asking? > >> > >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Jiri Pirko <jiri@resnulli.us> >> Sent: Thursday, August 22, 2019 5:50 PM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> netdev@vger.kernel.org >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: >> > >> > >> >> -----Original Message----- >> >> From: Jiri Pirko <jiri@resnulli.us> >> >> Sent: Thursday, August 22, 2019 3:28 PM >> >> To: Parav Pandit <parav@mellanox.com> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> <cohuck@redhat.com>; >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> >> <cjia@nvidia.com>; netdev@vger.kernel.org >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: >> >> > >> >> > >> >> >> -----Original Message----- >> >> >> From: Jiri Pirko <jiri@resnulli.us> >> >> >> Sent: Thursday, August 22, 2019 2:59 PM >> >> >> To: Parav Pandit <parav@mellanox.com> >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> >> <cohuck@redhat.com>; >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: >> >> >> > >> >> >> > >> >> >> >> -----Original Message----- >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM >> >> >> >> To: Parav Pandit <parav@mellanox.com> >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller >> >> >> >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; >> >> >> >> Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; >> >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> >> >> >> netdev@vger.kernel.org >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> >> >> >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. >> >> >> >> > > > > In fact, proposing that the user does not set it, >> >> >> >> > > > > mdev-core provides one >> >> >> >> > > automatically. >> >> >> >> > > > > >> >> >> >> > > > > > > Since there seems to be some prefix overhead, as I >> >> >> >> > > > > > > ask about above in how many characters we actually >> >> >> >> > > > > > > have to work with in IFNAMESZ, maybe we start with >> >> >> >> > > > > > > 8 characters (matching your "index" namespace) and >> >> >> >> > > > > > > expand as necessary for >> >> >> disambiguation. >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with >> 12. >> >> >> >> > > > > > > Thanks, >> >> >> >> > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why does it >> >> >> >> > > > > > have to be limited to >> >> >> >> sha1? >> >> >> >> > > > > > Or you just told it as an example? >> >> >> >> > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. >> >> >> >> > > > > >> >> >> >> > > > > No, I'm proposing a different solution where mdev-core >> >> >> >> > > > > creates an alias based on an abbreviated sha1. The >> >> >> >> > > > > user does not provide the >> >> >> >> alias. >> >> >> >> > > > > >> >> >> >> > > > > > Instead of mdev imposing number of characters on the >> >> >> >> > > > > > alias, it should be best >> >> >> >> > > > > left to the user. >> >> >> >> > > > > > Because in future if netdev improves on the naming >> >> >> >> > > > > > scheme, mdev will be >> >> >> >> > > > > limiting it, which is not right. >> >> >> >> > > > > > So not restricting alias size seems right to me. >> >> >> >> > > > > > User configuring mdev for networking devices in a >> >> >> >> > > > > > given kernel knows what >> >> >> >> > > > > user is doing. >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. >> >> >> >> > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. >> >> >> >> > > > > Thanks, >> >> >> >> > > > >> >> >> >> > > > I understood your point. But mdev doesn't know how user >> >> >> >> > > > is going to use >> >> >> >> > > udev/systemd to name the netdev. >> >> >> >> > > > So even if mdev chose to pick 12 characters, it could >> >> >> >> > > > result in >> >> collision. >> >> >> >> > > > Hence the proposal to provide the alias by the user, as >> >> >> >> > > > user know the best >> >> >> >> > > policy for its use case in the environment its using. >> >> >> >> > > > So 12 character sha1 method will still work by user. >> >> >> >> > > >> >> >> >> > > Haven't you already provided examples where certain drivers >> >> >> >> > > or subsystems have unique netdev prefixes? If mdev >> >> >> >> > > provides a unique alias within the subsystem, couldn't we >> >> >> >> > > simply define a netdev prefix for the mdev subsystem and >> >> >> >> > > avoid all other collisions? I'm not in favor of the user >> >> >> >> > > providing both a uuid and an alias/instance. Thanks, >> >> >> >> > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 >> >> >> >> > characters have >> >> >> >> collision? >> >> >> >> >> >> >> >> I think it would be a mistake to waste so many chars on a >> >> >> >> prefix, but >> >> >> >> 9 characters of sha1 likely wouldn't have a collision before we >> >> >> >> have 10s of thousands of devices. Thanks, >> >> >> >> >> >> >> >> Alex >> >> >> > >> >> >> >Jiri, Dave, >> >> >> >Are you ok with it for devlink/netdev part? >> >> >> >Mdev core will create an alias from a UUID. >> >> >> > >> >> >> >This will be supplied during devlink port attr set such as, >> >> >> > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const char >> >> >> >*mdev_alias); >> >> >> > >> >> >> >This alias is used to generate representor netdev's phys_port_name. >> >> >> >This alias from the mdev device's sysfs will be used by the >> >> >> >udev/systemd to >> >> >> generate predicable netdev's name. >> >> >> >Example: enm<mdev_alias_first_12_chars> >> >> >> >> >> >> What happens in unlikely case of 2 UUIDs collide? >> >> >> >> >> >Since users sees two devices with same phys_port_name, user should >> >> >destroy >> >> recently created mdev and recreate mdev with different UUID? >> >> >> >> Driver should make sure phys port name wont collide, >> >So when mdev creation is initiated, mdev core calculates the alias and if there >> is any other mdev with same alias exist, it returns -EEXIST error before >> progressing further. >> >This way user will get to know upfront in event of collision before the mdev >> device gets created. >> >How about that? >> >> Sounds fine to me. Now the question is how many chars do we want to have. >> >12 characters from Alex's suggestion similar to git? Ok. > >> > >> > >> >> in this case that it does >> >> not provide 2 same attrs for 2 different ports. >> >> Hmm, so the order of creation matters. That is not good. >> >> >> >> >> >> >> >> >I took Ethernet mdev as an example. >> >> >> >New prefix 'm' stands for mediated device. >> >> >> >Remaining 12 characters are first 12 chars of the mdev alias. >> >> >> >> >> >> Does this resolve the identification of devlink port representor? >> >> >Not sure if I understood your question correctly, attemping to answer >> below. >> >> >phys_port_name of devlink port is defined by the first 12 characters >> >> >of mdev >> >> alias. >> >> >> I assume you want to use the same 12(or so) chars, don't you? >> >> >Mdev's netdev will also use the same mdev alias from the sysfs to >> >> >rename >> >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, >> m=mdev. >> >> > >> >> >So yes, same 12 characters are use for mdev's netdev and mdev >> >> >devlink port's >> >> phys_port_name. >> >> > >> >> >Is that what are you asking? >> >> >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
Hi Alex, > -----Original Message----- > From: Jiri Pirko <jiri@resnulli.us> > Sent: Friday, August 23, 2019 1:42 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Jiri Pirko <jiri@resnulli.us> > >> Sent: Thursday, August 22, 2019 5:50 PM > >> To: Parav Pandit <parav@mellanox.com> > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Jiri Pirko <jiri@resnulli.us> > >> >> Sent: Thursday, August 22, 2019 3:28 PM > >> >> To: Parav Pandit <parav@mellanox.com> > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > >> <cohuck@redhat.com>; > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> >> > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > >> >> > > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > >> >> >> To: Parav Pandit <parav@mellanox.com> > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > >> >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > >> >> >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > >> >> <cohuck@redhat.com>; > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > >> >> >> > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > >> >> >> > > >> >> >> > > >> >> >> >> -----Original Message----- > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > >> >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > >> >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > >> >> >> >> core > >> >> >> >> > >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. > >> >> >> >> > > > > In fact, proposing that the user does not set it, > >> >> >> >> > > > > mdev-core provides one > >> >> >> >> > > automatically. > >> >> >> >> > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix overhead, as > >> >> >> >> > > > > > > I ask about above in how many characters we > >> >> >> >> > > > > > > actually have to work with in IFNAMESZ, maybe we > >> >> >> >> > > > > > > start with > >> >> >> >> > > > > > > 8 characters (matching your "index" namespace) > >> >> >> >> > > > > > > and expand as necessary for > >> >> >> disambiguation. > >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's > >> >> >> >> > > > > > > start with > >> 12. > >> >> >> >> > > > > > > Thanks, > >> >> >> >> > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why does it > >> >> >> >> > > > > > have to be limited to > >> >> >> >> sha1? > >> >> >> >> > > > > > Or you just told it as an example? > >> >> >> >> > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > >> >> >> >> > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > >> >> >> >> > > > > mdev-core creates an alias based on an abbreviated > >> >> >> >> > > > > sha1. The user does not provide the > >> >> >> >> alias. > >> >> >> >> > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of characters on > >> >> >> >> > > > > > the alias, it should be best > >> >> >> >> > > > > left to the user. > >> >> >> >> > > > > > Because in future if netdev improves on the naming > >> >> >> >> > > > > > scheme, mdev will be > >> >> >> >> > > > > limiting it, which is not right. > >> >> >> >> > > > > > So not restricting alias size seems right to me. > >> >> >> >> > > > > > User configuring mdev for networking devices in a > >> >> >> >> > > > > > given kernel knows what > >> >> >> >> > > > > user is doing. > >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. > >> >> >> >> > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > >> >> >> >> > > > > Thanks, > >> >> >> >> > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know how > >> >> >> >> > > > user is going to use > >> >> >> >> > > udev/systemd to name the netdev. > >> >> >> >> > > > So even if mdev chose to pick 12 characters, it could > >> >> >> >> > > > result in > >> >> collision. > >> >> >> >> > > > Hence the proposal to provide the alias by the user, > >> >> >> >> > > > as user know the best > >> >> >> >> > > policy for its use case in the environment its using. > >> >> >> >> > > > So 12 character sha1 method will still work by user. > >> >> >> >> > > > >> >> >> >> > > Haven't you already provided examples where certain > >> >> >> >> > > drivers or subsystems have unique netdev prefixes? If > >> >> >> >> > > mdev provides a unique alias within the subsystem, > >> >> >> >> > > couldn't we simply define a netdev prefix for the mdev > >> >> >> >> > > subsystem and avoid all other collisions? I'm not in > >> >> >> >> > > favor of the user providing both a uuid and an > >> >> >> >> > > alias/instance. Thanks, > >> >> >> >> > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 > >> >> >> >> > characters have > >> >> >> >> collision? > >> >> >> >> > >> >> >> >> I think it would be a mistake to waste so many chars on a > >> >> >> >> prefix, but > >> >> >> >> 9 characters of sha1 likely wouldn't have a collision before > >> >> >> >> we have 10s of thousands of devices. Thanks, > >> >> >> >> > >> >> >> >> Alex > >> >> >> > > >> >> >> >Jiri, Dave, > >> >> >> >Are you ok with it for devlink/netdev part? > >> >> >> >Mdev core will create an alias from a UUID. > >> >> >> > > >> >> >> >This will be supplied during devlink port attr set such as, > >> >> >> > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const > >> >> >> >char *mdev_alias); > >> >> >> > > >> >> >> >This alias is used to generate representor netdev's phys_port_name. > >> >> >> >This alias from the mdev device's sysfs will be used by the > >> >> >> >udev/systemd to > >> >> >> generate predicable netdev's name. > >> >> >> >Example: enm<mdev_alias_first_12_chars> > >> >> >> > >> >> >> What happens in unlikely case of 2 UUIDs collide? > >> >> >> > >> >> >Since users sees two devices with same phys_port_name, user > >> >> >should destroy > >> >> recently created mdev and recreate mdev with different UUID? > >> >> > >> >> Driver should make sure phys port name wont collide, > >> >So when mdev creation is initiated, mdev core calculates the alias > >> >and if there > >> is any other mdev with same alias exist, it returns -EEXIST error > >> before progressing further. > >> >This way user will get to know upfront in event of collision before > >> >the mdev > >> device gets created. > >> >How about that? > >> > >> Sounds fine to me. Now the question is how many chars do we want to have. > >> > >12 characters from Alex's suggestion similar to git? > > Ok. > Can you please confirm this scheme looks good now? I like to get patches started. > > > >> > > >> > > >> >> in this case that it does > >> >> not provide 2 same attrs for 2 different ports. > >> >> Hmm, so the order of creation matters. That is not good. > >> >> > >> >> >> > >> >> >> >I took Ethernet mdev as an example. > >> >> >> >New prefix 'm' stands for mediated device. > >> >> >> >Remaining 12 characters are first 12 chars of the mdev alias. > >> >> >> > >> >> >> Does this resolve the identification of devlink port representor? > >> >> >Not sure if I understood your question correctly, attemping to > >> >> >answer > >> below. > >> >> >phys_port_name of devlink port is defined by the first 12 > >> >> >characters of mdev > >> >> alias. > >> >> >> I assume you want to use the same 12(or so) chars, don't you? > >> >> >Mdev's netdev will also use the same mdev alias from the sysfs to > >> >> >rename > >> >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, > >> m=mdev. > >> >> > > >> >> >So yes, same 12 characters are use for mdev's netdev and mdev > >> >> >devlink port's > >> >> phys_port_name. > >> >> > > >> >> >Is that what are you asking? > >> >> > >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
On Fri, 23 Aug 2019 08:14:39 +0000 Parav Pandit <parav@mellanox.com> wrote: > Hi Alex, > > > > -----Original Message----- > > From: Jiri Pirko <jiri@resnulli.us> > > Sent: Friday, August 23, 2019 1:42 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > > Wankhede <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > > > > >> -----Original Message----- > > >> From: Jiri Pirko <jiri@resnulli.us> > > >> Sent: Thursday, August 22, 2019 5:50 PM > > >> To: Parav Pandit <parav@mellanox.com> > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > > >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > <cohuck@redhat.com>; > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > >> <cjia@nvidia.com>; netdev@vger.kernel.org > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > >> > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > >> > > > >> > > > >> >> -----Original Message----- > > >> >> From: Jiri Pirko <jiri@resnulli.us> > > >> >> Sent: Thursday, August 22, 2019 3:28 PM > > >> >> To: Parav Pandit <parav@mellanox.com> > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > > >> >> Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > >> <cohuck@redhat.com>; > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > >> >> > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > > >> >> > > > >> >> > > > >> >> >> -----Original Message----- > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > > >> >> >> To: Parav Pandit <parav@mellanox.com> > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > >> >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > >> >> >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > >> >> <cohuck@redhat.com>; > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > >> >> >> > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com wrote: > > >> >> >> > > > >> >> >> > > > >> >> >> >> -----Original Message----- > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > >> >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > >> >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > >> >> >> >> core > > >> >> >> >> > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. > > >> >> >> >> > > > > In fact, proposing that the user does not set it, > > >> >> >> >> > > > > mdev-core provides one > > >> >> >> >> > > automatically. > > >> >> >> >> > > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix overhead, as > > >> >> >> >> > > > > > > I ask about above in how many characters we > > >> >> >> >> > > > > > > actually have to work with in IFNAMESZ, maybe we > > >> >> >> >> > > > > > > start with > > >> >> >> >> > > > > > > 8 characters (matching your "index" namespace) > > >> >> >> >> > > > > > > and expand as necessary for > > >> >> >> disambiguation. > > >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, let's > > >> >> >> >> > > > > > > start with > > >> 12. > > >> >> >> >> > > > > > > Thanks, > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why does it > > >> >> >> >> > > > > > have to be limited to > > >> >> >> >> sha1? > > >> >> >> >> > > > > > Or you just told it as an example? > > >> >> >> >> > > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > > >> >> >> >> > > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > > >> >> >> >> > > > > mdev-core creates an alias based on an abbreviated > > >> >> >> >> > > > > sha1. The user does not provide the > > >> >> >> >> alias. > > >> >> >> >> > > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of characters on > > >> >> >> >> > > > > > the alias, it should be best > > >> >> >> >> > > > > left to the user. > > >> >> >> >> > > > > > Because in future if netdev improves on the naming > > >> >> >> >> > > > > > scheme, mdev will be > > >> >> >> >> > > > > limiting it, which is not right. > > >> >> >> >> > > > > > So not restricting alias size seems right to me. > > >> >> >> >> > > > > > User configuring mdev for networking devices in a > > >> >> >> >> > > > > > given kernel knows what > > >> >> >> >> > > > > user is doing. > > >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. > > >> >> >> >> > > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > > >> >> >> >> > > > > Thanks, > > >> >> >> >> > > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know how > > >> >> >> >> > > > user is going to use > > >> >> >> >> > > udev/systemd to name the netdev. > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, it could > > >> >> >> >> > > > result in > > >> >> collision. > > >> >> >> >> > > > Hence the proposal to provide the alias by the user, > > >> >> >> >> > > > as user know the best > > >> >> >> >> > > policy for its use case in the environment its using. > > >> >> >> >> > > > So 12 character sha1 method will still work by user. > > >> >> >> >> > > > > >> >> >> >> > > Haven't you already provided examples where certain > > >> >> >> >> > > drivers or subsystems have unique netdev prefixes? If > > >> >> >> >> > > mdev provides a unique alias within the subsystem, > > >> >> >> >> > > couldn't we simply define a netdev prefix for the mdev > > >> >> >> >> > > subsystem and avoid all other collisions? I'm not in > > >> >> >> >> > > favor of the user providing both a uuid and an > > >> >> >> >> > > alias/instance. Thanks, > > >> >> >> >> > > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 > > >> >> >> >> > characters have > > >> >> >> >> collision? > > >> >> >> >> > > >> >> >> >> I think it would be a mistake to waste so many chars on a > > >> >> >> >> prefix, but > > >> >> >> >> 9 characters of sha1 likely wouldn't have a collision before > > >> >> >> >> we have 10s of thousands of devices. Thanks, > > >> >> >> >> > > >> >> >> >> Alex > > >> >> >> > > > >> >> >> >Jiri, Dave, > > >> >> >> >Are you ok with it for devlink/netdev part? > > >> >> >> >Mdev core will create an alias from a UUID. > > >> >> >> > > > >> >> >> >This will be supplied during devlink port attr set such as, > > >> >> >> > > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, const > > >> >> >> >char *mdev_alias); > > >> >> >> > > > >> >> >> >This alias is used to generate representor netdev's phys_port_name. > > >> >> >> >This alias from the mdev device's sysfs will be used by the > > >> >> >> >udev/systemd to > > >> >> >> generate predicable netdev's name. > > >> >> >> >Example: enm<mdev_alias_first_12_chars> > > >> >> >> > > >> >> >> What happens in unlikely case of 2 UUIDs collide? > > >> >> >> > > >> >> >Since users sees two devices with same phys_port_name, user > > >> >> >should destroy > > >> >> recently created mdev and recreate mdev with different UUID? > > >> >> > > >> >> Driver should make sure phys port name wont collide, > > >> >So when mdev creation is initiated, mdev core calculates the alias > > >> >and if there > > >> is any other mdev with same alias exist, it returns -EEXIST error > > >> before progressing further. > > >> >This way user will get to know upfront in event of collision before > > >> >the mdev > > >> device gets created. > > >> >How about that? > > >> > > >> Sounds fine to me. Now the question is how many chars do we want to have. > > >> > > >12 characters from Alex's suggestion similar to git? > > > > Ok. > > > > Can you please confirm this scheme looks good now? I like to get patches started. My only concern is your comment that in the event of an abbreviated sha1 collision (as exceptionally rare as that might be at 12-chars), we'd fail the device create, while my original suggestion was that vfio-core would add an extra character to the alias. For non-networking devices, the sha1 is unnecessary, so the extension behavior seems preferred. The user is only responsible to provide a unique uuid. Perhaps the failure behavior could be applied based on the mdev device_api. A module option on mdev to specify the default number of alias chars would also be useful for testing so that we can set it low enough to validate the collision behavior. Thanks, Alex > > >> >> in this case that it does > > >> >> not provide 2 same attrs for 2 different ports. > > >> >> Hmm, so the order of creation matters. That is not good. > > >> >> > > >> >> >> > > >> >> >> >I took Ethernet mdev as an example. > > >> >> >> >New prefix 'm' stands for mediated device. > > >> >> >> >Remaining 12 characters are first 12 chars of the mdev alias. > > >> >> >> > > >> >> >> Does this resolve the identification of devlink port representor? > > >> >> >Not sure if I understood your question correctly, attemping to > > >> >> >answer > > >> below. > > >> >> >phys_port_name of devlink port is defined by the first 12 > > >> >> >characters of mdev > > >> >> alias. > > >> >> >> I assume you want to use the same 12(or so) chars, don't you? > > >> >> >Mdev's netdev will also use the same mdev alias from the sysfs to > > >> >> >rename > > >> >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, > > >> m=mdev. > > >> >> > > > >> >> >So yes, same 12 characters are use for mdev's netdev and mdev > > >> >> >devlink port's > > >> >> phys_port_name. > > >> >> > > > >> >> >Is that what are you asking? > > >> >> > > >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Friday, August 23, 2019 7:58 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Fri, 23 Aug 2019 08:14:39 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > Hi Alex, > > > > > > > -----Original Message----- > > > From: Jiri Pirko <jiri@resnulli.us> > > > Sent: Friday, August 23, 2019 1:42 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > > > Wankhede <kwankhede@nvidia.com>; Cornelia Huck > <cohuck@redhat.com>; > > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > > > > > > > >> -----Original Message----- > > > >> From: Jiri Pirko <jiri@resnulli.us> > > > >> Sent: Thursday, August 22, 2019 5:50 PM > > > >> To: Parav Pandit <parav@mellanox.com> > > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > <cohuck@redhat.com>; > > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > >> > > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > > >> > > > > >> > > > > >> >> -----Original Message----- > > > >> >> From: Jiri Pirko <jiri@resnulli.us> > > > >> >> Sent: Thursday, August 22, 2019 3:28 PM > > > >> >> To: Parav Pandit <parav@mellanox.com> > > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > >> >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > >> <cohuck@redhat.com>; > > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > >> >> > > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > > > >> >> > > > > >> >> > > > > >> >> >> -----Original Message----- > > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > > > >> >> >> To: Parav Pandit <parav@mellanox.com> > > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > >> >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > >> >> <cohuck@redhat.com>; > > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > >> >> >> core > > > >> >> >> > > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com > wrote: > > > >> >> >> > > > > >> >> >> > > > > >> >> >> >> -----Original Message----- > > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > >> >> >> >> <cohuck@redhat.com>; kvm@vger.kernel.org; > > > >> >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > >> >> >> >> netdev@vger.kernel.org > > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > >> >> >> >> mdev core > > > >> >> >> >> > > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. > > > >> >> >> >> > > > > In fact, proposing that the user does not set > > > >> >> >> >> > > > > it, mdev-core provides one > > > >> >> >> >> > > automatically. > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix > > > >> >> >> >> > > > > > > overhead, as I ask about above in how many > > > >> >> >> >> > > > > > > characters we actually have to work with in > > > >> >> >> >> > > > > > > IFNAMESZ, maybe we start with > > > >> >> >> >> > > > > > > 8 characters (matching your "index" > > > >> >> >> >> > > > > > > namespace) and expand as necessary for > > > >> >> >> disambiguation. > > > >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, > > > >> >> >> >> > > > > > > let's start with > > > >> 12. > > > >> >> >> >> > > > > > > Thanks, > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why does > > > >> >> >> >> > > > > > it have to be limited to > > > >> >> >> >> sha1? > > > >> >> >> >> > > > > > Or you just told it as an example? > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > > > >> >> >> >> > > > > mdev-core creates an alias based on an > > > >> >> >> >> > > > > abbreviated sha1. The user does not provide the > > > >> >> >> >> alias. > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of characters > > > >> >> >> >> > > > > > on the alias, it should be best > > > >> >> >> >> > > > > left to the user. > > > >> >> >> >> > > > > > Because in future if netdev improves on the > > > >> >> >> >> > > > > > naming scheme, mdev will be > > > >> >> >> >> > > > > limiting it, which is not right. > > > >> >> >> >> > > > > > So not restricting alias size seems right to me. > > > >> >> >> >> > > > > > User configuring mdev for networking devices > > > >> >> >> >> > > > > > in a given kernel knows what > > > >> >> >> >> > > > > user is doing. > > > >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. > > > >> >> >> >> > > > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > > > >> >> >> >> > > > > Thanks, > > > >> >> >> >> > > > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know how > > > >> >> >> >> > > > user is going to use > > > >> >> >> >> > > udev/systemd to name the netdev. > > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, it > > > >> >> >> >> > > > could result in > > > >> >> collision. > > > >> >> >> >> > > > Hence the proposal to provide the alias by the > > > >> >> >> >> > > > user, as user know the best > > > >> >> >> >> > > policy for its use case in the environment its using. > > > >> >> >> >> > > > So 12 character sha1 method will still work by user. > > > >> >> >> >> > > > > > >> >> >> >> > > Haven't you already provided examples where certain > > > >> >> >> >> > > drivers or subsystems have unique netdev prefixes? > > > >> >> >> >> > > If mdev provides a unique alias within the > > > >> >> >> >> > > subsystem, couldn't we simply define a netdev prefix > > > >> >> >> >> > > for the mdev subsystem and avoid all other > > > >> >> >> >> > > collisions? I'm not in favor of the user providing > > > >> >> >> >> > > both a uuid and an alias/instance. Thanks, > > > >> >> >> >> > > > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 > > > >> >> >> >> > first 9 characters have > > > >> >> >> >> collision? > > > >> >> >> >> > > > >> >> >> >> I think it would be a mistake to waste so many chars on > > > >> >> >> >> a prefix, but > > > >> >> >> >> 9 characters of sha1 likely wouldn't have a collision > > > >> >> >> >> before we have 10s of thousands of devices. Thanks, > > > >> >> >> >> > > > >> >> >> >> Alex > > > >> >> >> > > > > >> >> >> >Jiri, Dave, > > > >> >> >> >Are you ok with it for devlink/netdev part? > > > >> >> >> >Mdev core will create an alias from a UUID. > > > >> >> >> > > > > >> >> >> >This will be supplied during devlink port attr set such > > > >> >> >> >as, > > > >> >> >> > > > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, > > > >> >> >> >const char *mdev_alias); > > > >> >> >> > > > > >> >> >> >This alias is used to generate representor netdev's > phys_port_name. > > > >> >> >> >This alias from the mdev device's sysfs will be used by > > > >> >> >> >the udev/systemd to > > > >> >> >> generate predicable netdev's name. > > > >> >> >> >Example: enm<mdev_alias_first_12_chars> > > > >> >> >> > > > >> >> >> What happens in unlikely case of 2 UUIDs collide? > > > >> >> >> > > > >> >> >Since users sees two devices with same phys_port_name, user > > > >> >> >should destroy > > > >> >> recently created mdev and recreate mdev with different UUID? > > > >> >> > > > >> >> Driver should make sure phys port name wont collide, > > > >> >So when mdev creation is initiated, mdev core calculates the > > > >> >alias and if there > > > >> is any other mdev with same alias exist, it returns -EEXIST error > > > >> before progressing further. > > > >> >This way user will get to know upfront in event of collision > > > >> >before the mdev > > > >> device gets created. > > > >> >How about that? > > > >> > > > >> Sounds fine to me. Now the question is how many chars do we want to > have. > > > >> > > > >12 characters from Alex's suggestion similar to git? > > > > > > Ok. > > > > > > > Can you please confirm this scheme looks good now? I like to get patches > started. > > My only concern is your comment that in the event of an abbreviated > sha1 collision (as exceptionally rare as that might be at 12-chars), we'd fail the > device create, while my original suggestion was that vfio-core would add an > extra character to the alias. For non-networking devices, the sha1 is > unnecessary, so the extension behavior seems preferred. The user is only > responsible to provide a unique uuid. Perhaps the failure behavior could be > applied based on the mdev device_api. A module option on mdev to specify the > default number of alias chars would also be useful for testing so that we can set > it low enough to validate the collision behavior. Thanks, > Idea is to have mdev alias as optional. Each mdev_parent says whether it wants mdev_core to generate an alias or not. So only networking device drivers would set it to true. For rest, alias won't be generated, and won't be compared either during creation time. User continue to provide only uuid. I am tempted to have alias collision detection only within children mdevs of the same parent, but doing so will always mandate to prefix in netdev name. And currently we are left with only 3 characters to prefix it, so that may not be good either. Hence, I think mdev core wide alias is better with 12 characters. I do not understand how an extra character reduces collision, if that's what you meant. Module options are almost not encouraged anymore with other subsystems/drivers. For testing collision rate, a sample user space script and sample mtty is easy and get us collision count too. We shouldn't put that using module option in production kernel. I practically have the code ready to play with; Changing 12 to smaller value is easy with module reload. #define MDEV_ALIAS_LEN 12 > Alex > > > > >> >> in this case that it does > > > >> >> not provide 2 same attrs for 2 different ports. > > > >> >> Hmm, so the order of creation matters. That is not good. > > > >> >> > > > >> >> >> > > > >> >> >> >I took Ethernet mdev as an example. > > > >> >> >> >New prefix 'm' stands for mediated device. > > > >> >> >> >Remaining 12 characters are first 12 chars of the mdev alias. > > > >> >> >> > > > >> >> >> Does this resolve the identification of devlink port representor? > > > >> >> >Not sure if I understood your question correctly, attemping > > > >> >> >to answer > > > >> below. > > > >> >> >phys_port_name of devlink port is defined by the first 12 > > > >> >> >characters of mdev > > > >> >> alias. > > > >> >> >> I assume you want to use the same 12(or so) chars, don't you? > > > >> >> >Mdev's netdev will also use the same mdev alias from the > > > >> >> >sysfs to rename > > > >> >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, > > > >> m=mdev. > > > >> >> > > > > >> >> >So yes, same 12 characters are use for mdev's netdev and mdev > > > >> >> >devlink port's > > > >> >> phys_port_name. > > > >> >> > > > > >> >> >Is that what are you asking? > > > >> >> > > > >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)...
Fri, Aug 23, 2019 at 04:53:06PM CEST, parav@mellanox.com wrote: > > >> -----Original Message----- >> From: Alex Williamson <alex.williamson@redhat.com> >> Sent: Friday, August 23, 2019 7:58 PM >> To: Parav Pandit <parav@mellanox.com> >> Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller >> <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia >> Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- >> kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> >> On Fri, 23 Aug 2019 08:14:39 +0000 >> Parav Pandit <parav@mellanox.com> wrote: >> >> > Hi Alex, >> > >> > >> > > -----Original Message----- >> > > From: Jiri Pirko <jiri@resnulli.us> >> > > Sent: Friday, August 23, 2019 1:42 PM >> > > To: Parav Pandit <parav@mellanox.com> >> > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti >> > > Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> <cohuck@redhat.com>; >> > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> > > <cjia@nvidia.com>; netdev@vger.kernel.org >> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> > > >> > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: >> > > > >> > > > >> > > >> -----Original Message----- >> > > >> From: Jiri Pirko <jiri@resnulli.us> >> > > >> Sent: Thursday, August 22, 2019 5:50 PM >> > > >> To: Parav Pandit <parav@mellanox.com> >> > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; >> > > >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> > > <cohuck@redhat.com>; >> > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> > > >> <cjia@nvidia.com>; netdev@vger.kernel.org >> > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> > > >> >> > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: >> > > >> > >> > > >> > >> > > >> >> -----Original Message----- >> > > >> >> From: Jiri Pirko <jiri@resnulli.us> >> > > >> >> Sent: Thursday, August 22, 2019 3:28 PM >> > > >> >> To: Parav Pandit <parav@mellanox.com> >> > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko >> > > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; >> > > >> >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck >> > > >> <cohuck@redhat.com>; >> > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org >> > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core >> > > >> >> >> > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: >> > > >> >> > >> > > >> >> > >> > > >> >> >> -----Original Message----- >> > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> >> > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM >> > > >> >> >> To: Parav Pandit <parav@mellanox.com> >> > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri >> > > >> >> >> Pirko <jiri@mellanox.com>; David S . Miller >> > > >> >> >> <davem@davemloft.net>; Kirti Wankhede >> > > >> >> >> <kwankhede@nvidia.com>; Cornelia Huck >> > > >> >> <cohuck@redhat.com>; >> > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia >> > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org >> > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev >> > > >> >> >> core >> > > >> >> >> >> > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com >> wrote: >> > > >> >> >> > >> > > >> >> >> > >> > > >> >> >> >> -----Original Message----- >> > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> >> > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM >> > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> >> > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller >> > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede >> > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck >> > > >> >> >> >> <cohuck@redhat.com>; kvm@vger.kernel.org; >> > > >> >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; >> > > >> >> >> >> netdev@vger.kernel.org >> > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and >> > > >> >> >> >> mdev core >> > > >> >> >> >> >> > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. >> > > >> >> >> >> > > > > In fact, proposing that the user does not set >> > > >> >> >> >> > > > > it, mdev-core provides one >> > > >> >> >> >> > > automatically. >> > > >> >> >> >> > > > > >> > > >> >> >> >> > > > > > > Since there seems to be some prefix >> > > >> >> >> >> > > > > > > overhead, as I ask about above in how many >> > > >> >> >> >> > > > > > > characters we actually have to work with in >> > > >> >> >> >> > > > > > > IFNAMESZ, maybe we start with >> > > >> >> >> >> > > > > > > 8 characters (matching your "index" >> > > >> >> >> >> > > > > > > namespace) and expand as necessary for >> > > >> >> >> disambiguation. >> > > >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, >> > > >> >> >> >> > > > > > > let's start with >> > > >> 12. >> > > >> >> >> >> > > > > > > Thanks, >> > > >> >> >> >> > > > > > > >> > > >> >> >> >> > > > > > If user is going to choose the alias, why does >> > > >> >> >> >> > > > > > it have to be limited to >> > > >> >> >> >> sha1? >> > > >> >> >> >> > > > > > Or you just told it as an example? >> > > >> >> >> >> > > > > > >> > > >> >> >> >> > > > > > It can be an alpha-numeric string. >> > > >> >> >> >> > > > > >> > > >> >> >> >> > > > > No, I'm proposing a different solution where >> > > >> >> >> >> > > > > mdev-core creates an alias based on an >> > > >> >> >> >> > > > > abbreviated sha1. The user does not provide the >> > > >> >> >> >> alias. >> > > >> >> >> >> > > > > >> > > >> >> >> >> > > > > > Instead of mdev imposing number of characters >> > > >> >> >> >> > > > > > on the alias, it should be best >> > > >> >> >> >> > > > > left to the user. >> > > >> >> >> >> > > > > > Because in future if netdev improves on the >> > > >> >> >> >> > > > > > naming scheme, mdev will be >> > > >> >> >> >> > > > > limiting it, which is not right. >> > > >> >> >> >> > > > > > So not restricting alias size seems right to me. >> > > >> >> >> >> > > > > > User configuring mdev for networking devices >> > > >> >> >> >> > > > > > in a given kernel knows what >> > > >> >> >> >> > > > > user is doing. >> > > >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. >> > > >> >> >> >> > > > > >> > > >> >> >> >> > > > > That's not what I'm proposing, please read again. >> > > >> >> >> >> > > > > Thanks, >> > > >> >> >> >> > > > >> > > >> >> >> >> > > > I understood your point. But mdev doesn't know how >> > > >> >> >> >> > > > user is going to use >> > > >> >> >> >> > > udev/systemd to name the netdev. >> > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, it >> > > >> >> >> >> > > > could result in >> > > >> >> collision. >> > > >> >> >> >> > > > Hence the proposal to provide the alias by the >> > > >> >> >> >> > > > user, as user know the best >> > > >> >> >> >> > > policy for its use case in the environment its using. >> > > >> >> >> >> > > > So 12 character sha1 method will still work by user. >> > > >> >> >> >> > > >> > > >> >> >> >> > > Haven't you already provided examples where certain >> > > >> >> >> >> > > drivers or subsystems have unique netdev prefixes? >> > > >> >> >> >> > > If mdev provides a unique alias within the >> > > >> >> >> >> > > subsystem, couldn't we simply define a netdev prefix >> > > >> >> >> >> > > for the mdev subsystem and avoid all other >> > > >> >> >> >> > > collisions? I'm not in favor of the user providing >> > > >> >> >> >> > > both a uuid and an alias/instance. Thanks, >> > > >> >> >> >> > > >> > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 >> > > >> >> >> >> > first 9 characters have >> > > >> >> >> >> collision? >> > > >> >> >> >> >> > > >> >> >> >> I think it would be a mistake to waste so many chars on >> > > >> >> >> >> a prefix, but >> > > >> >> >> >> 9 characters of sha1 likely wouldn't have a collision >> > > >> >> >> >> before we have 10s of thousands of devices. Thanks, >> > > >> >> >> >> >> > > >> >> >> >> Alex >> > > >> >> >> > >> > > >> >> >> >Jiri, Dave, >> > > >> >> >> >Are you ok with it for devlink/netdev part? >> > > >> >> >> >Mdev core will create an alias from a UUID. >> > > >> >> >> > >> > > >> >> >> >This will be supplied during devlink port attr set such >> > > >> >> >> >as, >> > > >> >> >> > >> > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, >> > > >> >> >> >const char *mdev_alias); >> > > >> >> >> > >> > > >> >> >> >This alias is used to generate representor netdev's >> phys_port_name. >> > > >> >> >> >This alias from the mdev device's sysfs will be used by >> > > >> >> >> >the udev/systemd to >> > > >> >> >> generate predicable netdev's name. >> > > >> >> >> >Example: enm<mdev_alias_first_12_chars> >> > > >> >> >> >> > > >> >> >> What happens in unlikely case of 2 UUIDs collide? >> > > >> >> >> >> > > >> >> >Since users sees two devices with same phys_port_name, user >> > > >> >> >should destroy >> > > >> >> recently created mdev and recreate mdev with different UUID? >> > > >> >> >> > > >> >> Driver should make sure phys port name wont collide, >> > > >> >So when mdev creation is initiated, mdev core calculates the >> > > >> >alias and if there >> > > >> is any other mdev with same alias exist, it returns -EEXIST error >> > > >> before progressing further. >> > > >> >This way user will get to know upfront in event of collision >> > > >> >before the mdev >> > > >> device gets created. >> > > >> >How about that? >> > > >> >> > > >> Sounds fine to me. Now the question is how many chars do we want to >> have. >> > > >> >> > > >12 characters from Alex's suggestion similar to git? >> > > >> > > Ok. >> > > >> > >> > Can you please confirm this scheme looks good now? I like to get patches >> started. >> >> My only concern is your comment that in the event of an abbreviated >> sha1 collision (as exceptionally rare as that might be at 12-chars), we'd fail the >> device create, while my original suggestion was that vfio-core would add an >> extra character to the alias. For non-networking devices, the sha1 is >> unnecessary, so the extension behavior seems preferred. The user is only >> responsible to provide a unique uuid. Perhaps the failure behavior could be >> applied based on the mdev device_api. A module option on mdev to specify the >> default number of alias chars would also be useful for testing so that we can set >> it low enough to validate the collision behavior. Thanks, >> > >Idea is to have mdev alias as optional. >Each mdev_parent says whether it wants mdev_core to generate an alias or not. >So only networking device drivers would set it to true. >For rest, alias won't be generated, and won't be compared either during creation time. >User continue to provide only uuid. >I am tempted to have alias collision detection only within children mdevs of the same parent, but doing so will always mandate to prefix in netdev name. >And currently we are left with only 3 characters to prefix it, so that may not be good either. >Hence, I think mdev core wide alias is better with 12 characters. > >I do not understand how an extra character reduces collision, if that's what you meant. Also, that breaks the naming consistency for different creation order. >Module options are almost not encouraged anymore with other subsystems/drivers. > >For testing collision rate, a sample user space script and sample mtty is easy and get us collision count too. >We shouldn't put that using module option in production kernel. >I practically have the code ready to play with; Changing 12 to smaller value is easy with module reload. > >#define MDEV_ALIAS_LEN 12 > >> Alex >> >> > > >> >> in this case that it does >> > > >> >> not provide 2 same attrs for 2 different ports. >> > > >> >> Hmm, so the order of creation matters. That is not good. >> > > >> >> >> > > >> >> >> >> > > >> >> >> >I took Ethernet mdev as an example. >> > > >> >> >> >New prefix 'm' stands for mediated device. >> > > >> >> >> >Remaining 12 characters are first 12 chars of the mdev alias. >> > > >> >> >> >> > > >> >> >> Does this resolve the identification of devlink port representor? >> > > >> >> >Not sure if I understood your question correctly, attemping >> > > >> >> >to answer >> > > >> below. >> > > >> >> >phys_port_name of devlink port is defined by the first 12 >> > > >> >> >characters of mdev >> > > >> >> alias. >> > > >> >> >> I assume you want to use the same 12(or so) chars, don't you? >> > > >> >> >Mdev's netdev will also use the same mdev alias from the >> > > >> >> >sysfs to rename >> > > >> >> netdev name from ethX to enm<mdev_alias>, where en=Etherenet, >> > > >> m=mdev. >> > > >> >> > >> > > >> >> >So yes, same 12 characters are use for mdev's netdev and mdev >> > > >> >> >devlink port's >> > > >> >> phys_port_name. >> > > >> >> > >> > > >> >> >Is that what are you asking? >> > > >> >> >> > > >> >> Yes. Then you have 3 chars to handle the rest of the name (pci, pf)... >
On Fri, 23 Aug 2019 14:53:06 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Friday, August 23, 2019 7:58 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Fri, 23 Aug 2019 08:14:39 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > Hi Alex, > > > > > > > > > > -----Original Message----- > > > > From: Jiri Pirko <jiri@resnulli.us> > > > > Sent: Friday, August 23, 2019 1:42 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; Kirti > > > > Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > <cohuck@redhat.com>; > > > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > > > > > > > > > > >> -----Original Message----- > > > > >> From: Jiri Pirko <jiri@resnulli.us> > > > > >> Sent: Thursday, August 22, 2019 5:50 PM > > > > >> To: Parav Pandit <parav@mellanox.com> > > > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > <cohuck@redhat.com>; > > > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > >> > > > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > > > >> > > > > > >> > > > > > >> >> -----Original Message----- > > > > >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > >> >> Sent: Thursday, August 22, 2019 3:28 PM > > > > >> >> To: Parav Pandit <parav@mellanox.com> > > > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > >> >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > >> >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > >> <cohuck@redhat.com>; > > > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > >> >> > > > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com wrote: > > > > >> >> > > > > > >> >> > > > > > >> >> >> -----Original Message----- > > > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > > > > >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > > >> >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > > >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > >> >> <cohuck@redhat.com>; > > > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > > >> >> >> core > > > > >> >> >> > > > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, parav@mellanox.com > > wrote: > > > > >> >> >> > > > > > >> >> >> > > > > > >> >> >> >> -----Original Message----- > > > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > > > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > > > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > >> >> >> >> <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > >> >> >> >> linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > >> >> >> >> netdev@vger.kernel.org > > > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > > >> >> >> >> mdev core > > > > >> >> >> >> > > > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's set. > > > > >> >> >> >> > > > > In fact, proposing that the user does not set > > > > >> >> >> >> > > > > it, mdev-core provides one > > > > >> >> >> >> > > automatically. > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix > > > > >> >> >> >> > > > > > > overhead, as I ask about above in how many > > > > >> >> >> >> > > > > > > characters we actually have to work with in > > > > >> >> >> >> > > > > > > IFNAMESZ, maybe we start with > > > > >> >> >> >> > > > > > > 8 characters (matching your "index" > > > > >> >> >> >> > > > > > > namespace) and expand as necessary for > > > > >> >> >> disambiguation. > > > > >> >> >> >> > > > > > > If we can eliminate overhead in IFNAMESZ, > > > > >> >> >> >> > > > > > > let's start with > > > > >> 12. > > > > >> >> >> >> > > > > > > Thanks, > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why does > > > > >> >> >> >> > > > > > it have to be limited to > > > > >> >> >> >> sha1? > > > > >> >> >> >> > > > > > Or you just told it as an example? > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > > > > >> >> >> >> > > > > mdev-core creates an alias based on an > > > > >> >> >> >> > > > > abbreviated sha1. The user does not provide the > > > > >> >> >> >> alias. > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of characters > > > > >> >> >> >> > > > > > on the alias, it should be best > > > > >> >> >> >> > > > > left to the user. > > > > >> >> >> >> > > > > > Because in future if netdev improves on the > > > > >> >> >> >> > > > > > naming scheme, mdev will be > > > > >> >> >> >> > > > > limiting it, which is not right. > > > > >> >> >> >> > > > > > So not restricting alias size seems right to me. > > > > >> >> >> >> > > > > > User configuring mdev for networking devices > > > > >> >> >> >> > > > > > in a given kernel knows what > > > > >> >> >> >> > > > > user is doing. > > > > >> >> >> >> > > > > > So user can choose alias name size as it finds suitable. > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > > > > >> >> >> >> > > > > Thanks, > > > > >> >> >> >> > > > > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know how > > > > >> >> >> >> > > > user is going to use > > > > >> >> >> >> > > udev/systemd to name the netdev. > > > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, it > > > > >> >> >> >> > > > could result in > > > > >> >> collision. > > > > >> >> >> >> > > > Hence the proposal to provide the alias by the > > > > >> >> >> >> > > > user, as user know the best > > > > >> >> >> >> > > policy for its use case in the environment its using. > > > > >> >> >> >> > > > So 12 character sha1 method will still work by user. > > > > >> >> >> >> > > > > > > >> >> >> >> > > Haven't you already provided examples where certain > > > > >> >> >> >> > > drivers or subsystems have unique netdev prefixes? > > > > >> >> >> >> > > If mdev provides a unique alias within the > > > > >> >> >> >> > > subsystem, couldn't we simply define a netdev prefix > > > > >> >> >> >> > > for the mdev subsystem and avoid all other > > > > >> >> >> >> > > collisions? I'm not in favor of the user providing > > > > >> >> >> >> > > both a uuid and an alias/instance. Thanks, > > > > >> >> >> >> > > > > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 > > > > >> >> >> >> > first 9 characters have > > > > >> >> >> >> collision? > > > > >> >> >> >> > > > > >> >> >> >> I think it would be a mistake to waste so many chars on > > > > >> >> >> >> a prefix, but > > > > >> >> >> >> 9 characters of sha1 likely wouldn't have a collision > > > > >> >> >> >> before we have 10s of thousands of devices. Thanks, > > > > >> >> >> >> > > > > >> >> >> >> Alex > > > > >> >> >> > > > > > >> >> >> >Jiri, Dave, > > > > >> >> >> >Are you ok with it for devlink/netdev part? > > > > >> >> >> >Mdev core will create an alias from a UUID. > > > > >> >> >> > > > > > >> >> >> >This will be supplied during devlink port attr set such > > > > >> >> >> >as, > > > > >> >> >> > > > > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, > > > > >> >> >> >const char *mdev_alias); > > > > >> >> >> > > > > > >> >> >> >This alias is used to generate representor netdev's > > phys_port_name. > > > > >> >> >> >This alias from the mdev device's sysfs will be used by > > > > >> >> >> >the udev/systemd to > > > > >> >> >> generate predicable netdev's name. > > > > >> >> >> >Example: enm<mdev_alias_first_12_chars> > > > > >> >> >> > > > > >> >> >> What happens in unlikely case of 2 UUIDs collide? > > > > >> >> >> > > > > >> >> >Since users sees two devices with same phys_port_name, user > > > > >> >> >should destroy > > > > >> >> recently created mdev and recreate mdev with different UUID? > > > > >> >> > > > > >> >> Driver should make sure phys port name wont collide, > > > > >> >So when mdev creation is initiated, mdev core calculates the > > > > >> >alias and if there > > > > >> is any other mdev with same alias exist, it returns -EEXIST error > > > > >> before progressing further. > > > > >> >This way user will get to know upfront in event of collision > > > > >> >before the mdev > > > > >> device gets created. > > > > >> >How about that? > > > > >> > > > > >> Sounds fine to me. Now the question is how many chars do we want to > > have. > > > > >> > > > > >12 characters from Alex's suggestion similar to git? > > > > > > > > Ok. > > > > > > > > > > Can you please confirm this scheme looks good now? I like to get patches > > started. > > > > My only concern is your comment that in the event of an abbreviated > > sha1 collision (as exceptionally rare as that might be at 12-chars), we'd fail the > > device create, while my original suggestion was that vfio-core would add an > > extra character to the alias. For non-networking devices, the sha1 is > > unnecessary, so the extension behavior seems preferred. The user is only > > responsible to provide a unique uuid. Perhaps the failure behavior could be > > applied based on the mdev device_api. A module option on mdev to specify the > > default number of alias chars would also be useful for testing so that we can set > > it low enough to validate the collision behavior. Thanks, > > > > Idea is to have mdev alias as optional. > Each mdev_parent says whether it wants mdev_core to generate an alias > or not. So only networking device drivers would set it to true. > For rest, alias won't be generated, and won't be compared either > during creation time. User continue to provide only uuid. Ok > I am tempted to have alias collision detection only within children > mdevs of the same parent, but doing so will always mandate to prefix > in netdev name. And currently we are left with only 3 characters to > prefix it, so that may not be good either. Hence, I think mdev core > wide alias is better with 12 characters. I suppose it depends on the API, if the vendor driver can ask the mdev core for an alias as part of the device creation process, then it could manage the netdev namespace for all its devices, choosing how many characters to use, and fail the creation if it can't meet a uniqueness requirement. IOW, mdev-core would always provide a full sha1 and therefore gets itself out of the uniqueness/collision aspects. > I do not understand how an extra character reduces collision, if > that's what you meant. If the default were for example 3-chars, we might already have device 'abc'. A collision would expose one more char of the new device, so we might add device with alias 'abcd'. I mentioned previously that this leaves an issue for userspace that we can't change the alias of device abc, so without additional information, userspace can only determine via elimination the mapping of alias to device, but userspace has more information available to it in the form of sysfs links. > Module options are almost not encouraged > anymore with other subsystems/drivers. We don't live in a world of absolutes. I agree that the defaults should work in the vast majority of cases. Requiring a user to twiddle module options to make things work is undesirable, verging on a bug. A module option to enable some specific feature, unsafe condition, or test that is outside of the typical use case is reasonable, imo. > For testing collision rate, a sample user space script and sample > mtty is easy and get us collision count too. We shouldn't put that > using module option in production kernel. I practically have the code > ready to play with; Changing 12 to smaller value is easy with module > reload. > > #define MDEV_ALIAS_LEN 12 If it can't be tested with a shipping binary, it probably won't be tested. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Friday, August 23, 2019 9:22 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Fri, 23 Aug 2019 14:53:06 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Friday, August 23, 2019 7:58 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; > > > David S . Miller <davem@davemloft.net>; Kirti Wankhede > > > <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > > kvm@vger.kernel.org; linux- kernel@vger.kernel.org; cjia > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Fri, 23 Aug 2019 08:14:39 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > Hi Alex, > > > > > > > > > > > > > -----Original Message----- > > > > > From: Jiri Pirko <jiri@resnulli.us> > > > > > Sent: Friday, August 23, 2019 1:42 PM > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > <cohuck@redhat.com>; > > > > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > > > > > > > > > > > > > >> -----Original Message----- > > > > > >> From: Jiri Pirko <jiri@resnulli.us> > > > > > >> Sent: Thursday, August 22, 2019 5:50 PM > > > > > >> To: Parav Pandit <parav@mellanox.com> > > > > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > > >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > > <cohuck@redhat.com>; > > > > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > > > >> core > > > > > >> > > > > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > > > > >> > > > > > > >> > > > > > > >> >> -----Original Message----- > > > > > >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > > >> >> Sent: Thursday, August 22, 2019 3:28 PM > > > > > >> >> To: Parav Pandit <parav@mellanox.com> > > > > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > > > >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > > > >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > >> <cohuck@redhat.com>; > > > > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > > > >> >> core > > > > > >> >> > > > > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com > wrote: > > > > > >> >> > > > > > > >> >> > > > > > > >> >> >> -----Original Message----- > > > > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > > > > > >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > > > >> >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > > > >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > >> >> <cohuck@redhat.com>; > > > > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > > > >> >> >> mdev core > > > > > >> >> >> > > > > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, > > > > > >> >> >> parav@mellanox.com > > > wrote: > > > > > >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> >> -----Original Message----- > > > > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > > > > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > > > > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > >> >> >> >> <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > > >> >> >> >> linux-kernel@vger.kernel.org; cjia > > > > > >> >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > > > >> >> >> >> mdev core > > > > > >> >> >> >> > > > > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's > set. > > > > > >> >> >> >> > > > > In fact, proposing that the user does not > > > > > >> >> >> >> > > > > set it, mdev-core provides one > > > > > >> >> >> >> > > automatically. > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix > > > > > >> >> >> >> > > > > > > overhead, as I ask about above in how > > > > > >> >> >> >> > > > > > > many characters we actually have to work > > > > > >> >> >> >> > > > > > > with in IFNAMESZ, maybe we start with > > > > > >> >> >> >> > > > > > > 8 characters (matching your "index" > > > > > >> >> >> >> > > > > > > namespace) and expand as necessary for > > > > > >> >> >> disambiguation. > > > > > >> >> >> >> > > > > > > If we can eliminate overhead in > > > > > >> >> >> >> > > > > > > IFNAMESZ, let's start with > > > > > >> 12. > > > > > >> >> >> >> > > > > > > Thanks, > > > > > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why > > > > > >> >> >> >> > > > > > does it have to be limited to > > > > > >> >> >> >> sha1? > > > > > >> >> >> >> > > > > > Or you just told it as an example? > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > > > > > >> >> >> >> > > > > mdev-core creates an alias based on an > > > > > >> >> >> >> > > > > abbreviated sha1. The user does not provide > > > > > >> >> >> >> > > > > the > > > > > >> >> >> >> alias. > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of > > > > > >> >> >> >> > > > > > characters on the alias, it should be best > > > > > >> >> >> >> > > > > left to the user. > > > > > >> >> >> >> > > > > > Because in future if netdev improves on > > > > > >> >> >> >> > > > > > the naming scheme, mdev will be > > > > > >> >> >> >> > > > > limiting it, which is not right. > > > > > >> >> >> >> > > > > > So not restricting alias size seems right to me. > > > > > >> >> >> >> > > > > > User configuring mdev for networking > > > > > >> >> >> >> > > > > > devices in a given kernel knows what > > > > > >> >> >> >> > > > > user is doing. > > > > > >> >> >> >> > > > > > So user can choose alias name size as it finds > suitable. > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > > > > > >> >> >> >> > > > > Thanks, > > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know > > > > > >> >> >> >> > > > how user is going to use > > > > > >> >> >> >> > > udev/systemd to name the netdev. > > > > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, > > > > > >> >> >> >> > > > it could result in > > > > > >> >> collision. > > > > > >> >> >> >> > > > Hence the proposal to provide the alias by the > > > > > >> >> >> >> > > > user, as user know the best > > > > > >> >> >> >> > > policy for its use case in the environment its using. > > > > > >> >> >> >> > > > So 12 character sha1 method will still work by user. > > > > > >> >> >> >> > > > > > > > >> >> >> >> > > Haven't you already provided examples where > > > > > >> >> >> >> > > certain drivers or subsystems have unique netdev > prefixes? > > > > > >> >> >> >> > > If mdev provides a unique alias within the > > > > > >> >> >> >> > > subsystem, couldn't we simply define a netdev > > > > > >> >> >> >> > > prefix for the mdev subsystem and avoid all > > > > > >> >> >> >> > > other collisions? I'm not in favor of the user > > > > > >> >> >> >> > > providing both a uuid and an alias/instance. > > > > > >> >> >> >> > > Thanks, > > > > > >> >> >> >> > > > > > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 > > > > > >> >> >> >> > first 9 characters have > > > > > >> >> >> >> collision? > > > > > >> >> >> >> > > > > > >> >> >> >> I think it would be a mistake to waste so many chars > > > > > >> >> >> >> on a prefix, but > > > > > >> >> >> >> 9 characters of sha1 likely wouldn't have a > > > > > >> >> >> >> collision before we have 10s of thousands of > > > > > >> >> >> >> devices. Thanks, > > > > > >> >> >> >> > > > > > >> >> >> >> Alex > > > > > >> >> >> > > > > > > >> >> >> >Jiri, Dave, > > > > > >> >> >> >Are you ok with it for devlink/netdev part? > > > > > >> >> >> >Mdev core will create an alias from a UUID. > > > > > >> >> >> > > > > > > >> >> >> >This will be supplied during devlink port attr set > > > > > >> >> >> >such as, > > > > > >> >> >> > > > > > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, > > > > > >> >> >> >const char *mdev_alias); > > > > > >> >> >> > > > > > > >> >> >> >This alias is used to generate representor netdev's > > > phys_port_name. > > > > > >> >> >> >This alias from the mdev device's sysfs will be used > > > > > >> >> >> >by the udev/systemd to > > > > > >> >> >> generate predicable netdev's name. > > > > > >> >> >> >Example: enm<mdev_alias_first_12_chars> > > > > > >> >> >> > > > > > >> >> >> What happens in unlikely case of 2 UUIDs collide? > > > > > >> >> >> > > > > > >> >> >Since users sees two devices with same phys_port_name, > > > > > >> >> >user should destroy > > > > > >> >> recently created mdev and recreate mdev with different UUID? > > > > > >> >> > > > > > >> >> Driver should make sure phys port name wont collide, > > > > > >> >So when mdev creation is initiated, mdev core calculates the > > > > > >> >alias and if there > > > > > >> is any other mdev with same alias exist, it returns -EEXIST > > > > > >> error before progressing further. > > > > > >> >This way user will get to know upfront in event of collision > > > > > >> >before the mdev > > > > > >> device gets created. > > > > > >> >How about that? > > > > > >> > > > > > >> Sounds fine to me. Now the question is how many chars do we > > > > > >> want to > > > have. > > > > > >> > > > > > >12 characters from Alex's suggestion similar to git? > > > > > > > > > > Ok. > > > > > > > > > > > > > Can you please confirm this scheme looks good now? I like to get > > > > patches > > > started. > > > > > > My only concern is your comment that in the event of an abbreviated > > > sha1 collision (as exceptionally rare as that might be at 12-chars), > > > we'd fail the device create, while my original suggestion was that > > > vfio-core would add an extra character to the alias. For > > > non-networking devices, the sha1 is unnecessary, so the extension > > > behavior seems preferred. The user is only responsible to provide a > > > unique uuid. Perhaps the failure behavior could be applied based on > > > the mdev device_api. A module option on mdev to specify the default > > > number of alias chars would also be useful for testing so that we > > > can set it low enough to validate the collision behavior. Thanks, > > > > > > > Idea is to have mdev alias as optional. > > Each mdev_parent says whether it wants mdev_core to generate an alias > > or not. So only networking device drivers would set it to true. > > For rest, alias won't be generated, and won't be compared either > > during creation time. User continue to provide only uuid. > > Ok > > > I am tempted to have alias collision detection only within children > > mdevs of the same parent, but doing so will always mandate to prefix > > in netdev name. And currently we are left with only 3 characters to > > prefix it, so that may not be good either. Hence, I think mdev core > > wide alias is better with 12 characters. > > I suppose it depends on the API, if the vendor driver can ask the mdev core for > an alias as part of the device creation process, then it could manage the netdev > namespace for all its devices, choosing how many characters to use, and fail > the creation if it can't meet a uniqueness requirement. IOW, mdev-core would > always provide a full sha1 and therefore gets itself out of the > uniqueness/collision aspects. > This doesn't work. At mdev core level 20 bytes sha1 are unique, so mdev core allowed to create a mdev. And then devlink core chooses only 6 bytes (12 characters) and there is collision. Things fall apart. Since mdev provides unique uuid based scheme, it's the mdev core's ownership to provide unique aliases. > > I do not understand how an extra character reduces collision, if > > that's what you meant. > > If the default were for example 3-chars, we might already have device 'abc'. A > collision would expose one more char of the new device, so we might add > device with alias 'abcd'. I mentioned previously that this leaves an issue for > userspace that we can't change the alias of device abc, so without additional > information, userspace can only determine via elimination the mapping of alias > to device, but userspace has more information available to it in the form of > sysfs links. > > > Module options are almost not encouraged anymore with other > > subsystems/drivers. > > We don't live in a world of absolutes. I agree that the defaults should work in > the vast majority of cases. Requiring a user to twiddle module options to make > things work is undesirable, verging on a bug. A module option to enable some > specific feature, unsafe condition, or test that is outside of the typical use case > is reasonable, imo. > > > For testing collision rate, a sample user space script and sample mtty > > is easy and get us collision count too. We shouldn't put that using > > module option in production kernel. I practically have the code ready > > to play with; Changing 12 to smaller value is easy with module reload. > > > > #define MDEV_ALIAS_LEN 12 > > If it can't be tested with a shipping binary, it probably won't be tested. Thanks, > It is not the role of mdev core to expose collision efficiency/deficiency of the sha1. It can be tested outside before mdev choose to use it. I am saying we should test with 12 characters with 10,000 or more devices and see how collision occurs. Even if collision occurs, mdev returns EEXIST status indicating user to pick a different UUID for those rare conditions.
On Fri, 23 Aug 2019 16:14:04 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Friday, August 23, 2019 9:22 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Fri, 23 Aug 2019 14:53:06 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > Sent: Friday, August 23, 2019 7:58 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; > > > > David S . Miller <davem@davemloft.net>; Kirti Wankhede > > > > <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > > > kvm@vger.kernel.org; linux- kernel@vger.kernel.org; cjia > > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Fri, 23 Aug 2019 08:14:39 +0000 > > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > Hi Alex, > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Jiri Pirko <jiri@resnulli.us> > > > > > > Sent: Friday, August 23, 2019 1:42 PM > > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > <cohuck@redhat.com>; > > > > > > kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > > > Thu, Aug 22, 2019 at 03:33:30PM CEST, parav@mellanox.com wrote: > > > > > > > > > > > > > > > > > > > > >> -----Original Message----- > > > > > > >> From: Jiri Pirko <jiri@resnulli.us> > > > > > > >> Sent: Thursday, August 22, 2019 5:50 PM > > > > > > >> To: Parav Pandit <parav@mellanox.com> > > > > > > >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri Pirko > > > > > > >> <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > > > >> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > > > <cohuck@redhat.com>; > > > > > > >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > > >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > > >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > > > > >> core > > > > > > >> > > > > > > >> Thu, Aug 22, 2019 at 12:04:02PM CEST, parav@mellanox.com wrote: > > > > > > >> > > > > > > > >> > > > > > > > >> >> -----Original Message----- > > > > > > >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > > > >> >> Sent: Thursday, August 22, 2019 3:28 PM > > > > > > >> >> To: Parav Pandit <parav@mellanox.com> > > > > > > >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > > > > >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > > > > >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > > >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > > >> <cohuck@redhat.com>; > > > > > > >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > > >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > > >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev > > > > > > >> >> core > > > > > > >> >> > > > > > > >> >> Thu, Aug 22, 2019 at 11:42:13AM CEST, parav@mellanox.com > > wrote: > > > > > > >> >> > > > > > > > >> >> > > > > > > > >> >> >> -----Original Message----- > > > > > > >> >> >> From: Jiri Pirko <jiri@resnulli.us> > > > > > > >> >> >> Sent: Thursday, August 22, 2019 2:59 PM > > > > > > >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > > > >> >> >> Cc: Alex Williamson <alex.williamson@redhat.com>; Jiri > > > > > > >> >> >> Pirko <jiri@mellanox.com>; David S . Miller > > > > > > >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > > >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > > >> >> <cohuck@redhat.com>; > > > > > > >> >> >> kvm@vger.kernel.org; linux-kernel@vger.kernel.org; cjia > > > > > > >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > > >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > > > > >> >> >> mdev core > > > > > > >> >> >> > > > > > > >> >> >> Wed, Aug 21, 2019 at 08:23:17AM CEST, > > > > > > >> >> >> parav@mellanox.com > > > > wrote: > > > > > > >> >> >> > > > > > > > >> >> >> > > > > > > > >> >> >> >> -----Original Message----- > > > > > > >> >> >> >> From: Alex Williamson <alex.williamson@redhat.com> > > > > > > >> >> >> >> Sent: Wednesday, August 21, 2019 10:56 AM > > > > > > >> >> >> >> To: Parav Pandit <parav@mellanox.com> > > > > > > >> >> >> >> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller > > > > > > >> >> >> >> <davem@davemloft.net>; Kirti Wankhede > > > > > > >> >> >> >> <kwankhede@nvidia.com>; Cornelia Huck > > > > > > >> >> >> >> <cohuck@redhat.com>; kvm@vger.kernel.org; > > > > > > >> >> >> >> linux-kernel@vger.kernel.org; cjia > > > > > > >> >> >> >> <cjia@nvidia.com>; netdev@vger.kernel.org > > > > > > >> >> >> >> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and > > > > > > >> >> >> >> mdev core > > > > > > >> >> >> >> > > > > > > >> >> >> >> > > > > Just an example of the alias, not proposing how it's > > set. > > > > > > >> >> >> >> > > > > In fact, proposing that the user does not > > > > > > >> >> >> >> > > > > set it, mdev-core provides one > > > > > > >> >> >> >> > > automatically. > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > > Since there seems to be some prefix > > > > > > >> >> >> >> > > > > > > overhead, as I ask about above in how > > > > > > >> >> >> >> > > > > > > many characters we actually have to work > > > > > > >> >> >> >> > > > > > > with in IFNAMESZ, maybe we start with > > > > > > >> >> >> >> > > > > > > 8 characters (matching your "index" > > > > > > >> >> >> >> > > > > > > namespace) and expand as necessary for > > > > > > >> >> >> disambiguation. > > > > > > >> >> >> >> > > > > > > If we can eliminate overhead in > > > > > > >> >> >> >> > > > > > > IFNAMESZ, let's start with > > > > > > >> 12. > > > > > > >> >> >> >> > > > > > > Thanks, > > > > > > >> >> >> >> > > > > > > > > > > > > >> >> >> >> > > > > > If user is going to choose the alias, why > > > > > > >> >> >> >> > > > > > does it have to be limited to > > > > > > >> >> >> >> sha1? > > > > > > >> >> >> >> > > > > > Or you just told it as an example? > > > > > > >> >> >> >> > > > > > > > > > > > >> >> >> >> > > > > > It can be an alpha-numeric string. > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > No, I'm proposing a different solution where > > > > > > >> >> >> >> > > > > mdev-core creates an alias based on an > > > > > > >> >> >> >> > > > > abbreviated sha1. The user does not provide > > > > > > >> >> >> >> > > > > the > > > > > > >> >> >> >> alias. > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > > Instead of mdev imposing number of > > > > > > >> >> >> >> > > > > > characters on the alias, it should be best > > > > > > >> >> >> >> > > > > left to the user. > > > > > > >> >> >> >> > > > > > Because in future if netdev improves on > > > > > > >> >> >> >> > > > > > the naming scheme, mdev will be > > > > > > >> >> >> >> > > > > limiting it, which is not right. > > > > > > >> >> >> >> > > > > > So not restricting alias size seems right to me. > > > > > > >> >> >> >> > > > > > User configuring mdev for networking > > > > > > >> >> >> >> > > > > > devices in a given kernel knows what > > > > > > >> >> >> >> > > > > user is doing. > > > > > > >> >> >> >> > > > > > So user can choose alias name size as it finds > > suitable. > > > > > > >> >> >> >> > > > > > > > > > > >> >> >> >> > > > > That's not what I'm proposing, please read again. > > > > > > >> >> >> >> > > > > Thanks, > > > > > > >> >> >> >> > > > > > > > > > >> >> >> >> > > > I understood your point. But mdev doesn't know > > > > > > >> >> >> >> > > > how user is going to use > > > > > > >> >> >> >> > > udev/systemd to name the netdev. > > > > > > >> >> >> >> > > > So even if mdev chose to pick 12 characters, > > > > > > >> >> >> >> > > > it could result in > > > > > > >> >> collision. > > > > > > >> >> >> >> > > > Hence the proposal to provide the alias by the > > > > > > >> >> >> >> > > > user, as user know the best > > > > > > >> >> >> >> > > policy for its use case in the environment its using. > > > > > > >> >> >> >> > > > So 12 character sha1 method will still work by user. > > > > > > >> >> >> >> > > > > > > > > >> >> >> >> > > Haven't you already provided examples where > > > > > > >> >> >> >> > > certain drivers or subsystems have unique netdev > > prefixes? > > > > > > >> >> >> >> > > If mdev provides a unique alias within the > > > > > > >> >> >> >> > > subsystem, couldn't we simply define a netdev > > > > > > >> >> >> >> > > prefix for the mdev subsystem and avoid all > > > > > > >> >> >> >> > > other collisions? I'm not in favor of the user > > > > > > >> >> >> >> > > providing both a uuid and an alias/instance. > > > > > > >> >> >> >> > > Thanks, > > > > > > >> >> >> >> > > > > > > > > >> >> >> >> > For a given prefix, say ens2f0, can two UUID->sha1 > > > > > > >> >> >> >> > first 9 characters have > > > > > > >> >> >> >> collision? > > > > > > >> >> >> >> > > > > > > >> >> >> >> I think it would be a mistake to waste so many chars > > > > > > >> >> >> >> on a prefix, but > > > > > > >> >> >> >> 9 characters of sha1 likely wouldn't have a > > > > > > >> >> >> >> collision before we have 10s of thousands of > > > > > > >> >> >> >> devices. Thanks, > > > > > > >> >> >> >> > > > > > > >> >> >> >> Alex > > > > > > >> >> >> > > > > > > > >> >> >> >Jiri, Dave, > > > > > > >> >> >> >Are you ok with it for devlink/netdev part? > > > > > > >> >> >> >Mdev core will create an alias from a UUID. > > > > > > >> >> >> > > > > > > > >> >> >> >This will be supplied during devlink port attr set > > > > > > >> >> >> >such as, > > > > > > >> >> >> > > > > > > > >> >> >> >devlink_port_attrs_mdev_set(struct devlink_port *port, > > > > > > >> >> >> >const char *mdev_alias); > > > > > > >> >> >> > > > > > > > >> >> >> >This alias is used to generate representor netdev's > > > > phys_port_name. > > > > > > >> >> >> >This alias from the mdev device's sysfs will be used > > > > > > >> >> >> >by the udev/systemd to > > > > > > >> >> >> generate predicable netdev's name. > > > > > > >> >> >> >Example: enm<mdev_alias_first_12_chars> > > > > > > >> >> >> > > > > > > >> >> >> What happens in unlikely case of 2 UUIDs collide? > > > > > > >> >> >> > > > > > > >> >> >Since users sees two devices with same phys_port_name, > > > > > > >> >> >user should destroy > > > > > > >> >> recently created mdev and recreate mdev with different UUID? > > > > > > >> >> > > > > > > >> >> Driver should make sure phys port name wont collide, > > > > > > >> >So when mdev creation is initiated, mdev core calculates the > > > > > > >> >alias and if there > > > > > > >> is any other mdev with same alias exist, it returns -EEXIST > > > > > > >> error before progressing further. > > > > > > >> >This way user will get to know upfront in event of collision > > > > > > >> >before the mdev > > > > > > >> device gets created. > > > > > > >> >How about that? > > > > > > >> > > > > > > >> Sounds fine to me. Now the question is how many chars do we > > > > > > >> want to > > > > have. > > > > > > >> > > > > > > >12 characters from Alex's suggestion similar to git? > > > > > > > > > > > > Ok. > > > > > > > > > > > > > > > > Can you please confirm this scheme looks good now? I like to get > > > > > patches > > > > started. > > > > > > > > My only concern is your comment that in the event of an abbreviated > > > > sha1 collision (as exceptionally rare as that might be at 12-chars), > > > > we'd fail the device create, while my original suggestion was that > > > > vfio-core would add an extra character to the alias. For > > > > non-networking devices, the sha1 is unnecessary, so the extension > > > > behavior seems preferred. The user is only responsible to provide a > > > > unique uuid. Perhaps the failure behavior could be applied based on > > > > the mdev device_api. A module option on mdev to specify the default > > > > number of alias chars would also be useful for testing so that we > > > > can set it low enough to validate the collision behavior. Thanks, > > > > > > > > > > Idea is to have mdev alias as optional. > > > Each mdev_parent says whether it wants mdev_core to generate an alias > > > or not. So only networking device drivers would set it to true. > > > For rest, alias won't be generated, and won't be compared either > > > during creation time. User continue to provide only uuid. > > > > Ok > > > > > I am tempted to have alias collision detection only within children > > > mdevs of the same parent, but doing so will always mandate to prefix > > > in netdev name. And currently we are left with only 3 characters to > > > prefix it, so that may not be good either. Hence, I think mdev core > > > wide alias is better with 12 characters. > > > > I suppose it depends on the API, if the vendor driver can ask the mdev core for > > an alias as part of the device creation process, then it could manage the netdev > > namespace for all its devices, choosing how many characters to use, and fail > > the creation if it can't meet a uniqueness requirement. IOW, mdev-core would > > always provide a full sha1 and therefore gets itself out of the > > uniqueness/collision aspects. > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so > mdev core allowed to create a mdev. The mdev vendor driver has the opportunity to fail the device creation in mdev_parent_ops.create(). > And then devlink core chooses > only 6 bytes (12 characters) and there is collision. Things fall > apart. Since mdev provides unique uuid based scheme, it's the mdev > core's ownership to provide unique aliases. You're suggesting/contemplating multiple solutions here, 3-char prefix + 12-char sha1 vs <parent netdev> + ?-char sha1. Also, the 15-char total limit is imposed by an external subsystem, where the vendor driver is the gateway between that subsystem and mdev. How would mdev integrate with another subsystem that maybe only has 9-chars available? Would the vendor driver API specify "I need an alias" or would it specify "I need an X-char length alias"? Does it make sense that mdev-core would fail creation of a device if there's a collision in the 12-char address space between different subsystems? For example, does enm0123456789ab really collide with xyz0123456789ab? So if mdev were to provided a 40-char sha1, is it possible that the vendor driver could consume this in its create callback, truncate it to the number of chars required by the vendor driver's subsystem, and determine whether a collision exists? > > > I do not understand how an extra character reduces collision, if > > > that's what you meant. > > > > If the default were for example 3-chars, we might already have > > device 'abc'. A collision would expose one more char of the new > > device, so we might add device with alias 'abcd'. I mentioned > > previously that this leaves an issue for userspace that we can't > > change the alias of device abc, so without additional information, > > userspace can only determine via elimination the mapping of alias > > to device, but userspace has more information available to it in > > the form of sysfs links. > > > Module options are almost not encouraged anymore with other > > > subsystems/drivers. > > > > We don't live in a world of absolutes. I agree that the defaults > > should work in the vast majority of cases. Requiring a user to > > twiddle module options to make things work is undesirable, verging > > on a bug. A module option to enable some specific feature, unsafe > > condition, or test that is outside of the typical use case is > > reasonable, imo. > > > For testing collision rate, a sample user space script and sample > > > mtty is easy and get us collision count too. We shouldn't put > > > that using module option in production kernel. I practically have > > > the code ready to play with; Changing 12 to smaller value is easy > > > with module reload. > > > > > > #define MDEV_ALIAS_LEN 12 > > > > If it can't be tested with a shipping binary, it probably won't be > > tested. Thanks, > It is not the role of mdev core to expose collision > efficiency/deficiency of the sha1. It can be tested outside before > mdev choose to use it. The testing I'm considering is the user and kernel response to a collision. > I am saying we should test with 12 characters with 10,000 or more > devices and see how collision occurs. Even if collision occurs, mdev > returns EEXIST status indicating user to pick a different UUID for > those rare conditions. The only way we're going to see collision with a 12-char sha1 is if we burn the CPU cycles to find uuids that collide in that space. 10,000 devices is not remotely enough to generate a collision in that address space. That puts a prerequisite in place that in order to test collision, someone needs to know certain magic inputs. OTOH, if we could use a shorter abbreviation, collisions are trivial to test experimentally. Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Friday, August 23, 2019 10:47 PM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Fri, 23 Aug 2019 16:14:04 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > > Idea is to have mdev alias as optional. > > > > Each mdev_parent says whether it wants mdev_core to generate an > > > > alias or not. So only networking device drivers would set it to true. > > > > For rest, alias won't be generated, and won't be compared either > > > > during creation time. User continue to provide only uuid. > > > > > > Ok > > > > > > > I am tempted to have alias collision detection only within > > > > children mdevs of the same parent, but doing so will always > > > > mandate to prefix in netdev name. And currently we are left with > > > > only 3 characters to prefix it, so that may not be good either. > > > > Hence, I think mdev core wide alias is better with 12 characters. > > > > > > I suppose it depends on the API, if the vendor driver can ask the > > > mdev core for an alias as part of the device creation process, then > > > it could manage the netdev namespace for all its devices, choosing > > > how many characters to use, and fail the creation if it can't meet a > > > uniqueness requirement. IOW, mdev-core would always provide a full > > > sha1 and therefore gets itself out of the uniqueness/collision aspects. > > > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so > > mdev core allowed to create a mdev. > > The mdev vendor driver has the opportunity to fail the device creation in > mdev_parent_ops.create(). > That is not helpful for below reasons. 1. vendor driver doesn't have visibility in other vendor's alias. 2. Even for single vendor, it needs to maintain global list of devices to see collision. 3. multiple vendors needs to implement same scheme. Mdev core should be the owner. Shifting ownership from one layer to a lower layer in vendor driver doesn't solve the problem (if there is one, which I think doesn't exist). > > And then devlink core chooses > > only 6 bytes (12 characters) and there is collision. Things fall > > apart. Since mdev provides unique uuid based scheme, it's the mdev > > core's ownership to provide unique aliases. > > You're suggesting/contemplating multiple solutions here, 3-char prefix + 12- > char sha1 vs <parent netdev> + ?-char sha1. Also, the 15-char total limit is > imposed by an external subsystem, where the vendor driver is the gateway > between that subsystem and mdev. How would mdev integrate with another > subsystem that maybe only has 9-chars available? Would the vendor driver API > specify "I need an alias" or would it specify "I need an X-char length alias"? Yes, Vendor driver should say how long the alias it wants. However before we implement that, I suggest let such vendor/user/driver arrive which needs that. Such variable length alias can be added at that time and even with that alias collision can be detected by single mdev module. > Does it make sense that mdev-core would fail creation of a device if there's a > collision in the 12-char address space between different subsystems? For > example, does enm0123456789ab really collide with xyz0123456789ab? I think so, because at mdev level its 12-char alias matters. Choosing the prefix not adding prefix is really a user space choice. > So if > mdev were to provided a 40-char sha1, is it possible that the vendor driver > could consume this in its create callback, truncate it to the number of chars > required by the vendor driver's subsystem, and determine whether a collision > exists? We shouldn't shift the problem from mdev to multiple vendor drivers to detect collision. I still think that user providing alias is better because it knows the use-case system in use, and eliminates these collision issue. > > > > > I do not understand how an extra character reduces collision, if > > > > that's what you meant. > > > > > > If the default were for example 3-chars, we might already have > > > device 'abc'. A collision would expose one more char of the new > > > device, so we might add device with alias 'abcd'. I mentioned > > > previously that this leaves an issue for userspace that we can't > > > change the alias of device abc, so without additional information, > > > userspace can only determine via elimination the mapping of alias to > > > device, but userspace has more information available to it in the > > > form of sysfs links. > > > > Module options are almost not encouraged anymore with other > > > > subsystems/drivers. > > > > > > We don't live in a world of absolutes. I agree that the defaults > > > should work in the vast majority of cases. Requiring a user to > > > twiddle module options to make things work is undesirable, verging > > > on a bug. A module option to enable some specific feature, unsafe > > > condition, or test that is outside of the typical use case is > > > reasonable, imo. > > > > For testing collision rate, a sample user space script and sample > > > > mtty is easy and get us collision count too. We shouldn't put that > > > > using module option in production kernel. I practically have the > > > > code ready to play with; Changing 12 to smaller value is easy with > > > > module reload. > > > > > > > > #define MDEV_ALIAS_LEN 12 > > > > > > If it can't be tested with a shipping binary, it probably won't be > > > tested. Thanks, > > It is not the role of mdev core to expose collision > > efficiency/deficiency of the sha1. It can be tested outside before > > mdev choose to use it. > > The testing I'm considering is the user and kernel response to a collision. > > > I am saying we should test with 12 characters with 10,000 or more > > devices and see how collision occurs. Even if collision occurs, mdev > > returns EEXIST status indicating user to pick a different UUID for > > those rare conditions. > > The only way we're going to see collision with a 12-char sha1 is if we burn the > CPU cycles to find uuids that collide in that space. 10,000 devices is not > remotely enough to generate a collision in that address space. That puts a > prerequisite in place that in order to test collision, someone needs to know > certain magic inputs. OTOH, if we could use a shorter abbreviation, collisions > are trivial to test experimentally. Thanks, > Yes, and therefore a sane user who wants to create more mdevs, wouldn't intentionally stress it to see failures. > Alex
On Fri, 23 Aug 2019 18:00:30 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Friday, August 23, 2019 10:47 PM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Fri, 23 Aug 2019 16:14:04 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > Idea is to have mdev alias as optional. > > > > > Each mdev_parent says whether it wants mdev_core to generate an > > > > > alias or not. So only networking device drivers would set it to true. > > > > > For rest, alias won't be generated, and won't be compared either > > > > > during creation time. User continue to provide only uuid. > > > > > > > > Ok > > > > > > > > > I am tempted to have alias collision detection only within > > > > > children mdevs of the same parent, but doing so will always > > > > > mandate to prefix in netdev name. And currently we are left with > > > > > only 3 characters to prefix it, so that may not be good either. > > > > > Hence, I think mdev core wide alias is better with 12 characters. > > > > > > > > I suppose it depends on the API, if the vendor driver can ask the > > > > mdev core for an alias as part of the device creation process, then > > > > it could manage the netdev namespace for all its devices, choosing > > > > how many characters to use, and fail the creation if it can't meet a > > > > uniqueness requirement. IOW, mdev-core would always provide a full > > > > sha1 and therefore gets itself out of the uniqueness/collision aspects. > > > > > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so > > > mdev core allowed to create a mdev. > > > > The mdev vendor driver has the opportunity to fail the device creation in > > mdev_parent_ops.create(). > > > That is not helpful for below reasons. > 1. vendor driver doesn't have visibility in other vendor's alias. > 2. Even for single vendor, it needs to maintain global list of devices to see collision. > 3. multiple vendors needs to implement same scheme. > > Mdev core should be the owner. Shifting ownership from one layer to a > lower layer in vendor driver doesn't solve the problem (if there is > one, which I think doesn't exist). > > > > And then devlink core chooses > > > only 6 bytes (12 characters) and there is collision. Things fall > > > apart. Since mdev provides unique uuid based scheme, it's the mdev > > > core's ownership to provide unique aliases. > > > > You're suggesting/contemplating multiple solutions here, 3-char > > prefix + 12- char sha1 vs <parent netdev> + ?-char sha1. Also, the > > 15-char total limit is imposed by an external subsystem, where the > > vendor driver is the gateway between that subsystem and mdev. How > > would mdev integrate with another subsystem that maybe only has > > 9-chars available? Would the vendor driver API specify "I need an > > alias" or would it specify "I need an X-char length alias"? > Yes, Vendor driver should say how long the alias it wants. > However before we implement that, I suggest let such > vendor/user/driver arrive which needs that. Such variable length > alias can be added at that time and even with that alias collision > can be detected by single mdev module. If we agree that different alias lengths are possible, then I would request that minimally an mdev sample driver be modified to request an alias with a length that can be adjusted without recompiling in order to exercise the collision path. If mdev-core is guaranteeing uniqueness, does this indicate that each alias length constitutes a separate namespace? ie. strictly a strcmp(), not a strncmp() to the shorter alias. > > Does it make sense that mdev-core would fail creation of a device > > if there's a collision in the 12-char address space between > > different subsystems? For example, does enm0123456789ab really > > collide with xyz0123456789ab? > I think so, because at mdev level its 12-char alias matters. > Choosing the prefix not adding prefix is really a user space choice. > > > So if > > mdev were to provided a 40-char sha1, is it possible that the > > vendor driver could consume this in its create callback, truncate > > it to the number of chars required by the vendor driver's > > subsystem, and determine whether a collision exists? > We shouldn't shift the problem from mdev to multiple vendor drivers > to detect collision. > > I still think that user providing alias is better because it knows > the use-case system in use, and eliminates these collision issue. How is a user provided alias immune from collisions? The burden is on the user to provide both a unique uuid and a unique alias. That makes it trivial to create a collision. > > > > > I do not understand how an extra character reduces collision, > > > > > if that's what you meant. > > > > > > > > If the default were for example 3-chars, we might already have > > > > device 'abc'. A collision would expose one more char of the new > > > > device, so we might add device with alias 'abcd'. I mentioned > > > > previously that this leaves an issue for userspace that we can't > > > > change the alias of device abc, so without additional > > > > information, userspace can only determine via elimination the > > > > mapping of alias to device, but userspace has more information > > > > available to it in the form of sysfs links. > > > > > Module options are almost not encouraged anymore with other > > > > > subsystems/drivers. > > > > > > > > We don't live in a world of absolutes. I agree that the > > > > defaults should work in the vast majority of cases. Requiring > > > > a user to twiddle module options to make things work is > > > > undesirable, verging on a bug. A module option to enable some > > > > specific feature, unsafe condition, or test that is outside of > > > > the typical use case is reasonable, imo. > > > > > For testing collision rate, a sample user space script and > > > > > sample mtty is easy and get us collision count too. We > > > > > shouldn't put that using module option in production kernel. > > > > > I practically have the code ready to play with; Changing 12 > > > > > to smaller value is easy with module reload. > > > > > > > > > > #define MDEV_ALIAS_LEN 12 > > > > > > > > If it can't be tested with a shipping binary, it probably won't > > > > be tested. Thanks, > > > It is not the role of mdev core to expose collision > > > efficiency/deficiency of the sha1. It can be tested outside before > > > mdev choose to use it. > > > > The testing I'm considering is the user and kernel response to a > > collision. > > > I am saying we should test with 12 characters with 10,000 or more > > > devices and see how collision occurs. Even if collision occurs, > > > mdev returns EEXIST status indicating user to pick a different > > > UUID for those rare conditions. > > > > The only way we're going to see collision with a 12-char sha1 is if > > we burn the CPU cycles to find uuids that collide in that space. > > 10,000 devices is not remotely enough to generate a collision in > > that address space. That puts a prerequisite in place that in > > order to test collision, someone needs to know certain magic > > inputs. OTOH, if we could use a shorter abbreviation, collisions > > are trivial to test experimentally. Thanks, > Yes, and therefore a sane user who wants to create more mdevs, > wouldn't intentionally stress it to see failures. I don't understand this logic. I'm simply asking that we have a way to test the collision behavior without changing the binary. The path we're driving towards seems to be making this easier and easier. If the vendor can request an alias of a specific length, then a sample driver with a module option to set the desired alias length to 1-char makes it trivially easy to induce a collision. It doesn't even need to be exposed in a real driver. Besides, when do we ever get to design interfaces that only worry about sane users??? Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Saturday, August 24, 2019 1:14 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Fri, 23 Aug 2019 18:00:30 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Friday, August 23, 2019 10:47 PM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; > > > David S . Miller <davem@davemloft.net>; Kirti Wankhede > > > <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > > kvm@vger.kernel.org; linux- kernel@vger.kernel.org; cjia > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Fri, 23 Aug 2019 16:14:04 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > Idea is to have mdev alias as optional. > > > > > > Each mdev_parent says whether it wants mdev_core to generate > > > > > > an alias or not. So only networking device drivers would set it to true. > > > > > > For rest, alias won't be generated, and won't be compared > > > > > > either during creation time. User continue to provide only uuid. > > > > > > > > > > Ok > > > > > > > > > > > I am tempted to have alias collision detection only within > > > > > > children mdevs of the same parent, but doing so will always > > > > > > mandate to prefix in netdev name. And currently we are left > > > > > > with only 3 characters to prefix it, so that may not be good either. > > > > > > Hence, I think mdev core wide alias is better with 12 characters. > > > > > > > > > > I suppose it depends on the API, if the vendor driver can ask > > > > > the mdev core for an alias as part of the device creation > > > > > process, then it could manage the netdev namespace for all its > > > > > devices, choosing how many characters to use, and fail the > > > > > creation if it can't meet a uniqueness requirement. IOW, > > > > > mdev-core would always provide a full > > > > > sha1 and therefore gets itself out of the uniqueness/collision aspects. > > > > > > > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so > > > > mdev core allowed to create a mdev. > > > > > > The mdev vendor driver has the opportunity to fail the device > > > creation in mdev_parent_ops.create(). > > > > > That is not helpful for below reasons. > > 1. vendor driver doesn't have visibility in other vendor's alias. > > 2. Even for single vendor, it needs to maintain global list of devices to see > collision. > > 3. multiple vendors needs to implement same scheme. > > > > Mdev core should be the owner. Shifting ownership from one layer to a > > lower layer in vendor driver doesn't solve the problem (if there is > > one, which I think doesn't exist). > > > > > > And then devlink core chooses > > > > only 6 bytes (12 characters) and there is collision. Things fall > > > > apart. Since mdev provides unique uuid based scheme, it's the mdev > > > > core's ownership to provide unique aliases. > > > > > > You're suggesting/contemplating multiple solutions here, 3-char > > > prefix + 12- char sha1 vs <parent netdev> + ?-char sha1. Also, the > > > 15-char total limit is imposed by an external subsystem, where the > > > vendor driver is the gateway between that subsystem and mdev. How > > > would mdev integrate with another subsystem that maybe only has > > > 9-chars available? Would the vendor driver API specify "I need an > > > alias" or would it specify "I need an X-char length alias"? > > Yes, Vendor driver should say how long the alias it wants. > > However before we implement that, I suggest let such > > vendor/user/driver arrive which needs that. Such variable length alias > > can be added at that time and even with that alias collision can be > > detected by single mdev module. > > If we agree that different alias lengths are possible, then I would request that > minimally an mdev sample driver be modified to request an alias with a length > that can be adjusted without recompiling in order to exercise the collision path. > Yes. this can be done. But I fail to understand the need to do so. It is not the responsibility of the mdev core to show case sha1 collision efficiency/deficiency. So why do you insist exercise it? > If mdev-core is guaranteeing uniqueness, does this indicate that each alias > length constitutes a separate namespace? ie. strictly a strcmp(), not a > strncmp() to the shorter alias. > Yes. > > > Does it make sense that mdev-core would fail creation of a device if > > > there's a collision in the 12-char address space between different > > > subsystems? For example, does enm0123456789ab really > > > collide with xyz0123456789ab? > > I think so, because at mdev level its 12-char alias matters. > > Choosing the prefix not adding prefix is really a user space choice. > > > > > So if > > > mdev were to provided a 40-char sha1, is it possible that the vendor > > > driver could consume this in its create callback, truncate it to the > > > number of chars required by the vendor driver's subsystem, and > > > determine whether a collision exists? > > We shouldn't shift the problem from mdev to multiple vendor drivers to > > detect collision. > > > > I still think that user providing alias is better because it knows the > > use-case system in use, and eliminates these collision issue. > > How is a user provided alias immune from collisions? The burden is on the user > to provide both a unique uuid and a unique alias. That makes it trivial to create > a collision. > Than such collision should have occurred for other subsystem such as netdev while creating vlan, macvlan, ipvlan, vxlan and more devices who are named by the user. But that isn't the case. > > > > > > I do not understand how an extra character reduces collision, > > > > > > if that's what you meant. > > > > > > > > > > If the default were for example 3-chars, we might already have > > > > > device 'abc'. A collision would expose one more char of the new > > > > > device, so we might add device with alias 'abcd'. I mentioned > > > > > previously that this leaves an issue for userspace that we can't > > > > > change the alias of device abc, so without additional > > > > > information, userspace can only determine via elimination the > > > > > mapping of alias to device, but userspace has more information > > > > > available to it in the form of sysfs links. > > > > > > Module options are almost not encouraged anymore with other > > > > > > subsystems/drivers. > > > > > > > > > > We don't live in a world of absolutes. I agree that the > > > > > defaults should work in the vast majority of cases. Requiring a > > > > > user to twiddle module options to make things work is > > > > > undesirable, verging on a bug. A module option to enable some > > > > > specific feature, unsafe condition, or test that is outside of > > > > > the typical use case is reasonable, imo. > > > > > > For testing collision rate, a sample user space script and > > > > > > sample mtty is easy and get us collision count too. We > > > > > > shouldn't put that using module option in production kernel. > > > > > > I practically have the code ready to play with; Changing 12 to > > > > > > smaller value is easy with module reload. > > > > > > > > > > > > #define MDEV_ALIAS_LEN 12 > > > > > > > > > > If it can't be tested with a shipping binary, it probably won't > > > > > be tested. Thanks, > > > > It is not the role of mdev core to expose collision > > > > efficiency/deficiency of the sha1. It can be tested outside before > > > > mdev choose to use it. > > > > > > The testing I'm considering is the user and kernel response to a > > > collision. > > > > I am saying we should test with 12 characters with 10,000 or more > > > > devices and see how collision occurs. Even if collision occurs, > > > > mdev returns EEXIST status indicating user to pick a different > > > > UUID for those rare conditions. > > > > > > The only way we're going to see collision with a 12-char sha1 is if > > > we burn the CPU cycles to find uuids that collide in that space. > > > 10,000 devices is not remotely enough to generate a collision in > > > that address space. That puts a prerequisite in place that in order > > > to test collision, someone needs to know certain magic inputs. > > > OTOH, if we could use a shorter abbreviation, collisions are trivial > > > to test experimentally. Thanks, > > Yes, and therefore a sane user who wants to create more mdevs, > > wouldn't intentionally stress it to see failures. > > I don't understand this logic. I'm simply asking that we have a way to test the > collision behavior without changing the binary. The path we're driving towards > seems to be making this easier and easier. If the vendor can request an alias of > a specific length, then a sample driver with a module option to set the desired > alias length to 1-char makes it trivially easy to induce a collision. Sure it is easy to test collision, but my point is - mdev core is not sha1 test module. Hence adding functionality of variable alias length to test collision doesn't make sense. When the actual user arrives who needs small alias, we will be able to add additional pieces very easily. > It doesn't > even need to be exposed in a real driver. Besides, when do we ever get to > design interfaces that only worry about sane users??? Thanks, > I intent to say that a sane user who wants to create mdev's will just work fine with less collision. If there is collision EEXIST is returns and sane user picks different UUID. If user is intentionally picking UUIDs in such a way that triggers sha1 collision, his intention is likely to not create mdevs for actual use. And if interface returns error code it is still fine. > Alex
Hi Alex, > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org <linux-kernel- > owner@vger.kernel.org> On Behalf Of Parav Pandit > Sent: Saturday, August 24, 2019 9:26 AM > To: Alex Williamson <alex.williamson@redhat.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . > Miller <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: RE: [PATCH v2 0/2] Simplify mtty driver and mdev core > > I don't understand this logic. I'm simply asking that we have a way > > to test the collision behavior without changing the binary. The path > > we're driving towards seems to be making this easier and easier. If > > the vendor can request an alias of a specific length, then a sample > > driver with a module option to set the desired alias length to 1-char makes > it trivially easy to induce a collision. > Sure it is easy to test collision, but my point is - mdev core is not sha1 test > module. > Hence adding functionality of variable alias length to test collision doesn't > make sense. > When the actual user arrives who needs small alias, we will be able to add > additional pieces very easily. My initial thoughts to add parent_ops to have bool flag to generate alias or not. However, instead of bool, keeping it unsigned int to say, zero to skip alias and non-zero length to convey generate alias. This will serve both the purpose with trivial handling.
On Sat, 24 Aug 2019 03:56:08 +0000 Parav Pandit <parav@mellanox.com> wrote: > > -----Original Message----- > > From: Alex Williamson <alex.williamson@redhat.com> > > Sent: Saturday, August 24, 2019 1:14 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . Miller > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; Cornelia > > Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > On Fri, 23 Aug 2019 18:00:30 +0000 > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > -----Original Message----- > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > Sent: Friday, August 23, 2019 10:47 PM > > > > To: Parav Pandit <parav@mellanox.com> > > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; > > > > David S . Miller <davem@davemloft.net>; Kirti Wankhede > > > > <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > > > kvm@vger.kernel.org; linux- kernel@vger.kernel.org; cjia > > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > On Fri, 23 Aug 2019 16:14:04 +0000 > > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > > > > Idea is to have mdev alias as optional. > > > > > > > Each mdev_parent says whether it wants mdev_core to generate > > > > > > > an alias or not. So only networking device drivers would set it to true. > > > > > > > For rest, alias won't be generated, and won't be compared > > > > > > > either during creation time. User continue to provide only uuid. > > > > > > > > > > > > Ok > > > > > > > > > > > > > I am tempted to have alias collision detection only within > > > > > > > children mdevs of the same parent, but doing so will always > > > > > > > mandate to prefix in netdev name. And currently we are left > > > > > > > with only 3 characters to prefix it, so that may not be good either. > > > > > > > Hence, I think mdev core wide alias is better with 12 characters. > > > > > > > > > > > > I suppose it depends on the API, if the vendor driver can ask > > > > > > the mdev core for an alias as part of the device creation > > > > > > process, then it could manage the netdev namespace for all its > > > > > > devices, choosing how many characters to use, and fail the > > > > > > creation if it can't meet a uniqueness requirement. IOW, > > > > > > mdev-core would always provide a full > > > > > > sha1 and therefore gets itself out of the uniqueness/collision aspects. > > > > > > > > > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so > > > > > mdev core allowed to create a mdev. > > > > > > > > The mdev vendor driver has the opportunity to fail the device > > > > creation in mdev_parent_ops.create(). > > > > > > > That is not helpful for below reasons. > > > 1. vendor driver doesn't have visibility in other vendor's alias. > > > 2. Even for single vendor, it needs to maintain global list of devices to see > > collision. > > > 3. multiple vendors needs to implement same scheme. > > > > > > Mdev core should be the owner. Shifting ownership from one layer to a > > > lower layer in vendor driver doesn't solve the problem (if there is > > > one, which I think doesn't exist). > > > > > > > > And then devlink core chooses > > > > > only 6 bytes (12 characters) and there is collision. Things fall > > > > > apart. Since mdev provides unique uuid based scheme, it's the mdev > > > > > core's ownership to provide unique aliases. > > > > > > > > You're suggesting/contemplating multiple solutions here, 3-char > > > > prefix + 12- char sha1 vs <parent netdev> + ?-char sha1. Also, the > > > > 15-char total limit is imposed by an external subsystem, where the > > > > vendor driver is the gateway between that subsystem and mdev. How > > > > would mdev integrate with another subsystem that maybe only has > > > > 9-chars available? Would the vendor driver API specify "I need an > > > > alias" or would it specify "I need an X-char length alias"? > > > Yes, Vendor driver should say how long the alias it wants. > > > However before we implement that, I suggest let such > > > vendor/user/driver arrive which needs that. Such variable length alias > > > can be added at that time and even with that alias collision can be > > > detected by single mdev module. > > > > If we agree that different alias lengths are possible, then I would request that > > minimally an mdev sample driver be modified to request an alias with a length > > that can be adjusted without recompiling in order to exercise the collision path. > > > Yes. this can be done. But I fail to understand the need to do so. > It is not the responsibility of the mdev core to show case sha1 > collision efficiency/deficiency. So why do you insist exercise it? I don't understand what you're trying to imply with "show case sha1 collision efficiency/deficiency". Are you suggesting that I'm asking for this feature to experimentally test the probability of collisions at different character lengths? We can use shell scripts for that. I'm simply observing that collisions are possible based on user input, but they're not practical to test for at the character lengths we're using. Therefore, how do I tell QA to develop a tests to make sure the kernel and userspace tools that might be involved behave correctly when this rare event occurs? As I mentioned previously, we can burn the cpu cyles to find some uuids which will collide with our aliases, but the more accessible approach seems to be to have a tune-able to reduce the alias address space such that we can simply throw enough random uuids into the test to guarantee a collision. Simply generating 10,000 devices with a 12-character alias, as you suggested previously, has effectively a 0% probability of generating a collision. If we accept that different vendor drivers might have different alias requirements, and therefore the vendor driver should have the ability to specify an alias length, then this all fits very nicely into modifying a sample driver to request a sufficiently short alias such that we can use it to test the behavior of mdev-core and surrounding code when an alias collision occurs. > > If mdev-core is guaranteeing uniqueness, does this indicate that > > each alias length constitutes a separate namespace? ie. strictly a > > strcmp(), not a strncmp() to the shorter alias. > > > Yes. > > > > > > Does it make sense that mdev-core would fail creation of a > > > > device if there's a collision in the 12-char address space > > > > between different subsystems? For example, does > > > > enm0123456789ab really collide with xyz0123456789ab? > > > I think so, because at mdev level its 12-char alias matters. > > > Choosing the prefix not adding prefix is really a user space > > > choice. > > > > So if > > > > mdev were to provided a 40-char sha1, is it possible that the > > > > vendor driver could consume this in its create callback, > > > > truncate it to the number of chars required by the vendor > > > > driver's subsystem, and determine whether a collision exists? > > > We shouldn't shift the problem from mdev to multiple vendor > > > drivers to detect collision. > > > > > > I still think that user providing alias is better because it > > > knows the use-case system in use, and eliminates these collision > > > issue. > > > > How is a user provided alias immune from collisions? The burden is > > on the user to provide both a unique uuid and a unique alias. That > > makes it trivial to create a collision. > > > Than such collision should have occurred for other subsystem such as > netdev while creating vlan, macvlan, ipvlan, vxlan and more devices > who are named by the user. But that isn't the case. > > > > > > > > I do not understand how an extra character reduces > > > > > > > collision, if that's what you meant. > > > > > > > > > > > > If the default were for example 3-chars, we might already > > > > > > have device 'abc'. A collision would expose one more char > > > > > > of the new device, so we might add device with alias > > > > > > 'abcd'. I mentioned previously that this leaves an issue > > > > > > for userspace that we can't change the alias of device abc, > > > > > > so without additional information, userspace can only > > > > > > determine via elimination the mapping of alias to device, > > > > > > but userspace has more information available to it in the > > > > > > form of sysfs links. > > > > > > > Module options are almost not encouraged anymore with > > > > > > > other subsystems/drivers. > > > > > > > > > > > > We don't live in a world of absolutes. I agree that the > > > > > > defaults should work in the vast majority of cases. > > > > > > Requiring a user to twiddle module options to make things > > > > > > work is undesirable, verging on a bug. A module option to > > > > > > enable some specific feature, unsafe condition, or test > > > > > > that is outside of the typical use case is reasonable, > > > > > > imo. > > > > > > > For testing collision rate, a sample user space script and > > > > > > > sample mtty is easy and get us collision count too. We > > > > > > > shouldn't put that using module option in production > > > > > > > kernel. I practically have the code ready to play with; > > > > > > > Changing 12 to smaller value is easy with module reload. > > > > > > > > > > > > > > #define MDEV_ALIAS_LEN 12 > > > > > > > > > > > > If it can't be tested with a shipping binary, it probably > > > > > > won't be tested. Thanks, > > > > > It is not the role of mdev core to expose collision > > > > > efficiency/deficiency of the sha1. It can be tested outside > > > > > before mdev choose to use it. > > > > > > > > The testing I'm considering is the user and kernel response to a > > > > collision. > > > > > I am saying we should test with 12 characters with 10,000 or > > > > > more devices and see how collision occurs. Even if collision > > > > > occurs, mdev returns EEXIST status indicating user to pick a > > > > > different UUID for those rare conditions. > > > > > > > > The only way we're going to see collision with a 12-char sha1 > > > > is if we burn the CPU cycles to find uuids that collide in that > > > > space. 10,000 devices is not remotely enough to generate a > > > > collision in that address space. That puts a prerequisite in > > > > place that in order to test collision, someone needs to know > > > > certain magic inputs. OTOH, if we could use a shorter > > > > abbreviation, collisions are trivial to test experimentally. > > > > Thanks, > > > Yes, and therefore a sane user who wants to create more mdevs, > > > wouldn't intentionally stress it to see failures. > > > > I don't understand this logic. I'm simply asking that we have a > > way to test the collision behavior without changing the binary. > > The path we're driving towards seems to be making this easier and > > easier. If the vendor can request an alias of a specific length, > > then a sample driver with a module option to set the desired alias > > length to 1-char makes it trivially easy to induce a collision. > Sure it is easy to test collision, but my point is - mdev core is not > sha1 test module. Hence adding functionality of variable alias length > to test collision doesn't make sense. When the actual user arrives > who needs small alias, we will be able to add additional pieces very > easily. > > > It doesn't > > even need to be exposed in a real driver. Besides, when do we ever > > get to design interfaces that only worry about sane users??? > > Thanks, > I intent to say that a sane user who wants to create mdev's will just > work fine with less collision. If there is collision EEXIST is > returns and sane user picks different UUID. If user is intentionally > picking UUIDs in such a way that triggers sha1 collision, his > intention is likely to not create mdevs for actual use. And if > interface returns error code it is still fine. This is exactly the scenarios that I'm asking "how do we test that it works as we expect". I can test that passing identical uuids into the mdev create interface only allows the first to succeed. With a 12-char sha1 alias, it's not practical to construct a test to validate the alias collision behavior. Do you suggest we rely only on code inspection instead? Thanks, Alex
> -----Original Message----- > From: Alex Williamson <alex.williamson@redhat.com> > Sent: Saturday, August 24, 2019 10:29 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; David S . > Miller <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>; > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > On Sat, 24 Aug 2019 03:56:08 +0000 > Parav Pandit <parav@mellanox.com> wrote: > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@redhat.com> > > > Sent: Saturday, August 24, 2019 1:14 AM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko <jiri@mellanox.com>; > > > David S . Miller <davem@davemloft.net>; Kirti Wankhede > > > <kwankhede@nvidia.com>; Cornelia Huck <cohuck@redhat.com>; > > > kvm@vger.kernel.org; linux- kernel@vger.kernel.org; cjia > > > <cjia@nvidia.com>; netdev@vger.kernel.org > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > On Fri, 23 Aug 2019 18:00:30 +0000 > > > Parav Pandit <parav@mellanox.com> wrote: > > > > > > > > -----Original Message----- > > > > > From: Alex Williamson <alex.williamson@redhat.com> > > > > > Sent: Friday, August 23, 2019 10:47 PM > > > > > To: Parav Pandit <parav@mellanox.com> > > > > > Cc: Jiri Pirko <jiri@resnulli.us>; Jiri Pirko > > > > > <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>; > > > > > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck > > > > > <cohuck@redhat.com>; kvm@vger.kernel.org; linux- > > > > > kernel@vger.kernel.org; cjia <cjia@nvidia.com>; > > > > > netdev@vger.kernel.org > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core > > > > > > > > > > On Fri, 23 Aug 2019 16:14:04 +0000 Parav Pandit > > > > > <parav@mellanox.com> wrote: > > > > > > > > > > > > > Idea is to have mdev alias as optional. > > > > > > > > Each mdev_parent says whether it wants mdev_core to > > > > > > > > generate an alias or not. So only networking device drivers > would set it to true. > > > > > > > > For rest, alias won't be generated, and won't be compared > > > > > > > > either during creation time. User continue to provide only uuid. > > > > > > > > > > > > > > Ok > > > > > > > > > > > > > > > I am tempted to have alias collision detection only within > > > > > > > > children mdevs of the same parent, but doing so will > > > > > > > > always mandate to prefix in netdev name. And currently we > > > > > > > > are left with only 3 characters to prefix it, so that may not be > good either. > > > > > > > > Hence, I think mdev core wide alias is better with 12 characters. > > > > > > > > > > > > > > I suppose it depends on the API, if the vendor driver can > > > > > > > ask the mdev core for an alias as part of the device > > > > > > > creation process, then it could manage the netdev namespace > > > > > > > for all its devices, choosing how many characters to use, > > > > > > > and fail the creation if it can't meet a uniqueness > > > > > > > requirement. IOW, mdev-core would always provide a full > > > > > > > sha1 and therefore gets itself out of the uniqueness/collision > aspects. > > > > > > > > > > > > > This doesn't work. At mdev core level 20 bytes sha1 are > > > > > > unique, so mdev core allowed to create a mdev. > > > > > > > > > > The mdev vendor driver has the opportunity to fail the device > > > > > creation in mdev_parent_ops.create(). > > > > > > > > > That is not helpful for below reasons. > > > > 1. vendor driver doesn't have visibility in other vendor's alias. > > > > 2. Even for single vendor, it needs to maintain global list of > > > > devices to see > > > collision. > > > > 3. multiple vendors needs to implement same scheme. > > > > > > > > Mdev core should be the owner. Shifting ownership from one layer > > > > to a lower layer in vendor driver doesn't solve the problem (if > > > > there is one, which I think doesn't exist). > > > > > > > > > > And then devlink core chooses > > > > > > only 6 bytes (12 characters) and there is collision. Things > > > > > > fall apart. Since mdev provides unique uuid based scheme, it's > > > > > > the mdev core's ownership to provide unique aliases. > > > > > > > > > > You're suggesting/contemplating multiple solutions here, 3-char > > > > > prefix + 12- char sha1 vs <parent netdev> + ?-char sha1. Also, > > > > > the 15-char total limit is imposed by an external subsystem, > > > > > where the vendor driver is the gateway between that subsystem > > > > > and mdev. How would mdev integrate with another subsystem that > > > > > maybe only has 9-chars available? Would the vendor driver API > > > > > specify "I need an alias" or would it specify "I need an X-char length > alias"? > > > > Yes, Vendor driver should say how long the alias it wants. > > > > However before we implement that, I suggest let such > > > > vendor/user/driver arrive which needs that. Such variable length > > > > alias can be added at that time and even with that alias collision > > > > can be detected by single mdev module. > > > > > > If we agree that different alias lengths are possible, then I would > > > request that minimally an mdev sample driver be modified to request > > > an alias with a length that can be adjusted without recompiling in order > to exercise the collision path. > > > > > Yes. this can be done. But I fail to understand the need to do so. > > It is not the responsibility of the mdev core to show case sha1 > > collision efficiency/deficiency. So why do you insist exercise it? > > I don't understand what you're trying to imply with "show case sha1 collision > efficiency/deficiency". Are you suggesting that I'm asking for this feature to > experimentally test the probability of collisions at different character > lengths? We can use shell scripts for that. > I'm simply observing that collisions are possible based on user input, but > they're not practical to test for at the character lengths we're using. > Therefore, how do I tell QA to develop a tests to make sure the kernel and > userspace tools that might be involved behave correctly when this rare event > occurs? > Ok. so you want to have code coverage and want to add a knob for that. That is fine. I will have the mdev_parent->ops.alias_len as API instead of bool. And extend mtty module parameter to set the alias length. Unfortunately similar code coverage doesn't exist for API like mdev_get/set_iommu_device() in sample of real vendor driver. And QA is not able to test this functionality without tainting the kernel.