Message ID | 1644340446-125084-1-git-send-email-moshe@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | net/mlx5: Introduce devlink param to disable SF aux dev probe | expand |
On Tue, 8 Feb 2022 19:14:02 +0200 Moshe Shemesh wrote: > $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ > value false cmode runtime > > Create SF: > $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 > $ devlink port function set pci/0000:08:00.0/32768 \ > hw_addr 00:00:00:00:00:11 state active > > Now depending on the use case, the user can enable specific auxiliary > device(s). For example: > > $ devlink dev param set auxiliary/mlx5_core.sf.1 \ > name enable_vnet value true cmde driverinit > > Afterwards, user needs to reload the SF in order for the SF to come up > with the specific configuration: > > $ devlink dev reload auxiliary/mlx5_core.sf.1 If the user just wants vnet why not add an API which tells the driver which functionality the user wants when the "port" is "spawned"?
On 2/9/2022 7:23 AM, Jakub Kicinski wrote: > On Tue, 8 Feb 2022 19:14:02 +0200 Moshe Shemesh wrote: >> $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ >> value false cmode runtime >> >> Create SF: >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 >> $ devlink port function set pci/0000:08:00.0/32768 \ >> hw_addr 00:00:00:00:00:11 state active >> >> Now depending on the use case, the user can enable specific auxiliary >> device(s). For example: >> >> $ devlink dev param set auxiliary/mlx5_core.sf.1 \ >> name enable_vnet value true cmde driverinit >> >> Afterwards, user needs to reload the SF in order for the SF to come up >> with the specific configuration: >> >> $ devlink dev reload auxiliary/mlx5_core.sf.1 > If the user just wants vnet why not add an API which tells the driver > which functionality the user wants when the "port" is "spawned"? Well we don't have the SFs at that stage, how can we tell which SF will use vnet and which SF will use eth ?
Wed, Feb 09, 2022 at 06:23:41AM CET, kuba@kernel.org wrote: >On Tue, 8 Feb 2022 19:14:02 +0200 Moshe Shemesh wrote: >> $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ >> value false cmode runtime >> >> Create SF: >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 >> $ devlink port function set pci/0000:08:00.0/32768 \ >> hw_addr 00:00:00:00:00:11 state active >> >> Now depending on the use case, the user can enable specific auxiliary >> device(s). For example: >> >> $ devlink dev param set auxiliary/mlx5_core.sf.1 \ >> name enable_vnet value true cmde driverinit >> >> Afterwards, user needs to reload the SF in order for the SF to come up >> with the specific configuration: >> >> $ devlink dev reload auxiliary/mlx5_core.sf.1 > >If the user just wants vnet why not add an API which tells the driver >which functionality the user wants when the "port" is "spawned"? It's a different user. One works with the eswitch and creates the port function. The other one takes the created instance and works with it. Note that it may be on a different host.
On Wed, 9 Feb 2022 09:39:54 +0200 Moshe Shemesh wrote: > Well we don't have the SFs at that stage, how can we tell which SF will > use vnet and which SF will use eth ? On Wed, 9 Feb 2022 10:57:21 +0100 Jiri Pirko wrote: > It's a different user. One works with the eswitch and creates the port > function. The other one takes the created instance and works with it. > Note that it may be on a different host. It is a little confusing, so I may well be misunderstanding but the cover letter says: $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ value false cmode runtime $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 So both of these run on the same side, no? What I meant is make the former part of the latter: $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 noprobe Maybe worth clarifying - pci/0000:08:00.0 is the eswitch side and auxiliary/mlx5_core.sf.1 is the... "customer" side, correct?
Thu, Feb 10, 2022 at 02:25:25AM CET, kuba@kernel.org wrote: >On Wed, 9 Feb 2022 09:39:54 +0200 Moshe Shemesh wrote: >> Well we don't have the SFs at that stage, how can we tell which SF will >> use vnet and which SF will use eth ? > >On Wed, 9 Feb 2022 10:57:21 +0100 Jiri Pirko wrote: >> It's a different user. One works with the eswitch and creates the port >> function. The other one takes the created instance and works with it. >> Note that it may be on a different host. > >It is a little confusing, so I may well be misunderstanding but the >cover letter says: > >$ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ > value false cmode runtime > >$ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 > >So both of these run on the same side, no? > >What I meant is make the former part of the latter: > >$ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 noprobe I see. So it would not be "global policy" but per-instance option during creation. That makes sense. I wonder if the HW is capable of such flow, Moshe, Saeed? > > >Maybe worth clarifying - pci/0000:08:00.0 is the eswitch side and >auxiliary/mlx5_core.sf.1 is the... "customer" side, correct? Yep.
On 2/10/2022 9:02 AM, Jiri Pirko wrote: > Thu, Feb 10, 2022 at 02:25:25AM CET, kuba@kernel.org wrote: >> On Wed, 9 Feb 2022 09:39:54 +0200 Moshe Shemesh wrote: >>> Well we don't have the SFs at that stage, how can we tell which SF will >>> use vnet and which SF will use eth ? >> On Wed, 9 Feb 2022 10:57:21 +0100 Jiri Pirko wrote: >>> It's a different user. One works with the eswitch and creates the port >>> function. The other one takes the created instance and works with it. >>> Note that it may be on a different host. >> It is a little confusing, so I may well be misunderstanding but the >> cover letter says: >> >> $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ >> value false cmode runtime >> >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 >> >> So both of these run on the same side, no? Yes. >> What I meant is make the former part of the latter: >> >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 noprobe > I see. So it would not be "global policy" but per-instance option during > creation. That makes sense. I wonder if the HW is capable of such flow, > Moshe, Saeed? LGTM. Thanks. > >> >> Maybe worth clarifying - pci/0000:08:00.0 is the eswitch side and >> auxiliary/mlx5_core.sf.1 is the... "customer" side, correct? > Yep.
> From: Moshe Shemesh <moshe@nvidia.com> > Sent: Thursday, February 10, 2022 3:58 PM > > On 2/10/2022 9:02 AM, Jiri Pirko wrote: > > Thu, Feb 10, 2022 at 02:25:25AM CET, kuba@kernel.org wrote: > >> On Wed, 9 Feb 2022 09:39:54 +0200 Moshe Shemesh wrote: > >>> Well we don't have the SFs at that stage, how can we tell which SF > >>> will use vnet and which SF will use eth ? > >> On Wed, 9 Feb 2022 10:57:21 +0100 Jiri Pirko wrote: > >>> It's a different user. One works with the eswitch and creates the > >>> port function. The other one takes the created instance and works with it. > >>> Note that it may be on a different host. > >> It is a little confusing, so I may well be misunderstanding but the > >> cover letter says: > >> > >> $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ > >> value false cmode runtime > >> > >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 > >> > >> So both of these run on the same side, no? > Yes. In this cover letter example it is on same side. But as Jiri explained, both can be on different host. > >> What I meant is make the former part of the latter: > >> > >> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 > >> noprobe > > I see. So it would not be "global policy" but per-instance option > > during creation. That makes sense. I wonder if the HW is capable of > > such flow, Moshe, Saeed? At present the device isn't capable of propagating this hint. Moreover, the probe option is for the auxiliary devices of the SF (net, vdpa, rdma). We still need to probe the SF's main auxiliary device so that a devlink instance of the SF is present to control the SF parameters [1] to compose it. The one very good advantage I see of the per SF suggestion of Jakub is, the ability to compose most properties of a SF at one place on eswitch side. However, even with per SF approach on eswitch side, the hurdle was in assigning the cpu affinity of the SF, which is something preferable to do on the host, where the actual workload is running. So cpu affinity assignment per SF on host side requires devlink reload. With that consideration it is better to control rest of the other parameters [1] too on customer side auxiliary/mlx5_core.sf.1 side. [1] https://www.kernel.org/doc/html/latest/networking/devlink/devlink-params.html > > LGTM. Thanks. > > > > >> > >> Maybe worth clarifying - pci/0000:08:00.0 is the eswitch side and > >> auxiliary/mlx5_core.sf.1 is the... "customer" side, correct? > > Yep. It is important to describe both use cases in the cover letter where customer side and eswitch side can be in same/different host with example. Moshe, Can you please revise the cover letter?
On 2/10/2022 9:09 PM, Parav Pandit wrote: >> From: Moshe Shemesh <moshe@nvidia.com> >> Sent: Thursday, February 10, 2022 3:58 PM >> >> On 2/10/2022 9:02 AM, Jiri Pirko wrote: >>> Thu, Feb 10, 2022 at 02:25:25AM CET, kuba@kernel.org wrote: >>>> On Wed, 9 Feb 2022 09:39:54 +0200 Moshe Shemesh wrote: >>>>> Well we don't have the SFs at that stage, how can we tell which SF >>>>> will use vnet and which SF will use eth ? >>>> On Wed, 9 Feb 2022 10:57:21 +0100 Jiri Pirko wrote: >>>>> It's a different user. One works with the eswitch and creates the >>>>> port function. The other one takes the created instance and works with it. >>>>> Note that it may be on a different host. >>>> It is a little confusing, so I may well be misunderstanding but the >>>> cover letter says: >>>> >>>> $ devlink dev param set pci/0000:08:00.0 name enable_sfs_aux_devs \ >>>> value false cmode runtime >>>> >>>> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 >>>> >>>> So both of these run on the same side, no? >> Yes. > In this cover letter example it is on same side. > But as Jiri explained, both can be on different host. > >>>> What I meant is make the former part of the latter: >>>> >>>> $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 >>>> noprobe >>> I see. So it would not be "global policy" but per-instance option >>> during creation. That makes sense. I wonder if the HW is capable of >>> such flow, Moshe, Saeed? > At present the device isn't capable of propagating this hint. > Moreover, the probe option is for the auxiliary devices of the SF (net, vdpa, rdma). > We still need to probe the SF's main auxiliary device so that a devlink instance of the SF is present to control the SF parameters [1] to compose it. > > The one very good advantage I see of the per SF suggestion of Jakub is, the ability to compose most properties of a SF at one place on eswitch side. > > However, even with per SF approach on eswitch side, the hurdle was in assigning the cpu affinity of the SF, which is something preferable to do on the host, where the actual workload is running. > So cpu affinity assignment per SF on host side requires devlink reload. > With that consideration it is better to control rest of the other parameters [1] too on customer side auxiliary/mlx5_core.sf.1 side. > > [1] https://www.kernel.org/doc/html/latest/networking/devlink/devlink-params.html > >> LGTM. Thanks. >> >>>> Maybe worth clarifying - pci/0000:08:00.0 is the eswitch side and >>>> auxiliary/mlx5_core.sf.1 is the... "customer" side, correct? >>> Yep. > It is important to describe both use cases in the cover letter where customer side and eswitch side can be in same/different host with example. > > Moshe, > Can you please revise the cover letter? Yes, I will send revised version.