diff mbox series

[net-next,v4] net: psample: Introduce stubs to remove NIC driver dependency

Message ID 20210128014543.521151-1-cmi@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v4] net: psample: Introduce stubs to remove NIC driver dependency | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 5 maintainers not CCed: jhs@mojatatu.com yotam.gi@gmail.com jiri@resnulli.us davem@davemloft.net xiyou.wangcong@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/kdoc success Errors and warnings before: 1 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Chris Mi Jan. 28, 2021, 1:45 a.m. UTC
In order to send sampled packets to userspace, NIC driver calls
psample api directly. But it creates a hard dependency on module
psample. Introduce psample_ops to remove the hard dependency.
It is initialized when psample module is loaded and set to NULL
when the module is unloaded.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
 - fix sparse errors
v2->v3:
 - remove inline
v3->v4:
 - add inline back

 include/net/psample.h    | 27 +++++++++++++++++++++++++++
 net/psample/psample.c    | 13 ++++++++++++-
 net/sched/Makefile       |  2 +-
 net/sched/psample_stub.c |  7 +++++++
 4 files changed, 47 insertions(+), 2 deletions(-)
 create mode 100644 net/sched/psample_stub.c

Comments

Cong Wang Jan. 29, 2021, 5:14 a.m. UTC | #1
On Wed, Jan 27, 2021 at 5:48 PM Chris Mi <cmi@nvidia.com> wrote:
>
> In order to send sampled packets to userspace, NIC driver calls
> psample api directly. But it creates a hard dependency on module
> psample. Introduce psample_ops to remove the hard dependency.
> It is initialized when psample module is loaded and set to NULL
> when the module is unloaded.

Is it too crazy to just make CONFIG_PSAMPLE a bool so that it
won't be a module?

Thanks.
Chris Mi Jan. 29, 2021, 6:08 a.m. UTC | #2
On 1/29/2021 1:14 PM, Cong Wang wrote:
> On Wed, Jan 27, 2021 at 5:48 PM Chris Mi <cmi@nvidia.com> wrote:
>> In order to send sampled packets to userspace, NIC driver calls
>> psample api directly. But it creates a hard dependency on module
>> psample. Introduce psample_ops to remove the hard dependency.
>> It is initialized when psample module is loaded and set to NULL
>> when the module is unloaded.
> Is it too crazy to just make CONFIG_PSAMPLE a bool so that it
> won't be a module?
>
> Thanks.
That's a crazy and simple solution. But adding such a stub is a common 
method to
remove the NIC driver dependency for modules. And comparing with other 
act_xxx
modules, psample is not small now.

Instead of discussing it several days, maybe it's better to review 
current patch, so that
we can move forward :)

Thanks.
Jakub Kicinski Jan. 29, 2021, 8:30 p.m. UTC | #3
On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> Instead of discussing it several days, maybe it's better to review 
> current patch, so that we can move forward :)

It took you 4 revisions to post a patch which builds cleanly and now
you want to hasten the review? My favorite kind of submission.

The mlxsw core + spectrum drivers are 65 times the size of psample 
on my system. Why is the dependency a problem?

What's going to make sure the module gets loaded when it's needed?
Saeed Mahameed Jan. 29, 2021, 8:44 p.m. UTC | #4
On Fri, 2021-01-29 at 12:30 -0800, Jakub Kicinski wrote:
> On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > Instead of discussing it several days, maybe it's better to review 
> > current patch, so that we can move forward :)
> 
> It took you 4 revisions to post a patch which builds cleanly and now
> you want to hasten the review? My favorite kind of submission.
> 
> The mlxsw core + spectrum drivers are 65 times the size of psample 
> on my system. Why is the dependency a problem?
> 
> What's going to make sure the module gets loaded when it's needed?

The issue is with distros who ship modules independently.. having a
hard dependency will make it impossible for basic mlx5_core.ko users to
load the driver when psample is not installed/loaded.

I prefer to have 0 dependency on external modules in a HW driver.

Thanks,
Saeed.
Jakub Kicinski Jan. 29, 2021, 9:47 p.m. UTC | #5
On Fri, 29 Jan 2021 12:44:59 -0800 Saeed Mahameed wrote:
> On Fri, 2021-01-29 at 12:30 -0800, Jakub Kicinski wrote:
> > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:  
> > > Instead of discussing it several days, maybe it's better to review 
> > > current patch, so that we can move forward :)  
> > 
> > It took you 4 revisions to post a patch which builds cleanly and now
> > you want to hasten the review? My favorite kind of submission.
> > 
> > The mlxsw core + spectrum drivers are 65 times the size of psample 
> > on my system. Why is the dependency a problem?
> > 
> > What's going to make sure the module gets loaded when it's needed?  
> 
> The issue is with distros who ship modules independently.. having a
> hard dependency will make it impossible for basic mlx5_core.ko users to
> load the driver when psample is not installed/loaded.
> 
> I prefer to have 0 dependency on external modules in a HW driver.

I see. Well such distros are clearly broken, right? You can't ship 
a module without its dependencies.
Jakub Kicinski Jan. 29, 2021, 9:50 p.m. UTC | #6
On Thu, 28 Jan 2021 09:45:43 +0800 Chris Mi wrote:
> In order to send sampled packets to userspace, NIC driver calls
> psample api directly. But it creates a hard dependency on module
> psample. Introduce psample_ops to remove the hard dependency.
> It is initialized when psample module is loaded and set to NULL
> when the module is unloaded.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Chris Mi <cmi@nvidia.com>
> Reviewed-by: Jiri Pirko <jiri@nvidia.com>

> diff --git a/include/net/psample.h b/include/net/psample.h
> index 68ae16bb0a4a..e6a73128de59 100644
> --- a/include/net/psample.h
> +++ b/include/net/psample.h
> @@ -4,6 +4,7 @@
>  
>  #include <uapi/linux/psample.h>
>  #include <linux/list.h>
> +#include <linux/skbuff.h>

Forward declaration should be enough.

>  struct psample_group {
>  	struct list_head list;

>  static void __exit psample_module_exit(void)
>  {
> +	RCU_INIT_POINTER(psample_ops, NULL);

I think you can use rcu_assign_pointer(), it handles constants 
right these days.

Please add a comment here saying that you depend on
genl_unregister_family() executing synchronize_rcu() 
and name the function where it does so.

>  	genl_unregister_family(&psample_nl_family);
>  }

> diff --git a/net/sched/psample_stub.c b/net/sched/psample_stub.c
> new file mode 100644
> index 000000000000..0615a7b64000
> --- /dev/null
> +++ b/net/sched/psample_stub.c
> @@ -0,0 +1,7 @@
> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> +/* Copyright (c) 2021 Mellanox Technologies. */
> +
> +#include <net/psample.h>

Forward declaration is sufficient.

> +const struct psample_ops __rcu *psample_ops __read_mostly;
> +EXPORT_SYMBOL_GPL(psample_ops);
Chris Mi Jan. 30, 2021, 2:35 a.m. UTC | #7
On 1/30/2021 5:50 AM, Jakub Kicinski wrote:
> On Thu, 28 Jan 2021 09:45:43 +0800 Chris Mi wrote:
>> In order to send sampled packets to userspace, NIC driver calls
>> psample api directly. But it creates a hard dependency on module
>> psample. Introduce psample_ops to remove the hard dependency.
>> It is initialized when psample module is loaded and set to NULL
>> when the module is unloaded.
>>
>> Reported-by: kernel test robot <lkp@intel.com>
>> Signed-off-by: Chris Mi <cmi@nvidia.com>
>> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
>> diff --git a/include/net/psample.h b/include/net/psample.h
>> index 68ae16bb0a4a..e6a73128de59 100644
>> --- a/include/net/psample.h
>> +++ b/include/net/psample.h
>> @@ -4,6 +4,7 @@
>>   
>>   #include <uapi/linux/psample.h>
>>   #include <linux/list.h>
>> +#include <linux/skbuff.h>
> Forward declaration should be enough.
Done.
>
>>   struct psample_group {
>>   	struct list_head list;
>>   static void __exit psample_module_exit(void)
>>   {
>> +	RCU_INIT_POINTER(psample_ops, NULL);
> I think you can use rcu_assign_pointer(), it handles constants
> right these days.
Done.
>
> Please add a comment here saying that you depend on
> genl_unregister_family() executing synchronize_rcu()
> and name the function where it does so.
I added synchronize_rcu() here directly. Maybe it's more clear.
>
>>   	genl_unregister_family(&psample_nl_family);
>>   }
>> diff --git a/net/sched/psample_stub.c b/net/sched/psample_stub.c
>> new file mode 100644
>> index 000000000000..0615a7b64000
>> --- /dev/null
>> +++ b/net/sched/psample_stub.c
>> @@ -0,0 +1,7 @@
>> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>> +/* Copyright (c) 2021 Mellanox Technologies. */
>> +
>> +#include <net/psample.h>
> Forward declaration is sufficient.
Done.

Thanks,
Chris
>
>> +const struct psample_ops __rcu *psample_ops __read_mostly;
>> +EXPORT_SYMBOL_GPL(psample_ops);
Ido Schimmel Jan. 30, 2021, 2:42 p.m. UTC | #8
On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > Instead of discussing it several days, maybe it's better to review 
> > current patch, so that we can move forward :)
> 
> It took you 4 revisions to post a patch which builds cleanly and now
> you want to hasten the review? My favorite kind of submission.
> 
> The mlxsw core + spectrum drivers are 65 times the size of psample 
> on my system. Why is the dependency a problem?

mlxsw has been using psample for ~4 years and I don't remember seeing a
single complaint about the dependency. I don't understand why this patch
is needed.
Chris Mi Feb. 1, 2021, 1:37 a.m. UTC | #9
Hi Ido,

On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
>> On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
>>> Instead of discussing it several days, maybe it's better to review
>>> current patch, so that we can move forward :)
>> It took you 4 revisions to post a patch which builds cleanly and now
>> you want to hasten the review? My favorite kind of submission.
>>
>> The mlxsw core + spectrum drivers are 65 times the size of psample
>> on my system. Why is the dependency a problem?
> mlxsw has been using psample for ~4 years and I don't remember seeing a
> single complaint about the dependency. I don't understand why this patch
> is needed.
Please see Saeed's comment in previous email:

"

The issue is with distros who ship modules independently.. having a
hard dependency will make it impossible for basic mlx5_core.ko users to
load the driver when psample is not installed/loaded.

I prefer to have 0 dependency on external modules in a HW driver.
"

We are working on a tc sample offload feature for mlx5_core. The distros
are likely to request us to do this. So we address it before submitting
the driver changes.

Regards,
Chris
Ido Schimmel Feb. 1, 2021, 6:08 p.m. UTC | #10
On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> Hi Ido,
> 
> On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > Instead of discussing it several days, maybe it's better to review
> > > > current patch, so that we can move forward :)
> > > It took you 4 revisions to post a patch which builds cleanly and now
> > > you want to hasten the review? My favorite kind of submission.
> > > 
> > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > on my system. Why is the dependency a problem?
> > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > single complaint about the dependency. I don't understand why this patch
> > is needed.
> Please see Saeed's comment in previous email:
> 
> "
> 
> The issue is with distros who ship modules independently.. having a
> hard dependency will make it impossible for basic mlx5_core.ko users to
> load the driver when psample is not installed/loaded.
> 
> I prefer to have 0 dependency on external modules in a HW driver.
> "

I saw it, but it basically comes down to personal preferences.

> 
> We are working on a tc sample offload feature for mlx5_core. The distros
> are likely to request us to do this. So we address it before submitting
> the driver changes.

Which distros? Can they comment here? mlxsw is in RHEL and I don't
remember queries from them about the psample module.

> 
> Regards,
> Chris
>
Leon Romanovsky Feb. 8, 2021, 7:03 a.m. UTC | #11
On Mon, Feb 01, 2021 at 08:08:37PM +0200, Ido Schimmel wrote:
> On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> > Hi Ido,
> >
> > On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > > Instead of discussing it several days, maybe it's better to review
> > > > > current patch, so that we can move forward :)
> > > > It took you 4 revisions to post a patch which builds cleanly and now
> > > > you want to hasten the review? My favorite kind of submission.
> > > >
> > > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > > on my system. Why is the dependency a problem?
> > > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > > single complaint about the dependency. I don't understand why this patch
> > > is needed.
> > Please see Saeed's comment in previous email:
> >
> > "
> >
> > The issue is with distros who ship modules independently.. having a
> > hard dependency will make it impossible for basic mlx5_core.ko users to
> > load the driver when psample is not installed/loaded.
> >
> > I prefer to have 0 dependency on external modules in a HW driver.
> > "
>
> I saw it, but it basically comes down to personal preferences.

It is more than personal preferences. In opposite to the mlxsw which is
used for netdev only, the mlx5_core is used by other subsystems, e.g. RDMA,
so Saeed's request to avoid extra dependencies makes sense.

We don't need psample dependency to run RDMA traffic.

>
> >
> > We are working on a tc sample offload feature for mlx5_core. The distros
> > are likely to request us to do this. So we address it before submitting
> > the driver changes.
>
> Which distros? Can they comment here? mlxsw is in RHEL and I don't
> remember queries from them about the psample module.

There is a huge difference between being in RHEL and actively work with
partners as mlx5 does.

The open mailing list is not the right place to discuss our partnership
relations.

Thanks

>
> >
> > Regards,
> > Chris
> >
Ido Schimmel Feb. 8, 2021, 8:57 a.m. UTC | #12
On Mon, Feb 08, 2021 at 09:03:50AM +0200, Leon Romanovsky wrote:
> On Mon, Feb 01, 2021 at 08:08:37PM +0200, Ido Schimmel wrote:
> > On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> > > Hi Ido,
> > >
> > > On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > > > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > > > Instead of discussing it several days, maybe it's better to review
> > > > > > current patch, so that we can move forward :)
> > > > > It took you 4 revisions to post a patch which builds cleanly and now
> > > > > you want to hasten the review? My favorite kind of submission.
> > > > >
> > > > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > > > on my system. Why is the dependency a problem?
> > > > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > > > single complaint about the dependency. I don't understand why this patch
> > > > is needed.
> > > Please see Saeed's comment in previous email:
> > >
> > > "
> > >
> > > The issue is with distros who ship modules independently.. having a
> > > hard dependency will make it impossible for basic mlx5_core.ko users to
> > > load the driver when psample is not installed/loaded.
> > >
> > > I prefer to have 0 dependency on external modules in a HW driver.
> > > "
> >
> > I saw it, but it basically comes down to personal preferences.
> 
> It is more than personal preferences. In opposite to the mlxsw which is
> used for netdev only, the mlx5_core is used by other subsystems, e.g. RDMA,
> so Saeed's request to avoid extra dependencies makes sense.
> 
> We don't need psample dependency to run RDMA traffic.

Right, you don't need it. The dependency is "PSAMPLE || PSAMPLE=n". You
can compile out psample and RDMA will work.

> 
> >
> > >
> > > We are working on a tc sample offload feature for mlx5_core. The distros
> > > are likely to request us to do this. So we address it before submitting
> > > the driver changes.
> >
> > Which distros? Can they comment here? mlxsw is in RHEL and I don't
> > remember queries from them about the psample module.
> 
> There is a huge difference between being in RHEL and actively work with
> partners as mlx5 does.
> 
> The open mailing list is not the right place to discuss our partnership
> relations.

I did not ask about "partnership relations". I asked for someone more
familiar with the problem that can explain the distro issue. But if such
a basic question can't be asked, then the distro argument should not
have been made in the first place.
Leon Romanovsky Feb. 8, 2021, 9:07 a.m. UTC | #13
On Mon, Feb 08, 2021 at 10:57:46AM +0200, Ido Schimmel wrote:
> On Mon, Feb 08, 2021 at 09:03:50AM +0200, Leon Romanovsky wrote:
> > On Mon, Feb 01, 2021 at 08:08:37PM +0200, Ido Schimmel wrote:
> > > On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> > > > Hi Ido,
> > > >
> > > > On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > > > > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > > > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > > > > Instead of discussing it several days, maybe it's better to review
> > > > > > > current patch, so that we can move forward :)
> > > > > > It took you 4 revisions to post a patch which builds cleanly and now
> > > > > > you want to hasten the review? My favorite kind of submission.
> > > > > >
> > > > > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > > > > on my system. Why is the dependency a problem?
> > > > > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > > > > single complaint about the dependency. I don't understand why this patch
> > > > > is needed.
> > > > Please see Saeed's comment in previous email:
> > > >
> > > > "
> > > >
> > > > The issue is with distros who ship modules independently.. having a
> > > > hard dependency will make it impossible for basic mlx5_core.ko users to
> > > > load the driver when psample is not installed/loaded.
> > > >
> > > > I prefer to have 0 dependency on external modules in a HW driver.
> > > > "
> > >
> > > I saw it, but it basically comes down to personal preferences.
> >
> > It is more than personal preferences. In opposite to the mlxsw which is
> > used for netdev only, the mlx5_core is used by other subsystems, e.g. RDMA,
> > so Saeed's request to avoid extra dependencies makes sense.
> >
> > We don't need psample dependency to run RDMA traffic.
>
> Right, you don't need it. The dependency is "PSAMPLE || PSAMPLE=n". You
> can compile out psample and RDMA will work.

So do you suggest to all our HPC users recompile their distribution kernel
just to make sure that psample is not called?

>
> >
> > >
> > > >
> > > > We are working on a tc sample offload feature for mlx5_core. The distros
> > > > are likely to request us to do this. So we address it before submitting
> > > > the driver changes.
> > >
> > > Which distros? Can they comment here? mlxsw is in RHEL and I don't
> > > remember queries from them about the psample module.
> >
> > There is a huge difference between being in RHEL and actively work with
> > partners as mlx5 does.
> >
> > The open mailing list is not the right place to discuss our partnership
> > relations.
>
> I did not ask about "partnership relations". I asked for someone more
> familiar with the problem that can explain the distro issue. But if such
> a basic question can't be asked, then the distro argument should not
> have been made in the first place.

It is not what you wrote, but if you don't want to take distro argument
into account, please don't bring mlxsw either.

Thanks
Ido Schimmel Feb. 8, 2021, 5:07 p.m. UTC | #14
On Mon, Feb 08, 2021 at 11:07:02AM +0200, Leon Romanovsky wrote:
> On Mon, Feb 08, 2021 at 10:57:46AM +0200, Ido Schimmel wrote:
> > On Mon, Feb 08, 2021 at 09:03:50AM +0200, Leon Romanovsky wrote:
> > > On Mon, Feb 01, 2021 at 08:08:37PM +0200, Ido Schimmel wrote:
> > > > On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> > > > > Hi Ido,
> > > > >
> > > > > On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > > > > > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > > > > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > > > > > Instead of discussing it several days, maybe it's better to review
> > > > > > > > current patch, so that we can move forward :)
> > > > > > > It took you 4 revisions to post a patch which builds cleanly and now
> > > > > > > you want to hasten the review? My favorite kind of submission.
> > > > > > >
> > > > > > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > > > > > on my system. Why is the dependency a problem?
> > > > > > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > > > > > single complaint about the dependency. I don't understand why this patch
> > > > > > is needed.
> > > > > Please see Saeed's comment in previous email:
> > > > >
> > > > > "
> > > > >
> > > > > The issue is with distros who ship modules independently.. having a
> > > > > hard dependency will make it impossible for basic mlx5_core.ko users to
> > > > > load the driver when psample is not installed/loaded.
> > > > >
> > > > > I prefer to have 0 dependency on external modules in a HW driver.
> > > > > "
> > > >
> > > > I saw it, but it basically comes down to personal preferences.
> > >
> > > It is more than personal preferences. In opposite to the mlxsw which is
> > > used for netdev only, the mlx5_core is used by other subsystems, e.g. RDMA,
> > > so Saeed's request to avoid extra dependencies makes sense.
> > >
> > > We don't need psample dependency to run RDMA traffic.
> >
> > Right, you don't need it. The dependency is "PSAMPLE || PSAMPLE=n". You
> > can compile out psample and RDMA will work.
> 
> So do you suggest to all our HPC users recompile their distribution kernel
> just to make sure that psample is not called?

I don't know. What are they complaining about? That psample needs to be
installed for mlx5_core to be loaded? How come the rest of the
dependencies are installed?

Or are they complaining about the size / memory footprint of psample?
Because then they should first check mlx5_core when all of its options
are blindly enabled as part of a distribution config.

AFAICS, mlx5 still does not have any code that uses psample. You can
wrap it in a config option and keep the weak dependency on psample.
Something like:

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index ad45d20f9d44..d17d03d8cc8b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -104,6 +104,15 @@ config MLX5_TC_CT
 
          If unsure, set to Y
 
+config MLX5_TC_SAMPLE
+       bool "MLX5 TC sample offload support"
+       depends on MLX5_CLS_ACT
+       depends on PSAMPLE || PSAMPLE=n
+       default n
+       help
+         Say Y here if you want to support offloading tc rules that use sample
+          action.
+
 config MLX5_CORE_EN_DCB
        bool "Data Center Bridging (DCB) Support"
        default y

> 
> >
> > >
> > > >
> > > > >
> > > > > We are working on a tc sample offload feature for mlx5_core. The distros
> > > > > are likely to request us to do this. So we address it before submitting
> > > > > the driver changes.
> > > >
> > > > Which distros? Can they comment here? mlxsw is in RHEL and I don't
> > > > remember queries from them about the psample module.
> > >
> > > There is a huge difference between being in RHEL and actively work with
> > > partners as mlx5 does.
> > >
> > > The open mailing list is not the right place to discuss our partnership
> > > relations.
> >
> > I did not ask about "partnership relations". I asked for someone more
> > familiar with the problem that can explain the distro issue. But if such
> > a basic question can't be asked, then the distro argument should not
> > have been made in the first place.
> 
> It is not what you wrote, but if you don't want to take distro argument
> into account, please don't bring mlxsw either.
> 
> Thanks
Leon Romanovsky Feb. 9, 2021, 6:47 a.m. UTC | #15
On Mon, Feb 08, 2021 at 07:07:35PM +0200, Ido Schimmel wrote:
> On Mon, Feb 08, 2021 at 11:07:02AM +0200, Leon Romanovsky wrote:
> > On Mon, Feb 08, 2021 at 10:57:46AM +0200, Ido Schimmel wrote:
> > > On Mon, Feb 08, 2021 at 09:03:50AM +0200, Leon Romanovsky wrote:
> > > > On Mon, Feb 01, 2021 at 08:08:37PM +0200, Ido Schimmel wrote:
> > > > > On Mon, Feb 01, 2021 at 09:37:11AM +0800, Chris Mi wrote:
> > > > > > Hi Ido,
> > > > > >
> > > > > > On 1/30/2021 10:42 PM, Ido Schimmel wrote:
> > > > > > > On Fri, Jan 29, 2021 at 12:30:09PM -0800, Jakub Kicinski wrote:
> > > > > > > > On Fri, 29 Jan 2021 14:08:39 +0800 Chris Mi wrote:
> > > > > > > > > Instead of discussing it several days, maybe it's better to review
> > > > > > > > > current patch, so that we can move forward :)
> > > > > > > > It took you 4 revisions to post a patch which builds cleanly and now
> > > > > > > > you want to hasten the review? My favorite kind of submission.
> > > > > > > >
> > > > > > > > The mlxsw core + spectrum drivers are 65 times the size of psample
> > > > > > > > on my system. Why is the dependency a problem?
> > > > > > > mlxsw has been using psample for ~4 years and I don't remember seeing a
> > > > > > > single complaint about the dependency. I don't understand why this patch
> > > > > > > is needed.
> > > > > > Please see Saeed's comment in previous email:
> > > > > >
> > > > > > "
> > > > > >
> > > > > > The issue is with distros who ship modules independently.. having a
> > > > > > hard dependency will make it impossible for basic mlx5_core.ko users to
> > > > > > load the driver when psample is not installed/loaded.
> > > > > >
> > > > > > I prefer to have 0 dependency on external modules in a HW driver.
> > > > > > "
> > > > >
> > > > > I saw it, but it basically comes down to personal preferences.
> > > >
> > > > It is more than personal preferences. In opposite to the mlxsw which is
> > > > used for netdev only, the mlx5_core is used by other subsystems, e.g. RDMA,
> > > > so Saeed's request to avoid extra dependencies makes sense.
> > > >
> > > > We don't need psample dependency to run RDMA traffic.
> > >
> > > Right, you don't need it. The dependency is "PSAMPLE || PSAMPLE=n". You
> > > can compile out psample and RDMA will work.
> >
> > So do you suggest to all our HPC users recompile their distribution kernel
> > just to make sure that psample is not called?
>
> I don't know. What are they complaining about? That psample needs to be
> installed for mlx5_core to be loaded? How come the rest of the
> dependencies are installed?

The psample module was first dependency that caught our attention. It is
here as an example of such not-needed dependency. Like Saeed said, we are
interested in more general solution that will allow us to use external
modules in fully dynamic mode.

Internally, as a preparation to the submission of mlx5 code that used nf_conntrack,
we found that restart of firewald service will bring down our mlx5_core driver, because
of such dependency.

So to answer on your question, HPC didn't complain yet, but we don't have any plans
to wait till they complain.

>
> Or are they complaining about the size / memory footprint of psample?
> Because then they should first check mlx5_core when all of its options
> are blindly enabled as part of a distribution config.

You are too focused on psample, while Saeed and I are saying more
general statement "I prefer to have 0 dependency on external modules in a HW driver."

>
> AFAICS, mlx5 still does not have any code that uses psample. You can
> wrap it in a config option and keep the weak dependency on psample.
> Something like:
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> index ad45d20f9d44..d17d03d8cc8b 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> @@ -104,6 +104,15 @@ config MLX5_TC_CT
>
>           If unsure, set to Y
>
> +config MLX5_TC_SAMPLE
> +       bool "MLX5 TC sample offload support"
> +       depends on MLX5_CLS_ACT
> +       depends on PSAMPLE || PSAMPLE=n
> +       default n
> +       help
> +         Say Y here if you want to support offloading tc rules that use sample
> +          action.
> +

This is another problem with mlx5 - complete madness with config options
that are not possible to test.
➜  kernel git:(rdma-next) grep -h "config MLX" drivers/net/ethernet/mellanox/mlx5/core/Kconfig | awk '{ print $2}' | sort |uniq |wc -l
19

And it is not weak dependency, but still hard dependency because you
need to recompile your kernel/module to disable/enable it. Any service
that will need to reload psample module for some reason will remove
mlx5_core.

Thanks
Or Gerlitz Feb. 9, 2021, 9:25 a.m. UTC | #16
On Tue, Feb 9, 2021 at 8:49 AM Leon Romanovsky <leon@kernel.org> wrote:

[..]

> This is another problem with mlx5 - complete madness with config options
> that are not possible to test.
> ➜  kernel git:(rdma-next) grep -h "config MLX" drivers/net/ethernet/mellanox/mlx5/core/Kconfig | awk '{ print $2}' | sort |uniq |wc -l
> 19

wait, why do you call it madness? we were suggested by some users (do
git blame for the patches) to refine things with multiple configs and it seem
to work quite well  -- what you don't like? and what's wrong with simple grep..

$ grep "config MLX5_" drivers/net/ethernet/mellanox/mlx5/core/Kconfig
config MLX5_CORE
config MLX5_ACCEL
config MLX5_FPGA
config MLX5_CORE_EN
config MLX5_EN_ARFS
config MLX5_EN_RXNFC
config MLX5_MPFS
config MLX5_ESWITCH
config MLX5_CLS_ACT
config MLX5_TC_CT
config MLX5_CORE_EN_DCB
config MLX5_CORE_IPOIB
config MLX5_FPGA_IPSEC
config MLX5_IPSEC
config MLX5_EN_IPSEC
config MLX5_FPGA_TLS
config MLX5_TLS
config MLX5_EN_TLS
config MLX5_SW_STEERING
config MLX5_SF
config MLX5_SF_MANAGER
config MLX5_EN_NVMEOTCP
Leon Romanovsky Feb. 9, 2021, 10:01 a.m. UTC | #17
On Tue, Feb 09, 2021 at 11:25:33AM +0200, Or Gerlitz wrote:
> On Tue, Feb 9, 2021 at 8:49 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> [..]
>
> > This is another problem with mlx5 - complete madness with config options
> > that are not possible to test.
> > ➜  kernel git:(rdma-next) grep -h "config MLX" drivers/net/ethernet/mellanox/mlx5/core/Kconfig | awk '{ print $2}' | sort |uniq |wc -l
> > 19
>
> wait, why do you call it madness? we were suggested by some users (do
> git blame for the patches) to refine things with multiple configs and it seem
> to work quite well  -- what you don't like? and what's wrong with simple grep..

Yes, I aware of these users and what and why they asked it.

They didn't ask us to have new config for every feature/file, but to have
light ethernet device.

Other users are distributions and they enable all options that supported in
the specific kernel they picked, because they don't know in advance where their
distro will be used.

You also don't have capacity to test various combinations, so you
test only small subset of common ones that are pretty standard. This is why
you have this feeling of "work quite well".

And I'm not talking about compilations but actual regression runs.

I suggest to reduce number of configs to small amount, something like 3-5 options:
 * Basic ethernet driver
 * + ETH
 * + RDMA
 * + VDPA
 * ....
 * Full mlx5 driver

And there is nothing wrong with simple grep, it was my copy/paste from
another internal thread where this config bloat caused us to some griefs
in regression.

Thanks

>
> $ grep "config MLX5_" drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> config MLX5_CORE
> config MLX5_ACCEL
> config MLX5_FPGA
> config MLX5_CORE_EN
> config MLX5_EN_ARFS
> config MLX5_EN_RXNFC
> config MLX5_MPFS
> config MLX5_ESWITCH
> config MLX5_CLS_ACT
> config MLX5_TC_CT
> config MLX5_CORE_EN_DCB
> config MLX5_CORE_IPOIB
> config MLX5_FPGA_IPSEC
> config MLX5_IPSEC
> config MLX5_EN_IPSEC
> config MLX5_FPGA_TLS
> config MLX5_TLS
> config MLX5_EN_TLS
> config MLX5_SW_STEERING
> config MLX5_SF
> config MLX5_SF_MANAGER
> config MLX5_EN_NVMEOTCP
Or Gerlitz Feb. 9, 2021, 10:49 a.m. UTC | #18
On Tue, Feb 9, 2021 at 12:01 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Tue, Feb 09, 2021 at 11:25:33AM +0200, Or Gerlitz wrote:
> > On Tue, Feb 9, 2021 at 8:49 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > [..]
> >
> > > This is another problem with mlx5 - complete madness with config options
> > > that are not possible to test.
> > > ➜  kernel git:(rdma-next) grep -h "config MLX" drivers/net/ethernet/mellanox/mlx5/core/Kconfig | awk '{ print $2}' | sort |uniq |wc -l
> > > 19
> >
> > wait, why do you call it madness? we were suggested by some users (do
> > git blame for the patches) to refine things with multiple configs and it seem
> > to work quite well  -- what you don't like? and what's wrong with simple grep..
>
> Yes, I aware of these users and what and why they asked it.
>
> They didn't ask us to have new config for every feature/file, but to have
> light ethernet device.
>
> Other users are distributions and they enable all options that supported in
> the specific kernel they picked, because they don't know in advance where their
> distro will be used.
>
> You also don't have capacity to test various combinations, so you
> test only small subset of common ones that are pretty standard. This is why
> you have this feeling of "work quite well".

ok, point taken

> And I'm not talking about compilations but actual regression runs.

understood

> I suggest to reduce number of configs to small amount, something like 3-5 options:
>  * Basic ethernet driver
>  * + ETH
>  * + RDMA
>  * + VDPA
>  * ....
>  * Full mlx5 driver

doesn't sound impossible

> And there is nothing wrong with simple grep [..]

no worries
Saeed Mahameed Feb. 11, 2021, 5:09 a.m. UTC | #19
On Tue, 2021-02-09 at 08:47 +0200, Leon Romanovsky wrote:
> On Mon, Feb 08, 2021 at 07:07:35PM +0200, Ido Schimmel wrote:
> > On Mon, Feb 08, 2021 at 11:07:02AM +0200, Leon Romanovsky wrote:
> > 
> > 

[...]

> > I don't know. What are they complaining about? That psample needs
> > to be
> > installed for mlx5_core to be loaded? How come the rest of the
> > dependencies are installed?
> 
> The psample module was first dependency that caught our attention. It
> is
> here as an example of such not-needed dependency. Like Saeed said, we
> are
> interested in more general solution that will allow us to use
> external
> modules in fully dynamic mode.
> 
> Internally, as a preparation to the submission of mlx5 code that used
> nf_conntrack,
> we found that restart of firewald service will bring down our
> mlx5_core driver, because
> of such dependency.
> 
> So to answer on your question, HPC didn't complain yet, but we don't
> have any plans
> to wait till they complain.
> 
> > 
> > Or are they complaining about the size / memory footprint of
> > psample?
> > Because then they should first check mlx5_core when all of its
> > options
> > are blindly enabled as part of a distribution config.
> 
> You are too focused on psample, while Saeed and I are saying more
> general statement "I prefer to have 0 dependency on external modules
> in a HW driver."
> 
> > 
> > AFAICS, mlx5 still does not have any code that uses psample. You
> > can
> > wrap it in a config option and keep the weak dependency on psample.
> > Something like:
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > index ad45d20f9d44..d17d03d8cc8b 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> > @@ -104,6 +104,15 @@ config MLX5_TC_CT
> > 
> >           If unsure, set to Y
> > 
> > +config MLX5_TC_SAMPLE
> > +       bool "MLX5 TC sample offload support"
> > +       depends on MLX5_CLS_ACT
> > +       depends on PSAMPLE || PSAMPLE=n
> > +       default n
> > +       help
> > +         Say Y here if you want to support offloading tc rules
> > that use sample
> > +          action.
> > +
> 

This won't solve anything other than compilation time dependency
between built-in modules to external modules, this is not the case.

our case is when both mlx5 and psample are modules, you can't load mlx5
without loading psample, even if you are not planning to use psample or
mlx5 psample features, which is 99.99% of the cases.

What we are asking for here is not new, and is a common practice in
netdev stack

see :
udp_tunnel_nic_ops
netfilter is full of these, see nf_ct_hook..

I don't see anything wrong with either repeating this practice for any
module or having some sort of a generic proxy in the built-in netdev
layer..
Ido Schimmel Feb. 11, 2021, 9:59 p.m. UTC | #20
On Wed, Feb 10, 2021 at 09:09:41PM -0800, Saeed Mahameed wrote:
> This won't solve anything other than compilation time dependency
> between built-in modules to external modules, this is not the case.
> 
> our case is when both mlx5 and psample are modules, you can't load mlx5
> without loading psample, even if you are not planning to use psample or
> mlx5 psample features, which is 99.99% of the cases.

You are again explaining what are the implications of the dependency,
but you are not explaining why it is the end of the world for you. All I
hear are hypotheticals. Dependencies are also a common practice in the
kernel and mlx5 has a few of its own (I see that pci_hyperv_intf was
loaded by mlx5_core on my system, for example).

> What we are asking for here is not new, and is a common practice in
> netdev stack
> 
> see :
> udp_tunnel_nic_ops
> netfilter is full of these, see nf_ct_hook..
> 
> I don't see anything wrong with either repeating this practice for any
> module or having some sort of a generic proxy in the built-in netdev
> layer..

If you want to move forward with patch, then I ask that you provide a
proper commit message with all the information that was exchanged in
this thread so that multiple people wouldn't need to milk it again upon
re-submission. For example:

"
The tc-sample action sends sampled packets to the psample module which
encapsulates them in generic netlink messages along with associated
metadata and emits notifications to user space.

Device drivers that offload this action are expected to report sampled
packets directly to the psample module by calling
psample_sample_packet(). This creates a dependency between these drivers
and the psample module.

While we are not aware of a problem this dependency can create, we
prefer to avoid it due to past experience with other dependencies. For
example, we discovered that a dependency of mlx5_core on nf_conntrack
will result in mlx5_core being unloaded upon a restart of the firewalld
service. This is because the firewalld service iterates over all the
dependants of nf_conntrack and unloads them so that it could eventually
unload nf_conntrack [1]. In addition, the psample module is only needed
in a small subset of deployments.

Therefore, avoid this dependency by doing XYZ. This is a common way to
reduce dependencies and employed by XYZ, for example.

Note that while the psample module will not be loaded upon the loading
of offloading drivers, it will be loaded by act_sample, which depends on
it. And since drivers offload the act_sample action, psample will be
loaded when needed.

Encapsulating the sampling code in a driver with a config option and
making it depend on psample will result in the psample module being
loaded in most cases given that some distributions blindly enable all
config options.

[1] https://github.com/firewalld/firewalld/blob/master/src/firewall/core/modules.py#L97
"

I also ask that this patch will be routed via the mlxsw queue. Few
reasons:

1. mlxsw already depends on psample while mlx5 does not. Therefore, this
patch needs to take care of mlxsw first. There is no reason to call into
psample differently from different drivers

2. We are about to queue changes (for 5.13) to psample that are going to
conflict with this patch. To avoid the conflict, I want to queue this
patch on top of these changes. The changes also contain selftests which
will provide better test coverage for this patch

Thanks
Saeed Mahameed Feb. 12, 2021, 12:41 a.m. UTC | #21
On Thu, 2021-02-11 at 23:59 +0200, Ido Schimmel wrote:
> On Wed, Feb 10, 2021 at 09:09:41PM -0800, Saeed Mahameed wrote:
> > This won't solve anything other than compilation time dependency
> > between built-in modules to external modules, this is not the case.
> > 
> > our case is when both mlx5 and psample are modules, you can't load
> > mlx5
> > without loading psample, even if you are not planning to use
> > psample or
> > mlx5 psample features, which is 99.99% of the cases.
> 
> You are again explaining what are the implications of the dependency,
> but you are not explaining why it is the end of the world for you.
> All I
> hear are hypotheticals. Dependencies are also a common practice in
> the
> kernel and mlx5 has a few of its own (I see that pci_hyperv_intf was
> loaded by mlx5_core on my system, for example).
> 

Very great example actually, the reason there is pci_hyperv_intf in
first place is exactly to avoid hard dependency between mlx5_core and
pci_hyperv.ko so they came up with pci_hyperv_intf which is a thin
layer between pci_hyperv and mlx5_core, microsoft insisted at the time
that pci_hyperv_intf can be a module (I really don't know why
though) ..

if there wasn't  pci_hyperv_intf then pci_hyperv would be a direct
dependency of mlx5_core, the problem s that pci_hperv can only be
loaded on a hyperv VM.. if you try to load it on any different host it
will fail, thus mlx5 can never load on any system where pci_hyperv.ko
is enabled as CONFIG option but it is not a hyperv VM..

Anyway the point is clear and very straight forward:

A hw_driver.ko should never depend on feature_x.ko, because who knows
what dependencies feature_x.ko are hiding behind or what pre-conditions
feature_x.ko can impose.


> > What we are asking for here is not new, and is a common practice in
> > netdev stack
> > 
> > see :
> > udp_tunnel_nic_ops
> > netfilter is full of these, see nf_ct_hook..
> > 
> > I don't see anything wrong with either repeating this practice for
> > any
> > module or having some sort of a generic proxy in the built-in
> > netdev
> > layer..
> 
> If you want to move forward with patch, then I ask that you provide a
> proper commit message with all the information that was exchanged in
> this thread so that multiple people wouldn't need to milk it again
> upon

Apparently you didn't milk this thread well enough, we already shut the
door on this patch weeks ago :)

But We are still discussing the dependency claim, not related to
psample..
We have some features lined up with harder dependencies than psample.

Again as i previously said, I will try to come up with a unified
universal approach with the necessary documentation and explanations.


> re-submission. For example:
> 
> "
> The tc-sample action sends sampled packets to the psample module
> which
> encapsulates them in generic netlink messages along with associated
> metadata and emits notifications to user space.
> 
> Device drivers that offload this action are expected to report
> sampled
> packets directly to the psample module by calling
> psample_sample_packet(). This creates a dependency between these
> drivers
> and the psample module.
> 
> While we are not aware of a problem this dependency can create, we
> prefer to avoid it due to past experience with other dependencies.
> For
> example, we discovered that a dependency of mlx5_core on nf_conntrack
> will result in mlx5_core being unloaded upon a restart of the
> firewalld
> service. This is because the firewalld service iterates over all the
> dependants of nf_conntrack and unloads them so that it could
> eventually
> unload nf_conntrack [1]. In addition, the psample module is only
> needed
> in a small subset of deployments.
> 
> Therefore, avoid this dependency by doing XYZ. This is a common way
> to
> reduce dependencies and employed by XYZ, for example.
> 
> Note that while the psample module will not be loaded upon the
> loading
> of offloading drivers, it will be loaded by act_sample, which depends
> on
> it. And since drivers offload the act_sample action, psample will be
> loaded when needed.
> 
> Encapsulating the sampling code in a driver with a config option and
> making it depend on psample will result in the psample module being
> loaded in most cases given that some distributions blindly enable all
> config options.
> 
> [1]   
> https://github.com/firewalld/firewalld/blob/master/src/firewall/core/modules.py#L97
> "
> 
> I also ask that this patch will be routed via the mlxsw queue. Few
> reasons:
> 
> 1. mlxsw already depends on psample while mlx5 does not. Therefore,
> this
> patch needs to take care of mlxsw first. There is no reason to call
> into
> psample differently from different drivers
> 
> 2. We are about to queue changes (for 5.13) to psample that are going
> to
> conflict with this patch. To avoid the conflict, I want to queue this
> patch on top of these changes. The changes also contain selftests
> which
> will provide better test coverage for this patch
> 
> Thanks

Thanks, will keep this in mind.
diff mbox series

Patch

diff --git a/include/net/psample.h b/include/net/psample.h
index 68ae16bb0a4a..e6a73128de59 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -4,6 +4,7 @@ 
 
 #include <uapi/linux/psample.h>
 #include <linux/list.h>
+#include <linux/skbuff.h>
 
 struct psample_group {
 	struct list_head list;
@@ -14,6 +15,15 @@  struct psample_group {
 	struct rcu_head rcu;
 };
 
+struct psample_ops {
+	void (*sample_packet)(struct psample_group *group, struct sk_buff *skb,
+			      u32 trunc_size, int in_ifindex, int out_ifindex,
+			      u32 sample_rate);
+
+};
+
+extern const struct psample_ops __rcu *psample_ops __read_mostly;
+
 struct psample_group *psample_group_get(struct net *net, u32 group_num);
 void psample_group_take(struct psample_group *group);
 void psample_group_put(struct psample_group *group);
@@ -35,4 +45,21 @@  static inline void psample_sample_packet(struct psample_group *group,
 
 #endif
 
+static inline void
+psample_nic_sample_packet(struct psample_group *group,
+			  struct sk_buff *skb, u32 trunc_size,
+			  int in_ifindex, int out_ifindex,
+			  u32 sample_rate)
+{
+	const struct psample_ops *ops;
+
+	rcu_read_lock();
+	ops = rcu_dereference(psample_ops);
+	if (ops)
+		ops->sample_packet(group, skb, trunc_size,
+				   in_ifindex, out_ifindex,
+				   sample_rate);
+	rcu_read_unlock();
+}
+
 #endif /* __NET_PSAMPLE_H */
diff --git a/net/psample/psample.c b/net/psample/psample.c
index 33e238c965bd..2a9fbfe09395 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -8,6 +8,7 @@ 
 #include <linux/kernel.h>
 #include <linux/skbuff.h>
 #include <linux/module.h>
+#include <linux/rcupdate.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -35,6 +36,10 @@  static const struct genl_multicast_group psample_nl_mcgrps[] = {
 
 static struct genl_family psample_nl_family __ro_after_init;
 
+static const struct psample_ops psample_sample_ops = {
+	.sample_packet	= psample_sample_packet,
+};
+
 static int psample_group_nl_fill(struct sk_buff *msg,
 				 struct psample_group *group,
 				 enum psample_command cmd, u32 portid, u32 seq,
@@ -456,11 +461,17 @@  EXPORT_SYMBOL_GPL(psample_sample_packet);
 
 static int __init psample_module_init(void)
 {
-	return genl_register_family(&psample_nl_family);
+	int ret;
+
+	ret = genl_register_family(&psample_nl_family);
+	if (!ret)
+		RCU_INIT_POINTER(psample_ops, &psample_sample_ops);
+	return ret;
 }
 
 static void __exit psample_module_exit(void)
 {
+	RCU_INIT_POINTER(psample_ops, NULL);
 	genl_unregister_family(&psample_nl_family);
 }
 
diff --git a/net/sched/Makefile b/net/sched/Makefile
index dd14ef413fda..0d92bb98bb26 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -3,7 +3,7 @@ 
 # Makefile for the Linux Traffic Control Unit.
 #
 
-obj-y	:= sch_generic.o sch_mq.o
+obj-y	:= sch_generic.o sch_mq.o psample_stub.o
 
 obj-$(CONFIG_INET)		+= sch_frag.o
 obj-$(CONFIG_NET_SCHED)		+= sch_api.o sch_blackhole.o
diff --git a/net/sched/psample_stub.c b/net/sched/psample_stub.c
new file mode 100644
index 000000000000..0615a7b64000
--- /dev/null
+++ b/net/sched/psample_stub.c
@@ -0,0 +1,7 @@ 
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#include <net/psample.h>
+
+const struct psample_ops __rcu *psample_ops __read_mostly;
+EXPORT_SYMBOL_GPL(psample_ops);