diff mbox series

[net-next,v3,5/9] genetlink: implement release callback and free sk_user_data there

Message ID 20231120084657.458076-6-jiri@resnulli.us (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series devlink: introduce notifications filtering | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/apply fail Patch does not apply to net-next

Commit Message

Jiri Pirko Nov. 20, 2023, 8:46 a.m. UTC
From: Jiri Pirko <jiri@nvidia.com>

If any generic netlink family would like to allocate data store the
pointer to sk_user_data, there is no way to do cleanup in the family
code.

Assume that kfree() is good for now, as the only user introduced by the
follow-up patch (devlink) will use kzalloc() for the allocation of
the memory pointed by a pointer stored in sk_user_data. If later on
this needs to be implemented per-family, a callback is going
to be needed. Until then (if ever), do this in a simple way.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
 net/netlink/genetlink.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Jakub Kicinski Nov. 21, 2023, 2:50 a.m. UTC | #1
On Mon, 20 Nov 2023 09:46:53 +0100 Jiri Pirko wrote:
> If any generic netlink family would like to allocate data store the
> pointer to sk_user_data, there is no way to do cleanup in the family
> code.

How is this supposed to work?

genetlink sockets are not bound to a family. User can use a single
socket to subscribe to notifications from all families and presumably
each one of the would interpret sk->sk_user_data as their own state?

You need to store the state locally in the family, keyed
on pid, and free it using the NETLINK_URELEASE notifier...
Jiri Pirko Nov. 21, 2023, 7:36 a.m. UTC | #2
Tue, Nov 21, 2023 at 03:50:22AM CET, kuba@kernel.org wrote:
>On Mon, 20 Nov 2023 09:46:53 +0100 Jiri Pirko wrote:
>> If any generic netlink family would like to allocate data store the
>> pointer to sk_user_data, there is no way to do cleanup in the family
>> code.
>
>How is this supposed to work?
>
>genetlink sockets are not bound to a family. User can use a single
>socket to subscribe to notifications from all families and presumably
>each one of the would interpret sk->sk_user_data as their own state?
>
>You need to store the state locally in the family, keyed
>on pid, and free it using the NETLINK_URELEASE notifier...

Ah, correct. Will fix.
Jiri Pirko Nov. 21, 2023, 1:12 p.m. UTC | #3
Tue, Nov 21, 2023 at 03:50:22AM CET, kuba@kernel.org wrote:
>On Mon, 20 Nov 2023 09:46:53 +0100 Jiri Pirko wrote:
>> If any generic netlink family would like to allocate data store the
>> pointer to sk_user_data, there is no way to do cleanup in the family
>> code.
>
>How is this supposed to work?
>
>genetlink sockets are not bound to a family. User can use a single
>socket to subscribe to notifications from all families and presumably
>each one of the would interpret sk->sk_user_data as their own state?
>
>You need to store the state locally in the family, keyed
>on pid, and free it using the NETLINK_URELEASE notifier...

Well, pin can have 2 sockets of different config. I think that sk/family
tuple is needed. I'm exploring a possibility to have genetlink
sk->sk_user_data used to store the hashlist keyed by the sk/family tuple.
Jakub Kicinski Nov. 21, 2023, 5:55 p.m. UTC | #4
On Tue, 21 Nov 2023 14:12:55 +0100 Jiri Pirko wrote:
> >How is this supposed to work?
> >
> >genetlink sockets are not bound to a family. User can use a single
> >socket to subscribe to notifications from all families and presumably
> >each one of the would interpret sk->sk_user_data as their own state?
> >
> >You need to store the state locally in the family, keyed
> >on pid, and free it using the NETLINK_URELEASE notifier...  
> 
> Well, pin can have 2 sockets of different config. I think that sk/family
> tuple is needed. I'm exploring a possibility to have genetlink
> sk->sk_user_data used to store the hashlist keyed by the sk/family tuple.

If you're doing it centrally, please put the state as a new field in
the netlink socket. sk_user_data is for the user.

Also let's start with a list, practically speaking using one socket 
in many families should be very rare.
Jiri Pirko Nov. 22, 2023, 9:29 a.m. UTC | #5
Tue, Nov 21, 2023 at 06:55:12PM CET, kuba@kernel.org wrote:
>On Tue, 21 Nov 2023 14:12:55 +0100 Jiri Pirko wrote:
>> >How is this supposed to work?
>> >
>> >genetlink sockets are not bound to a family. User can use a single
>> >socket to subscribe to notifications from all families and presumably
>> >each one of the would interpret sk->sk_user_data as their own state?
>> >
>> >You need to store the state locally in the family, keyed
>> >on pid, and free it using the NETLINK_URELEASE notifier...  
>> 
>> Well, pin can have 2 sockets of different config. I think that sk/family
>> tuple is needed. I'm exploring a possibility to have genetlink
>> sk->sk_user_data used to store the hashlist keyed by the sk/family tuple.
>
>If you're doing it centrally, please put the state as a new field in
>the netlink socket. sk_user_data is for the user.

I planned to use sk_user_data. What do you mean it is for the user?
I see it is already used for similar usecase by connector for example:

$ git grep sk_user_data drivers/connector/
drivers/connector/cn_proc.c:    if (!dsk || !dsk->sk_user_data || !data)
drivers/connector/cn_proc.c:    val = ((struct proc_input *)(dsk->sk_user_data))->event_type;
drivers/connector/cn_proc.c:    mc_op = ((struct proc_input *)(dsk->sk_user_data))->mcast_op;
drivers/connector/cn_proc.c:            if (sk->sk_user_data == NULL) {
drivers/connector/cn_proc.c:                    sk->sk_user_data = kzalloc(sizeof(struct proc_input),
drivers/connector/cn_proc.c:                    if (sk->sk_user_data == NULL) {
drivers/connector/cn_proc.c:                    ((struct proc_input *)(sk->sk_user_data))->mcast_op;
drivers/connector/cn_proc.c:            ((struct proc_input *)(sk->sk_user_data))->event_type =
drivers/connector/cn_proc.c:            ((struct proc_input *)(sk->sk_user_data))->mcast_op = mc_op;
drivers/connector/cn_proc.c:            ((struct proc_input *)(sk->sk_user_data))->event_type =
drivers/connector/connector.c:          kfree(sk->sk_user_data);
drivers/connector/connector.c:          sk->sk_user_data = NULL;


>
>Also let's start with a list, practically speaking using one socket 
>in many families should be very rare.

Okay.
Jakub Kicinski Nov. 22, 2023, 5:08 p.m. UTC | #6
On Wed, 22 Nov 2023 10:29:44 +0100 Jiri Pirko wrote:
> >If you're doing it centrally, please put the state as a new field in
> >the netlink socket. sk_user_data is for the user.  
> 
> I planned to use sk_user_data. What do you mean it is for the user?
> I see it is already used for similar usecase by connector for example:

I'm pretty sure I complained when it was being added. Long story.
AFAIU user as in if the socket is opened by a kernel module, the kernel
module is the user. There's no need to use this field for the
implementation since the implementation can simply extend its 
own structure to add a properly typed field.
Jiri Pirko Nov. 22, 2023, 6:20 p.m. UTC | #7
Wed, Nov 22, 2023 at 06:08:20PM CET, kuba@kernel.org wrote:
>On Wed, 22 Nov 2023 10:29:44 +0100 Jiri Pirko wrote:
>> >If you're doing it centrally, please put the state as a new field in
>> >the netlink socket. sk_user_data is for the user.  
>> 
>> I planned to use sk_user_data. What do you mean it is for the user?
>> I see it is already used for similar usecase by connector for example:
>
>I'm pretty sure I complained when it was being added. Long story.
>AFAIU user as in if the socket is opened by a kernel module, the kernel
>module is the user. There's no need to use this field for the
>implementation since the implementation can simply extend its 
>own structure to add a properly typed field.

Okay, excuse me, as always I'm slow here. What structure are
you refering to?

Thanks!
Jakub Kicinski Nov. 22, 2023, 7:50 p.m. UTC | #8
On Wed, 22 Nov 2023 19:20:58 +0100 Jiri Pirko wrote:
> >I'm pretty sure I complained when it was being added. Long story.
> >AFAIU user as in if the socket is opened by a kernel module, the kernel
> >module is the user. There's no need to use this field for the
> >implementation since the implementation can simply extend its 
> >own structure to add a properly typed field.  
> 
> Okay, excuse me, as always I'm slow here. What structure are
> you refering to?

struct netlink_sock. Technically it may be a little cleaner to wrap 
that in another struct that will be genetlink specific. But that's
a larger surgery for relatively little benefit at this stage.
Jiri Pirko Nov. 23, 2023, 6:37 a.m. UTC | #9
Wed, Nov 22, 2023 at 08:50:55PM CET, kuba@kernel.org wrote:
>On Wed, 22 Nov 2023 19:20:58 +0100 Jiri Pirko wrote:
>> >I'm pretty sure I complained when it was being added. Long story.
>> >AFAIU user as in if the socket is opened by a kernel module, the kernel
>> >module is the user. There's no need to use this field for the
>> >implementation since the implementation can simply extend its 
>> >own structure to add a properly typed field.  
>> 
>> Okay, excuse me, as always I'm slow here. What structure are
>> you refering to?
>
>struct netlink_sock. Technically it may be a little cleaner to wrap 
>that in another struct that will be genetlink specific. But that's
>a larger surgery for relatively little benefit at this stage.

Got it. Will explore that path. Thanks!
Jiri Pirko Nov. 23, 2023, 10:32 a.m. UTC | #10
Wed, Nov 22, 2023 at 06:08:20PM CET, kuba@kernel.org wrote:
>On Wed, 22 Nov 2023 10:29:44 +0100 Jiri Pirko wrote:
>> >If you're doing it centrally, please put the state as a new field in
>> >the netlink socket. sk_user_data is for the user.  
>> 
>> I planned to use sk_user_data. What do you mean it is for the user?
>> I see it is already used for similar usecase by connector for example:
>
>I'm pretty sure I complained when it was being added. Long story.
>AFAIU user as in if the socket is opened by a kernel module, the kernel
>module is the user. There's no need to use this field for the
>implementation since the implementation can simply extend its 
>own structure to add a properly typed field.

In this case, the socket is not opened by kernel, but it is opened by
the userspace app.

I basically need to have per-user-sk pointer somewhere I'm not clear why
to put it in struct netlink_sock when I can use sk_user_data which is
already there. From the usage of this pointer in kernel, I understand
this is exactly the reason to have it.

Are you afraid of a collision of sk_user_data use with somebody else
here? I don't see how that could happen for netlink socket.
Jakub Kicinski Nov. 23, 2023, 4:24 p.m. UTC | #11
On Thu, 23 Nov 2023 11:32:07 +0100 Jiri Pirko wrote:
> In this case, the socket is not opened by kernel, but it is opened by
> the userspace app.
> 
> I basically need to have per-user-sk pointer somewhere I'm not clear why
> to put it in struct netlink_sock when I can use sk_user_data which is
> already there. From the usage of this pointer in kernel, I understand
> this is exactly the reason to have it.

Various people stuck various things in that pointer just because,
it's a mess. IIUC the initial motivation for it is that someone
like NFS opens a kernel socket and needs to put private data
somewhere. A kernel user gets a callback for a socket, like data
ready, and needs to find their private state.

> Are you afraid of a collision of sk_user_data use with somebody else
> here? I don't see how that could happen for netlink socket.

Normally upper layer wraps the socket struct in its own struct. 
Look at the struct nesting for TCP or any other bona fide protocol. 
genetlink will benefit from having socket state, I bet it wasn't done
that way from the start because Jamal/Thomas were told to start small.

Please add a properly typed field to the netlink struct, unless you
have technical reasons not to.
Jiri Pirko Nov. 23, 2023, 4:53 p.m. UTC | #12
Thu, Nov 23, 2023 at 05:24:08PM CET, kuba@kernel.org wrote:
>On Thu, 23 Nov 2023 11:32:07 +0100 Jiri Pirko wrote:
>> In this case, the socket is not opened by kernel, but it is opened by
>> the userspace app.
>> 
>> I basically need to have per-user-sk pointer somewhere I'm not clear why
>> to put it in struct netlink_sock when I can use sk_user_data which is
>> already there. From the usage of this pointer in kernel, I understand
>> this is exactly the reason to have it.
>
>Various people stuck various things in that pointer just because,
>it's a mess. IIUC the initial motivation for it is that someone
>like NFS opens a kernel socket and needs to put private data
>somewhere. A kernel user gets a callback for a socket, like data
>ready, and needs to find their private state.
>
>> Are you afraid of a collision of sk_user_data use with somebody else
>> here? I don't see how that could happen for netlink socket.
>
>Normally upper layer wraps the socket struct in its own struct. 
>Look at the struct nesting for TCP or any other bona fide protocol. 
>genetlink will benefit from having socket state, I bet it wasn't done
>that way from the start because Jamal/Thomas were told to start small.
>
>Please add a properly typed field to the netlink struct, unless you
>have technical reasons not to.

Okay.
diff mbox series

Patch

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 92ef5ed2e7b0..905c5a167f53 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1699,12 +1699,18 @@  static int genl_bind(struct net *net, int group)
 	return ret;
 }
 
+static void genl_release(struct sock *sk, unsigned long *groups)
+{
+	kfree(sk->sk_user_data);
+}
+
 static int __net_init genl_pernet_init(struct net *net)
 {
 	struct netlink_kernel_cfg cfg = {
 		.input		= genl_rcv,
 		.flags		= NL_CFG_F_NONROOT_RECV,
 		.bind		= genl_bind,
+		.release	= genl_release,
 	};
 
 	/* we'll bump the group number right afterwards */