Message ID | 20231123181546.521488-6-jiri@resnulli.us (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | devlink: introduce notifications filtering | expand |
On Thu, 2023-11-23 at 19:15 +0100, Jiri Pirko wrote: [...] > +/** > + * genl_sk_priv_store - Store per-socket private pointer for family > + * > + * @sk: socket > + * @family: family > + * @priv: private pointer > + * > + * Store a private pointer per-socket for a specified > + * Generic netlink family. > + * > + * Caller has to make sure this is not called in parallel multiple times > + * for the same sock and also in parallel to genl_release() for the same sock. > + * > + * Returns: previously stored private pointer for the family (could be NULL) > + * on success, otherwise negative error value encoded by ERR_PTR(). > + */ > +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, > + void *priv) > +{ > + struct genl_sk_ctx *ctx; > + void *old_priv; > + > + ctx = rcu_dereference_raw(nlk_sk(sk)->priv); Minor nit: Looking at the following patch, this will be called under the rtnl lock. Since a look is needed to ensure the priv ptr consistency, what about adding the relevant lockdep annotation here? No need to repost for the above. Cheers, Paolo
Mon, Nov 27, 2023 at 12:13:09PM CET, pabeni@redhat.com wrote: >On Thu, 2023-11-23 at 19:15 +0100, Jiri Pirko wrote: >[...] >> +/** >> + * genl_sk_priv_store - Store per-socket private pointer for family >> + * >> + * @sk: socket >> + * @family: family >> + * @priv: private pointer >> + * >> + * Store a private pointer per-socket for a specified >> + * Generic netlink family. >> + * >> + * Caller has to make sure this is not called in parallel multiple times >> + * for the same sock and also in parallel to genl_release() for the same sock. >> + * >> + * Returns: previously stored private pointer for the family (could be NULL) >> + * on success, otherwise negative error value encoded by ERR_PTR(). >> + */ >> +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, >> + void *priv) >> +{ >> + struct genl_sk_ctx *ctx; >> + void *old_priv; >> + >> + ctx = rcu_dereference_raw(nlk_sk(sk)->priv); > >Minor nit: Looking at the following patch, this will be called under >the rtnl lock. Since a look is needed to ensure the priv ptr Well, depends on the user (as the comment says). Devlink does not call it with rtnl lock. Relies on the socket skb processing and cleanup flows. >consistency, what about adding the relevant lockdep annotation here? > >No need to repost for the above. > >Cheers, > >Paolo >
On Thu, 23 Nov 2023 19:15:42 +0100 Jiri Pirko wrote:
> + void __rcu *priv;
How many times did I say "typed struct" in the feedback to the broken
v3 of this series? You complain so much about people not addressing
feedback you've given them, it's really hypocritical.
Put the xarray pointer here directly. Plus a lock to protect the init.
The size of the per-family struct should be in family definition,
allocation should happen on first get automatically. Family definition
should also hold a callback to how the data is going to be freed.
Mon, Nov 27, 2023 at 11:46:26PM CET, kuba@kernel.org wrote: >On Thu, 23 Nov 2023 19:15:42 +0100 Jiri Pirko wrote: >> + void __rcu *priv; > >How many times did I say "typed struct" in the feedback to the broken >v3 of this series? You complain so much about people not addressing >feedback you've given them, it's really hypocritical. I did what I thought is best. Since this is struct netlink_sock, it is not related only to genetlink. So other implementations may in theory use this pointer for something else. Looked a bit odd to put genetlink-specific pointer here, but as you wish. > >Put the xarray pointer here directly. Plus a lock to protect the init. Okay, just to make this clear. You want me to have: struct xarray __rcu *family_privs; in struct netlink_sock, correct? Why I need a lock? If I read things correctly, skbs are processed in serial over one sock. Since this is per-sock, should be safe. > >The size of the per-family struct should be in family definition, >allocation should happen on first get automatically. Family definition Yes, I thought about that. But I decided to do this lockless, allocating new priv every time the user sets the filter and replace the xarray item so it could be accessed in rcu read section during notification processing. What you suggest requires some lock to protect the memory being changed during filter set and suring accessing in in notify. But okay, if you insist. >should also hold a callback to how the data is going to be freed. If it is alloceted automatically, why is it needed? >-- >pw-bot: cr
On 11/23/23 19:15, Jiri Pirko wrote: > From: Jiri Pirko <jiri@nvidia.com> > > Introduce a priv pointer into struct netlink_sock. Use it to store a per > socket xarray that contains family->id indexed priv pointer storage. > Note I used xarray instead of suggested linked list as it is more > convenient, without need to have a container struct that would > contain struct list_head item. > > Introduce genl_sk_priv_store() to store the priv pointer. > Introduce genl_sk_priv_get() to obtain the priv pointer under RCU > read lock. > > Assume that kfree() is good for free of privs for now, as the only user > introduced by the follow-up patch (devlink) will use kzalloc() for the > allocation of the memory of the stored pointer. If later on > this needs to be made custom, a callback is going to be needed. > Until then (if ever), do this in a simple way. > > Signed-off-by: Jiri Pirko <jiri@nvidia.com> > --- > v3->v4: > - new patch > --- > include/net/genetlink.h | 3 ++ > net/netlink/af_netlink.h | 1 + > net/netlink/genetlink.c | 98 ++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 102 insertions(+) > > diff --git a/include/net/genetlink.h b/include/net/genetlink.h > index e18a4c0d69ee..66c1e50415e0 100644 > --- a/include/net/genetlink.h > +++ b/include/net/genetlink.h > @@ -300,6 +300,9 @@ int genl_register_family(struct genl_family *family); > int genl_unregister_family(const struct genl_family *family); > void genl_notify(const struct genl_family *family, struct sk_buff *skb, > struct genl_info *info, u32 group, gfp_t flags); > +void *genl_sk_priv_get(struct sock *sk, struct genl_family *family); > +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, > + void *priv); > > void *genlmsg_put(struct sk_buff *skb, u32 portid, u32 seq, > const struct genl_family *family, int flags, u8 cmd); > diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h > index 2145979b9986..5d96135a4cf3 100644 > --- a/net/netlink/af_netlink.h > +++ b/net/netlink/af_netlink.h > @@ -51,6 +51,7 @@ struct netlink_sock { > struct rhash_head node; > struct rcu_head rcu; > struct work_struct work; > + void __rcu *priv; > }; > > static inline struct netlink_sock *nlk_sk(struct sock *sk) > diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c > index 92ef5ed2e7b0..aae5e63fa50b 100644 > --- a/net/netlink/genetlink.c > +++ b/net/netlink/genetlink.c > @@ -21,6 +21,7 @@ > #include <linux/idr.h> > #include <net/sock.h> > #include <net/genetlink.h> > +#include "af_netlink.h" > > static DEFINE_MUTEX(genl_mutex); /* serialization of message processing */ > static DECLARE_RWSEM(cb_lock); > @@ -1699,12 +1700,109 @@ static int genl_bind(struct net *net, int group) > return ret; > } > > +struct genl_sk_ctx { > + struct xarray family_privs; > +}; > + > +static struct genl_sk_ctx *genl_sk_ctx_alloc(void) > +{ > + struct genl_sk_ctx *ctx; > + > + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); > + if (!ctx) > + return NULL; > + xa_init_flags(&ctx->family_privs, XA_FLAGS_ALLOC); > + return ctx; > +} > + > +static void genl_sk_ctx_free(struct genl_sk_ctx *ctx) > +{ > + unsigned long family_id; > + void *priv; > + > + xa_for_each(&ctx->family_privs, family_id, priv) { > + xa_erase(&ctx->family_privs, family_id); > + kfree(priv); > + } > + xa_destroy(&ctx->family_privs); > + kfree(ctx); > +} > + > +/** > + * genl_sk_priv_get - Get per-socket private pointer for family > + * > + * @sk: socket > + * @family: family > + * > + * Lookup a private pointer stored per-socket by a specified > + * Generic netlink family. > + * > + * Caller should make sure this is called in RCU read locked section. > + * > + * Returns: valid pointer on success, otherwise NULL. since you are going to post next revision, kernel-doc requires "Return:" section (singular form) https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation for new code we should strive to fulfil the requirement (or piss-off someone powerful enough to change the requirement ;)) [snip]
Tue, Nov 28, 2023 at 01:30:51PM CET, przemyslaw.kitszel@intel.com wrote: >On 11/23/23 19:15, Jiri Pirko wrote: >> From: Jiri Pirko <jiri@nvidia.com> >> >> Introduce a priv pointer into struct netlink_sock. Use it to store a per >> socket xarray that contains family->id indexed priv pointer storage. >> Note I used xarray instead of suggested linked list as it is more >> convenient, without need to have a container struct that would >> contain struct list_head item. >> >> Introduce genl_sk_priv_store() to store the priv pointer. >> Introduce genl_sk_priv_get() to obtain the priv pointer under RCU >> read lock. >> >> Assume that kfree() is good for free of privs for now, as the only user >> introduced by the follow-up patch (devlink) will use kzalloc() for the >> allocation of the memory of the stored pointer. If later on >> this needs to be made custom, a callback is going to be needed. >> Until then (if ever), do this in a simple way. >> >> Signed-off-by: Jiri Pirko <jiri@nvidia.com> >> --- >> v3->v4: >> - new patch >> --- >> include/net/genetlink.h | 3 ++ >> net/netlink/af_netlink.h | 1 + >> net/netlink/genetlink.c | 98 ++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 102 insertions(+) >> >> diff --git a/include/net/genetlink.h b/include/net/genetlink.h >> index e18a4c0d69ee..66c1e50415e0 100644 >> --- a/include/net/genetlink.h >> +++ b/include/net/genetlink.h >> @@ -300,6 +300,9 @@ int genl_register_family(struct genl_family *family); >> int genl_unregister_family(const struct genl_family *family); >> void genl_notify(const struct genl_family *family, struct sk_buff *skb, >> struct genl_info *info, u32 group, gfp_t flags); >> +void *genl_sk_priv_get(struct sock *sk, struct genl_family *family); >> +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, >> + void *priv); >> void *genlmsg_put(struct sk_buff *skb, u32 portid, u32 seq, >> const struct genl_family *family, int flags, u8 cmd); >> diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h >> index 2145979b9986..5d96135a4cf3 100644 >> --- a/net/netlink/af_netlink.h >> +++ b/net/netlink/af_netlink.h >> @@ -51,6 +51,7 @@ struct netlink_sock { >> struct rhash_head node; >> struct rcu_head rcu; >> struct work_struct work; >> + void __rcu *priv; >> }; >> static inline struct netlink_sock *nlk_sk(struct sock *sk) >> diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c >> index 92ef5ed2e7b0..aae5e63fa50b 100644 >> --- a/net/netlink/genetlink.c >> +++ b/net/netlink/genetlink.c >> @@ -21,6 +21,7 @@ >> #include <linux/idr.h> >> #include <net/sock.h> >> #include <net/genetlink.h> >> +#include "af_netlink.h" >> static DEFINE_MUTEX(genl_mutex); /* serialization of message processing */ >> static DECLARE_RWSEM(cb_lock); >> @@ -1699,12 +1700,109 @@ static int genl_bind(struct net *net, int group) >> return ret; >> } >> +struct genl_sk_ctx { >> + struct xarray family_privs; >> +}; >> + >> +static struct genl_sk_ctx *genl_sk_ctx_alloc(void) >> +{ >> + struct genl_sk_ctx *ctx; >> + >> + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); >> + if (!ctx) >> + return NULL; >> + xa_init_flags(&ctx->family_privs, XA_FLAGS_ALLOC); >> + return ctx; >> +} >> + >> +static void genl_sk_ctx_free(struct genl_sk_ctx *ctx) >> +{ >> + unsigned long family_id; >> + void *priv; >> + >> + xa_for_each(&ctx->family_privs, family_id, priv) { >> + xa_erase(&ctx->family_privs, family_id); >> + kfree(priv); >> + } >> + xa_destroy(&ctx->family_privs); >> + kfree(ctx); >> +} >> + >> +/** >> + * genl_sk_priv_get - Get per-socket private pointer for family >> + * >> + * @sk: socket >> + * @family: family >> + * >> + * Lookup a private pointer stored per-socket by a specified >> + * Generic netlink family. >> + * >> + * Caller should make sure this is called in RCU read locked section. >> + * >> + * Returns: valid pointer on success, otherwise NULL. > >since you are going to post next revision, > >kernel-doc requires "Return:" section (singular form) >https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation > >for new code we should strive to fulfil the requirement >(or piss-off someone powerful enough to change the requirement ;)) Okay, will fix. I just thought this is okay when scripts/kernel-doc is happy. > > > >[snip]
On Tue, 28 Nov 2023 09:25:21 +0100 Jiri Pirko wrote: > >Put the xarray pointer here directly. Plus a lock to protect the init. > > Okay, just to make this clear. You want me to have: > struct xarray __rcu *family_privs; > > in struct netlink_sock, correct? > > > Why I need a lock? If I read things correctly, skbs are processed in > serial over one sock. Since this is per-sock, should be safe. Okay, then add an assertion that the socket lock is held, at least. Also, is the socket lock not held yet when the filtering happens? Maybe the __rcu annotation isn't necessary then either? > >The size of the per-family struct should be in family definition, > >allocation should happen on first get automatically. Family definition > > Yes, I thought about that. But I decided to do this lockless, allocating > new priv every time the user sets the filter and replace the xarray item > so it could be accessed in rcu read section during notification > processing. > > What you suggest requires some lock to protect the memory being changed > during filter set and suring accessing in in notify. But okay, > if you insist. Not necessarily, you can have a helper which doesn't allocate, too. What I'm saying is that the common case for ops will be to access the state and allocate if it doesn't exist. How about genl_sk_family_priv() and genl_sk_has_family_priv() ? > >should also hold a callback to how the data is going to be freed. > > If it is alloceted automatically, why is it needed? Because priv may be a complex type which has member that need individual fields to be destroyed (in fullness of time we also need a constructor which can init things like list_head, but we can defer that). I'm guessing in your case the priv will look like this: struct devlink_sk_priv { const char *nft_fltr_instance_name; }; static void devlink_sk_priv_free(void *ptr) { struct devlink_sk_priv *priv = ptr; kfree(priv->nft_fltr_instance_name); }
Tue, Nov 28, 2023 at 04:11:16PM CET, kuba@kernel.org wrote: >On Tue, 28 Nov 2023 09:25:21 +0100 Jiri Pirko wrote: >> >Put the xarray pointer here directly. Plus a lock to protect the init. >> >> Okay, just to make this clear. You want me to have: >> struct xarray __rcu *family_privs; >> >> in struct netlink_sock, correct? >> >> >> Why I need a lock? If I read things correctly, skbs are processed in >> serial over one sock. Since this is per-sock, should be safe. > >Okay, then add an assertion that the socket lock is held, at least. No socket lock, but I assumed revcmsg could be called only in parallel. But I guess that with multiple threads, this assumption is broken. Okay, will add a spin lock. >Also, is the socket lock not held yet when the filtering happens? Nope. >Maybe the __rcu annotation isn't necessary then either? > >> >The size of the per-family struct should be in family definition, >> >allocation should happen on first get automatically. Family definition >> >> Yes, I thought about that. But I decided to do this lockless, allocating >> new priv every time the user sets the filter and replace the xarray item >> so it could be accessed in rcu read section during notification >> processing. >> >> What you suggest requires some lock to protect the memory being changed >> during filter set and suring accessing in in notify. But okay, >> if you insist. > >Not necessarily, you can have a helper which doesn't allocate, too. >What I'm saying is that the common case for ops will be to access >the state and allocate if it doesn't exist. > >How about genl_sk_family_priv() and genl_sk_has_family_priv() ? My point is, with what you suggest, it will look something like this: 1) user does DEVLINK_CMD_NOTIFY_FILTER_SET 2) devlink calls into genetlink code for a sk_priv and inserts in xa_array 3) genetlink allocates sk_priv and returns back 4) devlink fills-up the sk_priv 5) user does DEVLINK_CMD_NOTIFY_FILTER_SET, again 6) devlink calls into genetlink code for a sk_priv 7) genetlink returns already exising sk_priv found in xa_array 8) devlink fills-up the sk_priv Now the notification thread, sees sk_priv zeroed between 3) and 4) and inconsistent during 4) and 8) I originally solved that by rcu, DEVLINK_CMD_NOTIFY_FILTER_SET code always allocates and flips the rcu pointer. Notification thread always sees sk_priv consistent. If you want to allocate sk_priv in genetlink code once, hard to use the rcu mechanism and have to protect the sk_priv memory by a lock. What am I missing? > >> >should also hold a callback to how the data is going to be freed. >> >> If it is alloceted automatically, why is it needed? > >Because priv may be a complex type which has member that need >individual fields to be destroyed (in fullness of time we also >need a constructor which can init things like list_head, but >we can defer that). > >I'm guessing in your case the priv will look like this: > >struct devlink_sk_priv { > const char *nft_fltr_instance_name; >}; > >static void devlink_sk_priv_free(void *ptr) >{ > struct devlink_sk_priv *priv = ptr; > > kfree(priv->nft_fltr_instance_name); >} If genetlink code does the allocation, it should know how to free. Does not make sense to pass destructor to genetlink code to free memory it actually allocated :/ If devlink does the allocation, this callback makes sense. I was thinking about having it, but decided kfree is okay for now and destructor could be always introduced if needed. Quoting the patch description: Assume that kfree() is good for free of privs for now, as the only user introduced by the follow-up patch (devlink) will use kzalloc() for the allocation of the memory of the stored pointer. If later on this needs to be made custom, a callback is going to be needed. Until then (if ever), do this in a simple way.
On Tue, Nov 28, 2023 at 01:30:51PM +0100, Przemek Kitszel wrote: > On 11/23/23 19:15, Jiri Pirko wrote: ... > > + * Returns: valid pointer on success, otherwise NULL. > > since you are going to post next revision, > > kernel-doc requires "Return:" section (singular form) > https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation > > for new code we should strive to fulfil the requirement > (or piss-off someone powerful enough to change the requirement ;)) Interestingly that the script accepts plural for a few keywords. Is it documented somewhere as deprecated?
On Tue, 28 Nov 2023 17:05:48 +0100 Jiri Pirko wrote: > >Not necessarily, you can have a helper which doesn't allocate, too. > >What I'm saying is that the common case for ops will be to access > >the state and allocate if it doesn't exist. > > > >How about genl_sk_family_priv() and genl_sk_has_family_priv() ? > > My point is, with what you suggest, it will look something like this: > > 1) user does DEVLINK_CMD_NOTIFY_FILTER_SET > 2) devlink calls into genetlink code for a sk_priv and inserts in xa_array > 3) genetlink allocates sk_priv and returns back > 4) devlink fills-up the sk_priv > > 5) user does DEVLINK_CMD_NOTIFY_FILTER_SET, again > 6) devlink calls into genetlink code for a sk_priv > 7) genetlink returns already exising sk_priv found in xa_array > 8) devlink fills-up the sk_priv > > Now the notification thread, sees sk_priv zeroed between 3) and 4) > and inconsistent during 4) and 8) > > I originally solved that by rcu, DEVLINK_CMD_NOTIFY_FILTER_SET > code always allocates and flips the rcu pointer. Notification thread > always sees sk_priv consistent. > > If you want to allocate sk_priv in genetlink code once, hard to use > the rcu mechanism and have to protect the sk_priv memory by a lock. No, you can do exact same thing, just instead of putting the string directly into the xarray you put a struct which points to the string. > What am I missing? The fact that someone in the future may want to add another devlink priv field, and if the state is basically a pointer to a string, with complicated lifetime, they will have to suffer undoing that. > >> If it is alloceted automatically, why is it needed? > > > >Because priv may be a complex type which has member that need > >individual fields to be destroyed (in fullness of time we also > >need a constructor which can init things like list_head, but > >we can defer that). > > > >I'm guessing in your case the priv will look like this: > > > >struct devlink_sk_priv { > > const char *nft_fltr_instance_name; > >}; > > > >static void devlink_sk_priv_free(void *ptr) > >{ > > struct devlink_sk_priv *priv = ptr; > > > > kfree(priv->nft_fltr_instance_name); > >} > > If genetlink code does the allocation, it should know how to free. > Does not make sense to pass destructor to genetlink code to free memory > it actually allocated :/ > > If devlink does the allocation, this callback makes sense. I was > thinking about having it, but decided kfree is okay for now and > destructor could be always introduced if needed. Did you read the code snippet above? Core still does the kfree of the container (struct devlink_sk_priv). But what's inside the container struct (string pointer) has to be handled by the destructor. Feels like you focus on how to prove me wrong more than on understanding what I'm saying :|
On 11/28/2023 8:18 AM, Andy Shevchenko wrote: > On Tue, Nov 28, 2023 at 01:30:51PM +0100, Przemek Kitszel wrote: >> On 11/23/23 19:15, Jiri Pirko wrote: > > ... > >>> + * Returns: valid pointer on success, otherwise NULL. >> >> since you are going to post next revision, >> >> kernel-doc requires "Return:" section (singular form) >> https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation >> >> for new code we should strive to fulfil the requirement >> (or piss-off someone powerful enough to change the requirement ;)) > > Interestingly that the script accepts plural for a few keywords. > Is it documented somewhere as deprecated? > I also checked the source: $git grep --count -h 'Returns:' | awk '{ sum += $1 } END { print sum }'3646 $git grep --count -h 'Return:' | awk '{ sum += $1 } END { print sum }' 10907 So there is a big favor towards using 'Return:', but there are still about 1/3 as many uses of 'Returns:'. I dug into kernel-doc and it looks like it has accepted both "Return" and "Returns" since the first time that section headers were limited: f624adef3d0b ("kernel-doc: limit the "section header:" detection to a select few") I don't see any documentation on 'Returns;' being deprecated, but the documentation does only call out 'Return:'. Thanks, Jake
On Tue, Nov 28, 2023 at 11:59:05AM -0800, Jacob Keller wrote: > On 11/28/2023 8:18 AM, Andy Shevchenko wrote: > > On Tue, Nov 28, 2023 at 01:30:51PM +0100, Przemek Kitszel wrote: > >> On 11/23/23 19:15, Jiri Pirko wrote: ... > >>> + * Returns: valid pointer on success, otherwise NULL. > >> > >> since you are going to post next revision, > >> > >> kernel-doc requires "Return:" section (singular form) > >> https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation > >> > >> for new code we should strive to fulfil the requirement > >> (or piss-off someone powerful enough to change the requirement ;)) > > > > Interestingly that the script accepts plural for a few keywords. > > Is it documented somewhere as deprecated? > > I also checked the source: > > $git grep --count -h 'Returns:' | awk '{ sum += $1 } END { print sum }' > 3646 > $git grep --count -h 'Return:' | awk '{ sum += $1 } END { print sum }' > 10907 > > So there is a big favor towards using 'Return:', but there are still > about 1/3 as many uses of 'Returns:'. > > I dug into kernel-doc and it looks like it has accepted both "Return" > and "Returns" since the first time that section headers were limited: > f624adef3d0b ("kernel-doc: limit the "section header:" detection to a > select few") > > I don't see any documentation on 'Returns;' being deprecated, but the > documentation does only call out 'Return:'. Then I would amend documentation followed by amending scripts, etc. Before that it's unclear to me that contributor must use Return:. It sounds like similar collision to 80 vs. 100 (former in documentation, latter in the checkpatch). Of course, there might be sunsystem rules, but again, has to be documented. Right?
Tue, Nov 28, 2023 at 05:36:05PM CET, kuba@kernel.org wrote: >On Tue, 28 Nov 2023 17:05:48 +0100 Jiri Pirko wrote: >> >Not necessarily, you can have a helper which doesn't allocate, too. >> >What I'm saying is that the common case for ops will be to access >> >the state and allocate if it doesn't exist. >> > >> >How about genl_sk_family_priv() and genl_sk_has_family_priv() ? >> >> My point is, with what you suggest, it will look something like this: >> >> 1) user does DEVLINK_CMD_NOTIFY_FILTER_SET >> 2) devlink calls into genetlink code for a sk_priv and inserts in xa_array >> 3) genetlink allocates sk_priv and returns back >> 4) devlink fills-up the sk_priv >> >> 5) user does DEVLINK_CMD_NOTIFY_FILTER_SET, again >> 6) devlink calls into genetlink code for a sk_priv >> 7) genetlink returns already exising sk_priv found in xa_array >> 8) devlink fills-up the sk_priv >> >> Now the notification thread, sees sk_priv zeroed between 3) and 4) >> and inconsistent during 4) and 8) >> >> I originally solved that by rcu, DEVLINK_CMD_NOTIFY_FILTER_SET >> code always allocates and flips the rcu pointer. Notification thread >> always sees sk_priv consistent. >> >> If you want to allocate sk_priv in genetlink code once, hard to use >> the rcu mechanism and have to protect the sk_priv memory by a lock. > >No, you can do exact same thing, just instead of putting the string >directly into the xarray you put a struct which points to the string. I'm lost. What "string" are you talking about exactly? I'm not putting any string to xarray. In the existing implementation, I have following struct: struct devlink_obj_desc { struct rcu_head rcu; const char *bus_name; const char *dev_name; unsigned int port_index; bool port_index_valid; long data[]; }; This is the struct put pointer to into the xarray. Pointer to this struct is dereferenced under rcu in notification code and the struct is freed by kfree_rcu(). > >> What am I missing? > >The fact that someone in the future may want to add another devlink >priv field, and if the state is basically a pointer to a string, "the state is basically a pointer to a string". I don't follow what you mean by this :/ >with complicated lifetime, they will have to suffer undoing that. > >> >> If it is alloceted automatically, why is it needed? >> > >> >Because priv may be a complex type which has member that need >> >individual fields to be destroyed (in fullness of time we also >> >need a constructor which can init things like list_head, but >> >we can defer that). >> > >> >I'm guessing in your case the priv will look like this: >> > >> >struct devlink_sk_priv { >> > const char *nft_fltr_instance_name; >> >}; >> > >> >static void devlink_sk_priv_free(void *ptr) >> >{ >> > struct devlink_sk_priv *priv = ptr; >> > >> > kfree(priv->nft_fltr_instance_name); >> >} >> >> If genetlink code does the allocation, it should know how to free. >> Does not make sense to pass destructor to genetlink code to free memory >> it actually allocated :/ >> >> If devlink does the allocation, this callback makes sense. I was >> thinking about having it, but decided kfree is okay for now and >> destructor could be always introduced if needed. > >Did you read the code snippet above? Sure. > >Core still does the kfree of the container (struct devlink_sk_priv). >But what's inside the container struct (string pointer) has to be >handled by the destructor. > >Feels like you focus on how to prove me wrong more than on >understanding what I'm saying :| Not at all, I have no reason for it. I just want to get my job done and I am having very hard time to understand what you want exactly.
On Wed, 29 Nov 2023 14:59:31 +0100 Jiri Pirko wrote: > Tue, Nov 28, 2023 at 05:36:05PM CET, kuba@kernel.org wrote: > >No, you can do exact same thing, just instead of putting the string > >directly into the xarray you put a struct which points to the string. > > I'm lost. What "string" are you talking about exactly? I'm not putting > any string to xarray. > > In the existing implementation, I have following struct: > struct devlink_obj_desc { > struct rcu_head rcu; > const char *bus_name; > const char *dev_name; > unsigned int port_index; > bool port_index_valid; > long data[]; > }; > > This is the struct put pointer to into the xarray. Pointer to this > struct is dereferenced under rcu in notification code and the struct > is freed by kfree_rcu(). Sorry I was looking at patch 8 which has only: +struct devlink_obj_desc { + struct rcu_head rcu; + const char *bus_name; + const char *dev_name; + long data[]; +}; that's basically a string and an rcu_head, that's what I meant. > >Core still does the kfree of the container (struct devlink_sk_priv). > >But what's inside the container struct (string pointer) has to be > >handled by the destructor. > > > >Feels like you focus on how to prove me wrong more than on > >understanding what I'm saying :| > > Not at all, I have no reason for it. I just want to get my job done > and I am having very hard time to understand what you want exactly. Sockets may want to hold state for more than filtering. Try to look past your singular use case.
Wed, Nov 29, 2023 at 04:01:57PM CET, kuba@kernel.org wrote: >On Wed, 29 Nov 2023 14:59:31 +0100 Jiri Pirko wrote: >> Tue, Nov 28, 2023 at 05:36:05PM CET, kuba@kernel.org wrote: >> >No, you can do exact same thing, just instead of putting the string >> >directly into the xarray you put a struct which points to the string. >> >> I'm lost. What "string" are you talking about exactly? I'm not putting >> any string to xarray. >> >> In the existing implementation, I have following struct: >> struct devlink_obj_desc { >> struct rcu_head rcu; >> const char *bus_name; >> const char *dev_name; >> unsigned int port_index; >> bool port_index_valid; >> long data[]; >> }; >> >> This is the struct put pointer to into the xarray. Pointer to this >> struct is dereferenced under rcu in notification code and the struct >> is freed by kfree_rcu(). > >Sorry I was looking at patch 8 which has only: > >+struct devlink_obj_desc { >+ struct rcu_head rcu; >+ const char *bus_name; >+ const char *dev_name; >+ long data[]; >+}; > >that's basically a string and an rcu_head, that's what I meant. > >> >Core still does the kfree of the container (struct devlink_sk_priv). >> >But what's inside the container struct (string pointer) has to be >> >handled by the destructor. >> > >> >Feels like you focus on how to prove me wrong more than on >> >understanding what I'm saying :| >> >> Not at all, I have no reason for it. I just want to get my job done >> and I am having very hard time to understand what you want exactly. > >Sockets may want to hold state for more than filtering. >Try to look past your singular use case. Okay, I'll try to make another V. But please don't be mad I misunderstood something if I do :)
On 11/28/2023 12:06 PM, Andy Shevchenko wrote: > On Tue, Nov 28, 2023 at 11:59:05AM -0800, Jacob Keller wrote: >> On 11/28/2023 8:18 AM, Andy Shevchenko wrote: >>> On Tue, Nov 28, 2023 at 01:30:51PM +0100, Przemek Kitszel wrote: >>>> On 11/23/23 19:15, Jiri Pirko wrote: > > ... > >>>>> + * Returns: valid pointer on success, otherwise NULL. >>>> >>>> since you are going to post next revision, >>>> >>>> kernel-doc requires "Return:" section (singular form) >>>> https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation >>>> >>>> for new code we should strive to fulfil the requirement >>>> (or piss-off someone powerful enough to change the requirement ;)) >>> >>> Interestingly that the script accepts plural for a few keywords. >>> Is it documented somewhere as deprecated? >> >> I also checked the source: >> >> $git grep --count -h 'Returns:' | awk '{ sum += $1 } END { print sum }' >> 3646 >> $git grep --count -h 'Return:' | awk '{ sum += $1 } END { print sum }' >> 10907 >> >> So there is a big favor towards using 'Return:', but there are still >> about 1/3 as many uses of 'Returns:'. >> >> I dug into kernel-doc and it looks like it has accepted both "Return" >> and "Returns" since the first time that section headers were limited: >> f624adef3d0b ("kernel-doc: limit the "section header:" detection to a >> select few") >> >> I don't see any documentation on 'Returns;' being deprecated, but the >> documentation does only call out 'Return:'. > > Then I would amend documentation followed by amending scripts, etc. > Before that it's unclear to me that contributor must use Return:. It > sounds like similar collision to 80 vs. 100 (former in documentation, > latter in the checkpatch). > > Of course, there might be sunsystem rules, but again, has to be documented. > Right? > Documenting it seems reasonable to me. Thanks, Jake
diff --git a/include/net/genetlink.h b/include/net/genetlink.h index e18a4c0d69ee..66c1e50415e0 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -300,6 +300,9 @@ int genl_register_family(struct genl_family *family); int genl_unregister_family(const struct genl_family *family); void genl_notify(const struct genl_family *family, struct sk_buff *skb, struct genl_info *info, u32 group, gfp_t flags); +void *genl_sk_priv_get(struct sock *sk, struct genl_family *family); +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, + void *priv); void *genlmsg_put(struct sk_buff *skb, u32 portid, u32 seq, const struct genl_family *family, int flags, u8 cmd); diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h index 2145979b9986..5d96135a4cf3 100644 --- a/net/netlink/af_netlink.h +++ b/net/netlink/af_netlink.h @@ -51,6 +51,7 @@ struct netlink_sock { struct rhash_head node; struct rcu_head rcu; struct work_struct work; + void __rcu *priv; }; static inline struct netlink_sock *nlk_sk(struct sock *sk) diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 92ef5ed2e7b0..aae5e63fa50b 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -21,6 +21,7 @@ #include <linux/idr.h> #include <net/sock.h> #include <net/genetlink.h> +#include "af_netlink.h" static DEFINE_MUTEX(genl_mutex); /* serialization of message processing */ static DECLARE_RWSEM(cb_lock); @@ -1699,12 +1700,109 @@ static int genl_bind(struct net *net, int group) return ret; } +struct genl_sk_ctx { + struct xarray family_privs; +}; + +static struct genl_sk_ctx *genl_sk_ctx_alloc(void) +{ + struct genl_sk_ctx *ctx; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return NULL; + xa_init_flags(&ctx->family_privs, XA_FLAGS_ALLOC); + return ctx; +} + +static void genl_sk_ctx_free(struct genl_sk_ctx *ctx) +{ + unsigned long family_id; + void *priv; + + xa_for_each(&ctx->family_privs, family_id, priv) { + xa_erase(&ctx->family_privs, family_id); + kfree(priv); + } + xa_destroy(&ctx->family_privs); + kfree(ctx); +} + +/** + * genl_sk_priv_get - Get per-socket private pointer for family + * + * @sk: socket + * @family: family + * + * Lookup a private pointer stored per-socket by a specified + * Generic netlink family. + * + * Caller should make sure this is called in RCU read locked section. + * + * Returns: valid pointer on success, otherwise NULL. + */ +void *genl_sk_priv_get(struct sock *sk, struct genl_family *family) +{ + struct genl_sk_ctx *ctx; + + ctx = rcu_dereference(nlk_sk(sk)->priv); + if (!ctx) + return NULL; + return xa_load(&ctx->family_privs, family->id); +} + +/** + * genl_sk_priv_store - Store per-socket private pointer for family + * + * @sk: socket + * @family: family + * @priv: private pointer + * + * Store a private pointer per-socket for a specified + * Generic netlink family. + * + * Caller has to make sure this is not called in parallel multiple times + * for the same sock and also in parallel to genl_release() for the same sock. + * + * Returns: previously stored private pointer for the family (could be NULL) + * on success, otherwise negative error value encoded by ERR_PTR(). + */ +void *genl_sk_priv_store(struct sock *sk, struct genl_family *family, + void *priv) +{ + struct genl_sk_ctx *ctx; + void *old_priv; + + ctx = rcu_dereference_raw(nlk_sk(sk)->priv); + if (!ctx) { + ctx = genl_sk_ctx_alloc(); + if (!ctx) + return ERR_PTR(-ENOMEM); + rcu_assign_pointer(nlk_sk(sk)->priv, ctx); + } + + old_priv = xa_store(&ctx->family_privs, family->id, priv, GFP_KERNEL); + if (xa_is_err(old_priv)) + return ERR_PTR(xa_err(old_priv)); + return old_priv; +} + +static void genl_release(struct sock *sk, unsigned long *groups) +{ + struct genl_sk_ctx *ctx; + + ctx = rcu_dereference_raw(nlk_sk(sk)->priv); + if (ctx) + genl_sk_ctx_free(ctx); +} + static int __net_init genl_pernet_init(struct net *net) { struct netlink_kernel_cfg cfg = { .input = genl_rcv, .flags = NL_CFG_F_NONROOT_RECV, .bind = genl_bind, + .release = genl_release, }; /* we'll bump the group number right afterwards */