Message ID | 20230201150815.409582-5-urezki@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 74ad5a1bf0685861d0a3d2871148c9bab642a00a |
Headers | show |
Series | Rename k[v]free_rcu() single argument to k[v]free_rcu_mightsleep() | expand |
> The kvfree_rcu()'s single argument name is deprecated therefore > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > underline that it is for sleepable contexts. > > Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > Could you please add you reviwed-by or Acked-by tags so we can bring our series with renaming for the next merge window? Thanks! -- Uladzisslau Rezki
On Thu, 9 Mar 2023 14:45:21 +0100 Uladzislau Rezki <urezki@gmail.com> wrote: > > The kvfree_rcu()'s single argument name is deprecated therefore > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > > underline that it is for sleepable contexts. > > > > Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > Could you please add you reviwed-by or Acked-by tags so we can bring > our series with renaming for the next merge window? I don't know. Perhaps we should just apply this patch and not worry about sleeping and whatnot. -- Steve diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 04f0fdae19a1..5de945a8f61d 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS; struct osnoise_instance { struct list_head list; struct trace_array *tr; + struct rcu_head rcu; }; static struct list_head osnoise_instances; @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) if (!found) return; - kvfree_rcu(inst); + kvfree_rcu(inst, rcu); } /* diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 20d0c4a97633..ef5fafb40c76 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp, return -ENOENT; list_del_rcu(&link->list); - kvfree_rcu(link); + kvfree_rcu(link, rcu); if (list_empty(&tp->event->files)) trace_probe_clear_flag(tp, TP_FLAG_TRACE); diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index ef8ed3b65d05..e6037752dcf0 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -256,6 +256,7 @@ struct trace_probe { struct event_file_link { struct trace_event_file *file; struct list_head list; + struct rcu_head rcu; }; static inline bool trace_probe_test_flag(struct trace_probe *tp,
On 3/15/23 4:36 PM, Steven Rostedt wrote: > On Thu, 9 Mar 2023 14:45:21 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > >>> The kvfree_rcu()'s single argument name is deprecated therefore >>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly >>> underline that it is for sleepable contexts. >>> >>> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> >>> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> >>> >> Could you please add you reviwed-by or Acked-by tags so we can bring >> our series with renaming for the next merge window? > > I don't know. Perhaps we should just apply this patch and not worry about > sleeping and whatnot. That's a cop out, just removing the one case you care about. Fact is the naming is awful, and the 1/2 argument thing is making it worse. If a big change is warranted, why not do it right and ACTUALLY get it right?
On Wed, Mar 15, 2023 at 05:19:18PM -0600, Jens Axboe wrote: > On 3/15/23 4:36 PM, Steven Rostedt wrote: > > On Thu, 9 Mar 2023 14:45:21 +0100 > > Uladzislau Rezki <urezki@gmail.com> wrote: > > > >>> The kvfree_rcu()'s single argument name is deprecated therefore > >>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > >>> underline that it is for sleepable contexts. > >>> > >>> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > >>> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > >>> > >> Could you please add you reviwed-by or Acked-by tags so we can bring > >> our series with renaming for the next merge window? > > > > I don't know. Perhaps we should just apply this patch and not worry about > > sleeping and whatnot. That does work, and I am guessing that the size increase is not a big problem for you there. > That's a cop out, just removing the one case you care about. Fact is > the naming is awful, and the 1/2 argument thing is making it worse. > If a big change is warranted, why not do it right and ACTUALLY > get it right? You both do realize that the kvfree_rcu_mightsleep() definition is already in mainline, right? Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire community eventually decides to name it--can do any of the following: 1. Put the pointer into an already allocated array of pointers. 2. Allocate a new array of pointers, have the allocation succeed without sleeping, then put the pointer into an already allocated array of pointers. 3. Allocate a new array of pointers, have the allocation succeed after sleeping, then put the pointer into an already allocated array of pointers. 4. Attempt to allocate a new array of pointers, have the allocation fail (presumably after sleeping), then invoke synchronize_rcu() directly. Too much fun! ;-) Thanx, Paul
On Wed, 15 Mar 2023 17:37:30 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > > That does work, and I am guessing that the size increase is not a big > problem for you there. Well, I was fine with it as long as it stayed in the headers, where ugliness is warmly welcomed. Just ask all the #ifdefs. > > > That's a cop out, just removing the one case you care about. Fact is > > the naming is awful, and the 1/2 argument thing is making it worse. > > If a big change is warranted, why not do it right and ACTUALLY > > get it right? > > You both do realize that the kvfree_rcu_mightsleep() definition is > already in mainline, right? > > Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire > community eventually decides to name it--can do any of the following: > > 1. Put the pointer into an already allocated array of pointers. > > 2. Allocate a new array of pointers, have the allocation succeed > without sleeping, then put the pointer into an already allocated > array of pointers. > > 3. Allocate a new array of pointers, have the allocation succeed > after sleeping, then put the pointer into an already allocated > array of pointers. > > 4. Attempt to allocate a new array of pointers, have the allocation > fail (presumably after sleeping), then invoke synchronize_rcu() > directly. > > Too much fun! ;-) > kvfree_rcu_kitchen_sink() ? kvfree_rcu_goldie_locks()? I honestly like the name "headless" as that perfectly describes the difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1). Whereas mightsleep() is confusing to me because it doesn't tell me why kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one. Usually, code that has two sleep variants is about limiting the functionality of the atomic friendly one. -- Steve
On Wed, Mar 15, 2023 at 10:23:23PM -0400, Steven Rostedt wrote: > On Wed, 15 Mar 2023 17:37:30 -0700 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > That does work, and I am guessing that the size increase is not a big > > problem for you there. > > Well, I was fine with it as long as it stayed in the headers, where > ugliness is warmly welcomed. Just ask all the #ifdefs. > > > > > > That's a cop out, just removing the one case you care about. Fact is > > > the naming is awful, and the 1/2 argument thing is making it worse. > > > If a big change is warranted, why not do it right and ACTUALLY > > > get it right? > > > > You both do realize that the kvfree_rcu_mightsleep() definition is > > already in mainline, right? > > > > Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire > > community eventually decides to name it--can do any of the following: > > > > 1. Put the pointer into an already allocated array of pointers. > > > > 2. Allocate a new array of pointers, have the allocation succeed > > without sleeping, then put the pointer into an already allocated > > array of pointers. > > > > 3. Allocate a new array of pointers, have the allocation succeed > > after sleeping, then put the pointer into an already allocated > > array of pointers. > > > > 4. Attempt to allocate a new array of pointers, have the allocation > > fail (presumably after sleeping), then invoke synchronize_rcu() > > directly. > > > > Too much fun! ;-) > > > > kvfree_rcu_kitchen_sink() ? > > kvfree_rcu_goldie_locks()? > > I honestly like the name "headless" as that perfectly describes the > difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1). > > Whereas mightsleep() is confusing to me because it doesn't tell me why > kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one. > Usually, code that has two sleep variants is about limiting the > functionality of the atomic friendly one. kvfree_rcu_alloc_head()? kvfree_rcu_dynhead()? kvfree_rcu_gearhead()? kvfree_rcu_radiohead()? kvfree_rcu_getahead()? I don't know about you guys, but to me, kvfree_rcu_mightsleep() is sounding better and better by comparison... Thanx, Paul
On Wed, Mar 15, 2023 at 11:44 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Wed, Mar 15, 2023 at 10:23:23PM -0400, Steven Rostedt wrote: > > On Wed, 15 Mar 2023 17:37:30 -0700 > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > > > > That does work, and I am guessing that the size increase is not a big > > > problem for you there. > > > > Well, I was fine with it as long as it stayed in the headers, where > > ugliness is warmly welcomed. Just ask all the #ifdefs. > > > > > > > > > That's a cop out, just removing the one case you care about. Fact is > > > > the naming is awful, and the 1/2 argument thing is making it worse. > > > > If a big change is warranted, why not do it right and ACTUALLY > > > > get it right? > > > > > > You both do realize that the kvfree_rcu_mightsleep() definition is > > > already in mainline, right? > > > > > > Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire > > > community eventually decides to name it--can do any of the following: > > > > > > 1. Put the pointer into an already allocated array of pointers. > > > > > > 2. Allocate a new array of pointers, have the allocation succeed > > > without sleeping, then put the pointer into an already allocated > > > array of pointers. > > > > > > 3. Allocate a new array of pointers, have the allocation succeed > > > after sleeping, then put the pointer into an already allocated > > > array of pointers. > > > > > > 4. Attempt to allocate a new array of pointers, have the allocation > > > fail (presumably after sleeping), then invoke synchronize_rcu() > > > directly. > > > > > > Too much fun! ;-) > > > > > > > kvfree_rcu_kitchen_sink() ? > > > > kvfree_rcu_goldie_locks()? > > > > I honestly like the name "headless" as that perfectly describes the > > difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1). > > > > Whereas mightsleep() is confusing to me because it doesn't tell me why > > kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one. > > Usually, code that has two sleep variants is about limiting the > > functionality of the atomic friendly one. > > kvfree_rcu_alloc_head()? > kvfree_rcu_dynhead()? > kvfree_rcu_gearhead()? > kvfree_rcu_radiohead()? > kvfree_rcu_getahead()? > > I don't know about you guys, but to me, kvfree_rcu_mightsleep() is > sounding better and better by comparison... Indeed, and one could argue that "headless" sounds like something out of a horror movie ;-). Which of course does match the situation when the API is applied incorrectly. - Joel
> On Thu, 9 Mar 2023 14:45:21 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > The kvfree_rcu()'s single argument name is deprecated therefore > > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > > > underline that it is for sleepable contexts. > > > > > > Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > > > Could you please add you reviwed-by or Acked-by tags so we can bring > > our series with renaming for the next merge window? > > I don't know. Perhaps we should just apply this patch and not worry about > sleeping and whatnot. > > -- Steve > > diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c > index 04f0fdae19a1..5de945a8f61d 100644 > --- a/kernel/trace/trace_osnoise.c > +++ b/kernel/trace/trace_osnoise.c > @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS; > struct osnoise_instance { > struct list_head list; > struct trace_array *tr; > + struct rcu_head rcu; > }; > > static struct list_head osnoise_instances; > @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) > if (!found) > return; > > - kvfree_rcu(inst); > + kvfree_rcu(inst, rcu); > } > > /* > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c > index 20d0c4a97633..ef5fafb40c76 100644 > --- a/kernel/trace/trace_probe.c > +++ b/kernel/trace/trace_probe.c > @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp, > return -ENOENT; > > list_del_rcu(&link->list); > - kvfree_rcu(link); > + kvfree_rcu(link, rcu); > > if (list_empty(&tp->event->files)) > trace_probe_clear_flag(tp, TP_FLAG_TRACE); > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > index ef8ed3b65d05..e6037752dcf0 100644 > --- a/kernel/trace/trace_probe.h > +++ b/kernel/trace/trace_probe.h > @@ -256,6 +256,7 @@ struct trace_probe { > struct event_file_link { > struct trace_event_file *file; > struct list_head list; > + struct rcu_head rcu; > }; > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > struct foo_a { int a; int b; }; your obj size is 8 byte struct foo_b { struct rcu_head rcu; int a; int b; }; now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object will be 32 bytes since there is no slab for 24 bytes: <snip> kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0 kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0 kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0 <snip> if we allocate 512 objects of foo_a it would be 4096 bytes in case of foo_b it is 24 * 512 = 12228 bytes. single argument will give you 4096 + 512 * 8 = 8192 bytes int terms of memory consumtion. And double argument will not give you better performance comparing with a single argument. So it depends on what you want to achieve by that patch. -- Uladzislau Rezki
On Thu, 16 Mar 2023 00:16:39 -0400 Joel Fernandes <joel@joelfernandes.org> wrote: > Indeed, and one could argue that "headless" sounds like something out > of a horror movie ;-). Which of course does match the situation when > the API is applied incorrectly. Well, "headless" is a common term in IT. https://en.wikipedia.org/wiki/Headless_software We could be specific to what horror movie/story, and call it: kvfree_rcu_sleepy_hollow() Which will imply both headless *and* might_sleep! -- Steve
On Thu, 16 Mar 2023 09:16:37 +0100 Uladzislau Rezki <urezki@gmail.com> wrote: > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > > index ef8ed3b65d05..e6037752dcf0 100644 > > --- a/kernel/trace/trace_probe.h > > +++ b/kernel/trace/trace_probe.h > > @@ -256,6 +256,7 @@ struct trace_probe { > > struct event_file_link { > > struct trace_event_file *file; > > struct list_head list; > > + struct rcu_head rcu; > > }; > > > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > > > struct foo_a { > int a; > int b; > }; Most machines today are 64 bits, even low end machines. struct foo_a { long long a; long long b; }; is more accurate. That's 16 bytes. Although it is more likely off because list_head is a double pointer. But let's just go with this, as the amount really doesn't matter here. > > your obj size is 8 byte > > struct foo_b { > struct rcu_head rcu; Isn't rcu_head defined as; struct callback_head { struct callback_head *next; void (*func)(struct callback_head *head); } __attribute__((aligned(sizeof(void *)))); #define rcu_head callback_head Which makes it 8 not 16 on 32 bit as well? > int a; > int b; > }; So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit. > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object > will be 32 bytes since there is no slab for 24 bytes: > > <snip> > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0 > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0 > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0 > <snip> > > if we allocate 512 objects of foo_a it would be 4096 bytes > in case of foo_b it is 24 * 512 = 12228 bytes. This is for probe events. We usually allocate 1, maybe 2. Oh, some may even allocate 100 to be crazy. But each probe event is in reality much larger (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes extra is still lost in the noise. > > single argument will give you 4096 + 512 * 8 = 8192 bytes > int terms of memory consumtion. If someone allocate 512 instances, that would be closer to a meg in size without this change. 8k is probably less than 1% > > And double argument will not give you better performance comparing > with a single argument. It will, because it will no longer have to allocate anything if need be. Note, when it doesn't allocate the system is probably mostly idle and we don't care about performance, but when it needs allocation, that's likely a time when performance is a bit more important. -- Steve
On Thu, Mar 16, 2023 at 08:14:24AM -0400, Steven Rostedt wrote: > On Thu, 16 Mar 2023 00:16:39 -0400 > Joel Fernandes <joel@joelfernandes.org> wrote: > > > Indeed, and one could argue that "headless" sounds like something out > > of a horror movie ;-). Which of course does match the situation when > > the API is applied incorrectly. > > Well, "headless" is a common term in IT. > > https://en.wikipedia.org/wiki/Headless_software Indeed it is. But RCU is incapable of headlessness, so in the kvfree_rcu_mightsleep() case, the rcu_head is dynamically allocated. > We could be specific to what horror movie/story, and call it: > > kvfree_rcu_sleepy_hollow() > > Which will imply both headless *and* might_sleep! Heh! That one is almost bad enough to be good! Almost! ;-) Thanx, Paul
On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote: > On Thu, 16 Mar 2023 09:16:37 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > > > index ef8ed3b65d05..e6037752dcf0 100644 > > > --- a/kernel/trace/trace_probe.h > > > +++ b/kernel/trace/trace_probe.h > > > @@ -256,6 +256,7 @@ struct trace_probe { > > > struct event_file_link { > > > struct trace_event_file *file; > > > struct list_head list; > > > + struct rcu_head rcu; > > > }; > > > > > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > > > > > struct foo_a { > > int a; > > int b; > > }; > > Most machines today are 64 bits, even low end machines. > > struct foo_a { > long long a; > long long b; > }; > > is more accurate. That's 16 bytes. > > Although it is more likely off because list_head is a double pointer. But > let's just go with this, as the amount really doesn't matter here. > > > > > your obj size is 8 byte > > > > struct foo_b { > > struct rcu_head rcu; > > Isn't rcu_head defined as; > > struct callback_head { > struct callback_head *next; > void (*func)(struct callback_head *head); > } __attribute__((aligned(sizeof(void *)))); > #define rcu_head callback_head > > Which makes it 8 not 16 on 32 bit as well? > > > int a; > > int b; > > }; > > So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit. > > > > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object > > will be 32 bytes since there is no slab for 24 bytes: > > > > <snip> > > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0 > > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0 > > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0 > > <snip> > > > > if we allocate 512 objects of foo_a it would be 4096 bytes > > in case of foo_b it is 24 * 512 = 12228 bytes. > > This is for probe events. We usually allocate 1, maybe 2. Oh, some may even > allocate 100 to be crazy. But each probe event is in reality much larger > (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes > extra is still lost in the noise. > > > > > single argument will give you 4096 + 512 * 8 = 8192 bytes > > int terms of memory consumtion. > > If someone allocate 512 instances, that would be closer to a meg in size > without this change. 8k is probably less than 1% > In percentage. My case. (12228 - 8192) * 100 / 12228 = ~33% difference. > > > > And double argument will not give you better performance comparing > > with a single argument. > > It will, because it will no longer have to allocate anything if need be. > Note, when it doesn't allocate the system is probably mostly idle and we > don't care about performance, but when it needs allocation, that's likely a > time when performance is a bit more important. > The problem further is about pointer chasing, like comparing arrays and lists. It will take longer time to offload all pointers. -- Uladzislau Rezki
On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote: > On Thu, 16 Mar 2023 09:16:37 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > > > index ef8ed3b65d05..e6037752dcf0 100644 > > > --- a/kernel/trace/trace_probe.h > > > +++ b/kernel/trace/trace_probe.h > > > @@ -256,6 +256,7 @@ struct trace_probe { > > > struct event_file_link { > > > struct trace_event_file *file; > > > struct list_head list; > > > + struct rcu_head rcu; > > > }; > > > > > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > > > > > struct foo_a { > > int a; > > int b; > > }; > > Most machines today are 64 bits, even low end machines. > > struct foo_a { > long long a; > long long b; > }; > > is more accurate. That's 16 bytes. > > Although it is more likely off because list_head is a double pointer. But > let's just go with this, as the amount really doesn't matter here. > > > > > your obj size is 8 byte > > > > struct foo_b { > > struct rcu_head rcu; > > Isn't rcu_head defined as; > > struct callback_head { > struct callback_head *next; > void (*func)(struct callback_head *head); > } __attribute__((aligned(sizeof(void *)))); > #define rcu_head callback_head > > Which makes it 8 not 16 on 32 bit as well? > > > int a; > > int b; > > }; > > So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit. > > > > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object > > will be 32 bytes since there is no slab for 24 bytes: > > > > <snip> > > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0 > > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0 > > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0 > > <snip> > > > > if we allocate 512 objects of foo_a it would be 4096 bytes > > in case of foo_b it is 24 * 512 = 12228 bytes. > > This is for probe events. We usually allocate 1, maybe 2. Oh, some may even > allocate 100 to be crazy. But each probe event is in reality much larger > (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes > extra is still lost in the noise. Agreed, in this case, there is almost no penalty for adding an rcu_head structure to the probe event... > > single argument will give you 4096 + 512 * 8 = 8192 bytes > > int terms of memory consumtion. > > If someone allocate 512 instances, that would be closer to a meg in size > without this change. 8k is probably less than 1% > > > > > And double argument will not give you better performance comparing > > with a single argument. > > It will, because it will no longer have to allocate anything if need be. > Note, when it doesn't allocate the system is probably mostly idle and we > don't care about performance, but when it needs allocation, that's likely a > time when performance is a bit more important. ... and also agreed that there are performance benefits to doing so, especially under OOM conditions. Thanx, Paul
On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote: > On Thu, 9 Mar 2023 14:45:21 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > The kvfree_rcu()'s single argument name is deprecated therefore > > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > > > underline that it is for sleepable contexts. > > > > > > Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > > > Could you please add you reviwed-by or Acked-by tags so we can bring > > our series with renaming for the next merge window? > > I don't know. Perhaps we should just apply this patch and not worry about > sleeping and whatnot. Just in case it is not clear: Acked-by: Paul E. McKenney <paulmck@kernel.org> > -- Steve > > diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c > index 04f0fdae19a1..5de945a8f61d 100644 > --- a/kernel/trace/trace_osnoise.c > +++ b/kernel/trace/trace_osnoise.c > @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS; > struct osnoise_instance { > struct list_head list; > struct trace_array *tr; > + struct rcu_head rcu; > }; > > static struct list_head osnoise_instances; > @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) > if (!found) > return; > > - kvfree_rcu(inst); > + kvfree_rcu(inst, rcu); > } > > /* > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c > index 20d0c4a97633..ef5fafb40c76 100644 > --- a/kernel/trace/trace_probe.c > +++ b/kernel/trace/trace_probe.c > @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp, > return -ENOENT; > > list_del_rcu(&link->list); > - kvfree_rcu(link); > + kvfree_rcu(link, rcu); > > if (list_empty(&tp->event->files)) > trace_probe_clear_flag(tp, TP_FLAG_TRACE); > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > index ef8ed3b65d05..e6037752dcf0 100644 > --- a/kernel/trace/trace_probe.h > +++ b/kernel/trace/trace_probe.h > @@ -256,6 +256,7 @@ struct trace_probe { > struct event_file_link { > struct trace_event_file *file; > struct list_head list; > + struct rcu_head rcu; > }; > > static inline bool trace_probe_test_flag(struct trace_probe *tp,
On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote: > On Thu, 9 Mar 2023 14:45:21 +0100 > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > The kvfree_rcu()'s single argument name is deprecated therefore > > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly > > > underline that it is for sleepable contexts. > > > > > > Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > > > > > Could you please add you reviwed-by or Acked-by tags so we can bring > > our series with renaming for the next merge window? > > I don't know. Perhaps we should just apply this patch and not worry about > sleeping and whatnot. > > -- Steve > > diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c > index 04f0fdae19a1..5de945a8f61d 100644 > --- a/kernel/trace/trace_osnoise.c > +++ b/kernel/trace/trace_osnoise.c > @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS; > struct osnoise_instance { > struct list_head list; > struct trace_array *tr; > + struct rcu_head rcu; > }; > > static struct list_head osnoise_instances; > @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) > if (!found) > return; > > - kvfree_rcu(inst); > + kvfree_rcu(inst, rcu); > } > > /* > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c > index 20d0c4a97633..ef5fafb40c76 100644 > --- a/kernel/trace/trace_probe.c > +++ b/kernel/trace/trace_probe.c > @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp, > return -ENOENT; > > list_del_rcu(&link->list); > - kvfree_rcu(link); > + kvfree_rcu(link, rcu); > > if (list_empty(&tp->event->files)) > trace_probe_clear_flag(tp, TP_FLAG_TRACE); > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > index ef8ed3b65d05..e6037752dcf0 100644 > --- a/kernel/trace/trace_probe.h > +++ b/kernel/trace/trace_probe.h > @@ -256,6 +256,7 @@ struct trace_probe { > struct event_file_link { > struct trace_event_file *file; > struct list_head list; > + struct rcu_head rcu; > }; > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > Anyway i do not see any problems with it Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com> -- Uladzislau Rezki
> On Mar 16, 2023, at 1:58 PM, Uladzislau Rezki <urezki@gmail.com> wrote: > > On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote: >> On Thu, 9 Mar 2023 14:45:21 +0100 >> Uladzislau Rezki <urezki@gmail.com> wrote: >> >>>> The kvfree_rcu()'s single argument name is deprecated therefore >>>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly >>>> underline that it is for sleepable contexts. >>>> >>>> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> >>>> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> >>>> >>> Could you please add you reviwed-by or Acked-by tags so we can bring >>> our series with renaming for the next merge window? >> >> I don't know. Perhaps we should just apply this patch and not worry about >> sleeping and whatnot. >> >> -- Steve >> >> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c >> index 04f0fdae19a1..5de945a8f61d 100644 >> --- a/kernel/trace/trace_osnoise.c >> +++ b/kernel/trace/trace_osnoise.c >> @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS; >> struct osnoise_instance { >> struct list_head list; >> struct trace_array *tr; >> + struct rcu_head rcu; >> }; >> >> static struct list_head osnoise_instances; >> @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) >> if (!found) >> return; >> >> - kvfree_rcu(inst); >> + kvfree_rcu(inst, rcu); >> } >> >> /* >> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c >> index 20d0c4a97633..ef5fafb40c76 100644 >> --- a/kernel/trace/trace_probe.c >> +++ b/kernel/trace/trace_probe.c >> @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp, >> return -ENOENT; >> >> list_del_rcu(&link->list); >> - kvfree_rcu(link); >> + kvfree_rcu(link, rcu); >> >> if (list_empty(&tp->event->files)) >> trace_probe_clear_flag(tp, TP_FLAG_TRACE); >> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h >> index ef8ed3b65d05..e6037752dcf0 100644 >> --- a/kernel/trace/trace_probe.h >> +++ b/kernel/trace/trace_probe.h >> @@ -256,6 +256,7 @@ struct trace_probe { >> struct event_file_link { >> struct trace_event_file *file; >> struct list_head list; >> + struct rcu_head rcu; >> }; >> >> static inline bool trace_probe_test_flag(struct trace_probe *tp, >> > Anyway i do not see any problems with it > > Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Just to clarify, we are dropping the original patch and instead taking Steves version? If so, Steve please send a patch when you are able to and with Vlads Ack, we can take it via the RCU tree if that is Ok with you. Thanks, - Joel > > -- > Uladzislau Rezki
On Thu, Mar 16, 2023 at 04:05:09PM +0100, Uladzislau Rezki wrote: > On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote: > > On Thu, 16 Mar 2023 09:16:37 +0100 > > Uladzislau Rezki <urezki@gmail.com> wrote: > > > > > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > > > > index ef8ed3b65d05..e6037752dcf0 100644 > > > > --- a/kernel/trace/trace_probe.h > > > > +++ b/kernel/trace/trace_probe.h > > > > @@ -256,6 +256,7 @@ struct trace_probe { > > > > struct event_file_link { > > > > struct trace_event_file *file; > > > > struct list_head list; > > > > + struct rcu_head rcu; > > > > }; > > > > > > > > static inline bool trace_probe_test_flag(struct trace_probe *tp, > > > > > > > struct foo_a { > > > int a; > > > int b; > > > }; > > > > Most machines today are 64 bits, even low end machines. > > > > struct foo_a { > > long long a; > > long long b; > > }; > > > > is more accurate. That's 16 bytes. > > > > Although it is more likely off because list_head is a double pointer. But > > let's just go with this, as the amount really doesn't matter here. > > > > > > > > your obj size is 8 byte > > > > > > struct foo_b { > > > struct rcu_head rcu; > > > > Isn't rcu_head defined as; > > > > struct callback_head { > > struct callback_head *next; > > void (*func)(struct callback_head *head); > > } __attribute__((aligned(sizeof(void *)))); > > #define rcu_head callback_head > > > > Which makes it 8 not 16 on 32 bit as well? > > > > > int a; > > > int b; > > > }; > > > > So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit. > > > > > > > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object > > > will be 32 bytes since there is no slab for 24 bytes: > > > > > > <snip> > > > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0 > > > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0 > > > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0 > > > <snip> > > > > > > if we allocate 512 objects of foo_a it would be 4096 bytes > > > in case of foo_b it is 24 * 512 = 12228 bytes. > > > > This is for probe events. We usually allocate 1, maybe 2. Oh, some may even > > allocate 100 to be crazy. But each probe event is in reality much larger > > (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes > > extra is still lost in the noise. > > > > > > > > single argument will give you 4096 + 512 * 8 = 8192 bytes > > > int terms of memory consumtion. > > > > If someone allocate 512 instances, that would be closer to a meg in size > > without this change. 8k is probably less than 1% > > > In percentage. My case. (12228 - 8192) * 100 / 12228 = ~33% difference. > > > > > > > And double argument will not give you better performance comparing > > > with a single argument. > > > > It will, because it will no longer have to allocate anything if need be. > > Note, when it doesn't allocate the system is probably mostly idle and we > > don't care about performance, but when it needs allocation, that's likely a > > time when performance is a bit more important. > > > The problem further is about pointer chasing, like comparing arrays and > lists. It will take longer time to offload all pointers. > Since i have a data, IMHO it is better to share than not: --bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=3 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" # double-argument 10 run Total time taken by all kfree'ers: 4387872408 ns, loops: 10000, batches: 958, memory footprint: 40MB Total time taken by all kfree'ers: 4415232304 ns, loops: 10000, batches: 982, memory footprint: 39MB Total time taken by all kfree'ers: 4270303081 ns, loops: 10000, batches: 955, memory footprint: 42MB Total time taken by all kfree'ers: 4364984147 ns, loops: 10000, batches: 953, memory footprint: 40MB Total time taken by all kfree'ers: 4225994506 ns, loops: 10000, batches: 942, memory footprint: 40MB Total time taken by all kfree'ers: 4601087346 ns, loops: 10000, batches: 1033, memory footprint: 40MB Total time taken by all kfree'ers: 4853397855 ns, loops: 10000, batches: 1109, memory footprint: 38MB Total time taken by all kfree'ers: 4627914204 ns, loops: 10000, batches: 1037, memory footprint: 39MB Total time taken by all kfree'ers: 4274587317 ns, loops: 10000, batches: 938, memory footprint: 33MB Total time taken by all kfree'ers: 4642151205 ns, loops: 10000, batches: 1068, memory footprint: 39MB # single-argument 10 run Total time taken by all kfree'ers: 3661190052 ns, loops: 10000, batches: 831, memory footprint: 29MB Total time taken by all kfree'ers: 3616277061 ns, loops: 10000, batches: 780, memory footprint: 27MB Total time taken by all kfree'ers: 3704584439 ns, loops: 10000, batches: 810, memory footprint: 27MB Total time taken by all kfree'ers: 3631291959 ns, loops: 10000, batches: 812, memory footprint: 28MB Total time taken by all kfree'ers: 3610490769 ns, loops: 10000, batches: 795, memory footprint: 27MB Total time taken by all kfree'ers: 3595446243 ns, loops: 10000, batches: 825, memory footprint: 28MB Total time taken by all kfree'ers: 3686252889 ns, loops: 10000, batches: 784, memory footprint: 27MB Total time taken by all kfree'ers: 3821475275 ns, loops: 10000, batches: 869, memory footprint: 27MB Total time taken by all kfree'ers: 3740407185 ns, loops: 10000, batches: 813, memory footprint: 28MB Total time taken by all kfree'ers: 3646684795 ns, loops: 10000, batches: 780, memory footprint: 24MB And yes, there are side effects. For example a low memory condition. -- Uladzislau Rezki
On Thu, 16 Mar 2023 14:01:30 -0400 Joel Fernandes <joel@joelfernandes.org> wrote: > If so, Steve please send a patch when you are able to and with Vlads Ack, > we can take it via the RCU tree if that is Ok with you. Daniel acked it, so you can just take it as he can be responsible for that code. For the trace_probe.c if you get an ack from Masami, then you can take the original patch as is. -- Steve
On Sat, Mar 18, 2023 at 12:11:31PM -0400, Steven Rostedt wrote: > On Thu, 16 Mar 2023 14:01:30 -0400 > Joel Fernandes <joel@joelfernandes.org> wrote: > > > If so, Steve please send a patch when you are able to and with Vlads Ack, > > we can take it via the RCU tree if that is Ok with you. > > Daniel acked it, so you can just take it as he can be responsible for that code. > > For the trace_probe.c if you get an ack from Masami, then you can take > the original patch as is. Masami, do you Ack the changes of the original patch to trace_probe.c ? Here it is again: https://lore.kernel.org/lkml/20230201150815.409582-5-urezki@gmail.com/ thanks, - Joel
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 94c1b5eb1dc0..67392d872b67 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -160,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr) if (!found) return; - kvfree_rcu(inst); + kvfree_rcu_mightsleep(inst); } /* diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 01ebabbbe8c9..32a7741dc731 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1170,7 +1170,7 @@ int trace_probe_remove_file(struct trace_probe *tp, return -ENOENT; list_del_rcu(&link->list); - kvfree_rcu(link); + kvfree_rcu_mightsleep(link); if (list_empty(&tp->event->files)) trace_probe_clear_flag(tp, TP_FLAG_TRACE);
The kvfree_rcu()'s single argument name is deprecated therefore rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly underline that it is for sleepable contexts. Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> --- kernel/trace/trace_osnoise.c | 2 +- kernel/trace/trace_probe.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)