Message ID | 20211117193932.4049412-2-gerald.schaefer@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC,1/1] mm/slub: fix endless "No data" printing for alloc/free_traces attribute | expand |
On 11/17/21 20:39, Gerald Schaefer wrote: > Reading from alloc/free_traces attribute in /sys/kernel/debug/slab/ results > in an endless sequence of "No data". This is because slab_debugfs_start() > does not check for a "past end of file" condition and return NULL. I still have no idea how that endless sequence happens. To get it, we would have to call slab_debugfs_show() repeatedly with such v that *v == 0. Which should only happen with slab_debugfs_start() with *ppos == 0. Which your patch won't change because you add a '*ppos > t->count' condition, so *ppos has to be at least 1 to trigger this. But yeah, AFAIK we should detect this in slab_debugfs_start() anyway. But I think the condition should be something like below, because we are past end of file already with *ppos == t->count. But if both are 0, we want to proceed for the "No data" output. // to show the No data if (!*ppos && !t->count) return ppos; if (*ppos >= t->count) return ppos; return ppos; > Fix it by adding such a check and return NULL. > > Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs") > Cc: <stable@vger.kernel.org> # v5.14+ > Reported-by: Steffen Maier <maier@linux.ibm.com> > Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> > --- > mm/slub.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/slub.c b/mm/slub.c > index f7368bfffb7a..336609671bc2 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -6115,6 +6115,11 @@ static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos) > > static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos) > { > + struct loc_track *t = seq->private; > + > + if (*ppos > t->count) > + return NULL; > + > return ppos; > } > >
On Fri, 19 Nov 2021 11:41:38 +0100 Vlastimil Babka <vbabka@suse.cz> wrote: > On 11/17/21 20:39, Gerald Schaefer wrote: > > Reading from alloc/free_traces attribute in /sys/kernel/debug/slab/ results > > in an endless sequence of "No data". This is because slab_debugfs_start() > > does not check for a "past end of file" condition and return NULL. > > I still have no idea how that endless sequence happens. > To get it, we would have to call slab_debugfs_show() repeatedly with such v > that *v == 0. Which should only happen with slab_debugfs_start() with *ppos > == 0. Which your patch won't change because you add a '*ppos > t->count' > condition, so *ppos has to be at least 1 to trigger this. Yes, very strange. After a closer look to fs/seq_file.c, especially seq_read_iter(), it seems that op->next will only be called when m->count == 0, at least in the first while(1) loop. Printing "No data\n" sets m->count to 8, so it will continue after Fill:, then call op->next, which returns NULL and breaks the second while(1) loop, and also calls op->stop. Then it returns from seq_read_iter(), only to be called again, and again, ... Only when op->start returns NULL it will end it for good, probably because seq_read_iter() will then return 0 instead of 8. Not sure if there is a better way to fix this than by adding a second "return NULL" to op->start, which feels a bit awkward and makes you wonder why the "return NULL" from op->next is not enough. > > But yeah, AFAIK we should detect this in slab_debugfs_start() anyway. > But I think the condition should be something like below, because we are > past end of file already with *ppos == t->count. But if both are 0, we want > to proceed for the "No data" output. Ah ok, I wasn't sure about the "t->count > 0" case, i.e. if the check for "*ppos > t->count" would still be correct there. So apparently it wouldn't, and we need two checks, like you suggested > > // to show the No data > if (!*ppos && !t->count) > return ppos; > > if (*ppos >= t->count) > return ppos; That should be return NULL here, right? > > return ppos; > Will send a new patch, unless I find a better way after investigating the endless seq_read_iter() calls mentioned above. Is there an easy way to test the "t->count > 0" case, i.e. what would need to be done to get some other reply than "No data"?
On 11/19/21 20:59, Gerald Schaefer wrote: > On Fri, 19 Nov 2021 11:41:38 +0100 > Vlastimil Babka <vbabka@suse.cz> wrote: > >> On 11/17/21 20:39, Gerald Schaefer wrote: >> > Reading from alloc/free_traces attribute in /sys/kernel/debug/slab/ results >> > in an endless sequence of "No data". This is because slab_debugfs_start() >> > does not check for a "past end of file" condition and return NULL. >> >> I still have no idea how that endless sequence happens. >> To get it, we would have to call slab_debugfs_show() repeatedly with such v >> that *v == 0. Which should only happen with slab_debugfs_start() with *ppos >> == 0. Which your patch won't change because you add a '*ppos > t->count' >> condition, so *ppos has to be at least 1 to trigger this. > > Yes, very strange. After a closer look to fs/seq_file.c, especially > seq_read_iter(), it seems that op->next will only be called when m->count == 0, > at least in the first while(1) loop. Printing "No data\n" sets m->count > to 8, so it will continue after Fill:, then call op->next, which returns NULL > and breaks the second while(1) loop, and also calls op->stop. Then it returns > from seq_read_iter(), only to be called again, and again, ... > > Only when op->start returns NULL it will end it for good, probably > because seq_read_iter() will then return 0 instead of 8. Ah, thanks for investigating. > Not sure if > there is a better way to fix this than by adding a second "return NULL" > to op->start, which feels a bit awkward and makes you wonder why the > "return NULL" from op->next is not enough. I think it's fine to require op->start to return NULL, even if it didn't cause this infinite loop. >> >> But yeah, AFAIK we should detect this in slab_debugfs_start() anyway. >> But I think the condition should be something like below, because we are >> past end of file already with *ppos == t->count. But if both are 0, we want >> to proceed for the "No data" output. > > Ah ok, I wasn't sure about the "t->count > 0" case, i.e. if the check for > "*ppos > t->count" would still be correct there. So apparently it wouldn't, > and we need two checks, like you suggested > >> >> // to show the No data >> if (!*ppos && !t->count) >> return ppos; >> >> if (*ppos >= t->count) >> return ppos; > > That should be return NULL here, right? Doh, right. >> >> return ppos; >> > > Will send a new patch, unless I find a better way after investigating the > endless seq_read_iter() calls mentioned above. > Is there an easy way to test the "t->count > 0" case, i.e. what would need > to be done to get some other reply than "No data"? Hm the debugfs files alloc_tracess/free_traces for any cache with non-zero objects (see /proc/slabinfo for that) should have t->count > 0. If the files are created for a cache, it means the related SLAB_STORE_USER debugging was enabled both during config and boot-time. If you see only a few caches with alloc_tracess/free_traces (because they are from e.g. some test module that adds SLAB_STORE_USER explicitly) and all happen to have 0 objects, boot with slub_debug=U parameter and then all caches will have this enabled and many will have >0 objects.
On Mon, 22 Nov 2021 16:02:28 +0100 Vlastimil Babka <vbabka@suse.cz> wrote: > On 11/19/21 20:59, Gerald Schaefer wrote: > > On Fri, 19 Nov 2021 11:41:38 +0100 > > Vlastimil Babka <vbabka@suse.cz> wrote: > > > >> On 11/17/21 20:39, Gerald Schaefer wrote: > >> > Reading from alloc/free_traces attribute in /sys/kernel/debug/slab/ results > >> > in an endless sequence of "No data". This is because slab_debugfs_start() > >> > does not check for a "past end of file" condition and return NULL. > >> > >> I still have no idea how that endless sequence happens. > >> To get it, we would have to call slab_debugfs_show() repeatedly with such v > >> that *v == 0. Which should only happen with slab_debugfs_start() with *ppos > >> == 0. Which your patch won't change because you add a '*ppos > t->count' > >> condition, so *ppos has to be at least 1 to trigger this. > > > > Yes, very strange. After a closer look to fs/seq_file.c, especially > > seq_read_iter(), it seems that op->next will only be called when m->count == 0, > > at least in the first while(1) loop. Printing "No data\n" sets m->count > > to 8, so it will continue after Fill:, then call op->next, which returns NULL > > and breaks the second while(1) loop, and also calls op->stop. Then it returns > > from seq_read_iter(), only to be called again, and again, ... > > > > Only when op->start returns NULL it will end it for good, probably > > because seq_read_iter() will then return 0 instead of 8. > > Ah, thanks for investigating. > > > Not sure if > > there is a better way to fix this than by adding a second "return NULL" > > to op->start, which feels a bit awkward and makes you wonder why the > > "return NULL" from op->next is not enough. > > I think it's fine to require op->start to return NULL, even if it didn't > cause this infinite loop. Yes, this all seems to work as designed, and it is slowly beginning to make some sense. The only other way to stop it would be to return from op->show, w/o printing anything, like it would be done for t->count > 0, when idx becomes >= t->count. In theory at least, see below. > > >> > >> But yeah, AFAIK we should detect this in slab_debugfs_start() anyway. > >> But I think the condition should be something like below, because we are > >> past end of file already with *ppos == t->count. But if both are 0, we want > >> to proceed for the "No data" output. > > > > Ah ok, I wasn't sure about the "t->count > 0" case, i.e. if the check for > > "*ppos > t->count" would still be correct there. So apparently it wouldn't, > > and we need two checks, like you suggested > > > >> > >> // to show the No data > >> if (!*ppos && !t->count) > >> return ppos; > >> > >> if (*ppos >= t->count) > >> return ppos; > > > > That should be return NULL here, right? > > Doh, right. > > >> > >> return ppos; > >> > > > > Will send a new patch, unless I find a better way after investigating the > > endless seq_read_iter() calls mentioned above. > > Is there an easy way to test the "t->count > 0" case, i.e. what would need > > to be done to get some other reply than "No data"? > > Hm the debugfs files alloc_tracess/free_traces for any cache with non-zero > objects (see /proc/slabinfo for that) should have t->count > 0. If the files > are created for a cache, it means the related SLAB_STORE_USER debugging was > enabled both during config and boot-time. If you see only a few caches with > alloc_tracess/free_traces (because they are from e.g. some test module that > adds SLAB_STORE_USER explicitly) and all happen to have 0 objects, boot with > slub_debug=U parameter and then all caches will have this enabled and many > will have >0 objects. Thanks. While testing this properly, yet another bug showed up. The idx in op->show remains 0 in all iterations, so I always see the same line printed t->count times (or infinitely, ATM). Not sure if this only shows on s390 due to endianness, but the reason is this: unsigned int idx = *(unsigned int *)v; IIUC, void *v is always the same as loff_t *ppos, and therefore idx also should be *ppos. De-referencing the loff_t * with an unsigned int * only gives the upper 32 bit half of the 64 bit value, which remains 0. This would be fixed e.g. with unsigned int idx = (unsigned int) *(loff_t *) v; With this fixed, my original patch actually also works for t->count > 0, because then op->show would return w/o printing anything when idx reaches t->count. For t->count > 0, it would even work w/o any extra checks in op->start because of that, only "No data" would be printed infinitely. It probably still makes sense to make this explicit in op->start, by checking separately for !*ppos && !t->count, and returning NULL for *ppos >= t->count, as you suggested. I think I will also make idx an unsigned long again, like it was before commit 64dd68497be7, and similar to t->count. Not sure if it needs to be, and with proper casting unsigned int is also possible, but why change it?
On Mon, 22 Nov 2021 21:14:00 +0100 Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: [...] > > Thanks. While testing this properly, yet another bug showed up. The idx > in op->show remains 0 in all iterations, so I always see the same line > printed t->count times (or infinitely, ATM). Not sure if this only shows > on s390 due to endianness, but the reason is this: > > unsigned int idx = *(unsigned int *)v; > > IIUC, void *v is always the same as loff_t *ppos, and therefore idx also > should be *ppos. De-referencing the loff_t * with an unsigned int * only > gives the upper 32 bit half of the 64 bit value, which remains 0. > > This would be fixed e.g. with > > unsigned int idx = (unsigned int) *(loff_t *) v; > > With this fixed, my original patch actually also works for t->count > 0, > because then op->show would return w/o printing anything when idx reaches > t->count. For t->count > 0, it would even work w/o any extra checks in > op->start because of that, only "No data" would be printed infinitely. Oh, no, that would actually also fix the "No data" part, as op->show will then also return w/o printing in the next iteration, so that op->next would correctly end it all. This could also explain why it might all have worked fine on x86 (haven't verified), and really only showed on big-endian s390. Hmm, now I'm not so sure anymore if we really want the additional checks and return NULL in op->start, just to make it "double safe". > > It probably still makes sense to make this explicit in op->start, by > checking separately for !*ppos && !t->count, and returning NULL for > *ppos >= t->count, as you suggested. > > I think I will also make idx an unsigned long again, like it was before > commit 64dd68497be7, and similar to t->count. Not sure if it needs to > be, and with proper casting unsigned int is also possible, but why > change it?
On 11/22/21 21:33, Gerald Schaefer wrote: > On Mon, 22 Nov 2021 21:14:00 +0100 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > [...] >> >> Thanks. While testing this properly, yet another bug showed up. The idx >> in op->show remains 0 in all iterations, so I always see the same line >> printed t->count times (or infinitely, ATM). Not sure if this only shows >> on s390 due to endianness, but the reason is this: >> >> unsigned int idx = *(unsigned int *)v; Uh, good catch. I was actually looking suspiciously at how we cast signed to unsigned, but didn't occur to me that shortening together with endiannes is the problem. >> >> IIUC, void *v is always the same as loff_t *ppos, and therefore idx also >> should be *ppos. De-referencing the loff_t * with an unsigned int * only >> gives the upper 32 bit half of the 64 bit value, which remains 0. >> >> This would be fixed e.g. with >> >> unsigned int idx = (unsigned int) *(loff_t *) v; With all this experience I'm now inclined to rather follow more the example in Documentation/filesystems/seq_file.rst and don't pass around the pointer that we got as ppos in slab_debugfs_start(), and that seq_file.c points to m->index. In that example an own value is kmalloced: loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); while we could just make this a field of loc_track? >> With this fixed, my original patch actually also works for t->count > 0, >> because then op->show would return w/o printing anything when idx reaches >> t->count. For t->count > 0, it would even work w/o any extra checks in >> op->start because of that, only "No data" would be printed infinitely. > > Oh, no, that would actually also fix the "No data" part, as op->show > will then also return w/o printing in the next iteration, so that op->next > would correctly end it all. > > This could also explain why it might all have worked fine on x86 (haven't > verified), and really only showed on big-endian s390. > > Hmm, now I'm not so sure anymore if we really want the additional > checks and return NULL in op->start, just to make it "double safe". I guess we don't. >> >> It probably still makes sense to make this explicit in op->start, by >> checking separately for !*ppos && !t->count, and returning NULL for >> *ppos >= t->count, as you suggested. >> >> I think I will also make idx an unsigned long again, like it was before >> commit 64dd68497be7, and similar to t->count. Not sure if it needs to >> be, and with proper casting unsigned int is also possible, but why >> change it? >
On Tue, 23 Nov 2021 15:19:49 +0100 Vlastimil Babka <vbabka@suse.cz> wrote: > On 11/22/21 21:33, Gerald Schaefer wrote: > > On Mon, 22 Nov 2021 21:14:00 +0100 > > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > > > [...] > >> > >> Thanks. While testing this properly, yet another bug showed up. The idx > >> in op->show remains 0 in all iterations, so I always see the same line > >> printed t->count times (or infinitely, ATM). Not sure if this only shows > >> on s390 due to endianness, but the reason is this: > >> > >> unsigned int idx = *(unsigned int *)v; > > Uh, good catch. I was actually looking suspiciously at how we cast signed to > unsigned, but didn't occur to me that shortening together with endiannes is > the problem. > > >> > >> IIUC, void *v is always the same as loff_t *ppos, and therefore idx also > >> should be *ppos. De-referencing the loff_t * with an unsigned int * only > >> gives the upper 32 bit half of the 64 bit value, which remains 0. > >> > >> This would be fixed e.g. with > >> > >> unsigned int idx = (unsigned int) *(loff_t *) v; > > With all this experience I'm now inclined to rather follow more the example > in Documentation/filesystems/seq_file.rst and don't pass around the pointer > that we got as ppos in slab_debugfs_start(), and that seq_file.c points to > m->index. > > In that example an own value is kmalloced: > > loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); > > while we could just make this a field of loc_track? Yes, following the example sounds good, and it would also make proper use of *v in op->next, which might make the code more readable. It also looks like it already does exactly what is needed here, i.e. have a simple iterator that just counts the lines. I don't think the iterator needs to be saved in loc_track. IIUC, it is already passed around like in the example, and can then be simply compared to t->count, similar to the existing code. This is what I'm currently testing, and it seems to work fine. Will send a new patch, if there are no objections: --- mm/slub.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a8626825a829..effb7b8d8f88 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6052,10 +6052,9 @@ __initcall(slab_sysfs_init); #if defined(CONFIG_SLUB_DEBUG) && defined(CONFIG_DEBUG_FS) static int slab_debugfs_show(struct seq_file *seq, void *v) { - - struct location *l; - unsigned int idx = *(unsigned int *)v; struct loc_track *t = seq->private; + loff_t idx = *(loff_t *) v; + struct location *l; if (idx < t->count) { l = &t->loc[idx]; @@ -6099,23 +6098,29 @@ static int slab_debugfs_show(struct seq_file *seq, void *v) static void slab_debugfs_stop(struct seq_file *seq, void *v) { + kfree(v); } static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos) { struct loc_track *t = seq->private; + loff_t *idxp = (loff_t *) v; - v = ppos; - ++*ppos; - if (*ppos <= t->count) - return v; + *ppos = ++(*idxp); + if (*idxp <= t->count) + return idxp; return NULL; } static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos) { - return ppos; + loff_t *idxp = kmalloc(sizeof(loff_t), GFP_KERNEL); + + if (!idxp) + return NULL; + *idxp = *ppos; + return idxp; } static const struct seq_operations slab_debugfs_sops = {
On Thu, 25 Nov 2021 17:13:10 +0100 Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > On Tue, 23 Nov 2021 15:19:49 +0100 > Vlastimil Babka <vbabka@suse.cz> wrote: > > > On 11/22/21 21:33, Gerald Schaefer wrote: > > > On Mon, 22 Nov 2021 21:14:00 +0100 > > > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > > > > > [...] > > >> > > >> Thanks. While testing this properly, yet another bug showed up. The idx > > >> in op->show remains 0 in all iterations, so I always see the same line > > >> printed t->count times (or infinitely, ATM). Not sure if this only shows > > >> on s390 due to endianness, but the reason is this: > > >> > > >> unsigned int idx = *(unsigned int *)v; > > > > Uh, good catch. I was actually looking suspiciously at how we cast signed to > > unsigned, but didn't occur to me that shortening together with endiannes is > > the problem. > > > > >> > > >> IIUC, void *v is always the same as loff_t *ppos, and therefore idx also > > >> should be *ppos. De-referencing the loff_t * with an unsigned int * only > > >> gives the upper 32 bit half of the 64 bit value, which remains 0. > > >> > > >> This would be fixed e.g. with > > >> > > >> unsigned int idx = (unsigned int) *(loff_t *) v; > > > > With all this experience I'm now inclined to rather follow more the example > > in Documentation/filesystems/seq_file.rst and don't pass around the pointer > > that we got as ppos in slab_debugfs_start(), and that seq_file.c points to > > m->index. > > > > In that example an own value is kmalloced: > > > > loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); > > > > while we could just make this a field of loc_track? > > Yes, following the example sounds good, and it would also make proper use > of *v in op->next, which might make the code more readable. It also looks > like it already does exactly what is needed here, i.e. have a simple > iterator that just counts the lines. > > I don't think the iterator needs to be saved in loc_track. IIUC, it is > already passed around like in the example, and can then be simply compared > to t->count, similar to the existing code. > > This is what I'm currently testing, and it seems to work fine. Will send > a new patch, if there are no objections: Oh well, I have one objection, returning NULL from op->next will be passed to op->stop, and then it will not free the allocated value. The example is elegantly avoiding this, by not returning NULL anywhere, and also not stopping. Sigh. Maybe not return NULL in op->next, but only from op->start, and only when no allocation was made or it was freed already? Or free it only/ already in op->next, when returning NULL?
On 11/25/21 21:12, Gerald Schaefer wrote: > On Thu, 25 Nov 2021 17:13:10 +0100 > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > >> On Tue, 23 Nov 2021 15:19:49 +0100 >> Vlastimil Babka <vbabka@suse.cz> wrote: >> >>> On 11/22/21 21:33, Gerald Schaefer wrote: >>>> On Mon, 22 Nov 2021 21:14:00 +0100 >>>> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: >>>> >>>> [...] >>>>> >>>>> Thanks. While testing this properly, yet another bug showed up. The idx >>>>> in op->show remains 0 in all iterations, so I always see the same line >>>>> printed t->count times (or infinitely, ATM). Not sure if this only shows >>>>> on s390 due to endianness, but the reason is this: >>>>> >>>>> unsigned int idx = *(unsigned int *)v; >>> >>> Uh, good catch. I was actually looking suspiciously at how we cast signed to >>> unsigned, but didn't occur to me that shortening together with endiannes is >>> the problem. >>> >>>>> >>>>> IIUC, void *v is always the same as loff_t *ppos, and therefore idx also >>>>> should be *ppos. De-referencing the loff_t * with an unsigned int * only >>>>> gives the upper 32 bit half of the 64 bit value, which remains 0. >>>>> >>>>> This would be fixed e.g. with >>>>> >>>>> unsigned int idx = (unsigned int) *(loff_t *) v; >>> >>> With all this experience I'm now inclined to rather follow more the example >>> in Documentation/filesystems/seq_file.rst and don't pass around the pointer >>> that we got as ppos in slab_debugfs_start(), and that seq_file.c points to >>> m->index. >>> >>> In that example an own value is kmalloced: >>> >>> loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); >>> >>> while we could just make this a field of loc_track? >> >> Yes, following the example sounds good, and it would also make proper use >> of *v in op->next, which might make the code more readable. It also looks >> like it already does exactly what is needed here, i.e. have a simple >> iterator that just counts the lines. >> >> I don't think the iterator needs to be saved in loc_track. IIUC, it is >> already passed around like in the example, and can then be simply compared >> to t->count, similar to the existing code. Saving it the loc_track doesn't preclude using the pointer that's being passed around. It however avoids the extra kmalloc and turns out it should also solve the return NULL from op->next() issue you describe below? >> This is what I'm currently testing, and it seems to work fine. Will send >> a new patch, if there are no objections: > > Oh well, I have one objection, returning NULL from op->next will be > passed to op->stop, and then it will not free the allocated value. > > The example is elegantly avoiding this, by not returning NULL anywhere, > and also not stopping. Sigh. > > Maybe not return NULL in op->next, but only from op->start, and only > when no allocation was made or it was freed already? Or free it only/ > already in op->next, when returning NULL? From these two probably the "free in op->next". But still inclined to store it in loc_track. Why does the API need to be so awkward...
On Thu, 25 Nov 2021 23:00:47 +0100 Vlastimil Babka <vbabka@suse.cz> wrote: > On 11/25/21 21:12, Gerald Schaefer wrote: > > On Thu, 25 Nov 2021 17:13:10 +0100 > > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > > > >> On Tue, 23 Nov 2021 15:19:49 +0100 > >> Vlastimil Babka <vbabka@suse.cz> wrote: > >> > >>> On 11/22/21 21:33, Gerald Schaefer wrote: > >>>> On Mon, 22 Nov 2021 21:14:00 +0100 > >>>> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote: > >>>> > >>>> [...] > >>>>> > >>>>> Thanks. While testing this properly, yet another bug showed up. The idx > >>>>> in op->show remains 0 in all iterations, so I always see the same line > >>>>> printed t->count times (or infinitely, ATM). Not sure if this only shows > >>>>> on s390 due to endianness, but the reason is this: > >>>>> > >>>>> unsigned int idx = *(unsigned int *)v; > >>> > >>> Uh, good catch. I was actually looking suspiciously at how we cast signed to > >>> unsigned, but didn't occur to me that shortening together with endiannes is > >>> the problem. > >>> > >>>>> > >>>>> IIUC, void *v is always the same as loff_t *ppos, and therefore idx also > >>>>> should be *ppos. De-referencing the loff_t * with an unsigned int * only > >>>>> gives the upper 32 bit half of the 64 bit value, which remains 0. > >>>>> > >>>>> This would be fixed e.g. with > >>>>> > >>>>> unsigned int idx = (unsigned int) *(loff_t *) v; > >>> > >>> With all this experience I'm now inclined to rather follow more the example > >>> in Documentation/filesystems/seq_file.rst and don't pass around the pointer > >>> that we got as ppos in slab_debugfs_start(), and that seq_file.c points to > >>> m->index. > >>> > >>> In that example an own value is kmalloced: > >>> > >>> loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); > >>> > >>> while we could just make this a field of loc_track? > >> > >> Yes, following the example sounds good, and it would also make proper use > >> of *v in op->next, which might make the code more readable. It also looks > >> like it already does exactly what is needed here, i.e. have a simple > >> iterator that just counts the lines. > >> > >> I don't think the iterator needs to be saved in loc_track. IIUC, it is > >> already passed around like in the example, and can then be simply compared > >> to t->count, similar to the existing code. > > Saving it the loc_track doesn't preclude using the pointer that's being > passed around. It however avoids the extra kmalloc and turns out it > should also solve the return NULL from op->next() issue you describe below? Yes, storing idx in loc_track seems straight-forward, and should also improve code readability. Patch will follow right here. I made idx a loff_t, as it seemed easier to handle in op->start/next, then casted to unsigned long again in op->show. Not sure if the unsigned int had any benefit. Given that loff_t is a signed 64 bit type, I guess both are wrong if it should ever turn negative (can it, i.e. can ppos turn negative?). With unsigned long, chances are better that it will at least turn into something big enough to not pass the "idx < t->count" check. > > >> This is what I'm currently testing, and it seems to work fine. Will send > >> a new patch, if there are no objections: > > > > Oh well, I have one objection, returning NULL from op->next will be > > passed to op->stop, and then it will not free the allocated value. > > > > The example is elegantly avoiding this, by not returning NULL anywhere, > > and also not stopping. Sigh. > > > > Maybe not return NULL in op->next, but only from op->start, and only > > when no allocation was made or it was freed already? Or free it only/ > > already in op->next, when returning NULL? > > From these two probably the "free in op->next". But still inclined to > store it in loc_track. > Why does the API need to be so awkward... Maybe we are just holding it wrong :-)
diff --git a/mm/slub.c b/mm/slub.c index f7368bfffb7a..336609671bc2 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6115,6 +6115,11 @@ static void *slab_debugfs_next(struct seq_file *seq, void *v, loff_t *ppos) static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos) { + struct loc_track *t = seq->private; + + if (*ppos > t->count) + return NULL; + return ppos; }
Reading from alloc/free_traces attribute in /sys/kernel/debug/slab/ results in an endless sequence of "No data". This is because slab_debugfs_start() does not check for a "past end of file" condition and return NULL. Fix it by adding such a check and return NULL. Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs") Cc: <stable@vger.kernel.org> # v5.14+ Reported-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> --- mm/slub.c | 5 +++++ 1 file changed, 5 insertions(+)