Message ID | 20240621102750.oH9p1FQq@linutronix.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | sched/task_struct: Move alloc_tag to the end of the struct. | expand |
On Fri, Jun 21, 2024 at 12:27:50PM +0200, Sebastian Andrzej Siewior wrote: > The alloc_tag member has been added to task_struct at the very > beginning. This is a pointer and on 64bit architectures it forces 4 byte > padding after `ptrace' and then forcing another another 4 byte padding > after `on_cpu'. A few members later, `se' requires a cacheline aligned > due to struct sched_avg resulting in 52 hole before `se'. > > This is the case on 64bit-SMP architectures. > The 52 byte hole can be avoided by moving alloc_tag away where it > currently resides. > > Move alloc_tag to the end of task_struct. There is likely a hole before > `thread' due to its alignment requirement and the previous members are > likely to be already pointer-aligned. We sure we want it at the end? we do want it on a hot cacheline
On 2024-06-21 10:20:58 [-0400], Kent Overstreet wrote: > On Fri, Jun 21, 2024 at 12:27:50PM +0200, Sebastian Andrzej Siewior wrote: > > The alloc_tag member has been added to task_struct at the very > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > padding after `ptrace' and then forcing another another 4 byte padding > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > due to struct sched_avg resulting in 52 hole before `se'. > > > > This is the case on 64bit-SMP architectures. > > The 52 byte hole can be avoided by moving alloc_tag away where it > > currently resides. > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > `thread' due to its alignment requirement and the previous members are > > likely to be already pointer-aligned. > > We sure we want it at the end? we do want it on a hot cacheline Well, the front is bad. Looking at pgalloc_tag_add() and its callers, there is no task_struct touching. alloc_tag_save()/restore might be the critical one. This is random code… Puh. So if the end is too cold, what about around the mm pointer? Other suggestions? Sebastian
On Fri, Jun 21, 2024 at 08:29:15PM +0200, Sebastian Andrzej Siewior wrote: > On 2024-06-21 10:20:58 [-0400], Kent Overstreet wrote: > > On Fri, Jun 21, 2024 at 12:27:50PM +0200, Sebastian Andrzej Siewior wrote: > > > The alloc_tag member has been added to task_struct at the very > > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > > padding after `ptrace' and then forcing another another 4 byte padding > > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > > due to struct sched_avg resulting in 52 hole before `se'. > > > > > > This is the case on 64bit-SMP architectures. > > > The 52 byte hole can be avoided by moving alloc_tag away where it > > > currently resides. > > > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > > `thread' due to its alignment requirement and the previous members are > > > likely to be already pointer-aligned. > > > > We sure we want it at the end? we do want it on a hot cacheline > > Well, the front is bad. > Looking at pgalloc_tag_add() and its callers, there is no task_struct > touching. alloc_tag_save()/restore might be the critical one. This is > random code… Puh. So if the end is too cold, what about around the mm > pointer? Not there, that's not actually that hot. It needs to be by task_struct->flags; that's used in the same paths.
On 2024-06-21 14:49:23 [-0400], Kent Overstreet wrote: > On Fri, Jun 21, 2024 at 08:29:15PM +0200, Sebastian Andrzej Siewior wrote: > > On 2024-06-21 10:20:58 [-0400], Kent Overstreet wrote: > > > On Fri, Jun 21, 2024 at 12:27:50PM +0200, Sebastian Andrzej Siewior wrote: > > > > The alloc_tag member has been added to task_struct at the very > > > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > > > padding after `ptrace' and then forcing another another 4 byte padding > > > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > > > due to struct sched_avg resulting in 52 hole before `se'. > > > > > > > > This is the case on 64bit-SMP architectures. > > > > The 52 byte hole can be avoided by moving alloc_tag away where it > > > > currently resides. > > > > > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > > > `thread' due to its alignment requirement and the previous members are > > > > likely to be already pointer-aligned. > > > > > > We sure we want it at the end? we do want it on a hot cacheline > > > > Well, the front is bad. > > Looking at pgalloc_tag_add() and its callers, there is no task_struct > > touching. alloc_tag_save()/restore might be the critical one. This is > > random code… Puh. So if the end is too cold, what about around the mm > > pointer? > > Not there, that's not actually that hot. It needs to be by > task_struct->flags; that's used in the same paths. But there is no space without the additional 52 bytes. Was this by any chance on purpose? Sebastian
On Fri, Jun 21, 2024 at 09:07:19PM +0200, Sebastian Andrzej Siewior wrote: > On 2024-06-21 14:49:23 [-0400], Kent Overstreet wrote: > > On Fri, Jun 21, 2024 at 08:29:15PM +0200, Sebastian Andrzej Siewior wrote: > > > On 2024-06-21 10:20:58 [-0400], Kent Overstreet wrote: > > > > On Fri, Jun 21, 2024 at 12:27:50PM +0200, Sebastian Andrzej Siewior wrote: > > > > > The alloc_tag member has been added to task_struct at the very > > > > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > > > > padding after `ptrace' and then forcing another another 4 byte padding > > > > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > > > > due to struct sched_avg resulting in 52 hole before `se'. > > > > > > > > > > This is the case on 64bit-SMP architectures. > > > > > The 52 byte hole can be avoided by moving alloc_tag away where it > > > > > currently resides. > > > > > > > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > > > > `thread' due to its alignment requirement and the previous members are > > > > > likely to be already pointer-aligned. > > > > > > > > We sure we want it at the end? we do want it on a hot cacheline > > > > > > Well, the front is bad. > > > Looking at pgalloc_tag_add() and its callers, there is no task_struct > > > touching. alloc_tag_save()/restore might be the critical one. This is > > > random code… Puh. So if the end is too cold, what about around the mm > > > pointer? > > > > Not there, that's not actually that hot. It needs to be by > > task_struct->flags; that's used in the same paths. > > But there is no space without the additional 52 bytes. Was this by any > chance on purpose? No, that wasn't, and it doesn't have to be the exact same cacheline, but we do want it near current->flags; we store PF_MEMALLOC flags there that are converted to gfp flags and used exactly where we're using current->alloc_tag.
On 2024-06-21 15:13:19 [-0400], Kent Overstreet wrote: > > > > random code… Puh. So if the end is too cold, what about around the mm > > > > pointer? > > > > > > Not there, that's not actually that hot. It needs to be by > > > task_struct->flags; that's used in the same paths. > > > > But there is no space without the additional 52 bytes. Was this by any > > chance on purpose? > > No, that wasn't, and it doesn't have to be the exact same cacheline, but > we do want it near current->flags; we store PF_MEMALLOC flags there that > are converted to gfp flags and used exactly where we're using > current->alloc_tag. Hmm. `stack' and `usage' are the only two member that you would have to move (away) in order the stash the conditional variable there. The `ptrace' one uses the same flags as `flags' so it wouldn't make sense to move that one. Sebastian
On 2024-06-21 12:27:52 [+0200], To linux-mm@kvack.org wrote: > The alloc_tag member has been added to task_struct at the very > beginning. This is a pointer and on 64bit architectures it forces 4 byte > padding after `ptrace' and then forcing another another 4 byte padding > after `on_cpu'. A few members later, `se' requires a cacheline aligned > due to struct sched_avg resulting in 52 hole before `se'. > > This is the case on 64bit-SMP architectures. > The 52 byte hole can be avoided by moving alloc_tag away where it > currently resides. > > Move alloc_tag to the end of task_struct. There is likely a hole before > `thread' due to its alignment requirement and the previous members are > likely to be already pointer-aligned. > > Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Could we please get this merged and worry about possible performance regression later? Or once there is a test case or an idea where this pointer might fit better but clearly the current situation is worse. Sebastian
On Fri, 28 Jun 2024 11:49:44 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > On 2024-06-21 12:27:52 [+0200], To linux-mm@kvack.org wrote: > > The alloc_tag member has been added to task_struct at the very > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > padding after `ptrace' and then forcing another another 4 byte padding > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > due to struct sched_avg resulting in 52 hole before `se'. > > > > This is the case on 64bit-SMP architectures. > > The 52 byte hole can be avoided by moving alloc_tag away where it > > currently resides. > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > `thread' due to its alignment requirement and the previous members are > > likely to be already pointer-aligned. > > > > Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Could we please get this merged and worry about possible performance > regression later? Or once there is a test case or an idea where this > pointer might fit better but clearly the current situation is worse. > All in favor of saving 56 bytes from the task_struct, but we can do that by moving various things around. Was alloc_tag the best choice?
On Fri, Jun 28, 2024 at 11:49:44AM +0200, Sebastian Andrzej Siewior wrote: > On 2024-06-21 12:27:52 [+0200], To linux-mm@kvack.org wrote: > > The alloc_tag member has been added to task_struct at the very > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > padding after `ptrace' and then forcing another another 4 byte padding > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > due to struct sched_avg resulting in 52 hole before `se'. > > > > This is the case on 64bit-SMP architectures. > > The 52 byte hole can be avoided by moving alloc_tag away where it > > currently resides. > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > `thread' due to its alignment requirement and the previous members are > > likely to be already pointer-aligned. > > > > Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Could we please get this merged and worry about possible performance > regression later? Or once there is a test case or an idea where this > pointer might fit better but clearly the current situation is worse. Sebastian, I gave you review feedback on your patch; if you can incorporate it into a new version your patch will sail right in.
On 2024-06-28 15:35:38 [-0400], Kent Overstreet wrote: > On Fri, Jun 28, 2024 at 11:49:44AM +0200, Sebastian Andrzej Siewior wrote: > > On 2024-06-21 12:27:52 [+0200], To linux-mm@kvack.org wrote: > > > The alloc_tag member has been added to task_struct at the very > > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > > padding after `ptrace' and then forcing another another 4 byte padding > > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > > due to struct sched_avg resulting in 52 hole before `se'. > > > > > > This is the case on 64bit-SMP architectures. > > > The 52 byte hole can be avoided by moving alloc_tag away where it > > > currently resides. > > > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > > `thread' due to its alignment requirement and the previous members are > > > likely to be already pointer-aligned. > > > > > > Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") > > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > > > Could we please get this merged and worry about possible performance > > regression later? Or once there is a test case or an idea where this > > pointer might fit better but clearly the current situation is worse. > > Sebastian, I gave you review feedback on your patch; if you can > incorporate it into a new version your patch will sail right in. Kent, you said you didn't want it where it currently is. Fine. You said you want it at the front next to `flags'. This isn't going to work since there is no space left. You didn't make another suggestion or say how to make room. I don't impose this version, I just don't see a better way right now. Sebastian
On Fri, Jun 28, 2024 at 09:55:53PM +0200, Sebastian Andrzej Siewior wrote: > On 2024-06-28 15:35:38 [-0400], Kent Overstreet wrote: > > On Fri, Jun 28, 2024 at 11:49:44AM +0200, Sebastian Andrzej Siewior wrote: > > > On 2024-06-21 12:27:52 [+0200], To linux-mm@kvack.org wrote: > > > > The alloc_tag member has been added to task_struct at the very > > > > beginning. This is a pointer and on 64bit architectures it forces 4 byte > > > > padding after `ptrace' and then forcing another another 4 byte padding > > > > after `on_cpu'. A few members later, `se' requires a cacheline aligned > > > > due to struct sched_avg resulting in 52 hole before `se'. > > > > > > > > This is the case on 64bit-SMP architectures. > > > > The 52 byte hole can be avoided by moving alloc_tag away where it > > > > currently resides. > > > > > > > > Move alloc_tag to the end of task_struct. There is likely a hole before > > > > `thread' due to its alignment requirement and the previous members are > > > > likely to be already pointer-aligned. > > > > > > > > Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") > > > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > > > > > Could we please get this merged and worry about possible performance > > > regression later? Or once there is a test case or an idea where this > > > pointer might fit better but clearly the current situation is worse. > > > > Sebastian, I gave you review feedback on your patch; if you can > > incorporate it into a new version your patch will sail right in. > > Kent, you said you didn't want it where it currently is. Fine. You said > you want it at the front next to `flags'. This isn't going to work since > there is no space left. You didn't make another suggestion or say how to > make room. It doesn't need to be on the exact same cacheline, just as near as you can get it.
diff --git a/include/linux/sched.h b/include/linux/sched.h index 61591ac6eab6d..d76c61510ef1d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -770,10 +770,6 @@ struct task_struct { unsigned int flags; unsigned int ptrace; -#ifdef CONFIG_MEM_ALLOC_PROFILING - struct alloc_tag *alloc_tag; -#endif - #ifdef CONFIG_SMP int on_cpu; struct __call_single_node wake_entry; @@ -1553,6 +1549,9 @@ struct task_struct { #ifdef CONFIG_USER_EVENTS struct user_event_mm *user_event_mm; #endif +#ifdef CONFIG_MEM_ALLOC_PROFILING + struct alloc_tag *alloc_tag; +#endif /* * New fields for task_struct should be added above here, so that
The alloc_tag member has been added to task_struct at the very beginning. This is a pointer and on 64bit architectures it forces 4 byte padding after `ptrace' and then forcing another another 4 byte padding after `on_cpu'. A few members later, `se' requires a cacheline aligned due to struct sched_avg resulting in 52 hole before `se'. This is the case on 64bit-SMP architectures. The 52 byte hole can be avoided by moving alloc_tag away where it currently resides. Move alloc_tag to the end of task_struct. There is likely a hole before `thread' due to its alignment requirement and the previous members are likely to be already pointer-aligned. Fixes: 22d407b164ff7 ("lib: add allocation tagging support for memory allocation profiling") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- include/linux/sched.h | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)