Message ID | 20210718214134.2619099-1-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/3] mm, oom: move task_will_free_mem up in the file to be used in process_mrelease | expand |
On 18.07.21 23:41, Suren Baghdasaryan wrote: > process_mrelease needs to be added in the CONFIG_MMU-dependent block which > comes before __task_will_free_mem and task_will_free_mem. Move these > functions before this block so that new process_mrelease syscall can use > them. > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > --- > changes in v2: > - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This > required moving task_will_free_mem implemented in the first patch > - Renamed process_reap to process_mrelease, per majority of votes > - Replaced "dying process" with "process which was sent a SIGKILL signal" in > the manual page text, per Florian Weimer > - Added ERRORS section in the manual page text > - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall > - Separated boilerplate code wiring-up the new syscall into a separate patch > to facilitate the review process > > mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- > 1 file changed, 75 insertions(+), 75 deletions(-) TBH, I really dislike this move as it makes git blame a lot harder with any real benefit. Can't you just use prototypes to avoid the move for now in patch #2? static bool task_will_free_mem(struct task_struct *task);
On Tue, Jul 20, 2021 at 5:44 AM David Hildenbrand <david@redhat.com> wrote: > > On 18.07.21 23:41, Suren Baghdasaryan wrote: > > process_mrelease needs to be added in the CONFIG_MMU-dependent block which > > comes before __task_will_free_mem and task_will_free_mem. Move these > > functions before this block so that new process_mrelease syscall can use > > them. > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > --- > > changes in v2: > > - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This > > required moving task_will_free_mem implemented in the first patch > > - Renamed process_reap to process_mrelease, per majority of votes > > - Replaced "dying process" with "process which was sent a SIGKILL signal" in > > the manual page text, per Florian Weimer > > - Added ERRORS section in the manual page text > > - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall > > - Separated boilerplate code wiring-up the new syscall into a separate patch > > to facilitate the review process > > > > mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- > > 1 file changed, 75 insertions(+), 75 deletions(-) > > TBH, I really dislike this move as it makes git blame a lot harder with > any real benefit. > > Can't you just use prototypes to avoid the move for now in patch #2? > > static bool task_will_free_mem(struct task_struct *task); Sure, I can use a forward-declaration. Just thought this would be cleaner. Will change in the next rev. Thanks! > > > -- > Thanks, > > David / dhildenb >
On Tue, 20 Jul 2021 14:43:52 +0200 David Hildenbrand <david@redhat.com> wrote: > On 18.07.21 23:41, Suren Baghdasaryan wrote: > > process_mrelease needs to be added in the CONFIG_MMU-dependent block which > > comes before __task_will_free_mem and task_will_free_mem. Move these > > functions before this block so that new process_mrelease syscall can use > > them. > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > --- > > changes in v2: > > - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This > > required moving task_will_free_mem implemented in the first patch > > - Renamed process_reap to process_mrelease, per majority of votes > > - Replaced "dying process" with "process which was sent a SIGKILL signal" in > > the manual page text, per Florian Weimer > > - Added ERRORS section in the manual page text > > - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall > > - Separated boilerplate code wiring-up the new syscall into a separate patch > > to facilitate the review process > > > > mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- > > 1 file changed, 75 insertions(+), 75 deletions(-) > > TBH, I really dislike this move as it makes git blame a lot harder with > any real benefit. > > Can't you just use prototypes to avoid the move for now in patch #2? > > static bool task_will_free_mem(struct task_struct *task); This change makes the code better - it's silly to be adding forward declarations just because the functions are in the wrong place. If that messes up git-blame then let's come up with better tooling rather than suffering poorer kernel code because the tools aren't doing what we want of them. Surely?
On 21.07.21 01:07, Andrew Morton wrote: > On Tue, 20 Jul 2021 14:43:52 +0200 David Hildenbrand <david@redhat.com> wrote: > >> On 18.07.21 23:41, Suren Baghdasaryan wrote: >>> process_mrelease needs to be added in the CONFIG_MMU-dependent block which >>> comes before __task_will_free_mem and task_will_free_mem. Move these >>> functions before this block so that new process_mrelease syscall can use >>> them. >>> >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>> --- >>> changes in v2: >>> - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This >>> required moving task_will_free_mem implemented in the first patch >>> - Renamed process_reap to process_mrelease, per majority of votes >>> - Replaced "dying process" with "process which was sent a SIGKILL signal" in >>> the manual page text, per Florian Weimer >>> - Added ERRORS section in the manual page text >>> - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall >>> - Separated boilerplate code wiring-up the new syscall into a separate patch >>> to facilitate the review process >>> >>> mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- >>> 1 file changed, 75 insertions(+), 75 deletions(-) >> >> TBH, I really dislike this move as it makes git blame a lot harder with >> any real benefit. >> >> Can't you just use prototypes to avoid the move for now in patch #2? >> >> static bool task_will_free_mem(struct task_struct *task); > > This change makes the code better - it's silly to be adding forward > declarations just because the functions are in the wrong place. I'd really love to learn what "better" here means and if it's rather subjective. When it comes to navigating the code, we do have established tools for that (ctags), and personally I couldn't care less where exactly in a file the code is located. Sure, ending up with a forward-declaration for every function might not be what we want ;) > > If that messes up git-blame then let's come up with better tooling > rather than suffering poorer kernel code because the tools aren't doing > what we want of them. Surely? I don't agree that what we get is "poorer kernel code" in this very instance; I can understand that we avoid forward-declarations when moving smallish functions. But moving two functions with 75 LOC is a bit too much for my taste at least -- speaking as someone who cares about easy backports and git-blame. Anyhow, just my 2 cents.
On Wed, Jul 21, 2021 at 12:30 AM David Hildenbrand <david@redhat.com> wrote: > > On 21.07.21 01:07, Andrew Morton wrote: > > On Tue, 20 Jul 2021 14:43:52 +0200 David Hildenbrand <david@redhat.com> wrote: > > > >> On 18.07.21 23:41, Suren Baghdasaryan wrote: > >>> process_mrelease needs to be added in the CONFIG_MMU-dependent block which > >>> comes before __task_will_free_mem and task_will_free_mem. Move these > >>> functions before this block so that new process_mrelease syscall can use > >>> them. > >>> > >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>> --- > >>> changes in v2: > >>> - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This > >>> required moving task_will_free_mem implemented in the first patch > >>> - Renamed process_reap to process_mrelease, per majority of votes > >>> - Replaced "dying process" with "process which was sent a SIGKILL signal" in > >>> the manual page text, per Florian Weimer > >>> - Added ERRORS section in the manual page text > >>> - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall > >>> - Separated boilerplate code wiring-up the new syscall into a separate patch > >>> to facilitate the review process > >>> > >>> mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- > >>> 1 file changed, 75 insertions(+), 75 deletions(-) > >> > >> TBH, I really dislike this move as it makes git blame a lot harder with > >> any real benefit. > >> > >> Can't you just use prototypes to avoid the move for now in patch #2? > >> > >> static bool task_will_free_mem(struct task_struct *task); > > > > This change makes the code better - it's silly to be adding forward > > declarations just because the functions are in the wrong place. > > I'd really love to learn what "better" here means and if it's rather > subjective. When it comes to navigating the code, we do have established > tools for that (ctags), and personally I couldn't care less where > exactly in a file the code is located. > > Sure, ending up with a forward-declaration for every function might not > be what we want ;) > > > > > If that messes up git-blame then let's come up with better tooling > > rather than suffering poorer kernel code because the tools aren't doing > > what we want of them. Surely? > > I don't agree that what we get is "poorer kernel code" in this very > instance; I can understand that we avoid forward-declarations when > moving smallish functions. But moving two functions with 75 LOC is a bit > too much for my taste at least -- speaking as someone who cares about > easy backports and git-blame. There is a third alternative here to have process_mrelease() at the end of the file with its own #ifdef CONFIG_MMU block, maybe even embedded in the function like this: int process_mrelease(int pidfd, unsigned int flags) { #ifdef CONFIG_MMU ... #else return ENOSYS; #endif } This would not require moving other functions. Would that be better than the current approach or the forward declaration? > > Anyhow, just my 2 cents. > > -- > Thanks, > > David / dhildenb > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On 21.07.21 17:33, Suren Baghdasaryan wrote: > On Wed, Jul 21, 2021 at 12:30 AM David Hildenbrand <david@redhat.com> wrote: >> >> On 21.07.21 01:07, Andrew Morton wrote: >>> On Tue, 20 Jul 2021 14:43:52 +0200 David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 18.07.21 23:41, Suren Baghdasaryan wrote: >>>>> process_mrelease needs to be added in the CONFIG_MMU-dependent block which >>>>> comes before __task_will_free_mem and task_will_free_mem. Move these >>>>> functions before this block so that new process_mrelease syscall can use >>>>> them. >>>>> >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>>>> --- >>>>> changes in v2: >>>>> - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This >>>>> required moving task_will_free_mem implemented in the first patch >>>>> - Renamed process_reap to process_mrelease, per majority of votes >>>>> - Replaced "dying process" with "process which was sent a SIGKILL signal" in >>>>> the manual page text, per Florian Weimer >>>>> - Added ERRORS section in the manual page text >>>>> - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall >>>>> - Separated boilerplate code wiring-up the new syscall into a separate patch >>>>> to facilitate the review process >>>>> >>>>> mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- >>>>> 1 file changed, 75 insertions(+), 75 deletions(-) >>>> >>>> TBH, I really dislike this move as it makes git blame a lot harder with >>>> any real benefit. >>>> >>>> Can't you just use prototypes to avoid the move for now in patch #2? >>>> >>>> static bool task_will_free_mem(struct task_struct *task); >>> >>> This change makes the code better - it's silly to be adding forward >>> declarations just because the functions are in the wrong place. >> >> I'd really love to learn what "better" here means and if it's rather >> subjective. When it comes to navigating the code, we do have established >> tools for that (ctags), and personally I couldn't care less where >> exactly in a file the code is located. >> >> Sure, ending up with a forward-declaration for every function might not >> be what we want ;) >> >>> >>> If that messes up git-blame then let's come up with better tooling >>> rather than suffering poorer kernel code because the tools aren't doing >>> what we want of them. Surely? >> >> I don't agree that what we get is "poorer kernel code" in this very >> instance; I can understand that we avoid forward-declarations when >> moving smallish functions. But moving two functions with 75 LOC is a bit >> too much for my taste at least -- speaking as someone who cares about >> easy backports and git-blame. > > There is a third alternative here to have process_mrelease() at the > end of the file with its own #ifdef CONFIG_MMU block, maybe even > embedded in the function like this: > > int process_mrelease(int pidfd, unsigned int flags) > { > #ifdef CONFIG_MMU > ... > #else > return ENOSYS; > #endif > } > > This would not require moving other functions. > Would that be better than the current approach or the forward declaration? IMHO that could be an easy, possible alternative.
On Wed, Jul 21, 2021 at 9:13 AM David Hildenbrand <david@redhat.com> wrote: > > On 21.07.21 17:33, Suren Baghdasaryan wrote: > > On Wed, Jul 21, 2021 at 12:30 AM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 21.07.21 01:07, Andrew Morton wrote: > >>> On Tue, 20 Jul 2021 14:43:52 +0200 David Hildenbrand <david@redhat.com> wrote: > >>> > >>>> On 18.07.21 23:41, Suren Baghdasaryan wrote: > >>>>> process_mrelease needs to be added in the CONFIG_MMU-dependent block which > >>>>> comes before __task_will_free_mem and task_will_free_mem. Move these > >>>>> functions before this block so that new process_mrelease syscall can use > >>>>> them. > >>>>> > >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>>>> --- > >>>>> changes in v2: > >>>>> - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This > >>>>> required moving task_will_free_mem implemented in the first patch > >>>>> - Renamed process_reap to process_mrelease, per majority of votes > >>>>> - Replaced "dying process" with "process which was sent a SIGKILL signal" in > >>>>> the manual page text, per Florian Weimer > >>>>> - Added ERRORS section in the manual page text > >>>>> - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall > >>>>> - Separated boilerplate code wiring-up the new syscall into a separate patch > >>>>> to facilitate the review process > >>>>> > >>>>> mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- > >>>>> 1 file changed, 75 insertions(+), 75 deletions(-) > >>>> > >>>> TBH, I really dislike this move as it makes git blame a lot harder with > >>>> any real benefit. > >>>> > >>>> Can't you just use prototypes to avoid the move for now in patch #2? > >>>> > >>>> static bool task_will_free_mem(struct task_struct *task); > >>> > >>> This change makes the code better - it's silly to be adding forward > >>> declarations just because the functions are in the wrong place. > >> > >> I'd really love to learn what "better" here means and if it's rather > >> subjective. When it comes to navigating the code, we do have established > >> tools for that (ctags), and personally I couldn't care less where > >> exactly in a file the code is located. > >> > >> Sure, ending up with a forward-declaration for every function might not > >> be what we want ;) > >> > >>> > >>> If that messes up git-blame then let's come up with better tooling > >>> rather than suffering poorer kernel code because the tools aren't doing > >>> what we want of them. Surely? > >> > >> I don't agree that what we get is "poorer kernel code" in this very > >> instance; I can understand that we avoid forward-declarations when > >> moving smallish functions. But moving two functions with 75 LOC is a bit > >> too much for my taste at least -- speaking as someone who cares about > >> easy backports and git-blame. > > > > There is a third alternative here to have process_mrelease() at the > > end of the file with its own #ifdef CONFIG_MMU block, maybe even > > embedded in the function like this: > > > > int process_mrelease(int pidfd, unsigned int flags) > > { > > #ifdef CONFIG_MMU > > ... > > #else > > return ENOSYS; > > #endif > > } > > > > This would not require moving other functions. > > Would that be better than the current approach or the forward declaration? > > IMHO that could be an easy, possible alternative. Andrew, others? Should I follow this path instead? > > -- > Thanks, > > David / dhildenb > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Wed, 21 Jul 2021 13:19:35 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > This would not require moving other functions. > > > Would that be better than the current approach or the forward declaration? > > > > IMHO that could be an easy, possible alternative. > > Andrew, others? Should I follow this path instead? Whatever you prefer ;)
On Wed, Jul 21, 2021 at 1:51 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Wed, 21 Jul 2021 13:19:35 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > This would not require moving other functions. > > > > Would that be better than the current approach or the forward declaration? > > > > > > IMHO that could be an easy, possible alternative. > > > > Andrew, others? Should I follow this path instead? > > Whatever you prefer ;) I understand David's concern too well to ignore it, so I prefer to follow this middle-ground approach if you don't mind :)
On Wed, Jul 21, 2021 at 1:59 PM Suren Baghdasaryan <surenb@google.com> wrote: > > On Wed, Jul 21, 2021 at 1:51 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > > On Wed, 21 Jul 2021 13:19:35 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > This would not require moving other functions. > > > > > Would that be better than the current approach or the forward declaration? > > > > > > > > IMHO that could be an easy, possible alternative. > > > > > > Andrew, others? Should I follow this path instead? > > > > Whatever you prefer ;) > > I understand David's concern too well to ignore it, so I prefer to > follow this middle-ground approach if you don't mind :) v3 with the refactoring is posted at https://lore.kernel.org/patchwork/project/lkml/list/?series=509230
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index c729a4c4a1ac..d04a13dc9fde 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -501,6 +501,81 @@ bool process_shares_mm(struct task_struct *p, struct mm_struct *mm) return false; } +static inline bool __task_will_free_mem(struct task_struct *task) +{ + struct signal_struct *sig = task->signal; + + /* + * A coredumping process may sleep for an extended period in exit_mm(), + * so the oom killer cannot assume that the process will promptly exit + * and release memory. + */ + if (sig->flags & SIGNAL_GROUP_COREDUMP) + return false; + + if (sig->flags & SIGNAL_GROUP_EXIT) + return true; + + if (thread_group_empty(task) && (task->flags & PF_EXITING)) + return true; + + return false; +} + +/* + * Checks whether the given task is dying or exiting and likely to + * release its address space. This means that all threads and processes + * sharing the same mm have to be killed or exiting. + * Caller has to make sure that task->mm is stable (hold task_lock or + * it operates on the current). + */ +static bool task_will_free_mem(struct task_struct *task) +{ + struct mm_struct *mm = task->mm; + struct task_struct *p; + bool ret = true; + + /* + * Skip tasks without mm because it might have passed its exit_mm and + * exit_oom_victim. oom_reaper could have rescued that but do not rely + * on that for now. We can consider find_lock_task_mm in future. + */ + if (!mm) + return false; + + if (!__task_will_free_mem(task)) + return false; + + /* + * This task has already been drained by the oom reaper so there are + * only small chances it will free some more + */ + if (test_bit(MMF_OOM_SKIP, &mm->flags)) + return false; + + if (atomic_read(&mm->mm_users) <= 1) + return true; + + /* + * Make sure that all tasks which share the mm with the given tasks + * are dying as well to make sure that a) nobody pins its mm and + * b) the task is also reapable by the oom reaper. + */ + rcu_read_lock(); + for_each_process(p) { + if (!process_shares_mm(p, mm)) + continue; + if (same_thread_group(task, p)) + continue; + ret = __task_will_free_mem(p); + if (!ret) + break; + } + rcu_read_unlock(); + + return ret; +} + #ifdef CONFIG_MMU /* * OOM Reaper kernel thread which tries to reap the memory used by the OOM @@ -781,81 +856,6 @@ bool oom_killer_disable(signed long timeout) return true; } -static inline bool __task_will_free_mem(struct task_struct *task) -{ - struct signal_struct *sig = task->signal; - - /* - * A coredumping process may sleep for an extended period in exit_mm(), - * so the oom killer cannot assume that the process will promptly exit - * and release memory. - */ - if (sig->flags & SIGNAL_GROUP_COREDUMP) - return false; - - if (sig->flags & SIGNAL_GROUP_EXIT) - return true; - - if (thread_group_empty(task) && (task->flags & PF_EXITING)) - return true; - - return false; -} - -/* - * Checks whether the given task is dying or exiting and likely to - * release its address space. This means that all threads and processes - * sharing the same mm have to be killed or exiting. - * Caller has to make sure that task->mm is stable (hold task_lock or - * it operates on the current). - */ -static bool task_will_free_mem(struct task_struct *task) -{ - struct mm_struct *mm = task->mm; - struct task_struct *p; - bool ret = true; - - /* - * Skip tasks without mm because it might have passed its exit_mm and - * exit_oom_victim. oom_reaper could have rescued that but do not rely - * on that for now. We can consider find_lock_task_mm in future. - */ - if (!mm) - return false; - - if (!__task_will_free_mem(task)) - return false; - - /* - * This task has already been drained by the oom reaper so there are - * only small chances it will free some more - */ - if (test_bit(MMF_OOM_SKIP, &mm->flags)) - return false; - - if (atomic_read(&mm->mm_users) <= 1) - return true; - - /* - * Make sure that all tasks which share the mm with the given tasks - * are dying as well to make sure that a) nobody pins its mm and - * b) the task is also reapable by the oom reaper. - */ - rcu_read_lock(); - for_each_process(p) { - if (!process_shares_mm(p, mm)) - continue; - if (same_thread_group(task, p)) - continue; - ret = __task_will_free_mem(p); - if (!ret) - break; - } - rcu_read_unlock(); - - return ret; -} - static void __oom_kill_process(struct task_struct *victim, const char *message) { struct task_struct *p;
process_mrelease needs to be added in the CONFIG_MMU-dependent block which comes before __task_will_free_mem and task_will_free_mem. Move these functions before this block so that new process_mrelease syscall can use them. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- changes in v2: - Fixed build error when CONFIG_MMU=n, reported by kernel test robot. This required moving task_will_free_mem implemented in the first patch - Renamed process_reap to process_mrelease, per majority of votes - Replaced "dying process" with "process which was sent a SIGKILL signal" in the manual page text, per Florian Weimer - Added ERRORS section in the manual page text - Resolved conflicts in syscall numbers caused by the new memfd_secret syscall - Separated boilerplate code wiring-up the new syscall into a separate patch to facilitate the review process mm/oom_kill.c | 150 +++++++++++++++++++++++++------------------------- 1 file changed, 75 insertions(+), 75 deletions(-)