diff mbox series

[RFC,3/3] mm, proc: report PR_SET_THP_DISABLE in proc

Message ID 20181120103515.25280-4-mhocko@kernel.org (mailing list archive)
State New, archived
Headers show
Series THP eligibility reporting via proc | expand

Commit Message

Michal Hocko Nov. 20, 2018, 10:35 a.m. UTC
From: Michal Hocko <mhocko@suse.com>

David Rientjes has reported that 1860033237d4 ("mm: make
PR_SET_THP_DISABLE immediately active") has changed the way how
we report THPable VMAs to the userspace. Their monitoring tool is
triggering false alarms on PR_SET_THP_DISABLE tasks because it considers
an insufficient THP usage as a memory fragmentation resp. memory
pressure issue.

Before the said commit each newly created VMA inherited VM_NOHUGEPAGE
flag and that got exposed to the userspace via /proc/<pid>/smaps file.
This implementation had its downsides as explained in the commit message
but it is true that the userspace doesn't have any means to query for
the process wide THP enabled/disabled status.

PR_SET_THP_DISABLE is a process wide flag so it makes a lot of sense
to export in the process wide context rather than per-vma. Introduce
a new field to /proc/<pid>/status which export this status.  If
PR_SET_THP_DISABLE is used then it reports false same as when the THP is
not compiled in. It doesn't consider the global THP status because we
already export that information via sysfs

Fixes: 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/filesystems/proc.txt |  3 +++
 fs/proc/array.c                    | 10 ++++++++++
 2 files changed, 13 insertions(+)

Comments

Michal Hocko Nov. 20, 2018, 11:42 a.m. UTC | #1
Damn, David somehow didn't make it to the CC list. Sorry about that.

On Tue 20-11-18 11:35:15, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> David Rientjes has reported that 1860033237d4 ("mm: make
> PR_SET_THP_DISABLE immediately active") has changed the way how
> we report THPable VMAs to the userspace. Their monitoring tool is
> triggering false alarms on PR_SET_THP_DISABLE tasks because it considers
> an insufficient THP usage as a memory fragmentation resp. memory
> pressure issue.
> 
> Before the said commit each newly created VMA inherited VM_NOHUGEPAGE
> flag and that got exposed to the userspace via /proc/<pid>/smaps file.
> This implementation had its downsides as explained in the commit message
> but it is true that the userspace doesn't have any means to query for
> the process wide THP enabled/disabled status.
> 
> PR_SET_THP_DISABLE is a process wide flag so it makes a lot of sense
> to export in the process wide context rather than per-vma. Introduce
> a new field to /proc/<pid>/status which export this status.  If
> PR_SET_THP_DISABLE is used then it reports false same as when the THP is
> not compiled in. It doesn't consider the global THP status because we
> already export that information via sysfs
> 
> Fixes: 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/filesystems/proc.txt |  3 +++
>  fs/proc/array.c                    | 10 ++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index 06562bab509a..7995e9322889 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -182,6 +182,7 @@ For example, to get the status information of a process, all you have to do is
>    VmSwap:        0 kB
>    HugetlbPages:          0 kB
>    CoreDumping:    0
> +  THP_enabled:	  1
>    Threads:        1
>    SigQ:   0/28578
>    SigPnd: 0000000000000000
> @@ -256,6 +257,8 @@ Table 1-2: Contents of the status files (as of 4.8)
>   HugetlbPages                size of hugetlb memory portions
>   CoreDumping                 process's memory is currently being dumped
>                               (killing the process may lead to a corrupted core)
> + THP_enabled		     process is allowed to use THP (returns 0 when
> +			     PR_SET_THP_DISABLE is set on the process
>   Threads                     number of threads
>   SigQ                        number of signals queued/max. number for queue
>   SigPnd                      bitmap of pending signals for the thread
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 0ceb3b6b37e7..9d428d5a0ac8 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -392,6 +392,15 @@ static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
>  	seq_putc(m, '\n');
>  }
>  
> +static inline void task_thp_status(struct seq_file *m, struct mm_struct *mm)
> +{
> +	bool thp_enabled = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE);
> +
> +	if (thp_enabled)
> +		thp_enabled = !test_bit(MMF_DISABLE_THP, &mm->flags);
> +	seq_printf(m, "THP_enabled:\t%d\n", thp_enabled);
> +}
> +
>  int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
>  			struct pid *pid, struct task_struct *task)
>  {
> @@ -406,6 +415,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
>  	if (mm) {
>  		task_mem(m, mm);
>  		task_core_dumping(m, mm);
> +		task_thp_status(m, mm);
>  		mmput(mm);
>  	}
>  	task_sig(m, task);
> -- 
> 2.19.1
Vlastimil Babka Nov. 23, 2018, 3:49 p.m. UTC | #2
On 11/20/18 11:35 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> David Rientjes has reported that 1860033237d4 ("mm: make
> PR_SET_THP_DISABLE immediately active") has changed the way how
> we report THPable VMAs to the userspace. Their monitoring tool is
> triggering false alarms on PR_SET_THP_DISABLE tasks because it considers
> an insufficient THP usage as a memory fragmentation resp. memory
> pressure issue.
> 
> Before the said commit each newly created VMA inherited VM_NOHUGEPAGE
> flag and that got exposed to the userspace via /proc/<pid>/smaps file.
> This implementation had its downsides as explained in the commit message
> but it is true that the userspace doesn't have any means to query for
> the process wide THP enabled/disabled status.
> 
> PR_SET_THP_DISABLE is a process wide flag so it makes a lot of sense
> to export in the process wide context rather than per-vma. Introduce

Agreed.

> a new field to /proc/<pid>/status which export this status.  If
> PR_SET_THP_DISABLE is used then it reports false same as when the THP is
> not compiled in. It doesn't consider the global THP status because we
> already export that information via sysfs
> 
> Fixes: 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active")
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>
William Kucharski Nov. 27, 2018, 12:33 a.m. UTC | #3
This determines whether the page can theoretically be THP-mapped , but is the intention to also check for proper alignment and/or preexisting PAGESIZE page cache mappings for the address range?

I'm having to deal with both these issues in the text page THP prototype I've been working on for some time now.

    Thanks,
         William Kucharski
Michal Hocko Nov. 27, 2018, 1:17 p.m. UTC | #4
On Mon 26-11-18 17:33:32, William Kucharski wrote:
> 
> 
> This determines whether the page can theoretically be THP-mapped , but
> is the intention to also check for proper alignment and/or preexisting
> PAGESIZE page cache mappings for the address range?

This is only about the process wide flag to disable THP. I do not see
how this can be alighnement related. I suspect you wanted to ask in the
smaps patch?

> I'm having to deal with both these issues in the text page THP
> prototype I've been working on for some time now.

Could you be more specific about the issue and how the alignment comes
into the game? The only thing I can think of is to not report VMAs
smaller than the THP as eligible. Is this what you are looking for?
William Kucharski Nov. 27, 2018, 2:50 p.m. UTC | #5
> On Nov 27, 2018, at 6:17 AM, Michal Hocko <mhocko@kernel.org> wrote:
> 
> This is only about the process wide flag to disable THP. I do not see
> how this can be alighnement related. I suspect you wanted to ask in the
> smaps patch?

No, answered below.

> 
>> I'm having to deal with both these issues in the text page THP
>> prototype I've been working on for some time now.
> 
> Could you be more specific about the issue and how the alignment comes
> into the game? The only thing I can think of is to not report VMAs
> smaller than the THP as eligible. Is this what you are looking for?

Basically, if the faulting VA is one that cannot be mapped with a THP
due to alignment or size constraints, it may be "eligible" for THP
mapping but ultimately can't be.

I was just double checking that this was meant to be more of a check done
before code elsewhere performs additional checks and does the actual THP
mapping, not an all-encompassing go/no go check for THP mapping.

    Thanks,
         William Kucharski
Michal Hocko Nov. 27, 2018, 4:25 p.m. UTC | #6
On Tue 27-11-18 07:50:08, William Kucharski wrote:
> 
> 
> > On Nov 27, 2018, at 6:17 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > This is only about the process wide flag to disable THP. I do not see
> > how this can be alighnement related. I suspect you wanted to ask in the
> > smaps patch?
> 
> No, answered below.
> 
> > 
> >> I'm having to deal with both these issues in the text page THP
> >> prototype I've been working on for some time now.
> > 
> > Could you be more specific about the issue and how the alignment comes
> > into the game? The only thing I can think of is to not report VMAs
> > smaller than the THP as eligible. Is this what you are looking for?
> 
> Basically, if the faulting VA is one that cannot be mapped with a THP
> due to alignment or size constraints, it may be "eligible" for THP
> mapping but ultimately can't be.
> 
> I was just double checking that this was meant to be more of a check done
> before code elsewhere performs additional checks and does the actual THP
> mapping, not an all-encompassing go/no go check for THP mapping.

I am still not sure I follow you completely here. This just reports
per-task eligibility. The system wide eligibility is reported via sysfs
and the per vma eligibility is reported via /proc/<pid>/smaps.
Vlastimil Babka Nov. 27, 2018, 4:50 p.m. UTC | #7
On 11/27/18 3:50 PM, William Kucharski wrote:
> 
> I was just double checking that this was meant to be more of a check done
> before code elsewhere performs additional checks and does the actual THP
> mapping, not an all-encompassing go/no go check for THP mapping.

Yes, the code doing the actual mapping is still checking also alignment etc.
William Kucharski Nov. 27, 2018, 5:06 p.m. UTC | #8
> On Nov 27, 2018, at 9:50 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> On 11/27/18 3:50 PM, William Kucharski wrote:
>> 
>> I was just double checking that this was meant to be more of a check done
>> before code elsewhere performs additional checks and does the actual THP
>> mapping, not an all-encompassing go/no go check for THP mapping.
> 
> Yes, the code doing the actual mapping is still checking also alignment etc.

Thanks, yes, that is what I was getting at.
diff mbox series

Patch

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 06562bab509a..7995e9322889 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -182,6 +182,7 @@  For example, to get the status information of a process, all you have to do is
   VmSwap:        0 kB
   HugetlbPages:          0 kB
   CoreDumping:    0
+  THP_enabled:	  1
   Threads:        1
   SigQ:   0/28578
   SigPnd: 0000000000000000
@@ -256,6 +257,8 @@  Table 1-2: Contents of the status files (as of 4.8)
  HugetlbPages                size of hugetlb memory portions
  CoreDumping                 process's memory is currently being dumped
                              (killing the process may lead to a corrupted core)
+ THP_enabled		     process is allowed to use THP (returns 0 when
+			     PR_SET_THP_DISABLE is set on the process
  Threads                     number of threads
  SigQ                        number of signals queued/max. number for queue
  SigPnd                      bitmap of pending signals for the thread
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 0ceb3b6b37e7..9d428d5a0ac8 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -392,6 +392,15 @@  static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
 	seq_putc(m, '\n');
 }
 
+static inline void task_thp_status(struct seq_file *m, struct mm_struct *mm)
+{
+	bool thp_enabled = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE);
+
+	if (thp_enabled)
+		thp_enabled = !test_bit(MMF_DISABLE_THP, &mm->flags);
+	seq_printf(m, "THP_enabled:\t%d\n", thp_enabled);
+}
+
 int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 			struct pid *pid, struct task_struct *task)
 {
@@ -406,6 +415,7 @@  int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 	if (mm) {
 		task_mem(m, mm);
 		task_core_dumping(m, mm);
+		task_thp_status(m, mm);
 		mmput(mm);
 	}
 	task_sig(m, task);