diff mbox

[v7,2/2] Refactor part of the oom report in dump_header

Message ID 1527940734-35161-2-git-send-email-ufo19890607@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

禹舟键 June 2, 2018, 11:58 a.m. UTC
From: yuzhoujian <yuzhoujian@didichuxing.com>

The dump_header does not print the memcg's name when the system
oom happened, so users cannot locate the certain container which
contains the task that has been killed by the oom killer.

I follow the advices of David Rientjes and Michal Hocko, and refactor
part of the oom report in a backwards compatible way. After this patch,
users can get the memcg's path from the oom report and check the certain
container more quickly.

Below is the part of the oom report in the dmesg
...
[  142.158316] panic cpuset=/ mems_allowed=0-1
[  142.158983] CPU: 15 PID: 8682 Comm: panic Not tainted 4.17.0-rc6+ #13
[  142.159659] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
[  142.160342] Call Trace:
[  142.161037]  dump_stack+0x78/0xb3
[  142.161734]  dump_header+0x7d/0x334
[  142.162433]  oom_kill_process+0x228/0x490
[  142.163126]  ? oom_badness+0x2a/0x130
[  142.163821]  out_of_memory+0xf0/0x280
[  142.164532]  __alloc_pages_slowpath+0x711/0xa07
[  142.165241]  __alloc_pages_nodemask+0x23f/0x260
[  142.165947]  alloc_pages_vma+0x73/0x180
[  142.166665]  do_anonymous_page+0xed/0x4e0
[  142.167388]  __handle_mm_fault+0xbd2/0xe00
[  142.168114]  handle_mm_fault+0x116/0x250
[  142.168841]  __do_page_fault+0x233/0x4d0
[  142.169567]  do_page_fault+0x32/0x130
[  142.170303]  ? page_fault+0x8/0x30
[  142.171036]  page_fault+0x1e/0x30
[  142.171764] RIP: 0033:0x7f403000a860
[  142.172517] RSP: 002b:00007ffc9f745c28 EFLAGS: 00010206
[  142.173268] RAX: 00007f3f6fd7d000 RBX: 0000000000000000 RCX: 00007f3f7f5cd000
[  142.174040] RDX: 00007f3fafd7d000 RSI: 0000000000000000 RDI: 00007f3f6fd7d000
[  142.174806] RBP: 00007ffc9f745c50 R08: ffffffffffffffff R09: 0000000000000000
[  142.175623] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000400490
[  142.176542] R13: 00007ffc9f745d30 R14: 0000000000000000 R15: 0000000000000000
[  142.177709] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),origin_memcg=(null),kill_memcg=/test/test1/test2,task=panic,pid= 8622,uid=    0
...

Changes since v6:
- divide the patch v5 into two parts. One part is to add an array of const char and
put enum oom_constraint into the memcontrol.h; the other is will refactor the output
in the dump_header.
- limit the memory usage for the static char array by using NAME_MAX in the mem_cgroup_print_oom_context.
- eliminate the spurious spaces in the oom's output and fix the spelling of "constrain".

Changes since v5:
- add an array of const char for each constraint.
- replace all of the pr_cont with a single line print of the pr_info.
- put enum oom_constraint into the memcontrol.c file for printing oom constraint.

Changes since v4:
- rename the helper's name to mem_cgroup_print_oom_context.
- rename the mem_cgroup_print_oom_info to mem_cgroup_print_oom_meminfo.
- add the constrain info in the dump_header.

Changes since v3:
- rename the helper's name to mem_cgroup_print_oom_memcg_name.
- add the rcu lock held to the helper.
- remove the print info of memcg's name in mem_cgroup_print_oom_info.

Changes since v2:
- add the mem_cgroup_print_memcg_name helper to print the memcg's
  name which contains the task that will be killed by the oom-killer.

Changes since v1:
- replace adding mem_cgroup_print_oom_info with printing the memcg's
  name only.

Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
---
 include/linux/memcontrol.h | 15 ++++++++++---
 mm/memcontrol.c            | 55 ++++++++++++++++++++++++++++++++--------------
 mm/oom_kill.c              |  5 +++--
 3 files changed, 53 insertions(+), 22 deletions(-)

Comments

Mike Rapoport June 3, 2018, 12:49 p.m. UTC | #1
On Sat, Jun 02, 2018 at 07:58:52PM +0800, ufo19890607@gmail.com wrote:
> From: yuzhoujian <yuzhoujian@didichuxing.com>
> 
> The dump_header does not print the memcg's name when the system
> oom happened, so users cannot locate the certain container which
> contains the task that has been killed by the oom killer.
> 
> I follow the advices of David Rientjes and Michal Hocko, and refactor
> part of the oom report in a backwards compatible way. After this patch,
> users can get the memcg's path from the oom report and check the certain
> container more quickly.
> 
> Below is the part of the oom report in the dmesg
> ...
> [  142.158316] panic cpuset=/ mems_allowed=0-1
> [  142.158983] CPU: 15 PID: 8682 Comm: panic Not tainted 4.17.0-rc6+ #13
> [  142.159659] Hardware name: Inspur SA5212M4/YZMB-00370-107, BIOS 4.1.10 11/14/2016
> [  142.160342] Call Trace:
> [  142.161037]  dump_stack+0x78/0xb3
> [  142.161734]  dump_header+0x7d/0x334
> [  142.162433]  oom_kill_process+0x228/0x490
> [  142.163126]  ? oom_badness+0x2a/0x130
> [  142.163821]  out_of_memory+0xf0/0x280
> [  142.164532]  __alloc_pages_slowpath+0x711/0xa07
> [  142.165241]  __alloc_pages_nodemask+0x23f/0x260
> [  142.165947]  alloc_pages_vma+0x73/0x180
> [  142.166665]  do_anonymous_page+0xed/0x4e0
> [  142.167388]  __handle_mm_fault+0xbd2/0xe00
> [  142.168114]  handle_mm_fault+0x116/0x250
> [  142.168841]  __do_page_fault+0x233/0x4d0
> [  142.169567]  do_page_fault+0x32/0x130
> [  142.170303]  ? page_fault+0x8/0x30
> [  142.171036]  page_fault+0x1e/0x30
> [  142.171764] RIP: 0033:0x7f403000a860
> [  142.172517] RSP: 002b:00007ffc9f745c28 EFLAGS: 00010206
> [  142.173268] RAX: 00007f3f6fd7d000 RBX: 0000000000000000 RCX: 00007f3f7f5cd000
> [  142.174040] RDX: 00007f3fafd7d000 RSI: 0000000000000000 RDI: 00007f3f6fd7d000
> [  142.174806] RBP: 00007ffc9f745c50 R08: ffffffffffffffff R09: 0000000000000000
> [  142.175623] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000400490
> [  142.176542] R13: 00007ffc9f745d30 R14: 0000000000000000 R15: 0000000000000000
> [  142.177709] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),origin_memcg=(null),kill_memcg=/test/test1/test2,task=panic,pid= 8622,uid=    0
> ...
> 
> Changes since v6:
> - divide the patch v5 into two parts. One part is to add an array of const char and
> put enum oom_constraint into the memcontrol.h; the other is will refactor the output
> in the dump_header.
> - limit the memory usage for the static char array by using NAME_MAX in the mem_cgroup_print_oom_context.
> - eliminate the spurious spaces in the oom's output and fix the spelling of "constrain".
> 
> Changes since v5:
> - add an array of const char for each constraint.
> - replace all of the pr_cont with a single line print of the pr_info.
> - put enum oom_constraint into the memcontrol.c file for printing oom constraint.
> 
> Changes since v4:
> - rename the helper's name to mem_cgroup_print_oom_context.
> - rename the mem_cgroup_print_oom_info to mem_cgroup_print_oom_meminfo.
> - add the constrain info in the dump_header.
> 
> Changes since v3:
> - rename the helper's name to mem_cgroup_print_oom_memcg_name.
> - add the rcu lock held to the helper.
> - remove the print info of memcg's name in mem_cgroup_print_oom_info.
> 
> Changes since v2:
> - add the mem_cgroup_print_memcg_name helper to print the memcg's
>   name which contains the task that will be killed by the oom-killer.
> 
> Changes since v1:
> - replace adding mem_cgroup_print_oom_info with printing the memcg's
>   name only.
> 
> Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
> ---
>  include/linux/memcontrol.h | 15 ++++++++++---
>  mm/memcontrol.c            | 55 ++++++++++++++++++++++++++++++++--------------
>  mm/oom_kill.c              |  5 +++--
>  3 files changed, 53 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 57311b6c4d67..1c7d5da1c827 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -478,8 +478,11 @@ void mem_cgroup_handle_over_high(void);
> 
>  unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg);
> 
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
> -				struct task_struct *p);
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
> +		struct task_struct *p, enum oom_constraint constraint,
> +		nodemask_t *nodemask);
> +
> +void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg);
> 
>  static inline void mem_cgroup_oom_enable(void)
>  {
> @@ -873,7 +876,13 @@ static inline unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg)
>  }
> 
>  static inline void
> -mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> +			enum oom_constraint constraint, nodemask_t *nodemask)
> +{
> +}
> +
> +static inline void
> +mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
>  {
>  }
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2bd3df3d101a..fd1172938c8e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1118,33 +1118,54 @@ static const char *const memcg1_stat_names[] = {
>  };
> 
>  #define K(x) ((x) << (PAGE_SHIFT-10))
> -/**
> - * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller.
> - * @memcg: The memory cgroup that went over limit
> +/*
> + * mem_cgroup_print_oom_context: Print OOM context information relevant to
> + * memory controller, which includes allocation constraint, nodemask, origin
> + * memcg that has reached its limit, kill memcg that contains the killed
> + * process, killed process's command, pid and uid.

Please keep the brief description of the function actually brief and move
the detailed explanation after the parameters description.

> + * @memcg: The origin memory cgroup that went over limit
>   * @p: Task that is going to be killed
> + * @constraint: The allocation constraint
> + * @nodemask: The allocation nodemask
>   *
>   * NOTE: @memcg and @p's mem_cgroup can be different when hierarchy is
>   * enabled
>   */
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> +			enum oom_constraint constraint, nodemask_t *nodemask)
>  {
> -	struct mem_cgroup *iter;
> -	unsigned int i;
> +	static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
> +	struct cgroup *origin_cgrp, *kill_cgrp;
> 
>  	rcu_read_lock();
> -
> -	if (p) {
> -		pr_info("Task in ");
> -		pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
> -		pr_cont(" killed as a result of limit of ");
> -	} else {
> -		pr_info("Memory limit reached of cgroup ");
> +	if (memcg) {
> +		origin_cgrp = memcg->css.cgroup;
> +		cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
>  	}
> -
> -	pr_cont_cgroup_path(memcg->css.cgroup);
> -	pr_cont("\n");
> -
> +	kill_cgrp = task_cgroup(p, memory_cgrp_id);
> +	cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
> +
> +	if (p)
> +		pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
> +			oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> +			strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
> +			kill_memcg_name, p->comm, p->pid,
> +			from_kuid(&init_user_ns, task_uid(p)));
> +	else
> +		pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s\n",
> +			oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> +			strlen(origin_memcg_name) ? origin_memcg_name : "(null)", kill_memcg_name);
>  	rcu_read_unlock();
> +}
> +
> +/**
> + * mem_cgroup_print_oom_info: Print OOM memory information relevant to memory controller.
> + * @memcg: The memory cgroup that went over limit
> + */
> +void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
> +{
> +	struct mem_cgroup *iter;
> +	unsigned int i;
> 
>  	pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
>  		K((u64)page_counter_read(&memcg->memory)),
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c806cd656af6..af0efab8a9e5 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -16,7 +16,6 @@
>   *  for newbie kernel hackers. It features several pointers to major
>   *  kernel subsystems and hints as to where to find out what things do.
>   */
> -
>  #include <linux/oom.h>
>  #include <linux/mm.h>
>  #include <linux/err.h>
> @@ -414,6 +413,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
> 
>  static void dump_header(struct oom_control *oc, struct task_struct *p)
>  {
> +	enum oom_constraint constraint = constrained_alloc(oc);

The allocation constraint is detected by the dump_header() callers, why not
just use it here?

>  	pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
>  		current->comm, oc->gfp_mask, &oc->gfp_mask,
>  		nodemask_pr_args(oc->nodemask), oc->order,
> @@ -423,8 +423,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
> 
>  	cpuset_print_current_mems_allowed();
>  	dump_stack();
> +	mem_cgroup_print_oom_context(oc->memcg, p, constraint, oc->nodemask);
>  	if (is_memcg_oom(oc))
> -		mem_cgroup_print_oom_info(oc->memcg, p);
> +		mem_cgroup_print_oom_meminfo(oc->memcg);
>  	else {
>  		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
>  		if (is_dump_unreclaim_slabs())
> -- 
> 2.14.1
>
Tetsuo Handa June 3, 2018, 2:45 p.m. UTC | #2
On 2018/06/02 20:58, yuzhoujian wrote:
> -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
> +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
> +			enum oom_constraint constraint, nodemask_t *nodemask)
>  {
> -	struct mem_cgroup *iter;
> -	unsigned int i;
> +	static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
> +	struct cgroup *origin_cgrp, *kill_cgrp;
>  
>  	rcu_read_lock();
> -
> -	if (p) {
> -		pr_info("Task in ");
> -		pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
> -		pr_cont(" killed as a result of limit of ");
> -	} else {
> -		pr_info("Memory limit reached of cgroup ");
> +	if (memcg) {
> +		origin_cgrp = memcg->css.cgroup;
> +		cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
>  	}
> -
> -	pr_cont_cgroup_path(memcg->css.cgroup);
> -	pr_cont("\n");
> -
> +	kill_cgrp = task_cgroup(p, memory_cgrp_id);
> +	cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
> +
> +	if (p)
> +		pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
> +			oom_constraint_text[constraint], nodemask_pr_args(nodemask),
> +			strlen(origin_memcg_name) ? origin_memcg_name : "(null)",

Since origin_memcg_name is printed for both memcg OOM and !memcg OOM,
it is strange that origin_memcg_name is updated only when memcg != NULL.
Have you really tested !memcg OOM case?
禹舟键 June 4, 2018, 1:58 a.m. UTC | #3
Hi Mike
> Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
Thanks for your advice.

> The allocation constraint is detected by the dump_header() callers, why not just use it here?
David suggest that constraint need to be printed in the oom report, so
I add the enum variable in this function.

Thanks
Wind
禹舟键 June 4, 2018, 2:41 a.m. UTC | #4
Hi Tetsuo
> Since origin_memcg_name is printed for both memcg OOM and !memcg OOM, it is strange that origin_memcg_name is updated only when memcg != NULL. Have you really tested !memcg OOM case?

if memcg == NULL , origin_memcg_name will also be NULL, so the length
of it is 0. origin_memcg_name will be "(null)". I've tested !memcg OOM
case with CONFIG_MEMCG and !CONFIG_MEMCG, and found nothing wrong.

Thanks
Wind
禹舟键 <ufo19890607@gmail.com> 于2018年6月4日周一 上午9:58写道:
>
> Hi Mike
> > Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
> Thanks for your advice.
>
> > The allocation constraint is detected by the dump_header() callers, why not just use it here?
> David suggest that constraint need to be printed in the oom report, so
> I add the enum variable in this function.
>
> Thanks
> Wind
Mike Rapoport June 4, 2018, 4:58 a.m. UTC | #5
On Mon, Jun 04, 2018 at 10:41:10AM +0800, 禹舟键 wrote:
> Hi Tetsuo
> > Since origin_memcg_name is printed for both memcg OOM and !memcg OOM, it is strange that origin_memcg_name is updated only when memcg != NULL. Have you really tested !memcg OOM case?
> 
> if memcg == NULL , origin_memcg_name will also be NULL, so the length
> of it is 0. origin_memcg_name will be "(null)". I've tested !memcg OOM
> case with CONFIG_MEMCG and !CONFIG_MEMCG, and found nothing wrong.
> 
> Thanks
> Wind
> 禹舟键 <ufo19890607@gmail.com> 于2018年6月4日周一 上午9:58写道:
> >
> > Hi Mike
> > > Please keep the brief description of the function actually brief and move the detailed explanation after the parameters description.
> > Thanks for your advice.
> >
> > > The allocation constraint is detected by the dump_header() callers, why not just use it here?
> > David suggest that constraint need to be printed in the oom report, so
> > I add the enum variable in this function.

My question was why do you call to alloc_constrained in the dump_header()
function rather than pass the constraint that was detected a bit earlier to
that function? 

Sorry if wasn't clear enough.

> > Thanks
> > Wind
>
禹舟键 June 4, 2018, 6:25 a.m. UTC | #6
Hi Mike
> My question was why do you call to alloc_constrained in the dump_header()
>function rather than pass the constraint that was detected a bit earlier to
>that function?

Ok, I will add a  new parameter in the dump_header.

Thank you.
Michal Hocko June 4, 2018, 6:52 a.m. UTC | #7
On Sat 02-06-18 19:58:52, ufo19890607@gmail.com wrote:
> From: yuzhoujian <yuzhoujian@didichuxing.com>
> 
> The dump_header does not print the memcg's name when the system
> oom happened, so users cannot locate the certain container which
> contains the task that has been killed by the oom killer.
> 
> I follow the advices of David Rientjes and Michal Hocko, and refactor
> part of the oom report in a backwards compatible way. After this patch,
> users can get the memcg's path from the oom report and check the certain
> container more quickly.

I have earlier suggested that you split this into two parts. One to add
the missing information and the later to convert it to a single printk
output. Reducing the overhead from PATH_MAX to NAME_MAX is a good step
but it still really begs an example why we really insist on a single
printk and that should be in its own changelog.

Sorry if that was not clear previously.
禹舟键 June 4, 2018, 8:18 a.m. UTC | #8
Hi Tetsuo
I know what you mean. Actully I refer to the code for kernel version:
3.10.0-514.  I think I can just use an array of type char, rather than
static char. Is it right?

Thanks
禹舟键 June 4, 2018, 8:57 a.m. UTC | #9
Hi Michal

> I have earlier suggested that you split this into two parts. One to add
> the missing information and the later to convert it to a single printk
> output.

I'm sorry I do not get your point.  What do you mean the missing information?

> but it still really begs an example why we really insist on a single
> printk and that should be in its own changelog.

Actually , I just know that we should avoid the interleaving messages
in the dmesg.
But I don't know how to reproduce this issue.  I think I can just
recount this issue in
the changelog.

Thanks
Michal Hocko June 4, 2018, 9:52 a.m. UTC | #10
On Mon 04-06-18 16:57:17, 禹舟键 wrote:
> Hi Michal
> 
> > I have earlier suggested that you split this into two parts. One to add
> > the missing information and the later to convert it to a single printk
> > output.
> 
> I'm sorry I do not get your point.  What do you mean the missing information?

memcg of the killed process

> 
> > but it still really begs an example why we really insist on a single
> > printk and that should be in its own changelog.
> 
> Actually , I just know that we should avoid the interleaving messages
> in the dmesg.

Yeah, that would be great. But you are increasing the static kernel size
and that is something to weigh in when considering the benefit. How
often those messages get interleaved? Is it worth another 512B of size?
Maybe yes, I am not sure. But this should be its own patch so that we
can revert it easily if the cost turns out to be bigger than the
benefit. You should realize that the OOM is a rare case and spending
resources on it is not really appreciated.

That being said, I am ready to ack a patch which adds the memcg of the
oom victim. I will not ack (nor nack) the patch which turns it into a
single print because I am not sure the benefit is really worth it. Maybe
others will though.
禹舟键 June 4, 2018, 12:13 p.m. UTC | #11
Hi Michal
I will add the missing information in the cover-letter.

> That being said, I am ready to ack a patch which adds the memcg of the
> oom victim. I will not ack (nor nack) the patch which turns it into a
> single print because I am not sure the benefit is really worth it. Maybe
> others will though.

OK, I will use the pr_cont_cgroup_name() to print origin and kill
memcg's name. I hope David will not have other opinions :)

Thanks
Michal Hocko June 4, 2018, 12:17 p.m. UTC | #12
On Mon 04-06-18 20:13:44, 禹舟键 wrote:
> Hi Michal
> I will add the missing information in the cover-letter.

I do not really think the cover letter needs much improvements. It is
the patch description that should be as specific as possible. Cover
letter should contain a highlevel description usually.
 
> > That being said, I am ready to ack a patch which adds the memcg of the
> > oom victim. I will not ack (nor nack) the patch which turns it into a
> > single print because I am not sure the benefit is really worth it. Maybe
> > others will though.
> 
> OK, I will use the pr_cont_cgroup_name() to print origin and kill
> memcg's name. I hope David will not have other opinions :)

As I've said this can be always added on top pressuming there is a good
justification.
禹舟键 June 8, 2018, 9:53 a.m. UTC | #13
Hi Mike
> My question was why do you call to alloc_constrained in the dump_header()
> function rather than pass the constraint that was detected a bit earlier to
> that function?

dump_header will be called by three functions: oom_kill_process,
check_panic_on_oom, out_of_memory.
We can get the constraint from the last two
functions(check_panic_on_oom, out_of_memory), but I need to
pass a new parameter(constraint) for oom_kill_process.

Thanks
Mike Rapoport June 10, 2018, 5:12 a.m. UTC | #14
On Fri, Jun 08, 2018 at 05:53:14PM +0800, 禹舟键 wrote:
> Hi Mike
> > My question was why do you call to alloc_constrained in the dump_header()
> > function rather than pass the constraint that was detected a bit earlier to
> > that function?
> 
> dump_header will be called by three functions: oom_kill_process,
> check_panic_on_oom, out_of_memory.
> We can get the constraint from the last two
> functions(check_panic_on_oom, out_of_memory), but I need to
> pass a new parameter(constraint) for oom_kill_process.

Another option is to add the constraint to the oom_control structure.
 
> Thanks
>
Michal Hocko June 11, 2018, 7:07 a.m. UTC | #15
On Sun 10-06-18 08:12:16, Mike Rapoport wrote:
> On Fri, Jun 08, 2018 at 05:53:14PM +0800, 禹舟键 wrote:
> > Hi Mike
> > > My question was why do you call to alloc_constrained in the dump_header()
> > > function rather than pass the constraint that was detected a bit earlier to
> > > that function?
> > 
> > dump_header will be called by three functions: oom_kill_process,
> > check_panic_on_oom, out_of_memory.
> > We can get the constraint from the last two
> > functions(check_panic_on_oom, out_of_memory), but I need to
> > pass a new parameter(constraint) for oom_kill_process.
> 
> Another option is to add the constraint to the oom_control structure.

Which would make more sense because oom_control should contain the full
OOM context.
diff mbox

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 57311b6c4d67..1c7d5da1c827 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -478,8 +478,11 @@  void mem_cgroup_handle_over_high(void);
 
 unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg);
 
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
-				struct task_struct *p);
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
+		struct task_struct *p, enum oom_constraint constraint,
+		nodemask_t *nodemask);
+
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg);
 
 static inline void mem_cgroup_oom_enable(void)
 {
@@ -873,7 +876,13 @@  static inline unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg)
 }
 
 static inline void
-mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+			enum oom_constraint constraint, nodemask_t *nodemask)
+{
+}
+
+static inline void
+mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
 {
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2bd3df3d101a..fd1172938c8e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1118,33 +1118,54 @@  static const char *const memcg1_stat_names[] = {
 };
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
-/**
- * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller.
- * @memcg: The memory cgroup that went over limit
+/*
+ * mem_cgroup_print_oom_context: Print OOM context information relevant to
+ * memory controller, which includes allocation constraint, nodemask, origin
+ * memcg that has reached its limit, kill memcg that contains the killed
+ * process, killed process's command, pid and uid.
+ * @memcg: The origin memory cgroup that went over limit
  * @p: Task that is going to be killed
+ * @constraint: The allocation constraint
+ * @nodemask: The allocation nodemask
  *
  * NOTE: @memcg and @p's mem_cgroup can be different when hierarchy is
  * enabled
  */
-void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
+void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p,
+			enum oom_constraint constraint, nodemask_t *nodemask)
 {
-	struct mem_cgroup *iter;
-	unsigned int i;
+	static char origin_memcg_name[NAME_MAX], kill_memcg_name[NAME_MAX];
+	struct cgroup *origin_cgrp, *kill_cgrp;
 
 	rcu_read_lock();
-
-	if (p) {
-		pr_info("Task in ");
-		pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
-		pr_cont(" killed as a result of limit of ");
-	} else {
-		pr_info("Memory limit reached of cgroup ");
+	if (memcg) {
+		origin_cgrp = memcg->css.cgroup;
+		cgroup_path(origin_cgrp, origin_memcg_name, NAME_MAX);
 	}
-
-	pr_cont_cgroup_path(memcg->css.cgroup);
-	pr_cont("\n");
-
+	kill_cgrp = task_cgroup(p, memory_cgrp_id);
+	cgroup_path(kill_cgrp, kill_memcg_name, NAME_MAX);
+
+	if (p)
+		pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s,task=%s,pid=%5d,uid=%5d\n",
+			oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+			strlen(origin_memcg_name) ? origin_memcg_name : "(null)",
+			kill_memcg_name, p->comm, p->pid,
+			from_kuid(&init_user_ns, task_uid(p)));
+	else
+		pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=%s,kill_memcg=%s\n",
+			oom_constraint_text[constraint], nodemask_pr_args(nodemask),
+			strlen(origin_memcg_name) ? origin_memcg_name : "(null)", kill_memcg_name);
 	rcu_read_unlock();
+}
+
+/**
+ * mem_cgroup_print_oom_info: Print OOM memory information relevant to memory controller.
+ * @memcg: The memory cgroup that went over limit
+ */
+void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
+{
+	struct mem_cgroup *iter;
+	unsigned int i;
 
 	pr_info("memory: usage %llukB, limit %llukB, failcnt %lu\n",
 		K((u64)page_counter_read(&memcg->memory)),
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c806cd656af6..af0efab8a9e5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -16,7 +16,6 @@ 
  *  for newbie kernel hackers. It features several pointers to major
  *  kernel subsystems and hints as to where to find out what things do.
  */
-
 #include <linux/oom.h>
 #include <linux/mm.h>
 #include <linux/err.h>
@@ -414,6 +413,7 @@  static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 
 static void dump_header(struct oom_control *oc, struct task_struct *p)
 {
+	enum oom_constraint constraint = constrained_alloc(oc);
 	pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
 		current->comm, oc->gfp_mask, &oc->gfp_mask,
 		nodemask_pr_args(oc->nodemask), oc->order,
@@ -423,8 +423,9 @@  static void dump_header(struct oom_control *oc, struct task_struct *p)
 
 	cpuset_print_current_mems_allowed();
 	dump_stack();
+	mem_cgroup_print_oom_context(oc->memcg, p, constraint, oc->nodemask);
 	if (is_memcg_oom(oc))
-		mem_cgroup_print_oom_info(oc->memcg, p);
+		mem_cgroup_print_oom_meminfo(oc->memcg);
 	else {
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
 		if (is_dump_unreclaim_slabs())