diff mbox series

mm/memcontrol: update documentation about invoking oom killer

Message ID 157270779336.1961.6528158720593572480.stgit@buzz (mailing list archive)
State New, archived
Headers show
Series mm/memcontrol: update documentation about invoking oom killer | expand

Commit Message

Konstantin Khlebnikov Nov. 2, 2019, 3:16 p.m. UTC
Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
charge path") memcg invokes oom killer not only for user page-faults.
This means 0-order allocation will either succeed or task get killed.

Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Damian Tometzki Nov. 2, 2019, 4:02 p.m. UTC | #1
On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
> 
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>  
>  		Failed allocation in its turn could be returned into
>  		userspace as -ENOMEM or silently ignored in cases like
> -		disk readahead.  For now OOM in memory cgroup kills
> -		tasks iff shortage has happened inside page fault.
> +		disk readahead.
> +
> +		Before 4.19 OOM in memory cgroup killed tasks iff
Hello Konstantin,

iff --> if :-)

Best regards
Damian


> +		shortage has happened inside page fault, random
> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
> +		failed memory cgroup allocation invokes oom killer and
> +		keeps retrying until it succeeds.
>  
>  		This event is not raised if the OOM killer is not
>  		considered as an option, e.g. for failed high-order
>
Konstantin Khlebnikov Nov. 2, 2019, 4:14 p.m. UTC | #2
On 02/11/2019 19.02, Damian Tometzki wrote:
> On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
>> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
>> charge path") memcg invokes oom killer not only for user page-faults.
>> This means 0-order allocation will either succeed or task get killed.
>>
>> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index 5361ebec3361..eb47815e137b 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>>   
>>   		Failed allocation in its turn could be returned into
>>   		userspace as -ENOMEM or silently ignored in cases like
>> -		disk readahead.  For now OOM in memory cgroup kills
>> -		tasks iff shortage has happened inside page fault.
>> +		disk readahead.
>> +
>> +		Before 4.19 OOM in memory cgroup killed tasks iff
> Hello Konstantin,
> 
> iff --> if :-)
> 

This "iff" is shortened "if and only if".
https://en.wikipedia.org/wiki/If_and_only_if

> Best regards
> Damian
> 
> 
>> +		shortage has happened inside page fault, random
>> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
>> +		failed memory cgroup allocation invokes oom killer and
>> +		keeps retrying until it succeeds.
>>   
>>   		This event is not raised if the OOM killer is not
>>   		considered as an option, e.g. for failed high-order
>>
Damian Tometzki Nov. 2, 2019, 4:28 p.m. UTC | #3
On Sat, 02. Nov 19:14, Konstantin Khlebnikov wrote:
> 
> 
> On 02/11/2019 19.02, Damian Tometzki wrote:
> > On Sat, 02. Nov 18:16, Konstantin Khlebnikov wrote:
> >> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> >> charge path") memcg invokes oom killer not only for user page-faults.
> >> This means 0-order allocation will either succeed or task get killed.
> >>
> >> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> >> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> >> ---
> >>   Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
> >>   1 file changed, 7 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> >> index 5361ebec3361..eb47815e137b 100644
> >> --- a/Documentation/admin-guide/cgroup-v2.rst
> >> +++ b/Documentation/admin-guide/cgroup-v2.rst
> >> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
> >>   
> >>   		Failed allocation in its turn could be returned into
> >>   		userspace as -ENOMEM or silently ignored in cases like
> >> -		disk readahead.  For now OOM in memory cgroup kills
> >> -		tasks iff shortage has happened inside page fault.
> >> +		disk readahead.
> >> +
> >> +		Before 4.19 OOM in memory cgroup killed tasks iff
> > Hello Konstantin,
> > 
> > iff --> if :-)
> > 
> 
> This "iff" is shortened "if and only if".
> https://en.wikipedia.org/wiki/If_and_only_if

good to know :-)

> 
> > Best regards
> > Damian
> > 
> > 
> >> +		shortage has happened inside page fault, random
> >> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
> >> +		failed memory cgroup allocation invokes oom killer and
> >> +		keeps retrying until it succeeds.
> >>   
> >>   		This event is not raised if the OOM killer is not
> >>   		considered as an option, e.g. for failed high-order
> >>
David Rientjes Nov. 2, 2019, 11:55 p.m. UTC | #4
On Sat, 2 Nov 2019, Konstantin Khlebnikov wrote:

> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
> 
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>  
>  		Failed allocation in its turn could be returned into
>  		userspace as -ENOMEM or silently ignored in cases like
> -		disk readahead.  For now OOM in memory cgroup kills
> -		tasks iff shortage has happened inside page fault.
> +		disk readahead.
> +
> +		Before 4.19 OOM in memory cgroup killed tasks iff
> +		shortage has happened inside page fault, random
> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
> +		failed memory cgroup allocation invokes oom killer and
> +		keeps retrying until it succeeds.
>  
>  		This event is not raised if the OOM killer is not
>  		considered as an option, e.g. for failed high-order

The previous text is obviously incorrect for today's kernels, but I'm 
curious if we should be conflating the documentation here by describing 
the pre-4.19 behavior.  OOM killing no longer happens only on page fault 
so maybe better to document the exact behavior today and not attempt to 
describe differences with previous versions?
Konstantin Khlebnikov Nov. 3, 2019, 10:46 a.m. UTC | #5
On 03/11/2019 02.55, David Rientjes wrote:
> On Sat, 2 Nov 2019, Konstantin Khlebnikov wrote:
> 
>> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
>> charge path") memcg invokes oom killer not only for user page-faults.
>> This means 0-order allocation will either succeed or task get killed.
>>
>> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index 5361ebec3361..eb47815e137b 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>>   
>>   		Failed allocation in its turn could be returned into
>>   		userspace as -ENOMEM or silently ignored in cases like
>> -		disk readahead.  For now OOM in memory cgroup kills
>> -		tasks iff shortage has happened inside page fault.
>> +		disk readahead.
>> +
>> +		Before 4.19 OOM in memory cgroup killed tasks iff
>> +		shortage has happened inside page fault, random
>> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
>> +		failed memory cgroup allocation invokes oom killer and
>> +		keeps retrying until it succeeds.
>>   
>>   		This event is not raised if the OOM killer is not
>>   		considered as an option, e.g. for failed high-order
> 
> The previous text is obviously incorrect for today's kernels, but I'm
> curious if we should be conflating the documentation here by describing
> the pre-4.19 behavior.  OOM killing no longer happens only on page fault
> so maybe better to document the exact behavior today and not attempt to
> describe differences with previous versions?
> 

Previous behaviour was here for ages and 4.19 is not so old.
According too https://www.kernel.org/category/releases.html pre-4.19 will
be maintained for couple years at least. Let's keep this tombstone.

I've seen a lot of strange side effects of old behaviour.
Most obscure was a hang inside libc fork() when clone(CLONE_CHILD_SETTID)
silently fails to set child pid =)
https://lore.kernel.org/lkml/20150206162301.18031.32251.stgit@buzz/
Michal Hocko Nov. 5, 2019, 6:09 a.m. UTC | #6
On Sat 02-11-19 18:16:33, Konstantin Khlebnikov wrote:
> Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
> charge path") memcg invokes oom killer not only for user page-faults.
> This means 0-order allocation will either succeed or task get killed.
> 
> Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")

Is this really appropriate? 8e675f7af507 was correct at the time. It was
29ef680ae7c2 that hasn't updated the documentation. I would just drop
the Fixes tag.

> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5361ebec3361..eb47815e137b 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
>  
>  		Failed allocation in its turn could be returned into
>  		userspace as -ENOMEM or silently ignored in cases like
> -		disk readahead.  For now OOM in memory cgroup kills
> -		tasks iff shortage has happened inside page fault.
> +		disk readahead.
> +
> +		Before 4.19 OOM in memory cgroup killed tasks iff

I would go with Kernels between 3.12 and 4.19 invoked the oom killer
only if shortage has happened inside page fault.

> +		shortage has happened inside page fault, random
> +		syscall may fail with ENOMEM or EFAULT. Since 4.19
> +		failed memory cgroup allocation invokes oom killer and
> +		keeps retrying until it succeeds.
>  
>  		This event is not raised if the OOM killer is not
>  		considered as an option, e.g. for failed high-order
diff mbox series

Patch

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5361ebec3361..eb47815e137b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1219,8 +1219,13 @@  PAGE_SIZE multiple when read back.
 
 		Failed allocation in its turn could be returned into
 		userspace as -ENOMEM or silently ignored in cases like
-		disk readahead.  For now OOM in memory cgroup kills
-		tasks iff shortage has happened inside page fault.
+		disk readahead.
+
+		Before 4.19 OOM in memory cgroup killed tasks iff
+		shortage has happened inside page fault, random
+		syscall may fail with ENOMEM or EFAULT. Since 4.19
+		failed memory cgroup allocation invokes oom killer and
+		keeps retrying until it succeeds.
 
 		This event is not raised if the OOM killer is not
 		considered as an option, e.g. for failed high-order