diff mbox series

mm: Make failslab writable again

Message ID 20220920082033.1727374-1-alexander.atanasov@virtuozzo.com (mailing list archive)
State New
Headers show
Series mm: Make failslab writable again | expand

Commit Message

Alexander Atanasov Sept. 20, 2022, 8:20 a.m. UTC
In (060807f841ac mm, slub: make remaining slub_debug related attributes
read-only failslab) it was made RO.
I think it became a collateral victim to the other two options
(sanity_checks and trace) for which the reasons are perfectly valid.
Here is why:
 - sanity_checks and trace are slab internal debug options,
   failslab is used for fault injection.
 - for fault injections, which by presumption are random, it
   does not matter if it is not set atomically. You need to
   set atleast one more option to trigger fault injection.
 - in a testing scenario you may need to change it at runtime
   example: module loading - you test all allocations limited
   by the space option. Then you move to test only your module's
   own slabs.
 - when set by command line flags it effectively disables all
   cache merges.

Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.cz

Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com>
---
 Documentation/mm/slub.rst |  2 ++
 mm/slub.c                 | 14 +++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)


base-commit: 80e78fcce86de0288793a0ef0f6acf37656ee4cf

Comments

Vlastimil Babka Sept. 20, 2022, 8:42 a.m. UTC | #1
On 9/20/22 10:20, Alexander Atanasov wrote:
> In (060807f841ac mm, slub: make remaining slub_debug related attributes
> read-only failslab) it was made RO.

"read-only) failslab was made RO" ?

> I think it became a collateral victim to the other two options
> (sanity_checks and trace) for which the reasons are perfectly valid.

The commit also mentioned that modifying the flags is not protected in any
way, see below.

> Here is why:
>  - sanity_checks and trace are slab internal debug options,
>    failslab is used for fault injection.
>  - for fault injections, which by presumption are random, it
>    does not matter if it is not set atomically. You need to
>    set atleast one more option to trigger fault injection.
>  - in a testing scenario you may need to change it at runtime
>    example: module loading - you test all allocations limited
>    by the space option. Then you move to test only your module's
>    own slabs.
>  - when set by command line flags it effectively disables all
>    cache merges.
> 
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Jann Horn <jannh@google.com>
> Cc: Vijayanand Jitta <vjitta@codeaurora.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.cz
> 
> Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com>
> ---
>  Documentation/mm/slub.rst |  2 ++
>  mm/slub.c                 | 14 +++++++++++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/mm/slub.rst b/Documentation/mm/slub.rst
> index 43063ade737a..86837073a39e 100644
> --- a/Documentation/mm/slub.rst
> +++ b/Documentation/mm/slub.rst
> @@ -116,6 +116,8 @@ options from the ``slub_debug`` parameter translate to the following files::
>  	T	trace
>  	A	failslab
>  
> +failslab file is writable, so writing 1 or 0 will enable or disable
> +the option at runtime. Write returns -EINVAL if cache is an alias.
>  Careful with tracing: It may spew out lots of information and never stop if
>  used on the wrong slab.
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index 862dbd9af4f5..7c15d312e0fb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5617,7 +5617,19 @@ static ssize_t failslab_show(struct kmem_cache *s, char *buf)
>  {
>  	return sysfs_emit(buf, "%d\n", !!(s->flags & SLAB_FAILSLAB));
>  }
> -SLAB_ATTR_RO(failslab);
> +
> +static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
> +				size_t length)
> +{
> +	if (s->refcount > 1)
> +		return -EINVAL;
> +
> +	s->flags &= ~SLAB_FAILSLAB;
> +	if (buf[0] == '1')
> +		s->flags |= SLAB_FAILSLAB;

Could we at least use a temporary variable to set up the final value and
then do a WRITE_ONCE() to s->flags, so the compiler is not allowed to do
some funky stuff? Assuming this is really the only place where we modify
s->flags during runtime, so we can't miss other updates due to RMW.

> +	return length;
> +}
> +SLAB_ATTR(failslab);
>  #endif
>  
>  static ssize_t shrink_show(struct kmem_cache *s, char *buf)
> 
> base-commit: 80e78fcce86de0288793a0ef0f6acf37656ee4cf
Alexander Atanasov Sept. 20, 2022, 9:17 a.m. UTC | #2
Hello,

On 20.09.22 11:42, Vlastimil Babka wrote:
> On 9/20/22 10:20, Alexander Atanasov wrote:
>> In (060807f841ac mm, slub: make remaining slub_debug related attributes
>> read-only failslab) it was made RO.
> 
> "read-only) failslab was made RO" ?

Yep.

>> I think it became a collateral victim to the other two options
>> (sanity_checks and trace) for which the reasons are perfectly valid.
> 
> The commit also mentioned that modifying the flags is not protected in any
> way, see below.

Yes, indeed.

>> +static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
>> +				size_t length)
>> +{
>> +	if (s->refcount > 1)
>> +		return -EINVAL;
>> +
>> +	s->flags &= ~SLAB_FAILSLAB;
>> +	if (buf[0] == '1')
>> +		s->flags |= SLAB_FAILSLAB;
> 
> Could we at least use a temporary variable to set up the final value and
> then do a WRITE_ONCE() to s->flags, so the compiler is not allowed to do
> some funky stuff? Assuming this is really the only place where we modify
> s->flags during runtime, so we can't miss other updates due to RMW.

Since it is set or clear - instead of temporary variable and potentially 
two writes and RMW issues i would suggest this:
+	if (buf[0] == '1')
+		s->flags |= SLAB_FAILSLAB;
+       else
+		s->flags &= ~SLAB_FAILSLAB;

If at some point more places need to modify the flags at runtime they 
can switch to atomic bit ops.
Vlastimil Babka Sept. 20, 2022, 9:29 a.m. UTC | #3
On 9/20/22 11:17, Alexander Atanasov wrote:
> Hello,
> 
> On 20.09.22 11:42, Vlastimil Babka wrote:
>> On 9/20/22 10:20, Alexander Atanasov wrote:
>>> In (060807f841ac mm, slub: make remaining slub_debug related attributes
>>> read-only failslab) it was made RO.
>>
>> "read-only) failslab was made RO" ?
> 
> Yep.
> 
>>> I think it became a collateral victim to the other two options
>>> (sanity_checks and trace) for which the reasons are perfectly valid.
>>
>> The commit also mentioned that modifying the flags is not protected in any
>> way, see below.
> 
> Yes, indeed.
> 
>>> +static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
>>> +                size_t length)
>>> +{
>>> +    if (s->refcount > 1)
>>> +        return -EINVAL;
>>> +
>>> +    s->flags &= ~SLAB_FAILSLAB;
>>> +    if (buf[0] == '1')
>>> +        s->flags |= SLAB_FAILSLAB;
>>
>> Could we at least use a temporary variable to set up the final value and
>> then do a WRITE_ONCE() to s->flags, so the compiler is not allowed to do
>> some funky stuff? Assuming this is really the only place where we modify
>> s->flags during runtime, so we can't miss other updates due to RMW.
> 
> Since it is set or clear - instead of temporary variable and potentially two
> writes and RMW issues i would suggest this:
> +    if (buf[0] == '1')
> +        s->flags |= SLAB_FAILSLAB;
> +       else
> +        s->flags &= ~SLAB_FAILSLAB;

This way also has RMW issues, and also the compiler is allowed to
temporarily modify s->flags any way it likes; with WRITE_ONCE() it can't.

> If at some point more places need to modify the flags at runtime they can
> switch to atomic bit ops.
Alexander Atanasov Sept. 20, 2022, 10:21 a.m. UTC | #4
On 20.09.22 12:29, Vlastimil Babka wrote:
> On 9/20/22 11:17, Alexander Atanasov wrote:
>> Hello,
>>
>> On 20.09.22 11:42, Vlastimil Babka wrote:
>>>> +static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
>>>> +                size_t length)
>>>> +{
>>>> +    if (s->refcount > 1)
>>>> +        return -EINVAL;
>>>> +
>>>> +    s->flags &= ~SLAB_FAILSLAB;
>>>> +    if (buf[0] == '1')
>>>> +        s->flags |= SLAB_FAILSLAB;
>>>
>>> Could we at least use a temporary variable to set up the final value and
>>> then do a WRITE_ONCE() to s->flags, so the compiler is not allowed to do
>>> some funky stuff? Assuming this is really the only place where we modify
>>> s->flags during runtime, so we can't miss other updates due to RMW.
>>
>> Since it is set or clear - instead of temporary variable and potentially two
>> writes and RMW issues i would suggest this:
>> +    if (buf[0] == '1')
>> +        s->flags |= SLAB_FAILSLAB;
>> +       else
>> +        s->flags &= ~SLAB_FAILSLAB;
> 
> This way also has RMW issues, and also the compiler is allowed to
> temporarily modify s->flags any way it likes; with WRITE_ONCE() it can't.

Okay, so the safest way is this?

if (buf[0] == '1')
	WRITE_ONCE(s->flags, READ_ONCE(s->flags) | SLAB_FAILSLAB);
else
	WRITE_ONCE(s->flags, READ_ONCE(s->flags) & ~SLAB_FAILSLAB);

It got me thinking how many places would break if the compiler
starts to temporariliy modify the flags - i hope it never does.
Vlastimil Babka Sept. 20, 2022, 10:31 a.m. UTC | #5
On 9/20/22 12:21, Alexander Atanasov wrote:
> On 20.09.22 12:29, Vlastimil Babka wrote:
>> On 9/20/22 11:17, Alexander Atanasov wrote:
>>> Hello,
>>>
>>> On 20.09.22 11:42, Vlastimil Babka wrote:
>>>>> +static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
>>>>> +                size_t length)
>>>>> +{
>>>>> +    if (s->refcount > 1)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    s->flags &= ~SLAB_FAILSLAB;
>>>>> +    if (buf[0] == '1')
>>>>> +        s->flags |= SLAB_FAILSLAB;
>>>>
>>>> Could we at least use a temporary variable to set up the final value and
>>>> then do a WRITE_ONCE() to s->flags, so the compiler is not allowed to do
>>>> some funky stuff? Assuming this is really the only place where we modify
>>>> s->flags during runtime, so we can't miss other updates due to RMW.
>>>
>>> Since it is set or clear - instead of temporary variable and potentially two
>>> writes and RMW issues i would suggest this:
>>> +    if (buf[0] == '1')
>>> +        s->flags |= SLAB_FAILSLAB;
>>> +       else
>>> +        s->flags &= ~SLAB_FAILSLAB;
>>
>> This way also has RMW issues, and also the compiler is allowed to
>> temporarily modify s->flags any way it likes; with WRITE_ONCE() it can't.
> 
> Okay, so the safest way is this?
> 
> if (buf[0] == '1')
>     WRITE_ONCE(s->flags, READ_ONCE(s->flags) | SLAB_FAILSLAB);
> else
>     WRITE_ONCE(s->flags, READ_ONCE(s->flags) & ~SLAB_FAILSLAB);

Yeah, that would work. Given we are the only writer, we shouldn't even need
a READ_ONCE.

> It got me thinking how many places would break if the compiler
> starts to temporariliy modify the flags - i hope it never does.

That's likely true as well. But the macros have been introduced for this
purpose AFAIK.
diff mbox series

Patch

diff --git a/Documentation/mm/slub.rst b/Documentation/mm/slub.rst
index 43063ade737a..86837073a39e 100644
--- a/Documentation/mm/slub.rst
+++ b/Documentation/mm/slub.rst
@@ -116,6 +116,8 @@  options from the ``slub_debug`` parameter translate to the following files::
 	T	trace
 	A	failslab
 
+failslab file is writable, so writing 1 or 0 will enable or disable
+the option at runtime. Write returns -EINVAL if cache is an alias.
 Careful with tracing: It may spew out lots of information and never stop if
 used on the wrong slab.
 
diff --git a/mm/slub.c b/mm/slub.c
index 862dbd9af4f5..7c15d312e0fb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5617,7 +5617,19 @@  static ssize_t failslab_show(struct kmem_cache *s, char *buf)
 {
 	return sysfs_emit(buf, "%d\n", !!(s->flags & SLAB_FAILSLAB));
 }
-SLAB_ATTR_RO(failslab);
+
+static ssize_t failslab_store(struct kmem_cache *s, const char *buf,
+				size_t length)
+{
+	if (s->refcount > 1)
+		return -EINVAL;
+
+	s->flags &= ~SLAB_FAILSLAB;
+	if (buf[0] == '1')
+		s->flags |= SLAB_FAILSLAB;
+	return length;
+}
+SLAB_ATTR(failslab);
 #endif
 
 static ssize_t shrink_show(struct kmem_cache *s, char *buf)