diff mbox series

[RFC,3/5] mm, slub: remove runtime allocation order changes

Message ID 20200602141519.7099-4-vbabka@suse.cz (mailing list archive)
State New, archived
Headers show
Series replace runtime slub_debug toggling with more capable boot parameter | expand

Commit Message

Vlastimil Babka June 2, 2020, 2:15 p.m. UTC
SLUB allows runtime changing of page allocation order by writing into the
/sys/kernel/slab/<cache>/order file. Jann has reported [1] that this interface
allows the order to be set too small, leading to crashes.

While it's possible to fix the immediate issue, closer inspection reveals
potential races. Storing the new order calls calculate_sizes() which
non-atomically updates a lot of kmem_cache fields while the cache is still in
use. Unexpected behavior might occur even if the fields are set to the same
value as they were.

This could be fixed by splitting out the part of calculate_sizes() that depends
on forced_order, so that we only update kmem_cache.oo field. This could still
race with init_cache_random_seq(), shuffle_freelist(), allocate_slab(). Perhaps
it's possible to audit and e.g. add some READ_ONCE/WRITE_ONCE accesses, it
might be easier just to remove the runtime order changes, which is what this
patch does. If there are valid usecases for per-cache order setting, we could
e.g. extend the boot parameters to do that.

[1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

Comments

Kees Cook June 5, 2020, 9:06 p.m. UTC | #1
On Tue, Jun 02, 2020 at 04:15:17PM +0200, Vlastimil Babka wrote:
> SLUB allows runtime changing of page allocation order by writing into the
> /sys/kernel/slab/<cache>/order file. Jann has reported [1] that this interface
> allows the order to be set too small, leading to crashes.
> 
> While it's possible to fix the immediate issue, closer inspection reveals
> potential races. Storing the new order calls calculate_sizes() which
> non-atomically updates a lot of kmem_cache fields while the cache is still in
> use. Unexpected behavior might occur even if the fields are set to the same
> value as they were.
> 
> This could be fixed by splitting out the part of calculate_sizes() that depends
> on forced_order, so that we only update kmem_cache.oo field. This could still
> race with init_cache_random_seq(), shuffle_freelist(), allocate_slab(). Perhaps
> it's possible to audit and e.g. add some READ_ONCE/WRITE_ONCE accesses, it
> might be easier just to remove the runtime order changes, which is what this
> patch does. If there are valid usecases for per-cache order setting, we could
> e.g. extend the boot parameters to do that.
> 
> [1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com
> 
> Reported-by: Jann Horn <jannh@google.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Reviewed-by: Kees Cook <keescook@chromium.org>
Roman Gushchin June 6, 2020, 12:32 a.m. UTC | #2
On Tue, Jun 02, 2020 at 04:15:17PM +0200, Vlastimil Babka wrote:
> SLUB allows runtime changing of page allocation order by writing into the
> /sys/kernel/slab/<cache>/order file. Jann has reported [1] that this interface
> allows the order to be set too small, leading to crashes.
> 
> While it's possible to fix the immediate issue, closer inspection reveals
> potential races. Storing the new order calls calculate_sizes() which
> non-atomically updates a lot of kmem_cache fields while the cache is still in
> use. Unexpected behavior might occur even if the fields are set to the same
> value as they were.
> 
> This could be fixed by splitting out the part of calculate_sizes() that depends
> on forced_order, so that we only update kmem_cache.oo field. This could still
> race with init_cache_random_seq(), shuffle_freelist(), allocate_slab(). Perhaps
> it's possible to audit and e.g. add some READ_ONCE/WRITE_ONCE accesses, it
> might be easier just to remove the runtime order changes, which is what this
> patch does. If there are valid usecases for per-cache order setting, we could
> e.g. extend the boot parameters to do that.
> 
> [1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com
> 
> Reported-by: Jann Horn <jannh@google.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Roman Gushchin <guro@fb.com>

Thanks!

> ---
>  mm/slub.c | 19 +------------------
>  1 file changed, 1 insertion(+), 18 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index ac198202dbb0..58c1e9e7b3b3 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5111,28 +5111,11 @@ static ssize_t objs_per_slab_show(struct kmem_cache *s, char *buf)
>  }
>  SLAB_ATTR_RO(objs_per_slab);
>  
> -static ssize_t order_store(struct kmem_cache *s,
> -				const char *buf, size_t length)
> -{
> -	unsigned int order;
> -	int err;
> -
> -	err = kstrtouint(buf, 10, &order);
> -	if (err)
> -		return err;
> -
> -	if (order > slub_max_order || order < slub_min_order)
> -		return -EINVAL;
> -
> -	calculate_sizes(s, order);
> -	return length;
> -}
> -
>  static ssize_t order_show(struct kmem_cache *s, char *buf)
>  {
>  	return sprintf(buf, "%u\n", oo_order(s->oo));
>  }
> -SLAB_ATTR(order);
> +SLAB_ATTR_RO(order);
>  
>  static ssize_t min_partial_show(struct kmem_cache *s, char *buf)
>  {
> -- 
> 2.26.2
> 
>
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index ac198202dbb0..58c1e9e7b3b3 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5111,28 +5111,11 @@  static ssize_t objs_per_slab_show(struct kmem_cache *s, char *buf)
 }
 SLAB_ATTR_RO(objs_per_slab);
 
-static ssize_t order_store(struct kmem_cache *s,
-				const char *buf, size_t length)
-{
-	unsigned int order;
-	int err;
-
-	err = kstrtouint(buf, 10, &order);
-	if (err)
-		return err;
-
-	if (order > slub_max_order || order < slub_min_order)
-		return -EINVAL;
-
-	calculate_sizes(s, order);
-	return length;
-}
-
 static ssize_t order_show(struct kmem_cache *s, char *buf)
 {
 	return sprintf(buf, "%u\n", oo_order(s->oo));
 }
-SLAB_ATTR(order);
+SLAB_ATTR_RO(order);
 
 static ssize_t min_partial_show(struct kmem_cache *s, char *buf)
 {