diff mbox series

[1/2] mm: krealloc: consider spare memory for __GFP_ZERO

Message ID 20240730194214.31483-1-dakr@kernel.org (mailing list archive)
State New
Headers show
Series [1/2] mm: krealloc: consider spare memory for __GFP_ZERO | expand

Commit Message

Danilo Krummrich July 30, 2024, 7:42 p.m. UTC
As long as krealloc() is called with __GFP_ZERO consistently, starting
with the initial memory allocation, __GFP_ZERO should be fully honored.

However, if for an existing allocation krealloc() is called with a
decreased size, it is not ensured that the spare portion the allocation
is zeroed. Thus, if krealloc() is subsequently called with a larger size
again, __GFP_ZERO can't be fully honored, since we don't know the
previous size, but only the bucket size.

Example:

	buf = kzalloc(64, GFP_KERNEL);
	memset(buf, 0xff, 64);

	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);

	/* After this call the last 16 bytes are still 0xff. */
	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);

Fix this, by explicitly setting spare memory to zero, when shrinking an
allocation with __GFP_ZERO flag set or init_on_alloc enabled.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 mm/slab_common.c | 7 +++++++
 1 file changed, 7 insertions(+)


base-commit: 7c3dd6d99f2df6a9d7944ee8505b195ba51c9b68

Comments

Andrew Morton July 30, 2024, 8:31 p.m. UTC | #1
On Tue, 30 Jul 2024 21:42:05 +0200 Danilo Krummrich <dakr@kernel.org> wrote:

> As long as krealloc() is called with __GFP_ZERO consistently, starting
> with the initial memory allocation, __GFP_ZERO should be fully honored.
> 
> However, if for an existing allocation krealloc() is called with a
> decreased size, it is not ensured that the spare portion the allocation
> is zeroed. Thus, if krealloc() is subsequently called with a larger size
> again, __GFP_ZERO can't be fully honored, since we don't know the
> previous size, but only the bucket size.

Well that's bad.

> Example:
> 
> 	buf = kzalloc(64, GFP_KERNEL);

If this was kmalloc()

> 	memset(buf, 0xff, 64);
> 
> 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
> 
> 	/* After this call the last 16 bytes are still 0xff. */
> 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);

then this would expose uninitialized kernel memory to kernel code, with
a risk that the kernel code will expose that to userspace, yes?

This does seem rather a trap, and I wonder whether krealloc() should
just zero out any such data by default.

> Fix this, by explicitly setting spare memory to zero, when shrinking an
> allocation with __GFP_ZERO flag set or init_on_alloc enabled.
> 
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>  
>  	/* If the object still fits, repoison it precisely. */
>  	if (ks >= new_size) {
> +		/* Zero out spare memory. */
> +		if (want_init_on_alloc(flags)) {
> +			kasan_disable_current();
> +			memset((void *)p + new_size, 0, ks - new_size);

Casting away the constness of `*p'.  This is just misleading everyone,
really.  It would be better to make argument `p' have type "void *".

> +			kasan_enable_current();
> +		}
> +
>  		p = kasan_krealloc((void *)p, new_size, flags);
Vlastimil Babka July 30, 2024, 9:06 p.m. UTC | #2
On 7/30/24 10:31 PM, Andrew Morton wrote:
> On Tue, 30 Jul 2024 21:42:05 +0200 Danilo Krummrich <dakr@kernel.org> wrote:
> 
>> As long as krealloc() is called with __GFP_ZERO consistently, starting
>> with the initial memory allocation, __GFP_ZERO should be fully honored.
>> 
>> However, if for an existing allocation krealloc() is called with a
>> decreased size, it is not ensured that the spare portion the allocation
>> is zeroed. Thus, if krealloc() is subsequently called with a larger size
>> again, __GFP_ZERO can't be fully honored, since we don't know the
>> previous size, but only the bucket size.
> 
> Well that's bad.
> 
>> Example:
>> 
>> 	buf = kzalloc(64, GFP_KERNEL);
> 
> If this was kmalloc()

Then already here we have unitialized kernel memory that a buggy user could
expose, no?

>> 	memset(buf, 0xff, 64);
>> 
>> 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
>> 
>> 	/* After this call the last 16 bytes are still 0xff. */
>> 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
> 
> then this would expose uninitialized kernel memory to kernel code, with
> a risk that the kernel code will expose that to userspace, yes?
> 
> This does seem rather a trap, and I wonder whether krealloc() should
> just zero out any such data by default.

So unless I'm missing how this differs from plain kmalloc(), relying on
want_init_on_alloc() seems the right way how to opt-in harden against this
potential exposure.

>> Fix this, by explicitly setting spare memory to zero, when shrinking an
>> allocation with __GFP_ZERO flag set or init_on_alloc enabled.
>> 
>> --- a/mm/slab_common.c
>> +++ b/mm/slab_common.c
>> @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>>  
>>  	/* If the object still fits, repoison it precisely. */
>>  	if (ks >= new_size) {
>> +		/* Zero out spare memory. */
>> +		if (want_init_on_alloc(flags)) {
>> +			kasan_disable_current();
>> +			memset((void *)p + new_size, 0, ks - new_size);
> 
> Casting away the constness of `*p'.  This is just misleading everyone,
> really.  It would be better to make argument `p' have type "void *".
> 
>> +			kasan_enable_current();
>> +		}
>> +
>>  		p = kasan_krealloc((void *)p, new_size, flags);
>
Vlastimil Babka July 30, 2024, 9:14 p.m. UTC | #3
On 7/30/24 9:42 PM, Danilo Krummrich wrote:
> As long as krealloc() is called with __GFP_ZERO consistently, starting
> with the initial memory allocation, __GFP_ZERO should be fully honored.
> 
> However, if for an existing allocation krealloc() is called with a
> decreased size, it is not ensured that the spare portion the allocation
> is zeroed. Thus, if krealloc() is subsequently called with a larger size
> again, __GFP_ZERO can't be fully honored, since we don't know the
> previous size, but only the bucket size.
> 
> Example:
> 
> 	buf = kzalloc(64, GFP_KERNEL);
> 	memset(buf, 0xff, 64);
> 
> 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
> 
> 	/* After this call the last 16 bytes are still 0xff. */
> 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
> 
> Fix this, by explicitly setting spare memory to zero, when shrinking an
> allocation with __GFP_ZERO flag set or init_on_alloc enabled.
> 
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  mm/slab_common.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 40b582a014b8..cff602cedf8e 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>  
>  	/* If the object still fits, repoison it precisely. */
>  	if (ks >= new_size) {
> +		/* Zero out spare memory. */
> +		if (want_init_on_alloc(flags)) {
> +			kasan_disable_current();
> +			memset((void *)p + new_size, 0, ks - new_size);
> +			kasan_enable_current();

If we do kasan_krealloc() first, shouldn't the memset then be legal
afterwards without the disable/enable dance?

> +		}
> +
>  		p = kasan_krealloc((void *)p, new_size, flags);
>  		return (void *)p;
>  	}
> 
> base-commit: 7c3dd6d99f2df6a9d7944ee8505b195ba51c9b68
Danilo Krummrich July 30, 2024, 11:54 p.m. UTC | #4
On Tue, Jul 30, 2024 at 11:14:16PM +0200, Vlastimil Babka wrote:
> On 7/30/24 9:42 PM, Danilo Krummrich wrote:
> > As long as krealloc() is called with __GFP_ZERO consistently, starting
> > with the initial memory allocation, __GFP_ZERO should be fully honored.
> > 
> > However, if for an existing allocation krealloc() is called with a
> > decreased size, it is not ensured that the spare portion the allocation
> > is zeroed. Thus, if krealloc() is subsequently called with a larger size
> > again, __GFP_ZERO can't be fully honored, since we don't know the
> > previous size, but only the bucket size.
> > 
> > Example:
> > 
> > 	buf = kzalloc(64, GFP_KERNEL);
> > 	memset(buf, 0xff, 64);
> > 
> > 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
> > 
> > 	/* After this call the last 16 bytes are still 0xff. */
> > 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
> > 
> > Fix this, by explicitly setting spare memory to zero, when shrinking an
> > allocation with __GFP_ZERO flag set or init_on_alloc enabled.
> > 
> > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> > ---
> >  mm/slab_common.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 40b582a014b8..cff602cedf8e 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >  
> >  	/* If the object still fits, repoison it precisely. */
> >  	if (ks >= new_size) {
> > +		/* Zero out spare memory. */
> > +		if (want_init_on_alloc(flags)) {
> > +			kasan_disable_current();
> > +			memset((void *)p + new_size, 0, ks - new_size);
> > +			kasan_enable_current();
> 
> If we do kasan_krealloc() first, shouldn't the memset then be legal
> afterwards without the disable/enable dance?

No, we always write into the poisoned area. The following tables show what we do
in the particular case:

Shrink
------
          new        old
0         size       size        ks
|----------|----------|----------|
|   keep   |        poison       |  <- poison
|--------------------------------|
|   keep   |         zero        |  <- data


Poison and zero things between old size and ks is not necessary, but we don't
know old size, hence we have do it between new size and ks.

Grow
----
          old        new
0         size       size        ks
|----------|----------|----------|
|       unpoison      |   keep   | <- poison
|--------------------------------|
|         keep        |   zero   | <- data

Zeroing between new_size and ks in not necessary in this case, since it must be
zero already. But without knowing the old size we don't know whether we shrink
and actually need to do something, or if we grow and don't need to do anything.

Analogously, we also unpoison things between 0 and old size for the same reason.

> 
> > +		}
> > +
> >  		p = kasan_krealloc((void *)p, new_size, flags);
> >  		return (void *)p;
> >  	}
> > 
> > base-commit: 7c3dd6d99f2df6a9d7944ee8505b195ba51c9b68
>
Vlastimil Babka July 31, 2024, 2:30 p.m. UTC | #5
On 7/31/24 1:54 AM, Danilo Krummrich wrote:
> On Tue, Jul 30, 2024 at 11:14:16PM +0200, Vlastimil Babka wrote:
>> On 7/30/24 9:42 PM, Danilo Krummrich wrote:
>> > As long as krealloc() is called with __GFP_ZERO consistently, starting
>> > with the initial memory allocation, __GFP_ZERO should be fully honored.
>> > 
>> > However, if for an existing allocation krealloc() is called with a
>> > decreased size, it is not ensured that the spare portion the allocation
>> > is zeroed. Thus, if krealloc() is subsequently called with a larger size
>> > again, __GFP_ZERO can't be fully honored, since we don't know the
>> > previous size, but only the bucket size.
>> > 
>> > Example:
>> > 
>> > 	buf = kzalloc(64, GFP_KERNEL);
>> > 	memset(buf, 0xff, 64);
>> > 
>> > 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
>> > 
>> > 	/* After this call the last 16 bytes are still 0xff. */
>> > 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
>> > 
>> > Fix this, by explicitly setting spare memory to zero, when shrinking an
>> > allocation with __GFP_ZERO flag set or init_on_alloc enabled.
>> > 
>> > Signed-off-by: Danilo Krummrich <dakr@kernel.org>
>> > ---
>> >  mm/slab_common.c | 7 +++++++
>> >  1 file changed, 7 insertions(+)
>> > 
>> > diff --git a/mm/slab_common.c b/mm/slab_common.c
>> > index 40b582a014b8..cff602cedf8e 100644
>> > --- a/mm/slab_common.c
>> > +++ b/mm/slab_common.c
>> > @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>> >  
>> >  	/* If the object still fits, repoison it precisely. */
>> >  	if (ks >= new_size) {
>> > +		/* Zero out spare memory. */
>> > +		if (want_init_on_alloc(flags)) {
>> > +			kasan_disable_current();
>> > +			memset((void *)p + new_size, 0, ks - new_size);
>> > +			kasan_enable_current();
>> 
>> If we do kasan_krealloc() first, shouldn't the memset then be legal
>> afterwards without the disable/enable dance?
> 
> No, we always write into the poisoned area. The following tables show what we do
> in the particular case:
> 
> Shrink
> ------
>           new        old
> 0         size       size        ks
> |----------|----------|----------|
> |   keep   |        poison       |  <- poison
> |--------------------------------|
> |   keep   |         zero        |  <- data
> 
> 
> Poison and zero things between old size and ks is not necessary, but we don't
> know old size, hence we have do it between new size and ks.
> 
> Grow
> ----
>           old        new
> 0         size       size        ks
> |----------|----------|----------|
> |       unpoison      |   keep   | <- poison
> |--------------------------------|
> |         keep        |   zero   | <- data
> 
> Zeroing between new_size and ks in not necessary in this case, since it must be
> zero already. But without knowing the old size we don't know whether we shrink
> and actually need to do something, or if we grow and don't need to do anything.
> 
> Analogously, we also unpoison things between 0 and old size for the same reason.

Thanks, you're right!

>> 
>> > +		}
>> > +
>> >  		p = kasan_krealloc((void *)p, new_size, flags);
>> >  		return (void *)p;
>> >  	}
>> > 
>> > base-commit: 7c3dd6d99f2df6a9d7944ee8505b195ba51c9b68
>>
Vlastimil Babka July 31, 2024, 2:31 p.m. UTC | #6
On 7/30/24 9:42 PM, Danilo Krummrich wrote:
> As long as krealloc() is called with __GFP_ZERO consistently, starting
> with the initial memory allocation, __GFP_ZERO should be fully honored.
> 
> However, if for an existing allocation krealloc() is called with a
> decreased size, it is not ensured that the spare portion the allocation
> is zeroed. Thus, if krealloc() is subsequently called with a larger size
> again, __GFP_ZERO can't be fully honored, since we don't know the
> previous size, but only the bucket size.
> 
> Example:
> 
> 	buf = kzalloc(64, GFP_KERNEL);
> 	memset(buf, 0xff, 64);
> 
> 	buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
> 
> 	/* After this call the last 16 bytes are still 0xff. */
> 	buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
> 
> Fix this, by explicitly setting spare memory to zero, when shrinking an
> allocation with __GFP_ZERO flag set or init_on_alloc enabled.
> 
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/slab_common.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 40b582a014b8..cff602cedf8e 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1273,6 +1273,13 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags)
>  
>  	/* If the object still fits, repoison it precisely. */
>  	if (ks >= new_size) {
> +		/* Zero out spare memory. */
> +		if (want_init_on_alloc(flags)) {
> +			kasan_disable_current();
> +			memset((void *)p + new_size, 0, ks - new_size);
> +			kasan_enable_current();
> +		}
> +
>  		p = kasan_krealloc((void *)p, new_size, flags);
>  		return (void *)p;
>  	}
> 
> base-commit: 7c3dd6d99f2df6a9d7944ee8505b195ba51c9b68
diff mbox series

Patch

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 40b582a014b8..cff602cedf8e 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1273,6 +1273,13 @@  __do_krealloc(const void *p, size_t new_size, gfp_t flags)
 
 	/* If the object still fits, repoison it precisely. */
 	if (ks >= new_size) {
+		/* Zero out spare memory. */
+		if (want_init_on_alloc(flags)) {
+			kasan_disable_current();
+			memset((void *)p + new_size, 0, ks - new_size);
+			kasan_enable_current();
+		}
+
 		p = kasan_krealloc((void *)p, new_size, flags);
 		return (void *)p;
 	}