Message ID | 20220225221625.3531852-1-keescook@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Handle ksize() vs __alloc_size by forgetting size | expand |
On Fri, 25 Feb 2022 14:16:25 -0800 Kees Cook <keescook@chromium.org> wrote: > If ksize() is used on an allocation, the compiler cannot make any > assumptions about its size any more (as hinted by __alloc_size). Force > it to forget. > > One caller was using a container_of() construction that needed to be > worked around. Please, when fixing something do fully explain what that thing is. I, for one, simply cannot understand why this change is being proposed. Especially when proposing a -stable backport! Tell readers what was the end-user impact of the bug. > Link: https://github.com/ClangBuiltLinux/linux/issues/1599 Even that didn't tell me. Is it just a clang warning? Does the kernel post your private keys on reddit then scribble all over your disk drive? I dunno.
On Fri, 25 Feb 2022 at 23:16, Kees Cook <keescook@chromium.org> wrote: > > If ksize() is used on an allocation, the compiler cannot make any > assumptions about its size any more (as hinted by __alloc_size). Force > it to forget. > > One caller was using a container_of() construction that needed to be > worked around. > > Cc: Marco Elver <elver@google.com> > Cc: Pekka Enberg <penberg@kernel.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: linux-mm@kvack.org > Link: https://github.com/ClangBuiltLinux/linux/issues/1599 > Fixes: c37495d6254c ("slab: add __alloc_size attributes for better bounds checking") > Cc: stable@vger.kernel.org > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > This appears to work for me, but I'm waiting for more feedback on > the specific instance got tripped over in Android. > --- > drivers/base/devres.c | 4 +++- > include/linux/slab.h | 26 +++++++++++++++++++++++++- > mm/slab_common.c | 19 +++---------------- > 3 files changed, 31 insertions(+), 18 deletions(-) > > diff --git a/drivers/base/devres.c b/drivers/base/devres.c > index eaa9a5cd1db9..1a2645bd7234 100644 > --- a/drivers/base/devres.c > +++ b/drivers/base/devres.c > @@ -855,6 +855,7 @@ void *devm_krealloc(struct device *dev, void *ptr, size_t new_size, gfp_t gfp) > size_t total_new_size, total_old_size; > struct devres *old_dr, *new_dr; > unsigned long flags; > + void *allocation; > > if (unlikely(!new_size)) { > devm_kfree(dev, ptr); > @@ -874,7 +875,8 @@ void *devm_krealloc(struct device *dev, void *ptr, size_t new_size, gfp_t gfp) > if (!check_dr_size(new_size, &total_new_size)) > return NULL; > > - total_old_size = ksize(container_of(ptr, struct devres, data)); > + allocation = container_of(ptr, struct devres, data); > + total_old_size = ksize(allocation); > if (total_old_size == 0) { > WARN(1, "Pointer doesn't point to dynamically allocated memory."); > return NULL; > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 37bde99b74af..a14f3bfa2f44 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -182,8 +182,32 @@ int kmem_cache_shrink(struct kmem_cache *s); > void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); > void kfree(const void *objp); > void kfree_sensitive(const void *objp); > + > +/** > + * ksize - get the actual amount of memory allocated for a given object > + * @objp: Pointer to the object > + * > + * kmalloc may internally round up allocations and return more memory > + * than requested. ksize() can be used to determine the actual amount of > + * memory allocated. The caller may use this additional memory, even though > + * a smaller amount of memory was initially specified with the kmalloc call. > + * The caller must guarantee that objp points to a valid object previously > + * allocated with either kmalloc() or kmem_cache_alloc(). The object > + * must not be freed during the duration of the call. > + * > + * Return: size of the actual memory used by @objp in bytes > + */ > +#define ksize(objp) ({ \ > + /* \ > + * Getting the actual allocation size means the __alloc_size \ > + * hints are no longer valid, and the compiler needs to \ > + * forget about them. \ > + */ \ > + OPTIMIZER_HIDE_VAR(objp); \ > + _ksize(objp); \ > +}) So per that ClangBuiltLinux issue I'm gleaning that the __alloc_size annotations are actually causing the compiler to generate wrong code? Possibly due to the compiler thinking that the accesses must stay within some bound, and anything beyond that will be "undefined behaviour"? Clearly, per the slab APIs, in particular with the provision of ksize(), the compiler is wrong. At first I thought this was only related to UBSAN bounds checking generating false positives, in which case a simple workaround as you present above would probably take care of most cases. But if the real issue is the compiler suddenly doing more aggressive compiler optimizations because it thinks accesses beyond the object size (per __alloc_size) is UB, but UB can never happen, and thus does crazy things [1], I think the answer (at least with what we have right now) should be to find a different solution that is more reliable. [1] https://lore.kernel.org/all/20220218131358.3032912-1-gregkh@linuxfoundation.org/ Because who's to say that there's not some code that does: foo = kmalloc(...); ... bar = foo; s = ksize(bar); ... makes access address-dependent on 's' and 'foo' (but not 'bar') ... This doesn't look like code anyone would write, but I fear with enough macro and inline function magic, it's not too unlikely. I can see a few options: 1. Dropping __alloc_size. 2. Somehow statically computing the size-class's size (kmalloc_index() might help here), removing __alloc_size from allocation functions and instead use some wrapper. 3. Teaching the compiler to drop *all* object sizes upon encountering a ksize(). So I think #1 is probably not what you want. #2 seems quite complicated, and in many cases likely too relaxed and would miss bugs, so also not ideal. #3 would be the most reliable, but OPTIMIZER_HIDE_VAR() doesn't cut it, and we need something stronger. The downside of #3 is that it might pessimize code generation, but given ksize() is used sparingly, might be ok. Thanks, -- Marco
On Mon, Feb 28, 2022 at 12:24:51PM +0100, Marco Elver wrote: > 2. Somehow statically computing the size-class's size (kmalloc_index() > might help here), removing __alloc_size from allocation functions and > instead use some wrapper. I don't think that's computable. I have been thinking about a slab flag that would say "speed is more important than size; if the smallest slab for this size of allocation has no free objects, search larger slabs to get memory instead of allocating a new slab". If we did have such a feature, it would be impossible to know how large ksize() would report.
There aren't many calls to ksize in the entire Linux kernel source tree. Most use cases are using the memory as some kind of (chunked) dynamic array, where they either need to realloc or kmalloc a new chunk when it runs out of space. Changing the approach to this API would be both more efficient and fully compatible with alloc_size since it would simply not be added to these functions: kmalloc_size(size, &result_size) krealloc_size(p, new_size, &result_size) Nearly every use of ksize could be easily phased out this way. There are probably a few which are harder to remove. It can be gradually phased out by keeping around ksize temporarily but documenting that it's only correct to use it on memory allocated with kmalloc_size/krealloc_size. I think it can be phased out quite quickly though. Look at how many calls there are to it. It's really not a lot, especially if you filter out the uses of the identifier 'ksize' for variables rather than calls to that function. I brought this up when I originally submitted CONFIG_FORTIFY_SOURCE. It's the main reason that I didn't bother trying to submit the alloc_size attributes myself. The most important ones are for kmalloc and it isn't technically correct to use it.
Looked through them and nearly all seem easy to replace. By the way, the call to ksize in arch/x86/kernel/cpu/microcode/amd.c doesn't look right. It allocates the memory with kmemdup. I don't see how it can assume that the padding is zero, and that seems to be a requirement since it zeroes the destination before copying to it. Seems far more reasonable to add a field with the size.
On Fri, Feb 25, 2022 at 2:16 PM Kees Cook <keescook@chromium.org> wrote: > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 37bde99b74af..a14f3bfa2f44 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -182,8 +182,32 @@ int kmem_cache_shrink(struct kmem_cache *s); > void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); > void kfree(const void *objp); > void kfree_sensitive(const void *objp); > + > +/** > + * ksize - get the actual amount of memory allocated for a given object > + * @objp: Pointer to the object > + * > + * kmalloc may internally round up allocations and return more memory > + * than requested. ksize() can be used to determine the actual amount of > + * memory allocated. The caller may use this additional memory, even though > + * a smaller amount of memory was initially specified with the kmalloc call. > + * The caller must guarantee that objp points to a valid object previously > + * allocated with either kmalloc() or kmem_cache_alloc(). The object > + * must not be freed during the duration of the call. > + * > + * Return: size of the actual memory used by @objp in bytes > + */ > +#define ksize(objp) ({ \ > + /* \ > + * Getting the actual allocation size means the __alloc_size \ > + * hints are no longer valid, and the compiler needs to \ > + * forget about them. \ > + */ \ > + OPTIMIZER_HIDE_VAR(objp); \ > + _ksize(objp); \ > +}) > size_t __ksize(const void *objp); > -size_t ksize(const void *objp); > +size_t _ksize(const void *objp); If you wanted to discourage others from calling _ksize, you could hide its declaration within the scope of statement expression within ksize: https://godbolt.org/z/e4sd4nE6q
On Fri, Feb 25, 2022 at 03:45:18PM -0800, Andrew Morton wrote: > On Fri, 25 Feb 2022 14:16:25 -0800 Kees Cook <keescook@chromium.org> wrote: > > > If ksize() is used on an allocation, the compiler cannot make any > > assumptions about its size any more (as hinted by __alloc_size). Force > > it to forget. > > > > One caller was using a container_of() construction that needed to be > > worked around. > > Please, when fixing something do fully explain what that thing is. I, > for one, simply cannot understand why this change is being proposed. > > Especially when proposing a -stable backport! Tell readers what was > the end-user impact of the bug. > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1599 > > Even that didn't tell me. Is it just a clang warning? Does the kernel > post your private keys on reddit then scribble all over your disk > drive? I dunno. Yup, sorry. I tend to get so deep changes like this that I forget to give an appropriately detailed summary. As others have mentioned, this is trying to fix a miscompilation issue, triggered by what can be considered either a mis-application of __alloc_size, or a failure to correctly disable compiler optimizations in the face of ksize().
On Mon, Feb 28, 2022 at 12:24:51PM +0100, Marco Elver wrote: > On Fri, 25 Feb 2022 at 23:16, Kees Cook <keescook@chromium.org> wrote: > > > > If ksize() is used on an allocation, the compiler cannot make any > > assumptions about its size any more (as hinted by __alloc_size). Force > > it to forget. > > [...] > > diff --git a/include/linux/slab.h b/include/linux/slab.h > > index 37bde99b74af..a14f3bfa2f44 100644 > > --- a/include/linux/slab.h > > +++ b/include/linux/slab.h > > @@ -182,8 +182,32 @@ int kmem_cache_shrink(struct kmem_cache *s); > > void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); > > void kfree(const void *objp); > > void kfree_sensitive(const void *objp); > > + > > +/** > > + * ksize - get the actual amount of memory allocated for a given object > > + * @objp: Pointer to the object > > + * > > + * kmalloc may internally round up allocations and return more memory > > + * than requested. ksize() can be used to determine the actual amount of > > + * memory allocated. The caller may use this additional memory, even though > > + * a smaller amount of memory was initially specified with the kmalloc call. > > + * The caller must guarantee that objp points to a valid object previously > > + * allocated with either kmalloc() or kmem_cache_alloc(). The object > > + * must not be freed during the duration of the call. > > + * > > + * Return: size of the actual memory used by @objp in bytes > > + */ > > +#define ksize(objp) ({ \ > > + /* \ > > + * Getting the actual allocation size means the __alloc_size \ > > + * hints are no longer valid, and the compiler needs to \ > > + * forget about them. \ > > + */ \ > > + OPTIMIZER_HIDE_VAR(objp); \ > > + _ksize(objp); \ > > +}) > > So per that ClangBuiltLinux issue I'm gleaning that the __alloc_size > annotations are actually causing the compiler to generate wrong code? I would say "incompatible", but yes, the result in the same. The compiler is doing exactly what we told it that it should do. Using __alloc_size means it can assume (and enforce) the size of the allocation. AFAICT, This became an issue due to: -fsanitize=bounds -fsanitize-undefined-trap-on-error I would expect -Warray-bounds to warn about it, but since we don't have -Warray-bounds enabled, this leads to a silent problem. (Note that I and others been working very hard to get the code base well enough into shape that we can enable -Warray-bounds.) I have not, however, been able to reduce the issue to a minimal a test case yet. This patch was proposed to see if Greg (or other Android folks) could check it. > Possibly due to the compiler thinking that the accesses must stay > within some bound, and anything beyond that will be "undefined > behaviour"? Clearly, per the slab APIs, in particular with the > provision of ksize(), the compiler is wrong. Right. __alloc_size vs ksize() is the problem. > At first I thought this was only related to UBSAN bounds checking > generating false positives, in which case a simple workaround as you > present above would probably take care of most cases. > > But if the real issue is the compiler suddenly doing more aggressive > compiler optimizations because it thinks accesses beyond the object > size (per __alloc_size) is UB, but UB can never happen, and thus does > crazy things [1], I think the answer (at least with what we have right > now) should be to find a different solution that is more reliable. > > [1] https://lore.kernel.org/all/20220218131358.3032912-1-gregkh@linuxfoundation.org/ I think it's -fsanitize-undefined-trap-on-error causing the "problem". It seems to be doing exactly what it was told to do. :) But again, I haven't got a good example, so maybe there is _also_ a miscompilation happening? > Because who's to say that there's not some code that does: > > foo = kmalloc(...); > ... > bar = foo; > s = ksize(bar); > ... makes access address-dependent on 's' and 'foo' (but not 'bar') ... > > This doesn't look like code anyone would write, but I fear with enough > macro and inline function magic, it's not too unlikely. Right ... I would _hope_ the s-based access would be on _bar_ only. :P > I can see a few options: > > 1. Dropping __alloc_size. Right; that's the big-hammer solution (and is what Greg has already done in [1] for the _particular_ case, but there may be others). > 2. Somehow statically computing the size-class's size (kmalloc_index() > might help here), removing __alloc_size from allocation functions and > instead use some wrapper. I want to do this for other reasons too, but yes, it looks a bit finicky. And there are some plans to make this even MORE dynamic, but if that happens, I think we could fall that condition back to "no __alloc_size". > 3. Teaching the compiler to drop *all* object sizes upon encountering a ksize(). size_t ksize(void *ptr) __just_kidding_about_the_alloc_size; > So I think #1 is probably not what you want. #2 seems quite > complicated, and in many cases likely too relaxed and would miss bugs, > so also not ideal. #3 would be the most reliable, but > OPTIMIZER_HIDE_VAR() doesn't cut it, and we need something stronger. > The downside of #3 is that it might pessimize code generation, but > given ksize() is used sparingly, might be ok. Okay, so, using __alloc_size is currently very limited in scope, in the sense that bounds checking against such an allocated pointer is only within the resulting function (and inlining) bounds. This _has_ caught flaws, though, so I remain interested in keeping it. Using it within the same scope as with ksize(), however, is the problem. Daniel suggests a reasonable solution here, which is "remove ksize" (I'll call this #4), since what he points out is that the users of ksize() follow a very specific pattern and don't benefit from __alloc_size. As I'd like to be expanding the scope of __alloc_size's effects even more strong into runtime (via other users of the information like __builtin_dynamic_object_size()), I think we need to either use the patch I've proposed, make kmalloc have compile-time-visible buckets so __alloc_size seems the "right" size, or remove ksize(). #4 seems maybe doable. Here's a slightly trimmed list: arch/x86/kernel/cpu/microcode/amd.c: memcpy(amd_ucode_patch, p->data, min_t(u32, ksize(p->data), PATCH_MAX_SIZE)); drivers/base/devres.c: total_old_size = ksize(container_of(ptr, struct devres, data)); drivers/dma-buf/dma-resv.c: list->shared_max = (ksize(list) - offsetof(typeof(*list), shared)) / drivers/gpu/drm/drm_managed.c: WARN_ON(dev + 1 > (struct drm_device *) (container + ksize(container))); drivers/net/ethernet/intel/igb/igb_main.c: } else if (size > ksize(q_vector)) { drivers/net/ipa/gsi_trans.c: pool->count = ksize(pool->base) / size; drivers/net/wireless/intel/iwlwifi/mvm/scan.c: memset(mvm->scan_cmd, 0, ksize(mvm->scan_cmd)); fs/btrfs/send.c: p->buf_len = ksize(p->buf); fs/coredump.c: cn->size = ksize(corename); include/linux/slab.h:size_t ksize(const void *objp); kernel/bpf/verifier.c: if (ksize(dst) < bytes) { mm/mempool.c: __check_element(pool, element, ksize(element)); mm/nommu.c: return ksize(objp); mm/slab_common.c: ks = ksize(mem); net/core/skbuff.c: unsigned int size = frag_size ? : ksize(data); net/core/skbuff.c: osize = ksize(data); net/core/skbuff.c: size = SKB_WITH_OVERHEAD(ksize(data)); net/core/skbuff.c: size = SKB_WITH_OVERHEAD(ksize(data)); net/core/skbuff.c: size = SKB_WITH_OVERHEAD(ksize(data)); security/tomoyo/gc.c: tomoyo_memory_used[TOMOYO_MEMORY_POLICY] -= ksize(ptr); security/tomoyo/memory.c: const size_t s = ksize(ptr); I would really like to have #2, though, as I could use it for some future kmalloc defenses that need compile-time sizing visibility.
diff --git a/drivers/base/devres.c b/drivers/base/devres.c index eaa9a5cd1db9..1a2645bd7234 100644 --- a/drivers/base/devres.c +++ b/drivers/base/devres.c @@ -855,6 +855,7 @@ void *devm_krealloc(struct device *dev, void *ptr, size_t new_size, gfp_t gfp) size_t total_new_size, total_old_size; struct devres *old_dr, *new_dr; unsigned long flags; + void *allocation; if (unlikely(!new_size)) { devm_kfree(dev, ptr); @@ -874,7 +875,8 @@ void *devm_krealloc(struct device *dev, void *ptr, size_t new_size, gfp_t gfp) if (!check_dr_size(new_size, &total_new_size)) return NULL; - total_old_size = ksize(container_of(ptr, struct devres, data)); + allocation = container_of(ptr, struct devres, data); + total_old_size = ksize(allocation); if (total_old_size == 0) { WARN(1, "Pointer doesn't point to dynamically allocated memory."); return NULL; diff --git a/include/linux/slab.h b/include/linux/slab.h index 37bde99b74af..a14f3bfa2f44 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -182,8 +182,32 @@ int kmem_cache_shrink(struct kmem_cache *s); void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); void kfree(const void *objp); void kfree_sensitive(const void *objp); + +/** + * ksize - get the actual amount of memory allocated for a given object + * @objp: Pointer to the object + * + * kmalloc may internally round up allocations and return more memory + * than requested. ksize() can be used to determine the actual amount of + * memory allocated. The caller may use this additional memory, even though + * a smaller amount of memory was initially specified with the kmalloc call. + * The caller must guarantee that objp points to a valid object previously + * allocated with either kmalloc() or kmem_cache_alloc(). The object + * must not be freed during the duration of the call. + * + * Return: size of the actual memory used by @objp in bytes + */ +#define ksize(objp) ({ \ + /* \ + * Getting the actual allocation size means the __alloc_size \ + * hints are no longer valid, and the compiler needs to \ + * forget about them. \ + */ \ + OPTIMIZER_HIDE_VAR(objp); \ + _ksize(objp); \ +}) size_t __ksize(const void *objp); -size_t ksize(const void *objp); +size_t _ksize(const void *objp); #ifdef CONFIG_PRINTK bool kmem_valid_obj(void *object); void kmem_dump_obj(void *object); diff --git a/mm/slab_common.c b/mm/slab_common.c index 23f2ab0713b7..ba5fa8481396 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1245,21 +1245,8 @@ void kfree_sensitive(const void *p) } EXPORT_SYMBOL(kfree_sensitive); -/** - * ksize - get the actual amount of memory allocated for a given object - * @objp: Pointer to the object - * - * kmalloc may internally round up allocations and return more memory - * than requested. ksize() can be used to determine the actual amount of - * memory allocated. The caller may use this additional memory, even though - * a smaller amount of memory was initially specified with the kmalloc call. - * The caller must guarantee that objp points to a valid object previously - * allocated with either kmalloc() or kmem_cache_alloc(). The object - * must not be freed during the duration of the call. - * - * Return: size of the actual memory used by @objp in bytes - */ -size_t ksize(const void *objp) +/* Should not be called directly: use "ksize" macro wrapper. */ +size_t _ksize(const void *objp) { size_t size; @@ -1289,7 +1276,7 @@ size_t ksize(const void *objp) kasan_unpoison_range(objp, size); return size; } -EXPORT_SYMBOL(ksize); +EXPORT_SYMBOL(_ksize); /* Tracepoints definitions. */ EXPORT_TRACEPOINT_SYMBOL(kmalloc);
If ksize() is used on an allocation, the compiler cannot make any assumptions about its size any more (as hinted by __alloc_size). Force it to forget. One caller was using a container_of() construction that needed to be worked around. Cc: Marco Elver <elver@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-mm@kvack.org Link: https://github.com/ClangBuiltLinux/linux/issues/1599 Fixes: c37495d6254c ("slab: add __alloc_size attributes for better bounds checking") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> --- This appears to work for me, but I'm waiting for more feedback on the specific instance got tripped over in Android. --- drivers/base/devres.c | 4 +++- include/linux/slab.h | 26 +++++++++++++++++++++++++- mm/slab_common.c | 19 +++---------------- 3 files changed, 31 insertions(+), 18 deletions(-)