Message ID | 5cef104d9b842899489b4054fe8d1339a71acee0.1700502145.git.andreyknvl@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | stackdepot: allow evicting stack traces | expand |
On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote: > > From: Andrey Konovalov <andreyknvl@google.com> > > Evict alloc/free stack traces from the stack depot for Generic KASAN > once they are evicted from the quaratine. > > For auxiliary stack traces, evict the oldest stack trace once a new one > is saved (KASAN only keeps references to the last two). > > Also evict all saved stack traces on krealloc. > > To avoid double-evicting and mis-evicting stack traces (in case KASAN's > metadata was corrupted), reset KASAN's per-object metadata that stores > stack depot handles when the object is initialized and when it's evicted > from the quarantine. > > Note that stack_depot_put is no-op if the handle is 0. > > Reviewed-by: Marco Elver <elver@google.com> > Signed-off-by: Andrey Konovalov <andreyknvl@google.com> I observed boot hangs on a few SLUB configurations. Having other users of stackdepot might be the cause. After passing 'slub_debug=-' which disables SLUB debugging, it boots fine. compiler version: gcc-11 config: https://download.kerneltesting.org/builds/2023-11-21-f121f2/.config bisect log: https://download.kerneltesting.org/builds/2023-11-21-f121f2/bisect.log.txt [dmesg] (gdb) lx-dmesg [ 0.000000] Linux version 6.7.0-rc1-00136-g0e8b630f3053 (hyeyoo@localhost.localdomain) (gcc (GCC) 11.3.1 20221121 (R3[ 0.000000] Command line: console=ttyS0 root=/dev/sda1 nokaslr [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc1-00136-g0e8b630f3053 #22 [ 0.000000] RIP: 0010:setup_arch+0x500/0x2250 [ 0.000000] Code: c6 09 08 00 48 89 c5 48 85 c0 0f 84 58 13 00 00 48 c1 e8 03 48 83 05 be 97 66 00 01 80 3c 18 00 0f3[ 0.000000] RSP: 0000:ffffffff86007e00 EFLAGS: 00010046 ORIG_RAX: 0000000000000009 [ 0.000000] RAX: 1fffffffffe40088 RBX: dffffc0000000000 RCX: 1ffffffff11ed630 [ 0.000000] RDX: 0000000000000000 RSI: feec4698e8103000 RDI: ffffffff88f6b180 [ 0.000000] RBP: ffffffffff200444 R08: 8000000000000163 R09: 1ffffffff11ed628 [ 0.000000] R10: ffffffff88f7a150 R11: 0000000000000000 R12: 0000000000000010 [ 0.000000] R13: ffffffffff200450 R14: feec4698e8102444 R15: feec4698e8102444 [ 0.000000] FS: 0000000000000000(0000) GS:ffffffff88d5b000(0000) knlGS:0000000000000000 [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.000000] CR2: ffffffffff200444 CR3: 0000000008f0e000 CR4: 00000000000000b0 [ 0.000000] Call Trace: [ 0.000000] <TASK> [ 0.000000] ? show_regs+0x87/0xa0 [ 0.000000] ? early_fixup_exception+0x130/0x310 [ 0.000000] ? do_early_exception+0x23/0x90 [ 0.000000] ? early_idt_handler_common+0x2f/0x40 [ 0.000000] ? setup_arch+0x500/0x2250 [ 0.000000] ? __pfx_setup_arch+0x10/0x10 [ 0.000000] ? vprintk_default+0x20/0x30 [ 0.000000] ? vprintk+0x4c/0x80 [ 0.000000] ? _printk+0xba/0xf0 [ 0.000000] ? __pfx__printk+0x10/0x10 [ 0.000000] ? init_cgroup_root+0x10f/0x2f0 --Type <RET> for more, q to quit, c to continue without paging-- [ 0.000000] ? cgroup_init_early+0x1e4/0x440 [ 0.000000] ? start_kernel+0xae/0x790 [ 0.000000] ? x86_64_start_reservations+0x28/0x50 [ 0.000000] ? x86_64_start_kernel+0x10e/0x130 [ 0.000000] ? secondary_startup_64_no_verify+0x178/0x17b [ 0.000000] </TASK> -- Hyeonggon
On Wed, Nov 22, 2023 at 12:17 PM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote: > > > > From: Andrey Konovalov <andreyknvl@google.com> > > > > Evict alloc/free stack traces from the stack depot for Generic KASAN > > once they are evicted from the quaratine. > > > > For auxiliary stack traces, evict the oldest stack trace once a new one > > is saved (KASAN only keeps references to the last two). > > > > Also evict all saved stack traces on krealloc. > > > > To avoid double-evicting and mis-evicting stack traces (in case KASAN's > > metadata was corrupted), reset KASAN's per-object metadata that stores > > stack depot handles when the object is initialized and when it's evicted > > from the quarantine. > > > > Note that stack_depot_put is no-op if the handle is 0. > > > > Reviewed-by: Marco Elver <elver@google.com> > > Signed-off-by: Andrey Konovalov <andreyknvl@google.com> > > I observed boot hangs on a few SLUB configurations. > > Having other users of stackdepot might be the cause. After passing > 'slub_debug=-' which disables SLUB debugging, it boots fine. Looks like I forgot to Cc regzbot. If you need more information, please let me know. #regzbot introduced: f0ff84b7c3a Thanks, Hyeonggon > compiler version: gcc-11 > config: https://download.kerneltesting.org/builds/2023-11-21-f121f2/.config > bisect log: https://download.kerneltesting.org/builds/2023-11-21-f121f2/bisect.log.txt > > [dmesg] > (gdb) lx-dmesg > [ 0.000000] Linux version 6.7.0-rc1-00136-g0e8b630f3053 > (hyeyoo@localhost.localdomain) (gcc (GCC) 11.3.1 20221121 (R3[ > 0.000000] Command line: console=ttyS0 root=/dev/sda1 nokaslr > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted > 6.7.0-rc1-00136-g0e8b630f3053 #22 > [ 0.000000] RIP: 0010:setup_arch+0x500/0x2250 > [ 0.000000] Code: c6 09 08 00 48 89 c5 48 85 c0 0f 84 58 13 00 00 > 48 c1 e8 03 48 83 05 be 97 66 00 01 80 3c 18 00 0f3[ 0.000000] RSP: > 0000:ffffffff86007e00 EFLAGS: 00010046 ORIG_RAX: 0000000000000009 > [ 0.000000] RAX: 1fffffffffe40088 RBX: dffffc0000000000 RCX: 1ffffffff11ed630 > [ 0.000000] RDX: 0000000000000000 RSI: feec4698e8103000 RDI: ffffffff88f6b180 > [ 0.000000] RBP: ffffffffff200444 R08: 8000000000000163 R09: 1ffffffff11ed628 > [ 0.000000] R10: ffffffff88f7a150 R11: 0000000000000000 R12: 0000000000000010 > [ 0.000000] R13: ffffffffff200450 R14: feec4698e8102444 R15: feec4698e8102444 > [ 0.000000] FS: 0000000000000000(0000) GS:ffffffff88d5b000(0000) > knlGS:0000000000000000 > [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.000000] CR2: ffffffffff200444 CR3: 0000000008f0e000 CR4: 00000000000000b0 > [ 0.000000] Call Trace: > [ 0.000000] <TASK> > [ 0.000000] ? show_regs+0x87/0xa0 > [ 0.000000] ? early_fixup_exception+0x130/0x310 > [ 0.000000] ? do_early_exception+0x23/0x90 > [ 0.000000] ? early_idt_handler_common+0x2f/0x40 > [ 0.000000] ? setup_arch+0x500/0x2250 > [ 0.000000] ? __pfx_setup_arch+0x10/0x10 > [ 0.000000] ? vprintk_default+0x20/0x30 > [ 0.000000] ? vprintk+0x4c/0x80 > [ 0.000000] ? _printk+0xba/0xf0 > [ 0.000000] ? __pfx__printk+0x10/0x10 > [ 0.000000] ? init_cgroup_root+0x10f/0x2f0 > --Type <RET> for more, q to quit, c to continue without paging-- > [ 0.000000] ? cgroup_init_early+0x1e4/0x440 > [ 0.000000] ? start_kernel+0xae/0x790 > [ 0.000000] ? x86_64_start_reservations+0x28/0x50 > [ 0.000000] ? x86_64_start_kernel+0x10e/0x130 > [ 0.000000] ? secondary_startup_64_no_verify+0x178/0x17b > [ 0.000000] </TASK>
On Wed, Nov 22, 2023 at 4:17 AM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote: > > > > From: Andrey Konovalov <andreyknvl@google.com> > > > > Evict alloc/free stack traces from the stack depot for Generic KASAN > > once they are evicted from the quaratine. > > > > For auxiliary stack traces, evict the oldest stack trace once a new one > > is saved (KASAN only keeps references to the last two). > > > > Also evict all saved stack traces on krealloc. > > > > To avoid double-evicting and mis-evicting stack traces (in case KASAN's > > metadata was corrupted), reset KASAN's per-object metadata that stores > > stack depot handles when the object is initialized and when it's evicted > > from the quarantine. > > > > Note that stack_depot_put is no-op if the handle is 0. > > > > Reviewed-by: Marco Elver <elver@google.com> > > Signed-off-by: Andrey Konovalov <andreyknvl@google.com> > > I observed boot hangs on a few SLUB configurations. > > Having other users of stackdepot might be the cause. After passing > 'slub_debug=-' which disables SLUB debugging, it boots fine. Hi Hyeonggon, Just mailed a fix. Thank you for the report!
diff --git a/mm/kasan/common.c b/mm/kasan/common.c index 825a0240ec02..b5d8bd26fced 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -50,7 +50,8 @@ depot_stack_handle_t kasan_save_stack(gfp_t flags, depot_flags_t depot_flags) void kasan_set_track(struct kasan_track *track, gfp_t flags) { track->pid = current->pid; - track->stack = kasan_save_stack(flags, STACK_DEPOT_FLAG_CAN_ALLOC); + track->stack = kasan_save_stack(flags, + STACK_DEPOT_FLAG_CAN_ALLOC | STACK_DEPOT_FLAG_GET); } #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c index 5d168c9afb32..50cc519e23f4 100644 --- a/mm/kasan/generic.c +++ b/mm/kasan/generic.c @@ -449,10 +449,14 @@ struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache, void kasan_init_object_meta(struct kmem_cache *cache, const void *object) { struct kasan_alloc_meta *alloc_meta; + struct kasan_free_meta *free_meta; alloc_meta = kasan_get_alloc_meta(cache, object); if (alloc_meta) __memset(alloc_meta, 0, sizeof(*alloc_meta)); + free_meta = kasan_get_free_meta(cache, object); + if (free_meta) + __memset(free_meta, 0, sizeof(*free_meta)); } size_t kasan_metadata_size(struct kmem_cache *cache, bool in_object) @@ -489,18 +493,20 @@ static void __kasan_record_aux_stack(void *addr, depot_flags_t depot_flags) if (!alloc_meta) return; + stack_depot_put(alloc_meta->aux_stack[1]); alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0]; alloc_meta->aux_stack[0] = kasan_save_stack(0, depot_flags); } void kasan_record_aux_stack(void *addr) { - return __kasan_record_aux_stack(addr, STACK_DEPOT_FLAG_CAN_ALLOC); + return __kasan_record_aux_stack(addr, + STACK_DEPOT_FLAG_CAN_ALLOC | STACK_DEPOT_FLAG_GET); } void kasan_record_aux_stack_noalloc(void *addr) { - return __kasan_record_aux_stack(addr, 0); + return __kasan_record_aux_stack(addr, STACK_DEPOT_FLAG_GET); } void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags) @@ -508,8 +514,16 @@ void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags) struct kasan_alloc_meta *alloc_meta; alloc_meta = kasan_get_alloc_meta(cache, object); - if (alloc_meta) - kasan_set_track(&alloc_meta->alloc_track, flags); + if (!alloc_meta) + return; + + /* Evict previous stack traces (might exist for krealloc). */ + stack_depot_put(alloc_meta->alloc_track.stack); + stack_depot_put(alloc_meta->aux_stack[0]); + stack_depot_put(alloc_meta->aux_stack[1]); + __memset(alloc_meta, 0, sizeof(*alloc_meta)); + + kasan_set_track(&alloc_meta->alloc_track, flags); } void kasan_save_free_info(struct kmem_cache *cache, void *object) diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c index ca4529156735..265ca2bbe2dd 100644 --- a/mm/kasan/quarantine.c +++ b/mm/kasan/quarantine.c @@ -143,11 +143,22 @@ static void *qlink_to_object(struct qlist_node *qlink, struct kmem_cache *cache) static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache) { void *object = qlink_to_object(qlink, cache); - struct kasan_free_meta *meta = kasan_get_free_meta(cache, object); + struct kasan_alloc_meta *alloc_meta = kasan_get_alloc_meta(cache, object); + struct kasan_free_meta *free_meta = kasan_get_free_meta(cache, object); unsigned long flags; - if (IS_ENABLED(CONFIG_SLAB)) - local_irq_save(flags); + if (alloc_meta) { + stack_depot_put(alloc_meta->alloc_track.stack); + stack_depot_put(alloc_meta->aux_stack[0]); + stack_depot_put(alloc_meta->aux_stack[1]); + __memset(alloc_meta, 0, sizeof(*alloc_meta)); + } + + if (free_meta && + *(u8 *)kasan_mem_to_shadow(object) == KASAN_SLAB_FREETRACK) { + stack_depot_put(free_meta->free_track.stack); + free_meta->free_track.stack = 0; + } /* * If init_on_free is enabled and KASAN's free metadata is stored in @@ -157,14 +168,17 @@ static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache) */ if (slab_want_init_on_free(cache) && cache->kasan_info.free_meta_offset == 0) - memzero_explicit(meta, sizeof(*meta)); + memzero_explicit(free_meta, sizeof(*free_meta)); /* - * As the object now gets freed from the quarantine, assume that its - * free track is no longer valid. + * As the object now gets freed from the quarantine, + * take note that its free track is no longer exists. */ *(u8 *)kasan_mem_to_shadow(object) = KASAN_SLAB_FREE; + if (IS_ENABLED(CONFIG_SLAB)) + local_irq_save(flags); + ___cache_free(cache, object, _THIS_IP_); if (IS_ENABLED(CONFIG_SLAB))