mbox series

[v6,00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

Message ID cover.1643047180.git.andreyknvl@google.com (mailing list archive)
Headers show
Series kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS | expand

Message

andrey.konovalov@linux.dev Jan. 24, 2022, 6:02 p.m. UTC
From: Andrey Konovalov <andreyknvl@google.com>

Hi,

This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
KASAN modes.

The tree with patches is available here:

https://github.com/xairy/linux/tree/up-kasan-vmalloc-tags-v6

About half of patches are cleanups I went for along the way. None of
them seem to be important enough to go through stable, so I decided
not to split them out into separate patches/series.

The patchset is partially based on an early version of the HW_TAGS
patchset by Vincenzo that had vmalloc support. Thus, I added a
Co-developed-by tag into a few patches.

SW_TAGS vmalloc tagging support is straightforward. It reuses all of
the generic KASAN machinery, but uses shadow memory to store tags
instead of magic values. Naturally, vmalloc tagging requires adding
a few kasan_reset_tag() annotations to the vmalloc code.

HW_TAGS vmalloc tagging support stands out. HW_TAGS KASAN is based on
Arm MTE, which can only assigns tags to physical memory. As a result,
HW_TAGS KASAN only tags vmalloc() allocations, which are backed by
page_alloc memory. It ignores vmap() and others.

Thanks!

Changes in v5->v6:
- Rebased onto mainline/5.17-rc1.
- Drop unnecessary explicit checks for software KASAN modes from
  should_skip_init().

Changes in v4->v5:
- Rebase onto fresh mm.
- Mention optimization intention in the comment for __GFP_ZEROTAGS.
- Replace "kasan: simplify kasan_init_hw_tags" with "kasan: clean up
  feature flags for HW_TAGS mode".
- Use true as kasan_flag_vmalloc static key default.
- Cosmetic changes to __def_gfpflag_names_kasan and __GFP_BITS_SHIFT.

Changes in v3->v4:
- Rebase onto fresh mm.
- Rename KASAN_VMALLOC_NOEXEC to KASAN_VMALLOC_PROT_NORMAL.
- Compare prot with PAGE_KERNEL instead of using pgprot_nx() to
  indentify normal non-executable mappings.
- Rename arch_vmalloc_pgprot_modify() to arch_vmap_pgprot_tagged().
- Move checks from arch_vmap_pgprot_tagged() to __vmalloc_node_range()
  as the same condition is used for other things in subsequent patches.
- Use proper kasan_hw_tags_enabled() checks instead of
  IS_ENABLED(CONFIG_KASAN_HW_TAGS).
- Set __GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_ZERO flags instead of
  resetting.
- Only define KASAN GFP flags when when HW_TAGS KASAN is enabled.
- Move setting KASAN GFP flags to __vmalloc_node_range() and do it
  only for normal non-executable mapping when HW_TAGS KASAN is enabled.
- Add new GFP flags to include/trace/events/mmflags.h.
- Don't forget to save tagged addr to vm_struct->addr for VM_ALLOC
  so that find_vm_area(addr)->addr == addr for vmalloc().
- Reset pointer tag in change_memory_common().
- Add test checks for set_memory_*() on vmalloc() allocations.
- Minor patch descriptions and comments fixes.

Changes in v2->v3:
- Rebase onto mm.
- New patch: "kasan, arm64: reset pointer tags of vmapped stacks".
- New patch: "kasan, vmalloc: don't tag executable vmalloc allocations".
- New patch: "kasan, arm64: don't tag executable vmalloc allocations".
- Allowing enabling KASAN_VMALLOC with SW/HW_TAGS is moved to
  "kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS", as this can only
  be done once executable allocations are no longer tagged.
- Minor fixes, see patches for lists of changes.

Changes in v1->v2:
- Move memory init for vmalloc() into vmalloc code for HW_TAGS KASAN.
- Minor fixes and code reshuffling, see patches for lists of changes.

Acked-by: Marco Elver <elver@google.com>

Andrey Konovalov (39):
  kasan, page_alloc: deduplicate should_skip_kasan_poison
  kasan, page_alloc: move tag_clear_highpage out of
    kernel_init_free_pages
  kasan, page_alloc: merge kasan_free_pages into free_pages_prepare
  kasan, page_alloc: simplify kasan_poison_pages call site
  kasan, page_alloc: init memory of skipped pages on free
  kasan: drop skip_kasan_poison variable in free_pages_prepare
  mm: clarify __GFP_ZEROTAGS comment
  kasan: only apply __GFP_ZEROTAGS when memory is zeroed
  kasan, page_alloc: refactor init checks in post_alloc_hook
  kasan, page_alloc: merge kasan_alloc_pages into post_alloc_hook
  kasan, page_alloc: combine tag_clear_highpage calls in post_alloc_hook
  kasan, page_alloc: move SetPageSkipKASanPoison in post_alloc_hook
  kasan, page_alloc: move kernel_init_free_pages in post_alloc_hook
  kasan, page_alloc: rework kasan_unpoison_pages call site
  kasan: clean up metadata byte definitions
  kasan: define KASAN_VMALLOC_INVALID for SW_TAGS
  kasan, x86, arm64, s390: rename functions for modules shadow
  kasan, vmalloc: drop outdated VM_KASAN comment
  kasan: reorder vmalloc hooks
  kasan: add wrappers for vmalloc hooks
  kasan, vmalloc: reset tags in vmalloc functions
  kasan, fork: reset pointer tags of vmapped stacks
  kasan, arm64: reset pointer tags of vmapped stacks
  kasan, vmalloc: add vmalloc tagging for SW_TAGS
  kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged
  kasan, vmalloc: unpoison VM_ALLOC pages after mapping
  kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
  kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
  kasan, page_alloc: allow skipping memory init for HW_TAGS
  kasan, vmalloc: add vmalloc tagging for HW_TAGS
  kasan, vmalloc: only tag normal vmalloc allocations
  kasan, arm64: don't tag executable vmalloc allocations
  kasan: mark kasan_arg_stacktrace as __initdata
  kasan: clean up feature flags for HW_TAGS mode
  kasan: add kasan.vmalloc command line flag
  kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS
  arm64: select KASAN_VMALLOC for SW/HW_TAGS modes
  kasan: documentation updates
  kasan: improve vmalloc tests

 Documentation/dev-tools/kasan.rst   |  17 ++-
 arch/arm64/Kconfig                  |   2 +-
 arch/arm64/include/asm/vmalloc.h    |   6 +
 arch/arm64/include/asm/vmap_stack.h |   5 +-
 arch/arm64/kernel/module.c          |   5 +-
 arch/arm64/mm/pageattr.c            |   2 +-
 arch/arm64/net/bpf_jit_comp.c       |   3 +-
 arch/s390/kernel/module.c           |   2 +-
 arch/x86/kernel/module.c            |   2 +-
 include/linux/gfp.h                 |  35 +++--
 include/linux/kasan.h               |  97 +++++++++-----
 include/linux/vmalloc.h             |  18 +--
 include/trace/events/mmflags.h      |  14 +-
 kernel/fork.c                       |   1 +
 kernel/scs.c                        |   4 +-
 lib/Kconfig.kasan                   |  20 +--
 lib/test_kasan.c                    | 189 ++++++++++++++++++++++++++-
 mm/kasan/common.c                   |   4 +-
 mm/kasan/hw_tags.c                  | 193 ++++++++++++++++++++++------
 mm/kasan/kasan.h                    |  18 ++-
 mm/kasan/shadow.c                   |  63 +++++----
 mm/page_alloc.c                     | 152 +++++++++++++++-------
 mm/vmalloc.c                        |  99 +++++++++++---
 23 files changed, 731 insertions(+), 220 deletions(-)

Comments

Andrey Konovalov Jan. 24, 2022, 6:32 p.m. UTC | #1
On Mon, Jan 24, 2022 at 7:09 PM Marco Elver <elver@google.com> wrote:
>
> On Mon, 24 Jan 2022 at 19:02, <andrey.konovalov@linux.dev> wrote:
> >
> > From: Andrey Konovalov <andreyknvl@google.com>
> >
> > Hi,
> >
> > This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> > KASAN modes.
> [...]
> >
> > Acked-by: Marco Elver <elver@google.com>
>
> FYI, my Ack may get lost here - on rebase you could apply it to all
> patches to carry it forward. As-is, Andrew would still have to apply
> it manually.

Sounds good, will do if there is a v7.

> An Ack to the cover letter saves replying to each patch and thus
> generating less emails, which I think is preferred.
>
> My Ack is still valid, given v6 is mainly a rebase and I don't see any
> major changes.

Thanks, Marco!
Qian Cai April 28, 2022, 2:13 p.m. UTC | #2
On Mon, Jan 24, 2022 at 07:02:08PM +0100, andrey.konovalov@linux.dev wrote:
> From: Andrey Konovalov <andreyknvl@google.com>
> 
> Hi,
> 
> This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> KASAN modes.
> 
> The tree with patches is available here:
> 
> https://github.com/xairy/linux/tree/up-kasan-vmalloc-tags-v6
> 
> About half of patches are cleanups I went for along the way. None of
> them seem to be important enough to go through stable, so I decided
> not to split them out into separate patches/series.
> 
> The patchset is partially based on an early version of the HW_TAGS
> patchset by Vincenzo that had vmalloc support. Thus, I added a
> Co-developed-by tag into a few patches.
> 
> SW_TAGS vmalloc tagging support is straightforward. It reuses all of
> the generic KASAN machinery, but uses shadow memory to store tags
> instead of magic values. Naturally, vmalloc tagging requires adding
> a few kasan_reset_tag() annotations to the vmalloc code.
> 
> HW_TAGS vmalloc tagging support stands out. HW_TAGS KASAN is based on
> Arm MTE, which can only assigns tags to physical memory. As a result,
> HW_TAGS KASAN only tags vmalloc() allocations, which are backed by
> page_alloc memory. It ignores vmap() and others.

I could use some help here. Ever since this series, our system starts to
trigger bad page state bugs from time to time. Any thoughts?

 BUG: Bad page state in process systemd-udevd  pfn:83ffffcd
 page:fffffc20fdfff340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcd
 flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
 raw: 0bfffc0000001000 fffffc20fdfff348 fffffc20fdfff348 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
 page_owner info is not present (never set?)
 CPU: 76 PID: 1873 Comm: systemd-udevd Not tainted 5.18.0-rc4-next-20220428-dirty #67
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  dump_stack
  bad_page
  free_pcp_prepare
  free_unref_page
  __free_pages
  free_pages.part.0
  free_pages
  kasan_depopulate_vmalloc_pte
  (inlined by) kasan_depopulate_vmalloc_pte at mm/kasan/shadow.c:361
  apply_to_pte_range
  apply_to_pmd_range
  apply_to_pud_range
  __apply_to_page_range
  apply_to_existing_page_range
  kasan_release_vmalloc
  (inlined by) kasan_release_vmalloc at mm/kasan/shadow.c:469
  __purge_vmap_area_lazy
  purge_vmap_area_lazy
  alloc_vmap_area
  __get_vm_area_node.constprop.0
  __vmalloc_node_range
  module_alloc
  move_module
  layout_and_allocate
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
 Disabling lock debugging due to kernel taint
 BUG: Bad page state in process systemd-udevd  pfn:83ffffcc
 page:fffffc20fdfff300 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcc
 flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
 raw: 0bfffc0000001000 fffffc20fdfff308 fffffc20fdfff308 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
 page_owner info is not present (never set?)
 CPU: 76 PID: 1873 Comm: systemd-udevd Tainted: G    B             5.18.0-rc4-next-20220428-dirty #67
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  dump_stack
  bad_page
  free_pcp_prepare
  free_unref_page
  __free_pages
  free_pages.part.0
  free_pages
  kasan_depopulate_vmalloc_pte
  apply_to_pte_range
  apply_to_pmd_range
  apply_to_pud_range
  __apply_to_page_range
  apply_to_existing_page_range
  kasan_release_vmalloc
  __purge_vmap_area_lazy
  purge_vmap_area_lazy
  alloc_vmap_area
  __get_vm_area_node.constprop.0
  __vmalloc_node_range
  module_alloc
  move_module
  layout_and_allocate
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
Andrey Konovalov April 28, 2022, 3:28 p.m. UTC | #3
On Thu, Apr 28, 2022 at 4:14 PM Qian Cai <quic_qiancai@quicinc.com> wrote:
>
> > SW_TAGS vmalloc tagging support is straightforward. It reuses all of
> > the generic KASAN machinery, but uses shadow memory to store tags
> > instead of magic values. Naturally, vmalloc tagging requires adding
> > a few kasan_reset_tag() annotations to the vmalloc code.
>
> I could use some help here. Ever since this series, our system starts to
> trigger bad page state bugs from time to time. Any thoughts?
>
>  BUG: Bad page state in process systemd-udevd  pfn:83ffffcd
>  page:fffffc20fdfff340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcd
>  flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
>  raw: 0bfffc0000001000 fffffc20fdfff348 fffffc20fdfff348 0000000000000000
>  raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
>  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
>  page_owner info is not present (never set?)

Hi Qian,

No ideas so far.

Looks like the page has reserved tag set when it's being freed.

Does this crash only happen with the SW_TAGS mode?

Does this crash only happen when loading modules?

Does your system have any hot-plugged memory?

Thanks!
Qian Cai April 28, 2022, 4:12 p.m. UTC | #4
On Thu, Apr 28, 2022 at 05:28:12PM +0200, Andrey Konovalov wrote:
> No ideas so far.
> 
> Looks like the page has reserved tag set when it's being freed.
> 
> Does this crash only happen with the SW_TAGS mode?

No, the system is running exclusively with CONFIG_KASAN_GENERIC=y

> Does this crash only happen when loading modules?

Yes. Here is another sligtly different path at the bottom.

> Does your system have any hot-plugged memory?

No.

 BUG: Bad page state in process systemd-udevd  pfn:403fc007c
 page:fffffd00fd001f00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x403fc007c
 flags: 0x1bfffc0000001000(reserved|node=1|zone=2|lastcpupid=0xffff)
 raw: 1bfffc0000001000 fffffd00fd001f08 fffffd00fd001f08 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
 CPU: 101 PID: 2004 Comm: systemd-udevd Not tainted 5.17.0-rc8-next-20220317-dirty #39
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  dump_stack
  bad_page
  free_pcp_prepare
  free_pages_prepare at mm/page_alloc.c:1348
  (inlined by) free_pcp_prepare at mm/page_alloc.c:1403
  free_unref_page
  __free_pages
  free_pages.part.0
  free_pages
  kasan_depopulate_vmalloc_pte
  (inlined by) kasan_depopulate_vmalloc_pte at mm/kasan/shadow.c:359
  apply_to_pte_range
  apply_to_pte_range at mm/memory.c:2547
  apply_to_pmd_range
  apply_to_pud_range
  __apply_to_page_range
  apply_to_existing_page_range
  kasan_release_vmalloc
  (inlined by) kasan_release_vmalloc at mm/kasan/shadow.c:469
  __purge_vmap_area_lazy
  _vm_unmap_aliases.part.0
  __vunmap
  __vfree
  vfree
  module_memfree
  free_module
  do_init_module
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync
 Disabling lock debugging due to kernel taint
 BUG: Bad page state in process systemd-udevd  pfn:403fc007b
 page:fffffd00fd001ec0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x403fc007b
 flags: 0x1bfffc0000001000(reserved|node=1|zone=2|lastcpupid=0xffff)
 raw: 1bfffc0000001000 fffffd00fd001ec8 fffffd00fd001ec8 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
 CPU: 101 PID: 2004 Comm: systemd-udevd Tainted: G    B             5.17.0-rc8-next-20220317-dirty #39
 Call trace:
  dump_backtrace
  show_stack
  dump_stack_lvl
  dump_stack
  bad_page
  free_pcp_prepare
  free_unref_page
  __free_pages
  free_pages.part.0
  free_pages
  kasan_depopulate_vmalloc_pte
  apply_to_pte_range
  apply_to_pmd_range
  apply_to_pud_range
  __apply_to_page_range
  apply_to_existing_page_range
  kasan_release_vmalloc
  __purge_vmap_area_lazy
  _vm_unmap_aliases.part.0
  __vunmap
  __vfree
  vfree
  module_memfree
  free_module
  do_init_module
  load_module
  __do_sys_finit_module
  __arm64_sys_finit_module
  invoke_syscall
  el0_svc_common.constprop.0
  do_el0_svc
  el0_svc
  el0t_64_sync_handler
  el0t_64_sync