mbox series

[0/3] kasan: Fix ordering between MTE tag colouring and page->flags

Message ID 20220517180945.756303-1-catalin.marinas@arm.com (mailing list archive)
Headers show
Series kasan: Fix ordering between MTE tag colouring and page->flags | expand

Message

Catalin Marinas May 17, 2022, 6:09 p.m. UTC
Hi,

That's more of an RFC to get a discussion started. I plan to eventually
apply the third patch reverting the page_kasan_tag_reset() calls under
arch/arm64 since they don't cover all cases (the race is rare and we
haven't hit anything yet but it's possible).

On a system with MTE and KASAN_HW_TAGS enabled, when a page is allocated
kasan_unpoison_pages() sets a random tag and saves it in page->flags so
that page_to_virt() re-creates the correct tagged pointer. We need to
ensure that the in-memory tags are visible before setting the
page->flags:

P0 (__kasan_unpoison_range):	P1 (access via virt_to_page):
  Wtags=x			  Rflags=x
    |				    |
    | DMB			    | address dependency
    V				    V
  Wflags=x			  Rtags=x

The first patch changes the order of page unpoisoning with the tag
storing in page->flags. page_kasan_tag_set() has the right barriers
through try_cmpxchg().

If such page is mapped in user-space with PROT_MTE, the architecture
code will set the tag to 0 and a subsequent page_to_virt() dereference
will fault. We currently try to fix this by resetting the tag in
page->flags so that it is 0xff (match-all, not faulting). However,
setting the tags and flags can race with another CPU reading the flags
(page_to_virt()) and barriers can't help, e.g.:

P0 (mte_sync_page_tags):        P1 (memcpy from virt_to_page):
                                  Rflags!=0xff
  Wflags=0xff
  DMB (doesn't help)
  Wtags=0
                                  Rtags=0   // fault

Since clearing the flags in the arch code doesn't work, try to do this
at page allocation time by a new flag added to GFP_USER. Could we
instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?

Thanks.

Catalin Marinas (3):
  mm: kasan: Ensure the tags are visible before the tag in page->flags
  mm: kasan: Reset the tag on pages intended for user
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"

 arch/arm64/kernel/hibernate.c |  5 -----
 arch/arm64/kernel/mte.c       |  9 ---------
 arch/arm64/mm/copypage.c      |  9 ---------
 arch/arm64/mm/fault.c         |  1 -
 arch/arm64/mm/mteswap.c       |  9 ---------
 include/linux/gfp.h           | 10 +++++++---
 mm/kasan/common.c             |  3 ++-
 mm/page_alloc.c               |  9 ++++++---
 8 files changed, 15 insertions(+), 40 deletions(-)

Comments

Andrey Konovalov May 19, 2022, 9:45 p.m. UTC | #1
On Tue, May 17, 2022 at 8:09 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> Hi,

Hi Catalin,

> That's more of an RFC to get a discussion started. I plan to eventually
> apply the third patch reverting the page_kasan_tag_reset() calls under
> arch/arm64 since they don't cover all cases (the race is rare and we
> haven't hit anything yet but it's possible).
>
> On a system with MTE and KASAN_HW_TAGS enabled, when a page is allocated
> kasan_unpoison_pages() sets a random tag and saves it in page->flags so
> that page_to_virt() re-creates the correct tagged pointer. We need to
> ensure that the in-memory tags are visible before setting the
> page->flags:
>
> P0 (__kasan_unpoison_range):    P1 (access via virt_to_page):
>   Wtags=x                         Rflags=x
>     |                               |
>     | DMB                           | address dependency
>     V                               V
>   Wflags=x                        Rtags=x

This is confusing: the paragraph mentions page_to_virt() and the
diagram - virt_to_page(). I assume it should be page_to_virt().

alloc_pages(), which calls kasan_unpoison_pages(), has to return
before page_to_virt() can be called. So they only can race if the tags
don't get propagated to memory before alloc_pages() returns, right?
This is why you say that the race is rare?

> The first patch changes the order of page unpoisoning with the tag
> storing in page->flags. page_kasan_tag_set() has the right barriers
> through try_cmpxchg().

[...]

> If such page is mapped in user-space with PROT_MTE, the architecture
> code will set the tag to 0 and a subsequent page_to_virt() dereference
> will fault. We currently try to fix this by resetting the tag in
> page->flags so that it is 0xff (match-all, not faulting). However,
> setting the tags and flags can race with another CPU reading the flags
> (page_to_virt()) and barriers can't help, e.g.:
>
> P0 (mte_sync_page_tags):        P1 (memcpy from virt_to_page):
>                                   Rflags!=0xff
>   Wflags=0xff
>   DMB (doesn't help)
>   Wtags=0
>                                   Rtags=0   // fault

So this change, effectively, makes the tag in page->flags for GFP_USER
pages to be reset at allocation time. And the current approach of
resetting the tag when the kernel is about to access these pages is
not good because: 1. it's inconvenient to track all places where this
should be done and 2. the tag reset can race with page_to_virt() even
with patch #1 applied. Is my understanding correct?

This will reset the tags for all kinds of GFP_USER allocations, not
only for the ones intended for MAP_ANONYMOUS and RAM-based file
mappings, for which userspace can set tags, right? This will thus
weaken in-kernel MTE for pages whose tags can't even be set by
userspace. Is there a way to deal with this?

> Since clearing the flags in the arch code doesn't work, try to do this
> at page allocation time by a new flag added to GFP_USER. Could we
> instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?

Why do we need a new flag? Can we just check & GFP_USER instead?

Thanks!
Catalin Marinas May 20, 2022, 1:01 p.m. UTC | #2
On Thu, May 19, 2022 at 11:45:04PM +0200, Andrey Konovalov wrote:
> On Tue, May 17, 2022 at 8:09 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > That's more of an RFC to get a discussion started. I plan to eventually
> > apply the third patch reverting the page_kasan_tag_reset() calls under
> > arch/arm64 since they don't cover all cases (the race is rare and we
> > haven't hit anything yet but it's possible).
> >
> > On a system with MTE and KASAN_HW_TAGS enabled, when a page is allocated
> > kasan_unpoison_pages() sets a random tag and saves it in page->flags so
> > that page_to_virt() re-creates the correct tagged pointer. We need to
> > ensure that the in-memory tags are visible before setting the
> > page->flags:
> >
> > P0 (__kasan_unpoison_range):    P1 (access via virt_to_page):
> >   Wtags=x                         Rflags=x
> >     |                               |
> >     | DMB                           | address dependency
> >     V                               V
> >   Wflags=x                        Rtags=x
> 
> This is confusing: the paragraph mentions page_to_virt() and the
> diagram - virt_to_page(). I assume it should be page_to_virt().

Yes, it should be page_to_virt().

> alloc_pages(), which calls kasan_unpoison_pages(), has to return
> before page_to_virt() can be called. So they only can race if the tags
> don't get propagated to memory before alloc_pages() returns, right?
> This is why you say that the race is rare?

Yeah, it involves another CPU getting the pfn or page address and trying
to access it before the tags are propagated (not necessarily to DRAM, it
can be some some cache level or they are just stuck in a writebuffer).
It's unlikely but still possible.

See a somewhat related recent memory ordering fix, it was found in
actual testing:

https://git.kernel.org/arm64/c/1d0cb4c8864a

> > If such page is mapped in user-space with PROT_MTE, the architecture
> > code will set the tag to 0 and a subsequent page_to_virt() dereference
> > will fault. We currently try to fix this by resetting the tag in
> > page->flags so that it is 0xff (match-all, not faulting). However,
> > setting the tags and flags can race with another CPU reading the flags
> > (page_to_virt()) and barriers can't help, e.g.:
> >
> > P0 (mte_sync_page_tags):        P1 (memcpy from virt_to_page):
> >                                   Rflags!=0xff
> >   Wflags=0xff
> >   DMB (doesn't help)
> >   Wtags=0
> >                                   Rtags=0   // fault
> 
> So this change, effectively, makes the tag in page->flags for GFP_USER
> pages to be reset at allocation time. And the current approach of
> resetting the tag when the kernel is about to access these pages is
> not good because: 1. it's inconvenient to track all places where this
> should be done and 2. the tag reset can race with page_to_virt() even
> with patch #1 applied. Is my understanding correct?

Yes. Regarding (1), it's pretty impractical. There are some clear places
like copy_user_highpage() where we could untag the page address. In
others others it may not be as simple. We could try to reset the page
flags when we do a get_user_pages() to cover another class. But we still
have swap, page migration that may read a page with a mismatched tag.

> This will reset the tags for all kinds of GFP_USER allocations, not
> only for the ones intended for MAP_ANONYMOUS and RAM-based file
> mappings, for which userspace can set tags, right? This will thus
> weaken in-kernel MTE for pages whose tags can't even be set by
> userspace. Is there a way to deal with this?

That's correct, it will weaken some of the allocations where the user
doesn't care about MTE. And TBH, I'm not sure it covers all cases
either (can we have an anonymous or memfd page mapped in user space that
was not allocated with GFP_USER?).

Another option would be to lock the page but set_pte_at() seems to be
called for pages both locked and unlocked.

Any suggestions are welcomed.

> > Since clearing the flags in the arch code doesn't work, try to do this
> > at page allocation time by a new flag added to GFP_USER. Could we
> > instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?
> 
> Why do we need a new flag? Can we just check & GFP_USER instead?

GFP_USER is not a flag as such but a combination of flags, none of which
says explicitly it's meant for user.
Andrey Konovalov May 21, 2022, 10:20 p.m. UTC | #3
On Fri, May 20, 2022 at 3:01 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > So this change, effectively, makes the tag in page->flags for GFP_USER
> > pages to be reset at allocation time. And the current approach of
> > resetting the tag when the kernel is about to access these pages is
> > not good because: 1. it's inconvenient to track all places where this
> > should be done and 2. the tag reset can race with page_to_virt() even
> > with patch #1 applied. Is my understanding correct?
>
> Yes. Regarding (1), it's pretty impractical. There are some clear places
> like copy_user_highpage() where we could untag the page address. In
> others others it may not be as simple. We could try to reset the page
> flags when we do a get_user_pages() to cover another class. But we still
> have swap, page migration that may read a page with a mismatched tag.

I see.

> > This will reset the tags for all kinds of GFP_USER allocations, not
> > only for the ones intended for MAP_ANONYMOUS and RAM-based file
> > mappings, for which userspace can set tags, right? This will thus
> > weaken in-kernel MTE for pages whose tags can't even be set by
> > userspace. Is there a way to deal with this?
>
> That's correct, it will weaken some of the allocations where the user
> doesn't care about MTE.

Well, while this is unfortunate, I don't mind the change.

I've left some comments on the patches.

> > > Since clearing the flags in the arch code doesn't work, try to do this
> > > at page allocation time by a new flag added to GFP_USER.

Does this have to be GFP_USER? Can we add new flags to
GFP_HIGHUSER_MOVABLE instead?

For instance, Peter added __GFP_SKIP_KASAN_POISON to
GFP_HIGHUSER_MOVABLE in c275c5c6d50a0.

> > > Could we
> > > instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?

Adding __GFP_SKIP_KASAN_UNPOISON makes sense, but we still need to
reset the tag in page->flags.

Thanks!
Catalin Marinas May 25, 2022, 3:45 p.m. UTC | #4
On Sun, May 22, 2022 at 12:20:26AM +0200, Andrey Konovalov wrote:
> On Fri, May 20, 2022 at 3:01 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > This will reset the tags for all kinds of GFP_USER allocations, not
> > > only for the ones intended for MAP_ANONYMOUS and RAM-based file
> > > mappings, for which userspace can set tags, right? This will thus
> > > weaken in-kernel MTE for pages whose tags can't even be set by
> > > userspace. Is there a way to deal with this?
> >
> > That's correct, it will weaken some of the allocations where the user
> > doesn't care about MTE.
> 
> Well, while this is unfortunate, I don't mind the change.
> 
> I've left some comments on the patches.

Thanks. I'll update and post at -rc1.

> > > > Since clearing the flags in the arch code doesn't work, try to do this
> > > > at page allocation time by a new flag added to GFP_USER.
> 
> Does this have to be GFP_USER? Can we add new flags to
> GFP_HIGHUSER_MOVABLE instead?
> 
> For instance, Peter added __GFP_SKIP_KASAN_POISON to
> GFP_HIGHUSER_MOVABLE in c275c5c6d50a0.

The above commit was a performance improvement. Here we need to address
the correctness. However, looking through the GFP_USER cases, I don't
think any of them is at risk of ending up in user space with PROT_MTE.
There are places where GFP_USER is passed to kmalloc() for in-kernel
objects that would never be mapped to user, though the new gfp flag
won't be taken into account.

I'm ok to move the new flag to the GFP_HIGHUSER_MOVABLE but probably
still keep a page_kasan_tag_reset() on the set_pte_at() path together
with a WARN_ON_ONCE() if we miss anything.

> > > > Could we
> > > > instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?
> 
> Adding __GFP_SKIP_KASAN_UNPOISON makes sense, but we still need to
> reset the tag in page->flags.

My thought was to reset the tag in page->flags based on 'unpoison'
alone without any extra flags. We use this flag for vmalloc() pages but
it seems we don't reset the page tags (as we do via
kasan_poison_slab()).
Andrey Konovalov May 25, 2022, 5:41 p.m. UTC | #5
On Wed, May 25, 2022 at 5:45 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > Does this have to be GFP_USER? Can we add new flags to
> > GFP_HIGHUSER_MOVABLE instead?
> >
> > For instance, Peter added __GFP_SKIP_KASAN_POISON to
> > GFP_HIGHUSER_MOVABLE in c275c5c6d50a0.
>
> The above commit was a performance improvement. Here we need to address
> the correctness. However, looking through the GFP_USER cases, I don't
> think any of them is at risk of ending up in user space with PROT_MTE.
> There are places where GFP_USER is passed to kmalloc() for in-kernel
> objects that would never be mapped to user, though the new gfp flag
> won't be taken into account.

Yeah, those kmalloc()'s look suspicious.

> I'm ok to move the new flag to the GFP_HIGHUSER_MOVABLE but probably
> still keep a page_kasan_tag_reset() on the set_pte_at() path together
> with a WARN_ON_ONCE() if we miss anything.

GFP_HIGHUSER_MOVABLE is used in fewer places than GFP_USER, so if it
works - great!

However, see below.

> > Adding __GFP_SKIP_KASAN_UNPOISON makes sense, but we still need to
> > reset the tag in page->flags.
>
> My thought was to reset the tag in page->flags based on 'unpoison'
> alone without any extra flags. We use this flag for vmalloc() pages but
> it seems we don't reset the page tags (as we do via
> kasan_poison_slab()).

I just realized that we already have __GFP_ZEROTAGS that initializes
both in-memory and page->flags tags. Currently only used for user
pages allocated via alloc_zeroed_user_highpage_movable(). Perhaps we
can add this flag to GFP_HIGHUSER_MOVABLE?

We'll also need to change the behavior of __GFP_ZEROTAGS to work even
when GFP_ZERO is not set, but this doesn't seem to be a problem.

And, at this point, we can probably combine __GFP_ZEROTAGS with
__GFP_SKIP_KASAN_POISON, as they both would target user pages.

Does this make sense?
Catalin Marinas May 26, 2022, 12:24 p.m. UTC | #6
On Wed, May 25, 2022 at 07:41:08PM +0200, Andrey Konovalov wrote:
> On Wed, May 25, 2022 at 5:45 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > Adding __GFP_SKIP_KASAN_UNPOISON makes sense, but we still need to
> > > reset the tag in page->flags.
> >
> > My thought was to reset the tag in page->flags based on 'unpoison'
> > alone without any extra flags. We use this flag for vmalloc() pages but
> > it seems we don't reset the page tags (as we do via
> > kasan_poison_slab()).
> 
> I just realized that we already have __GFP_ZEROTAGS that initializes
> both in-memory and page->flags tags.

IIUC it only zeroes the tags and skips the unpoisoning but
page_kasan_tag() remains unchanged.

> Currently only used for user
> pages allocated via alloc_zeroed_user_highpage_movable(). Perhaps we
> can add this flag to GFP_HIGHUSER_MOVABLE?

I wouldn't add __GFP_ZEROTAGS to GFP_HIGHUSER_MOVABLE as we only need it
if the page is mapped with PROT_MTE. Clearing a page without tags may be
marginally faster.

> We'll also need to change the behavior of __GFP_ZEROTAGS to work even
> when GFP_ZERO is not set, but this doesn't seem to be a problem.

Why? We'd get unnecessary tag zeroing. We have these cases for
anonymous, private pages:

1. Zeroed page allocation without PROT_MTE: we need GFP_ZERO and
   page_kasan_tag_reset() in case of later mprotect(PROT_MTE).

2. Zeroed page allocation with PROT_MTE: we need GFP_ZERO,
   __GFP_ZEROTAGS and page_kasan_tag_reset().

3. CoW page allocation without PROT_MTE: copy data and we only need
   page_kasan_tag_reset() in case of later mprotect(PROT_MTE).

4. CoW page allocation with PROT_MTE: copy data and tags together with
   page_kasan_tag_reset().

So basically we always need page_kasan_tag_reset() for pages mapped in
user space even if they are not PROT_MTE, in case of a later
mprotect(PROT_MTE). For (1), (3) and (4) we don't need to zero the tags.
For (1) maybe we could do it as part of data zeroing (subject to some
benchmarks) but for (3) and (4) they'd be overridden by the copy anyway.

> And, at this point, we can probably combine __GFP_ZEROTAGS with
> __GFP_SKIP_KASAN_POISON, as they both would target user pages.

For user pages, I think we should skip unpoisoning as well. We can keep
unpoisoning around but if we end up calling page_kasan_tag_reset(),
there's not much value, at least in page_address() accesses since the
pointer would match all tags. That's unless you want to detect other
stray pointers to such pages but we already skip the poisoning on free,
so it doesn't seem to be a use-case.

If we skip unpoisoning (not just poisoning as we already do) for user
pages, we should reset the tags in page->flags. Whether __GFP_ZEROTAGS
is passed is complementary, depending on the reason for allocation.
Currently if __GFP_ZEROTAGS is passed, the unpoisoning is skipped but I
think we should have just added __GFP_SKIP_KASAN_UNPOISON instead and
not add a new argument to should_skip_kasan_unpoison(). If we decide to
always skip unpoisoning, something like below on top of the vanilla
kernel:

-------------8<-----------------
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 3e3d36fc2109..df0ec30524fb 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -348,7 +348,7 @@ struct vm_area_struct;
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
 #define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE | \
-			 __GFP_SKIP_KASAN_POISON)
+			 __GFP_SKIP_KASAN_POISON | __GFP_SKIP_KASAN_UNPOISON)
 #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
 			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e42038382c1..3173e8f0e69a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2346,7 +2346,7 @@ static inline bool check_new_pcp(struct page *page, unsigned int order)
 }
 #endif /* CONFIG_DEBUG_VM */

-static inline bool should_skip_kasan_unpoison(gfp_t flags, bool init_tags)
+static inline bool should_skip_kasan_unpoison(gfp_t flags)
 {
 	/* Don't skip if a software KASAN mode is enabled. */
 	if (IS_ENABLED(CONFIG_KASAN_GENERIC) ||
@@ -2358,12 +2358,10 @@ static inline bool should_skip_kasan_unpoison(gfp_t flags, bool init_tags)
 		return true;

 	/*
-	 * With hardware tag-based KASAN enabled, skip if either:
-	 *
-	 * 1. Memory tags have already been cleared via tag_clear_highpage().
-	 * 2. Skipping has been requested via __GFP_SKIP_KASAN_UNPOISON.
+	 * With hardware tag-based KASAN enabled, skip if this was requested
+	 * via __GFP_SKIP_KASAN_UNPOISON.
 	 */
-	return init_tags || (flags & __GFP_SKIP_KASAN_UNPOISON);
+	return flags & __GFP_SKIP_KASAN_UNPOISON;
 }

 static inline bool should_skip_init(gfp_t flags)
@@ -2416,7 +2414,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 		/* Note that memory is already initialized by the loop above. */
 		init = false;
 	}
-	if (!should_skip_kasan_unpoison(gfp_flags, init_tags)) {
+	if (!should_skip_kasan_unpoison(gfp_flags)) {
 		/* Unpoison shadow memory or set memory tags. */
 		kasan_unpoison_pages(page, order, init);
 
-------------8<-----------------

With the above, we can wire up page_kasan_tag_reset() to the
__GFP_SKIP_KASAN_UNPOISON check without any additional flags.
Andrey Konovalov May 31, 2022, 5:16 p.m. UTC | #7
On Thu, May 26, 2022 at 2:24 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Wed, May 25, 2022 at 07:41:08PM +0200, Andrey Konovalov wrote:
> > On Wed, May 25, 2022 at 5:45 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > Adding __GFP_SKIP_KASAN_UNPOISON makes sense, but we still need to
> > > > reset the tag in page->flags.
> > >
> > > My thought was to reset the tag in page->flags based on 'unpoison'
> > > alone without any extra flags. We use this flag for vmalloc() pages but
> > > it seems we don't reset the page tags (as we do via
> > > kasan_poison_slab()).
> >
> > I just realized that we already have __GFP_ZEROTAGS that initializes
> > both in-memory and page->flags tags.
>
> IIUC it only zeroes the tags and skips the unpoisoning but
> page_kasan_tag() remains unchanged.

No, it does page_kasan_tag_reset() via tag_clear_highpage(). At least,
currently.

> > Currently only used for user
> > pages allocated via alloc_zeroed_user_highpage_movable(). Perhaps we
> > can add this flag to GFP_HIGHUSER_MOVABLE?
>
> I wouldn't add __GFP_ZEROTAGS to GFP_HIGHUSER_MOVABLE as we only need it
> if the page is mapped with PROT_MTE. Clearing a page without tags may be
> marginally faster.

Ah, right. We need a dedicated flag for PROT_MTE allocations.

> > We'll also need to change the behavior of __GFP_ZEROTAGS to work even
> > when GFP_ZERO is not set, but this doesn't seem to be a problem.
>
> Why? We'd get unnecessary tag zeroing. We have these cases for
> anonymous, private pages:
>
> 1. Zeroed page allocation without PROT_MTE: we need GFP_ZERO and
>    page_kasan_tag_reset() in case of later mprotect(PROT_MTE).
>
> 2. Zeroed page allocation with PROT_MTE: we need GFP_ZERO,
>    __GFP_ZEROTAGS and page_kasan_tag_reset().
>
> 3. CoW page allocation without PROT_MTE: copy data and we only need
>    page_kasan_tag_reset() in case of later mprotect(PROT_MTE).
>
> 4. CoW page allocation with PROT_MTE: copy data and tags together with
>    page_kasan_tag_reset().
>
> So basically we always need page_kasan_tag_reset() for pages mapped in
> user space even if they are not PROT_MTE, in case of a later
> mprotect(PROT_MTE). For (1), (3) and (4) we don't need to zero the tags.
> For (1) maybe we could do it as part of data zeroing (subject to some
> benchmarks) but for (3) and (4) they'd be overridden by the copy anyway.

Ack.

> > And, at this point, we can probably combine __GFP_ZEROTAGS with
> > __GFP_SKIP_KASAN_POISON, as they both would target user pages.
>
> For user pages, I think we should skip unpoisoning as well. We can keep
> unpoisoning around but if we end up calling page_kasan_tag_reset(),
> there's not much value, at least in page_address() accesses since the
> pointer would match all tags. That's unless you want to detect other
> stray pointers to such pages but we already skip the poisoning on free,
> so it doesn't seem to be a use-case.

Skipping unpoisoning makes sense.

> If we skip unpoisoning (not just poisoning as we already do) for user
> pages, we should reset the tags in page->flags. Whether __GFP_ZEROTAGS
> is passed is complementary, depending on the reason for allocation.

[...]

> Currently if __GFP_ZEROTAGS is passed, the unpoisoning is skipped but I
> think we should have just added __GFP_SKIP_KASAN_UNPOISON instead and
> not add a new argument to should_skip_kasan_unpoison(). If we decide to
> always skip unpoisoning, something like below on top of the vanilla
> kernel:

[...]

> With the above, we can wire up page_kasan_tag_reset() to the
> __GFP_SKIP_KASAN_UNPOISON check without any additional flags.

This would make __GFP_SKIP_KASAN_UNPOISON do two logically unrelated
things: skip setting memory tags and reset page tags. This seems
weird.

I think it makes more sense to split __GFP_ZEROTAGS into
__GFP_ZERO_MEMORY_TAGS and __GFP_ZERO_PAGE_TAGS: the first one does
tag_clear_highpage() without page_kasan_tag_reset() and the second one
does page_kasan_tag_reset() in post_alloc_hook(). Then, add
__GFP_ZERO_PAGE_TAGS to GFP_HIGHUSER_MOVABLE along with
__GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_KASAN_POISON. And replace
__GFP_ZEROTAGS with __GFP_ZERO_MEMORY_TAGS in
alloc_zeroed_user_highpage_movable().

An a alternative approach that would reduce the number of GFP flags,
we could extend your suggestion and pre-combining all standalone
MTE-related GFP flags based on their use cases:

__GFP_KASAN_VMALLOC == essentially __GFP_SKIP_KASAN_UNPOISON | __GFP_SKIP_ZERO
__GFP_MTE_USER == essentially __GFP_ZERO_PAGE_TAGS |
__GFP_SKIP_KASAN_UNPOISON | __GFP_SKIP_KASAN_POISON
__GFP_MTE_USER_ZERO == essentially __GFP_ZERO_MEMORY_TAGS

Then we would only need 3 flags instead of 5.

However, this seems to be unaligned with the idea that __GFP flags
should enable/disable a single piece of functionality. So I like the
first approach better.

What do you think?

Thanks!
Catalin Marinas June 9, 2022, 6:32 p.m. UTC | #8
Hi Andrey,

Sorry, I got distracted by the merging window.

On Tue, May 31, 2022 at 07:16:03PM +0200, Andrey Konovalov wrote:
> On Thu, May 26, 2022 at 2:24 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > If we skip unpoisoning (not just poisoning as we already do) for user
> > pages, we should reset the tags in page->flags. Whether __GFP_ZEROTAGS
> > is passed is complementary, depending on the reason for allocation.
> 
> [...]
> 
> > Currently if __GFP_ZEROTAGS is passed, the unpoisoning is skipped but I
> > think we should have just added __GFP_SKIP_KASAN_UNPOISON instead and
> > not add a new argument to should_skip_kasan_unpoison(). If we decide to
> > always skip unpoisoning, something like below on top of the vanilla
> > kernel:
> 
> [...]
> 
> > With the above, we can wire up page_kasan_tag_reset() to the
> > __GFP_SKIP_KASAN_UNPOISON check without any additional flags.
> 
> This would make __GFP_SKIP_KASAN_UNPOISON do two logically unrelated
> things: skip setting memory tags and reset page tags. This seems
> weird.

Not entirely weird, it depends on how you look at it. After allocation,
you expect the accesses to page_address() to work, irrespective of the
GFP flags. __kasan_unpoison_pages() ensures that the page->flags match
the written tag without a new GFP flag to set the page->flags. If you
skip the unpoisoning something should reset the page->flags tag to
ensure an accessible page_address(). I find it weirder that you need
another GFP flag to pretty much say 'give me an accessible page'.

> I think it makes more sense to split __GFP_ZEROTAGS into
> __GFP_ZERO_MEMORY_TAGS and __GFP_ZERO_PAGE_TAGS: the first one does
> tag_clear_highpage() without page_kasan_tag_reset() and the second one
> does page_kasan_tag_reset() in post_alloc_hook(). Then, add
> __GFP_ZERO_PAGE_TAGS to GFP_HIGHUSER_MOVABLE along with
> __GFP_SKIP_KASAN_UNPOISON and __GFP_SKIP_KASAN_POISON. And replace
> __GFP_ZEROTAGS with __GFP_ZERO_MEMORY_TAGS in
> alloc_zeroed_user_highpage_movable().

As above, my preference would be to avoid a new flag, just wire this up
to __GFP_SKIP_KASAN_UNPOISON. But if you do want fine-grained control, I
can add the above.

Thanks.
Andrey Konovalov June 9, 2022, 6:40 p.m. UTC | #9
Hi Catalin,

On Thu, Jun 9, 2022 at 8:32 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > This would make __GFP_SKIP_KASAN_UNPOISON do two logically unrelated
> > things: skip setting memory tags and reset page tags. This seems
> > weird.
>
> Not entirely weird, it depends on how you look at it. After allocation,
> you expect the accesses to page_address() to work, irrespective of the
> GFP flags. __kasan_unpoison_pages() ensures that the page->flags match
> the written tag without a new GFP flag to set the page->flags. If you
> skip the unpoisoning something should reset the page->flags tag to
> ensure an accessible page_address(). I find it weirder that you need
> another GFP flag to pretty much say 'give me an accessible page'.

Hm, this makes sense.

> As above, my preference would be to avoid a new flag, just wire this up
> to __GFP_SKIP_KASAN_UNPOISON. But if you do want fine-grained control, I
> can add the above.

OK, let's do as you suggest.

Thanks!