[RFC,v1,2/2] mseal: allow noop mprotect

Message ID	20250312002117.2556240-3-jeffxu@google.com (mailing list archive)
State	New
Headers	show Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F5C54CB5B for <linux-kselftest@vger.kernel.org>; Wed, 12 Mar 2025 00:21:27 +0000 (UTC) From: jeffxu@chromium.org To: akpm@linux-foundation.org, vbabka@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@Oracle.com, broonie@kernel.org, skhan@linuxfoundation.org Cc: linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, keescook@chromium.org, pedro.falcato@gmail.com, rdunlap@infradead.org, jannh@google.com, Jeff Xu <jeffxu@chromium.org> Subject: [RFC PATCH v1 2/2] mseal: allow noop mprotect Date: Wed, 12 Mar 2025 00:21:17 +0000 Message-ID: <20250312002117.2556240-3-jeffxu@google.com> In-Reply-To: <20250312002117.2556240-1-jeffxu@google.com> References: <20250312002117.2556240-1-jeffxu@google.com> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	mseal: allow noop mprotect \| expand [RFC,v1,0/2] mseal: allow noop mprotect [RFC,v1,1/2] selftests/mm: mseal_test: avoid using no-op mprotect [RFC,v1,2/2] mseal: allow noop mprotect

Jeff Xu March 12, 2025, 12:21 a.m. UTC

From: Jeff Xu <jeffxu@chromium.org>

Initially, when mseal was introduced in 6.10, semantically, when a VMA
within the specified address range is sealed, the mprotect will be rejected,
leaving all of VMA unmodified. However, adding an extra loop to check the mseal
flag for every VMA slows things down a bit, therefore in 6.12, this issue was
solved by removing can_modify_mm and checking each VMA’s mseal flag directly
without an extra loop [1]. This is a semantic change, i.e. partial update is
allowed, VMAs can be updated until a sealed VMA is found.

The new semantic also means, we could allow mprotect on a sealed VMA if the new
attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
impacts for applications that want to seal a particular mapping. Doing this also
has no security impact.

[1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/

Fixes: 4a2dd02b0916 ("mm/mprotect: replace can_modify_mm with can_modify_vma")
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
 mm/mprotect.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Lorenzo Stoakes March 12, 2025, 1:49 p.m. UTC | #1

On Wed, Mar 12, 2025 at 12:21:17AM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Initially, when mseal was introduced in 6.10, semantically, when a VMA
> within the specified address range is sealed, the mprotect will be rejected,
> leaving all of VMA unmodified. However, adding an extra loop to check the mseal
> flag for every VMA slows things down a bit, therefore in 6.12, this issue was
> solved by removing can_modify_mm and checking each VMA’s mseal flag directly
> without an extra loop [1]. This is a semantic change, i.e. partial update is
> allowed, VMAs can be updated until a sealed VMA is found.
>
> The new semantic also means, we could allow mprotect on a sealed VMA if the new
> attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
> impacts for applications that want to seal a particular mapping. Doing this also
> has no security impact.
>
> [1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/
>
> Fixes: 4a2dd02b0916 ("mm/mprotect: replace can_modify_mm with can_modify_vma")
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> ---
>  mm/mprotect.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 516b1d847e2c..a24d23967aa5 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -613,14 +613,14 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
>  	unsigned long charged = 0;
>  	int error;
>
> -	if (!can_modify_vma(vma))
> -		return -EPERM;
> -
>  	if (newflags == oldflags) {
>  		*pprev = vma;
>  		return 0;
>  	}
>
> +	if (!can_modify_vma(vma))
> +		return -EPERM;
> +
>  	/*
>  	 * Do PROT_NONE PFN permission checks here when we can still
>  	 * bail out without undoing a lot of state. This is a rather
> --
> 2.49.0.rc0.332.g42c0ae87b1-goog
>

Hm I'm not so sure about this, to me a seal means 'don't touch', even if
the touch would be a no-op. It's simpler to be totally consistent on this
and makes the code easier everywhere.

Because if we start saying 'apply mseal rules, except if we can determine
this to be a no-op' then that implies we might have some inconsistency in
other operations that do not do that, and sometimes a 'no-op' might be
ill-defined etc.

I think generally I'd rather leave things as they are unless you have a
specific real-life case where this is causing problems?

Kees Cook March 12, 2025, 3:27 p.m. UTC | #2

On March 12, 2025 6:49:39 AM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>On Wed, Mar 12, 2025 at 12:21:17AM +0000, jeffxu@chromium.org wrote:
>> From: Jeff Xu <jeffxu@chromium.org>
>>
>> Initially, when mseal was introduced in 6.10, semantically, when a VMA
>> within the specified address range is sealed, the mprotect will be rejected,
>> leaving all of VMA unmodified. However, adding an extra loop to check the mseal
>> flag for every VMA slows things down a bit, therefore in 6.12, this issue was
>> solved by removing can_modify_mm and checking each VMA’s mseal flag directly
>> without an extra loop [1]. This is a semantic change, i.e. partial update is
>> allowed, VMAs can be updated until a sealed VMA is found.
>>
>> The new semantic also means, we could allow mprotect on a sealed VMA if the new
>> attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
>> impacts for applications that want to seal a particular mapping. Doing this also
>> has no security impact.
>>
>> [1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/
>>
>> Fixes: 4a2dd02b0916 ("mm/mprotect: replace can_modify_mm with can_modify_vma")
>> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
>> ---
>>  mm/mprotect.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 516b1d847e2c..a24d23967aa5 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -613,14 +613,14 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
>>  	unsigned long charged = 0;
>>  	int error;
>>
>> -	if (!can_modify_vma(vma))
>> -		return -EPERM;
>> -
>>  	if (newflags == oldflags) {
>>  		*pprev = vma;
>>  		return 0;
>>  	}
>>
>> +	if (!can_modify_vma(vma))
>> +		return -EPERM;
>> +
>>  	/*
>>  	 * Do PROT_NONE PFN permission checks here when we can still
>>  	 * bail out without undoing a lot of state. This is a rather
>> --
>> 2.49.0.rc0.332.g42c0ae87b1-goog
>>
>
>Hm I'm not so sure about this, to me a seal means 'don't touch', even if
>the touch would be a no-op. It's simpler to be totally consistent on this
>and makes the code easier everywhere.
>
>Because if we start saying 'apply mseal rules, except if we can determine
>this to be a no-op' then that implies we might have some inconsistency in
>other operations that do not do that, and sometimes a 'no-op' might be
>ill-defined etc.

Does mseal mean "you cannot call mprotect on this VMA" or does it mean "you cannot change this VMA". I've always considered it the latter since the entry point to making VMA changes doesn't matter (mmap, mprotect, etc) it's the VMA that can't change. Even the internal function name is "can_modify", and if the flags aren't changing then it's not a modification.

I think it's more ergonomic to check for _changes_.

-Kees

Pedro Falcato March 12, 2025, 3:48 p.m. UTC | #3

On Wed, Mar 12, 2025 at 3:28 PM Kees Cook <kees@kernel.org> wrote:
>
>
>
> On March 12, 2025 6:49:39 AM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> >On Wed, Mar 12, 2025 at 12:21:17AM +0000, jeffxu@chromium.org wrote:
> >> From: Jeff Xu <jeffxu@chromium.org>
> >>
> >> Initially, when mseal was introduced in 6.10, semantically, when a VMA
> >> within the specified address range is sealed, the mprotect will be rejected,
> >> leaving all of VMA unmodified. However, adding an extra loop to check the mseal
> >> flag for every VMA slows things down a bit, therefore in 6.12, this issue was
> >> solved by removing can_modify_mm and checking each VMA’s mseal flag directly
> >> without an extra loop [1]. This is a semantic change, i.e. partial update is
> >> allowed, VMAs can be updated until a sealed VMA is found.
> >>
> >> The new semantic also means, we could allow mprotect on a sealed VMA if the new
> >> attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
> >> impacts for applications that want to seal a particular mapping. Doing this also
> >> has no security impact.
> >>
> >> [1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/
> >>
> >> Fixes: 4a2dd02b0916 ("mm/mprotect: replace can_modify_mm with can_modify_vma")
> >> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> >> ---
> >>  mm/mprotect.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/mprotect.c b/mm/mprotect.c
> >> index 516b1d847e2c..a24d23967aa5 100644
> >> --- a/mm/mprotect.c
> >> +++ b/mm/mprotect.c
> >> @@ -613,14 +613,14 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> >>      unsigned long charged = 0;
> >>      int error;
> >>
> >> -    if (!can_modify_vma(vma))
> >> -            return -EPERM;
> >> -
> >>      if (newflags == oldflags) {
> >>              *pprev = vma;
> >>              return 0;
> >>      }
> >>
> >> +    if (!can_modify_vma(vma))
> >> +            return -EPERM;
> >> +
> >>      /*
> >>       * Do PROT_NONE PFN permission checks here when we can still
> >>       * bail out without undoing a lot of state. This is a rather
> >> --
> >> 2.49.0.rc0.332.g42c0ae87b1-goog
> >>
> >
> >Hm I'm not so sure about this, to me a seal means 'don't touch', even if
> >the touch would be a no-op. It's simpler to be totally consistent on this
> >and makes the code easier everywhere.
> >
> >Because if we start saying 'apply mseal rules, except if we can determine
> >this to be a no-op' then that implies we might have some inconsistency in
> >other operations that do not do that, and sometimes a 'no-op' might be
> >ill-defined etc.
>
> Does mseal mean "you cannot call mprotect on this VMA" or does it mean "you cannot change this VMA". I've always considered it the latter since the entry point to making VMA changes doesn't matter (mmap, mprotect, etc) it's the VMA that can't change. Even the internal function name is "can_modify", and if the flags aren't changing then it's not a modification.
>
> I think it's more ergonomic to check for _changes_.

I think this is a slippery slope because some changes are not trivial
to deal with e.g
int fd = open("somefile")
void *ptr = mmap(0, 4096, PROT_READ, MAP_SHARED, fd, 0);
mmap(ptr, 4096, PROT_READ, MAP_FIXED | MAP_SHARED, fd, 0);


soooo on one hand, I don't really have grounds to say this patch is
incorrect. On the other hand, I'd like to see either a particular
problem or a consistent criteria we can apply to all VMA-related
situations.

Lorenzo Stoakes March 12, 2025, 3:50 p.m. UTC | #4

On Wed, Mar 12, 2025 at 08:27:57AM -0700, Kees Cook wrote:
>
>
> On March 12, 2025 6:49:39 AM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> >On Wed, Mar 12, 2025 at 12:21:17AM +0000, jeffxu@chromium.org wrote:
> >> From: Jeff Xu <jeffxu@chromium.org>
> >>
> >> Initially, when mseal was introduced in 6.10, semantically, when a VMA
> >> within the specified address range is sealed, the mprotect will be rejected,
> >> leaving all of VMA unmodified. However, adding an extra loop to check the mseal
> >> flag for every VMA slows things down a bit, therefore in 6.12, this issue was
> >> solved by removing can_modify_mm and checking each VMA’s mseal flag directly
> >> without an extra loop [1]. This is a semantic change, i.e. partial update is
> >> allowed, VMAs can be updated until a sealed VMA is found.
> >>
> >> The new semantic also means, we could allow mprotect on a sealed VMA if the new
> >> attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
> >> impacts for applications that want to seal a particular mapping. Doing this also
> >> has no security impact.
> >>
> >> [1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com/
> >>
> >> Fixes: 4a2dd02b0916 ("mm/mprotect: replace can_modify_mm with can_modify_vma")
> >> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> >> ---
> >>  mm/mprotect.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/mprotect.c b/mm/mprotect.c
> >> index 516b1d847e2c..a24d23967aa5 100644
> >> --- a/mm/mprotect.c
> >> +++ b/mm/mprotect.c
> >> @@ -613,14 +613,14 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> >>  	unsigned long charged = 0;
> >>  	int error;
> >>
> >> -	if (!can_modify_vma(vma))
> >> -		return -EPERM;
> >> -
> >>  	if (newflags == oldflags) {
> >>  		*pprev = vma;
> >>  		return 0;
> >>  	}
> >>
> >> +	if (!can_modify_vma(vma))
> >> +		return -EPERM;
> >> +
> >>  	/*
> >>  	 * Do PROT_NONE PFN permission checks here when we can still
> >>  	 * bail out without undoing a lot of state. This is a rather
> >> --
> >> 2.49.0.rc0.332.g42c0ae87b1-goog
> >>
> >
> >Hm I'm not so sure about this, to me a seal means 'don't touch', even if
> >the touch would be a no-op. It's simpler to be totally consistent on this
> >and makes the code easier everywhere.
> >
> >Because if we start saying 'apply mseal rules, except if we can determine
> >this to be a no-op' then that implies we might have some inconsistency in
> >other operations that do not do that, and sometimes a 'no-op' might be
> >ill-defined etc.
>
> Does mseal mean "you cannot call mprotect on this VMA" or does it mean "you cannot change this VMA". I've always considered it the latter since the entry point to making VMA changes doesn't matter (mmap, mprotect, etc) it's the VMA that can't change. Even the internal function name is "can_modify", and if the flags aren't changing then it's not a modification.

Right, but here it's easy to determine that.

What about madvise() with MADV_DONTNEED on a r/o VMA that's not faulted in?
That's a no-op right? But it's not permitted.

So now we have an inconsistency between the two calls.

Should we now check to see if all the madvise() calls are somehow no-ops
and permit them? Because that gets potentially egregious, fast.

My concern is that we set a trap for ourselves by establishing some kind of
contract, implicit or not, that otherwise-mseal-prevented-calls will be
permitted if they result in a no-op.

To me it's simpler to say 'if we touch a VMA with a call that modifies
things, and it's sealed, we abort'.

Easy, doesn't set traps, no reasonable situation in which that should cause
problems.

>
> I think it's more ergonomic to check for _changes_.

I don't know what you mean by 'ergonomic'?

>
> -Kees
>
> --
> Kees Cook

My reply seemed to get truncated at the end here :) So let me ask again -
do you have a practical case in mind for this?

Kees Cook March 12, 2025, 4:45 p.m. UTC | #5

On Wed, Mar 12, 2025 at 03:50:40PM +0000, Lorenzo Stoakes wrote:
> What about madvise() with MADV_DONTNEED on a r/o VMA that's not faulted in?
> That's a no-op right? But it's not permitted.

Hmm, yes, that's a good example. Thank you!

> So now we have an inconsistency between the two calls.

Yeah, I see your concern now.

> I don't know what you mean by 'ergonomic'?

I was thinking about idempotent-ness. Like, some library setting up a
memory region, it can't call its setup routine twice if the second time
through (where no changes are made) it gets rejected. But I think this
is likely just a userspace problem: check for the VMAs before blindly
trying to do it again. (This is strictly an imagined situation.)

> My reply seemed to get truncated at the end here :) So let me ask again -
> do you have a practical case in mind for this?

Sorry, I didn't have any reply to that part, so I left it off. If Jeff
has a specific case in mind, I'll let him answer that part. :)

-Kees

Jeff Xu March 12, 2025, 11:29 p.m. UTC | #6

On Wed, Mar 12, 2025 at 9:45 AM Kees Cook <kees@kernel.org> wrote:
>
> On Wed, Mar 12, 2025 at 03:50:40PM +0000, Lorenzo Stoakes wrote:
> > What about madvise() with MADV_DONTNEED on a r/o VMA that's not faulted in?
> > That's a no-op right? But it's not permitted.
>
Madvise's semantics are about behavior, while mprotect is about
attributes. To me:  madvise is like "make this VMA do that" while
mprotect is about "update this VMA's attributes to a new value".

It is more difficult to determine if a behavior is no-op, so I don't
intend to apply the same no-op concept to madvise().

> Hmm, yes, that's a good example. Thank you!
>
> > So now we have an inconsistency between the two calls.
>
> Yeah, I see your concern now.
>
> > I don't know what you mean by 'ergonomic'?
>
> I was thinking about idempotent-ness. Like, some library setting up a
> memory region, it can't call its setup routine twice if the second time
> through (where no changes are made) it gets rejected. But I think this
> is likely just a userspace problem: check for the VMAs before blindly
> trying to do it again. (This is strictly an imagined situation.)
>
Yes.

 We also don't have a system call to query the "mprotect" attributes,
so it is understandable that userspace can rely on idempotents of the
mprotect.

> > My reply seemed to get truncated at the end here :) So let me ask again -
> > do you have a practical case in mind for this?
>
I noticed there were idempotent mprotects last year while working on
applying mseal on stack in Android. I assume this might not be the
only instance since mprotect gets called a lot in general.

Blocking this won't improve security, it could actually hinder the
adoption of mseal, i.e. force apps to make code change.

-Jeff

> Sorry, I didn't have any reply to that part, so I left it off. If Jeff
> has a specific case in mind, I'll let him answer that part. :)
>
> -Kees
>
> --
> Kees Cook

[RFC,v1,2/2] mseal: allow noop mprotect

Commit Message

Comments

Patch