mbox series

[v1,0/2] mseal: fixing madvise for file-backed mapping and PROT_NONE

Message ID 20241017005105.3047458-1-jeffxu@chromium.org (mailing list archive)
Headers show
Series mseal: fixing madvise for file-backed mapping and PROT_NONE | expand

Message

Jeff Xu Oct. 17, 2024, 12:51 a.m. UTC
From: Jeff Xu <jeffxu@google.com>

Two fixes for madvise(MADV_DONTNEED) when sealed.

For PROT_NONE mappings, the previous blocking of
madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits
memory access, madvise(MADV_DONTNEED) should be allowed to proceed in
order to free the page.

For file-backed, private, read-only memory mappings, we previously did
not block the madvise(MADV_DONTNEED). This was based on
the assumption that the memory's content, being file-backed, could be
retrieved from the file if accessed again. However, this assumption
failed to consider scenarios where a mapping is initially created as
read-write, modified, and subsequently changed to read-only. The newly
introduced VM_WASWRITE flag addresses this oversight.

Jeff Xu (2):
  mseal: Two fixes for madvise(MADV_DONTNEED) when sealed
  selftest/mseal: Add tests for madvise

 include/linux/mm.h                      |   2 +
 mm/mprotect.c                           |   3 +
 mm/mseal.c                              |  42 +++++++--
 tools/testing/selftests/mm/mseal_test.c | 118 +++++++++++++++++++++++-
 4 files changed, 157 insertions(+), 8 deletions(-)

Comments

Lorenzo Stoakes Oct. 17, 2024, 8:38 a.m. UTC | #1
NACK.

On Thu, Oct 17, 2024 at 12:51:03AM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@google.com>
>
> Two fixes for madvise(MADV_DONTNEED) when sealed.
>
> For PROT_NONE mappings, the previous blocking of
> madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits
> memory access, madvise(MADV_DONTNEED) should be allowed to proceed in
> order to free the page.

Except if they are VM_MAYWRITE...

>
> For file-backed, private, read-only memory mappings, we previously did
> not block the madvise(MADV_DONTNEED). This was based on
> the assumption that the memory's content, being file-backed, could be
> retrieved from the file if accessed again. However, this assumption
> failed to consider scenarios where a mapping is initially created as
> read-write, modified, and subsequently changed to read-only. The newly
> introduced VM_WASWRITE flag addresses this oversight.

There's no justification for adding a new VMA flag, especially given it
will break VMA merging for everyone.

This whole approach seems broken. What you seem to need is to check whether
a mapping _could_ be mapped writably at some stage.

The kernel doesn't need to keep track of all the times where it was
writable before or not but rather this.

Please look at VM_MAYWRITE and mapping_writably_mapped() (to account for
memfd seal behaviour).

Also you need to rewrite your tests to be readable.

>
> Jeff Xu (2):
>   mseal: Two fixes for madvise(MADV_DONTNEED) when sealed
>   selftest/mseal: Add tests for madvise
>
>  include/linux/mm.h                      |   2 +
>  mm/mprotect.c                           |   3 +
>  mm/mseal.c                              |  42 +++++++--
>  tools/testing/selftests/mm/mseal_test.c | 118 +++++++++++++++++++++++-
>  4 files changed, 157 insertions(+), 8 deletions(-)
>
> --
> 2.47.0.rc1.288.g06298d1525-goog
>