diff mbox series

riscv: Allow PROT_WRITE-only mmap()

Message ID 20220908170133.1159747-1-abrestic@rivosinc.com (mailing list archive)
State New, archived
Headers show
Series riscv: Allow PROT_WRITE-only mmap() | expand

Commit Message

Andrew Bresticker Sept. 8, 2022, 5:01 p.m. UTC
Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
PROT_READ with the justification that a write-only PTE is considered a
reserved PTE permission bit pattern in the privileged spec. This check
is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
inconsistent with other architectures that don't support write-only PTEs,
creating a potential software portability issue. Just remove the check
altogether and let PROT_WRITE imply PROT_READ as is the case on other
architectures.

Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
---
 arch/riscv/kernel/sys_riscv.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

SS JieJi Sept. 8, 2022, 5:21 p.m. UTC | #1
> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE
> maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is
> inconsistent with other architectures that don't support write-only PTEs,
> creating a potential software portability issue.

I don't believe that the check is unnecessary. The missing check is
discovered in realworld scenario, while we are fixing libaio's test
failure on RISC-V [1]. A minimum reproducible example is uploaded to
https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
before/after a write attempt performed by the kernel.

[1]: https://pagure.io/libaio/blob/1b18bfafc6a2f7b9fa2c6be77a95afed8b7be448/f/harness/cases/5.t

> -       if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
> -               return -EINVAL;
> -

Just to mention, this revert patch is removing the check of exec
without read (--x), too.
SS JieJi Sept. 8, 2022, 5:28 p.m. UTC | #2
> https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
> before/after a write attempt performed by the kernel.

That said, maybe prohibit mmap-ing -w- pages is not the best fix for
this issue. If -w- pages are irreplaceable for some use cases (and
hence need to be allowed), I'd suggest at least we need to re-fix the
read result inconsistency issue somewhere else despite simply
reverting the patch.

Yours, Pan Ruizhe
Andrew Bresticker Sept. 8, 2022, 6:14 p.m. UTC | #3
On Thu, Sep 8, 2022 at 1:28 PM SS JieJi <c141028@gmail.com> wrote:
>
> > https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages
> > before/after a write attempt performed by the kernel.
>
> That said, maybe prohibit mmap-ing -w- pages is not the best fix for
> this issue. If -w- pages are irreplaceable for some use cases (and
> hence need to be allowed), I'd suggest at least we need to re-fix the
> read result inconsistency issue somewhere else despite simply
> reverting the patch.

Ah, this is because do_page_fault() also needs to be made aware of
write-implying-read. Will send a v2 shortly.

-Andrew

>
> Yours, Pan Ruizhe
diff mbox series

Patch

diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 571556bb9261..5d3f2fbeb33c 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -18,9 +18,6 @@  static long riscv_sys_mmap(unsigned long addr, unsigned long len,
 	if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
 		return -EINVAL;
 
-	if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ)))
-		return -EINVAL;
-
 	return ksys_mmap_pgoff(addr, len, prot, flags, fd,
 			       offset >> (PAGE_SHIFT - page_shift_offset));
 }