mbox series

[v3,0/3] RISC-V: mm: do not treat hint addr on mmap as the upper bound to search

Message ID tencent_108260B43689E30AAE5D0C7C085AA31ADF06@qq.com (mailing list archive)
Headers show
Series RISC-V: mm: do not treat hint addr on mmap as the upper bound to search | expand

Message

Yangyu Chen Aug. 27, 2024, 8:05 a.m. UTC
Previous patch series[1][2] changes a mmap behavior that treats the hint
address as the upper bound of the mmap address range. The motivation of the
previous patch series is that some user space software may assume 48-bit
address space and use higher bits to encode some information, which may
collide with large virtual address space mmap may return. However, to make
sv48 by default, we don't need to change the meaning of the hint address on
mmap as the upper bound of the mmap address range. This behavior breaks
some user space software like Chromium that gets ENOMEM error when the hint
address + size is not big enough, as specified in [3].

Other ISAs with larger than 48-bit virtual address space like x86, arm64,
and powerpc do not have this special mmap behavior on hint address. They
all just make 48-bit / 47-bit virtual address space by default, and if a
user space software wants to large virtual address space, it only need to
specify a hint address larger than 48-bit / 47-bit.

Thus, this patch series change mmap to use sv48 by default but does not
treat the hint address as the upper bound of the mmap address range. After
this patch, the behavior of mmap will align with existing behavior on other
ISAs with larger than 48-bit virtual address space like x86, arm64, and
powerpc. The user space software will no longer need to rewrite their code
to fit with this special mmap behavior only on RISC-V.

Note: Charlie also created another series [4] to completely remove the
arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
and size. However, this will cause programs like Go and Java, which need to
store information in the higher bits of the pointer, to fail on Sv57
machines.

Changes in v3:
- Rebase to newest master
- Changes some information in cover letter after patchset [2]
- Use patch [5] to patch selftests
- Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/

Changes in v2:
- correct arch_get_mmap_end and arch_get_mmap_base
- Add description in documentation about mmap behavior on kernel v6.6-6.7.
- Improve commit message and cover letter
- Rebase to newest riscv/for-next branch
- Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/

[1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
[2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
[3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
[4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
[5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/

Charlie Jenkins (1):
  riscv: selftests: Remove mmap hint address checks

Yangyu Chen (2):
  RISC-V: mm: not use hint addr as upper bound
  Documentation: riscv: correct sv57 kernel behavior

 Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
 arch/riscv/include/asm/processor.h            | 20 ++----
 .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
 .../testing/selftests/riscv/mm/mmap_default.c |  2 -
 tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
 5 files changed, 36 insertions(+), 98 deletions(-)

Comments

Palmer Dabbelt Aug. 27, 2024, 4:33 p.m. UTC | #1
On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
> Previous patch series[1][2] changes a mmap behavior that treats the hint
> address as the upper bound of the mmap address range. The motivation of the
> previous patch series is that some user space software may assume 48-bit
> address space and use higher bits to encode some information, which may
> collide with large virtual address space mmap may return. However, to make
> sv48 by default, we don't need to change the meaning of the hint address on
> mmap as the upper bound of the mmap address range. This behavior breaks
> some user space software like Chromium that gets ENOMEM error when the hint
> address + size is not big enough, as specified in [3].
>
> Other ISAs with larger than 48-bit virtual address space like x86, arm64,
> and powerpc do not have this special mmap behavior on hint address. They
> all just make 48-bit / 47-bit virtual address space by default, and if a
> user space software wants to large virtual address space, it only need to
> specify a hint address larger than 48-bit / 47-bit.
>
> Thus, this patch series change mmap to use sv48 by default but does not
> treat the hint address as the upper bound of the mmap address range. After
> this patch, the behavior of mmap will align with existing behavior on other
> ISAs with larger than 48-bit virtual address space like x86, arm64, and
> powerpc. The user space software will no longer need to rewrite their code
> to fit with this special mmap behavior only on RISC-V.

So it actually looks like we just screwed up the original version of 
this: the reason we went with the more complicated address splits were 
than we actually started with a defacto 39-bit page table uABI (ie 
38-bit user VAs), and moving to even 48-bit page tables (ie, 47-bit user 
VAs) broke users (here's an ASAN bug, for example: 
https://github.com/google/android-riscv64/issues/64).  

Unless I'm missing something, though, the code doesn't actually do that.  
I remember having that discussion at some point, but I must have 
forgotten to make sure it worked.  As far as I can tell we've just moved 
to the 48-bit VAs by default, which breaks the whole point of doing the 
compatibilty stuff.  Probably a good sign I need to pay more attention 
to this stuff.

So I'm not really sure what to do here: we can just copy the arm64 
behavior at tell the other users that's just how things work, but then 
we're just pushing around breakages.  At a certain point all we can 
really do with this hint stuff is push around problems, though, and at 
least if we copy arm64 then most of those problems get reported as bugs 
for us.

> Note: Charlie also created another series [4] to completely remove the
> arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
> and size. However, this will cause programs like Go and Java, which need to
> store information in the higher bits of the pointer, to fail on Sv57
> machines.
>
> Changes in v3:
> - Rebase to newest master
> - Changes some information in cover letter after patchset [2]
> - Use patch [5] to patch selftests
> - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/
>
> Changes in v2:
> - correct arch_get_mmap_end and arch_get_mmap_base
> - Add description in documentation about mmap behavior on kernel v6.6-6.7.
> - Improve commit message and cover letter
> - Rebase to newest riscv/for-next branch
> - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/
>
> [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
> [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
> [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
> [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
> [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/
>
> Charlie Jenkins (1):
>   riscv: selftests: Remove mmap hint address checks
>
> Yangyu Chen (2):
>   RISC-V: mm: not use hint addr as upper bound
>   Documentation: riscv: correct sv57 kernel behavior
>
>  Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
>  arch/riscv/include/asm/processor.h            | 20 ++----
>  .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
>  .../testing/selftests/riscv/mm/mmap_default.c |  2 -
>  tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
>  5 files changed, 36 insertions(+), 98 deletions(-)
Charlie Jenkins Aug. 27, 2024, 4:40 p.m. UTC | #2
On Tue, Aug 27, 2024 at 09:33:11AM -0700, Palmer Dabbelt wrote:
> On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
> > Previous patch series[1][2] changes a mmap behavior that treats the hint
> > address as the upper bound of the mmap address range. The motivation of the
> > previous patch series is that some user space software may assume 48-bit
> > address space and use higher bits to encode some information, which may
> > collide with large virtual address space mmap may return. However, to make
> > sv48 by default, we don't need to change the meaning of the hint address on
> > mmap as the upper bound of the mmap address range. This behavior breaks
> > some user space software like Chromium that gets ENOMEM error when the hint
> > address + size is not big enough, as specified in [3].
> > 
> > Other ISAs with larger than 48-bit virtual address space like x86, arm64,
> > and powerpc do not have this special mmap behavior on hint address. They
> > all just make 48-bit / 47-bit virtual address space by default, and if a
> > user space software wants to large virtual address space, it only need to
> > specify a hint address larger than 48-bit / 47-bit.
> > 
> > Thus, this patch series change mmap to use sv48 by default but does not
> > treat the hint address as the upper bound of the mmap address range. After
> > this patch, the behavior of mmap will align with existing behavior on other
> > ISAs with larger than 48-bit virtual address space like x86, arm64, and
> > powerpc. The user space software will no longer need to rewrite their code
> > to fit with this special mmap behavior only on RISC-V.
> 
> So it actually looks like we just screwed up the original version of this:
> the reason we went with the more complicated address splits were than we
> actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs),
> and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users
> (here's an ASAN bug, for example:
> https://github.com/google/android-riscv64/issues/64).
> 
> Unless I'm missing something, though, the code doesn't actually do that.  I
> remember having that discussion at some point, but I must have forgotten to
> make sure it worked.  As far as I can tell we've just moved to the 48-bit
> VAs by default, which breaks the whole point of doing the compatibilty
> stuff.  Probably a good sign I need to pay more attention to this stuff.
> 
> So I'm not really sure what to do here: we can just copy the arm64 behavior
> at tell the other users that's just how things work, but then we're just
> pushing around breakages.  At a certain point all we can really do with this
> hint stuff is push around problems, though, and at least if we copy arm64
> then most of those problems get reported as bugs for us.

Relying on the hint address in any capacity will push around breakages
is my perspective as well. I messed this up from the start. I believe
the only way to have consistent behavior is to mark mmap relying on the
hint address as a bug, and only rely on the hint address if a flag
defines the behavior.

There is an awkward window of releases that will have this "buggy"
behavior. However, since the mmap changes introduced a variety of
userspace bugs it seems acceptable to revert to the previous behavior
and to create a consistent path forward.

- Charlie

> 
> > Note: Charlie also created another series [4] to completely remove the
> > arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
> > and size. However, this will cause programs like Go and Java, which need to
> > store information in the higher bits of the pointer, to fail on Sv57
> > machines.
> > 
> > Changes in v3:
> > - Rebase to newest master
> > - Changes some information in cover letter after patchset [2]
> > - Use patch [5] to patch selftests
> > - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/
> > 
> > Changes in v2:
> > - correct arch_get_mmap_end and arch_get_mmap_base
> > - Add description in documentation about mmap behavior on kernel v6.6-6.7.
> > - Improve commit message and cover letter
> > - Rebase to newest riscv/for-next branch
> > - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/
> > 
> > [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
> > [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
> > [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
> > [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
> > [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/
> > 
> > Charlie Jenkins (1):
> >   riscv: selftests: Remove mmap hint address checks
> > 
> > Yangyu Chen (2):
> >   RISC-V: mm: not use hint addr as upper bound
> >   Documentation: riscv: correct sv57 kernel behavior
> > 
> >  Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
> >  arch/riscv/include/asm/processor.h            | 20 ++----
> >  .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
> >  .../testing/selftests/riscv/mm/mmap_default.c |  2 -
> >  tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
> >  5 files changed, 36 insertions(+), 98 deletions(-)
Yangyu Chen Aug. 27, 2024, 5:47 p.m. UTC | #3
> On Aug 28, 2024, at 00:33, Palmer Dabbelt <palmer@rivosinc.com> wrote:
> 
> On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
>> Previous patch series[1][2] changes a mmap behavior that treats the hint
>> address as the upper bound of the mmap address range. The motivation of the
>> previous patch series is that some user space software may assume 48-bit
>> address space and use higher bits to encode some information, which may
>> collide with large virtual address space mmap may return. However, to make
>> sv48 by default, we don't need to change the meaning of the hint address on
>> mmap as the upper bound of the mmap address range. This behavior breaks
>> some user space software like Chromium that gets ENOMEM error when the hint
>> address + size is not big enough, as specified in [3].
>> 
>> Other ISAs with larger than 48-bit virtual address space like x86, arm64,
>> and powerpc do not have this special mmap behavior on hint address. They
>> all just make 48-bit / 47-bit virtual address space by default, and if a
>> user space software wants to large virtual address space, it only need to
>> specify a hint address larger than 48-bit / 47-bit.
>> 
>> Thus, this patch series change mmap to use sv48 by default but does not
>> treat the hint address as the upper bound of the mmap address range. After
>> this patch, the behavior of mmap will align with existing behavior on other
>> ISAs with larger than 48-bit virtual address space like x86, arm64, and
>> powerpc. The user space software will no longer need to rewrite their code
>> to fit with this special mmap behavior only on RISC-V.
> 
> So it actually looks like we just screwed up the original version of this: the reason we went with the more complicated address splits were than we actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs), and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users (here's an ASAN bug, for example: https://github.com/google/android-riscv64/issues/64).  
> Unless I'm missing something, though, the code doesn't actually do that.  I remember having that discussion at some point, but I must have forgotten to make sure it worked.  As far as I can tell we've just moved to the 48-bit VAs by default, which breaks the whole point of doing the compatibilty stuff.  Probably a good sign I need to pay more attention to this stuff.
> 

It seems the issues have been solved in LLVM D139823 [1] and LLVM D152895 [2].

[1] https://reviews.llvm.org/D139823
[2] https://reviews.llvm.org/D152895

> So I'm not really sure what to do here: we can just copy the arm64 behavior at tell the other users that's just how things work, but then we're just pushing around breakages.  At a certain point all we can really do with this hint stuff is push around problems, though, and at least if we copy arm64 then most of those problems get reported as bugs for us.
> 
>> Note: Charlie also created another series [4] to completely remove the
>> arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
>> and size. However, this will cause programs like Go and Java, which need to
>> store information in the higher bits of the pointer, to fail on Sv57
>> machines.
>> 
>> Changes in v3:
>> - Rebase to newest master
>> - Changes some information in cover letter after patchset [2]
>> - Use patch [5] to patch selftests
>> - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/
>> 
>> Changes in v2:
>> - correct arch_get_mmap_end and arch_get_mmap_base
>> - Add description in documentation about mmap behavior on kernel v6.6-6.7.
>> - Improve commit message and cover letter
>> - Rebase to newest riscv/for-next branch
>> - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/
>> 
>> [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
>> [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
>> [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
>> [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
>> [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/
>> 
>> Charlie Jenkins (1):
>>  riscv: selftests: Remove mmap hint address checks
>> 
>> Yangyu Chen (2):
>>  RISC-V: mm: not use hint addr as upper bound
>>  Documentation: riscv: correct sv57 kernel behavior
>> 
>> Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
>> arch/riscv/include/asm/processor.h            | 20 ++----
>> .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
>> .../testing/selftests/riscv/mm/mmap_default.c |  2 -
>> tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
>> 5 files changed, 36 insertions(+), 98 deletions(-)
Yangyu Chen Aug. 27, 2024, 6:04 p.m. UTC | #4
> On Aug 28, 2024, at 00:40, Charlie Jenkins <charlie@rivosinc.com> wrote:
> 
> On Tue, Aug 27, 2024 at 09:33:11AM -0700, Palmer Dabbelt wrote:
>> On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
>>> Previous patch series[1][2] changes a mmap behavior that treats the hint
>>> address as the upper bound of the mmap address range. The motivation of the
>>> previous patch series is that some user space software may assume 48-bit
>>> address space and use higher bits to encode some information, which may
>>> collide with large virtual address space mmap may return. However, to make
>>> sv48 by default, we don't need to change the meaning of the hint address on
>>> mmap as the upper bound of the mmap address range. This behavior breaks
>>> some user space software like Chromium that gets ENOMEM error when the hint
>>> address + size is not big enough, as specified in [3].
>>> 
>>> Other ISAs with larger than 48-bit virtual address space like x86, arm64,
>>> and powerpc do not have this special mmap behavior on hint address. They
>>> all just make 48-bit / 47-bit virtual address space by default, and if a
>>> user space software wants to large virtual address space, it only need to
>>> specify a hint address larger than 48-bit / 47-bit.
>>> 
>>> Thus, this patch series change mmap to use sv48 by default but does not
>>> treat the hint address as the upper bound of the mmap address range. After
>>> this patch, the behavior of mmap will align with existing behavior on other
>>> ISAs with larger than 48-bit virtual address space like x86, arm64, and
>>> powerpc. The user space software will no longer need to rewrite their code
>>> to fit with this special mmap behavior only on RISC-V.
>> 
>> So it actually looks like we just screwed up the original version of this:
>> the reason we went with the more complicated address splits were than we
>> actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs),
>> and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users
>> (here's an ASAN bug, for example:
>> https://github.com/google/android-riscv64/issues/64).
>> 
>> Unless I'm missing something, though, the code doesn't actually do that.  I
>> remember having that discussion at some point, but I must have forgotten to
>> make sure it worked.  As far as I can tell we've just moved to the 48-bit
>> VAs by default, which breaks the whole point of doing the compatibilty
>> stuff.  Probably a good sign I need to pay more attention to this stuff.
>> 
>> So I'm not really sure what to do here: we can just copy the arm64 behavior
>> at tell the other users that's just how things work, but then we're just
>> pushing around breakages.  At a certain point all we can really do with this
>> hint stuff is push around problems, though, and at least if we copy arm64
>> then most of those problems get reported as bugs for us.
> 
> Relying on the hint address in any capacity will push around breakages
> is my perspective as well. I messed this up from the start. I believe
> the only way to have consistent behavior is to mark mmap relying on the
> hint address as a bug, and only rely on the hint address if a flag
> defines the behavior.
> 

I agree with this. However, since we already have this behavior on
x86 and aarch64 for quite a long time, to prevent breaking userspace,
I think we can use this patch and then add a flag like MAP_VA_FULL
to enable full va address in the future.

Thanks,
Yangyu Chen

> There is an awkward window of releases that will have this "buggy"
> behavior. However, since the mmap changes introduced a variety of
> userspace bugs it seems acceptable to revert to the previous behavior
> and to create a consistent path forward.
> 
> - Charlie
> 
>> 
>>> Note: Charlie also created another series [4] to completely remove the
>>> arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
>>> and size. However, this will cause programs like Go and Java, which need to
>>> store information in the higher bits of the pointer, to fail on Sv57
>>> machines.
>>> 
>>> Changes in v3:
>>> - Rebase to newest master
>>> - Changes some information in cover letter after patchset [2]
>>> - Use patch [5] to patch selftests
>>> - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/
>>> 
>>> Changes in v2:
>>> - correct arch_get_mmap_end and arch_get_mmap_base
>>> - Add description in documentation about mmap behavior on kernel v6.6-6.7.
>>> - Improve commit message and cover letter
>>> - Rebase to newest riscv/for-next branch
>>> - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/
>>> 
>>> [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
>>> [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
>>> [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
>>> [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
>>> [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/
>>> 
>>> Charlie Jenkins (1):
>>>  riscv: selftests: Remove mmap hint address checks
>>> 
>>> Yangyu Chen (2):
>>>  RISC-V: mm: not use hint addr as upper bound
>>>  Documentation: riscv: correct sv57 kernel behavior
>>> 
>>> Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
>>> arch/riscv/include/asm/processor.h            | 20 ++----
>>> .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
>>> .../testing/selftests/riscv/mm/mmap_default.c |  2 -
>>> tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
>>> 5 files changed, 36 insertions(+), 98 deletions(-)
Charlie Jenkins Aug. 27, 2024, 7:35 p.m. UTC | #5
On Wed, Aug 28, 2024 at 02:04:29AM +0800, Yangyu Chen wrote:
> 
> 
> > On Aug 28, 2024, at 00:40, Charlie Jenkins <charlie@rivosinc.com> wrote:
> > 
> > On Tue, Aug 27, 2024 at 09:33:11AM -0700, Palmer Dabbelt wrote:
> >> On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
> >>> Previous patch series[1][2] changes a mmap behavior that treats the hint
> >>> address as the upper bound of the mmap address range. The motivation of the
> >>> previous patch series is that some user space software may assume 48-bit
> >>> address space and use higher bits to encode some information, which may
> >>> collide with large virtual address space mmap may return. However, to make
> >>> sv48 by default, we don't need to change the meaning of the hint address on
> >>> mmap as the upper bound of the mmap address range. This behavior breaks
> >>> some user space software like Chromium that gets ENOMEM error when the hint
> >>> address + size is not big enough, as specified in [3].
> >>> 
> >>> Other ISAs with larger than 48-bit virtual address space like x86, arm64,
> >>> and powerpc do not have this special mmap behavior on hint address. They
> >>> all just make 48-bit / 47-bit virtual address space by default, and if a
> >>> user space software wants to large virtual address space, it only need to
> >>> specify a hint address larger than 48-bit / 47-bit.
> >>> 
> >>> Thus, this patch series change mmap to use sv48 by default but does not
> >>> treat the hint address as the upper bound of the mmap address range. After
> >>> this patch, the behavior of mmap will align with existing behavior on other
> >>> ISAs with larger than 48-bit virtual address space like x86, arm64, and
> >>> powerpc. The user space software will no longer need to rewrite their code
> >>> to fit with this special mmap behavior only on RISC-V.
> >> 
> >> So it actually looks like we just screwed up the original version of this:
> >> the reason we went with the more complicated address splits were than we
> >> actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs),
> >> and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users
> >> (here's an ASAN bug, for example:
> >> https://github.com/google/android-riscv64/issues/64).
> >> 
> >> Unless I'm missing something, though, the code doesn't actually do that.  I
> >> remember having that discussion at some point, but I must have forgotten to
> >> make sure it worked.  As far as I can tell we've just moved to the 48-bit
> >> VAs by default, which breaks the whole point of doing the compatibilty
> >> stuff.  Probably a good sign I need to pay more attention to this stuff.
> >> 
> >> So I'm not really sure what to do here: we can just copy the arm64 behavior
> >> at tell the other users that's just how things work, but then we're just
> >> pushing around breakages.  At a certain point all we can really do with this
> >> hint stuff is push around problems, though, and at least if we copy arm64
> >> then most of those problems get reported as bugs for us.
> > 
> > Relying on the hint address in any capacity will push around breakages
> > is my perspective as well. I messed this up from the start. I believe
> > the only way to have consistent behavior is to mark mmap relying on the
> > hint address as a bug, and only rely on the hint address if a flag
> > defines the behavior.
> > 
> 
> I agree with this. However, since we already have this behavior on
> x86 and aarch64 for quite a long time, to prevent breaking userspace,
> I think we can use this patch and then add a flag like MAP_VA_FULL
> to enable full va address in the future.

Since riscv is not x86 or aarch64, we should be able to make decisions
that are best for riscv regardless of if it is identical to how it is
implemented in x86 or aarch64.

- Charlie

> 
> Thanks,
> Yangyu Chen
> 
> > There is an awkward window of releases that will have this "buggy"
> > behavior. However, since the mmap changes introduced a variety of
> > userspace bugs it seems acceptable to revert to the previous behavior
> > and to create a consistent path forward.
> > 
> > - Charlie
> > 
> >> 
> >>> Note: Charlie also created another series [4] to completely remove the
> >>> arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
> >>> and size. However, this will cause programs like Go and Java, which need to
> >>> store information in the higher bits of the pointer, to fail on Sv57
> >>> machines.
> >>> 
> >>> Changes in v3:
> >>> - Rebase to newest master
> >>> - Changes some information in cover letter after patchset [2]
> >>> - Use patch [5] to patch selftests
> >>> - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@qq.com/
> >>> 
> >>> Changes in v2:
> >>> - correct arch_get_mmap_end and arch_get_mmap_base
> >>> - Add description in documentation about mmap behavior on kernel v6.6-6.7.
> >>> - Improve commit message and cover letter
> >>> - Rebase to newest riscv/for-next branch
> >>> - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@qq.com/
> >>> 
> >>> [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc.com/
> >>> [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@rivosinc.com/
> >>> [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@MEYP282MB2312.AUSP282.PROD.OUTLOOK.COM/
> >>> [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com/
> >>> [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@rivosinc.com/
> >>> 
> >>> Charlie Jenkins (1):
> >>>  riscv: selftests: Remove mmap hint address checks
> >>> 
> >>> Yangyu Chen (2):
> >>>  RISC-V: mm: not use hint addr as upper bound
> >>>  Documentation: riscv: correct sv57 kernel behavior
> >>> 
> >>> Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
> >>> arch/riscv/include/asm/processor.h            | 20 ++----
> >>> .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
> >>> .../testing/selftests/riscv/mm/mmap_default.c |  2 -
> >>> tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
> >>> 5 files changed, 36 insertions(+), 98 deletions(-)
> 
>
patchwork-bot+linux-riscv@kernel.org Sept. 3, 2024, 2:30 p.m. UTC | #6
Hello:

This series was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Tue, 27 Aug 2024 16:05:15 +0800 you wrote:
> Previous patch series[1][2] changes a mmap behavior that treats the hint
> address as the upper bound of the mmap address range. The motivation of the
> previous patch series is that some user space software may assume 48-bit
> address space and use higher bits to encode some information, which may
> collide with large virtual address space mmap may return. However, to make
> sv48 by default, we don't need to change the meaning of the hint address on
> mmap as the upper bound of the mmap address range. This behavior breaks
> some user space software like Chromium that gets ENOMEM error when the hint
> address + size is not big enough, as specified in [3].
> 
> [...]

Here is the summary with links:
  - [v3,1/3] riscv: selftests: Remove mmap hint address checks
    https://git.kernel.org/riscv/c/83dae72ac038
  - [v3,2/3] RISC-V: mm: not use hint addr as upper bound
    (no matching commit)
  - [v3,3/3] Documentation: riscv: correct sv57 kernel behavior
    (no matching commit)

You are awesome, thank you!