diff mbox series

[V2,4/4] riscv: mm: Optimize TASK_SIZE definition

Message ID 20231221154702.2267684-5-guoren@kernel.org (mailing list archive)
State Superseded
Headers show
Series riscv: mm: Fixup & Optimize COMPAT code | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-4-test-1 success .github/scripts/patches/build_rv32_defconfig.sh
conchuod/patch-4-test-2 success .github/scripts/patches/build_rv64_clang_allmodconfig.sh
conchuod/patch-4-test-3 success .github/scripts/patches/build_rv64_gcc_allmodconfig.sh
conchuod/patch-4-test-4 success .github/scripts/patches/build_rv64_nommu_k210_defconfig.sh
conchuod/patch-4-test-5 success .github/scripts/patches/build_rv64_nommu_virt_defconfig.sh
conchuod/patch-4-test-6 success .github/scripts/patches/checkpatch.sh
conchuod/patch-4-test-7 success .github/scripts/patches/dtb_warn_rv64.sh
conchuod/patch-4-test-8 success .github/scripts/patches/header_inline.sh
conchuod/patch-4-test-9 success .github/scripts/patches/kdoc.sh
conchuod/patch-4-test-10 success .github/scripts/patches/module_param.sh
conchuod/patch-4-test-11 success .github/scripts/patches/verify_fixes.sh
conchuod/patch-4-test-12 success .github/scripts/patches/verify_signedoff.sh

Commit Message

Guo Ren Dec. 21, 2023, 3:47 p.m. UTC
From: Guo Ren <guoren@linux.alibaba.com>

Unify the TASK_SIZE definition with VA_BITS for better readability.
Add COMPAT mode user address space info in the comment.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/pgtable.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Leonardo Bras Dec. 22, 2023, 5:09 a.m. UTC | #1
On Thu, Dec 21, 2023 at 10:47:01AM -0500, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Unify the TASK_SIZE definition with VA_BITS for better readability.
> Add COMPAT mode user address space info in the comment.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/include/asm/pgtable.h | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index e415582276ec..d165ddae3b42 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -866,6 +866,7 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
>   * Note that PGDIR_SIZE must evenly divide TASK_SIZE.
>   * Task size is:
>   * -        0x9fc00000	(~2.5GB) for RV32.
> + * -        0x80000000	(   2GB) for RV64 compat mode
>   * -      0x4000000000	( 256GB) for RV64 using SV39 mmu
>   * -    0x800000000000	( 128TB) for RV64 using SV48 mmu
>   * - 0x100000000000000	(  64PB) for RV64 using SV57 mmu
> @@ -877,11 +878,11 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
>   * Similarly for SV57, bits 63–57 must be equal to bit 56.
>   */
>  #ifdef CONFIG_64BIT
> -#define TASK_SIZE_64	(PGDIR_SIZE * PTRS_PER_PGD / 2)
> +#define TASK_SIZE_64	(UL(1) << (VA_BITS - 1))

Checked for l5, l4 and l3, and it seems a correct replacement.

>  
>  #ifdef CONFIG_COMPAT
> -#define TASK_SIZE_32	(_AC(0x80000000, UL))
> -#define TASK_SIZE	(test_thread_flag(TIF_32BIT) ? \
> +#define TASK_SIZE_32	(UL(1) << (VA_BITS_SV32 - 1))

Oh, much better. Thanks for removing the magic number :)

> +#define TASK_SIZE	(is_compat_task() ? \
>  			 TASK_SIZE_32 : TASK_SIZE_64)
>  #else
>  #define TASK_SIZE	TASK_SIZE_64
> -- 
> 2.40.1
> 

That's much more readable IMO now. Thanks!

FWIW:
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Guo Ren Dec. 22, 2023, 11:25 a.m. UTC | #2
On Fri, Dec 22, 2023 at 1:09 PM Leonardo Bras <leobras@redhat.com> wrote:
>
> On Thu, Dec 21, 2023 at 10:47:01AM -0500, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Unify the TASK_SIZE definition with VA_BITS for better readability.
> > Add COMPAT mode user address space info in the comment.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/include/asm/pgtable.h | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index e415582276ec..d165ddae3b42 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -866,6 +866,7 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
> >   * Note that PGDIR_SIZE must evenly divide TASK_SIZE.
> >   * Task size is:
> >   * -        0x9fc00000       (~2.5GB) for RV32.
> > + * -        0x80000000       (   2GB) for RV64 compat mode
> >   * -      0x4000000000       ( 256GB) for RV64 using SV39 mmu
> >   * -    0x800000000000       ( 128TB) for RV64 using SV48 mmu
> >   * - 0x100000000000000       (  64PB) for RV64 using SV57 mmu
> > @@ -877,11 +878,11 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
> >   * Similarly for SV57, bits 63–57 must be equal to bit 56.
> >   */
> >  #ifdef CONFIG_64BIT
> > -#define TASK_SIZE_64 (PGDIR_SIZE * PTRS_PER_PGD / 2)
> > +#define TASK_SIZE_64 (UL(1) << (VA_BITS - 1))
>
> Checked for l5, l4 and l3, and it seems a correct replacement.
>
> >
> >  #ifdef CONFIG_COMPAT
> > -#define TASK_SIZE_32 (_AC(0x80000000, UL))
> > -#define TASK_SIZE    (test_thread_flag(TIF_32BIT) ? \
> > +#define TASK_SIZE_32 (UL(1) << (VA_BITS_SV32 - 1))
>
> Oh, much better. Thanks for removing the magic number :)
>
> > +#define TASK_SIZE    (is_compat_task() ? \
> >                        TASK_SIZE_32 : TASK_SIZE_64)
I would remove is_compat_task() in the next version because your patch
contains that.

> >  #else
> >  #define TASK_SIZE    TASK_SIZE_64
> > --
> > 2.40.1
> >
>
> That's much more readable IMO now. Thanks!
>
> FWIW:
> Reviewed-by: Leonardo Bras <leobras@redhat.com>
>
David Laight Dec. 22, 2023, 11:52 a.m. UTC | #3
From: Guo Ren
> Sent: 22 December 2023 11:25
...
> > > +#define TASK_SIZE    (is_compat_task() ? \
> > >                        TASK_SIZE_32 : TASK_SIZE_64)
> I would remove is_compat_task() in the next version because your patch
> contains that.

Does TASK_SIZE get used in access_ok() ?
If so the repeated expansion of that 'mess' will slow things down.

OTOH access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
and rely on the page faults for everything else.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Guo Ren Dec. 23, 2023, 2:38 a.m. UTC | #4
Hi David,

On Fri, Dec 22, 2023 at 7:52 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Guo Ren
> > Sent: 22 December 2023 11:25
> ...
> > > > +#define TASK_SIZE    (is_compat_task() ? \
> > > >                        TASK_SIZE_32 : TASK_SIZE_64)
> > I would remove is_compat_task() in the next version because your patch
> > contains that.
>
> Does TASK_SIZE get used in access_ok() ?
> If so the repeated expansion of that 'mess' will slow things down.
>
> OTOH access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> and rely on the page faults for everything else.
I mean, I would remove is_compat_task() optimization.
test_thread_flag(TIF_32BIT) -> (is_compat_task() ?
Sorry for the bad wording.

Leonardo's new patch series contains the optimization on
is_compat_task(), so I canceled mine.


>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
Guo Ren Dec. 23, 2023, 2:52 a.m. UTC | #5
On Fri, Dec 22, 2023 at 7:52 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Guo Ren
> > Sent: 22 December 2023 11:25
> ...
> > > > +#define TASK_SIZE    (is_compat_task() ? \
> > > >                        TASK_SIZE_32 : TASK_SIZE_64)
> > I would remove is_compat_task() in the next version because your patch
> > contains that.
>
> Does TASK_SIZE get used in access_ok() ?
> If so the repeated expansion of that 'mess' will slow things down.
>
> OTOH access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> and rely on the page faults for everything else.
Or do you want to discuss the bad side effect of is_compat_task()?

Yes, test_thread_flag(TIF_32BIT) would slow down access_ok(). But if
we use TASK_SIZE_MAX, VA_BITS still needs pgtable_l5_enabled,
pgtable_l4_enabled detectation for riscv.

It's not only for compat mode, but also Sv39, Sv48, Sv57. All treat
TASK_SIZE_MAX as 0x8000000000000000, right? Then:
access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)

It's another feature and does not relate to compat mode.

>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
David Laight Dec. 23, 2023, 10:31 a.m. UTC | #6
From: Guo Ren
> Sent: 23 December 2023 02:53
> 
> On Fri, Dec 22, 2023 at 7:52 PM David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Guo Ren
> > > Sent: 22 December 2023 11:25
> > ...
> > > > > +#define TASK_SIZE    (is_compat_task() ? \
> > > > >                        TASK_SIZE_32 : TASK_SIZE_64)
> > > I would remove is_compat_task() in the next version because your patch
> > > contains that.
> >
> > Does TASK_SIZE get used in access_ok() ?
> > If so the repeated expansion of that 'mess' will slow things down.
> >
> > OTOH access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> > and rely on the page faults for everything else.
> Or do you want to discuss the bad side effect of is_compat_task()?
> 
> Yes, test_thread_flag(TIF_32BIT) would slow down access_ok(). But if
> we use TASK_SIZE_MAX, VA_BITS still needs pgtable_l5_enabled,
> pgtable_l4_enabled detectation for riscv.
> 
> It's not only for compat mode, but also Sv39, Sv48, Sv57. All treat
> TASK_SIZE_MAX as 0x8000000000000000, right? Then:
> access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> 
> It's another feature and does not relate to compat mode.

Compat mode just makes it worse...

One possibility would be to save the task's max user address
in the task structure itself - that would save all the conditionals
at a cost of an extra value in the task structure.

There is also the question of whether a normally 64-bit task
can actually make the compat mmap() system call?
On x86 that is certainly possible (IIRC wine does it), x86
userspace can flip between 32bit and 63bit mode without a
system call.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Guo Ren Dec. 24, 2023, 1:24 a.m. UTC | #7
On Sat, Dec 23, 2023 at 6:31 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Guo Ren
> > Sent: 23 December 2023 02:53
> >
> > On Fri, Dec 22, 2023 at 7:52 PM David Laight <David.Laight@aculab.com> wrote:
> > >
> > > From: Guo Ren
> > > > Sent: 22 December 2023 11:25
> > > ...
> > > > > > +#define TASK_SIZE    (is_compat_task() ? \
> > > > > >                        TASK_SIZE_32 : TASK_SIZE_64)
> > > > I would remove is_compat_task() in the next version because your patch
> > > > contains that.
> > >
> > > Does TASK_SIZE get used in access_ok() ?
> > > If so the repeated expansion of that 'mess' will slow things down.
> > >
> > > OTOH access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> > > and rely on the page faults for everything else.
> > Or do you want to discuss the bad side effect of is_compat_task()?
> >
> > Yes, test_thread_flag(TIF_32BIT) would slow down access_ok(). But if
> > we use TASK_SIZE_MAX, VA_BITS still needs pgtable_l5_enabled,
> > pgtable_l4_enabled detectation for riscv.
> >
> > It's not only for compat mode, but also Sv39, Sv48, Sv57. All treat
> > TASK_SIZE_MAX as 0x8000000000000000, right? Then:
> > access_ok(ptr, len) can just check (ptr | (ptr + len)) < 0)
> >
> > It's another feature and does not relate to compat mode.
>
> Compat mode just makes it worse...
It's hard to observe.

>
> One possibility would be to save the task's max user address
> in the task structure itself - that would save all the conditionals
> at a cost of an extra value in the task structure.
It would still cause memory load operation, although it is $tp->xxx.
If we want to gain observability benefits, "just check (ptr | (ptr +
len)) < 0)" is better.

>
> There is also the question of whether a normally 64-bit task
> can actually make the compat mmap() system call?
No.

> On x86 that is certainly possible (IIRC wine does it), x86
> userspace can flip between 32bit and 63bit mode without a
> system call.
RISC-V can't do that because it needs sstatux.uxl=32/64, which can
only be modified in S-mode.


>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
David Laight Dec. 24, 2023, 2:37 p.m. UTC | #8
From: Guo Ren
> Sent: 24 December 2023 01:24
...
> > One possibility would be to save the task's max user address
> > in the task structure itself - that would save all the conditionals
> > at a cost of an extra value in the task structure.

> It would still cause memory load operation, although it is $tp->xxx.

All the (mispredicted) branches are likely to cause more of a
problem than a load from the current task structure.

> If we want to gain observability benefits, "just check (ptr | (ptr +
> len)) < 0)" is better.

If you can guarantee a faulting page between user and kernel addresses
and assume (check) that the accesses are 'reasonably sequential'
then you only need to check the base address.
That is likely hard for 32bit but easier for 64bit (except arm64)
because A63 and A62 have to match.
Unless you have some hardware address masking which makes it much
more likely that 'random values' will be valid addresses.
(Someone remind me why that is a good idea unless the high bits
are validated by the hardware.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index e415582276ec..d165ddae3b42 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -866,6 +866,7 @@  static inline pte_t pte_swp_clear_exclusive(pte_t pte)
  * Note that PGDIR_SIZE must evenly divide TASK_SIZE.
  * Task size is:
  * -        0x9fc00000	(~2.5GB) for RV32.
+ * -        0x80000000	(   2GB) for RV64 compat mode
  * -      0x4000000000	( 256GB) for RV64 using SV39 mmu
  * -    0x800000000000	( 128TB) for RV64 using SV48 mmu
  * - 0x100000000000000	(  64PB) for RV64 using SV57 mmu
@@ -877,11 +878,11 @@  static inline pte_t pte_swp_clear_exclusive(pte_t pte)
  * Similarly for SV57, bits 63–57 must be equal to bit 56.
  */
 #ifdef CONFIG_64BIT
-#define TASK_SIZE_64	(PGDIR_SIZE * PTRS_PER_PGD / 2)
+#define TASK_SIZE_64	(UL(1) << (VA_BITS - 1))
 
 #ifdef CONFIG_COMPAT
-#define TASK_SIZE_32	(_AC(0x80000000, UL))
-#define TASK_SIZE	(test_thread_flag(TIF_32BIT) ? \
+#define TASK_SIZE_32	(UL(1) << (VA_BITS_SV32 - 1))
+#define TASK_SIZE	(is_compat_task() ? \
 			 TASK_SIZE_32 : TASK_SIZE_64)
 #else
 #define TASK_SIZE	TASK_SIZE_64