diff mbox series

[v3,13/29] riscv mmu: write protect and shadow stack

Message ID 20240403234054.2020347-14-debug@rivosinc.com (mailing list archive)
State RFC
Headers show
Series riscv control-flow integrity for usermode | expand

Checks

Context Check Description
conchuod/vmtest-fixes-PR fail merge-conflict

Commit Message

Deepak Gupta April 3, 2024, 11:35 p.m. UTC
`fork` implements copy on write (COW) by making pages readonly in child
and parent both.

ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
Assumption is that page is readable and on fault copy on write happens.

To implement COW on such pages, clearing up W bit makes them XWR = 000.
This will result in wrong PTE setting which says no perms but V=1 and PFN
field pointing to final page. Instead desired behavior is to turn it into
a readable page, take an access (load/store) fault on sspush/sspop
(shadow stack) and then perform COW on such pages. This way regular reads
would still be allowed and not lead to COW maintaining current behavior
of COW on non-shadow stack but writeable memory.

On the other hand it doesn't interfere with existing COW for read-write
memory. Assumption is always that _PAGE_READ must have been set and thus
setting _PAGE_READ is harmless.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
---
 arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Alexandre Ghiti May 12, 2024, 4:31 p.m. UTC | #1
On 04/04/2024 01:35, Deepak Gupta wrote:
> `fork` implements copy on write (COW) by making pages readonly in child
> and parent both.
>
> ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
> Assumption is that page is readable and on fault copy on write happens.
>
> To implement COW on such pages,


I guess you mean "shadow stack pages" here.


>   clearing up W bit makes them XWR = 000.
> This will result in wrong PTE setting which says no perms but V=1 and PFN
> field pointing to final page. Instead desired behavior is to turn it into
> a readable page, take an access (load/store) fault on sspush/sspop
> (shadow stack) and then perform COW on such pages.
> This way regular reads
> would still be allowed and not lead to COW maintaining current behavior
> of COW on non-shadow stack but writeable memory.
>
> On the other hand it doesn't interfere with existing COW for read-write
> memory. Assumption is always that _PAGE_READ must have been set and thus
> setting _PAGE_READ is harmless.
>
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> ---
>   arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 9b837239d3e8..7a1c2a98d272 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -398,7 +398,7 @@ static inline int pte_special(pte_t pte)
>   
>   static inline pte_t pte_wrprotect(pte_t pte)
>   {
> -	return __pte(pte_val(pte) & ~(_PAGE_WRITE));
> +	return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
>   }
>   
>   /* static inline pte_t pte_mkread(pte_t pte) */
> @@ -581,7 +581,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>   static inline void ptep_set_wrprotect(struct mm_struct *mm,
>   				      unsigned long address, pte_t *ptep)
>   {
> -	atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
> +	volatile pte_t read_pte = *ptep;
> +	/*
> +	 * ptep_set_wrprotect can be called for shadow stack ranges too.
> +	 * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
> +	 * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
> +	 * but we dont want this wrong configuration to be set in page tables.
> +	 */
> +	atomic_long_set((atomic_long_t *)ptep,
> +			((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
>   }
>   
>   #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH


Doesn't making the shadow stack page readable allow "normal" loads to 
access the page? If it does, isn't that an issue (security-wise)?
Deepak Gupta May 13, 2024, 5:32 p.m. UTC | #2
On Sun, May 12, 2024 at 06:31:24PM +0200, Alexandre Ghiti wrote:
>On 04/04/2024 01:35, Deepak Gupta wrote:
>>`fork` implements copy on write (COW) by making pages readonly in child
>>and parent both.
>>
>>ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
>>Assumption is that page is readable and on fault copy on write happens.
>>
>>To implement COW on such pages,
>
>
>I guess you mean "shadow stack pages" here.

Yes I meant shadow stack pages. Will fix the message.

>
>
>>  clearing up W bit makes them XWR = 000.
>>This will result in wrong PTE setting which says no perms but V=1 and PFN
>>field pointing to final page. Instead desired behavior is to turn it into
>>a readable page, take an access (load/store) fault on sspush/sspop
>>(shadow stack) and then perform COW on such pages.
>>This way regular reads
>>would still be allowed and not lead to COW maintaining current behavior
>>of COW on non-shadow stack but writeable memory.
>>
>>On the other hand it doesn't interfere with existing COW for read-write
>>memory. Assumption is always that _PAGE_READ must have been set and thus
>>setting _PAGE_READ is harmless.
>>
>>Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>>---
>>  arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>>diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>>index 9b837239d3e8..7a1c2a98d272 100644
>>--- a/arch/riscv/include/asm/pgtable.h
>>+++ b/arch/riscv/include/asm/pgtable.h
>>@@ -398,7 +398,7 @@ static inline int pte_special(pte_t pte)
>>  static inline pte_t pte_wrprotect(pte_t pte)
>>  {
>>-	return __pte(pte_val(pte) & ~(_PAGE_WRITE));
>>+	return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
>>  }
>>  /* static inline pte_t pte_mkread(pte_t pte) */
>>@@ -581,7 +581,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>>  static inline void ptep_set_wrprotect(struct mm_struct *mm,
>>  				      unsigned long address, pte_t *ptep)
>>  {
>>-	atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
>>+	volatile pte_t read_pte = *ptep;
>>+	/*
>>+	 * ptep_set_wrprotect can be called for shadow stack ranges too.
>>+	 * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
>>+	 * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
>>+	 * but we dont want this wrong configuration to be set in page tables.
>>+	 */
>>+	atomic_long_set((atomic_long_t *)ptep,
>>+			((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
>>  }
>>  #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
>
>
>Doesn't making the shadow stack page readable allow "normal" loads to 
>access the page? If it does, isn't that an issue (security-wise)?

When shadow stack permissions are there (i.e. R=0, W=1, X=0), then also shadow stack is
readable through "normal" loads. So nothing changes when it converts into a readonly page
from page permissions perspective.

Security-wise it's not a concern because from threat modeling perspective, if attacker had
read-write primitives (via some bug in program) available to read and write address space
of process/task; then they would have availiblity of return addresses on normal stack. It's
the write primitive that is concerning and to be protected against. And that's why shadow stack
is not writeable using "normal" stores.

>
Alexandre Ghiti May 23, 2024, 2:59 p.m. UTC | #3
Hi Deepak,

On Mon, May 13, 2024 at 7:32 PM Deepak Gupta <debug@rivosinc.com> wrote:
>
> On Sun, May 12, 2024 at 06:31:24PM +0200, Alexandre Ghiti wrote:
> >On 04/04/2024 01:35, Deepak Gupta wrote:
> >>`fork` implements copy on write (COW) by making pages readonly in child
> >>and parent both.
> >>
> >>ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
> >>Assumption is that page is readable and on fault copy on write happens.
> >>
> >>To implement COW on such pages,
> >
> >
> >I guess you mean "shadow stack pages" here.
>
> Yes I meant shadow stack pages. Will fix the message.
>
> >
> >
> >>  clearing up W bit makes them XWR = 000.
> >>This will result in wrong PTE setting which says no perms but V=1 and PFN
> >>field pointing to final page. Instead desired behavior is to turn it into
> >>a readable page, take an access (load/store) fault on sspush/sspop
> >>(shadow stack) and then perform COW on such pages.
> >>This way regular reads
> >>would still be allowed and not lead to COW maintaining current behavior
> >>of COW on non-shadow stack but writeable memory.
> >>
> >>On the other hand it doesn't interfere with existing COW for read-write
> >>memory. Assumption is always that _PAGE_READ must have been set and thus
> >>setting _PAGE_READ is harmless.
> >>
> >>Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> >>---
> >>  arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
> >>  1 file changed, 10 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> >>index 9b837239d3e8..7a1c2a98d272 100644
> >>--- a/arch/riscv/include/asm/pgtable.h
> >>+++ b/arch/riscv/include/asm/pgtable.h
> >>@@ -398,7 +398,7 @@ static inline int pte_special(pte_t pte)
> >>  static inline pte_t pte_wrprotect(pte_t pte)
> >>  {
> >>-     return __pte(pte_val(pte) & ~(_PAGE_WRITE));
> >>+     return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
> >>  }
> >>  /* static inline pte_t pte_mkread(pte_t pte) */
> >>@@ -581,7 +581,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
> >>  static inline void ptep_set_wrprotect(struct mm_struct *mm,
> >>                                    unsigned long address, pte_t *ptep)
> >>  {
> >>-     atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
> >>+     volatile pte_t read_pte = *ptep;

Sorry I missed this ^. You need to use ptep_get() to get the value of
a pte. And why do you need the volatile here?

> >>+     /*
> >>+      * ptep_set_wrprotect can be called for shadow stack ranges too.
> >>+      * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
> >>+      * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
> >>+      * but we dont want this wrong configuration to be set in page tables.
> >>+      */
> >>+     atomic_long_set((atomic_long_t *)ptep,
> >>+                     ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
> >>  }
> >>  #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
> >
> >
> >Doesn't making the shadow stack page readable allow "normal" loads to
> >access the page? If it does, isn't that an issue (security-wise)?
>
> When shadow stack permissions are there (i.e. R=0, W=1, X=0), then also shadow stack is
> readable through "normal" loads. So nothing changes when it converts into a readonly page
> from page permissions perspective.
>
> Security-wise it's not a concern because from threat modeling perspective, if attacker had
> read-write primitives (via some bug in program) available to read and write address space
> of process/task; then they would have availiblity of return addresses on normal stack. It's
> the write primitive that is concerning and to be protected against. And that's why shadow stack
> is not writeable using "normal" stores.
>
> >

Thanks for the explanation!

With the use of ptep_get(), you can add:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex
Deepak Gupta May 24, 2024, 7:16 a.m. UTC | #4
On Thu, May 23, 2024 at 04:59:30PM +0200, Alexandre Ghiti wrote:
>Hi Deepak,
>
>On Mon, May 13, 2024 at 7:32 PM Deepak Gupta <debug@rivosinc.com> wrote:
>>
>> On Sun, May 12, 2024 at 06:31:24PM +0200, Alexandre Ghiti wrote:
>> >On 04/04/2024 01:35, Deepak Gupta wrote:
>> >>`fork` implements copy on write (COW) by making pages readonly in child
>> >>and parent both.
>> >>
>> >>ptep_set_wrprotect and pte_wrprotect clears _PAGE_WRITE in PTE.
>> >>Assumption is that page is readable and on fault copy on write happens.
>> >>
>> >>To implement COW on such pages,
>> >
>> >
>> >I guess you mean "shadow stack pages" here.
>>
>> Yes I meant shadow stack pages. Will fix the message.
>>
>> >
>> >
>> >>  clearing up W bit makes them XWR = 000.
>> >>This will result in wrong PTE setting which says no perms but V=1 and PFN
>> >>field pointing to final page. Instead desired behavior is to turn it into
>> >>a readable page, take an access (load/store) fault on sspush/sspop
>> >>(shadow stack) and then perform COW on such pages.
>> >>This way regular reads
>> >>would still be allowed and not lead to COW maintaining current behavior
>> >>of COW on non-shadow stack but writeable memory.
>> >>
>> >>On the other hand it doesn't interfere with existing COW for read-write
>> >>memory. Assumption is always that _PAGE_READ must have been set and thus
>> >>setting _PAGE_READ is harmless.
>> >>
>> >>Signed-off-by: Deepak Gupta <debug@rivosinc.com>
>> >>---
>> >>  arch/riscv/include/asm/pgtable.h | 12 ++++++++++--
>> >>  1 file changed, 10 insertions(+), 2 deletions(-)
>> >>
>> >>diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> >>index 9b837239d3e8..7a1c2a98d272 100644
>> >>--- a/arch/riscv/include/asm/pgtable.h
>> >>+++ b/arch/riscv/include/asm/pgtable.h
>> >>@@ -398,7 +398,7 @@ static inline int pte_special(pte_t pte)
>> >>  static inline pte_t pte_wrprotect(pte_t pte)
>> >>  {
>> >>-     return __pte(pte_val(pte) & ~(_PAGE_WRITE));
>> >>+     return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
>> >>  }
>> >>  /* static inline pte_t pte_mkread(pte_t pte) */
>> >>@@ -581,7 +581,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>> >>  static inline void ptep_set_wrprotect(struct mm_struct *mm,
>> >>                                    unsigned long address, pte_t *ptep)
>> >>  {
>> >>-     atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
>> >>+     volatile pte_t read_pte = *ptep;
>
>Sorry I missed this ^. You need to use ptep_get() to get the value of
>a pte. 

Noted. will fix it.

>And why do you need the volatile here?

I don't remember the reason. It's probably not needed here.
But I am sure I was debugging something and trying everything.
And this probably slipped sanitization before sending patches.

Will fix it.
>
>> >>+     /*
>> >>+      * ptep_set_wrprotect can be called for shadow stack ranges too.
>> >>+      * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
>> >>+      * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
>> >>+      * but we dont want this wrong configuration to be set in page tables.
>> >>+      */
>> >>+     atomic_long_set((atomic_long_t *)ptep,
>> >>+                     ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
>> >>  }
>> >>  #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
>> >
>> >
>> >Doesn't making the shadow stack page readable allow "normal" loads to
>> >access the page? If it does, isn't that an issue (security-wise)?
>>
>> When shadow stack permissions are there (i.e. R=0, W=1, X=0), then also shadow stack is
>> readable through "normal" loads. So nothing changes when it converts into a readonly page
>> from page permissions perspective.
>>
>> Security-wise it's not a concern because from threat modeling perspective, if attacker had
>> read-write primitives (via some bug in program) available to read and write address space
>> of process/task; then they would have availiblity of return addresses on normal stack. It's
>> the write primitive that is concerning and to be protected against. And that's why shadow stack
>> is not writeable using "normal" stores.
>>
>> >
>
>Thanks for the explanation!
>
>With the use of ptep_get(), you can add:
>
>Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>
>Thanks,
>
>Alex
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 9b837239d3e8..7a1c2a98d272 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -398,7 +398,7 @@  static inline int pte_special(pte_t pte)
 
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-	return __pte(pte_val(pte) & ~(_PAGE_WRITE));
+	return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ));
 }
 
 /* static inline pte_t pte_mkread(pte_t pte) */
@@ -581,7 +581,15 @@  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 static inline void ptep_set_wrprotect(struct mm_struct *mm,
 				      unsigned long address, pte_t *ptep)
 {
-	atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
+	volatile pte_t read_pte = *ptep;
+	/*
+	 * ptep_set_wrprotect can be called for shadow stack ranges too.
+	 * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to
+	 * encoding 000b which is wrong encoding with V = 1. This should lead to page fault
+	 * but we dont want this wrong configuration to be set in page tables.
+	 */
+	atomic_long_set((atomic_long_t *)ptep,
+			((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ));
 }
 
 #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH