diff mbox series

[2/2] mm/hugetlb: Fix cow where page writtable in child

Message ID 20210501144110.8784-3-peterx@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm/hugetlb: Fix issues on file sealing and fork | expand

Commit Message

Peter Xu May 1, 2021, 2:41 p.m. UTC
When fork() and copy hugetlb page range, we'll remember to wrprotect src pte if
needed, however we forget about the child!  Without it, the child will be able
to write to parent's pages when mapped as PROT_READ|PROT_WRITE and MAP_PRIVATE,
which will cause data corruption in the parent process.

This issue can also be exposed by "memfd_test hugetlbfs" kselftest (if it can
pass the F_SEAL_FUTURE_WRITE test first, though).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hugetlb.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Mike Kravetz May 3, 2021, 8:53 p.m. UTC | #1
On 5/1/21 7:41 AM, Peter Xu wrote:
> When fork() and copy hugetlb page range, we'll remember to wrprotect src pte if
> needed, however we forget about the child!  Without it, the child will be able
> to write to parent's pages when mapped as PROT_READ|PROT_WRITE and MAP_PRIVATE,
> which will cause data corruption in the parent process.
> 
> This issue can also be exposed by "memfd_test hugetlbfs" kselftest (if it can
> pass the F_SEAL_FUTURE_WRITE test first, though).
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/hugetlb.c | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

I think we need to add, "Fixes: 4eae4efa2c29" as this is now in v5.12
Peter Xu May 3, 2021, 9:41 p.m. UTC | #2
On Mon, May 03, 2021 at 01:53:03PM -0700, Mike Kravetz wrote:
> On 5/1/21 7:41 AM, Peter Xu wrote:
> > When fork() and copy hugetlb page range, we'll remember to wrprotect src pte if
> > needed, however we forget about the child!  Without it, the child will be able
> > to write to parent's pages when mapped as PROT_READ|PROT_WRITE and MAP_PRIVATE,
> > which will cause data corruption in the parent process.
> > 
> > This issue can also be exposed by "memfd_test hugetlbfs" kselftest (if it can
> > pass the F_SEAL_FUTURE_WRITE test first, though).
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  mm/hugetlb.c | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

Thanks!

> 
> I think we need to add, "Fixes: 4eae4efa2c29" as this is now in v5.12

I could be mistaken, but my understanding is it's broken from the most initial
cow support of hugetlbfs in 2006...  So if we want a fixes tag, maybe this?

Fixes: 1e8f889b10d8d ("[PATCH] Hugetlb: Copy on Write support")
Mike Kravetz May 3, 2021, 10:10 p.m. UTC | #3
On 5/3/21 2:41 PM, Peter Xu wrote:
> On Mon, May 03, 2021 at 01:53:03PM -0700, Mike Kravetz wrote:
>> On 5/1/21 7:41 AM, Peter Xu wrote:
>>> When fork() and copy hugetlb page range, we'll remember to wrprotect src pte if
>>> needed, however we forget about the child!  Without it, the child will be able
>>> to write to parent's pages when mapped as PROT_READ|PROT_WRITE and MAP_PRIVATE,
>>> which will cause data corruption in the parent process.
>>>
>>> This issue can also be exposed by "memfd_test hugetlbfs" kselftest (if it can
>>> pass the F_SEAL_FUTURE_WRITE test first, though).
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>  mm/hugetlb.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>
>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Thanks!
> 
>>
>> I think we need to add, "Fixes: 4eae4efa2c29" as this is now in v5.12
> 
> I could be mistaken, but my understanding is it's broken from the most initial
> cow support of hugetlbfs in 2006...  So if we want a fixes tag, maybe this?
> 
> Fixes: 1e8f889b10d8d ("[PATCH] Hugetlb: Copy on Write support")
> 

Here is why I think it was broken in 4eae4efa2c29.  Prior to that commit
the code looked like this:

			if (cow) {
				/*
				 * No need to notify as we are downgrading page
				 * table protection not changing it to point
				 * to a new page.
				 *
				 * See Documentation/vm/mmu_notifier.rst
				 */
				huge_ptep_set_wrprotect(src, addr, src_pte);
			}
			entry = huge_ptep_get(src_pte);
			ptepage = pte_page(entry);
			get_page(ptepage);
			page_dup_rmap(ptepage, true);
			set_huge_pte_at(dst, addr, dst_pte, entry);
			hugetlb_count_add(pages_per_huge_page(h), dst);

After setting the wrprotect in the source pte, we 'huge_ptep_get' the
source to create the destination.  Hence, wrprotect will be set in the
destination as well.  It is perhaps not the most efficient, but
I think it 'works'.

It is subtle, or am I missing something?
Peter Xu May 3, 2021, 10:24 p.m. UTC | #4
On Mon, May 03, 2021 at 03:10:04PM -0700, Mike Kravetz wrote:
> On 5/3/21 2:41 PM, Peter Xu wrote:
> > On Mon, May 03, 2021 at 01:53:03PM -0700, Mike Kravetz wrote:
> >> On 5/1/21 7:41 AM, Peter Xu wrote:
> >>> When fork() and copy hugetlb page range, we'll remember to wrprotect src pte if
> >>> needed, however we forget about the child!  Without it, the child will be able
> >>> to write to parent's pages when mapped as PROT_READ|PROT_WRITE and MAP_PRIVATE,
> >>> which will cause data corruption in the parent process.
> >>>
> >>> This issue can also be exposed by "memfd_test hugetlbfs" kselftest (if it can
> >>> pass the F_SEAL_FUTURE_WRITE test first, though).
> >>>
> >>> Signed-off-by: Peter Xu <peterx@redhat.com>
> >>> ---
> >>>  mm/hugetlb.c | 2 ++
> >>>  1 file changed, 2 insertions(+)
> >>
> >> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> > 
> > Thanks!
> > 
> >>
> >> I think we need to add, "Fixes: 4eae4efa2c29" as this is now in v5.12
> > 
> > I could be mistaken, but my understanding is it's broken from the most initial
> > cow support of hugetlbfs in 2006...  So if we want a fixes tag, maybe this?
> > 
> > Fixes: 1e8f889b10d8d ("[PATCH] Hugetlb: Copy on Write support")
> > 
> 
> Here is why I think it was broken in 4eae4efa2c29.  Prior to that commit
> the code looked like this:
> 
> 			if (cow) {
> 				/*
> 				 * No need to notify as we are downgrading page
> 				 * table protection not changing it to point
> 				 * to a new page.
> 				 *
> 				 * See Documentation/vm/mmu_notifier.rst
> 				 */
> 				huge_ptep_set_wrprotect(src, addr, src_pte);
> 			}
> 			entry = huge_ptep_get(src_pte);
> 			ptepage = pte_page(entry);
> 			get_page(ptepage);
> 			page_dup_rmap(ptepage, true);
> 			set_huge_pte_at(dst, addr, dst_pte, entry);
> 			hugetlb_count_add(pages_per_huge_page(h), dst);
> 
> After setting the wrprotect in the source pte, we 'huge_ptep_get' the
> source to create the destination.  Hence, wrprotect will be set in the
> destination as well.  It is perhaps not the most efficient, but
> I think it 'works'.
> 
> It is subtle, or am I missing something?

You're right, thanks Mike.  I'll repost and add correct fixes tag.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 629aa4c2259c8..9978fb73b8caf 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4056,6 +4056,8 @@  int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 				 * See Documentation/vm/mmu_notifier.rst
 				 */
 				huge_ptep_set_wrprotect(src, addr, src_pte);
+				/* Child cannot write too! */
+				entry = huge_pte_wrprotect(entry);
 			}
 
 			page_dup_rmap(ptepage, true);