Message ID | 20220901173516.702122-8-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | per-VMA locks proposal | expand |
Le 01/09/2022 à 19:34, Suren Baghdasaryan a écrit : > Protect VMAs from concurrent page fault handler while performing > copy_page_range for VMAs having VM_WIPEONFORK flag set. I'm wondering why is that necessary. The copied mm is write locked, and the destination one is not reachable. If any other readers are using the VMA, this is only for page fault handling. I should have miss something because I can't see any need to mark the lock VMA here. > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > --- > kernel/fork.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index bfab31ecd11e..1872ad549fed 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, > rb_parent = &tmp->vm_rb; > > mm->map_count++; > - if (!(tmp->vm_flags & VM_WIPEONFORK)) > + if (!(tmp->vm_flags & VM_WIPEONFORK)) { > + vma_mark_locked(mpnt); > retval = copy_page_range(tmp, mpnt); > + } > > if (tmp->vm_ops && tmp->vm_ops->open) > tmp->vm_ops->open(tmp);
On Tue, Sep 6, 2022 at 7:38 AM Laurent Dufour <ldufour@linux.ibm.com> wrote: > > Le 01/09/2022 à 19:34, Suren Baghdasaryan a écrit : > > Protect VMAs from concurrent page fault handler while performing > > copy_page_range for VMAs having VM_WIPEONFORK flag set. > > I'm wondering why is that necessary. > The copied mm is write locked, and the destination one is not reachable. > If any other readers are using the VMA, this is only for page fault handling. Correct, this is done to prevent page faulting in the VMA being duplicated. I assume we want to prevent the pages in that VMA from changing when we are calling copy_page_range(). Am I wrong? > I should have miss something because I can't see any need to mark the lock > VMA here. > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > --- > > kernel/fork.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index bfab31ecd11e..1872ad549fed 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, > > rb_parent = &tmp->vm_rb; > > > > mm->map_count++; > > - if (!(tmp->vm_flags & VM_WIPEONFORK)) > > + if (!(tmp->vm_flags & VM_WIPEONFORK)) { > > + vma_mark_locked(mpnt); > > retval = copy_page_range(tmp, mpnt); > > + } > > > > if (tmp->vm_ops && tmp->vm_ops->open) > > tmp->vm_ops->open(tmp); >
Le 09/09/2022 à 01:57, Suren Baghdasaryan a écrit : > On Tue, Sep 6, 2022 at 7:38 AM Laurent Dufour <ldufour@linux.ibm.com> wrote: >> >> Le 01/09/2022 à 19:34, Suren Baghdasaryan a écrit : >>> Protect VMAs from concurrent page fault handler while performing >>> copy_page_range for VMAs having VM_WIPEONFORK flag set. >> >> I'm wondering why is that necessary. >> The copied mm is write locked, and the destination one is not reachable. >> If any other readers are using the VMA, this is only for page fault handling. > > Correct, this is done to prevent page faulting in the VMA being > duplicated. I assume we want to prevent the pages in that VMA from > changing when we are calling copy_page_range(). Am I wrong? If a page is faulted while copy_page_range() is in progress, the page may not be backed on the child side (PTE lock should protect the copy, isn't it). Is that a real problem? It will be backed later if accessed on the child side. Maybe the per process pages accounting could be incorrect... > >> I should have miss something because I can't see any need to mark the lock >> VMA here. >> >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>> --- >>> kernel/fork.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/fork.c b/kernel/fork.c >>> index bfab31ecd11e..1872ad549fed 100644 >>> --- a/kernel/fork.c >>> +++ b/kernel/fork.c >>> @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, >>> rb_parent = &tmp->vm_rb; >>> >>> mm->map_count++; >>> - if (!(tmp->vm_flags & VM_WIPEONFORK)) >>> + if (!(tmp->vm_flags & VM_WIPEONFORK)) { >>> + vma_mark_locked(mpnt); >>> retval = copy_page_range(tmp, mpnt); >>> + } >>> >>> if (tmp->vm_ops && tmp->vm_ops->open) >>> tmp->vm_ops->open(tmp); >>
On Fri, Sep 9, 2022 at 6:27 AM Laurent Dufour <ldufour@linux.ibm.com> wrote: > > Le 09/09/2022 à 01:57, Suren Baghdasaryan a écrit : > > On Tue, Sep 6, 2022 at 7:38 AM Laurent Dufour <ldufour@linux.ibm.com> wrote: > >> > >> Le 01/09/2022 à 19:34, Suren Baghdasaryan a écrit : > >>> Protect VMAs from concurrent page fault handler while performing > >>> copy_page_range for VMAs having VM_WIPEONFORK flag set. > >> > >> I'm wondering why is that necessary. > >> The copied mm is write locked, and the destination one is not reachable. > >> If any other readers are using the VMA, this is only for page fault handling. > > > > Correct, this is done to prevent page faulting in the VMA being > > duplicated. I assume we want to prevent the pages in that VMA from > > changing when we are calling copy_page_range(). Am I wrong? > > If a page is faulted while copy_page_range() is in progress, the page may > not be backed on the child side (PTE lock should protect the copy, isn't it). > Is that a real problem? It will be backed later if accessed on the child side. > Maybe the per process pages accounting could be incorrect... This feels to me like walking on the edge. Maybe we can discuss this with more people at LPC before trying it? > > > > >> I should have miss something because I can't see any need to mark the lock > >> VMA here. > >> > >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>> --- > >>> kernel/fork.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/kernel/fork.c b/kernel/fork.c > >>> index bfab31ecd11e..1872ad549fed 100644 > >>> --- a/kernel/fork.c > >>> +++ b/kernel/fork.c > >>> @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, > >>> rb_parent = &tmp->vm_rb; > >>> > >>> mm->map_count++; > >>> - if (!(tmp->vm_flags & VM_WIPEONFORK)) > >>> + if (!(tmp->vm_flags & VM_WIPEONFORK)) { > >>> + vma_mark_locked(mpnt); > >>> retval = copy_page_range(tmp, mpnt); > >>> + } > >>> > >>> if (tmp->vm_ops && tmp->vm_ops->open) > >>> tmp->vm_ops->open(tmp); > >> >
diff --git a/kernel/fork.c b/kernel/fork.c index bfab31ecd11e..1872ad549fed 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, rb_parent = &tmp->vm_rb; mm->map_count++; - if (!(tmp->vm_flags & VM_WIPEONFORK)) + if (!(tmp->vm_flags & VM_WIPEONFORK)) { + vma_mark_locked(mpnt); retval = copy_page_range(tmp, mpnt); + } if (tmp->vm_ops && tmp->vm_ops->open) tmp->vm_ops->open(tmp);
Protect VMAs from concurrent page fault handler while performing copy_page_range for VMAs having VM_WIPEONFORK flag set. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- kernel/fork.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)