Message ID | 239432a0bc56464e58a6baf3622fdc72526c8d57.1724226076.git.zhengqi.arch@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | introduce pte_offset_map_{readonly|maywrite}_nolock() | expand |
Le 21/08/2024 à 10:18, Qi Zheng a écrit : > In handle_pte_fault(), we may modify the vmf->pte after acquiring the > vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But > since we already do the pte_same() check, so there is no need to get > pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > --- > mm/memory.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 93c0c25433d02..d3378e98faf13 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) > * pmd by anon khugepaged, since that takes mmap_lock in write > * mode; but shmem or file collapse to THP could still morph > * it into a huge pmd: just retry later if so. > + * > + * Use the maywrite version to indicate that vmf->pte will be > + * modified, but since we will use pte_same() to detect the > + * change of the pte entry, there is no need to get pmdval. > */ > - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, > - vmf->address, &vmf->ptl); > + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, > + vmf->pmd, vmf->address, > + NULL, &vmf->ptl); This might be the demonstration that the function name is becoming too long. Can you find shorter names ? > if (unlikely(!vmf->pte)) > return 0; > vmf->orig_pte = ptep_get_lockless(vmf->pte);
On 2024/8/21 17:17, LEROY Christophe wrote: > > > Le 21/08/2024 à 10:18, Qi Zheng a écrit : >> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But >> since we already do the pte_same() check, so there is no need to get >> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >> >> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >> --- >> mm/memory.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 93c0c25433d02..d3378e98faf13 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) >> * pmd by anon khugepaged, since that takes mmap_lock in write >> * mode; but shmem or file collapse to THP could still morph >> * it into a huge pmd: just retry later if so. >> + * >> + * Use the maywrite version to indicate that vmf->pte will be >> + * modified, but since we will use pte_same() to detect the >> + * change of the pte entry, there is no need to get pmdval. >> */ >> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >> - vmf->address, &vmf->ptl); >> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >> + vmf->pmd, vmf->address, >> + NULL, &vmf->ptl); > > This might be the demonstration that the function name is becoming too long. > > Can you find shorter names ? Maybe use abbreviations? pte_offset_map_ro_nolock() pte_offset_map_rw_nolock() > >> if (unlikely(!vmf->pte)) >> return 0; >> vmf->orig_pte = ptep_get_lockless(vmf->pte);
On 21.08.24 11:24, Qi Zheng wrote: > > > On 2024/8/21 17:17, LEROY Christophe wrote: >> >> >> Le 21/08/2024 à 10:18, Qi Zheng a écrit : >>> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >>> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But >>> since we already do the pte_same() check, so there is no need to get >>> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >>> >>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>> --- >>> mm/memory.c | 9 +++++++-- >>> 1 file changed, 7 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 93c0c25433d02..d3378e98faf13 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) >>> * pmd by anon khugepaged, since that takes mmap_lock in write >>> * mode; but shmem or file collapse to THP could still morph >>> * it into a huge pmd: just retry later if so. >>> + * >>> + * Use the maywrite version to indicate that vmf->pte will be >>> + * modified, but since we will use pte_same() to detect the >>> + * change of the pte entry, there is no need to get pmdval. >>> */ >>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >>> - vmf->address, &vmf->ptl); >>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>> + vmf->pmd, vmf->address, >>> + NULL, &vmf->ptl); I think we discussed that passing NULL should be forbidden for that function. >> >> This might be the demonstration that the function name is becoming too long. >> >> Can you find shorter names ? > > Maybe use abbreviations? > > pte_offset_map_ro_nolock() > pte_offset_map_rw_nolock() At least the "ro" is better, but "rw" does not express the "maywrite" -- because without taking the lock we are not allowed to write. But maybe "rw" is good enough for that if we document it properly. And you can use up to 100 characters, if it helps readability.
On 2024/8/21 17:41, David Hildenbrand wrote: > On 21.08.24 11:24, Qi Zheng wrote: >> >> >> On 2024/8/21 17:17, LEROY Christophe wrote: >>> >>> >>> Le 21/08/2024 à 10:18, Qi Zheng a écrit : >>>> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >>>> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But >>>> since we already do the pte_same() check, so there is no need to get >>>> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >>>> >>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>>> --- >>>> mm/memory.c | 9 +++++++-- >>>> 1 file changed, 7 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/mm/memory.c b/mm/memory.c >>>> index 93c0c25433d02..d3378e98faf13 100644 >>>> --- a/mm/memory.c >>>> +++ b/mm/memory.c >>>> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct >>>> vm_fault *vmf) >>>> * pmd by anon khugepaged, since that takes mmap_lock in >>>> write >>>> * mode; but shmem or file collapse to THP could still >>>> morph >>>> * it into a huge pmd: just retry later if so. >>>> + * >>>> + * Use the maywrite version to indicate that vmf->pte will be >>>> + * modified, but since we will use pte_same() to detect the >>>> + * change of the pte entry, there is no need to get pmdval. >>>> */ >>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >>>> - vmf->address, &vmf->ptl); >>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>> + vmf->pmd, vmf->address, >>>> + NULL, &vmf->ptl); > > I think we discussed that passing NULL should be forbidden for that > function. Yes, but for some maywrite case, there is no need to get pmdval to do pmd_same() check. So I passed NULL and added a comment to explain this. > >>> >>> This might be the demonstration that the function name is becoming >>> too long. >>> >>> Can you find shorter names ? >> >> Maybe use abbreviations? >> >> pte_offset_map_ro_nolock() >> pte_offset_map_rw_nolock() > > At least the "ro" is better, but "rw" does not express the "maywrite" -- > because without taking the lock we are not allowed to write. But maybe > "rw" is good enough for that if we document it properly. OK, will change to it in the next version. > > And you can use up to 100 characters, if it helps readability Got it. >
On 21.08.24 11:51, Qi Zheng wrote: > > > On 2024/8/21 17:41, David Hildenbrand wrote: >> On 21.08.24 11:24, Qi Zheng wrote: >>> >>> >>> On 2024/8/21 17:17, LEROY Christophe wrote: >>>> >>>> >>>> Le 21/08/2024 à 10:18, Qi Zheng a écrit : >>>>> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >>>>> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But >>>>> since we already do the pte_same() check, so there is no need to get >>>>> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >>>>> >>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>>>> --- >>>>> mm/memory.c | 9 +++++++-- >>>>> 1 file changed, 7 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>> index 93c0c25433d02..d3378e98faf13 100644 >>>>> --- a/mm/memory.c >>>>> +++ b/mm/memory.c >>>>> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct >>>>> vm_fault *vmf) >>>>> * pmd by anon khugepaged, since that takes mmap_lock in >>>>> write >>>>> * mode; but shmem or file collapse to THP could still >>>>> morph >>>>> * it into a huge pmd: just retry later if so. >>>>> + * >>>>> + * Use the maywrite version to indicate that vmf->pte will be >>>>> + * modified, but since we will use pte_same() to detect the >>>>> + * change of the pte entry, there is no need to get pmdval. >>>>> */ >>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >>>>> - vmf->address, &vmf->ptl); >>>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>> + vmf->pmd, vmf->address, >>>>> + NULL, &vmf->ptl); >> >> I think we discussed that passing NULL should be forbidden for that >> function. > > Yes, but for some maywrite case, there is no need to get pmdval to > do pmd_same() check. So I passed NULL and added a comment to > explain this. I wonder if it's better to pass a dummy variable instead. One has to think harder why that is required compared to blindly passing "NULL" :)
On 2024/8/21 17:53, David Hildenbrand wrote: > On 21.08.24 11:51, Qi Zheng wrote: >> >> >> On 2024/8/21 17:41, David Hildenbrand wrote: >>> On 21.08.24 11:24, Qi Zheng wrote: >>>> >>>> >>>> On 2024/8/21 17:17, LEROY Christophe wrote: >>>>> >>>>> >>>>> Le 21/08/2024 à 10:18, Qi Zheng a écrit : >>>>>> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >>>>>> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). >>>>>> But >>>>>> since we already do the pte_same() check, so there is no need to get >>>>>> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >>>>>> >>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>>>>> --- >>>>>> mm/memory.c | 9 +++++++-- >>>>>> 1 file changed, 7 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>>> index 93c0c25433d02..d3378e98faf13 100644 >>>>>> --- a/mm/memory.c >>>>>> +++ b/mm/memory.c >>>>>> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct >>>>>> vm_fault *vmf) >>>>>> * pmd by anon khugepaged, since that takes mmap_lock in >>>>>> write >>>>>> * mode; but shmem or file collapse to THP could still >>>>>> morph >>>>>> * it into a huge pmd: just retry later if so. >>>>>> + * >>>>>> + * Use the maywrite version to indicate that vmf->pte >>>>>> will be >>>>>> + * modified, but since we will use pte_same() to detect the >>>>>> + * change of the pte entry, there is no need to get pmdval. >>>>>> */ >>>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >>>>>> - vmf->address, &vmf->ptl); >>>>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>>> + vmf->pmd, vmf->address, >>>>>> + NULL, &vmf->ptl); >>> >>> I think we discussed that passing NULL should be forbidden for that >>> function. >> >> Yes, but for some maywrite case, there is no need to get pmdval to >> do pmd_same() check. So I passed NULL and added a comment to >> explain this. > > I wonder if it's better to pass a dummy variable instead. One has to > think harder why that is required compared to blindly passing "NULL" :) You are afraid that subsequent caller will abuse this function, right? My initial concern was that this would add a useless local vaiable, but perhaps that is not a big deal. Both are fine for me. ;) >
On 21.08.24 12:03, Qi Zheng wrote: > > > On 2024/8/21 17:53, David Hildenbrand wrote: >> On 21.08.24 11:51, Qi Zheng wrote: >>> >>> >>> On 2024/8/21 17:41, David Hildenbrand wrote: >>>> On 21.08.24 11:24, Qi Zheng wrote: >>>>> >>>>> >>>>> On 2024/8/21 17:17, LEROY Christophe wrote: >>>>>> >>>>>> >>>>>> Le 21/08/2024 à 10:18, Qi Zheng a écrit : >>>>>>> In handle_pte_fault(), we may modify the vmf->pte after acquiring the >>>>>>> vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). >>>>>>> But >>>>>>> since we already do the pte_same() check, so there is no need to get >>>>>>> pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. >>>>>>> >>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>>>>>> --- >>>>>>> mm/memory.c | 9 +++++++-- >>>>>>> 1 file changed, 7 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>>>> index 93c0c25433d02..d3378e98faf13 100644 >>>>>>> --- a/mm/memory.c >>>>>>> +++ b/mm/memory.c >>>>>>> @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct >>>>>>> vm_fault *vmf) >>>>>>> * pmd by anon khugepaged, since that takes mmap_lock in >>>>>>> write >>>>>>> * mode; but shmem or file collapse to THP could still >>>>>>> morph >>>>>>> * it into a huge pmd: just retry later if so. >>>>>>> + * >>>>>>> + * Use the maywrite version to indicate that vmf->pte >>>>>>> will be >>>>>>> + * modified, but since we will use pte_same() to detect the >>>>>>> + * change of the pte entry, there is no need to get pmdval. >>>>>>> */ >>>>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, >>>>>>> - vmf->address, &vmf->ptl); >>>>>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>>>> + vmf->pmd, vmf->address, >>>>>>> + NULL, &vmf->ptl); >>>> >>>> I think we discussed that passing NULL should be forbidden for that >>>> function. >>> >>> Yes, but for some maywrite case, there is no need to get pmdval to >>> do pmd_same() check. So I passed NULL and added a comment to >>> explain this. >> >> I wonder if it's better to pass a dummy variable instead. One has to >> think harder why that is required compared to blindly passing "NULL" :) > > You are afraid that subsequent caller will abuse this function, right? Yes! "oh, I don't need a pmdval, why would I? let's just pass NULL, easy" :) > My initial concern was that this would add a useless local vaiable, but > perhaps that is not a big deal. How many of these "special" instances do we have? > > Both are fine for me. ;) Also no strong opinion, but having to pass a variable makes you think what you are supposed to do with it and why it is not optional.
Hi David, On 2024/8/22 17:29, David Hildenbrand wrote: > On 21.08.24 12:03, Qi Zheng wrote: >> >> [...] >>>>>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, >>>>>>>> vmf->pmd, >>>>>>>> - vmf->address, &vmf->ptl); >>>>>>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>>>>> + vmf->pmd, vmf->address, >>>>>>>> + NULL, &vmf->ptl); >>>>> >>>>> I think we discussed that passing NULL should be forbidden for that >>>>> function. >>>> >>>> Yes, but for some maywrite case, there is no need to get pmdval to >>>> do pmd_same() check. So I passed NULL and added a comment to >>>> explain this. >>> >>> I wonder if it's better to pass a dummy variable instead. One has to >>> think harder why that is required compared to blindly passing "NULL" :) >> >> You are afraid that subsequent caller will abuse this function, right? > > Yes! "oh, I don't need a pmdval, why would I? let's just pass NULL, > easy" :) > >> My initial concern was that this would add a useless local vaiable, but >> perhaps that is not a big deal. > > How many of these "special" instances do we have? We have 5 such special instances. > >> >> Both are fine for me. ;) > > Also no strong opinion, but having to pass a variable makes you think > what you are supposed to do with it and why it is not optional. Yeah, I added 'BUG_ON(!pmdvalp);' in pte_offset_map_ro_nolock(), and have updated the v2 version [1]. [1]. https://lore.kernel.org/lkml/cover.1724310149.git.zhengqi.arch@bytedance.com/ Thanks, Qi >
On 22.08.24 14:17, Qi Zheng wrote: > Hi David, > > On 2024/8/22 17:29, David Hildenbrand wrote: >> On 21.08.24 12:03, Qi Zheng wrote: >>> >>> > > [...] > >>>>>>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, >>>>>>>>> vmf->pmd, >>>>>>>>> - vmf->address, &vmf->ptl); >>>>>>>>> + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>>>>>> + vmf->pmd, vmf->address, >>>>>>>>> + NULL, &vmf->ptl); >>>>>> >>>>>> I think we discussed that passing NULL should be forbidden for that >>>>>> function. >>>>> >>>>> Yes, but for some maywrite case, there is no need to get pmdval to >>>>> do pmd_same() check. So I passed NULL and added a comment to >>>>> explain this. >>>> >>>> I wonder if it's better to pass a dummy variable instead. One has to >>>> think harder why that is required compared to blindly passing "NULL" :) >>> >>> You are afraid that subsequent caller will abuse this function, right? >> >> Yes! "oh, I don't need a pmdval, why would I? let's just pass NULL, >> easy" :) >> >>> My initial concern was that this would add a useless local vaiable, but >>> perhaps that is not a big deal. >> >> How many of these "special" instances do we have? > > We have 5 such special instances. > >> >>> >>> Both are fine for me. ;) >> >> Also no strong opinion, but having to pass a variable makes you think >> what you are supposed to do with it and why it is not optional. > > Yeah, I added 'BUG_ON(!pmdvalp);' in pte_offset_map_ro_nolock(), and > have updated the v2 version [1]. No BUG_ON please :) VM_WARN_ON_ONCE() is good enough.
On 2024/8/22 20:19, David Hildenbrand wrote: > On 22.08.24 14:17, Qi Zheng wrote: >> Hi David, >> >> On 2024/8/22 17:29, David Hildenbrand wrote: >>> On 21.08.24 12:03, Qi Zheng wrote: >>>> >>>> >> >> [...] >> >>>>>>>>>> - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, >>>>>>>>>> vmf->pmd, >>>>>>>>>> - vmf->address, &vmf->ptl); >>>>>>>>>> + vmf->pte = >>>>>>>>>> pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, >>>>>>>>>> + vmf->pmd, vmf->address, >>>>>>>>>> + NULL, &vmf->ptl); >>>>>>> >>>>>>> I think we discussed that passing NULL should be forbidden for that >>>>>>> function. >>>>>> >>>>>> Yes, but for some maywrite case, there is no need to get pmdval to >>>>>> do pmd_same() check. So I passed NULL and added a comment to >>>>>> explain this. >>>>> >>>>> I wonder if it's better to pass a dummy variable instead. One has to >>>>> think harder why that is required compared to blindly passing >>>>> "NULL" :) >>>> >>>> You are afraid that subsequent caller will abuse this function, right? >>> >>> Yes! "oh, I don't need a pmdval, why would I? let's just pass NULL, >>> easy" :) >>> >>>> My initial concern was that this would add a useless local vaiable, but >>>> perhaps that is not a big deal. >>> >>> How many of these "special" instances do we have? >> >> We have 5 such special instances. >> >>> >>>> >>>> Both are fine for me. ;) >>> >>> Also no strong opinion, but having to pass a variable makes you think >>> what you are supposed to do with it and why it is not optional. >> >> Yeah, I added 'BUG_ON(!pmdvalp);' in pte_offset_map_ro_nolock(), and >> have updated the v2 version [1]. > > No BUG_ON please :) VM_WARN_ON_ONCE() is good enough. Got it. Will do in the next version. >
diff --git a/mm/memory.c b/mm/memory.c index 93c0c25433d02..d3378e98faf13 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5504,9 +5504,14 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * pmd by anon khugepaged, since that takes mmap_lock in write * mode; but shmem or file collapse to THP could still morph * it into a huge pmd: just retry later if so. + * + * Use the maywrite version to indicate that vmf->pte will be + * modified, but since we will use pte_same() to detect the + * change of the pte entry, there is no need to get pmdval. */ - vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + vmf->pte = pte_offset_map_maywrite_nolock(vmf->vma->vm_mm, + vmf->pmd, vmf->address, + NULL, &vmf->ptl); if (unlikely(!vmf->pte)) return 0; vmf->orig_pte = ptep_get_lockless(vmf->pte);
In handle_pte_fault(), we may modify the vmf->pte after acquiring the vmf->ptl, so convert it to using pte_offset_map_maywrite_nolock(). But since we already do the pte_same() check, so there is no need to get pmdval to do pmd_same() check, just pass NULL to pmdvalp parameter. Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> --- mm/memory.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)