Message ID | 20240917073117.1531207-4-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: Use pxdp_get() for accessing page table entries | expand |
On 17/09/2024 08:31, Anshuman Khandual wrote: > Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but > also provides the platform an opportunity to override when required. This > stores read page table entry value in a local variable which can be used in > multiple instances there after. This helps in avoiding multiple memory load > operations as well possible race conditions. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > --- > include/linux/pgtable.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 2a6a3cccfc36..547eeae8c43f 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) > */ > #define set_pte_safe(ptep, pte) \ > ({ \ > - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ > + pte_t __old = ptep_get(ptep); \ > + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ > set_pte(ptep, pte); \ > }) >
On 17.09.24 09:31, Anshuman Khandual wrote: > Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but > also provides the platform an opportunity to override when required. This > stores read page table entry value in a local variable which can be used in > multiple instances there after. This helps in avoiding multiple memory load > operations as well possible race conditions. > Please make it clearer in the subject+description that this really only involves set_pte_safe(). > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/linux/pgtable.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 2a6a3cccfc36..547eeae8c43f 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) > */ > #define set_pte_safe(ptep, pte) \ > ({ \ > - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ > + pte_t __old = ptep_get(ptep); \ > + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ > set_pte(ptep, pte); \ > }) > I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all. Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason.
On 9/17/24 15:58, David Hildenbrand wrote: > On 17.09.24 09:31, Anshuman Khandual wrote: >> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but >> also provides the platform an opportunity to override when required. This >> stores read page table entry value in a local variable which can be used in >> multiple instances there after. This helps in avoiding multiple memory load >> operations as well possible race conditions. >> > > Please make it clearer in the subject+description that this really only involves set_pte_safe(). I will update the commit message with some thing like this. mm: Use ptep_get() in set_pte_safe() This converts PTE accesses in set_pte_safe() via ptep_get() helper which defaults as READ_ONCE() but also provides the platform an opportunity to override when required. This stores read page table entry value in a local variable which can be used in multiple instances there after. This helps in avoiding multiple memory load operations as well as some possible race conditions. > > >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Ryan Roberts <ryan.roberts@arm.com> >> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> >> Cc: linux-mm@kvack.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> include/linux/pgtable.h | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index 2a6a3cccfc36..547eeae8c43f 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) >> */ >> #define set_pte_safe(ptep, pte) \ >> ({ \ >> - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ >> + pte_t __old = ptep_get(ptep); \ >> + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ >> set_pte(ptep, pte); \ >> }) >> > > I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all. Which is not necessary ? Converting de-references to ptep_get() OR caching the page table read value in a local variable ? ptep_get() conversion also serves the purpose providing an opportunity for platform to override. > > Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason. > set_pte/pmd_safe() can be moved to x86 platform - as that is currently the sole user for these helpers. But because set_pgd_safe() gets used in riscv platform, just wondering would it be worth moving only the pte/pmd helpers but not the pgd one ?
On 18.09.24 08:32, Anshuman Khandual wrote: > > > On 9/17/24 15:58, David Hildenbrand wrote: >> On 17.09.24 09:31, Anshuman Khandual wrote: >>> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but >>> also provides the platform an opportunity to override when required. This >>> stores read page table entry value in a local variable which can be used in >>> multiple instances there after. This helps in avoiding multiple memory load >>> operations as well possible race conditions. >>> >> >> Please make it clearer in the subject+description that this really only involves set_pte_safe(). > > I will update the commit message with some thing like this. > > mm: Use ptep_get() in set_pte_safe() > > This converts PTE accesses in set_pte_safe() via ptep_get() helper which > defaults as READ_ONCE() but also provides the platform an opportunity to > override when required. This stores read page table entry value in a local > variable which can be used in multiple instances there after. This helps > in avoiding multiple memory load operations as well as some possible race > conditions. > >> >> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: David Hildenbrand <david@redhat.com> >>> Cc: Ryan Roberts <ryan.roberts@arm.com> >>> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> >>> Cc: linux-mm@kvack.org >>> Cc: linux-kernel@vger.kernel.org >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>> --- >>> include/linux/pgtable.h | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >>> index 2a6a3cccfc36..547eeae8c43f 100644 >>> --- a/include/linux/pgtable.h >>> +++ b/include/linux/pgtable.h >>> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) >>> */ >>> #define set_pte_safe(ptep, pte) \ >>> ({ \ >>> - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ >>> + pte_t __old = ptep_get(ptep); \ >>> + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ >>> set_pte(ptep, pte); \ >>> }) >>> >> >> I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all. > > Which is not necessary ? Converting de-references to ptep_get() OR caching > the page table read value in a local variable ? ptep_get() conversion also > serves the purpose providing an opportunity for platform to override. Which arch override are you thinking of where this change here would make a real difference? Would it even make a difference with cont-pte on arm? > >> >> Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason. >> > set_pte/pmd_safe() can be moved to x86 platform - as that is currently the > sole user for these helpers. But because set_pgd_safe() gets used in riscv > platform, just wondering would it be worth moving only the pte/pmd helpers > but not the pgd one ? My take would be just to move them where they are used, and possibly even inlining them. The point is that it's absolutely underdocumented what "_safe" is supposed to be here, and I don't really see the reason to have this in common code (making the common API more complicated).
On 9/19/24 13:34, David Hildenbrand wrote: > On 18.09.24 08:32, Anshuman Khandual wrote: >> >> >> On 9/17/24 15:58, David Hildenbrand wrote: >>> On 17.09.24 09:31, Anshuman Khandual wrote: >>>> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but >>>> also provides the platform an opportunity to override when required. This >>>> stores read page table entry value in a local variable which can be used in >>>> multiple instances there after. This helps in avoiding multiple memory load >>>> operations as well possible race conditions. >>>> >>> >>> Please make it clearer in the subject+description that this really only involves set_pte_safe(). >> >> I will update the commit message with some thing like this. >> >> mm: Use ptep_get() in set_pte_safe() >> >> This converts PTE accesses in set_pte_safe() via ptep_get() helper which >> defaults as READ_ONCE() but also provides the platform an opportunity to >> override when required. This stores read page table entry value in a local >> variable which can be used in multiple instances there after. This helps >> in avoiding multiple memory load operations as well as some possible race >> conditions. >> >>> >>> >>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>> Cc: David Hildenbrand <david@redhat.com> >>>> Cc: Ryan Roberts <ryan.roberts@arm.com> >>>> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> >>>> Cc: linux-mm@kvack.org >>>> Cc: linux-kernel@vger.kernel.org >>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>>> --- >>>> include/linux/pgtable.h | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >>>> index 2a6a3cccfc36..547eeae8c43f 100644 >>>> --- a/include/linux/pgtable.h >>>> +++ b/include/linux/pgtable.h >>>> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) >>>> */ >>>> #define set_pte_safe(ptep, pte) \ >>>> ({ \ >>>> - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ >>>> + pte_t __old = ptep_get(ptep); \ >>>> + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ >>>> set_pte(ptep, pte); \ >>>> }) >>>> >>> >>> I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all. >> >> Which is not necessary ? Converting de-references to ptep_get() OR caching >> the page table read value in a local variable ? ptep_get() conversion also >> serves the purpose providing an opportunity for platform to override. > > Which arch override are you thinking of where this change here would make a real difference? Would it even make a difference with cont-pte on arm? As we figured out already this code is not used any where other than x86 platform. So changing this, won't make a difference for arm64 unless I am missing something. The idea behind the series is to ensure that, there are no direct de-referencing of page table entries in generic MM code and all accesses should go via available helpers instead. But if we move these set_pxd_safe() helpers into platform code as you have suggested earlier, those changes will not be necessary anymore. > >> >>> >>> Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason. >>> >> set_pte/pmd_safe() can be moved to x86 platform - as that is currently the >> sole user for these helpers. But because set_pgd_safe() gets used in riscv >> platform, just wondering would it be worth moving only the pte/pmd helpers >> but not the pgd one ? > > My take would be just to move them where they are used, and possibly even inlining them. > > The point is that it's absolutely underdocumented what "_safe" is supposed to be here, and I don't really see the reason to have this in common code (making the common API more complicated). Agreed, it makes sense for these helpers to be in the platform code instead where they get used (x86, riscv). Will move them as required.
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2a6a3cccfc36..547eeae8c43f 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b) */ #define set_pte_safe(ptep, pte) \ ({ \ - WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \ + pte_t __old = ptep_get(ptep); \ + WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \ set_pte(ptep, pte); \ })
Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but also provides the platform an opportunity to override when required. This stores read page table entry value in a local variable which can be used in multiple instances there after. This helps in avoiding multiple memory load operations as well possible race conditions. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- include/linux/pgtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)