Message ID | 20221022114424.906110403@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Clean up pmd_get_atomic() and i386-PAE | expand |
On Sat, 22 Oct 2022, Peter Zijlstra wrote: > On architectures where the PTE/PMD is larger than the native word size > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > pmdp_get_lockless() just like we use ptep_get_lockless(). I thought that was something Will Deacon put a lot of effort into handling around 5.8 and 5.9: see "strong prevailing wind" in include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. Was it too optimistic? Did the wind drop? I'm interested in the answer, but I've certainly no objection to making this all more obviously robust - thanks. Hugh > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > kernel/events/core.c | 2 +- > mm/gup.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct > return pud_leaf_size(pud); > > pmdp = pmd_offset_lockless(pudp, pud, addr); > - pmd = READ_ONCE(*pmdp); > + pmd = pmdp_get_lockless(pmdp); > if (!pmd_present(pmd)) > return 0; > > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu > > pmdp = pmd_offset_lockless(pudp, pud, addr); > do { > - pmd_t pmd = READ_ONCE(*pmdp); > + pmd_t pmd = pmdp_get_lockless(pmdp); > > next = pmd_addr_end(addr, end); > if (!pmd_present(pmd))
On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote: > On Sat, 22 Oct 2022, Peter Zijlstra wrote: > > > On architectures where the PTE/PMD is larger than the native word size > > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > > pmdp_get_lockless() just like we use ptep_get_lockless(). > > I thought that was something Will Deacon put a lot of effort > into handling around 5.8 and 5.9: see "strong prevailing wind" in > include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. > > Was it too optimistic? Did the wind drop? > > I'm interested in the answer, but I've certainly no objection > to making this all more obviously robust - thanks. READ_ONCE() can't do what the hardware can't do. There is absolutely no way i386 can do an atomic 64bit load without resorting to cmpxchg8b. Also see the comment that goes with compiletime_assert_rwonce_type(). It explicitly allows 64bit because there's just too much stuff that does that (and there's actually 32bit hardware that *can* do it). But it's still very wrong.
On Mon, 24 Oct 2022, Peter Zijlstra wrote: > On Sat, Oct 22, 2022 at 05:42:18PM -0700, Hugh Dickins wrote: > > On Sat, 22 Oct 2022, Peter Zijlstra wrote: > > > > > On architectures where the PTE/PMD is larger than the native word size > > > (i386-PAE for example), READ_ONCE() can do the wrong thing. Use > > > pmdp_get_lockless() just like we use ptep_get_lockless(). > > > > I thought that was something Will Deacon put a lot of effort > > into handling around 5.8 and 5.9: see "strong prevailing wind" in > > include/asm-generic/rwonce.h, formerly in include/linux/compiler.h. > > > > Was it too optimistic? Did the wind drop? > > > > I'm interested in the answer, but I've certainly no objection > > to making this all more obviously robust - thanks. > > READ_ONCE() can't do what the hardware can't do. There is absolutely no > way i386 can do an atomic 64bit load without resorting to cmpxchg8b. Right. > > Also see the comment that goes with compiletime_assert_rwonce_type(). It > explicitly allows 64bit because there's just too much stuff that does > that (and there's actually 32bit hardware that *can* do it). Yes, the "strong prevailing wind" comment. I think I've never read that carefully enough, until you redirected me back there: it is in fact quite clear, that it's only *atomic* in the Armv7 + LPAE case; but READ_ONCEy (READ_EACH_HALF_ONCE I guess) for other 64-on-32 cases. > > But it's still very wrong. Somewhat clearer to me now, thanks. Hugh
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct return pud_leaf_size(pud); pmdp = pmd_offset_lockless(pudp, pud, addr); - pmd = READ_ONCE(*pmdp); + pmd = pmdp_get_lockless(pmdp); if (!pmd_present(pmd)) return 0; --- a/mm/gup.c +++ b/mm/gup.c @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu pmdp = pmd_offset_lockless(pudp, pud, addr); do { - pmd_t pmd = READ_ONCE(*pmdp); + pmd_t pmd = pmdp_get_lockless(pmdp); next = pmd_addr_end(addr, end); if (!pmd_present(pmd))
On architectures where the PTE/PMD is larger than the native word size (i386-PAE for example), READ_ONCE() can do the wrong thing. Use pmdp_get_lockless() just like we use ptep_get_lockless(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/events/core.c | 2 +- mm/gup.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)