Message ID | 20180216102100.GB25201@hirez.programming.kicks-ass.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Peter, On Fri, Feb 16, 2018 at 11:21:00AM +0100, Peter Zijlstra wrote: > On Thu, Feb 15, 2018 at 06:20:49PM +0000, Will Deacon wrote: > > On Thu, Feb 15, 2018 at 06:08:47PM +0100, Peter Zijlstra wrote: > > > On Thu, Feb 15, 2018 at 03:29:33PM +0000, Will Deacon wrote: > > > > +static inline void __clear_bit_unlock(unsigned int nr, > > > > + volatile unsigned long *p) > > > > +{ > > > > + unsigned long old; > > > > > > > > + p += BIT_WORD(nr); > > > > + old = READ_ONCE(*p); > > > > + old &= ~BIT_MASK(nr); > > > > + smp_store_release(p, old); > > > > > > This should be atomic_set_release() I think, for the special case where > > > atomics are implemented with spinlocks, see the 'fun' case in > > > Documentation/atomic_t.txt. > > > > My understanding of __clear_bit_unlock is that there is guaranteed to be > > no concurrent accesses to the same word, so why would it matter whether > > locks are used to implement atomics? > > > commit f75d48644c56a31731d17fa693c8175328957e1d > Author: Peter Zijlstra <peterz@infradead.org> > Date: Wed Mar 9 12:40:54 2016 +0100 > > bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Ah, so it's problematic for the case where atomics are built using locks. Got it. I'll err on the side of caution here and have the asm-generic header (which should be bitops/lock.h not bitops/atomic.h) conditionally define __clear_bit_unlock as clear_bit_lock unless the architecture has provided its own implementation. Thanks, Will
On Mon, Feb 19, 2018 at 02:01:43PM +0000, Will Deacon wrote: > > The non serializing __clear_bit() was getting "lost" > > > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Ah, so it's problematic for the case where atomics are built using locks. > Got it. I'll err on the side of caution here and have the asm-generic header > (which should be bitops/lock.h not bitops/atomic.h) conditionally define > __clear_bit_unlock as clear_bit_lock unless the architecture has provided > its own implementation. So I think we get it all right if we use atomic_set_release(). If the atomics are implemented using locks, atomic_set*() should be implemented like atomic_xchg() and avoid the above problem.
diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */