Message ID | Zehp16cKYeGWknJs@libra05 (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | net/rds: Improper memory ordering semantic in release_in_xmit() | expand |
On Wed, 2024-03-06 at 22:04 +0900, Yewon Choi wrote: > Hello, > > It seems to be that clear_bit() in release_in_xmit() doesn't have > release semantic while it works as a bit lock in rds_send_xmit(). > Since acquire/release_in_xmit() are used in rds_send_xmit() for the > serialization between callers of rds_send_xmit(), they should imply > acquire/release semantics like other locks. > > Although smp_mb__after_atomic() is placed after clear_bit(), it > cannot > prevent that instructions before clear_bit() (in critical section) > are > reordered after clear_bit(). > As a result, mutual exclusion may not be guaranteed in specific > HW architectures like Arm. > > We tested that this locking implementation doesn't guarantee the > atomicity of > critical section in Arm server. Testing was done with Arm Neoverse N1 > cores, > and the testing code was generated by litmus testing tool (klitmus7). > > Initial condition: > > l = x = y = r0 = r1 = 0 > > Thread 0: > > if (test_and_set_bit(0, l) == 0) { > WRITE_ONCE(*x, 1); > WRITE_ONCE(*y, 1); > clear_bit(0, l); > smp_mb__after_atomic(); > } > > Thread 1: > > if (test_and_set_bit(0, l) == 0) { > r0 = READ_ONCE(*x); > r1 = READ_ONCE(*y); > clear_bit(0, l); > smp_mb__after_atomic(); > } > > If the implementation is correct, the value of r0 and r1 should show > all-or-nothing behavior (both 0 or 1). However, below test result > shows > that atomicity violation is very rare, but exists: > > Histogram (4 states) > 9673811 :>1:r0=0; 1:r1=0; > 5647 :>1:r0=1; 1:r1=0; // Violate atomicity > 9605 :>1:r0=0; 1:r1=1; // Violate atomicity > 6310937 :>1:r0=1; 1:r1=1; > > So, we suggest introducing release semantic using clear_bit_unlock() > instead of clear_bit(): > > diff --git a/net/rds/send.c b/net/rds/send.c > index 5e57a1581dc6..65b1bb06ca71 100644 > --- a/net/rds/send.c > +++ b/net/rds/send.c > @@ -108,7 +108,7 @@ static int acquire_in_xmit(struct rds_conn_path > *cp) > > static void release_in_xmit(struct rds_conn_path *cp) > { > - clear_bit(RDS_IN_XMIT, &cp->cp_flags); > + clear_bit_unlock(RDS_IN_XMIT, &cp->cp_flags); > smp_mb__after_atomic(); > /* > * We don't use wait_on_bit()/wake_up_bit() because our > waking is in a > > Could you check this please? If needed, we will send a patch. Hi Yewon, Thank you for finding this. I had a look at the code you had mentioned, and while I don't see any use cases of release_in_xmit() that might result in an out of order read, I do think that the proposed change is a good clean up. If you choose to submit a patch, please remove the proceeding "smp_mb__after_atomic" line as well, as it would no longer be needed. Also, please update acquire_in_xmit() to use the corresponding test_and_set_bit_lock() call. Thank you! Allison > > Best Regards, > Yewon Choi
On Thu, Mar 07, 2024 at 08:13:50PM +0000, Allison Henderson wrote: > On Wed, 2024-03-06 at 22:04 +0900, Yewon Choi wrote: > > Hello, > > > > It seems to be that clear_bit() in release_in_xmit() doesn't have > > release semantic while it works as a bit lock in rds_send_xmit(). > > Since acquire/release_in_xmit() are used in rds_send_xmit() for the > > serialization between callers of rds_send_xmit(), they should imply > > acquire/release semantics like other locks. > > > > Although smp_mb__after_atomic() is placed after clear_bit(), it > > cannot > > prevent that instructions before clear_bit() (in critical section) > > are > > reordered after clear_bit(). > > As a result, mutual exclusion may not be guaranteed in specific > > HW architectures like Arm. > > > > We tested that this locking implementation doesn't guarantee the > > atomicity of > > critical section in Arm server. Testing was done with Arm Neoverse N1 > > cores, > > and the testing code was generated by litmus testing tool (klitmus7). > > > > Initial condition: > > > > l = x = y = r0 = r1 = 0 > > > > Thread 0: > > > > if (test_and_set_bit(0, l) == 0) { > > WRITE_ONCE(*x, 1); > > WRITE_ONCE(*y, 1); > > clear_bit(0, l); > > smp_mb__after_atomic(); > > } > > > > Thread 1: > > > > if (test_and_set_bit(0, l) == 0) { > > r0 = READ_ONCE(*x); > > r1 = READ_ONCE(*y); > > clear_bit(0, l); > > smp_mb__after_atomic(); > > } > > > > If the implementation is correct, the value of r0 and r1 should show > > all-or-nothing behavior (both 0 or 1). However, below test result > > shows > > that atomicity violation is very rare, but exists: > > > > Histogram (4 states) > > 9673811 :>1:r0=0; 1:r1=0; > > 5647 :>1:r0=1; 1:r1=0; // Violate atomicity > > 9605 :>1:r0=0; 1:r1=1; // Violate atomicity > > 6310937 :>1:r0=1; 1:r1=1; > > > > So, we suggest introducing release semantic using clear_bit_unlock() > > instead of clear_bit(): > > > > diff --git a/net/rds/send.c b/net/rds/send.c > > index 5e57a1581dc6..65b1bb06ca71 100644 > > --- a/net/rds/send.c > > +++ b/net/rds/send.c > > @@ -108,7 +108,7 @@ static int acquire_in_xmit(struct rds_conn_path > > *cp) > > > > static void release_in_xmit(struct rds_conn_path *cp) > > { > > - clear_bit(RDS_IN_XMIT, &cp->cp_flags); > > + clear_bit_unlock(RDS_IN_XMIT, &cp->cp_flags); > > smp_mb__after_atomic(); > > /* > > * We don't use wait_on_bit()/wake_up_bit() because our > > waking is in a > > > > Could you check this please? If needed, we will send a patch. > > Hi Yewon, > > Thank you for finding this. I had a look at the code you had > mentioned, and while I don't see any use cases of release_in_xmit() > that might result in an out of order read, I do think that the proposed > change is a good clean up. If you choose to submit a patch, please > remove the proceeding "smp_mb__after_atomic" line as well, as it would > no longer be needed. Also, please update acquire_in_xmit() to use the > corresponding test_and_set_bit_lock() call. Thank you! > Thank you for examining this and giving suggestions! I sent a patch with changes including your suggestions. If it has problems, I will correct them as soon as possible. Sincerely, Yewon Choi > Allison > > > > > > Best Regards, > > Yewon Choi >
diff --git a/net/rds/send.c b/net/rds/send.c index 5e57a1581dc6..65b1bb06ca71 100644 --- a/net/rds/send.c +++ b/net/rds/send.c @@ -108,7 +108,7 @@ static int acquire_in_xmit(struct rds_conn_path *cp) static void release_in_xmit(struct rds_conn_path *cp) { - clear_bit(RDS_IN_XMIT, &cp->cp_flags); + clear_bit_unlock(RDS_IN_XMIT, &cp->cp_flags); smp_mb__after_atomic(); /* * We don't use wait_on_bit()/wake_up_bit() because our waking is in a