diff mbox series

[3/5] btrfs: access eb::blocking_writers according to ACCESS_ONCE policies

Message ID cb93f314165c09d91cbbeb7a1d4ee59b54496220.1571340084.git.dsterba@suse.com (mailing list archive)
State New, archived
Headers show
Series Extent buffer locking and documentation | expand

Commit Message

David Sterba Oct. 17, 2019, 7:38 p.m. UTC
A nice writeup of the LKMM (Linux Kernel Memory Model) rules for access
once policies can be found here
https://lwn.net/Articles/799218/#Access-Marking%20Policies .

The locked and unlocked access to eb::blocking_writers should be
annotated accordingly, following this:

Writes:

- locked write must use ONCE, may use plain read
- unlocked write must use ONCE

Reads:

- unlocked read must use ONCE
- locked read may use plain read iff not mixed with unlocked read
- unlocked read then locked must use ONCE

There's one difference on the assembly level, where
btrfs_tree_read_lock_atomic and btrfs_try_tree_read_lock used the cached
value and did not reevaluate it after taking the lock. This could have
missed some opportunities to take the lock in case blocking writers
changed between the calls, but the window is just a few instructions
long. As this is in try-lock, the callers handle that.

Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/locking.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

Comments

Nikolay Borisov Oct. 23, 2019, 8:42 a.m. UTC | #1
On 17.10.19 г. 22:38 ч., David Sterba wrote:
> A nice writeup of the LKMM (Linux Kernel Memory Model) rules for access
> once policies can be found here
> https://lwn.net/Articles/799218/#Access-Marking%20Policies .
> 
> The locked and unlocked access to eb::blocking_writers should be
> annotated accordingly, following this:
> 
> Writes:
> 
> - locked write must use ONCE, may use plain read
> - unlocked write must use ONCE
> 
> Reads:
> 
> - unlocked read must use ONCE
> - locked read may use plain read iff not mixed with unlocked read
> - unlocked read then locked must use ONCE
> 
> There's one difference on the assembly level, where
> btrfs_tree_read_lock_atomic and btrfs_try_tree_read_lock used the cached
> value and did not reevaluate it after taking the lock. This could have
> missed some opportunities to take the lock in case blocking writers
> changed between the calls, but the window is just a few instructions
> long. As this is in try-lock, the callers handle that.

Looking at blocking_writers it seems this variable is modified under
eb->lock (in set lock blocking which requires us having a spinning lock
first, meaning the setting thread is the owner). At this point the
owning thread will have to eventually unlock it, this happens in
btrfs_tree_unlock. There the read/write to the variable is either locked
or unlocked depending on whether the lock is blocking (unlocked) or
spinning (locked).

OTOH other readers of the lock might access is unlocked, namely :
tree_read_lock (in case wait_event is called), tree_read_lock_atomic -
the first access is unlocked. The optimistic unlocked checks in
try_tree_(read|Write_lock).

So I believe this puts us in scenario 3 in the referenced section of
that LWN article:

A shared variable (blocking_writers) is only modified while holding a
given lock by a given owning CPU or thread (that's the thread which
first took a spinning lock and eventually converted it to a blocking
one), but is read by other CPUs or threads or by code not holding that
lock (that's other threads that might want to acquire the lock on the
given eb).

Apart from one minor nit below (and provided my understanding is correct
as per above exposé) you can add:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> 
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>  fs/btrfs/locking.c | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index 00edf91c3d1c..a12f3edc3505 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -109,7 +109,7 @@ void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
>  	if (eb->blocking_writers == 0) {
>  		btrfs_assert_spinning_writers_put(eb);
>  		btrfs_assert_tree_locked(eb);
> -		eb->blocking_writers = 1;
> +		WRITE_ONCE(eb->blocking_writers, 1);
>  		write_unlock(&eb->lock);
>  	}
>  }

<snip>


> @@ -294,7 +299,8 @@ void btrfs_tree_lock(struct extent_buffer *eb)
>   */
>  void btrfs_tree_unlock(struct extent_buffer *eb)
>  {
> -	int blockers = eb->blocking_writers;
> +	/* This is read both locked and unlocked */
> +	int blockers = READ_ONCE(eb->blocking_writers);

Actually aren't we guaranteed that btrfs_tree_unlock's caller is the
owner of blocking_writers meaning we can use plain loads as per:

"The owning CPU or thread may use plain loads..."

>  
>  	BUG_ON(blockers > 1);
>  
> @@ -305,7 +311,8 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
>  
>  	if (blockers) {
>  		btrfs_assert_no_spinning_writers(eb);
> -		eb->blocking_writers = 0;
> +		/* Unlocked write */
> +		WRITE_ONCE(eb->blocking_writers, 0);
>  		/*
>  		 * We need to order modifying blocking_writers above with
>  		 * actually waking up the sleepers to ensure they see the
>
David Sterba Oct. 29, 2019, 9:33 p.m. UTC | #2
On Wed, Oct 23, 2019 at 11:42:29AM +0300, Nikolay Borisov wrote:
> On 17.10.19 г. 22:38 ч., David Sterba wrote:
> > @@ -294,7 +299,8 @@ void btrfs_tree_lock(struct extent_buffer *eb)
> >   */
> >  void btrfs_tree_unlock(struct extent_buffer *eb)
> >  {
> > -	int blockers = eb->blocking_writers;
> > +	/* This is read both locked and unlocked */
> > +	int blockers = READ_ONCE(eb->blocking_writers);
> 
> Actually aren't we guaranteed that btrfs_tree_unlock's caller is the
> owner of blocking_writers meaning we can use plain loads as per:
> 
> "The owning CPU or thread may use plain loads..."

That's right, I'll remove READ_ONCE and leave a comment as it's not
obvious from the context.
diff mbox series

Patch

diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index 00edf91c3d1c..a12f3edc3505 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -109,7 +109,7 @@  void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
 	if (eb->blocking_writers == 0) {
 		btrfs_assert_spinning_writers_put(eb);
 		btrfs_assert_tree_locked(eb);
-		eb->blocking_writers = 1;
+		WRITE_ONCE(eb->blocking_writers, 1);
 		write_unlock(&eb->lock);
 	}
 }
@@ -145,7 +145,7 @@  void btrfs_tree_read_lock(struct extent_buffer *eb)
 		}
 		read_unlock(&eb->lock);
 		wait_event(eb->write_lock_wq,
-			   eb->blocking_writers == 0);
+			   READ_ONCE(eb->blocking_writers) == 0);
 		goto again;
 	}
 	btrfs_assert_tree_read_locks_get(eb);
@@ -160,11 +160,12 @@  void btrfs_tree_read_lock(struct extent_buffer *eb)
  */
 int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
 {
-	if (eb->blocking_writers)
+	if (READ_ONCE(eb->blocking_writers))
 		return 0;
 
 	read_lock(&eb->lock);
-	if (eb->blocking_writers) {
+	/* Refetch value after lock */
+	if (READ_ONCE(eb->blocking_writers)) {
 		read_unlock(&eb->lock);
 		return 0;
 	}
@@ -180,13 +181,14 @@  int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
  */
 int btrfs_try_tree_read_lock(struct extent_buffer *eb)
 {
-	if (eb->blocking_writers)
+	if (READ_ONCE(eb->blocking_writers))
 		return 0;
 
 	if (!read_trylock(&eb->lock))
 		return 0;
 
-	if (eb->blocking_writers) {
+	/* Refetch value after lock */
+	if (READ_ONCE(eb->blocking_writers)) {
 		read_unlock(&eb->lock);
 		return 0;
 	}
@@ -202,11 +204,12 @@  int btrfs_try_tree_read_lock(struct extent_buffer *eb)
  */
 int btrfs_try_tree_write_lock(struct extent_buffer *eb)
 {
-	if (eb->blocking_writers || atomic_read(&eb->blocking_readers))
+	if (READ_ONCE(eb->blocking_writers) || atomic_read(&eb->blocking_readers))
 		return 0;
 
 	write_lock(&eb->lock);
-	if (eb->blocking_writers || atomic_read(&eb->blocking_readers)) {
+	/* Refetch value after lock */
+	if (READ_ONCE(eb->blocking_writers) || atomic_read(&eb->blocking_readers)) {
 		write_unlock(&eb->lock);
 		return 0;
 	}
@@ -277,9 +280,11 @@  void btrfs_tree_lock(struct extent_buffer *eb)
 	WARN_ON(eb->lock_owner == current->pid);
 again:
 	wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0);
-	wait_event(eb->write_lock_wq, eb->blocking_writers == 0);
+	wait_event(eb->write_lock_wq, READ_ONCE(eb->blocking_writers) == 0);
 	write_lock(&eb->lock);
-	if (atomic_read(&eb->blocking_readers) || eb->blocking_writers) {
+	/* Refetch value after lock */
+	if (atomic_read(&eb->blocking_readers) ||
+	    READ_ONCE(eb->blocking_writers)) {
 		write_unlock(&eb->lock);
 		goto again;
 	}
@@ -294,7 +299,8 @@  void btrfs_tree_lock(struct extent_buffer *eb)
  */
 void btrfs_tree_unlock(struct extent_buffer *eb)
 {
-	int blockers = eb->blocking_writers;
+	/* This is read both locked and unlocked */
+	int blockers = READ_ONCE(eb->blocking_writers);
 
 	BUG_ON(blockers > 1);
 
@@ -305,7 +311,8 @@  void btrfs_tree_unlock(struct extent_buffer *eb)
 
 	if (blockers) {
 		btrfs_assert_no_spinning_writers(eb);
-		eb->blocking_writers = 0;
+		/* Unlocked write */
+		WRITE_ONCE(eb->blocking_writers, 0);
 		/*
 		 * We need to order modifying blocking_writers above with
 		 * actually waking up the sleepers to ensure they see the