[5/5] btrfs: document extent buffer locking
diff mbox series

Message ID ed989ccddbc8822e61df533d861d907ce0a43040.1571340084.git.dsterba@suse.com
State New
Headers show
Series
  • Extent buffer locking and documentation
Related show

Commit Message

David Sterba Oct. 17, 2019, 7:39 p.m. UTC
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 96 insertions(+), 14 deletions(-)

Comments

Qu Wenruo Oct. 18, 2019, 12:17 a.m. UTC | #1
On 2019/10/18 上午3:39, David Sterba wrote:
> Signed-off-by: David Sterba <dsterba@suse.com>

Great document.

Some questions inlined below.
> ---
>  fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 96 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index e0e0430577aa..2a0e828b4470 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -13,6 +13,48 @@
>  #include "extent_io.h"
>  #include "locking.h"
>  
> +/*
> + * Extent buffer locking
> + * ~~~~~~~~~~~~~~~~~~~~~
> + *
> + * The locks use a custom scheme that allows to do more operations than are
> + * available fromt current locking primitives. The building blocks are still
> + * rwlock and wait queues.
> + *
> + * Required semantics:
> + *
> + * - reader/writer exclusion
> + * - writer/writer exclusion
> + * - reader/reader sharing
> + * - spinning lock semantics
> + * - blocking lock semantics
> + * - try-lock semantics for readers and writers
> + * - one level nesting, allowing read lock to be taken by the same thread that
> + *   already has write lock

Any example about this scenario? IIRC there is only one user of nested lock.
Although we know it exists for a long time, I guess it would be better
trying to remove such call sites?

> + *
> + * The extent buffer locks (also called tree locks) manage access to eb data.

One of my concern related to "access to eb data" is, to access some
member, we don't really need any lock at all.

Some members should never change during the lifespan of an eb. E.g.
bytenr, transid.

Some code is already taking advantage of this, like tree-checker
checking the transid without holding a lock.
Not sure if we should take use of this.

> + * We want concurrency of many readers and safe updates. The underlying locking
> + * is done by read-write spinlock and the blocking part is implemented using
> + * counters and wait queues.
> + *
> + * spinning semantics - the low-level rwlock is held so all other threads that
> + *                      want to take it are spinning on it.
> + *
> + * blocking semantics - the low-level rwlock is not held but the counter
> + *                      denotes how many times the blocking lock was held;
> + *                      sleeping is possible

What about an example/state machine of all read/write and
spinning/blocking combination?

Thanks,
Qu

> + *
> + * Write lock always allows only one thread to access the data.
> + *
> + *
> + * Debugging
> + * ~~~~~~~~~
> + *
> + * There are additional state counters that are asserted in various contexts,
> + * removed from non-debug build to reduce extent_buffer size and for
> + * performance reasons.
> + */
> +
>  #ifdef CONFIG_BTRFS_DEBUG
>  static inline void btrfs_assert_spinning_writers_get(struct extent_buffer *eb)
>  {
> @@ -80,6 +122,15 @@ static void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) { }
>  static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { }
>  #endif
>  
> +/*
> + * Mark already held read lock as blocking. Can be nested in write lock by the
> + * same thread.
> + *
> + * Use when there are potentially long operations ahead so other thread waiting
> + * on the lock will not actively spin but sleep instead.
> + *
> + * The rwlock is released and blocking reader counter is increased.
> + */
>  void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
>  {
>  	trace_btrfs_set_lock_blocking_read(eb);
> @@ -96,6 +147,14 @@ void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
>  	read_unlock(&eb->lock);
>  }
>  
> +/*
> + * Mark already held write lock as blocking.
> + *
> + * Use when there are potentially long operations ahead so other threads
> + * waiting on the lock will not actively spin but sleep instead.
> + *
> + * The rwlock is released and blocking writers is set.
> + */
>  void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
>  {
>  	trace_btrfs_set_lock_blocking_write(eb);
> @@ -127,8 +186,13 @@ void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning read lock.  This will wait for any blocking
> - * writers
> + * Lock the extent buffer for read. Wait for any writers (spinning or blocking).
> + * Can be nested in write lock by the same thread.
> + *
> + * Use when the locked section does only lightweight actions and busy waiting
> + * would be cheaper than making other threads do the wait/wake loop.
> + *
> + * The rwlock is held upon exit.
>   */
>  void btrfs_tree_read_lock(struct extent_buffer *eb)
>  {
> @@ -166,9 +230,10 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning read lock.
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers
> + * Lock extent buffer for read, optimistically expecting that there are no
> + * contending blocking writers. If there are, don't wait.
> + *
> + * Return 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
>  {
> @@ -188,8 +253,9 @@ int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
>  }
>  
>  /*
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers
> + * Try-lock for read. Don't block or wait for contending writers.
> + *
> + * Retrun 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_try_tree_read_lock(struct extent_buffer *eb)
>  {
> @@ -211,8 +277,10 @@ int btrfs_try_tree_read_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers or readers
> + * Try-lock for write. May block until the lock is uncontended, but does not
> + * wait until it is free.
> + *
> + * Retrun 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_try_tree_write_lock(struct extent_buffer *eb)
>  {
> @@ -233,7 +301,10 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a spinning read lock
> + * Release read lock. Must be used only if the lock is in spinning mode.  If
> + * the read lock is nested, must pair with read lock before the write unlock.
> + *
> + * The rwlock is not held upon exit.
>   */
>  void btrfs_tree_read_unlock(struct extent_buffer *eb)
>  {
> @@ -255,7 +326,11 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a blocking read lock
> + * Release read lock, previously set to blocking by a pairing call to
> + * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same
> + * thread.
> + *
> + * State of rwlock is unchanged, last reader wakes waiting threads.
>   */
>  void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  {
> @@ -279,8 +354,10 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning write lock.  This will wait for both
> - * blocking readers or writers
> + * Lock for write. Wait for all blocking and spinning readers and writers. This
> + * starts context where reader lock could be nested by the same thread.
> + *
> + * The rwlock is held for write upon exit.
>   */
>  void btrfs_tree_lock(struct extent_buffer *eb)
>  {
> @@ -307,7 +384,12 @@ void btrfs_tree_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a spinning or a blocking write lock.
> + * Release the write lock, either blocking or spinning (ie. there's no need
> + * for an explicit blocking unlock, like btrfs_tree_read_unlock_blocking).
> + * This also ends the context for nesting, the read lock must have been
> + * released already.
> + *
> + * Tasks blocked and waiting are woken, rwlock is not held upon exit.
>   */
>  void btrfs_tree_unlock(struct extent_buffer *eb)
>  {
>
David Sterba Oct. 18, 2019, 11:56 a.m. UTC | #2
On Fri, Oct 18, 2019 at 08:17:44AM +0800, Qu Wenruo wrote:
> > + * Required semantics:
> > + *
> > + * - reader/writer exclusion
> > + * - writer/writer exclusion
> > + * - reader/reader sharing
> > + * - spinning lock semantics
> > + * - blocking lock semantics
> > + * - try-lock semantics for readers and writers
> > + * - one level nesting, allowing read lock to be taken by the same thread that
> > + *   already has write lock
> 
> Any example about this scenario? IIRC there is only one user of nested lock.
> Although we know it exists for a long time, I guess it would be better
> trying to remove such call sites?

It could make a few things easier on the locking side, but I don't know
how much intrusive it would be in the scenario where it happens. The
stacktrace is quite long so some sort of propagation of the locked
context would have to happen.

> 
> > + *
> > + * The extent buffer locks (also called tree locks) manage access to eb data.
> 
> One of my concern related to "access to eb data" is, to access some
> member, we don't really need any lock at all.
> 
> Some members should never change during the lifespan of an eb. E.g.
> bytenr, transid.

Ok, so the paragraph can be more specific that the locking concerns the
b-tree data, ie. item keys and the corresponding item data.

> Some code is already taking advantage of this, like tree-checker
> checking the transid without holding a lock.
> Not sure if we should take use of this.
> 
> > + * We want concurrency of many readers and safe updates. The underlying locking
> > + * is done by read-write spinlock and the blocking part is implemented using
> > + * counters and wait queues.
> > + *
> > + * spinning semantics - the low-level rwlock is held so all other threads that
> > + *                      want to take it are spinning on it.
> > + *
> > + * blocking semantics - the low-level rwlock is not held but the counter
> > + *                      denotes how many times the blocking lock was held;
> > + *                      sleeping is possible
> 
> What about an example/state machine of all read/write and
> spinning/blocking combination?

The sate machine is really trivial, either use just the spinning locks
or at some point do set lock blocking and unblock/unlock at the end. The
main difference is whether the rwlock is held or not.

A few more words about that are in
https://github.com/btrfs/btrfs-dev-docs/blob/master/extent-buffer-locking.txt
Nikolay Borisov Oct. 22, 2019, 9:53 a.m. UTC | #3
On 17.10.19 г. 22:39 ч., David Sterba wrote:
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>  fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 96 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index e0e0430577aa..2a0e828b4470 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -13,6 +13,48 @@
>  #include "extent_io.h"
>  #include "locking.h"
>  
> +/*
> + * Extent buffer locking
> + * ~~~~~~~~~~~~~~~~~~~~~
> + *
> + * The locks use a custom scheme that allows to do more operations than are
> + * available fromt current locking primitives. The building blocks are still
> + * rwlock and wait queues.
> + *
> + * Required semantics:

Thinking out loud here..

> + *
> + * - reader/writer exclusion

RWSEM has that

> + * - writer/writer exclusion

RWSEM has that

> + * - reader/reader sharing

RWSEM has that
> + * - spinning lock semantics

RWSEM has that in the form of optimistic spinning which would be shorter
than what the custom implementation provides.

> + * - blocking lock semantics

RWSEM has that

> + * - try-lock semantics for readers and writers

down_write_trylock, down_read_trylock.

> + * - one level nesting, allowing read lock to be taken by the same thread that
> + *   already has write lock

this might be somewhat problematic but there is downgrade_write which
could be used, I'll have to check.

> + *
> + * The extent buffer locks (also called tree locks) manage access to eb data.
> + * We want concurrency of many readers and safe updates. The underlying locking
> + * is done by read-write spinlock and the blocking part is implemented using
> + * counters and wait queues.
> + *
> + * spinning semantics - the low-level rwlock is held so all other threads that
> + *                      want to take it are spinning on it.
> + *
> + * blocking semantics - the low-level rwlock is not held but the counter
> + *                      denotes how many times the blocking lock was held;
> + *                      sleeping is possible
> + *
> + * Write lock always allows only one thread to access the data.

<snip>
David Sterba Oct. 22, 2019, 10:41 a.m. UTC | #4
On Tue, Oct 22, 2019 at 12:53:29PM +0300, Nikolay Borisov wrote:
> 
> 
> On 17.10.19 г. 22:39 ч., David Sterba wrote:
> > Signed-off-by: David Sterba <dsterba@suse.com>
> > ---
> >  fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 96 insertions(+), 14 deletions(-)
> > 
> > diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> > index e0e0430577aa..2a0e828b4470 100644
> > --- a/fs/btrfs/locking.c
> > +++ b/fs/btrfs/locking.c
> > @@ -13,6 +13,48 @@
> >  #include "extent_io.h"
> >  #include "locking.h"
> >  
> > +/*
> > + * Extent buffer locking
> > + * ~~~~~~~~~~~~~~~~~~~~~
> > + *
> > + * The locks use a custom scheme that allows to do more operations than are
> > + * available fromt current locking primitives. The building blocks are still
> > + * rwlock and wait queues.
> > + *
> > + * Required semantics:
> 
> Thinking out loud here..
> 
> > + *
> > + * - reader/writer exclusion
> 
> RWSEM has that
> 
> > + * - writer/writer exclusion
> 
> RWSEM has that
> 
> > + * - reader/reader sharing
> 
> RWSEM has that
> > + * - spinning lock semantics
> 
> RWSEM has that in the form of optimistic spinning which would be shorter
> than what the custom implementation provides.
> 
> > + * - blocking lock semantics
> 
> RWSEM has that

'blocking' in btrfs lock does not hold the rwlock, does rwsem do it the
same? The term might be confusing, a thread that tries to get a lock is
also blocking (and spinning while trying), but the point here is that
the lock is not held so any sleeping operation can happen. And with the
locking status offloaded, we're not nesting the locking primitives, but
only the big tree locks.

> > + * - try-lock semantics for readers and writers
> 
> down_write_trylock, down_read_trylock.
> 
> > + * - one level nesting, allowing read lock to be taken by the same thread that
> > + *   already has write lock
> 
> this might be somewhat problematic but there is downgrade_write which
> could be used, I'll have to check.

The read lock happens transaprently from the POV of the thread that
takes it. As there's only one known context where the write->read
happens, it would be possible to do some magic to convert it, but it's
still problematic.

The read-function is also called without the write lock. So it would
have to atomically check if the lock is held for write AND by the same
task. I don't know if this is feasible.
Nikolay Borisov Oct. 23, 2019, 11:16 a.m. UTC | #5
On 17.10.19 г. 22:39 ч., David Sterba wrote:
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>  fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 96 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index e0e0430577aa..2a0e828b4470 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -13,6 +13,48 @@
>  #include "extent_io.h"
>  #include "locking.h"
>  
> +/*
> + * Extent buffer locking
> + * ~~~~~~~~~~~~~~~~~~~~~
> + *
> + * The locks use a custom scheme that allows to do more operations than are
> + * available fromt current locking primitives. The building blocks are still
> + * rwlock and wait queues.
> + *
> + * Required semantics:
> + *
> + * - reader/writer exclusion
> + * - writer/writer exclusion
> + * - reader/reader sharing
> + * - spinning lock semantics
> + * - blocking lock semantics
> + * - try-lock semantics for readers and writers
> + * - one level nesting, allowing read lock to be taken by the same thread that
> + *   already has write lock
> + *
> + * The extent buffer locks (also called tree locks) manage access to eb data.
> + * We want concurrency of many readers and safe updates. The underlying locking
> + * is done by read-write spinlock and the blocking part is implemented using
> + * counters and wait queues.
> + *
> + * spinning semantics - the low-level rwlock is held so all other threads that
> + *                      want to take it are spinning on it.
> + *
> + * blocking semantics - the low-level rwlock is not held but the counter
> + *                      denotes how many times the blocking lock was held;
> + *                      sleeping is possible
> + *
> + * Write lock always allows only one thread to access the data.
> + *
> + *
> + * Debugging
> + * ~~~~~~~~~
> + *
> + * There are additional state counters that are asserted in various contexts,
> + * removed from non-debug build to reduce extent_buffer size and for
> + * performance reasons.
> + */
> +
>  #ifdef CONFIG_BTRFS_DEBUG
>  static inline void btrfs_assert_spinning_writers_get(struct extent_buffer *eb)
>  {
> @@ -80,6 +122,15 @@ static void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) { }
>  static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { }
>  #endif
>  
> +/*
> + * Mark already held read lock as blocking. Can be nested in write lock by the
> + * same thread.
> + *
> + * Use when there are potentially long operations ahead so other thread waiting
> + * on the lock will not actively spin but sleep instead.
> + *
> + * The rwlock is released and blocking reader counter is increased.
> + */
>  void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
>  {

I think for this and it's write counterpart a
lockdep_assert_held(eb->lock) will be a better way to document the fact
the lock needs to be held when those functions are called.

>  	trace_btrfs_set_lock_blocking_read(eb);

<snip>

>  /*
> - * drop a blocking read lock
> + * Release read lock, previously set to blocking by a pairing call to
> + * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same
> + * thread.
> + *
> + * State of rwlock is unchanged, last reader wakes waiting threads.
>   */
>  void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  {
> @@ -279,8 +354,10 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning write lock.  This will wait for both
> - * blocking readers or writers
> + * Lock for write. Wait for all blocking and spinning readers and writers. This
> + * starts context where reader lock could be nested by the same thread.

Imo you shouldn't ommit the explicit mention this takes a spinning lock.

> + *
> + * The rwlock is held for write upon exit.
>   */
>  void btrfs_tree_lock(struct extent_buffer *eb)
>  {
> @@ -307,7 +384,12 @@ void btrfs_tree_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a spinning or a blocking write lock.
> + * Release the write lock, either blocking or spinning (ie. there's no need
> + * for an explicit blocking unlock, like btrfs_tree_read_unlock_blocking).
> + * This also ends the context for nesting, the read lock must have been
> + * released already.
> + *
> + * Tasks blocked and waiting are woken, rwlock is not held upon exit.
>   */
>  void btrfs_tree_unlock(struct extent_buffer *eb)
>  {
>
David Sterba Oct. 29, 2019, 5:33 p.m. UTC | #6
On Wed, Oct 23, 2019 at 02:16:26PM +0300, Nikolay Borisov wrote:
> >  void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
> >  {
> 
> I think for this and it's write counterpart a
> lockdep_assert_held(eb->lock) will be a better way to document the fact
> the lock needs to be held when those functions are called.

Ok, will add it.

> >  	trace_btrfs_set_lock_blocking_read(eb);
> 
> <snip>
> 
> >  /*
> > - * drop a blocking read lock
> > + * Release read lock, previously set to blocking by a pairing call to
> > + * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same
> > + * thread.
> > + *
> > + * State of rwlock is unchanged, last reader wakes waiting threads.
> >   */
> >  void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
> >  {
> > @@ -279,8 +354,10 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
> >  }
> >  
> >  /*
> > - * take a spinning write lock.  This will wait for both
> > - * blocking readers or writers
> > + * Lock for write. Wait for all blocking and spinning readers and writers. This
> > + * starts context where reader lock could be nested by the same thread.
> 
> Imo you shouldn't ommit the explicit mention this takes a spinning lock.

But

> > + * The rwlock is held for write upon exit.

the next line in the commit says that. There's no explicit 'spinning'
but I find it sufficient. Feel free to suggest better wording.

Patch
diff mbox series

diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index e0e0430577aa..2a0e828b4470 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -13,6 +13,48 @@ 
 #include "extent_io.h"
 #include "locking.h"
 
+/*
+ * Extent buffer locking
+ * ~~~~~~~~~~~~~~~~~~~~~
+ *
+ * The locks use a custom scheme that allows to do more operations than are
+ * available fromt current locking primitives. The building blocks are still
+ * rwlock and wait queues.
+ *
+ * Required semantics:
+ *
+ * - reader/writer exclusion
+ * - writer/writer exclusion
+ * - reader/reader sharing
+ * - spinning lock semantics
+ * - blocking lock semantics
+ * - try-lock semantics for readers and writers
+ * - one level nesting, allowing read lock to be taken by the same thread that
+ *   already has write lock
+ *
+ * The extent buffer locks (also called tree locks) manage access to eb data.
+ * We want concurrency of many readers and safe updates. The underlying locking
+ * is done by read-write spinlock and the blocking part is implemented using
+ * counters and wait queues.
+ *
+ * spinning semantics - the low-level rwlock is held so all other threads that
+ *                      want to take it are spinning on it.
+ *
+ * blocking semantics - the low-level rwlock is not held but the counter
+ *                      denotes how many times the blocking lock was held;
+ *                      sleeping is possible
+ *
+ * Write lock always allows only one thread to access the data.
+ *
+ *
+ * Debugging
+ * ~~~~~~~~~
+ *
+ * There are additional state counters that are asserted in various contexts,
+ * removed from non-debug build to reduce extent_buffer size and for
+ * performance reasons.
+ */
+
 #ifdef CONFIG_BTRFS_DEBUG
 static inline void btrfs_assert_spinning_writers_get(struct extent_buffer *eb)
 {
@@ -80,6 +122,15 @@  static void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) { }
 static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { }
 #endif
 
+/*
+ * Mark already held read lock as blocking. Can be nested in write lock by the
+ * same thread.
+ *
+ * Use when there are potentially long operations ahead so other thread waiting
+ * on the lock will not actively spin but sleep instead.
+ *
+ * The rwlock is released and blocking reader counter is increased.
+ */
 void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
 {
 	trace_btrfs_set_lock_blocking_read(eb);
@@ -96,6 +147,14 @@  void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
 	read_unlock(&eb->lock);
 }
 
+/*
+ * Mark already held write lock as blocking.
+ *
+ * Use when there are potentially long operations ahead so other threads
+ * waiting on the lock will not actively spin but sleep instead.
+ *
+ * The rwlock is released and blocking writers is set.
+ */
 void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
 {
 	trace_btrfs_set_lock_blocking_write(eb);
@@ -127,8 +186,13 @@  void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
 }
 
 /*
- * take a spinning read lock.  This will wait for any blocking
- * writers
+ * Lock the extent buffer for read. Wait for any writers (spinning or blocking).
+ * Can be nested in write lock by the same thread.
+ *
+ * Use when the locked section does only lightweight actions and busy waiting
+ * would be cheaper than making other threads do the wait/wake loop.
+ *
+ * The rwlock is held upon exit.
  */
 void btrfs_tree_read_lock(struct extent_buffer *eb)
 {
@@ -166,9 +230,10 @@  void btrfs_tree_read_lock(struct extent_buffer *eb)
 }
 
 /*
- * take a spinning read lock.
- * returns 1 if we get the read lock and 0 if we don't
- * this won't wait for blocking writers
+ * Lock extent buffer for read, optimistically expecting that there are no
+ * contending blocking writers. If there are, don't wait.
+ *
+ * Return 1 if the rwlock has been taken, 0 otherwise
  */
 int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
 {
@@ -188,8 +253,9 @@  int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
 }
 
 /*
- * returns 1 if we get the read lock and 0 if we don't
- * this won't wait for blocking writers
+ * Try-lock for read. Don't block or wait for contending writers.
+ *
+ * Retrun 1 if the rwlock has been taken, 0 otherwise
  */
 int btrfs_try_tree_read_lock(struct extent_buffer *eb)
 {
@@ -211,8 +277,10 @@  int btrfs_try_tree_read_lock(struct extent_buffer *eb)
 }
 
 /*
- * returns 1 if we get the read lock and 0 if we don't
- * this won't wait for blocking writers or readers
+ * Try-lock for write. May block until the lock is uncontended, but does not
+ * wait until it is free.
+ *
+ * Retrun 1 if the rwlock has been taken, 0 otherwise
  */
 int btrfs_try_tree_write_lock(struct extent_buffer *eb)
 {
@@ -233,7 +301,10 @@  int btrfs_try_tree_write_lock(struct extent_buffer *eb)
 }
 
 /*
- * drop a spinning read lock
+ * Release read lock. Must be used only if the lock is in spinning mode.  If
+ * the read lock is nested, must pair with read lock before the write unlock.
+ *
+ * The rwlock is not held upon exit.
  */
 void btrfs_tree_read_unlock(struct extent_buffer *eb)
 {
@@ -255,7 +326,11 @@  void btrfs_tree_read_unlock(struct extent_buffer *eb)
 }
 
 /*
- * drop a blocking read lock
+ * Release read lock, previously set to blocking by a pairing call to
+ * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same
+ * thread.
+ *
+ * State of rwlock is unchanged, last reader wakes waiting threads.
  */
 void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
 {
@@ -279,8 +354,10 @@  void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
 }
 
 /*
- * take a spinning write lock.  This will wait for both
- * blocking readers or writers
+ * Lock for write. Wait for all blocking and spinning readers and writers. This
+ * starts context where reader lock could be nested by the same thread.
+ *
+ * The rwlock is held for write upon exit.
  */
 void btrfs_tree_lock(struct extent_buffer *eb)
 {
@@ -307,7 +384,12 @@  void btrfs_tree_lock(struct extent_buffer *eb)
 }
 
 /*
- * drop a spinning or a blocking write lock.
+ * Release the write lock, either blocking or spinning (ie. there's no need
+ * for an explicit blocking unlock, like btrfs_tree_read_unlock_blocking).
+ * This also ends the context for nesting, the read lock must have been
+ * released already.
+ *
+ * Tasks blocked and waiting are woken, rwlock is not held upon exit.
  */
 void btrfs_tree_unlock(struct extent_buffer *eb)
 {