diff mbox

[v3,2/2] locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN

Message ID 1526405884-4860-3-git-send-email-longman@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Waiman Long May 15, 2018, 5:38 p.m. UTC
The filesystem freezing code needs to transfer ownership of a rwsem
embedded in a percpu-rwsem from the task that does the freezing to
another one that does the thawing by calling percpu_rwsem_release()
after freezing and percpu_rwsem_acquire() before thawing.

However, the new rwsem debug code runs afoul with this scheme by warning
that the task that releases the rwsem isn't the one that acquires it.

[   20.302978] ------------[ cut here ]------------
[   20.305016] DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
[   20.305029] WARNING: CPU: 1 PID: 1401 at
/home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79
[   20.311252] CPU: 1 PID: 1401 Comm: fsfreeze Not tainted 4.17.0-rc3-xfstests-00049-g39e47bf59eb3 #3276
[   20.314808] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   20.318403] RIP: 0010:up_write+0x59/0x79
[   20.320928] RSP: 0018:ffffc90000717e48 EFLAGS: 00010286
[   20.322955] RAX: 0000000000000030 RBX: ffff880078f1c680 RCX: ffff880078e42200
[   20.325665] RDX: ffffffff810cc9c1 RSI: 0000000000000001 RDI: 0000000000000202
[   20.328844] RBP: ffffc90000717e80 R08: 0000000000000001 R09: 0000000000000001
[   20.332340] R10: ffffc90000717c58 R11: ffffffff836807ad R12: ffff880078f1c388
[   20.335095] R13: ffff880078a8b980 R14: 0000000000000000 R15: 00000000fffffff7
[   20.338009] FS:  00007fb61ca42700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
[   20.341423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   20.343772] CR2: 00007fb61c559b30 CR3: 0000000078da6000 CR4: 00000000000006e0
[   20.346463] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   20.349201] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   20.351960] Call Trace:
[   20.352911]  percpu_up_write+0x1f/0x28
[   20.354344]  thaw_super_locked+0xdf/0x120
[   20.355944]  do_vfs_ioctl+0x270/0x5f1
[   20.357390]  ? __se_sys_newfstat+0x2e/0x39
[   20.358969]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
[   20.360991]  ksys_ioctl+0x52/0x71
[   20.362384]  __x64_sys_ioctl+0x16/0x19
[   20.363702]  do_syscall_64+0x5d/0x167
[   20.365099]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

To work properly with the rwsem debug code, we need to annotate that the
rwsem ownership is unknown during the tranfer period until a brave soul
comes forward to acquire the ownership. During that period, optimistic
spinning will be disabled.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/percpu-rwsem.h | 6 +++++-
 include/linux/rwsem.h        | 5 +++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Peter Zijlstra May 15, 2018, 5:58 p.m. UTC | #1
On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
> +/*
> + * Owner value to indicate the rwsem's owner is not currently known.
> + */
> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)

It might be nice to comment that this works and relies on having that
ANON_OWNER bit set.
Waiman Long May 15, 2018, 6:02 p.m. UTC | #2
On 05/15/2018 01:58 PM, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>> +/*
>> + * Owner value to indicate the rwsem's owner is not currently known.
>> + */
>> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
> It might be nice to comment that this works and relies on having that
> ANON_OWNER bit set.

I am just trying not to expose internal working of rwsem, but I can
document that one of the bits is the real deal without specifying which one.

Cheers,
Longman
Matthew Wilcox May 15, 2018, 6:02 p.m. UTC | #3
On Tue, May 15, 2018 at 07:58:05PM +0200, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
> > +/*
> > + * Owner value to indicate the rwsem's owner is not currently known.
> > + */
> > +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
> 
> It might be nice to comment that this works and relies on having that
> ANON_OWNER bit set.

I'd rather change the definition to be ((struct task_struct *)2)
otherwise this is both reader-owned and anonymously-owned which doesn't
make much sense.
Amir Goldstein May 15, 2018, 6:05 p.m. UTC | #4
On Tue, May 15, 2018 at 9:02 PM, Waiman Long <longman@redhat.com> wrote:
> On 05/15/2018 01:58 PM, Peter Zijlstra wrote:
>> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>>> +/*
>>> + * Owner value to indicate the rwsem's owner is not currently known.
>>> + */
>>> +#define RWSEM_OWNER_UNKNOWN ((struct task_struct *)-1)
>> It might be nice to comment that this works and relies on having that
>> ANON_OWNER bit set.
>
> I am just trying not to expose internal working of rwsem, but I can
> document that one of the bits is the real deal without specifying which one.
>

BUILD_BUG_ON(!rwsem_has_anonymous_owner(RWSEM_OWNER_UNKNOWN))
inside rwsem.c and comment above to explain it.

Thanks,
Amir.
Peter Zijlstra May 15, 2018, 6:05 p.m. UTC | #5
On Tue, May 15, 2018 at 02:02:00PM -0400, Waiman Long wrote:
> On 05/15/2018 01:58 PM, Peter Zijlstra wrote:
> > On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
> >> +/*
> >> + * Owner value to indicate the rwsem's owner is not currently known.
> >> + */
> >> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
> > It might be nice to comment that this works and relies on having that
> > ANON_OWNER bit set.
> 
> I am just trying not to expose internal working of rwsem, but I can
> document that one of the bits is the real deal without specifying which one.

Thing is, you don't want someone changing this without knowing about
that one magic bit. It doesn't hurt to be explicit here.

Also, do we want -1L instead of a -1 literal?
Peter Zijlstra May 15, 2018, 6:07 p.m. UTC | #6
On Tue, May 15, 2018 at 11:02:19AM -0700, Matthew Wilcox wrote:
> On Tue, May 15, 2018 at 07:58:05PM +0200, Peter Zijlstra wrote:
> > On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
> > > +/*
> > > + * Owner value to indicate the rwsem's owner is not currently known.
> > > + */
> > > +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
> > 
> > It might be nice to comment that this works and relies on having that
> > ANON_OWNER bit set.
> 
> I'd rather change the definition to be ((struct task_struct *)2)
> otherwise this is both reader-owned and anonymously-owned which doesn't
> make much sense.

Works for me.
Waiman Long May 15, 2018, 6:35 p.m. UTC | #7
On 05/15/2018 02:05 PM, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 02:02:00PM -0400, Waiman Long wrote:
>> On 05/15/2018 01:58 PM, Peter Zijlstra wrote:
>>> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>>>> +/*
>>>> + * Owner value to indicate the rwsem's owner is not currently known.
>>>> + */
>>>> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
>>> It might be nice to comment that this works and relies on having that
>>> ANON_OWNER bit set.
>> I am just trying not to expose internal working of rwsem, but I can
>> document that one of the bits is the real deal without specifying which one.
> Thing is, you don't want someone changing this without knowing about
> that one magic bit. It doesn't hurt to be explicit here.
>
> Also, do we want -1L instead of a -1 literal?
>
Yes, I should -1L instead.

Cheers,
Longman
Waiman Long May 15, 2018, 6:45 p.m. UTC | #8
On 05/15/2018 02:02 PM, Matthew Wilcox wrote:
> On Tue, May 15, 2018 at 07:58:05PM +0200, Peter Zijlstra wrote:
>> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>>> +/*
>>> + * Owner value to indicate the rwsem's owner is not currently known.
>>> + */
>>> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
>> It might be nice to comment that this works and relies on having that
>> ANON_OWNER bit set.
> I'd rather change the definition to be ((struct task_struct *)2)
> otherwise this is both reader-owned and anonymously-owned which doesn't
> make much sense.

Thinking about it a bit more. I can actually just use one special bit
(bit 0) to designate an unknown owner. So for a reader-owned lock, it is
just owner == 1 as the owners are unknown for a reader owned lock. For a
lock owned by an unknown writer, it is (owner & 1) && (owner != 1). That
will justify the use of -1L and save bit 1 for future extension.

Cheers,
Longman
Waiman Long May 15, 2018, 6:45 p.m. UTC | #9
On 05/15/2018 02:05 PM, Amir Goldstein wrote:
> On Tue, May 15, 2018 at 9:02 PM, Waiman Long <longman@redhat.com> wrote:
>> On 05/15/2018 01:58 PM, Peter Zijlstra wrote:
>>> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>>>> +/*
>>>> + * Owner value to indicate the rwsem's owner is not currently known.
>>>> + */
>>>> +#define RWSEM_OWNER_UNKNOWN ((struct task_struct *)-1)
>>> It might be nice to comment that this works and relies on having that
>>> ANON_OWNER bit set.
>> I am just trying not to expose internal working of rwsem, but I can
>> document that one of the bits is the real deal without specifying which one.
>>
> BUILD_BUG_ON(!rwsem_has_anonymous_owner(RWSEM_OWNER_UNKNOWN))
> inside rwsem.c and comment above to explain it.

Yes, it is a good idea. I will do that.

-Longman
Matthew Wilcox May 15, 2018, 9:21 p.m. UTC | #10
On Tue, May 15, 2018 at 02:45:12PM -0400, Waiman Long wrote:
> On 05/15/2018 02:02 PM, Matthew Wilcox wrote:
> > On Tue, May 15, 2018 at 07:58:05PM +0200, Peter Zijlstra wrote:
> >> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
> >>> +/*
> >>> + * Owner value to indicate the rwsem's owner is not currently known.
> >>> + */
> >>> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
> >> It might be nice to comment that this works and relies on having that
> >> ANON_OWNER bit set.
> > I'd rather change the definition to be ((struct task_struct *)2)
> > otherwise this is both reader-owned and anonymously-owned which doesn't
> > make much sense.
> 
> Thinking about it a bit more. I can actually just use one special bit
> (bit 0) to designate an unknown owner. So for a reader-owned lock, it is
> just owner == 1 as the owners are unknown for a reader owned lock. For a
> lock owned by an unknown writer, it is (owner & 1) && (owner != 1). That
> will justify the use of -1L and save bit 1 for future extension.

To quote from your patch:

- * In essence, the owner field now has the following 3 states:
+ * In essence, the owner field now has the following 4 states:
  *  1) 0
  *     - lock is free or the owner hasn't set the field yet
  *  2) RWSEM_READER_OWNED
  *     - lock is currently or previously owned by readers (lock is free
  *	  or not set by owner yet)
- *  3) Other non-zero value
- *     - a writer owns the lock
+ *  3) RWSEM_ANONYMOUSLY_OWNED
+ *     - lock is owned by an anonymous writer, so spinning on the lock
+ *	  owner should be disabled.
+ *  4) Other non-zero value
+ *     - a writer owns the lock and other writers can spin on the lock owner.

I'd leave these as 0, 1, 2, other.  It's not really worth messing with
testing bits.

Actually, if you change them to all be values -- s/NULL/RWSEM_NO_OWNER/

then you could define them as:

RWSEM_READER_OWNED	0
RWSEM_ANON_OWNED	1
RWSEM_NO_OWNER		2

and rwsem_should_spin() is just sem->owner > 1.
Waiman Long May 15, 2018, 9:30 p.m. UTC | #11
On 05/15/2018 05:21 PM, Matthew Wilcox wrote:
> On Tue, May 15, 2018 at 02:45:12PM -0400, Waiman Long wrote:
>> On 05/15/2018 02:02 PM, Matthew Wilcox wrote:
>>> On Tue, May 15, 2018 at 07:58:05PM +0200, Peter Zijlstra wrote:
>>>> On Tue, May 15, 2018 at 01:38:04PM -0400, Waiman Long wrote:
>>>>> +/*
>>>>> + * Owner value to indicate the rwsem's owner is not currently known.
>>>>> + */
>>>>> +#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
>>>> It might be nice to comment that this works and relies on having that
>>>> ANON_OWNER bit set.
>>> I'd rather change the definition to be ((struct task_struct *)2)
>>> otherwise this is both reader-owned and anonymously-owned which doesn't
>>> make much sense.
>> Thinking about it a bit more. I can actually just use one special bit
>> (bit 0) to designate an unknown owner. So for a reader-owned lock, it is
>> just owner == 1 as the owners are unknown for a reader owned lock. For a
>> lock owned by an unknown writer, it is (owner & 1) && (owner != 1). That
>> will justify the use of -1L and save bit 1 for future extension.
> To quote from your patch:
>
> - * In essence, the owner field now has the following 3 states:
> + * In essence, the owner field now has the following 4 states:
>   *  1) 0
>   *     - lock is free or the owner hasn't set the field yet
>   *  2) RWSEM_READER_OWNED
>   *     - lock is currently or previously owned by readers (lock is free
>   *	  or not set by owner yet)
> - *  3) Other non-zero value
> - *     - a writer owns the lock
> + *  3) RWSEM_ANONYMOUSLY_OWNED
> + *     - lock is owned by an anonymous writer, so spinning on the lock
> + *	  owner should be disabled.
> + *  4) Other non-zero value
> + *     - a writer owns the lock and other writers can spin on the lock owner.
>
> I'd leave these as 0, 1, 2, other.  It's not really worth messing with
> testing bits.
>
> Actually, if you change them to all be values -- s/NULL/RWSEM_NO_OWNER/
>
> then you could define them as:
>
> RWSEM_READER_OWNED	0
> RWSEM_ANON_OWNED	1
> RWSEM_NO_OWNER		2
>
> and rwsem_should_spin() is just sem->owner > 1.

I would like to have owner equal to NULL if it is not locked. If it is
locked, the owner can be used to get information about the owner. So I
am not sure your scheme will work.

Cheers,
Longman
diff mbox

Patch

diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index b1f37a8..79b99d6 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -133,7 +133,7 @@  static inline void percpu_rwsem_release(struct percpu_rw_semaphore *sem,
 	lock_release(&sem->rw_sem.dep_map, 1, ip);
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
 	if (!read)
-		sem->rw_sem.owner = NULL;
+		sem->rw_sem.owner = RWSEM_OWNER_UNKNOWN;
 #endif
 }
 
@@ -141,6 +141,10 @@  static inline void percpu_rwsem_acquire(struct percpu_rw_semaphore *sem,
 					bool read, unsigned long ip)
 {
 	lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip);
+#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
+	if (!read)
+		sem->rw_sem.owner = current;
+#endif
 }
 
 #endif
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index 56707d5..fc2ed27 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -44,6 +44,11 @@  struct rw_semaphore {
 #endif
 };
 
+/*
+ * Owner value to indicate the rwsem's owner is not currently known.
+ */
+#define RWSEM_OWNER_UNKNOWN	((struct task_struct *)-1)
+
 extern struct rw_semaphore *rwsem_down_read_failed(struct rw_semaphore *sem);
 extern struct rw_semaphore *rwsem_down_read_failed_killable(struct rw_semaphore *sem);
 extern struct rw_semaphore *rwsem_down_write_failed(struct rw_semaphore *sem);