diff mbox

[03/10] locking: bring back lglocks

Message ID 20180518074918.13816-7-kent.overstreet@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kent Overstreet May 18, 2018, 7:49 a.m. UTC
bcachefs makes use of them - also, add a proper lg_lock_init()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
---
 include/linux/lglock.h  |  97 +++++++++++++++++++++++++++++++++++++
 kernel/locking/Makefile |   1 +
 kernel/locking/lglock.c | 105 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 203 insertions(+)
 create mode 100644 include/linux/lglock.h
 create mode 100644 kernel/locking/lglock.c

Comments

Peter Zijlstra May 18, 2018, 9:51 a.m. UTC | #1
On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> bcachefs makes use of them - also, add a proper lg_lock_init()

Why?! lglocks are horrid things, we got rid of them for a reason. They
have terrifying worst case preemption off latencies.

Why can't you use something like per-cpu rwsems?
Kent Overstreet May 18, 2018, 10:13 a.m. UTC | #2
On Fri, May 18, 2018 at 11:51:02AM +0200, Peter Zijlstra wrote:
> On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> > bcachefs makes use of them - also, add a proper lg_lock_init()
> 
> Why?! lglocks are horrid things, we got rid of them for a reason. They
> have terrifying worst case preemption off latencies.

Ah. That was missing from your commit message.

> Why can't you use something like per-cpu rwsems?

Well,

 a) in my use case, lg_global_lock() pretty much isn't used in normal operation,
    it's only called when starting mark and sweep gc (which is not needed
    anymore and disabled by default, it'll eventually get rolled into online
    fsck) and for device resize

 b) I'm using it in conjection with percpu counters, and technically yes I
    certainly _could_ use per-cpu sleepable locks (mutexes would make more sense
    for me than rwsems), there's a bit of a clash there and it's going to be a
    bit ugly and messy and it's more work for me. (this_cpu_ptr() no longer
    makes any sense in that case, so it'd mean auditing and converting all the
    code that touches the relevant data structures).

If you're really that dead set against lglocks I might just come up with a new
lock with similar semantics, that doesn't break this_cpu_ptr() but sleeps if the
global lock is held...
Peter Zijlstra May 18, 2018, 11:03 a.m. UTC | #3
On Fri, May 18, 2018 at 06:13:53AM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2018 at 11:51:02AM +0200, Peter Zijlstra wrote:
> > On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> > > bcachefs makes use of them - also, add a proper lg_lock_init()
> > 
> > Why?! lglocks are horrid things, we got rid of them for a reason. They
> > have terrifying worst case preemption off latencies.
> 
> Ah. That was missing from your commit message.

Yeah, sorry, sometimes it's hard to state what is obvious to oneself :/

> > Why can't you use something like per-cpu rwsems?
> 
> Well,
> 
>  a) in my use case, lg_global_lock() pretty much isn't used in normal operation,
>     it's only called when starting mark and sweep gc (which is not needed
>     anymore and disabled by default, it'll eventually get rolled into online
>     fsck) and for device resize
> 
>  b) I'm using it in conjection with percpu counters, and technically yes I
>     certainly _could_ use per-cpu sleepable locks (mutexes would make more sense
>     for me than rwsems), there's a bit of a clash there and it's going to be a
>     bit ugly and messy and it's more work for me. (this_cpu_ptr() no longer
>     makes any sense in that case, so it'd mean auditing and converting all the
>     code that touches the relevant data structures).

Well, lg is a reader-writer style lock per definition, as you want
concurrency on the local and full exclusion against the global, so I'm
not sure how mutexes fit into this.

In any case, have a look at percpu_down_read_preempt_disable() and
percpu_up_read_preempt_enable(); they're a bit of a hack but they should
work for you I think.

They will sleep at down_read, but the entire actual critical section
will be with preemption disabled -- therefore it had better be short and
bounded, and the RT guys will thank you for not using spinlock_t under
it (use raw_spinlock_t if you have to).

The (global) writer side will block and be preemptible like normal.

> If you're really that dead set against lglocks I might just come up with a new
> lock with similar semantics, that doesn't break this_cpu_ptr() but sleeps if the
> global lock is held...

See above, we already have this ;-)
Kent Overstreet May 18, 2018, 11:39 a.m. UTC | #4
On Fri, May 18, 2018 at 01:03:39PM +0200, Peter Zijlstra wrote:
> On Fri, May 18, 2018 at 06:13:53AM -0400, Kent Overstreet wrote:
> > On Fri, May 18, 2018 at 11:51:02AM +0200, Peter Zijlstra wrote:
> > > On Fri, May 18, 2018 at 03:49:04AM -0400, Kent Overstreet wrote:
> > > > bcachefs makes use of them - also, add a proper lg_lock_init()
> > > 
> > > Why?! lglocks are horrid things, we got rid of them for a reason. They
> > > have terrifying worst case preemption off latencies.
> > 
> > Ah. That was missing from your commit message.
> 
> Yeah, sorry, sometimes it's hard to state what is obvious to oneself :/
> 
> > > Why can't you use something like per-cpu rwsems?
> > 
> > Well,
> > 
> >  a) in my use case, lg_global_lock() pretty much isn't used in normal operation,
> >     it's only called when starting mark and sweep gc (which is not needed
> >     anymore and disabled by default, it'll eventually get rolled into online
> >     fsck) and for device resize
> > 
> >  b) I'm using it in conjection with percpu counters, and technically yes I
> >     certainly _could_ use per-cpu sleepable locks (mutexes would make more sense
> >     for me than rwsems), there's a bit of a clash there and it's going to be a
> >     bit ugly and messy and it's more work for me. (this_cpu_ptr() no longer
> >     makes any sense in that case, so it'd mean auditing and converting all the
> >     code that touches the relevant data structures).
> 
> Well, lg is a reader-writer style lock per definition, as you want
> concurrency on the local and full exclusion against the global, so I'm
> not sure how mutexes fit into this.
> 
> In any case, have a look at percpu_down_read_preempt_disable() and
> percpu_up_read_preempt_enable(); they're a bit of a hack but they should
> work for you I think.
> 
> They will sleep at down_read, but the entire actual critical section
> will be with preemption disabled -- therefore it had better be short and
> bounded, and the RT guys will thank you for not using spinlock_t under
> it (use raw_spinlock_t if you have to).
> 
> The (global) writer side will block and be preemptible like normal.
> 
> > If you're really that dead set against lglocks I might just come up with a new
> > lock with similar semantics, that doesn't break this_cpu_ptr() but sleeps if the
> > global lock is held...
> 
> See above, we already have this ;-)

Ok, I think this might work. I'll have to stare awhile and make sure I remember
everything I'm currently depending on the lglock for...
diff mbox

Patch

diff --git a/include/linux/lglock.h b/include/linux/lglock.h
new file mode 100644
index 0000000000..c1fbe64bd2
--- /dev/null
+++ b/include/linux/lglock.h
@@ -0,0 +1,97 @@ 
+/*
+ * Specialised local-global spinlock. Can only be declared as global variables
+ * to avoid overhead and keep things simple (and we don't want to start using
+ * these inside dynamically allocated structures).
+ *
+ * "local/global locks" (lglocks) can be used to:
+ *
+ * - Provide fast exclusive access to per-CPU data, with exclusive access to
+ *   another CPU's data allowed but possibly subject to contention, and to
+ *   provide very slow exclusive access to all per-CPU data.
+ * - Or to provide very fast and scalable read serialisation, and to provide
+ *   very slow exclusive serialisation of data (not necessarily per-CPU data).
+ *
+ * Brlocks are also implemented as a short-hand notation for the latter use
+ * case.
+ *
+ * Copyright 2009, 2010, Nick Piggin, Novell Inc.
+ */
+#ifndef __LINUX_LGLOCK_H
+#define __LINUX_LGLOCK_H
+
+#include <linux/spinlock.h>
+#include <linux/lockdep.h>
+#include <linux/percpu.h>
+#include <linux/cpu.h>
+#include <linux/notifier.h>
+
+#ifdef CONFIG_SMP
+
+struct lglock {
+	arch_spinlock_t __percpu *lock;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lockdep_map    dep_map;
+#endif
+};
+
+#define DEFINE_LGLOCK(name)						\
+	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
+	= __ARCH_SPIN_LOCK_UNLOCKED;					\
+	struct lglock name = { .lock = &name ## _lock }
+
+#define DEFINE_STATIC_LGLOCK(name)					\
+	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
+	= __ARCH_SPIN_LOCK_UNLOCKED;					\
+	static struct lglock name = { .lock = &name ## _lock }
+
+static inline void lg_lock_free(struct lglock *lg)
+{
+	free_percpu(lg->lock);
+}
+
+#define lg_lock_lockdep_init(lock)					\
+do {									\
+	static struct lock_class_key __key;				\
+									\
+	lockdep_init_map(&(lock)->dep_map, #lock, &__key, 0);		\
+} while (0)
+
+static inline int __lg_lock_init(struct lglock *lg)
+{
+	lg->lock = alloc_percpu(*lg->lock);
+	return lg->lock ? 0 : -ENOMEM;
+}
+
+#define lg_lock_init(lock)						\
+({									\
+	lg_lock_lockdep_init(lock);					\
+	__lg_lock_init(lock);						\
+})
+
+void lg_local_lock(struct lglock *lg);
+void lg_local_unlock(struct lglock *lg);
+void lg_local_lock_cpu(struct lglock *lg, int cpu);
+void lg_local_unlock_cpu(struct lglock *lg, int cpu);
+
+void lg_double_lock(struct lglock *lg, int cpu1, int cpu2);
+void lg_double_unlock(struct lglock *lg, int cpu1, int cpu2);
+
+void lg_global_lock(struct lglock *lg);
+void lg_global_unlock(struct lglock *lg);
+
+#else
+/* When !CONFIG_SMP, map lglock to spinlock */
+#define lglock spinlock
+#define DEFINE_LGLOCK(name) DEFINE_SPINLOCK(name)
+#define DEFINE_STATIC_LGLOCK(name) static DEFINE_SPINLOCK(name)
+#define lg_lock_init(lg)	({ spin_lock_init(lg); 0; })
+#define lg_lock_free(lg)	do {} while (0)
+#define lg_local_lock spin_lock
+#define lg_local_unlock spin_unlock
+#define lg_local_lock_cpu(lg, cpu) spin_lock(lg)
+#define lg_local_unlock_cpu(lg, cpu) spin_unlock(lg)
+#define lg_global_lock spin_lock
+#define lg_global_unlock spin_unlock
+#endif
+
+#endif
diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 392c7f23af..e5bb62823d 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -19,6 +19,7 @@  obj-$(CONFIG_LOCKDEP) += lockdep_proc.o
 endif
 obj-$(CONFIG_SMP) += spinlock.o
 obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o
+obj-$(CONFIG_SMP) += lglock.o
 obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
 obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o
 obj-$(CONFIG_RT_MUTEXES) += rtmutex.o
diff --git a/kernel/locking/lglock.c b/kernel/locking/lglock.c
new file mode 100644
index 0000000000..051feaccc4
--- /dev/null
+++ b/kernel/locking/lglock.c
@@ -0,0 +1,105 @@ 
+/* See include/linux/lglock.h for description */
+#include <linux/module.h>
+#include <linux/lglock.h>
+#include <linux/cpu.h>
+#include <linux/string.h>
+
+/*
+ * Note there is no uninit, so lglocks cannot be defined in
+ * modules (but it's fine to use them from there)
+ * Could be added though, just undo lg_lock_init
+ */
+
+void lg_local_lock(struct lglock *lg)
+{
+	arch_spinlock_t *lock;
+
+	preempt_disable();
+	lock_acquire_shared(&lg->dep_map, 0, 0, NULL, _RET_IP_);
+	lock = this_cpu_ptr(lg->lock);
+	arch_spin_lock(lock);
+}
+EXPORT_SYMBOL(lg_local_lock);
+
+void lg_local_unlock(struct lglock *lg)
+{
+	arch_spinlock_t *lock;
+
+	lock_release(&lg->dep_map, 1, _RET_IP_);
+	lock = this_cpu_ptr(lg->lock);
+	arch_spin_unlock(lock);
+	preempt_enable();
+}
+EXPORT_SYMBOL(lg_local_unlock);
+
+void lg_local_lock_cpu(struct lglock *lg, int cpu)
+{
+	arch_spinlock_t *lock;
+
+	preempt_disable();
+	lock_acquire_shared(&lg->dep_map, 0, 0, NULL, _RET_IP_);
+	lock = per_cpu_ptr(lg->lock, cpu);
+	arch_spin_lock(lock);
+}
+EXPORT_SYMBOL(lg_local_lock_cpu);
+
+void lg_local_unlock_cpu(struct lglock *lg, int cpu)
+{
+	arch_spinlock_t *lock;
+
+	lock_release(&lg->dep_map, 1, _RET_IP_);
+	lock = per_cpu_ptr(lg->lock, cpu);
+	arch_spin_unlock(lock);
+	preempt_enable();
+}
+EXPORT_SYMBOL(lg_local_unlock_cpu);
+
+void lg_double_lock(struct lglock *lg, int cpu1, int cpu2)
+{
+	BUG_ON(cpu1 == cpu2);
+
+	/* lock in cpu order, just like lg_global_lock */
+	if (cpu2 < cpu1)
+		swap(cpu1, cpu2);
+
+	preempt_disable();
+	lock_acquire_shared(&lg->dep_map, 0, 0, NULL, _RET_IP_);
+	arch_spin_lock(per_cpu_ptr(lg->lock, cpu1));
+	arch_spin_lock(per_cpu_ptr(lg->lock, cpu2));
+}
+
+void lg_double_unlock(struct lglock *lg, int cpu1, int cpu2)
+{
+	lock_release(&lg->dep_map, 1, _RET_IP_);
+	arch_spin_unlock(per_cpu_ptr(lg->lock, cpu1));
+	arch_spin_unlock(per_cpu_ptr(lg->lock, cpu2));
+	preempt_enable();
+}
+
+void lg_global_lock(struct lglock *lg)
+{
+	int i;
+
+	preempt_disable();
+	lock_acquire_exclusive(&lg->dep_map, 0, 0, NULL, _RET_IP_);
+	for_each_possible_cpu(i) {
+		arch_spinlock_t *lock;
+		lock = per_cpu_ptr(lg->lock, i);
+		arch_spin_lock(lock);
+	}
+}
+EXPORT_SYMBOL(lg_global_lock);
+
+void lg_global_unlock(struct lglock *lg)
+{
+	int i;
+
+	lock_release(&lg->dep_map, 1, _RET_IP_);
+	for_each_possible_cpu(i) {
+		arch_spinlock_t *lock;
+		lock = per_cpu_ptr(lg->lock, i);
+		arch_spin_unlock(lock);
+	}
+	preempt_enable();
+}
+EXPORT_SYMBOL(lg_global_unlock);