diff mbox series

[v2,07/10] mmap locking API: add mmap_read_release() and mmap_read_unlock_non_owner()

Message ID 20200327021058.221911-8-walken@google.com (mailing list archive)
State New, archived
Headers show
Series Add a new mmap locking API wrapping mmap_sem calls | expand

Commit Message

Michel Lespinasse March 27, 2020, 2:10 a.m. UTC
Add a couple APIs to allow splitting mmap_read_unlock() into two calls:
- mmap_read_release(), called by the task that had taken the mmap lock;
- mmap_read_unlock_non_owner(), called from a work queue.

These apis are used by kernel/bpf/stackmap.c only.

Signed-off-by: Michel Lespinasse <walken@google.com>
---
 include/linux/mmap_lock.h | 10 ++++++++++
 kernel/bpf/stackmap.c     |  9 ++++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

Comments

Davidlohr Bueso March 27, 2020, 4:46 a.m. UTC | #1
On Thu, 26 Mar 2020, Michel Lespinasse wrote:

>Add a couple APIs to allow splitting mmap_read_unlock() into two calls:
>- mmap_read_release(), called by the task that had taken the mmap lock;
>- mmap_read_unlock_non_owner(), called from a work queue.
>
>These apis are used by kernel/bpf/stackmap.c only.

I'm not crazy about the idea generalizing such calls into an mm api.
We try to stay away from non-owner semantics in locking - granted
the IS_ENABLED(CONFIG_PREEMPT_RT) warning, but still.

Could this give future users the wrong impression? What about just
using rwsem calls directly in bpf?

Thanks,
Davidlohr
Michel Lespinasse March 27, 2020, 5:09 a.m. UTC | #2
On Thu, Mar 26, 2020 at 9:48 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> On Thu, 26 Mar 2020, Michel Lespinasse wrote:
>
> >Add a couple APIs to allow splitting mmap_read_unlock() into two calls:
> >- mmap_read_release(), called by the task that had taken the mmap lock;
> >- mmap_read_unlock_non_owner(), called from a work queue.
> >
> >These apis are used by kernel/bpf/stackmap.c only.
>
> I'm not crazy about the idea generalizing such calls into an mm api.
> We try to stay away from non-owner semantics in locking - granted
> the IS_ENABLED(CONFIG_PREEMPT_RT) warning, but still.
>
> Could this give future users the wrong impression? What about just
> using rwsem calls directly in bpf?

I see what you mean and I certainly don't want to encourage any new
non-owner call sites to appear.... This bpf stackmap site is a small
pain point in my larger range locking patchset too.

I am not sure what is the proper response to it; the opposite side of
your argument could be that using a direct rwsem call there hides the
issue and makes it less likely for someone to fix it ? I don't have a
very strong opinion on this, as I think it can be argued either way...

But at a minimum, I think it'd be worth adding a comment asking people
not to add new call sites to the mmap_read_release() and
mmap_read_unlock_non_owner() APIs ?
diff mbox series

Patch

diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 40a972a26857..00d6cc02581d 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -62,6 +62,16 @@  static inline void mmap_read_unlock(struct mm_struct *mm)
 	up_read(&mm->mmap_sem);
 }
 
+static inline void mmap_read_release(struct mm_struct *mm, unsigned long ip)
+{
+	rwsem_release(&mm->mmap_sem.dep_map, ip);
+}
+
+static inline void mmap_read_unlock_non_owner(struct mm_struct *mm)
+{
+	up_read_non_owner(&mm->mmap_sem);
+}
+
 static inline bool mmap_is_locked(struct mm_struct *mm)
 {
 	return rwsem_is_locked(&mm->mmap_sem) != 0;
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index f2115f691577..413b512a99eb 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -33,7 +33,7 @@  struct bpf_stack_map {
 /* irq_work to run up_read() for build_id lookup in nmi context */
 struct stack_map_irq_work {
 	struct irq_work irq_work;
-	struct rw_semaphore *sem;
+	struct mm_struct *mm;
 };
 
 static void do_up_read(struct irq_work *entry)
@@ -41,8 +41,7 @@  static void do_up_read(struct irq_work *entry)
 	struct stack_map_irq_work *work;
 
 	work = container_of(entry, struct stack_map_irq_work, irq_work);
-	up_read_non_owner(work->sem);
-	work->sem = NULL;
+	mmap_read_unlock_non_owner(work->mm);
 }
 
 static DEFINE_PER_CPU(struct stack_map_irq_work, up_read_work);
@@ -332,14 +331,14 @@  static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
 	if (!work) {
 		mmap_read_unlock(current->mm);
 	} else {
-		work->sem = &current->mm->mmap_sem;
+		work->mm = current->mm;
 		irq_work_queue(&work->irq_work);
 		/*
 		 * The irq_work will release the mmap_sem with
 		 * up_read_non_owner(). The rwsem_release() is called
 		 * here to release the lock from lockdep's perspective.
 		 */
-		rwsem_release(&current->mm->mmap_sem.dep_map, _RET_IP_);
+		mmap_read_release(current->mm, _RET_IP_);
 	}
 }