diff mbox

[2/7] IB/rxe: Disable completion upcalls when a CQ is destroyed

Message ID 1500989968-30889-3-git-send-email-andrew.boyer@dell.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Andrew Boyer July 25, 2017, 1:39 p.m. UTC
This prevents the stack from accessing userspace objects while they
are being torn down.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
---
 drivers/infiniband/sw/rxe/rxe_cq.c    | 19 +++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_loc.h   |  2 ++
 drivers/infiniband/sw/rxe/rxe_verbs.c |  2 ++
 drivers/infiniband/sw/rxe/rxe_verbs.h |  1 +
 4 files changed, 24 insertions(+)

Comments

Moni Shoua July 27, 2017, 9:35 a.m. UTC | #1
On Tue, Jul 25, 2017 at 4:39 PM, Andrew Boyer <andrew.boyer@dell.com> wrote:
> This prevents the stack from accessing userspace objects while they
> are being torn down.
>
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_cq.c    | 19 +++++++++++++++++++
>  drivers/infiniband/sw/rxe/rxe_loc.h   |  2 ++
>  drivers/infiniband/sw/rxe/rxe_verbs.c |  2 ++
>  drivers/infiniband/sw/rxe/rxe_verbs.h |  1 +
>  4 files changed, 24 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index 49fe42c..c4aabf7 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -69,6 +69,14 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
>  static void rxe_send_complete(unsigned long data)
>  {
>         struct rxe_cq *cq = (struct rxe_cq *)data;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&cq->cq_lock, flags);
> +       if (cq->is_dying) {
> +               spin_unlock_irqrestore(&cq->cq_lock, flags);
> +               return;
> +       }
> +       spin_unlock_irqrestore(&cq->cq_lock, flags);
What if CQ is destroyed here after you pass the is_dying test?
Maybe you should think of a solution based on ref counting.
>         cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Boyer July 27, 2017, 1:19 p.m. UTC | #2
On 7/27/17, 5:35 AM, "monisonlists@gmail.com on behalf of Moni Shoua"
<monisonlists@gmail.com on behalf of monis@mellanox.com> wrote:

>On Tue, Jul 25, 2017 at 4:39 PM, Andrew Boyer <andrew.boyer@dell.com>
>wrote:
>> This prevents the stack from accessing userspace objects while they
>> are being torn down.
>>
>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>> Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_cq.c    | 19 +++++++++++++++++++
>>  drivers/infiniband/sw/rxe/rxe_loc.h   |  2 ++
>>  drivers/infiniband/sw/rxe/rxe_verbs.c |  2 ++
>>  drivers/infiniband/sw/rxe/rxe_verbs.h |  1 +
>>  4 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c
>>b/drivers/infiniband/sw/rxe/rxe_cq.c
>> index 49fe42c..c4aabf7 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
>> @@ -69,6 +69,14 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct
>>rxe_cq *cq,
>>  static void rxe_send_complete(unsigned long data)
>>  {
>>         struct rxe_cq *cq = (struct rxe_cq *)data;
>> +       unsigned long flags;
>> +
>> +       spin_lock_irqsave(&cq->cq_lock, flags);
>> +       if (cq->is_dying) {
>> +               spin_unlock_irqrestore(&cq->cq_lock, flags);
>> +               return;
>> +       }
>> +       spin_unlock_irqrestore(&cq->cq_lock, flags);
>What if CQ is destroyed here after you pass the is_dying test?
>Maybe you should think of a solution based on ref counting.
>>         cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
>>  }

Hello Moni,
Thank you for all of the reviews. I¹ll address commit messages etc. in a
revised series.

This is the situation that causes a crash here:
 - Userspace programs exits
 - ib_uverbs_cleanup_ucontext() runs, calling ib_destroy_qp(),
ib_destroy_cq(), etc. and releasing/freeing the UCQ
   - The QP still has tasklets running, so it isn¹t destroyed yet
   - The CQ is referenced (twice) by the QP, so the CQ isn¹t destroyed yet
   - The UCQ is kfree()'d!
 - A send work request completes
 - rxe_send_complete() calls cq->ibcq.comp_handler()
 - ib_uverbs_comp_handler() runs and crashes; the event queue is checked
for is_closed, but it has no way to check the ib_ucq_object

As you can see, the reference counting on the CQ doesn¹t protect us.
There¹s no interface I could find that would deregister the UCQ from the
CQ. I didn¹t think attempting to add reference counting to the UCQ was
going to be a good way to go since the solution I posted above is so much
simpler (if hacky).

It looks like ib_uverbs_cleanup_context() is gone in 4.12. I don¹t know if
whatever replaced it addresses this issue already, by accident or by
design.

Does this make sense? Do you have a better idea for a fix?

Thank you,
Andrew

P.S. Sorry for the Outlook garbage formatting.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index 49fe42c..c4aabf7 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -69,6 +69,14 @@  int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
 static void rxe_send_complete(unsigned long data)
 {
 	struct rxe_cq *cq = (struct rxe_cq *)data;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	if (cq->is_dying) {
+		spin_unlock_irqrestore(&cq->cq_lock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
 	cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
 }
@@ -97,6 +105,8 @@  int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
 	if (udata)
 		cq->is_user = 1;
 
+	cq->is_dying = false;
+
 	tasklet_init(&cq->comp_task, rxe_send_complete, (unsigned long)cq);
 
 	spin_lock_init(&cq->cq_lock);
@@ -156,6 +166,15 @@  int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	return 0;
 }
 
+void rxe_cq_disable(struct rxe_cq *cq)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	cq->is_dying = true;
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
+}
+
 void rxe_cq_cleanup(struct rxe_pool_entry *arg)
 {
 	struct rxe_cq *cq = container_of(arg, typeof(*cq), pelem);
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index d6299ed..64f8fa1 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -64,6 +64,8 @@  int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
 
 int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited);
 
+void rxe_cq_disable(struct rxe_cq *cq);
+
 void rxe_cq_cleanup(struct rxe_pool_entry *arg);
 
 /* rxe_mcast.c */
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index af90a7d..02a39e8 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -960,6 +960,8 @@  static int rxe_destroy_cq(struct ib_cq *ibcq)
 {
 	struct rxe_cq *cq = to_rcq(ibcq);
 
+	rxe_cq_disable(cq);
+
 	rxe_drop_ref(cq);
 	return 0;
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 5a180fb..b09a9e2 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -89,6 +89,7 @@  struct rxe_cq {
 	struct rxe_queue	*queue;
 	spinlock_t		cq_lock;
 	u8			notify;
+	bool			is_dying;
 	int			is_user;
 	struct tasklet_struct	comp_task;
 };