diff mbox

[rdma-rc] RDMA/restrack: don't use uaccess_kernel()

Message ID eac01039be8c0518e37928e5b003a911c32e53a7.1518468700.git.swise@opengridcomputing.com (mailing list archive)
State Changes Requested
Delegated to: Jason Gunthorpe
Headers show

Commit Message

Steve Wise Feb. 8, 2018, 8:09 p.m. UTC
uaccess_kernel() isn't sufficient to determine if an rdma resource is
user-mode or not.  For example, resources allocated in the add_one()
function of an ib_client get falsely labeled as user mode, when they
are kernel mode allocations.  EG: mad qps.

The result is that these qps are skipped over during a nldev query
because of an erroneous namespace mismatch.

So now we determine if the resource is user-mode by looking at the object
struct's uobject or similar pointer to know if it was allocated for user
mode applications.

Fixes: 02d8883f520e ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
 drivers/infiniband/core/restrack.c | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

Comments

Leon Romanovsky Feb. 13, 2018, 9:01 a.m. UTC | #1
On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> uaccess_kernel() isn't sufficient to determine if an rdma resource is
> user-mode or not.  For example, resources allocated in the add_one()
> function of an ib_client get falsely labeled as user mode, when they
> are kernel mode allocations.  EG: mad qps.

I found the reason.

It is due to the difference between code compiled as
a module (=m) or compiled in as monolith (=y). The "add_one()" runs in user
space contexts in first scenario and as kernel space context in second
scenario. I'm not using modules so it worked for me.

Thanks,
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Steve Wise Feb. 13, 2018, 3:10 p.m. UTC | #2
> 
> On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> > uaccess_kernel() isn't sufficient to determine if an rdma resource is
> > user-mode or not.  For example, resources allocated in the add_one()
> > function of an ib_client get falsely labeled as user mode, when they
> > are kernel mode allocations.  EG: mad qps.
> 
> I found the reason.
> 
> It is due to the difference between code compiled as
> a module (=m) or compiled in as monolith (=y). The "add_one()" runs in
user
> space contexts in first scenario and as kernel space context in second
> scenario. I'm not using modules so it worked for me.
> 
> Thanks,
> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>

Good to know. Thanks!

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Feb. 15, 2018, 9:58 p.m. UTC | #3
On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> +	case RDMA_RESTRACK_QP:
> +		qp = container_of(res, struct ib_qp, res);
> +		if (qp->pd)
> +			is_user = qp->pd->uobject;

?? Why is this like this?

struct ib_qp has a uboject, why do we need to look at the PD?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 15, 2018, 10:04 p.m. UTC | #4
> 
> On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> > +	case RDMA_RESTRACK_QP:
> > +		qp = container_of(res, struct ib_qp, res);
> > +		if (qp->pd)
> > +			is_user = qp->pd->uobject;
> 
> ?? Why is this like this?
> 
> struct ib_qp has a uboject, why do we need to look at the PD?

I wasn't getting the correct user/kernel setting when I used qp->uobject.
Perhaps I should revisit this.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Feb. 15, 2018, 10:41 p.m. UTC | #5
On Thu, Feb 15, 2018 at 04:04:02PM -0600, Steve Wise wrote:
> > 
> > On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> > > +	case RDMA_RESTRACK_QP:
> > > +		qp = container_of(res, struct ib_qp, res);
> > > +		if (qp->pd)
> > > +			is_user = qp->pd->uobject;
> > 
> > ?? Why is this like this?
> > 
> > struct ib_qp has a uboject, why do we need to look at the PD?
> 
> I wasn't getting the correct user/kernel setting when I used qp->uobject.
> Perhaps I should revisit this.

Oh that answer certainly makes me nervous.. Let's explain that please
and either change it or add a big fat comment here about why it has to
be like this.

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 16, 2018, 3:47 p.m. UTC | #6
> > > On Thu, Feb 08, 2018 at 12:09:43PM -0800, Steve Wise wrote:
> > > > +	case RDMA_RESTRACK_QP:
> > > > +		qp = container_of(res, struct ib_qp, res);
> > > > +		if (qp->pd)
> > > > +			is_user = qp->pd->uobject;
> > >
> > > ?? Why is this like this?
> > >
> > > struct ib_qp has a uboject, why do we need to look at the PD?
> >
> > I wasn't getting the correct user/kernel setting when I used
qp->uobject.
> > Perhaps I should revisit this.
> 
> Oh that answer certainly makes me nervous.. Let's explain that please
> and either change it or add a big fat comment here about why it has to
> be like this.
> 

I see the problem:  in uverbs_cmd.c:create_qp(), the call to _ib_create_qp()
happens before qp->uobject is initialized.  I'll move the initialization of
qp->uobject to before the call to _ib_create_qp(), test it out, and send a
new patch.

Thanks Jason!

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
index 857637b..83bce7e 100644
--- a/drivers/infiniband/core/restrack.c
+++ b/drivers/infiniband/core/restrack.c
@@ -7,7 +7,6 @@ 
 #include <rdma/restrack.h>
 #include <linux/mutex.h>
 #include <linux/sched/task.h>
-#include <linux/uaccess.h>
 #include <linux/pid_namespace.h>
 
 void rdma_restrack_init(struct rdma_restrack_root *res)
@@ -93,6 +92,39 @@  static struct ib_device *res_to_dev(struct rdma_restrack_entry *res)
 	return dev;
 }
 
+static bool res_is_user(struct rdma_restrack_entry *res)
+{
+	enum rdma_restrack_type type = res->type;
+	struct ib_xrcd *xrcd;
+	struct ib_pd *pd;
+	struct ib_cq *cq;
+	struct ib_qp *qp;
+	bool is_user = false;
+
+	switch (type) {
+	case RDMA_RESTRACK_PD:
+		pd = container_of(res, struct ib_pd, res);
+		is_user = pd->uobject;
+		break;
+	case RDMA_RESTRACK_CQ:
+		cq = container_of(res, struct ib_cq, res);
+		is_user = cq->uobject;
+		break;
+	case RDMA_RESTRACK_QP:
+		qp = container_of(res, struct ib_qp, res);
+		if (qp->pd)
+			is_user = qp->pd->uobject;
+		break;
+	case RDMA_RESTRACK_XRCD:
+		xrcd = container_of(res, struct ib_xrcd, res);
+		is_user = xrcd->inode;
+		break;
+	default:
+		WARN_ONCE(true, "Wrong resource tracking type %u\n", type);
+	}
+	return is_user;
+}
+
 void rdma_restrack_add(struct rdma_restrack_entry *res)
 {
 	struct ib_device *dev = res_to_dev(res);
@@ -100,7 +132,7 @@  void rdma_restrack_add(struct rdma_restrack_entry *res)
 	if (!dev)
 		return;
 
-	if (!uaccess_kernel()) {
+	if (res_is_user(res)) {
 		get_task_struct(current);
 		res->task = current;
 		res->kern_name = NULL;