diff mbox

[RFC,rdma-core,4/5] mlx5: Add support for ibv_parent domain and its related verbs

Message ID 1510522903-6838-5-git-send-email-yishaih@mellanox.com (mailing list archive)
State RFC
Headers show

Commit Message

Yishai Hadas Nov. 12, 2017, 9:41 p.m. UTC
This patch is the initial implementation of ibv_alloc/dealloc_parent
domain verbs in the mlx5 driver.

The driver uses internally mlx5_pd structure which its prefix holds
the legacy ibv_pd so that the return ibv_pd can be used with any verb
around that gets a protection domain.

In addition, the driver saves an indication whether this ibv_pd is a
parent domain and as such may hold a thread domain and a pointer to a
previously allocated protection domain.

In downstream patches the mlx5 driver will consider upon getting an
ibv_pd which fields should be used based on the above indicator.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 providers/mlx5/mlx5.c  |  2 ++
 providers/mlx5/mlx5.h  |  9 +++++++++
 providers/mlx5/verbs.c | 23 +++++++++++++++++++++++
 3 files changed, 34 insertions(+)

Comments

Jason Gunthorpe Nov. 13, 2017, 8:03 p.m. UTC | #1
On Sun, Nov 12, 2017 at 11:41:42PM +0200, Yishai Hadas wrote:

> @@ -319,6 +319,9 @@ struct mlx5_buf {
>  struct mlx5_pd {
>  	struct ibv_pd			ibv_pd;
>  	uint32_t			pdn;
> +	int 				is_parent_domain;
> +	struct ibv_td			*td;
> +	struct ibv_pd			*protection_domain;

is_parent_domain can just be protection_domain != NULL

> +struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context,
> +			struct ibv_parent_domain_init_attr *attr)
> +{
> +	struct mlx5_pd	*pd;
> +
> +	pd = calloc(1, sizeof *pd);
> +	if (!pd)
> +		return NULL;
> +
> +	pd->is_parent_domain = 1;
> +	pd->td = attr->td;
> +	pd->protection_domain= attr->pd;

Don't we need some kind of ref counting here?

What is the intention for the final version of this patch?

Are you going to do
  pd->ibv_pd = *attr->pd;

During create?

Or change every call site very roughly like like..

inline struct mlx5_pd *resolve_pd(struct ibv_pd *pd)
{
   if (pd->protection_domain)
      return pd->protection_domain;
   return pd;
}

mlx5_foo(struct ibv_pd *arg_pd)
{
    struct mlx5_pd *pd = resolve_pd(arg_pd);

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Nov. 14, 2017, 10:29 a.m. UTC | #2
On 11/13/2017 10:03 PM, Jason Gunthorpe wrote:
> On Sun, Nov 12, 2017 at 11:41:42PM +0200, Yishai Hadas wrote:
> 
>> @@ -319,6 +319,9 @@ struct mlx5_buf {
>>   struct mlx5_pd {
>>   	struct ibv_pd			ibv_pd;
>>   	uint32_t			pdn;
>> +	int 				is_parent_domain;
>> +	struct ibv_td			*td;
>> +	struct ibv_pd			*protection_domain;
> 
> is_parent_domain can just be protection_domain != NULL

It's not enough, there might be cases that protection_domain is NULL but 
still the object is a parent domain as of in the CQ case where td may be 
applicable but not the protection_domain. Better having here clear 
indication which is set upon the creation API rather than some semantic.

>> +struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context,
>> +			struct ibv_parent_domain_init_attr *attr)
>> +{
>> +	struct mlx5_pd	*pd;
>> +
>> +	pd = calloc(1, sizeof *pd);
>> +	if (!pd)
>> +		return NULL;
>> +
>> +	pd->is_parent_domain = 1;
>> +	pd->td = attr->td;
>> +	pd->protection_domain= attr->pd;
> 
> Don't we need some kind of ref counting here?
> 

We thought about it as well, however reference counting can't prevent 
race conditions when more than one thread uses the API.
We expect the application to work as the API is documented. (e.g. 
dealloc_parent_domain() first and then its internal protection domain 
and thread domain.

Similar behavior exists also in current code which puts the 
responsibility on the application to correctly use the API instead of 
using some reference counting /locks inside the libraries.

For example if one thread is calling alloc_pd() and then create_qp() and 
second thread will call in parallel to dealloc_pd() and it will run 
before the create_qp() will ended the application might oops in the user 
area when ibv_pd will be used post its free as part of the command.

> What is the intention for the final version of this patch?

The next patch of this RFC around QP creation introduces the expected usage.

Look at:
static void mlx5_get_domains(struct ibv_pd *pd, struct ibv_pd
                             **protection_domain, struct ibv_td **td)

It gets an ibv_pd and returns both the protection domain and the thread 
domain. This helper function will be used in other mlx5 verbs around 
that gets an ibv_pd. In the final series there may be a separate patch 
for that post this patch that introduces the parent domain.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Nov. 14, 2017, 7:15 p.m. UTC | #3
On Tue, Nov 14, 2017 at 12:29:14PM +0200, Yishai Hadas wrote:

> It's not enough, there might be cases that protection_domain is NULL but
> still the object is a parent domain as of in the CQ case where td may be
> applicable but not the protection_domain. Better having here clear
> indication which is set upon the creation API rather than some semantic.

Ugh, the idea of a ibv_pd without a ibv_pd is a ugly. That is a whole
bunch more checking and failure scenarios.

I think I'd rather have the CQ receive a PD it doesn't need than do
that..

> >>+struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context,
> >>+			struct ibv_parent_domain_init_attr *attr)

> >>+	pd->protection_domain= attr->pd;
> >
> >Don't we need some kind of ref counting here?
> >
> 
> We thought about it as well, however reference counting can't prevent race
> conditions when more than one thread uses the API.

Not really concerned about races. More concerned about helping the
user use it properly..

> Similar behavior exists also in current code which puts the responsibility
> on the application to correctly use the API instead of using some reference
> counting /locks inside the libraries.

We have refcounts in the kernel side for many things and generate
errors if the user does stuff in the wrong order.

> >What is the intention for the final version of this patch?
> 
> The next patch of this RFC around QP creation introduces the
> expected usage.

Well, this transformation would have to ultimately be done in this
patch or before.

Looking more, this isn't good enough, at the very least I think you
need to memcpy the struct ibv_pd portion so context and handle are
set, as those are public ABIs.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 0c073e2..b160373 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -1011,6 +1011,8 @@  static int mlx5_init_context(struct verbs_device *vdev,
 	verbs_set_ctx_op(v_ctx, post_srq_ops, mlx5_post_srq_ops);
 	verbs_set_ctx_op(v_ctx, alloc_td, mlx5_alloc_td);
 	verbs_set_ctx_op(v_ctx, dealloc_td, mlx5_dealloc_td);
+	verbs_set_ctx_op(v_ctx, alloc_parent_domain, mlx5_alloc_parent_domain);
+	verbs_set_ctx_op(v_ctx, dealloc_parent_domain, mlx5_dealloc_parent_domain);
 
 	memset(&device_attr, 0, sizeof(device_attr));
 	if (!mlx5_query_device_ex(ctx, NULL, &device_attr,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index da0aa91..f9b83ad 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -319,6 +319,9 @@  struct mlx5_buf {
 struct mlx5_pd {
 	struct ibv_pd			ibv_pd;
 	uint32_t			pdn;
+	int 				is_parent_domain;
+	struct ibv_td			*td;
+	struct ibv_pd			*protection_domain;
 };
 
 struct mlx5_td {
@@ -766,6 +769,12 @@  int mlx5_post_srq_ops(struct ibv_srq *srq,
 struct ibv_td *mlx5_alloc_td(struct ibv_context *context);
 int mlx5_dealloc_td(struct ibv_td *td);
 
+struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context,
+					struct ibv_parent_domain_init_attr *attr);
+
+int mlx5_dealloc_parent_domain(struct ibv_pd *parent_domain);
+
+
 static inline void *mlx5_find_uidx(struct mlx5_context *ctx, uint32_t uidx)
 {
 	int tind = uidx >> MLX5_UIDX_TABLE_SHIFT;
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 8cf14c0..13844b3 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -196,6 +196,29 @@  int mlx5_dealloc_td(struct ibv_td *ib_td)
 	return 0;
 }
 
+struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context,
+			struct ibv_parent_domain_init_attr *attr)
+{
+	struct mlx5_pd	*pd;
+
+	pd = calloc(1, sizeof *pd);
+	if (!pd)
+		return NULL;
+
+	pd->is_parent_domain = 1;
+	pd->td = attr->td;
+	pd->protection_domain= attr->pd;
+
+	pd->ibv_pd.context = context;
+	return &pd->ibv_pd;
+}
+
+int mlx5_dealloc_parent_domain(struct ibv_pd *parent_domain)
+{
+	free(to_mpd(parent_domain));
+	return 0;
+}
+
 int mlx5_free_pd(struct ibv_pd *pd)
 {
 	int ret;