diff mbox

[RFC,02/10] IB/core: Add support to finalize objects in one transaction

Message ID 1492615225-55118-3-git-send-email-matanb@mellanox.com (mailing list archive)
State RFC
Headers show

Commit Message

Matan Barak April 19, 2017, 3:20 p.m. UTC
The new ioctl based infrastructure either commits or rollbacks
all objects of the command as one transaction. In order to do
that, we introduce a notion of dealing with a collection of
objects that are related to a specific action.

This also requires adding a notion of an action and attribute.
An action contains a groups of attributes, where each group
contains several attributes.

When declaring these actions and attributes, we actually declare
their specifications. When a command is executed, we actually
allocates some space to hold auxiliary information.

Signed-off-by: Matan Barak <matanb@mellanox.com>
---
 drivers/infiniband/core/rdma_core.c | 43 ++++++++++++++++++++++
 drivers/infiniband/core/rdma_core.h | 20 ++++++++++-
 include/rdma/uverbs_ioctl.h         | 72 +++++++++++++++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 1 deletion(-)

Comments

Amrani, Ram May 4, 2017, 2:50 p.m. UTC | #1
> The new ioctl based infrastructure either commits or rollbacks
> all objects of the command as one transaction. In order to do
> that, we introduce a notion of dealing with a collection of
> objects that are related to a specific action.
> 
> This also requires adding a notion of an action and attribute.
> An action contains a groups of attributes, where each group
> contains several attributes.
> 
> When declaring these actions and attributes, we actually declare
> their specifications. When a command is executed, we actually
> allocates some space to hold auxiliary information.
> 
> Signed-off-by: Matan Barak <matanb@mellanox.com>
> ---

Matan, thanks for the RFC!

If I got this correctly each object will go through three phases - get, handler, and a put.
I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase.
I do see it working if at first multiple gets are performed and one of them fails.
In that case undoing is easy because the handlers weren't invoked yet.

In the case were some user-objects failed the operation how is this reflected upwards?

Also, I wonder, is there another intention behind batch operations except speed?


Thanks,
Ram
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak May 7, 2017, 11 a.m. UTC | #2
On Thu, May 4, 2017 at 5:50 PM, Amrani, Ram <Ram.Amrani@cavium.com> wrote:
>> The new ioctl based infrastructure either commits or rollbacks
>> all objects of the command as one transaction. In order to do
>> that, we introduce a notion of dealing with a collection of
>> objects that are related to a specific action.
>>
>> This also requires adding a notion of an action and attribute.
>> An action contains a groups of attributes, where each group
>> contains several attributes.
>>
>> When declaring these actions and attributes, we actually declare
>> their specifications. When a command is executed, we actually
>> allocates some space to hold auxiliary information.
>>
>> Signed-off-by: Matan Barak <matanb@mellanox.com>
>> ---
>
> Matan, thanks for the RFC!
>
> If I got this correctly each object will go through three phases - get, handler, and a put.
> I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase.
> I do see it working if at first multiple gets are performed and one of them fails.
> In that case undoing is easy because the handlers weren't invoked yet.
>
> In the case were some user-objects failed the operation how is this reflected upwards?
>
> Also, I wonder, is there another intention behind batch operations except speed?
>

It's really not recommended to batch create/destroy. The reason is
exactly what you've pointer out.
If you batch several "destroy objects" and the n'th one fail, you
can't unwind the successful ones.

So basically, we want to support a semantic which is similar to what
we have today - create a single object or destroy a single object.
In this case, the pre-handler stage locks the dependencies of this
object (for example, in create_qp you lock the pd and cq so they won't
be destroyed)
and create the uobject for the QP. The handler itself can assume the
requirements it stated in the specifications are filled and just
create the QP and
tie the uobject to the QP object. In the post-handler stage we commit
the QP's uobject and unlock the dependencies (assuming the handler
increased
the required refcounts).
Destroying an object is similar. The only different is that the
destruction itself isn't done by the handler, but in the
post-handler's code (to share this code between
regular "destroy" calls with process tear-down and hardware removal).

>
> Thanks,
> Ram

- Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amrani, Ram May 7, 2017, 12:02 p.m. UTC | #3
> >> The new ioctl based infrastructure either commits or rollbacks

> >> all objects of the command as one transaction. In order to do

> >> that, we introduce a notion of dealing with a collection of

> >> objects that are related to a specific action.

> >>

> >> This also requires adding a notion of an action and attribute.

> >> An action contains a groups of attributes, where each group

> >> contains several attributes.

> >>

> >> When declaring these actions and attributes, we actually declare

> >> their specifications. When a command is executed, we actually

> >> allocates some space to hold auxiliary information.

> >>

> >> Signed-off-by: Matan Barak <matanb@mellanox.com>

> >> ---

> >

> > Matan, thanks for the RFC!

> >

> > If I got this correctly each object will go through three phases - get, handler, and a put.

> > I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase.

> > I do see it working if at first multiple gets are performed and one of them fails.

> > In that case undoing is easy because the handlers weren't invoked yet.

> >

> > In the case were some user-objects failed the operation how is this reflected upwards?

> >

> > Also, I wonder, is there another intention behind batch operations except speed?

> >

> 

> It's really not recommended to batch create/destroy. The reason is

> exactly what you've pointer out.

> If you batch several "destroy objects" and the n'th one fail, you

> can't unwind the successful ones.

> 

> So basically, we want to support a semantic which is similar to what

> we have today - create a single object or destroy a single object.

> In this case, the pre-handler stage locks the dependencies of this

> object (for example, in create_qp you lock the pd and cq so they won't

> be destroyed)

> and create the uobject for the QP. The handler itself can assume the

> requirements it stated in the specifications are filled and just

> create the QP and

> tie the uobject to the QP object. In the post-handler stage we commit

> the QP's uobject and unlock the dependencies (assuming the handler

> increased

> the required refcounts).

> Destroying an object is similar. The only different is that the

> destruction itself isn't done by the handler, but in the

> post-handler's code (to share this code between

> regular "destroy" calls with process tear-down and hardware removal).

 
I understand there are two sets of objects here. Let's make sure I'm not confusing them.

(1) A collection of user-objects passed via ioctl. This is indicated in the first paragraph.
But as you indicated now we shouldn't support this. So why (apparently) do we?

(2) A collection of user-objects that should be locked for the creation/deletion/modification
of another that was requested via ioctl.
In this case the handler doesn't need to be invoked at all for the collection.
We can easily roll back the "get" phase, if failed during.
We don't expect the "put" phase to fail, if it will fail for some reason. Then this won't really
be handled as the handler was already invoked.

Thanks,
Ram
Matan Barak May 7, 2017, 12:39 p.m. UTC | #4
On Sun, May 7, 2017 at 3:02 PM, Amrani, Ram <Ram.Amrani@cavium.com> wrote:
>> >> The new ioctl based infrastructure either commits or rollbacks
>> >> all objects of the command as one transaction. In order to do
>> >> that, we introduce a notion of dealing with a collection of
>> >> objects that are related to a specific action.
>> >>
>> >> This also requires adding a notion of an action and attribute.
>> >> An action contains a groups of attributes, where each group
>> >> contains several attributes.
>> >>
>> >> When declaring these actions and attributes, we actually declare
>> >> their specifications. When a command is executed, we actually
>> >> allocates some space to hold auxiliary information.
>> >>
>> >> Signed-off-by: Matan Barak <matanb@mellanox.com>
>> >> ---
>> >
>> > Matan, thanks for the RFC!
>> >
>> > If I got this correctly each object will go through three phases - get, handler, and a put.
>> > I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase.
>> > I do see it working if at first multiple gets are performed and one of them fails.
>> > In that case undoing is easy because the handlers weren't invoked yet.
>> >
>> > In the case were some user-objects failed the operation how is this reflected upwards?
>> >
>> > Also, I wonder, is there another intention behind batch operations except speed?
>> >
>>
>> It's really not recommended to batch create/destroy. The reason is
>> exactly what you've pointer out.
>> If you batch several "destroy objects" and the n'th one fail, you
>> can't unwind the successful ones.
>>
>> So basically, we want to support a semantic which is similar to what
>> we have today - create a single object or destroy a single object.
>> In this case, the pre-handler stage locks the dependencies of this
>> object (for example, in create_qp you lock the pd and cq so they won't
>> be destroyed)
>> and create the uobject for the QP. The handler itself can assume the
>> requirements it stated in the specifications are filled and just
>> create the QP and
>> tie the uobject to the QP object. In the post-handler stage we commit
>> the QP's uobject and unlock the dependencies (assuming the handler
>> increased
>> the required refcounts).
>> Destroying an object is similar. The only different is that the
>> destruction itself isn't done by the handler, but in the
>> post-handler's code (to share this code between
>> regular "destroy" calls with process tear-down and hardware removal).
>
> I understand there are two sets of objects here. Let's make sure I'm not confusing them.
>
> (1) A collection of user-objects passed via ioctl. This is indicated in the first paragraph.
> But as you indicated now we shouldn't support this. So why (apparently) do we?
>
> (2) A collection of user-objects that should be locked for the creation/deletion/modification
> of another that was requested via ioctl.
> In this case the handler doesn't need to be invoked at all for the collection.
> We can easily roll back the "get" phase, if failed during.
> We don't expect the "put" phase to fail, if it will fail for some reason. Then this won't really
> be handled as the handler was already invoked.
>

The infrastructure is agnostic to whether the objects are common or
driver specific.
It actually gives you a (hopefully) convenient way to invoke verbs
handlers in the kernel.
Each handler is a function which could have some arguments. However,
we can't pass kernel pointers
from user-space to kernel and we can't trust the user-space from
executing two calls concurrently that
could use an object and destroy it at the same time. Currently
(current infrastructure), when you write such a handler, you need
to open code this yourself - map ids to objects and lock them.

What we propose here is to have some additional info to the handler.
This info could be thought as the function's deceleration.
This additional info makes the infrastructure validates syntactically
your attributes, map them to the actual kernel pointers and lock them.
Since the kernel developer writes this "additional info"
(specification), it can make sure only one "DESTROY" or "CREATE"
object exists per a specification (to avoid the behavior you
mentioned).

A command handling always consists of 3 stages: pre, handler and post
(for all handlers).
So, overall, if you have a create based command, the "pre" stage
creates the uobject and locks its dependencies. If the handler fails,
this is totally reversible (unlock dependencies and destroy the
uobject).
If you have a destroy command, the "pre" stage locks the uobject for
exclusive access. If the handler fails, it just unlocks the object. If
it's successful, the "post" stage actually destroys it.
In other commands, the "pre" stage just locks the uobjects and
obviously it's reversible.

I hope that answers your questions.

> Thanks,
> Ram
>
>

- Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amrani, Ram May 7, 2017, 2:29 p.m. UTC | #5
> The infrastructure is agnostic to whether the objects are common or

> driver specific.

> It actually gives you a (hopefully) convenient way to invoke verbs

> handlers in the kernel.

> Each handler is a function which could have some arguments. However,

> we can't pass kernel pointers

> from user-space to kernel and we can't trust the user-space from

> executing two calls concurrently that

> could use an object and destroy it at the same time. Currently

> (current infrastructure), when you write such a handler, you need

> to open code this yourself - map ids to objects and lock them.

> 

> What we propose here is to have some additional info to the handler.

> This info could be thought as the function's deceleration.

> This additional info makes the infrastructure validates syntactically

> your attributes, map them to the actual kernel pointers and lock them.

> Since the kernel developer writes this "additional info"

> (specification), it can make sure only one "DESTROY" or "CREATE"

> object exists per a specification (to avoid the behavior you

> mentioned).

> 

> A command handling always consists of 3 stages: pre, handler and post

> (for all handlers).

> So, overall, if you have a create based command, the "pre" stage

> creates the uobject and locks its dependencies. If the handler fails,

> this is totally reversible (unlock dependencies and destroy the

> uobject).

> If you have a destroy command, the "pre" stage locks the uobject for

> exclusive access. If the handler fails, it just unlocks the object. If

> it's successful, the "post" stage actually destroys it.

> In other commands, the "pre" stage just locks the uobjects and

> obviously it's reversible.

> 

> I hope that answers your questions.

> 


I'm in sync with you.
Thanks for further explaining.

- Ram
diff mbox

Patch

diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index 269fa7f..78ffd8c 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -676,3 +676,46 @@  int uverbs_finalize_object(struct ib_uobject *uobj,
 
 	return ret;
 }
+
+int uverbs_finalize_objects(struct uverbs_attr_array *attr_array,
+			    size_t num,
+			    const struct uverbs_action *action,
+			    bool commit)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; i < num; i++) {
+		struct uverbs_attr_array *attr_spec_array = &attr_array[i];
+		const struct uverbs_attr_spec_group *attr_spec_group =
+			action->attr_groups[i];
+		unsigned int j;
+
+		for (j = 0; j < attr_spec_array->num_attrs; j++) {
+			struct uverbs_attr *attr = &attr_spec_array->attrs[j];
+			struct uverbs_attr_spec *spec = &attr_spec_group->attrs[j];
+
+			if (!uverbs_is_valid(attr_spec_array, j))
+				continue;
+
+			if (spec->type == UVERBS_ATTR_TYPE_IDR ||
+			    spec->type == UVERBS_ATTR_TYPE_FD) {
+				int current_ret;
+
+				/*
+				 * refcounts should be handled at the object
+				 * level and not at the uobject level. Refcounts
+				 * of the objects themselves are done in
+				 * handlers.
+				 */
+				current_ret = uverbs_finalize_object(attr->obj_attr.uobject,
+								     spec->obj.access,
+								     commit);
+				if (!ret)
+					ret = current_ret;
+			}
+		}
+	}
+	return ret;
+}
+
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index 5a1da24..0aebc47 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -81,7 +81,7 @@ 
  * the object is from the given type. Lock it to the required access.
  * This function could create (access == NEW) or destroy (access == DESTROY)
  * objects if required. The action will be finalized only when
- * uverbs_finalize_object is called.
+ * uverbs_finalize_object or uverbs_finalize_objects is called.
  */
 struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
 						   struct ib_ucontext *ucontext,
@@ -90,5 +90,23 @@  struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type
 int uverbs_finalize_object(struct ib_uobject *uobj,
 			   enum uverbs_idr_access access,
 			   bool commit);
+/*
+ * Note that certain finalize stages could return a status:
+ *   (a) alloc_commit could return a failure if the object is committed at the
+ *       same time when the context is destroyed.
+ *   (b) remove_commit could fail if the object wasn't destroyed successfully.
+ * Since multiple objects could be finalized in one transaction, it is very NOT
+ * recommended to have several finalize actions which have side affects.
+ * For example, it's NOT recommended to have a certain action which has both
+ * a commit action and a destroy action or two destroy objects in the same
+ * action. The rule of thumb is to have one destroy or commit action with
+ * multiple lookups.
+ * The first non zero return value of finalize_object is returned from this
+ * function.
+ */
+int uverbs_finalize_objects(struct uverbs_attr_array *attr_array,
+			    size_t num,
+			    const struct uverbs_action *action,
+			    bool success);
 
 #endif /* RDMA_CORE_H */
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index a18468e..1f84f30 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -41,6 +41,12 @@ 
  * =======================================
  */
 
+enum uverbs_attr_type {
+	UVERBS_ATTR_TYPE_NA,
+	UVERBS_ATTR_TYPE_IDR,
+	UVERBS_ATTR_TYPE_FD,
+};
+
 enum uverbs_idr_access {
 	UVERBS_ACCESS_READ,
 	UVERBS_ACCESS_WRITE,
@@ -48,5 +54,71 @@  enum uverbs_idr_access {
 	UVERBS_ACCESS_DESTROY
 };
 
+struct uverbs_attr_spec {
+	enum uverbs_attr_type		type;
+	union {
+		u16				len;
+		struct {
+			u16			obj_type;
+			u8			access;
+		} obj;
+	};
+};
+
+struct uverbs_attr_spec_group {
+	struct uverbs_attr_spec		*attrs;
+	size_t				num_attrs;
+};
+
+struct uverbs_action {
+	const struct uverbs_attr_spec_group		**attr_groups;
+	size_t						num_groups;
+};
+
+/* =================================================
+ *              Parsing infrastructure
+ * =================================================
+ */
+
+struct uverbs_fd_attr {
+	int		fd;
+};
+
+struct uverbs_uobj_attr {
+	/*  idr handle */
+	u32	idr;
+};
+
+struct uverbs_obj_attr {
+	/* pointer to the kernel descriptor -> type, access, etc */
+	struct ib_uverbs_attr __user	*uattr;
+	const struct uverbs_type_alloc_action	*type;
+	struct ib_uobject		*uobject;
+	union {
+		struct uverbs_fd_attr		fd;
+		struct uverbs_uobj_attr		uobj;
+	};
+};
+
+struct uverbs_attr {
+	union {
+		struct uverbs_obj_attr	obj_attr;
+	};
+};
+
+/* output of one validator */
+struct uverbs_attr_array {
+	unsigned long *valid_bitmap;
+	size_t num_attrs;
+	/* arrays of attrubytes, index is the id i.e SEND_CQ */
+	struct uverbs_attr *attrs;
+};
+
+static inline bool uverbs_is_valid(const struct uverbs_attr_array *attr_array,
+				   unsigned int idx)
+{
+	return test_bit(idx, attr_array->valid_bitmap);
+}
+
 #endif