Message ID | 1492615225-55118-3-git-send-email-matanb@mellanox.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
> The new ioctl based infrastructure either commits or rollbacks > all objects of the command as one transaction. In order to do > that, we introduce a notion of dealing with a collection of > objects that are related to a specific action. > > This also requires adding a notion of an action and attribute. > An action contains a groups of attributes, where each group > contains several attributes. > > When declaring these actions and attributes, we actually declare > their specifications. When a command is executed, we actually > allocates some space to hold auxiliary information. > > Signed-off-by: Matan Barak <matanb@mellanox.com> > --- Matan, thanks for the RFC! If I got this correctly each object will go through three phases - get, handler, and a put. I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase. I do see it working if at first multiple gets are performed and one of them fails. In that case undoing is easy because the handlers weren't invoked yet. In the case were some user-objects failed the operation how is this reflected upwards? Also, I wonder, is there another intention behind batch operations except speed? Thanks, Ram -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 4, 2017 at 5:50 PM, Amrani, Ram <Ram.Amrani@cavium.com> wrote: >> The new ioctl based infrastructure either commits or rollbacks >> all objects of the command as one transaction. In order to do >> that, we introduce a notion of dealing with a collection of >> objects that are related to a specific action. >> >> This also requires adding a notion of an action and attribute. >> An action contains a groups of attributes, where each group >> contains several attributes. >> >> When declaring these actions and attributes, we actually declare >> their specifications. When a command is executed, we actually >> allocates some space to hold auxiliary information. >> >> Signed-off-by: Matan Barak <matanb@mellanox.com> >> --- > > Matan, thanks for the RFC! > > If I got this correctly each object will go through three phases - get, handler, and a put. > I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase. > I do see it working if at first multiple gets are performed and one of them fails. > In that case undoing is easy because the handlers weren't invoked yet. > > In the case were some user-objects failed the operation how is this reflected upwards? > > Also, I wonder, is there another intention behind batch operations except speed? > It's really not recommended to batch create/destroy. The reason is exactly what you've pointer out. If you batch several "destroy objects" and the n'th one fail, you can't unwind the successful ones. So basically, we want to support a semantic which is similar to what we have today - create a single object or destroy a single object. In this case, the pre-handler stage locks the dependencies of this object (for example, in create_qp you lock the pd and cq so they won't be destroyed) and create the uobject for the QP. The handler itself can assume the requirements it stated in the specifications are filled and just create the QP and tie the uobject to the QP object. In the post-handler stage we commit the QP's uobject and unlock the dependencies (assuming the handler increased the required refcounts). Destroying an object is similar. The only different is that the destruction itself isn't done by the handler, but in the post-handler's code (to share this code between regular "destroy" calls with process tear-down and hardware removal). > > Thanks, > Ram - Matan > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> The new ioctl based infrastructure either commits or rollbacks > >> all objects of the command as one transaction. In order to do > >> that, we introduce a notion of dealing with a collection of > >> objects that are related to a specific action. > >> > >> This also requires adding a notion of an action and attribute. > >> An action contains a groups of attributes, where each group > >> contains several attributes. > >> > >> When declaring these actions and attributes, we actually declare > >> their specifications. When a command is executed, we actually > >> allocates some space to hold auxiliary information. > >> > >> Signed-off-by: Matan Barak <matanb@mellanox.com> > >> --- > > > > Matan, thanks for the RFC! > > > > If I got this correctly each object will go through three phases - get, handler, and a put. > > I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase. > > I do see it working if at first multiple gets are performed and one of them fails. > > In that case undoing is easy because the handlers weren't invoked yet. > > > > In the case were some user-objects failed the operation how is this reflected upwards? > > > > Also, I wonder, is there another intention behind batch operations except speed? > > > > It's really not recommended to batch create/destroy. The reason is > exactly what you've pointer out. > If you batch several "destroy objects" and the n'th one fail, you > can't unwind the successful ones. > > So basically, we want to support a semantic which is similar to what > we have today - create a single object or destroy a single object. > In this case, the pre-handler stage locks the dependencies of this > object (for example, in create_qp you lock the pd and cq so they won't > be destroyed) > and create the uobject for the QP. The handler itself can assume the > requirements it stated in the specifications are filled and just > create the QP and > tie the uobject to the QP object. In the post-handler stage we commit > the QP's uobject and unlock the dependencies (assuming the handler > increased > the required refcounts). > Destroying an object is similar. The only different is that the > destruction itself isn't done by the handler, but in the > post-handler's code (to share this code between > regular "destroy" calls with process tear-down and hardware removal). I understand there are two sets of objects here. Let's make sure I'm not confusing them. (1) A collection of user-objects passed via ioctl. This is indicated in the first paragraph. But as you indicated now we shouldn't support this. So why (apparently) do we? (2) A collection of user-objects that should be locked for the creation/deletion/modification of another that was requested via ioctl. In this case the handler doesn't need to be invoked at all for the collection. We can easily roll back the "get" phase, if failed during. We don't expect the "put" phase to fail, if it will fail for some reason. Then this won't really be handled as the handler was already invoked. Thanks, Ram
On Sun, May 7, 2017 at 3:02 PM, Amrani, Ram <Ram.Amrani@cavium.com> wrote: >> >> The new ioctl based infrastructure either commits or rollbacks >> >> all objects of the command as one transaction. In order to do >> >> that, we introduce a notion of dealing with a collection of >> >> objects that are related to a specific action. >> >> >> >> This also requires adding a notion of an action and attribute. >> >> An action contains a groups of attributes, where each group >> >> contains several attributes. >> >> >> >> When declaring these actions and attributes, we actually declare >> >> their specifications. When a command is executed, we actually >> >> allocates some space to hold auxiliary information. >> >> >> >> Signed-off-by: Matan Barak <matanb@mellanox.com> >> >> --- >> > >> > Matan, thanks for the RFC! >> > >> > If I got this correctly each object will go through three phases - get, handler, and a put. >> > I don't quite understand how a batch operation, like destroy QPs, can be undone after the handler phase. >> > I do see it working if at first multiple gets are performed and one of them fails. >> > In that case undoing is easy because the handlers weren't invoked yet. >> > >> > In the case were some user-objects failed the operation how is this reflected upwards? >> > >> > Also, I wonder, is there another intention behind batch operations except speed? >> > >> >> It's really not recommended to batch create/destroy. The reason is >> exactly what you've pointer out. >> If you batch several "destroy objects" and the n'th one fail, you >> can't unwind the successful ones. >> >> So basically, we want to support a semantic which is similar to what >> we have today - create a single object or destroy a single object. >> In this case, the pre-handler stage locks the dependencies of this >> object (for example, in create_qp you lock the pd and cq so they won't >> be destroyed) >> and create the uobject for the QP. The handler itself can assume the >> requirements it stated in the specifications are filled and just >> create the QP and >> tie the uobject to the QP object. In the post-handler stage we commit >> the QP's uobject and unlock the dependencies (assuming the handler >> increased >> the required refcounts). >> Destroying an object is similar. The only different is that the >> destruction itself isn't done by the handler, but in the >> post-handler's code (to share this code between >> regular "destroy" calls with process tear-down and hardware removal). > > I understand there are two sets of objects here. Let's make sure I'm not confusing them. > > (1) A collection of user-objects passed via ioctl. This is indicated in the first paragraph. > But as you indicated now we shouldn't support this. So why (apparently) do we? > > (2) A collection of user-objects that should be locked for the creation/deletion/modification > of another that was requested via ioctl. > In this case the handler doesn't need to be invoked at all for the collection. > We can easily roll back the "get" phase, if failed during. > We don't expect the "put" phase to fail, if it will fail for some reason. Then this won't really > be handled as the handler was already invoked. > The infrastructure is agnostic to whether the objects are common or driver specific. It actually gives you a (hopefully) convenient way to invoke verbs handlers in the kernel. Each handler is a function which could have some arguments. However, we can't pass kernel pointers from user-space to kernel and we can't trust the user-space from executing two calls concurrently that could use an object and destroy it at the same time. Currently (current infrastructure), when you write such a handler, you need to open code this yourself - map ids to objects and lock them. What we propose here is to have some additional info to the handler. This info could be thought as the function's deceleration. This additional info makes the infrastructure validates syntactically your attributes, map them to the actual kernel pointers and lock them. Since the kernel developer writes this "additional info" (specification), it can make sure only one "DESTROY" or "CREATE" object exists per a specification (to avoid the behavior you mentioned). A command handling always consists of 3 stages: pre, handler and post (for all handlers). So, overall, if you have a create based command, the "pre" stage creates the uobject and locks its dependencies. If the handler fails, this is totally reversible (unlock dependencies and destroy the uobject). If you have a destroy command, the "pre" stage locks the uobject for exclusive access. If the handler fails, it just unlocks the object. If it's successful, the "post" stage actually destroys it. In other commands, the "pre" stage just locks the uobjects and obviously it's reversible. I hope that answers your questions. > Thanks, > Ram > > - Matan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> The infrastructure is agnostic to whether the objects are common or > driver specific. > It actually gives you a (hopefully) convenient way to invoke verbs > handlers in the kernel. > Each handler is a function which could have some arguments. However, > we can't pass kernel pointers > from user-space to kernel and we can't trust the user-space from > executing two calls concurrently that > could use an object and destroy it at the same time. Currently > (current infrastructure), when you write such a handler, you need > to open code this yourself - map ids to objects and lock them. > > What we propose here is to have some additional info to the handler. > This info could be thought as the function's deceleration. > This additional info makes the infrastructure validates syntactically > your attributes, map them to the actual kernel pointers and lock them. > Since the kernel developer writes this "additional info" > (specification), it can make sure only one "DESTROY" or "CREATE" > object exists per a specification (to avoid the behavior you > mentioned). > > A command handling always consists of 3 stages: pre, handler and post > (for all handlers). > So, overall, if you have a create based command, the "pre" stage > creates the uobject and locks its dependencies. If the handler fails, > this is totally reversible (unlock dependencies and destroy the > uobject). > If you have a destroy command, the "pre" stage locks the uobject for > exclusive access. If the handler fails, it just unlocks the object. If > it's successful, the "post" stage actually destroys it. > In other commands, the "pre" stage just locks the uobjects and > obviously it's reversible. > > I hope that answers your questions. > I'm in sync with you. Thanks for further explaining. - Ram
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c index 269fa7f..78ffd8c 100644 --- a/drivers/infiniband/core/rdma_core.c +++ b/drivers/infiniband/core/rdma_core.c @@ -676,3 +676,46 @@ int uverbs_finalize_object(struct ib_uobject *uobj, return ret; } + +int uverbs_finalize_objects(struct uverbs_attr_array *attr_array, + size_t num, + const struct uverbs_action *action, + bool commit) +{ + unsigned int i; + int ret = 0; + + for (i = 0; i < num; i++) { + struct uverbs_attr_array *attr_spec_array = &attr_array[i]; + const struct uverbs_attr_spec_group *attr_spec_group = + action->attr_groups[i]; + unsigned int j; + + for (j = 0; j < attr_spec_array->num_attrs; j++) { + struct uverbs_attr *attr = &attr_spec_array->attrs[j]; + struct uverbs_attr_spec *spec = &attr_spec_group->attrs[j]; + + if (!uverbs_is_valid(attr_spec_array, j)) + continue; + + if (spec->type == UVERBS_ATTR_TYPE_IDR || + spec->type == UVERBS_ATTR_TYPE_FD) { + int current_ret; + + /* + * refcounts should be handled at the object + * level and not at the uobject level. Refcounts + * of the objects themselves are done in + * handlers. + */ + current_ret = uverbs_finalize_object(attr->obj_attr.uobject, + spec->obj.access, + commit); + if (!ret) + ret = current_ret; + } + } + } + return ret; +} + diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h index 5a1da24..0aebc47 100644 --- a/drivers/infiniband/core/rdma_core.h +++ b/drivers/infiniband/core/rdma_core.h @@ -81,7 +81,7 @@ * the object is from the given type. Lock it to the required access. * This function could create (access == NEW) or destroy (access == DESTROY) * objects if required. The action will be finalized only when - * uverbs_finalize_object is called. + * uverbs_finalize_object or uverbs_finalize_objects is called. */ struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs, struct ib_ucontext *ucontext, @@ -90,5 +90,23 @@ struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type int uverbs_finalize_object(struct ib_uobject *uobj, enum uverbs_idr_access access, bool commit); +/* + * Note that certain finalize stages could return a status: + * (a) alloc_commit could return a failure if the object is committed at the + * same time when the context is destroyed. + * (b) remove_commit could fail if the object wasn't destroyed successfully. + * Since multiple objects could be finalized in one transaction, it is very NOT + * recommended to have several finalize actions which have side affects. + * For example, it's NOT recommended to have a certain action which has both + * a commit action and a destroy action or two destroy objects in the same + * action. The rule of thumb is to have one destroy or commit action with + * multiple lookups. + * The first non zero return value of finalize_object is returned from this + * function. + */ +int uverbs_finalize_objects(struct uverbs_attr_array *attr_array, + size_t num, + const struct uverbs_action *action, + bool success); #endif /* RDMA_CORE_H */ diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h index a18468e..1f84f30 100644 --- a/include/rdma/uverbs_ioctl.h +++ b/include/rdma/uverbs_ioctl.h @@ -41,6 +41,12 @@ * ======================================= */ +enum uverbs_attr_type { + UVERBS_ATTR_TYPE_NA, + UVERBS_ATTR_TYPE_IDR, + UVERBS_ATTR_TYPE_FD, +}; + enum uverbs_idr_access { UVERBS_ACCESS_READ, UVERBS_ACCESS_WRITE, @@ -48,5 +54,71 @@ enum uverbs_idr_access { UVERBS_ACCESS_DESTROY }; +struct uverbs_attr_spec { + enum uverbs_attr_type type; + union { + u16 len; + struct { + u16 obj_type; + u8 access; + } obj; + }; +}; + +struct uverbs_attr_spec_group { + struct uverbs_attr_spec *attrs; + size_t num_attrs; +}; + +struct uverbs_action { + const struct uverbs_attr_spec_group **attr_groups; + size_t num_groups; +}; + +/* ================================================= + * Parsing infrastructure + * ================================================= + */ + +struct uverbs_fd_attr { + int fd; +}; + +struct uverbs_uobj_attr { + /* idr handle */ + u32 idr; +}; + +struct uverbs_obj_attr { + /* pointer to the kernel descriptor -> type, access, etc */ + struct ib_uverbs_attr __user *uattr; + const struct uverbs_type_alloc_action *type; + struct ib_uobject *uobject; + union { + struct uverbs_fd_attr fd; + struct uverbs_uobj_attr uobj; + }; +}; + +struct uverbs_attr { + union { + struct uverbs_obj_attr obj_attr; + }; +}; + +/* output of one validator */ +struct uverbs_attr_array { + unsigned long *valid_bitmap; + size_t num_attrs; + /* arrays of attrubytes, index is the id i.e SEND_CQ */ + struct uverbs_attr *attrs; +}; + +static inline bool uverbs_is_valid(const struct uverbs_attr_array *attr_array, + unsigned int idx) +{ + return test_bit(idx, attr_array->valid_bitmap); +} + #endif
The new ioctl based infrastructure either commits or rollbacks all objects of the command as one transaction. In order to do that, we introduce a notion of dealing with a collection of objects that are related to a specific action. This also requires adding a notion of an action and attribute. An action contains a groups of attributes, where each group contains several attributes. When declaring these actions and attributes, we actually declare their specifications. When a command is executed, we actually allocates some space to hold auxiliary information. Signed-off-by: Matan Barak <matanb@mellanox.com> --- drivers/infiniband/core/rdma_core.c | 43 ++++++++++++++++++++++ drivers/infiniband/core/rdma_core.h | 20 ++++++++++- include/rdma/uverbs_ioctl.h | 72 +++++++++++++++++++++++++++++++++++++ 3 files changed, 134 insertions(+), 1 deletion(-)