diff mbox

[for-next,2/7] IB: Introduce Work Queue object and its verbs

Message ID 1444893410-13242-3-git-send-email-yishaih@mellanox.com (mailing list archive)
State Superseded
Headers show

Commit Message

Yishai Hadas Oct. 15, 2015, 7:16 a.m. UTC
Introduce Work Queue object and its create/destroy/modify verbs.

QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.
WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues.

WQ associated (many to one) with Completion Queue and it owns WQ
properties (PD, WQ size, etc.).
WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
it may be extend to others such as IB_WQT_SQ. (send queue).
WQ from type IB_WQT_RQ contains receive work requests.

WQ context is subject to a well-defined state transitions done by
the modify_wq verb.
When WQ is created its initial state becomes IB_WQS_RESET.
From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
or to IB_WQS_ERR.
From IB_WQS_ERR it can be modified to IB_WQS_RESET.

Note: transition to IB_WQS_ERR might occur implicitly in case there
was some HW error.


Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 drivers/infiniband/core/verbs.c |   59 ++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         |   88 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 147 insertions(+), 0 deletions(-)

Comments

Sagi Grimberg Oct. 15, 2015, 8:50 a.m. UTC | #1
Hi Yishai,

> +/**
> + * ib_create_wq - Creates a WQ associated with the specified protection
> + * domain.
> + * @pd: The protection domain associated with the WQ.
> + * @wq_init_attr: A list of initial attributes required to create the
> + * WQ. If WQ creation succeeds, then the attributes are updated to
> + * the actual capabilities of the created WQ.
> + *
> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
> + * the requested size of the WQ, and set to the actual values allocated
> + * on return.
> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
> + * at least as large as the requested values.
> + *
> + * Return Value
> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
> + * fails.
> + */
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +			   struct ib_wq_init_attr *init_attr);
> +

We started shifting function documentations from the header declarations
to the *.c implmentations. Would you mind moving it too?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Oct. 15, 2015, 9:13 a.m. UTC | #2
Just curious, why does WQ need to be bind to PD?
Isn't ucontext sufficient?
Or because kcontext doesn't exist, PD serves that role?
Or Is this just manifestation of how hardware behave?

Since you mentioned, "QP can be configured to use "external" WQ
object", it might be worth to reuse the WQ across multiple QPs of
different PD?
Because MR and QP validation check has to happen among MR and actual
QP and might not require that check against WQ.

Parav

On Thu, Oct 15, 2015 at 12:46 PM, Yishai Hadas <yishaih@mellanox.com> wrote:
> Introduce Work Queue object and its create/destroy/modify verbs.
>
> QP can be created without internal WQs "packaged" inside it,
> this QP can be configured to use "external" WQ object as its
> receive/send queue.
> WQ is a necessary component for RSS technology since RSS mechanism
> is supposed to distribute the traffic between multiple
> Receive Work Queues.
>
> WQ associated (many to one) with Completion Queue and it owns WQ
> properties (PD, WQ size, etc.).
> WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
> it may be extend to others such as IB_WQT_SQ. (send queue).
> WQ from type IB_WQT_RQ contains receive work requests.
>
> WQ context is subject to a well-defined state transitions done by
> the modify_wq verb.
> When WQ is created its initial state becomes IB_WQS_RESET.
> From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
> From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
> or to IB_WQS_ERR.
> From IB_WQS_ERR it can be modified to IB_WQS_RESET.
>
> Note: transition to IB_WQS_ERR might occur implicitly in case there
> was some HW error.
>
>
> Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
> ---
>  drivers/infiniband/core/verbs.c |   59 ++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h         |   88 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 147 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index e1f2c98..c63c622 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1435,6 +1435,65 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
>  }
>  EXPORT_SYMBOL(ib_dealloc_xrcd);
>
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +                          struct ib_wq_init_attr *wq_attr)
> +{
> +       struct ib_wq *wq;
> +
> +       if (!pd->device->create_wq)
> +               return ERR_PTR(-ENOSYS);
> +
> +       wq = pd->device->create_wq(pd, wq_attr, NULL);
> +       if (!IS_ERR(wq)) {
> +               wq->event_handler = wq_attr->event_handler;
> +               wq->wq_context = wq_attr->wq_context;
> +               wq->wq_type = wq_attr->wq_type;
> +               wq->cq = wq_attr->cq;
> +               wq->device = pd->device;
> +               wq->pd = pd;
> +               wq->uobject = NULL;
> +               atomic_inc(&pd->usecnt);
> +               atomic_inc(&wq_attr->cq->usecnt);
> +               atomic_set(&wq->usecnt, 0);
> +       }
> +       return wq;
> +}
> +EXPORT_SYMBOL(ib_create_wq);
> +
> +int ib_destroy_wq(struct ib_wq *wq)
> +{
> +       int err;
> +       struct ib_cq *cq = wq->cq;
> +       struct ib_pd *pd = wq->pd;
> +
> +       if (!wq->device->destroy_wq)
> +               return -ENOSYS;
> +
> +       if (atomic_read(&wq->usecnt))
> +               return -EBUSY;
> +
> +       err = wq->device->destroy_wq(wq);
> +       if (!err) {
> +               atomic_dec(&pd->usecnt);
> +               atomic_dec(&cq->usecnt);
> +       }
> +       return err;
> +}
> +EXPORT_SYMBOL(ib_destroy_wq);
> +
> +int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
> +                enum ib_wq_attr_mask attr_mask)
> +{
> +       int err;
> +
> +       if (!wq->device->modify_wq)
> +               return -ENOSYS;
> +
> +       err = wq->device->modify_wq(wq, wq_attr, attr_mask, NULL);
> +       return err;
> +}
> +EXPORT_SYMBOL(ib_modify_wq);
> +
>  struct ib_flow *ib_create_flow(struct ib_qp *qp,
>                                struct ib_flow_attr *flow_attr,
>                                int domain)
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index e1f65e2..0c6291b 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1310,6 +1310,48 @@ struct ib_srq {
>         } ext;
>  };
>
> +enum ib_wq_type {
> +       IB_WQT_RQ
> +};
> +
> +enum ib_wq_state {
> +       IB_WQS_RESET,
> +       IB_WQS_RDY,
> +       IB_WQS_ERR
> +};
> +
> +struct ib_wq {
> +       struct ib_device       *device;
> +       struct ib_uobject      *uobject;
> +       void                *wq_context;
> +       void                (*event_handler)(struct ib_event *, void *);
> +       struct ib_pd           *pd;
> +       struct ib_cq           *cq;
> +       u32             wq_num;
> +       enum ib_wq_state       state;
> +       enum ib_wq_type wq_type;
> +       atomic_t                usecnt;
> +};
> +
> +struct ib_wq_init_attr {
> +       void                   *wq_context;
> +       enum ib_wq_type wq_type;
> +       u32             max_wr;
> +       u32             max_sge;
> +       struct  ib_cq          *cq;
> +       void                (*event_handler)(struct ib_event *, void *);
> +};
> +
> +enum ib_wq_attr_mask {
> +       IB_WQ_STATE     = 1 << 0,
> +       IB_WQ_CUR_STATE = 1 << 1,
> +};
> +
> +struct ib_wq_attr {
> +       enum    ib_wq_state     wq_state;
> +       enum    ib_wq_state     curr_wq_state;
> +};
> +
>  struct ib_qp {
>         struct ib_device       *device;
>         struct ib_pd           *pd;
> @@ -1771,6 +1813,14 @@ struct ib_device {
>         int                        (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
>                                                       struct ib_mr_status *mr_status);
>         void                       (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
> +       struct ib_wq *             (*create_wq)(struct ib_pd *pd,
> +                                               struct ib_wq_init_attr *init_attr,
> +                                               struct ib_udata *udata);
> +       int                        (*destroy_wq)(struct ib_wq *wq);
> +       int                        (*modify_wq)(struct ib_wq *wq,
> +                                               struct ib_wq_attr *attr,
> +                                               enum ib_wq_attr_mask attr_mask,
> +                                               struct ib_udata *udata);
>
>         struct ib_dma_mapping_ops   *dma_ops;
>
> @@ -3024,4 +3074,42 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u8 port,
>                                             u16 pkey, const union ib_gid *gid,
>                                             const struct sockaddr *addr);
>
> +/**
> + * ib_create_wq - Creates a WQ associated with the specified protection
> + * domain.
> + * @pd: The protection domain associated with the WQ.
> + * @wq_init_attr: A list of initial attributes required to create the
> + * WQ. If WQ creation succeeds, then the attributes are updated to
> + * the actual capabilities of the created WQ.
> + *
> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
> + * the requested size of the WQ, and set to the actual values allocated
> + * on return.
> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
> + * at least as large as the requested values.
> + *
> + * Return Value
> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
> + * fails.
> + */
> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
> +                          struct ib_wq_init_attr *init_attr);
> +
> +/**
> + * ib_destroy_wq - Destroys the specified WQ.
> + * @wq: The WQ to destroy.
> + */
> +int ib_destroy_wq(struct ib_wq *wq);
> +
> +/**
> + * ib_modify_wq - Modifies the specified WQ.
> + * @wq: The WQ to modify.
> + * @wq_attr: On input, specifies the WQ attributes to modify.
> + * @attr_mask: A bit-mask used to specify which attributes of the WQ
> + *   are being modified.
> + * On output, the current values of selected WQ attributes are returned.
> + */
> +int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
> +                enum ib_wq_attr_mask attr_mask);
> +
>  #endif /* IB_VERBS_H */
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Oct. 15, 2015, 2:12 p.m. UTC | #3
On 10/15/2015 12:13 PM, Parav Pandit wrote:
> Just curious, why does WQ need to be bind to PD?
> Isn't ucontext sufficient?
> Or because kcontext doesn't exist, PD serves that role?
> Or Is this just manifestation of how hardware behave?

PD is an attribute of a work queue (i.e. send/receive queue), it's used 
by the hardware for security validation before scattering to a memory 
region. For that, an external WQ object needs a PD, letting the
hardware makes that validation.

> Since you mentioned, "QP can be configured to use "external" WQ
> object", it might be worth to reuse the WQ across multiple QPs of
> different PD?

Correct, external WQ can be used across multiple QPs, in that case its 
PD is used by the hardware for security validation when it accesses to 
the MR, in that case the QP's PD is not in use.

> Because MR and QP validation check has to happen among MR and actual
> QP and might not require that check against WQ.

No, in that case of an external WQ its PD is used and the QP's PD is not 
in use.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Oct. 15, 2015, 3:17 p.m. UTC | #4
On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
<yishaih@dev.mellanox.co.il> wrote:
> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>
>> Just curious, why does WQ need to be bind to PD?
>> Isn't ucontext sufficient?
>> Or because kcontext doesn't exist, PD serves that role?
>> Or Is this just manifestation of how hardware behave?
>
>
> PD is an attribute of a work queue (i.e. send/receive queue), it's used by
> the hardware for security validation before scattering to a memory region.
> For that, an external WQ object needs a PD, letting the
> hardware makes that validation.
>
>> Since you mentioned, "QP can be configured to use "external" WQ
>> object", it might be worth to reuse the WQ across multiple QPs of
>> different PD?
>
>
> Correct, external WQ can be used across multiple QPs, in that case its PD is
> used by the hardware for security validation when it accesses to the MR, in
> that case the QP's PD is not in use.
>
I think I get it, just confirming with below example.

So I think below is possible.
WQ_A having PD=1.
QP_A having PD=2 bound to WQ_A.
QP_B having PD=3 bound to WQ_A.
MR_X having PD=2.
And checks are done between MR and QP.

In other use case,
MR is not at all used. (only physical addresses are used)
WQ_A having PD=1.
QP_A having PD=2 bound to WQ_A.
QP_B having PD=3 bound to WQ_A.

WQ entries fail as MR is not associated and QP are bound to different
PD than the PD of WQ_A.
Because at QP bound time with WQ, its unknown whether it will use MR
or not in the WQE at run time.
Right?


>> Because MR and QP validation check has to happen among MR and actual
>> QP and might not require that check against WQ.
>
>
> No, in that case of an external WQ its PD is used and the QP's PD is not in
> use.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Oct. 15, 2015, 4:25 p.m. UTC | #5
On 10/15/2015 6:17 PM, Parav Pandit wrote:
> On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
> <yishaih@dev.mellanox.co.il> wrote:
>> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>>
>>> Just curious, why does WQ need to be bind to PD?
>>> Isn't ucontext sufficient?
>>> Or because kcontext doesn't exist, PD serves that role?
>>> Or Is this just manifestation of how hardware behave?
>>
>>
>> PD is an attribute of a work queue (i.e. send/receive queue), it's used by
>> the hardware for security validation before scattering to a memory region.
>> For that, an external WQ object needs a PD, letting the
>> hardware makes that validation.
>>
>>> Since you mentioned, "QP can be configured to use "external" WQ
>>> object", it might be worth to reuse the WQ across multiple QPs of
>>> different PD?
>>
>>
>> Correct, external WQ can be used across multiple QPs, in that case its PD is
>> used by the hardware for security validation when it accesses to the MR, in
>> that case the QP's PD is not in use.
>>
> I think I get it, just confirming with below example.

.

> So I think below is possible.
> WQ_A having PD=1.
> QP_A having PD=2 bound to WQ_A.
> QP_B having PD=3 bound to WQ_A.
> MR_X having PD=2.
> And checks are done between MR and QP.
No, please follow above description, in that case PD=1 of WQ_A is used 
for the checks.

> In other use case,
> MR is not at all used. (only physical addresses are used)
> WQ_A having PD=1.
> QP_A having PD=2 bound to WQ_A.
> QP_B having PD=3 bound to WQ_A.
>
> WQ entries fail as MR is not associated and QP are bound to different
> PD than the PD of WQ_A.
> Because at QP bound time with WQ, its unknown whether it will use MR
> or not in the WQE at run time.
> Right?

In case there is MR for physical addresses it has a PD and the WQ's PD 
is used, in case there is no MR the PD is not applicable.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Oct. 15, 2015, 4:49 p.m. UTC | #6
On Thu, Oct 15, 2015 at 9:55 PM, Yishai Hadas
<yishaih@dev.mellanox.co.il> wrote:
> On 10/15/2015 6:17 PM, Parav Pandit wrote:
>>
>> On Thu, Oct 15, 2015 at 7:42 PM, Yishai Hadas
>> <yishaih@dev.mellanox.co.il> wrote:
>>>
>>> On 10/15/2015 12:13 PM, Parav Pandit wrote:
>>>>
>>>>
>>>> Just curious, why does WQ need to be bind to PD?
>>>> Isn't ucontext sufficient?
>>>> Or because kcontext doesn't exist, PD serves that role?
>>>> Or Is this just manifestation of how hardware behave?
>>>
>>>
>>>
>>> PD is an attribute of a work queue (i.e. send/receive queue), it's used
>>> by
>>> the hardware for security validation before scattering to a memory
>>> region.
>>> For that, an external WQ object needs a PD, letting the
>>> hardware makes that validation.
>>>
>>>> Since you mentioned, "QP can be configured to use "external" WQ
>>>> object", it might be worth to reuse the WQ across multiple QPs of
>>>> different PD?
>>>
>>>
>>>
>>> Correct, external WQ can be used across multiple QPs, in that case its PD
>>> is
>>> used by the hardware for security validation when it accesses to the MR,
>>> in
>>> that case the QP's PD is not in use.
>>>
>> I think I get it, just confirming with below example.
>
>
> .
>
>> So I think below is possible.
>> WQ_A having PD=1.
>> QP_A having PD=2 bound to WQ_A.
>> QP_B having PD=3 bound to WQ_A.
>> MR_X having PD=2.
>> And checks are done between MR and QP.
>
> No, please follow above description, in that case PD=1 of WQ_A is used for
> the checks.
>
This appears to me a manifestation of hardware implementation
surfacing the verb layer.
There may be nothing wrong in it, but worth to know how to actually do
verb programming.

If there is stateless WQ being used by multiple QPs in multiplexed
way, it should be able to multiplex between QP's of different PD as
well.
Otherwise for every PD being created, there will have be one WQ needed
to service all the QPs belonging to that PD.

>> In other use case,
>> MR is not at all used. (only physical addresses are used)
>> WQ_A having PD=1.
>> QP_A having PD=2 bound to WQ_A.
>> QP_B having PD=3 bound to WQ_A.
>>
>> WQ entries fail as MR is not associated and QP are bound to different
>> PD than the PD of WQ_A.
>> Because at QP bound time with WQ, its unknown whether it will use MR
>> or not in the WQE at run time.
>> Right?
>
>
> In case there is MR for physical addresses it has a PD and the WQ's PD is
> used, in case there is no MR the PD is not applicable.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Oct. 18, 2015, 3:08 p.m. UTC | #7
On 10/15/2015 7:49 PM, Parav Pandit wrote:

> If there is stateless WQ being used by multiple QPs in multiplexed

The WQ is not stateless and always has its own PD.

> way, it should be able to multiplex between QP's of different PD as
> well.
> Otherwise for every PD being created, there will have be one WQ needed
> to service all the QPs belonging to that PD.

As mentioned, same WQ can serve multiple QPs, from PD point of view it 
behaves similarly to SRQ that may be associated with many QPs with 
different PDs.

See IB SPEC, Release 1.3, o10-2.2.1:
"SRQ may be associated with the same PD as used by one or more of its 
associated QPs or a different PD."

As part of coming V1 will improve the commit message to better clarify 
the WQ's PD behavior, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Oct. 18, 2015, 3:13 p.m. UTC | #8
On 10/15/2015 11:50 AM, Sagi Grimberg wrote:
> Hi Yishai,
>
>> +/**
>> + * ib_create_wq - Creates a WQ associated with the specified protection
>> + * domain.
>> + * @pd: The protection domain associated with the WQ.
>> + * @wq_init_attr: A list of initial attributes required to create the
>> + * WQ. If WQ creation succeeds, then the attributes are updated to
>> + * the actual capabilities of the created WQ.
>> + *
>> + * wq_init_attr->max_wr and wq_init_attr->max_sge determine
>> + * the requested size of the WQ, and set to the actual values allocated
>> + * on return.
>> + * If ib_create_wq() succeeds, then max_wr and max_sge will always be
>> + * at least as large as the requested values.
>> + *
>> + * Return Value
>> + * ib_create_wq() returns a pointer to the created WQ, or NULL if the
>> request
>> + * fails.
>> + */
>> +struct ib_wq *ib_create_wq(struct ib_pd *pd,
>> +               struct ib_wq_init_attr *init_attr);
>> +
>
> We started shifting function documentations from the header declarations
> to the *.c implmentations. Would you mind moving it too?

OK, will move as part of V1 to be in the C file.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Parav Pandit Oct. 18, 2015, 3:15 p.m. UTC | #9
On Sun, Oct 18, 2015 at 8:38 PM, Yishai Hadas
<yishaih@dev.mellanox.co.il> wrote:
> On 10/15/2015 7:49 PM, Parav Pandit wrote:
>
>> If there is stateless WQ being used by multiple QPs in multiplexed
>
>
> The WQ is not stateless and always has its own PD.
>
>> way, it should be able to multiplex between QP's of different PD as
>> well.
>> Otherwise for every PD being created, there will have be one WQ needed
>> to service all the QPs belonging to that PD.
>
>
> As mentioned, same WQ can serve multiple QPs, from PD point of view it
> behaves similarly to SRQ that may be associated with many QPs with different
> PDs.
>
> See IB SPEC, Release 1.3, o10-2.2.1:
> "SRQ may be associated with the same PD as used by one or more of its
> associated QPs or a different PD."
>
> As part of coming V1 will improve the commit message to better clarify the
> WQ's PD behavior, thanks.

Ok. Got it. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma Oct. 18, 2015, 3:38 p.m. UTC | #10
Hi All,

Will it be a good idea to have ha separate header for this feature.
Lets not append to ib_verbs.h?

-Regards
Devesh

On Sun, Oct 18, 2015 at 8:45 PM, Parav Pandit <pandit.parav@gmail.com> wrote:
> On Sun, Oct 18, 2015 at 8:38 PM, Yishai Hadas
> <yishaih@dev.mellanox.co.il> wrote:
>> On 10/15/2015 7:49 PM, Parav Pandit wrote:
>>
>>> If there is stateless WQ being used by multiple QPs in multiplexed
>>
>>
>> The WQ is not stateless and always has its own PD.
>>
>>> way, it should be able to multiplex between QP's of different PD as
>>> well.
>>> Otherwise for every PD being created, there will have be one WQ needed
>>> to service all the QPs belonging to that PD.
>>
>>
>> As mentioned, same WQ can serve multiple QPs, from PD point of view it
>> behaves similarly to SRQ that may be associated with many QPs with different
>> PDs.
>>
>> See IB SPEC, Release 1.3, o10-2.2.1:
>> "SRQ may be associated with the same PD as used by one or more of its
>> associated QPs or a different PD."
>>
>> As part of coming V1 will improve the commit message to better clarify the
>> WQ's PD behavior, thanks.
>
> Ok. Got it. Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e1f2c98..c63c622 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1435,6 +1435,65 @@  int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 }
 EXPORT_SYMBOL(ib_dealloc_xrcd);
 
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *wq_attr)
+{
+	struct ib_wq *wq;
+
+	if (!pd->device->create_wq)
+		return ERR_PTR(-ENOSYS);
+
+	wq = pd->device->create_wq(pd, wq_attr, NULL);
+	if (!IS_ERR(wq)) {
+		wq->event_handler = wq_attr->event_handler;
+		wq->wq_context = wq_attr->wq_context;
+		wq->wq_type = wq_attr->wq_type;
+		wq->cq = wq_attr->cq;
+		wq->device = pd->device;
+		wq->pd = pd;
+		wq->uobject = NULL;
+		atomic_inc(&pd->usecnt);
+		atomic_inc(&wq_attr->cq->usecnt);
+		atomic_set(&wq->usecnt, 0);
+	}
+	return wq;
+}
+EXPORT_SYMBOL(ib_create_wq);
+
+int ib_destroy_wq(struct ib_wq *wq)
+{
+	int err;
+	struct ib_cq *cq = wq->cq;
+	struct ib_pd *pd = wq->pd;
+
+	if (!wq->device->destroy_wq)
+		return -ENOSYS;
+
+	if (atomic_read(&wq->usecnt))
+		return -EBUSY;
+
+	err = wq->device->destroy_wq(wq);
+	if (!err) {
+		atomic_dec(&pd->usecnt);
+		atomic_dec(&cq->usecnt);
+	}
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_wq);
+
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		 enum ib_wq_attr_mask attr_mask)
+{
+	int err;
+
+	if (!wq->device->modify_wq)
+		return -ENOSYS;
+
+	err = wq->device->modify_wq(wq, wq_attr, attr_mask, NULL);
+	return err;
+}
+EXPORT_SYMBOL(ib_modify_wq);
+
 struct ib_flow *ib_create_flow(struct ib_qp *qp,
 			       struct ib_flow_attr *flow_attr,
 			       int domain)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e1f65e2..0c6291b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1310,6 +1310,48 @@  struct ib_srq {
 	} ext;
 };
 
+enum ib_wq_type {
+	IB_WQT_RQ
+};
+
+enum ib_wq_state {
+	IB_WQS_RESET,
+	IB_WQS_RDY,
+	IB_WQS_ERR
+};
+
+struct ib_wq {
+	struct ib_device       *device;
+	struct ib_uobject      *uobject;
+	void		    *wq_context;
+	void		    (*event_handler)(struct ib_event *, void *);
+	struct ib_pd	       *pd;
+	struct ib_cq	       *cq;
+	u32		wq_num;
+	enum ib_wq_state       state;
+	enum ib_wq_type	wq_type;
+	atomic_t		usecnt;
+};
+
+struct ib_wq_init_attr {
+	void		       *wq_context;
+	enum ib_wq_type	wq_type;
+	u32		max_wr;
+	u32		max_sge;
+	struct	ib_cq	       *cq;
+	void		    (*event_handler)(struct ib_event *, void *);
+};
+
+enum ib_wq_attr_mask {
+	IB_WQ_STATE	= 1 << 0,
+	IB_WQ_CUR_STATE	= 1 << 1,
+};
+
+struct ib_wq_attr {
+	enum	ib_wq_state	wq_state;
+	enum	ib_wq_state	curr_wq_state;
+};
+
 struct ib_qp {
 	struct ib_device       *device;
 	struct ib_pd	       *pd;
@@ -1771,6 +1813,14 @@  struct ib_device {
 	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
 						      struct ib_mr_status *mr_status);
 	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
+	struct ib_wq *		   (*create_wq)(struct ib_pd *pd,
+						struct ib_wq_init_attr *init_attr,
+						struct ib_udata *udata);
+	int			   (*destroy_wq)(struct ib_wq *wq);
+	int			   (*modify_wq)(struct ib_wq *wq,
+						struct ib_wq_attr *attr,
+						enum ib_wq_attr_mask attr_mask,
+						struct ib_udata *udata);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -3024,4 +3074,42 @@  struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u8 port,
 					    u16 pkey, const union ib_gid *gid,
 					    const struct sockaddr *addr);
 
+/**
+ * ib_create_wq - Creates a WQ associated with the specified protection
+ * domain.
+ * @pd: The protection domain associated with the WQ.
+ * @wq_init_attr: A list of initial attributes required to create the
+ * WQ. If WQ creation succeeds, then the attributes are updated to
+ * the actual capabilities of the created WQ.
+ *
+ * wq_init_attr->max_wr and wq_init_attr->max_sge determine
+ * the requested size of the WQ, and set to the actual values allocated
+ * on return.
+ * If ib_create_wq() succeeds, then max_wr and max_sge will always be
+ * at least as large as the requested values.
+ *
+ * Return Value
+ * ib_create_wq() returns a pointer to the created WQ, or NULL if the request
+ * fails.
+ */
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *init_attr);
+
+/**
+ * ib_destroy_wq - Destroys the specified WQ.
+ * @wq: The WQ to destroy.
+ */
+int ib_destroy_wq(struct ib_wq *wq);
+
+/**
+ * ib_modify_wq - Modifies the specified WQ.
+ * @wq: The WQ to modify.
+ * @wq_attr: On input, specifies the WQ attributes to modify.
+ * @attr_mask: A bit-mask used to specify which attributes of the WQ
+ *   are being modified.
+ * On output, the current values of selected WQ attributes are returned.
+ */
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
+		 enum ib_wq_attr_mask attr_mask);
+
 #endif /* IB_VERBS_H */