Message ID | 20200318150257.198402-2-maxg@mellanox.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | nvmet-rdma/srpt: SRQ per completion vector | expand |
> ULP's can use this API to create/destroy SRQ's with the same > characteristics for implementing a logic that aimed to save resources > without significant performance penalty (e.g. create SRQ per completion > vector and use shared receive buffers for multiple controllers of the > ULP). There is almost no logic in here. Is there a real point in having in the way it is? What is the point of creating a pool, getting all the srqs, manage in the ulp (in an array), putting back, and destroy as a pool? I'd expect to have a refcount for each qp referencing a srq from the pool, and also that the pool would manage the srqs themselves. srqs are long lived resources, unlike mrs which are taken and restored to the pool on a per I/O basis... Its not that I hate it or something, just not clear to me how useful it is to have in this form...
On 3/20/2020 7:59 AM, Sagi Grimberg wrote: > >> ULP's can use this API to create/destroy SRQ's with the same >> characteristics for implementing a logic that aimed to save resources >> without significant performance penalty (e.g. create SRQ per completion >> vector and use shared receive buffers for multiple controllers of the >> ULP). > > There is almost no logic in here. Is there a real point in having > in the way it is? > > What is the point of creating a pool, getting all the srqs, manage > in the ulp (in an array), putting back, and destroy as a pool? > > I'd expect to have a refcount for each qp referencing a srq from the > pool, and also that the pool would manage the srqs themselves. > > srqs are long lived resources, unlike mrs which are taken and restored > to the pool on a per I/O basis... > > Its not that I hate it or something, just not clear to me how useful it > is to have in this form... Sagi, It's surprising to me since in my V1 two years ago I sent a pure nvmet/RDMA implementation and no srq_pool in there. And you have asked to add a srq_pool in the review. Also I was asked to add another implementation with this API for another ULP back then and I didn't have the capacity for it. Now I've done both NVMf and SRP targetĀ implementation with the SRQ pool. I'm ok with removing/upgrading the pool in a way everyone would be happy. I'm ok with removing the SRP implementation if it's not needed. I just want to add this feature to NVMf target 5.7 release. So please decide on the implementation and I'll send the patches. -Max.
On Thu, Mar 19, 2020 at 10:59:33PM -0700, Sagi Grimberg wrote: > > > ULP's can use this API to create/destroy SRQ's with the same > > characteristics for implementing a logic that aimed to save resources > > without significant performance penalty (e.g. create SRQ per completion > > vector and use shared receive buffers for multiple controllers of the > > ULP). > > There is almost no logic in here. Is there a real point in having > in the way it is? > > What is the point of creating a pool, getting all the srqs, manage > in the ulp (in an array), putting back, and destroy as a pool? > > I'd expect to have a refcount for each qp referencing a srq from the > pool, and also that the pool would manage the srqs themselves. > > srqs are long lived resources, unlike mrs which are taken and restored > to the pool on a per I/O basis... > > Its not that I hate it or something, just not clear to me how useful it > is to have in this form... Sagi, Why do we need such complexity of referencing to the QPs? This SRQ pool is intended for the kernel users whom we trust and we don't expect that QP will be destroyed before SRQ. Thanks
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index d1b14887..ca377b0 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -12,7 +12,7 @@ ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \ roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \ multicast.o mad.o smi.o agent.o mad_rmpp.o \ nldev.o restrack.o counters.o ib_core_uverbs.o \ - trace.o + trace.o srq_pool.o ib_core-$(CONFIG_SECURITY_INFINIBAND) += security.o ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o diff --git a/drivers/infiniband/core/srq_pool.c b/drivers/infiniband/core/srq_pool.c new file mode 100644 index 0000000..68321f0 --- /dev/null +++ b/drivers/infiniband/core/srq_pool.c @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2020 Mellanox Technologies. All rights reserved. + */ + +#include <rdma/srq_pool.h> + +struct ib_srq *rdma_srq_pool_get(struct ib_pd *pd) +{ + struct ib_srq *srq; + unsigned long flags; + + spin_lock_irqsave(&pd->srq_lock, flags); + srq = list_first_entry_or_null(&pd->srqs, struct ib_srq, pd_entry); + if (srq) + list_del(&srq->pd_entry); + spin_unlock_irqrestore(&pd->srq_lock, flags); + + return srq; +} +EXPORT_SYMBOL(rdma_srq_pool_get); + +void rdma_srq_pool_put(struct ib_pd *pd, struct ib_srq *srq) +{ + unsigned long flags; + + spin_lock_irqsave(&pd->srq_lock, flags); + list_add(&srq->pd_entry, &pd->srqs); + spin_unlock_irqrestore(&pd->srq_lock, flags); +} +EXPORT_SYMBOL(rdma_srq_pool_put); + +int rdma_srq_pool_init(struct ib_pd *pd, int nr, + struct ib_srq_init_attr *srq_attr) +{ + struct ib_srq *srq; + unsigned long flags; + int ret, i; + + for (i = 0; i < nr; i++) { + srq = ib_create_srq(pd, srq_attr); + if (IS_ERR(srq)) { + ret = PTR_ERR(srq); + goto out; + } + + spin_lock_irqsave(&pd->srq_lock, flags); + list_add_tail(&srq->pd_entry, &pd->srqs); + spin_unlock_irqrestore(&pd->srq_lock, flags); + } + + return 0; +out: + rdma_srq_pool_destroy(pd); + return ret; +} +EXPORT_SYMBOL(rdma_srq_pool_init); + +void rdma_srq_pool_destroy(struct ib_pd *pd) +{ + struct ib_srq *srq; + unsigned long flags; + + spin_lock_irqsave(&pd->srq_lock, flags); + while (!list_empty(&pd->srqs)) { + srq = list_first_entry(&pd->srqs, struct ib_srq, pd_entry); + list_del(&srq->pd_entry); + + spin_unlock_irqrestore(&pd->srq_lock, flags); + ib_destroy_srq(srq); + spin_lock_irqsave(&pd->srq_lock, flags); + } + spin_unlock_irqrestore(&pd->srq_lock, flags); +} +EXPORT_SYMBOL(rdma_srq_pool_destroy); diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index e62c9df..0bb69d2 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -272,6 +272,8 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags, pd->__internal_mr = NULL; atomic_set(&pd->usecnt, 0); pd->flags = flags; + spin_lock_init(&pd->srq_lock); + INIT_LIST_HEAD(&pd->srqs); pd->res.type = RDMA_RESTRACK_PD; rdma_restrack_set_task(&pd->res, caller); @@ -340,6 +342,7 @@ void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata) pd->__internal_mr = NULL; } + WARN_ON_ONCE(!list_empty(&pd->srqs)); /* uverbs manipulates usecnt with proper locking, while the kabi requires the caller to guarantee we can't race here. */ WARN_ON(atomic_read(&pd->usecnt)); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 1f779fa..1dcfefb 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1517,6 +1517,9 @@ struct ib_pd { u32 unsafe_global_rkey; + spinlock_t srq_lock; + struct list_head srqs; + /* * Implementation details of the RDMA core, don't use in drivers: */ @@ -1585,6 +1588,7 @@ struct ib_srq { void *srq_context; enum ib_srq_type srq_type; atomic_t usecnt; + struct list_head pd_entry; /* srq pool entry */ struct { struct ib_cq *cq; diff --git a/include/rdma/srq_pool.h b/include/rdma/srq_pool.h new file mode 100644 index 0000000..ee83896 --- /dev/null +++ b/include/rdma/srq_pool.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) */ +/* + * Copyright (c) 2020 Mellanox Technologies. All rights reserved. + */ + +#ifndef _RDMA_SRQ_POOL_H +#define _RDMA_SRQ_POOL_H + +#include <rdma/ib_verbs.h> + +struct ib_srq *rdma_srq_pool_get(struct ib_pd *pd); +void rdma_srq_pool_put(struct ib_pd *pd, struct ib_srq *srq); + +int rdma_srq_pool_init(struct ib_pd *pd, int nr, + struct ib_srq_init_attr *srq_attr); +void rdma_srq_pool_destroy(struct ib_pd *pd); + +#endif /* _RDMA_SRQ_POOL_H */
ULP's can use this API to create/destroy SRQ's with the same characteristics for implementing a logic that aimed to save resources without significant performance penalty (e.g. create SRQ per completion vector and use shared receive buffers for multiple controllers of the ULP). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> --- drivers/infiniband/core/Makefile | 2 +- drivers/infiniband/core/srq_pool.c | 75 ++++++++++++++++++++++++++++++++++++++ drivers/infiniband/core/verbs.c | 3 ++ include/rdma/ib_verbs.h | 4 ++ include/rdma/srq_pool.h | 18 +++++++++ 5 files changed, 101 insertions(+), 1 deletion(-) create mode 100644 drivers/infiniband/core/srq_pool.c create mode 100644 include/rdma/srq_pool.h