diff mbox series

[v2] RDMA/srpt: Make slab cache names unique

Message ID 20241007203726.3076222-1-bvanassche@acm.org (mailing list archive)
State Superseded
Headers show
Series [v2] RDMA/srpt: Make slab cache names unique | expand

Commit Message

Bart Van Assche Oct. 7, 2024, 8:37 p.m. UTC
Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when
DEBUG_VM=y"), slab complains about duplicate cache names. Hence this
patch. The approach is as follows:
- Maintain an xarray with the slab size as index and a reference count
  and a kmem_cache pointer as contents. Use srpt-${slab_size} as kmem
  cache name.
- Use 512-byte alignment for all slabs instead of only for some of the
  slabs.
- Increment the reference count instead of calling kmem_cache_create().
- Decrement the reference count instead of calling kmem_cache_destroy().

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/linux-block/xpe6bea7rakpyoyfvspvin2dsozjmjtjktpph7rep3h25tv7fb@ooz4cu5z6bq6/
Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 79 +++++++++++++++++++++++----
 1 file changed, 67 insertions(+), 12 deletions(-)

Changes compared to v1:
 - Instead of using an ida to make slab names unique, maintain an xarray
   with the slab size as index and the slab pointer and a reference count as
   contents.

Comments

Shinichiro Kawasaki Oct. 8, 2024, 4:36 a.m. UTC | #1
On Oct 07, 2024 / 13:37, Bart Van Assche wrote:
> Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when
> DEBUG_VM=y"), slab complains about duplicate cache names. Hence this
> patch. The approach is as follows:
> - Maintain an xarray with the slab size as index and a reference count
>   and a kmem_cache pointer as contents. Use srpt-${slab_size} as kmem
>   cache name.
> - Use 512-byte alignment for all slabs instead of only for some of the
>   slabs.
> - Increment the reference count instead of calling kmem_cache_create().
> - Decrement the reference count instead of calling kmem_cache_destroy().
> 
> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Closes: https://lore.kernel.org/linux-block/xpe6bea7rakpyoyfvspvin2dsozjmjtjktpph7rep3h25tv7fb@ooz4cu5z6bq6/
> Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Fixes: 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Thank you for the fix work. I tested this v2 patch also and confirmed that it
avoids the failures. I observed no regression. Looks good.

Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Leon Romanovsky Oct. 8, 2024, 12:04 p.m. UTC | #2
On Mon, Oct 07, 2024 at 01:37:26PM -0700, Bart Van Assche wrote:
> Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when
> DEBUG_VM=y"), slab complains about duplicate cache names. Hence this
> patch. The approach is as follows:
> - Maintain an xarray with the slab size as index and a reference count
>   and a kmem_cache pointer as contents. Use srpt-${slab_size} as kmem
>   cache name.
> - Use 512-byte alignment for all slabs instead of only for some of the
>   slabs.
> - Increment the reference count instead of calling kmem_cache_create().
> - Decrement the reference count instead of calling kmem_cache_destroy().
> 
> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Closes: https://lore.kernel.org/linux-block/xpe6bea7rakpyoyfvspvin2dsozjmjtjktpph7rep3h25tv7fb@ooz4cu5z6bq6/
> Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Fixes: 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/infiniband/ulp/srpt/ib_srpt.c | 79 +++++++++++++++++++++++----
>  1 file changed, 67 insertions(+), 12 deletions(-)
> 
> Changes compared to v1:
>  - Instead of using an ida to make slab names unique, maintain an xarray
>    with the slab size as index and the slab pointer and a reference count as
>    contents.

<...>

> +	if (IS_ERR(xa_store(&srpt_memory_caches, object_size, e, GFP_KERNEL)))

xarray API needs to be checked with xa_is_err() instead of IS_ERR().

Thanks
Zhu Yanjun Oct. 8, 2024, 12:48 p.m. UTC | #3
在 2024/10/8 4:37, Bart Van Assche 写道:
> Since commit 4c39529663b9 ("slab: Warn on duplicate cache names when
> DEBUG_VM=y"), slab complains about duplicate cache names. Hence this
> patch. The approach is as follows:
> - Maintain an xarray with the slab size as index and a reference count
>    and a kmem_cache pointer as contents. Use srpt-${slab_size} as kmem
>    cache name.
> - Use 512-byte alignment for all slabs instead of only for some of the
>    slabs.
> - Increment the reference count instead of calling kmem_cache_create().
> - Decrement the reference count instead of calling kmem_cache_destroy().
> 
> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Closes: https://lore.kernel.org/linux-block/xpe6bea7rakpyoyfvspvin2dsozjmjtjktpph7rep3h25tv7fb@ooz4cu5z6bq6/
> Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Fixes: 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   drivers/infiniband/ulp/srpt/ib_srpt.c | 79 +++++++++++++++++++++++----
>   1 file changed, 67 insertions(+), 12 deletions(-)
> 
> Changes compared to v1:
>   - Instead of using an ida to make slab names unique, maintain an xarray
>     with the slab size as index and the slab pointer and a reference count as
>     contents.
> 
> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
> index 9632afbd727b..cc18179693bf 100644
> --- a/drivers/infiniband/ulp/srpt/ib_srpt.c
> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
> @@ -68,6 +68,8 @@ MODULE_LICENSE("Dual BSD/GPL");
>   static u64 srpt_service_guid;
>   static DEFINE_SPINLOCK(srpt_dev_lock);	/* Protects srpt_dev_list. */
>   static LIST_HEAD(srpt_dev_list);	/* List of srpt_device structures. */
> +static DEFINE_MUTEX(srpt_mc_mutex);	/* Protects srpt_memory_caches. */

Not sure if a function mutex_destroy is also needed in this commit or not.

Zhu Yanjun

> +static DEFINE_XARRAY(srpt_memory_caches); /* See also srpt_memory_cache_entry */
>   
>   static unsigned srp_max_req_size = DEFAULT_MAX_REQ_SIZE;
>   module_param(srp_max_req_size, int, 0444);
> @@ -105,6 +107,62 @@ static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
>   static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
>   static void srpt_process_wait_list(struct srpt_rdma_ch *ch);
>   
> +/* Type of the entries in srpt_memory_caches. */
> +struct srpt_memory_cache_entry {
> +	refcount_t ref;
> +	struct kmem_cache *c;
> +};
> +
> +static struct kmem_cache *srpt_cache_get(unsigned int object_size)
> +{
> +	struct srpt_memory_cache_entry *e;
> +	char name[32];
> +
> +	guard(mutex)(&srpt_mc_mutex);
> +	e = xa_load(&srpt_memory_caches, object_size);
> +	if (e) {
> +		refcount_inc(&e->ref);
> +		return e->c;
> +	}
> +	snprintf(name, sizeof(name), "srpt-%u", object_size);
> +	e = kmalloc(sizeof(*e), GFP_KERNEL);
> +	if (!e)
> +		return NULL;
> +	refcount_set(&e->ref, 1);
> +	e->c = kmem_cache_create(name, object_size, /*align=*/512, 0, NULL);
> +	if (!e->c)
> +		goto free_entry;
> +	if (IS_ERR(xa_store(&srpt_memory_caches, object_size, e, GFP_KERNEL)))
> +		goto destroy_cache;
> +	return e->c;
> +
> +destroy_cache:
> +	kmem_cache_destroy(e->c);
> +
> +free_entry:
> +	kfree(e);
> +	return NULL;
> +}
> +
> +static void srpt_cache_put(struct kmem_cache *c)
> +{
> +	struct srpt_memory_cache_entry *e = NULL;
> +	unsigned long object_size;
> +
> +	guard(mutex)(&srpt_mc_mutex);
> +	xa_for_each(&srpt_memory_caches, object_size, e)
> +		if (e->c == c)
> +			break;
> +	if (WARN_ON_ONCE(!e))
> +		return;
> +	WARN_ON_ONCE(e->c != c);
> +	if (!refcount_dec_and_test(&e->ref))
> +		return;
> +	WARN_ON_ONCE(xa_erase(&srpt_memory_caches, object_size) != e);
> +	kmem_cache_destroy(e->c);
> +	kfree(e);
> +}
> +
>   /*
>    * The only allowed channel state changes are those that change the channel
>    * state into a state with a higher numerical value. Hence the new > prev test.
> @@ -2119,13 +2177,13 @@ static void srpt_release_channel_work(struct work_struct *w)
>   			     ch->sport->sdev, ch->rq_size,
>   			     ch->rsp_buf_cache, DMA_TO_DEVICE);
>   
> -	kmem_cache_destroy(ch->rsp_buf_cache);
> +	srpt_cache_put(ch->rsp_buf_cache);
>   
>   	srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_recv_ring,
>   			     sdev, ch->rq_size,
>   			     ch->req_buf_cache, DMA_FROM_DEVICE);
>   
> -	kmem_cache_destroy(ch->req_buf_cache);
> +	srpt_cache_put(ch->req_buf_cache);
>   
>   	kref_put(&ch->kref, srpt_free_ch);
>   }
> @@ -2245,8 +2303,7 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
>   	INIT_LIST_HEAD(&ch->cmd_wait_list);
>   	ch->max_rsp_size = ch->sport->port_attrib.srp_max_rsp_size;
>   
> -	ch->rsp_buf_cache = kmem_cache_create("srpt-rsp-buf", ch->max_rsp_size,
> -					      512, 0, NULL);
> +	ch->rsp_buf_cache = srpt_cache_get(ch->max_rsp_size);
>   	if (!ch->rsp_buf_cache)
>   		goto free_ch;
>   
> @@ -2280,8 +2337,7 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
>   		alignment_offset = round_up(imm_data_offset, 512) -
>   			imm_data_offset;
>   		req_sz = alignment_offset + imm_data_offset + srp_max_req_size;
> -		ch->req_buf_cache = kmem_cache_create("srpt-req-buf", req_sz,
> -						      512, 0, NULL);
> +		ch->req_buf_cache = srpt_cache_get(req_sz);
>   		if (!ch->req_buf_cache)
>   			goto free_rsp_ring;
>   
> @@ -2478,7 +2534,7 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
>   			     ch->req_buf_cache, DMA_FROM_DEVICE);
>   
>   free_recv_cache:
> -	kmem_cache_destroy(ch->req_buf_cache);
> +	srpt_cache_put(ch->req_buf_cache);
>   
>   free_rsp_ring:
>   	srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_ring,
> @@ -2486,7 +2542,7 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
>   			     ch->rsp_buf_cache, DMA_TO_DEVICE);
>   
>   free_rsp_cache:
> -	kmem_cache_destroy(ch->rsp_buf_cache);
> +	srpt_cache_put(ch->rsp_buf_cache);
>   
>   free_ch:
>   	if (rdma_cm_id)
> @@ -3055,7 +3111,7 @@ static void srpt_free_srq(struct srpt_device *sdev)
>   	srpt_free_ioctx_ring((struct srpt_ioctx **)sdev->ioctx_ring, sdev,
>   			     sdev->srq_size, sdev->req_buf_cache,
>   			     DMA_FROM_DEVICE);
> -	kmem_cache_destroy(sdev->req_buf_cache);
> +	srpt_cache_put(sdev->req_buf_cache);
>   	sdev->srq = NULL;
>   }
>   
> @@ -3082,8 +3138,7 @@ static int srpt_alloc_srq(struct srpt_device *sdev)
>   	pr_debug("create SRQ #wr= %d max_allow=%d dev= %s\n", sdev->srq_size,
>   		 sdev->device->attrs.max_srq_wr, dev_name(&device->dev));
>   
> -	sdev->req_buf_cache = kmem_cache_create("srpt-srq-req-buf",
> -						srp_max_req_size, 0, 0, NULL);
> +	sdev->req_buf_cache = srpt_cache_get(srp_max_req_size);
>   	if (!sdev->req_buf_cache)
>   		goto free_srq;
>   
> @@ -3105,7 +3160,7 @@ static int srpt_alloc_srq(struct srpt_device *sdev)
>   	return 0;
>   
>   free_cache:
> -	kmem_cache_destroy(sdev->req_buf_cache);
> +	srpt_cache_put(sdev->req_buf_cache);
>   
>   free_srq:
>   	ib_destroy_srq(srq);
Bart Van Assche Oct. 9, 2024, 8:47 p.m. UTC | #4
On 10/8/24 5:04 AM, Leon Romanovsky wrote:
> On Mon, Oct 07, 2024 at 01:37:26PM -0700, Bart Van Assche wrote:
>> +	if (IS_ERR(xa_store(&srpt_memory_caches, object_size, e, GFP_KERNEL)))
> 
> xarray API needs to be checked with xa_is_err() instead of IS_ERR().

Thanks Leon. I will fix this. In case you would not be aware of this,
this patch is the first patch in which I use the xarray API.

Bart.
Bart Van Assche Oct. 9, 2024, 9 p.m. UTC | #5
On 10/8/24 5:48 AM, Zhu Yanjun wrote:
> 在 2024/10/8 4:37, Bart Van Assche 写道:
>> +static DEFINE_MUTEX(srpt_mc_mutex);    /* Protects 
>> srpt_memory_caches. */
> 
> Not sure if a function mutex_destroy is also needed in this commit or not.

Probably not since the module memory is freed shortly after
srpt_cleanup_module() has been called. From kernel/locking/mutex-debug.c:

void mutex_destroy(struct mutex *lock)
{
	DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock));
	lock->magic = NULL;
}

Bart.
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 9632afbd727b..cc18179693bf 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -68,6 +68,8 @@  MODULE_LICENSE("Dual BSD/GPL");
 static u64 srpt_service_guid;
 static DEFINE_SPINLOCK(srpt_dev_lock);	/* Protects srpt_dev_list. */
 static LIST_HEAD(srpt_dev_list);	/* List of srpt_device structures. */
+static DEFINE_MUTEX(srpt_mc_mutex);	/* Protects srpt_memory_caches. */
+static DEFINE_XARRAY(srpt_memory_caches); /* See also srpt_memory_cache_entry */
 
 static unsigned srp_max_req_size = DEFAULT_MAX_REQ_SIZE;
 module_param(srp_max_req_size, int, 0444);
@@ -105,6 +107,62 @@  static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_process_wait_list(struct srpt_rdma_ch *ch);
 
+/* Type of the entries in srpt_memory_caches. */
+struct srpt_memory_cache_entry {
+	refcount_t ref;
+	struct kmem_cache *c;
+};
+
+static struct kmem_cache *srpt_cache_get(unsigned int object_size)
+{
+	struct srpt_memory_cache_entry *e;
+	char name[32];
+
+	guard(mutex)(&srpt_mc_mutex);
+	e = xa_load(&srpt_memory_caches, object_size);
+	if (e) {
+		refcount_inc(&e->ref);
+		return e->c;
+	}
+	snprintf(name, sizeof(name), "srpt-%u", object_size);
+	e = kmalloc(sizeof(*e), GFP_KERNEL);
+	if (!e)
+		return NULL;
+	refcount_set(&e->ref, 1);
+	e->c = kmem_cache_create(name, object_size, /*align=*/512, 0, NULL);
+	if (!e->c)
+		goto free_entry;
+	if (IS_ERR(xa_store(&srpt_memory_caches, object_size, e, GFP_KERNEL)))
+		goto destroy_cache;
+	return e->c;
+
+destroy_cache:
+	kmem_cache_destroy(e->c);
+
+free_entry:
+	kfree(e);
+	return NULL;
+}
+
+static void srpt_cache_put(struct kmem_cache *c)
+{
+	struct srpt_memory_cache_entry *e = NULL;
+	unsigned long object_size;
+
+	guard(mutex)(&srpt_mc_mutex);
+	xa_for_each(&srpt_memory_caches, object_size, e)
+		if (e->c == c)
+			break;
+	if (WARN_ON_ONCE(!e))
+		return;
+	WARN_ON_ONCE(e->c != c);
+	if (!refcount_dec_and_test(&e->ref))
+		return;
+	WARN_ON_ONCE(xa_erase(&srpt_memory_caches, object_size) != e);
+	kmem_cache_destroy(e->c);
+	kfree(e);
+}
+
 /*
  * The only allowed channel state changes are those that change the channel
  * state into a state with a higher numerical value. Hence the new > prev test.
@@ -2119,13 +2177,13 @@  static void srpt_release_channel_work(struct work_struct *w)
 			     ch->sport->sdev, ch->rq_size,
 			     ch->rsp_buf_cache, DMA_TO_DEVICE);
 
-	kmem_cache_destroy(ch->rsp_buf_cache);
+	srpt_cache_put(ch->rsp_buf_cache);
 
 	srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_recv_ring,
 			     sdev, ch->rq_size,
 			     ch->req_buf_cache, DMA_FROM_DEVICE);
 
-	kmem_cache_destroy(ch->req_buf_cache);
+	srpt_cache_put(ch->req_buf_cache);
 
 	kref_put(&ch->kref, srpt_free_ch);
 }
@@ -2245,8 +2303,7 @@  static int srpt_cm_req_recv(struct srpt_device *const sdev,
 	INIT_LIST_HEAD(&ch->cmd_wait_list);
 	ch->max_rsp_size = ch->sport->port_attrib.srp_max_rsp_size;
 
-	ch->rsp_buf_cache = kmem_cache_create("srpt-rsp-buf", ch->max_rsp_size,
-					      512, 0, NULL);
+	ch->rsp_buf_cache = srpt_cache_get(ch->max_rsp_size);
 	if (!ch->rsp_buf_cache)
 		goto free_ch;
 
@@ -2280,8 +2337,7 @@  static int srpt_cm_req_recv(struct srpt_device *const sdev,
 		alignment_offset = round_up(imm_data_offset, 512) -
 			imm_data_offset;
 		req_sz = alignment_offset + imm_data_offset + srp_max_req_size;
-		ch->req_buf_cache = kmem_cache_create("srpt-req-buf", req_sz,
-						      512, 0, NULL);
+		ch->req_buf_cache = srpt_cache_get(req_sz);
 		if (!ch->req_buf_cache)
 			goto free_rsp_ring;
 
@@ -2478,7 +2534,7 @@  static int srpt_cm_req_recv(struct srpt_device *const sdev,
 			     ch->req_buf_cache, DMA_FROM_DEVICE);
 
 free_recv_cache:
-	kmem_cache_destroy(ch->req_buf_cache);
+	srpt_cache_put(ch->req_buf_cache);
 
 free_rsp_ring:
 	srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_ring,
@@ -2486,7 +2542,7 @@  static int srpt_cm_req_recv(struct srpt_device *const sdev,
 			     ch->rsp_buf_cache, DMA_TO_DEVICE);
 
 free_rsp_cache:
-	kmem_cache_destroy(ch->rsp_buf_cache);
+	srpt_cache_put(ch->rsp_buf_cache);
 
 free_ch:
 	if (rdma_cm_id)
@@ -3055,7 +3111,7 @@  static void srpt_free_srq(struct srpt_device *sdev)
 	srpt_free_ioctx_ring((struct srpt_ioctx **)sdev->ioctx_ring, sdev,
 			     sdev->srq_size, sdev->req_buf_cache,
 			     DMA_FROM_DEVICE);
-	kmem_cache_destroy(sdev->req_buf_cache);
+	srpt_cache_put(sdev->req_buf_cache);
 	sdev->srq = NULL;
 }
 
@@ -3082,8 +3138,7 @@  static int srpt_alloc_srq(struct srpt_device *sdev)
 	pr_debug("create SRQ #wr= %d max_allow=%d dev= %s\n", sdev->srq_size,
 		 sdev->device->attrs.max_srq_wr, dev_name(&device->dev));
 
-	sdev->req_buf_cache = kmem_cache_create("srpt-srq-req-buf",
-						srp_max_req_size, 0, 0, NULL);
+	sdev->req_buf_cache = srpt_cache_get(srp_max_req_size);
 	if (!sdev->req_buf_cache)
 		goto free_srq;
 
@@ -3105,7 +3160,7 @@  static int srpt_alloc_srq(struct srpt_device *sdev)
 	return 0;
 
 free_cache:
-	kmem_cache_destroy(sdev->req_buf_cache);
+	srpt_cache_put(sdev->req_buf_cache);
 
 free_srq:
 	ib_destroy_srq(srq);