diff mbox series

RDMA/irdma: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()

Message ID 20210926061124.335-1-caihuoqing@baidu.com (mailing list archive)
State Changes Requested
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/irdma: Use dma_alloc_coherent() instead of kmalloc/dma_map_single() | expand

Commit Message

Cai,Huoqing Sept. 26, 2021, 6:11 a.m. UTC
Replacing kmalloc/kfree/dma_map_single/dma_unmap_single()
with dma_alloc_coherent/dma_free_coherent() helps to reduce
code size, and simplify the code, and coherent DMA will not
clear the cache every time.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
---
 drivers/infiniband/hw/irdma/puda.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

Comments

Jason Gunthorpe Sept. 27, 2021, 12:02 p.m. UTC | #1
On Sun, Sep 26, 2021 at 02:11:23PM +0800, Cai Huoqing wrote:
> Replacing kmalloc/kfree/dma_map_single/dma_unmap_single()
> with dma_alloc_coherent/dma_free_coherent() helps to reduce
> code size, and simplify the code, and coherent DMA will not
> clear the cache every time.
> 
> Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
> ---
>  drivers/infiniband/hw/irdma/puda.c | 19 ++++---------------
>  1 file changed, 4 insertions(+), 15 deletions(-)

This I'm not sure about, I see lots of calls to dma_sync_single_* for
this memory and it is not unconditionally true that using coherent
memory is better than doing the cache flushes. It depends very much
on the access pattern.

At the very least if you convert to coherent memory I expect to see
the sync's removed too..

Jason
Christoph Hellwig Sept. 27, 2021, 2:30 p.m. UTC | #2
On Mon, Sep 27, 2021 at 09:02:35AM -0300, Jason Gunthorpe wrote:
> This I'm not sure about, I see lots of calls to dma_sync_single_* for
> this memory and it is not unconditionally true that using coherent
> memory is better than doing the cache flushes. It depends very much
> on the access pattern.
> 
> At the very least if you convert to coherent memory I expect to see
> the sync's removed too..

In general coherent memory actually is worse for not cache coherent
architectures when you hav chance, an should mkae no difference for
cache coherent ones.  So I'd like to see numbers here instead of a
claim.
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c
index 58e7d875643b..feafe21b12c6 100644
--- a/drivers/infiniband/hw/irdma/puda.c
+++ b/drivers/infiniband/hw/irdma/puda.c
@@ -151,24 +151,15 @@  static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev,
 
 	buf = buf_mem.va;
 	buf->mem.size = len;
-	buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL);
+	buf->mem.va = dma_alloc_coherent(dev->hw->device, len,
+					 &buf->mem.pa, GFP_KERNEL);
 	if (!buf->mem.va)
-		goto free_virt;
-	buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va,
-				     buf->mem.size, DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(dev->hw->device, buf->mem.pa)) {
-		kfree(buf->mem.va);
-		goto free_virt;
-	}
+		return NULL;
 
 	buf->buf_mem.va = buf_mem.va;
 	buf->buf_mem.size = buf_mem.size;
 
 	return buf;
-
-free_virt:
-	kfree(buf_mem.va);
-	return NULL;
 }
 
 /**
@@ -179,9 +170,7 @@  static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev,
 static void irdma_puda_dele_buf(struct irdma_sc_dev *dev,
 				struct irdma_puda_buf *buf)
 {
-	dma_unmap_single(dev->hw->device, buf->mem.pa, buf->mem.size,
-			 DMA_BIDIRECTIONAL);
-	kfree(buf->mem.va);
+	dma_free_coherent(dev->hw->device, buf->mem.size, buf->mem.va, buf->mem.pa);
 	kfree(buf->buf_mem.va);
 }