Message ID | 20210926061124.335-1-caihuoqing@baidu.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA/irdma: Use dma_alloc_coherent() instead of kmalloc/dma_map_single() | expand |
On Sun, Sep 26, 2021 at 02:11:23PM +0800, Cai Huoqing wrote: > Replacing kmalloc/kfree/dma_map_single/dma_unmap_single() > with dma_alloc_coherent/dma_free_coherent() helps to reduce > code size, and simplify the code, and coherent DMA will not > clear the cache every time. > > Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> > --- > drivers/infiniband/hw/irdma/puda.c | 19 ++++--------------- > 1 file changed, 4 insertions(+), 15 deletions(-) This I'm not sure about, I see lots of calls to dma_sync_single_* for this memory and it is not unconditionally true that using coherent memory is better than doing the cache flushes. It depends very much on the access pattern. At the very least if you convert to coherent memory I expect to see the sync's removed too.. Jason
On Mon, Sep 27, 2021 at 09:02:35AM -0300, Jason Gunthorpe wrote: > This I'm not sure about, I see lots of calls to dma_sync_single_* for > this memory and it is not unconditionally true that using coherent > memory is better than doing the cache flushes. It depends very much > on the access pattern. > > At the very least if you convert to coherent memory I expect to see > the sync's removed too.. In general coherent memory actually is worse for not cache coherent architectures when you hav chance, an should mkae no difference for cache coherent ones. So I'd like to see numbers here instead of a claim.
diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 58e7d875643b..feafe21b12c6 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -151,24 +151,15 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev, buf = buf_mem.va; buf->mem.size = len; - buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL); + buf->mem.va = dma_alloc_coherent(dev->hw->device, len, + &buf->mem.pa, GFP_KERNEL); if (!buf->mem.va) - goto free_virt; - buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va, - buf->mem.size, DMA_BIDIRECTIONAL); - if (dma_mapping_error(dev->hw->device, buf->mem.pa)) { - kfree(buf->mem.va); - goto free_virt; - } + return NULL; buf->buf_mem.va = buf_mem.va; buf->buf_mem.size = buf_mem.size; return buf; - -free_virt: - kfree(buf_mem.va); - return NULL; } /** @@ -179,9 +170,7 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev, static void irdma_puda_dele_buf(struct irdma_sc_dev *dev, struct irdma_puda_buf *buf) { - dma_unmap_single(dev->hw->device, buf->mem.pa, buf->mem.size, - DMA_BIDIRECTIONAL); - kfree(buf->mem.va); + dma_free_coherent(dev->hw->device, buf->mem.size, buf->mem.va, buf->mem.pa); kfree(buf->buf_mem.va); }
Replacing kmalloc/kfree/dma_map_single/dma_unmap_single() with dma_alloc_coherent/dma_free_coherent() helps to reduce code size, and simplify the code, and coherent DMA will not clear the cache every time. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> --- drivers/infiniband/hw/irdma/puda.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-)