Message ID | 56138854.4040209@dev.mellanox.co.il (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, Oct 06, 2015 at 11:37:40AM +0300, Sagi Grimberg wrote: > The issue is that the device requires the MR page array to have > an alignment (0x40 for mlx4 and 0x400 for mlx5). When I modified the > page array allocation to be non-coherent I didn't take care of > alignment. Just curious: why did you switch away from the coheret dma allocations anyway? Seems like the page lists are mapped as long as they are allocated so the coherent allocator would seem like a nice fit. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/7/2015 12:20 PM, Christoph Hellwig wrote: > On Tue, Oct 06, 2015 at 11:37:40AM +0300, Sagi Grimberg wrote: >> The issue is that the device requires the MR page array to have >> an alignment (0x40 for mlx4 and 0x400 for mlx5). When I modified the >> page array allocation to be non-coherent I didn't take care of >> alignment. > > Just curious: why did you switch away from the coheret dma allocations > anyway? Seems like the page lists are mapped as long as they are > allocated so the coherent allocator would seem like a nice fit. > Bart suggested that having to sync once for the entire page list might perform better than coherent memory. I'll settle either way since using non-coherent memory might cause higher-order allocations due to alignment, so it's not free-of-charge. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 07, 2015 at 12:25:25PM +0300, Sagi Grimberg wrote: > Bart suggested that having to sync once for the entire page list might > perform better than coherent memory. I'll settle either way since using > non-coherent memory might cause higher-order allocations due to > alignment, so it's not free-of-charge. I don't really care either way, it just seemed like an odd change hiding in here that I missed when reviewing earlier. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> I don't really care either way, it just seemed like an odd change hiding > in here that I missed when reviewing earlier. OK, so I'm sticking with it until someone suggests otherwise. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/07/2015 02:20 AM, Christoph Hellwig wrote: > On Tue, Oct 06, 2015 at 11:37:40AM +0300, Sagi Grimberg wrote: >> The issue is that the device requires the MR page array to have >> an alignment (0x40 for mlx4 and 0x400 for mlx5). When I modified the >> page array allocation to be non-coherent I didn't take care of >> alignment. > > Just curious: why did you switch away from the coheret dma allocations > anyway? Seems like the page lists are mapped as long as they are > allocated so the coherent allocator would seem like a nice fit. Hello Christoph, My concern is that caching and/or write combining might be disabled for DMA coherent memory regions. This is why I assume that calling dma_map_single() and dma_unmap_single() will be faster for registering multiple pages as a single memory region instead of using DMA coherent memory. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index de6eab3..4c69247 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -129,6 +129,8 @@ struct mlx4_ib_cq { struct list_head recv_qp_list; }; +#define MLX4_MR_PAGES_ALIGN 0x40 + struct mlx4_ib_mr { struct ib_mr ibmr; __be64 *pages; @@ -137,6 +139,7 @@ struct mlx4_ib_mr { u32 max_pages; struct mlx4_mr mmr; struct ib_umem *umem; + void *pages_alloc; }; struct mlx4_ib_mw { diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index fa01f75..d3f8175 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -279,10 +279,14 @@ mlx4_alloc_priv_pages(struct ib_device *device, int size = max_pages * sizeof(u64); int ret; - mr->pages = kzalloc(size, GFP_KERNEL); - if (!mr->pages) + size += max_t(int, MLX4_MR_PAGES_ALIGN - ARCH_KMALLOC_MINALIGN, 0); + + mr->pages_alloc = kzalloc(size, GFP_KERNEL); + if (!mr->pages_alloc) return -ENOMEM; + mr->pages = PTR_ALIGN(mr->pages_alloc, MLX4_MR_PAGES_ALIGN); + mr->page_map = dma_map_single(device->dma_device, mr->pages, size, DMA_TO_DEVICE); @@ -293,20 +297,22 @@ mlx4_alloc_priv_pages(struct ib_device *device, return 0; err: - kfree(mr->pages); + kfree(mr->pages_alloc); return ret; } static void mlx4_free_priv_pages(struct mlx4_ib_mr *mr) { - struct ib_device *device = mr->ibmr.device; - int size = mr->max_pages * sizeof(u64); - if (mr->pages) { + struct ib_device *device = mr->ibmr.device; + int size = mr->max_pages * sizeof(u64); + + size += max_t(int, MLX4_MR_PAGES_ALIGN - ARCH_KMALLOC_MINALIGN, 0); + dma_unmap_single(device->dma_device, mr->page_map, size, DMA_TO_DEVICE); - kfree(mr->pages); + kfree(mr->pages_alloc); mr->pages = NULL; }