Message ID | 20190520111902.7104DE0184@unicorn.suse.cz (mailing list archive) |
---|---|
State | Mainlined |
Commit | 37eb86c4507abcb14fc346863e83aa8751aa4675 |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | mlx5: avoid 64-bit division | expand |
On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > breaks i386 build by introducing three 64-bit divisions. As the divisor > is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace > the division with bit operations. Interesting, we tried to solve it differently. I added it to our regression to be on the same side. Thanks
On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > breaks i386 build by introducing three 64-bit divisions. As the divisor > is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace > the division with bit operations. > > Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > Signed-off-by: Michal Kubecek <mkubecek@suse.cz> > drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++-- > drivers/infiniband/hw/mlx5/main.c | 2 +- > 2 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c > index e3ec79b8f7f5..6c8645033102 100644 > +++ b/drivers/infiniband/hw/mlx5/cmd.c > @@ -190,12 +190,12 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, > u16 uid, phys_addr_t *addr, u32 *obj_id) > { > struct mlx5_core_dev *dev = dm->dev; > - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); > u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; > u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {}; > unsigned long *block_map; > u64 icm_start_addr; > u32 log_icm_size; > + u32 num_blocks; > u32 max_blocks; > u64 block_idx; > void *sw_icm; > @@ -224,6 +224,8 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, > return -EINVAL; > } > > + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >> > + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); > max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev)); > spin_lock(&dm->lock); > block_idx = bitmap_find_next_zero_area(block_map, > @@ -266,13 +268,16 @@ int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, > u16 uid, phys_addr_t addr, u32 obj_id) > { > struct mlx5_core_dev *dev = dm->dev; > - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); > u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; > u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; > unsigned long *block_map; > + u32 num_blocks; > u64 start_idx; > int err; > > + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >> > + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); > + > switch (type) { > case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: > start_idx = > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c > index abac70ad5c7c..340290b883fe 100644 > +++ b/drivers/infiniband/hw/mlx5/main.c > @@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx, > /* Allocation size must a multiple of the basic block size > * and a power of 2. > */ > - act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); > + act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); > act_size = roundup_pow_of_two(act_size); It is kind of weird that we have round_up and the bitshift version.. None of this is performance critical so why not just use round_up everywhere? Ariel, it is true MLX5_SW_ICM_BLOCK_SIZE will always be a power of two? Jason
On Mon, May 27, 2019 at 03:15:34PM -0300, Jason Gunthorpe wrote: > On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c > > index abac70ad5c7c..340290b883fe 100644 > > +++ b/drivers/infiniband/hw/mlx5/main.c > > @@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx, > > /* Allocation size must a multiple of the basic block size > > * and a power of 2. > > */ > > - act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); > > + act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); > > act_size = roundup_pow_of_two(act_size); > > It is kind of weird that we have round_up and the bitshift > version.. None of this is performance critical so why not just use > round_up everywhere? > > Ariel, it is true MLX5_SW_ICM_BLOCK_SIZE will always be a power of > two? If it weren't, the requirements from the comment above could never be satisfied as a power of two can only be a multiple of another power of two. Which also means that what the code above does is in fact equivalent to act_size = max_t(u64, roundup_pow_of_two(attr->length), MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); or act_size = roundup_pow_of_two(max_t(u64, attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); Michal Kubecek
On Mon, May 20, 2019 at 02:28:35PM +0300, Leon Romanovsky wrote: > On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > > Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > > breaks i386 build by introducing three 64-bit divisions. As the divisor > > is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace > > the division with bit operations. > > Interesting, we tried to solve it differently. > I added it to our regression to be on the same side. This patch works for us. Thanks, Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
On Mon, May 20, 2019 at 02:28:35PM +0300, Leon Romanovsky wrote: > > On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > > > Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > > > breaks i386 build by introducing three 64-bit divisions. As the divisor > > > is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace > > > the division with bit operations. > > > > Interesting, we tried to solve it differently. > > I added it to our regression to be on the same side. > This patch works for us. Yes, this value is guaranteed to be a power of 2. We safely use round_up() instead as suggested in the patch.
On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote: > Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > breaks i386 build by introducing three 64-bit divisions. As the divisor > is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace > the division with bit operations. > > Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") > Signed-off-by: Michal Kubecek <mkubecek@suse.cz> > Reviewed-by: Leon Romanovsky <leonro@mellanox.com> > --- > drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++-- > drivers/infiniband/hw/mlx5/main.c | 2 +- > 2 files changed, 8 insertions(+), 3 deletions(-) Applied to for-rc, thanks Jason
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c index e3ec79b8f7f5..6c8645033102 100644 --- a/drivers/infiniband/hw/mlx5/cmd.c +++ b/drivers/infiniband/hw/mlx5/cmd.c @@ -190,12 +190,12 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id) { struct mlx5_core_dev *dev = dm->dev; - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {}; unsigned long *block_map; u64 icm_start_addr; u32 log_icm_size; + u32 num_blocks; u32 max_blocks; u64 block_idx; void *sw_icm; @@ -224,6 +224,8 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, return -EINVAL; } + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev)); spin_lock(&dm->lock); block_idx = bitmap_find_next_zero_area(block_map, @@ -266,13 +268,16 @@ int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, u16 uid, phys_addr_t addr, u32 obj_id) { struct mlx5_core_dev *dev = dm->dev; - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; unsigned long *block_map; + u32 num_blocks; u64 start_idx; int err; + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); + switch (type) { case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: start_idx = diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index abac70ad5c7c..340290b883fe 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx, /* Allocation size must a multiple of the basic block size * and a power of 2. */ - act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); + act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); act_size = roundup_pow_of_two(act_size); dm->size = act_size;
Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") breaks i386 build by introducing three 64-bit divisions. As the divisor is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace the division with bit operations. Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> --- drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++-- drivers/infiniband/hw/mlx5/main.c | 2 +- 2 files changed, 8 insertions(+), 3 deletions(-)