Message ID | TYCP286MB23238842958D7C083D6B67CECA349@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] block: simplify blksize_bits() implementation | expand |
I'm not sure if it matters, but the change does look fine to me:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On 10/29/22 22:20, Dawei Li wrote: > @@ -1349,12 +1349,7 @@ static inline int blk_rq_aligned(struct request_queue *q, unsigned long addr, > /* assumes size > 256 */ > static inline unsigned int blksize_bits(unsigned int size) > { > - unsigned int bits = 8; > - do { > - bits++; > - size >>= 1; > - } while (size > 256); > - return bits; > + return order_base_2(size >> SECTOR_SHIFT) + SECTOR_SHIFT; > } Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On 10/29/2022 10:20 PM, Dawei Li wrote: > Convert current looping-based implementation into bit operation, > which can bring improvement for: > > 1) bitops is more efficient for its arch-level optimization. > do you have a quantitative date to prove that ? Also which arch benefits the most ? is it true for all ? > 2) Given that blksize_bits() is inline, _if_ @size is compile-time > constant, it's possible that order_base_2() _may_ make output > compile-time evaluated, depending on code context and compiler behavior. > patches like this needs to be supported by the quantitative data, else I've seen reviewers taking an objection ... either way :- Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> -ck
On Sun, 30 Oct 2022 13:20:08 +0800, Dawei Li wrote: > Convert current looping-based implementation into bit operation, > which can bring improvement for: > > 1) bitops is more efficient for its arch-level optimization. > > 2) Given that blksize_bits() is inline, _if_ @size is compile-time > constant, it's possible that order_base_2() _may_ make output > compile-time evaluated, depending on code context and compiler behavior. > > [...] Applied, thanks! [1/1] block: simplify blksize_bits() implementation commit: adff215830fcf3ef74f2f0d4dd5a47a6927d450b Best regards,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 57ed49f20d2e..32137d85c9ad 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1349,12 +1349,7 @@ static inline int blk_rq_aligned(struct request_queue *q, unsigned long addr, /* assumes size > 256 */ static inline unsigned int blksize_bits(unsigned int size) { - unsigned int bits = 8; - do { - bits++; - size >>= 1; - } while (size > 256); - return bits; + return order_base_2(size >> SECTOR_SHIFT) + SECTOR_SHIFT; } static inline unsigned int block_size(struct block_device *bdev)
Convert current looping-based implementation into bit operation, which can bring improvement for: 1) bitops is more efficient for its arch-level optimization. 2) Given that blksize_bits() is inline, _if_ @size is compile-time constant, it's possible that order_base_2() _may_ make output compile-time evaluated, depending on code context and compiler behavior. v1: https://lore.kernel.org/all/TYCP286MB2323169D81A806A7C1F7FDF1CA309@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM v2: Remove the ternary operator, based on Bart's suggestion But this may lead to break for corner cases below: BUILD_BUG_ON(blksize_bits(1025) != 11); So make a minor modification by adding (SECTOR_SIZE - 1) before shifting. v3: Remove the rounding stuff. base-commit: 30209debe98b6f66b13591e59e5272cb65b3945e Signed-off-by: Dawei Li <set_pte_at@outlook.com> --- include/linux/blkdev.h | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)