Message ID | 9ded7cf6a3af7e6e577d12a835a385657da4a69e.1634676157.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block optimisation round | expand |
On Tue, Oct 19, 2021 at 10:24:24PM +0100, Pavel Begunkov wrote: > + unsigned int op = bio_op(bio); > + > + if (op != REQ_OP_READ && op != REQ_OP_WRITE && op != REQ_OP_FLUSH) { > + switch (op) { > + case REQ_OP_DISCARD: > + case REQ_OP_SECURE_ERASE: > + case REQ_OP_WRITE_ZEROES: > + case REQ_OP_WRITE_SAME: > + return true; /* non-trivial splitting decisions */ > + default: > + break; > + } Nesting the if and the switch is too ugly to live. If you want ifs do just them. But I'd really like to see numbers for this, also compared to epxlicitly checking for REQ_OP_READ and REQ_OP_WRITE and maybe using __builtin_expect for those values.
On 10/20/21 07:25, Christoph Hellwig wrote: > On Tue, Oct 19, 2021 at 10:24:24PM +0100, Pavel Begunkov wrote: >> + unsigned int op = bio_op(bio); >> + >> + if (op != REQ_OP_READ && op != REQ_OP_WRITE && op != REQ_OP_FLUSH) { >> + switch (op) { >> + case REQ_OP_DISCARD: >> + case REQ_OP_SECURE_ERASE: >> + case REQ_OP_WRITE_ZEROES: >> + case REQ_OP_WRITE_SAME: >> + return true; /* non-trivial splitting decisions */ >> + default: >> + break; >> + } > > Nesting the if and the switch is too ugly to live. If you want ifs do > just them. But I'd really like to see numbers for this, also compared > to epxlicitly checking for REQ_OP_READ and REQ_OP_WRITE and maybe using > __builtin_expect for those values. What I want to get from the compiler is: if (op <= REQ_OP_FLUSH) goto after_switch_label; else switch () { ... } Was trying to hint it somehow (gcc 11.1), (void)__builtin_expect(op <= FLUSH, 1); No luck, asm doesn't change. Not sure why people don't like it, so let me ask which one is better? 1) if (op == read || op == write ...) goto label; else switch () { } 2) if (op == read || op == write ...) goto label; else if () ... else if () ... else if () ... For the numbers, had profiling for the whole series (nullblk): + 2.82% 2.82% io_uring [kernel.vmlinux] [k] submit_bio_checks + 2.51% 2.50% io_uring [kernel.vmlinux] [k] submit_bio_checks Because the relative % for "after" should grow because of other optimisations, so the difference should be _a bit_ larger. Need to retest. And some asm (for submit_bio_checks()) for comparison. Before: # block/blk-core.c:823: switch (op) { cmpl $9, %eax #, op je .L616 #, ja .L617 #, cmpl $5, %eax #, op je .L618 #, cmpl $7, %eax #, op jne .L696 #, ... .L696: cmpl $3, %eax #, op jne .L621 #, ... .L621 (label after switch) After: # block/blk-core.c:822: if (op != REQ_OP_READ && op != REQ_OP_WRITE && op != REQ_OP_FLUSH) { cmpb $2, %al #, _18 # block/blk-core.c:822: if (op != REQ_OP_READ && op != REQ_OP_WRITE && op != REQ_OP_FLUSH) { jbe .L616 #, ... .L616 (label after switch)
diff --git a/block/blk.h b/block/blk.h index 6a039e6c7d07..0bf00e96e1f0 100644 --- a/block/blk.h +++ b/block/blk.h @@ -269,14 +269,18 @@ ssize_t part_timeout_store(struct device *, struct device_attribute *, static inline bool blk_may_split(struct request_queue *q, struct bio *bio) { - switch (bio_op(bio)) { - case REQ_OP_DISCARD: - case REQ_OP_SECURE_ERASE: - case REQ_OP_WRITE_ZEROES: - case REQ_OP_WRITE_SAME: - return true; /* non-trivial splitting decisions */ - default: - break; + unsigned int op = bio_op(bio); + + if (op != REQ_OP_READ && op != REQ_OP_WRITE && op != REQ_OP_FLUSH) { + switch (op) { + case REQ_OP_DISCARD: + case REQ_OP_SECURE_ERASE: + case REQ_OP_WRITE_ZEROES: + case REQ_OP_WRITE_SAME: + return true; /* non-trivial splitting decisions */ + default: + break; + } } /*
Read/write/flush are the most common operations, optimise switch in blk_may_split() for these cases. All three added conditions are compiled into a single comparison as the corresponding REQ_OP_* take 0-2. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- block/blk.h | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)