diff mbox series

block: cleanup and fix batch completion adding conditions

Message ID 20575f0a-656e-4bb3-9d82-dec6c7e3a35c@kernel.dk (mailing list archive)
State New
Headers show
Series block: cleanup and fix batch completion adding conditions | expand

Commit Message

Jens Axboe Feb. 13, 2025, 3:18 p.m. UTC
The conditions for whether or not a request is allowed adding to a
completion batch are a bit hard to read, and they also have a few
issues. One is that ioerror may indeed be a random value on passthrough,
and it's being checked unconditionally of whether or not the given
request is a passthrough request or not.

Rewrite the conditions to be separate for easier reading, and only check
ioerror for non-passthrough requests. This fixes an issue with bio
unmapping on passthrough, where it fails getting added to a batch. This
both leads to suboptimal performance, and may trigger a potential
schedule-under-atomic condition for polled passthrough IO.

Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Comments

Shinichiro Kawasaki Feb. 19, 2025, 7:26 a.m. UTC | #1
CC+: Alan,

On Feb 13, 2025 / 08:18, Jens Axboe wrote:
> The conditions for whether or not a request is allowed adding to a
> completion batch are a bit hard to read, and they also have a few
> issues. One is that ioerror may indeed be a random value on passthrough,
> and it's being checked unconditionally of whether or not the given
> request is a passthrough request or not.
> 
> Rewrite the conditions to be separate for easier reading, and only check
> ioerror for non-passthrough requests. This fixes an issue with bio
> unmapping on passthrough, where it fails getting added to a batch. This
> both leads to suboptimal performance, and may trigger a potential
> schedule-under-atomic condition for polled passthrough IO.
> 
> Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()")
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

I observed the blktests test case nvme/039 failure with v6.14-rc3 kernel. I
bisected and found that this patch in the v6.14-rc3 is the trigger.

The test run output is as follows:

nvme/039 => nvme0n1 (test error logging)                     [failed]
    runtime  5.378s  ...  5.354s
    --- tests/nvme/039.out      2024-09-20 11:20:26.405380875 +0900
    +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2025-02-19 16:13:05.061387179 +0900
    @@ -1,7 +1,3 @@
     Running nvme/039
    - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
    - Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
    - Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
      Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
    - Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
     Test complete


The test case does error injection. Test method requires reconsideration to
adjust to this kernel change, probably. Help for fix will be appreciated.
Alan Adamson Feb. 19, 2025, 7:17 p.m. UTC | #2
On 2/18/25 11:26 PM, Shinichiro Kawasaki wrote:
> CC+: Alan,
>
> On Feb 13, 2025 / 08:18, Jens Axboe wrote:
>> The conditions for whether or not a request is allowed adding to a
>> completion batch are a bit hard to read, and they also have a few
>> issues. One is that ioerror may indeed be a random value on passthrough,
>> and it's being checked unconditionally of whether or not the given
>> request is a passthrough request or not.
>>
>> Rewrite the conditions to be separate for easier reading, and only check
>> ioerror for non-passthrough requests. This fixes an issue with bio
>> unmapping on passthrough, where it fails getting added to a batch. This
>> both leads to suboptimal performance, and may trigger a potential
>> schedule-under-atomic condition for polled passthrough IO.
>>
>> Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()")
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> I observed the blktests test case nvme/039 failure with v6.14-rc3 kernel. I
> bisected and found that this patch in the v6.14-rc3 is the trigger.
>
> The test run output is as follows:
>
> nvme/039 => nvme0n1 (test error logging)                     [failed]
>      runtime  5.378s  ...  5.354s
>      --- tests/nvme/039.out      2024-09-20 11:20:26.405380875 +0900
>      +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2025-02-19 16:13:05.061387179 +0900
>      @@ -1,7 +1,3 @@
>       Running nvme/039
>      - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>      - Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
>      - Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
>        Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
>      - Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
>       Test complete
>
>
> The test case does error injection. Test method requires reconsideration to
> adjust to this kernel change, probably. Help for fix will be appreciated.

Not really an issue with the test, rather the error injector is broken. 
I'll investigate.

Alan
diff mbox series

Patch

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index a0a9007cc1e3..cc1cb5de8fb6 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -861,12 +861,22 @@  static inline bool blk_mq_add_to_batch(struct request *req,
 				       void (*complete)(struct io_comp_batch *))
 {
 	/*
-	 * blk_mq_end_request_batch() can't end request allocated from
-	 * sched tags
+	 * Check various conditions that exclude batch processing:
+	 * 1) No batch container
+	 * 2) Has scheduler data attached
+	 * 3) Not a passthrough request and end_io set
+	 * 4) Not a passthrough request and an ioerror
 	 */
-	if (!iob || (req->rq_flags & RQF_SCHED_TAGS) || ioerror ||
-			(req->end_io && !blk_rq_is_passthrough(req)))
+	if (!iob)
 		return false;
+	if (req->rq_flags & RQF_SCHED_TAGS)
+		return false;
+	if (!blk_rq_is_passthrough(req)) {
+		if (req->end_io)
+			return false;
+		if (ioerror < 0)
+			return false;
+	}
 
 	if (!iob->complete)
 		iob->complete = complete;