Message ID | 20200212202320.GA2704@avx2 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | null_blk: fix spurious IO errors after failed past-wp access | expand |
Alexey, thanks for the patch however the description is not simple to understand. I just sent a patch with a description and the test result. On 02/12/2020 12:23 PM, Alexey Dobriyan wrote: > Steps to reproduce: > > BLKRESETZONE zone 0 > > // force EIO > pwrite(fd, buf, 4096, 4096); > > [issue more IO including zone ioctls] > > It will start failing randomly including IO to unrelated zones because of > ->error "reuse". Trigger can be partition detection as well if test is not > run immediately which is even more entertaining. > > The fix is of course to clear ->error where necessary. > > Signed-off-by: Alexey Dobriyan (SK hynix)<adobriyan@gmail.com> > --- > > drivers/block/null_blk_main.c | 2 ++
On Wed, Feb 12, 2020 at 11:23:20PM +0300, Alexey Dobriyan wrote: > Steps to reproduce: > > BLKRESETZONE zone 0 > > // force EIO > pwrite(fd, buf, 4096, 4096); > > [issue more IO including zone ioctls] > > It will start failing randomly including IO to unrelated zones because of > ->error "reuse". Trigger can be partition detection as well if test is not > run immediately which is even more entertaining. > > The fix is of course to clear ->error where necessary. > > Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> > --- > > drivers/block/null_blk_main.c | 2 ++ > 1 file changed, 2 insertions(+) > > --- a/drivers/block/null_blk_main.c > +++ b/drivers/block/null_blk_main.c > @@ -605,6 +605,7 @@ static struct nullb_cmd *__alloc_cmd(struct nullb_queue *nq) > if (tag != -1U) { > cmd = &nq->cmds[tag]; > cmd->tag = tag; > + cmd->error = BLK_STS_OK; I'd place this line in null_queue_bio to match the blk-mq patch more closely. Otherwise this looks fine: Reviewed-by: Christoph Hellwig <hch@lst.de> Can you add your testcase to blktests?
On 2/12/20 1:23 PM, Alexey Dobriyan wrote: > Steps to reproduce: > > BLKRESETZONE zone 0 > > // force EIO > pwrite(fd, buf, 4096, 4096); > > [issue more IO including zone ioctls] > > It will start failing randomly including IO to unrelated zones because of > ->error "reuse". Trigger can be partition detection as well if test is not > run immediately which is even more entertaining. > > The fix is of course to clear ->error where necessary. Applied, thanks.
--- a/drivers/block/null_blk_main.c +++ b/drivers/block/null_blk_main.c @@ -605,6 +605,7 @@ static struct nullb_cmd *__alloc_cmd(struct nullb_queue *nq) if (tag != -1U) { cmd = &nq->cmds[tag]; cmd->tag = tag; + cmd->error = BLK_STS_OK; cmd->nq = nq; if (nq->dev->irqmode == NULL_IRQ_TIMER) { hrtimer_init(&cmd->timer, CLOCK_MONOTONIC, @@ -1385,6 +1386,7 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, cmd->timer.function = null_cmd_timer_expired; } cmd->rq = bd->rq; + cmd->error = BLK_STS_OK; cmd->nq = nq; blk_mq_start_request(bd->rq);
Steps to reproduce: BLKRESETZONE zone 0 // force EIO pwrite(fd, buf, 4096, 4096); [issue more IO including zone ioctls] It will start failing randomly including IO to unrelated zones because of ->error "reuse". Trigger can be partition detection as well if test is not run immediately which is even more entertaining. The fix is of course to clear ->error where necessary. Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> --- drivers/block/null_blk_main.c | 2 ++ 1 file changed, 2 insertions(+)