Message ID | 20200429074844.6241-1-mcgrof@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | block: add error handling for *add_disk*() | expand |
On 2020-04-29 00:48, Luis Chamberlain wrote: > While working on some blktrace races I noticed that we don't do > error handling on *add_disk*() and friends. This is my initial > work on that. > > This is based on linux-next tag next-20200428, you can also get this > on my branch 20200428-block-fixes [0]. > > Let me know what you think. Hi Luis, Thank you for having done this work. Since triggering error paths can be challenging, how about adding fault injection capabilities that make it possible to trigger all modified error paths and how about adding blktests that trigger these paths? That is the strategy that I followed myself recently to fix an error path in blk_mq_realloc_hw_ctxs(). Thanks, Bart.
On Tue, May 05, 2020 at 06:18:22PM -0700, Bart Van Assche wrote: > On 2020-04-29 00:48, Luis Chamberlain wrote: > > While working on some blktrace races I noticed that we don't do > > error handling on *add_disk*() and friends. This is my initial > > work on that. > > > > This is based on linux-next tag next-20200428, you can also get this > > on my branch 20200428-block-fixes [0]. > > > > Let me know what you think. > Hi Luis, > > Thank you for having done this work. My pleasure, I just made one minor change to this series, but that's all so far. Note that break-blktrace run_0004.sh still yields: debugfs: Directory 'loop0' with parent 'block' already present! And so I suspect something else is up, this is even after. That's using my latest: https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200508-block-fixes Some more eyebealls on that would be useful. > Since triggering error paths can be > challenging, how about adding fault injection capabilities that make it > possible to trigger all modified error paths and how about adding > blktests that trigger these paths? That is the strategy that I followed > myself recently to fix an error path in blk_mq_realloc_hw_ctxs(). Sure thing, but I get the impression that adding this may make it odd to or harder to review. Shouldn't this be done after we have *some* error handling? Right now we shouldn't regress as we never fail, and that seemss worse. Let me know, either way, I'll start work on it. Luis