mbox series

[RFC,v1,0/6] block: add error handling for *add_disk*()

Message ID 20200429074844.6241-1-mcgrof@kernel.org (mailing list archive)
Headers show
Series block: add error handling for *add_disk*() | expand

Message

Luis Chamberlain April 29, 2020, 7:48 a.m. UTC
While working on some blktrace races I noticed that we don't do
error handling on *add_disk*() and friends. This is my initial
work on that.

This is based on linux-next tag next-20200428, you can also get this
on my branch 20200428-block-fixes [0].

Let me know what you think.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200428-blktrace-fixes

Luis Chamberlain (6):
  block: refcount the request_queue early in __device_add_disk()
  block: move disk announce work from register_disk() to a helper
  block: move disk invalidation from del_gendisk() into a helper
  block: move disk unregistration work from del_gendisk() to a helper
  block: add initial error handling for *add_disk()* and friends
  loop: add error handling support for add_disk()

 block/blk-integrity.c |  13 +-
 block/blk-sysfs.c     |   7 +-
 block/blk.h           |   5 +-
 block/genhd.c         | 366 +++++++++++++++++++++++++++---------------
 drivers/block/loop.c  |   7 +-
 include/linux/genhd.h |  16 +-
 6 files changed, 265 insertions(+), 149 deletions(-)

Comments

Bart Van Assche May 6, 2020, 1:18 a.m. UTC | #1
On 2020-04-29 00:48, Luis Chamberlain wrote:
> While working on some blktrace races I noticed that we don't do
> error handling on *add_disk*() and friends. This is my initial
> work on that.
> 
> This is based on linux-next tag next-20200428, you can also get this
> on my branch 20200428-block-fixes [0].
> 
> Let me know what you think.
Hi Luis,

Thank you for having done this work. Since triggering error paths can be
challenging, how about adding fault injection capabilities that make it
possible to trigger all modified error paths and how about adding
blktests that trigger these paths? That is the strategy that I followed
myself recently to fix an error path in blk_mq_realloc_hw_ctxs().

Thanks,

Bart.
Luis Chamberlain May 9, 2020, 3:43 a.m. UTC | #2
On Tue, May 05, 2020 at 06:18:22PM -0700, Bart Van Assche wrote:
> On 2020-04-29 00:48, Luis Chamberlain wrote:
> > While working on some blktrace races I noticed that we don't do
> > error handling on *add_disk*() and friends. This is my initial
> > work on that.
> > 
> > This is based on linux-next tag next-20200428, you can also get this
> > on my branch 20200428-block-fixes [0].
> > 
> > Let me know what you think.
> Hi Luis,
> 
> Thank you for having done this work.

My pleasure, I just made one minor change to this series, but that's
all so far. Note that break-blktrace run_0004.sh still yields:

debugfs: Directory 'loop0' with parent 'block' already present!

And so I suspect something else is up, this is even after. That's using
my latest:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200508-block-fixes

Some more eyebealls on that would be useful.

> Since triggering error paths can be
> challenging, how about adding fault injection capabilities that make it
> possible to trigger all modified error paths and how about adding
> blktests that trigger these paths? That is the strategy that I followed
> myself recently to fix an error path in blk_mq_realloc_hw_ctxs().

Sure thing, but I get the impression that adding this may make it odd
to or harder to review. Shouldn't this be done after we have *some*
error handling? Right now we shouldn't regress as we never fail, and
that seemss worse.

Let me know, either way, I'll start work on it.

  Luis