mbox series

[0/3] rbd: fix some issues around flushing notifies

Message ID 20200317120422.3406-1-idryomov@gmail.com (mailing list archive)
Headers show
Series rbd: fix some issues around flushing notifies | expand

Message

Ilya Dryomov March 17, 2020, 12:04 p.m. UTC
Hello,

A recent snapshot-based mirroring experiment exposed a deadlock on
header_rwsem in the error path of rbd_dev_image_probe() (i.e. "rbd
map").

Thanks,

                Ilya


Ilya Dryomov (3):
  rbd: avoid a deadlock on header_rwsem when flushing notifies
  rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
  rbd: don't test rbd_dev->opts in rbd_dev_image_release()

 drivers/block/rbd.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

Comments

Jason Dillaman March 17, 2020, 4:41 p.m. UTC | #1
On Tue, Mar 17, 2020 at 8:06 AM Ilya Dryomov <idryomov@gmail.com> wrote:
>
> Hello,
>
> A recent snapshot-based mirroring experiment exposed a deadlock on
> header_rwsem in the error path of rbd_dev_image_probe() (i.e. "rbd
> map").
>
> Thanks,
>
>                 Ilya
>
>
> Ilya Dryomov (3):
>   rbd: avoid a deadlock on header_rwsem when flushing notifies
>   rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
>   rbd: don't test rbd_dev->opts in rbd_dev_image_release()
>
>  drivers/block/rbd.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> --
> 2.19.2
>

The "get_snapcontext" call still going to hang (albeit in an
interruptible state) if the image has > 510 snapshots, correct?

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Ilya Dryomov March 17, 2020, 5:24 p.m. UTC | #2
On Tue, Mar 17, 2020 at 5:41 PM Jason Dillaman <jdillama@redhat.com> wrote:
>
> On Tue, Mar 17, 2020 at 8:06 AM Ilya Dryomov <idryomov@gmail.com> wrote:
> >
> > Hello,
> >
> > A recent snapshot-based mirroring experiment exposed a deadlock on
> > header_rwsem in the error path of rbd_dev_image_probe() (i.e. "rbd
> > map").
> >
> > Thanks,
> >
> >                 Ilya
> >
> >
> > Ilya Dryomov (3):
> >   rbd: avoid a deadlock on header_rwsem when flushing notifies
> >   rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
> >   rbd: don't test rbd_dev->opts in rbd_dev_image_release()
> >
> >  drivers/block/rbd.c | 23 ++++++++++++++---------
> >  1 file changed, 14 insertions(+), 9 deletions(-)
> >
> > --
> > 2.19.2
> >
>
> The "get_snapcontext" call still going to hang (albeit in an
> interruptible state) if the image has > 510 snapshots, correct?

Yes, this has been a limitation of the messenger and its interface
since day one.  This is a krbd ticket specifically about snapshots,
but this limitation affects a lot more than that:

  https://tracker.ceph.com/issues/12874

It is engraved pretty deeply, but I'm planning to address it while
the messenger is opened up for surgery.

Thanks,

                Ilya