libceph: add ignore cache/overlay flag if got redirect reply
diff mbox series

Message ID CAKQB+fug_Y4y8wYe-vG=itf+0BmYFPfDm-ch7DTobtkipQz-yw@mail.gmail.com
State New
Headers show
Series
  • libceph: add ignore cache/overlay flag if got redirect reply
Related show

Commit Message

Jerry Lee May 18, 2020, 8:03 a.m. UTC
osd client should ignore cache/overlay flag if got redirect reply.
Otherwise, the client hangs when the cache tier is in forward mode.

Similar issues:
   https://tracker.ceph.com/issues/23296
   https://tracker.ceph.com/issues/36406

Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
---
 net/ceph/osd_client.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

                goto out_unlock_osdc;

Comments

Ilya Dryomov May 19, 2020, 9:14 a.m. UTC | #1
On Mon, May 18, 2020 at 10:03 AM Jerry Lee <leisurelysw24@gmail.com> wrote:
>
> osd client should ignore cache/overlay flag if got redirect reply.
> Otherwise, the client hangs when the cache tier is in forward mode.
>
> Similar issues:
>    https://tracker.ceph.com/issues/23296
>    https://tracker.ceph.com/issues/36406
>
> Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
> ---
>  net/ceph/osd_client.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 998e26b..1d4973f 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -3649,7 +3649,9 @@ static void handle_reply(struct ceph_osd *osd,
> struct ceph_msg *msg)
>                  * supported.
>                  */
>                 req->r_t.target_oloc.pool = m.redirect.oloc.pool;
> -               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
> +               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
> +                               CEPH_OSD_FLAG_IGNORE_OVERLAY |
> +                               CEPH_OSD_FLAG_IGNORE_CACHE;
>                 req->r_tid = 0;
>                 __submit_request(req, false);
>                 goto out_unlock_osdc;

Hi Jerry,

Looks good (although the patch was whitespace damaged).  I've fixed
it up, but check out Documentation/process/email-clients.rst.

Also, out of curiosity, are you actually using the forward cache mode?

Thanks,

                Ilya
Jerry Lee May 19, 2020, 10:30 a.m. UTC | #2
On Tue, 19 May 2020 at 17:14, Ilya Dryomov <idryomov@gmail.com> wrote:
>
> On Mon, May 18, 2020 at 10:03 AM Jerry Lee <leisurelysw24@gmail.com> wrote:
> >
> > osd client should ignore cache/overlay flag if got redirect reply.
> > Otherwise, the client hangs when the cache tier is in forward mode.
> >
> > Similar issues:
> >    https://tracker.ceph.com/issues/23296
> >    https://tracker.ceph.com/issues/36406
> >
> > Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
> > ---
> >  net/ceph/osd_client.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> > index 998e26b..1d4973f 100644
> > --- a/net/ceph/osd_client.c
> > +++ b/net/ceph/osd_client.c
> > @@ -3649,7 +3649,9 @@ static void handle_reply(struct ceph_osd *osd,
> > struct ceph_msg *msg)
> >                  * supported.
> >                  */
> >                 req->r_t.target_oloc.pool = m.redirect.oloc.pool;
> > -               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
> > +               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
> > +                               CEPH_OSD_FLAG_IGNORE_OVERLAY |
> > +                               CEPH_OSD_FLAG_IGNORE_CACHE;
> >                 req->r_tid = 0;
> >                 __submit_request(req, false);
> >                 goto out_unlock_osdc;
>
> Hi Jerry,
>
> Looks good (although the patch was whitespace damaged).  I've fixed
> it up, but check out Documentation/process/email-clients.rst.
Thanks for sharing the doc!
>
> Also, out of curiosity, are you actually using the forward cache mode?
No, we accidentally found the issue when removing a writeback cache.
The kernel client got blocked when the cache mode switched from
writeback to forward and waited for the cache pool to be flushed.

BTW, a warning (Error EPERM: 'forward' is not a well-supported cache
mode and may corrupt your data.) is shown when the cache mode is
changed to forward mode.  Does it mean that the data integrity and IO
ordering cannot be ensured in this mode?

Thanks!

- Jerry
>
> Thanks,
>
>                 Ilya
Ilya Dryomov May 19, 2020, 1:32 p.m. UTC | #3
On Tue, May 19, 2020 at 12:30 PM Jerry Lee <leisurelysw24@gmail.com> wrote:
>
> On Tue, 19 May 2020 at 17:14, Ilya Dryomov <idryomov@gmail.com> wrote:
> >
> > On Mon, May 18, 2020 at 10:03 AM Jerry Lee <leisurelysw24@gmail.com> wrote:
> > >
> > > osd client should ignore cache/overlay flag if got redirect reply.
> > > Otherwise, the client hangs when the cache tier is in forward mode.
> > >
> > > Similar issues:
> > >    https://tracker.ceph.com/issues/23296
> > >    https://tracker.ceph.com/issues/36406
> > >
> > > Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
> > > ---
> > >  net/ceph/osd_client.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> > > index 998e26b..1d4973f 100644
> > > --- a/net/ceph/osd_client.c
> > > +++ b/net/ceph/osd_client.c
> > > @@ -3649,7 +3649,9 @@ static void handle_reply(struct ceph_osd *osd,
> > > struct ceph_msg *msg)
> > >                  * supported.
> > >                  */
> > >                 req->r_t.target_oloc.pool = m.redirect.oloc.pool;
> > > -               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
> > > +               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
> > > +                               CEPH_OSD_FLAG_IGNORE_OVERLAY |
> > > +                               CEPH_OSD_FLAG_IGNORE_CACHE;
> > >                 req->r_tid = 0;
> > >                 __submit_request(req, false);
> > >                 goto out_unlock_osdc;
> >
> > Hi Jerry,
> >
> > Looks good (although the patch was whitespace damaged).  I've fixed
> > it up, but check out Documentation/process/email-clients.rst.
> Thanks for sharing the doc!
> >
> > Also, out of curiosity, are you actually using the forward cache mode?
> No, we accidentally found the issue when removing a writeback cache.
> The kernel client got blocked when the cache mode switched from
> writeback to forward and waited for the cache pool to be flushed.
>
> BTW, a warning (Error EPERM: 'forward' is not a well-supported cache
> mode and may corrupt your data.) is shown when the cache mode is
> changed to forward mode.  Does it mean that the data integrity and IO
> ordering cannot be ensured in this mode?

Yes.  The problem with redirects is that they can mess up the order
of requests.  The forward mode is based on redirects and therefore
inherently flawed.

Use proxy and readproxy modes instead of forward and readforward.

Thanks,

                Ilya
Jerry Lee May 20, 2020, 2:35 a.m. UTC | #4
On Tue, 19 May 2020 at 21:32, Ilya Dryomov <idryomov@gmail.com> wrote:
>
> On Tue, May 19, 2020 at 12:30 PM Jerry Lee <leisurelysw24@gmail.com> wrote:
> >
> > On Tue, 19 May 2020 at 17:14, Ilya Dryomov <idryomov@gmail.com> wrote:
> > >
> > > On Mon, May 18, 2020 at 10:03 AM Jerry Lee <leisurelysw24@gmail.com> wrote:
> > > >
> > > > osd client should ignore cache/overlay flag if got redirect reply.
> > > > Otherwise, the client hangs when the cache tier is in forward mode.
> > > >
> > > > Similar issues:
> > > >    https://tracker.ceph.com/issues/23296
> > > >    https://tracker.ceph.com/issues/36406
> > > >
> > > > Signed-off-by: Jerry Lee <leisurelysw24@gmail.com>
> > > > ---
> > > >  net/ceph/osd_client.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> > > > index 998e26b..1d4973f 100644
> > > > --- a/net/ceph/osd_client.c
> > > > +++ b/net/ceph/osd_client.c
> > > > @@ -3649,7 +3649,9 @@ static void handle_reply(struct ceph_osd *osd,
> > > > struct ceph_msg *msg)
> > > >                  * supported.
> > > >                  */
> > > >                 req->r_t.target_oloc.pool = m.redirect.oloc.pool;
> > > > -               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
> > > > +               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
> > > > +                               CEPH_OSD_FLAG_IGNORE_OVERLAY |
> > > > +                               CEPH_OSD_FLAG_IGNORE_CACHE;
> > > >                 req->r_tid = 0;
> > > >                 __submit_request(req, false);
> > > >                 goto out_unlock_osdc;
> > >
> > > Hi Jerry,
> > >
> > > Looks good (although the patch was whitespace damaged).  I've fixed
> > > it up, but check out Documentation/process/email-clients.rst.
> > Thanks for sharing the doc!
> > >
> > > Also, out of curiosity, are you actually using the forward cache mode?
> > No, we accidentally found the issue when removing a writeback cache.
> > The kernel client got blocked when the cache mode switched from
> > writeback to forward and waited for the cache pool to be flushed.
> >
> > BTW, a warning (Error EPERM: 'forward' is not a well-supported cache
> > mode and may corrupt your data.) is shown when the cache mode is
> > changed to forward mode.  Does it mean that the data integrity and IO
> > ordering cannot be ensured in this mode?
>
> Yes.  The problem with redirects is that they can mess up the order
> of requests.  The forward mode is based on redirects and therefore
> inherently flawed.
>
> Use proxy and readproxy modes instead of forward and readforward.
>

Thanks for the clarification.  I refer to the mimic version
cache-tering configuration guide which states that forward mode is
configured when removing a writeback cache.  However, in the
up-to-date doc (master), proxy mode is recommended.  I'll use proxy
mode instead.

Thanks,
- Jerry

> Thanks,
>
>                 Ilya

Patch
diff mbox series

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 998e26b..1d4973f 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -3649,7 +3649,9 @@  static void handle_reply(struct ceph_osd *osd,
struct ceph_msg *msg)
                 * supported.
                 */
                req->r_t.target_oloc.pool = m.redirect.oloc.pool;
-               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
+               req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
+                               CEPH_OSD_FLAG_IGNORE_OVERLAY |
+                               CEPH_OSD_FLAG_IGNORE_CACHE;
                req->r_tid = 0;
                __submit_request(req, false);