Message ID | 20191120082902.38666-1-xiubli@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | mdsmap: fix mds choosing | expand |
On Wed, 2019-11-20 at 03:28 -0500, xiubli@redhat.com wrote: > From: Xiubo Li <xiubli@redhat.com> > > Xiubo Li (3): > mdsmap: add more debug info when decoding > mdsmap: fix mdsmap cluster available check based on laggy number > mdsmap: only choose one MDS who is in up:active state without laggy > > fs/ceph/mds_client.c | 6 ++++-- > fs/ceph/mdsmap.c | 27 ++++++++++++++++++--------- > 2 files changed, 22 insertions(+), 11 deletions(-) > These all look good to me. I'll plan to merge them for v5.5, unless anyone else sees issues with them. Thanks!
On 11/20/19 9:50 PM, Jeff Layton wrote: > On Wed, 2019-11-20 at 03:28 -0500, xiubli@redhat.com wrote: >> From: Xiubo Li <xiubli@redhat.com> >> >> Xiubo Li (3): >> mdsmap: add more debug info when decoding >> mdsmap: fix mdsmap cluster available check based on laggy number >> mdsmap: only choose one MDS who is in up:active state without laggy >> >> fs/ceph/mds_client.c | 6 ++++-- >> fs/ceph/mdsmap.c | 27 ++++++++++++++++++--------- >> 2 files changed, 22 insertions(+), 11 deletions(-) >> > > These all look good to me. I'll plan to merge them for v5.5, unless > anyone else sees issues with them. > > Thanks! > Main problem of this series is that we need to distinguish between mds crash and transient mds laggy.
On 2019/11/21 10:42, Yan, Zheng wrote: > On 11/20/19 9:50 PM, Jeff Layton wrote: >> On Wed, 2019-11-20 at 03:28 -0500, xiubli@redhat.com wrote: >>> From: Xiubo Li <xiubli@redhat.com> >>> >>> Xiubo Li (3): >>> mdsmap: add more debug info when decoding >>> mdsmap: fix mdsmap cluster available check based on laggy number >>> mdsmap: only choose one MDS who is in up:active state without laggy >>> >>> fs/ceph/mds_client.c | 6 ++++-- >>> fs/ceph/mdsmap.c | 27 ++++++++++++++++++--------- >>> 2 files changed, 22 insertions(+), 11 deletions(-) >>> >> >> These all look good to me. I'll plan to merge them for v5.5, unless >> anyone else sees issues with them. >> >> Thanks! >> > > Main problem of this series is that we need to distinguish between mds > crash and transient mds laggy. How about let's try to check and get an up:active & !laggy mds first, if we couldn't find one then fall back to one that is up:active & laggy ? For the auth mds case, we will ignore the laggy stuff. BRs
On Thu, 2019-11-21 at 19:28 +0800, Xiubo Li wrote: > On 2019/11/21 10:42, Yan, Zheng wrote: > > On 11/20/19 9:50 PM, Jeff Layton wrote: > > > On Wed, 2019-11-20 at 03:28 -0500, xiubli@redhat.com wrote: > > > > From: Xiubo Li <xiubli@redhat.com> > > > > > > > > Xiubo Li (3): > > > > mdsmap: add more debug info when decoding > > > > mdsmap: fix mdsmap cluster available check based on laggy number > > > > mdsmap: only choose one MDS who is in up:active state without laggy > > > > > > > > fs/ceph/mds_client.c | 6 ++++-- > > > > fs/ceph/mdsmap.c | 27 ++++++++++++++++++--------- > > > > 2 files changed, 22 insertions(+), 11 deletions(-) > > > > > > > > > > These all look good to me. I'll plan to merge them for v5.5, unless > > > anyone else sees issues with them. > > > > > > Thanks! > > > > > > > Main problem of this series is that we need to distinguish between mds > > crash and transient mds laggy. > > How about let's try to check and get an up:active & !laggy mds first, if > we couldn't find one then fall back to one that is up:active & laggy ? > > For the auth mds case, we will ignore the laggy stuff. > Ok. I've dropped this series for now with the expectation that you'll re-post when you have something ready. Cheers,
From: Xiubo Li <xiubli@redhat.com> Xiubo Li (3): mdsmap: add more debug info when decoding mdsmap: fix mdsmap cluster available check based on laggy number mdsmap: only choose one MDS who is in up:active state without laggy fs/ceph/mds_client.c | 6 ++++-- fs/ceph/mdsmap.c | 27 ++++++++++++++++++--------- 2 files changed, 22 insertions(+), 11 deletions(-)