Message ID | 20240514070856.194701-1-xiubli@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ceph: stop reconnecting to MDS after connection being closed | expand |
Hi Xiubo, On Tue, May 14, 2024 at 12:39 PM <xiubli@redhat.com> wrote: > > From: Xiubo Li <xiubli@redhat.com> > > The reconnect feature never been supported by MDS in mds non-RECONNECT > state. This reconnect requests will incorrectly close the just reopened > sessions when the MDS kills them during the "mds_session_blocklist_on_evict" > option is disabled. > > Remove it for now. > > Fixes: 7e70f0ed9f3e ("ceph: attempt mds reconnect if mds closes our session") > URL: https://tracker.ceph.com/issues/65647 > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > fs/ceph/mds_client.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index f5b25d178118..97a126c54578 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -6241,9 +6241,6 @@ static void mds_peer_reset(struct ceph_connection *con) > > pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", > s->s_mds); > - if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO && > - ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT) > - send_mds_reconnect(mdsc, s); > } > > static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg) > -- > 2.44.0 > I don't see this change in the testing branch so that the fix can be verified with https://github.com/ceph/ceph/pull/57458
On 7/25/24 13:22, Venky Shankar wrote: > Hi Xiubo, > > On Tue, May 14, 2024 at 12:39 PM <xiubli@redhat.com> wrote: >> From: Xiubo Li <xiubli@redhat.com> >> >> The reconnect feature never been supported by MDS in mds non-RECONNECT >> state. This reconnect requests will incorrectly close the just reopened >> sessions when the MDS kills them during the "mds_session_blocklist_on_evict" >> option is disabled. >> >> Remove it for now. >> >> Fixes: 7e70f0ed9f3e ("ceph: attempt mds reconnect if mds closes our session") >> URL: https://tracker.ceph.com/issues/65647 >> Signed-off-by: Xiubo Li <xiubli@redhat.com> >> --- >> fs/ceph/mds_client.c | 3 --- >> 1 file changed, 3 deletions(-) >> >> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c >> index f5b25d178118..97a126c54578 100644 >> --- a/fs/ceph/mds_client.c >> +++ b/fs/ceph/mds_client.c >> @@ -6241,9 +6241,6 @@ static void mds_peer_reset(struct ceph_connection *con) >> >> pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", >> s->s_mds); >> - if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO && >> - ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT) >> - send_mds_reconnect(mdsc, s); >> } >> >> static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg) >> -- >> 2.44.0 >> > I don't see this change in the testing branch so that the fix can be > verified with > > https://github.com/ceph/ceph/pull/57458 Venky, As I rememembered this is buggy and there was a failure in qa runs. So this change won't work. Let me have a check again. Thanks - Xiubo
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index f5b25d178118..97a126c54578 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -6241,9 +6241,6 @@ static void mds_peer_reset(struct ceph_connection *con) pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", s->s_mds); - if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO && - ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT) - send_mds_reconnect(mdsc, s); } static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg)