diff mbox series

[v2] ceph: defer stopping the mdsc delayed_work

Message ID 20230724084214.321005-1-xiubli@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] ceph: defer stopping the mdsc delayed_work | expand

Commit Message

Xiubo Li July 24, 2023, 8:42 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

Flushing the dirty buffer may take a long time if the Rados is
overloaded or if there is network issue. So we should ping the
MDSs periodically to keep alive, else the MDS will blocklist
the kclient.

Cc: stable@vger.kernel.org
Cc: Venky Shankar <vshankar@redhat.com>
URL: https://tracker.ceph.com/issues/61843
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---

V2:
- Fixed a typo in commit comment, thanks Milind.


 fs/ceph/mds_client.c | 2 +-
 fs/ceph/mds_client.h | 3 ++-
 fs/ceph/super.c      | 7 ++++---
 3 files changed, 7 insertions(+), 5 deletions(-)

Comments

Ilya Dryomov July 24, 2023, 11:12 a.m. UTC | #1
On Mon, Jul 24, 2023 at 10:44 AM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Flushing the dirty buffer may take a long time if the Rados is
> overloaded or if there is network issue. So we should ping the
> MDSs periodically to keep alive, else the MDS will blocklist
> the kclient.
>
> Cc: stable@vger.kernel.org

Hi Xiubo,

The stable tag doesn't make sense here as this commit enhances commit
2789c08342f7 ("ceph: drop the messages from MDS when unmounting") which
isn't upstream.  It should probably just be folded there.

Thanks,

                Ilya
Xiubo Li July 24, 2023, 12:20 p.m. UTC | #2
On 7/24/23 19:12, Ilya Dryomov wrote:
> On Mon, Jul 24, 2023 at 10:44 AM <xiubli@redhat.com> wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Flushing the dirty buffer may take a long time if the Rados is
>> overloaded or if there is network issue. So we should ping the
>> MDSs periodically to keep alive, else the MDS will blocklist
>> the kclient.
>>
>> Cc: stable@vger.kernel.org
> Hi Xiubo,
>
> The stable tag doesn't make sense here as this commit enhances commit
> 2789c08342f7 ("ceph: drop the messages from MDS when unmounting") which
> isn't upstream.  It should probably just be folded there.

No, Ilya. This is not an enhancement for commit 2789c08342f7.

They are for different issues here. This patch just based on that. We 
can apply this first and then I can rebase the testing branch.

Thanks

- Xiubo


> Thanks,
>
>                  Ilya
>
Ilya Dryomov July 24, 2023, 12:34 p.m. UTC | #3
On Mon, Jul 24, 2023 at 2:20 PM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 7/24/23 19:12, Ilya Dryomov wrote:
> > On Mon, Jul 24, 2023 at 10:44 AM <xiubli@redhat.com> wrote:
> >> From: Xiubo Li <xiubli@redhat.com>
> >>
> >> Flushing the dirty buffer may take a long time if the Rados is
> >> overloaded or if there is network issue. So we should ping the
> >> MDSs periodically to keep alive, else the MDS will blocklist
> >> the kclient.
> >>
> >> Cc: stable@vger.kernel.org
> > Hi Xiubo,
> >
> > The stable tag doesn't make sense here as this commit enhances commit
> > 2789c08342f7 ("ceph: drop the messages from MDS when unmounting") which
> > isn't upstream.  It should probably just be folded there.
>
> No, Ilya. This is not an enhancement for commit 2789c08342f7.
>
> They are for different issues here. This patch just based on that. We
> can apply this first and then I can rebase the testing branch.

Ah, thanks for letting me know.  Please go ahead and structure it that
way in the testing branch.  As it is, it conflicts heavily as the enum
that it adds a new member to was added in commit 2789c08342f7.

                Ilya
Xiubo Li July 24, 2023, 12:40 p.m. UTC | #4
On 7/24/23 20:34, Ilya Dryomov wrote:
> On Mon, Jul 24, 2023 at 2:20 PM Xiubo Li <xiubli@redhat.com> wrote:
>>
>> On 7/24/23 19:12, Ilya Dryomov wrote:
>>> On Mon, Jul 24, 2023 at 10:44 AM <xiubli@redhat.com> wrote:
>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>
>>>> Flushing the dirty buffer may take a long time if the Rados is
>>>> overloaded or if there is network issue. So we should ping the
>>>> MDSs periodically to keep alive, else the MDS will blocklist
>>>> the kclient.
>>>>
>>>> Cc: stable@vger.kernel.org
>>> Hi Xiubo,
>>>
>>> The stable tag doesn't make sense here as this commit enhances commit
>>> 2789c08342f7 ("ceph: drop the messages from MDS when unmounting") which
>>> isn't upstream.  It should probably just be folded there.
>> No, Ilya. This is not an enhancement for commit 2789c08342f7.
>>
>> They are for different issues here. This patch just based on that. We
>> can apply this first and then I can rebase the testing branch.
> Ah, thanks for letting me know.  Please go ahead and structure it that
> way in the testing branch.  As it is, it conflicts heavily as the enum
> that it adds a new member to was added in commit 2789c08342f7.

Sure. Thanks


>
>                  Ilya
>
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 65230ebefd51..70987b3c198a 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5192,7 +5192,7 @@  static void delayed_work(struct work_struct *work)
 
 	doutc(mdsc->fsc->client, "mdsc delayed_work\n");
 
-	if (mdsc->stopping)
+	if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED)
 		return;
 
 	mutex_lock(&mdsc->mutex);
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 5d02c8c582fd..befbd384428e 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -400,7 +400,8 @@  struct cap_wait {
 
 enum {
 	CEPH_MDSC_STOPPING_BEGIN = 1,
-	CEPH_MDSC_STOPPING_FLUSHED = 2,
+	CEPH_MDSC_STOPPING_FLUSHING = 2,
+	CEPH_MDSC_STOPPING_FLUSHED = 3,
 };
 
 /*
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 0a370852cf02..49fd17fbba9f 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1477,7 +1477,7 @@  static int ceph_init_fs_context(struct fs_context *fc)
 static bool __inc_stopping_blocker(struct ceph_mds_client *mdsc)
 {
 	spin_lock(&mdsc->stopping_lock);
-	if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED) {
+	if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHING) {
 		spin_unlock(&mdsc->stopping_lock);
 		return false;
 	}
@@ -1490,7 +1490,7 @@  static void __dec_stopping_blocker(struct ceph_mds_client *mdsc)
 {
 	spin_lock(&mdsc->stopping_lock);
 	if (!atomic_dec_return(&mdsc->stopping_blockers) &&
-	    mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED)
+	    mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHING)
 		complete_all(&mdsc->stopping_waiter);
 	spin_unlock(&mdsc->stopping_lock);
 }
@@ -1551,7 +1551,7 @@  static void ceph_kill_sb(struct super_block *s)
 	sync_filesystem(s);
 
 	spin_lock(&mdsc->stopping_lock);
-	mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHED;
+	mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHING;
 	wait = !!atomic_read(&mdsc->stopping_blockers);
 	spin_unlock(&mdsc->stopping_lock);
 
@@ -1565,6 +1565,7 @@  static void ceph_kill_sb(struct super_block *s)
 			pr_warn_client(cl, "umount was killed, %ld\n", timeleft);
 	}
 
+	mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHED;
 	kill_anon_super(s);
 
 	fsc->client->extra_mon_dispatch = NULL;