diff mbox series

[v2] ceph: flush the mdlog for filesystem sync

Message ID 20220414054512.386293-1-xiubli@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] ceph: flush the mdlog for filesystem sync | expand

Commit Message

Xiubo Li April 14, 2022, 5:45 a.m. UTC
Before waiting for a request's safe reply, we will send the mdlog
flush request to the relevant MDS. And this will also flush the
mdlog for all the other unsafe requests in the same session, so
we can record the last session and no need to flush mdlog again
in the next loop. But there still have cases that it may send the
mdlog flush requst twice or more, but that should be not often.

URL: https://tracker.ceph.com/issues/55284
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---

V2:
- Fixed possible NULL pointer dereference for the req->r_session


 fs/ceph/mds_client.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Jeff Layton April 18, 2022, 10:25 a.m. UTC | #1
On Thu, 2022-04-14 at 13:45 +0800, Xiubo Li wrote:
> Before waiting for a request's safe reply, we will send the mdlog
> flush request to the relevant MDS. And this will also flush the
> mdlog for all the other unsafe requests in the same session, so
> we can record the last session and no need to flush mdlog again
> in the next loop. But there still have cases that it may send the
> mdlog flush requst twice or more, but that should be not often.
> 
> URL: https://tracker.ceph.com/issues/55284
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
> 
> V2:
> - Fixed possible NULL pointer dereference for the req->r_session
> 
> 
>  fs/ceph/mds_client.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 0da85c9ce73a..4aaa7b14136e 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -5098,6 +5098,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
>  static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>  {
>  	struct ceph_mds_request *req = NULL, *nextreq;
> +	struct ceph_mds_session *last_session = NULL, *s;
>  	struct rb_node *n;
>  
>  	mutex_lock(&mdsc->mutex);
> @@ -5117,6 +5118,15 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>  			ceph_mdsc_get_request(req);
>  			if (nextreq)
>  				ceph_mdsc_get_request(nextreq);
> +
> +			/* send flush mdlog request to MDS */
> +			s = req->r_session;
> +			if (s && last_session != s) {
> +				send_flush_mdlog(s);
> +				ceph_put_mds_session(last_session);
> +				last_session = ceph_get_mds_session(s);
> +			}
> +
>  			mutex_unlock(&mdsc->mutex);
>  			dout("wait_unsafe_requests  wait on %llu (want %llu)\n",
>  			     req->r_tid, want_tid);
> @@ -5135,6 +5145,7 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>  		req = nextreq;
>  	}
>  	mutex_unlock(&mdsc->mutex);
> +	ceph_put_mds_session(last_session);
>  	dout("wait_unsafe_requests done\n");
>  }
>  

Looks reasonable. My only minor nit is that "wait_unsafe_requests" is
not really descriptive of this function anymore since you're not just
waiting on requests anymore, but also sending mdlog flush requests.

The sync handling in this code is a bit of a mess too. We have
unsafe_request_wait which is called from the fsync codepath, and then we
also have wait_unsafe_requests which is called from ceph_sync_fs. I
suspect they do enough of the same things that those could be combined.

So, I'll give my ACK on this, but wouldn't mind seeing some other
cleanup in this area.

Acked-by: Jeff Layton <jlayton@kernel.org>
Xiubo Li April 18, 2022, 10:41 a.m. UTC | #2
On 4/18/22 6:25 PM, Jeff Layton wrote:
> On Thu, 2022-04-14 at 13:45 +0800, Xiubo Li wrote:
>> Before waiting for a request's safe reply, we will send the mdlog
>> flush request to the relevant MDS. And this will also flush the
>> mdlog for all the other unsafe requests in the same session, so
>> we can record the last session and no need to flush mdlog again
>> in the next loop. But there still have cases that it may send the
>> mdlog flush requst twice or more, but that should be not often.
>>
>> URL: https://tracker.ceph.com/issues/55284
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>
>> V2:
>> - Fixed possible NULL pointer dereference for the req->r_session
>>
>>
>>   fs/ceph/mds_client.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>> index 0da85c9ce73a..4aaa7b14136e 100644
>> --- a/fs/ceph/mds_client.c
>> +++ b/fs/ceph/mds_client.c
>> @@ -5098,6 +5098,7 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
>>   static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>>   {
>>   	struct ceph_mds_request *req = NULL, *nextreq;
>> +	struct ceph_mds_session *last_session = NULL, *s;
>>   	struct rb_node *n;
>>   
>>   	mutex_lock(&mdsc->mutex);
>> @@ -5117,6 +5118,15 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>>   			ceph_mdsc_get_request(req);
>>   			if (nextreq)
>>   				ceph_mdsc_get_request(nextreq);
>> +
>> +			/* send flush mdlog request to MDS */
>> +			s = req->r_session;
>> +			if (s && last_session != s) {
>> +				send_flush_mdlog(s);
>> +				ceph_put_mds_session(last_session);
>> +				last_session = ceph_get_mds_session(s);
>> +			}
>> +
>>   			mutex_unlock(&mdsc->mutex);
>>   			dout("wait_unsafe_requests  wait on %llu (want %llu)\n",
>>   			     req->r_tid, want_tid);
>> @@ -5135,6 +5145,7 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
>>   		req = nextreq;
>>   	}
>>   	mutex_unlock(&mdsc->mutex);
>> +	ceph_put_mds_session(last_session);
>>   	dout("wait_unsafe_requests done\n");
>>   }
>>   
> Looks reasonable. My only minor nit is that "wait_unsafe_requests" is
> not really descriptive of this function anymore since you're not just
> waiting on requests anymore, but also sending mdlog flush requests.
>
> The sync handling in this code is a bit of a mess too. We have
> unsafe_request_wait which is called from the fsync codepath, and then we
> also have wait_unsafe_requests which is called from ceph_sync_fs. I
> suspect they do enough of the same things that those could be combined.

I tried and It was hard to combine them IMO.

The fsync() will iterate the "ci->i_unsafe_iops" and 
"ci->i_unsafe_dirops" first and get all the possible sessions, and then 
will send flush mdlog requests to them all.

In the ceph_sync_fs() it needs to iterate the global 
"mdsc->request_tree" instead.

-- Xiubo


> So, I'll give my ACK on this, but wouldn't mind seeing some other
> cleanup in this area.
>
> Acked-by: Jeff Layton <jlayton@kernel.org>
>
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 0da85c9ce73a..4aaa7b14136e 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5098,6 +5098,7 @@  void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
 static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
 {
 	struct ceph_mds_request *req = NULL, *nextreq;
+	struct ceph_mds_session *last_session = NULL, *s;
 	struct rb_node *n;
 
 	mutex_lock(&mdsc->mutex);
@@ -5117,6 +5118,15 @@  static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
 			ceph_mdsc_get_request(req);
 			if (nextreq)
 				ceph_mdsc_get_request(nextreq);
+
+			/* send flush mdlog request to MDS */
+			s = req->r_session;
+			if (s && last_session != s) {
+				send_flush_mdlog(s);
+				ceph_put_mds_session(last_session);
+				last_session = ceph_get_mds_session(s);
+			}
+
 			mutex_unlock(&mdsc->mutex);
 			dout("wait_unsafe_requests  wait on %llu (want %llu)\n",
 			     req->r_tid, want_tid);
@@ -5135,6 +5145,7 @@  static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
 		req = nextreq;
 	}
 	mutex_unlock(&mdsc->mutex);
+	ceph_put_mds_session(last_session);
 	dout("wait_unsafe_requests done\n");
 }