diff mbox

libceph: init sent and completed when starting

Message ID 518E81BC.5020702@inktank.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alex Elder May 11, 2013, 5:37 p.m. UTC
The rbd code has a need to be able to restart an osd request that
has already been started and completed once before.  This currently
wouldn't work right because the osd client code assumes an osd
request will be started exactly once  Certain fields in a request
are never cleared and this leads to trouble if you try to reuse it.

Specifically, the r_sent, r_got_reply, and r_completed fields are
never cleared.  The r_sent field records the osd incarnation at the
time the request was sent to that osd.  If that's non-zero, the
message won't get re-mapped to a target osd properly, and won't be
put on the unsafe requests list the first time it's sent as it
should.  The r_got_reply field is used in handle_reply() to ensure
the reply to a request is processed only once.  And the r_completed
field is used for lingering requests to avoid calling the callback
function every time the osd client re-sends the request on behalf of
its initiator.

Each osd request passes through ceph_osdc_start_request() when
responsibility for the request is handed over to the osd client for
completion.  We can safely zero these three fields there each time a
request gets started.

One last related change--clear the r_linger flag when a request
is no longer registered as a linger request.

This resolves:
    http://tracker.ceph.com/issues/5026

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/osd_client.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

 	mutex_unlock(&osdc->request_mutex);
@@ -2120,7 +2121,9 @@ int ceph_osdc_start_request(struct ceph_osd_client
*osdc,
 	down_read(&osdc->map_sem);
 	mutex_lock(&osdc->request_mutex);
 	__register_request(osdc, req);
-	WARN_ON(req->r_sent);
+	req->r_sent = 0;
+	req->r_got_reply = 0;
+	req->r_completed = 0;
 	rc = __map_request(osdc, req, 0);
 	if (rc < 0) {
 		if (nofail) {

Comments

Alex Elder May 11, 2013, 5:48 p.m. UTC | #1
On 05/11/2013 12:37 PM, Alex Elder wrote:
> The rbd code has a need to be able to restart an osd request that
> has already been started and completed once before.  This currently
> wouldn't work right because the osd client code assumes an osd
> request will be started exactly once  Certain fields in a request
> are never cleared and this leads to trouble if you try to reuse it.

This patch and the next two, and the two series of patches
that follow them, are available in the "review/wip-flatten"
branch of the ceph-glient git repository.

					-Alex


> Specifically, the r_sent, r_got_reply, and r_completed fields are
> never cleared.  The r_sent field records the osd incarnation at the
> time the request was sent to that osd.  If that's non-zero, the
> message won't get re-mapped to a target osd properly, and won't be
> put on the unsafe requests list the first time it's sent as it
> should.  The r_got_reply field is used in handle_reply() to ensure
> the reply to a request is processed only once.  And the r_completed
> field is used for lingering requests to avoid calling the callback
> function every time the osd client re-sends the request on behalf of
> its initiator.
> 
> Each osd request passes through ceph_osdc_start_request() when
> responsibility for the request is handed over to the osd client for
> completion.  We can safely zero these three fields there each time a
> request gets started.
> 
> One last related change--clear the r_linger flag when a request
> is no longer registered as a linger request.
> 
> This resolves:
>     http://tracker.ceph.com/issues/5026
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/osd_client.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index a3395fd..d5953b8 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1204,6 +1204,7 @@ void ceph_osdc_unregister_linger_request(struct
> ceph_osd_client *osdc,
>  	mutex_lock(&osdc->request_mutex);
>  	if (req->r_linger) {
>  		__unregister_linger_request(osdc, req);
> +		req->r_linger = 0;
>  		ceph_osdc_put_request(req);
>  	}
>  	mutex_unlock(&osdc->request_mutex);
> @@ -2120,7 +2121,9 @@ int ceph_osdc_start_request(struct ceph_osd_client
> *osdc,
>  	down_read(&osdc->map_sem);
>  	mutex_lock(&osdc->request_mutex);
>  	__register_request(osdc, req);
> -	WARN_ON(req->r_sent);
> +	req->r_sent = 0;
> +	req->r_got_reply = 0;
> +	req->r_completed = 0;
>  	rc = __map_request(osdc, req, 0);
>  	if (rc < 0) {
>  		if (nofail) {
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Durgin May 11, 2013, 9:11 p.m. UTC | #2
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

On 05/11/2013 10:37 AM, Alex Elder wrote:
> The rbd code has a need to be able to restart an osd request that
> has already been started and completed once before.  This currently
> wouldn't work right because the osd client code assumes an osd
> request will be started exactly once  Certain fields in a request
> are never cleared and this leads to trouble if you try to reuse it.
>
> Specifically, the r_sent, r_got_reply, and r_completed fields are
> never cleared.  The r_sent field records the osd incarnation at the
> time the request was sent to that osd.  If that's non-zero, the
> message won't get re-mapped to a target osd properly, and won't be
> put on the unsafe requests list the first time it's sent as it
> should.  The r_got_reply field is used in handle_reply() to ensure
> the reply to a request is processed only once.  And the r_completed
> field is used for lingering requests to avoid calling the callback
> function every time the osd client re-sends the request on behalf of
> its initiator.
>
> Each osd request passes through ceph_osdc_start_request() when
> responsibility for the request is handed over to the osd client for
> completion.  We can safely zero these three fields there each time a
> request gets started.
>
> One last related change--clear the r_linger flag when a request
> is no longer registered as a linger request.
>
> This resolves:
>      http://tracker.ceph.com/issues/5026
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   net/ceph/osd_client.c |    5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index a3395fd..d5953b8 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1204,6 +1204,7 @@ void ceph_osdc_unregister_linger_request(struct
> ceph_osd_client *osdc,
>   	mutex_lock(&osdc->request_mutex);
>   	if (req->r_linger) {
>   		__unregister_linger_request(osdc, req);
> +		req->r_linger = 0;
>   		ceph_osdc_put_request(req);
>   	}
>   	mutex_unlock(&osdc->request_mutex);
> @@ -2120,7 +2121,9 @@ int ceph_osdc_start_request(struct ceph_osd_client
> *osdc,
>   	down_read(&osdc->map_sem);
>   	mutex_lock(&osdc->request_mutex);
>   	__register_request(osdc, req);
> -	WARN_ON(req->r_sent);
> +	req->r_sent = 0;
> +	req->r_got_reply = 0;
> +	req->r_completed = 0;
>   	rc = __map_request(osdc, req, 0);
>   	if (rc < 0) {
>   		if (nofail) {
>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index a3395fd..d5953b8 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1204,6 +1204,7 @@  void ceph_osdc_unregister_linger_request(struct
ceph_osd_client *osdc,
 	mutex_lock(&osdc->request_mutex);
 	if (req->r_linger) {
 		__unregister_linger_request(osdc, req);
+		req->r_linger = 0;
 		ceph_osdc_put_request(req);
 	}