[v3] SUNRPC: Fix a race in the receive code path

Message ID	20171204001726.5747-1-trond.myklebust@primarydata.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Trond Myklebust <trond.myklebust@primarydata.com> To: Chuck Lever <chuck.lever@oracl.com>, Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: linux-nfs@vger.kernel.org Subject: [PATCH v3] SUNRPC: Fix a race in the receive code path Date: Sun, 3 Dec 2017 19:17:26 -0500 Message-Id: <20171204001726.5747-1-trond.myklebust@primarydata.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk

Trond Myklebust Dec. 4, 2017, 12:17 a.m. UTC

We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
race with the call to xprt_complete_rqst().

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
Cc: stable@vger.kernel.org # 4.14+
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 net/sunrpc/xprt.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Chuck Lever Dec. 13, 2017, 4:14 p.m. UTC | #1

Hi Trond-

> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
> 
> We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
> race with the call to xprt_complete_rqst().
> 
> Reported-by: Chuck Lever <chuck.lever@oracle.com>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
> Cc: stable@vger.kernel.org # 4.14+
> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> ---
> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 333b9d697ae5..5e4278e9ce37 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
> 	} else if (!req->rq_bytes_sent)
> 		return;
> 
> +	req->rq_connect_cookie = xprt->connect_cookie;
> 	req->rq_xtime = ktime_get();
> 	status = xprt->ops->send_request(task);
> 	trace_xprt_transmit(xprt, req->rq_xid, status);
> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
> 	xprt->stat.bklog_u += xprt->backlog.qlen;
> 	xprt->stat.sending_u += xprt->sending.qlen;
> 	xprt->stat.pending_u += xprt->pending.qlen;
> +	spin_unlock_bh(&xprt->transport_lock);
> 
> -	/* Don't race with disconnect */
> -	if (!xprt_connected(xprt))
> -		task->tk_status = -ENOTCONN;
> -	else {
> +	if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
> 		/*
> -		 * Sleep on the pending queue since
> -		 * we're expecting a reply.
> +		 * Sleep on the pending queue if we're expecting a reply.
> +		 * The spinlock ensures atomicity between the test of
> +		 * req->rq_reply_bytes_recvd, and the call to rpc_sleep_on().
> 		 */
> -		if (!req->rq_reply_bytes_recvd && rpc_reply_expected(task))
> +		spin_lock(&xprt->recv_lock);
> +		if (!req->rq_reply_bytes_recvd) {
> 			rpc_sleep_on(&xprt->pending, task, xprt_timer);
> -		req->rq_connect_cookie = xprt->connect_cookie;
> +			/*
> +			 * Send an extra queue wakeup call if the
> +			 * connection was dropped in case the call to
> +			 * rpc_sleep_on() raced.
> +			 */
> +			if (!xprt_connected(xprt))
> +				xprt_wake_pending_tasks(xprt, -ENOTCONN);
> +		}
> +		spin_unlock(&xprt->recv_lock);
> 	}
> -	spin_unlock_bh(&xprt->transport_lock);
> }
> 
> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
> -- 
> 2.14.3

I've run into a problem with this version of the patch (on v4.15-rc3).

With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
and a temporarily climbing latency followed by a recovery, but this
can happen on occasion:

releasing clients
  16       584   322.55 MB/sec  warmup   1 sec  latency 807.121 ms
  16       584   161.28 MB/sec  warmup   2 sec  latency 1807.175 ms
  16       584   107.52 MB/sec  warmup   3 sec  latency 2807.198 ms
  16       584    80.64 MB/sec  warmup   4 sec  latency 3807.217 ms
  16       584    64.51 MB/sec  warmup   5 sec  latency 4807.235 ms
  16       584    53.76 MB/sec  warmup   6 sec  latency 5807.251 ms
  16       584    46.08 MB/sec  warmup   7 sec  latency 6807.269 ms
  16       584    40.32 MB/sec  warmup   8 sec  latency 7807.285 ms
  16       584    35.84 MB/sec  warmup   9 sec  latency 8807.303 ms
  16       584    32.26 MB/sec  warmup  10 sec  latency 9807.318 ms
  16       584    29.32 MB/sec  warmup  11 sec  latency 10807.334 ms
  16       584    26.88 MB/sec  warmup  12 sec  latency 11807.355 ms
  16       584    24.81 MB/sec  warmup  13 sec  latency 12807.372 ms
  16       584    23.04 MB/sec  warmup  14 sec  latency 13807.387 ms
  16       584    21.50 MB/sec  warmup  15 sec  latency 14807.406 ms
  16       584    20.16 MB/sec  warmup  16 sec  latency 15807.423 ms

No recovery. The latency number keeps climbing, and throughput
drops.

"kernel: nfs: server klimt-ib not responding, still trying"
appears in the client's /var/log/messages.

Without this patch applied, the NFSv4.0 behavior is the same as I
reported with NFSv3: every once in a while, one RPC completion is
lost, but when the timer fires, the client notices the reply
actually did arrive and the RPC completes normally.

I'm looking into it.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chuck Lever Dec. 13, 2017, 11:42 p.m. UTC | #2

> On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Hi Trond-
> 
>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>> 
>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
>> race with the call to xprt_complete_rqst().
>> 
>> Reported-by: Chuck Lever <chuck.lever@oracle.com>
>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
>> Cc: stable@vger.kernel.org # 4.14+
>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>> ---
>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>> 1 file changed, 17 insertions(+), 9 deletions(-)
>> 
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 333b9d697ae5..5e4278e9ce37 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>> 	} else if (!req->rq_bytes_sent)
>> 		return;
>> 
>> +	req->rq_connect_cookie = xprt->connect_cookie;
>> 	req->rq_xtime = ktime_get();
>> 	status = xprt->ops->send_request(task);
>> 	trace_xprt_transmit(xprt, req->rq_xid, status);
>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
>> 	xprt->stat.bklog_u += xprt->backlog.qlen;
>> 	xprt->stat.sending_u += xprt->sending.qlen;
>> 	xprt->stat.pending_u += xprt->pending.qlen;
>> +	spin_unlock_bh(&xprt->transport_lock);
>> 
>> -	/* Don't race with disconnect */
>> -	if (!xprt_connected(xprt))
>> -		task->tk_status = -ENOTCONN;
>> -	else {
>> +	if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
>> 		/*
>> -		 * Sleep on the pending queue since
>> -		 * we're expecting a reply.
>> +		 * Sleep on the pending queue if we're expecting a reply.
>> +		 * The spinlock ensures atomicity between the test of
>> +		 * req->rq_reply_bytes_recvd, and the call to rpc_sleep_on().
>> 		 */
>> -		if (!req->rq_reply_bytes_recvd && rpc_reply_expected(task))
>> +		spin_lock(&xprt->recv_lock);
>> +		if (!req->rq_reply_bytes_recvd) {
>> 			rpc_sleep_on(&xprt->pending, task, xprt_timer);
>> -		req->rq_connect_cookie = xprt->connect_cookie;
>> +			/*
>> +			 * Send an extra queue wakeup call if the
>> +			 * connection was dropped in case the call to
>> +			 * rpc_sleep_on() raced.
>> +			 */
>> +			if (!xprt_connected(xprt))
>> +				xprt_wake_pending_tasks(xprt, -ENOTCONN);
>> +		}
>> +		spin_unlock(&xprt->recv_lock);
>> 	}
>> -	spin_unlock_bh(&xprt->transport_lock);
>> }
>> 
>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
>> -- 
>> 2.14.3
> 
> I've run into a problem with this version of the patch (on v4.15-rc3).
> 
> With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
> and a temporarily climbing latency followed by a recovery, but this
> can happen on occasion:
> 
> releasing clients
>  16       584   322.55 MB/sec  warmup   1 sec  latency 807.121 ms
>  16       584   161.28 MB/sec  warmup   2 sec  latency 1807.175 ms
>  16       584   107.52 MB/sec  warmup   3 sec  latency 2807.198 ms
>  16       584    80.64 MB/sec  warmup   4 sec  latency 3807.217 ms
>  16       584    64.51 MB/sec  warmup   5 sec  latency 4807.235 ms
>  16       584    53.76 MB/sec  warmup   6 sec  latency 5807.251 ms
>  16       584    46.08 MB/sec  warmup   7 sec  latency 6807.269 ms
>  16       584    40.32 MB/sec  warmup   8 sec  latency 7807.285 ms
>  16       584    35.84 MB/sec  warmup   9 sec  latency 8807.303 ms
>  16       584    32.26 MB/sec  warmup  10 sec  latency 9807.318 ms
>  16       584    29.32 MB/sec  warmup  11 sec  latency 10807.334 ms
>  16       584    26.88 MB/sec  warmup  12 sec  latency 11807.355 ms
>  16       584    24.81 MB/sec  warmup  13 sec  latency 12807.372 ms
>  16       584    23.04 MB/sec  warmup  14 sec  latency 13807.387 ms
>  16       584    21.50 MB/sec  warmup  15 sec  latency 14807.406 ms
>  16       584    20.16 MB/sec  warmup  16 sec  latency 15807.423 ms
> 
> No recovery. The latency number keeps climbing, and throughput
> drops.
> 
> "kernel: nfs: server klimt-ib not responding, still trying"
> appears in the client's /var/log/messages.
> 
> Without this patch applied, the NFSv4.0 behavior is the same as I
> reported with NFSv3: every once in a while, one RPC completion is
> lost, but when the timer fires, the client notices the reply
> actually did arrive and the RPC completes normally.
> 
> I'm looking into it.

Root cause:

The transport is temporarily out of resources and returns -ENOBUFS
from ->send_request.

The FSM schedules a delay then calls call_transmit again, which
invokes xprt_prepare_transmit.

On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
transport write lock.

On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
xprt_prepare_transmit to see that the transport is connected and
rq_connect_cookie is the same as xprt->connect_cookie, so it
queues the task on ->pending and returns false.

The transport never sends the Call, but still waits for a Reply,
forever. This is not a bug in your patch, it is a pre-existing
issue.

There's a simple fix, which is to set

   rq_connect_cookie = xprt->connect_cookie - 1;

if RPC-over-RDMA header marshaling fails. This is a one-line fix,
appropriate for -rc and backport to stable (to at least v4.8).

A more robust fix might be to have ->send_request return -EAGAIN
instead, and let xprtrdma call xprt_write_space when the resource
exhaustion is relieved. That might be more involved, and thus is
future-facing.

I'm investigating both fixes.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chuck Lever Dec. 14, 2017, 1:13 a.m. UTC | #3

> On Dec 13, 2017, at 6:42 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> Hi Trond-
>> 
>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>>> 
>>> We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot
>>> race with the call to xprt_complete_rqst().
>>> 
>>> Reported-by: Chuck Lever <chuck.lever@oracle.com>
>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect..")
>>> Cc: stable@vger.kernel.org # 4.14+
>>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>>> ---
>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 333b9d697ae5..5e4278e9ce37 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>>> 	} else if (!req->rq_bytes_sent)
>>> 		return;
>>> 
>>> +	req->rq_connect_cookie = xprt->connect_cookie;
>>> 	req->rq_xtime = ktime_get();
>>> 	status = xprt->ops->send_request(task);
>>> 	trace_xprt_transmit(xprt, req->rq_xid, status);
>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task *task)
>>> 	xprt->stat.bklog_u += xprt->backlog.qlen;
>>> 	xprt->stat.sending_u += xprt->sending.qlen;
>>> 	xprt->stat.pending_u += xprt->pending.qlen;
>>> +	spin_unlock_bh(&xprt->transport_lock);
>>> 
>>> -	/* Don't race with disconnect */
>>> -	if (!xprt_connected(xprt))
>>> -		task->tk_status = -ENOTCONN;
>>> -	else {
>>> +	if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
>>> 		/*
>>> -		 * Sleep on the pending queue since
>>> -		 * we're expecting a reply.
>>> +		 * Sleep on the pending queue if we're expecting a reply.
>>> +		 * The spinlock ensures atomicity between the test of
>>> +		 * req->rq_reply_bytes_recvd, and the call to rpc_sleep_on().
>>> 		 */
>>> -		if (!req->rq_reply_bytes_recvd && rpc_reply_expected(task))
>>> +		spin_lock(&xprt->recv_lock);
>>> +		if (!req->rq_reply_bytes_recvd) {
>>> 			rpc_sleep_on(&xprt->pending, task, xprt_timer);
>>> -		req->rq_connect_cookie = xprt->connect_cookie;
>>> +			/*
>>> +			 * Send an extra queue wakeup call if the
>>> +			 * connection was dropped in case the call to
>>> +			 * rpc_sleep_on() raced.
>>> +			 */
>>> +			if (!xprt_connected(xprt))
>>> +				xprt_wake_pending_tasks(xprt, -ENOTCONN);
>>> +		}
>>> +		spin_unlock(&xprt->recv_lock);
>>> 	}
>>> -	spin_unlock_bh(&xprt->transport_lock);
>>> }
>>> 
>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
>>> -- 
>>> 2.14.3
>> 
>> I've run into a problem with this version of the patch (on v4.15-rc3).
>> 
>> With dbench on NFSv4.0 (on RDMA) I no longer see constant throughput
>> and a temporarily climbing latency followed by a recovery, but this
>> can happen on occasion:
>> 
>> releasing clients
>> 16       584   322.55 MB/sec  warmup   1 sec  latency 807.121 ms
>> 16       584   161.28 MB/sec  warmup   2 sec  latency 1807.175 ms
>> 16       584   107.52 MB/sec  warmup   3 sec  latency 2807.198 ms
>> 16       584    80.64 MB/sec  warmup   4 sec  latency 3807.217 ms
>> 16       584    64.51 MB/sec  warmup   5 sec  latency 4807.235 ms
>> 16       584    53.76 MB/sec  warmup   6 sec  latency 5807.251 ms
>> 16       584    46.08 MB/sec  warmup   7 sec  latency 6807.269 ms
>> 16       584    40.32 MB/sec  warmup   8 sec  latency 7807.285 ms
>> 16       584    35.84 MB/sec  warmup   9 sec  latency 8807.303 ms
>> 16       584    32.26 MB/sec  warmup  10 sec  latency 9807.318 ms
>> 16       584    29.32 MB/sec  warmup  11 sec  latency 10807.334 ms
>> 16       584    26.88 MB/sec  warmup  12 sec  latency 11807.355 ms
>> 16       584    24.81 MB/sec  warmup  13 sec  latency 12807.372 ms
>> 16       584    23.04 MB/sec  warmup  14 sec  latency 13807.387 ms
>> 16       584    21.50 MB/sec  warmup  15 sec  latency 14807.406 ms
>> 16       584    20.16 MB/sec  warmup  16 sec  latency 15807.423 ms
>> 
>> No recovery. The latency number keeps climbing, and throughput
>> drops.
>> 
>> "kernel: nfs: server klimt-ib not responding, still trying"
>> appears in the client's /var/log/messages.
>> 
>> Without this patch applied, the NFSv4.0 behavior is the same as I
>> reported with NFSv3: every once in a while, one RPC completion is
>> lost, but when the timer fires, the client notices the reply
>> actually did arrive and the RPC completes normally.
>> 
>> I'm looking into it.
> 
> Root cause:
> 
> The transport is temporarily out of resources and returns -ENOBUFS
> from ->send_request.
> 
> The FSM schedules a delay then calls call_transmit again, which
> invokes xprt_prepare_transmit.
> 
> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
> transport write lock.
> 
> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
> xprt_prepare_transmit to see that the transport is connected and
> rq_connect_cookie is the same as xprt->connect_cookie, so it
> queues the task on ->pending and returns false.
> 
> The transport never sends the Call, but still waits for a Reply,
> forever. This is not a bug in your patch, it is a pre-existing
> issue.

Sorry for one more reply...

I think this last statement is incorrect. It is a bug in your
patch after all.

I forgot that your patch moves the line that sets

    req->rq_connect_cookie = xprt->connect_cookie;

to before the call to ->send_request. But rq_connect_cookie
must be set iff ->send_request is successful. Moving this
line seems to fix the issue:

1027         req->rq_xtime = ktime_get();
1028         status = xprt->ops->send_request(task);
1029         trace_xprt_transmit(xprt, req->rq_xid, status);
1030         if (status != 0) {
1031                 task->tk_status = status;
1032                 return;
1033         }
1034         req->rq_connect_cookie = xprt->connect_cookie; <<<<
1035         xprt_inject_disconnect(xprt);


> There's a simple fix, which is to set
> 
>   rq_connect_cookie = xprt->connect_cookie - 1;
> 
> if RPC-over-RDMA header marshaling fails. This is a one-line fix,
> appropriate for -rc and backport to stable (to at least v4.8).

This fix does exactly the same thing.


> A more robust fix might be to have ->send_request return -EAGAIN
> instead, and let xprtrdma call xprt_write_space when the resource
> exhaustion is relieved. That might be more involved, and thus is
> future-facing.
> 
> I'm investigating both fixes.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Trond Myklebust Dec. 14, 2017, 12:16 p.m. UTC | #4

T24gV2VkLCAyMDE3LTEyLTEzIGF0IDIwOjEzIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTMsIDIwMTcsIGF0IDY6NDIgUE0sIENodWNrIExldmVyIDxjaHVjay5sZXZlckBv
cmFjbGUuY29tPg0KPiA+IHdyb3RlOg0KPiA+IA0KPiA+ID4gDQo+ID4gPiBPbiBEZWMgMTMsIDIw
MTcsIGF0IDExOjE0IEFNLCBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbQ0KPiA+
ID4gPiB3cm90ZToNCj4gPiA+IA0KPiA+ID4gSGkgVHJvbmQtDQo+ID4gPiANCj4gPiA+ID4gT24g
RGVjIDMsIDIwMTcsIGF0IDc6MTcgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0
QHByDQo+ID4gPiA+IGltYXJ5ZGF0YS5jb20+IHdyb3RlOg0KPiA+ID4gPiANCj4gPiA+ID4gV2Ug
bXVzdCBlbnN1cmUgdGhhdCB0aGUgY2FsbCB0byBycGNfc2xlZXBfb24oKSBpbg0KPiA+ID4gPiB4
cHJ0X3RyYW5zbWl0KCkgY2Fubm90DQo+ID4gPiA+IHJhY2Ugd2l0aCB0aGUgY2FsbCB0byB4cHJ0
X2NvbXBsZXRlX3Jxc3QoKS4NCj4gPiA+ID4gDQo+ID4gPiA+IFJlcG9ydGVkLWJ5OiBDaHVjayBM
ZXZlciA8Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbT4NCj4gPiA+ID4gTGluazogaHR0cHM6Ly9idWd6
aWxsYS5saW51eC1uZnMub3JnL3Nob3dfYnVnLmNnaT9pZD0zMTcNCj4gPiA+ID4gRml4ZXM6IGNl
N2MyNTJhOGM3NCAoIlNVTlJQQzogQWRkIGEgc2VwYXJhdGUgc3BpbmxvY2sgdG8NCj4gPiA+ID4g
cHJvdGVjdC4uIikNCj4gPiA+ID4gQ2M6IHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmcgIyA0LjE0Kw0K
PiA+ID4gPiBSZXZpZXdlZC1ieTogQ2h1Y2sgTGV2ZXIgPGNodWNrLmxldmVyQG9yYWNsZS5jb20+
DQo+ID4gPiA+IFNpZ25lZC1vZmYtYnk6IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0
QHByaW1hcnlkYXRhLmNvbQ0KPiA+ID4gPiA+DQo+ID4gPiA+IC0tLQ0KPiA+ID4gPiBuZXQvc3Vu
cnBjL3hwcnQuYyB8IDI2ICsrKysrKysrKysrKysrKysrLS0tLS0tLS0tDQo+ID4gPiA+IDEgZmls
ZSBjaGFuZ2VkLCAxNyBpbnNlcnRpb25zKCspLCA5IGRlbGV0aW9ucygtKQ0KPiA+ID4gPiANCj4g
PiA+ID4gZGlmZiAtLWdpdCBhL25ldC9zdW5ycGMveHBydC5jIGIvbmV0L3N1bnJwYy94cHJ0LmMN
Cj4gPiA+ID4gaW5kZXggMzMzYjlkNjk3YWU1Li41ZTQyNzhlOWNlMzcgMTAwNjQ0DQo+ID4gPiA+
IC0tLSBhL25ldC9zdW5ycGMveHBydC5jDQo+ID4gPiA+ICsrKyBiL25ldC9zdW5ycGMveHBydC5j
DQo+ID4gPiA+IEBAIC0xMDI0LDYgKzEwMjQsNyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0
IHJwY190YXNrICp0YXNrKQ0KPiA+ID4gPiAJfSBlbHNlIGlmICghcmVxLT5ycV9ieXRlc19zZW50
KQ0KPiA+ID4gPiAJCXJldHVybjsNCj4gPiA+ID4gDQo+ID4gPiA+ICsJcmVxLT5ycV9jb25uZWN0
X2Nvb2tpZSA9IHhwcnQtPmNvbm5lY3RfY29va2llOw0KPiA+ID4gPiAJcmVxLT5ycV94dGltZSA9
IGt0aW1lX2dldCgpOw0KPiA+ID4gPiAJc3RhdHVzID0geHBydC0+b3BzLT5zZW5kX3JlcXVlc3Qo
dGFzayk7DQo+ID4gPiA+IAl0cmFjZV94cHJ0X3RyYW5zbWl0KHhwcnQsIHJlcS0+cnFfeGlkLCBz
dGF0dXMpOw0KPiA+ID4gPiBAQCAtMTA0NywyMCArMTA0OCwyNyBAQCB2b2lkIHhwcnRfdHJhbnNt
aXQoc3RydWN0IHJwY190YXNrDQo+ID4gPiA+ICp0YXNrKQ0KPiA+ID4gPiAJeHBydC0+c3RhdC5i
a2xvZ191ICs9IHhwcnQtPmJhY2tsb2cucWxlbjsNCj4gPiA+ID4gCXhwcnQtPnN0YXQuc2VuZGlu
Z191ICs9IHhwcnQtPnNlbmRpbmcucWxlbjsNCj4gPiA+ID4gCXhwcnQtPnN0YXQucGVuZGluZ191
ICs9IHhwcnQtPnBlbmRpbmcucWxlbjsNCj4gPiA+ID4gKwlzcGluX3VubG9ja19iaCgmeHBydC0+
dHJhbnNwb3J0X2xvY2spOw0KPiA+ID4gPiANCj4gPiA+ID4gLQkvKiBEb24ndCByYWNlIHdpdGgg
ZGlzY29ubmVjdCAqLw0KPiA+ID4gPiAtCWlmICgheHBydF9jb25uZWN0ZWQoeHBydCkpDQo+ID4g
PiA+IC0JCXRhc2stPnRrX3N0YXR1cyA9IC1FTk9UQ09OTjsNCj4gPiA+ID4gLQllbHNlIHsNCj4g
PiA+ID4gKwlpZiAocnBjX3JlcGx5X2V4cGVjdGVkKHRhc2spICYmICFSRUFEX09OQ0UocmVxLQ0K
PiA+ID4gPiA+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpKSB7DQo+ID4gPiA+IAkJLyoNCj4gPiA+ID4g
LQkJICogU2xlZXAgb24gdGhlIHBlbmRpbmcgcXVldWUgc2luY2UNCj4gPiA+ID4gLQkJICogd2Un
cmUgZXhwZWN0aW5nIGEgcmVwbHkuDQo+ID4gPiA+ICsJCSAqIFNsZWVwIG9uIHRoZSBwZW5kaW5n
IHF1ZXVlIGlmIHdlJ3JlDQo+ID4gPiA+IGV4cGVjdGluZyBhIHJlcGx5Lg0KPiA+ID4gPiArCQkg
KiBUaGUgc3BpbmxvY2sgZW5zdXJlcyBhdG9taWNpdHkgYmV0d2VlbiB0aGUNCj4gPiA+ID4gdGVz
dCBvZg0KPiA+ID4gPiArCQkgKiByZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkLCBhbmQgdGhlIGNh
bGwgdG8NCj4gPiA+ID4gcnBjX3NsZWVwX29uKCkuDQo+ID4gPiA+IAkJICovDQo+ID4gPiA+IC0J
CWlmICghcmVxLT5ycV9yZXBseV9ieXRlc19yZWN2ZCAmJg0KPiA+ID4gPiBycGNfcmVwbHlfZXhw
ZWN0ZWQodGFzaykpDQo+ID4gPiA+ICsJCXNwaW5fbG9jaygmeHBydC0+cmVjdl9sb2NrKTsNCj4g
PiA+ID4gKwkJaWYgKCFyZXEtPnJxX3JlcGx5X2J5dGVzX3JlY3ZkKSB7DQo+ID4gPiA+IAkJCXJw
Y19zbGVlcF9vbigmeHBydC0+cGVuZGluZywgdGFzaywNCj4gPiA+ID4geHBydF90aW1lcik7DQo+
ID4gPiA+IC0JCXJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsN
Cj4gPiA+ID4gKwkJCS8qDQo+ID4gPiA+ICsJCQkgKiBTZW5kIGFuIGV4dHJhIHF1ZXVlIHdha2V1
cCBjYWxsIGlmDQo+ID4gPiA+IHRoZQ0KPiA+ID4gPiArCQkJICogY29ubmVjdGlvbiB3YXMgZHJv
cHBlZCBpbiBjYXNlIHRoZQ0KPiA+ID4gPiBjYWxsIHRvDQo+ID4gPiA+ICsJCQkgKiBycGNfc2xl
ZXBfb24oKSByYWNlZC4NCj4gPiA+ID4gKwkJCSAqLw0KPiA+ID4gPiArCQkJaWYgKCF4cHJ0X2Nv
bm5lY3RlZCh4cHJ0KSkNCj4gPiA+ID4gKwkJCQl4cHJ0X3dha2VfcGVuZGluZ190YXNrcyh4cHJ0
LA0KPiA+ID4gPiAtRU5PVENPTk4pOw0KPiA+ID4gPiArCQl9DQo+ID4gPiA+ICsJCXNwaW5fdW5s
b2NrKCZ4cHJ0LT5yZWN2X2xvY2spOw0KPiA+ID4gPiAJfQ0KPiA+ID4gPiAtCXNwaW5fdW5sb2Nr
X2JoKCZ4cHJ0LT50cmFuc3BvcnRfbG9jayk7DQo+ID4gPiA+IH0NCj4gPiA+ID4gDQo+ID4gPiA+
IHN0YXRpYyB2b2lkIHhwcnRfYWRkX2JhY2tsb2coc3RydWN0IHJwY194cHJ0ICp4cHJ0LCBzdHJ1
Y3QNCj4gPiA+ID4gcnBjX3Rhc2sgKnRhc2spDQo+ID4gPiA+IC0tIA0KPiA+ID4gPiAyLjE0LjMN
Cj4gPiA+IA0KPiA+ID4gSSd2ZSBydW4gaW50byBhIHByb2JsZW0gd2l0aCB0aGlzIHZlcnNpb24g
b2YgdGhlIHBhdGNoIChvbiB2NC4xNS0NCj4gPiA+IHJjMykuDQo+ID4gPiANCj4gPiA+IFdpdGgg
ZGJlbmNoIG9uIE5GU3Y0LjAgKG9uIFJETUEpIEkgbm8gbG9uZ2VyIHNlZSBjb25zdGFudA0KPiA+
ID4gdGhyb3VnaHB1dA0KPiA+ID4gYW5kIGEgdGVtcG9yYXJpbHkgY2xpbWJpbmcgbGF0ZW5jeSBm
b2xsb3dlZCBieSBhIHJlY292ZXJ5LCBidXQNCj4gPiA+IHRoaXMNCj4gPiA+IGNhbiBoYXBwZW4g
b24gb2NjYXNpb246DQo+ID4gPiANCj4gPiA+IHJlbGVhc2luZyBjbGllbnRzDQo+ID4gPiAxNiAg
ICAgICA1ODQgICAzMjIuNTUgTUIvc2VjICB3YXJtdXAgICAxIHNlYyAgbGF0ZW5jeSA4MDcuMTIx
IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAxNjEuMjggTUIvc2VjICB3YXJtdXAgICAyIHNlYyAg
bGF0ZW5jeSAxODA3LjE3NSBtcw0KPiA+ID4gMTYgICAgICAgNTg0ICAgMTA3LjUyIE1CL3NlYyAg
d2FybXVwICAgMyBzZWMgIGxhdGVuY3kgMjgwNy4xOTggbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAg
ICA4MC42NCBNQi9zZWMgIHdhcm11cCAgIDQgc2VjICBsYXRlbmN5IDM4MDcuMjE3IG1zDQo+ID4g
PiAxNiAgICAgICA1ODQgICAgNjQuNTEgTUIvc2VjICB3YXJtdXAgICA1IHNlYyAgbGF0ZW5jeSA0
ODA3LjIzNSBtcw0KPiA+ID4gMTYgICAgICAgNTg0ICAgIDUzLjc2IE1CL3NlYyAgd2FybXVwICAg
NiBzZWMgIGxhdGVuY3kgNTgwNy4yNTEgbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAgICA0Ni4wOCBN
Qi9zZWMgIHdhcm11cCAgIDcgc2VjICBsYXRlbmN5IDY4MDcuMjY5IG1zDQo+ID4gPiAxNiAgICAg
ICA1ODQgICAgNDAuMzIgTUIvc2VjICB3YXJtdXAgICA4IHNlYyAgbGF0ZW5jeSA3ODA3LjI4NSBt
cw0KPiA+ID4gMTYgICAgICAgNTg0ICAgIDM1Ljg0IE1CL3NlYyAgd2FybXVwICAgOSBzZWMgIGxh
dGVuY3kgODgwNy4zMDMgbXMNCj4gPiA+IDE2ICAgICAgIDU4NCAgICAzMi4yNiBNQi9zZWMgIHdh
cm11cCAgMTAgc2VjICBsYXRlbmN5IDk4MDcuMzE4IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAg
MjkuMzIgTUIvc2VjICB3YXJtdXAgIDExIHNlYyAgbGF0ZW5jeSAxMDgwNy4zMzQNCj4gPiA+IG1z
DQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjYuODggTUIvc2VjICB3YXJtdXAgIDEyIHNlYyAgbGF0
ZW5jeSAxMTgwNy4zNTUNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjQuODEgTUIv
c2VjICB3YXJtdXAgIDEzIHNlYyAgbGF0ZW5jeSAxMjgwNy4zNzINCj4gPiA+IG1zDQo+ID4gPiAx
NiAgICAgICA1ODQgICAgMjMuMDQgTUIvc2VjICB3YXJtdXAgIDE0IHNlYyAgbGF0ZW5jeSAxMzgw
Ny4zODcNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1ODQgICAgMjEuNTAgTUIvc2VjICB3YXJt
dXAgIDE1IHNlYyAgbGF0ZW5jeSAxNDgwNy40MDYNCj4gPiA+IG1zDQo+ID4gPiAxNiAgICAgICA1
ODQgICAgMjAuMTYgTUIvc2VjICB3YXJtdXAgIDE2IHNlYyAgbGF0ZW5jeSAxNTgwNy40MjMNCj4g
PiA+IG1zDQo+ID4gPiANCj4gPiA+IE5vIHJlY292ZXJ5LiBUaGUgbGF0ZW5jeSBudW1iZXIga2Vl
cHMgY2xpbWJpbmcsIGFuZCB0aHJvdWdocHV0DQo+ID4gPiBkcm9wcy4NCj4gPiA+IA0KPiA+ID4g
Imtlcm5lbDogbmZzOiBzZXJ2ZXIga2xpbXQtaWIgbm90IHJlc3BvbmRpbmcsIHN0aWxsIHRyeWlu
ZyINCj4gPiA+IGFwcGVhcnMgaW4gdGhlIGNsaWVudCdzIC92YXIvbG9nL21lc3NhZ2VzLg0KPiA+
ID4gDQo+ID4gPiBXaXRob3V0IHRoaXMgcGF0Y2ggYXBwbGllZCwgdGhlIE5GU3Y0LjAgYmVoYXZp
b3IgaXMgdGhlIHNhbWUgYXMgSQ0KPiA+ID4gcmVwb3J0ZWQgd2l0aCBORlN2MzogZXZlcnkgb25j
ZSBpbiBhIHdoaWxlLCBvbmUgUlBDIGNvbXBsZXRpb24gaXMNCj4gPiA+IGxvc3QsIGJ1dCB3aGVu
IHRoZSB0aW1lciBmaXJlcywgdGhlIGNsaWVudCBub3RpY2VzIHRoZSByZXBseQ0KPiA+ID4gYWN0
dWFsbHkgZGlkIGFycml2ZSBhbmQgdGhlIFJQQyBjb21wbGV0ZXMgbm9ybWFsbHkuDQo+ID4gPiAN
Cj4gPiA+IEknbSBsb29raW5nIGludG8gaXQuDQo+ID4gDQo+ID4gUm9vdCBjYXVzZToNCj4gPiAN
Cj4gPiBUaGUgdHJhbnNwb3J0IGlzIHRlbXBvcmFyaWx5IG91dCBvZiByZXNvdXJjZXMgYW5kIHJl
dHVybnMgLUVOT0JVRlMNCj4gPiBmcm9tIC0+c2VuZF9yZXF1ZXN0Lg0KPiA+IA0KPiA+IFRoZSBG
U00gc2NoZWR1bGVzIGEgZGVsYXkgdGhlbiBjYWxscyBjYWxsX3RyYW5zbWl0IGFnYWluLCB3aGlj
aA0KPiA+IGludm9rZXMgeHBydF9wcmVwYXJlX3RyYW5zbWl0Lg0KPiA+IA0KPiA+IE9uIE5GU3Yz
IG1vdW50cywgeHBydF9wcmVwYXJlX3RyYW5zbWl0IHRyaWVzIHRvIGFjcXVpcmUgdGhlDQo+ID4g
dHJhbnNwb3J0IHdyaXRlIGxvY2suDQo+ID4gDQo+ID4gT24gTkZTdjQgbW91bnRzLCBSUENfVEFT
S19OT19SRVRSQU5TX1RJTUVPVVQgaXMgc2V0LiBUaGlzIGNhdXNlcw0KPiA+IHhwcnRfcHJlcGFy
ZV90cmFuc21pdCB0byBzZWUgdGhhdCB0aGUgdHJhbnNwb3J0IGlzIGNvbm5lY3RlZCBhbmQNCj4g
PiBycV9jb25uZWN0X2Nvb2tpZSBpcyB0aGUgc2FtZSBhcyB4cHJ0LT5jb25uZWN0X2Nvb2tpZSwg
c28gaXQNCj4gPiBxdWV1ZXMgdGhlIHRhc2sgb24gLT5wZW5kaW5nIGFuZCByZXR1cm5zIGZhbHNl
Lg0KPiA+IA0KPiA+IFRoZSB0cmFuc3BvcnQgbmV2ZXIgc2VuZHMgdGhlIENhbGwsIGJ1dCBzdGls
bCB3YWl0cyBmb3IgYSBSZXBseSwNCj4gPiBmb3JldmVyLiBUaGlzIGlzIG5vdCBhIGJ1ZyBpbiB5
b3VyIHBhdGNoLCBpdCBpcyBhIHByZS1leGlzdGluZw0KPiA+IGlzc3VlLg0KPiANCj4gU29ycnkg
Zm9yIG9uZSBtb3JlIHJlcGx5Li4uDQo+IA0KPiBJIHRoaW5rIHRoaXMgbGFzdCBzdGF0ZW1lbnQg
aXMgaW5jb3JyZWN0LiBJdCBpcyBhIGJ1ZyBpbiB5b3VyDQo+IHBhdGNoIGFmdGVyIGFsbC4NCj4g
DQo+IEkgZm9yZ290IHRoYXQgeW91ciBwYXRjaCBtb3ZlcyB0aGUgbGluZSB0aGF0IHNldHMNCj4g
DQo+ICAgICByZXEtPnJxX2Nvbm5lY3RfY29va2llID0geHBydC0+Y29ubmVjdF9jb29raWU7DQo+
IA0KPiB0byBiZWZvcmUgdGhlIGNhbGwgdG8gLT5zZW5kX3JlcXVlc3QuIEJ1dCBycV9jb25uZWN0
X2Nvb2tpZQ0KPiBtdXN0IGJlIHNldCBpZmYgLT5zZW5kX3JlcXVlc3QgaXMgc3VjY2Vzc2Z1bC4g
TW92aW5nIHRoaXMNCj4gbGluZSBzZWVtcyB0byBmaXggdGhlIGlzc3VlOg0KPiANCj4gMTAyNyAg
ICAgICAgIHJlcS0+cnFfeHRpbWUgPSBrdGltZV9nZXQoKTsNCj4gMTAyOCAgICAgICAgIHN0YXR1
cyA9IHhwcnQtPm9wcy0+c2VuZF9yZXF1ZXN0KHRhc2spOw0KPiAxMDI5ICAgICAgICAgdHJhY2Vf
eHBydF90cmFuc21pdCh4cHJ0LCByZXEtPnJxX3hpZCwgc3RhdHVzKTsNCj4gMTAzMCAgICAgICAg
IGlmIChzdGF0dXMgIT0gMCkgew0KPiAxMDMxICAgICAgICAgICAgICAgICB0YXNrLT50a19zdGF0
dXMgPSBzdGF0dXM7DQo+IDEwMzIgICAgICAgICAgICAgICAgIHJldHVybjsNCj4gMTAzMyAgICAg
ICAgIH0NCj4gMTAzNCAgICAgICAgIHJlcS0+cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25u
ZWN0X2Nvb2tpZTsgPDw8PA0KPiAxMDM1ICAgICAgICAgeHBydF9pbmplY3RfZGlzY29ubmVjdCh4
cHJ0KTsNCj4gDQoNCkxldCdzIHJhdGhlciBhcHBseSB0aGUgYXR0YWNoZWQgZml4IChpbmNyZW1l
bnRhbCB0byB0aGUgb3JpZ2luYWwgcGF0Y2gNCmFuZCBpbnRlbmRlZCB0byBiZSBhcHBsaWVkIHVz
aW5nICdnaXQgcmViYXNlIC0tYXV0b3NxdWFzaCcpLiBUaGUgcmVhc29uDQpmb3Igd2FudGluZyB0
byBzZXQgdGhlIGNvbm5lY3QgY29va2llIGJlZm9yZSB3ZSBjYWxsIHNlbmRfcmVxdWVzdCgpIGlz
DQp0byBhdm9pZCByYWNlcyB3aXRoIGRpc2Nvbm5lY3RzIHRoYXQgaGFwcGVuIGltbWVkaWF0ZWx5
IGFmdGVyIHdlIHNlbnQNCnRoZSBkYXRhIChidXQgYmVmb3JlIHdlIGdldCByb3VuZCB0byBzZXR0
aW5nIHRoZSBjb29raWUpLg0KDQo4PC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpGcm9tIDMyNmMxMjcyN2IxNzlkNWMyZmQ0MmYzNzRhNGMw
ZTJkNTZlZTkwYzUgTW9uIFNlcCAxNyAwMDowMDowMCAyMDAxDQpGcm9tOiBUcm9uZCBNeWtsZWJ1
c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQpEYXRlOiBUaHUsIDE0IERlYyAy
MDE3IDA3OjA1OjI3IC0wNTAwDQpTdWJqZWN0OiBbUEFUQ0hdIGZpeHVwISBTVU5SUEM6IEZpeCBh
IHJhY2UgaW4gdGhlIHJlY2VpdmUgY29kZSBwYXRoDQoNCi0tLQ0KIG5ldC9zdW5ycGMveHBydC5j
IHwgMyArKy0NCiAxIGZpbGUgY2hhbmdlZCwgMiBpbnNlcnRpb25zKCspLCAxIGRlbGV0aW9uKC0p
DQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBiL25ldC9zdW5ycGMveHBydC5jDQpp
bmRleCA1ZTQyNzhlOWNlMzcuLjJjZjRjZjIxYTg3OCAxMDA2NDQNCi0tLSBhL25ldC9zdW5ycGMv
eHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAgLTEwMjEsMTAgKzEwMjEsMTEgQEAg
dm9pZCB4cHJ0X3RyYW5zbWl0KHN0cnVjdCBycGNfdGFzayAqdGFzaykNCiAJCQkvKiBUdXJuIG9m
ZiBhdXRvZGlzY29ubmVjdCAqLw0KIAkJCWRlbF9zaW5nbGVzaG90X3RpbWVyX3N5bmMoJnhwcnQt
PnRpbWVyKTsNCiAJCX0NCisJCWlmICghcmVxLT5ycV9ieXRlc19zZW50KQ0KKwkJCXJlcS0+cnFf
Y29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCiAJfSBlbHNlIGlmICghcmVx
LT5ycV9ieXRlc19zZW50KQ0KIAkJcmV0dXJuOw0KIA0KLQlyZXEtPnJxX2Nvbm5lY3RfY29va2ll
ID0geHBydC0+Y29ubmVjdF9jb29raWU7DQogCXJlcS0+cnFfeHRpbWUgPSBrdGltZV9nZXQoKTsN
CiAJc3RhdHVzID0geHBydC0+b3BzLT5zZW5kX3JlcXVlc3QodGFzayk7DQogCXRyYWNlX3hwcnRf
dHJhbnNtaXQoeHBydCwgcmVxLT5ycV94aWQsIHN0YXR1cyk7DQotLSANCjIuMTQuMw0KDQotLSAN
ClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmltYXJ5RGF0
YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chuck Lever Dec. 14, 2017, 3:49 p.m. UTC | #5

> On Dec 14, 2017, at 7:16 AM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Wed, 2017-12-13 at 20:13 -0500, Chuck Lever wrote:
>>> On Dec 13, 2017, at 6:42 PM, Chuck Lever <chuck.lever@oracle.com>
>>> wrote:
>>> 
>>>> 
>>>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle.com
>>>>> wrote:
>>>> 
>>>> Hi Trond-
>>>> 
>>>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebust@pr
>>>>> imarydata.com> wrote:
>>>>> 
>>>>> We must ensure that the call to rpc_sleep_on() in
>>>>> xprt_transmit() cannot
>>>>> race with the call to xprt_complete_rqst().
>>>>> 
>>>>> Reported-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
>>>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to
>>>>> protect..")
>>>>> Cc: stable@vger.kernel.org # 4.14+
>>>>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com
>>>>>> 
>>>>> ---
>>>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>> 
>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>> index 333b9d697ae5..5e4278e9ce37 100644
>>>>> --- a/net/sunrpc/xprt.c
>>>>> +++ b/net/sunrpc/xprt.c
>>>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task *task)
>>>>> 	} else if (!req->rq_bytes_sent)
>>>>> 		return;
>>>>> 
>>>>> +	req->rq_connect_cookie = xprt->connect_cookie;
>>>>> 	req->rq_xtime = ktime_get();
>>>>> 	status = xprt->ops->send_request(task);
>>>>> 	trace_xprt_transmit(xprt, req->rq_xid, status);
>>>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task
>>>>> *task)
>>>>> 	xprt->stat.bklog_u += xprt->backlog.qlen;
>>>>> 	xprt->stat.sending_u += xprt->sending.qlen;
>>>>> 	xprt->stat.pending_u += xprt->pending.qlen;
>>>>> +	spin_unlock_bh(&xprt->transport_lock);
>>>>> 
>>>>> -	/* Don't race with disconnect */
>>>>> -	if (!xprt_connected(xprt))
>>>>> -		task->tk_status = -ENOTCONN;
>>>>> -	else {
>>>>> +	if (rpc_reply_expected(task) && !READ_ONCE(req-
>>>>>> rq_reply_bytes_recvd)) {
>>>>> 		/*
>>>>> -		 * Sleep on the pending queue since
>>>>> -		 * we're expecting a reply.
>>>>> +		 * Sleep on the pending queue if we're
>>>>> expecting a reply.
>>>>> +		 * The spinlock ensures atomicity between the
>>>>> test of
>>>>> +		 * req->rq_reply_bytes_recvd, and the call to
>>>>> rpc_sleep_on().
>>>>> 		 */
>>>>> -		if (!req->rq_reply_bytes_recvd &&
>>>>> rpc_reply_expected(task))
>>>>> +		spin_lock(&xprt->recv_lock);
>>>>> +		if (!req->rq_reply_bytes_recvd) {
>>>>> 			rpc_sleep_on(&xprt->pending, task,
>>>>> xprt_timer);
>>>>> -		req->rq_connect_cookie = xprt->connect_cookie;
>>>>> +			/*
>>>>> +			 * Send an extra queue wakeup call if
>>>>> the
>>>>> +			 * connection was dropped in case the
>>>>> call to
>>>>> +			 * rpc_sleep_on() raced.
>>>>> +			 */
>>>>> +			if (!xprt_connected(xprt))
>>>>> +				xprt_wake_pending_tasks(xprt,
>>>>> -ENOTCONN);
>>>>> +		}
>>>>> +		spin_unlock(&xprt->recv_lock);
>>>>> 	}
>>>>> -	spin_unlock_bh(&xprt->transport_lock);
>>>>> }
>>>>> 
>>>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct
>>>>> rpc_task *task)
>>>>> -- 
>>>>> 2.14.3
>>>> 
>>>> I've run into a problem with this version of the patch (on v4.15-
>>>> rc3).
>>>> 
>>>> With dbench on NFSv4.0 (on RDMA) I no longer see constant
>>>> throughput
>>>> and a temporarily climbing latency followed by a recovery, but
>>>> this
>>>> can happen on occasion:
>>>> 
>>>> releasing clients
>>>> 16       584   322.55 MB/sec  warmup   1 sec  latency 807.121 ms
>>>> 16       584   161.28 MB/sec  warmup   2 sec  latency 1807.175 ms
>>>> 16       584   107.52 MB/sec  warmup   3 sec  latency 2807.198 ms
>>>> 16       584    80.64 MB/sec  warmup   4 sec  latency 3807.217 ms
>>>> 16       584    64.51 MB/sec  warmup   5 sec  latency 4807.235 ms
>>>> 16       584    53.76 MB/sec  warmup   6 sec  latency 5807.251 ms
>>>> 16       584    46.08 MB/sec  warmup   7 sec  latency 6807.269 ms
>>>> 16       584    40.32 MB/sec  warmup   8 sec  latency 7807.285 ms
>>>> 16       584    35.84 MB/sec  warmup   9 sec  latency 8807.303 ms
>>>> 16       584    32.26 MB/sec  warmup  10 sec  latency 9807.318 ms
>>>> 16       584    29.32 MB/sec  warmup  11 sec  latency 10807.334
>>>> ms
>>>> 16       584    26.88 MB/sec  warmup  12 sec  latency 11807.355
>>>> ms
>>>> 16       584    24.81 MB/sec  warmup  13 sec  latency 12807.372
>>>> ms
>>>> 16       584    23.04 MB/sec  warmup  14 sec  latency 13807.387
>>>> ms
>>>> 16       584    21.50 MB/sec  warmup  15 sec  latency 14807.406
>>>> ms
>>>> 16       584    20.16 MB/sec  warmup  16 sec  latency 15807.423
>>>> ms
>>>> 
>>>> No recovery. The latency number keeps climbing, and throughput
>>>> drops.
>>>> 
>>>> "kernel: nfs: server klimt-ib not responding, still trying"
>>>> appears in the client's /var/log/messages.
>>>> 
>>>> Without this patch applied, the NFSv4.0 behavior is the same as I
>>>> reported with NFSv3: every once in a while, one RPC completion is
>>>> lost, but when the timer fires, the client notices the reply
>>>> actually did arrive and the RPC completes normally.
>>>> 
>>>> I'm looking into it.
>>> 
>>> Root cause:
>>> 
>>> The transport is temporarily out of resources and returns -ENOBUFS
>>> from ->send_request.
>>> 
>>> The FSM schedules a delay then calls call_transmit again, which
>>> invokes xprt_prepare_transmit.
>>> 
>>> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
>>> transport write lock.
>>> 
>>> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This causes
>>> xprt_prepare_transmit to see that the transport is connected and
>>> rq_connect_cookie is the same as xprt->connect_cookie, so it
>>> queues the task on ->pending and returns false.
>>> 
>>> The transport never sends the Call, but still waits for a Reply,
>>> forever. This is not a bug in your patch, it is a pre-existing
>>> issue.
>> 
>> Sorry for one more reply...
>> 
>> I think this last statement is incorrect. It is a bug in your
>> patch after all.
>> 
>> I forgot that your patch moves the line that sets
>> 
>>    req->rq_connect_cookie = xprt->connect_cookie;
>> 
>> to before the call to ->send_request. But rq_connect_cookie
>> must be set iff ->send_request is successful. Moving this
>> line seems to fix the issue:
>> 
>> 1027         req->rq_xtime = ktime_get();
>> 1028         status = xprt->ops->send_request(task);
>> 1029         trace_xprt_transmit(xprt, req->rq_xid, status);
>> 1030         if (status != 0) {
>> 1031                 task->tk_status = status;
>> 1032                 return;
>> 1033         }
>> 1034         req->rq_connect_cookie = xprt->connect_cookie; <<<<
>> 1035         xprt_inject_disconnect(xprt);
>> 
> 
> Let's rather apply the attached fix (incremental to the original patch
> and intended to be applied using 'git rebase --autosquash'). The reason
> for wanting to set the connect cookie before we call send_request() is
> to avoid races with disconnects that happen immediately after we sent
> the data (but before we get round to setting the cookie).
> 
> 8<--------------------------------------------------------
> From 326c12727b179d5c2fd42f374a4c0e2d56ee90c5 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> Date: Thu, 14 Dec 2017 07:05:27 -0500
> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
> 
> ---
> net/sunrpc/xprt.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 5e4278e9ce37..2cf4cf21a878 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1021,10 +1021,11 @@ void xprt_transmit(struct rpc_task *task)
> 			/* Turn off autodisconnect */
> 			del_singleshot_timer_sync(&xprt->timer);
> 		}
> +		if (!req->rq_bytes_sent)
> +			req->rq_connect_cookie = xprt->connect_cookie;
> 	} else if (!req->rq_bytes_sent)
> 		return;
> 
> -	req->rq_connect_cookie = xprt->connect_cookie;
> 	req->rq_xtime = ktime_get();
> 	status = xprt->ops->send_request(task);
> 	trace_xprt_transmit(xprt, req->rq_xid, status);
> -- 
> 2.14.3

Tested with a client that returns -ENOBUFS from ->send_request
with a little extra frequency.

This doesn't appear to address the problem. The transport
deadlocks as before when send_request() fails to send the RPC.

I imagine this issue could also impact write buffer exhaustion
on TCP.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Trond Myklebust Dec. 14, 2017, 7:03 p.m. UTC | #6

On Thu, 2017-12-14 at 10:49 -0500, Chuck Lever wrote:
> > On Dec 14, 2017, at 7:16 AM, Trond Myklebust <trondmy@primarydata.c

> > om> wrote:

> > 

> > On Wed, 2017-12-13 at 20:13 -0500, Chuck Lever wrote:

> > > > On Dec 13, 2017, at 6:42 PM, Chuck Lever <chuck.lever@oracle.co

> > > > m>

> > > > wrote:

> > > > 

> > > > > 

> > > > > On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle

> > > > > .com

> > > > > > wrote:

> > > > > 

> > > > > Hi Trond-

> > > > > 

> > > > > > On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebus

> > > > > > t@pr

> > > > > > imarydata.com> wrote:

> > > > > > 

> > > > > > We must ensure that the call to rpc_sleep_on() in

> > > > > > xprt_transmit() cannot

> > > > > > race with the call to xprt_complete_rqst().

> > > > > > 

> > > > > > Reported-by: Chuck Lever <chuck.lever@oracle.com>

> > > > > > Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317

> > > > > > Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to

> > > > > > protect..")

> > > > > > Cc: stable@vger.kernel.org # 4.14+

> > > > > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>

> > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata

> > > > > > .com

> > > > > > > 

> > > > > > 

> > > > > > ---

> > > > > > net/sunrpc/xprt.c | 26 +++++++++++++++++---------

> > > > > > 1 file changed, 17 insertions(+), 9 deletions(-)

> > > > > > 

> > > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c

> > > > > > index 333b9d697ae5..5e4278e9ce37 100644

> > > > > > --- a/net/sunrpc/xprt.c

> > > > > > +++ b/net/sunrpc/xprt.c

> > > > > > @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task

> > > > > > *task)

> > > > > > 	} else if (!req->rq_bytes_sent)

> > > > > > 		return;

> > > > > > 

> > > > > > +	req->rq_connect_cookie = xprt->connect_cookie;

> > > > > > 	req->rq_xtime = ktime_get();

> > > > > > 	status = xprt->ops->send_request(task);

> > > > > > 	trace_xprt_transmit(xprt, req->rq_xid, status);

> > > > > > @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task

> > > > > > *task)

> > > > > > 	xprt->stat.bklog_u += xprt->backlog.qlen;

> > > > > > 	xprt->stat.sending_u += xprt->sending.qlen;

> > > > > > 	xprt->stat.pending_u += xprt->pending.qlen;

> > > > > > +	spin_unlock_bh(&xprt->transport_lock);

> > > > > > 

> > > > > > -	/* Don't race with disconnect */

> > > > > > -	if (!xprt_connected(xprt))

> > > > > > -		task->tk_status = -ENOTCONN;

> > > > > > -	else {

> > > > > > +	if (rpc_reply_expected(task) && !READ_ONCE(req-

> > > > > > > rq_reply_bytes_recvd)) {

> > > > > > 

> > > > > > 		/*

> > > > > > -		 * Sleep on the pending queue since

> > > > > > -		 * we're expecting a reply.

> > > > > > +		 * Sleep on the pending queue if we're

> > > > > > expecting a reply.

> > > > > > +		 * The spinlock ensures atomicity between

> > > > > > the

> > > > > > test of

> > > > > > +		 * req->rq_reply_bytes_recvd, and the call

> > > > > > to

> > > > > > rpc_sleep_on().

> > > > > > 		 */

> > > > > > -		if (!req->rq_reply_bytes_recvd &&

> > > > > > rpc_reply_expected(task))

> > > > > > +		spin_lock(&xprt->recv_lock);

> > > > > > +		if (!req->rq_reply_bytes_recvd) {

> > > > > > 			rpc_sleep_on(&xprt->pending, task,

> > > > > > xprt_timer);

> > > > > > -		req->rq_connect_cookie = xprt-

> > > > > > >connect_cookie;

> > > > > > +			/*

> > > > > > +			 * Send an extra queue wakeup call

> > > > > > if

> > > > > > the

> > > > > > +			 * connection was dropped in case

> > > > > > the

> > > > > > call to

> > > > > > +			 * rpc_sleep_on() raced.

> > > > > > +			 */

> > > > > > +			if (!xprt_connected(xprt))

> > > > > > +				xprt_wake_pending_tasks(xp

> > > > > > rt,

> > > > > > -ENOTCONN);

> > > > > > +		}

> > > > > > +		spin_unlock(&xprt->recv_lock);

> > > > > > 	}

> > > > > > -	spin_unlock_bh(&xprt->transport_lock);

> > > > > > }

> > > > > > 

> > > > > > static void xprt_add_backlog(struct rpc_xprt *xprt, struct

> > > > > > rpc_task *task)

> > > > > > -- 

> > > > > > 2.14.3

> > > > > 

> > > > > I've run into a problem with this version of the patch (on

> > > > > v4.15-

> > > > > rc3).

> > > > > 

> > > > > With dbench on NFSv4.0 (on RDMA) I no longer see constant

> > > > > throughput

> > > > > and a temporarily climbing latency followed by a recovery,

> > > > > but

> > > > > this

> > > > > can happen on occasion:

> > > > > 

> > > > > releasing clients

> > > > > 16       584   322.55 MB/sec  warmup   1 sec  latency 807.121

> > > > > ms

> > > > > 16       584   161.28 MB/sec  warmup   2 sec  latency

> > > > > 1807.175 ms

> > > > > 16       584   107.52 MB/sec  warmup   3 sec  latency

> > > > > 2807.198 ms

> > > > > 16       584    80.64 MB/sec  warmup   4 sec  latency

> > > > > 3807.217 ms

> > > > > 16       584    64.51 MB/sec  warmup   5 sec  latency

> > > > > 4807.235 ms

> > > > > 16       584    53.76 MB/sec  warmup   6 sec  latency

> > > > > 5807.251 ms

> > > > > 16       584    46.08 MB/sec  warmup   7 sec  latency

> > > > > 6807.269 ms

> > > > > 16       584    40.32 MB/sec  warmup   8 sec  latency

> > > > > 7807.285 ms

> > > > > 16       584    35.84 MB/sec  warmup   9 sec  latency

> > > > > 8807.303 ms

> > > > > 16       584    32.26 MB/sec  warmup  10 sec  latency

> > > > > 9807.318 ms

> > > > > 16       584    29.32 MB/sec  warmup  11 sec  latency

> > > > > 10807.334

> > > > > ms

> > > > > 16       584    26.88 MB/sec  warmup  12 sec  latency

> > > > > 11807.355

> > > > > ms

> > > > > 16       584    24.81 MB/sec  warmup  13 sec  latency

> > > > > 12807.372

> > > > > ms

> > > > > 16       584    23.04 MB/sec  warmup  14 sec  latency

> > > > > 13807.387

> > > > > ms

> > > > > 16       584    21.50 MB/sec  warmup  15 sec  latency

> > > > > 14807.406

> > > > > ms

> > > > > 16       584    20.16 MB/sec  warmup  16 sec  latency

> > > > > 15807.423

> > > > > ms

> > > > > 

> > > > > No recovery. The latency number keeps climbing, and

> > > > > throughput

> > > > > drops.

> > > > > 

> > > > > "kernel: nfs: server klimt-ib not responding, still trying"

> > > > > appears in the client's /var/log/messages.

> > > > > 

> > > > > Without this patch applied, the NFSv4.0 behavior is the same

> > > > > as I

> > > > > reported with NFSv3: every once in a while, one RPC

> > > > > completion is

> > > > > lost, but when the timer fires, the client notices the reply

> > > > > actually did arrive and the RPC completes normally.

> > > > > 

> > > > > I'm looking into it.

> > > > 

> > > > Root cause:

> > > > 

> > > > The transport is temporarily out of resources and returns

> > > > -ENOBUFS

> > > > from ->send_request.

> > > > 

> > > > The FSM schedules a delay then calls call_transmit again, which

> > > > invokes xprt_prepare_transmit.

> > > > 

> > > > On NFSv3 mounts, xprt_prepare_transmit tries to acquire the

> > > > transport write lock.

> > > > 

> > > > On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This

> > > > causes

> > > > xprt_prepare_transmit to see that the transport is connected

> > > > and

> > > > rq_connect_cookie is the same as xprt->connect_cookie, so it

> > > > queues the task on ->pending and returns false.

> > > > 

> > > > The transport never sends the Call, but still waits for a

> > > > Reply,

> > > > forever. This is not a bug in your patch, it is a pre-existing

> > > > issue.

> > > 

> > > Sorry for one more reply...

> > > 

> > > I think this last statement is incorrect. It is a bug in your

> > > patch after all.

> > > 

> > > I forgot that your patch moves the line that sets

> > > 

> > >    req->rq_connect_cookie = xprt->connect_cookie;

> > > 

> > > to before the call to ->send_request. But rq_connect_cookie

> > > must be set iff ->send_request is successful. Moving this

> > > line seems to fix the issue:

> > > 

> > > 1027         req->rq_xtime = ktime_get();

> > > 1028         status = xprt->ops->send_request(task);

> > > 1029         trace_xprt_transmit(xprt, req->rq_xid, status);

> > > 1030         if (status != 0) {

> > > 1031                 task->tk_status = status;

> > > 1032                 return;

> > > 1033         }

> > > 1034         req->rq_connect_cookie = xprt->connect_cookie; <<<<

> > > 1035         xprt_inject_disconnect(xprt);

> > > 

> > 

> > Let's rather apply the attached fix (incremental to the original

> > patch

> > and intended to be applied using 'git rebase --autosquash'). The

> > reason

> > for wanting to set the connect cookie before we call send_request()

> > is

> > to avoid races with disconnects that happen immediately after we

> > sent

> > the data (but before we get round to setting the cookie).

> > 

> > 8<--------------------------------------------------------

> > From 326c12727b179d5c2fd42f374a4c0e2d56ee90c5 Mon Sep 17 00:00:00

> > 2001

> > From: Trond Myklebust <trond.myklebust@primarydata.com>

> > Date: Thu, 14 Dec 2017 07:05:27 -0500

> > Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path

> > 

> > ---

> > net/sunrpc/xprt.c | 3 ++-

> > 1 file changed, 2 insertions(+), 1 deletion(-)

> > 

> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c

> > index 5e4278e9ce37..2cf4cf21a878 100644

> > --- a/net/sunrpc/xprt.c

> > +++ b/net/sunrpc/xprt.c

> > @@ -1021,10 +1021,11 @@ void xprt_transmit(struct rpc_task *task)

> > 			/* Turn off autodisconnect */

> > 			del_singleshot_timer_sync(&xprt->timer);

> > 		}

> > +		if (!req->rq_bytes_sent)

> > +			req->rq_connect_cookie = xprt-

> > >connect_cookie;

> > 	} else if (!req->rq_bytes_sent)

> > 		return;

> > 

> > -	req->rq_connect_cookie = xprt->connect_cookie;

> > 	req->rq_xtime = ktime_get();

> > 	status = xprt->ops->send_request(task);

> > 	trace_xprt_transmit(xprt, req->rq_xid, status);

> > -- 

> > 2.14.3

> 

> Tested with a client that returns -ENOBUFS from ->send_request

> with a little extra frequency.

> 

> This doesn't appear to address the problem. The transport

> deadlocks as before when send_request() fails to send the RPC.


Does the RDMA code update the connect cookie when the connection
breaks? It looks to me as if it only does that when the connection is
re-established. We really want both.

> I imagine this issue could also impact write buffer exhaustion

> on TCP.


See net/sunrpc/xprtsock.c:xs_tcp_state_change()


-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com

Chuck Lever Dec. 14, 2017, 7:22 p.m. UTC | #7

> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Thu, 2017-12-14 at 10:49 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 7:16 AM, Trond Myklebust <trondmy@primarydata.c
>>> om> wrote:
>>> 
>>> On Wed, 2017-12-13 at 20:13 -0500, Chuck Lever wrote:
>>>>> On Dec 13, 2017, at 6:42 PM, Chuck Lever <chuck.lever@oracle.co
>>>>> m>
>>>>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Dec 13, 2017, at 11:14 AM, Chuck Lever <chuck.lever@oracle
>>>>>> .com
>>>>>>> wrote:
>>>>>> 
>>>>>> Hi Trond-
>>>>>> 
>>>>>>> On Dec 3, 2017, at 7:17 PM, Trond Myklebust <trond.myklebus
>>>>>>> t@pr
>>>>>>> imarydata.com> wrote:
>>>>>>> 
>>>>>>> We must ensure that the call to rpc_sleep_on() in
>>>>>>> xprt_transmit() cannot
>>>>>>> race with the call to xprt_complete_rqst().
>>>>>>> 
>>>>>>> Reported-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>>> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317
>>>>>>> Fixes: ce7c252a8c74 ("SUNRPC: Add a separate spinlock to
>>>>>>> protect..")
>>>>>>> Cc: stable@vger.kernel.org # 4.14+
>>>>>>> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata
>>>>>>> .com
>>>>>>>> 
>>>>>>> 
>>>>>>> ---
>>>>>>> net/sunrpc/xprt.c | 26 +++++++++++++++++---------
>>>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>>>> index 333b9d697ae5..5e4278e9ce37 100644
>>>>>>> --- a/net/sunrpc/xprt.c
>>>>>>> +++ b/net/sunrpc/xprt.c
>>>>>>> @@ -1024,6 +1024,7 @@ void xprt_transmit(struct rpc_task
>>>>>>> *task)
>>>>>>> 	} else if (!req->rq_bytes_sent)
>>>>>>> 		return;
>>>>>>> 
>>>>>>> +	req->rq_connect_cookie = xprt->connect_cookie;
>>>>>>> 	req->rq_xtime = ktime_get();
>>>>>>> 	status = xprt->ops->send_request(task);
>>>>>>> 	trace_xprt_transmit(xprt, req->rq_xid, status);
>>>>>>> @@ -1047,20 +1048,27 @@ void xprt_transmit(struct rpc_task
>>>>>>> *task)
>>>>>>> 	xprt->stat.bklog_u += xprt->backlog.qlen;
>>>>>>> 	xprt->stat.sending_u += xprt->sending.qlen;
>>>>>>> 	xprt->stat.pending_u += xprt->pending.qlen;
>>>>>>> +	spin_unlock_bh(&xprt->transport_lock);
>>>>>>> 
>>>>>>> -	/* Don't race with disconnect */
>>>>>>> -	if (!xprt_connected(xprt))
>>>>>>> -		task->tk_status = -ENOTCONN;
>>>>>>> -	else {
>>>>>>> +	if (rpc_reply_expected(task) && !READ_ONCE(req-
>>>>>>>> rq_reply_bytes_recvd)) {
>>>>>>> 
>>>>>>> 		/*
>>>>>>> -		 * Sleep on the pending queue since
>>>>>>> -		 * we're expecting a reply.
>>>>>>> +		 * Sleep on the pending queue if we're
>>>>>>> expecting a reply.
>>>>>>> +		 * The spinlock ensures atomicity between
>>>>>>> the
>>>>>>> test of
>>>>>>> +		 * req->rq_reply_bytes_recvd, and the call
>>>>>>> to
>>>>>>> rpc_sleep_on().
>>>>>>> 		 */
>>>>>>> -		if (!req->rq_reply_bytes_recvd &&
>>>>>>> rpc_reply_expected(task))
>>>>>>> +		spin_lock(&xprt->recv_lock);
>>>>>>> +		if (!req->rq_reply_bytes_recvd) {
>>>>>>> 			rpc_sleep_on(&xprt->pending, task,
>>>>>>> xprt_timer);
>>>>>>> -		req->rq_connect_cookie = xprt-
>>>>>>>> connect_cookie;
>>>>>>> +			/*
>>>>>>> +			 * Send an extra queue wakeup call
>>>>>>> if
>>>>>>> the
>>>>>>> +			 * connection was dropped in case
>>>>>>> the
>>>>>>> call to
>>>>>>> +			 * rpc_sleep_on() raced.
>>>>>>> +			 */
>>>>>>> +			if (!xprt_connected(xprt))
>>>>>>> +				xprt_wake_pending_tasks(xp
>>>>>>> rt,
>>>>>>> -ENOTCONN);
>>>>>>> +		}
>>>>>>> +		spin_unlock(&xprt->recv_lock);
>>>>>>> 	}
>>>>>>> -	spin_unlock_bh(&xprt->transport_lock);
>>>>>>> }
>>>>>>> 
>>>>>>> static void xprt_add_backlog(struct rpc_xprt *xprt, struct
>>>>>>> rpc_task *task)
>>>>>>> -- 
>>>>>>> 2.14.3
>>>>>> 
>>>>>> I've run into a problem with this version of the patch (on
>>>>>> v4.15-
>>>>>> rc3).
>>>>>> 
>>>>>> With dbench on NFSv4.0 (on RDMA) I no longer see constant
>>>>>> throughput
>>>>>> and a temporarily climbing latency followed by a recovery,
>>>>>> but
>>>>>> this
>>>>>> can happen on occasion:
>>>>>> 
>>>>>> releasing clients
>>>>>> 16       584   322.55 MB/sec  warmup   1 sec  latency 807.121
>>>>>> ms
>>>>>> 16       584   161.28 MB/sec  warmup   2 sec  latency
>>>>>> 1807.175 ms
>>>>>> 16       584   107.52 MB/sec  warmup   3 sec  latency
>>>>>> 2807.198 ms
>>>>>> 16       584    80.64 MB/sec  warmup   4 sec  latency
>>>>>> 3807.217 ms
>>>>>> 16       584    64.51 MB/sec  warmup   5 sec  latency
>>>>>> 4807.235 ms
>>>>>> 16       584    53.76 MB/sec  warmup   6 sec  latency
>>>>>> 5807.251 ms
>>>>>> 16       584    46.08 MB/sec  warmup   7 sec  latency
>>>>>> 6807.269 ms
>>>>>> 16       584    40.32 MB/sec  warmup   8 sec  latency
>>>>>> 7807.285 ms
>>>>>> 16       584    35.84 MB/sec  warmup   9 sec  latency
>>>>>> 8807.303 ms
>>>>>> 16       584    32.26 MB/sec  warmup  10 sec  latency
>>>>>> 9807.318 ms
>>>>>> 16       584    29.32 MB/sec  warmup  11 sec  latency
>>>>>> 10807.334
>>>>>> ms
>>>>>> 16       584    26.88 MB/sec  warmup  12 sec  latency
>>>>>> 11807.355
>>>>>> ms
>>>>>> 16       584    24.81 MB/sec  warmup  13 sec  latency
>>>>>> 12807.372
>>>>>> ms
>>>>>> 16       584    23.04 MB/sec  warmup  14 sec  latency
>>>>>> 13807.387
>>>>>> ms
>>>>>> 16       584    21.50 MB/sec  warmup  15 sec  latency
>>>>>> 14807.406
>>>>>> ms
>>>>>> 16       584    20.16 MB/sec  warmup  16 sec  latency
>>>>>> 15807.423
>>>>>> ms
>>>>>> 
>>>>>> No recovery. The latency number keeps climbing, and
>>>>>> throughput
>>>>>> drops.
>>>>>> 
>>>>>> "kernel: nfs: server klimt-ib not responding, still trying"
>>>>>> appears in the client's /var/log/messages.
>>>>>> 
>>>>>> Without this patch applied, the NFSv4.0 behavior is the same
>>>>>> as I
>>>>>> reported with NFSv3: every once in a while, one RPC
>>>>>> completion is
>>>>>> lost, but when the timer fires, the client notices the reply
>>>>>> actually did arrive and the RPC completes normally.
>>>>>> 
>>>>>> I'm looking into it.
>>>>> 
>>>>> Root cause:
>>>>> 
>>>>> The transport is temporarily out of resources and returns
>>>>> -ENOBUFS
>>>>> from ->send_request.
>>>>> 
>>>>> The FSM schedules a delay then calls call_transmit again, which
>>>>> invokes xprt_prepare_transmit.
>>>>> 
>>>>> On NFSv3 mounts, xprt_prepare_transmit tries to acquire the
>>>>> transport write lock.
>>>>> 
>>>>> On NFSv4 mounts, RPC_TASK_NO_RETRANS_TIMEOUT is set. This
>>>>> causes
>>>>> xprt_prepare_transmit to see that the transport is connected
>>>>> and
>>>>> rq_connect_cookie is the same as xprt->connect_cookie, so it
>>>>> queues the task on ->pending and returns false.
>>>>> 
>>>>> The transport never sends the Call, but still waits for a
>>>>> Reply,
>>>>> forever. This is not a bug in your patch, it is a pre-existing
>>>>> issue.
>>>> 
>>>> Sorry for one more reply...
>>>> 
>>>> I think this last statement is incorrect. It is a bug in your
>>>> patch after all.
>>>> 
>>>> I forgot that your patch moves the line that sets
>>>> 
>>>>   req->rq_connect_cookie = xprt->connect_cookie;
>>>> 
>>>> to before the call to ->send_request. But rq_connect_cookie
>>>> must be set iff ->send_request is successful. Moving this
>>>> line seems to fix the issue:
>>>> 
>>>> 1027         req->rq_xtime = ktime_get();
>>>> 1028         status = xprt->ops->send_request(task);
>>>> 1029         trace_xprt_transmit(xprt, req->rq_xid, status);
>>>> 1030         if (status != 0) {
>>>> 1031                 task->tk_status = status;
>>>> 1032                 return;
>>>> 1033         }
>>>> 1034         req->rq_connect_cookie = xprt->connect_cookie; <<<<
>>>> 1035         xprt_inject_disconnect(xprt);
>>>> 
>>> 
>>> Let's rather apply the attached fix (incremental to the original
>>> patch
>>> and intended to be applied using 'git rebase --autosquash'). The
>>> reason
>>> for wanting to set the connect cookie before we call send_request()
>>> is
>>> to avoid races with disconnects that happen immediately after we
>>> sent
>>> the data (but before we get round to setting the cookie).
>>> 
>>> 8<--------------------------------------------------------
>>> From 326c12727b179d5c2fd42f374a4c0e2d56ee90c5 Mon Sep 17 00:00:00
>>> 2001
>>> From: Trond Myklebust <trond.myklebust@primarydata.com>
>>> Date: Thu, 14 Dec 2017 07:05:27 -0500
>>> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
>>> 
>>> ---
>>> net/sunrpc/xprt.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 5e4278e9ce37..2cf4cf21a878 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -1021,10 +1021,11 @@ void xprt_transmit(struct rpc_task *task)
>>> 			/* Turn off autodisconnect */
>>> 			del_singleshot_timer_sync(&xprt->timer);
>>> 		}
>>> +		if (!req->rq_bytes_sent)
>>> +			req->rq_connect_cookie = xprt-
>>>> connect_cookie;
>>> 	} else if (!req->rq_bytes_sent)
>>> 		return;
>>> 
>>> -	req->rq_connect_cookie = xprt->connect_cookie;
>>> 	req->rq_xtime = ktime_get();
>>> 	status = xprt->ops->send_request(task);
>>> 	trace_xprt_transmit(xprt, req->rq_xid, status);
>>> -- 
>>> 2.14.3
>> 
>> Tested with a client that returns -ENOBUFS from ->send_request
>> with a little extra frequency.
>> 
>> This doesn't appear to address the problem. The transport
>> deadlocks as before when send_request() fails to send the RPC.
> 
> Does the RDMA code update the connect cookie when the connection
> breaks? It looks to me as if it only does that when the connection is
> re-established. We really want both.
> 
>> I imagine this issue could also impact write buffer exhaustion
>> on TCP.
> 
> See net/sunrpc/xprtsock.c:xs_tcp_state_change()

xprtrdma manipulates the connect_cookie in its connect worker,
see rpcrdma_connect_worker. This was added by:

commit 575448bd36208f99fe0dd554a43518d798966740
Author:     Tom Talpey <talpey@netapp.com>
AuthorDate: Thu Oct 9 15:00:40 2008 -0400
Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
CommitDate: Fri Oct 10 15:10:36 2008 -0400

    RPC/RDMA: suppress retransmit on RPC/RDMA clients.

Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
which is the equivalent to xs_tcp_state_change? (if so, why, so
I can compose a reasonable patch description)

It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
RDMA provides a distinction between "server disconnected"
and "client disconnected" although that probably does not
matter for this purpose.

But, why would the additional cookie update help? The transport
is not disconnecting before the deadlock.


--
Chuck Lever

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Trond Myklebust Dec. 14, 2017, 8:37 p.m. UTC | #8

T24gVGh1LCAyMDE3LTEyLTE0IGF0IDE0OjIyIC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTQsIDIwMTcsIGF0IDI6MDMgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBw
cmltYXJ5ZGF0YS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IERvZXMgdGhlIFJETUEgY29k
ZSB1cGRhdGUgdGhlIGNvbm5lY3QgY29va2llIHdoZW4gdGhlIGNvbm5lY3Rpb24NCj4gPiBicmVh
a3M/IEl0IGxvb2tzIHRvIG1lIGFzIGlmIGl0IG9ubHkgZG9lcyB0aGF0IHdoZW4gdGhlIGNvbm5l
Y3Rpb24NCj4gPiBpcw0KPiA+IHJlLWVzdGFibGlzaGVkLiBXZSByZWFsbHkgd2FudCBib3RoLg0K
PiA+IA0KPiA+ID4gSSBpbWFnaW5lIHRoaXMgaXNzdWUgY291bGQgYWxzbyBpbXBhY3Qgd3JpdGUg
YnVmZmVyIGV4aGF1c3Rpb24NCj4gPiA+IG9uIFRDUC4NCj4gPiANCj4gPiBTZWUgbmV0L3N1bnJw
Yy94cHJ0c29jay5jOnhzX3RjcF9zdGF0ZV9jaGFuZ2UoKQ0KPiANCj4geHBydHJkbWEgbWFuaXB1
bGF0ZXMgdGhlIGNvbm5lY3RfY29va2llIGluIGl0cyBjb25uZWN0IHdvcmtlciwNCj4gc2VlIHJw
Y3JkbWFfY29ubmVjdF93b3JrZXIuIFRoaXMgd2FzIGFkZGVkIGJ5Og0KPiANCj4gY29tbWl0IDU3
NTQ0OGJkMzYyMDhmOTlmZTBkZDU1NGE0MzUxOGQ3OTg5NjY3NDANCj4gQXV0aG9yOiAgICAgVG9t
IFRhbHBleSA8dGFscGV5QG5ldGFwcC5jb20+DQo+IEF1dGhvckRhdGU6IFRodSBPY3QgOSAxNTow
MDo0MCAyMDA4IC0wNDAwDQo+IENvbW1pdDogICAgIFRyb25kIE15a2xlYnVzdCA8VHJvbmQuTXlr
bGVidXN0QG5ldGFwcC5jb20+DQo+IENvbW1pdERhdGU6IEZyaSBPY3QgMTAgMTU6MTA6MzYgMjAw
OCAtMDQwMA0KPiANCj4gICAgIFJQQy9SRE1BOiBzdXBwcmVzcyByZXRyYW5zbWl0IG9uIFJQQy9S
RE1BIGNsaWVudHMuDQo+IA0KPiBXb3VsZCBpdCBiZSBtb3JlIGNvcnJlY3QgdG8gYnVtcCB0aGUg
Y29va2llIGluIHJwY3JkbWFfY29ubl91cGNhbGwsDQo+IHdoaWNoIGlzIHRoZSBlcXVpdmFsZW50
IHRvIHhzX3RjcF9zdGF0ZV9jaGFuZ2U/IChpZiBzbywgd2h5LCBzbw0KPiBJIGNhbiBjb21wb3Nl
IGEgcmVhc29uYWJsZSBwYXRjaCBkZXNjcmlwdGlvbikNCj4gDQo+IEl0IGNvdWxkIGJlIGJ1bXBl
ZCBpbiB0aGUgUkRNQV9DTV9FVkVOVF9FU1RBQkxJU0hFRCBhbmQgdGhlDQo+IFJETUFfQ01fRVZF
TlRfRElTQ09OTkVDVEVEIGNhc2VzLCBmb3IgZXhhbXBsZS4gSSdtIG5vdCBzdXJlDQo+IFJETUEg
cHJvdmlkZXMgYSBkaXN0aW5jdGlvbiBiZXR3ZWVuICJzZXJ2ZXIgZGlzY29ubmVjdGVkIg0KPiBh
bmQgImNsaWVudCBkaXNjb25uZWN0ZWQiIGFsdGhvdWdoIHRoYXQgcHJvYmFibHkgZG9lcyBub3QN
Cj4gbWF0dGVyIGZvciB0aGlzIHB1cnBvc2UuDQo+IA0KPiBCdXQsIHdoeSB3b3VsZCB0aGUgYWRk
aXRpb25hbCBjb29raWUgdXBkYXRlIGhlbHA/IFRoZSB0cmFuc3BvcnQNCj4gaXMgbm90IGRpc2Nv
bm5lY3RpbmcgYmVmb3JlIHRoZSBkZWFkbG9jay4NCj4gDQoNClRoZSBjb25uZWN0aW9uIGNvb2tp
ZSdzIHB1cnBvc2UgaXMgdHdvZm9sZDoNCg0KMSkgSXQgdHJhY2tzIHdoZXRoZXIgb3Igbm90IGEg
cmVxdWVzdCBoYXMgYmVlbiB0cmFuc21pdHRlZCBvbiB0aGUNCmN1cnJlbnQgY29ubmVjdGlvbiBv
ciBub3QuDQoNCjIpIEl0IGVuc3VyZXMgdGhhdCB3aGVuIHNldmVyYWwgcmVxdWVzdHMgd2l0aCB0
aGUgc2FtZSBjb25uZWN0aW9uDQpjb29raWUgYWxsIGNhbGwgeHBydF9jb25kaXRpb25hbF9kaXNj
b25uZWN0KCksIHRoZW4gdGhhdCByZXN1bHRzIGluIGENCnNpbmdsZSBkaXNjb25uZWN0aW9uIGV2
ZW50LiBUbyBkbyBzbywgaXQgYXNzdW1lcyB0aGF0IHhwcnRfYXV0b2Nsb3NlKCkNCndpbGwgY2hh
bmdlIHRoZSBjb29raWUgaWYgdGhlIGRpc2Nvbm5lY3Rpb24gYXR0ZW1wdCBpcyBzdWNjZXNzZnVs
Lg0KDQpJbiBUQ1Agd2UgZG8gc28gaW4gdGhlIHhzX3RjcF9zdGF0ZV9jaGFuZ2UoKS4gSWYgdGhl
IFJETUEgdHJhbnNwb3J0IGNhbg0KZ3VhcmFudGVlIHRoYXQgdGhlIGNhbGwgdG8geHBydC0+b3Bz
LT5jbG9zZSh4cHJ0KSBpcyBhbHdheXMgc3VjY2Vzc2Z1bCwNCnRoZW4geW91IGNvdWxkIGRvIHNv
IHRoZXJlLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFp
bmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chuck Lever Dec. 14, 2017, 8:59 p.m. UTC | #9

> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primarydata.c
>>> om> wrote:
>>> 
>>> Does the RDMA code update the connect cookie when the connection
>>> breaks? It looks to me as if it only does that when the connection
>>> is
>>> re-established. We really want both.
>>> 
>>>> I imagine this issue could also impact write buffer exhaustion
>>>> on TCP.
>>> 
>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>> 
>> xprtrdma manipulates the connect_cookie in its connect worker,
>> see rpcrdma_connect_worker. This was added by:
>> 
>> commit 575448bd36208f99fe0dd554a43518d798966740
>> Author:     Tom Talpey <talpey@netapp.com>
>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>> Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>> 
>>    RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>> 
>> Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>> I can compose a reasonable patch description)
>> 
>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>> RDMA provides a distinction between "server disconnected"
>> and "client disconnected" although that probably does not
>> matter for this purpose.
>> 
>> But, why would the additional cookie update help? The transport
>> is not disconnecting before the deadlock.
>> 
> 
> The connection cookie's purpose is twofold:
> 
> 1) It tracks whether or not a request has been transmitted on the
> current connection or not.

That's broken by setting the cookie unconditionally outside
the transport_lock, isn't it?

> 2) It ensures that when several requests with the same connection
> cookie all call xprt_conditional_disconnect(), then that results in a
> single disconnection event. To do so, it assumes that xprt_autoclose()
> will change the cookie if the disconnection attempt is successful.
> 
> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport can
> guarantee that the call to xprt->ops->close(xprt) is always successful,
> then you could do so there.

I don't mind moving the cookie bump to rpcrdma_conn_upcall,
but I'm not sure I understand the locking requirements.

Currently, xprt_transmit sets the connect_cookie while holding
the transport_lock.

xprt_conditional_disconnect compares the cookie while holding
the transport_lock.

For TCP, the transport_lock is held when bumping the cookie
in the ESTABLISHED case, but _not_ in the two CLOSE cases?

xprtrdma holds the transport_lock when bumping the cookie,
which it does in its connect worker. It has to hold the lock
because it skips the value 0. xprtrdma needs to guarantee
that an RPC is never transmitted on the same connection
twice (and maybe it could use rq_connect_cookie instead of
its own cookie).

xprt_reserve_init is holding the reserve_lock but not the
transport_lock when it grabs the cookie. Maybe it should
not be initializing the rqst's cookie there?

Seems to me that xprt_transmit needs to update the rqst's
cookie while holding the transport_lock, especially if
xprtrdma needs to skip a cookie value? I'm sure I'm missing
something.

--
Chuck Lever

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Trond Myklebust Dec. 14, 2017, 9:33 p.m. UTC | #10

T24gVGh1LCAyMDE3LTEyLTE0IGF0IDE1OjU5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBEZWMgMTQsIDIwMTcsIGF0IDM6MzcgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBw
cmltYXJ5ZGF0YS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IE9uIFRodSwgMjAxNy0xMi0x
NCBhdCAxNDoyMiAtMDUwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiA+IE9uIERlYyAxNCwg
MjAxNywgYXQgMjowMyBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHByaW1hcnlkYQ0KPiA+
ID4gPiB0YS5jDQo+ID4gPiA+IG9tPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+IERvZXMgdGhl
IFJETUEgY29kZSB1cGRhdGUgdGhlIGNvbm5lY3QgY29va2llIHdoZW4gdGhlDQo+ID4gPiA+IGNv
bm5lY3Rpb24NCj4gPiA+ID4gYnJlYWtzPyBJdCBsb29rcyB0byBtZSBhcyBpZiBpdCBvbmx5IGRv
ZXMgdGhhdCB3aGVuIHRoZQ0KPiA+ID4gPiBjb25uZWN0aW9uDQo+ID4gPiA+IGlzDQo+ID4gPiA+
IHJlLWVzdGFibGlzaGVkLiBXZSByZWFsbHkgd2FudCBib3RoLg0KPiA+ID4gPiANCj4gPiA+ID4g
PiBJIGltYWdpbmUgdGhpcyBpc3N1ZSBjb3VsZCBhbHNvIGltcGFjdCB3cml0ZSBidWZmZXINCj4g
PiA+ID4gPiBleGhhdXN0aW9uDQo+ID4gPiA+ID4gb24gVENQLg0KPiA+ID4gPiANCj4gPiA+ID4g
U2VlIG5ldC9zdW5ycGMveHBydHNvY2suYzp4c190Y3Bfc3RhdGVfY2hhbmdlKCkNCj4gPiA+IA0K
PiA+ID4geHBydHJkbWEgbWFuaXB1bGF0ZXMgdGhlIGNvbm5lY3RfY29va2llIGluIGl0cyBjb25u
ZWN0IHdvcmtlciwNCj4gPiA+IHNlZSBycGNyZG1hX2Nvbm5lY3Rfd29ya2VyLiBUaGlzIHdhcyBh
ZGRlZCBieToNCj4gPiA+IA0KPiA+ID4gY29tbWl0IDU3NTQ0OGJkMzYyMDhmOTlmZTBkZDU1NGE0
MzUxOGQ3OTg5NjY3NDANCj4gPiA+IEF1dGhvcjogICAgIFRvbSBUYWxwZXkgPHRhbHBleUBuZXRh
cHAuY29tPg0KPiA+ID4gQXV0aG9yRGF0ZTogVGh1IE9jdCA5IDE1OjAwOjQwIDIwMDggLTA0MDAN
Cj4gPiA+IENvbW1pdDogICAgIFRyb25kIE15a2xlYnVzdCA8VHJvbmQuTXlrbGVidXN0QG5ldGFw
cC5jb20+DQo+ID4gPiBDb21taXREYXRlOiBGcmkgT2N0IDEwIDE1OjEwOjM2IDIwMDggLTA0MDAN
Cj4gPiA+IA0KPiA+ID4gICAgUlBDL1JETUE6IHN1cHByZXNzIHJldHJhbnNtaXQgb24gUlBDL1JE
TUEgY2xpZW50cy4NCj4gPiA+IA0KPiA+ID4gV291bGQgaXQgYmUgbW9yZSBjb3JyZWN0IHRvIGJ1
bXAgdGhlIGNvb2tpZSBpbg0KPiA+ID4gcnBjcmRtYV9jb25uX3VwY2FsbCwNCj4gPiA+IHdoaWNo
IGlzIHRoZSBlcXVpdmFsZW50IHRvIHhzX3RjcF9zdGF0ZV9jaGFuZ2U/IChpZiBzbywgd2h5LCBz
bw0KPiA+ID4gSSBjYW4gY29tcG9zZSBhIHJlYXNvbmFibGUgcGF0Y2ggZGVzY3JpcHRpb24pDQo+
ID4gPiANCj4gPiA+IEl0IGNvdWxkIGJlIGJ1bXBlZCBpbiB0aGUgUkRNQV9DTV9FVkVOVF9FU1RB
QkxJU0hFRCBhbmQgdGhlDQo+ID4gPiBSRE1BX0NNX0VWRU5UX0RJU0NPTk5FQ1RFRCBjYXNlcywg
Zm9yIGV4YW1wbGUuIEknbSBub3Qgc3VyZQ0KPiA+ID4gUkRNQSBwcm92aWRlcyBhIGRpc3RpbmN0
aW9uIGJldHdlZW4gInNlcnZlciBkaXNjb25uZWN0ZWQiDQo+ID4gPiBhbmQgImNsaWVudCBkaXNj
b25uZWN0ZWQiIGFsdGhvdWdoIHRoYXQgcHJvYmFibHkgZG9lcyBub3QNCj4gPiA+IG1hdHRlciBm
b3IgdGhpcyBwdXJwb3NlLg0KPiA+ID4gDQo+ID4gPiBCdXQsIHdoeSB3b3VsZCB0aGUgYWRkaXRp
b25hbCBjb29raWUgdXBkYXRlIGhlbHA/IFRoZSB0cmFuc3BvcnQNCj4gPiA+IGlzIG5vdCBkaXNj
b25uZWN0aW5nIGJlZm9yZSB0aGUgZGVhZGxvY2suDQo+ID4gPiANCj4gPiANCj4gPiBUaGUgY29u
bmVjdGlvbiBjb29raWUncyBwdXJwb3NlIGlzIHR3b2ZvbGQ6DQo+ID4gDQo+ID4gMSkgSXQgdHJh
Y2tzIHdoZXRoZXIgb3Igbm90IGEgcmVxdWVzdCBoYXMgYmVlbiB0cmFuc21pdHRlZCBvbiB0aGUN
Cj4gPiBjdXJyZW50IGNvbm5lY3Rpb24gb3Igbm90Lg0KPiANCj4gVGhhdCdzIGJyb2tlbiBieSBz
ZXR0aW5nIHRoZSBjb29raWUgdW5jb25kaXRpb25hbGx5IG91dHNpZGUNCj4gdGhlIHRyYW5zcG9y
dF9sb2NrLCBpc24ndCBpdD8NCj4gDQo+IA0KPiA+IDIpIEl0IGVuc3VyZXMgdGhhdCB3aGVuIHNl
dmVyYWwgcmVxdWVzdHMgd2l0aCB0aGUgc2FtZSBjb25uZWN0aW9uDQo+ID4gY29va2llIGFsbCBj
YWxsIHhwcnRfY29uZGl0aW9uYWxfZGlzY29ubmVjdCgpLCB0aGVuIHRoYXQgcmVzdWx0cyBpbg0K
PiA+IGENCj4gPiBzaW5nbGUgZGlzY29ubmVjdGlvbiBldmVudC4gVG8gZG8gc28sIGl0IGFzc3Vt
ZXMgdGhhdA0KPiA+IHhwcnRfYXV0b2Nsb3NlKCkNCj4gPiB3aWxsIGNoYW5nZSB0aGUgY29va2ll
IGlmIHRoZSBkaXNjb25uZWN0aW9uIGF0dGVtcHQgaXMgc3VjY2Vzc2Z1bC4NCj4gPiANCj4gPiBJ
biBUQ1Agd2UgZG8gc28gaW4gdGhlIHhzX3RjcF9zdGF0ZV9jaGFuZ2UoKS4gSWYgdGhlIFJETUEg
dHJhbnNwb3J0DQo+ID4gY2FuDQo+ID4gZ3VhcmFudGVlIHRoYXQgdGhlIGNhbGwgdG8geHBydC0+
b3BzLT5jbG9zZSh4cHJ0KSBpcyBhbHdheXMNCj4gPiBzdWNjZXNzZnVsLA0KPiA+IHRoZW4geW91
IGNvdWxkIGRvIHNvIHRoZXJlLg0KPiANCj4gSSBkb24ndCBtaW5kIG1vdmluZyB0aGUgY29va2ll
IGJ1bXAgdG8gcnBjcmRtYV9jb25uX3VwY2FsbCwNCj4gYnV0IEknbSBub3Qgc3VyZSBJIHVuZGVy
c3RhbmQgdGhlIGxvY2tpbmcgcmVxdWlyZW1lbnRzLg0KPiANCj4gQ3VycmVudGx5LCB4cHJ0X3Ry
YW5zbWl0IHNldHMgdGhlIGNvbm5lY3RfY29va2llIHdoaWxlIGhvbGRpbmcNCj4gdGhlIHRyYW5z
cG9ydF9sb2NrLg0KPiANCj4geHBydF9jb25kaXRpb25hbF9kaXNjb25uZWN0IGNvbXBhcmVzIHRo
ZSBjb29raWUgd2hpbGUgaG9sZGluZw0KPiB0aGUgdHJhbnNwb3J0X2xvY2suDQo+IA0KPiBGb3Ig
VENQLCB0aGUgdHJhbnNwb3J0X2xvY2sgaXMgaGVsZCB3aGVuIGJ1bXBpbmcgdGhlIGNvb2tpZQ0K
PiBpbiB0aGUgRVNUQUJMSVNIRUQgY2FzZSwgYnV0IF9ub3RfIGluIHRoZSB0d28gQ0xPU0UgY2Fz
ZXM/DQoNClRoYXQgc2hvdWxkIGJlIE9LLiBUaGUgbmV0d29ya2luZyBsYXllciBzaG91bGQgcHJv
dmlkZSBzdWZmaWNpZW50DQpzZXJpYWxpc2F0aW9uIHRoYXQgd2UgZG9uJ3QgaGF2ZSB0byB3b3Jy
eSBhYm91dCBjb2xsaXNpb25zLg0KDQo+IA0KPiB4cHJ0cmRtYSBob2xkcyB0aGUgdHJhbnNwb3J0
X2xvY2sgd2hlbiBidW1waW5nIHRoZSBjb29raWUsDQo+IHdoaWNoIGl0IGRvZXMgaW4gaXRzIGNv
bm5lY3Qgd29ya2VyLiBJdCBoYXMgdG8gaG9sZCB0aGUgbG9jaw0KPiBiZWNhdXNlIGl0IHNraXBz
IHRoZSB2YWx1ZSAwLiB4cHJ0cmRtYSBuZWVkcyB0byBndWFyYW50ZWUNCj4gdGhhdCBhbiBSUEMg
aXMgbmV2ZXIgdHJhbnNtaXR0ZWQgb24gdGhlIHNhbWUgY29ubmVjdGlvbg0KPiB0d2ljZSAoYW5k
IG1heWJlIGl0IGNvdWxkIHVzZSBycV9jb25uZWN0X2Nvb2tpZSBpbnN0ZWFkIG9mDQo+IGl0cyBv
d24gY29va2llKS4NCj4gDQo+IHhwcnRfcmVzZXJ2ZV9pbml0IGlzIGhvbGRpbmcgdGhlIHJlc2Vy
dmVfbG9jayBidXQgbm90IHRoZQ0KPiB0cmFuc3BvcnRfbG9jayB3aGVuIGl0IGdyYWJzIHRoZSBj
b29raWUuIE1heWJlIGl0IHNob3VsZA0KPiBub3QgYmUgaW5pdGlhbGl6aW5nIHRoZSBycXN0J3Mg
Y29va2llIHRoZXJlPw0KPiANCj4gU2VlbXMgdG8gbWUgdGhhdCB4cHJ0X3RyYW5zbWl0IG5lZWRz
IHRvIHVwZGF0ZSB0aGUgcnFzdCdzDQo+IGNvb2tpZSB3aGlsZSBob2xkaW5nIHRoZSB0cmFuc3Bv
cnRfbG9jaywgZXNwZWNpYWxseSBpZg0KPiB4cHJ0cmRtYSBuZWVkcyB0byBza2lwIGEgY29va2ll
IHZhbHVlPyBJJ20gc3VyZSBJJ20gbWlzc2luZw0KPiBzb21ldGhpbmcuDQo+IA0KDQpJdCBzaG91
bGQgYmUgT0ssIGdpdmVuIHRoYXQgdGhlIGNvbm5lY3Rpb24gaXMgYSBzdGF0ZSBtYWNoaW5lLg0K
SG93ZXZlciwgSSBtaXNzZWQgc29tZXRoaW5nIHRoYXQgeW91IHNhaWQgZWFybGllciBhYm91dA0K
eHBydF9wcmVwYXJlX3RyYW5zbWl0KCkuDQoNCk9LLiBIb3cgYWJvdXQgdGhlIGZvbGxvd2luZyBm
aXh1cCBwYXRjaCBpbnN0ZWFkIG9mIHRoZSBlYXJsaWVyIG9uZT8NCg0KODwtLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gMjFjZGIyODAyZDlk
OGI3MTU1Mzk5OGU2YmU1YWFmZWZmMDc0MjE0MiBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZy
b206IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCkRh
dGU6IFRodSwgMTQgRGVjIDIwMTcgMDc6MDU6MjcgLTA1MDANClN1YmplY3Q6IFtQQVRDSF0gZml4
dXAhIFNVTlJQQzogRml4IGEgcmFjZSBpbiB0aGUgcmVjZWl2ZSBjb2RlIHBhdGgNCg0KLS0tDQog
bmV0L3N1bnJwYy94cHJ0LmMgfCA0ICsrKy0NCiAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25z
KCspLCAxIGRlbGV0aW9uKC0pDQoNCmRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnQuYyBiL25l
dC9zdW5ycGMveHBydC5jDQppbmRleCA1ZTQyNzhlOWNlMzcuLjMzYjc0ZmQ4NDA1MSAxMDA2NDQN
Ci0tLSBhL25ldC9zdW5ycGMveHBydC5jDQorKysgYi9uZXQvc3VucnBjL3hwcnQuYw0KQEAgLTEw
MDEsNiArMTAwMSw3IEBAIHZvaWQgeHBydF90cmFuc21pdChzdHJ1Y3QgcnBjX3Rhc2sgKnRhc2sp
DQogew0KIAlzdHJ1Y3QgcnBjX3Jxc3QJKnJlcSA9IHRhc2stPnRrX3Jxc3RwOw0KIAlzdHJ1Y3Qg
cnBjX3hwcnQJKnhwcnQgPSByZXEtPnJxX3hwcnQ7DQorCXVuc2lnbmVkIGludCBjb25uZWN0X2Nv
b2tpZTsNCiAJaW50IHN0YXR1cywgbnVtcmVxczsNCiANCiAJZHByaW50aygiUlBDOiAlNXUgeHBy
dF90cmFuc21pdCgldSlcbiIsIHRhc2stPnRrX3BpZCwgcmVxLT5ycV9zbGVuKTsNCkBAIC0xMDI0
LDcgKzEwMjUsNyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0IHJwY190YXNrICp0YXNrKQ0K
IAl9IGVsc2UgaWYgKCFyZXEtPnJxX2J5dGVzX3NlbnQpDQogCQlyZXR1cm47DQogDQotCXJlcS0+
cnFfY29ubmVjdF9jb29raWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCisJY29ubmVjdF9jb29r
aWUgPSB4cHJ0LT5jb25uZWN0X2Nvb2tpZTsNCiAJcmVxLT5ycV94dGltZSA9IGt0aW1lX2dldCgp
Ow0KIAlzdGF0dXMgPSB4cHJ0LT5vcHMtPnNlbmRfcmVxdWVzdCh0YXNrKTsNCiAJdHJhY2VfeHBy
dF90cmFuc21pdCh4cHJ0LCByZXEtPnJxX3hpZCwgc3RhdHVzKTsNCkBAIC0xMDUwLDYgKzEwNTEs
NyBAQCB2b2lkIHhwcnRfdHJhbnNtaXQoc3RydWN0IHJwY190YXNrICp0YXNrKQ0KIAl4cHJ0LT5z
dGF0LnBlbmRpbmdfdSArPSB4cHJ0LT5wZW5kaW5nLnFsZW47DQogCXNwaW5fdW5sb2NrX2JoKCZ4
cHJ0LT50cmFuc3BvcnRfbG9jayk7DQogDQorCXJlcS0+cnFfY29ubmVjdF9jb29raWUgPSBjb25u
ZWN0X2Nvb2tpZTsNCiAJaWYgKHJwY19yZXBseV9leHBlY3RlZCh0YXNrKSAmJiAhUkVBRF9PTkNF
KHJlcS0+cnFfcmVwbHlfYnl0ZXNfcmVjdmQpKSB7DQogCQkvKg0KIAkJICogU2xlZXAgb24gdGhl
IHBlbmRpbmcgcXVldWUgaWYgd2UncmUgZXhwZWN0aW5nIGEgcmVwbHkuDQotLSANCjIuMTQuMw0K
DQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmlt
YXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chuck Lever Dec. 15, 2017, 2:16 a.m. UTC | #11

> On Dec 14, 2017, at 4:33 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Thu, 2017-12-14 at 15:59 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <trondmy@primarydata.c
>>> om> wrote:
>>> 
>>> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primaryda
>>>>> ta.c
>>>>> om> wrote:
>>>>> 
>>>>> Does the RDMA code update the connect cookie when the
>>>>> connection
>>>>> breaks? It looks to me as if it only does that when the
>>>>> connection
>>>>> is
>>>>> re-established. We really want both.
>>>>> 
>>>>>> I imagine this issue could also impact write buffer
>>>>>> exhaustion
>>>>>> on TCP.
>>>>> 
>>>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>>>> 
>>>> xprtrdma manipulates the connect_cookie in its connect worker,
>>>> see rpcrdma_connect_worker. This was added by:
>>>> 
>>>> commit 575448bd36208f99fe0dd554a43518d798966740
>>>> Author:     Tom Talpey <talpey@netapp.com>
>>>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>>>> Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
>>>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>>>> 
>>>>   RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>>>> 
>>>> Would it be more correct to bump the cookie in
>>>> rpcrdma_conn_upcall,
>>>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>>>> I can compose a reasonable patch description)
>>>> 
>>>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>>>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>>>> RDMA provides a distinction between "server disconnected"
>>>> and "client disconnected" although that probably does not
>>>> matter for this purpose.
>>>> 
>>>> But, why would the additional cookie update help? The transport
>>>> is not disconnecting before the deadlock.
>>>> 
>>> 
>>> The connection cookie's purpose is twofold:
>>> 
>>> 1) It tracks whether or not a request has been transmitted on the
>>> current connection or not.
>> 
>> That's broken by setting the cookie unconditionally outside
>> the transport_lock, isn't it?
>> 
>> 
>>> 2) It ensures that when several requests with the same connection
>>> cookie all call xprt_conditional_disconnect(), then that results in
>>> a
>>> single disconnection event. To do so, it assumes that
>>> xprt_autoclose()
>>> will change the cookie if the disconnection attempt is successful.
>>> 
>>> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport
>>> can
>>> guarantee that the call to xprt->ops->close(xprt) is always
>>> successful,
>>> then you could do so there.
>> 
>> I don't mind moving the cookie bump to rpcrdma_conn_upcall,
>> but I'm not sure I understand the locking requirements.
>> 
>> Currently, xprt_transmit sets the connect_cookie while holding
>> the transport_lock.
>> 
>> xprt_conditional_disconnect compares the cookie while holding
>> the transport_lock.
>> 
>> For TCP, the transport_lock is held when bumping the cookie
>> in the ESTABLISHED case, but _not_ in the two CLOSE cases?
> 
> That should be OK. The networking layer should provide sufficient
> serialisation that we don't have to worry about collisions.
> 
>> 
>> xprtrdma holds the transport_lock when bumping the cookie,
>> which it does in its connect worker. It has to hold the lock
>> because it skips the value 0. xprtrdma needs to guarantee
>> that an RPC is never transmitted on the same connection
>> twice (and maybe it could use rq_connect_cookie instead of
>> its own cookie).
>> 
>> xprt_reserve_init is holding the reserve_lock but not the
>> transport_lock when it grabs the cookie. Maybe it should
>> not be initializing the rqst's cookie there?
>> 
>> Seems to me that xprt_transmit needs to update the rqst's
>> cookie while holding the transport_lock, especially if
>> xprtrdma needs to skip a cookie value? I'm sure I'm missing
>> something.
>> 
> 
> It should be OK, given that the connection is a state machine.
> However, I missed something that you said earlier about
> xprt_prepare_transmit().
> 
> OK. How about the following fixup patch instead of the earlier one?
> 
> 8<---------------------------------------------------
> From 21cdb2802d9d8b71553998e6be5aafeff0742142 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> Date: Thu, 14 Dec 2017 07:05:27 -0500
> Subject: [PATCH] fixup! SUNRPC: Fix a race in the receive code path
> 
> ---
> net/sunrpc/xprt.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 5e4278e9ce37..33b74fd84051 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1001,6 +1001,7 @@ void xprt_transmit(struct rpc_task *task)
> {
> 	struct rpc_rqst	*req = task->tk_rqstp;
> 	struct rpc_xprt	*xprt = req->rq_xprt;
> +	unsigned int connect_cookie;
> 	int status, numreqs;
> 
> 	dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, req->rq_slen);
> @@ -1024,7 +1025,7 @@ void xprt_transmit(struct rpc_task *task)
> 	} else if (!req->rq_bytes_sent)
> 		return;
> 
> -	req->rq_connect_cookie = xprt->connect_cookie;
> +	connect_cookie = xprt->connect_cookie;
> 	req->rq_xtime = ktime_get();
> 	status = xprt->ops->send_request(task);
> 	trace_xprt_transmit(xprt, req->rq_xid, status);
> @@ -1050,6 +1051,7 @@ void xprt_transmit(struct rpc_task *task)
> 	xprt->stat.pending_u += xprt->pending.qlen;
> 	spin_unlock_bh(&xprt->transport_lock);
> 
> +	req->rq_connect_cookie = connect_cookie;
> 	if (rpc_reply_expected(task) && !READ_ONCE(req->rq_reply_bytes_recvd)) {
> 		/*
> 		 * Sleep on the pending queue if we're expecting a reply.
> -- 
> 2.14.3

No problems here, passed basic testing with NFSv4.0 on a
client with extra send_request fault injection.

I hope we can get the recv race fix (as updated here) and
the queue-work-on patch [1] into v4.15-rc.


--
Chuck Lever

[1] https://marc.info/?l=linux-nfs&m=151241427912572&w=2--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[v3] SUNRPC: Fix a race in the receive code path

Commit Message

Comments

Patch