From patchwork Mon Jun 20 16:11:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 9188089 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 275836075F for ; Mon, 20 Jun 2016 16:14:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14B6120410 for ; Mon, 20 Jun 2016 16:14:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0759427A98; Mon, 20 Jun 2016 16:14:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E9FD20410 for ; Mon, 20 Jun 2016 16:14:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751225AbcFTQNc (ORCPT ); Mon, 20 Jun 2016 12:13:32 -0400 Received: from mail-it0-f68.google.com ([209.85.214.68]:34575 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755091AbcFTQNN (ORCPT ); Mon, 20 Jun 2016 12:13:13 -0400 Received: by mail-it0-f68.google.com with SMTP id f6so7331183ith.1; Mon, 20 Jun 2016 09:11:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=rufxiUuqEvtaup6E7COhwUz+LyypuK2nBibI7cgcnU8=; b=o3R0DxGPKK8pKvhF+kInZjvzdH+u0knj5IWiI98kJx2IQmSTJqcdw6uwIjnEcjxMIk wvzsphP/Pd5eaaEb2UcXwe0SN9RePV1HnokZAyPLygqFtVp9/EDQw6ALIfkgXcJm5RPZ MIQSU/3eTH9VeebbflH5A3FqpE8gkTVmmZxEDM5NEwe9HQyNaDYD8eSqyqWb6pNZp9ZX e0ZM9+QyUAXzXop3pm43KPy1vAet4zUgeiZKuV82Smf43clWq0BATLicjz+2En/NZpZy Grty1zwod5qlekeLcx0ZrZihBvDrNTkS1Wnyw3DnXURM6JEEO0F40O3vhhqXPs7UkdpG mXFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=rufxiUuqEvtaup6E7COhwUz+LyypuK2nBibI7cgcnU8=; b=Di6E+t0AFdMI9nNjhvXUUqOAWbti914WAmXRivN42qh9QnBqXK9Jd7jO1rpTRzUYW4 yyB1FzKF9rlhpOT3LgBUIxDW5dHH6H+yuYYbBbbXccshOPS5mb0yAPpEb4wdbtppJvvQ I/mRGk4MtLAE5SDzVSJOJ53NKvKBA6e5HRw6lAXt/fuB/fDUB9lzYeZTjHv/mpMo2DXh +i26L1SQipozreLV7c4xL5MwNK0mAsIIxAfYBM4gkP1D9aFjDorqIqPpZlJ4LwAHiuEg wauPyfar5eW4A7pu7DRy8EluZH8kJXGQNaTsL8vevhENqLepVGTGSh45l1r9t25Dm+dZ feBQ== X-Gm-Message-State: ALyK8tIkrxiOs6Tdh3A2N4nHRaDUxcy0hOchXG9onPUtmAsSnftvk1tfrSGr6qCC1BX1lg== X-Received: by 10.36.213.135 with SMTP id a129mr174903itg.8.1466439071934; Mon, 20 Jun 2016 09:11:11 -0700 (PDT) Received: from manet.1015granger.net ([2604:8800:100:81fc:ec4:7aff:fe6c:1dce]) by smtp.gmail.com with ESMTPSA id f69sm27162198ioj.39.2016.06.20.09.11.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Jun 2016 09:11:11 -0700 (PDT) Subject: [PATCH v3 19/25] xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup() From: Chuck Lever To: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Mon, 20 Jun 2016 12:11:10 -0400 Message-ID: <20160620161110.10809.24136.stgit@manet.1015granger.net> In-Reply-To: <20160620155751.10809.22262.stgit@manet.1015granger.net> References: <20160620155751.10809.22262.stgit@manet.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While trying NFSv4.0/RDMA with sec=krb5p, I noticed small NFS READ operations failed. After the client unwrapped the NFS READ reply message, the NFS READ XDR decoder was not able to decode the reply. The message was "Server cheating in reply", with the reported number of received payload bytes being zero. Applications reported a read(2) that returned -1/EIO. The problem is rpcrdma_inline_fixup() sets the tail.iov_len to zero when the incoming reply fits entirely in the head iovec. The zero tail.iov_len confused xdr_buf_trim(), which then mangled the actual reply data instead of simply removing the trailing GSS checksum. As near as I can tell, RPC transports are not supposed to update the head.iov_len, page_len, or tail.iov_len fields in the receive XDR buffer when handling an incoming RPC reply message. These fields contain the length of each component of the XDR buffer, and hence the maximum number of bytes of reply data that can be stored in each XDR buffer component. I've concluded this because: - This is how xdr_partial_copy_from_skb() appears to behave - rpcrdma_inline_fixup() already does not alter page_len - call_decode() compares rq_private_buf and rq_rcv_buf and WARNs if they are not exactly the same Unfortunately, as soon as I tried the simple fix to just remove the line that sets tail.iov_len to zero, I saw that the logic that appends the implicit Write chunk pad inline depends on inline_fixup setting tail.iov_len to zero. To address this, re-organize the tail iovec handling logic to use the same approach as with the head iovec: simply point tail.iov_base to the correct bytes in the receive buffer. While I remember all this, write down the conclusion in documenting comments. Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/rpc_rdma.c | 61 ++++++++++++++++++++++------------------ 1 file changed, 33 insertions(+), 28 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c index e3560c2..d018eb7 100644 --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -740,8 +740,16 @@ rpcrdma_count_chunks(struct rpcrdma_rep *rep, int wrchunk, __be32 **iptrp) return total_len; } -/* - * Scatter inline received data back into provided iov's. +/** + * rpcrdma_inline_fixup - Scatter inline received data into rqst's iovecs + * @rqst: controlling RPC request + * @srcp: points to RPC message payload in receive buffer + * @copy_len: remaining length of receive buffer content + * @pad: Write chunk pad bytes needed (zero for pure inline) + * + * The upper layer has set the maximum number of bytes it can + * receive in each component of rq_rcv_buf. These values are set in + * the head.iov_len, page_len, tail.iov_len, and buflen fields. */ static void rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) @@ -751,17 +759,19 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) struct page **ppages; int page_base; + /* The head iovec is redirected to the RPC reply message + * in the receive buffer, to avoid a memcopy. + */ + rqst->rq_rcv_buf.head[0].iov_base = srcp; + + /* The contents of the receive buffer that follow + * head.iov_len bytes are copied into the page list. + */ curlen = rqst->rq_rcv_buf.head[0].iov_len; - if (curlen > copy_len) { /* write chunk header fixup */ + if (curlen > copy_len) curlen = copy_len; - rqst->rq_rcv_buf.head[0].iov_len = curlen; - } - dprintk("RPC: %s: srcp 0x%p len %d hdrlen %d\n", __func__, srcp, copy_len, curlen); - - /* Shift pointer for first receive segment only */ - rqst->rq_rcv_buf.head[0].iov_base = srcp; srcp += curlen; copy_len -= curlen; @@ -798,28 +808,23 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) break; page_base = 0; } - } - if (copy_len && rqst->rq_rcv_buf.tail[0].iov_len) { - curlen = copy_len; - if (curlen > rqst->rq_rcv_buf.tail[0].iov_len) - curlen = rqst->rq_rcv_buf.tail[0].iov_len; - if (rqst->rq_rcv_buf.tail[0].iov_base != srcp) - memmove(rqst->rq_rcv_buf.tail[0].iov_base, srcp, curlen); - dprintk("RPC: %s: tail srcp 0x%p len %d curlen %d\n", - __func__, srcp, copy_len, curlen); - rqst->rq_rcv_buf.tail[0].iov_len = curlen; - copy_len -= curlen; ++i; - } else - rqst->rq_rcv_buf.tail[0].iov_len = 0; - - if (pad) { - /* implicit padding on terminal chunk */ - unsigned char *p = rqst->rq_rcv_buf.tail[0].iov_base; - while (pad--) - p[rqst->rq_rcv_buf.tail[0].iov_len++] = 0; + /* Implicit padding for the last segment in a Write + * chunk is inserted inline at the front of the tail + * iovec. The upper layer ignores the content of + * the pad. Simply ensure inline content in the tail + * that follows the Write chunk is properly aligned. + */ + if (pad) + srcp -= pad; } + /* The tail iovec is redirected to the remaining data + * in the receive buffer, to avoid a memcopy. + */ + if (copy_len || pad) + rqst->rq_rcv_buf.tail[0].iov_base = srcp; + if (copy_len) dprintk("RPC: %s: %d bytes in" " %d extra segments (%d lost)\n",