From patchwork Wed Jun 15 03:17:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 9177387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E58266021C for ; Wed, 15 Jun 2016 03:18:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7476281FE for ; Wed, 15 Jun 2016 03:18:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CA19A28349; Wed, 15 Jun 2016 03:18:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 416FB281FE for ; Wed, 15 Jun 2016 03:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932635AbcFODR7 (ORCPT ); Tue, 14 Jun 2016 23:17:59 -0400 Received: from mail-it0-f67.google.com ([209.85.214.67]:33501 "EHLO mail-it0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932473AbcFODR6 (ORCPT ); Tue, 14 Jun 2016 23:17:58 -0400 Received: by mail-it0-f67.google.com with SMTP id i6so1618474ith.0; Tue, 14 Jun 2016 20:17:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=rufxiUuqEvtaup6E7COhwUz+LyypuK2nBibI7cgcnU8=; b=AqTqGMkTyWNUMXf6NUC83ybLmqwU3x8Mix3naexGYIkS6PgmAmr980+/h5YSISiKWm Foekj7qAzRZIEWcNqd3SC06Yb+V/036Qdji7QZEVEJ5/GFwTyLJPDbf8uI9wWx99H5Z2 lIiCIfqW6PCyOo31DirDr9Gi/9uHprE4Cn0/Z4DSbgM/sZIFaX5qdJOOFXDiY+mj3Ln+ WofP0WCPDxZI6a4H6GSSgvlfLxahE5vUDIzksbr86iBon1pNu9Hl81SGHyiPhRxNb3/H Bnzffoua2fQnq2RMuG9+I50bDo2rSxQaG72PRlyMg1CVB72cPg3zlWvq4Ksovt1CEo0L RSsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=rufxiUuqEvtaup6E7COhwUz+LyypuK2nBibI7cgcnU8=; b=LukJZBgf89tLVLKodEWb1/jdyNxN0dULTI0o2mQWNm5Kjv8o9ycEvdyk4c2vbqW4Bf aD0uETL+ncf2o0Mrvn4/3LBXqX12nFK0Au5w1W/6Pc6iile0oNkn82do2TwY3JSf2QSc SJAXdjlHv9nt1CCFyAoWEYNlfi59oWdquwSOse+cS7Dh9mSsIvTDJ9JAt35LzLrD9yEd cbMnAQAue9Md5K55NDoV6gwHcX2nGioeUFaXQNcmZ2mH+LxnRun0OEen8ARx52yHjuyb C68dJEVuQ0bliK/z8yuHQOipd8ThucvmswTJ9koXd1QfBrUx8BDHvXR346ZaNfjQ1QCb /38Q== X-Gm-Message-State: ALyK8tLp41X6tODYZ2l9C2cZMONFNrXygnSO75XDp8tyXLDQNYStelcVnvbVLApAcxMpPw== X-Received: by 10.36.2.2 with SMTP id 2mr32380351itu.34.1465960672050; Tue, 14 Jun 2016 20:17:52 -0700 (PDT) Received: from manet.1015granger.net (c-68-46-169-226.hsd1.mi.comcast.net. [68.46.169.226]) by smtp.gmail.com with ESMTPSA id j65sm3284894ith.17.2016.06.14.20.17.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Jun 2016 20:17:51 -0700 (PDT) Subject: [PATCH v2 19/24] xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup() From: Chuck Lever To: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Tue, 14 Jun 2016 23:17:51 -0400 Message-ID: <20160615031751.14794.79660.stgit@manet.1015granger.net> In-Reply-To: <20160615030626.14794.43805.stgit@manet.1015granger.net> References: <20160615030626.14794.43805.stgit@manet.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While trying NFSv4.0/RDMA with sec=krb5p, I noticed small NFS READ operations failed. After the client unwrapped the NFS READ reply message, the NFS READ XDR decoder was not able to decode the reply. The message was "Server cheating in reply", with the reported number of received payload bytes being zero. Applications reported a read(2) that returned -1/EIO. The problem is rpcrdma_inline_fixup() sets the tail.iov_len to zero when the incoming reply fits entirely in the head iovec. The zero tail.iov_len confused xdr_buf_trim(), which then mangled the actual reply data instead of simply removing the trailing GSS checksum. As near as I can tell, RPC transports are not supposed to update the head.iov_len, page_len, or tail.iov_len fields in the receive XDR buffer when handling an incoming RPC reply message. These fields contain the length of each component of the XDR buffer, and hence the maximum number of bytes of reply data that can be stored in each XDR buffer component. I've concluded this because: - This is how xdr_partial_copy_from_skb() appears to behave - rpcrdma_inline_fixup() already does not alter page_len - call_decode() compares rq_private_buf and rq_rcv_buf and WARNs if they are not exactly the same Unfortunately, as soon as I tried the simple fix to just remove the line that sets tail.iov_len to zero, I saw that the logic that appends the implicit Write chunk pad inline depends on inline_fixup setting tail.iov_len to zero. To address this, re-organize the tail iovec handling logic to use the same approach as with the head iovec: simply point tail.iov_base to the correct bytes in the receive buffer. While I remember all this, write down the conclusion in documenting comments. Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/rpc_rdma.c | 61 ++++++++++++++++++++++------------------ 1 file changed, 33 insertions(+), 28 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c index e3560c2..d018eb7 100644 --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -740,8 +740,16 @@ rpcrdma_count_chunks(struct rpcrdma_rep *rep, int wrchunk, __be32 **iptrp) return total_len; } -/* - * Scatter inline received data back into provided iov's. +/** + * rpcrdma_inline_fixup - Scatter inline received data into rqst's iovecs + * @rqst: controlling RPC request + * @srcp: points to RPC message payload in receive buffer + * @copy_len: remaining length of receive buffer content + * @pad: Write chunk pad bytes needed (zero for pure inline) + * + * The upper layer has set the maximum number of bytes it can + * receive in each component of rq_rcv_buf. These values are set in + * the head.iov_len, page_len, tail.iov_len, and buflen fields. */ static void rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) @@ -751,17 +759,19 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) struct page **ppages; int page_base; + /* The head iovec is redirected to the RPC reply message + * in the receive buffer, to avoid a memcopy. + */ + rqst->rq_rcv_buf.head[0].iov_base = srcp; + + /* The contents of the receive buffer that follow + * head.iov_len bytes are copied into the page list. + */ curlen = rqst->rq_rcv_buf.head[0].iov_len; - if (curlen > copy_len) { /* write chunk header fixup */ + if (curlen > copy_len) curlen = copy_len; - rqst->rq_rcv_buf.head[0].iov_len = curlen; - } - dprintk("RPC: %s: srcp 0x%p len %d hdrlen %d\n", __func__, srcp, copy_len, curlen); - - /* Shift pointer for first receive segment only */ - rqst->rq_rcv_buf.head[0].iov_base = srcp; srcp += curlen; copy_len -= curlen; @@ -798,28 +808,23 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad) break; page_base = 0; } - } - if (copy_len && rqst->rq_rcv_buf.tail[0].iov_len) { - curlen = copy_len; - if (curlen > rqst->rq_rcv_buf.tail[0].iov_len) - curlen = rqst->rq_rcv_buf.tail[0].iov_len; - if (rqst->rq_rcv_buf.tail[0].iov_base != srcp) - memmove(rqst->rq_rcv_buf.tail[0].iov_base, srcp, curlen); - dprintk("RPC: %s: tail srcp 0x%p len %d curlen %d\n", - __func__, srcp, copy_len, curlen); - rqst->rq_rcv_buf.tail[0].iov_len = curlen; - copy_len -= curlen; ++i; - } else - rqst->rq_rcv_buf.tail[0].iov_len = 0; - - if (pad) { - /* implicit padding on terminal chunk */ - unsigned char *p = rqst->rq_rcv_buf.tail[0].iov_base; - while (pad--) - p[rqst->rq_rcv_buf.tail[0].iov_len++] = 0; + /* Implicit padding for the last segment in a Write + * chunk is inserted inline at the front of the tail + * iovec. The upper layer ignores the content of + * the pad. Simply ensure inline content in the tail + * that follows the Write chunk is properly aligned. + */ + if (pad) + srcp -= pad; } + /* The tail iovec is redirected to the remaining data + * in the receive buffer, to avoid a memcopy. + */ + if (copy_len || pad) + rqst->rq_rcv_buf.tail[0].iov_base = srcp; + if (copy_len) dprintk("RPC: %s: %d bytes in" " %d extra segments (%d lost)\n",