From patchwork Wed Mar 27 10:15:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kirill Smelkov X-Patchwork-Id: 10873243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5B94A15AC for ; Wed, 27 Mar 2019 10:44:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 45DCA28717 for ; Wed, 27 Mar 2019 10:44:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 39E7D2883D; Wed, 27 Mar 2019 10:44:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,URIBL_GREY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C63B728717 for ; Wed, 27 Mar 2019 10:44:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733091AbfC0KoX (ORCPT ); Wed, 27 Mar 2019 06:44:23 -0400 Received: from mail177-9.suw61.mandrillapp.com ([198.2.177.9]:58519 "EHLO mail177-9.suw61.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731920AbfC0KoW (ORCPT ); Wed, 27 Mar 2019 06:44:22 -0400 X-Greylist: delayed 908 seconds by postgrey-1.27 at vger.kernel.org; Wed, 27 Mar 2019 06:44:21 EDT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:In-Reply-To:References:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=qZZSsnfZjwS1Mn9CA2yZlrd+GTax9gGW7CsTVUXwZqM=; b=mVyySvKCsRrTGL3G10Al1WfzUjHiLnofcTfgNvIqoQOkZmBkquBs4IrH6Q9e1f+VJYcKOCaWTzXu 8PeMbn8Nh4Lemz7QPTgSRrm0/66ixVVtkh8Q1V7XMGb/wu+iVQdGgYXrxVIAeTqFn0jal+NNOdp4 HrAsowEr8pqjaFD3zWg= Received: from pmta06.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail177-9.suw61.mandrillapp.com id hjda0c22rtkr for ; Wed, 27 Mar 2019 10:15:19 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1553681719; h=From : Subject : To : Cc : Message-Id : In-Reply-To : References : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=qZZSsnfZjwS1Mn9CA2yZlrd+GTax9gGW7CsTVUXwZqM=; b=n0jXxxdmIFbRfLosDwgRpO6ZBwRB8JH6UxNzVAVt0nDLRQzB9pb/V/ecfnSX8HIHwlZdBw HGbtL4o6y9/b/Ar0e/vA3LEX09/BjJhA/J9VsuUxRUB1Qc0o/d0vhgZsm+UaLvfJLfuYpi8Q cz2ZK51lNb7HNb9naQsBCm3OkuQ1I= From: Kirill Smelkov Subject: [RESEND4, PATCH 1/2] fuse: retrieve: cap requested size to negotiated max_write Received: from [87.98.221.171] by mandrillapp.com id dd8284640bc144b6ad82d8ff3f0fe998; Wed, 27 Mar 2019 10:15:19 +0000 X-Mailer: git-send-email 2.21.0.392.gf8f6787159 To: Miklos Szeredi , Miklos Szeredi Cc: Han-Wen Nienhuys , Jakob Unterwurzacher , Kirill Tkhai , Andrew Morton , , , , Kirill Smelkov , Message-Id: <12f7d0d98555ee0d174d04bb47644f65c07f035a.1553680185.git.kirr@nexedi.com> In-Reply-To: References: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.dd8284640bc144b6ad82d8ff3f0fe998 X-Mandrill-User: md_31050260 Date: Wed, 27 Mar 2019 10:15:19 +0000 MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header + that max_write bytes. A filesystem server is free to set its max_write in anywhere in the range between [1·page, fc->max_pages·page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum. If max_write is < fc->max_pages·page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back. -> Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested. Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuse Signed-off-by: Kirill Smelkov Cc: Han-Wen Nienhuys Cc: Jakob Unterwurzacher Cc: # v2.6.36+ Signed-off-by: Kirill Smelkov --- fs/fuse/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 8a63e52785e9..38e94bc43053 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode, offset = outarg->offset & ~PAGE_MASK; file_size = i_size_read(inode); - num = outarg->size; + num = min(outarg->size, fc->max_write); if (outarg->offset > file_size) num = 0; else if (outarg->offset + num > file_size)