From patchwork Fri Nov 30 16:39:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706709 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A20D17F0 for ; Fri, 30 Nov 2018 16:40:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4968C303C9 for ; Fri, 30 Nov 2018 16:40:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3DB2E303DE; Fri, 30 Nov 2018 16:40:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D24CC300B7 for ; Fri, 30 Nov 2018 16:40:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727450AbeLADuE (ORCPT ); Fri, 30 Nov 2018 22:50:04 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53184 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727195AbeLADuE (ORCPT ); Fri, 30 Nov 2018 22:50:04 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3ADD9300127F; Fri, 30 Nov 2018 16:40:11 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 804BD5EE03; Fri, 30 Nov 2018 16:39:53 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 1/7] cachefiles: Fix an assertion failure when trying to update a failed object From: David Howells To: torvalds@linux-foundation.org Cc: Zhibin Li , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:39:53 +0000 Message-ID: <154359599280.18703.15983385548712952353.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 30 Nov 2018 16:40:11 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If cachefiles gets an error other then ENOENT when trying to look up an object in the cache (in this case, EACCES), the object state machine will eventually transition to the DROP_OBJECT state. This state invokes fscache_drop_object() which tries to sync the auxiliary data with the cache (this is done lazily since commit 402cb8dda949d) on an incomplete cache object struct. The problem comes when cachefiles_update_object_xattr() is called to rewrite the xattr holding the data. There's an assertion there that the cache object points to a dentry as we're going to update its xattr. The assertion trips, however, as dentry didn't get set. Fix the problem by skipping the update in cachefiles if the object doesn't refer to a dentry. A better way to do it could be to skip the update from the DROP_OBJECT state handler in fscache, but that might deny the cache the opportunity to update intermediate state. If this error occurs, the kernel log includes lines that look like the following: CacheFiles: Lookup failed error -13 CacheFiles: CacheFiles: Assertion failed ------------[ cut here ]------------ kernel BUG at fs/cachefiles/xattr.c:138! ... Workqueue: fscache_object fscache_object_work_func [fscache] RIP: 0010:cachefiles_update_object_xattr.cold.4+0x18/0x1a [cachefiles] ... Call Trace: cachefiles_update_object+0xdd/0x1c0 [cachefiles] fscache_update_aux_data+0x23/0x30 [fscache] fscache_drop_object+0x18e/0x1c0 [fscache] fscache_object_work_func+0x74/0x2b0 [fscache] process_one_work+0x18d/0x340 worker_thread+0x2e/0x390 ? pwq_unbound_release_workfn+0xd0/0xd0 kthread+0x112/0x130 ? kthread_bind+0x30/0x30 ret_from_fork+0x35/0x40 Note that there are actually two issues here: (1) EACCES happened on a cache object and (2) an oops occurred. I think that the second is a consequence of the first (it certainly looks like it ought to be). This patch only deals with the second. Fixes: 402cb8dda949 ("fscache: Attach the index key and aux data to the cookie") Reported-by: Zhibin Li Signed-off-by: David Howells --- fs/cachefiles/xattr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index 0a29a00aed2e..511e6c68156a 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -135,7 +135,8 @@ int cachefiles_update_object_xattr(struct cachefiles_object *object, struct dentry *dentry = object->dentry; int ret; - ASSERT(dentry); + if (!dentry) + return -ESTALE; _enter("%p,#%d", object, auxdata->len); From patchwork Fri Nov 30 16:40:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31F2E13B0 for ; Fri, 30 Nov 2018 16:40:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 215B4303E8 for ; Fri, 30 Nov 2018 16:40:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1F822301E1; Fri, 30 Nov 2018 16:40:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BA1D303E8 for ; Fri, 30 Nov 2018 16:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727559AbeLADuV (ORCPT ); Fri, 30 Nov 2018 22:50:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45280 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbeLADuV (ORCPT ); Fri, 30 Nov 2018 22:50:21 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7A4E43D95C; Fri, 30 Nov 2018 16:40:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 59BB261530; Fri, 30 Nov 2018 16:40:22 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 2/7] fscache: Fix race in fscache_op_complete() due to split atomic_sub & read From: David Howells To: torvalds@linux-foundation.org Cc: Kiran Kumar Modukuri , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:40:16 +0000 Message-ID: <154359601645.18703.16410473531194112220.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 30 Nov 2018 16:40:28 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: kiran.modukuri The code in fscache_retrieval_complete is using atomic_sub followed by an atomic_read: atomic_sub(n_pages, &op->n_pages); if (atomic_read(&op->n_pages) <= 0) fscache_op_complete(&op->op, true); This causes two threads doing a decrement of n_pages to race with each other seeing the op->refcount 0 at same time - and they end up calling fscache_op_complete() in both the threads leading to an assertion failure. Fix this by using atomic_sub_return_relaxed() instead of two calls. Note that I'm using 'relaxed' rather than, say, 'release' as there aren't multiple variables that appear to need ordering across the release. The oops looks something like: FS-Cache: Assertion failed FS-Cache: 0 > 0 is false ... kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449! ... Workqueue: fscache_operation fscache_op_work_func [fscache] ... RIP: 0010:[] fscache_op_complete+0x10d/0x180 [fscache] ... Call Trace: [] cachefiles_read_copier+0x3a9/0x410 [cachefiles] [] fscache_op_work_func+0x22/0x50 [fscache] [] process_one_work+0x150/0x3f0 [] worker_thread+0x11a/0x470 [] ? __schedule+0x359/0x980 [] ? rescuer_thread+0x310/0x310 [] kthread+0xd6/0xf0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60 This seen this in 4.4.x kernels and the same bug affects fscache in latest upstreams kernels. Fixes: 1bb4b7f98f36 ("FS-Cache: The retrieval remaining-pages counter needs to be atomic_t") Signed-off-by: Kiran Kumar Modukuri Signed-off-by: David Howells --- include/linux/fscache-cache.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index 34cf0fdd7dc7..610815e3f1aa 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -196,8 +196,7 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op) static inline void fscache_retrieval_complete(struct fscache_retrieval *op, int n_pages) { - atomic_sub(n_pages, &op->n_pages); - if (atomic_read(&op->n_pages) <= 0) + if (atomic_sub_return_relaxed(n_pages, &op->n_pages) <= 0) fscache_op_complete(&op->op, false); } From patchwork Fri Nov 30 16:40:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04A254B7E for ; Fri, 30 Nov 2018 16:40:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E86A7300CB for ; Fri, 30 Nov 2018 16:40:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E6CC2303F7; Fri, 30 Nov 2018 16:40:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 752A5301E1 for ; Fri, 30 Nov 2018 16:40:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727615AbeLADu3 (ORCPT ); Fri, 30 Nov 2018 22:50:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45390 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbeLADu3 (ORCPT ); Fri, 30 Nov 2018 22:50:29 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 546874E937; Fri, 30 Nov 2018 16:40:36 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 769C65EE03; Fri, 30 Nov 2018 16:40:34 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 3/7] cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active From: David Howells To: torvalds@linux-foundation.org Cc: Daniel Axtens , Shantanu Goel , Kiran Kumar Modukuri , Daniel Axtens , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:40:33 +0000 Message-ID: <154359603369.18703.763590641473461495.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 30 Nov 2018 16:40:36 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Kiran Kumar Modukuri [Description] In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled, pages can be leaked by fscache while trying read pages from cachefiles backend. This can happen because two applications can be reading same page from a single mount, two threads can be trying to read the backing page at same time. This results in one of the threads finding that a page for the backing file or netfs file is already in the radix tree. During the error handling cachefiles does not clean up the reference on backing page, leading to page leak. [Fix] The fix is straightforward, to decrement the reference when error is encountered. [dhowells: Note that I've removed the clearance and put of newpage as they aren't attested in the commit message and don't appear to actually achieve anything since a new page is only allocated is newpage!=NULL and any residual new page is cleared before returning.] [Testing] I have tested the fix using following method for 12+ hrs. 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc :/export /mnt/nfs 2) create 10000 files of 2.8MB in a NFS mount. 3) start a thread to simulate heavy VM presssure (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)& 4) start multiple parallel reader for data set at same time find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & .. .. find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ; free -h , cat /proc/meminfo and page-types -r -b lru to ensure all pages are freed. Reviewed-by: Daniel Axtens Signed-off-by: Shantanu Goel Signed-off-by: Kiran Kumar Modukuri [dja: forward ported to current upstream] Signed-off-by: Daniel Axtens Signed-off-by: David Howells --- fs/cachefiles/rdwr.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index 40f7595aad10..db233588a69a 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -535,7 +535,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + put_page(backpage); + backpage = NULL; put_page(netpage); + netpage = NULL; fscache_retrieval_complete(op, 1); continue; } @@ -608,7 +611,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + put_page(backpage); + backpage = NULL; put_page(netpage); + netpage = NULL; fscache_retrieval_complete(op, 1); continue; } From patchwork Fri Nov 30 16:40:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706715 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 837161057 for ; Fri, 30 Nov 2018 16:40:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6EC51303E9 for ; Fri, 30 Nov 2018 16:40:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 63422303F0; Fri, 30 Nov 2018 16:40:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03774301E1 for ; Fri, 30 Nov 2018 16:40:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727648AbeLADuk (ORCPT ); Fri, 30 Nov 2018 22:50:40 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39458 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbeLADuk (ORCPT ); Fri, 30 Nov 2018 22:50:40 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8A8DEC7913; Fri, 30 Nov 2018 16:40:47 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 56A785D9CD; Fri, 30 Nov 2018 16:40:42 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 4/7] fscache: fix race between enablement and dropping of object From: David Howells To: torvalds@linux-foundation.org Cc: NeilBrown , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:40:41 +0000 Message-ID: <154359604153.18703.13408831778890364347.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 30 Nov 2018 16:40:47 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: NeilBrown It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown Signed-off-by: David Howells --- fs/fscache/object.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/fscache/object.c b/fs/fscache/object.c index 9edc920f651f..6d9cb1719de5 100644 --- a/fs/fscache/object.c +++ b/fs/fscache/object.c @@ -730,6 +730,9 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob if (awaken) wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING); + if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags)) + wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP); + /* Prevent a race with our last child, which has to signal EV_CLEARED * before dropping our spinlock. From patchwork Fri Nov 30 16:40:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706717 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 204DE1057 for ; Fri, 30 Nov 2018 16:41:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F2E5303EA for ; Fri, 30 Nov 2018 16:41:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 03B1330413; Fri, 30 Nov 2018 16:40:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 70081303EA for ; Fri, 30 Nov 2018 16:40:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727143AbeLADuv (ORCPT ); Fri, 30 Nov 2018 22:50:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46718 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbeLADuv (ORCPT ); Fri, 30 Nov 2018 22:50:51 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6C8653F3B4; Fri, 30 Nov 2018 16:40:58 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4EDB6012E; Fri, 30 Nov 2018 16:40:53 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 5/7] cachefiles: Explicitly cast enumerated type in put_object From: David Howells To: torvalds@linux-foundation.org Cc: Nick Desaulniers , Nathan Chancellor , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:40:52 +0000 Message-ID: <154359605275.18703.13154398219288238138.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 30 Nov 2018 16:40:58 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Nathan Chancellor Clang warns when one enumerated type is implicitly converted to another. fs/cachefiles/namei.c:247:50: warning: implicit conversion from enumeration type 'enum cachefiles_obj_ref_trace' to different enumeration type 'enum fscache_obj_ref_trace' [-Wenum-conversion] cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_retry); Silence this warning by explicitly casting to fscache_obj_ref_trace, which is also done in put_object. Reported-by: Nick Desaulniers Signed-off-by: Nathan Chancellor Signed-off-by: David Howells --- fs/cachefiles/namei.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 95983c744164..5ab411d4bc59 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -244,11 +244,13 @@ static int cachefiles_mark_object_active(struct cachefiles_cache *cache, ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)); - cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_retry); + cache->cache.ops->put_object(&xobject->fscache, + (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_retry); goto try_again; requeue: - cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_timeo); + cache->cache.ops->put_object(&xobject->fscache, + (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_timeo); _leave(" = -ETIMEDOUT"); return -ETIMEDOUT; } From patchwork Fri Nov 30 16:41:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E94C21057 for ; Fri, 30 Nov 2018 16:41:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D92423041C for ; Fri, 30 Nov 2018 16:41:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CCE6730413; Fri, 30 Nov 2018 16:41:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D494303E9 for ; Fri, 30 Nov 2018 16:41:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726881AbeLADvC (ORCPT ); Fri, 30 Nov 2018 22:51:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46786 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726647AbeLADvC (ORCPT ); Fri, 30 Nov 2018 22:51:02 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CA1A653F5; Fri, 30 Nov 2018 16:41:08 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 52C715D9CD; Fri, 30 Nov 2018 16:41:04 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 6/7] cachefiles: avoid deprecated get_seconds() From: David Howells To: torvalds@linux-foundation.org Cc: Arnd Bergmann , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:41:03 +0000 Message-ID: <154359606362.18703.14445019258226218319.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 30 Nov 2018 16:41:08 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Arnd Bergmann get_seconds() returns an unsigned long can overflow on some architectures and is deprecated because of that. In cachefs, we cast that number to a a 32-bit integer, which will overflow in year 2106 on all architectures. As confirmed by David Howells, the overflow probably isn't harmful in the end, since the timestamps are only used to make the file names unique, but they don't strictly have to be in monotonically increasing order since the files only exist in order to be deleted as quickly as possible. Moving to ktime_get_real_seconds() avoids the deprecated interface. Signed-off-by: Arnd Bergmann Signed-off-by: David Howells --- fs/cachefiles/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 5ab411d4bc59..1645fcfd9691 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -338,7 +338,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, try_again: /* first step is to make up a grave dentry in the graveyard */ sprintf(nbuffer, "%08x%08x", - (uint32_t) get_seconds(), + (uint32_t) ktime_get_real_seconds(), (uint32_t) atomic_inc_return(&cache->gravecounter)); /* do the multiway lock magic */ From patchwork Fri Nov 30 16:41:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 10706721 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A776014E2 for ; Fri, 30 Nov 2018 16:41:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98C8530405 for ; Fri, 30 Nov 2018 16:41:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 86EFA30428; Fri, 30 Nov 2018 16:41:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D6B730407 for ; Fri, 30 Nov 2018 16:41:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726830AbeLADvM (ORCPT ); Fri, 30 Nov 2018 22:51:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46412 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726647AbeLADvM (ORCPT ); Fri, 30 Nov 2018 22:51:12 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3DBA1C074EE8; Fri, 30 Nov 2018 16:41:19 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-113.rdu2.redhat.com [10.10.120.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id B147817253; Fri, 30 Nov 2018 16:41:14 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 7/7] fscache, cachefiles: remove redundant variable 'cache' From: David Howells To: torvalds@linux-foundation.org Cc: Colin Ian King , dhowells@redhat.com, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 30 Nov 2018 16:41:14 +0000 Message-ID: <154359607401.18703.8945342302168543879.stgit@warthog.procyon.org.uk> In-Reply-To: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> References: <154359597927.18703.424445427183570230.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 30 Nov 2018 16:41:19 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Colin Ian King Variable 'cache' is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'cache' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King Signed-off-by: David Howells --- fs/cachefiles/rdwr.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index db233588a69a..8a577409d030 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -968,11 +968,8 @@ void cachefiles_uncache_page(struct fscache_object *_object, struct page *page) __releases(&object->fscache.cookie->lock) { struct cachefiles_object *object; - struct cachefiles_cache *cache; object = container_of(_object, struct cachefiles_object, fscache); - cache = container_of(object->fscache.cache, - struct cachefiles_cache, cache); _enter("%p,{%lu}", object, page->index);