From patchwork Mon Feb 15 00:15:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 8307281 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 2E14A9F38B for ; Mon, 15 Feb 2016 00:15:57 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2ACC02045B for ; Mon, 15 Feb 2016 00:15:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0D1DE2042B for ; Mon, 15 Feb 2016 00:15:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752123AbcBOAPg (ORCPT ); Sun, 14 Feb 2016 19:15:36 -0500 Received: from mail-yw0-f194.google.com ([209.85.161.194]:34773 "EHLO mail-yw0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751628AbcBOAPa (ORCPT ); Sun, 14 Feb 2016 19:15:30 -0500 Received: by mail-yw0-f194.google.com with SMTP id f6so2988779ywa.1 for ; Sun, 14 Feb 2016 16:15:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=poochiereds-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=FHimwXcS1eAk/VFl4+Rzi1YaDDiPI7x0hL08N0gmBxc=; b=LRyMthAIgopgyOwegmyOIp6HEZItzwmAuDta3LCbIKrka2aiKG+GZThQMoZ7URbdoJ xkNWBRJThrs80qEO2YB7xBjc8SUZIhiZlFATJKJ/hoX1Lf+EqXZE3o8jI74+ATp9+Kke qwtbMKFsxkUVqEHn7gIFKHkJZ4o/0GTDbQ4yOSfR8zKxmz5MbxYD4gSexWfjgcy5stSm xR+PUTkd8ElFfZAGpyckDjZa8CQlUryLm3YfcEp8EoyrkiBVU/BrhX+qdV/r7piTuEp1 UTviyp/83rxmlDYi7XZURe4iqEQpSLqxt9jEuZiQDj2btTMaNaFqjf1r3V5Hz+LqzMeA 1iNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=FHimwXcS1eAk/VFl4+Rzi1YaDDiPI7x0hL08N0gmBxc=; b=mQ4c3J+3WudBjVBGtaqn1fdlpYv+cswsHZt6HCg0mDpVQ88ngp3PVv7pnlwZ4r3jgr V+prxEq3sb4ZK0NABi24/5BYDUQ2LbobATm1eg7XbfkW4uBxAU6E7r21iJPDyQ+SgjsA T08crJYmjc68dUSdpcs6lQqC4b1B16UbSM9KjRVoITueuBjfUulKSXHRP3tdBS8dTVL5 n6D3ATydp2Kf+iSLjOCpxrMIkT2W5Lg59cFn61CL27sQF7pEo0UftxAkuqcJlpUsUTSS wSVQxb97Jb/CLRCfNoVmPw1CsvTr4GHX6enGncgk0nXPMUtxeZ+QzQHpSPHSc3kjeLSB pU5g== X-Gm-Message-State: AG10YOTFH2UJ7RI/9P7KLUyOQdd8cr+ADxQrivKUZ9nSquYl/b7Y0r8h/RHnCP5L1lJ7eg== X-Received: by 10.129.45.194 with SMTP id t185mr7458259ywt.243.1455495329238; Sun, 14 Feb 2016 16:15:29 -0800 (PST) Received: from tlielax.poochiereds.net ([2606:a000:1125:4074:3a60:77ff:fe93:a95d]) by smtp.googlemail.com with ESMTPSA id y4sm18820131ywa.28.2016.02.14.16.15.27 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 14 Feb 2016 16:15:28 -0800 (PST) From: Jeff Layton X-Google-Original-From: Jeff Layton To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, "Paul E. McKenney" , Jan Kara , Eric Paris , Eryu Guan Subject: [PATCH 1/2] Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" Date: Sun, 14 Feb 2016 19:15:22 -0500 Message-Id: <1455495323-29605-1-git-send-email-jeff.layton@primarydata.com> X-Mailer: git-send-email 2.5.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commit c510eff6bebaa244e577b8f499e86606b5e5d4c7. Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Cc: Jan Kara Cc: Eric Paris Cc: Andrew Morton Cc: Eryu Guan Signed-off-by: Jeff Layton --- fs/notify/mark.c | 66 +++++++++++++++++++++++++++++++--------- include/linux/fsnotify_backend.h | 5 +-- 2 files changed, 53 insertions(+), 18 deletions(-) diff --git a/fs/notify/mark.c b/fs/notify/mark.c index cfcbf114676e..fc0df4442f7b 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -92,6 +92,9 @@ #include "fsnotify.h" struct srcu_struct fsnotify_mark_srcu; +static DEFINE_SPINLOCK(destroy_lock); +static LIST_HEAD(destroy_list); +static DECLARE_WAIT_QUEUE_HEAD(destroy_waitq); void fsnotify_get_mark(struct fsnotify_mark *mark) { @@ -165,19 +168,10 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark) atomic_dec(&group->num_marks); } -static void -fsnotify_mark_free_rcu(struct rcu_head *rcu) -{ - struct fsnotify_mark *mark; - - mark = container_of(rcu, struct fsnotify_mark, g_rcu); - fsnotify_put_mark(mark); -} - /* - * Free fsnotify mark. The freeing is actually happening from a call_srcu - * callback. Caller must have a reference to the mark or be protected by - * fsnotify_mark_srcu. + * Free fsnotify mark. The freeing is actually happening from a kthread which + * first waits for srcu period end. Caller must have a reference to the mark + * or be protected by fsnotify_mark_srcu. */ void fsnotify_free_mark(struct fsnotify_mark *mark) { @@ -192,7 +186,10 @@ void fsnotify_free_mark(struct fsnotify_mark *mark) mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE; spin_unlock(&mark->lock); - call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu); + spin_lock(&destroy_lock); + list_add(&mark->g_list, &destroy_list); + spin_unlock(&destroy_lock); + wake_up(&destroy_waitq); /* * Some groups like to know that marks are being freed. This is a @@ -388,7 +385,11 @@ err: spin_unlock(&mark->lock); - call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu); + spin_lock(&destroy_lock); + list_add(&mark->g_list, &destroy_list); + spin_unlock(&destroy_lock); + wake_up(&destroy_waitq); + return ret; } @@ -491,3 +492,40 @@ void fsnotify_init_mark(struct fsnotify_mark *mark, atomic_set(&mark->refcnt, 1); mark->free_mark = free_mark; } + +static int fsnotify_mark_destroy(void *ignored) +{ + struct fsnotify_mark *mark, *next; + struct list_head private_destroy_list; + + for (;;) { + spin_lock(&destroy_lock); + /* exchange the list head */ + list_replace_init(&destroy_list, &private_destroy_list); + spin_unlock(&destroy_lock); + + synchronize_srcu(&fsnotify_mark_srcu); + + list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) { + list_del_init(&mark->g_list); + fsnotify_put_mark(mark); + } + + wait_event_interruptible(destroy_waitq, !list_empty(&destroy_list)); + } + + return 0; +} + +static int __init fsnotify_mark_init(void) +{ + struct task_struct *thread; + + thread = kthread_run(fsnotify_mark_destroy, NULL, + "fsnotify_mark"); + if (IS_ERR(thread)) + panic("unable to start fsnotify mark destruction thread."); + + return 0; +} +device_initcall(fsnotify_mark_init); diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 6b7e89f45aa4..533c4408529a 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -220,10 +220,7 @@ struct fsnotify_mark { /* List of marks by group->i_fsnotify_marks. Also reused for queueing * mark into destroy_list when it's waiting for the end of SRCU period * before it can be freed. [group->mark_mutex] */ - union { - struct list_head g_list; - struct rcu_head g_rcu; - }; + struct list_head g_list; /* Protects inode / mnt pointers, flags, masks */ spinlock_t lock; /* List of marks for inode / vfsmount [obj_lock] */