From patchwork Tue Jul 6 13:52:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 12360703 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 072BAC07E96 for ; Tue, 6 Jul 2021 15:14:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E0AFB61983 for ; Tue, 6 Jul 2021 15:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231779AbhGFPR3 (ORCPT ); Tue, 6 Jul 2021 11:17:29 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:33716 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231266AbhGFPR2 (ORCPT ); Tue, 6 Jul 2021 11:17:28 -0400 Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5A0791FF81; Tue, 6 Jul 2021 13:52:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1625579564; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K2OyEQ2NjpZQXif7OzBbQKABUSVbIXRvJoMQ9p8XetM=; b=vdQMyIREkOOz7ms/RwsWo5wGnIow/IxbhLmeP9yRtZ5oV6JJDf3B6tDbrcHpoPeJ6Hw0lm aCX8Fxady6vp3KkRp1NwSJrS4K17EEnzT3zBeUoqBnznn1lhz7e1M7BtrnTfaCrVZYuK8/ yZJHMqdLa+rwzd+JPMj6YiuCdw+feuA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1625579564; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K2OyEQ2NjpZQXif7OzBbQKABUSVbIXRvJoMQ9p8XetM=; b=lCCXhzffDA8aeMVYB+Rzr6SR7IIh+AIvoPZOWKzjLryndmuff8gqy1OFlfo2nn9bJv5B26 unvM/pNT3IxOFOCA== Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id EFFA5133D0; Tue, 6 Jul 2021 13:52:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id OBhfNytg5GCPKAAAGKfGzw (envelope-from ); Tue, 06 Jul 2021 13:52:43 +0000 Received: from localhost (brahms [local]) by brahms (OpenSMTPD) with ESMTPA id fca8c589; Tue, 6 Jul 2021 13:52:42 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Date: Tue, 6 Jul 2021 14:52:41 +0100 Message-Id: <20210706135242.9978-2-lhenriques@suse.de> In-Reply-To: <20210706135242.9978-1-lhenriques@suse.de> References: <20210706135242.9978-1-lhenriques@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow schedule_delayed() callers to explicitly set the delay value instead of defaulting to a 5 secs value. Signed-off-by: Luis Henriques --- fs/ceph/mds_client.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index e5af591d3bd4..f5dc58a05f9f 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4502,13 +4502,19 @@ void inc_session_sequence(struct ceph_mds_session *s) } /* - * delayed work -- periodically trim expired leases, renew caps with mds + * delayed work -- periodically trim expired leases, renew caps with mds. If + * the @delay parameter is set to 0 or if it's more than 5 secs, the default + * workqueue delay value of 5 secs will be used. */ -static void schedule_delayed(struct ceph_mds_client *mdsc) +static void schedule_delayed(struct ceph_mds_client *mdsc, unsigned long delay) { - int delay = 5; - unsigned hz = round_jiffies_relative(HZ * delay); - schedule_delayed_work(&mdsc->delayed_work, hz); + unsigned long max_delay = HZ * 5; + + /* 5 secs default delay */ + if (!delay || (delay > max_delay)) + delay = max_delay; + schedule_delayed_work(&mdsc->delayed_work, + round_jiffies_relative(delay)); } static void delayed_work(struct work_struct *work) @@ -4565,7 +4571,7 @@ static void delayed_work(struct work_struct *work) maybe_recover_session(mdsc); - schedule_delayed(mdsc); + schedule_delayed(mdsc, 0); } int ceph_mdsc_init(struct ceph_fs_client *fsc) @@ -5042,7 +5048,7 @@ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg) mdsc->mdsmap->m_epoch); mutex_unlock(&mdsc->mutex); - schedule_delayed(mdsc); + schedule_delayed(mdsc, 0); return; bad_unlock: From patchwork Tue Jul 6 13:52:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 12360707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DFBDC07E96 for ; Tue, 6 Jul 2021 15:14:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 26BC061A0F for ; Tue, 6 Jul 2021 15:14:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232287AbhGFPRc (ORCPT ); Tue, 6 Jul 2021 11:17:32 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:33768 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231422AbhGFPR3 (ORCPT ); Tue, 6 Jul 2021 11:17:29 -0400 Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E54631FF83; Tue, 6 Jul 2021 13:52:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1625579564; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q5f1LbCjS19naTSoFESy0vSf6+nQCjxcCvLGSFudqS0=; b=Qdua2UOSBG514HrgXVAdYTdFHv7+oPhHQ0sjPvbOz2Z7xwbJcyQmtuPQ+7zubd5U2IgR8j DkiaBuoLTF79f2vSw8eIO86rholBVgv/+auCqSvUknZRXkQzITIHwZEPpLzOwTcuUDpKmp a8ZWUI1NsOaj9wjPNlxpsGBZZ7+rgAA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1625579564; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q5f1LbCjS19naTSoFESy0vSf6+nQCjxcCvLGSFudqS0=; b=+D5BDnA47QVUw1qPePfb8jdDeS7x5g4zbKe1K8WzZ+jsVOyS7MibfsEC+LNFXe9+osVx+G Eo3IVICa1W1H39Dw== Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id 6FC2F133D0; Tue, 6 Jul 2021 13:52:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id ANkxGCxg5GCPKAAAGKfGzw (envelope-from ); Tue, 06 Jul 2021 13:52:44 +0000 Received: from localhost (brahms [local]) by brahms (OpenSMTPD) with ESMTPA id 1835e700; Tue, 6 Jul 2021 13:52:43 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques , stable@vger.kernel.org Subject: [PATCH v3 2/2] ceph: reduce contention in ceph_check_delayed_caps() Date: Tue, 6 Jul 2021 14:52:42 +0100 Message-Id: <20210706135242.9978-3-lhenriques@suse.de> In-Reply-To: <20210706135242.9978-1-lhenriques@suse.de> References: <20210706135242.9978-1-lhenriques@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Function ceph_check_delayed_caps() is called from the mdsc->delayed_work workqueue and it can be kept looping for quite some time if caps keep being added back to the mdsc->cap_delay_list. This may result in the watchdog tainting the kernel with the softlockup flag. This patch breaks this loop if the caps have been recently (i.e. during the loop execution). Any new caps added to the list will be handled in the next run. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/46284 Signed-off-by: Luis Henriques --- fs/ceph/caps.c | 17 ++++++++++++++++- fs/ceph/mds_client.c | 7 ++++--- fs/ceph/super.h | 2 +- 3 files changed, 21 insertions(+), 5 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index a5e93b185515..c79b8dff25d7 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4224,11 +4224,19 @@ void ceph_handle_caps(struct ceph_mds_session *session, /* * Delayed work handler to process end of delayed cap release LRU list. + * + * If new caps are added to the list while processing it, these won't get + * processed in this run. In this case, the ci->i_hold_caps_max will be + * returned so that the work can be scheduled accordingly. */ -void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) +unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc) { struct inode *inode; struct ceph_inode_info *ci; + struct ceph_mount_options *opt = mdsc->fsc->mount_options; + unsigned long delay_max = opt->caps_wanted_delay_max * HZ; + unsigned long loop_start = jiffies; + unsigned long delay = 0; dout("check_delayed_caps\n"); spin_lock(&mdsc->cap_delay_lock); @@ -4236,6 +4244,11 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) ci = list_first_entry(&mdsc->cap_delay_list, struct ceph_inode_info, i_cap_delay_list); + if (time_before(loop_start, ci->i_hold_caps_max - delay_max)) { + dout("%s caps added recently. Exiting loop", __func__); + delay = ci->i_hold_caps_max; + break; + } if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 && time_before(jiffies, ci->i_hold_caps_max)) break; @@ -4252,6 +4265,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc) } } spin_unlock(&mdsc->cap_delay_lock); + + return delay; } /* diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index f5dc58a05f9f..a6f985786d68 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4519,11 +4519,12 @@ static void schedule_delayed(struct ceph_mds_client *mdsc, unsigned long delay) static void delayed_work(struct work_struct *work) { - int i; struct ceph_mds_client *mdsc = container_of(work, struct ceph_mds_client, delayed_work.work); + unsigned long delay; int renew_interval; int renew_caps; + int i; dout("mdsc delayed_work\n"); @@ -4563,7 +4564,7 @@ static void delayed_work(struct work_struct *work) } mutex_unlock(&mdsc->mutex); - ceph_check_delayed_caps(mdsc); + delay = ceph_check_delayed_caps(mdsc); ceph_queue_cap_reclaim_work(mdsc); @@ -4571,7 +4572,7 @@ static void delayed_work(struct work_struct *work) maybe_recover_session(mdsc); - schedule_delayed(mdsc, 0); + schedule_delayed(mdsc, delay); } int ceph_mdsc_init(struct ceph_fs_client *fsc) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 839e6b0239ee..3b5207c82767 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1170,7 +1170,7 @@ extern void ceph_flush_snaps(struct ceph_inode_info *ci, extern bool __ceph_should_report_size(struct ceph_inode_info *ci); extern void ceph_check_caps(struct ceph_inode_info *ci, int flags, struct ceph_mds_session *session); -extern void ceph_check_delayed_caps(struct ceph_mds_client *mdsc); +extern unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc); extern void ceph_flush_dirty_caps(struct ceph_mds_client *mdsc); extern int ceph_drop_caps_for_unlink(struct inode *inode); extern int ceph_encode_inode_release(void **p, struct inode *inode,