From patchwork Thu Jun 29 18:42:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297200 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 898E5EB64D9 for ; Thu, 29 Jun 2023 18:42:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232785AbjF2Smg (ORCPT ); Thu, 29 Jun 2023 14:42:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232774AbjF2Smf (ORCPT ); Thu, 29 Jun 2023 14:42:35 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A4CD2681 for ; Thu, 29 Jun 2023 11:42:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6BB66615E4 for ; Thu, 29 Jun 2023 18:42:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 648B0C433C8; Thu, 29 Jun 2023 18:42:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064151; bh=Y7Tjzr9QJE6XOLQMLihT+VYtx/00IbvTy/FXe+QmMww=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=UQbJnh1MgCg3uc5vjuv0zgmej8LNCVyUl85eJrpI5Ces53K+OoaDBsFm2RROTndy4 R1kNlEUdFQF/WK1qRlGTz5FBKFmma4bJ5phKmvo47onRad/Hx+Kp2yoLRJbP6kSBuF njIkBduyj2P7GhCRI1SIf5xvtmmgknI/ZEkLAKuvhj5GdxxZJBvhDK9ulKmr7xnRiC LF6ojoY3bbaaFkbMvD7Rpb+7SWORO4EPIeeM7J1hHKGu9Xy/aMGQrQw1rs6FpGUQOM /P7T8k1zyNr0Scl4KpITEVdMWamp2+6jL2SNBwqXjSGISLGPXdqBHFik6kmmP+JQ2m eEpOlJ855o2zw== Subject: [PATCH RFC 1/8] SUNRPC: Deduplicate thread wake-up code From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:42:30 -0400 Message-ID: <168806415041.1034990.11822594910002824781.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Refactor: Extract the loop that finds an idle service thread from svc_xprt_enqueue() and svc_wake_up(). Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc.h | 1 + net/sunrpc/svc.c | 28 +++++++++++++++++++++++++++ net/sunrpc/svc_xprt.c | 46 +++++++++++++------------------------------- 3 files changed, 43 insertions(+), 32 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index f8751118c122..dc2d90a655e2 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -427,6 +427,7 @@ int svc_register(const struct svc_serv *, struct net *, const int, void svc_wake_up(struct svc_serv *); void svc_reserve(struct svc_rqst *rqstp, int space); +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool); struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv); char * svc_print_addr(struct svc_rqst *, char *, size_t); const char * svc_proc_name(const struct svc_rqst *rqstp); diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 587811a002c9..e81ce5f76abd 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -689,6 +689,34 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) return rqstp; } +/** + * svc_pool_wake_idle_thread - wake an idle thread in @pool + * @pool: service thread pool + * + * Returns an idle service thread (now marked BUSY), or NULL + * if no service threads are available. Finding an idle service + * thread and marking it BUSY is atomic with respect to other + * calls to svc_pool_wake_idle_thread(). + */ +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool) +{ + struct svc_rqst *rqstp; + + rcu_read_lock(); + list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { + if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) + continue; + + rcu_read_unlock(); + WRITE_ONCE(rqstp->rq_qtime, ktime_get()); + wake_up_process(rqstp->rq_task); + percpu_counter_inc(&pool->sp_threads_woken); + return rqstp; + } + rcu_read_unlock(); + return NULL; +} + /* * Choose a pool in which to create a new thread, for svc_set_num_threads */ diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 62c7919ea610..f14476d11b67 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -455,8 +455,8 @@ static bool svc_xprt_ready(struct svc_xprt *xprt) */ void svc_xprt_enqueue(struct svc_xprt *xprt) { + struct svc_rqst *rqstp; struct svc_pool *pool; - struct svc_rqst *rqstp = NULL; if (!svc_xprt_ready(xprt)) return; @@ -476,20 +476,10 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) list_add_tail(&xprt->xpt_ready, &pool->sp_sockets); spin_unlock_bh(&pool->sp_lock); - /* find a thread for this xprt */ - rcu_read_lock(); - list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { - if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) - continue; - percpu_counter_inc(&pool->sp_threads_woken); - rqstp->rq_qtime = ktime_get(); - wake_up_process(rqstp->rq_task); - goto out_unlock; - } - set_bit(SP_CONGESTED, &pool->sp_flags); - rqstp = NULL; -out_unlock: - rcu_read_unlock(); + rqstp = svc_pool_wake_idle_thread(pool); + if (!rqstp) + set_bit(SP_CONGESTED, &pool->sp_flags); + trace_svc_xprt_enqueue(xprt, rqstp); } EXPORT_SYMBOL_GPL(svc_xprt_enqueue); @@ -581,7 +571,10 @@ static void svc_xprt_release(struct svc_rqst *rqstp) svc_xprt_put(xprt); } -/* +/** + * svc_wake_up - Wake up a service thread for non-transport work + * @serv: RPC service + * * Some svc_serv's will have occasional work to do, even when a xprt is not * waiting to be serviced. This function is there to "kick" a task in one of * those services so that it can wake up and do that work. Note that we only @@ -590,27 +583,16 @@ static void svc_xprt_release(struct svc_rqst *rqstp) */ void svc_wake_up(struct svc_serv *serv) { + struct svc_pool *pool = &serv->sv_pools[0]; struct svc_rqst *rqstp; - struct svc_pool *pool; - pool = &serv->sv_pools[0]; - - rcu_read_lock(); - list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { - /* skip any that aren't queued */ - if (test_bit(RQ_BUSY, &rqstp->rq_flags)) - continue; - rcu_read_unlock(); - wake_up_process(rqstp->rq_task); - trace_svc_wake_up(rqstp->rq_task->pid); + rqstp = svc_pool_wake_idle_thread(pool); + if (!rqstp) { + set_bit(SP_TASK_PENDING, &pool->sp_flags); return; } - rcu_read_unlock(); - /* No free entries available */ - set_bit(SP_TASK_PENDING, &pool->sp_flags); - smp_wmb(); - trace_svc_wake_up(0); + trace_svc_wake_up(rqstp->rq_task->pid); } EXPORT_SYMBOL_GPL(svc_wake_up); From patchwork Thu Jun 29 18:42:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DEB6EB64D9 for ; Thu, 29 Jun 2023 18:42:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232791AbjF2Smp (ORCPT ); Thu, 29 Jun 2023 14:42:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232787AbjF2Smo (ORCPT ); Thu, 29 Jun 2023 14:42:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7CF32681 for ; Thu, 29 Jun 2023 11:42:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id ED19D615E7 for ; Thu, 29 Jun 2023 18:42:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9180C433C8; Thu, 29 Jun 2023 18:42:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064158; bh=1MpJ/jMmBL286yDmYlTszKW+QVO8O2BNdW6hS36K63c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QYDbOAHHhH+EpmQuihg4rM80Tb5w3Ykf9BdJZMWDKcVYOKi+IJnacyiG9q61qHiuM r7St13pIjvsImAhy2yHT5diaiemECp7e0LaKHw2ZWhqyWxIcAxEGHJ1DoAQd9ySmas w+NKFW8Fu0Z+KPGSc/npTYh3yxqF0bFtaqYJnzrxtwZqYCb6WgLFF/c6aC1ca3WbK8 aWivujr/q4f8SKhXmJ+GgLzka1kXA/vZa5kqNVnBdAfBto9phfKokjDhvG8Ar1dJCw f7TzHbTBBCy7dpkUAW13DGIUNdi4y0L/1Q3w38XxAf29a09IXkpNShEENJac2dcF5i 0EBUKGGkqf7lg== Subject: [PATCH RFC 2/8] SUNRPC: Report when no service thread is available. From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:42:37 -0400 Message-ID: <168806415706.1034990.3299102237393518473.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Count and record thread starvation. Administrators can take action by increasing thread count or decreasing workload. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc.h | 5 +++- include/trace/events/sunrpc.h | 49 ++++++++++++++++++++++++++++++++++------- net/sunrpc/svc.c | 9 +++++++- net/sunrpc/svc_xprt.c | 21 ++++++++++-------- 4 files changed, 64 insertions(+), 20 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index dc2d90a655e2..fbfe6ea737c8 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -22,7 +22,6 @@ #include /* - * * RPC service thread pool. * * Pool of threads and temporary sockets. Generally there is only @@ -42,6 +41,7 @@ struct svc_pool { struct percpu_counter sp_sockets_queued; struct percpu_counter sp_threads_woken; struct percpu_counter sp_threads_timedout; + struct percpu_counter sp_threads_starved; #define SP_TASK_PENDING (0) /* still work to do even if no * xprt is queued. */ @@ -427,7 +427,8 @@ int svc_register(const struct svc_serv *, struct net *, const int, void svc_wake_up(struct svc_serv *); void svc_reserve(struct svc_rqst *rqstp, int space); -struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool); +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, + struct svc_pool *pool); struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv); char * svc_print_addr(struct svc_rqst *, char *, size_t); const char * svc_proc_name(const struct svc_rqst *rqstp); diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 69e42ef30979..9813f4560eef 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1918,21 +1918,21 @@ TRACE_EVENT(svc_xprt_create_err, TRACE_EVENT(svc_xprt_enqueue, TP_PROTO( const struct svc_xprt *xprt, - const struct svc_rqst *rqst + const struct svc_rqst *wakee ), - TP_ARGS(xprt, rqst), + TP_ARGS(xprt, wakee), TP_STRUCT__entry( SVC_XPRT_ENDPOINT_FIELDS(xprt) - __field(int, pid) + __field(pid_t, pid) ), TP_fast_assign( SVC_XPRT_ENDPOINT_ASSIGNMENTS(xprt); - __entry->pid = rqst? rqst->rq_task->pid : 0; + __entry->pid = wakee->rq_task->pid; ), TP_printk(SVC_XPRT_ENDPOINT_FORMAT " pid=%d", @@ -1963,6 +1963,39 @@ TRACE_EVENT(svc_xprt_dequeue, SVC_XPRT_ENDPOINT_VARARGS, __entry->wakeup) ); +#define show_svc_pool_flags(x) \ + __print_flags(x, "|", \ + { BIT(SP_TASK_PENDING), "TASK_PENDING" }, \ + { BIT(SP_CONGESTED), "CONGESTED" }) + +TRACE_EVENT(svc_pool_starved, + TP_PROTO( + const struct svc_serv *serv, + const struct svc_pool *pool + ), + + TP_ARGS(serv, pool), + + TP_STRUCT__entry( + __string(name, serv->sv_name) + __field(int, pool_id) + __field(unsigned int, nrthreads) + __field(unsigned long, flags) + ), + + TP_fast_assign( + __assign_str(name, serv->sv_name); + __entry->pool_id = pool->sp_id; + __entry->nrthreads = pool->sp_nrthreads; + __entry->flags = pool->sp_flags; + ), + + TP_printk("service=%s pool=%d flags=%s nrthreads=%u", + __get_str(name), __entry->pool_id, + show_svc_pool_flags(__entry->flags), __entry->nrthreads + ) +); + DECLARE_EVENT_CLASS(svc_xprt_event, TP_PROTO( const struct svc_xprt *xprt @@ -2033,16 +2066,16 @@ TRACE_EVENT(svc_xprt_accept, ); TRACE_EVENT(svc_wake_up, - TP_PROTO(int pid), + TP_PROTO(const struct svc_rqst *wakee), - TP_ARGS(pid), + TP_ARGS(wakee), TP_STRUCT__entry( - __field(int, pid) + __field(pid_t, pid) ), TP_fast_assign( - __entry->pid = pid; + __entry->pid = wakee->rq_task->pid; ), TP_printk("pid=%d", __entry->pid) diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index e81ce5f76abd..04151e22ec44 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -516,6 +516,7 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_timedout, 0, GFP_KERNEL); + percpu_counter_init(&pool->sp_threads_starved, 0, GFP_KERNEL); } return serv; @@ -591,6 +592,7 @@ svc_destroy(struct kref *ref) percpu_counter_destroy(&pool->sp_sockets_queued); percpu_counter_destroy(&pool->sp_threads_woken); percpu_counter_destroy(&pool->sp_threads_timedout); + percpu_counter_destroy(&pool->sp_threads_starved); } kfree(serv->sv_pools); kfree(serv); @@ -691,6 +693,7 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) /** * svc_pool_wake_idle_thread - wake an idle thread in @pool + * @serv: RPC service * @pool: service thread pool * * Returns an idle service thread (now marked BUSY), or NULL @@ -698,7 +701,8 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) * thread and marking it BUSY is atomic with respect to other * calls to svc_pool_wake_idle_thread(). */ -struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool) +struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, + struct svc_pool *pool) { struct svc_rqst *rqstp; @@ -714,6 +718,9 @@ struct svc_rqst *svc_pool_wake_idle_thread(struct svc_pool *pool) return rqstp; } rcu_read_unlock(); + + trace_svc_pool_starved(serv, pool); + percpu_counter_inc(&pool->sp_threads_starved); return NULL; } diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index f14476d11b67..859eecb7d52c 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -455,7 +455,7 @@ static bool svc_xprt_ready(struct svc_xprt *xprt) */ void svc_xprt_enqueue(struct svc_xprt *xprt) { - struct svc_rqst *rqstp; + struct svc_rqst *rqstp; struct svc_pool *pool; if (!svc_xprt_ready(xprt)) @@ -476,9 +476,11 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) list_add_tail(&xprt->xpt_ready, &pool->sp_sockets); spin_unlock_bh(&pool->sp_lock); - rqstp = svc_pool_wake_idle_thread(pool); - if (!rqstp) + rqstp = svc_pool_wake_idle_thread(xprt->xpt_server, pool); + if (!rqstp) { set_bit(SP_CONGESTED, &pool->sp_flags); + return; + } trace_svc_xprt_enqueue(xprt, rqstp); } @@ -584,15 +586,15 @@ static void svc_xprt_release(struct svc_rqst *rqstp) void svc_wake_up(struct svc_serv *serv) { struct svc_pool *pool = &serv->sv_pools[0]; - struct svc_rqst *rqstp; + struct svc_rqst *rqstp; - rqstp = svc_pool_wake_idle_thread(pool); + rqstp = svc_pool_wake_idle_thread(serv, pool); if (!rqstp) { set_bit(SP_TASK_PENDING, &pool->sp_flags); return; } - trace_svc_wake_up(rqstp->rq_task->pid); + trace_svc_wake_up(rqstp); } EXPORT_SYMBOL_GPL(svc_wake_up); @@ -1434,16 +1436,17 @@ static int svc_pool_stats_show(struct seq_file *m, void *p) struct svc_pool *pool = p; if (p == SEQ_START_TOKEN) { - seq_puts(m, "# pool packets-arrived sockets-enqueued threads-woken threads-timedout\n"); + seq_puts(m, "# pool packets-arrived xprts-enqueued threads-woken threads-timedout starved\n"); return 0; } - seq_printf(m, "%u %llu %llu %llu %llu\n", + seq_printf(m, "%u %llu %llu %llu %llu %llu\n", pool->sp_id, percpu_counter_sum_positive(&pool->sp_sockets_queued), percpu_counter_sum_positive(&pool->sp_sockets_queued), percpu_counter_sum_positive(&pool->sp_threads_woken), - percpu_counter_sum_positive(&pool->sp_threads_timedout)); + percpu_counter_sum_positive(&pool->sp_threads_timedout), + percpu_counter_sum_positive(&pool->sp_threads_starved)); return 0; } From patchwork Thu Jun 29 18:42:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FD50EB64D9 for ; Thu, 29 Jun 2023 18:42:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232439AbjF2Smy (ORCPT ); Thu, 29 Jun 2023 14:42:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232799AbjF2Sms (ORCPT ); Thu, 29 Jun 2023 14:42:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDF392681 for ; Thu, 29 Jun 2023 11:42:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 95E27615E4 for ; Thu, 29 Jun 2023 18:42:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8AD11C433C0; Thu, 29 Jun 2023 18:42:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064165; bh=UqeeQ6Lmu/aTxCKDiThwWw0WtcODAzks2kwzKD8n96g=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=tJWapsKC7qa+SJF9+n5U0unLHUnsuTrHBbv5+NPZosJ+rhnPfMF+Ud0ncRcHv/xpg zp+VG650RgkyflsQEyFSewOAAhw1faRWlDO8VosffkJwfxZDrxguMH4rd5TMXQZ50s IWHP7MetcPWl8Ql+Q95vfJBFfItOr63WDMBIEA1WXoeh4O2Oz/ncPCg2kcniuDdIT7 sEGYtjVDKrCOdXXZdx4Qokxfw22Irp2S6axmLWtRaUL1BXynHf6ZD4dF/kNLm4ws0b N4ssKS/6+XkIw1UVAfDHHJIq3fmWfOIiYGT3iMfvB0D9NsF2nhv7dHOhxw9ErIMrnr FeuY0l2KMpc3g== Subject: [PATCH RFC 3/8] SUNRPC: Split the svc_xprt_dequeue tracepoint From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:42:43 -0400 Message-ID: <168806416357.1034990.16815431273227880388.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Distinguish between the case where new work was picked up just by looking at the transport queue versus when the thread was awoken. This gives us better visibility about how well-utilized the thread pool is. Signed-off-by: Chuck Lever --- include/trace/events/sunrpc.h | 48 +++++++++++++++++++++++++++++++---------- net/sunrpc/svc_xprt.c | 9 +++++--- 2 files changed, 42 insertions(+), 15 deletions(-) diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 9813f4560eef..cf3d404ca6d8 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1939,34 +1939,58 @@ TRACE_EVENT(svc_xprt_enqueue, SVC_XPRT_ENDPOINT_VARARGS, __entry->pid) ); -TRACE_EVENT(svc_xprt_dequeue, +#define show_svc_pool_flags(x) \ + __print_flags(x, "|", \ + { BIT(SP_TASK_PENDING), "TASK_PENDING" }, \ + { BIT(SP_CONGESTED), "CONGESTED" }) + +DECLARE_EVENT_CLASS(svc_pool_scheduler_class, TP_PROTO( - const struct svc_rqst *rqst + const struct svc_pool *pool, + const struct svc_rqst *rqstp ), - TP_ARGS(rqst), + TP_ARGS(pool, rqstp), TP_STRUCT__entry( - SVC_XPRT_ENDPOINT_FIELDS(rqst->rq_xprt) + SVC_XPRT_ENDPOINT_FIELDS(rqstp->rq_xprt) + __string(name, rqstp->rq_server->sv_name) + __field(int, pool_id) + __field(unsigned int, nrthreads) + __field(unsigned long, pool_flags) __field(unsigned long, wakeup) ), TP_fast_assign( - SVC_XPRT_ENDPOINT_ASSIGNMENTS(rqst->rq_xprt); + SVC_XPRT_ENDPOINT_ASSIGNMENTS(rqstp->rq_xprt); + __assign_str(name, rqstp->rq_server->sv_name); + __entry->pool_id = pool->sp_id; + __entry->nrthreads = pool->sp_nrthreads; + __entry->pool_flags = pool->sp_flags; __entry->wakeup = ktime_to_us(ktime_sub(ktime_get(), - rqst->rq_qtime)); + rqstp->rq_qtime)); ), - TP_printk(SVC_XPRT_ENDPOINT_FORMAT " wakeup-us=%lu", - SVC_XPRT_ENDPOINT_VARARGS, __entry->wakeup) + TP_printk(SVC_XPRT_ENDPOINT_FORMAT + " service=%s pool=%d pool_flags=%s nrthreads=%u wakeup-us=%lu", + SVC_XPRT_ENDPOINT_VARARGS, __get_str(name), __entry->pool_id, + show_svc_pool_flags(__entry->pool_flags), __entry->nrthreads, + __entry->wakeup + ) ); -#define show_svc_pool_flags(x) \ - __print_flags(x, "|", \ - { BIT(SP_TASK_PENDING), "TASK_PENDING" }, \ - { BIT(SP_CONGESTED), "CONGESTED" }) +#define DEFINE_SVC_POOL_SCHEDULER_EVENT(name) \ + DEFINE_EVENT(svc_pool_scheduler_class, svc_pool_##name, \ + TP_PROTO( \ + const struct svc_pool *pool, \ + const struct svc_rqst *rqstp \ + ), \ + TP_ARGS(pool, rqstp)) + +DEFINE_SVC_POOL_SCHEDULER_EVENT(polled); +DEFINE_SVC_POOL_SCHEDULER_EVENT(awoken); TRACE_EVENT(svc_pool_starved, TP_PROTO( diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 859eecb7d52c..7d5aed4d1766 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -743,8 +743,10 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) WARN_ON_ONCE(rqstp->rq_xprt); rqstp->rq_xprt = svc_xprt_dequeue(pool); - if (rqstp->rq_xprt) + if (rqstp->rq_xprt) { + trace_svc_pool_polled(pool, rqstp); goto out_found; + } /* * We have to be able to interrupt this wait @@ -766,8 +768,10 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) set_bit(RQ_BUSY, &rqstp->rq_flags); smp_mb__after_atomic(); rqstp->rq_xprt = svc_xprt_dequeue(pool); - if (rqstp->rq_xprt) + if (rqstp->rq_xprt) { + trace_svc_pool_awoken(pool, rqstp); goto out_found; + } if (!time_left) percpu_counter_inc(&pool->sp_threads_timedout); @@ -783,7 +787,6 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) rqstp->rq_chandle.thread_wait = 5*HZ; else rqstp->rq_chandle.thread_wait = 1*HZ; - trace_svc_xprt_dequeue(rqstp); return rqstp->rq_xprt; } From patchwork Thu Jun 29 18:42:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46C06C0015E for ; Thu, 29 Jun 2023 18:42:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232799AbjF2Smy (ORCPT ); Thu, 29 Jun 2023 14:42:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232813AbjF2Smy (ORCPT ); Thu, 29 Jun 2023 14:42:54 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99D6B2681 for ; Thu, 29 Jun 2023 11:42:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C4F3615E4 for ; Thu, 29 Jun 2023 18:42:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2023EC433C8; Thu, 29 Jun 2023 18:42:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064171; bh=uuaHQhmmL7ku0nyVaFEQ9qlHbmFEZv9i0J/O3Fzz1tw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=rwoKD3iu8OrL110JrjVuvWFf30IMh4npTv1OnDGjD6YVDGBw/L81URBu2hqsjPbd+ FNz8WJDAp3/PwgWrSnaZbGdcuRk9ikpJBt9IxySCFnzgIfZc3GDptLDOwJaA252ZD9 BvxDi2HPWGy8aobcuSyCHNuBt3kXAZFAguMJ5T/Ihk2nW0Lf8YxA78jRrsx7Pld9XG Ete+KYQoOTvAF5I7Wq2xbEgBaYRnOm5aN09Ksy7kNV/qaaQCTSt+93ewxddYZVh0Vo CYlvXXWcz2c6cbNyX5QjBSSiZU3GSmmVMGoA5yXcb7WVeS6eS4/OkfMZnV8mqEVzwk THlePelOQkd5w== Subject: [PATCH RFC 4/8] SUNRPC: Clean up svc_set_num_threads From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:42:50 -0400 Message-ID: <168806417022.1034990.13187981091421789973.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Document the API contract and remove stale or obvious comments. Signed-off-by: Chuck Lever --- net/sunrpc/svc.c | 60 +++++++++++++++++++++++------------------------------- 1 file changed, 25 insertions(+), 35 deletions(-) diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 04151e22ec44..cf2e58ead35d 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -724,23 +724,14 @@ struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, return NULL; } -/* - * Choose a pool in which to create a new thread, for svc_set_num_threads - */ -static inline struct svc_pool * -choose_pool(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) +static struct svc_pool * +svc_pool_next(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) { - if (pool != NULL) - return pool; - - return &serv->sv_pools[(*state)++ % serv->sv_nrpools]; + return pool ? pool : &serv->sv_pools[(*state)++ % serv->sv_nrpools]; } -/* - * Choose a thread to kill, for svc_set_num_threads - */ -static inline struct task_struct * -choose_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) +static struct task_struct * +svc_pool_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) { unsigned int i; struct task_struct *task = NULL; @@ -748,7 +739,6 @@ choose_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) if (pool != NULL) { spin_lock_bh(&pool->sp_lock); } else { - /* choose a pool in round-robin fashion */ for (i = 0; i < serv->sv_nrpools; i++) { pool = &serv->sv_pools[--(*state) % serv->sv_nrpools]; spin_lock_bh(&pool->sp_lock); @@ -763,21 +753,15 @@ choose_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) if (!list_empty(&pool->sp_all_threads)) { struct svc_rqst *rqstp; - /* - * Remove from the pool->sp_all_threads list - * so we don't try to kill it again. - */ rqstp = list_entry(pool->sp_all_threads.next, struct svc_rqst, rq_all); set_bit(RQ_VICTIM, &rqstp->rq_flags); list_del_rcu(&rqstp->rq_all); task = rqstp->rq_task; } spin_unlock_bh(&pool->sp_lock); - return task; } -/* create new threads */ static int svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) { @@ -789,13 +773,12 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) do { nrservs--; - chosen_pool = choose_pool(serv, pool, &state); - + chosen_pool = svc_pool_next(serv, pool, &state); node = svc_pool_map_get_node(chosen_pool->sp_id); + rqstp = svc_prepare_thread(serv, chosen_pool, node); if (IS_ERR(rqstp)) return PTR_ERR(rqstp); - task = kthread_create_on_node(serv->sv_threadfn, rqstp, node, "%s", serv->sv_name); if (IS_ERR(task)) { @@ -814,15 +797,6 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) return 0; } -/* - * Create or destroy enough new threads to make the number - * of threads the given number. If `pool' is non-NULL, applies - * only to threads in that pool, otherwise round-robins between - * all pools. Caller must ensure that mutual exclusion between this and - * server startup or shutdown. - */ - -/* destroy old threads */ static int svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) { @@ -830,9 +804,8 @@ svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) struct task_struct *task; unsigned int state = serv->sv_nrthreads-1; - /* destroy old threads */ do { - task = choose_victim(serv, pool, &state); + task = svc_pool_victim(serv, pool, &state); if (task == NULL) break; rqstp = kthread_data(task); @@ -844,6 +817,23 @@ svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) return 0; } +/** + * svc_set_num_threads - adjust number of threads per RPC service + * @serv: RPC service to adjust + * @pool: Specific pool from which to choose threads, or NULL + * @nrservs: New number of threads for @serv (0 or less means kill all threads) + * + * Create or destroy threads to make the number of threads for @serv the + * given number. If @pool is non-NULL, change only threads in that pool; + * otherwise, round-robin between all pools for @serv. @serv's + * sv_nrthreads is adjusted for each thread created or destroyed. + * + * Caller must ensure mutual exclusion between this and server startup or + * shutdown. + * + * Returns zero on success or a negative errno if an error occurred while + * starting a thread. + */ int svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) { From patchwork Thu Jun 29 18:42:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297204 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8374BEB64DD for ; Thu, 29 Jun 2023 18:43:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232819AbjF2SnB (ORCPT ); Thu, 29 Jun 2023 14:43:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232507AbjF2SnA (ORCPT ); Thu, 29 Jun 2023 14:43:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BD222681 for ; Thu, 29 Jun 2023 11:42:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B94E7615E4 for ; Thu, 29 Jun 2023 18:42:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BABFAC433C8; Thu, 29 Jun 2023 18:42:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064178; bh=LCNqZcxIVd1RO7ZlOOofBxJSB+d62egEMIvK6s5EX2c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hF3kYUTDZcMqcnqOWKC6JgSAhs8YpIZley9hZDcNmTONM3cS1U8TNhuSdMSYCRTch HxsEEXxyiQuKwwH1lcxYyaJQ+Ch7n/33KSn1bGfZFwk+Ihgm0cnR8hekVOLgUPyVMN N4MJOZ9YpHAvqwD8lqw7IjZY/yhTyilps8sox/E8EGPXgPi8Eb2RVa4/AdPEDwbslL 55TZzI/tjgdzLTW7hrd8WOqNlq3Ljk4s2CW/PkcgkjhUzIXczzt5kQ3dORIHDyPQ8O TKArG337X72EcOv3CT78fKDzcXYKANuofwYZgkbSaXbEp6871YwSP95cABZke1FZHW xT38mH1SmvYDA== Subject: [PATCH RFC 5/8] SUNRPC: Replace dprintk() call site in __svc_create() From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:42:56 -0400 Message-ID: <168806417679.1034990.17820560466387975643.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Done as part of converting SunRPC observability from printk to tracepoints. Signed-off-by: Chuck Lever --- include/trace/events/sunrpc.h | 23 +++++++++++++++++++++++ net/sunrpc/svc.c | 5 ++--- 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index cf3d404ca6d8..70f3bc22c429 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1842,6 +1842,29 @@ TRACE_EVENT(svc_stats_latency, __get_str(procedure), __entry->execute) ); +TRACE_EVENT(svc_pool_init, + TP_PROTO( + const struct svc_serv *serv, + const struct svc_pool *pool + ), + + TP_ARGS(serv, pool), + + TP_STRUCT__entry( + __string(name, serv->sv_name) + __field(int, pool_id) + ), + + TP_fast_assign( + __assign_str(name, serv->sv_name); + __entry->pool_id = pool->sp_id; + ), + + TP_printk("service=%s pool=%d", + __get_str(name), __entry->pool_id + ) +); + #define show_svc_xprt_flags(flags) \ __print_flags(flags, "|", \ { BIT(XPT_BUSY), "BUSY" }, \ diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index cf2e58ead35d..828d28883ea8 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -505,9 +505,6 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, for (i = 0; i < serv->sv_nrpools; i++) { struct svc_pool *pool = &serv->sv_pools[i]; - dprintk("svc: initialising pool %u for %s\n", - i, serv->sv_name); - pool->sp_id = i; INIT_LIST_HEAD(&pool->sp_sockets); INIT_LIST_HEAD(&pool->sp_all_threads); @@ -517,6 +514,8 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_timedout, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_starved, 0, GFP_KERNEL); + + trace_svc_pool_init(serv, pool); } return serv; From patchwork Thu Jun 29 18:43:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B9A2EB64DD for ; Thu, 29 Jun 2023 18:43:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231668AbjF2SnO (ORCPT ); Thu, 29 Jun 2023 14:43:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232825AbjF2SnK (ORCPT ); Thu, 29 Jun 2023 14:43:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F5C62681 for ; Thu, 29 Jun 2023 11:43:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3AA76615E8 for ; Thu, 29 Jun 2023 18:43:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 35468C433C0; Thu, 29 Jun 2023 18:43:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064184; bh=gFNY8BUQgoiA7qVABypwdwxxB4umlPMej7xWlY7ZoMY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=d92mHh+81At5RXQ3+sBo43OUnsZz9R0A7TV2L9SFQMnTcC0ZNNKl7+p5O09HOdVRd cY0lkseW9I6UfOAlYXS8ipdTWy8UinAGPh+1+wp0VNJU/tgmQxkQ6HtDGExLad8CAj HkaMUorqyWUrAm2N+2LaJkCd0pI8kc0Dr7al6C9txJFfffYjBCHz42iHyfZTTJW3Zb oo5SswhCl2j7LKQs6dsBXajkxFr8yjGbxDQQ6Bmn8XJ4Cu02N94OR5fbsWxN0FZguh iqRExuNgW2VNvPRCObA4ntVGtPb9UA+olJv6h0NYAYevN3wUodyic4YAwgEC9Bb5kx 7mG7wJfWMA+nw== Subject: [PATCH RFC 6/8] SUNRPC: Replace sp_threads_all with an xarray From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:43:03 -0400 Message-ID: <168806418337.1034990.3706968041401141634.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever We want a thread lookup operation that can be done with RCU only, but to avoid the linked-list walk, which does not scale well in the number of svc threads. BH-disabled locking is no longer necessary because we're no longer sharing the pool's sp_lock to protect either the xarray or the pool's thread count. sp_lock also protects transport activity. As far as I can tell, there are no callers of svc_set_num_threads() that run outside of process context. Signed-off-by: Chuck Lever --- fs/nfsd/nfssvc.c | 3 +- include/linux/sunrpc/svc.h | 11 +++---- include/trace/events/sunrpc.h | 47 +++++++++++++++++++++++++++++- net/sunrpc/svc.c | 65 +++++++++++++++++++++++++---------------- net/sunrpc/svc_xprt.c | 2 + 5 files changed, 92 insertions(+), 36 deletions(-) diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index 2154fa63c5f2..d42b2a40c93c 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -62,8 +62,7 @@ static __be32 nfsd_init_request(struct svc_rqst *, * If (out side the lock) nn->nfsd_serv is non-NULL, then it must point to a * properly initialised 'struct svc_serv' with ->sv_nrthreads > 0 (unless * nn->keep_active is set). That number of nfsd threads must - * exist and each must be listed in ->sp_all_threads in some entry of - * ->sv_pools[]. + * exist and each must be listed in some entry of ->sv_pools[]. * * Each active thread holds a counted reference on nn->nfsd_serv, as does * the nn->keep_active flag and various transient calls to svc_get(). diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index fbfe6ea737c8..45aa7648dca6 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -32,10 +32,10 @@ */ struct svc_pool { unsigned int sp_id; /* pool id; also node id on NUMA */ - spinlock_t sp_lock; /* protects all fields */ + spinlock_t sp_lock; /* protects sp_sockets */ struct list_head sp_sockets; /* pending sockets */ unsigned int sp_nrthreads; /* # of threads in pool */ - struct list_head sp_all_threads; /* all server threads */ + struct xarray sp_thread_xa; /* statistics on pool operation */ struct percpu_counter sp_sockets_queued; @@ -194,7 +194,6 @@ extern u32 svc_max_payload(const struct svc_rqst *rqstp); * processed. */ struct svc_rqst { - struct list_head rq_all; /* all threads list */ struct rcu_head rq_rcu_head; /* for RCU deferred kfree */ struct svc_xprt * rq_xprt; /* transport ptr */ @@ -239,10 +238,10 @@ struct svc_rqst { #define RQ_SPLICE_OK (4) /* turned off in gss privacy * to prevent encrypting page * cache pages */ -#define RQ_VICTIM (5) /* about to be shut down */ -#define RQ_BUSY (6) /* request is busy */ -#define RQ_DATA (7) /* request has data */ +#define RQ_BUSY (5) /* request is busy */ +#define RQ_DATA (6) /* request has data */ unsigned long rq_flags; /* flags field */ + u32 rq_thread_id; /* xarray index */ ktime_t rq_qtime; /* enqueue time */ void * rq_argp; /* decoded arguments */ diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 70f3bc22c429..4ec746048f15 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1600,7 +1600,6 @@ DEFINE_SVCXDRBUF_EVENT(sendto); svc_rqst_flag(USEDEFERRAL) \ svc_rqst_flag(DROPME) \ svc_rqst_flag(SPLICE_OK) \ - svc_rqst_flag(VICTIM) \ svc_rqst_flag(BUSY) \ svc_rqst_flag_end(DATA) @@ -2043,6 +2042,52 @@ TRACE_EVENT(svc_pool_starved, ) ); +DECLARE_EVENT_CLASS(svc_thread_lifetime_class, + TP_PROTO( + const struct svc_serv *serv, + const struct svc_pool *pool, + const struct svc_rqst *rqstp + ), + + TP_ARGS(serv, pool, rqstp), + + TP_STRUCT__entry( + __string(name, serv->sv_name) + __field(int, pool_id) + __field(unsigned int, nrthreads) + __field(unsigned long, pool_flags) + __field(u32, thread_id) + __field(const void *, rqstp) + ), + + TP_fast_assign( + __assign_str(name, serv->sv_name); + __entry->pool_id = pool->sp_id; + __entry->nrthreads = pool->sp_nrthreads; + __entry->pool_flags = pool->sp_flags; + __entry->thread_id = rqstp->rq_thread_id; + __entry->rqstp = rqstp; + ), + + TP_printk("service=%s pool=%d pool_flags=%s nrthreads=%u thread_id=%u", + __get_str(name), __entry->pool_id, + show_svc_pool_flags(__entry->pool_flags), + __entry->nrthreads, __entry->thread_id + ) +); + +#define DEFINE_SVC_THREAD_LIFETIME_EVENT(name) \ + DEFINE_EVENT(svc_thread_lifetime_class, svc_pool_##name, \ + TP_PROTO( \ + const struct svc_serv *serv, \ + const struct svc_pool *pool, \ + const struct svc_rqst *rqstp \ + ), \ + TP_ARGS(serv, pool, rqstp)) + +DEFINE_SVC_THREAD_LIFETIME_EVENT(thread_init); +DEFINE_SVC_THREAD_LIFETIME_EVENT(thread_exit); + DECLARE_EVENT_CLASS(svc_xprt_event, TP_PROTO( const struct svc_xprt *xprt diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 828d28883ea8..18fbb98895ea 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -507,8 +507,8 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, pool->sp_id = i; INIT_LIST_HEAD(&pool->sp_sockets); - INIT_LIST_HEAD(&pool->sp_all_threads); spin_lock_init(&pool->sp_lock); + xa_init_flags(&pool->sp_thread_xa, XA_FLAGS_ALLOC); percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); @@ -592,6 +592,8 @@ svc_destroy(struct kref *ref) percpu_counter_destroy(&pool->sp_threads_woken); percpu_counter_destroy(&pool->sp_threads_timedout); percpu_counter_destroy(&pool->sp_threads_starved); + + xa_destroy(&pool->sp_thread_xa); } kfree(serv->sv_pools); kfree(serv); @@ -672,7 +674,11 @@ EXPORT_SYMBOL_GPL(svc_rqst_alloc); static struct svc_rqst * svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) { + static const struct xa_limit limit = { + .max = UINT_MAX, + }; struct svc_rqst *rqstp; + int ret; rqstp = svc_rqst_alloc(serv, pool, node); if (!rqstp) @@ -683,11 +689,21 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) serv->sv_nrthreads += 1; spin_unlock_bh(&serv->sv_lock); - spin_lock_bh(&pool->sp_lock); + xa_lock(&pool->sp_thread_xa); + ret = __xa_alloc(&pool->sp_thread_xa, &rqstp->rq_thread_id, rqstp, + limit, GFP_KERNEL); + if (ret) { + xa_unlock(&pool->sp_thread_xa); + goto out_free; + } pool->sp_nrthreads++; - list_add_rcu(&rqstp->rq_all, &pool->sp_all_threads); - spin_unlock_bh(&pool->sp_lock); + xa_unlock(&pool->sp_thread_xa); + trace_svc_pool_thread_init(serv, pool, rqstp); return rqstp; + +out_free: + svc_rqst_free(rqstp); + return ERR_PTR(ret); } /** @@ -704,19 +720,17 @@ struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, struct svc_pool *pool) { struct svc_rqst *rqstp; + unsigned long index; - rcu_read_lock(); - list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { + xa_for_each(&pool->sp_thread_xa, index, rqstp) { if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) continue; - rcu_read_unlock(); WRITE_ONCE(rqstp->rq_qtime, ktime_get()); wake_up_process(rqstp->rq_task); percpu_counter_inc(&pool->sp_threads_woken); return rqstp; } - rcu_read_unlock(); trace_svc_pool_starved(serv, pool); percpu_counter_inc(&pool->sp_threads_starved); @@ -732,32 +746,31 @@ svc_pool_next(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) static struct task_struct * svc_pool_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) { - unsigned int i; struct task_struct *task = NULL; + struct svc_rqst *rqstp; + unsigned long zero = 0; + unsigned int i; if (pool != NULL) { - spin_lock_bh(&pool->sp_lock); + xa_lock(&pool->sp_thread_xa); } else { for (i = 0; i < serv->sv_nrpools; i++) { pool = &serv->sv_pools[--(*state) % serv->sv_nrpools]; - spin_lock_bh(&pool->sp_lock); - if (!list_empty(&pool->sp_all_threads)) + xa_lock(&pool->sp_thread_xa); + if (!xa_empty(&pool->sp_thread_xa)) goto found_pool; - spin_unlock_bh(&pool->sp_lock); + xa_unlock(&pool->sp_thread_xa); } return NULL; } found_pool: - if (!list_empty(&pool->sp_all_threads)) { - struct svc_rqst *rqstp; - - rqstp = list_entry(pool->sp_all_threads.next, struct svc_rqst, rq_all); - set_bit(RQ_VICTIM, &rqstp->rq_flags); - list_del_rcu(&rqstp->rq_all); + rqstp = xa_find(&pool->sp_thread_xa, &zero, U32_MAX, XA_PRESENT); + if (rqstp) { + __xa_erase(&pool->sp_thread_xa, rqstp->rq_thread_id); task = rqstp->rq_task; } - spin_unlock_bh(&pool->sp_lock); + xa_unlock(&pool->sp_thread_xa); return task; } @@ -839,9 +852,9 @@ svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) if (pool == NULL) { nrservs -= serv->sv_nrthreads; } else { - spin_lock_bh(&pool->sp_lock); + xa_lock(&pool->sp_thread_xa); nrservs -= pool->sp_nrthreads; - spin_unlock_bh(&pool->sp_lock); + xa_unlock(&pool->sp_thread_xa); } if (nrservs > 0) @@ -928,11 +941,11 @@ svc_exit_thread(struct svc_rqst *rqstp) struct svc_serv *serv = rqstp->rq_server; struct svc_pool *pool = rqstp->rq_pool; - spin_lock_bh(&pool->sp_lock); + xa_lock(&pool->sp_thread_xa); pool->sp_nrthreads--; - if (!test_and_set_bit(RQ_VICTIM, &rqstp->rq_flags)) - list_del_rcu(&rqstp->rq_all); - spin_unlock_bh(&pool->sp_lock); + __xa_erase(&pool->sp_thread_xa, rqstp->rq_thread_id); + xa_unlock(&pool->sp_thread_xa); + trace_svc_pool_thread_exit(serv, pool, rqstp); spin_lock_bh(&serv->sv_lock); serv->sv_nrthreads -= 1; diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 7d5aed4d1766..77fc20b2181d 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -46,7 +46,7 @@ static LIST_HEAD(svc_xprt_class_list); /* SMP locking strategy: * - * svc_pool->sp_lock protects most of the fields of that pool. + * svc_pool->sp_lock protects sp_sockets. * svc_serv->sv_lock protects sv_tempsocks, sv_permsocks, sv_tmpcnt. * when both need to be taken (rare), svc_serv->sv_lock is first. * The "service mutex" protects svc_serv->sv_nrthread. From patchwork Thu Jun 29 18:43:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297206 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8D24EB64DD for ; Thu, 29 Jun 2023 18:43:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232267AbjF2SnS (ORCPT ); Thu, 29 Jun 2023 14:43:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230504AbjF2SnR (ORCPT ); Thu, 29 Jun 2023 14:43:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A60732681 for ; Thu, 29 Jun 2023 11:43:13 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D53B5615E2 for ; Thu, 29 Jun 2023 18:43:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D71EEC433C0; Thu, 29 Jun 2023 18:43:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064191; bh=eVPnCR2LLHYXWqkpa3VekDVPIf7iShqKQ2Na5GMgkmI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Gliwww1XCqj6Am+PKPVuRg+XsRhad3cTZVB2OogvO4PA9W1XoaZRzcJpeqa0bS/WU IDiaOZyUkTULbE7lwr1MXW7jXsYCuRM9GE49hTNzhbZGWagP5DLNKqoO/qaJ51xA0h /hbx+Gy7dQzoM2lw2D6msgzQb4l1hAius0BCMHjf3WwX3j65/KkG1plOQNlqxbOL+u UrgpOxnASWn/M694Yu5rFijgY4RAMCJekG8lKoHydKXTXnr+n1A/4Qli2juiO4M6To qDJgdS/eCwuVfuW48dCGe3X+DhUDup2MkExyYrUH27n5et6KZ97LE+yieVmcemE2zf V8mOj/E3YEGpw== Subject: [PATCH RFC 7/8] SUNRPC: Convert RQ_BUSY into a per-pool bitmap From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:43:09 -0400 Message-ID: <168806418985.1034990.14686512686720974159.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever I've noticed that server request latency goes up simply when the nfsd thread count is increased. List walking is known to be memory-inefficient. On a busy server with many threads, enqueuing a transport will walk the "all threads" list quite frequently. This also pulls in the cache lines for some hot fields in each svc_rqst. The svc_xprt_enqueue() call that concerns me most is the one in svc_rdma_wc_receive(), which is single-threaded per CQ. Slowing down completion handling will limit the total throughput per RDMA connection. So, avoid walking the "all threads" list to find an idle thread to wake. Instead, set up an idle bitmap and use find_next_bit, which should work the same way as RQ_BUSY but it will touch only the cacheline that the bitmap is in. I think we can stick with atomic bit operations here to avoid taking the pool lock. The server can keep track of up to 64 threads in just one unsigned long, and the bitmap can be multiple words long to handle even more threads. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc.h | 6 ++++-- include/trace/events/sunrpc.h | 1 - net/sunrpc/svc.c | 38 ++++++++++++++++++++++++++------------ net/sunrpc/svc_xprt.c | 23 +++++++++++++++++++---- 4 files changed, 49 insertions(+), 19 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 45aa7648dca6..ffa58a7a689d 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -35,6 +35,7 @@ struct svc_pool { spinlock_t sp_lock; /* protects sp_sockets */ struct list_head sp_sockets; /* pending sockets */ unsigned int sp_nrthreads; /* # of threads in pool */ + unsigned long *sp_idle_map; /* idle threads */ struct xarray sp_thread_xa; /* statistics on pool operation */ @@ -189,6 +190,8 @@ extern u32 svc_max_payload(const struct svc_rqst *rqstp); #define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \ + 2 + 1) +#define RPCSVC_MAXPOOLTHREADS (256) + /* * The context of a single thread, including the request currently being * processed. @@ -238,8 +241,7 @@ struct svc_rqst { #define RQ_SPLICE_OK (4) /* turned off in gss privacy * to prevent encrypting page * cache pages */ -#define RQ_BUSY (5) /* request is busy */ -#define RQ_DATA (6) /* request has data */ +#define RQ_DATA (5) /* request has data */ unsigned long rq_flags; /* flags field */ u32 rq_thread_id; /* xarray index */ ktime_t rq_qtime; /* enqueue time */ diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 4ec746048f15..f64c255975ab 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1600,7 +1600,6 @@ DEFINE_SVCXDRBUF_EVENT(sendto); svc_rqst_flag(USEDEFERRAL) \ svc_rqst_flag(DROPME) \ svc_rqst_flag(SPLICE_OK) \ - svc_rqst_flag(BUSY) \ svc_rqst_flag_end(DATA) #undef svc_rqst_flag diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 18fbb98895ea..c2cba61a890c 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -509,6 +509,12 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, INIT_LIST_HEAD(&pool->sp_sockets); spin_lock_init(&pool->sp_lock); xa_init_flags(&pool->sp_thread_xa, XA_FLAGS_ALLOC); + /* All threads initially marked "busy" */ + pool->sp_idle_map = + bitmap_zalloc_node(RPCSVC_MAXPOOLTHREADS, GFP_KERNEL, + svc_pool_map_get_node(i)); + if (!pool->sp_idle_map) + return NULL; percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL); percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); @@ -594,6 +600,8 @@ svc_destroy(struct kref *ref) percpu_counter_destroy(&pool->sp_threads_starved); xa_destroy(&pool->sp_thread_xa); + bitmap_free(pool->sp_idle_map); + pool->sp_idle_map = NULL; } kfree(serv->sv_pools); kfree(serv); @@ -645,7 +653,6 @@ svc_rqst_alloc(struct svc_serv *serv, struct svc_pool *pool, int node) folio_batch_init(&rqstp->rq_fbatch); - __set_bit(RQ_BUSY, &rqstp->rq_flags); rqstp->rq_server = serv; rqstp->rq_pool = pool; @@ -675,7 +682,7 @@ static struct svc_rqst * svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node) { static const struct xa_limit limit = { - .max = UINT_MAX, + .max = RPCSVC_MAXPOOLTHREADS, }; struct svc_rqst *rqstp; int ret; @@ -720,18 +727,24 @@ struct svc_rqst *svc_pool_wake_idle_thread(struct svc_serv *serv, struct svc_pool *pool) { struct svc_rqst *rqstp; - unsigned long index; + unsigned long bit; - xa_for_each(&pool->sp_thread_xa, index, rqstp) { - if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) - continue; + bit = 0; + do { + bit = find_next_bit(pool->sp_idle_map, pool->sp_nrthreads, bit); + if (bit == pool->sp_nrthreads) + goto out_starved; + } while (!test_and_clear_bit(bit, pool->sp_idle_map)); - WRITE_ONCE(rqstp->rq_qtime, ktime_get()); - wake_up_process(rqstp->rq_task); - percpu_counter_inc(&pool->sp_threads_woken); - return rqstp; - } + rqstp = xa_find(&pool->sp_thread_xa, &bit, bit, XA_PRESENT); + if (!rqstp) + goto out_starved; + WRITE_ONCE(rqstp->rq_qtime, ktime_get()); + wake_up_process(rqstp->rq_task); + percpu_counter_inc(&pool->sp_threads_woken); + return rqstp; +out_starved: trace_svc_pool_starved(serv, pool); percpu_counter_inc(&pool->sp_threads_starved); return NULL; @@ -765,7 +778,8 @@ svc_pool_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *stat } found_pool: - rqstp = xa_find(&pool->sp_thread_xa, &zero, U32_MAX, XA_PRESENT); + rqstp = xa_find(&pool->sp_thread_xa, &zero, RPCSVC_MAXPOOLTHREADS, + XA_PRESENT); if (rqstp) { __xa_erase(&pool->sp_thread_xa, rqstp->rq_thread_id); task = rqstp->rq_task; diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 77fc20b2181d..e22f1432aabb 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -734,6 +734,18 @@ rqst_should_sleep(struct svc_rqst *rqstp) return true; } +static void svc_rqst_mark_idle(struct svc_rqst *rqstp) +{ + set_bit(rqstp->rq_thread_id, rqstp->rq_pool->sp_idle_map); + smp_mb__after_atomic(); +} + +static void svc_rqst_mark_busy(struct svc_rqst *rqstp) +{ + clear_bit(rqstp->rq_thread_id, rqstp->rq_pool->sp_idle_map); + smp_mb__after_atomic(); +} + static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) { struct svc_pool *pool = rqstp->rq_pool; @@ -755,8 +767,7 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) set_current_state(TASK_INTERRUPTIBLE); smp_mb__before_atomic(); clear_bit(SP_CONGESTED, &pool->sp_flags); - clear_bit(RQ_BUSY, &rqstp->rq_flags); - smp_mb__after_atomic(); + svc_rqst_mark_idle(rqstp); if (likely(rqst_should_sleep(rqstp))) time_left = schedule_timeout(timeout); @@ -765,8 +776,12 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) try_to_freeze(); - set_bit(RQ_BUSY, &rqstp->rq_flags); - smp_mb__after_atomic(); + /* Post-sleep: look for more work. + * + * Note: If we were awoken, then this rqstp has already + * been marked busy. + */ + svc_rqst_mark_busy(rqstp); rqstp->rq_xprt = svc_xprt_dequeue(pool); if (rqstp->rq_xprt) { trace_svc_pool_awoken(pool, rqstp); From patchwork Thu Jun 29 18:43:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13297207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B296C0015E for ; Thu, 29 Jun 2023 18:43:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232170AbjF2Sn0 (ORCPT ); Thu, 29 Jun 2023 14:43:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232560AbjF2SnY (ORCPT ); Thu, 29 Jun 2023 14:43:24 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 509A42693 for ; Thu, 29 Jun 2023 11:43:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 76DB3615E7 for ; Thu, 29 Jun 2023 18:43:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76951C433C8; Thu, 29 Jun 2023 18:43:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688064197; bh=vtiAWFYHkHtGebTnExQNuYeQk9SYGLwLglMdhcaK9ZM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=aQFdffpIsoCrOg5znro2aPU30UMcRoTrYLD4loS3+uKHQEkbO+MqpjQICbHmmEFqv 2njYhoev6GoHLV/+2f2i9g2KS2VfPBeCQw45PWtvJzfjb3UVE+/GSp+bVdLWVg95LL 6zeHNuvgtCAbE3hF9z5B9rmOBGtnja5wIQBBGJ8Kdc1czI/GhE9WF8/x/LAPtdyx3E E8SFpZrQU1qAbwFNlSZtADWKSwbErKxCfrJ/GRGO7uxPGzIaWlxdRqe3Ft3kgLKZvN CRAOlhG6HM5JFSwV13HEN7qDNjihhjD3cPa9OrHAU5jC8k5vJ93OqRjQHjFlWF894K alXlnpfcAWNBw== Subject: [PATCH RFC 8/8] SUNRPC: Don't disable BH's when taking sp_lock From: Chuck Lever To: linux-nfs@vger.kernel.org Cc: Chuck Lever , lorenzo@kernel.org, neilb@suse.de, jlayton@redhat.com, david@fromorbit.com Date: Thu, 29 Jun 2023 14:43:16 -0400 Message-ID: <168806419648.1034990.7540913098847778540.stgit@morisot.1015granger.net> In-Reply-To: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> References: <168806401782.1034990.9686296943273298604.stgit@morisot.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever Consumers of sp_lock now all run in process context. Signed-off-by: Chuck Lever --- net/sunrpc/svc_xprt.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index e22f1432aabb..6a56cd202148 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -472,9 +472,9 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) pool = svc_pool_for_cpu(xprt->xpt_server); percpu_counter_inc(&pool->sp_sockets_queued); - spin_lock_bh(&pool->sp_lock); + spin_lock(&pool->sp_lock); list_add_tail(&xprt->xpt_ready, &pool->sp_sockets); - spin_unlock_bh(&pool->sp_lock); + spin_unlock(&pool->sp_lock); rqstp = svc_pool_wake_idle_thread(xprt->xpt_server, pool); if (!rqstp) { @@ -496,14 +496,14 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool) if (list_empty(&pool->sp_sockets)) goto out; - spin_lock_bh(&pool->sp_lock); + spin_lock(&pool->sp_lock); if (likely(!list_empty(&pool->sp_sockets))) { xprt = list_first_entry(&pool->sp_sockets, struct svc_xprt, xpt_ready); list_del_init(&xprt->xpt_ready); svc_xprt_get(xprt); } - spin_unlock_bh(&pool->sp_lock); + spin_unlock(&pool->sp_lock); out: return xprt; } @@ -1129,15 +1129,15 @@ static struct svc_xprt *svc_dequeue_net(struct svc_serv *serv, struct net *net) for (i = 0; i < serv->sv_nrpools; i++) { pool = &serv->sv_pools[i]; - spin_lock_bh(&pool->sp_lock); + spin_lock(&pool->sp_lock); list_for_each_entry_safe(xprt, tmp, &pool->sp_sockets, xpt_ready) { if (xprt->xpt_net != net) continue; list_del_init(&xprt->xpt_ready); - spin_unlock_bh(&pool->sp_lock); + spin_unlock(&pool->sp_lock); return xprt; } - spin_unlock_bh(&pool->sp_lock); + spin_unlock(&pool->sp_lock); } return NULL; }