From patchwork Thu Jun 22 08:53:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288604 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56090C001B3 for ; Thu, 22 Jun 2023 08:55:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230462AbjFVIzE (ORCPT ); Thu, 22 Jun 2023 04:55:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231466AbjFVIyl (ORCPT ); Thu, 22 Jun 2023 04:54:41 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7ACC213C for ; Thu, 22 Jun 2023 01:54:11 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1b5466bc5f8so9651705ad.1 for ; Thu, 22 Jun 2023 01:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424051; x=1690016051; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DFMKKDRv9BEw0wZ8UUzCf9pHUtxe8HkFUWg9bN69VRM=; b=AuFJ2ff77l/LXEOtYzZWEupZaAfbwt3boPeShk25fcGMwTsb6VlfqfAhMOVgIPnKgi eqcOSF5WVHBBVioigqwmY9UAoTi5JEhiUSGdMReu4dDGNOFEVgxVIc2OKO3nltUsVDLn bIiXH0DnqlXam4z90R2tFa6XgNVpJSLaIq6Gbf9RXLnB8BOaf0v/n1c1hG3kRZPk0MVc mjQpq7kQFhiCgtBijMi8fR+x4TAH8zuouOKbSAqWM+GHeJnchPRb4XIBJxswbiSbm3XO sXBcqfmZ/ARe540iNA/E7F7ZKvP/VvSbnbTILDPrN3l0DanjIGoHjsT1U12ioP3F05dG yTtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424051; x=1690016051; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DFMKKDRv9BEw0wZ8UUzCf9pHUtxe8HkFUWg9bN69VRM=; b=NnGceUK4auyyMU1THRaZt4+h1qvoJCy6cAJKffiWzvVCssZL1ixNK3Z4y1CmcchKVX fkJJNx1vKKRQCQm+wjwFXe+9K0ZjT+sQqIriCBxX56ceXv5rEplZnqKwZ6bhgXi3my65 TuFWQ5jP0BR0leK6Wvc6PQppELcIY5XdThgXu/T/zjszBWxr/+B3eTatXEtSYA3fxwUi Hz/RCZpbkfLwKHc3VqDcw3FZA5fxO7Zp9NNNPMhUugGl9y6qs+jXyDkHWi6NHDKD2qV7 i7H0HLwZGkPJpFViJ5vgK/mCxRS0GiHCVaFV4Y+UWWWbcsGl1CDbSa3aDNp7nJgGoYoV gMkQ== X-Gm-Message-State: AC+VfDwD4SBXIT5jrh2AyjE813nK+DrAsUOuHWMaNBZI8D7fTHDzV8A2 QIDEJ9JdwPKGg2JyDIfMF8DRNQ== X-Google-Smtp-Source: ACHHUZ41Ov0V9PIJeBixnqaY8MSMJrvGTOtmFbuG0vAMaQAXgKytEj5uEpxgQHkUs9eKy8wwGzA30Q== X-Received: by 2002:a17:902:f691:b0:1b3:d4bb:3515 with SMTP id l17-20020a170902f69100b001b3d4bb3515mr21827814plg.0.1687424051106; Thu, 22 Jun 2023 01:54:11 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:10 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 01/29] mm: shrinker: add shrinker::private_data field Date: Thu, 22 Jun 2023 16:53:07 +0800 Message-Id: <20230622085335.77010-2-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org To prepare for the dynamic allocation of shrinker instances embedded in other structures, add a private_data field to struct shrinker, so that we can use shrinker::private_data to record and get the original embedded structure. Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 224293b2dd06..43e6fcabbf51 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -70,6 +70,8 @@ struct shrinker { int seeks; /* seeks to recreate an obj */ unsigned flags; + void *private_data; + /* These are for internal use */ struct list_head list; #ifdef CONFIG_MEMCG From patchwork Thu Jun 22 08:53:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3613EB64D8 for ; Thu, 22 Jun 2023 08:55:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230296AbjFVIzH (ORCPT ); Thu, 22 Jun 2023 04:55:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231446AbjFVIy1 (ORCPT ); Thu, 22 Jun 2023 04:54:27 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 968EF2114 for ; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b693afe799so2463735ad.1 for ; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424059; x=1690016059; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4+KB+8A1a3nPwmndVeoDnF0UUBrNJStZPfOFWRboZRM=; b=Sk9j4D1M64xyAQoy0v1SMiRRuiugSGVsgHhv7vYv9c3IikKaUFVDiFJczRjS+t8QJh 8bRZCGyYvkRGHDm8uUJh3JrpsRJfZFHSfxrg783G1zZ8ag+Nrop0sZP8BnmbcU+0S25q ZIKtGKgKPIH7cGpK2fHrvYtY9p82kt8TosN3NcpFb8CLbiW0zQyLzdahaQYLKutpL1Kb iV83VCrLSsoe4tZLXCgr7bzLBi5xWr4Iu6AgSr+ApJLZMKUxhfto21PwHxkzor+s4sBV FrT1a+LNzQzeAZGp0p/ozv8VRhpBFeAgfX/MFGdkWk3iK2t7SrfDgjAEjXYPkdmChJzo lkUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424059; x=1690016059; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4+KB+8A1a3nPwmndVeoDnF0UUBrNJStZPfOFWRboZRM=; b=QF1dGnOzwWuvaRIO2IVn8bIstUgsElMsSeqxuXQM4SCQeLXkmU1zc7AfI1/jufsYXJ FxDyGJbu6Uadvl1XUlGuPufEY0xB4fXWmsWdcQq3wgg6yy/A1wVaFco/lPriwNeJXpYV knhSX8ofJbO3V6Ab6lkwZvDIyJDpFEdBanyD6KABy69RHwihmP2NEizNfbejCuHOLUhM K37fBTuQnCc1fTkbcQ19bGDzshzLoz/dQ17lVN2/rZMtxSuCQjKitF2Ujl9mD7FRB5k1 X/+pIdBKmHJiPFzHZ7oEAEVCwYHwtuEVXaBjmP2CQmEFay5Iy2Otym6xd4rU0t0MHpac 3glQ== X-Gm-Message-State: AC+VfDxi7vpgHaz8f6w291IvbZrBg4FgVPaN3Y/vh/Bij6FMBPDETyYx j1OogT3vmPmKlb8bDcWSZfHtOg== X-Google-Smtp-Source: ACHHUZ4ysdKcTcW8I0NFLAt2c3adAhLn+VAaFg6uutJFhO8zbaZNgTMf0Qt9rlrj2Rh11Ffv9G+1Gw== X-Received: by 2002:a17:903:2451:b0:1b0:34c6:3bf2 with SMTP id l17-20020a170903245100b001b034c63bf2mr21537674pls.5.1687424059079; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:18 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 02/29] mm: vmscan: introduce some helpers for dynamically allocating shrinker Date: Thu, 22 Jun 2023 16:53:08 +0800 Message-Id: <20230622085335.77010-3-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Introduce some helpers for dynamically allocating shrinker instance, and their uses are as follows: 1. shrinker_alloc_and_init() Used to allocate and initialize a shrinker instance, the priv_data parameter is used to pass the pointer of the previously embedded structure of the shrinker instance. 2. shrinker_free() Used to free the shrinker instance when the registration of shrinker fails. 3. unregister_and_free_shrinker() Used to unregister and free the shrinker instance, and the kfree() will be changed to kfree_rcu() later. Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 12 ++++++++++++ mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 43e6fcabbf51..8e9ba6fa3fcc 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -107,6 +107,18 @@ extern void unregister_shrinker(struct shrinker *shrinker); extern void free_prealloced_shrinker(struct shrinker *shrinker); extern void synchronize_shrinkers(void); +typedef unsigned long (*count_objects_cb)(struct shrinker *s, + struct shrink_control *sc); +typedef unsigned long (*scan_objects_cb)(struct shrinker *s, + struct shrink_control *sc); + +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, + scan_objects_cb scan, long batch, + int seeks, unsigned flags, + void *priv_data); +void shrinker_free(struct shrinker *shrinker); +void unregister_and_free_shrinker(struct shrinker *shrinker); + #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, diff --git a/mm/vmscan.c b/mm/vmscan.c index 45d17c7cc555..64ff598fbad9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -809,6 +809,41 @@ void unregister_shrinker(struct shrinker *shrinker) } EXPORT_SYMBOL(unregister_shrinker); +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, + scan_objects_cb scan, long batch, + int seeks, unsigned flags, + void *priv_data) +{ + struct shrinker *shrinker; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + + shrinker->count_objects = count; + shrinker->scan_objects = scan; + shrinker->batch = batch; + shrinker->seeks = seeks; + shrinker->flags = flags; + shrinker->private_data = priv_data; + + return shrinker; +} +EXPORT_SYMBOL(shrinker_alloc_and_init); + +void shrinker_free(struct shrinker *shrinker) +{ + kfree(shrinker); +} +EXPORT_SYMBOL(shrinker_free); + +void unregister_and_free_shrinker(struct shrinker *shrinker) +{ + unregister_shrinker(shrinker); + kfree(shrinker); +} +EXPORT_SYMBOL(unregister_and_free_shrinker); + /** * synchronize_shrinkers - Wait for all running shrinkers to complete. * From patchwork Thu Jun 22 08:53:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3650CEB64DA for ; Thu, 22 Jun 2023 08:56:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231603AbjFVIzn (ORCPT ); Thu, 22 Jun 2023 04:55:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231599AbjFVIy5 (ORCPT ); Thu, 22 Jun 2023 04:54:57 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CDA71738 for ; Thu, 22 Jun 2023 01:54:27 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1b52418c25bso12030775ad.0 for ; Thu, 22 Jun 2023 01:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424067; x=1690016067; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qIt8zXdDCaGLXCE6rKF2JMKvp8G2AK8nuTQphjV73ms=; b=cC13yiK2b2HqEC2WYanx92tTHG9ibgrmYLbrgzYcUarLdydxeypI53cwuyYhjaQzJp onJq4vBBmMajr1poWh86FQTLq2y1xSvoknQDzCFUhqc78tS5eFxPwQt8ueUiPUlQcGJ1 NoUMAyqrL7rvWv1XaJ4CPnN6K7xK1sRx+XiYPW8paVbE3vb5Wkc4IFamUXhI7GXmfwho QPeLUm4YlUE0P3+vsMp1UZZv6g44Pla+Jp2DOFJOg6O2qS7xw748UZa1YIoXHuTtcdry A4esVq8CH+pV+6r7bJr/mBJgAfFPjM5J1TM9ToGnDJSCePlm9yunBGrnuwbojkh7OG6q rFSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424067; x=1690016067; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qIt8zXdDCaGLXCE6rKF2JMKvp8G2AK8nuTQphjV73ms=; b=OelVzCiTdWCE6eNVsogqOiLxNsdGdUhkFDQBHOp8OU0NNSyWjPEj679IdS+dGJHts/ G4E8lD1E8Ozbq9uL5PbOyK1M0tpGmJdd8tpbnrFQWoiLLH2cLl8w+T4Gpd7stSK6nzSk gGYYNsW8nIGMG8Go77sZBV+LjbdSZedKjWaargbzNSYsoVpAWOWPlXCfYFWfENfDYlmd ih2C23ghepPATTD3+V9n3414pWbDUp412CwxU/BpZVZmGAptfd81kD1goOL2Gt2Mp/+V 4FLV31qLu6BN8CXoBDx6fba/mQZ5IabS/9uwmXLPT0VNF22vPSZ/o8fwkZB0JqlUb7sC uLTg== X-Gm-Message-State: AC+VfDwAIqwDQqWLaFQhZqbGikd7GHswdE99l/BK21PTcpYKjRWEsyU1 lKvvW3RBKPSfCxeTbH4cNx+GJw== X-Google-Smtp-Source: ACHHUZ68xppE2OzenI0HziNaJw5pBbhsLnOSkDmr33RiuwGHaNU1efiGcJgTbt3uCyHHsLierWgKoQ== X-Received: by 2002:a17:902:d489:b0:1ae:4567:2737 with SMTP id c9-20020a170902d48900b001ae45672737mr21909019plg.2.1687424067136; Thu, 22 Jun 2023 01:54:27 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:26 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 03/29] drm/i915: dynamically allocate the i915_gem_mm shrinker Date: Thu, 22 Jun 2023 16:53:09 +0800 Message-Id: <20230622085335.77010-4-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the i915_gem_mm shrinker, so that it can be freed asynchronously by using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct drm_i915_private. Signed-off-by: Qi Zheng --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 27 ++++++++++---------- drivers/gpu/drm/i915/i915_drv.h | 3 ++- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 214763942aa2..4dcdace26a08 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -284,8 +284,7 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915) static unsigned long i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { - struct drm_i915_private *i915 = - container_of(shrinker, struct drm_i915_private, mm.shrinker); + struct drm_i915_private *i915 = shrinker->private_data; unsigned long num_objects; unsigned long count; @@ -302,8 +301,8 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) if (num_objects) { unsigned long avg = 2 * count / num_objects; - i915->mm.shrinker.batch = - max((i915->mm.shrinker.batch + avg) >> 1, + i915->mm.shrinker->batch = + max((i915->mm.shrinker->batch + avg) >> 1, 128ul /* default SHRINK_BATCH */); } @@ -313,8 +312,7 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) static unsigned long i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { - struct drm_i915_private *i915 = - container_of(shrinker, struct drm_i915_private, mm.shrinker); + struct drm_i915_private *i915 = shrinker->private_data; unsigned long freed; sc->nr_scanned = 0; @@ -422,12 +420,15 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr void i915_gem_driver_register__shrinker(struct drm_i915_private *i915) { - i915->mm.shrinker.scan_objects = i915_gem_shrinker_scan; - i915->mm.shrinker.count_objects = i915_gem_shrinker_count; - i915->mm.shrinker.seeks = DEFAULT_SEEKS; - i915->mm.shrinker.batch = 4096; - drm_WARN_ON(&i915->drm, register_shrinker(&i915->mm.shrinker, - "drm-i915_gem")); + i915->mm.shrinker = shrinker_alloc_and_init(i915_gem_shrinker_count, + i915_gem_shrinker_scan, + 4096, DEFAULT_SEEKS, 0, + i915); + if (i915->mm.shrinker && + register_shrinker(i915->mm.shrinker, "drm-i915_gem")) { + shrinker_free(i915->mm.shrinker); + drm_WARN_ON(&i915->drm, 1); + } i915->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom; drm_WARN_ON(&i915->drm, register_oom_notifier(&i915->mm.oom_notifier)); @@ -443,7 +444,7 @@ void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915) unregister_vmap_purge_notifier(&i915->mm.vmap_notifier)); drm_WARN_ON(&i915->drm, unregister_oom_notifier(&i915->mm.oom_notifier)); - unregister_shrinker(&i915->mm.shrinker); + unregister_and_free_shrinker(i915->mm.shrinker); } void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b4cf6f0f636d..06b04428596d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -163,7 +163,8 @@ struct i915_gem_mm { struct notifier_block oom_notifier; struct notifier_block vmap_notifier; - struct shrinker shrinker; + + struct shrinker *shrinker; #ifdef CONFIG_MMU_NOTIFIER /** From patchwork Thu Jun 22 08:53:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14B17EB64DC for ; Thu, 22 Jun 2023 08:55:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230320AbjFVIzD (ORCPT ); Thu, 22 Jun 2023 04:55:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231464AbjFVIyl (ORCPT ); Thu, 22 Jun 2023 04:54:41 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C06EB2120 for ; Thu, 22 Jun 2023 01:54:35 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1b5466bc5f8so9652705ad.1 for ; Thu, 22 Jun 2023 01:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424075; x=1690016075; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ftOXXP2bfTIIt8St3T9822C8Mz+yI5OAEWwVCqdxqi8=; b=AF1n71Ibzw+/eJzW8ZfFCXm2Gb4L6rcgcO3MWUfrJXSOCg2q4/6sdbKki+cciYNMm8 euETZJp/Yv9okGd+5XPvMxEgndnOAmqccUN3sNROxfnHbIeAfCrF3CESQGiUHVaKUyC0 MpSE/83T59YTQuiU5NMvjWSDBOsWVTepqqBeQE1tIy9JZFyOQuULXF0wbqj90XyCa0gR wARDSKvxEvJy+BtXj+ipth/FF0Y38R/9qPz9cOOh0wy/h6cCvspbbaNHGZoc8L1j5F66 eXHRgIUKfoLp9EhNKNNczQL499XVYbOwcbEp2A8mZxNkm++j1bZaQj9vcnyBlktuqhCK 186Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424075; x=1690016075; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ftOXXP2bfTIIt8St3T9822C8Mz+yI5OAEWwVCqdxqi8=; b=fN7DT6AOHk6K4Wr5n20QiK5t9pcpfmYC7pVlkYRE16LXVSM1MuUNp9R3Ef11Sc6yCY ltBrQNPk9v8QL9xy9e2+xwJK+0+KJB6mlq05VZjuxDWcNnU2KYHN7/iatLNZxjN5g2Nh POrUFWXqdYLhfs//Gt4Yxg8INd0k8xKH+rspMeTq6Vx9PwyNRXfnWvWrJpObYQNiyAXC IMW/v7pq2z51SWCmi/9ASQHLZXCOh3zrDoZ21hGEwYHKdpKH8wKO9wGwEPfM6x/V1HD/ 2/6w2siY2bR7bgsEHrNG7s29DiIPhtLTDuXxb9GONyOULr64WvG1KC2OXd+m3Jzgnfo7 rX6Q== X-Gm-Message-State: AC+VfDyfbt/UR9nUuHrk+v5fMIZWJNooHeAzlZXhXr6ilbeXxHaRLIgA yJ+b4JBNROG7SIM2AWQD2OLt7w== X-Google-Smtp-Source: ACHHUZ5hV1GhQPf78RPEQj8dLuXzVGGtbX4uDKxYqhUuZR/F2WM6CQGQNWMXHDLRseslqPACLVPKJA== X-Received: by 2002:a17:903:1246:b0:1b3:d8ac:8db3 with SMTP id u6-20020a170903124600b001b3d8ac8db3mr21942648plh.6.1687424075082; Thu, 22 Jun 2023 01:54:35 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:34 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 04/29] drm/msm: dynamically allocate the drm-msm_gem shrinker Date: Thu, 22 Jun 2023 16:53:10 +0800 Message-Id: <20230622085335.77010-5-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the drm-msm_gem shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct msm_drm_private. Signed-off-by: Qi Zheng --- drivers/gpu/drm/msm/msm_drv.h | 2 +- drivers/gpu/drm/msm/msm_gem_shrinker.c | 25 ++++++++++++++----------- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index e13a8cbd61c9..4f3ba55058cd 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -217,7 +217,7 @@ struct msm_drm_private { } vram; struct notifier_block vmap_notifier; - struct shrinker shrinker; + struct shrinker *shrinker; struct drm_atomic_state *pm_state; diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index f38296ad8743..db7582ae1f19 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -34,8 +34,7 @@ static bool can_block(struct shrink_control *sc) static unsigned long msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { - struct msm_drm_private *priv = - container_of(shrinker, struct msm_drm_private, shrinker); + struct msm_drm_private *priv = shrinker->private_data; unsigned count = priv->lru.dontneed.count; if (can_swap()) @@ -100,8 +99,7 @@ active_evict(struct drm_gem_object *obj) static unsigned long msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { - struct msm_drm_private *priv = - container_of(shrinker, struct msm_drm_private, shrinker); + struct msm_drm_private *priv = shrinker->private_data; struct { struct drm_gem_lru *lru; bool (*shrink)(struct drm_gem_object *obj); @@ -151,7 +149,7 @@ msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan) int ret; fs_reclaim_acquire(GFP_KERNEL); - ret = msm_gem_shrinker_scan(&priv->shrinker, &sc); + ret = msm_gem_shrinker_scan(priv->shrinker, &sc); fs_reclaim_release(GFP_KERNEL); return ret; @@ -213,10 +211,15 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) void msm_gem_shrinker_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; - priv->shrinker.count_objects = msm_gem_shrinker_count; - priv->shrinker.scan_objects = msm_gem_shrinker_scan; - priv->shrinker.seeks = DEFAULT_SEEKS; - WARN_ON(register_shrinker(&priv->shrinker, "drm-msm_gem")); + + priv->shrinker = shrinker_alloc_and_init(msm_gem_shrinker_count, + msm_gem_shrinker_scan, 0, + DEFAULT_SEEKS, 0, priv); + if (priv->shrinker && + register_shrinker(priv->shrinker, "drm-msm_gem")) { + shrinker_free(priv->shrinker); + WARN_ON(1); + } priv->vmap_notifier.notifier_call = msm_gem_shrinker_vmap; WARN_ON(register_vmap_purge_notifier(&priv->vmap_notifier)); @@ -232,8 +235,8 @@ void msm_gem_shrinker_cleanup(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; - if (priv->shrinker.nr_deferred) { + if (priv->shrinker->nr_deferred) { WARN_ON(unregister_vmap_purge_notifier(&priv->vmap_notifier)); - unregister_shrinker(&priv->shrinker); + unregister_and_free_shrinker(priv->shrinker); } } From patchwork Thu Jun 22 08:53:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68C1FEB64DB for ; Thu, 22 Jun 2023 08:56:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231663AbjFVI4G (ORCPT ); Thu, 22 Jun 2023 04:56:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231447AbjFVIzN (ORCPT ); Thu, 22 Jun 2023 04:55:13 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF32D2129 for ; Thu, 22 Jun 2023 01:54:43 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b515ec39feso12619185ad.0 for ; Thu, 22 Jun 2023 01:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424083; x=1690016083; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TQaNNe8oXaUFeFpLguh+yca1iGAODo9pDL1sZw6pJSE=; b=bLkRMM1jSdqXsEty9KLAFC9vFSq6Mb8s85EPujNxx1fG7cL+WhObUy7eJvjIIVVT27 Kiy5rsYxFdvQkjX/Hn7QfCF/H515sUldxTAV8THo6pKuerK5LdnjnrVU9ezaWAUUfrz8 rV0yX1ewF3SUu27Tytlc8UIGxdmoc2rf5KwQXmt4TN0xZENUrrBpDEpOBdYckFOGmW+/ CnWb/ghzLgegjrSPuJYmDsQVFRVYxO2XxU7X4C2jxwq5osf5N5i/+CYLyq1fNxhJsVCX q//piIl1OJNIHjgKTj8Fx/a4slkfZ5A+zdUvxvyRJBR/16JSIqSsZUTa9z9xdPFzC7nq BhVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424083; x=1690016083; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TQaNNe8oXaUFeFpLguh+yca1iGAODo9pDL1sZw6pJSE=; b=eFxbv2AmFFcs6+z0cyJxz1B5cYXgYGVoleMSthvT1Yc2oimiqYjYPmflfU55xzaS/l S601ZSik2iPCA/2JP8U7d350OUH9lyM3Q6m2Ep28IrYWqXcSwLdkUV6XGioOuizvjEG8 PwJ13tNxSlE4+1E8IZ5S9vBK5ymNtDpUesVjwwxTumm9vE/lIZJVHMhwSM1BdkwC4jl9 vvUw1/PyBsYCmak2RoMSqzpc7vhszuTZyt2VTyUs2lCsLJAoDGNapSSTPt8z2p5+SfPE LvZ4UWpQIclqzYfiG91j6j2r7/xuvqPJ3LE86OP7ygJ2w1Kki7UKJN5OH9EPM/IlDjvo DmIw== X-Gm-Message-State: AC+VfDzDOErK6ZGb9yH26KZvUjb3O+vDyLnb4WAV98Fo9U56M2Op/poj +CPmI1rVNbFF6lS9TueUKAZoLg== X-Google-Smtp-Source: ACHHUZ6gH/Pv0NlMjwBsqv6k+8Uik3ghEf6lWW7kovYDQCKB++KLEA1PTcfQABA8HOR66RClg4yAXA== X-Received: by 2002:a17:903:32c4:b0:1b0:3cda:6351 with SMTP id i4-20020a17090332c400b001b03cda6351mr21555341plr.0.1687424083229; Thu, 22 Jun 2023 01:54:43 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:42 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 05/29] drm/panfrost: dynamically allocate the drm-panfrost shrinker Date: Thu, 22 Jun 2023 16:53:11 +0800 Message-Id: <20230622085335.77010-6-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the drm-panfrost shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct panfrost_device. Signed-off-by: Qi Zheng --- drivers/gpu/drm/panfrost/panfrost_device.h | 2 +- .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 24 ++++++++++--------- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index b0126b9fbadc..e667e5689353 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -118,7 +118,7 @@ struct panfrost_device { struct mutex shrinker_lock; struct list_head shrinker_list; - struct shrinker shrinker; + struct shrinker *shrinker; struct panfrost_devfreq pfdevfreq; }; diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c index bf0170782f25..2a5513eb9e1f 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c @@ -18,8 +18,7 @@ static unsigned long panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { - struct panfrost_device *pfdev = - container_of(shrinker, struct panfrost_device, shrinker); + struct panfrost_device *pfdev = shrinker->private_data; struct drm_gem_shmem_object *shmem; unsigned long count = 0; @@ -65,8 +64,7 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj) static unsigned long panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { - struct panfrost_device *pfdev = - container_of(shrinker, struct panfrost_device, shrinker); + struct panfrost_device *pfdev = shrinker->private_data; struct drm_gem_shmem_object *shmem, *tmp; unsigned long freed = 0; @@ -100,10 +98,15 @@ panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) void panfrost_gem_shrinker_init(struct drm_device *dev) { struct panfrost_device *pfdev = dev->dev_private; - pfdev->shrinker.count_objects = panfrost_gem_shrinker_count; - pfdev->shrinker.scan_objects = panfrost_gem_shrinker_scan; - pfdev->shrinker.seeks = DEFAULT_SEEKS; - WARN_ON(register_shrinker(&pfdev->shrinker, "drm-panfrost")); + + pfdev->shrinker = shrinker_alloc_and_init(panfrost_gem_shrinker_count, + panfrost_gem_shrinker_scan, 0, + DEFAULT_SEEKS, 0, pfdev); + if (pfdev->shrinker && + register_shrinker(pfdev->shrinker, "drm-panfrost")) { + shrinker_free(pfdev->shrinker); + WARN_ON(1); + } } /** @@ -116,7 +119,6 @@ void panfrost_gem_shrinker_cleanup(struct drm_device *dev) { struct panfrost_device *pfdev = dev->dev_private; - if (pfdev->shrinker.nr_deferred) { - unregister_shrinker(&pfdev->shrinker); - } + if (pfdev->shrinker->nr_deferred) + unregister_and_free_shrinker(pfdev->shrinker); } From patchwork Thu Jun 22 08:53:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D6C0EB64D8 for ; Thu, 22 Jun 2023 08:57:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230323AbjFVI4i (ORCPT ); Thu, 22 Jun 2023 04:56:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231503AbjFVIzW (ORCPT ); Thu, 22 Jun 2023 04:55:22 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B563A213B for ; Thu, 22 Jun 2023 01:54:51 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-668842bc50dso866024b3a.1 for ; Thu, 22 Jun 2023 01:54:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424091; x=1690016091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=161EjO+w9TmzA/a8qJytM8fXp/e7+Gx/R5PIlDH2uAk=; b=VsUY2aigV4ZcD/zqH6R6qf5YjeDjgcBdr5DWNiFL+xj8FWQJXfxqjN+JjtQQqPUTZU y/hvwjl2otPmZYcL1mDMWeSCzXx77cZKx3CoOlix2I89OTP/XqbmazxkdkRcxT8IoFUn 2QEonOktxP7xKSbe3Xt0JkWeJ0vXY7bYmU0ntULjSwRJaXbaN1TAoEufMFnqYH3EWbZ2 epggxRJm4wrl127Vz5WoCbm7/HM2e7h8o7U+VP/mEnC0Icgf0KIm3PI/NPbIYz3ozq3C dpiJUcJ4yXe7UOxEIQJgQOohxTSbyOV1lnJIMgAbmswsZtfNIYimqUElkeBCjB3tvm0/ 8Aig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424091; x=1690016091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=161EjO+w9TmzA/a8qJytM8fXp/e7+Gx/R5PIlDH2uAk=; b=OJqR3yKvBoZnWxfeqZ/APrzvz06Ui1trLnwSG92JCvYhGSSap264OIhBJT/rkoC9e9 K8WESBKhhtGc4WJBFCBsWU/MJQ8tycqntgBLkENT9qDyHIWkdiaaTbqMwmxm1pgVyrDV Seek5JaUdR/NeNB2SHhEgJ3GAwZ6EEOS8l+mcqpKXAQBb0Q25Mh5/msn1mCM7LsBKkPe BWkBpptJWYVFWmFYHntJzGr9YX+BFMevf1ETGzgGvsydZqQi8lW5mFp8XCkGAGxDkmIL oVKT56vHytbdX9+mIK/E/t2NbWrXg50V+k3wkkeVKYXHODpbnfJZYDr6rR5msJNQrU0t BS9w== X-Gm-Message-State: AC+VfDzG+68rpXZxd2Tg3eiTPXIq3RyBkDdDLFz0K2BeUdapj4Xsqlfc viqToMJQMKptdhYHNkoAsfhWEA== X-Google-Smtp-Source: ACHHUZ6Zp8ahuhiEYJqMKGH6XvbaL5lpWamc0XC/4KRVYsGWUu1w4+GFAcXsiV5/9t8MrDqgvh99eg== X-Received: by 2002:a17:902:ea01:b0:1b3:e842:40a7 with SMTP id s1-20020a170902ea0100b001b3e84240a7mr20959635plg.1.1687424091188; Thu, 22 Jun 2023 01:54:51 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:50 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 06/29] dm: dynamically allocate the dm-bufio shrinker Date: Thu, 22 Jun 2023 16:53:12 +0800 Message-Id: <20230622085335.77010-7-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the dm-bufio shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct dm_bufio_client. Signed-off-by: Qi Zheng --- drivers/md/dm-bufio.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index eea977662e81..9472470d456d 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -963,7 +963,7 @@ struct dm_bufio_client { sector_t start; - struct shrinker shrinker; + struct shrinker *shrinker; struct work_struct shrink_work; atomic_long_t need_shrink; @@ -2385,7 +2385,7 @@ static unsigned long dm_bufio_shrink_scan(struct shrinker *shrink, struct shrink { struct dm_bufio_client *c; - c = container_of(shrink, struct dm_bufio_client, shrinker); + c = shrink->private_data; atomic_long_add(sc->nr_to_scan, &c->need_shrink); queue_work(dm_bufio_wq, &c->shrink_work); @@ -2394,7 +2394,7 @@ static unsigned long dm_bufio_shrink_scan(struct shrinker *shrink, struct shrink static unsigned long dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { - struct dm_bufio_client *c = container_of(shrink, struct dm_bufio_client, shrinker); + struct dm_bufio_client *c = shrink->private_data; unsigned long count = cache_total(&c->cache); unsigned long retain_target = get_retain_buffers(c); unsigned long queued_for_cleanup = atomic_long_read(&c->need_shrink); @@ -2507,14 +2507,15 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign INIT_WORK(&c->shrink_work, shrink_work); atomic_long_set(&c->need_shrink, 0); - c->shrinker.count_objects = dm_bufio_shrink_count; - c->shrinker.scan_objects = dm_bufio_shrink_scan; - c->shrinker.seeks = 1; - c->shrinker.batch = 0; - r = register_shrinker(&c->shrinker, "dm-bufio:(%u:%u)", + c->shrinker = shrinker_alloc_and_init(dm_bufio_shrink_count, + dm_bufio_shrink_scan, 0, 1, 0, c); + if (!c->shrinker) + goto bad; + + r = register_shrinker(c->shrinker, "dm-bufio:(%u:%u)", MAJOR(bdev->bd_dev), MINOR(bdev->bd_dev)); if (r) - goto bad; + goto bad_shrinker; mutex_lock(&dm_bufio_clients_lock); dm_bufio_client_count++; @@ -2524,6 +2525,8 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign return c; +bad_shrinker: + shrinker_free(c->shrinker); bad: while (!list_empty(&c->reserved_buffers)) { struct dm_buffer *b = list_to_buffer(c->reserved_buffers.next); @@ -2554,7 +2557,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c) drop_buffers(c); - unregister_shrinker(&c->shrinker); + unregister_and_free_shrinker(c->shrinker); flush_work(&c->shrink_work); mutex_lock(&dm_bufio_clients_lock); From patchwork Thu Jun 22 08:53:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE200C001DB for ; Thu, 22 Jun 2023 08:57:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231567AbjFVI5r (ORCPT ); Thu, 22 Jun 2023 04:57:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231642AbjFVIzg (ORCPT ); Thu, 22 Jun 2023 04:55:36 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 886D11FF0 for ; Thu, 22 Jun 2023 01:54:59 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b693afe799so2464335ad.1 for ; Thu, 22 Jun 2023 01:54:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424099; x=1690016099; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8pXD3DmjPUfWk2+jjA+m+4zKKy0fC+8AwUsZhq2lRmk=; b=E+tFDKMSki037PBhfmhvWVloKXWuzU3Ri56xXQW9hCkunzqr3/MhlEZSqiNZ+tNI9q wXEuj2hNk0ysqFsBunC7adB4hsOSpWoOkzyNHyQickBfIH13Ob7zUzdQVkoowJm+gwHf GqLtSyQYvEY7J6FdTyYJ1xVnKRHg63OezBYT/3UycGb6EQo3xYiC+CTrGWJfn3US3UhJ dbGuvQxkFI38s4idyaP5iHfWFNKc7ZWCaOV72DK+NmzkQO6lFrqKljjPnoN3sVHURF0C M8AH8Js6rxjyGoBC/sNAUYyKjN9tqWo3Nk1SQH9TmJx7zV/vF6Qb8YX+sV5g8XxbHf9I nLxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424099; x=1690016099; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8pXD3DmjPUfWk2+jjA+m+4zKKy0fC+8AwUsZhq2lRmk=; b=gotM5Rpc7vfAQRA+uiTCQsY9NbhedY9PqtUC+0DSB1zTSZjnCV3rdRJmJfPaRxnQt5 gGpH0Up1SeqTGIVOEYmvJEeY+EHufJCDpcGLIOTNancUW3NDJe2j+BBC/A1AMqCg3GVk ZgJCwdbFGECaWq5dr0mgTAenoOHBUYwtXt3Vzlk0QxOl3p1vHNYeb5QmsVbdbO5WRJSN ZjD7K6L50eD93DfbI/Z12rrjR1vffaioNg73Y+f9o6oSdCNxVtd7q1eNBizKzFCEBtOc 4sVU6uDwAn9I4uaJSfCUkL/YnsSqZeNfY9HZIPLMDoAHFpVxPvNkC+U37EytqqEB8UsT dYoQ== X-Gm-Message-State: AC+VfDyAlRom/YdzvTh5jwJu0HvElYshmgijD2FDPLm1fs96klX5wjZH aqoMtkTkKQvuihDnelxBfFHxwxpn4Y7QzA7Myuw= X-Google-Smtp-Source: ACHHUZ6uW4WhSYRgUY2aQCit4ogoH8R/r57rghu14Eyu6d6WG0AvLb+rcmWHNr1Voea08aO5tIYo3Q== X-Received: by 2002:a17:902:ecc6:b0:1ae:1364:6086 with SMTP id a6-20020a170902ecc600b001ae13646086mr21600250plh.2.1687424099018; Thu, 22 Jun 2023 01:54:59 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:58 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 07/29] dm zoned: dynamically allocate the dm-zoned-meta shrinker Date: Thu, 22 Jun 2023 16:53:13 +0800 Message-Id: <20230622085335.77010-8-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the dm-zoned-meta shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct dmz_metadata. Signed-off-by: Qi Zheng --- drivers/md/dm-zoned-metadata.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c index 9d3cca8e3dc9..41b10ffb968a 100644 --- a/drivers/md/dm-zoned-metadata.c +++ b/drivers/md/dm-zoned-metadata.c @@ -187,7 +187,7 @@ struct dmz_metadata { struct rb_root mblk_rbtree; struct list_head mblk_lru_list; struct list_head mblk_dirty_list; - struct shrinker mblk_shrinker; + struct shrinker *mblk_shrinker; /* Zone allocation management */ struct mutex map_lock; @@ -615,7 +615,7 @@ static unsigned long dmz_shrink_mblock_cache(struct dmz_metadata *zmd, static unsigned long dmz_mblock_shrinker_count(struct shrinker *shrink, struct shrink_control *sc) { - struct dmz_metadata *zmd = container_of(shrink, struct dmz_metadata, mblk_shrinker); + struct dmz_metadata *zmd = shrink->private_data; return atomic_read(&zmd->nr_mblks); } @@ -626,7 +626,7 @@ static unsigned long dmz_mblock_shrinker_count(struct shrinker *shrink, static unsigned long dmz_mblock_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct dmz_metadata *zmd = container_of(shrink, struct dmz_metadata, mblk_shrinker); + struct dmz_metadata *zmd = shrink->private_data; unsigned long count; spin_lock(&zmd->mblk_lock); @@ -2936,17 +2936,22 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev, */ zmd->min_nr_mblks = 2 + zmd->nr_map_blocks + zmd->zone_nr_bitmap_blocks * 16; zmd->max_nr_mblks = zmd->min_nr_mblks + 512; - zmd->mblk_shrinker.count_objects = dmz_mblock_shrinker_count; - zmd->mblk_shrinker.scan_objects = dmz_mblock_shrinker_scan; - zmd->mblk_shrinker.seeks = DEFAULT_SEEKS; + + zmd->mblk_shrinker = shrinker_alloc_and_init(dmz_mblock_shrinker_count, + dmz_mblock_shrinker_scan, + 0, DEFAULT_SEEKS, 0, zmd); + if (!zmd->mblk_shrinker) { + dmz_zmd_err(zmd, "allocate metadata cache shrinker failed"); + goto err; + } /* Metadata cache shrinker */ - ret = register_shrinker(&zmd->mblk_shrinker, "dm-zoned-meta:(%u:%u)", + ret = register_shrinker(zmd->mblk_shrinker, "dm-zoned-meta:(%u:%u)", MAJOR(dev->bdev->bd_dev), MINOR(dev->bdev->bd_dev)); if (ret) { dmz_zmd_err(zmd, "Register metadata cache shrinker failed"); - goto err; + goto err_shrinker; } dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version); @@ -2982,6 +2987,8 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev, *metadata = zmd; return 0; +err_shrinker: + shrinker_free(zmd->mblk_shrinker); err: dmz_cleanup_metadata(zmd); kfree(zmd); @@ -2995,7 +3002,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev, */ void dmz_dtr_metadata(struct dmz_metadata *zmd) { - unregister_shrinker(&zmd->mblk_shrinker); + unregister_and_free_shrinker(zmd->mblk_shrinker); dmz_cleanup_metadata(zmd); kfree(zmd); } From patchwork Thu Jun 22 08:53:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 590A6C001DE for ; Thu, 22 Jun 2023 08:57:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231577AbjFVI5s (ORCPT ); Thu, 22 Jun 2023 04:57:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231578AbjFVIzl (ORCPT ); Thu, 22 Jun 2023 04:55:41 -0400 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 714602680 for ; Thu, 22 Jun 2023 01:55:07 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-54fd4a7ce25so1096065a12.0 for ; Thu, 22 Jun 2023 01:55:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424107; x=1690016107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VJl14vOz0XQFz+azMteX8g8XWQLmhD27loPyzQxCyWo=; b=dSY63dBGkLY8Soj8dXVS98dEuYYMZ3kYNqwn6f5//glPXLQFLJz27NFKmb5CQWTM2k APxbCFally5Ye2wcKhRjXEzkXpY2MMujAr7sHxpWVpY6Tw+UW8TdBughq+A6CDzAWoWy +AmwZXYTUtj1EVbssFnJPqE+UVXszSk7I45NYzxDje6ZfgHjeGWZHMg6aUv73LqWwWwW 5itKACc99wkxtDt6xsJWBL2urtctKzIlWYA5VnKnCf1Z8lrwitknq6fV+Y7yJzzApX1O ankK3WiQYH84OGQwTOkmZRg+avhtQPcT8KVvO5N8KslMNgzr0qkBoxHulDKJPx12AE+X /x4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424107; x=1690016107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VJl14vOz0XQFz+azMteX8g8XWQLmhD27loPyzQxCyWo=; b=FFyJOPBCjKamg2AJa6z4730T++2/Ti6AmkbU3GG7nLwMWJD5j7GU6RgLsmE5Ls96wD SZIt+lRDQYE+1QwnWOnbJ5s19EixVlpCc+uorvubp//feuG3HdalJd2IAWWLMAWWIviv /xNmPjkndpx7sBd+j5A2ybMteqHVLugvoJJN80/8ZW1BKHPdOSc7EjSK2MkqTZcSzf9B B+TxXikVWVjyAkjLJQFfYDL+NgyGpQKbF0J1+jqGwk26LPkN64PV4uzGUXorE+cqSDFj SRkYpGPePa7WWWuMNrgefXUnpkP/4r8tIp8z/xd0Yxz3JRDhLQRyBAIDTR1I18XIAQv/ y5cQ== X-Gm-Message-State: AC+VfDzFKy1TI23Ae9+fKQttJgYXbjeN1xoKMI3fJG4R/bZeY17LoSNX /dMZmgAw8mYrGC2JF/PEZuZUAA== X-Google-Smtp-Source: ACHHUZ4GugeqE/IaG8xLwh4Bzf5woebFV8d7i9CEhVh9D6pBJJ8ySrAWkjhYmHqxBxnT9m8efiqyXQ== X-Received: by 2002:a17:902:c945:b0:1ac:40f7:8b5a with SMTP id i5-20020a170902c94500b001ac40f78b5amr21000858pla.3.1687424106926; Thu, 22 Jun 2023 01:55:06 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:06 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 08/29] md/raid5: dynamically allocate the md-raid5 shrinker Date: Thu, 22 Jun 2023 16:53:14 +0800 Message-Id: <20230622085335.77010-9-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the md-raid5 shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct r5conf. Signed-off-by: Qi Zheng --- drivers/md/raid5.c | 28 +++++++++++++++++----------- drivers/md/raid5.h | 2 +- 2 files changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index f4eea1bbbeaf..4866cad1ad62 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7391,7 +7391,7 @@ static void free_conf(struct r5conf *conf) log_exit(conf); - unregister_shrinker(&conf->shrinker); + unregister_and_free_shrinker(conf->shrinker); free_thread_groups(conf); shrink_stripes(conf); raid5_free_percpu(conf); @@ -7439,7 +7439,7 @@ static int raid5_alloc_percpu(struct r5conf *conf) static unsigned long raid5_cache_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct r5conf *conf = container_of(shrink, struct r5conf, shrinker); + struct r5conf *conf = shrink->private_data; unsigned long ret = SHRINK_STOP; if (mutex_trylock(&conf->cache_size_mutex)) { @@ -7460,7 +7460,7 @@ static unsigned long raid5_cache_scan(struct shrinker *shrink, static unsigned long raid5_cache_count(struct shrinker *shrink, struct shrink_control *sc) { - struct r5conf *conf = container_of(shrink, struct r5conf, shrinker); + struct r5conf *conf = shrink->private_data; if (conf->max_nr_stripes < conf->min_nr_stripes) /* unlikely, but not impossible */ @@ -7695,16 +7695,21 @@ static struct r5conf *setup_conf(struct mddev *mddev) * it reduces the queue depth and so can hurt throughput. * So set it rather large, scaled by number of devices. */ - conf->shrinker.seeks = DEFAULT_SEEKS * conf->raid_disks * 4; - conf->shrinker.scan_objects = raid5_cache_scan; - conf->shrinker.count_objects = raid5_cache_count; - conf->shrinker.batch = 128; - conf->shrinker.flags = 0; - ret = register_shrinker(&conf->shrinker, "md-raid5:%s", mdname(mddev)); + conf->shrinker = shrinker_alloc_and_init(raid5_cache_count, + raid5_cache_scan, 128, + DEFAULT_SEEKS * conf->raid_disks * 4, + 0, conf); + if (!conf->shrinker) { + pr_warn("md/raid:%s: couldn't allocate shrinker.\n", + mdname(mddev)); + goto abort; + } + + ret = register_shrinker(conf->shrinker, "md-raid5:%s", mdname(mddev)); if (ret) { pr_warn("md/raid:%s: couldn't register shrinker.\n", mdname(mddev)); - goto abort; + goto abort_shrinker; } sprintf(pers_name, "raid%d", mddev->new_level); @@ -7717,7 +7722,8 @@ static struct r5conf *setup_conf(struct mddev *mddev) } return conf; - +abort_shrinker: + shrinker_free(conf->shrinker); abort: if (conf) free_conf(conf); diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index 6a92fafb0748..806f84681599 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -670,7 +670,7 @@ struct r5conf { wait_queue_head_t wait_for_stripe; wait_queue_head_t wait_for_overlap; unsigned long cache_state; - struct shrinker shrinker; + struct shrinker *shrinker; int pool_size; /* number of disks in stripeheads in pool */ spinlock_t device_lock; struct disk_info *disks; From patchwork Thu Jun 22 08:53:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F3CCEB64DA for ; Thu, 22 Jun 2023 08:57:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230172AbjFVI5x (ORCPT ); Thu, 22 Jun 2023 04:57:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231643AbjFVIzx (ORCPT ); Thu, 22 Jun 2023 04:55:53 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EB712693 for ; Thu, 22 Jun 2023 01:55:15 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b5585e84b4so6827385ad.0 for ; Thu, 22 Jun 2023 01:55:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424115; x=1690016115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9Gb8UpD3lZt4bh5cbhFKOyQevww2sfofkVlmbPNwxg8=; b=D5LVC+4HokJCz4FPeXcY+dsaTssife3Qqo2QFad+aTpXncu7uEYIb5Esgo0X9193yq JD9ShpMbmoNV55hSp3a3PNB6eqKE4aEEgjXjfXMhn0CWCq43W/gk5ji4OA6QFiz5YEbm Of780eCZQIl3oU3nzqHMjitXTfpqsvwkQAoJ3/aSYN48pFnlXdDicr+5TJONhzqSsJ8t IgcwK0tiB9rdlMicFxhfFEkVC1N5FWb0hV5n3c+iSTxmBlXa0ZlHHEAwcq0BLcrHgmll f7y7d2fR0/LzWuGi/rCVXp+lACayEzs+7pdEprt9XcBuFLw9Vycb7reaOybOjmFS0TWh I1Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424115; x=1690016115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9Gb8UpD3lZt4bh5cbhFKOyQevww2sfofkVlmbPNwxg8=; b=F0GHFXj8+JHFnFMHZBWpVpXnhM099UC722Nxp9Tm9XSinAyqkepQhCycpFHTT6Se/E quAXZYsitY6USacCTUTaUqFvIo9u/edA5ODPu7lv5A4s/JZpvJnLsZZOooCkwFHdXxu3 fBH5pzWctGhP/ck1H0QqiWCXZ96MkECauH1mfU7A6ouweQJta8u/GTOWtf67G+11/gH5 jIP55LIknLOlxQe16OMPVgJOss/7EKgYvuhALIMkr6814N8k+Cr87UoUH4uPZ28tEXTp U6N+G4y3NcxV04b0e8NrPoZkxRqT+zyGotkZ9mXOnRO+zA6axwmAOgYKH7PIULicfWyc A/Qg== X-Gm-Message-State: AC+VfDzkoVsZcBoQ13qcQexxYAAc7NTymBSRdat+V/EKhvib5wSY2KKz 9WLgzjR/lTIclpl+tQ/UMG4ing== X-Google-Smtp-Source: ACHHUZ4eMYIbb4I7cyZSyRLa7A39+u7jKZFqr72/TYsRg3oKUy2dca/PQ8JGq6DHBe5p8nrFI+JLcw== X-Received: by 2002:a17:903:2451:b0:1b0:34c6:3bf2 with SMTP id l17-20020a170903245100b001b034c63bf2mr21539426pls.5.1687424114850; Thu, 22 Jun 2023 01:55:14 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:14 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 09/29] bcache: dynamically allocate the md-bcache shrinker Date: Thu, 22 Jun 2023 16:53:15 +0800 Message-Id: <20230622085335.77010-10-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the md-bcache shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct cache_set. Signed-off-by: Qi Zheng --- drivers/md/bcache/bcache.h | 2 +- drivers/md/bcache/btree.c | 23 ++++++++++++++--------- drivers/md/bcache/sysfs.c | 2 +- 3 files changed, 16 insertions(+), 11 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 700dc5588d5f..53c73b372e7a 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -541,7 +541,7 @@ struct cache_set { struct bio_set bio_split; /* For the btree cache */ - struct shrinker shrink; + struct shrinker *shrink; /* For the btree cache and anything allocation related */ struct mutex bucket_lock; diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 569f48958bde..1131ae91f62a 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -667,7 +667,7 @@ static int mca_reap(struct btree *b, unsigned int min_order, bool flush) static unsigned long bch_mca_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct cache_set *c = container_of(shrink, struct cache_set, shrink); + struct cache_set *c = shrink->private_data; struct btree *b, *t; unsigned long i, nr = sc->nr_to_scan; unsigned long freed = 0; @@ -734,7 +734,7 @@ static unsigned long bch_mca_scan(struct shrinker *shrink, static unsigned long bch_mca_count(struct shrinker *shrink, struct shrink_control *sc) { - struct cache_set *c = container_of(shrink, struct cache_set, shrink); + struct cache_set *c = shrink->private_data; if (c->shrinker_disabled) return 0; @@ -752,8 +752,8 @@ void bch_btree_cache_free(struct cache_set *c) closure_init_stack(&cl); - if (c->shrink.list.next) - unregister_shrinker(&c->shrink); + if (c->shrink->list.next) + unregister_and_free_shrinker(c->shrink); mutex_lock(&c->bucket_lock); @@ -828,14 +828,19 @@ int bch_btree_cache_alloc(struct cache_set *c) c->verify_data = NULL; #endif - c->shrink.count_objects = bch_mca_count; - c->shrink.scan_objects = bch_mca_scan; - c->shrink.seeks = 4; - c->shrink.batch = c->btree_pages * 2; + c->shrink = shrinker_alloc_and_init(bch_mca_count, bch_mca_scan, + c->btree_pages * 2, 4, 0, c); + if (!c->shrink) { + pr_warn("bcache: %s: could not allocate shrinker\n", + __func__); + return -ENOMEM; + } - if (register_shrinker(&c->shrink, "md-bcache:%pU", c->set_uuid)) + if (register_shrinker(c->shrink, "md-bcache:%pU", c->set_uuid)) { pr_warn("bcache: %s: could not register shrinker\n", __func__); + shrinker_free(c->shrink); + } return 0; } diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index c6f677059214..771577581f52 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -866,7 +866,7 @@ STORE(__bch_cache_set) sc.gfp_mask = GFP_KERNEL; sc.nr_to_scan = strtoul_or_return(buf); - c->shrink.scan_objects(&c->shrink, &sc); + c->shrink->scan_objects(c->shrink, &sc); } sysfs_strtoul_clamp(congested_read_threshold_us, From patchwork Thu Jun 22 08:53:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7851EB64DC for ; Thu, 22 Jun 2023 08:57:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231617AbjFVI5y (ORCPT ); Thu, 22 Jun 2023 04:57:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231258AbjFVI5D (ORCPT ); Thu, 22 Jun 2023 04:57:03 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B150126AD for ; Thu, 22 Jun 2023 01:55:23 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1b5466bc5f8so9653975ad.1 for ; Thu, 22 Jun 2023 01:55:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424123; x=1690016123; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u/vExFBLW3qTniAw4XeP9KQ69i9zow/4wwrpInqlwjM=; b=f9niG/1nDZCnLyGAdmXOPnr0UhYRgdnlIknKHTp5fpAvQQctnSzqFRWkCf20Ww177d QG9jeeQWMuTex/IwaFDMA8vXlH6fpJPO6Zn2dpmhMfbdqmxsLU2UIGO2NDOzTGYWiDw7 ywIPFwnvUTjEHuZeccmcjrjEkLsXSYquHahbHd6ZdjnzqEP2RHdSMNMWgiY9L8TVZ0lC G4aVTSqjJ5NrWtnVCJTDf1mbR/bEFK3aai4gwzFQmPA34Ikuzxs8i6IqNfr5Nxc5N0yO /Hq8+LGyQI4XHPCqHbrNX4k2UTVR4h1zk7vD27aNaHUnKMeOkCqJba/XQtzAGJZToHzN Aj7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424123; x=1690016123; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u/vExFBLW3qTniAw4XeP9KQ69i9zow/4wwrpInqlwjM=; b=e/WuZWtL5wZoRlG0hYo+ltQcivoq+lvdZHvTKhbsEGnpSQWOJXUtFbwo9JYI3C64px 2FWeyPfU9VzvN2q/yqxzIcbdpmDt0VABWZu4lgufPAspbH77cugtZh0NHoEsQMIZdrpc LCAQK1FkEpxRQNzgR1PBscYdLtsQQejNkz2S73pEHEGATJkJCzPEbBF2SiNkifvZJ+6l NU5yyGi2BKdr52kZJ60FAEksTrqtxSbpYhDJQmkFPhBm8NmDNuQFkidObqeDKJmS4b5v n+CBfBnZdfpHRmEczOPYIAdiXqzpMSVG+o2dGbtw0182DYKFszbeUWCZfYmoXxgnrtRM EwcA== X-Gm-Message-State: AC+VfDwb8OnAMQ53L66l9cxpenKi/Hn5rWx/SM1fqiwKcHXqzQ/PnLhk bJl/btEml+qnCUzcJxrign25PA== X-Google-Smtp-Source: ACHHUZ4R6XsCch8DMbBn2ilzIIAPskFMWcDdECndDXpbpwroXAZTFBoXbn8TaLTiXdt1Xz94/St+OA== X-Received: by 2002:a17:902:dac6:b0:1a1:956d:2281 with SMTP id q6-20020a170902dac600b001a1956d2281mr22035085plx.3.1687424123198; Thu, 22 Jun 2023 01:55:23 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:22 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 10/29] vmw_balloon: dynamically allocate the vmw-balloon shrinker Date: Thu, 22 Jun 2023 16:53:16 +0800 Message-Id: <20230622085335.77010-11-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the vmw-balloon shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct vmballoon. Signed-off-by: Qi Zheng --- drivers/misc/vmw_balloon.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c index 9ce9b9e0e9b6..2f86f666b476 100644 --- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@ -380,7 +380,7 @@ struct vmballoon { /** * @shrinker: shrinker interface that is used to avoid over-inflation. */ - struct shrinker shrinker; + struct shrinker *shrinker; /** * @shrinker_registered: whether the shrinker was registered. @@ -1569,7 +1569,7 @@ static unsigned long vmballoon_shrinker_count(struct shrinker *shrinker, static void vmballoon_unregister_shrinker(struct vmballoon *b) { if (b->shrinker_registered) - unregister_shrinker(&b->shrinker); + unregister_and_free_shrinker(b->shrinker); b->shrinker_registered = false; } @@ -1581,14 +1581,18 @@ static int vmballoon_register_shrinker(struct vmballoon *b) if (!vmwballoon_shrinker_enable) return 0; - b->shrinker.scan_objects = vmballoon_shrinker_scan; - b->shrinker.count_objects = vmballoon_shrinker_count; - b->shrinker.seeks = DEFAULT_SEEKS; + b->shrinker = shrinker_alloc_and_init(vmballoon_shrinker_count, + vmballoon_shrinker_scan, + 0, DEFAULT_SEEKS, 0, b); + if (!b->shrinker) + return -ENOMEM; - r = register_shrinker(&b->shrinker, "vmw-balloon"); + r = register_shrinker(b->shrinker, "vmw-balloon"); if (r == 0) b->shrinker_registered = true; + else + shrinker_free(b->shrinker); return r; } From patchwork Thu Jun 22 08:53:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFCA6EB64D8 for ; Thu, 22 Jun 2023 08:58:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231708AbjFVI6t (ORCPT ); Thu, 22 Jun 2023 04:58:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231480AbjFVI6B (ORCPT ); Thu, 22 Jun 2023 04:58:01 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DE4626BC for ; Thu, 22 Jun 2023 01:55:32 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1b5585e84b4so6827715ad.0 for ; Thu, 22 Jun 2023 01:55:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424131; x=1690016131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Rw+THIB7nYtdEoKFrcRpf+DtRSemfk8nQmXyWQCWLOU=; b=DMwdTIl9g+okPZHJv8qTrb+gYAPEj9v1sxcSDlzK5ch7N+TuIC/BcFHKohEZo55ND8 XU29tOhBmT8qLwPtzV/wEHQ6iOqaUqSy8YTL9GNxgMKyLojaih+JCKFRV8sA50VDaMhL 4ENhUxcmuP8buO31FrzpXihlkiHbj7xjQ5jQCZ/XAAfjoNJ005ZfE8KQlMXJnunKzCcO Fe2zvwkJ/uki5/MCqM+WcH9ckA1bFEYVaXIUmwP+4s7h2vpQxpmZaq/OfODYIqbWYJLH gNO1EGlemqhIu8SDpgbpJo6kXONx1A4ErgZXcMZMC3KoLi5/PQ7Kas2P+1tIIIXAXu47 3GiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424131; x=1690016131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Rw+THIB7nYtdEoKFrcRpf+DtRSemfk8nQmXyWQCWLOU=; b=ICaAe/1zo9DeOLlwHbq+fPZiSHMIaII0TPPJcTCSBfQVPAq2sEEKzvwHNdKfUNTakk NcADxfd1+TeWBJw/VeKrQdWhqB2WaKzMs2d7zktJMhNp+A7bNXalsgUaD1z496Bth0MA kWN0s0jGvb0l/pK4y0NtIwPa/KXiasEb1sN+4YVSMyW0uDvftM/ruSZ9JOYDL80TJI8Y LFs5iVtCjsvfvyhV4WpN3XnzVl0lUsXfpTwZtXL/zPJRrqCOJFb7Nv2EkggocoHil1kX ke1QOxVhUbRzv/QJbyb6qv9bsOQEB3F8dILE+KQIe1f2xTMXiV0w4kl2sh/Nw6pmgP6G /MBQ== X-Gm-Message-State: AC+VfDwTqu4aWmCQeTUfOjWkHC7pbq2tconD1pJsnJjodZmjv3bQupjO 1U9F5EK5OQbwHI2xPgwjHE4XzQ== X-Google-Smtp-Source: ACHHUZ74w6zUyxxONmkdhWFpbjqcQv1Ri8s1BW02LunWtr1K9sXtmODVbQHvvX6jLBNiniDuaX5mmA== X-Received: by 2002:a17:902:d489:b0:1b4:ddef:841e with SMTP id c9-20020a170902d48900b001b4ddef841emr21417877plg.4.1687424131373; Thu, 22 Jun 2023 01:55:31 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:31 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 11/29] virtio_balloon: dynamically allocate the virtio-balloon shrinker Date: Thu, 22 Jun 2023 16:53:17 +0800 Message-Id: <20230622085335.77010-12-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the virtio-balloon shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct virtio_balloon. Signed-off-by: Qi Zheng --- drivers/virtio/virtio_balloon.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 5b15936a5214..fa051bff8d90 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -111,7 +111,7 @@ struct virtio_balloon { struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */ - struct shrinker shrinker; + struct shrinker *shrinker; /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ struct notifier_block oom_nb; @@ -816,8 +816,7 @@ static unsigned long shrink_free_pages(struct virtio_balloon *vb, static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { - struct virtio_balloon *vb = container_of(shrinker, - struct virtio_balloon, shrinker); + struct virtio_balloon *vb = shrinker->private_data; return shrink_free_pages(vb, sc->nr_to_scan); } @@ -825,8 +824,7 @@ static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { - struct virtio_balloon *vb = container_of(shrinker, - struct virtio_balloon, shrinker); + struct virtio_balloon *vb = shrinker->private_data; return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; } @@ -847,16 +845,24 @@ static int virtio_balloon_oom_notify(struct notifier_block *nb, static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) { - unregister_shrinker(&vb->shrinker); + unregister_and_free_shrinker(vb->shrinker); } static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) { - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; - vb->shrinker.count_objects = virtio_balloon_shrinker_count; - vb->shrinker.seeks = DEFAULT_SEEKS; + int ret; + + vb->shrinker = shrinker_alloc_and_init(virtio_balloon_shrinker_count, + virtio_balloon_shrinker_scan, + 0, DEFAULT_SEEKS, 0, vb); + if (!vb->shrinker) + return -ENOMEM; + + ret = register_shrinker(vb->shrinker, "virtio-balloon"); + if (ret) + shrinker_free(vb->shrinker); - return register_shrinker(&vb->shrinker, "virtio-balloon"); + return ret; } static int virtballoon_probe(struct virtio_device *vdev) From patchwork Thu Jun 22 08:53:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288680 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F108C001B3 for ; Thu, 22 Jun 2023 08:58:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231723AbjFVI6w (ORCPT ); Thu, 22 Jun 2023 04:58:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231625AbjFVI6J (ORCPT ); Thu, 22 Jun 2023 04:58:09 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA6FB2713 for ; Thu, 22 Jun 2023 01:55:40 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b693afe799so2465025ad.1 for ; Thu, 22 Jun 2023 01:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424140; x=1690016140; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qqb5b2GNvTgVTRVLR7iEMYBQ9KFY39GKHYb23c1SNH4=; b=DTUuH0I9uHLii3LJ9Zyuu9c+EPbxYZC20xJ5+3ma+Kp5qpBNFOFTSMnclhdsEVCqLi IhMw89U+Z8kUvjOX0p1/U1TPShfVW/vJXGuaCCTRG1bGCD9nXSC4/ZiagSGalcNeTARf jURjtWkQSw5PrzWP1wBPTMrMWiplRo47rrBJgWE68ZF11RySU51PM2cKEbziuuvrByZZ Zy6z/i7b8JtNNsvUGe9g7/sl4BzjGsT0dMzcCmwPn8WMxyE8EEE7gXWjStqzol2ehPl/ 69+rSZ9IFQ8XiOW8vY+908jjX5pcw5aN354YWworQkitYsT+fnzRPZItmEL7YHvgl8Dv 6USw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424140; x=1690016140; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qqb5b2GNvTgVTRVLR7iEMYBQ9KFY39GKHYb23c1SNH4=; b=iuAGazHM1JtubrfW0AOOPzol81kHRw4ylEeMfto5jR8Q+zrSUcnGoIbXYnVf7/NjKr r50JWh3LXYWo2FQyYEoyD9LF60VKO5FlmEVg/9g/QvO2ORE//S4AAzha27GGlXjDuCBV yaXWWKP+uKzDKJ9nLggC1BsbDFL2sBVlfGsSU2crm5Q/mtMdzkyEJqvcfduy+FoFktqY Z+AlBenQNCx5GVbF93Do3gE+CII7sd/4x73r3M2tbpwrH8rK1K7GgYnBCGPkG8TTJ7KB mIVU/kMCEyveEPvXYNL5mSqPZz5SswbCjo6guocbs5LI2jgOsEsexyz0DGs20PpEbEhi xKKQ== X-Gm-Message-State: AC+VfDz1mOedmtpgsf/aOrS3EzkMKI4JCVe259hOHAIrNB0Q2f4EZITK frVSP3WZNQ0EvY53S3J/1QdHUQ== X-Google-Smtp-Source: ACHHUZ4U5RlW8/2AChRXKukLEKek4tD/ljCLASAoWCRPvf/eu3Wdfis/k+W/YQGMZOvsRYh9ePd5+g== X-Received: by 2002:a17:902:d509:b0:1af:a349:3f31 with SMTP id b9-20020a170902d50900b001afa3493f31mr21732029plg.3.1687424140189; Thu, 22 Jun 2023 01:55:40 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:39 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 12/29] mbcache: dynamically allocate the mbcache shrinker Date: Thu, 22 Jun 2023 16:53:18 +0800 Message-Id: <20230622085335.77010-13-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the mbcache shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct mb_cache. Signed-off-by: Qi Zheng --- fs/mbcache.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/fs/mbcache.c b/fs/mbcache.c index 2a4b8b549e93..fec393e55a66 100644 --- a/fs/mbcache.c +++ b/fs/mbcache.c @@ -37,7 +37,7 @@ struct mb_cache { struct list_head c_list; /* Number of entries in cache */ unsigned long c_entry_count; - struct shrinker c_shrink; + struct shrinker *c_shrink; /* Work for shrinking when the cache has too many entries */ struct work_struct c_shrink_work; }; @@ -293,8 +293,7 @@ EXPORT_SYMBOL(mb_cache_entry_touch); static unsigned long mb_cache_count(struct shrinker *shrink, struct shrink_control *sc) { - struct mb_cache *cache = container_of(shrink, struct mb_cache, - c_shrink); + struct mb_cache *cache = shrink->private_data; return cache->c_entry_count; } @@ -333,8 +332,8 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache, static unsigned long mb_cache_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct mb_cache *cache = container_of(shrink, struct mb_cache, - c_shrink); + struct mb_cache *cache = shrink->private_data; + return mb_cache_shrink(cache, sc->nr_to_scan); } @@ -370,26 +369,30 @@ struct mb_cache *mb_cache_create(int bucket_bits) cache->c_hash = kmalloc_array(bucket_count, sizeof(struct hlist_bl_head), GFP_KERNEL); - if (!cache->c_hash) { - kfree(cache); - goto err_out; - } + if (!cache->c_hash) + goto err_c_hash; + for (i = 0; i < bucket_count; i++) INIT_HLIST_BL_HEAD(&cache->c_hash[i]); - cache->c_shrink.count_objects = mb_cache_count; - cache->c_shrink.scan_objects = mb_cache_scan; - cache->c_shrink.seeks = DEFAULT_SEEKS; - if (register_shrinker(&cache->c_shrink, "mbcache-shrinker")) { - kfree(cache->c_hash); - kfree(cache); - goto err_out; - } + cache->c_shrink = shrinker_alloc_and_init(mb_cache_count, mb_cache_scan, + 0, DEFAULT_SEEKS, 0, cache); + if (!cache->c_shrink) + goto err_shrinker; + + if (register_shrinker(cache->c_shrink, "mbcache-shrinker")) + goto err_register; INIT_WORK(&cache->c_shrink_work, mb_cache_shrink_worker); return cache; +err_register: + shrinker_free(cache->c_shrink); +err_shrinker: + kfree(cache->c_hash); +err_c_hash: + kfree(cache); err_out: return NULL; } @@ -406,7 +409,7 @@ void mb_cache_destroy(struct mb_cache *cache) { struct mb_cache_entry *entry, *next; - unregister_shrinker(&cache->c_shrink); + unregister_and_free_shrinker(cache->c_shrink); /* * We don't bother with any locking. Cache must not be used at this From patchwork Thu Jun 22 08:53:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86F01EB64DB for ; Thu, 22 Jun 2023 08:59:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230088AbjFVI7h (ORCPT ); Thu, 22 Jun 2023 04:59:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231713AbjFVI6u (ORCPT ); Thu, 22 Jun 2023 04:58:50 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9ECAC2949 for ; Thu, 22 Jun 2023 01:55:49 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-51f64817809so680065a12.1 for ; Thu, 22 Jun 2023 01:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424148; x=1690016148; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eFTuD75yazkbkprx1nDK+J2a+gsKcpdTH5uLLFxlsCE=; b=YL01pLqxeJdJ3yx7vlwGwrScHLp900A8hEE3Wf+RWJNlhx4L8kHsp9yLDKLvMKBCnC 3rsy/Bt3MovGZd9xDykjB2GgcKTKrjV0UKR/p/LUcqeAgCcK1oeyqL4lKiu0nVi2NGBN TaYF2c0QLLlLRjtauaahzpKEMrC5bmEQ14igH5gpZVZzZ5PZmJEXxyLsPSab+WZLJtP7 soRQ7v+HKrW5nh3UOzeGEOAr+30lPAAddvMLxo2KnZ7MK/KtFrltdlmvaznhvWw7dzRl oronsxPEPE0DLPq3etiyjPQ96cDhygWHhtXaurPBhnnq3MU+faq6ga287Qti+jNFsnPy uaEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424148; x=1690016148; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eFTuD75yazkbkprx1nDK+J2a+gsKcpdTH5uLLFxlsCE=; b=f5HzeEsKHTLUvswCnHYft903+ZQTSByinPhtwgu7yRQjVZgspEo/lk75QSplR8aCQS Dqdi45nqnecJBtW76kuKduOJMbmXGSVv5FggvF14C0i8H1uwvpWBwJuLM8UCZ/t4SGc0 aSCMl1DsJymDZup1bqTU9Gv0OshPVWzlfMp/VE44HVl1n98IiirPlyqxPK+bRF6lxs9S bLz2qHmuwe2cHjhTvyjveptWRo7m09N4HvfnUSeFExBCA2U3ipwWdYSUxoz92TUWzuQm TGjVTQrhZThqn0Pl3Kg7P1ElT2k2IiSdJIHdf/duwO5BqAJPXNvIbNiJpl2EFZIHkJ77 7ywQ== X-Gm-Message-State: AC+VfDwsdkJG+XJE94JfWybLI6AKFKMcpPBWNI3RUHB2J52NffLBczV2 22gQuR3mWSkBP/+gVAcIO0yt1A== X-Google-Smtp-Source: ACHHUZ6jVuCzKKZsdSS1vKWkjhi+wc7lz2Kfe1cJ6l4zUout3MX707Lbn9nWcxBb7Ur8w7srUJ3z+g== X-Received: by 2002:a17:903:2451:b0:1b0:34c6:3bf2 with SMTP id l17-20020a170903245100b001b034c63bf2mr21540496pls.5.1687424148210; Thu, 22 Jun 2023 01:55:48 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:47 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 13/29] ext4: dynamically allocate the ext4-es shrinker Date: Thu, 22 Jun 2023 16:53:19 +0800 Message-Id: <20230622085335.77010-14-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the ext4-es shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct ext4_sb_info. Signed-off-by: Qi Zheng --- fs/ext4/ext4.h | 2 +- fs/ext4/extents_status.c | 21 ++++++++++++--------- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 0a2d55faa095..1bd150d454f5 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1651,7 +1651,7 @@ struct ext4_sb_info { __u32 s_csum_seed; /* Reclaim extents from extent status tree */ - struct shrinker s_es_shrinker; + struct shrinker *s_es_shrinker; struct list_head s_es_list; /* List of inodes with reclaimable extents */ long s_es_nr_inode; struct ext4_es_stats s_es_stats; diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 9b5b8951afb4..fea82339f4b4 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1596,7 +1596,7 @@ static unsigned long ext4_es_count(struct shrinker *shrink, unsigned long nr; struct ext4_sb_info *sbi; - sbi = container_of(shrink, struct ext4_sb_info, s_es_shrinker); + sbi = shrink->private_data; nr = percpu_counter_read_positive(&sbi->s_es_stats.es_stats_shk_cnt); trace_ext4_es_shrink_count(sbi->s_sb, sc->nr_to_scan, nr); return nr; @@ -1605,8 +1605,7 @@ static unsigned long ext4_es_count(struct shrinker *shrink, static unsigned long ext4_es_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct ext4_sb_info *sbi = container_of(shrink, - struct ext4_sb_info, s_es_shrinker); + struct ext4_sb_info *sbi = shrink->private_data; int nr_to_scan = sc->nr_to_scan; int ret, nr_shrunk; @@ -1690,15 +1689,19 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi) if (err) goto err3; - sbi->s_es_shrinker.scan_objects = ext4_es_scan; - sbi->s_es_shrinker.count_objects = ext4_es_count; - sbi->s_es_shrinker.seeks = DEFAULT_SEEKS; - err = register_shrinker(&sbi->s_es_shrinker, "ext4-es:%s", + sbi->s_es_shrinker = shrinker_alloc_and_init(ext4_es_count, ext4_es_scan, + 0, DEFAULT_SEEKS, 0, sbi); + if (!sbi->s_es_shrinker) + goto err4; + + err = register_shrinker(sbi->s_es_shrinker, "ext4-es:%s", sbi->s_sb->s_id); if (err) - goto err4; + goto err5; return 0; +err5: + shrinker_free(sbi->s_es_shrinker); err4: percpu_counter_destroy(&sbi->s_es_stats.es_stats_shk_cnt); err3: @@ -1716,7 +1719,7 @@ void ext4_es_unregister_shrinker(struct ext4_sb_info *sbi) percpu_counter_destroy(&sbi->s_es_stats.es_stats_cache_misses); percpu_counter_destroy(&sbi->s_es_stats.es_stats_all_cnt); percpu_counter_destroy(&sbi->s_es_stats.es_stats_shk_cnt); - unregister_shrinker(&sbi->s_es_shrinker); + unregister_and_free_shrinker(sbi->s_es_shrinker); } /* From patchwork Thu Jun 22 08:53:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288682 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55B0CEB64DA for ; Thu, 22 Jun 2023 09:00:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231732AbjFVI7w (ORCPT ); Thu, 22 Jun 2023 04:59:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231744AbjFVI7P (ORCPT ); Thu, 22 Jun 2023 04:59:15 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF6DA2979 for ; Thu, 22 Jun 2023 01:55:56 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1b5585e84b4so6828175ad.0 for ; Thu, 22 Jun 2023 01:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424156; x=1690016156; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YV1TxBHonYX3pT+e4dDoFW63Wrih64tWAHWeaspLQMY=; b=IN1IoI2uOCQQeS9Pc7yMEYFtdbTKp2FDPBFMAyikkB3bBw/FCEgPQpTXBrLGJxS9op OYz9dXXM6oULWR3l3wYKXb5/BI9AkFpryZP4yoepCwR6hfyyNUOfJ8hsO4sBRwfJ49YX gUV9lDqFE5hQ/d7HdmIIwZSFFlm7MsgfCCmjn2SzIPpkrW6AhbhGMDaDZJJIRMhzbnyY 28028P9U8PoGDzKdCZpYkcFtSyvZMpWWRxH6zTertHKnPVPdcLDGdjwoWDesvLV0LnN9 TRCiB1NZo1an6jc3DcyUcaVRTuirz1I2fp4pQQgMnnutBNGuXTEGp5MRQRxr/+etF0+R bUJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424156; x=1690016156; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YV1TxBHonYX3pT+e4dDoFW63Wrih64tWAHWeaspLQMY=; b=W0wQAJnlYiB98G2UrE7IdCDKbNYj/oNS0a9+YEYHmbG2FjSiH5aw8nlM/iKALOtjdV bp7C/O9hjncImAZ12pwvUElq22p2f3ZFCBrv1k6KyffZz7Vbwht3Bw0cPqMQ5psu4/9l tmvS6I951NvWNCoVgLNeyLfClIjaojy+5MfEpl+Z4gltOMPEc8URbS4vx5A2dPJDwhQv qTpnUQKscf5GmCEcXDpoPXWaSf7FodO3zkaZl1qdR9yTw/5netU8w/TkoI77C/yqT0Pz YYtHaPABYs7oxgNEaEPgzCjcUuks/KaSqejlo3c2GGLc1u6DFvHXA6WNUjnjy6aGeCDT buWg== X-Gm-Message-State: AC+VfDwAjRII2rnJSOZmliiS1oziyk1ZGLrWsf0dXHKMvxhKQqfJASDP aVS8r0ghaS/qZzoSCBnsyivvrw== X-Google-Smtp-Source: ACHHUZ6yC2snRIv6FqrsdGDMg/3uXpE9V6V7JpWahZzREljgaxmC3AuzAW6pVJNk/nQXbwoLHaAThg== X-Received: by 2002:a17:903:2451:b0:1b0:34c6:3bf2 with SMTP id l17-20020a170903245100b001b034c63bf2mr21540756pls.5.1687424156178; Thu, 22 Jun 2023 01:55:56 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:55:55 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 14/29] jbd2,ext4: dynamically allocate the jbd2-journal shrinker Date: Thu, 22 Jun 2023 16:53:20 +0800 Message-Id: <20230622085335.77010-15-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the jbd2-journal shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct journal_s. Signed-off-by: Qi Zheng --- fs/jbd2/journal.c | 32 +++++++++++++++++++------------- include/linux/jbd2.h | 2 +- 2 files changed, 20 insertions(+), 14 deletions(-) diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index eee3c0ae349a..92a2f4360b5f 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1301,7 +1301,7 @@ static int jbd2_min_tag_size(void) static unsigned long jbd2_journal_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { - journal_t *journal = container_of(shrink, journal_t, j_shrinker); + journal_t *journal = shrink->private_data; unsigned long nr_to_scan = sc->nr_to_scan; unsigned long nr_shrunk; unsigned long count; @@ -1327,7 +1327,7 @@ static unsigned long jbd2_journal_shrink_scan(struct shrinker *shrink, static unsigned long jbd2_journal_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { - journal_t *journal = container_of(shrink, journal_t, j_shrinker); + journal_t *journal = shrink->private_data; unsigned long count; count = percpu_counter_read_positive(&journal->j_checkpoint_jh_count); @@ -1415,21 +1415,27 @@ static journal_t *journal_init_common(struct block_device *bdev, journal->j_superblock = (journal_superblock_t *)bh->b_data; journal->j_shrink_transaction = NULL; - journal->j_shrinker.scan_objects = jbd2_journal_shrink_scan; - journal->j_shrinker.count_objects = jbd2_journal_shrink_count; - journal->j_shrinker.seeks = DEFAULT_SEEKS; - journal->j_shrinker.batch = journal->j_max_transaction_buffers; if (percpu_counter_init(&journal->j_checkpoint_jh_count, 0, GFP_KERNEL)) goto err_cleanup; - if (register_shrinker(&journal->j_shrinker, "jbd2-journal:(%u:%u)", - MAJOR(bdev->bd_dev), MINOR(bdev->bd_dev))) { - percpu_counter_destroy(&journal->j_checkpoint_jh_count); - goto err_cleanup; - } + journal->j_shrinker = shrinker_alloc_and_init(jbd2_journal_shrink_count, + jbd2_journal_shrink_scan, + journal->j_max_transaction_buffers, + DEFAULT_SEEKS, 0, journal); + if (!journal->j_shrinker) + goto err_shrinker; + + if (register_shrinker(journal->j_shrinker, "jbd2-journal:(%u:%u)", + MAJOR(bdev->bd_dev), MINOR(bdev->bd_dev))) + goto err_register; + return journal; +err_register: + shrinker_free(journal->j_shrinker); +err_shrinker: + percpu_counter_destroy(&journal->j_checkpoint_jh_count); err_cleanup: brelse(journal->j_sb_buffer); kfree(journal->j_wbuf); @@ -2190,9 +2196,9 @@ int jbd2_journal_destroy(journal_t *journal) brelse(journal->j_sb_buffer); } - if (journal->j_shrinker.flags & SHRINKER_REGISTERED) { + if (journal->j_shrinker->flags & SHRINKER_REGISTERED) { percpu_counter_destroy(&journal->j_checkpoint_jh_count); - unregister_shrinker(&journal->j_shrinker); + unregister_and_free_shrinker(journal->j_shrinker); } if (journal->j_proc_entry) jbd2_stats_proc_exit(journal); diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 44c298aa58d4..beb4c4586320 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -891,7 +891,7 @@ struct journal_s * Journal head shrinker, reclaim buffer's journal head which * has been written back. */ - struct shrinker j_shrinker; + struct shrinker *j_shrinker; /** * @j_checkpoint_jh_count: From patchwork Thu Jun 22 08:53:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288684 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C94EEB64DD for ; Thu, 22 Jun 2023 09:00:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231258AbjFVJA1 (ORCPT ); Thu, 22 Jun 2023 05:00:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231898AbjFVJAF (ORCPT ); Thu, 22 Jun 2023 05:00:05 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C82330D3 for ; Thu, 22 Jun 2023 01:56:05 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b5466bc5f8so9654975ad.1 for ; Thu, 22 Jun 2023 01:56:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424164; x=1690016164; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FWRSgLiVxGu7by+bRdZCMSZgL2CmyzCgjcAmFywfg3g=; b=W9Q0mEFJ4YX/Kxo42ZpDHLFmjt3CYHNqumMzPD6lCEJ/AclBj6ki+9nXx+XP2OFoT9 biokZFjtTO4s6n85EeiNgXVRb8aHb7r2IMBB+HnwNOlBqet/0l5xfJVO+DolK/U7XTWU M277NuxA39AWH+q/lhKoP2vIlehvWLhpgJj/4uGWoGxCITX36OL+YHFtfzDKSSw2iPaX MbP+C5V24hBPOg4LWw9eV/ZehzOW8RyY8zV5pnDOyFwQ720HnrExvRoWQvOw0JMgQMOU XwvMcYncbAVdeNBjwwgU6wHkMwvamAyfgHl3RPNu98UxqyF0R9Sxc8Wx/1hGBgxkRDYX Kkuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424164; x=1690016164; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FWRSgLiVxGu7by+bRdZCMSZgL2CmyzCgjcAmFywfg3g=; b=cEGPnVzLBpuYIlbHVpr8I7+HDe8JwwfK6S5LIj2TBBja2cv9Kj4ZOUE9cx45LRxz1Q 6Q5nDqyIWCgfuSL3Ua8b5CKHqxQ8qQOecTh333FOFZtH3TM5DpDrsr4mcvQ4cRuk2C4s WTP3J2WFPyP8zNY1s2LbDeD7CUAAr77U6SE7n59tjFEg+YGdagiQ8TL6v/9uTXfHWisy a3IC7zuwhLmAOYUz2THTNTOfOa4wbMGZ9lnuUuF/BjjefwxoFLXQB0ZAWzMgJxMkufQI 3+npyG4lmA4gsCK5p+TwgX9X3QzY00ugj21+N6tLOMOn5s4p8J+nDiHX3OYIKnwxrm98 kehQ== X-Gm-Message-State: AC+VfDy7zoOfeGDWrfGNDN6lLiH/XnFzBcBQZP6IlENbt//dalI+YzQR GwmQ6IYDsWKGHWx0PMm3P0q38A== X-Google-Smtp-Source: ACHHUZ4nYFfrGKqZLYUWSywLes9hZrkxqoQAJfwtv5FmoHmc5MLCqSR93ZcAMHhMn41B6ov+vpDAfw== X-Received: by 2002:a17:902:ea01:b0:1b0:4680:60 with SMTP id s1-20020a170902ea0100b001b046800060mr21968987plg.1.1687424164102; Thu, 22 Jun 2023 01:56:04 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.55.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:03 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 15/29] NFSD: dynamically allocate the nfsd-client shrinker Date: Thu, 22 Jun 2023 16:53:21 +0800 Message-Id: <20230622085335.77010-16-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the nfsd-client shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct nfsd_net. Signed-off-by: Qi Zheng Acked-by: Chuck Lever --- fs/nfsd/netns.h | 2 +- fs/nfsd/nfs4state.c | 20 ++++++++++++-------- 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index ec49b200b797..f669444d5336 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -195,7 +195,7 @@ struct nfsd_net { int nfs4_max_clients; atomic_t nfsd_courtesy_clients; - struct shrinker nfsd_client_shrinker; + struct shrinker *nfsd_client_shrinker; struct work_struct nfsd_shrinker_work; }; diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 6e61fa3acaf1..a06184270548 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -4388,8 +4388,7 @@ static unsigned long nfsd4_state_shrinker_count(struct shrinker *shrink, struct shrink_control *sc) { int count; - struct nfsd_net *nn = container_of(shrink, - struct nfsd_net, nfsd_client_shrinker); + struct nfsd_net *nn = shrink->private_data; count = atomic_read(&nn->nfsd_courtesy_clients); if (!count) @@ -8094,14 +8093,19 @@ static int nfs4_state_create_net(struct net *net) INIT_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker); get_net(net); - nn->nfsd_client_shrinker.scan_objects = nfsd4_state_shrinker_scan; - nn->nfsd_client_shrinker.count_objects = nfsd4_state_shrinker_count; - nn->nfsd_client_shrinker.seeks = DEFAULT_SEEKS; - - if (register_shrinker(&nn->nfsd_client_shrinker, "nfsd-client")) + nn->nfsd_client_shrinker = shrinker_alloc_and_init(nfsd4_state_shrinker_count, + nfsd4_state_shrinker_scan, + 0, DEFAULT_SEEKS, 0, + nn); + if (!nn->nfsd_client_shrinker) goto err_shrinker; + + if (register_shrinker(nn->nfsd_client_shrinker, "nfsd-client")) + goto err_register; return 0; +err_register: + shrinker_free(nn->nfsd_client_shrinker); err_shrinker: put_net(net); kfree(nn->sessionid_hashtbl); @@ -8197,7 +8201,7 @@ nfs4_state_shutdown_net(struct net *net) struct list_head *pos, *next, reaplist; struct nfsd_net *nn = net_generic(net, nfsd_net_id); - unregister_shrinker(&nn->nfsd_client_shrinker); + unregister_and_free_shrinker(nn->nfsd_client_shrinker); cancel_work(&nn->nfsd_shrinker_work); cancel_delayed_work_sync(&nn->laundromat_work); locks_end_grace(&nn->nfsd4_manager); From patchwork Thu Jun 22 08:53:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDC98EB64D8 for ; Thu, 22 Jun 2023 09:00:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231160AbjFVJAZ (ORCPT ); Thu, 22 Jun 2023 05:00:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231861AbjFVI77 (ORCPT ); Thu, 22 Jun 2023 04:59:59 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2BCA3593 for ; Thu, 22 Jun 2023 01:56:12 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-543a37c5c03so1091196a12.1 for ; Thu, 22 Jun 2023 01:56:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424172; x=1690016172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zUNTs2D5J+Px6+8NyTz0IGsKdVlRq3fVdCNXQo6zMeo=; b=PB56VJEtwK39jb6ujoKeOUgYB/PJLrruyxKRjJr7vrytJTdiBm6YHrGem+eoP0b4tN ySO0Fp6iHJx1jYG5cUtED8VGJ9nuR5OzkvfKjRzGq011b3+5HaAEFO0JOZ1JrJyRnWbE zD+J47hPkoBpD2IoCQquNzCqiGwmlgwN3y5YbOJ5VGLqEl2FtVJoITLOXOxAzhFddu+k 6lt/tmVHj10PbE9jdwz/MvuKMpEDpm3abrU9F/6h0dgBlWZ5OjmzxDq5+jKr9MCDVLPh Sp8IUY5vM81KMKjfLSjlTI62E4wEBy9NQ8VxDpQ2xwoZwC4ItSHDc6u/r/EGjzaiDmrG iUiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424172; x=1690016172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zUNTs2D5J+Px6+8NyTz0IGsKdVlRq3fVdCNXQo6zMeo=; b=aTt9SghtyWYIGvl415u0PQ3dxdMKeOf/V8FJooNHveY7RNOBugDQTHahzWMfpNJfdN 17HFNeEp+JGyIxkxi4maJ/woSJpMIYhieEhHu7Bob5ITP+Yf2pktzN+wGf9t3oKYtivd 8FJnHH+Bj0u9a7QLsP2wxcsSts0z3Vd/pYh72FAWJ6ImL1DKZpsofgBORW4BOI78arof rSt+a1/Ey3CDO2QCwwcaYifC0ZlaZsHM9ui4ZuNyyc+1STZl1BW2iQPjBJjrahT5Ujtc hJ1kuruzK59bczVHqYPGjbQV3wWFYbi88Hr5G2r0mL4cbcCVeNt/Xw+2TlqyNi3BGuvM 29bw== X-Gm-Message-State: AC+VfDw+iEpd3j9MeVHkKfZ7fl4H+69RRfhH3ah5a/suTFlCctvTRep6 4cicEzSU23Xj7yPamsYwZYA2cg== X-Google-Smtp-Source: ACHHUZ7RZPPxDHHt9nXlq48w0UysjH5yX+GnVgYNyQdTPpVzG1MBfOvOCSKK+16aR/H5lLV5H8OFJg== X-Received: by 2002:a17:902:f688:b0:1ad:e3a8:3c4 with SMTP id l8-20020a170902f68800b001ade3a803c4mr20681104plg.4.1687424171977; Thu, 22 Jun 2023 01:56:11 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:11 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 16/29] NFSD: dynamically allocate the nfsd-reply shrinker Date: Thu, 22 Jun 2023 16:53:22 +0800 Message-Id: <20230622085335.77010-17-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the nfsd-reply shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct nfsd_net. Signed-off-by: Qi Zheng --- fs/nfsd/netns.h | 2 +- fs/nfsd/nfscache.c | 33 +++++++++++++++++++-------------- 2 files changed, 20 insertions(+), 15 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index f669444d5336..ab303a8b77d5 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -177,7 +177,7 @@ struct nfsd_net { /* size of cache when we saw the longest hash chain */ unsigned int longest_chain_cachesize; - struct shrinker nfsd_reply_cache_shrinker; + struct shrinker *nfsd_reply_cache_shrinker; /* tracking server-to-server copy mounts */ spinlock_t nfsd_ssc_lock; diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c index 041faa13b852..ec33de8e418b 100644 --- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -173,19 +173,23 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) if (status) goto out_nomem; - nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan; - nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count; - nn->nfsd_reply_cache_shrinker.seeks = 1; - status = register_shrinker(&nn->nfsd_reply_cache_shrinker, - "nfsd-reply:%s", nn->nfsd_name); - if (status) - goto out_stats_destroy; - nn->drc_hashtbl = kvzalloc(array_size(hashsize, sizeof(*nn->drc_hashtbl)), GFP_KERNEL); if (!nn->drc_hashtbl) + goto out_stats_destroy; + + nn->nfsd_reply_cache_shrinker = + shrinker_alloc_and_init(nfsd_reply_cache_count, + nfsd_reply_cache_scan, + 0, 1, 0, nn); + if (!nn->nfsd_reply_cache_shrinker) goto out_shrinker; + status = register_shrinker(nn->nfsd_reply_cache_shrinker, + "nfsd-reply:%s", nn->nfsd_name); + if (status) + goto out_register; + for (i = 0; i < hashsize; i++) { INIT_LIST_HEAD(&nn->drc_hashtbl[i].lru_head); spin_lock_init(&nn->drc_hashtbl[i].cache_lock); @@ -193,8 +197,11 @@ int nfsd_reply_cache_init(struct nfsd_net *nn) nn->drc_hashsize = hashsize; return 0; + +out_register: + shrinker_free(nn->nfsd_reply_cache_shrinker); out_shrinker: - unregister_shrinker(&nn->nfsd_reply_cache_shrinker); + kvfree(nn->drc_hashtbl); out_stats_destroy: nfsd_reply_cache_stats_destroy(nn); out_nomem: @@ -207,7 +214,7 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn) struct svc_cacherep *rp; unsigned int i; - unregister_shrinker(&nn->nfsd_reply_cache_shrinker); + unregister_and_free_shrinker(nn->nfsd_reply_cache_shrinker); for (i = 0; i < nn->drc_hashsize; i++) { struct list_head *head = &nn->drc_hashtbl[i].lru_head; @@ -297,8 +304,7 @@ prune_cache_entries(struct nfsd_net *nn) static unsigned long nfsd_reply_cache_count(struct shrinker *shrink, struct shrink_control *sc) { - struct nfsd_net *nn = container_of(shrink, - struct nfsd_net, nfsd_reply_cache_shrinker); + struct nfsd_net *nn = shrink->private_data; return atomic_read(&nn->num_drc_entries); } @@ -306,8 +312,7 @@ nfsd_reply_cache_count(struct shrinker *shrink, struct shrink_control *sc) static unsigned long nfsd_reply_cache_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct nfsd_net *nn = container_of(shrink, - struct nfsd_net, nfsd_reply_cache_shrinker); + struct nfsd_net *nn = shrink->private_data; return prune_cache_entries(nn); } From patchwork Thu Jun 22 08:53:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D447EB64D8 for ; Thu, 22 Jun 2023 09:01:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231777AbjFVJBI (ORCPT ); Thu, 22 Jun 2023 05:01:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232035AbjFVJAU (ORCPT ); Thu, 22 Jun 2023 05:00:20 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2109F3AA0 for ; Thu, 22 Jun 2023 01:56:35 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-54fd4a7ce25so1096313a12.0 for ; Thu, 22 Jun 2023 01:56:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424180; x=1690016180; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DX6SUDFVJBECNsblFEChPYzpO8xY8cES72+0dMzxuzQ=; b=J0xdX/HVFp5YYG4TfcvtFJvRxLyQLKtNeoVYCgaC2J+nRoYDa/BzwrOrCj094o0cZq 1xReHr0GJ1hygdSXflfCN7O6Wd3k5bnTYjzGehkNWElkKcnI57FpdEY2zo1QIhRivIfW Vn3A6HH2yEtIwY+bYcUHLIMQK61OnnzFACK07hJ1eCTN8lHwm9h5Rchf6mt0xuf3bmH2 YrMt/i+taGwUAGQfmqtCpero+/nADk45X1pPDnlfELgRkE2o8P26K8ugc6xZW6o9l2x2 0OyQF06tkzpnptZeoSrMW5oczymeBmy807qxrTJz07x7dVgWrGVUEsLlXTXBF7IGucek WQiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424180; x=1690016180; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DX6SUDFVJBECNsblFEChPYzpO8xY8cES72+0dMzxuzQ=; b=c6ZcnNg7aWk9VUAmdGh9vH8N/ETD8OkEYTKVNjS2x1Y6xu9ALOHdoezOyTLXoRySh9 mGY3qcOEEogZ78UwbgTQm7xVLiKMYgi8ol2GtcGLKVYemU3M6T5b4r8yR1bNvkBq4Vsp YfsXyr27eeOVggT6bR111gzrERxJxEXpDGXSsx+OQKKJX8zfWYNMpGoydN1g0TNTjrAd HwFTZIjpjPOIEjnvLXpS7742ZTEpYiDzjgNF0COz/loYhGoNCF2u4U5G9uy55Rdos+Fx vnGCT+GGU4ZpMWhQiriRNihqK6wPMK4vtKIHbuzpQ9jeHp9HZrd//WtDIoT/rW+X0Rny ibXg== X-Gm-Message-State: AC+VfDxRs9uDxEtlSRUzmvSIBnWE0S5RimZBTpqfr+9p8kegh5WNa54Q f8RAOptz1cfTTx2O0FpslKPWMw== X-Google-Smtp-Source: ACHHUZ6cRwGFUE7wdLf8WreSOUecoWuLPpHsSO23aKngJIhhKsUdOoFlGAyrZhuJGN3RD5MvXrHaNg== X-Received: by 2002:a17:903:2452:b0:1a9:581b:fbaa with SMTP id l18-20020a170903245200b001a9581bfbaamr20887740pls.2.1687424179969; Thu, 22 Jun 2023 01:56:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:19 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 17/29] xfs: dynamically allocate the xfs-buf shrinker Date: Thu, 22 Jun 2023 16:53:23 +0800 Message-Id: <20230622085335.77010-18-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the xfs-buf shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct xfs_buftarg. Signed-off-by: Qi Zheng --- fs/xfs/xfs_buf.c | 25 ++++++++++++++----------- fs/xfs/xfs_buf.h | 2 +- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 15d1e5a7c2d3..6657d285d26f 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1906,8 +1906,7 @@ xfs_buftarg_shrink_scan( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_buftarg *btp = container_of(shrink, - struct xfs_buftarg, bt_shrinker); + struct xfs_buftarg *btp = shrink->private_data; LIST_HEAD(dispose); unsigned long freed; @@ -1929,8 +1928,7 @@ xfs_buftarg_shrink_count( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_buftarg *btp = container_of(shrink, - struct xfs_buftarg, bt_shrinker); + struct xfs_buftarg *btp = shrink->private_data; return list_lru_shrink_count(&btp->bt_lru, sc); } @@ -1938,7 +1936,7 @@ void xfs_free_buftarg( struct xfs_buftarg *btp) { - unregister_shrinker(&btp->bt_shrinker); + unregister_and_free_shrinker(btp->bt_shrinker); ASSERT(percpu_counter_sum(&btp->bt_io_count) == 0); percpu_counter_destroy(&btp->bt_io_count); list_lru_destroy(&btp->bt_lru); @@ -2021,15 +2019,20 @@ xfs_alloc_buftarg( if (percpu_counter_init(&btp->bt_io_count, 0, GFP_KERNEL)) goto error_lru; - btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count; - btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan; - btp->bt_shrinker.seeks = DEFAULT_SEEKS; - btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE; - if (register_shrinker(&btp->bt_shrinker, "xfs-buf:%s", - mp->m_super->s_id)) + btp->bt_shrinker = shrinker_alloc_and_init(xfs_buftarg_shrink_count, + xfs_buftarg_shrink_scan, + 0, DEFAULT_SEEKS, + SHRINKER_NUMA_AWARE, btp); + if (!btp->bt_shrinker) goto error_pcpu; + + if (register_shrinker(btp->bt_shrinker, "xfs-buf:%s", + mp->m_super->s_id)) + goto error_shrinker; return btp; +error_shrinker: + shrinker_free(btp->bt_shrinker); error_pcpu: percpu_counter_destroy(&btp->bt_io_count); error_lru: diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index 549c60942208..4e6969a675f7 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -102,7 +102,7 @@ typedef struct xfs_buftarg { size_t bt_logical_sectormask; /* LRU control structures */ - struct shrinker bt_shrinker; + struct shrinker *bt_shrinker; struct list_lru bt_lru; struct percpu_counter bt_io_count; From patchwork Thu Jun 22 08:53:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8864AEB64DB for ; Thu, 22 Jun 2023 09:02:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbjFVJCI (ORCPT ); Thu, 22 Jun 2023 05:02:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231831AbjFVJBN (ORCPT ); Thu, 22 Jun 2023 05:01:13 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04A9B3C15 for ; Thu, 22 Jun 2023 01:56:52 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1b5079b8cb3so12591085ad.1 for ; Thu, 22 Jun 2023 01:56:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424188; x=1690016188; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tTZEGaIAEFBjvotA1NXyBA/23sGgDmW1pgUANJLmL6g=; b=RR2ZBd6H05awxDqDqVthUnK8c2aiJ3/40MwwmbAPlBGQiSswQkBwcxNT/M8cNWLnOA 82Up+za+KnH3T12j6kLa8MWMs1OLNz27XFTdAlPbZ38sAlCSEZeBeei4jIEF3DOGcfFf sqLx7qpbJbMhGZOLWmDQDC3qBDYjKQCUQ+pHZmp4+zwVYsBG9SEPO8LfQRHMRrJmIPZH mEo73QmcjA4JR+vhCDbl2LS55wzlx82Sg5Hmni0lGgyYmJGvSzc9k9c1k7gD3cDGUkxo aNJG0MDIsuR+BfQGSCBSsk1/g4GTm7BXhgoQS8izvkkK6lhK2DzzoNDI+34/82OB/BW3 znuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424188; x=1690016188; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tTZEGaIAEFBjvotA1NXyBA/23sGgDmW1pgUANJLmL6g=; b=XJXOH0MtHemLV7jy1v2FYaTu9UCWO6/8IOpEASxbwEHM3oUuEvu/9fHWVHKoYo35+l 5B37j2QIIbZM13K78tYMAmfATO/pzyYpnSPr/HDPo1ZuFW1JAGA0wE9mbmm7GyZgVpii XAoH1nnBUHHS6kDvYhEg8wakXZwJHVX705y6nLBlyq5oU+doaJtCY1APLx/WHROLQrvF dhvdvNvtNxy3T9natMLkjtkMfDlbf0hJ//3FBgmFnLLn00uPRRahMScGwXTpEdRcVRf5 C3jZl/U0r4SssXym7WHPIFurHNNwUlBzHxoa/lLgIANHJCl00yil745SZoN4hBO8EzBK RWyA== X-Gm-Message-State: AC+VfDx1ZW1QsZ8t+1ZxKQZfQJmb/fLrYTyLyfaWqmdFhY3KJ3b4QUEt rm4561dBPVHpzykuXNOMlstPvw== X-Google-Smtp-Source: ACHHUZ6OWalxJOPaEyVlek6DUPm2brDmU3Y4XuKo9voozhXt0KV98CJSzaGxqKjpiLWREy1Rn2GjcA== X-Received: by 2002:a17:902:dac6:b0:1ac:656f:a68d with SMTP id q6-20020a170902dac600b001ac656fa68dmr21469547plx.4.1687424187780; Thu, 22 Jun 2023 01:56:27 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:27 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 18/29] xfs: dynamically allocate the xfs-inodegc shrinker Date: Thu, 22 Jun 2023 16:53:24 +0800 Message-Id: <20230622085335.77010-19-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the xfs-inodegc shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct xfs_mount. Signed-off-by: Qi Zheng --- fs/xfs/xfs_icache.c | 27 ++++++++++++++++----------- fs/xfs/xfs_mount.c | 4 ++-- fs/xfs/xfs_mount.h | 2 +- 3 files changed, 19 insertions(+), 14 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 453890942d9f..1ef0c9fa57de 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -2225,8 +2225,7 @@ xfs_inodegc_shrinker_count( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_mount *mp = container_of(shrink, struct xfs_mount, - m_inodegc_shrinker); + struct xfs_mount *mp = shrink->private_data; struct xfs_inodegc *gc; int cpu; @@ -2247,8 +2246,7 @@ xfs_inodegc_shrinker_scan( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_mount *mp = container_of(shrink, struct xfs_mount, - m_inodegc_shrinker); + struct xfs_mount *mp = shrink->private_data; struct xfs_inodegc *gc; int cpu; bool no_items = true; @@ -2284,13 +2282,20 @@ int xfs_inodegc_register_shrinker( struct xfs_mount *mp) { - struct shrinker *shrink = &mp->m_inodegc_shrinker; + int ret; - shrink->count_objects = xfs_inodegc_shrinker_count; - shrink->scan_objects = xfs_inodegc_shrinker_scan; - shrink->seeks = 0; - shrink->flags = SHRINKER_NONSLAB; - shrink->batch = XFS_INODEGC_SHRINKER_BATCH; + mp->m_inodegc_shrinker = + shrinker_alloc_and_init(xfs_inodegc_shrinker_count, + xfs_inodegc_shrinker_scan, + XFS_INODEGC_SHRINKER_BATCH, + 0, SHRINKER_NONSLAB, mp); + if (!mp->m_inodegc_shrinker) + return -ENOMEM; + + ret = register_shrinker(mp->m_inodegc_shrinker, "xfs-inodegc:%s", + mp->m_super->s_id); + if (ret) + shrinker_free(mp->m_inodegc_shrinker); - return register_shrinker(shrink, "xfs-inodegc:%s", mp->m_super->s_id); + return ret; } diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index fb87ffb48f7f..67286859ad34 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1018,7 +1018,7 @@ xfs_mountfs( out_log_dealloc: xfs_log_mount_cancel(mp); out_inodegc_shrinker: - unregister_shrinker(&mp->m_inodegc_shrinker); + unregister_and_free_shrinker(mp->m_inodegc_shrinker); out_fail_wait: if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) xfs_buftarg_drain(mp->m_logdev_targp); @@ -1100,7 +1100,7 @@ xfs_unmountfs( #if defined(DEBUG) xfs_errortag_clearall(mp); #endif - unregister_shrinker(&mp->m_inodegc_shrinker); + unregister_and_free_shrinker(mp->m_inodegc_shrinker); xfs_free_perag(mp); xfs_errortag_del(mp); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index e2866e7fa60c..562c294ca08e 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -217,7 +217,7 @@ typedef struct xfs_mount { atomic_t m_agirotor; /* last ag dir inode alloced */ /* Memory shrinker to throttle and reprioritize inodegc */ - struct shrinker m_inodegc_shrinker; + struct shrinker *m_inodegc_shrinker; /* * Workqueue item so that we can coalesce multiple inode flush attempts * into a single flush. From patchwork Thu Jun 22 08:53:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288740 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B310FEB64D8 for ; Thu, 22 Jun 2023 09:02:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231544AbjFVJC3 (ORCPT ); Thu, 22 Jun 2023 05:02:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231855AbjFVJBi (ORCPT ); Thu, 22 Jun 2023 05:01:38 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B5EA3C39 for ; Thu, 22 Jun 2023 01:57:00 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id 41be03b00d2f7-51f64817809so680147a12.1 for ; Thu, 22 Jun 2023 01:57:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424196; x=1690016196; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tuTRU39Llc4f1pG+4j4EE0WoT0ftJSPXyoG6lJxyplk=; b=Ekn/zV8B3Pykd2GrPoTdVnpJwk5evd2tyUP0vKsA0sao3e0fC5CJiY4BTrhaxMXIHV uTPkEPgehKIEn13/c91Rf6hAUh1pL4moe2PR0AHRF6Kdv1EoYZyskenz8RKeyckC9Itb y+HACd1zVUdu9ru2Mkg+skG1k3PTY0I37XiwGKuN4fZctDPgzdbV+AVCjywsN0HsaZ69 Cuv8lvpq3tEmZe+xhsg7ur6ivxEYcTfKMKXUZmg3vRSFd562H9J1GbGm5bH6oFZpkxLe LAGfy+lFWFcKu0xxXlVXi22gjm4qtT4oNXfdGZLZ8KWmVvCWym8e4Apiabz+y3j9a60o NyUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424196; x=1690016196; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tuTRU39Llc4f1pG+4j4EE0WoT0ftJSPXyoG6lJxyplk=; b=iMRwVMRzXdWoemKvvxmZEIimY2MtcXK6diAljYwHGuJKzQbJVE4uuYjQeo7lnjJ0lK OyS3+K0gKurIH6cEmwfmbqoW1gXN2ZLqIoFpgpYZiZBkO8w9WT+t7ai+0MzxDPchd0C9 E4iniPP64u6UL3mwygcvJ/6YfVAjkhLCSLdtnAtVW1RWNVC6m+J+0CMd6W7+aNJcOiKb nHOfkwuJeOoZ6Vn3uPcCg8jq1gg/dgYTmAQgZdaxtvzEY7NJkeY9wq5MPZ8ucq4WIEde vwBRsI0gmNCoHrfAXqhUvBs6WPG8x2DqQywm01aM6vmIsDjZe5akN1VcQtga328YZZam LLxA== X-Gm-Message-State: AC+VfDwQnpt1kLBLbCg3/EmB+GSRqWS3VzH5KQ6Fmm1PSLUuNzI+tAim YTGSMmbhzYLKyBnNHOlFhmotbQ== X-Google-Smtp-Source: ACHHUZ67Tf+k1CheKb86ycYHhx5x/FfiuFJZEOT7coQiqeyoaYPxA8zsjGzghk4F71F8kxwZIRTJAQ== X-Received: by 2002:a17:902:ecc6:b0:1ae:1364:6086 with SMTP id a6-20020a170902ecc600b001ae13646086mr21603647plh.2.1687424195747; Thu, 22 Jun 2023 01:56:35 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:35 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 19/29] xfs: dynamically allocate the xfs-qm shrinker Date: Thu, 22 Jun 2023 16:53:25 +0800 Message-Id: <20230622085335.77010-20-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the xfs-qm shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct xfs_quotainfo. Signed-off-by: Qi Zheng --- fs/xfs/xfs_qm.c | 24 +++++++++++++----------- fs/xfs/xfs_qm.h | 2 +- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 6abcc34fafd8..b15320d469cc 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -504,8 +504,7 @@ xfs_qm_shrink_scan( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_quotainfo *qi = container_of(shrink, - struct xfs_quotainfo, qi_shrinker); + struct xfs_quotainfo *qi = shrink->private_data; struct xfs_qm_isolate isol; unsigned long freed; int error; @@ -539,8 +538,7 @@ xfs_qm_shrink_count( struct shrinker *shrink, struct shrink_control *sc) { - struct xfs_quotainfo *qi = container_of(shrink, - struct xfs_quotainfo, qi_shrinker); + struct xfs_quotainfo *qi = shrink->private_data; return list_lru_shrink_count(&qi->qi_lru, sc); } @@ -680,18 +678,22 @@ xfs_qm_init_quotainfo( if (XFS_IS_PQUOTA_ON(mp)) xfs_qm_set_defquota(mp, XFS_DQTYPE_PROJ, qinf); - qinf->qi_shrinker.count_objects = xfs_qm_shrink_count; - qinf->qi_shrinker.scan_objects = xfs_qm_shrink_scan; - qinf->qi_shrinker.seeks = DEFAULT_SEEKS; - qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE; + qinf->qi_shrinker = shrinker_alloc_and_init(xfs_qm_shrink_count, + xfs_qm_shrink_scan, + 0, DEFAULT_SEEKS, + SHRINKER_NUMA_AWARE, qinf); + if (!qinf->qi_shrinker) + goto out_free_inos; - error = register_shrinker(&qinf->qi_shrinker, "xfs-qm:%s", + error = register_shrinker(qinf->qi_shrinker, "xfs-qm:%s", mp->m_super->s_id); if (error) - goto out_free_inos; + goto out_shrinker; return 0; +out_shrinker: + shrinker_free(qinf->qi_shrinker); out_free_inos: mutex_destroy(&qinf->qi_quotaofflock); mutex_destroy(&qinf->qi_tree_lock); @@ -718,7 +720,7 @@ xfs_qm_destroy_quotainfo( qi = mp->m_quotainfo; ASSERT(qi != NULL); - unregister_shrinker(&qi->qi_shrinker); + unregister_and_free_shrinker(qi->qi_shrinker); list_lru_destroy(&qi->qi_lru); xfs_qm_destroy_quotainos(qi); mutex_destroy(&qi->qi_tree_lock); diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h index 9683f0457d19..d5c9fc4ba591 100644 --- a/fs/xfs/xfs_qm.h +++ b/fs/xfs/xfs_qm.h @@ -63,7 +63,7 @@ struct xfs_quotainfo { struct xfs_def_quota qi_usr_default; struct xfs_def_quota qi_grp_default; struct xfs_def_quota qi_prj_default; - struct shrinker qi_shrinker; + struct shrinker *qi_shrinker; /* Minimum and maximum quota expiration timestamp values. */ time64_t qi_expiry_min; From patchwork Thu Jun 22 08:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 355B9EB64DD for ; Thu, 22 Jun 2023 09:03:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231915AbjFVJDB (ORCPT ); Thu, 22 Jun 2023 05:03:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231886AbjFVJC0 (ORCPT ); Thu, 22 Jun 2023 05:02:26 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E132422A for ; Thu, 22 Jun 2023 01:57:11 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1b693afe799so2466135ad.1 for ; Thu, 22 Jun 2023 01:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424203; x=1690016203; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uYJ0Isr1XZATX6JGPvTVuhl3oDZVpp/hY4/4gwVGW9E=; b=E8DR3f2Hoat1dr+eWUDbHZgFwAog/Y14Fyr901nadnjVcnG4wi/dzsYXue5lsR1fMf J4lGKXcngL6tZ98mDRA+meQDFEQqI8M8C2U7Jzsh+P5pb02lqX9x9oPIGG3MqGtcEQBd gtcaY6d5Wvcszm2wAYU1/GbNiI4/W6jf6+osdVtp14uPtbwdSj3xBB3cuLupxVd/0Bva 3aNVjboq1g3R8QF56ponf3Z0k9HzBLbK6msqxIkke8ZOuuk+1QHFYDcPM2BRj2r4DT2+ yfbsCvnlQ4hLgF8msig/7Mg+peyLnaUmJHbCBGncbsN2mdvtqIXKdXSMs3btfn7dPCo7 J6Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424203; x=1690016203; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uYJ0Isr1XZATX6JGPvTVuhl3oDZVpp/hY4/4gwVGW9E=; b=iWoAVy+BMFKReE44PNE3Q95gJrIpk9OkIXcxLvyIBhgjbsINq58WF5XnQm3wYC5heF 5/28NeajbxApW1yDedmX+X8IPITIwLCbhR5roJ0zZs956F+e+RLzMJH54cOja+af8+WU W7n0XS18mi5tdifaaTNTBr3EZNxlNkaKobyr8yyOvUTSktzqJOj30SiEPx/QAztzN+uG YEY2xhnh8cBXUbe/sMCA8+mKzAVFKQjZrN1gamYSVVBK+jroWXv9MHVavaPM2v4RM7uh 0n0x7aKvlPo6QY6GBaKFQsx/XJlk9ALg2mujRX7CjPUIzo6TViVKuN+GUpzR8Tabtbsp G7yQ== X-Gm-Message-State: AC+VfDyjD4radVglok6IZ0wQ+Aavekx9ptHT+XyIfY9DJkQ+upfg4ZsR Dd6paeyjfCw52DkRzf6WbY7N4w== X-Google-Smtp-Source: ACHHUZ4t5WHqyo4ZYahXOm5IhQBts9kOj8de1dBmPyGeGIdQ3YL7Www59M51QQRN7AwqoJV6XrQ1Ww== X-Received: by 2002:a17:902:ea01:b0:1a9:6467:aa8d with SMTP id s1-20020a170902ea0100b001a96467aa8dmr21674286plg.1.1687424203648; Thu, 22 Jun 2023 01:56:43 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:43 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 20/29] zsmalloc: dynamically allocate the mm-zspool shrinker Date: Thu, 22 Jun 2023 16:53:26 +0800 Message-Id: <20230622085335.77010-21-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the mm-zspool shrinker, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct zs_pool. Signed-off-by: Qi Zheng --- mm/zsmalloc.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 3f057970504e..c03b34ae637e 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -229,7 +229,7 @@ struct zs_pool { struct zs_pool_stats stats; /* Compact classes */ - struct shrinker shrinker; + struct shrinker *shrinker; #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; @@ -2107,8 +2107,7 @@ static unsigned long zs_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { unsigned long pages_freed; - struct zs_pool *pool = container_of(shrinker, struct zs_pool, - shrinker); + struct zs_pool *pool = shrinker->private_data; /* * Compact classes and calculate compaction delta. @@ -2126,8 +2125,7 @@ static unsigned long zs_shrinker_count(struct shrinker *shrinker, int i; struct size_class *class; unsigned long pages_to_free = 0; - struct zs_pool *pool = container_of(shrinker, struct zs_pool, - shrinker); + struct zs_pool *pool = shrinker->private_data; for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) { class = pool->size_class[i]; @@ -2142,18 +2140,24 @@ static unsigned long zs_shrinker_count(struct shrinker *shrinker, static void zs_unregister_shrinker(struct zs_pool *pool) { - unregister_shrinker(&pool->shrinker); + unregister_and_free_shrinker(pool->shrinker); } static int zs_register_shrinker(struct zs_pool *pool) { - pool->shrinker.scan_objects = zs_shrinker_scan; - pool->shrinker.count_objects = zs_shrinker_count; - pool->shrinker.batch = 0; - pool->shrinker.seeks = DEFAULT_SEEKS; + int ret; + + pool->shrinker = shrinker_alloc_and_init(zs_shrinker_count, + zs_shrinker_scan, 0, + DEFAULT_SEEKS, 0, pool); + if (!pool->shrinker) + return -ENOMEM; - return register_shrinker(&pool->shrinker, "mm-zspool:%s", - pool->name); + ret = register_shrinker(pool->shrinker, "mm-zspool:%s", pool->name); + if (ret) + shrinker_free(pool->shrinker); + + return ret; } static int calculate_zspage_chain_size(int class_size) From patchwork Thu Jun 22 08:53:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 030B1EB64DA for ; Thu, 22 Jun 2023 09:03:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231952AbjFVJDl (ORCPT ); Thu, 22 Jun 2023 05:03:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231919AbjFVJDB (ORCPT ); Thu, 22 Jun 2023 05:03:01 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93A5044A0 for ; Thu, 22 Jun 2023 01:57:20 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b515ec39feso12622555ad.0 for ; Thu, 22 Jun 2023 01:57:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424211; x=1690016211; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=02DUTqACM/aSl9866C36A3I60VPYUs4OjMZ7Yfrc1/o=; b=H+1EjwPIJiQm5S0OenD/xh0iPse4xpKj+qmkmBU/srKV3WPi3ULtA4pLp++dC5Bqh9 7BlRLloRVIKfQ42/hs1T7Du0cHLmrcEPoZEBW7JCQkTGLYx1UQ3fbHqq8+O3YFu52yg3 0+mCV5UtaM+1WMi4gyiYw8XnFCdNZ/oJcDwzQAYJvzkblrrUfBpPXXo27BHoXCIIli5o hjfG1EZpMvdsdRL66NPCU/nz333dm9KHMwYWw99s85mydOWvyhkPwMLzowDcLeXpMtBs G9dmlhU3PRkH82ZAs6EjGsULbQMB1pyMMS03ZQNG2aou+qooblLYB2n9++fqbuAHUsk3 kqMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424211; x=1690016211; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=02DUTqACM/aSl9866C36A3I60VPYUs4OjMZ7Yfrc1/o=; b=Wn51SqAlm+/MzLn9I53hfSvPSEBIyLAFz6dgw2rJoFVeSw4liGmgF1Wo5rUXBq4OyE 81Vp3RmqMmoJE5HNHVPANBoC2P1AZMtZ/lmuNRzUGVlogcsxTNIpzKGSP6ikmBjfE8xU 9cbYkLCd6aIoIEQtOXojaNz5bXZEEEinVvZ04XSQKpM9eeeLI0V4tmmwZ3U7TorBOd7a zr/eOa1wteBp7+GlV0uDq/PBuDBIdWMciS2L1s0uhPe6DN1w1OQhgKooUkVJ6qX7EYlY QF0YAOGE7QQUIxWhfGU4pnj8Pf7EYOSqfritm2NZI4Mv6l3ogQkAHNhH9QFrVASf52O9 bBuw== X-Gm-Message-State: AC+VfDxchLVqSz1loz6rkbBPMzKrQXZW0s1bfV1LM7rCL15HLJ/YSBKj yIFxlkAvaq5+k/uL209YlNGqJg== X-Google-Smtp-Source: ACHHUZ4bv27VnKnfH9aSIwqm+pi4rG014ro45egLBBsO6iOayIjodJqPa/Ct3mSpE0n2zy/iA6cg+A== X-Received: by 2002:a17:902:ecc6:b0:1b1:9272:55e2 with SMTP id a6-20020a170902ecc600b001b1927255e2mr21755968plh.3.1687424211650; Thu, 22 Jun 2023 01:56:51 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:51 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 21/29] fs: super: dynamically allocate the s_shrink Date: Thu, 22 Jun 2023 16:53:27 +0800 Message-Id: <20230622085335.77010-22-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation for implementing lockless slab shrink, we need to dynamically allocate the s_shrink, so that it can be freed asynchronously using kfree_rcu(). Then it doesn't need to wait for RCU read-side critical section when releasing the struct super_block. Signed-off-by: Qi Zheng --- fs/btrfs/super.c | 2 +- fs/kernfs/mount.c | 2 +- fs/proc/root.c | 2 +- fs/super.c | 38 ++++++++++++++++++++++---------------- include/linux/fs.h | 2 +- 5 files changed, 26 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index f1dd172d8d5b..fad4ded26c80 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1513,7 +1513,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, error = -EBUSY; } else { snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev); - shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", fs_type->name, + shrinker_debugfs_rename(s->s_shrink, "sb-%s:%s", fs_type->name, s->s_id); btrfs_sb(s)->bdev_holder = fs_type; error = btrfs_fill_super(s, fs_devices, data); diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index d49606accb07..2657ff1181f1 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -256,7 +256,7 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k sb->s_time_gran = 1; /* sysfs dentries and inodes don't require IO to create */ - sb->s_shrink.seeks = 0; + sb->s_shrink->seeks = 0; /* get root inode, initialize and unlock it */ down_read(&kf_root->kernfs_rwsem); diff --git a/fs/proc/root.c b/fs/proc/root.c index a86e65a608da..22b78b28b477 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -188,7 +188,7 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc) s->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH; /* procfs dentries and inodes don't require IO to create */ - s->s_shrink.seeks = 0; + s->s_shrink->seeks = 0; pde_get(&proc_root); root_inode = proc_get_inode(s, &proc_root); diff --git a/fs/super.c b/fs/super.c index 2e83c8cd435b..791342bb8ac9 100644 --- a/fs/super.c +++ b/fs/super.c @@ -67,7 +67,7 @@ static unsigned long super_cache_scan(struct shrinker *shrink, long dentries; long inodes; - sb = container_of(shrink, struct super_block, s_shrink); + sb = shrink->private_data; /* * Deadlock avoidance. We may hold various FS locks, and we don't want @@ -120,7 +120,7 @@ static unsigned long super_cache_count(struct shrinker *shrink, struct super_block *sb; long total_objects = 0; - sb = container_of(shrink, struct super_block, s_shrink); + sb = shrink->private_data; /* * We don't call trylock_super() here as it is a scalability bottleneck, @@ -182,7 +182,10 @@ static void destroy_unused_super(struct super_block *s) security_sb_free(s); put_user_ns(s->s_user_ns); kfree(s->s_subtype); - free_prealloced_shrinker(&s->s_shrink); + if (s->s_shrink) { + free_prealloced_shrinker(s->s_shrink); + shrinker_free(s->s_shrink); + } /* no delays needed */ destroy_super_work(&s->destroy_work); } @@ -259,16 +262,19 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags, s->s_time_min = TIME64_MIN; s->s_time_max = TIME64_MAX; - s->s_shrink.seeks = DEFAULT_SEEKS; - s->s_shrink.scan_objects = super_cache_scan; - s->s_shrink.count_objects = super_cache_count; - s->s_shrink.batch = 1024; - s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE; - if (prealloc_shrinker(&s->s_shrink, "sb-%s", type->name)) + s->s_shrink = shrinker_alloc_and_init(super_cache_count, + super_cache_scan, 1024, + DEFAULT_SEEKS, + SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE, + s); + if (!s->s_shrink) + goto fail; + + if (prealloc_shrinker(s->s_shrink, "sb-%s", type->name)) goto fail; - if (list_lru_init_memcg(&s->s_dentry_lru, &s->s_shrink)) + if (list_lru_init_memcg(&s->s_dentry_lru, s->s_shrink)) goto fail; - if (list_lru_init_memcg(&s->s_inode_lru, &s->s_shrink)) + if (list_lru_init_memcg(&s->s_inode_lru, s->s_shrink)) goto fail; return s; @@ -326,7 +332,7 @@ void deactivate_locked_super(struct super_block *s) { struct file_system_type *fs = s->s_type; if (atomic_dec_and_test(&s->s_active)) { - unregister_shrinker(&s->s_shrink); + unregister_and_free_shrinker(s->s_shrink); fs->kill_sb(s); /* @@ -599,7 +605,7 @@ struct super_block *sget_fc(struct fs_context *fc, hlist_add_head(&s->s_instances, &s->s_type->fs_supers); spin_unlock(&sb_lock); get_filesystem(s->s_type); - register_shrinker_prepared(&s->s_shrink); + register_shrinker_prepared(s->s_shrink); return s; share_extant_sb: @@ -678,7 +684,7 @@ struct super_block *sget(struct file_system_type *type, hlist_add_head(&s->s_instances, &type->fs_supers); spin_unlock(&sb_lock); get_filesystem(type); - register_shrinker_prepared(&s->s_shrink); + register_shrinker_prepared(s->s_shrink); return s; } EXPORT_SYMBOL(sget); @@ -1308,7 +1314,7 @@ int get_tree_bdev(struct fs_context *fc, down_write(&s->s_umount); } else { snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev); - shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", + shrinker_debugfs_rename(s->s_shrink, "sb-%s:%s", fc->fs_type->name, s->s_id); sb_set_blocksize(s, block_size(bdev)); error = fill_super(s, fc); @@ -1381,7 +1387,7 @@ struct dentry *mount_bdev(struct file_system_type *fs_type, down_write(&s->s_umount); } else { snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev); - shrinker_debugfs_rename(&s->s_shrink, "sb-%s:%s", + shrinker_debugfs_rename(s->s_shrink, "sb-%s:%s", fs_type->name, s->s_id); sb_set_blocksize(s, block_size(bdev)); error = fill_super(s, data, flags & SB_SILENT ? 1 : 0); diff --git a/include/linux/fs.h b/include/linux/fs.h index 53e0b5e98046..dd6f8ce28385 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1228,7 +1228,7 @@ struct super_block { const struct dentry_operations *s_d_op; /* default d_op for dentries */ - struct shrinker s_shrink; /* per-sb shrinker handle */ + struct shrinker *s_shrink; /* per-sb shrinker handle */ /* Number of inodes with nlink == 0 but still referenced */ atomic_long_t s_remove_count; From patchwork Thu Jun 22 08:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B495EC001B3 for ; Thu, 22 Jun 2023 09:03:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231823AbjFVJDa (ORCPT ); Thu, 22 Jun 2023 05:03:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231820AbjFVJCz (ORCPT ); Thu, 22 Jun 2023 05:02:55 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 341BE4683 for ; Thu, 22 Jun 2023 01:57:30 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b52418c25bso12035515ad.0 for ; Thu, 22 Jun 2023 01:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424219; x=1690016219; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kOpMuT1tSay47KCV2B0lCCsOSa7MUah7TRoPRzKvzWA=; b=ARIMz0kPY37zDpA2KUfwIMWCUvAyZHIYiFTXtRMngKg5guyDvAvZNDKsvB2ZgT8UTW A25yLHuKDIGVoaFMSsnctFsAVv7yr9hBv5sSwiDDhv/mHcXUMpPqBe+IawTWyFdB0del fl+Pxlo1XCZ20awrxgcMp8gUHyp7xw9pGxosYaiH18lXNbdNEUFeaQtGj4iDzF4J11pb jyEKbhwN4kpW66g5W4dvNRCNuWMTsByWhUSRp4GIzQ739D16xyqzHvKa0SrFnLXwxenz WdarMgZ86g8pGHWvsC0B3J0QiIU/iI2cSgeTldT8LPJGZSfxzUyvtAnV3WYD0uKX4hXF pNRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424219; x=1690016219; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kOpMuT1tSay47KCV2B0lCCsOSa7MUah7TRoPRzKvzWA=; b=gzz6oWIQhOmu/MSGydG8Px6Jc1yOvRfQ0Js6vQ8N0ejE9Z0EAN0vSKUIjRwJII7jHR PIKs/3DrygxVmY+TdVojbG5WYfSRI15UK0ewItdZYi5/7SF1tvMZW9rVSUi9/RTAW2V/ +0azCvZoldIxI6aSVZhPe0k/QuA1c8PtHMLEkPZRWw9uFluUYFnyZzef/sGhwIl+YIO0 B7hfeS2kH1j+ZRcy9N1kV6kYjVNiesD5d+hXPA+vU2whFzovvzIdqG8d1zQQ4AAuUy2d DXoydh7f9+FwEPMoiRwbpuwjorJYiZ8DcMkXBm/tNJS9Cpu28Qa2jmxRIgReWqu05Jow oDUA== X-Gm-Message-State: AC+VfDyqCCqMHHs7dX4WkdHBQnzJJsgf3oswe08oZr7fojFEeOzhXtni KDeOYAVuUNbNJNpFwCJH27n+Gw== X-Google-Smtp-Source: ACHHUZ7DqrxYhESUqlF1zio/fEoWG5nPFDtKmAeSV45runPF4798dCp+1eyZAVtYptcjl0nDKDsZDg== X-Received: by 2002:a17:903:230c:b0:1b3:ec39:f42c with SMTP id d12-20020a170903230c00b001b3ec39f42cmr21844690plh.5.1687424219648; Thu, 22 Jun 2023 01:56:59 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.56.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:56:59 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 22/29] drm/ttm: introduce pool_shrink_rwsem Date: Thu, 22 Jun 2023 16:53:28 +0800 Message-Id: <20230622085335.77010-23-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Currently, the synchronize_shrinkers() is only used by TTM pool. It only requires that no shrinkers run in parallel. After we use RCU+refcount method to implement the lockless slab shrink, we can not use shrinker_rwsem or synchronize_rcu() to guarantee that all shrinker invocations have seen an update before freeing memory. So we introduce a new pool_shrink_rwsem to implement a private synchronize_shrinkers(), so as to achieve the same purpose. Signed-off-by: Qi Zheng --- drivers/gpu/drm/ttm/ttm_pool.c | 15 +++++++++++++++ include/linux/shrinker.h | 1 - mm/vmscan.c | 15 --------------- 3 files changed, 15 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index cddb9151d20f..713b1c0a70e1 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -74,6 +74,7 @@ static struct ttm_pool_type global_dma32_uncached[MAX_ORDER + 1]; static spinlock_t shrinker_lock; static struct list_head shrinker_list; static struct shrinker mm_shrinker; +static DECLARE_RWSEM(pool_shrink_rwsem); /* Allocate pages of size 1 << order with the given gfp_flags */ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, @@ -317,6 +318,7 @@ static unsigned int ttm_pool_shrink(void) unsigned int num_pages; struct page *p; + down_read(&pool_shrink_rwsem); spin_lock(&shrinker_lock); pt = list_first_entry(&shrinker_list, typeof(*pt), shrinker_list); list_move_tail(&pt->shrinker_list, &shrinker_list); @@ -329,6 +331,7 @@ static unsigned int ttm_pool_shrink(void) } else { num_pages = 0; } + up_read(&pool_shrink_rwsem); return num_pages; } @@ -572,6 +575,18 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev, } EXPORT_SYMBOL(ttm_pool_init); +/** + * synchronize_shrinkers - Wait for all running shrinkers to complete. + * + * This is useful to guarantee that all shrinker invocations have seen an + * update, before freeing memory, similar to rcu. + */ +static void synchronize_shrinkers(void) +{ + down_write(&pool_shrink_rwsem); + up_write(&pool_shrink_rwsem); +} + /** * ttm_pool_fini - Cleanup a pool * diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 8e9ba6fa3fcc..4094e4c44e80 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -105,7 +105,6 @@ extern int __printf(2, 3) register_shrinker(struct shrinker *shrinker, const char *fmt, ...); extern void unregister_shrinker(struct shrinker *shrinker); extern void free_prealloced_shrinker(struct shrinker *shrinker); -extern void synchronize_shrinkers(void); typedef unsigned long (*count_objects_cb)(struct shrinker *s, struct shrink_control *sc); diff --git a/mm/vmscan.c b/mm/vmscan.c index 64ff598fbad9..3a8d50ad6ff6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -844,21 +844,6 @@ void unregister_and_free_shrinker(struct shrinker *shrinker) } EXPORT_SYMBOL(unregister_and_free_shrinker); -/** - * synchronize_shrinkers - Wait for all running shrinkers to complete. - * - * This is equivalent to calling unregister_shrink() and register_shrinker(), - * but atomically and with less overhead. This is useful to guarantee that all - * shrinker invocations have seen an update, before freeing memory, similar to - * rcu. - */ -void synchronize_shrinkers(void) -{ - down_write(&shrinker_rwsem); - up_write(&shrinker_rwsem); -} -EXPORT_SYMBOL(synchronize_shrinkers); - #define SHRINK_BATCH 128 static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, From patchwork Thu Jun 22 08:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9983EB64D8 for ; Thu, 22 Jun 2023 09:04:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231955AbjFVJEe (ORCPT ); Thu, 22 Jun 2023 05:04:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231961AbjFVJDu (ORCPT ); Thu, 22 Jun 2023 05:03:50 -0400 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B075269F for ; Thu, 22 Jun 2023 01:57:44 -0700 (PDT) Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-54fd4a7ce25so1096467a12.0 for ; Thu, 22 Jun 2023 01:57:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424227; x=1690016227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZsvVWwvjCNbIpzLWR+JMbRdqcPMoA6IncSRNtrMJjtk=; b=btQFnTi1+Jxgpt9ZEv6SV1CxbtEwEnpYQ1WSSwhfGLNNoz+3LZU80wQIeDEMGJinhi 8+hFjjrmI0Ti2weoui+1nba3fuqZZ6jN3eQN4e+CjSeo0Ggyrqg3j3GGKdGsxTGtEKKB 6n4ntOH8ClqO9PsI9RmJeFYa/h0xYT0vefdwMfx1z83TBiR/mTSXCJDkn82jEtfe964Z +W5pu2WQm5/BmUewzteVLri/ASv79HUROzRO9GvLtgUyAJkwQyVDE/hnKFhVeWh3zZR/ QE/dybVozXXKkm+aMId69JeSrmobjvQPBxgZrNjMENKTPWKUy1hHiQ4+Qa8p1f7rRS3j Hymg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424227; x=1690016227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZsvVWwvjCNbIpzLWR+JMbRdqcPMoA6IncSRNtrMJjtk=; b=G95EgrCrgGkXC92rBlg0p6B9tuZawBPRieWGQhJsPJaWlZKxMqi+AkKkFgmqlkC/41 IdvBk46Xc9AZIkk9Vx4xp/IwPjFfpOQcdu4w5ZjZK49mp4Gp5eMnikhVwWHCx5HehggP 19KiemScRqVpZTo+T8m/qm4C3S4y1Pl6bseehvBmVwk9agNVR1NmLjs9cJxY6Hx3rAAa l2nUTfQ9N/vGtFAw6CEJRYVqb1UTKyzi9c3/DEsLMc9Ayg0hNnSXh6SwAybtahJbShsQ XXAct+6Hxq7Btf4pElIxNQrPlPq+JmRuLByDW1ddTjZ5x/uyVNHd9wL+A4yhVueid8p4 Bi4w== X-Gm-Message-State: AC+VfDyB9hv2293vwhIH5vRzGCwP0eW7SpTOAcRsgZJdp3ZeNeCyVgNr 97KXRxxGFTevlKVwbV0TNPCWUQ== X-Google-Smtp-Source: ACHHUZ6IzlQYehvkbwm6qTB7qvbww6jKgEwXRnqs2X+ZAvHSdOqqO5hMcbKXrRtXSYnzqJ1v0JsE2A== X-Received: by 2002:a17:903:1105:b0:1b3:ebda:654e with SMTP id n5-20020a170903110500b001b3ebda654emr20780380plh.5.1687424227653; Thu, 22 Jun 2023 01:57:07 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:07 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 23/29] mm: shrinker: add refcount and completion_wait fields Date: Thu, 22 Jun 2023 16:53:29 +0800 Message-Id: <20230622085335.77010-24-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org This commit introduces refcount and completion_wait fields to struct shrinker to manage the life cycle of shrinker instance. Just a preparation work for implementing the lockless slab shrink, no functional changes. Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 11 +++++++++++ mm/vmscan.c | 5 +++++ 2 files changed, 16 insertions(+) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 4094e4c44e80..7bfeb2f25246 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -4,6 +4,8 @@ #include #include +#include +#include /* * This struct is used to pass information from page reclaim to the shrinkers. @@ -70,6 +72,9 @@ struct shrinker { int seeks; /* seeks to recreate an obj */ unsigned flags; + refcount_t refcount; + struct completion completion_wait; + void *private_data; /* These are for internal use */ @@ -118,6 +123,12 @@ struct shrinker *shrinker_alloc_and_init(count_objects_cb count, void shrinker_free(struct shrinker *shrinker); void unregister_and_free_shrinker(struct shrinker *shrinker); +static inline void shrinker_put(struct shrinker *shrinker) +{ + if (refcount_dec_and_test(&shrinker->refcount)) + complete(&shrinker->completion_wait); +} + #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, diff --git a/mm/vmscan.c b/mm/vmscan.c index 3a8d50ad6ff6..6f9c4750effa 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -740,6 +740,8 @@ void free_prealloced_shrinker(struct shrinker *shrinker) void register_shrinker_prepared(struct shrinker *shrinker) { down_write(&shrinker_rwsem); + refcount_set(&shrinker->refcount, 1); + init_completion(&shrinker->completion_wait); list_add_tail(&shrinker->list, &shrinker_list); shrinker->flags |= SHRINKER_REGISTERED; shrinker_debugfs_add(shrinker); @@ -794,6 +796,9 @@ void unregister_shrinker(struct shrinker *shrinker) if (!(shrinker->flags & SHRINKER_REGISTERED)) return; + shrinker_put(shrinker); + wait_for_completion(&shrinker->completion_wait); + down_write(&shrinker_rwsem); list_del(&shrinker->list); shrinker->flags &= ~SHRINKER_REGISTERED; From patchwork Thu Jun 22 08:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7F13EB64D8 for ; Thu, 22 Jun 2023 09:04:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231883AbjFVJEZ (ORCPT ); Thu, 22 Jun 2023 05:04:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231682AbjFVJDo (ORCPT ); Thu, 22 Jun 2023 05:03:44 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 667F744BD for ; Thu, 22 Jun 2023 01:57:53 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b5466bc5f8so9657445ad.1 for ; Thu, 22 Jun 2023 01:57:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424235; x=1690016235; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WJL0OVu56Nd2o0mxjMzDLbJ1YgOSFE+1vh1fraMJRQc=; b=ZL5KbVwmDDj8Gy97lbIJB7YMY8SjOouZdkJA02qk3b6DF+dyuydzFaZQylnxfC9S+S TSPs6tccznQsZMFsLcEJsyUCVw2T8A7tWiogG+8JrU9eTxXc/9BRSgGLnZUgk22umk+B SSngiAqdElEoARuVlyhvyt60KkFHlCbx7fUzwDo8HLyuhKs4MGrMo82qC5g6M3adQ/kW 5sTm4LvGQVeaef4Ot0YEMEqtMtR7HuPlighv8JjZTv9ZOAYC2jp2PBoT7dm2kyLjCYpo 2319G2aelwdOOipT4nDLl60x2b/otqGmO31pzkQB3PluAdANWpdqDdwlhTA0kEDJVKL2 ha0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424235; x=1690016235; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WJL0OVu56Nd2o0mxjMzDLbJ1YgOSFE+1vh1fraMJRQc=; b=iksWFq+vZs1PeZgwAZRaGP5i0vQuAoSK9ZO/PrBSM3pz3ObhwxZFH0DcWmsSPV/Zfe whOqeBcsduToLO6xGvieQ1DYROz4wOZ1gqi0oduqUKGQ5S/+SGcLLhwAheDVbTMcvpir lwQyvqfyKue4D2WqyChd+u3yasTabqqnXCHgvRnS5y/a38rUQ9pUao6ruSIYDnk4JF3x dvVr8NrB9pyLB409fMZe6NN66YWXpO5KGB8jyv8MMrZWYV1K49QlnNl/olRGyoa5rts8 OsDkVsngq19qB1KYjnIZHdnkwxy05rKrX0FlYLvGiPSGcGv7mzp0KlbECXx9HAqLvDsr buxw== X-Gm-Message-State: AC+VfDzh8EXC4ESxdfrfWpDW7xotev+uVY+43cOzyfqWweQaSwqa4lyL soHGghhXekEaOWZtpl151OnbXg== X-Google-Smtp-Source: ACHHUZ7SQ8rS1EvkM8z3rgy8xjUWZCKvjOIofntYWhPm+3TRoZDLL1AdCBUgXDGJA0nPgYbkXii2Dw== X-Received: by 2002:a17:902:c945:b0:1ae:3ff8:7fa7 with SMTP id i5-20020a170902c94500b001ae3ff87fa7mr21702029pla.4.1687424235565; Thu, 22 Jun 2023 01:57:15 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:15 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 24/29] mm: vmscan: make global slab shrink lockless Date: Thu, 22 Jun 2023 16:53:30 +0800 Message-Id: <20230622085335.77010-25-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The shrinker_rwsem is a global read-write lock in shrinkers subsystem, which protects most operations such as slab shrink, registration and unregistration of shrinkers, etc. This can easily cause problems in the following cases. 1) When the memory pressure is high and there are many filesystems mounted or unmounted at the same time, slab shrink will be affected (down_read_trylock() failed). Such as the real workload mentioned by Kirill Tkhai: ``` One of the real workloads from my experience is start of an overcommitted node containing many starting containers after node crash (or many resuming containers after reboot for kernel update). In these cases memory pressure is huge, and the node goes round in long reclaim. ``` 2) If a shrinker is blocked (such as the case mentioned in [1]) and a writer comes in (such as mount a fs), then this writer will be blocked and cause all subsequent shrinker-related operations to be blocked. Even if there is no competitor when shrinking slab, there may still be a problem. If we have a long shrinker list and we do not reclaim enough memory with each shrinker, then the down_read_trylock() may be called with high frequency. Because of the poor multicore scalability of atomic operations, this can lead to a significant drop in IPC (instructions per cycle). We used to implement the lockless slab shrink with SRCU [1], but then kernel test robot reported -88.8% regression in stress-ng.ramfs.ops_per_sec test case [2], so we reverted it [3]. This commit uses the refcount+RCU method [4] proposed by by Dave Chinner to re-implement the lockless global slab shrink. The memcg slab shrink is handled in the subsequent patch. Currently, the shrinker instances can be divided into the following three types: a) global shrinker instance statically defined in the kernel, such as workingset_shadow_shrinker. b) global shrinker instance statically defined in the kernel modules, such as mmu_shrinker in x86. c) shrinker instance embedded in other structures. For case a, the memory of shrinker instance is never freed. For case b, the memory of shrinker instance will be freed after the module is unloaded. But we will call synchronize_rcu() in free_module() to wait for RCU read-side critical section to exit. For case c, the memory of shrinker instance will be dynamically freed by calling kfree_rcu(). So we can use rcu_read_{lock,unlock}() to ensure that the shrinker instance is valid. The shrinker::refcount mechanism ensures that the shrinker instance will not be run again after unregistration. So the structure that records the pointer of shrinker instance can be safely freed without waiting for the RCU read-side critical section. In this way, while we implement the lockless slab shrink, we don't need to be blocked in unregister_shrinker() to wait RCU read-side critical section. The following are the test results: stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 & 1) Before applying this patchset: setting to a 60 second run per stressor dispatching hogs: 9 ramfs stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s (secs) (secs) (secs) (real time) (usr+sys time) ramfs 880623 60.02 7.71 226.93 14671.45 3753.09 ramfs: 1 System Management Interrupt for a 60.03s run time: 5762.40s available CPU time 7.71s user time ( 0.13%) 226.93s system time ( 3.94%) 234.64s total time ( 4.07%) load average: 8.54 3.06 2.11 passed: 9: ramfs (9) failed: 0 skipped: 0 successful run completed in 60.03s (1 min, 0.03 secs) 2) After applying this patchset: setting to a 60 second run per stressor dispatching hogs: 9 ramfs stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s (secs) (secs) (secs) (real time) (usr+sys time) ramfs 847562 60.02 7.44 230.22 14120.66 3566.23 ramfs: 4 System Management Interrupts for a 60.12s run time: 5771.95s available CPU time 7.44s user time ( 0.13%) 230.22s system time ( 3.99%) 237.66s total time ( 4.12%) load average: 8.18 2.43 0.84 passed: 9: ramfs (9) failed: 0 skipped: 0 successful run completed in 60.12s (1 min, 0.12 secs) We can see that the ops/s has hardly changed. [1]. https://lore.kernel.org/lkml/20230313112819.38938-1-zhengqi.arch@bytedance.com/ [2]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/ [3]. https://lore.kernel.org/all/20230609081518.3039120-1-qi.zheng@linux.dev/ [4]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/ Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 6 ++++++ mm/vmscan.c | 33 ++++++++++++++------------------- 2 files changed, 20 insertions(+), 19 deletions(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 7bfeb2f25246..b0c6c2df9db8 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -74,6 +74,7 @@ struct shrinker { refcount_t refcount; struct completion completion_wait; + struct rcu_head rcu; void *private_data; @@ -123,6 +124,11 @@ struct shrinker *shrinker_alloc_and_init(count_objects_cb count, void shrinker_free(struct shrinker *shrinker); void unregister_and_free_shrinker(struct shrinker *shrinker); +static inline bool shrinker_try_get(struct shrinker *shrinker) +{ + return refcount_inc_not_zero(&shrinker->refcount); +} + static inline void shrinker_put(struct shrinker *shrinker) { if (refcount_dec_and_test(&shrinker->refcount)) diff --git a/mm/vmscan.c b/mm/vmscan.c index 6f9c4750effa..767569698946 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include @@ -742,7 +743,7 @@ void register_shrinker_prepared(struct shrinker *shrinker) down_write(&shrinker_rwsem); refcount_set(&shrinker->refcount, 1); init_completion(&shrinker->completion_wait); - list_add_tail(&shrinker->list, &shrinker_list); + list_add_tail_rcu(&shrinker->list, &shrinker_list); shrinker->flags |= SHRINKER_REGISTERED; shrinker_debugfs_add(shrinker); up_write(&shrinker_rwsem); @@ -800,7 +801,7 @@ void unregister_shrinker(struct shrinker *shrinker) wait_for_completion(&shrinker->completion_wait); down_write(&shrinker_rwsem); - list_del(&shrinker->list); + list_del_rcu(&shrinker->list); shrinker->flags &= ~SHRINKER_REGISTERED; if (shrinker->flags & SHRINKER_MEMCG_AWARE) unregister_memcg_shrinker(shrinker); @@ -845,7 +846,7 @@ EXPORT_SYMBOL(shrinker_free); void unregister_and_free_shrinker(struct shrinker *shrinker) { unregister_shrinker(shrinker); - kfree(shrinker); + kfree_rcu(shrinker, rcu); } EXPORT_SYMBOL(unregister_and_free_shrinker); @@ -1067,33 +1068,27 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) return shrink_slab_memcg(gfp_mask, nid, memcg, priority); - if (!down_read_trylock(&shrinker_rwsem)) - goto out; - - list_for_each_entry(shrinker, &shrinker_list, list) { + rcu_read_lock(); + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { struct shrink_control sc = { .gfp_mask = gfp_mask, .nid = nid, .memcg = memcg, }; + if (!shrinker_try_get(shrinker)) + continue; + rcu_read_unlock(); + ret = do_shrink_slab(&sc, shrinker, priority); if (ret == SHRINK_EMPTY) ret = 0; freed += ret; - /* - * Bail out if someone want to register a new shrinker to - * prevent the registration from being stalled for long periods - * by parallel ongoing shrinking. - */ - if (rwsem_is_contended(&shrinker_rwsem)) { - freed = freed ? : 1; - break; - } - } - up_read(&shrinker_rwsem); -out: + rcu_read_lock(); + shrinker_put(shrinker); + } + rcu_read_unlock(); cond_resched(); return freed; } From patchwork Thu Jun 22 08:53:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8730FC04FDF for ; Thu, 22 Jun 2023 09:06:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232161AbjFVJGS (ORCPT ); Thu, 22 Jun 2023 05:06:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232008AbjFVJEb (ORCPT ); Thu, 22 Jun 2023 05:04:31 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FEFA49CC for ; Thu, 22 Jun 2023 01:58:04 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-54fd4a7ce25so1096510a12.0 for ; Thu, 22 Jun 2023 01:58:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424243; x=1690016243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A8Fm0YGxp8yCkyBdSSDbuW3Cqk4JmtEEZRGj6EeWqug=; b=VXm59oxeB5FguY23iZ6wxsHZUa/n0ixVM5+RJovFhBsSqk7jUN18WD75GpJCgArR0W JBsEkA9d39jtf7NG1SF+QvFe/XJdGOqobYaTEX0aq6YpeCNoIMMKV4Gdkmm7QKEFMCrH QNswdcRq+FKKLLJSUCefQDEwUuyAf6EXdUAA8SWzr8d5PlDWqAvD+NLb3ziu4Q1ZbddG 3AHOgUEiuKr4IMyzSNLNbfYcakjlWAem1FVIRkWrvCWnM54TEnQwCZtRjpDkJUyTk1gU FK042RTEuCbaP60yQQ0pKyvy7feFAoM0ianmMZaWx78B21cSO7WyFn4vwESs5Li+xEdq ozzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424243; x=1690016243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A8Fm0YGxp8yCkyBdSSDbuW3Cqk4JmtEEZRGj6EeWqug=; b=gWeJttAgDiZ67DoiXFpmryK0gyTiTgiFY9wfc92z3tC7unl7bpfDGkHNKHFRnPLSOa DRgO/cGIvAb8SfkcGva2A2Gbdhb5ca7ZfaNa7Up6iAv3ViUDhZpTaKjY8kZC2xoyShuI Vnx3/iz2Jw3EwgqD8rGH3alrUXGObfG/6AHeAzZ8Totw4KLYUgaOzi/LmvDaQ0Nk7Tet 5OHqfy1ZrIYFYVlnyxjPvLon0oD1eaxHrwob1AT18nj6L0LWP/BsTN8eT2TzC/I2l9tT o2A1Lki2swN5DUqpNw14u5nYAgvjOaiLWaLrzDOJl1t2JVN0KayzlIZmlnGHsY7SK7sH T7pQ== X-Gm-Message-State: AC+VfDwXFZ3Px8BP8W6fvbI6Z9oy6kOYH3pPNB/t0uY4/JrIdkALVLY8 Q824hULhthUeU7xWeHQ7VasM/w== X-Google-Smtp-Source: ACHHUZ6zp07NG2bkXXxglRYaIEoFlfKXA7hhsSA7stt9xCgrqpdelrec/tU9od+DRY9y6jYGAzcstA== X-Received: by 2002:a17:902:da91:b0:1b0:3d54:358f with SMTP id j17-20020a170902da9100b001b03d54358fmr20812342plx.0.1687424243512; Thu, 22 Jun 2023 01:57:23 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:23 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 25/29] mm: vmscan: make memcg slab shrink lockless Date: Thu, 22 Jun 2023 16:53:31 +0800 Message-Id: <20230622085335.77010-26-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Like global slab shrink, this commit also uses refcount+RCU method to make memcg slab shrink lockless. We can reproduce the down_read_trylock() hotspot through the following script: ``` DIR="/root/shrinker/memcg/mnt" do_create() { mkdir -p /sys/fs/cgroup/memory/test mkdir -p /sys/fs/cgroup/perf_event/test echo 4G > /sys/fs/cgroup/memory/test/memory.limit_in_bytes for i in `seq 0 $1`; do mkdir -p /sys/fs/cgroup/memory/test/$i; echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs; mkdir -p $DIR/$i; done } do_mount() { for i in `seq $1 $2`; do mount -t tmpfs $i $DIR/$i; done } do_touch() { for i in `seq $1 $2`; do echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs; dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 & done } case "$1" in touch) do_touch $2 $3 ;; test) do_create 4000 do_mount 0 4000 do_touch 0 3000 ;; *) exit 1 ;; esac ``` Save the above script, then run test and touch commands. Then we can use the following perf command to view hotspots: perf top -U -F 999 [-g] 1) Before applying this patchset: 35.34% [kernel] [k] down_read_trylock 18.44% [kernel] [k] shrink_slab 15.98% [kernel] [k] pv_native_safe_halt 15.08% [kernel] [k] up_read 5.33% [kernel] [k] idr_find 2.71% [kernel] [k] _find_next_bit 2.21% [kernel] [k] shrink_node 1.29% [kernel] [k] shrink_lruvec 0.66% [kernel] [k] do_shrink_slab 0.33% [kernel] [k] list_lru_count_one 0.33% [kernel] [k] __radix_tree_lookup 0.25% [kernel] [k] mem_cgroup_iter - 82.19% 19.49% [kernel] [k] shrink_slab - 62.00% shrink_slab 36.37% down_read_trylock 15.52% up_read 5.48% idr_find 3.38% _find_next_bit + 0.98% do_shrink_slab 2) After applying this patchset: 46.83% [kernel] [k] shrink_slab 20.52% [kernel] [k] pv_native_safe_halt 8.85% [kernel] [k] do_shrink_slab 7.71% [kernel] [k] _find_next_bit 1.72% [kernel] [k] xas_descend 1.70% [kernel] [k] shrink_node 1.44% [kernel] [k] shrink_lruvec 1.43% [kernel] [k] mem_cgroup_iter 1.28% [kernel] [k] xas_load 0.89% [kernel] [k] super_cache_count 0.84% [kernel] [k] xas_start 0.66% [kernel] [k] list_lru_count_one - 65.50% 40.44% [kernel] [k] shrink_slab - 22.96% shrink_slab 13.11% _find_next_bit - 9.91% do_shrink_slab - 1.59% super_cache_count 0.92% list_lru_count_one We can see that the first perf hotspot becomes shrink_slab, which is what we expect. Signed-off-by: Qi Zheng --- mm/vmscan.c | 58 +++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 41 insertions(+), 17 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 767569698946..357a1f2ad690 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -213,6 +213,12 @@ static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, lockdep_is_held(&shrinker_rwsem)); } +static struct shrinker_info *shrinker_info_rcu(struct mem_cgroup *memcg, + int nid) +{ + return rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); +} + static int expand_one_shrinker_info(struct mem_cgroup *memcg, int map_size, int defer_size, int old_map_size, int old_defer_size, @@ -339,7 +345,7 @@ void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) struct shrinker_info *info; rcu_read_lock(); - info = rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); + info = shrinker_info_rcu(memcg, nid); if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) { /* Pairs with smp mb in shrink_slab() */ smp_mb__before_atomic(); @@ -359,7 +365,6 @@ static int prealloc_memcg_shrinker(struct shrinker *shrinker) return -ENOSYS; down_write(&shrinker_rwsem); - /* This may call shrinker, so it must use down_read_trylock() */ id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); if (id < 0) goto unlock; @@ -392,18 +397,28 @@ static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, struct mem_cgroup *memcg) { struct shrinker_info *info; + long nr_deferred; - info = shrinker_info_protected(memcg, nid); - return atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + nr_deferred = atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); + rcu_read_unlock(); + + return nr_deferred; } static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, struct mem_cgroup *memcg) { struct shrinker_info *info; + long nr_deferred; + + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + nr_deferred = atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); + rcu_read_unlock(); - info = shrinker_info_protected(memcg, nid); - return atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); + return nr_deferred; } void reparent_shrinker_deferred(struct mem_cgroup *memcg) @@ -955,19 +970,18 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, { struct shrinker_info *info; unsigned long ret, freed = 0; - int i; + int i = 0; if (!mem_cgroup_online(memcg)) return 0; - if (!down_read_trylock(&shrinker_rwsem)) - return 0; - - info = shrinker_info_protected(memcg, nid); +again: + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); if (unlikely(!info)) goto unlock; - for_each_set_bit(i, info->map, info->map_nr_max) { + for_each_set_bit_from(i, info->map, info->map_nr_max) { struct shrink_control sc = { .gfp_mask = gfp_mask, .nid = nid, @@ -982,6 +996,10 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, continue; } + if (!shrinker_try_get(shrinker)) + continue; + rcu_read_unlock(); + /* Call non-slab shrinkers even though kmem is disabled */ if (!memcg_kmem_online() && !(shrinker->flags & SHRINKER_NONSLAB)) @@ -1014,13 +1032,19 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, } freed += ret; - if (rwsem_is_contended(&shrinker_rwsem)) { - freed = freed ? : 1; - break; - } + shrinker_put(shrinker); + + /* + * We have already exited the read-side of rcu critical section + * before calling do_shrink_slab(), the shrinker_info may be + * released in expand_one_shrinker_info(), so restart the + * iteration. + */ + i++; + goto again; } unlock: - up_read(&shrinker_rwsem); + rcu_read_unlock(); return freed; } #else /* CONFIG_MEMCG */ From patchwork Thu Jun 22 08:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A83CEB64DA for ; Thu, 22 Jun 2023 09:06:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232026AbjFVJGP (ORCPT ); Thu, 22 Jun 2023 05:06:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231991AbjFVJEZ (ORCPT ); Thu, 22 Jun 2023 05:04:25 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 925EF49E8 for ; Thu, 22 Jun 2023 01:58:09 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b693afe799so2466775ad.1 for ; Thu, 22 Jun 2023 01:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424252; x=1690016252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YpqAnPCFjvBxCUJyQNRij7IdVYJH6Jib7EICYqtSs4Q=; b=I9WP6mv1rDRCGtgaLzA9lhBXeFPAHaj8y0iyY2DOjVEWCsT2F7nfNJVwyIikm+kMao zO6QJGdxbeP+Xd2yLUuD3B5OfZQRm3B4q5mm2BLvCFNnuFgYbDreBXTpY911jw5PjLjw 5qBcX7//daeu9fw2bM5v9uWJQ3Yj963JAAdMROcF9Ug6KJW+BwPKjtr4qnEx+svXyNj4 WQtBJXjBe1rX51UFB2Mb+BbnprNNOvKWLYHk9IXF69UQEETABYtrhl0ZHyku4x/+376s ovmIBUSgYXrVJUG8L2WcJE6QBBiCJCq44tzgQbbUxJYU3EpXiLR3t2TzR3wvfdUTA7wG r7SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424252; x=1690016252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YpqAnPCFjvBxCUJyQNRij7IdVYJH6Jib7EICYqtSs4Q=; b=QfMfd4OGBE4oGFCAoCA0Y+q1T9TyYvk4T5e3+hUbiOZfmYvzeutHdWFBEka6WUphQj /63zKjdP1zblJApeeQmzktvxyhBHns4FoEUkbESMGRJzo4K9+OzNpmEPubgpZEl+IjSL aOn5O2mq4E+wpJSXetJpLM6oby7ip2+3fUpL+ps/btZ6yGQjUpcDMQbzG7XfnxgvdGo0 yLRizP+ipAoTjJ74BNYsTYr9Wp9CGVnXYITOlOUzHP3i4OakiY5tO4xGeSTbCXyrInpB ulQEtkFO9WB62s55jPw9Xg6+w18Oc6bozbYt7cwQG9+qPx8kUIb3wKOweTqn0Dr6gqpM hO7A== X-Gm-Message-State: AC+VfDx1lhkuglKgB+16ysU46q9Y71vk+h+NBkadjGY9Mqc59SuENS+i Y4gnSYrSIWfMPv2IyrShcYpOL8/ctQMgL6I9fro= X-Google-Smtp-Source: ACHHUZ6FoF5W+f0N2ZUpFBUVBh1vs3Mj6G9dcM80ve0PhcTP+4niqZQRBrFmYJGvwbhs9MGB3E6MuA== X-Received: by 2002:a17:903:32c4:b0:1b3:e352:6d88 with SMTP id i4-20020a17090332c400b001b3e3526d88mr21673124plr.6.1687424251794; Thu, 22 Jun 2023 01:57:31 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:31 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 26/29] mm: shrinker: make count and scan in shrinker debugfs lockless Date: Thu, 22 Jun 2023 16:53:32 +0800 Message-Id: <20230622085335.77010-27-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Like global and memcg slab shrink, also make count and scan operations in memory shrinker debugfs lockless. The debugfs_remove_recursive() will wait for debugfs_file_put() to return, so there is no need to call rcu_read_lock() before calling shrinker_try_get(). Signed-off-by: Qi Zheng --- mm/shrinker_debug.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index 3ab53fad8876..c18fa9b6b7f0 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -55,8 +55,8 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v) if (!count_per_node) return -ENOMEM; - ret = down_read_killable(&shrinker_rwsem); - if (ret) { + ret = shrinker_try_get(shrinker); + if (!ret) { kfree(count_per_node); return ret; } @@ -92,7 +92,7 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v) } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); rcu_read_unlock(); - up_read(&shrinker_rwsem); + shrinker_put(shrinker); kfree(count_per_node); return ret; @@ -146,8 +146,8 @@ static ssize_t shrinker_debugfs_scan_write(struct file *file, return -EINVAL; } - ret = down_read_killable(&shrinker_rwsem); - if (ret) { + ret = shrinker_try_get(shrinker); + if (!ret) { mem_cgroup_put(memcg); return ret; } @@ -159,7 +159,7 @@ static ssize_t shrinker_debugfs_scan_write(struct file *file, shrinker->scan_objects(shrinker, &sc); - up_read(&shrinker_rwsem); + shrinker_put(shrinker); mem_cgroup_put(memcg); return size; From patchwork Thu Jun 22 08:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 109E8C04FE1 for ; Thu, 22 Jun 2023 09:06:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232008AbjFVJGV (ORCPT ); Thu, 22 Jun 2023 05:06:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232093AbjFVJE7 (ORCPT ); Thu, 22 Jun 2023 05:04:59 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21D0D4C03 for ; Thu, 22 Jun 2023 01:58:20 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b5079b8cb3so12593615ad.1 for ; Thu, 22 Jun 2023 01:58:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424260; x=1690016260; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aRQUwtIbFrjRbKaWDWITc6Upds+hXxxlwcQqvFPvJow=; b=Kv6dWh2IIchnBu9RbCVLsx8HZLugRgYdolMOnFdVv/Mze68pcgFvadMFaompRsV/Ii Vr4ahAwXirWdfwADAjPFAtSTivwacduawKzyeKPVnp7cESQMTGYfaqAI7425tQvAdQ85 8rLDDpZCwcAbH67+6k+hMmW8Fi1etqH6LYFHkuQfbD+ESFiwPmeSocm5kXqI50kDiSjf PCd9vDumWTECSSKSpOEWhL8aCyPGbcSxO0SaB85LOV7uPD9a5LVIfvAATYV0qgfKCSQQ J+7inf8lDO5dvyUSQTFKKOcNdRaAbUMX2WbzVCHQayCH8tZA09PfVWiof+vKchWmVnWN 0e0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424260; x=1690016260; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aRQUwtIbFrjRbKaWDWITc6Upds+hXxxlwcQqvFPvJow=; b=BXcfqqIoJfBW4tvz1H8DqTkBesA4bNY3VfGQEIIgGleUr4NNtvZhj2heLw1v5YtvSC eRVdV/VFL/FMqKyULedk2xNiEU8ggu9UsLgNH+7Ivy+iT7Qn92bpxK9hHZ98RzzTTbx9 5nzUoKlz5l8Ggo8a94o2g3V0mnwGPUd6myeWBAXAvbKkrMjL60h5vt0eQVDSyftTDUMY hpdEfQKdxQJ8hCBh65xvhV1Uck+oR0JT5b498cYinWt4w5s1qD82bkWvBfR1JPJQYwTX IpYSDjSW5ATb/hX6RYh7uMOL7eNTLj7eDjwMKg9+dqIC6XGpDBt+rcth1m5/gRXJA9D0 Q7Eg== X-Gm-Message-State: AC+VfDyDSP5A6LQehOa40rY0LRpmb0/gKQH7QmastKgmceSuU2MDxDV9 XigNGbSg0QRb4qYYShyZp9W4Kg== X-Google-Smtp-Source: ACHHUZ4KKZ8AXrupsFYw18EsJWvGfL6qm7cyYV/rSPTKOdYWRQ8IhOkLOim7I9s7W96jDohmbJS9Lg== X-Received: by 2002:a17:903:41d2:b0:1a6:cf4b:4d7d with SMTP id u18-20020a17090341d200b001a6cf4b4d7dmr21650808ple.2.1687424260203; Thu, 22 Jun 2023 01:57:40 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:39 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 27/29] mm: vmscan: hold write lock to reparent shrinker nr_deferred Date: Thu, 22 Jun 2023 16:53:33 +0800 Message-Id: <20230622085335.77010-28-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org For now, reparent_shrinker_deferred() is the only holder of read lock of shrinker_rwsem. And it already holds the global cgroup_mutex, so it will not be called in parallel. Therefore, in order to convert shrinker_rwsem to shrinker_mutex later, here we change to hold the write lock of shrinker_rwsem to reparent. Signed-off-by: Qi Zheng --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 357a1f2ad690..0711b63e68d9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -433,7 +433,7 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg) parent = root_mem_cgroup; /* Prevent from concurrent shrinker_info expand */ - down_read(&shrinker_rwsem); + down_write(&shrinker_rwsem); for_each_node(nid) { child_info = shrinker_info_protected(memcg, nid); parent_info = shrinker_info_protected(parent, nid); @@ -442,7 +442,7 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg) atomic_long_add(nr, &parent_info->nr_deferred[i]); } } - up_read(&shrinker_rwsem); + up_write(&shrinker_rwsem); } static bool cgroup_reclaim(struct scan_control *sc) From patchwork Thu Jun 22 08:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AAC5C05052 for ; Thu, 22 Jun 2023 09:06:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231991AbjFVJGQ (ORCPT ); Thu, 22 Jun 2023 05:06:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232363AbjFVJFd (ORCPT ); Thu, 22 Jun 2023 05:05:33 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9AD32707 for ; Thu, 22 Jun 2023 01:58:25 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b693afe799so2467015ad.1 for ; Thu, 22 Jun 2023 01:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424268; x=1690016268; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+P5gj8bno5Yz104PoCsSeC7OSXGzrjC+L9aN78tCOOI=; b=jvhIGmYE039/syWqcfLyIdo6t9Qplyt6OsSxUBVvxtsPAkw48Ft91VW4Wz2O26wmhx R0vxBU2Gya4gcRkfUKW9xk7qtBQ/FB/itjiP7zdM6kYiVPyrZbA9p3LfIx3f71WW1xO7 LLQ+0DljNwUv19wViW0KnJsUJ8+fuKrQN9IBdM3/tIdgoFSPmAiPoFKu75bet98dvkR/ 1RKnODQtDxONo/sndY6eLEsUvIJFr3WP9l67MNEVgdfQojcwBUWJp7hhTFie6SRhHC3h g1EMHwWUjxMnky7BCMGJqXeVS7+DBkpxiGCDfvbg7pDqaD88DLgjwvpVG1NnvXM5z1NX rt6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424268; x=1690016268; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+P5gj8bno5Yz104PoCsSeC7OSXGzrjC+L9aN78tCOOI=; b=JNwWcKWbk4apgNtFW5YaYzwgoLqFw1IDksJwP32gdGBp8aGcwq37EY9s6b7XU9GXCL hmfvsoeGPnajUb8wlwHAF53v2cMAZ7DsagRuyQxYFBdVJklvI2SqsEBYDQQQ6HaU3DIt RLAB/oT0rMVAa2MDArKKWisD/BLml1LjveZ/5d96ZCwt7A2WjSHXekskvbFFtWIKUlq2 Tkbj+gaHOJXAB4xK4iHXDf9lEV4FnWdhW875jTTTLdPS5B/3MqpiEm7EwAgI+EOCklQB Y9X9QSaeXJTFeZeFXJtM2CMgri6nmVdOpEzIZ/DC8uxzFb4ZYLoRuIyvXXnp7pwao/Gi HqQw== X-Gm-Message-State: AC+VfDwJ9Al2DHL8f3d2/pNi0QNeks93U3F+UN8CWIcOxQJKfW10z18P vBffq823JQ/KE2maLZvnVgfIUA== X-Google-Smtp-Source: ACHHUZ57rUw3vY36SztXPXN7lhzSsB9eXMClAkEXukaFLir2ldlvIHkHUZcbPwVIoKn5tOhLnnwBWg== X-Received: by 2002:a17:902:ecc6:b0:1ae:1364:6086 with SMTP id a6-20020a170902ecc600b001ae13646086mr21606033plh.2.1687424268179; Thu, 22 Jun 2023 01:57:48 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:47 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 28/29] mm: shrinkers: convert shrinker_rwsem to mutex Date: Thu, 22 Jun 2023 16:53:34 +0800 Message-Id: <20230622085335.77010-29-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Now there are no readers of shrinker_rwsem, so we can simply replace it with mutex lock. Signed-off-by: Qi Zheng --- drivers/md/dm-cache-metadata.c | 2 +- drivers/md/dm-thin-metadata.c | 2 +- fs/super.c | 2 +- mm/shrinker_debug.c | 14 +++++++------- mm/vmscan.c | 34 +++++++++++++++++----------------- 5 files changed, 27 insertions(+), 27 deletions(-) diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c index acffed750e3e..9e0c69958587 100644 --- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -1828,7 +1828,7 @@ int dm_cache_metadata_abort(struct dm_cache_metadata *cmd) * Replacement block manager (new_bm) is created and old_bm destroyed outside of * cmd root_lock to avoid ABBA deadlock that would result (due to life-cycle of * shrinker associated with the block manager's bufio client vs cmd root_lock). - * - must take shrinker_rwsem without holding cmd->root_lock + * - must take shrinker_mutex without holding cmd->root_lock */ new_bm = dm_block_manager_create(cmd->bdev, DM_CACHE_METADATA_BLOCK_SIZE << SECTOR_SHIFT, CACHE_MAX_CONCURRENT_LOCKS); diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index fd464fb024c3..9f5cb52c5763 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -1887,7 +1887,7 @@ int dm_pool_abort_metadata(struct dm_pool_metadata *pmd) * Replacement block manager (new_bm) is created and old_bm destroyed outside of * pmd root_lock to avoid ABBA deadlock that would result (due to life-cycle of * shrinker associated with the block manager's bufio client vs pmd root_lock). - * - must take shrinker_rwsem without holding pmd->root_lock + * - must take shrinker_mutex without holding pmd->root_lock */ new_bm = dm_block_manager_create(pmd->bdev, THIN_METADATA_BLOCK_SIZE << SECTOR_SHIFT, THIN_MAX_CONCURRENT_LOCKS); diff --git a/fs/super.c b/fs/super.c index 791342bb8ac9..471800ff793a 100644 --- a/fs/super.c +++ b/fs/super.c @@ -54,7 +54,7 @@ static char *sb_writers_name[SB_FREEZE_LEVELS] = { * One thing we have to be careful of with a per-sb shrinker is that we don't * drop the last active reference to the superblock from within the shrinker. * If that happens we could trigger unregistering the shrinker from within the - * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence we + * shrinker path and that leads to deadlock on the shrinker_mutex. Hence we * take a passive reference to the superblock to avoid this from occurring. */ static unsigned long super_cache_scan(struct shrinker *shrink, diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index c18fa9b6b7f0..7ad903f84463 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -7,7 +7,7 @@ #include /* defined in vmscan.c */ -extern struct rw_semaphore shrinker_rwsem; +extern struct mutex shrinker_mutex; extern struct list_head shrinker_list; static DEFINE_IDA(shrinker_debugfs_ida); @@ -177,7 +177,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker) char buf[128]; int id; - lockdep_assert_held(&shrinker_rwsem); + lockdep_assert_held(&shrinker_mutex); /* debugfs isn't initialized yet, add debugfs entries later. */ if (!shrinker_debugfs_root) @@ -220,7 +220,7 @@ int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) if (!new) return -ENOMEM; - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); old = shrinker->name; shrinker->name = new; @@ -238,7 +238,7 @@ int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) shrinker->debugfs_entry = entry; } - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); kfree_const(old); @@ -251,7 +251,7 @@ struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, { struct dentry *entry = shrinker->debugfs_entry; - lockdep_assert_held(&shrinker_rwsem); + lockdep_assert_held(&shrinker_mutex); kfree_const(shrinker->name); shrinker->name = NULL; @@ -280,14 +280,14 @@ static int __init shrinker_debugfs_init(void) shrinker_debugfs_root = dentry; /* Create debugfs entries for shrinkers registered at boot */ - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); list_for_each_entry(shrinker, &shrinker_list, list) if (!shrinker->debugfs_entry) { ret = shrinker_debugfs_add(shrinker); if (ret) break; } - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); return ret; } diff --git a/mm/vmscan.c b/mm/vmscan.c index 0711b63e68d9..bcdd97caa403 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -35,7 +35,7 @@ #include #include #include -#include +#include #include #include #include @@ -190,7 +190,7 @@ struct scan_control { int vm_swappiness = 60; LIST_HEAD(shrinker_list); -DECLARE_RWSEM(shrinker_rwsem); +DEFINE_MUTEX(shrinker_mutex); #ifdef CONFIG_MEMCG static int shrinker_nr_max; @@ -210,7 +210,7 @@ static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, int nid) { return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, - lockdep_is_held(&shrinker_rwsem)); + lockdep_is_held(&shrinker_mutex)); } static struct shrinker_info *shrinker_info_rcu(struct mem_cgroup *memcg, @@ -283,7 +283,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) int nid, size, ret = 0; int map_size, defer_size = 0; - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); map_size = shrinker_map_size(shrinker_nr_max); defer_size = shrinker_defer_size(shrinker_nr_max); size = map_size + defer_size; @@ -299,7 +299,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) info->map_nr_max = shrinker_nr_max; rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); return ret; } @@ -315,7 +315,7 @@ static int expand_shrinker_info(int new_id) if (!root_mem_cgroup) goto out; - lockdep_assert_held(&shrinker_rwsem); + lockdep_assert_held(&shrinker_mutex); map_size = shrinker_map_size(new_nr_max); defer_size = shrinker_defer_size(new_nr_max); @@ -364,7 +364,7 @@ static int prealloc_memcg_shrinker(struct shrinker *shrinker) if (mem_cgroup_disabled()) return -ENOSYS; - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); if (id < 0) goto unlock; @@ -378,7 +378,7 @@ static int prealloc_memcg_shrinker(struct shrinker *shrinker) shrinker->id = id; ret = 0; unlock: - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); return ret; } @@ -388,7 +388,7 @@ static void unregister_memcg_shrinker(struct shrinker *shrinker) BUG_ON(id < 0); - lockdep_assert_held(&shrinker_rwsem); + lockdep_assert_held(&shrinker_mutex); idr_remove(&shrinker_idr, id); } @@ -433,7 +433,7 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg) parent = root_mem_cgroup; /* Prevent from concurrent shrinker_info expand */ - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); for_each_node(nid) { child_info = shrinker_info_protected(memcg, nid); parent_info = shrinker_info_protected(parent, nid); @@ -442,7 +442,7 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg) atomic_long_add(nr, &parent_info->nr_deferred[i]); } } - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); } static bool cgroup_reclaim(struct scan_control *sc) @@ -743,9 +743,9 @@ void free_prealloced_shrinker(struct shrinker *shrinker) shrinker->name = NULL; #endif if (shrinker->flags & SHRINKER_MEMCG_AWARE) { - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); unregister_memcg_shrinker(shrinker); - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); return; } @@ -755,13 +755,13 @@ void free_prealloced_shrinker(struct shrinker *shrinker) void register_shrinker_prepared(struct shrinker *shrinker) { - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); refcount_set(&shrinker->refcount, 1); init_completion(&shrinker->completion_wait); list_add_tail_rcu(&shrinker->list, &shrinker_list); shrinker->flags |= SHRINKER_REGISTERED; shrinker_debugfs_add(shrinker); - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); } static int __register_shrinker(struct shrinker *shrinker) @@ -815,13 +815,13 @@ void unregister_shrinker(struct shrinker *shrinker) shrinker_put(shrinker); wait_for_completion(&shrinker->completion_wait); - down_write(&shrinker_rwsem); + mutex_lock(&shrinker_mutex); list_del_rcu(&shrinker->list); shrinker->flags &= ~SHRINKER_REGISTERED; if (shrinker->flags & SHRINKER_MEMCG_AWARE) unregister_memcg_shrinker(shrinker); debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); - up_write(&shrinker_rwsem); + mutex_unlock(&shrinker_mutex); shrinker_debugfs_remove(debugfs_entry, debugfs_id); From patchwork Thu Jun 22 08:53:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13288803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16EE4EB64D8 for ; Thu, 22 Jun 2023 09:06:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232148AbjFVJGS (ORCPT ); Thu, 22 Jun 2023 05:06:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232218AbjFVJFR (ORCPT ); Thu, 22 Jun 2023 05:05:17 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A15CE4C20 for ; Thu, 22 Jun 2023 01:58:31 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b693afe799so2467165ad.1 for ; Thu, 22 Jun 2023 01:58:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424276; x=1690016276; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PAKGQW5AKOb0i6yhTIQHZdj2Ew7zYpKm64DNJQ5dWlo=; b=IAFXPAQdKRfFFdX8R+urqMUX8XywDjZlIZgvkiyvxKq3OuDxGDvyj53uf/GwQxOSgj UCrH4ulAqIMmQ7X056WmoGhyeCnVWVEc9MkUT54moxveVQUVm/n1V13P8OOuQyRngFVd lkgqQbjTLhW3PPA79SQnKo1nifBNALOotV0hDjlRRH90/2Gp8IdTAPqu+2ReapWlGMbq zlnEUk3uncMYp9LT0mLOQRJqCv6RJvoFO7mY6AD+tEO3DKYoCqm/BoBWKqYPC/hMbbbK /L8GxpwKn26Y52nmXelpvPk6gmsdpmYPKF8i9Fu/0MG+e/S0nlhty3rfnlUBtkssiN/B zjDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424276; x=1690016276; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PAKGQW5AKOb0i6yhTIQHZdj2Ew7zYpKm64DNJQ5dWlo=; b=JeSiXagjLGjh8ediBYqP4h9+TGykMjliKSVh19bn/fK2Dly0+kHvPW+K+ySxdmLfmF oDnXE7nynSdc6em/gQdiRjJO/fH6BhcKeai7foKgyH1y5lnGWJZFNfQrHIS2g3ZOoaSy Byi9EsWWTXWHuX4ukbdiw0wAJVEVvLS9JW/ITobLZNZqTiqQQkAXbao+fA0mcq8y/uVD ZxGhvOTae0SzUQeS8KuGved0k+V5iiu7+hoQC05MbYovVnUVtKKQ6aLh7NIkdGJnxxwP ruI53+Y08xwhJ8y3qSVPflyUuUFoELurB8gGYuVeKshk/wKV63q8BqMInhk4M+HuVOdI YliQ== X-Gm-Message-State: AC+VfDws4P7ngjFrwz0H0z0B0ZhcIGLRdWsfY9EPtX06jdzN+0PtuLfN 6xvca7KL2FV8DKThE1qE1VjP0A== X-Google-Smtp-Source: ACHHUZ7H5apRKh4VQCyo6UXxNAhEyp0xI76DSHJvkAU1Y/8mfRtLEhaUfJSBdNPbX02WDiX9mNvMOA== X-Received: by 2002:a17:902:ea01:b0:1a9:6467:aa8d with SMTP id s1-20020a170902ea0100b001a96467aa8dmr21676629plg.1.1687424276223; Thu, 22 Jun 2023 01:57:56 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.57.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:57:55 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng Subject: [PATCH 29/29] mm: shrinker: move shrinker-related code into a separate file Date: Thu, 22 Jun 2023 16:53:35 +0800 Message-Id: <20230622085335.77010-30-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The mm/vmscan.c file is too large, so separate the shrinker-related code from it into a separate file. No functional changes. Signed-off-by: Qi Zheng --- include/linux/shrinker.h | 3 + mm/Makefile | 4 +- mm/shrinker.c | 750 +++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 746 -------------------------------------- 4 files changed, 755 insertions(+), 748 deletions(-) create mode 100644 mm/shrinker.c diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index b0c6c2df9db8..b837b4c0864a 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -104,6 +104,9 @@ struct shrinker { */ #define SHRINKER_NONSLAB (1 << 3) +unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, + int priority); + extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...); extern void register_shrinker_prepared(struct shrinker *shrinker); diff --git a/mm/Makefile b/mm/Makefile index 678530a07326..891899186608 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -48,8 +48,8 @@ endif obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page-writeback.o folio-compat.o \ - readahead.o swap.o truncate.o vmscan.o shmem.o \ - util.o mmzone.o vmstat.o backing-dev.o \ + readahead.o swap.o truncate.o vmscan.o shrinker.o \ + shmem.o util.o mmzone.o vmstat.o backing-dev.o \ mm_init.o percpu.o slab_common.o \ compaction.o show_mem.o\ interval_tree.o list_lru.o workingset.o \ diff --git a/mm/shrinker.c b/mm/shrinker.c new file mode 100644 index 000000000000..f88966f3d6be --- /dev/null +++ b/mm/shrinker.c @@ -0,0 +1,750 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +LIST_HEAD(shrinker_list); +DEFINE_MUTEX(shrinker_mutex); + +#ifdef CONFIG_MEMCG +static int shrinker_nr_max; + +/* The shrinker_info is expanded in a batch of BITS_PER_LONG */ +static inline int shrinker_map_size(int nr_items) +{ + return (DIV_ROUND_UP(nr_items, BITS_PER_LONG) * sizeof(unsigned long)); +} + +static inline int shrinker_defer_size(int nr_items) +{ + return (round_up(nr_items, BITS_PER_LONG) * sizeof(atomic_long_t)); +} + +static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, + int nid) +{ + return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, + lockdep_is_held(&shrinker_mutex)); +} + +static struct shrinker_info *shrinker_info_rcu(struct mem_cgroup *memcg, + int nid) +{ + return rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); +} + +static int expand_one_shrinker_info(struct mem_cgroup *memcg, + int map_size, int defer_size, + int old_map_size, int old_defer_size, + int new_nr_max) +{ + struct shrinker_info *new, *old; + struct mem_cgroup_per_node *pn; + int nid; + int size = map_size + defer_size; + + for_each_node(nid) { + pn = memcg->nodeinfo[nid]; + old = shrinker_info_protected(memcg, nid); + /* Not yet online memcg */ + if (!old) + return 0; + + /* Already expanded this shrinker_info */ + if (new_nr_max <= old->map_nr_max) + continue; + + new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); + if (!new) + return -ENOMEM; + + new->nr_deferred = (atomic_long_t *)(new + 1); + new->map = (void *)new->nr_deferred + defer_size; + new->map_nr_max = new_nr_max; + + /* map: set all old bits, clear all new bits */ + memset(new->map, (int)0xff, old_map_size); + memset((void *)new->map + old_map_size, 0, map_size - old_map_size); + /* nr_deferred: copy old values, clear all new values */ + memcpy(new->nr_deferred, old->nr_deferred, old_defer_size); + memset((void *)new->nr_deferred + old_defer_size, 0, + defer_size - old_defer_size); + + rcu_assign_pointer(pn->shrinker_info, new); + kvfree_rcu(old, rcu); + } + + return 0; +} + +void free_shrinker_info(struct mem_cgroup *memcg) +{ + struct mem_cgroup_per_node *pn; + struct shrinker_info *info; + int nid; + + for_each_node(nid) { + pn = memcg->nodeinfo[nid]; + info = rcu_dereference_protected(pn->shrinker_info, true); + kvfree(info); + rcu_assign_pointer(pn->shrinker_info, NULL); + } +} + +int alloc_shrinker_info(struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + int nid, size, ret = 0; + int map_size, defer_size = 0; + + mutex_lock(&shrinker_mutex); + map_size = shrinker_map_size(shrinker_nr_max); + defer_size = shrinker_defer_size(shrinker_nr_max); + size = map_size + defer_size; + for_each_node(nid) { + info = kvzalloc_node(sizeof(*info) + size, GFP_KERNEL, nid); + if (!info) { + free_shrinker_info(memcg); + ret = -ENOMEM; + break; + } + info->nr_deferred = (atomic_long_t *)(info + 1); + info->map = (void *)info->nr_deferred + defer_size; + info->map_nr_max = shrinker_nr_max; + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); + } + mutex_unlock(&shrinker_mutex); + + return ret; +} + +static int expand_shrinker_info(int new_id) +{ + int ret = 0; + int new_nr_max = round_up(new_id + 1, BITS_PER_LONG); + int map_size, defer_size = 0; + int old_map_size, old_defer_size = 0; + struct mem_cgroup *memcg; + + if (!root_mem_cgroup) + goto out; + + lockdep_assert_held(&shrinker_mutex); + + map_size = shrinker_map_size(new_nr_max); + defer_size = shrinker_defer_size(new_nr_max); + old_map_size = shrinker_map_size(shrinker_nr_max); + old_defer_size = shrinker_defer_size(shrinker_nr_max); + + memcg = mem_cgroup_iter(NULL, NULL, NULL); + do { + ret = expand_one_shrinker_info(memcg, map_size, defer_size, + old_map_size, old_defer_size, + new_nr_max); + if (ret) { + mem_cgroup_iter_break(NULL, memcg); + goto out; + } + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); +out: + if (!ret) + shrinker_nr_max = new_nr_max; + + return ret; +} + +void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) +{ + if (shrinker_id >= 0 && memcg && !mem_cgroup_is_root(memcg)) { + struct shrinker_info *info; + + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) { + /* Pairs with smp mb in shrink_slab() */ + smp_mb__before_atomic(); + set_bit(shrinker_id, info->map); + } + rcu_read_unlock(); + } +} + +static DEFINE_IDR(shrinker_idr); + +static int prealloc_memcg_shrinker(struct shrinker *shrinker) +{ + int id, ret = -ENOMEM; + + if (mem_cgroup_disabled()) + return -ENOSYS; + + mutex_lock(&shrinker_mutex); + id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); + if (id < 0) + goto unlock; + + if (id >= shrinker_nr_max) { + if (expand_shrinker_info(id)) { + idr_remove(&shrinker_idr, id); + goto unlock; + } + } + shrinker->id = id; + ret = 0; +unlock: + mutex_unlock(&shrinker_mutex); + return ret; +} + +static void unregister_memcg_shrinker(struct shrinker *shrinker) +{ + int id = shrinker->id; + + BUG_ON(id < 0); + + lockdep_assert_held(&shrinker_mutex); + + idr_remove(&shrinker_idr, id); +} + +static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + long nr_deferred; + + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + nr_deferred = atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); + rcu_read_unlock(); + + return nr_deferred; +} + +static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + long nr_deferred; + + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + nr_deferred = atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); + rcu_read_unlock(); + + return nr_deferred; +} + +void reparent_shrinker_deferred(struct mem_cgroup *memcg) +{ + int i, nid; + long nr; + struct mem_cgroup *parent; + struct shrinker_info *child_info, *parent_info; + + parent = parent_mem_cgroup(memcg); + if (!parent) + parent = root_mem_cgroup; + + /* Prevent from concurrent shrinker_info expand */ + mutex_lock(&shrinker_mutex); + for_each_node(nid) { + child_info = shrinker_info_protected(memcg, nid); + parent_info = shrinker_info_protected(parent, nid); + for (i = 0; i < child_info->map_nr_max; i++) { + nr = atomic_long_read(&child_info->nr_deferred[i]); + atomic_long_add(nr, &parent_info->nr_deferred[i]); + } + } + mutex_unlock(&shrinker_mutex); +} +#else +static int prealloc_memcg_shrinker(struct shrinker *shrinker) +{ + return -ENOSYS; +} + +static void unregister_memcg_shrinker(struct shrinker *shrinker) +{ +} + +static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + return 0; +} + +static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + return 0; +} +#endif /* CONFIG_MEMCG */ + +static long xchg_nr_deferred(struct shrinker *shrinker, + struct shrink_control *sc) +{ + int nid = sc->nid; + + if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) + nid = 0; + + if (sc->memcg && + (shrinker->flags & SHRINKER_MEMCG_AWARE)) + return xchg_nr_deferred_memcg(nid, shrinker, + sc->memcg); + + return atomic_long_xchg(&shrinker->nr_deferred[nid], 0); +} + +static long add_nr_deferred(long nr, struct shrinker *shrinker, + struct shrink_control *sc) +{ + int nid = sc->nid; + + if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) + nid = 0; + + if (sc->memcg && + (shrinker->flags & SHRINKER_MEMCG_AWARE)) + return add_nr_deferred_memcg(nr, nid, shrinker, + sc->memcg); + + return atomic_long_add_return(nr, &shrinker->nr_deferred[nid]); +} + +#define SHRINK_BATCH 128 + +static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, + struct shrinker *shrinker, int priority) +{ + unsigned long freed = 0; + unsigned long long delta; + long total_scan; + long freeable; + long nr; + long new_nr; + long batch_size = shrinker->batch ? shrinker->batch + : SHRINK_BATCH; + long scanned = 0, next_deferred; + + freeable = shrinker->count_objects(shrinker, shrinkctl); + if (freeable == 0 || freeable == SHRINK_EMPTY) + return freeable; + + /* + * copy the current shrinker scan count into a local variable + * and zero it so that other concurrent shrinker invocations + * don't also do this scanning work. + */ + nr = xchg_nr_deferred(shrinker, shrinkctl); + + if (shrinker->seeks) { + delta = freeable >> priority; + delta *= 4; + do_div(delta, shrinker->seeks); + } else { + /* + * These objects don't require any IO to create. Trim + * them aggressively under memory pressure to keep + * them from causing refetches in the IO caches. + */ + delta = freeable / 2; + } + + total_scan = nr >> priority; + total_scan += delta; + total_scan = min(total_scan, (2 * freeable)); + + trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, + freeable, delta, total_scan, priority); + + /* + * Normally, we should not scan less than batch_size objects in one + * pass to avoid too frequent shrinker calls, but if the slab has less + * than batch_size objects in total and we are really tight on memory, + * we will try to reclaim all available objects, otherwise we can end + * up failing allocations although there are plenty of reclaimable + * objects spread over several slabs with usage less than the + * batch_size. + * + * We detect the "tight on memory" situations by looking at the total + * number of objects we want to scan (total_scan). If it is greater + * than the total number of objects on slab (freeable), we must be + * scanning at high prio and therefore should try to reclaim as much as + * possible. + */ + while (total_scan >= batch_size || + total_scan >= freeable) { + unsigned long ret; + unsigned long nr_to_scan = min(batch_size, total_scan); + + shrinkctl->nr_to_scan = nr_to_scan; + shrinkctl->nr_scanned = nr_to_scan; + ret = shrinker->scan_objects(shrinker, shrinkctl); + if (ret == SHRINK_STOP) + break; + freed += ret; + + count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); + total_scan -= shrinkctl->nr_scanned; + scanned += shrinkctl->nr_scanned; + + cond_resched(); + } + + /* + * The deferred work is increased by any new work (delta) that wasn't + * done, decreased by old deferred work that was done now. + * + * And it is capped to two times of the freeable items. + */ + next_deferred = max_t(long, (nr + delta - scanned), 0); + next_deferred = min(next_deferred, (2 * freeable)); + + /* + * move the unused scan count back into the shrinker in a + * manner that handles concurrent updates. + */ + new_nr = add_nr_deferred(next_deferred, shrinker, shrinkctl); + + trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, total_scan); + return freed; +} + +#ifdef CONFIG_MEMCG +static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, + struct mem_cgroup *memcg, int priority) +{ + struct shrinker_info *info; + unsigned long ret, freed = 0; + int i = 0; + + if (!mem_cgroup_online(memcg)) + return 0; + +again: + rcu_read_lock(); + info = shrinker_info_rcu(memcg, nid); + if (unlikely(!info)) + goto unlock; + + for_each_set_bit_from(i, info->map, info->map_nr_max) { + struct shrink_control sc = { + .gfp_mask = gfp_mask, + .nid = nid, + .memcg = memcg, + }; + struct shrinker *shrinker; + + shrinker = idr_find(&shrinker_idr, i); + if (unlikely(!shrinker || !(shrinker->flags & SHRINKER_REGISTERED))) { + if (!shrinker) + clear_bit(i, info->map); + continue; + } + + if (!shrinker_try_get(shrinker)) + continue; + rcu_read_unlock(); + + /* Call non-slab shrinkers even though kmem is disabled */ + if (!memcg_kmem_online() && + !(shrinker->flags & SHRINKER_NONSLAB)) + continue; + + ret = do_shrink_slab(&sc, shrinker, priority); + if (ret == SHRINK_EMPTY) { + clear_bit(i, info->map); + /* + * After the shrinker reported that it had no objects to + * free, but before we cleared the corresponding bit in + * the memcg shrinker map, a new object might have been + * added. To make sure, we have the bit set in this + * case, we invoke the shrinker one more time and reset + * the bit if it reports that it is not empty anymore. + * The memory barrier here pairs with the barrier in + * set_shrinker_bit(): + * + * list_lru_add() shrink_slab_memcg() + * list_add_tail() clear_bit() + * + * set_bit() do_shrink_slab() + */ + smp_mb__after_atomic(); + ret = do_shrink_slab(&sc, shrinker, priority); + if (ret == SHRINK_EMPTY) + ret = 0; + else + set_shrinker_bit(memcg, nid, i); + } + freed += ret; + + shrinker_put(shrinker); + + /* + * We have already exited the read-side of rcu critical section + * before calling do_shrink_slab(), the shrinker_info may be + * released in expand_one_shrinker_info(), so restart the + * iteration. + */ + i++; + goto again; + } +unlock: + rcu_read_unlock(); + return freed; +} +#else /* CONFIG_MEMCG */ +static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, + struct mem_cgroup *memcg, int priority) +{ + return 0; +} +#endif /* CONFIG_MEMCG */ + +/** + * shrink_slab - shrink slab caches + * @gfp_mask: allocation context + * @nid: node whose slab caches to target + * @memcg: memory cgroup whose slab caches to target + * @priority: the reclaim priority + * + * Call the shrink functions to age shrinkable caches. + * + * @nid is passed along to shrinkers with SHRINKER_NUMA_AWARE set, + * unaware shrinkers will receive a node id of 0 instead. + * + * @memcg specifies the memory cgroup to target. Unaware shrinkers + * are called only if it is the root cgroup. + * + * @priority is sc->priority, we take the number of objects and >> by priority + * in order to get the scan target. + * + * Returns the number of reclaimed slab objects. + */ +unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, + int priority) +{ + unsigned long ret, freed = 0; + struct shrinker *shrinker; + + /* + * The root memcg might be allocated even though memcg is disabled + * via "cgroup_disable=memory" boot parameter. This could make + * mem_cgroup_is_root() return false, then just run memcg slab + * shrink, but skip global shrink. This may result in premature + * oom. + */ + if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) + return shrink_slab_memcg(gfp_mask, nid, memcg, priority); + + rcu_read_lock(); + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { + struct shrink_control sc = { + .gfp_mask = gfp_mask, + .nid = nid, + .memcg = memcg, + }; + + if (!shrinker_try_get(shrinker)) + continue; + rcu_read_unlock(); + + ret = do_shrink_slab(&sc, shrinker, priority); + if (ret == SHRINK_EMPTY) + ret = 0; + freed += ret; + + rcu_read_lock(); + shrinker_put(shrinker); + } + rcu_read_unlock(); + cond_resched(); + return freed; +} + +/* + * Add a shrinker callback to be called from the vm. + */ +static int __prealloc_shrinker(struct shrinker *shrinker) +{ + unsigned int size; + int err; + + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { + err = prealloc_memcg_shrinker(shrinker); + if (err != -ENOSYS) + return err; + + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; + } + + size = sizeof(*shrinker->nr_deferred); + if (shrinker->flags & SHRINKER_NUMA_AWARE) + size *= nr_node_ids; + + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); + if (!shrinker->nr_deferred) + return -ENOMEM; + + return 0; +} + +#ifdef CONFIG_SHRINKER_DEBUG +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + va_list ap; + int err; + + va_start(ap, fmt); + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + va_end(ap); + if (!shrinker->name) + return -ENOMEM; + + err = __prealloc_shrinker(shrinker); + if (err) { + kfree_const(shrinker->name); + shrinker->name = NULL; + } + + return err; +} +#else +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + return __prealloc_shrinker(shrinker); +} +#endif + +void free_prealloced_shrinker(struct shrinker *shrinker) +{ +#ifdef CONFIG_SHRINKER_DEBUG + kfree_const(shrinker->name); + shrinker->name = NULL; +#endif + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { + mutex_lock(&shrinker_mutex); + unregister_memcg_shrinker(shrinker); + mutex_unlock(&shrinker_mutex); + return; + } + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; +} + +void register_shrinker_prepared(struct shrinker *shrinker) +{ + mutex_lock(&shrinker_mutex); + refcount_set(&shrinker->refcount, 1); + init_completion(&shrinker->completion_wait); + list_add_tail_rcu(&shrinker->list, &shrinker_list); + shrinker->flags |= SHRINKER_REGISTERED; + shrinker_debugfs_add(shrinker); + mutex_unlock(&shrinker_mutex); +} + +static int __register_shrinker(struct shrinker *shrinker) +{ + int err = __prealloc_shrinker(shrinker); + + if (err) + return err; + register_shrinker_prepared(shrinker); + return 0; +} + +#ifdef CONFIG_SHRINKER_DEBUG +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + va_list ap; + int err; + + va_start(ap, fmt); + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + va_end(ap); + if (!shrinker->name) + return -ENOMEM; + + err = __register_shrinker(shrinker); + if (err) { + kfree_const(shrinker->name); + shrinker->name = NULL; + } + return err; +} +#else +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + return __register_shrinker(shrinker); +} +#endif +EXPORT_SYMBOL(register_shrinker); + +/* + * Remove one + */ +void unregister_shrinker(struct shrinker *shrinker) +{ + struct dentry *debugfs_entry; + int debugfs_id; + + if (!(shrinker->flags & SHRINKER_REGISTERED)) + return; + + shrinker_put(shrinker); + wait_for_completion(&shrinker->completion_wait); + + mutex_lock(&shrinker_mutex); + list_del_rcu(&shrinker->list); + shrinker->flags &= ~SHRINKER_REGISTERED; + if (shrinker->flags & SHRINKER_MEMCG_AWARE) + unregister_memcg_shrinker(shrinker); + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); + mutex_unlock(&shrinker_mutex); + + shrinker_debugfs_remove(debugfs_entry, debugfs_id); + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; +} +EXPORT_SYMBOL(unregister_shrinker); + +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, + scan_objects_cb scan, long batch, + int seeks, unsigned flags, + void *priv_data) +{ + struct shrinker *shrinker; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + + shrinker->count_objects = count; + shrinker->scan_objects = scan; + shrinker->batch = batch; + shrinker->seeks = seeks; + shrinker->flags = flags; + shrinker->private_data = priv_data; + + return shrinker; +} +EXPORT_SYMBOL(shrinker_alloc_and_init); + +void shrinker_free(struct shrinker *shrinker) +{ + kfree(shrinker); +} +EXPORT_SYMBOL(shrinker_free); + +void unregister_and_free_shrinker(struct shrinker *shrinker) +{ + unregister_shrinker(shrinker); + kfree_rcu(shrinker, rcu); +} +EXPORT_SYMBOL(unregister_and_free_shrinker); diff --git a/mm/vmscan.c b/mm/vmscan.c index bcdd97caa403..0816c6496483 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #include #include @@ -57,7 +56,6 @@ #include #include #include -#include #include #include @@ -189,262 +187,7 @@ struct scan_control { */ int vm_swappiness = 60; -LIST_HEAD(shrinker_list); -DEFINE_MUTEX(shrinker_mutex); - #ifdef CONFIG_MEMCG -static int shrinker_nr_max; - -/* The shrinker_info is expanded in a batch of BITS_PER_LONG */ -static inline int shrinker_map_size(int nr_items) -{ - return (DIV_ROUND_UP(nr_items, BITS_PER_LONG) * sizeof(unsigned long)); -} - -static inline int shrinker_defer_size(int nr_items) -{ - return (round_up(nr_items, BITS_PER_LONG) * sizeof(atomic_long_t)); -} - -static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, - int nid) -{ - return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, - lockdep_is_held(&shrinker_mutex)); -} - -static struct shrinker_info *shrinker_info_rcu(struct mem_cgroup *memcg, - int nid) -{ - return rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); -} - -static int expand_one_shrinker_info(struct mem_cgroup *memcg, - int map_size, int defer_size, - int old_map_size, int old_defer_size, - int new_nr_max) -{ - struct shrinker_info *new, *old; - struct mem_cgroup_per_node *pn; - int nid; - int size = map_size + defer_size; - - for_each_node(nid) { - pn = memcg->nodeinfo[nid]; - old = shrinker_info_protected(memcg, nid); - /* Not yet online memcg */ - if (!old) - return 0; - - /* Already expanded this shrinker_info */ - if (new_nr_max <= old->map_nr_max) - continue; - - new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); - if (!new) - return -ENOMEM; - - new->nr_deferred = (atomic_long_t *)(new + 1); - new->map = (void *)new->nr_deferred + defer_size; - new->map_nr_max = new_nr_max; - - /* map: set all old bits, clear all new bits */ - memset(new->map, (int)0xff, old_map_size); - memset((void *)new->map + old_map_size, 0, map_size - old_map_size); - /* nr_deferred: copy old values, clear all new values */ - memcpy(new->nr_deferred, old->nr_deferred, old_defer_size); - memset((void *)new->nr_deferred + old_defer_size, 0, - defer_size - old_defer_size); - - rcu_assign_pointer(pn->shrinker_info, new); - kvfree_rcu(old, rcu); - } - - return 0; -} - -void free_shrinker_info(struct mem_cgroup *memcg) -{ - struct mem_cgroup_per_node *pn; - struct shrinker_info *info; - int nid; - - for_each_node(nid) { - pn = memcg->nodeinfo[nid]; - info = rcu_dereference_protected(pn->shrinker_info, true); - kvfree(info); - rcu_assign_pointer(pn->shrinker_info, NULL); - } -} - -int alloc_shrinker_info(struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - int nid, size, ret = 0; - int map_size, defer_size = 0; - - mutex_lock(&shrinker_mutex); - map_size = shrinker_map_size(shrinker_nr_max); - defer_size = shrinker_defer_size(shrinker_nr_max); - size = map_size + defer_size; - for_each_node(nid) { - info = kvzalloc_node(sizeof(*info) + size, GFP_KERNEL, nid); - if (!info) { - free_shrinker_info(memcg); - ret = -ENOMEM; - break; - } - info->nr_deferred = (atomic_long_t *)(info + 1); - info->map = (void *)info->nr_deferred + defer_size; - info->map_nr_max = shrinker_nr_max; - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); - } - mutex_unlock(&shrinker_mutex); - - return ret; -} - -static int expand_shrinker_info(int new_id) -{ - int ret = 0; - int new_nr_max = round_up(new_id + 1, BITS_PER_LONG); - int map_size, defer_size = 0; - int old_map_size, old_defer_size = 0; - struct mem_cgroup *memcg; - - if (!root_mem_cgroup) - goto out; - - lockdep_assert_held(&shrinker_mutex); - - map_size = shrinker_map_size(new_nr_max); - defer_size = shrinker_defer_size(new_nr_max); - old_map_size = shrinker_map_size(shrinker_nr_max); - old_defer_size = shrinker_defer_size(shrinker_nr_max); - - memcg = mem_cgroup_iter(NULL, NULL, NULL); - do { - ret = expand_one_shrinker_info(memcg, map_size, defer_size, - old_map_size, old_defer_size, - new_nr_max); - if (ret) { - mem_cgroup_iter_break(NULL, memcg); - goto out; - } - } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); -out: - if (!ret) - shrinker_nr_max = new_nr_max; - - return ret; -} - -void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) -{ - if (shrinker_id >= 0 && memcg && !mem_cgroup_is_root(memcg)) { - struct shrinker_info *info; - - rcu_read_lock(); - info = shrinker_info_rcu(memcg, nid); - if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) { - /* Pairs with smp mb in shrink_slab() */ - smp_mb__before_atomic(); - set_bit(shrinker_id, info->map); - } - rcu_read_unlock(); - } -} - -static DEFINE_IDR(shrinker_idr); - -static int prealloc_memcg_shrinker(struct shrinker *shrinker) -{ - int id, ret = -ENOMEM; - - if (mem_cgroup_disabled()) - return -ENOSYS; - - mutex_lock(&shrinker_mutex); - id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); - if (id < 0) - goto unlock; - - if (id >= shrinker_nr_max) { - if (expand_shrinker_info(id)) { - idr_remove(&shrinker_idr, id); - goto unlock; - } - } - shrinker->id = id; - ret = 0; -unlock: - mutex_unlock(&shrinker_mutex); - return ret; -} - -static void unregister_memcg_shrinker(struct shrinker *shrinker) -{ - int id = shrinker->id; - - BUG_ON(id < 0); - - lockdep_assert_held(&shrinker_mutex); - - idr_remove(&shrinker_idr, id); -} - -static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - long nr_deferred; - - rcu_read_lock(); - info = shrinker_info_rcu(memcg, nid); - nr_deferred = atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); - rcu_read_unlock(); - - return nr_deferred; -} - -static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - long nr_deferred; - - rcu_read_lock(); - info = shrinker_info_rcu(memcg, nid); - nr_deferred = atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); - rcu_read_unlock(); - - return nr_deferred; -} - -void reparent_shrinker_deferred(struct mem_cgroup *memcg) -{ - int i, nid; - long nr; - struct mem_cgroup *parent; - struct shrinker_info *child_info, *parent_info; - - parent = parent_mem_cgroup(memcg); - if (!parent) - parent = root_mem_cgroup; - - /* Prevent from concurrent shrinker_info expand */ - mutex_lock(&shrinker_mutex); - for_each_node(nid) { - child_info = shrinker_info_protected(memcg, nid); - parent_info = shrinker_info_protected(parent, nid); - for (i = 0; i < child_info->map_nr_max; i++) { - nr = atomic_long_read(&child_info->nr_deferred[i]); - atomic_long_add(nr, &parent_info->nr_deferred[i]); - } - } - mutex_unlock(&shrinker_mutex); -} - static bool cgroup_reclaim(struct scan_control *sc) { return sc->target_mem_cgroup; @@ -479,27 +222,6 @@ static bool writeback_throttling_sane(struct scan_control *sc) return false; } #else -static int prealloc_memcg_shrinker(struct shrinker *shrinker) -{ - return -ENOSYS; -} - -static void unregister_memcg_shrinker(struct shrinker *shrinker) -{ -} - -static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - return 0; -} - -static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - return 0; -} - static bool cgroup_reclaim(struct scan_control *sc) { return false; @@ -568,39 +290,6 @@ static void flush_reclaim_state(struct scan_control *sc) } } -static long xchg_nr_deferred(struct shrinker *shrinker, - struct shrink_control *sc) -{ - int nid = sc->nid; - - if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) - nid = 0; - - if (sc->memcg && - (shrinker->flags & SHRINKER_MEMCG_AWARE)) - return xchg_nr_deferred_memcg(nid, shrinker, - sc->memcg); - - return atomic_long_xchg(&shrinker->nr_deferred[nid], 0); -} - - -static long add_nr_deferred(long nr, struct shrinker *shrinker, - struct shrink_control *sc) -{ - int nid = sc->nid; - - if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) - nid = 0; - - if (sc->memcg && - (shrinker->flags & SHRINKER_MEMCG_AWARE)) - return add_nr_deferred_memcg(nr, nid, shrinker, - sc->memcg); - - return atomic_long_add_return(nr, &shrinker->nr_deferred[nid]); -} - static bool can_demote(int nid, struct scan_control *sc) { if (!numa_demotion_enabled) @@ -682,441 +371,6 @@ static unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, return size; } -/* - * Add a shrinker callback to be called from the vm. - */ -static int __prealloc_shrinker(struct shrinker *shrinker) -{ - unsigned int size; - int err; - - if (shrinker->flags & SHRINKER_MEMCG_AWARE) { - err = prealloc_memcg_shrinker(shrinker); - if (err != -ENOSYS) - return err; - - shrinker->flags &= ~SHRINKER_MEMCG_AWARE; - } - - size = sizeof(*shrinker->nr_deferred); - if (shrinker->flags & SHRINKER_NUMA_AWARE) - size *= nr_node_ids; - - shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); - if (!shrinker->nr_deferred) - return -ENOMEM; - - return 0; -} - -#ifdef CONFIG_SHRINKER_DEBUG -int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - va_list ap; - int err; - - va_start(ap, fmt); - shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); - va_end(ap); - if (!shrinker->name) - return -ENOMEM; - - err = __prealloc_shrinker(shrinker); - if (err) { - kfree_const(shrinker->name); - shrinker->name = NULL; - } - - return err; -} -#else -int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - return __prealloc_shrinker(shrinker); -} -#endif - -void free_prealloced_shrinker(struct shrinker *shrinker) -{ -#ifdef CONFIG_SHRINKER_DEBUG - kfree_const(shrinker->name); - shrinker->name = NULL; -#endif - if (shrinker->flags & SHRINKER_MEMCG_AWARE) { - mutex_lock(&shrinker_mutex); - unregister_memcg_shrinker(shrinker); - mutex_unlock(&shrinker_mutex); - return; - } - - kfree(shrinker->nr_deferred); - shrinker->nr_deferred = NULL; -} - -void register_shrinker_prepared(struct shrinker *shrinker) -{ - mutex_lock(&shrinker_mutex); - refcount_set(&shrinker->refcount, 1); - init_completion(&shrinker->completion_wait); - list_add_tail_rcu(&shrinker->list, &shrinker_list); - shrinker->flags |= SHRINKER_REGISTERED; - shrinker_debugfs_add(shrinker); - mutex_unlock(&shrinker_mutex); -} - -static int __register_shrinker(struct shrinker *shrinker) -{ - int err = __prealloc_shrinker(shrinker); - - if (err) - return err; - register_shrinker_prepared(shrinker); - return 0; -} - -#ifdef CONFIG_SHRINKER_DEBUG -int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - va_list ap; - int err; - - va_start(ap, fmt); - shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); - va_end(ap); - if (!shrinker->name) - return -ENOMEM; - - err = __register_shrinker(shrinker); - if (err) { - kfree_const(shrinker->name); - shrinker->name = NULL; - } - return err; -} -#else -int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - return __register_shrinker(shrinker); -} -#endif -EXPORT_SYMBOL(register_shrinker); - -/* - * Remove one - */ -void unregister_shrinker(struct shrinker *shrinker) -{ - struct dentry *debugfs_entry; - int debugfs_id; - - if (!(shrinker->flags & SHRINKER_REGISTERED)) - return; - - shrinker_put(shrinker); - wait_for_completion(&shrinker->completion_wait); - - mutex_lock(&shrinker_mutex); - list_del_rcu(&shrinker->list); - shrinker->flags &= ~SHRINKER_REGISTERED; - if (shrinker->flags & SHRINKER_MEMCG_AWARE) - unregister_memcg_shrinker(shrinker); - debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); - mutex_unlock(&shrinker_mutex); - - shrinker_debugfs_remove(debugfs_entry, debugfs_id); - - kfree(shrinker->nr_deferred); - shrinker->nr_deferred = NULL; -} -EXPORT_SYMBOL(unregister_shrinker); - -struct shrinker *shrinker_alloc_and_init(count_objects_cb count, - scan_objects_cb scan, long batch, - int seeks, unsigned flags, - void *priv_data) -{ - struct shrinker *shrinker; - - shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); - if (!shrinker) - return NULL; - - shrinker->count_objects = count; - shrinker->scan_objects = scan; - shrinker->batch = batch; - shrinker->seeks = seeks; - shrinker->flags = flags; - shrinker->private_data = priv_data; - - return shrinker; -} -EXPORT_SYMBOL(shrinker_alloc_and_init); - -void shrinker_free(struct shrinker *shrinker) -{ - kfree(shrinker); -} -EXPORT_SYMBOL(shrinker_free); - -void unregister_and_free_shrinker(struct shrinker *shrinker) -{ - unregister_shrinker(shrinker); - kfree_rcu(shrinker, rcu); -} -EXPORT_SYMBOL(unregister_and_free_shrinker); - -#define SHRINK_BATCH 128 - -static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, - struct shrinker *shrinker, int priority) -{ - unsigned long freed = 0; - unsigned long long delta; - long total_scan; - long freeable; - long nr; - long new_nr; - long batch_size = shrinker->batch ? shrinker->batch - : SHRINK_BATCH; - long scanned = 0, next_deferred; - - freeable = shrinker->count_objects(shrinker, shrinkctl); - if (freeable == 0 || freeable == SHRINK_EMPTY) - return freeable; - - /* - * copy the current shrinker scan count into a local variable - * and zero it so that other concurrent shrinker invocations - * don't also do this scanning work. - */ - nr = xchg_nr_deferred(shrinker, shrinkctl); - - if (shrinker->seeks) { - delta = freeable >> priority; - delta *= 4; - do_div(delta, shrinker->seeks); - } else { - /* - * These objects don't require any IO to create. Trim - * them aggressively under memory pressure to keep - * them from causing refetches in the IO caches. - */ - delta = freeable / 2; - } - - total_scan = nr >> priority; - total_scan += delta; - total_scan = min(total_scan, (2 * freeable)); - - trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, - freeable, delta, total_scan, priority); - - /* - * Normally, we should not scan less than batch_size objects in one - * pass to avoid too frequent shrinker calls, but if the slab has less - * than batch_size objects in total and we are really tight on memory, - * we will try to reclaim all available objects, otherwise we can end - * up failing allocations although there are plenty of reclaimable - * objects spread over several slabs with usage less than the - * batch_size. - * - * We detect the "tight on memory" situations by looking at the total - * number of objects we want to scan (total_scan). If it is greater - * than the total number of objects on slab (freeable), we must be - * scanning at high prio and therefore should try to reclaim as much as - * possible. - */ - while (total_scan >= batch_size || - total_scan >= freeable) { - unsigned long ret; - unsigned long nr_to_scan = min(batch_size, total_scan); - - shrinkctl->nr_to_scan = nr_to_scan; - shrinkctl->nr_scanned = nr_to_scan; - ret = shrinker->scan_objects(shrinker, shrinkctl); - if (ret == SHRINK_STOP) - break; - freed += ret; - - count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); - total_scan -= shrinkctl->nr_scanned; - scanned += shrinkctl->nr_scanned; - - cond_resched(); - } - - /* - * The deferred work is increased by any new work (delta) that wasn't - * done, decreased by old deferred work that was done now. - * - * And it is capped to two times of the freeable items. - */ - next_deferred = max_t(long, (nr + delta - scanned), 0); - next_deferred = min(next_deferred, (2 * freeable)); - - /* - * move the unused scan count back into the shrinker in a - * manner that handles concurrent updates. - */ - new_nr = add_nr_deferred(next_deferred, shrinker, shrinkctl); - - trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, total_scan); - return freed; -} - -#ifdef CONFIG_MEMCG -static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, int priority) -{ - struct shrinker_info *info; - unsigned long ret, freed = 0; - int i = 0; - - if (!mem_cgroup_online(memcg)) - return 0; - -again: - rcu_read_lock(); - info = shrinker_info_rcu(memcg, nid); - if (unlikely(!info)) - goto unlock; - - for_each_set_bit_from(i, info->map, info->map_nr_max) { - struct shrink_control sc = { - .gfp_mask = gfp_mask, - .nid = nid, - .memcg = memcg, - }; - struct shrinker *shrinker; - - shrinker = idr_find(&shrinker_idr, i); - if (unlikely(!shrinker || !(shrinker->flags & SHRINKER_REGISTERED))) { - if (!shrinker) - clear_bit(i, info->map); - continue; - } - - if (!shrinker_try_get(shrinker)) - continue; - rcu_read_unlock(); - - /* Call non-slab shrinkers even though kmem is disabled */ - if (!memcg_kmem_online() && - !(shrinker->flags & SHRINKER_NONSLAB)) - continue; - - ret = do_shrink_slab(&sc, shrinker, priority); - if (ret == SHRINK_EMPTY) { - clear_bit(i, info->map); - /* - * After the shrinker reported that it had no objects to - * free, but before we cleared the corresponding bit in - * the memcg shrinker map, a new object might have been - * added. To make sure, we have the bit set in this - * case, we invoke the shrinker one more time and reset - * the bit if it reports that it is not empty anymore. - * The memory barrier here pairs with the barrier in - * set_shrinker_bit(): - * - * list_lru_add() shrink_slab_memcg() - * list_add_tail() clear_bit() - * - * set_bit() do_shrink_slab() - */ - smp_mb__after_atomic(); - ret = do_shrink_slab(&sc, shrinker, priority); - if (ret == SHRINK_EMPTY) - ret = 0; - else - set_shrinker_bit(memcg, nid, i); - } - freed += ret; - - shrinker_put(shrinker); - - /* - * We have already exited the read-side of rcu critical section - * before calling do_shrink_slab(), the shrinker_info may be - * released in expand_one_shrinker_info(), so restart the - * iteration. - */ - i++; - goto again; - } -unlock: - rcu_read_unlock(); - return freed; -} -#else /* CONFIG_MEMCG */ -static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, int priority) -{ - return 0; -} -#endif /* CONFIG_MEMCG */ - -/** - * shrink_slab - shrink slab caches - * @gfp_mask: allocation context - * @nid: node whose slab caches to target - * @memcg: memory cgroup whose slab caches to target - * @priority: the reclaim priority - * - * Call the shrink functions to age shrinkable caches. - * - * @nid is passed along to shrinkers with SHRINKER_NUMA_AWARE set, - * unaware shrinkers will receive a node id of 0 instead. - * - * @memcg specifies the memory cgroup to target. Unaware shrinkers - * are called only if it is the root cgroup. - * - * @priority is sc->priority, we take the number of objects and >> by priority - * in order to get the scan target. - * - * Returns the number of reclaimed slab objects. - */ -static unsigned long shrink_slab(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, - int priority) -{ - unsigned long ret, freed = 0; - struct shrinker *shrinker; - - /* - * The root memcg might be allocated even though memcg is disabled - * via "cgroup_disable=memory" boot parameter. This could make - * mem_cgroup_is_root() return false, then just run memcg slab - * shrink, but skip global shrink. This may result in premature - * oom. - */ - if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) - return shrink_slab_memcg(gfp_mask, nid, memcg, priority); - - rcu_read_lock(); - list_for_each_entry_rcu(shrinker, &shrinker_list, list) { - struct shrink_control sc = { - .gfp_mask = gfp_mask, - .nid = nid, - .memcg = memcg, - }; - - if (!shrinker_try_get(shrinker)) - continue; - rcu_read_unlock(); - - ret = do_shrink_slab(&sc, shrinker, priority); - if (ret == SHRINK_EMPTY) - ret = 0; - freed += ret; - - rcu_read_lock(); - shrinker_put(shrinker); - } - rcu_read_unlock(); - cond_resched(); - return freed; -} - static unsigned long drop_slab_node(int nid) { unsigned long freed = 0;