From patchwork Sat Jul 27 23:06:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takero Funaki X-Patchwork-Id: 13743806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6718C3DA4A for ; Sat, 27 Jul 2024 23:06:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 66FA26B0082; Sat, 27 Jul 2024 19:06:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6292E6B0085; Sat, 27 Jul 2024 19:06:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4996E6B0088; Sat, 27 Jul 2024 19:06:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2E9CE6B0082 for ; Sat, 27 Jul 2024 19:06:52 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id BE811C01C7 for ; Sat, 27 Jul 2024 23:06:51 +0000 (UTC) X-FDA: 82387069422.16.64987BA Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf25.hostedemail.com (Postfix) with ESMTP id ECD34A000A for ; Sat, 27 Jul 2024 23:06:49 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UO4g2P9x; spf=pass (imf25.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722121541; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tc8l+AoqZ/lOD+WyY+7XMFYuKOmIy9AbESjq+dymj8E=; b=pNrYLu35/ihiXWvafMZMCGOe2zn8Ijws3yIHsSJwOueXQW3Wk0FN7+tdQlxRM3UGtTstMA BbfKDZAZ1Rbd/1DB8pP0w2frn/t49IltFhmQji6UmU3yLEnD7haWZbx5HpuA7C6GrPMA9F q+kLGNJ6cR4I2yDDlPyhewXwdVxH9Rk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UO4g2P9x; spf=pass (imf25.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722121541; a=rsa-sha256; cv=none; b=dlUhdKPjFm24e06IyCKG0/ykMT928b1O2ysVCuv3G4Lw0DHBA54PH8NToNPPwwNgKAjgNB FFn6dTBtWz1EvCc+R56Ly+YP6ZckI3JNukq/3OkA+i9hCAoa+XkLGtecLtWezpEf71rRG7 CDMqC5in3UGUW5SNEwsLYGcOuZbkTyY= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1fd66cddd4dso16662665ad.2 for ; Sat, 27 Jul 2024 16:06:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722121609; x=1722726409; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tc8l+AoqZ/lOD+WyY+7XMFYuKOmIy9AbESjq+dymj8E=; b=UO4g2P9xdakuy09k1poOXOlRBjfoz8gFZN0H1I6T6lPIGPtTeOGvI/0KcNQqZMSoe6 ciAqgz6/O+29lzBScN3kzL+jrrCDyHmxSHJ9gmsL2wYaeE+Bi05ul844lQi9EnA8xLnO 1B6fbflL0CZX8D0p2XVqUMqNgfSnC/qKYJYdxlPYwXpoiCXFFAY2L2hBheE6LWKxoJJI jx8KuaLfvcZLyaSyjtiotlW46Ql8tYAVMcOM1KJ6zwbmDwsguPvDgf13rNi4BnhO3Gha zBoiDq7I9bKpHdd9hhAhbPS++Xlbwkcm0UlX+3I3Iq2igSh4q1mjFzuS4Vh19OUC7ugS 1KrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722121609; x=1722726409; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tc8l+AoqZ/lOD+WyY+7XMFYuKOmIy9AbESjq+dymj8E=; b=UnPmy1syjvnRfWS/Qv50Zkhn3TqhujdNJeYkY+JamXCuU9zvp/bHylrZGhL7NLp/ma WayJo7NG7aJbhf9Bgx0AtFO8Lg4yiPcPqNQDvpNOmRnoFgG3Rbxdc7qbJUQnuiW9Kcd9 eRzH+SFmmR01kMnvT0cnNCKa5l1oFg4l4Se9gLwBf/3iRPWN7j+zgdK4Sn4pEzYmVmmT OYelmlQ2vXUn3kthbyshC3CMqg58R4Y4oO++5jRYJIhajfOtBMUk2lQkxoSWRYvWNNiZ gR6JEpjhV/VG9ShCCdLBGECtozdybxofOdHrZYg8pQiNrCd3WsO789718rCLz7m4dYfA UQqg== X-Forwarded-Encrypted: i=1; AJvYcCUCqQoq3EQHOjktJFKf3SPgl5SFgAwhxpcpDQcX7862OwGiRyX1U98j/L1i0Slt2Y8hsqqT1Pjb1YR+XdQG8ARhM4k= X-Gm-Message-State: AOJu0Ywo+ZTyLDmD0+x9xPLBoXY4AuthauNS9c3bAKsLn49JerY+Uvcd qLS1tBC3O/f9kcEDf/WrcDRUt6QI6y95qkNWx86scOxAuAsj5HEFIZ+suA== X-Google-Smtp-Source: AGHT+IEP0u9cC0ABzPcUiF/6yrcR6z/JGO3GPmbyGGVg/iRkTfavw47MJeKi1Uf1gwD15gqjnwfpkQ== X-Received: by 2002:a17:902:e5c9:b0:1fd:64ec:886f with SMTP id d9443c01a7336-1ff0482f73emr46302535ad.26.1722121608658; Sat, 27 Jul 2024 16:06:48 -0700 (PDT) Received: from cbuild.incus (h101-111-009-128.hikari.itscom.jp. [101.111.9.128]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fed7f274a7sm55881145ad.209.2024.07.27.16.06.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 27 Jul 2024 16:06:48 -0700 (PDT) From: Takero Funaki To: Johannes Weiner , Yosry Ahmed , Nhat Pham , Chengming Zhou , Andrew Morton Cc: Takero Funaki , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/2] mm: zswap: fix global shrinker memcg iteration Date: Sat, 27 Jul 2024 23:06:29 +0000 Message-ID: <20240727230635.3170-2-flintglass@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240727230635.3170-1-flintglass@gmail.com> References: <20240727230635.3170-1-flintglass@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: ECD34A000A X-Stat-Signature: 1dpimw9qewaw9bh6b8qfyce95ps8aw73 X-Rspam-User: X-HE-Tag: 1722121609-818722 X-HE-Meta: U2FsdGVkX18kjWCV9Zaq8bGJIMUOv1Q+c9NV4Wo543BXh65tOrIa0OXtB33T+N57SrjOFV/vaSZBljQUvn3HaBHDj6XiQFoS653QY4A3Q6WKi5e3aV7Z05d4MXHybZtfUyS7XP0aIC+sWPP96hoiuyKDvcvXv+4T1lWfvUEnDFwUcRVHTn3EtBdCQT7/xIi/gq9Et6Kh6QBo16qFuVIuRgnwxgKCnjc+Oq1xR/QcN27WEeGyTBd53fNOvUxI31En8Waw3YwaZ6ZwDuQtYvUC3zQMXWB+V5XjLMdKpfaRCnLBDAtELYwbR2l3LAzGHTIjNTXI1uHY3t0ALOFZbKStBHEcQqs8eTtx1nmVOHiOAAtmur8zHe/o7fr6g6MRqWRGbUQK5CB1oX54EphiwVIX+aqrR5JngSEv8qRukRk3ILnMHvIX0MbAu1aXGWNX2oLex5jyzhgSJWcGUmJQIQKpghpZ0wltMxJ4+3rRjZA8dWgaD6gxSPSdj5NX41KilG1WuFCrUekVpgNc0nGRFrr7w//FO9dWIJv/VjQoifjGEuQOrDFoveyfYjbDcPIoiI435F009c7hVKXoNWqmtZyLaVfgE3nEinLXw+nqGx4PenPsaVIE9sEdYEVeH3OfIrjxMcDCS8Re1AiM8baC9T2UuNbMfDNpCZcpxYtoGzeaIbRL9SdbeHWj7wqcB0a7FLI/PvdDOzEV+Bh5Kaa1a96S1KlTQNZscbuSlDXr0+ew3TP4v7WLPBKI396v5vq6euKzIhb/hnea8X7yJhzLwIKPvX9hC12L4UgfDGZeAgwqzc4Cb2GOZKtqn945KPzO5sJ2jzt6p+plM2bNrwuWO7eOpvSjVHun5uLOhad0WpoiuSPCUu1AC2/IFWF7llTNTda6OM2zewEnXvbGFt8NVe2Qvx9HNi4JrO2qdBXGPLVs0YQ9nLOgQTZmVuxwDtqibY0ZdU8cCwyNjCGu4t8oAnS ukr3UcGN 5v/RKCO9kywkRGPmMBflKc4kVuIVslLDwfrXzmnuvWavHQZq4/amTEqCdaMClb6V4oyL/gGfKyrBeW6b6LXc1uwjeW4zJQfO+mGESYNo9cBr+v0y0/OhM2H/rojfKSVe20ZbcyY9RCoBEg8QCTYp/F67DwIhBjC5AhHuGT+e/7shLLCj4pXchuu2vPrXg4UNsFziRMz+M44J6+c7BHhV2QXd53oDd3ppDxL08wp2oNKvJvpS5B/+aSlsjoeDwEGH6qoIzyWAAKMbQUkoxA5N2iLykdNUihbAK3vedkXQvsQWeiESx1hkrISKHSXGQpZbeSb3trkc3beqiX1MdCPXmkyramr++00LoPJh7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch fixes an issue where the zswap global shrinker stopped iterating through the memcg tree. The problem was that shrink_worker() would restart iterating memcg tree from the tree root, considering an offline memcg as a failure, and abort shrinking after encountering the same offline memcg 16 times even if there is only one offline memcg. After this change, an offline memcg in the tree is no longer considered a failure. This allows the shrinker to continue shrinking the other online memcgs regardless of whether an offline memcg exists, gives higher zswap writeback activity. To avoid holding refcount of offline memcg encountered during the memcg tree walking, shrink_worker() must continue iterating to release the offline memcg to ensure the next memcg stored in the cursor is online. The offline memcg cleaner has also been changed to avoid the same issue. When the next memcg of the offlined memcg is also offline, the refcount stored in the iteration cursor was held until the next shrink_worker() run. The cleaner must release the offline memcg recursively. Fixes: a65b0e7607cc ("zswap: make shrinking memcg-aware") Signed-off-by: Takero Funaki Acked-by: Yosry Ahmed Reviewed-by: Nhat Pham --- mm/zswap.c | 73 ++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 49 insertions(+), 24 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index adeaf9c97fde..e9b5343256cd 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -765,12 +765,31 @@ void zswap_folio_swapin(struct folio *folio) } } +/* + * This function should be called when a memcg is being offlined. + * + * Since the global shrinker shrink_worker() may hold a reference + * of the memcg, we must check and release the reference in + * zswap_next_shrink. + * + * shrink_worker() must handle the case where this function releases + * the reference of memcg being shrunk. + */ void zswap_memcg_offline_cleanup(struct mem_cgroup *memcg) { /* lock out zswap shrinker walking memcg tree */ spin_lock(&zswap_shrink_lock); - if (zswap_next_shrink == memcg) - zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); + if (zswap_next_shrink == memcg) { + do { + zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); + } while (zswap_next_shrink && !mem_cgroup_online(zswap_next_shrink)); + /* + * We verified the next memcg is online. Even if the next + * memcg is being offlined here, another cleaner must be + * waiting for our lock. We can leave the online memcg + * reference. + */ + } spin_unlock(&zswap_shrink_lock); } @@ -1304,43 +1323,49 @@ static void shrink_worker(struct work_struct *w) /* Reclaim down to the accept threshold */ thr = zswap_accept_thr_pages(); - /* global reclaim will select cgroup in a round-robin fashion. */ + /* global reclaim will select cgroup in a round-robin fashion. + * + * We save iteration cursor memcg into zswap_next_shrink, + * which can be modified by the offline memcg cleaner + * zswap_memcg_offline_cleanup(). + * + * Since the offline cleaner is called only once, we cannot leave an + * offline memcg reference in zswap_next_shrink. + * We can rely on the cleaner only if we get online memcg under lock. + * + * If we get an offline memcg, we cannot determine if the cleaner has + * already been called or will be called later. We must put back the + * reference before returning from this function. Otherwise, the + * offline memcg left in zswap_next_shrink will hold the reference + * until the next run of shrink_worker(). + */ do { spin_lock(&zswap_shrink_lock); - zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); - memcg = zswap_next_shrink; /* - * We need to retry if we have gone through a full round trip, or if we - * got an offline memcg (or else we risk undoing the effect of the - * zswap memcg offlining cleanup callback). This is not catastrophic - * per se, but it will keep the now offlined memcg hostage for a while. - * + * Start shrinking from the next memcg after zswap_next_shrink. + * When the offline cleaner has already advanced the cursor, + * advancing the cursor here overlooks one memcg, but this + * should be negligibly rare. + */ + do { + memcg = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); + zswap_next_shrink = memcg; + } while (memcg && !mem_cgroup_tryget_online(memcg)); + /* * Note that if we got an online memcg, we will keep the extra * reference in case the original reference obtained by mem_cgroup_iter * is dropped by the zswap memcg offlining callback, ensuring that the * memcg is not killed when we are reclaiming. */ - if (!memcg) { - spin_unlock(&zswap_shrink_lock); - if (++failures == MAX_RECLAIM_RETRIES) - break; - - goto resched; - } - - if (!mem_cgroup_tryget_online(memcg)) { - /* drop the reference from mem_cgroup_iter() */ - mem_cgroup_iter_break(NULL, memcg); - zswap_next_shrink = NULL; - spin_unlock(&zswap_shrink_lock); + spin_unlock(&zswap_shrink_lock); + if (!memcg) { if (++failures == MAX_RECLAIM_RETRIES) break; goto resched; } - spin_unlock(&zswap_shrink_lock); ret = shrink_memcg(memcg); /* drop the extra reference */ From patchwork Sat Jul 27 23:06:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takero Funaki X-Patchwork-Id: 13743807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B728AC3DA4A for ; Sat, 27 Jul 2024 23:06:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44E146B0088; Sat, 27 Jul 2024 19:06:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FC6F6B0089; Sat, 27 Jul 2024 19:06:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 279BC6B008A; Sat, 27 Jul 2024 19:06:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 022306B0088 for ; Sat, 27 Jul 2024 19:06:54 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 710EDC01C8 for ; Sat, 27 Jul 2024 23:06:54 +0000 (UTC) X-FDA: 82387069548.06.B7A0492 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf20.hostedemail.com (Postfix) with ESMTP id 9B5171C0003 for ; Sat, 27 Jul 2024 23:06:52 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UVQ9BISp; spf=pass (imf20.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722121586; a=rsa-sha256; cv=none; b=6veJgAcltbKzxF+hVQme40busH6nGDms+m3dMn5647QdjDREPvY7ir+FWRDr32HK5HVTeK 2TOyjvqNkYSvV45SF81YNcl7erqleSu7eYmXV/lsbZrcR9/6p2Ns8Hbs/GDv44axKHor/M vCJqtnHrtKEmRKYdM2BUt/Kfvv14L+4= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UVQ9BISp; spf=pass (imf20.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722121586; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Yd+/PkNsbnZGnxyGV+P8ksi9diuCZ2xuAbwQqloCs2E=; b=YrVwGEWRsiutgTxcr+00C7XKJbUB2f6IlT4NmNPUTDeiRYh3WrmEbnh9ipfqVhRQiIEN1d CqT7PENIeKZW3ETB6GAfA2GeYDQLSbS863LpaXyuvWIPKB81giR2iXL5Oy0K2+IVPIse5w BE5/9bVPAQG/C85DMDgXmKo0sMTBEUo= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1fc611a0f8cso13073755ad.2 for ; Sat, 27 Jul 2024 16:06:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722121611; x=1722726411; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yd+/PkNsbnZGnxyGV+P8ksi9diuCZ2xuAbwQqloCs2E=; b=UVQ9BISpfreTdNwotGS95uzSudx4eFCg3s5adSqUpmm/x93RUDw50iBPao/YTJSwwg f5Vs0kEfp/xUNmXqgXusUw5ijq2igi4r7vZ3GTDZRbbvx/ojmy4QcaeXIVJYwjuHSrtl EL+gscCw/M1imCDekihOFNjFYhCL/S55Lcpq+5woiF+InsMDQvYDWdWNx6HVcI+tzxuy qc8G+A7TpLEXzp6aJJKhQeBJUl6gSq7ni9O0+ERHja4I0yHN4q2EKdObo1VqFHBEmvPS 31u9SAFKsUW7nVzc4lv1iWLXfD3LTFuRpNSVjvm7vqX4tTKl9q0Wz+7LgQ8N1HdpzeZM Qblg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722121611; x=1722726411; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yd+/PkNsbnZGnxyGV+P8ksi9diuCZ2xuAbwQqloCs2E=; b=cVJx0brOfMuHEhaUWjvMxk6Pqczs+bs/kH9VTXSAfPykk2Id8DuajD9rZZvloxWl28 u6fYYX1VyMOFyQKkjEobvbf1B7S6K4wfY5Xy4pBL8y85CKZXGnPjdyS6kEcdNfOmu/Ln 7vKief8uwyD8bPiUjNI2IXk8P6dKbIWq5gDX4X/sDbNBiFL0NSc/pEzVXhZ/kgibY8bq bxFmAjGK+KbsfSuf9JXZ2tZBxpBk3JqGZuwQvZrn87UhyYz+fISjl1/xJFkwYNpgw9Ar LhcnItbhHWQXuEPz+XaRpUQhWaImsxu6ZPsiVmlHSN2O9gMDZHCAdnq0lDsVcIXi2k4j OUZA== X-Forwarded-Encrypted: i=1; AJvYcCW3e8cshIbdKKhWD9cR15v4ydghKFb5RslR0olF/TduLf0gXEWHdLtMQs14sAgiWbcNwLVD2vSipD+IIYUXPc+l4V0= X-Gm-Message-State: AOJu0Yw6PifT7r50BZuspdAIzhO5y7baomDuuaC4gJJ7K4ZFQcYr+ynR 5n5XK/KSoo5mz+iRLk75PIdFJ5P/XjKtExgqndhZSt661NojzH3t X-Google-Smtp-Source: AGHT+IF/8gi3Mz4DsUNG9UiAYAkoJ0FEr706604n2lQS02Byp3AL/IAe8zts+2DvzxbR9MZ9cSUcCg== X-Received: by 2002:a17:902:e5ca:b0:1fb:3d7:1d01 with SMTP id d9443c01a7336-1ff048e4fd6mr30034875ad.59.1722121611214; Sat, 27 Jul 2024 16:06:51 -0700 (PDT) Received: from cbuild.incus (h101-111-009-128.hikari.itscom.jp. [101.111.9.128]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fed7f274a7sm55881145ad.209.2024.07.27.16.06.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 27 Jul 2024 16:06:50 -0700 (PDT) From: Takero Funaki To: Johannes Weiner , Yosry Ahmed , Nhat Pham , Chengming Zhou , Andrew Morton Cc: Takero Funaki , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/2] mm: zswap: fix global shrinker error handling logic Date: Sat, 27 Jul 2024 23:06:30 +0000 Message-ID: <20240727230635.3170-3-flintglass@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240727230635.3170-1-flintglass@gmail.com> References: <20240727230635.3170-1-flintglass@gmail.com> MIME-Version: 1.0 X-Stat-Signature: uf85fdfzhf8tmrescuiwa7o7g7qjcem8 X-Rspamd-Queue-Id: 9B5171C0003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722121612-488276 X-HE-Meta: U2FsdGVkX1/Qnz7OrnLa0lWXjyc9k0jFWJMT+/2SmNYgX47GcX86V6mjhSMS/+cX19H6aF3lu+3QlF0Ny2y0J4jmcDinr6s67j5Xo5RoXX+oLZQurzxGMW2Kl9ICxV87fVV3qoDZv4MDrLDYZJMK/Y3tzVHH5IfT1pKWrul22fFN7eFqHHTni99ybqESBpyRdZ9m9dI2W9Of5cagekW8ilDN/GwG9so+56rXMWRPLjRh/L7vxZVRsl5psiKHlUUg6ARfO6LdhQlO0y0Bxw47ducn4e+OreAakUrBPSin8krvLmiorffrA7T7EwDUX6LaZ5S2leyDLhw73uYw+w46RCx4oy3WzLkImbCXxnPyrR7nBRwC8oNxl708yk2xSIONYmsTLZJ9zSZqz15LeRPZeVS/g5srGSYUzp7ntrW1sj0mlyLo/SS6StbGQs3EtWoSLOAhziD1qfEm5lAWqf/xRiAd4/88X24wOZWZSfW/5+iuroMpIa1LwkmW3dOHDTc0aiA5Mu0Byyk7JKCqUyWlpGfMi90xZDBrVajNHm6ZoWvm6KJKZfGTTihijjkRr8WUxBTS5bh55h4BcVUJVDlL6VUNMY3bZqb1aP1O7Sq7jzRa4GVqRXcGNIsNiqeSUSgYBNs+aGHkLq450G59p1NpX3zWN+7j/FuJ7zIGGoqhMulZhzFy/iZqGbwsgGByt79b9cvnbLLYhXdqkY6c//usY436M41SSJ9lFD/B0Xa9Lz79PhVii/4dO8Sd7nRyGMDV1u1jQsh57/gf8zDYJHETXHrKTPuUoJLD06pP7QH2d2L1Fmn84ze/N4qDzuV8ue/WV0NgF/4BUUFUPg0MCyM8LNm8X2UvXJRaznAKanTQFBc/In1L7HkNkMZjtPnnJCBhd7y+CN3iLEWK+i4cxAIXe6qYPLhEDgRwDB1J2XyOKC57ATkt8qHjCTTTmqt+XtH2H96PEttN+1GiZqWmKNv zbITjwvd yD9vzpWfljlMZPERIo1Rs/+GdATdTDulM/ZptMpkl/Ezxe+JjASFoCSRvXh3mNEuWTfwSmG+uA+HehKRCpeywMYQuIZhJbmcDU5pcDBWwYBDMEaBf6Y/INSACwmPdLhDzYMgVDly6c/f0fCYMX/kC5HFF40JRxIW+Hf2nUYsIUfh3oy+pilTObLFvr46BQGnNQXPcDKRBXEAxApYvTHl+tvSrYQwvmQSL2G+RsvYeARWSTCZKaD4zeDLywg3En5eSq1XImCQlTlqWZnPeJHbv9ifvGmFOJ4r1jSBgMBjJpljAHj+xizDuSceGmHZe+C61zb7j5E4+i4V3jio= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch fixes the zswap global shrinker, which did not shrink the zpool as expected. The issue addressed is that shrink_worker() did not distinguish between unexpected errors and expected errors, such as failed writeback from an empty memcg. The shrinker would stop shrinking after iterating through the memcg tree 16 times, even if there was only one empty memcg. With this patch, the shrinker no longer considers encountering an empty memcg, encountering a memcg with writeback disabled, or reaching the end of a memcg tree walk as a failure, as long as there are memcgs that are candidates for writeback. Systems with one or more empty memcgs will now observe significantly higher zswap writeback activity after the zswap pool limit is hit. To avoid an infinite loop when there are no writeback candidates, this patch tracks writeback attempts during memcg tree walks and limits reties if no writeback candidates are found. To handle the empty memcg case, the helper function shrink_memcg() is modified to check if the memcg is empty and then return -ENOENT. Fixes: a65b0e7607cc ("zswap: make shrinking memcg-aware") Signed-off-by: Takero Funaki --- mm/zswap.c | 41 ++++++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 7 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index e9b5343256cd..60c8b1232ec9 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1293,10 +1293,10 @@ static struct shrinker *zswap_alloc_shrinker(void) static int shrink_memcg(struct mem_cgroup *memcg) { - int nid, shrunk = 0; + int nid, shrunk = 0, scanned = 0; if (!mem_cgroup_zswap_writeback_enabled(memcg)) - return -EINVAL; + return -ENOENT; /* * Skip zombies because their LRUs are reparented and we would be @@ -1310,20 +1310,34 @@ static int shrink_memcg(struct mem_cgroup *memcg) shrunk += list_lru_walk_one(&zswap_list_lru, nid, memcg, &shrink_memcg_cb, NULL, &nr_to_walk); + scanned += 1 - nr_to_walk; } + + if (!scanned) + return -ENOENT; + return shrunk ? 0 : -EAGAIN; } static void shrink_worker(struct work_struct *w) { struct mem_cgroup *memcg; - int ret, failures = 0; + int ret, failures = 0, attempts = 0; unsigned long thr; /* Reclaim down to the accept threshold */ thr = zswap_accept_thr_pages(); - /* global reclaim will select cgroup in a round-robin fashion. + /* + * Global reclaim will select cgroup in a round-robin fashion from all + * online memcgs, but memcgs that have no pages in zswap and + * writeback-disabled memcgs (memory.zswap.writeback=0) are not + * candidates for shrinking. + * + * Shrinking will be aborted if we encounter the following + * MAX_RECLAIM_RETRIES times: + * - No writeback-candidate memcgs found in a memcg tree walk. + * - Shrinking a writeback-candidate memcg failed. * * We save iteration cursor memcg into zswap_next_shrink, * which can be modified by the offline memcg cleaner @@ -1361,9 +1375,14 @@ static void shrink_worker(struct work_struct *w) spin_unlock(&zswap_shrink_lock); if (!memcg) { - if (++failures == MAX_RECLAIM_RETRIES) + /* + * Continue shrinking without incrementing failures if + * we found candidate memcgs in the last tree walk. + */ + if (!attempts && ++failures == MAX_RECLAIM_RETRIES) break; + attempts = 0; goto resched; } @@ -1371,8 +1390,16 @@ static void shrink_worker(struct work_struct *w) /* drop the extra reference */ mem_cgroup_put(memcg); - if (ret == -EINVAL) - break; + /* + * There are no writeback-candidate pages in the memcg. + * This is not an issue as long as we can find another memcg + * with pages in zswap. Skip this without incrementing attempts + * and failures. + */ + if (ret == -ENOENT) + continue; + ++attempts; + if (ret && ++failures == MAX_RECLAIM_RETRIES) break; resched: