From patchwork Thu Mar 23 04:00:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13184903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E055C6FD1C for ; Thu, 23 Mar 2023 04:00:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A63326B0075; Thu, 23 Mar 2023 00:00:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A11A96B0078; Thu, 23 Mar 2023 00:00:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D9756B007B; Thu, 23 Mar 2023 00:00:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7E3CC6B0075 for ; Thu, 23 Mar 2023 00:00:44 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 46C571401D1 for ; Thu, 23 Mar 2023 04:00:44 +0000 (UTC) X-FDA: 80598811608.14.D203697 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf25.hostedemail.com (Postfix) with ESMTP id 7FE3EA0018 for ; Thu, 23 Mar 2023 04:00:42 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Wqjo2gOw; spf=pass (imf25.hostedemail.com: domain of 36c4bZAoKCB4SIMLS4BG87AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--yosryahmed.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=36c4bZAoKCB4SIMLS4BG87AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679544042; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=Xc2ib1DAQNBt9XqcDNxVHZL8QNmpkFxT9Nb6z5D6KewVjflQZiOOgXNTSivAtiGnCbe3dv Fnd0Evr6NkfYTFK35QOJhHpkZyagpSwBID9umNqRqvzgTl0kFjSB8V0a0qFNXZXMVkTi2Q NoZJrU1FZkuS5J9ACahnMYTvhhqc7q4= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Wqjo2gOw; spf=pass (imf25.hostedemail.com: domain of 36c4bZAoKCB4SIMLS4BG87AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--yosryahmed.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=36c4bZAoKCB4SIMLS4BG87AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679544042; a=rsa-sha256; cv=none; b=oanCWka94DjNOO31UkSFzdoKk/6HJgnRqR2btpmDjS5zkXEFWa6uo4htKBOQ+WUGES5xBf Qi/pS1KeciT3W6PpOp38DszmAs8cAcq+4LvVIdgZYM9jS8RZvO4TPq31jl1TZd3IgKS1ek BULoQ4AfGL1vhj0UIYe9ZGcpx1tx/VA= Received: by mail-yb1-f202.google.com with SMTP id d7-20020a25adc7000000b00953ffdfbe1aso21777491ybe.23 for ; Wed, 22 Mar 2023 21:00:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679544041; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=Wqjo2gOwLuBNwjRWcqy/e97mtEalxuitp0xY/9ucFAeoVjzZqkLFNj+Kg/bS7+SLso 0mEqszCZOwticCjD28rnJun5qYG8zLpCe7Lgqr1Zcq77xAINPLQkmlCr1RUZ3ZFrFZLe yVSSzSX2DRXw2LGgoiHCS2XSksG6+AorfME/vlcB+xENetBawM2tXFoeFxzlxQ+CDzRr rABuT5MXEbil4k21Tw2rj0lLKNuTBpVjCslzrHFHpTW3cYQbR83DpY9L8+R2A0wlBskB V7890htQHHBp7dpJzFMQR7ATN69oyvzWl1Z0FZ3aHCz1QscoKhEDDw6wgjhuKwrEdKtk wggQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679544041; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=k8Qv4pmYIH2QF7YnjuFwUFfe9h+Qy0kRdTGcmK0sTsdCxgTPnHhXxzkv/UVkFJSQm5 FDfXPGgPTAjq7Xw7+/BU+cR/amEY2FWX7m7p4IKCEmPk7S0F4+rn3Vmx8lKGDyi7AKdj ii5ievOf5wp+b2H/EooGNHsxm462U4J0ZdL/54F97H1rHpm5TH5KGOOZR9S2sO/Lfw5e kGd6O+TpanVgLxHXG++1BFZyDwVEoJxw2OPh8tAFOvrp10z5fA90hKSPAnYfr9+EBIPm KfwJt3H6au0VxHXqKUdG/Z4Xw3LXbvWxYtxZSH1g39/eCNK/EO3ffMoLx3HqE5OQ3ea9 l3CQ== X-Gm-Message-State: AAQBX9fxNExiUcxlgV0v1Jgsm6/5doEuqXlCCyVh5k3PMd/BOzMph9lY SZoDoHE0WYwtoSZfvcAks2EwyeqxjEysscIB X-Google-Smtp-Source: AKy350Z2wOemdLUQ3DTnBtcfhLATRQqZsbL/MHNlEb2SKIexwcp8tfmae32ViTPN5/R0R1QUYdUH7T2cgfJO0+nV X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a25:840a:0:b0:b26:884:c35e with SMTP id u10-20020a25840a000000b00b260884c35emr1126100ybk.4.1679544041493; Wed, 22 Mar 2023 21:00:41 -0700 (PDT) Date: Thu, 23 Mar 2023 04:00:30 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog Message-ID: <20230323040037.2389095-1-yosryahmed@google.com> Subject: [RFC PATCH 0/7] Make rstat flushing IRQ and sleep friendly From: Yosry Ahmed To: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton Cc: Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, Yosry Ahmed X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7FE3EA0018 X-Rspam-User: X-Stat-Signature: kewzqjoup9m694u8kukj6ksx1dp63tmf X-HE-Tag: 1679544042-704639 X-HE-Meta: U2FsdGVkX1+nhOgPHx3+dk4z7/AqNvZT3DBPniSG66pUQ45Dp/tcK0DEb+eE+p0LI319DY3txwtcZ+zYD/5aHtqjdKuIou7/oPME7sfBcB8GMBocLk9sZgHY/BTX7yIhryOOcS0XTyiLeGOwnup+j4NEOXG5fn24B6lRBRPfOXw0ElEHNgv8qZSeOcb6Zx40nhYF4bWmq8l6Tv9AP8QZlihWRVeXhPzthIlraLZaD3dP7juNkyZzo3BPcOUSfBX9YNJZr0MT1He80aMYiW5eFejT46pMGycFcLHPB+m/8xJG19ku4HTXJQBdsI647NCUKcKJTXt128tNXC9BRXbE6jGvFnn1abdZYTlKIKWabmIQymbHdm4B1q7mJs+FYfQUKDR4penhtJWsoIcoxtfHlB6lz8YVfKUYh8u0719u4+kK063burbftjlEadF8i7NkxxPz21qunkM0Hs8Vp7nspFv3QY1xxoYNs92mK6vihcp4G7lKOdcSnOJBbQgwho5JfRaBRtB93qKo6NofkJ8+a9wKwGF2LCXI7NMPtcKjSrp/ghicFbnyBW4UFwWBrrzutuZrWf0LlhaAw62h345g7phl9DUVOvJffqU7SHRyekGaPvx0BT2erKt3wd1PQJN4dBRelFDQPu+QtHptUIMDDOU89GOfQArGPCBCEV0FGWlH9rdC0OS+S7fkmv71X8UTCfPDnnFcNEyk2xbbye++croXjnNm9mn5XKKVRwC/Juefn3tmc+FWydgQoPljvbTnu/M5HhZ6GLPqJSSeGsuAQPI43AOsaz3mSUzyBStLmvodyAcVJovxjwXIje0au98JFOgXzF6QnB3LW6T6IniBmUcgsQiDuGMXhtu4i2IEiYZlDP5rnHjo9bG80JGUSIQJQtOSHn0Peml37bArAShqORq+kae7lvllpCwnGZVmGGeVW73Cc27s97GuwTB9NV2KTMDhmBOc4NTTRIWM7Tj YeDP26kw W8FsqXX1njCrNVOA7zhaGUZWvsB5TbWpK8JZ03HYd7LmfkO1TynlKch2BBdTaThjVTXKGwRsZTGEWqURhTHO7bzfYRO1BAZDGrYNFWAUf9bEu0wQbRRGa5PPreoszgCNehC1CeyGEjtP4ztnDF9Iqu3BYXrS5HDgCkbCLQPqBT4u73mShxTlMNzSdjhKKTVbcnSt/9P57htcm8wniflsWj5X81j7AvF4VgrtXYw646pNlViJxHbKQ7pN88zuJH/sO6mVEwDqvXG9psczCIydgAUdU3/2ntoaCDtg3XC5xVTL4tqX/nViOiwkTbYoWMDZoIxelC3JbZfiH/FJCOiJj3eA9hcj7X1e64TfhMGVW7jwawSMy89zm8Zt/F+uUzZ2GrYxn6gyGt5GIXb+XozrUjmzimJHwZpbalw3/CU0SWqTGcC7x37UOa48WpaGAxRqVTdigI3PM8IFI5gTZjbzM+7ZVRCGvewGg/fwoy48xPiqOStsPP8TMj932uMGYSK6NL0tEne9yE54FIVVQ2YpuQSVOu/QbLwuZQvdP8et4EVTtmyjXnPNUZLV+NmeXkBWUL2qQSBG1Qbwc8NMtOPPYzoKqdwX6l+hyJaMp42WSQwHKUG+0epyYXK1DzuPJeC/xz/+b X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, if rstat flushing is invoked using the irqsafe variant cgroup_rstat_flush_irqsafe(), we keep interrupts disabled and do not sleep for the entire flush operation, which is O(# cpus * # cgroups). This can be rather dangerous. Not all contexts that use cgroup_rstat_flush_irqsafe() actually cannot sleep, and among those that cannot sleep, not all contexts require interrupts to be disabled. This patch series breaks down the O(# cpus * # cgroups) duration that we disable interrupts for into a series of O(# cgroups) durations. Disabling interrupts is deferred to the caller if needed. Patch 1 mainly addresses this by not requiring interrupts to be disabled for the global rstat lock to be acquired. As a side effect of that, the we disable rstat flushing in interrupt context. See patch 1 for more details. One thing I am not sure about is whether the only caller of cgroup_rstat_flush_hold() -- cgroup_base_stat_cputime_show(), currently has any dependency on that call disabling interrupts. Patch 2 follows suit for stats_flush_lock in the memcg code, allowing it to be acquired without disabling interrupts. Patch 3 removes cgroup_rstat_flush_irqsafe() and updates cgroup_rstat_flush() to be more explicit about sleeping. Patch 4 changes memcg code paths that invoke rstat flushing to sleep where possible. The patch changes code paths where it is naturally saef to sleep: userspace reads and the background periodic flusher. Patches 5 & 6 allow sleeping while rstat flushing in reclaim context and refault context. I am not sure if this is okay, especially the latter, so I placed them in separate patches for ease of revert/drop. Patch 7 is a slightly tangential optimization that limits the work done by rstat flushing in some scenarios. Yosry Ahmed (7): cgroup: rstat: only disable interrupts for the percpu lock memcg: do not disable interrupts when holding stats_flush_lock cgroup: rstat: remove cgroup_rstat_flush_irqsafe() memcg: sleep during flushing stats in safe contexts vmscan: memcg: sleep when flushing stats during reclaim workingset: memcg: sleep when flushing stats in workingset_refault() memcg: do not modify rstat tree for zero updates block/blk-cgroup.c | 2 +- include/linux/cgroup.h | 3 +-- include/linux/memcontrol.h | 8 +++--- kernel/cgroup/cgroup.c | 4 +-- kernel/cgroup/rstat.c | 54 ++++++++++++++++++++------------------ mm/memcontrol.c | 52 ++++++++++++++++++++++-------------- mm/vmscan.c | 2 +- mm/workingset.c | 4 +-- 8 files changed, 73 insertions(+), 56 deletions(-)