From patchwork Sat Feb 26 20:41:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12761449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AD1CC433FE for ; Sat, 26 Feb 2022 20:41:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A6B588D0001; Sat, 26 Feb 2022 15:41:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 907B58D0007; Sat, 26 Feb 2022 15:41:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D2A18D0001; Sat, 26 Feb 2022 15:41:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 42A748D0003 for ; Sat, 26 Feb 2022 15:41:53 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 048E921741 for ; Sat, 26 Feb 2022 20:41:52 +0000 (UTC) X-FDA: 79186102506.06.7FD934F Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf02.hostedemail.com (Postfix) with ESMTP id 53DF18000D for ; Sat, 26 Feb 2022 20:41:52 +0000 (UTC) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645908111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vzffaKtY2TPuSciDyqAdDELxkAsc6OFdR8wdvbRfvC8=; b=HfojqYzQ6+JkjfDeLFJcwECUNDYhedrn0fTUATceUUdx+2MnqdFSyJz9x1RXUzuAa567xe yl3CMMtLaCjUaHBL7+vqJBe6aKXv0b6TslTod4pLAmrLfbapb2/iNfFcMmdvhHwg1xdDyg RxsMIz57z2RoGNK0biAieGMonD2qdvpA6XVbBexJcE3x4H4tU6TY302Yr+dQyDKi9WZgqy uOcDZfuSXf5MCihdLx61r8uxKEt1yi8tzqE1D/tScrseZ3QDdtTu2IYvL/M9Pz0Ktg2JIF uv3RaHsJlodo6H70lTr4OJE16oQUmJPHYvo4Sset1N1xmjGvIQQ9CMrX72W6EQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645908111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vzffaKtY2TPuSciDyqAdDELxkAsc6OFdR8wdvbRfvC8=; b=XeDtsWWbNTY8NrX/xGqu21GrQtVyptp5oaHZP4roJVmI7X1TwzvM2EwJQWs2pvFq6LD3yy XeweT9UkNlYxraCA== To: cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , Johannes Weiner , Michal Hocko , =?utf-8?q?Michal_Koutn=C3=BD?= , Peter Zijlstra , Thomas Gleixner , Vladimir Davydov , Waiman Long , Sebastian Andrzej Siewior , Roman Gushchin Subject: [PATCH v5 3/6] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed. Date: Sat, 26 Feb 2022 21:41:41 +0100 Message-Id: <20220226204144.1008339-4-bigeasy@linutronix.de> In-Reply-To: <20220226204144.1008339-1-bigeasy@linutronix.de> References: <20220226204144.1008339-1-bigeasy@linutronix.de> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 53DF18000D X-Stat-Signature: r1fnt57onmb35ktib8goqpr67kehz1qq Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=HfojqYzQ; dkim=pass header.d=linutronix.de header.s=2020e header.b=XeDtsWWb; spf=pass (imf02.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de X-HE-Tag: 1645908112-790911 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The per-CPU counter are modified with the non-atomic modifier. The consistency is ensured by disabling interrupts for the update. On non PREEMPT_RT configuration this works because acquiring a spinlock_t typed lock with the _irq() suffix disables interrupts. On PREEMPT_RT configurations the RMW operation can be interrupted. Another problem is that mem_cgroup_swapout() expects to be invoked with disabled interrupts because the caller has to acquire a spinlock_t which is acquired with disabled interrupts. Since spinlock_t never disables interrupts on PREEMPT_RT the interrupts are never disabled at this point. The code is never called from in_irq() context on PREEMPT_RT therefore disabling preemption during the update is sufficient on PREEMPT_RT. The sections which explicitly disable interrupts can remain on PREEMPT_RT because the sections remain short and they don't involve sleeping locks (memcg_check_events() is doing nothing on PREEMPT_RT). Disable preemption during update of the per-CPU variables which do not explicitly disable interrupts. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0b5117ed2ae08..238ea77aade5d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -630,6 +630,35 @@ static DEFINE_SPINLOCK(stats_flush_lock); static DEFINE_PER_CPU(unsigned int, stats_updates); static atomic_t stats_flush_threshold = ATOMIC_INIT(0); +/* + * Accessors to ensure that preemption is disabled on PREEMPT_RT because it can + * not rely on this as part of an acquired spinlock_t lock. These functions are + * never used in hardirq context on PREEMPT_RT and therefore disabling preemtion + * is sufficient. + */ +static void memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#else + VM_BUG_ON(!irqs_disabled()); +#endif +} + +static void __memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#endif +} + +static void memcg_stats_unlock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_enable(); +#endif +} + static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { unsigned int x; @@ -706,6 +735,27 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg = pn->memcg; + /* + * The caller from rmap relay on disabled preemption becase they never + * update their counter from in-interrupt context. For these two + * counters we check that the update is never performed from an + * interrupt context while other caller need to have disabled interrupt. + */ + __memcg_stats_lock(); + if (IS_ENABLED(CONFIG_DEBUG_VM) && !IS_ENABLED(CONFIG_PREEMPT_RT)) { + switch (idx) { + case NR_ANON_MAPPED: + case NR_FILE_MAPPED: + case NR_ANON_THPS: + case NR_SHMEM_PMDMAPPED: + case NR_FILE_PMDMAPPED: + WARN_ON_ONCE(!in_task()); + break; + default: + WARN_ON_ONCE(!irqs_disabled()); + } + } + /* Update memcg */ __this_cpu_add(memcg->vmstats_percpu->state[idx], val); @@ -713,6 +763,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); memcg_rstat_updated(memcg, val); + memcg_stats_unlock(); } /** @@ -795,8 +846,10 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, if (mem_cgroup_disabled()) return; + memcg_stats_lock(); __this_cpu_add(memcg->vmstats_percpu->events[idx], count); memcg_rstat_updated(memcg, count); + memcg_stats_unlock(); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) @@ -7140,8 +7193,9 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) * important here to have the interrupts disabled because it is the * only synchronisation we have for updating the per-CPU variables. */ - VM_BUG_ON(!irqs_disabled()); + memcg_stats_lock(); mem_cgroup_charge_statistics(memcg, -nr_entries); + memcg_stats_unlock(); memcg_check_events(memcg, page_to_nid(page)); css_put(&memcg->css);