From patchwork Wed Dec 22 11:41:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12691421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 651B9C433FE for ; Wed, 22 Dec 2021 11:41:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2FA116B0073; Wed, 22 Dec 2021 06:41:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D41126B007B; Wed, 22 Dec 2021 06:41:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE0E56B0078; Wed, 22 Dec 2021 06:41:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id 971096B0075 for ; Wed, 22 Dec 2021 06:41:22 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5ACE684A2F for ; Wed, 22 Dec 2021 11:41:22 +0000 (UTC) X-FDA: 78945239604.09.2EEC77F Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf17.hostedemail.com (Postfix) with ESMTP id 6CDD34002B for ; Wed, 22 Dec 2021 11:41:10 +0000 (UTC) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1640173279; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=SInYklV1GRO67nKTuIrqt5K8/wl4gnCZNFPRmmw4im0=; b=gsrtXN/tVsUALHVSJVO88Kxt4rg/drzA0RIlGFwbLYv28j5kQZJV+1wySOntrDZP0IjRyf TP5ELjoSPIVdDSlemdZ5gPmQ3uJm+kQbThQpxn9TdbKyEVpC6eeENljCGUrSBdbumA7kYt yP4a8KLygcirMJXQ+OxWekGWJV2hjbIShSItq1sWwdfLr7I0NJTAJeSMMGrG9aHqRNHgRy Eq/84xpMVNhBfqLa3GpIjsASFiRDjeB9Or0vOeLEipjV4q+Q9eKB9UBjaV1LByF8hjDFn/ j5r2BSkYvtfzwRz0JT+jPRK96cgEnxro1uhIh6aVvhw0Vf2g6kPIofAZptONPQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1640173279; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=SInYklV1GRO67nKTuIrqt5K8/wl4gnCZNFPRmmw4im0=; b=RdqkbiEBKJ59ybHZ0ypx9jEFJf/FEmaOjgkqJZ+CcXsVQ9jfVJdmUtAcoaQS7o1U4ZmPwp D9cwO9TFLwYQmnAw== To: cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Thomas Gleixner , Waiman Long , Peter Zijlstra Subject: [RFC PATCH 0/3] mm/memcg: Address PREEMPT_RT problems instead of disabling it. Date: Wed, 22 Dec 2021 12:41:08 +0100 Message-Id: <20211222114111.2206248-1-bigeasy@linutronix.de> MIME-Version: 1.0 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b="gsrtXN/t"; dkim=pass header.d=linutronix.de header.s=2020e header.b=RdqkbiEB; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf17.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6CDD34002B X-Stat-Signature: um4e6wj9xp68icp9pkign8of6uf8jehr X-HE-Tag: 1640173270-670236 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, this is a follow up to https://lkml.kernel.org/r/20211207155208.eyre5svucpg7krxe@linutronix.de where it has been suggested that I should try again with memcg instead of simply disabling it. Patch #1 deals with the counters. It has been suggested to simply disable preemption on RT (like in vmstats) and I followed that advice as closely as possible. The local_irq_save() could be removed from mod_memcg_state() and the other wrapper on RT but I leave it since it does not hurt and it might look nicer ;) Patch #2 is a follow up to https://lkml.kernel.org/r/20211214144412.447035-1-longman@redhat.com Patch #3 restricts the task_obj usage to !PREEMPTION kernels. Based on my understanding the use of preempt_disable() minimizes (avoids?) the win of the optimisation. I tested them on CONFIG_PREEMPT_NONE + CONFIG_PREEMPT_RT with the tools/testing/selftests/cgroup/* tests. I looked good except for the following (which was also there before the patches): - test_kmem sometimes complained about: not ok 2 test_kmem_memcg_deletion - test_memcontrol complained always about not ok 3 test_memcg_min not ok 4 test_memcg_low and did not finish. - lockdep complains were triggered by test_core and test_freezer (both had to run): ====================================================== WARNING: possible circular locking dependency detected 5.16.0-rc5 #259 Not tainted ------------------------------------------------------ test_core/5996 is trying to acquire lock: ffffffff829a1258 (css_set_lock){..-.}-{2:2}, at: obj_cgroup_release+0x2d/0xb0 but task is already holding lock: ffff888103034618 (&sighand->siglock){....}-{2:2}, at: get_signal+0x8d/0xdb0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sighand->siglock){....}-{2:2}: _raw_spin_lock+0x27/0x40 cgroup_post_fork+0x1f5/0x290 copy_process+0x191b/0x1f80 kernel_clone+0x5a/0x410 __do_sys_clone3+0xb3/0x110 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (css_set_lock){..-.}-{2:2}: __lock_acquire+0x1253/0x2280 lock_acquire+0xd4/0x2e0 _raw_spin_lock_irqsave+0x36/0x50 obj_cgroup_release+0x2d/0xb0 drain_obj_stock+0x1a9/0x1b0 refill_obj_stock+0x4f/0x220 memcg_slab_free_hook.part.0+0x108/0x290 kmem_cache_free+0xf5/0x3c0 dequeue_signal+0xaf/0x1e0 get_signal+0x232/0xdb0 arch_do_signal_or_restart+0xf8/0x740 exit_to_user_mode_prepare+0x17d/0x270 syscall_exit_to_user_mode+0x19/0x70 do_syscall_64+0x50/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sighand->siglock); lock(css_set_lock); lock(&sighand->siglock); lock(css_set_lock); *** DEADLOCK *** 2 locks held by test_core/5996: #0: ffff888103034618 (&sighand->siglock){....}-{2:2}, at: get_signal+0x8d/0xdb0 #1: ffffffff82905e40 (rcu_read_lock){....}-{1:2}, at: drain_obj_stock+0x71/0x1b0 stack backtrace: CPU: 2 PID: 5996 Comm: test_core Not tainted 5.16.0-rc5 #259 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: dump_stack_lvl+0x45/0x59 check_noncircular+0xfe/0x110 __lock_acquire+0x1253/0x2280 lock_acquire+0xd4/0x2e0 _raw_spin_lock_irqsave+0x36/0x50 obj_cgroup_release+0x2d/0xb0 drain_obj_stock+0x1a9/0x1b0 refill_obj_stock+0x4f/0x220 memcg_slab_free_hook.part.0+0x108/0x290 kmem_cache_free+0xf5/0x3c0 dequeue_signal+0xaf/0x1e0 get_signal+0x232/0xdb0 arch_do_signal_or_restart+0xf8/0x740 exit_to_user_mode_prepare+0x17d/0x270 syscall_exit_to_user_mode+0x19/0x70 do_syscall_64+0x50/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Sebastian