From patchwork Tue Jun 29 02:35:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12348945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FCEDC11F65 for ; Tue, 29 Jun 2021 02:35:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3AD0461A1D for ; Tue, 29 Jun 2021 02:35:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AD0461A1D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 910FE8D009A; Mon, 28 Jun 2021 22:35:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BF9F8D008F; Mon, 28 Jun 2021 22:35:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 787B38D009A; Mon, 28 Jun 2021 22:35:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id 4ABB88D008F for ; Mon, 28 Jun 2021 22:35:15 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 466B21D61E for ; Tue, 29 Jun 2021 02:35:15 +0000 (UTC) X-FDA: 78305194590.19.D3704FB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id DE701A0010DB for ; Tue, 29 Jun 2021 02:35:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DF37561A2B; Tue, 29 Jun 2021 02:35:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624934114; bh=PSbjepOTs3jROGKOjM+SiNnzSrEvTC08mSYEBTNPB6k=; h=Date:From:To:Subject:In-Reply-To:From; b=tV1hNKxLfKl01944UAa7GaGFfZu5TsNmYhxdueh85UvTAhwpkXF5Qw/EWvhYthyq4 x0gB5L9mBRMyOmniONGGMi0Ps+FKjPZkOLWmTQdfQgVkqWMJE+wbFw2/Pf5sSIkT4z 1mz/eTgZY5PsfpMJrj5TmpIvNdz1K36Avth7wTv4= Date: Mon, 28 Jun 2021 19:35:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nsaenzju@redhat.com, rostedt@goodmis.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 037/192] mm: mmap_lock: use local locks instead of disabling preemption Message-ID: <20210629023513.G6ax4BHRk%akpm@linux-foundation.org> In-Reply-To: <20210628193256.008961950a714730751c1423@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DE701A0010DB Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=tV1hNKxL; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Stat-Signature: prjczp59o6sysbcudguk4y1cn88ucj1c X-HE-Tag: 1624934114-303687 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nicolas Saenz Julienne Subject: mm: mmap_lock: use local locks instead of disabling preemption mmap_lock will explicitly disable/enable preemption upon manipulating its local CPU variables. This is to be expected, but in this case, it doesn't play well with PREEMPT_RT. The preemption disabled code section also takes a spin-lock. Spin-locks in RT systems will try to schedule, which is exactly what we're trying to avoid. To mitigate this, convert the explicit preemption handling to local_locks. Which are RT aware, and will disable migration instead of preemption when PREEMPT_RT=y. The faulty call trace looks like the following: __mmap_lock_do_trace_*() preempt_disable() get_mm_memcg_path() cgroup_path() kernfs_path_from_node() spin_lock_irqsave() /* Scheduling while atomic! */ Link: https://lkml.kernel.org/r/20210604163506.2103900-1-nsaenzju@redhat.com Fixes: 2b5067a8143e3 ("mm: mmap_lock: add tracepoints around lock acquisition ") Signed-off-by: Nicolas Saenz Julienne Tested-by: Axel Rasmussen Reviewed-by: Axel Rasmussen Cc: Vlastimil Babka Cc: Steven Rostedt Signed-off-by: Andrew Morton --- mm/mmap_lock.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) --- a/mm/mmap_lock.c~mm-mmap_lock-use-local-locks-instead-of-disabling-preemption +++ a/mm/mmap_lock.c @@ -11,6 +11,7 @@ #include #include #include +#include EXPORT_TRACEPOINT_SYMBOL(mmap_lock_start_locking); EXPORT_TRACEPOINT_SYMBOL(mmap_lock_acquire_returned); @@ -39,21 +40,30 @@ static int reg_refcount; /* Protected by */ #define CONTEXT_COUNT 4 -static DEFINE_PER_CPU(char __rcu *, memcg_path_buf); +struct memcg_path { + local_lock_t lock; + char __rcu *buf; + local_t buf_idx; +}; +static DEFINE_PER_CPU(struct memcg_path, memcg_paths) = { + .lock = INIT_LOCAL_LOCK(lock), + .buf_idx = LOCAL_INIT(0), +}; + static char **tmp_bufs; -static DEFINE_PER_CPU(int, memcg_path_buf_idx); /* Called with reg_lock held. */ static void free_memcg_path_bufs(void) { + struct memcg_path *memcg_path; int cpu; char **old = tmp_bufs; for_each_possible_cpu(cpu) { - *(old++) = rcu_dereference_protected( - per_cpu(memcg_path_buf, cpu), + memcg_path = per_cpu_ptr(&memcg_paths, cpu); + *(old++) = rcu_dereference_protected(memcg_path->buf, lockdep_is_held(®_lock)); - rcu_assign_pointer(per_cpu(memcg_path_buf, cpu), NULL); + rcu_assign_pointer(memcg_path->buf, NULL); } /* Wait for inflight memcg_path_buf users to finish. */ @@ -88,7 +98,7 @@ int trace_mmap_lock_reg(void) new = kmalloc(MEMCG_PATH_BUF_SIZE * CONTEXT_COUNT, GFP_KERNEL); if (new == NULL) goto out_fail_free; - rcu_assign_pointer(per_cpu(memcg_path_buf, cpu), new); + rcu_assign_pointer(per_cpu_ptr(&memcg_paths, cpu)->buf, new); /* Don't need to wait for inflights, they'd have gotten NULL. */ } @@ -122,23 +132,24 @@ out: static inline char *get_memcg_path_buf(void) { + struct memcg_path *memcg_path = this_cpu_ptr(&memcg_paths); char *buf; int idx; rcu_read_lock(); - buf = rcu_dereference(*this_cpu_ptr(&memcg_path_buf)); + buf = rcu_dereference(memcg_path->buf); if (buf == NULL) { rcu_read_unlock(); return NULL; } - idx = this_cpu_add_return(memcg_path_buf_idx, MEMCG_PATH_BUF_SIZE) - + idx = local_add_return(MEMCG_PATH_BUF_SIZE, &memcg_path->buf_idx) - MEMCG_PATH_BUF_SIZE; return &buf[idx]; } static inline void put_memcg_path_buf(void) { - this_cpu_sub(memcg_path_buf_idx, MEMCG_PATH_BUF_SIZE); + local_sub(MEMCG_PATH_BUF_SIZE, &this_cpu_ptr(&memcg_paths)->buf_idx); rcu_read_unlock(); } @@ -179,14 +190,14 @@ out: #define TRACE_MMAP_LOCK_EVENT(type, mm, ...) \ do { \ const char *memcg_path; \ - preempt_disable(); \ + local_lock(&memcg_paths.lock); \ memcg_path = get_mm_memcg_path(mm); \ trace_mmap_lock_##type(mm, \ memcg_path != NULL ? memcg_path : "", \ ##__VA_ARGS__); \ if (likely(memcg_path != NULL)) \ put_memcg_path_buf(); \ - preempt_enable(); \ + local_unlock(&memcg_paths.lock); \ } while (0) #else /* !CONFIG_MEMCG */