From patchwork Fri Jan 14 22:02:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95541C433FE for ; Fri, 14 Jan 2022 22:02:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E87586B0073; Fri, 14 Jan 2022 17:02:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E363C6B0074; Fri, 14 Jan 2022 17:02:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D25EE6B0075; Fri, 14 Jan 2022 17:02:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id C645B6B0073 for ; Fri, 14 Jan 2022 17:02:55 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8705D998CB for ; Fri, 14 Jan 2022 22:02:55 +0000 (UTC) X-FDA: 79030268310.08.A7A72DF Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 17D741C0003 for ; Fri, 14 Jan 2022 22:02:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EDD0361FE5; Fri, 14 Jan 2022 22:02:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C67BCC36AE9; Fri, 14 Jan 2022 22:02:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197773; bh=M50U5r7x5oJoUWEw7VcxXcqN6SqJM1Cm+EW7ABDoeLE=; h=Date:From:To:Subject:In-Reply-To:From; b=sHIIGvx31y98h6JICGIIcuLf02izC5wXBQYoayVZMMwo7ZCC1f3T4YIeslTa6W5c0 ntQ6QYHK1dQ+WWCrcJZLD2WK7pj6NQP/z2FgHsvER+Lr58I0TWLPLBVbsidMtCX+b1 C87JhHXX7VcoZy0+j4XaU5d3pQ0e2oXaGvgK71i0= Date: Fri, 14 Jan 2022 14:02:52 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 001/146] kthread: add the helper function kthread_run_on_cpu() Message-ID: <20220114220252.FkLY1R14S%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 17D741C0003 X-Stat-Signature: f6ifgtb6ytaigtaz4c1co3ndhz6r8br5 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=sHIIGvx3; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197774-473203 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: kthread: add the helper function kthread_run_on_cpu() the helper function kthread_run_on_cpu() includes kthread_create_on_cpu/wake_up_process(). In some cases, use kthread_run_on_cpu() directly instead of kthread_create_on_node/kthread_bind/wake_up_process() or kthread_create_on_cpu/wake_up_process() or kthreadd_create/kthread_bind/wake_up_process() to simplify the code. [akpm@linux-foundation.org: export kthread_create_on_cpu to modules] Link: https://lkml.kernel.org/r/20211022025711.3673-2-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Cai Huoqing Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton --- include/linux/kthread.h | 25 +++++++++++++++++++++++++ kernel/kthread.c | 1 + 2 files changed, 26 insertions(+) --- a/include/linux/kthread.h~kthread-add-the-helper-function-kthread_run_on_cpu +++ a/include/linux/kthread.h @@ -56,6 +56,31 @@ bool kthread_is_per_cpu(struct task_stru __k; \ }) +/** + * kthread_run_on_cpu - create and wake a cpu bound thread. + * @threadfn: the function to run until signal_pending(current). + * @data: data ptr for @threadfn. + * @cpu: The cpu on which the thread should be bound, + * @namefmt: printf-style name for the thread. Format is restricted + * to "name.*%u". Code fills in cpu number. + * + * Description: Convenient wrapper for kthread_create_on_cpu() + * followed by wake_up_process(). Returns the kthread or + * ERR_PTR(-ENOMEM). + */ +static inline struct task_struct * +kthread_run_on_cpu(int (*threadfn)(void *data), void *data, + unsigned int cpu, const char *namefmt) +{ + struct task_struct *p; + + p = kthread_create_on_cpu(threadfn, data, cpu, namefmt); + if (!IS_ERR(p)) + wake_up_process(p); + + return p; +} + void free_kthread_struct(struct task_struct *k); void kthread_bind(struct task_struct *k, unsigned int cpu); void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); --- a/kernel/kthread.c~kthread-add-the-helper-function-kthread_run_on_cpu +++ a/kernel/kthread.c @@ -523,6 +523,7 @@ struct task_struct *kthread_create_on_cp to_kthread(p)->cpu = cpu; return p; } +EXPORT_SYMBOL(kthread_create_on_cpu); void kthread_set_per_cpu(struct task_struct *k, int cpu) { From patchwork Fri Jan 14 22:02:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC0B7C4332F for ; Fri, 14 Jan 2022 22:02:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B2536B0074; Fri, 14 Jan 2022 17:02:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 361386B0075; Fri, 14 Jan 2022 17:02:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 228B76B0078; Fri, 14 Jan 2022 17:02:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 158176B0074 for ; Fri, 14 Jan 2022 17:02:59 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CB9B8998D3 for ; Fri, 14 Jan 2022 22:02:58 +0000 (UTC) X-FDA: 79030268436.13.23E6761 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf04.hostedemail.com (Postfix) with ESMTP id 6427740002 for ; Fri, 14 Jan 2022 22:02:58 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7F41461FEB; Fri, 14 Jan 2022 22:02:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 59FFBC36AE9; Fri, 14 Jan 2022 22:02:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197776; bh=sGsAq+0wKXeYRkrjJAef+zs4S7vgZ9DsnZbpsQb14OU=; h=Date:From:To:Subject:In-Reply-To:From; b=WQNl6nB0Vj/1xOaKobpAEpI0BZ0YOVVcZWRWD5zU6VH52FZX9+kYxMq3+fxzaxeXw VIxiJEi0xntXKNIPr4Wsl5nBhovnw70ZMWQCt6RhIyuioGa64R/nhwFZAKIiVM21Q/ ZHlm84kNE9KaHbEei/ksSi/b26Py8lW6e5bZ8xFo= Date: Fri, 14 Jan 2022 14:02:55 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 002/146] RDMA/siw: make use of the helper function kthread_run_on_cpu() Message-ID: <20220114220255.ZPEpuUjOn%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6427740002 X-Stat-Signature: aw6nmhpiy9pia7x3erf3qxf6u8jwjjmw Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WQNl6nB0; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197778-814100 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: RDMA/siw: make use of the helper function kthread_run_on_cpu() Replace kthread_create/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-3-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton Reviewed-by: Bernard Metzler --- drivers/infiniband/sw/siw/siw_main.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/drivers/infiniband/sw/siw/siw_main.c~rdma-siw-make-use-of-the-helper-function-kthread_run_on_cpu +++ a/drivers/infiniband/sw/siw/siw_main.c @@ -98,15 +98,14 @@ static int siw_create_tx_threads(void) continue; siw_tx_thread[cpu] = - kthread_create(siw_run_sq, (unsigned long *)(long)cpu, - "siw_tx/%d", cpu); + kthread_run_on_cpu(siw_run_sq, + (unsigned long *)(long)cpu, + cpu, "siw_tx/%u"); if (IS_ERR(siw_tx_thread[cpu])) { siw_tx_thread[cpu] = NULL; continue; } - kthread_bind(siw_tx_thread[cpu], cpu); - wake_up_process(siw_tx_thread[cpu]); assigned++; } return assigned; From patchwork Fri Jan 14 22:02:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06C98C433F5 for ; Fri, 14 Jan 2022 22:03:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 930166B0078; Fri, 14 Jan 2022 17:03:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E0C06B007B; Fri, 14 Jan 2022 17:03:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CE946B007D; Fri, 14 Jan 2022 17:03:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0052.hostedemail.com [216.40.44.52]) by kanga.kvack.org (Postfix) with ESMTP id 6B4746B0078 for ; Fri, 14 Jan 2022 17:03:02 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2683182EF998 for ; Fri, 14 Jan 2022 22:03:02 +0000 (UTC) X-FDA: 79030268604.29.E0FBD27 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id B8E9E1C0007 for ; Fri, 14 Jan 2022 22:03:01 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0D86861FE6; Fri, 14 Jan 2022 22:03:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB76BC36AE5; Fri, 14 Jan 2022 22:02:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197780; bh=A5j0iXSb71HCjCe29rW2IIzeZqi11B2GacRE/qA5+tQ=; h=Date:From:To:Subject:In-Reply-To:From; b=JRxuJiz/8epe+gGw3IOCLJ2cSlLEDitSszf1O3TV8OOI2NS1DQ/SAQhIftVzioJIy Uop/lpaW5/MxmJukGxKK75SPuCz6o8Q2Aj+eeXskyf3Q06CxYOR2lPn/kRb59gBq2T zBroNw1KSEGGShFDaHaVdwyi9/XBpzde48SYZbdQ= Date: Fri, 14 Jan 2022 14:02:59 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 003/146] ring-buffer: make use of the helper function kthread_run_on_cpu() Message-ID: <20220114220259.m6-lQwjxH%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="JRxuJiz/"; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: d3qsmncc4eq94iua9mpd7ynk1w1xe93u X-Rspamd-Queue-Id: B8E9E1C0007 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197781-439459 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: ring-buffer: make use of the helper function kthread_run_on_cpu() Replace kthread_create/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-4-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton --- kernel/trace/ring_buffer.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) --- a/kernel/trace/ring_buffer.c~ring-buffer-make-use-of-the-helper-function-kthread_run_on_cpu +++ a/kernel/trace/ring_buffer.c @@ -5898,16 +5898,13 @@ static __init int test_ringbuffer(void) rb_data[cpu].buffer = buffer; rb_data[cpu].cpu = cpu; rb_data[cpu].cnt = cpu; - rb_threads[cpu] = kthread_create(rb_test, &rb_data[cpu], - "rbtester/%d", cpu); + rb_threads[cpu] = kthread_run_on_cpu(rb_test, &rb_data[cpu], + cpu, "rbtester/%u"); if (WARN_ON(IS_ERR(rb_threads[cpu]))) { pr_cont("FAILED\n"); ret = PTR_ERR(rb_threads[cpu]); goto out_free; } - - kthread_bind(rb_threads[cpu], cpu); - wake_up_process(rb_threads[cpu]); } /* Now create the rb hammer! */ From patchwork Fri Jan 14 22:03:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C64EC433FE for ; Fri, 14 Jan 2022 22:03:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86CDD6B007B; Fri, 14 Jan 2022 17:03:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 81C096B007D; Fri, 14 Jan 2022 17:03:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70B946B007E; Fri, 14 Jan 2022 17:03:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 645746B007B for ; Fri, 14 Jan 2022 17:03:07 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1E5E91826B6AE for ; Fri, 14 Jan 2022 22:03:07 +0000 (UTC) X-FDA: 79030268814.15.92D6A06 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf15.hostedemail.com (Postfix) with ESMTP id 9EA4DA0015 for ; Fri, 14 Jan 2022 22:03:06 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4A7B4B82A3C; Fri, 14 Jan 2022 22:03:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71054C36AE5; Fri, 14 Jan 2022 22:03:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197784; bh=L8uhuIiq5lt0Fka3qsyxPXra33P+FdN1wYQH6JzvxuM=; h=Date:From:To:Subject:In-Reply-To:From; b=TGAaP0hQw9b4ib+V4YRN9Hf9ab/58ksKtl/gcLhP5a21+YUZ1gITvH0IA2mIrmTcQ 0uLrtkWsJO6EqAeh2XPj3uGVzEdJIkoGs3MeVzzipAM0nH6eaLthPmxxtF3UNet3d6 Kjw2gUVTSwRkXnKrpYvIC8tsw/yvDeim57+oiCgk= Date: Fri, 14 Jan 2022 14:03:02 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 004/146] rcutorture: make use of the helper function kthread_run_on_cpu() Message-ID: <20220114220302.NBU0-uX6r%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 9EA4DA0015 X-Stat-Signature: hiietx41nk4z5bcxekjk9aszijcaysfi Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=TGAaP0hQ; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642197786-766948 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: rcutorture: make use of the helper function kthread_run_on_cpu() Replace kthread_create_on_node/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-5-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton --- kernel/rcu/rcutorture.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) --- a/kernel/rcu/rcutorture.c~rcutorture-make-use-of-the-helper-function-kthread_run_on_cpu +++ a/kernel/rcu/rcutorture.c @@ -1992,9 +1992,8 @@ static int rcutorture_booster_init(unsig mutex_lock(&boost_mutex); rcu_torture_disable_rt_throttle(); VERBOSE_TOROUT_STRING("Creating rcu_torture_boost task"); - boost_tasks[cpu] = kthread_create_on_node(rcu_torture_boost, NULL, - cpu_to_node(cpu), - "rcu_torture_boost"); + boost_tasks[cpu] = kthread_run_on_cpu(rcu_torture_boost, NULL, + cpu, "rcu_torture_boost_%u"); if (IS_ERR(boost_tasks[cpu])) { retval = PTR_ERR(boost_tasks[cpu]); VERBOSE_TOROUT_STRING("rcu_torture_boost task create failed"); @@ -2003,8 +2002,6 @@ static int rcutorture_booster_init(unsig mutex_unlock(&boost_mutex); return retval; } - kthread_bind(boost_tasks[cpu], cpu); - wake_up_process(boost_tasks[cpu]); mutex_unlock(&boost_mutex); return 0; } From patchwork Fri Jan 14 22:03:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1122C433EF for ; Fri, 14 Jan 2022 22:03:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8493E6B007E; Fri, 14 Jan 2022 17:03:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A8416B0080; Fri, 14 Jan 2022 17:03:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66EFF6B0081; Fri, 14 Jan 2022 17:03:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 56EB06B007E for ; Fri, 14 Jan 2022 17:03:09 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1DEE01826B6AE for ; Fri, 14 Jan 2022 22:03:09 +0000 (UTC) X-FDA: 79030268898.17.B7F52B0 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id C059C40013 for ; Fri, 14 Jan 2022 22:03:08 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 16EF361FF2; Fri, 14 Jan 2022 22:03:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA2FEC36AE9; Fri, 14 Jan 2022 22:03:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197787; bh=TCCbkQv3Z5Jyic4DQgXlZXah5HWhZHvALS2kVOtzjzs=; h=Date:From:To:Subject:In-Reply-To:From; b=fTuvFWFrAoWVU2Qmp2FbZXBvLQ9sQwovpg8fohQY99QtBhyaHWnmhOdQkh+SUbxez hbTz6wDnf3oEmqFfrEAHbPcoobzWJ1lSpJMigXbV4/UentfIM8vw615KgyHzV8Q8Zy +nVPx1uZoZXLNK8Yh6NfSQ0VrAqSsHVyC8d/Qios= Date: Fri, 14 Jan 2022 14:03:06 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 005/146] trace/osnoise: make use of the helper function kthread_run_on_cpu() Message-ID: <20220114220306.E3xj0fyX8%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C059C40013 X-Stat-Signature: o5utqr1mf7fc6rbm9h8663ug7gz383md Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=fTuvFWFr; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197788-505850 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: trace/osnoise: make use of the helper function kthread_run_on_cpu() Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-6-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton --- kernel/trace/trace_osnoise.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/kernel/trace/trace_osnoise.c~trace-osnoise-make-use-of-the-helper-function-kthread_run_on_cpu +++ a/kernel/trace/trace_osnoise.c @@ -1701,7 +1701,7 @@ static int start_kthread(unsigned int cp snprintf(comm, 24, "osnoise/%d", cpu); } - kthread = kthread_create_on_cpu(main, NULL, cpu, comm); + kthread = kthread_run_on_cpu(main, NULL, cpu, comm); if (IS_ERR(kthread)) { pr_err(BANNER "could not start sampling thread\n"); @@ -1710,7 +1710,6 @@ static int start_kthread(unsigned int cp } per_cpu(per_cpu_osnoise_var, cpu).kthread = kthread; - wake_up_process(kthread); return 0; } From patchwork Fri Jan 14 22:03:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A1C7C433F5 for ; Fri, 14 Jan 2022 22:03:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22BCE6B0081; Fri, 14 Jan 2022 17:03:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18C956B0082; Fri, 14 Jan 2022 17:03:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07AF76B0083; Fri, 14 Jan 2022 17:03:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id EA1786B0081 for ; Fri, 14 Jan 2022 17:03:12 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id AABFC998D3 for ; Fri, 14 Jan 2022 22:03:12 +0000 (UTC) X-FDA: 79030269024.17.5C6841D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf01.hostedemail.com (Postfix) with ESMTP id 4A7C74000A for ; Fri, 14 Jan 2022 22:03:12 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9FB6C61FDD; Fri, 14 Jan 2022 22:03:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B7A7C36AE9; Fri, 14 Jan 2022 22:03:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197791; bh=GAVMvjSqQb3qKYqLPSoXyCyjKQXix2PTQ97z3GBGXaY=; h=Date:From:To:Subject:In-Reply-To:From; b=O76iebov87USnie3YmN6n3CaBXftL772S9mKOI+eFfcHJoL+yZxeQ8/mYuZ8xzjn+ actYmcKn8P3TwxFj10Mry+jByftsqyr8Hv4L6GQ/aVvSATfuYg80e9sy/1NhVsue1J ABppWfrzOYtxin7oKi9Xso3UGOhrqGxCyOKvZsOM= Date: Fri, 14 Jan 2022 14:03:10 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bmt@zurich.ibm.com, bristot@kernel.org, caihuoqing@baidu.com, dave@stgolabs.net, dledford@redhat.com, jgg@ziepe.ca, jiangshanlai@gmail.com, joel@joelfernandes.org, josh@joshtriplett.org, linux-mm@kvack.org, mathieu.desnoyers@efficios.com, mingo@redhat.com, mm-commits@vger.kernel.org, paulmck@kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 006/146] trace/hwlat: make use of the helper function kthread_run_on_cpu() Message-ID: <20220114220310.cKAJFgLZs%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4A7C74000A X-Stat-Signature: 4fzfccw36w618sqcopyw7je8wa9u5htx Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=O76iebov; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197792-627860 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Cai Huoqing Subject: trace/hwlat: make use of the helper function kthread_run_on_cpu() Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-7-caihuoqing@baidu.com Signed-off-by: Cai Huoqing Cc: Bernard Metzler Cc: Daniel Bristot de Oliveira Cc: Davidlohr Bueso Cc: Doug Ledford Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: Joel Fernandes (Google) Cc: Josh Triplett Cc: Lai Jiangshan Cc: Mathieu Desnoyers Cc: "Paul E . McKenney" Cc: Steven Rostedt Signed-off-by: Andrew Morton --- kernel/trace/trace_hwlat.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) --- a/kernel/trace/trace_hwlat.c~trace-hwlat-make-use-of-the-helper-function-kthread_run_on_cpu +++ a/kernel/trace/trace_hwlat.c @@ -491,18 +491,14 @@ static void stop_per_cpu_kthreads(void) static int start_cpu_kthread(unsigned int cpu) { struct task_struct *kthread; - char comm[24]; - snprintf(comm, 24, "hwlatd/%d", cpu); - - kthread = kthread_create_on_cpu(kthread_fn, NULL, cpu, comm); + kthread = kthread_run_on_cpu(kthread_fn, NULL, cpu, "hwlatd/%u"); if (IS_ERR(kthread)) { pr_err(BANNER "could not start sampling thread\n"); return -ENOMEM; } per_cpu(hwlat_per_cpu_data, cpu).kthread = kthread; - wake_up_process(kthread); return 0; } From patchwork Fri Jan 14 22:03:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5535C433FE for ; Fri, 14 Jan 2022 22:03:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5AD4B6B0083; Fri, 14 Jan 2022 17:03:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 535976B0085; Fri, 14 Jan 2022 17:03:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D7116B0087; Fri, 14 Jan 2022 17:03:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0091.hostedemail.com [216.40.44.91]) by kanga.kvack.org (Postfix) with ESMTP id 216C16B0083 for ; Fri, 14 Jan 2022 17:03:17 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D6D9A1826B6AD for ; Fri, 14 Jan 2022 22:03:16 +0000 (UTC) X-FDA: 79030269192.18.6F85778 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf01.hostedemail.com (Postfix) with ESMTP id 526E54000A for ; Fri, 14 Jan 2022 22:03:16 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4DFE3B82A26; Fri, 14 Jan 2022 22:03:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA8C8C36AE9; Fri, 14 Jan 2022 22:03:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197794; bh=nn0CO68Tivwlt2P/Mnr6cdxvJP90N3hXPXfcfMMqJqw=; h=Date:From:To:Subject:In-Reply-To:From; b=V8oMPiRYlEtTcfr49iFfTNFP2Ud0+o2xB9cgwZ3ozaJrBMALGSwfqU1oDfupmUsrt SqviG6K+C1c0Kwp26p23e4vNwkuMqvbPrW6R/didNeJaHbmH9Rl/4BDQ1GorEypbeu HbhnPz4QgB7b7aTCSZdtiqF/bVim1um3PfYmcSAU= Date: Fri, 14 Jan 2022 14:03:13 -0800 From: Andrew Morton To: akpm@linux-foundation.org, davidcomponentone@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, yang.guang5@zte.com.cn, zealci@zte.com.cn Subject: [patch 007/146] ia64: module: use swap() to make code cleaner Message-ID: <20220114220313._cn9cqHJT%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 526E54000A X-Stat-Signature: 8qgpaticbc13rqfiqsdghub583tqiaao Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=V8oMPiRY; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642197796-929788 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Guang Subject: ia64: module: use swap() to make code cleaner Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Link: https://lkml.kernel.org/r/20211104062642.1506539-1-yang.guang5@zte.com.cn Signed-off-by: Yang Guang Reported-by: Zeal Robot Cc: David Yang Signed-off-by: Andrew Morton --- arch/ia64/kernel/module.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/arch/ia64/kernel/module.c~ia64-module-use-swap-to-make-code-cleaner +++ a/arch/ia64/kernel/module.c @@ -848,7 +848,7 @@ register_unwind_table (struct module *mo { struct unw_table_entry *start = (void *) mod->arch.unwind->sh_addr; struct unw_table_entry *end = start + mod->arch.unwind->sh_size / sizeof (*start); - struct unw_table_entry tmp, *e1, *e2, *core, *init; + struct unw_table_entry *e1, *e2, *core, *init; unsigned long num_init = 0, num_core = 0; /* First, count how many init and core unwind-table entries there are. */ @@ -865,9 +865,7 @@ register_unwind_table (struct module *mo for (e1 = start; e1 < end; ++e1) { for (e2 = e1 + 1; e2 < end; ++e2) { if (e2->start_offset < e1->start_offset) { - tmp = *e1; - *e1 = *e2; - *e2 = tmp; + swap(*e1, *e2); } } } From patchwork Fri Jan 14 22:03:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5862AC433EF for ; Fri, 14 Jan 2022 22:03:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC3846B0087; Fri, 14 Jan 2022 17:03:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D4C4E6B0088; Fri, 14 Jan 2022 17:03:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF7EB6B0089; Fri, 14 Jan 2022 17:03:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id A8E436B0087 for ; Fri, 14 Jan 2022 17:03:21 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6AAE482F7A97 for ; Fri, 14 Jan 2022 22:03:21 +0000 (UTC) X-FDA: 79030269402.28.B69329F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 956ED1C0003 for ; Fri, 14 Jan 2022 22:03:19 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5ABF1B8260F; Fri, 14 Jan 2022 22:03:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E87EFC36AE5; Fri, 14 Jan 2022 22:03:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197797; bh=WFzDB0RhxBdKuLWNL/xqpojlHFldianfv0A/q3T/7yI=; h=Date:From:To:Subject:In-Reply-To:From; b=GbeSRuQc2pcSu6DMSuSSNIQn59XU8fJME3xIjnETt9o+UYg1eZBzN+THOkTE2JEWu lzPeGQy4lkRbCJ9g8hJwwTY5XbmnmL2GZoW0Q5y5dJ64C63ueY5WDUsBIA3faQyv0/ EZWdRga225N/FbA8aYHACXJwKm3KlEEm62Khl0z0= Date: Fri, 14 Jan 2022 14:03:16 -0800 From: Andrew Morton To: akpm@linux-foundation.org, davidcomponentone@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, yang.guang5@zte.com.cn, zealci@zte.com.cn Subject: [patch 008/146] arch/ia64/kernel/setup.c: use swap() to make code cleaner Message-ID: <20220114220316.7j_Se__l2%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=GbeSRuQc; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: p3ko6c9yiijzdcqy5thmyqjah4jmu9er X-Rspamd-Queue-Id: 956ED1C0003 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197799-355732 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Guang Subject: arch/ia64/kernel/setup.c: use swap() to make code cleaner Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Link: https://lkml.kernel.org/r/20211104001908.695110-1-yang.guang5@zte.com.cn Reported-by: Zeal Robot Signed-off-by: Yang Guang Cc: David Yang Signed-off-by: Andrew Morton --- arch/ia64/kernel/setup.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/arch/ia64/kernel/setup.c~ia64-use-swap-to-make-code-cleaner +++ a/arch/ia64/kernel/setup.c @@ -208,10 +208,7 @@ sort_regions (struct rsvd_region *rsvd_r while (max--) { for (j = 0; j < max; ++j) { if (rsvd_region[j].start > rsvd_region[j+1].start) { - struct rsvd_region tmp; - tmp = rsvd_region[j]; - rsvd_region[j] = rsvd_region[j + 1]; - rsvd_region[j + 1] = tmp; + swap(rsvd_region[j], rsvd_region[j + 1]); } } } From patchwork Fri Jan 14 22:03:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E85BC433FE for ; Fri, 14 Jan 2022 22:03:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1406F6B0089; Fri, 14 Jan 2022 17:03:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A4966B008A; Fri, 14 Jan 2022 17:03:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0F636B008C; Fri, 14 Jan 2022 17:03:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0021.hostedemail.com [216.40.44.21]) by kanga.kvack.org (Postfix) with ESMTP id C2CBE6B0089 for ; Fri, 14 Jan 2022 17:03:22 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8581C998D6 for ; Fri, 14 Jan 2022 22:03:22 +0000 (UTC) X-FDA: 79030269444.25.E51884B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf02.hostedemail.com (Postfix) with ESMTP id 5E24E80002 for ; Fri, 14 Jan 2022 22:03:21 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B1DED61FD7; Fri, 14 Jan 2022 22:03:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA6BBC36AE9; Fri, 14 Jan 2022 22:03:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197800; bh=8h2BgT3ZMcQgq/WkmRNRPLVea1gYYnRdpxuR56cd7Jk=; h=Date:From:To:Subject:In-Reply-To:From; b=F7mayFiRGSuCe8l7j6onNbOEdP7IyKdm3vHLNebdvSTu/eu1DpnwtlPUZqV+UXN0q y8uaP/RE2icpd/bFeU/b9Qbee2BEiDk2wLhKd4IQyXmYb3dzycnV3QitTSKtT3Tyof LuvNXUKKKl+y5wi/kA7pZ8EtthBQY82eWnBZ8qtE= Date: Fri, 14 Jan 2022 14:03:19 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, wangborong@cdjrlc.com Subject: [patch 009/146] ia64: fix typo in a comment Message-ID: <20220114220319.N8AxJActa%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 5E24E80002 X-Stat-Signature: wsi353n6fmiwnho8718wfjuot33c6qdi Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=F7mayFiR; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642197801-995793 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Wang Subject: ia64: fix typo in a comment The double `the' in a comment is repeated, thus it should be removed. Link: https://lkml.kernel.org/r/20211113030316.22650-1-wangborong@cdjrlc.com Signed-off-by: Jason Wang Signed-off-by: Andrew Morton --- arch/ia64/kernel/uncached.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/ia64/kernel/uncached.c~ia64-fix-typo-in-a-comment +++ a/arch/ia64/kernel/uncached.c @@ -171,7 +171,7 @@ failed: * @n_pages: number of contiguous pages to allocate * * Allocate the specified number of contiguous uncached pages on the - * the requested node. If not enough contiguous uncached pages are available + * requested node. If not enough contiguous uncached pages are available * on the requested node, roundrobin starting with the next higher node. */ unsigned long uncached_alloc_page(int starting_nid, int n_pages) From patchwork Fri Jan 14 22:03:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2965C433F5 for ; Fri, 14 Jan 2022 22:03:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D0786B008C; Fri, 14 Jan 2022 17:03:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 75B1B6B0092; Fri, 14 Jan 2022 17:03:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D4AB6B0093; Fri, 14 Jan 2022 17:03:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0109.hostedemail.com [216.40.44.109]) by kanga.kvack.org (Postfix) with ESMTP id 3FB2D6B008C for ; Fri, 14 Jan 2022 17:03:25 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id F34C91826B6AD for ; Fri, 14 Jan 2022 22:03:24 +0000 (UTC) X-FDA: 79030269528.25.CABDB32 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 7FA8718000C for ; Fri, 14 Jan 2022 22:03:24 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C2FED61FF0; Fri, 14 Jan 2022 22:03:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB793C36AE9; Fri, 14 Jan 2022 22:03:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197803; bh=n5C8xwI8l8pZqI3rdPSbxUwmk06mCoSvKTRrF5aI3to=; h=Date:From:To:Subject:In-Reply-To:From; b=oTNs1KRYHo+JW9/JLodVYuPrbxgd1LFuX7WWP2mwgFVAFVC3hg78pCIMEueTY12/t cSN62kAYir/hm95vmLiCb9FFVK5ZInfzJPgveuVFUAkHGYJIDR7CsW0YH+xtHOHlGv z8uPWEiAh0PY1L+4ynhF6wu3nFolx5+Wgr0hxTbw= Date: Fri, 14 Jan 2022 14:03:22 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, gregkh@linuxfoundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@kernel.org, torvalds@linux-foundation.org Subject: [patch 010/146] ia64: topology: use default_groups in kobj_type Message-ID: <20220114220322.BAnKJNfcw%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: kjy8b99e511odoi633zkhyk1x8w77cio Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=oTNs1KRY; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 7FA8718000C X-HE-Tag: 1642197804-331270 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Greg Kroah-Hartman Subject: ia64: topology: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ia64 topology sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/20220104154800.1287947-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman Cc: Mike Rapoport Cc: David Hildenbrand Signed-off-by: Andrew Morton --- arch/ia64/kernel/topology.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/arch/ia64/kernel/topology.c~ia64-topology-use-default_groups-in-kobj_type +++ a/arch/ia64/kernel/topology.c @@ -264,6 +264,7 @@ static struct attribute * cache_default_ &shared_cpu_map.attr, NULL }; +ATTRIBUTE_GROUPS(cache_default); #define to_object(k) container_of(k, struct cache_info, kobj) #define to_attr(a) container_of(a, struct cache_attr, attr) @@ -284,7 +285,7 @@ static const struct sysfs_ops cache_sysf static struct kobj_type cache_ktype = { .sysfs_ops = &cache_sysfs_ops, - .default_attrs = cache_default_attrs, + .default_groups = cache_default_groups, }; static struct kobj_type cache_ktype_percpu_entry = { From patchwork Fri Jan 14 22:03:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC786C433FE for ; Fri, 14 Jan 2022 22:03:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E6326B0095; Fri, 14 Jan 2022 17:03:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 620026B0096; Fri, 14 Jan 2022 17:03:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E92E6B0098; Fri, 14 Jan 2022 17:03:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id 39C686B0095 for ; Fri, 14 Jan 2022 17:03:29 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EB331998D9 for ; Fri, 14 Jan 2022 22:03:28 +0000 (UTC) X-FDA: 79030269696.18.CB13245 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf30.hostedemail.com (Postfix) with ESMTP id 88EE480003 for ; Fri, 14 Jan 2022 22:03:28 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 938C4B82630; Fri, 14 Jan 2022 22:03:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0D0DDC36AE9; Fri, 14 Jan 2022 22:03:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197806; bh=u3wrRiPF3Z6D6kDRf06hq6ekLBjCD2dUUJrMyl0La7Q=; h=Date:From:To:Subject:In-Reply-To:From; b=qiPK2ZbgtLpZ6QvFikz2kyT+MrdZj65x+tXeXxmVJeimtsKucYbiw7hh9Msgd7muk Gm9TiwSTj/C1O55FUeDDh7CYFMVuEQ/EsqXvihQhsFOuxuoKme3JPMJh5g60TBNd9x UWFFwIdU+ZgP9rQiYddwVIUH6yxJptUuPIMX8BHE= Date: Fri, 14 Jan 2022 14:03:25 -0800 From: Andrew Morton To: akpm@linux-foundation.org, colin.king@intel.com, dfustini@baylibre.com, gustavoars@kernel.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sven@narfation.org, tom.saeger@oracle.com, torvalds@linux-foundation.org, zuoqilin@yulong.com Subject: [patch 011/146] scripts/spelling.txt: add "oveflow" Message-ID: <20220114220325.gt0PHk9Tz%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 88EE480003 X-Stat-Signature: r6385yxcb7msbmb4r9cayyenbewi37dz Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qiPK2Zbg; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197808-940133 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Drew Fustini Subject: scripts/spelling.txt: add "oveflow" Add typo "oveflow" for "overflow". This typo was found and fixed in tools/testing/selftests/bpf/prog_tests/btf_dump.c Link: https://lore.kernel.org/all/20211122070528.837806-1-dfustini@baylibre.com/ Link: https://lkml.kernel.org/r/20211122072302.839102-1-dfustini@baylibre.com Signed-off-by: Drew Fustini Suggested-by: Gustavo A. R. Silva Cc: Colin Ian King Cc: Drew Fustini Cc: zuoqilin Cc: Tom Saeger Cc: Sven Eckelmann Signed-off-by: Andrew Morton --- scripts/spelling.txt | 1 + 1 file changed, 1 insertion(+) --- a/scripts/spelling.txt~scripts-spellingtxt-add-oveflow +++ a/scripts/spelling.txt @@ -1046,6 +1046,7 @@ oustanding||outstanding overaall||overall overhread||overhead overlaping||overlapping +oveflow||overflow overflw||overflow overlfow||overflow overide||override From patchwork Fri Jan 14 22:03:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C362DC433FE for ; Fri, 14 Jan 2022 22:03:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 539096B0099; Fri, 14 Jan 2022 17:03:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C0616B009A; Fri, 14 Jan 2022 17:03:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 362B86B009B; Fri, 14 Jan 2022 17:03:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 1CEAE6B0099 for ; Fri, 14 Jan 2022 17:03:32 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BFD34998CB for ; Fri, 14 Jan 2022 22:03:31 +0000 (UTC) X-FDA: 79030269822.03.738C5AE Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf28.hostedemail.com (Postfix) with ESMTP id B3C6FC0002 for ; Fri, 14 Jan 2022 22:03:30 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1E0BA61FEE; Fri, 14 Jan 2022 22:03:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CE36C36AE9; Fri, 14 Jan 2022 22:03:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197809; bh=qUC9HiRVD4PD7AiKof2Jd3v5ns4F17ZMp58gBOozCsc=; h=Date:From:To:Subject:In-Reply-To:From; b=AM7fM0iLS3sQ/YsnmUtr11ynLSIwAbYA6QeKJTFLO1MRhtQfSznKprEtnRzTOypYg YBlG0ykQpUKUL7yJD/dzlnRk+QUL220uicKHV8yv6GDB60FsYxO9Muhg9ju0h9Igp/ cRd7OvBswLZRedERJBYk1xmSKpsZ/0V1icXKfsew= Date: Fri, 14 Jan 2022 14:03:28 -0800 From: Andrew Morton To: abaci@linux.alibaba.com, akpm@linux-foundation.org, anton@tuxera.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, yang.lee@linux.alibaba.com Subject: [patch 012/146] fs/ntfs/attrib.c: fix one kernel-doc comment Message-ID: <20220114220328.-PVS4dlKr%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B3C6FC0002 X-Stat-Signature: fxjyfb37heqeqhfwozp95q6xefs7w8um Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=AM7fM0iL; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197810-776260 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Li Subject: fs/ntfs/attrib.c: fix one kernel-doc comment /** * attrib.c - NTFS attribute operations. Part of the Linux-NTFS The comments for the file should not be in kernel-doc format, which causes it to be incorrectly identified for function ntfs_map_runlist_nolock(), causing some warnings found by running scripts/kernel-doc. fs/ntfs/attrib.c:25: warning: Incorrect use of kernel-doc format: * ntfs_map_runlist_nolock - map (a part of) a runlist of an ntfs inode fs/ntfs/attrib.c:71: warning: Function parameter or member 'ni' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: Function parameter or member 'vcn' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: Function parameter or member 'ctx' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: expecting prototype for attrib.c - NTFS attribute operations. Part of the Linux(). Prototype was for ntfs_map_runlist_nolock() instead Link: https://lkml.kernel.org/r/20220106015145.67067-1-yang.lee@linux.alibaba.com Signed-off-by: Yang Li Reported-by: Abaci Robot Acked-by: Randy Dunlap Cc: Anton Altaparmakov Signed-off-by: Andrew Morton --- fs/ntfs/attrib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ntfs/attrib.c~ntfs-fix-one-kernel-doc-comment +++ a/fs/ntfs/attrib.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0-or-later -/** +/* * attrib.c - NTFS attribute operations. Part of the Linux-NTFS project. * * Copyright (c) 2001-2012 Anton Altaparmakov and Tuxera Inc. From patchwork Fri Jan 14 22:03:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78A95C4332F for ; Fri, 14 Jan 2022 22:03:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 118B16B009B; Fri, 14 Jan 2022 17:03:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A2996B009C; Fri, 14 Jan 2022 17:03:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E845E6B009D; Fri, 14 Jan 2022 17:03:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id CE9816B009B for ; Fri, 14 Jan 2022 17:03:34 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9151B998D9 for ; Fri, 14 Jan 2022 22:03:34 +0000 (UTC) X-FDA: 79030269948.12.35EB661 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id E9FA51C0005 for ; Fri, 14 Jan 2022 22:03:33 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4A8BD61FF0; Fri, 14 Jan 2022 22:03:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D26CC36AE9; Fri, 14 Jan 2022 22:03:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197812; bh=qgTwgytU62x3ftLkNF/pxm98+myLzfnACdaOnksfyw4=; h=Date:From:To:Subject:In-Reply-To:From; b=xkwq3AU6W7cYXR5uAAje09hFKir99TIfMOwb51DOA9CynvdEbKy0xlVJKDPbitvbR roWYvxWoyMFspktgPikxm/aWdaxF6ol4VvQPMOiTL904qqkr4zP5TWINezAJlL+oQh qmWDqNhcPUH8gcI1BwpU+ZPNN7PlYMw5JmCIKU5Y= Date: Fri, 14 Jan 2022 14:03:31 -0800 From: Andrew Morton To: akpm@linux-foundation.org, houtao1@huawei.com, linux-mm@kvack.org, miaoxie@huawei.com, mm-commits@vger.kernel.org, phillip@squashfs.org.uk, torvalds@linux-foundation.org, yi.zhang@huawei.com, zhengliang6@huawei.com Subject: [patch 013/146] squashfs: provide backing_dev_info in order to disable read-ahead Message-ID: <20220114220331.FqvedamY0%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xkwq3AU6; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: ujcd1yozu8888rhwc5d1389haeo74ngd X-Rspamd-Queue-Id: E9FA51C0005 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197813-943967 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zheng Liang Subject: squashfs: provide backing_dev_info in order to disable read-ahead Commit c1f6925e1091 ("mm: put readahead pages in cache earlier") causes the read performance of squashfs to deteriorate.Through testing, we find that the performance will be back by closing the readahead of squashfs. So we want to learn the way of ubifs, provides backing_dev_info and disable read-ahead. -------------------------------------------------------------------- We tested the following data by fio. squashfs image blocksize=128K test command: fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based turn on squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 271 128 randread 231 1024 randread 246 4 read 310 128 read 245 1024 read 247 turn off squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 293 128 randread 330 1024 randread 363 4 read 338 128 read 360 1024 read 365 turn on squashfs readahead and revert the commit c1f6925e1091("mm: put readahead pages in cache earlier") in 5.10 kernel bs(k) read/randread MB/s 4 randread 289 128 randread 306 1024 randread 335 4 read 337 128 read 336 1024 read 338 Link: https://lkml.kernel.org/r/20211116113141.1391026-1-zhengliang6@huawei.com Signed-off-by: Zheng Liang Reviewed-by: Phillip Lougher Cc: Zhang Yi Cc: Hou Tao Cc: Miao Xie Signed-off-by: Andrew Morton --- fs/squashfs/super.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) --- a/fs/squashfs/super.c~squashfs-provides-backing_dev_info-in-order-to-disable-read-ahead +++ a/fs/squashfs/super.c @@ -29,6 +29,7 @@ #include #include #include +#include #include "squashfs_fs.h" #include "squashfs_fs_sb.h" @@ -112,6 +113,24 @@ static const struct squashfs_decompresso return decompressor; } +static int squashfs_bdi_init(struct super_block *sb) +{ + int err; + unsigned int major = MAJOR(sb->s_dev); + unsigned int minor = MINOR(sb->s_dev); + + bdi_put(sb->s_bdi); + sb->s_bdi = &noop_backing_dev_info; + + err = super_setup_bdi_name(sb, "squashfs_%u_%u", major, minor); + if (err) + return err; + + sb->s_bdi->ra_pages = 0; + sb->s_bdi->io_pages = 0; + + return 0; +} static int squashfs_fill_super(struct super_block *sb, struct fs_context *fc) { @@ -127,6 +146,20 @@ static int squashfs_fill_super(struct su TRACE("Entered squashfs_fill_superblock\n"); + /* + * squashfs provides 'backing_dev_info' in order to disable read-ahead. For + * squashfs, I/O is not deferred, it is done immediately in readpage, + * which means the user would always have to wait their own I/O. So the effect + * of readahead is very weak for squashfs. squashfs_bdi_init will set + * sb->s_bdi->ra_pages and sb->s_bdi->io_pages to 0 and close readahead for + * squashfs. + */ + err = squashfs_bdi_init(sb); + if (err) { + errorf(fc, "squashfs init bdi failed"); + return err; + } + sb->s_fs_info = kzalloc(sizeof(*msblk), GFP_KERNEL); if (sb->s_fs_info == NULL) { ERROR("Failed to allocate squashfs_sb_info\n"); From patchwork Fri Jan 14 22:03:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97A19C433EF for ; Fri, 14 Jan 2022 22:03:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 204BF6B009E; Fri, 14 Jan 2022 17:03:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18D506B009F; Fri, 14 Jan 2022 17:03:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 005C16B00A0; Fri, 14 Jan 2022 17:03:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id DCAB66B009E for ; Fri, 14 Jan 2022 17:03:38 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9E3C982F4BEE for ; Fri, 14 Jan 2022 22:03:38 +0000 (UTC) X-FDA: 79030270116.19.E1850DE Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf05.hostedemail.com (Postfix) with ESMTP id 37081100007 for ; Fri, 14 Jan 2022 22:03:38 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3F7ADB8262F; Fri, 14 Jan 2022 22:03:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A27FDC36AE9; Fri, 14 Jan 2022 22:03:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197816; bh=u+LK7xStWhkzJS7v1TapIAfJyzhoww3EmEoACQ3I534=; h=Date:From:To:Subject:In-Reply-To:From; b=YDqU61Qf0lRKCRlJ80SD8Yc7jX/vRN2kZ2LntfVdaB2Jnvm0sIaGeqme7SMDE2Xqd VGxjQS96StrCTZyE/6tt2m9NBb4t4lAvDZ6UyIUwRg8ZAN4lvj/1MMmen20hlzvNhE y484cfc+ThkwrhHsRg4tBZLVqxlWiymBQb5p2K9o= Date: Fri, 14 Jan 2022 14:03:35 -0800 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org, zealci@zte.com.cn, zhang.mingyu@zte.com.cn Subject: [patch 014/146] ocfs2: use BUG_ON instead of if condition followed by BUG. Message-ID: <20220114220335.-FP0r81CP%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 37081100007 X-Stat-Signature: 71pb93gj4m53bixrhiwy5dycemqoh3m1 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YDqU61Qf; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197818-681757 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000574, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zhang Mingyu Subject: ocfs2: use BUG_ON instead of if condition followed by BUG. This issue was detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/20211105014424.75372-1-zhang.mingyu@zte.com.cn Signed-off-by: Zhang Mingyu Reported-by: Zeal Robot Acked-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/journal.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/fs/ocfs2/journal.c~ocfs2-use-bug_on-instead-of-if-condition-followed-by-bug +++ a/fs/ocfs2/journal.c @@ -1669,8 +1669,7 @@ static int ocfs2_replay_journal(struct o status = jbd2_journal_load(journal); if (status < 0) { mlog_errno(status); - if (!igrab(inode)) - BUG(); + BUG_ON(!igrab(inode)); jbd2_journal_destroy(journal); goto done; } @@ -1699,8 +1698,7 @@ static int ocfs2_replay_journal(struct o if (status < 0) mlog_errno(status); - if (!igrab(inode)) - BUG(); + BUG_ON(!igrab(inode)); jbd2_journal_destroy(journal); From patchwork Fri Jan 14 22:03:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D860C433F5 for ; Fri, 14 Jan 2022 22:03:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86CD76B009F; Fri, 14 Jan 2022 17:03:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F46F6B00A1; Fri, 14 Jan 2022 17:03:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6955C6B00A2; Fri, 14 Jan 2022 17:03:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 53C4B6B009F for ; Fri, 14 Jan 2022 17:03:42 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1E66A9879B for ; Fri, 14 Jan 2022 22:03:42 +0000 (UTC) X-FDA: 79030270284.12.817745B Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf30.hostedemail.com (Postfix) with ESMTP id 9FB8780003 for ; Fri, 14 Jan 2022 22:03:41 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9E0A4B8262E; Fri, 14 Jan 2022 22:03:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE5DCC36AE5; Fri, 14 Jan 2022 22:03:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197819; bh=Jpr0gDxBG2c1I+rPzuOJa8KfT9w3itjw0aD0P63usnE=; h=Date:From:To:Subject:In-Reply-To:From; b=rg8J1UL42cIdvG/e5nqyq9dSFPxwhQdHtC+c4Z7zscdXvqYeFHOza/k4ZPcFVGIWI hpX7BQTGp4NgRTqMnVzKPEFHYvk86gZGlYMix7IvV4Kt61Bc4Zn1aoRj/zXhCgToIY LLAmJFi4rjN0ULfEGh+jMffDq9CVOHO9MiJ3m274= Date: Fri, 14 Jan 2022 14:03:38 -0800 From: Andrew Morton To: akpm@linux-foundation.org, dan.carpenter@oracle.com, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 015/146] ocfs2: clearly handle ocfs2_grab_pages_for_write() return value Message-ID: <20220114220338.3M11cdFzD%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 9FB8780003 X-Stat-Signature: 5p3hn6fy56bktk9duc5364puzskiaxg6 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=rg8J1UL4; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642197821-806627 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joseph Qi Subject: ocfs2: clearly handle ocfs2_grab_pages_for_write() return value ocfs2_grab_pages_for_write() may return -EAGAIN if write context type is mmap and it could not lock the target page. In this case, we exit with no error and no target page. And then trigger the caller page_mkwrite() to retry. Since there are other caller types, e.g. buffer and direct io, make the return value handling more clear. Link: https://lkml.kernel.org/r/20211206065051.103353-1-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi Reported-by: Dan Carpenter Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/aops.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) --- a/fs/ocfs2/aops.c~ocfs2-clearly-handle-ocfs2_grab_pages_for_write-return-value +++ a/fs/ocfs2/aops.c @@ -1799,20 +1799,20 @@ try_again: */ ret = ocfs2_grab_pages_for_write(mapping, wc, wc->w_cpos, pos, len, cluster_of_pages, mmap_page); - if (ret && ret != -EAGAIN) { - mlog_errno(ret); - goto out_quota; - } + if (ret) { + /* + * ocfs2_grab_pages_for_write() returns -EAGAIN if it could not lock + * the target page. In this case, we exit with no error and no target + * page. This will trigger the caller, page_mkwrite(), to re-try + * the operation. + */ + if (type == OCFS2_WRITE_MMAP && ret == -EAGAIN) { + BUG_ON(wc->w_target_page); + ret = 0; + goto out_quota; + } - /* - * ocfs2_grab_pages_for_write() returns -EAGAIN if it could not lock - * the target page. In this case, we exit with no error and no target - * page. This will trigger the caller, page_mkwrite(), to re-try - * the operation. - */ - if (ret == -EAGAIN) { - BUG_ON(wc->w_target_page); - ret = 0; + mlog_errno(ret); goto out_quota; } From patchwork Fri Jan 14 22:03:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91F6DC433F5 for ; Fri, 14 Jan 2022 22:03:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18ED86B00A1; Fri, 14 Jan 2022 17:03:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 116A56B00A3; Fri, 14 Jan 2022 17:03:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED32B6B00A4; Fri, 14 Jan 2022 17:03:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id D85CB6B00A1 for ; Fri, 14 Jan 2022 17:03:44 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A10891827251D for ; Fri, 14 Jan 2022 22:03:44 +0000 (UTC) X-FDA: 79030270368.08.AD1C7DA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf18.hostedemail.com (Postfix) with ESMTP id EB5261C0003 for ; Fri, 14 Jan 2022 22:03:43 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 47AD461FF8; Fri, 14 Jan 2022 22:03:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42A78C36AE5; Fri, 14 Jan 2022 22:03:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197822; bh=HJbql4+3mauyMikf7+eBfBDh1LQCYCeuNiH0JIFBsqM=; h=Date:From:To:Subject:In-Reply-To:From; b=dOGPolUFpW1JC0IsK7XE7VcGNPcFCpSen0mOaq0pBWHZXLOOWydAfrxMB2AUkt9Xk EihYNYdjhZ/xPiQ+zjrgAdlE/r59SFzX1bsNMGFaB2Ec9Wp+x81BOXTmSwnO04dRxs VtXzwifF59lHBrCKS0mE95LNLoFgZINWRyKo6pG0= Date: Fri, 14 Jan 2022 14:03:41 -0800 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, gregkh@linuxfoundation.org, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 016/146] ocfs2: use default_groups in kobj_type Message-ID: <20220114220341.DGgJjC07l%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: EB5261C0003 X-Stat-Signature: a7fn38i5y79a7hce4hxaywox14rnfo57 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=dOGPolUF; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197823-897482 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Greg Kroah-Hartman Subject: ocfs2: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ocfs2 code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/20211228144517.391660-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman Acked-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/filecheck.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/fs/ocfs2/filecheck.c~ocfs2-use-default_groups-in-kobj_type +++ a/fs/ocfs2/filecheck.c @@ -94,6 +94,7 @@ static struct attribute *ocfs2_filecheck &ocfs2_filecheck_attr_set.attr, NULL }; +ATTRIBUTE_GROUPS(ocfs2_filecheck); static void ocfs2_filecheck_release(struct kobject *kobj) { @@ -138,7 +139,7 @@ static const struct sysfs_ops ocfs2_file }; static struct kobj_type ocfs2_ktype_filecheck = { - .default_attrs = ocfs2_filecheck_attrs, + .default_groups = ocfs2_filecheck_groups, .sysfs_ops = &ocfs2_filecheck_ops, .release = ocfs2_filecheck_release, }; From patchwork Fri Jan 14 22:03:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99C38C433EF for ; Fri, 14 Jan 2022 22:03:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3149E6B00A3; Fri, 14 Jan 2022 17:03:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 29C9B6B00A5; Fri, 14 Jan 2022 17:03:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 117146B00A6; Fri, 14 Jan 2022 17:03:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 004B46B00A3 for ; Fri, 14 Jan 2022 17:03:50 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B679C9879B for ; Fri, 14 Jan 2022 22:03:50 +0000 (UTC) X-FDA: 79030270620.20.E24C437 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 756A216000C for ; Fri, 14 Jan 2022 22:03:47 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9BAF461FEE; Fri, 14 Jan 2022 22:03:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 947CBC36AE5; Fri, 14 Jan 2022 22:03:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197826; bh=KGBgco/Qg404H8hFSNDm02SFLKyBLNcFUj8aSRtbYg0=; h=Date:From:To:Subject:In-Reply-To:From; b=U3gp0LihsO4oM5ACeKGVsf0jhzD8k/+OjDZi6gBBEIYeufLg4bOK0h02Fh9DA3D0L zFmLFFgYIDBfvP2S5abfwjVllYbvaAc6MF8Dv2zvVArV7scXEmqs8/81tCZGKiMbBZ MfvpV6//C+3rdgCxdki9gKIsybsKayJ264aKQpw8= Date: Fri, 14 Jan 2022 14:03:45 -0800 From: Andrew Morton To: akpm@linux-foundation.org, colin.i.king@gmail.com, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 017/146] ocfs2: remove redundant assignment to pointer root_bh Message-ID: <20220114220345.-mGCap73R%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 756A216000C X-Stat-Signature: ipc7rayb1pubcnxsa8pfhx5ruw4ushq8 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=U3gp0Lih; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197827-934158 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: ocfs2: remove redundant assignment to pointer root_bh Pointer root_bh is being initialized with a value that is not read, it is being re-assigned later on closer to its use. The early initialization is redundant and can be removed. Link: https://lkml.kernel.org/r/20211228013719.620923-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King Acked-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ocfs2/alloc.c~ocfs2-remove-redundant-assignment-to-pointer-root_bh +++ a/fs/ocfs2/alloc.c @@ -2040,7 +2040,7 @@ static void ocfs2_complete_edge_insert(h int i, idx; struct ocfs2_extent_list *el, *left_el, *right_el; struct ocfs2_extent_rec *left_rec, *right_rec; - struct buffer_head *root_bh = left_path->p_node[subtree_index].bh; + struct buffer_head *root_bh; /* * Update the counts and position values within all the From patchwork Fri Jan 14 22:03:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714044 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E757BC433F5 for ; Fri, 14 Jan 2022 22:03:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75C886B00A5; Fri, 14 Jan 2022 17:03:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E3E56B00A7; Fri, 14 Jan 2022 17:03:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55D6C6B00A8; Fri, 14 Jan 2022 17:03:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 326E86B00A5 for ; Fri, 14 Jan 2022 17:03:52 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F060A98C29 for ; Fri, 14 Jan 2022 22:03:51 +0000 (UTC) X-FDA: 79030270662.03.D2BDF9B Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf16.hostedemail.com (Postfix) with ESMTP id 8059418000C for ; Fri, 14 Jan 2022 22:03:51 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 74DE6B82A39; Fri, 14 Jan 2022 22:03:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD743C36AE5; Fri, 14 Jan 2022 22:03:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197829; bh=GpiJ2iTyuMfTog8vtRIZa1jO8WtE6H+rsP4YsjNdwpw=; h=Date:From:To:Subject:In-Reply-To:From; b=1RzNT0bfJMsjN3ygxZL/Y3az/9wv581NrpLd2cMnPQ6OqhKBM+rktj96W+dyXlwWH DgIlgJR8POZLOWP7PpMdaGZld/9YmEswAcG68FsWb0imINEw8kQt2dx3H5qKHUQvSt OUHtmZa1IKXmGQeQ31PYfMhvuORuvUnuhEf2Wc6U= Date: Fri, 14 Jan 2022 14:03:48 -0800 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, gregkh@linuxfoundation.org, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 018/146] ocfs2: cluster: use default_groups in kobj_type Message-ID: <20220114220348.KsXCW5R8X%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8059418000C X-Stat-Signature: 5we14jopmkge3ikcbtuhmefqjwqmsfos Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1RzNT0bf; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197831-271626 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Greg Kroah-Hartman Subject: ocfs2: cluster: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ocfs2 cluster sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/20220106102028.3345634-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman Reviewed-by: Joseph Qi Tested-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/cluster/masklog.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/fs/ocfs2/cluster/masklog.c~ocfs2-cluster-use-default_groups-in-kobj_type +++ a/fs/ocfs2/cluster/masklog.c @@ -120,7 +120,8 @@ static struct mlog_attribute mlog_attrs[ define_mask(KTHREAD), }; -static struct attribute *mlog_attr_ptrs[MLOG_MAX_BITS] = {NULL, }; +static struct attribute *mlog_default_attrs[MLOG_MAX_BITS] = {NULL, }; +ATTRIBUTE_GROUPS(mlog_default); static ssize_t mlog_show(struct kobject *obj, struct attribute *attr, char *buf) @@ -144,8 +145,8 @@ static const struct sysfs_ops mlog_attr_ }; static struct kobj_type mlog_ktype = { - .default_attrs = mlog_attr_ptrs, - .sysfs_ops = &mlog_attr_ops, + .default_groups = mlog_default_groups, + .sysfs_ops = &mlog_attr_ops, }; static struct kset mlog_kset = { @@ -157,10 +158,10 @@ int mlog_sys_init(struct kset *o2cb_kset int i = 0; while (mlog_attrs[i].attr.mode) { - mlog_attr_ptrs[i] = &mlog_attrs[i].attr; + mlog_default_attrs[i] = &mlog_attrs[i].attr; i++; } - mlog_attr_ptrs[i] = NULL; + mlog_default_attrs[i] = NULL; kobject_set_name(&mlog_kset.kobj, "logmask"); mlog_kset.kobj.kset = o2cb_kset; From patchwork Fri Jan 14 22:03:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E472C433F5 for ; Fri, 14 Jan 2022 22:03:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3DC06B00A7; Fri, 14 Jan 2022 17:03:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC4416B00A9; Fri, 14 Jan 2022 17:03:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A66266B00AA; Fri, 14 Jan 2022 17:03:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0146.hostedemail.com [216.40.44.146]) by kanga.kvack.org (Postfix) with ESMTP id 8BC7A6B00A7 for ; Fri, 14 Jan 2022 17:03:56 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4D1F180EC672 for ; Fri, 14 Jan 2022 22:03:56 +0000 (UTC) X-FDA: 79030270872.12.B5D113C Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf03.hostedemail.com (Postfix) with ESMTP id CB77F20006 for ; Fri, 14 Jan 2022 22:03:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CFBA0B82A26; Fri, 14 Jan 2022 22:03:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40722C36AE9; Fri, 14 Jan 2022 22:03:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197832; bh=e20GBA0OwqvknUZTAmxB+UM/XpDlxjDw0NO9T5731pM=; h=Date:From:To:Subject:In-Reply-To:From; b=mi1wdk46vNVKyErSwrYPp4la6SedFc9EZQuqtwIjr9upUrWuCbs+ADNxVTr6O1bAD ns/vPQG+cwO4qoh7JrJsq+psq15b+LXJuh/truIMQgaBlfUdfM6uMS+ycZoxOi7BqH ugDcImjLwGNjnsVmZTGPq+B5/iPHvv/QOwkcSCss= Date: Fri, 14 Jan 2022 14:03:51 -0800 From: Andrew Morton To: akpm@linux-foundation.org, colin.i.king@gmail.com, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 019/146] ocfs2: remove redundant assignment to variable free_space Message-ID: <20220114220351.qb2sDMT4e%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: CB77F20006 X-Stat-Signature: 8g1sj8quttbowcdncmna9n4gkyoz4e4p Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mi1wdk46; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197834-803517 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: ocfs2: remove redundant assignment to variable free_space Variable free_space is being initialized with a value that is not read, it is being re-assigned later in the two paths of an if statement. The early initialization is redundant and can be removed. Link: https://lkml.kernel.org/r/20220112230411.1090761-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King Acked-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ocfs2/dir.c~ocfs2-remove-redundant-assignment-to-variable-free_space +++ a/fs/ocfs2/dir.c @@ -3343,7 +3343,7 @@ static int ocfs2_find_dir_space_id(struc struct ocfs2_dir_entry *de, *last_de = NULL; char *de_buf, *limit; unsigned long offset = 0; - unsigned int rec_len, new_rec_len, free_space = dir->i_sb->s_blocksize; + unsigned int rec_len, new_rec_len, free_space; /* * This calculates how many free bytes we'd have in block zero, should From patchwork Fri Jan 14 22:03:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714046 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8149FC433EF for ; Fri, 14 Jan 2022 22:03:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E12696B00A9; Fri, 14 Jan 2022 17:03:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CFE206B00AB; Fri, 14 Jan 2022 17:03:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9E2F6B00AC; Fri, 14 Jan 2022 17:03:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0052.hostedemail.com [216.40.44.52]) by kanga.kvack.org (Postfix) with ESMTP id 9CF106B00A9 for ; Fri, 14 Jan 2022 17:03:57 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 69F26998B4 for ; Fri, 14 Jan 2022 22:03:57 +0000 (UTC) X-FDA: 79030270914.29.AEAC0EA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf30.hostedemail.com (Postfix) with ESMTP id 139DE80009 for ; Fri, 14 Jan 2022 22:03:56 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 678CE61FE2; Fri, 14 Jan 2022 22:03:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A900C36AE9; Fri, 14 Jan 2022 22:03:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197835; bh=ZNao/APMSnVraYf7V2i353zBOJ2zo6kziYgfy8pkItk=; h=Date:From:To:Subject:In-Reply-To:From; b=yhL8IHrnWE47GhxUbYUgzmEXKgX9CG7o+ckNKTCTakC6psqYzaXRFDI4obHPepTpg uAsaakpZh50lFvlr3vjb0jGDYhbt0NPKLru0fmmvI8Gc64a8xDYpUKULT5O38fiqMe HIH2kNAl8S1DJulKh2+2AGJcObw+/WcDizcvnJvw= Date: Fri, 14 Jan 2022 14:03:55 -0800 From: Andrew Morton To: akpm@linux-foundation.org, amit.kachhap@arm.com, Kevin.Brodsky@arm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, Vincenzo.Frascino@arm.com, viro@zeniv.linux.org.uk Subject: [patch 020/146] fs/ioctl: remove unnecessary __user annotation Message-ID: <20220114220355.BH05XczvO%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 139DE80009 X-Stat-Signature: 4dyfzj7ojd9hywmxeo3d1drrhjd83qkc Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yhL8IHrn; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197836-423905 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Amit Daniel Kachhap Subject: fs/ioctl: remove unnecessary __user annotation __user annotations are used by the checker (e.g sparse) to mark user pointers. However here __user is applied to a struct directly, without a pointer being directly involved. Although the presence of __user does not cause sparse to emit a warning, __user should be removed for consistency with other uses of offsetof(). Note: No functional changes intended. Link: https://lkml.kernel.org/r/20211122101256.7875-1-amit.kachhap@arm.com Signed-off-by: Amit Daniel Kachhap Cc: Vincenzo Frascino Cc: Kevin Brodsky Cc: Al Viro Signed-off-by: Andrew Morton --- fs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ioctl.c~fs-ioctl-remove-unnecessary-__user-annotation +++ a/fs/ioctl.c @@ -430,7 +430,7 @@ static int ioctl_file_dedupe_range(struc goto out; } - size = offsetof(struct file_dedupe_range __user, info[count]); + size = offsetof(struct file_dedupe_range, info[count]); if (size > PAGE_SIZE) { ret = -ENOMEM; goto out; From patchwork Fri Jan 14 22:03:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D186C433FE for ; Fri, 14 Jan 2022 22:04:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A34C6B00AB; Fri, 14 Jan 2022 17:04:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 02CB96B00AD; Fri, 14 Jan 2022 17:04:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE7D66B00AE; Fri, 14 Jan 2022 17:04:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id C71E96B00AB for ; Fri, 14 Jan 2022 17:04:01 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8C975181953FC for ; Fri, 14 Jan 2022 22:04:01 +0000 (UTC) X-FDA: 79030271082.06.E48F371 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 783811C0006 for ; Fri, 14 Jan 2022 22:04:00 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C11C761FFA; Fri, 14 Jan 2022 22:03:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7327C36AE9; Fri, 14 Jan 2022 22:03:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197839; bh=iOVyw2IzQSARaWNOIL3nAQMty30KC8WIeK0cp2QJhL8=; h=Date:From:To:Subject:In-Reply-To:From; b=QCQTMvAIal2ShTmLmYp3amdPcmFbO2oSKRCpFQ9s1wBfPh2sfp6qzK5xWf35Q7zJS KMc6Dt54qytQ3FbsLUFJ6DOvrfFLSN9MrNXtDLI7Ep3wtffRTYPoFfBpGI5eyhiucI RwfAVDh5klhGzKrXWUvHCf1EoaffzGjqPD0SZN3E= Date: Fri, 14 Jan 2022 14:03:58 -0800 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, dvyukov@google.com, elver@google.com, glider@google.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 021/146] mm/slab_common: use WARN() if cache still has objects on destroy Message-ID: <20220114220358.-PVz8K_7b%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 783811C0006 X-Stat-Signature: azzqbaii6no1kr6y6gnng9zcsg7wt5yp Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=QCQTMvAI; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642197840-625052 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: mm/slab_common: use WARN() if cache still has objects on destroy Calling kmem_cache_destroy() while the cache still has objects allocated is a kernel bug, and will usually result in the entire cache being leaked. While the message in kmem_cache_destroy() resembles a warning, it is currently not implemented using a real WARN(). This is problematic for infrastructure testing the kernel, all of which rely on the specific format of WARN()s to pick up on bugs. Some 13 years ago this used to be a simple WARN_ON() in slub, but d629d8195793 ("slub: improve kmem_cache_destroy() error message") changed it into an open-coded warning to avoid confusion with a bug in slub itself. Instead, turn the open-coded warning into a real WARN() with the message preserved, so that test systems can actually identify these issues, and we get all the other benefits of using a normal WARN(). The warning message is extended with "when called from " to make it even clearer where the fault lies. For most configurations this is only a cosmetic change, however, note that WARN() here will now also respect panic_on_warn. Link: https://lkml.kernel.org/r/20211102170733.648216-1-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Vlastimil Babka Acked-by: David Rientjes Cc: Christoph Lameter Cc: Pekka Enberg Cc: Joonsoo Kim Cc: Dmitry Vyukov Cc: Alexander Potapenko Cc: Ingo Molnar Signed-off-by: Andrew Morton --- mm/slab_common.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) --- a/mm/slab_common.c~mm-slab_common-use-warn-if-cache-still-has-objects-on-destroy +++ a/mm/slab_common.c @@ -489,8 +489,6 @@ void slab_kmem_cache_release(struct kmem void kmem_cache_destroy(struct kmem_cache *s) { - int err; - if (unlikely(!s)) return; @@ -501,12 +499,9 @@ void kmem_cache_destroy(struct kmem_cach if (s->refcount) goto out_unlock; - err = shutdown_cache(s); - if (err) { - pr_err("%s %s: Slab cache still has objects\n", - __func__, s->name); - dump_stack(); - } + WARN(shutdown_cache(s), + "%s %s: Slab cache still has objects when called from %pS", + __func__, s->name, (void *)_RET_IP_); out_unlock: mutex_unlock(&slab_mutex); cpus_read_unlock(); From patchwork Fri Jan 14 22:04:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714048 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1178C433EF for ; Fri, 14 Jan 2022 22:04:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55F446B00AD; Fri, 14 Jan 2022 17:04:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E6826B00AF; Fri, 14 Jan 2022 17:04:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 338E76B00B0; Fri, 14 Jan 2022 17:04:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 2191E6B00AD for ; Fri, 14 Jan 2022 17:04:06 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D1B391810C7DF for ; Fri, 14 Jan 2022 22:04:05 +0000 (UTC) X-FDA: 79030271250.17.460DDEB Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf27.hostedemail.com (Postfix) with ESMTP id B06C340008 for ; Fri, 14 Jan 2022 22:04:04 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AF708B8262F; Fri, 14 Jan 2022 22:04:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 08155C36AE9; Fri, 14 Jan 2022 22:04:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197842; bh=ix0Jn+5IN+yQioaMJD/5xRuRxcjZ6mnQSrrUSRw3vPE=; h=Date:From:To:Subject:In-Reply-To:From; b=X4yHtB5o6EhDvsZ1FfegyvcKLGZtRKSSAuW0o16y0xl3DhnQNzpuZzSQDLRdP8TYH n8PDyRkZDidwT0tpvdqWgr7NE3ZKB0LDztSBcic88Zgh4BOkazoHuHNawnvJ3cQXeg 2sOSqAhgV4gdFJxA0itbyKnnFBoTguXP+9FbuYW4= Date: Fri, 14 Jan 2022 14:04:01 -0800 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 022/146] mm: slab: make slab iterator functions static Message-ID: <20220114220401.-Yi3naZJB%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=X4yHtB5o; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: bdnueh93uuwboz943dxsnhx9qqp3qmsa X-Rspamd-Queue-Id: B06C340008 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197844-910523 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: slab: make slab iterator functions static There is no external users of slab_start/next/stop(), so make them static. And the memory.kmem.slabinfo is deprecated, which outputs nothing now, so move memcg_slab_show() into mm/memcontrol.c and rename it to mem_cgroup_slab_show to be consistent with other function names. Link: https://lkml.kernel.org/r/20211109133359.32881-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Reviewed-by: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- mm/memcontrol.c | 13 ++++++++++++- mm/slab.h | 5 ----- mm/slab_common.c | 17 +++-------------- 3 files changed, 15 insertions(+), 20 deletions(-) --- a/mm/memcontrol.c~mm-slab-make-slab-iterator-functions-static +++ a/mm/memcontrol.c @@ -4845,6 +4845,17 @@ out_kfree: return ret; } +#if defined(CONFIG_MEMCG_KMEM) && (defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)) +static int mem_cgroup_slab_show(struct seq_file *m, void *p) +{ + /* + * Deprecated. + * Please, take a look at tools/cgroup/slabinfo.py . + */ + return 0; +} +#endif + static struct cftype mem_cgroup_legacy_files[] = { { .name = "usage_in_bytes", @@ -4945,7 +4956,7 @@ static struct cftype mem_cgroup_legacy_f (defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)) { .name = "kmem.slabinfo", - .seq_show = memcg_slab_show, + .seq_show = mem_cgroup_slab_show, }, #endif { --- a/mm/slab_common.c~mm-slab-make-slab-iterator-functions-static +++ a/mm/slab_common.c @@ -1039,18 +1039,18 @@ static void print_slabinfo_header(struct seq_putc(m, '\n'); } -void *slab_start(struct seq_file *m, loff_t *pos) +static void *slab_start(struct seq_file *m, loff_t *pos) { mutex_lock(&slab_mutex); return seq_list_start(&slab_caches, *pos); } -void *slab_next(struct seq_file *m, void *p, loff_t *pos) +static void *slab_next(struct seq_file *m, void *p, loff_t *pos) { return seq_list_next(p, &slab_caches, pos); } -void slab_stop(struct seq_file *m, void *p) +static void slab_stop(struct seq_file *m, void *p) { mutex_unlock(&slab_mutex); } @@ -1118,17 +1118,6 @@ void dump_unreclaimable_slab(void) mutex_unlock(&slab_mutex); } -#if defined(CONFIG_MEMCG_KMEM) -int memcg_slab_show(struct seq_file *m, void *p) -{ - /* - * Deprecated. - * Please, take a look at tools/cgroup/slabinfo.py . - */ - return 0; -} -#endif - /* * slabinfo_op - iterator that generates /proc/slabinfo * --- a/mm/slab.h~mm-slab-make-slab-iterator-functions-static +++ a/mm/slab.h @@ -575,11 +575,6 @@ static inline struct kmem_cache_node *ge #endif -void *slab_start(struct seq_file *m, loff_t *pos); -void *slab_next(struct seq_file *m, void *p, loff_t *pos); -void slab_stop(struct seq_file *m, void *p); -int memcg_slab_show(struct seq_file *m, void *p); - #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG) void dump_unreclaimable_slab(void); #else From patchwork Fri Jan 14 22:04:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCE5FC433FE for ; Fri, 14 Jan 2022 22:04:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6726A6B00AF; Fri, 14 Jan 2022 17:04:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D13B6B00B1; Fri, 14 Jan 2022 17:04:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FD036B00B2; Fri, 14 Jan 2022 17:04:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id 281F36B00AF for ; Fri, 14 Jan 2022 17:04:08 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DE8771810C7DF for ; Fri, 14 Jan 2022 22:04:07 +0000 (UTC) X-FDA: 79030271334.03.7340FFF Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 1136A40006 for ; Fri, 14 Jan 2022 22:04:06 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 526C161FB7; Fri, 14 Jan 2022 22:04:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6AF52C36AE5; Fri, 14 Jan 2022 22:04:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197845; bh=XvFXvx5IVVkhOJxYT4ysGdrTvp9RocgrSbuFPsQPXMM=; h=Date:From:To:Subject:In-Reply-To:From; b=SxZ/UOzwgmC7D35hAFDQ6Jko2tTMRTAB/KzQ++XorURw9RoCiG0PqUvR2WQ+P288q OGn4LvFQ92YsWnTVqeY4ys4YZczIjeEQ3YhOo4KbGLWr03SL5nykA7z+iyZNIK+cY+ 3aqfR/gQ4uFfppAa1fKtXhjP+kTQ2lq9HHNJm13I= Date: Fri, 14 Jan 2022 14:04:04 -0800 From: Andrew Morton To: akpm@linux-foundation.org, catalin.marinas@arm.com, Kuan-Ying.Lee@mediatek.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, opendmb@gmail.com, peterz@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 023/146] kmemleak: fix kmemleak false positive report with HW tag-based kasan enable Message-ID: <20220114220404.vc-aCuaCH%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1136A40006 X-Stat-Signature: hqspzt1qun9z91ecwuega1oa74z1ryxa Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="SxZ/UOzw"; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197846-171563 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kuan-Ying Lee Subject: kmemleak: fix kmemleak false positive report with HW tag-based kasan enable With HW tag-based kasan enable, We will get the warning when we free object whose address starts with 0xFF. It is because kmemleak rbtree stores tagged object and this freeing object's tag does not match with rbtree object. In the example below, kmemleak rbtree stores the tagged object in the kmalloc(), and kfree() gets the pointer with 0xFF tag. Call sequence: ptr = kmalloc(size, GFP_KERNEL); page = virt_to_page(ptr); offset = offset_in_page(ptr); kfree(page_address(page) + offset); ptr = kmalloc(size, GFP_KERNEL); Call sequence like that may cause the warning as following: 1) Freeing unknown object: In kfree(), we will get free unknown object warning in kmemleak_free(). Because object(0xFx) in kmemleak rbtree and pointer(0xFF) in kfree() have different tag. 2) Overlap existing: When we allocate that object with the same hw-tag again, we will find the overlap in the kmemleak rbtree and kmemleak thread will be killed. [ 116.685312] kmemleak: Freeing unknown object at 0xffff000003f88000 [ 116.686422] CPU: 5 PID: 177 Comm: cat Not tainted 5.16.0-rc1-dirty #21 [ 116.687067] Hardware name: linux,dummy-virt (DT) [ 116.687496] Call trace: [ 116.687792] dump_backtrace+0x0/0x1ac [ 116.688255] show_stack+0x1c/0x30 [ 116.688663] dump_stack_lvl+0x68/0x84 [ 116.689096] dump_stack+0x1c/0x38 [ 116.689499] kmemleak_free+0x6c/0x70 [ 116.689919] slab_free_freelist_hook+0x104/0x200 [ 116.690420] kmem_cache_free+0xa8/0x3d4 [ 116.690845] test_version_show+0x270/0x3a0 [ 116.691344] module_attr_show+0x28/0x40 [ 116.691789] sysfs_kf_seq_show+0xb0/0x130 [ 116.692245] kernfs_seq_show+0x30/0x40 [ 116.692678] seq_read_iter+0x1bc/0x4b0 [ 116.692678] seq_read_iter+0x1bc/0x4b0 [ 116.693114] kernfs_fop_read_iter+0x144/0x1c0 [ 116.693586] generic_file_splice_read+0xd0/0x184 [ 116.694078] do_splice_to+0x90/0xe0 [ 116.694498] splice_direct_to_actor+0xb8/0x250 [ 116.694975] do_splice_direct+0x88/0xd4 [ 116.695409] do_sendfile+0x2b0/0x344 [ 116.695829] __arm64_sys_sendfile64+0x164/0x16c [ 116.696306] invoke_syscall+0x48/0x114 [ 116.696735] el0_svc_common.constprop.0+0x44/0xec [ 116.697263] do_el0_svc+0x74/0x90 [ 116.697665] el0_svc+0x20/0x80 [ 116.698261] el0t_64_sync_handler+0x1a8/0x1b0 [ 116.698695] el0t_64_sync+0x1ac/0x1b0 ... [ 117.520301] kmemleak: Cannot insert 0xf2ff000003f88000 into the object search tree (overlaps existing) [ 117.521118] CPU: 5 PID: 178 Comm: cat Not tainted 5.16.0-rc1-dirty #21 [ 117.521827] Hardware name: linux,dummy-virt (DT) [ 117.522287] Call trace: [ 117.522586] dump_backtrace+0x0/0x1ac [ 117.523053] show_stack+0x1c/0x30 [ 117.523578] dump_stack_lvl+0x68/0x84 [ 117.524039] dump_stack+0x1c/0x38 [ 117.524472] create_object.isra.0+0x2d8/0x2fc [ 117.524975] kmemleak_alloc+0x34/0x40 [ 117.525416] kmem_cache_alloc+0x23c/0x2f0 [ 117.525914] test_version_show+0x1fc/0x3a0 [ 117.526379] module_attr_show+0x28/0x40 [ 117.526827] sysfs_kf_seq_show+0xb0/0x130 [ 117.527363] kernfs_seq_show+0x30/0x40 [ 117.527848] seq_read_iter+0x1bc/0x4b0 [ 117.528320] kernfs_fop_read_iter+0x144/0x1c0 [ 117.528809] generic_file_splice_read+0xd0/0x184 [ 117.529316] do_splice_to+0x90/0xe0 [ 117.529734] splice_direct_to_actor+0xb8/0x250 [ 117.530227] do_splice_direct+0x88/0xd4 [ 117.530686] do_sendfile+0x2b0/0x344 [ 117.531154] __arm64_sys_sendfile64+0x164/0x16c [ 117.531673] invoke_syscall+0x48/0x114 [ 117.532111] el0_svc_common.constprop.0+0x44/0xec [ 117.532621] do_el0_svc+0x74/0x90 [ 117.533048] el0_svc+0x20/0x80 [ 117.533461] el0t_64_sync_handler+0x1a8/0x1b0 [ 117.533950] el0t_64_sync+0x1ac/0x1b0 [ 117.534625] kmemleak: Kernel memory leak detector disabled [ 117.535201] kmemleak: Object 0xf2ff000003f88000 (size 128): [ 117.535761] kmemleak: comm "cat", pid 177, jiffies 4294921177 [ 117.536339] kmemleak: min_count = 1 [ 117.536718] kmemleak: count = 0 [ 117.537068] kmemleak: flags = 0x1 [ 117.537429] kmemleak: checksum = 0 [ 117.537806] kmemleak: backtrace: [ 117.538211] kmem_cache_alloc+0x23c/0x2f0 [ 117.538924] test_version_show+0x1fc/0x3a0 [ 117.539393] module_attr_show+0x28/0x40 [ 117.539844] sysfs_kf_seq_show+0xb0/0x130 [ 117.540304] kernfs_seq_show+0x30/0x40 [ 117.540750] seq_read_iter+0x1bc/0x4b0 [ 117.541206] kernfs_fop_read_iter+0x144/0x1c0 [ 117.541687] generic_file_splice_read+0xd0/0x184 [ 117.542182] do_splice_to+0x90/0xe0 [ 117.542611] splice_direct_to_actor+0xb8/0x250 [ 117.543097] do_splice_direct+0x88/0xd4 [ 117.543544] do_sendfile+0x2b0/0x344 [ 117.543983] __arm64_sys_sendfile64+0x164/0x16c [ 117.544471] invoke_syscall+0x48/0x114 [ 117.544917] el0_svc_common.constprop.0+0x44/0xec [ 117.545416] do_el0_svc+0x74/0x90 [ 117.554100] kmemleak: Automatic memory scanning thread ended [akpm@linux-foundation.org: whitespace tweak] Link: https://lkml.kernel.org/r/20211118054426.4123-1-Kuan-Ying.Lee@mediatek.com Signed-off-by: Kuan-Ying Lee Reviewed-by: Catalin Marinas Cc: Doug Berger Cc: Mel Gorman Cc: Peter Zijlstra Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/kmemleak.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) --- a/mm/kmemleak.c~kmemleak-fix-kmemleak-false-positive-report-with-hw-tag-based-kasan-enable +++ a/mm/kmemleak.c @@ -381,15 +381,20 @@ static void dump_object_info(struct kmem static struct kmemleak_object *lookup_object(unsigned long ptr, int alias) { struct rb_node *rb = object_tree_root.rb_node; + unsigned long untagged_ptr = (unsigned long)kasan_reset_tag((void *)ptr); while (rb) { - struct kmemleak_object *object = - rb_entry(rb, struct kmemleak_object, rb_node); - if (ptr < object->pointer) + struct kmemleak_object *object; + unsigned long untagged_objp; + + object = rb_entry(rb, struct kmemleak_object, rb_node); + untagged_objp = (unsigned long)kasan_reset_tag((void *)object->pointer); + + if (untagged_ptr < untagged_objp) rb = object->rb_node.rb_left; - else if (object->pointer + object->size <= ptr) + else if (untagged_objp + object->size <= untagged_ptr) rb = object->rb_node.rb_right; - else if (object->pointer == ptr || alias) + else if (untagged_objp == untagged_ptr || alias) return object; else { kmemleak_warn("Found object by alias at 0x%08lx\n", @@ -576,6 +581,7 @@ static struct kmemleak_object *create_ob struct kmemleak_object *object, *parent; struct rb_node **link, *rb_parent; unsigned long untagged_ptr; + unsigned long untagged_objp; object = mem_pool_alloc(gfp); if (!object) { @@ -629,9 +635,10 @@ static struct kmemleak_object *create_ob while (*link) { rb_parent = *link; parent = rb_entry(rb_parent, struct kmemleak_object, rb_node); - if (ptr + size <= parent->pointer) + untagged_objp = (unsigned long)kasan_reset_tag((void *)parent->pointer); + if (untagged_ptr + size <= untagged_objp) link = &parent->rb_node.rb_left; - else if (parent->pointer + parent->size <= ptr) + else if (untagged_objp + parent->size <= untagged_ptr) link = &parent->rb_node.rb_right; else { kmemleak_stop("Cannot insert 0x%lx into the object search tree (overlaps existing)\n", From patchwork Fri Jan 14 22:04:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76DB2C433FE for ; Fri, 14 Jan 2022 22:04:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08B496B00B1; Fri, 14 Jan 2022 17:04:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 004BA6B00B3; Fri, 14 Jan 2022 17:04:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE7EE6B00B4; Fri, 14 Jan 2022 17:04:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id C1AE86B00B1 for ; Fri, 14 Jan 2022 17:04:10 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8D1809879B for ; Fri, 14 Jan 2022 22:04:10 +0000 (UTC) X-FDA: 79030271460.21.7CA671A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id 149BB40007 for ; Fri, 14 Jan 2022 22:04:09 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6620861FE2; Fri, 14 Jan 2022 22:04:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90B25C36AE5; Fri, 14 Jan 2022 22:04:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197848; bh=OSe+im9JFy8BViBq6/nlfLKffW+SXhkaPqS3wYcfOMA=; h=Date:From:To:Subject:In-Reply-To:From; b=0LFyw2uk2AX7UAqxELXZ6XtVZCfadT8Y1bVx40Spjp3AZJ3/bQzAycVU8GkxKsiJe UUNbqJxruID8s0Fn85JMVukRF1qu1Jrt2d4DGOksgt+6Iu7AkBBgHR5QWcsNb3Y97f bDBp4JgQ+u5lm8Fwi4rb1aoGUQQcAdLhtPJZ9NqM= Date: Fri, 14 Jan 2022 14:04:08 -0800 From: Andrew Morton To: akpm@linux-foundation.org, calvinzhang.cool@gmail.com, catalin.marinas@arm.com, frowand.list@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, robh+dt@kernel.org, torvalds@linux-foundation.org Subject: [patch 024/146] mm: kmemleak: alloc gray object for reserved region with direct map Message-ID: <20220114220408.0dD7Bxyv7%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 149BB40007 X-Stat-Signature: bh5gbkg13oy9dggu3cjozzhi11zhbxyi Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=0LFyw2uk; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197849-383962 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Calvin Zhang Subject: mm: kmemleak: alloc gray object for reserved region with direct map Reserved regions with direct mapping may contain references to other regions. CMA region with fixed location is reserved without creating kmemleak_object for it. So add them as gray kmemleak objects. Link: https://lkml.kernel.org/r/20211123090641.3654006-1-calvinzhang.cool@gmail.com Signed-off-by: Calvin Zhang Cc: Rob Herring Cc: Frank Rowand Cc: Catalin Marinas Signed-off-by: Andrew Morton --- drivers/of/fdt.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/drivers/of/fdt.c~mm-kmemleak-alloc-gray-object-for-reserved-region-with-direct-map +++ a/drivers/of/fdt.c @@ -26,6 +26,7 @@ #include #include #include +#include #include /* for COMMAND_LINE_SIZE */ #include @@ -522,9 +523,12 @@ static int __init __reserved_mem_reserve size = dt_mem_next_cell(dt_root_size_cells, &prop); if (size && - early_init_dt_reserve_memory_arch(base, size, nomap) == 0) + early_init_dt_reserve_memory_arch(base, size, nomap) == 0) { pr_debug("Reserved memory: reserved region for node '%s': base %pa, size %lu MiB\n", uname, &base, (unsigned long)(size / SZ_1M)); + if (!nomap) + kmemleak_alloc_phys(base, size, 0, 0); + } else pr_info("Reserved memory: failed to reserve memory for node '%s': base %pa, size %lu MiB\n", uname, &base, (unsigned long)(size / SZ_1M)); From patchwork Fri Jan 14 22:04:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54EEDC433FE for ; Fri, 14 Jan 2022 22:04:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC3156B00B3; Fri, 14 Jan 2022 17:04:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D4B056B00B5; Fri, 14 Jan 2022 17:04:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEEFC6B00B6; Fri, 14 Jan 2022 17:04:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id A91A46B00B3 for ; Fri, 14 Jan 2022 17:04:14 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 63EBA82F76DF for ; Fri, 14 Jan 2022 22:04:14 +0000 (UTC) X-FDA: 79030271628.18.1600CAC Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id ECD7C10000C for ; Fri, 14 Jan 2022 22:04:13 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 44E8961FE2; Fri, 14 Jan 2022 22:04:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 007C0C36AE9; Fri, 14 Jan 2022 22:04:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197852; bh=/CK2R4Uz89Iinr47VRydhr2LXy8xCgyTN5YRCoTibK4=; h=Date:From:To:Subject:In-Reply-To:From; b=EmiextObaHvwm1GF/XFsKuFAOKgaGTwJpT/XihdT09tCB9I4OSKLJn7keUW4k274U uieYG5tBddZXF1CvGxWrSO8+wNVgqHEP0njqB4Qtg5s9QsEbgQ+HGw3EPpVLYzstrJ Zi8V1mqYgu2AafABDY6mio+ItWw1+5l0WTJ/xp9I= Date: Fri, 14 Jan 2022 14:04:11 -0800 From: Andrew Morton To: agordeev@linux.ibm.com, akpm@linux-foundation.org, andreyknvl@gmail.com, borntraeger@linux.ibm.com, bp@alien8.de, catalin.marinas@arm.com, dave.hansen@linux.intel.com, dvyukov@google.com, glider@google.com, gor@linux.ibm.com, hca@linux.ibm.com, linux-mm@kvack.org, liuyongqiang13@huawei.com, mingo@redhat.com, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, tglx@linutronix.de, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com, will@kernel.org Subject: [patch 025/146] mm: defer kmemleak object creation of module_alloc() Message-ID: <20220114220411.9MdA2JViM%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: ECD7C10000C X-Stat-Signature: qwrmqxesi3yaby7kn7xsi6h35x1uioip Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EmiextOb; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197853-894144 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: mm: defer kmemleak object creation of module_alloc() Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/6d41e2b9-4692-5ec4-b1cd-cbe29ae89739@huawei.com/ [wangkefeng.wang@huawei.com: fix build] Link: https://lkml.kernel.org/r/20211125080307.27225-1-wangkefeng.wang@huawei.com [akpm@linux-foundation.org: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/20211124142034.192078-1-wangkefeng.wang@huawei.com Fixes: 793213a82de4 ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: 39d114ddc682 ("arm64: add KASAN support") Fixes: bebf56a1b176 ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang Reported-by: Yongqiang Liu Cc: Andrey Konovalov Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: Catalin Marinas Cc: Will Deacon Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Alexander Gordeev Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Alexander Potapenko Cc: Kefeng Wang Signed-off-by: Andrew Morton --- arch/arm64/kernel/module.c | 4 ++-- arch/s390/kernel/module.c | 5 +++-- arch/x86/kernel/module.c | 7 ++++--- include/linux/kasan.h | 4 ++-- include/linux/vmalloc.h | 7 +++++++ mm/kasan/shadow.c | 9 +++++++-- mm/vmalloc.c | 3 ++- 7 files changed, 27 insertions(+), 12 deletions(-) --- a/arch/arm64/kernel/module.c~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/arch/arm64/kernel/module.c @@ -36,7 +36,7 @@ void *module_alloc(unsigned long size) module_alloc_end = MODULES_END; p = __vmalloc_node_range(size, MODULE_ALIGN, module_alloc_base, - module_alloc_end, gfp_mask, PAGE_KERNEL, 0, + module_alloc_end, gfp_mask, PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE, __builtin_return_address(0)); if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) && @@ -58,7 +58,7 @@ void *module_alloc(unsigned long size) PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0)); - if (p && (kasan_module_alloc(p, size) < 0)) { + if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) { vfree(p); return NULL; } --- a/arch/s390/kernel/module.c~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/arch/s390/kernel/module.c @@ -37,14 +37,15 @@ void *module_alloc(unsigned long size) { + gfp_t gfp_mask = GFP_KERNEL; void *p; if (PAGE_ALIGN(size) > MODULES_LEN) return NULL; p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END, - GFP_KERNEL, PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, + gfp_mask, PAGE_KERNEL_EXEC, VM_DEFER_KMEMLEAK, NUMA_NO_NODE, __builtin_return_address(0)); - if (p && (kasan_module_alloc(p, size) < 0)) { + if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) { vfree(p); return NULL; } --- a/arch/x86/kernel/module.c~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/arch/x86/kernel/module.c @@ -67,6 +67,7 @@ static unsigned long int get_module_load void *module_alloc(unsigned long size) { + gfp_t gfp_mask = GFP_KERNEL; void *p; if (PAGE_ALIGN(size) > MODULES_LEN) @@ -74,10 +75,10 @@ void *module_alloc(unsigned long size) p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR + get_module_load_offset(), - MODULES_END, GFP_KERNEL, - PAGE_KERNEL, 0, NUMA_NO_NODE, + MODULES_END, gfp_mask, + PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE, __builtin_return_address(0)); - if (p && (kasan_module_alloc(p, size) < 0)) { + if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) { vfree(p); return NULL; } --- a/include/linux/kasan.h~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/include/linux/kasan.h @@ -474,12 +474,12 @@ static inline void kasan_populate_early_ * allocations with real shadow memory. With KASAN vmalloc, the special * case is unnecessary, as the work is handled in the generic case. */ -int kasan_module_alloc(void *addr, size_t size); +int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask); void kasan_free_shadow(const struct vm_struct *vm); #else /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */ -static inline int kasan_module_alloc(void *addr, size_t size) { return 0; } +static inline int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask) { return 0; } static inline void kasan_free_shadow(const struct vm_struct *vm) {} #endif /* (CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS) && !CONFIG_KASAN_VMALLOC */ --- a/include/linux/vmalloc.h~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/include/linux/vmalloc.h @@ -28,6 +28,13 @@ struct notifier_block; /* in notifier.h #define VM_MAP_PUT_PAGES 0x00000200 /* put pages and free array in vfree */ #define VM_NO_HUGE_VMAP 0x00000400 /* force PAGE_SIZE pte mapping */ +#if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ + !defined(CONFIG_KASAN_VMALLOC) +#define VM_DEFER_KMEMLEAK 0x00000800 /* defer kmemleak object creation */ +#else +#define VM_DEFER_KMEMLEAK 0 +#endif + /* * VM_KASAN is used slightly differently depending on CONFIG_KASAN_VMALLOC. * --- a/mm/kasan/shadow.c~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/mm/kasan/shadow.c @@ -498,7 +498,7 @@ void kasan_release_vmalloc(unsigned long #else /* CONFIG_KASAN_VMALLOC */ -int kasan_module_alloc(void *addr, size_t size) +int kasan_module_alloc(void *addr, size_t size, gfp_t gfp_mask) { void *ret; size_t scaled_size; @@ -520,9 +520,14 @@ int kasan_module_alloc(void *addr, size_ __builtin_return_address(0)); if (ret) { + struct vm_struct *vm = find_vm_area(addr); __memset(ret, KASAN_SHADOW_INIT, shadow_size); - find_vm_area(addr)->flags |= VM_KASAN; + vm->flags |= VM_KASAN; kmemleak_ignore(ret); + + if (vm->flags & VM_DEFER_KMEMLEAK) + kmemleak_vmalloc(vm, size, gfp_mask); + return 0; } --- a/mm/vmalloc.c~mm-defer-kmemleak-object-creation-of-module_alloc +++ a/mm/vmalloc.c @@ -3074,7 +3074,8 @@ again: clear_vm_uninitialized_flag(area); size = PAGE_ALIGN(size); - kmemleak_vmalloc(area, size, gfp_mask); + if (!(vm_flags & VM_DEFER_KMEMLEAK)) + kmemleak_vmalloc(area, size, gfp_mask); return addr; From patchwork Fri Jan 14 22:04:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E8D9C433FE for ; Fri, 14 Jan 2022 22:04:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3F9C6B00B5; Fri, 14 Jan 2022 17:04:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C82B6B00B7; Fri, 14 Jan 2022 17:04:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 869A46B00B8; Fri, 14 Jan 2022 17:04:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id 6F7BC6B00B5 for ; Fri, 14 Jan 2022 17:04:19 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3725B182693EE for ; Fri, 14 Jan 2022 22:04:19 +0000 (UTC) X-FDA: 79030271838.26.9D2A442 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf31.hostedemail.com (Postfix) with ESMTP id ADB8520003 for ; Fri, 14 Jan 2022 22:04:18 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A6535B82A26; Fri, 14 Jan 2022 22:04:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0F37C36AED; Fri, 14 Jan 2022 22:04:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197856; bh=aYIoNvdorZpC3opwE9aCghpmh6cUTrEoy7aFVNrAJg0=; h=Date:From:To:Subject:In-Reply-To:From; b=kCoKccPFKUMLVrqfCedpE9LvHSTtN6H9suwQ+c24+MVqMepQ6zd3YkZ/jO7gI8gcN q7KLqbv2884oy28pRU+Hv3losvhm2AVZP6+ISiF1axpa8uZaIDa6j5oM5S9i0znQmn u0y4n3z5jLEdSx0Nrxd4tfYU7Yt0H+PnFrOC+kRY= Date: Fri, 14 Jan 2022 14:04:15 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 026/146] mm/page_alloc: split prep_compound_page into head and tail subparts Message-ID: <20220114220415.Wq5bV9-yy%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: ADB8520003 X-Stat-Signature: c4bqhrurg5i177w1hgsof83qmoraqqet Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kCoKccPF; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197858-1474 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: mm/page_alloc: split prep_compound_page into head and tail subparts Patch series "mm, device-dax: Introduce compound pages in devmap", v7. This series converts device-dax to use compound pages, and moves away from the 'struct page per basepage on PMD/PUD' that is done today. Doing so, 1) unlocks a few noticeable improvements on unpin_user_pages() and makes device-dax+altmap case 4x times faster in pinning (numbers below and in last patch) 2) as mentioned in various other threads it's one important step towards cleaning up ZONE_DEVICE refcounting. I've split the compound pages on devmap part from the rest based on recent discussions on devmap pending and future work planned[5][6]. There is consensus that device-dax should be using compound pages to represent its PMD/PUDs just like HugeTLB and THP, and that leads to less specialization of the dax parts. I will pursue the rest of the work in parallel once this part is merged, particular the GUP-{slow,fast} improvements [7] and the tail struct page deduplication memory savings part[8]. To summarize what the series does: Patch 1: Prepare hwpoisoning to work with dax compound pages. Patches 2-3: Split the current utility function of prep_compound_page() into head and tail and use those two helpers where appropriate to take advantage of caches being warm after __init_single_page(). This is used when initializing zone device when we bring up device-dax namespaces. Patches 4-10: Add devmap support for compound pages in device-dax. memmap_init_zone_device() initialize its metadata as compound pages, and it introduces a new devmap property known as vmemmap_shift which outlines how the vmemmap is structured (defaults to base pages as done today). The property describe the page order of the metadata essentially. While at it do a few cleanups in device-dax in patches 5-9. Finally enable device-dax usage of devmap @vmemmap_shift to a value based on its own @align property. @vmemmap_shift returns 0 by default (which is today's case of base pages in devmap, like fsdax or the others) and the usage of compound devmap is optional. Starting with device-dax (*not* fsdax) we enable it by default. There are a few pinning improvements particular on the unpinning case and altmap, as well as unpin_user_page_range_dirty_lock() being just as effective as THP/hugetlb[0] pages. $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms [altmap] (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms [altmap with -m 127004] (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms Tested on x86 with 1Tb+ of pmem (alongside registering it with RDMA with and without altmap), alongside gup_test selftests with dynamic dax regions and static dax regions. Coupled with ndctl unit tests for dynamic dax devices that exercise all of this. Note, for dynamic dax regions I had to revert commit 8aa83e6395 ("x86/setup: Call early_reserve_memory() earlier"), it is a known issue that this commit broke efi_fake_mem=. This patch (of 11): Split the utility function prep_compound_page() into head and tail counterparts, and use them accordingly. This is in preparation for sharing the storage for compound page metadata. Link: https://lkml.kernel.org/r/20211202204422.26777-1-joao.m.martins@oracle.com Link: https://lkml.kernel.org/r/20211202204422.26777-3-joao.m.martins@oracle.com Signed-off-by: Joao Martins Acked-by: Mike Kravetz Reviewed-by: Dan Williams Reviewed-by: Muchun Song Cc: Vishal Verma Cc: Dave Jiang Cc: Naoya Horiguchi Cc: Matthew Wilcox (Oracle) Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jane Chu Cc: Jonathan Corbet Cc: Christoph Hellwig Cc: Jason Gunthorpe Signed-off-by: Andrew Morton --- mm/page_alloc.c | 30 ++++++++++++++++++++---------- 1 file changed, 20 insertions(+), 10 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-split-prep_compound_page-into-head-and-tail-subparts +++ a/mm/page_alloc.c @@ -726,23 +726,33 @@ void free_compound_page(struct page *pag free_the_page(page, compound_order(page)); } +static void prep_compound_head(struct page *page, unsigned int order) +{ + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); + set_compound_order(page, order); + atomic_set(compound_mapcount_ptr(page), -1); + if (hpage_pincount_available(page)) + atomic_set(compound_pincount_ptr(page), 0); +} + +static void prep_compound_tail(struct page *head, int tail_idx) +{ + struct page *p = head + tail_idx; + + p->mapping = TAIL_MAPPING; + set_compound_head(p, head); +} + void prep_compound_page(struct page *page, unsigned int order) { int i; int nr_pages = 1 << order; __SetPageHead(page); - for (i = 1; i < nr_pages; i++) { - struct page *p = page + i; - p->mapping = TAIL_MAPPING; - set_compound_head(p, page); - } + for (i = 1; i < nr_pages; i++) + prep_compound_tail(page, i); - set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); - set_compound_order(page, order); - atomic_set(compound_mapcount_ptr(page), -1); - if (hpage_pincount_available(page)) - atomic_set(compound_pincount_ptr(page), 0); + prep_compound_head(page, order); } #ifdef CONFIG_DEBUG_PAGEALLOC From patchwork Fri Jan 14 22:04:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4F7EC433EF for ; Fri, 14 Jan 2022 22:04:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55E126B00B7; Fri, 14 Jan 2022 17:04:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E6216B00B9; Fri, 14 Jan 2022 17:04:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3399D6B00BA; Fri, 14 Jan 2022 17:04:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 1944B6B00B7 for ; Fri, 14 Jan 2022 17:04:22 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CB2F9181B04B5 for ; Fri, 14 Jan 2022 22:04:21 +0000 (UTC) X-FDA: 79030271922.28.822A9B0 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf17.hostedemail.com (Postfix) with ESMTP id 67ADF40008 for ; Fri, 14 Jan 2022 22:04:21 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 84CB061FE2; Fri, 14 Jan 2022 22:04:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58DF2C36AE5; Fri, 14 Jan 2022 22:04:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197860; bh=sFnPFoqeExZj0BfB0gVfxJfeE0AVkbIP0cxQpOgWjqw=; h=Date:From:To:Subject:In-Reply-To:From; b=k8OPKOgPkMPK3vjNQq+sNdvnUgtOnc7MdS5VDPLG8NWUpLKI25J28WsdYPJgYbkui Vuw+VaUg/bzXbIeZZ+/P1qvO0EWKrzuKxDYZe9M63taquw1Q+xqC+3BGivKFTbUVhW xccMaSRoJRIhg6gnabEzix2hSWS8zZEctxVf8yP8= Date: Fri, 14 Jan 2022 14:04:18 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 027/146] mm/page_alloc: refactor memmap_init_zone_device() page init Message-ID: <20220114220418.RtHpbXDRe%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 67ADF40008 X-Stat-Signature: yb7prurt6dczxtg4xe87jtb1w7rr51d1 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=k8OPKOgP; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642197861-72759 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: mm/page_alloc: refactor memmap_init_zone_device() page init Move struct page init to an helper function __init_zone_device_page(). This is in preparation for sharing the storage for compound page metadata. Link: https://lkml.kernel.org/r/20211202204422.26777-4-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Dan Williams Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- mm/page_alloc.c | 74 +++++++++++++++++++++++++--------------------- 1 file changed, 41 insertions(+), 33 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-refactor-memmap_init_zone_device-page-init +++ a/mm/page_alloc.c @@ -6572,6 +6572,46 @@ void __meminit memmap_init_range(unsigne } #ifdef CONFIG_ZONE_DEVICE +static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, + unsigned long zone_idx, int nid, + struct dev_pagemap *pgmap) +{ + + __init_single_page(page, pfn, zone_idx, nid); + + /* + * Mark page reserved as it will need to wait for onlining + * phase for it to be fully associated with a zone. + * + * We can use the non-atomic __set_bit operation for setting + * the flag as we are still initializing the pages. + */ + __SetPageReserved(page); + + /* + * ZONE_DEVICE pages union ->lru with a ->pgmap back pointer + * and zone_device_data. It is a bug if a ZONE_DEVICE page is + * ever freed or placed on a driver-private list. + */ + page->pgmap = pgmap; + page->zone_device_data = NULL; + + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap + * because this is done early in section_activate() + */ + if (IS_ALIGNED(pfn, pageblock_nr_pages)) { + set_pageblock_migratetype(page, MIGRATE_MOVABLE); + cond_resched(); + } +} + void __ref memmap_init_zone_device(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, @@ -6600,39 +6640,7 @@ void __ref memmap_init_zone_device(struc for (pfn = start_pfn; pfn < end_pfn; pfn++) { struct page *page = pfn_to_page(pfn); - __init_single_page(page, pfn, zone_idx, nid); - - /* - * Mark page reserved as it will need to wait for onlining - * phase for it to be fully associated with a zone. - * - * We can use the non-atomic __set_bit operation for setting - * the flag as we are still initializing the pages. - */ - __SetPageReserved(page); - - /* - * ZONE_DEVICE pages union ->lru with a ->pgmap back pointer - * and zone_device_data. It is a bug if a ZONE_DEVICE page is - * ever freed or placed on a driver-private list. - */ - page->pgmap = pgmap; - page->zone_device_data = NULL; - - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap - * because this is done early in section_activate() - */ - if (IS_ALIGNED(pfn, pageblock_nr_pages)) { - set_pageblock_migratetype(page, MIGRATE_MOVABLE); - cond_resched(); - } + __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); } pr_info("%s initialised %lu pages in %ums\n", __func__, From patchwork Fri Jan 14 22:04:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B64BC433F5 for ; Fri, 14 Jan 2022 22:04:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F1C46B00B9; Fri, 14 Jan 2022 17:04:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 079C36B00BB; Fri, 14 Jan 2022 17:04:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E37D96B00BC; Fri, 14 Jan 2022 17:04:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id CE5566B00B9 for ; Fri, 14 Jan 2022 17:04:25 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 947DF998B4 for ; Fri, 14 Jan 2022 22:04:25 +0000 (UTC) X-FDA: 79030272090.23.C27E60C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id 1F7BE140004 for ; Fri, 14 Jan 2022 22:04:24 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2894E61FF8; Fri, 14 Jan 2022 22:04:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EBE90C36AEC; Fri, 14 Jan 2022 22:04:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197863; bh=9iAiEeb4UvT+jYIwxrSfdvyYYMjAoD854bj1LI2V/Mc=; h=Date:From:To:Subject:In-Reply-To:From; b=pe4cG6PYMqGSxf/UQbpZYiyzK8P771o3RD1deYlCqlZE/e5T7cbJjQPZ9KXtQwR0m Kvjhg7hRlcgb5ccqqtSIuhr7aNZbXRVw3exccJSPi6XkIdp0qWQgvIRr6oS7u0KR+T X+PFAQaarWwWkiwsPOJcc1uN9ayKOWB2d+f0q+9Y= Date: Fri, 14 Jan 2022 14:04:22 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 028/146] mm/memremap: add ZONE_DEVICE support for compound pages Message-ID: <20220114220422.NTYjq91hd%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1F7BE140004 X-Stat-Signature: p53714zb4i7x7fafzi8ehuor93ib1hs1 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=pe4cG6PY; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197864-46059 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: mm/memremap: add ZONE_DEVICE support for compound pages Add a new @vmemmap_shift property for struct dev_pagemap which specifies that a devmap is composed of a set of compound pages of order @vmemmap_shift, instead of base pages. When a compound page devmap is requested, all but the first page are initialised as tail pages instead of order-0 pages. For certain ZONE_DEVICE users like device-dax which have a fixed page size, this creates an opportunity to optimize GUP and GUP-fast walkers, treating it the same way as THP or hugetlb pages. Additionally, commit 7118fc2906e2 ("hugetlb: address ref count racing in prep_compound_gigantic_page") removed set_page_count() because the setting of page ref count to zero was redundant. devmap pages don't come from page allocator though and only head page refcount is used for compound pages, hence initialize tail page count to zero. Link: https://lkml.kernel.org/r/20211202204422.26777-5-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Dan Williams Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- include/linux/memremap.h | 11 ++++++++++ mm/memremap.c | 18 +++++++++++------ mm/page_alloc.c | 38 ++++++++++++++++++++++++++++++++++++- 3 files changed, 60 insertions(+), 7 deletions(-) --- a/include/linux/memremap.h~mm-memremap-add-zone_device-support-for-compound-pages +++ a/include/linux/memremap.h @@ -99,6 +99,11 @@ struct dev_pagemap_ops { * @done: completion for @internal_ref * @type: memory type: see MEMORY_* in memory_hotplug.h * @flags: PGMAP_* flags to specify defailed behavior + * @vmemmap_shift: structural definition of how the vmemmap page metadata + * is populated, specifically the metadata page order. + * A zero value (default) uses base pages as the vmemmap metadata + * representation. A bigger value will set up compound struct pages + * of the requested order value. * @ops: method table * @owner: an opaque pointer identifying the entity that manages this * instance. Used by various helpers to make sure that no @@ -114,6 +119,7 @@ struct dev_pagemap { struct completion done; enum memory_type type; unsigned int flags; + unsigned long vmemmap_shift; const struct dev_pagemap_ops *ops; void *owner; int nr_range; @@ -130,6 +136,11 @@ static inline struct vmem_altmap *pgmap_ return NULL; } +static inline unsigned long pgmap_vmemmap_nr(struct dev_pagemap *pgmap) +{ + return 1 << pgmap->vmemmap_shift; +} + #ifdef CONFIG_ZONE_DEVICE void *memremap_pages(struct dev_pagemap *pgmap, int nid); void memunmap_pages(struct dev_pagemap *pgmap); --- a/mm/memremap.c~mm-memremap-add-zone_device-support-for-compound-pages +++ a/mm/memremap.c @@ -102,15 +102,22 @@ static unsigned long pfn_end(struct dev_ return (range->start + range_len(range)) >> PAGE_SHIFT; } -static unsigned long pfn_next(unsigned long pfn) +static unsigned long pfn_next(struct dev_pagemap *pgmap, unsigned long pfn) { - if (pfn % 1024 == 0) + if (pfn % (1024 << pgmap->vmemmap_shift)) cond_resched(); - return pfn + 1; + return pfn + pgmap_vmemmap_nr(pgmap); +} + +static unsigned long pfn_len(struct dev_pagemap *pgmap, unsigned long range_id) +{ + return (pfn_end(pgmap, range_id) - + pfn_first(pgmap, range_id)) >> pgmap->vmemmap_shift; } #define for_each_device_pfn(pfn, map, i) \ - for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); pfn = pfn_next(pfn)) + for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); \ + pfn = pfn_next(map, pfn)) static void dev_pagemap_kill(struct dev_pagemap *pgmap) { @@ -295,8 +302,7 @@ static int pagemap_range(struct dev_page memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], PHYS_PFN(range->start), PHYS_PFN(range_len(range)), pgmap); - percpu_ref_get_many(pgmap->ref, pfn_end(pgmap, range_id) - - pfn_first(pgmap, range_id)); + percpu_ref_get_many(pgmap->ref, pfn_len(pgmap, range_id)); return 0; err_add_memory: --- a/mm/page_alloc.c~mm-memremap-add-zone_device-support-for-compound-pages +++ a/mm/page_alloc.c @@ -6612,6 +6612,35 @@ static void __ref __init_zone_device_pag } } +static void __ref memmap_init_compound(struct page *head, + unsigned long head_pfn, + unsigned long zone_idx, int nid, + struct dev_pagemap *pgmap, + unsigned long nr_pages) +{ + unsigned long pfn, end_pfn = head_pfn + nr_pages; + unsigned int order = pgmap->vmemmap_shift; + + __SetPageHead(head); + for (pfn = head_pfn + 1; pfn < end_pfn; pfn++) { + struct page *page = pfn_to_page(pfn); + + __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); + prep_compound_tail(head, pfn - head_pfn); + set_page_count(page, 0); + + /* + * The first tail page stores compound_mapcount_ptr() and + * compound_order() and the second tail page stores + * compound_pincount_ptr(). Call prep_compound_head() after + * the first and second tail pages have been initialized to + * not have the data overwritten. + */ + if (pfn == head_pfn + 2) + prep_compound_head(head, order); + } +} + void __ref memmap_init_zone_device(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, @@ -6620,6 +6649,7 @@ void __ref memmap_init_zone_device(struc unsigned long pfn, end_pfn = start_pfn + nr_pages; struct pglist_data *pgdat = zone->zone_pgdat; struct vmem_altmap *altmap = pgmap_altmap(pgmap); + unsigned int pfns_per_compound = pgmap_vmemmap_nr(pgmap); unsigned long zone_idx = zone_idx(zone); unsigned long start = jiffies; int nid = pgdat->node_id; @@ -6637,10 +6667,16 @@ void __ref memmap_init_zone_device(struc nr_pages = end_pfn - start_pfn; } - for (pfn = start_pfn; pfn < end_pfn; pfn++) { + for (pfn = start_pfn; pfn < end_pfn; pfn += pfns_per_compound) { struct page *page = pfn_to_page(pfn); __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); + + if (pfns_per_compound == 1) + continue; + + memmap_init_compound(page, pfn, zone_idx, nid, pgmap, + pfns_per_compound); } pr_info("%s initialised %lu pages in %ums\n", __func__, From patchwork Fri Jan 14 22:04:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B480EC433F5 for ; Fri, 14 Jan 2022 22:04:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 425D26B00BB; Fri, 14 Jan 2022 17:04:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3ADCB6B00BD; Fri, 14 Jan 2022 17:04:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2284F6B00BE; Fri, 14 Jan 2022 17:04:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id 0D02B6B00BB for ; Fri, 14 Jan 2022 17:04:29 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C351B8F174 for ; Fri, 14 Jan 2022 22:04:28 +0000 (UTC) X-FDA: 79030272216.29.900AC85 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf24.hostedemail.com (Postfix) with ESMTP id 6BB2F180006 for ; Fri, 14 Jan 2022 22:04:28 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B5C4261FE2; Fri, 14 Jan 2022 22:04:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 87504C36AE5; Fri, 14 Jan 2022 22:04:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197867; bh=nGm195y+2axTOVVS2KA7kxHcTPl6Wpu5+3z5X6bVCaI=; h=Date:From:To:Subject:In-Reply-To:From; b=aO03WhQ0D9GKW4QeShEdrx2K7hyzRQXInYZ08vceFlNtJvSKx7V/RUfTu78bpUCU+ X0M1b5v3nHeF9FkmP91i9FqYBxBfPGoTOOZdgp5zPjCTd3DbwvMdKz6ylih8uzt4Bs kluxQEHsBuOr8rM50M2ZnF1kZRz1m32XbIiJOPVg= Date: Fri, 14 Jan 2022 14:04:26 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 029/146] device-dax: use ALIGN() for determining pgoff Message-ID: <20220114220426.P_C9Y0zWa%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6BB2F180006 X-Stat-Signature: dj3tb6ec5cpomiuptdzdfap64qa83nqb Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=aO03WhQ0; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197868-350804 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: use ALIGN() for determining pgoff Rather than calculating @pgoff manually, switch to ALIGN() instead. Link: https://lkml.kernel.org/r/20211202204422.26777-6-joao.m.martins@oracle.com Suggested-by: Dan Williams Signed-off-by: Joao Martins Reviewed-by: Dan Williams Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/dax/device.c~device-dax-use-align-for-determining-pgoff +++ a/drivers/dax/device.c @@ -234,8 +234,8 @@ static vm_fault_t dev_dax_huge_fault(str * mapped. No need to consider the zero page, or racing * conflicting mappings. */ - pgoff = linear_page_index(vmf->vma, vmf->address - & ~(fault_size - 1)); + pgoff = linear_page_index(vmf->vma, + ALIGN(vmf->address, fault_size)); for (i = 0; i < fault_size / PAGE_SIZE; i++) { struct page *page; From patchwork Fri Jan 14 22:04:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714056 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 238E5C433FE for ; Fri, 14 Jan 2022 22:04:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1A666B00BD; Fri, 14 Jan 2022 17:04:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A0D56B00BF; Fri, 14 Jan 2022 17:04:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 869166B00C0; Fri, 14 Jan 2022 17:04:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id 7311F6B00BD for ; Fri, 14 Jan 2022 17:04:32 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3712B80B7FD2 for ; Fri, 14 Jan 2022 22:04:32 +0000 (UTC) X-FDA: 79030272384.12.1BF0066 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id DCD8D160002 for ; Fri, 14 Jan 2022 22:04:31 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3EC4461FEE; Fri, 14 Jan 2022 22:04:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19962C36AEC; Fri, 14 Jan 2022 22:04:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197870; bh=8nC/w8mWUpbJk/PkjV0z0DwCUUddZmUk1HY3vqhKk5E=; h=Date:From:To:Subject:In-Reply-To:From; b=N1NINCSpbM0rBaQ0WEgO2zNyzJKP34ACLR6XPXK9voptACX5ewoNI3XDGboZYMDo6 StTDD66j/4BZea93Ws3aq4HNn2ENmgazdYEwcYXsMl5DwlaLTBfInUhQCl5OtxVwfj +fDtuKC2lznpk65tjk66A4yr4Ge3EigJs3NfSNR0= Date: Fri, 14 Jan 2022 14:04:29 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 030/146] device-dax: use struct_size() Message-ID: <20220114220429.X7CsCUn1o%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: DCD8D160002 X-Stat-Signature: x47ic1yi1efqat5ecg15ebz8rsb7484j Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=N1NINCSp; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642197871-213164 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: use struct_size() Use the struct_size() helper for the size of a struct with variable array member at the end, rather than manually calculating it. Link: https://lkml.kernel.org/r/20211202204422.26777-7-joao.m.martins@oracle.com Suggested-by: Dan Williams Signed-off-by: Joao Martins Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/drivers/dax/device.c~device-dax-use-struct_size +++ a/drivers/dax/device.c @@ -404,8 +404,9 @@ int dev_dax_probe(struct dev_dax *dev_da return -EINVAL; if (!pgmap) { - pgmap = devm_kzalloc(dev, sizeof(*pgmap) + sizeof(struct range) - * (dev_dax->nr_range - 1), GFP_KERNEL); + pgmap = devm_kzalloc(dev, + struct_size(pgmap, ranges, dev_dax->nr_range - 1), + GFP_KERNEL); if (!pgmap) return -ENOMEM; pgmap->nr_range = dev_dax->nr_range; From patchwork Fri Jan 14 22:04:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 465B2C433F5 for ; Fri, 14 Jan 2022 22:04:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C148A6B00BF; Fri, 14 Jan 2022 17:04:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B9CA76B00C1; Fri, 14 Jan 2022 17:04:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A16E06B00C2; Fri, 14 Jan 2022 17:04:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 89E6D6B00BF for ; Fri, 14 Jan 2022 17:04:36 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4D133181C5C52 for ; Fri, 14 Jan 2022 22:04:36 +0000 (UTC) X-FDA: 79030272552.12.7E5702F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id 87A21A000B for ; Fri, 14 Jan 2022 22:04:35 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CB58261FB7; Fri, 14 Jan 2022 22:04:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A175CC36AE9; Fri, 14 Jan 2022 22:04:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197874; bh=V2lIad5Y7zSL5gtcIPp33sjppMbGmmamm9Pq6I/KhFo=; h=Date:From:To:Subject:In-Reply-To:From; b=mOPcEbevv2mbiKdyOD/YLu0xTdKPlxW56v8CmqJ4Yt4qM/eotKS+Sb6LoQLHIDyoX aaYH4bHUAHRnyvEeAbibSE9hxwowFXUvrutuexEVNgmgvntO2ODUw1HUE0ditZNj8B 5binFJhe4HIetgOv+0N2EWl4ovwm4z/EYIQk10DM= Date: Fri, 14 Jan 2022 14:04:33 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 031/146] device-dax: ensure dev_dax->pgmap is valid for dynamic devices Message-ID: <20220114220433.i8Bxr8AF1%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 87A21A000B X-Stat-Signature: jf17cawjyqt6wu4cjbp3q66azb4zen3y Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mOPcEbev; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197875-703350 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: ensure dev_dax->pgmap is valid for dynamic devices Right now, only static dax regions have a valid @pgmap pointer in its struct dev_dax. Dynamic dax case however, do not. In preparation for device-dax compound devmap support, make sure that dev_dax pgmap field is set after it has been allocated and initialized. dynamic dax device have the @pgmap is allocated at probe() and it's managed by devm (contrast to static dax region which a pgmap is provided and dax core kfrees it). So in addition to ensure a valid @pgmap, clear the pgmap when the dynamic dax device is released to avoid the same pgmap ranges to be re-requested across multiple region device reconfigs. Add a static_dev_dax() and use that helper in dev_dax_probe() to ensure the initialization differences between dynamic and static regions are more explicit. While at it, consolidate the ranges initialization when we allocate the @pgmap for the dynamic dax region case. Also take the opportunity to document the differences between static and dynamic da regions. Link: https://lkml.kernel.org/r/20211202204422.26777-8-joao.m.martins@oracle.com Suggested-by: Dan Williams Signed-off-by: Joao Martins Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/bus.c | 32 ++++++++++++++++++++++++++++++++ drivers/dax/bus.h | 1 + drivers/dax/device.c | 29 +++++++++++++++++++++-------- 3 files changed, 54 insertions(+), 8 deletions(-) --- a/drivers/dax/bus.c~device-dax-ensure-dev_dax-pgmap-is-valid-for-dynamic-devices +++ a/drivers/dax/bus.c @@ -129,11 +129,35 @@ ATTRIBUTE_GROUPS(dax_drv); static int dax_bus_match(struct device *dev, struct device_driver *drv); +/* + * Static dax regions are regions created by an external subsystem + * nvdimm where a single range is assigned. Its boundaries are by the external + * subsystem and are usually limited to one physical memory range. For example, + * for PMEM it is usually defined by NVDIMM Namespace boundaries (i.e. a + * single contiguous range) + * + * On dynamic dax regions, the assigned region can be partitioned by dax core + * into multiple subdivisions. A subdivision is represented into one + * /dev/daxN.M device composed by one or more potentially discontiguous ranges. + * + * When allocating a dax region, drivers must set whether it's static + * (IORESOURCE_DAX_STATIC). On static dax devices, the @pgmap is pre-assigned + * to dax core when calling devm_create_dev_dax(), whereas in dynamic dax + * devices it is NULL but afterwards allocated by dax core on device ->probe(). + * Care is needed to make sure that dynamic dax devices are torn down with a + * cleared @pgmap field (see kill_dev_dax()). + */ static bool is_static(struct dax_region *dax_region) { return (dax_region->res.flags & IORESOURCE_DAX_STATIC) != 0; } +bool static_dev_dax(struct dev_dax *dev_dax) +{ + return is_static(dev_dax->region); +} +EXPORT_SYMBOL_GPL(static_dev_dax); + static u64 dev_dax_size(struct dev_dax *dev_dax) { u64 size = 0; @@ -363,6 +387,14 @@ void kill_dev_dax(struct dev_dax *dev_da kill_dax(dax_dev); unmap_mapping_range(inode->i_mapping, 0, 0, 1); + + /* + * Dynamic dax region have the pgmap allocated via dev_kzalloc() + * and thus freed by devm. Clear the pgmap to not have stale pgmap + * ranges on probe() from previous reconfigurations of region devices. + */ + if (!static_dev_dax(dev_dax)) + dev_dax->pgmap = NULL; } EXPORT_SYMBOL_GPL(kill_dev_dax); --- a/drivers/dax/bus.h~device-dax-ensure-dev_dax-pgmap-is-valid-for-dynamic-devices +++ a/drivers/dax/bus.h @@ -48,6 +48,7 @@ int __dax_driver_register(struct dax_dev __dax_driver_register(driver, THIS_MODULE, KBUILD_MODNAME) void dax_driver_unregister(struct dax_device_driver *dax_drv); void kill_dev_dax(struct dev_dax *dev_dax); +bool static_dev_dax(struct dev_dax *dev_dax); #if IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT) int dev_dax_probe(struct dev_dax *dev_dax); --- a/drivers/dax/device.c~device-dax-ensure-dev_dax-pgmap-is-valid-for-dynamic-devices +++ a/drivers/dax/device.c @@ -398,18 +398,34 @@ int dev_dax_probe(struct dev_dax *dev_da void *addr; int rc, i; - pgmap = dev_dax->pgmap; - if (dev_WARN_ONCE(dev, pgmap && dev_dax->nr_range > 1, - "static pgmap / multi-range device conflict\n")) - return -EINVAL; + if (static_dev_dax(dev_dax)) { + if (dev_dax->nr_range > 1) { + dev_warn(dev, + "static pgmap / multi-range device conflict\n"); + return -EINVAL; + } + + pgmap = dev_dax->pgmap; + } else { + if (dev_dax->pgmap) { + dev_warn(dev, + "dynamic-dax with pre-populated page map\n"); + return -EINVAL; + } - if (!pgmap) { pgmap = devm_kzalloc(dev, struct_size(pgmap, ranges, dev_dax->nr_range - 1), GFP_KERNEL); if (!pgmap) return -ENOMEM; + pgmap->nr_range = dev_dax->nr_range; + dev_dax->pgmap = pgmap; + + for (i = 0; i < dev_dax->nr_range; i++) { + struct range *range = &dev_dax->ranges[i].range; + pgmap->ranges[i] = *range; + } } for (i = 0; i < dev_dax->nr_range; i++) { @@ -421,9 +437,6 @@ int dev_dax_probe(struct dev_dax *dev_da i, range->start, range->end); return -EBUSY; } - /* don't update the range for static pgmap */ - if (!dev_dax->pgmap) - pgmap->ranges[i] = *range; } pgmap->type = MEMORY_DEVICE_GENERIC; From patchwork Fri Jan 14 22:04:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CED7C433F5 for ; Fri, 14 Jan 2022 22:04:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E85336B00C1; Fri, 14 Jan 2022 17:04:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0E7E6B00C3; Fri, 14 Jan 2022 17:04:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C87EB6B00C4; Fri, 14 Jan 2022 17:04:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id B4CDA6B00C1 for ; Fri, 14 Jan 2022 17:04:40 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7FC64181B7E25 for ; Fri, 14 Jan 2022 22:04:40 +0000 (UTC) X-FDA: 79030272720.27.FDD8AD0 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf17.hostedemail.com (Postfix) with ESMTP id F24C040004 for ; Fri, 14 Jan 2022 22:04:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 05EB9B8262E; Fri, 14 Jan 2022 22:04:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 30386C36AE9; Fri, 14 Jan 2022 22:04:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197877; bh=cP2M1wf3ft2auD5+qyxpoqWOHWWusT3QFg35h1hSUBA=; h=Date:From:To:Subject:In-Reply-To:From; b=Z/inzFrtdOULFOfeCQ1dgtvJUgYlUXnIaqF10WchNNgTNl8B0FMclMU28VdvUeRkA b0xGY9Ziv3/lfIZWx9nTieglBDF8XLCoX3LdfDtGoYsijQAhSoBzWIqUI9LmDIb5rA /QJ1ilSoytOHz03cAbT57/X7hXl9RFOYJj7BjE8k= Date: Fri, 14 Jan 2022 14:04:36 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 032/146] device-dax: factor out page mapping initialization Message-ID: <20220114220436.7rH5EsVGE%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Z/inzFrt"; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: krnj8b9ku8ny94f1zx3hp9a75a96asdj X-Rspamd-Queue-Id: F24C040004 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197879-710888 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: factor out page mapping initialization Move initialization of page->mapping into a separate helper. This is in preparation to move the mapping set to be prior to inserting the page table entry and also for tidying up compound page handling into one helper. Link: https://lkml.kernel.org/r/20211202204422.26777-9-joao.m.martins@oracle.com Signed-off-by: Joao Martins Cc: Christoph Hellwig Cc: Dan Williams Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 45 ++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 22 deletions(-) --- a/drivers/dax/device.c~device-dax-factor-out-page-mapping-initialization +++ a/drivers/dax/device.c @@ -73,6 +73,27 @@ __weak phys_addr_t dax_pgoff_to_phys(str return -1; } +static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn, + unsigned long fault_size) +{ + unsigned long i, nr_pages = fault_size / PAGE_SIZE; + struct file *filp = vmf->vma->vm_file; + pgoff_t pgoff; + + pgoff = linear_page_index(vmf->vma, + ALIGN(vmf->address, fault_size)); + + for (i = 0; i < nr_pages; i++) { + struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i); + + if (page->mapping) + continue; + + page->mapping = filp->f_mapping; + page->index = pgoff + i; + } +} + static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, struct vm_fault *vmf, pfn_t *pfn) { @@ -224,28 +245,8 @@ static vm_fault_t dev_dax_huge_fault(str rc = VM_FAULT_SIGBUS; } - if (rc == VM_FAULT_NOPAGE) { - unsigned long i; - pgoff_t pgoff; - - /* - * In the device-dax case the only possibility for a - * VM_FAULT_NOPAGE result is when device-dax capacity is - * mapped. No need to consider the zero page, or racing - * conflicting mappings. - */ - pgoff = linear_page_index(vmf->vma, - ALIGN(vmf->address, fault_size)); - for (i = 0; i < fault_size / PAGE_SIZE; i++) { - struct page *page; - - page = pfn_to_page(pfn_t_to_pfn(pfn) + i); - if (page->mapping) - continue; - page->mapping = filp->f_mapping; - page->index = pgoff + i; - } - } + if (rc == VM_FAULT_NOPAGE) + dax_set_mapping(vmf, pfn, fault_size); dax_read_unlock(id); return rc; From patchwork Fri Jan 14 22:04:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21728C433F5 for ; Fri, 14 Jan 2022 22:04:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8E046B00C3; Fri, 14 Jan 2022 17:04:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EF656B00C5; Fri, 14 Jan 2022 17:04:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 841F26B00C6; Fri, 14 Jan 2022 17:04:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id 6E1136B00C3 for ; Fri, 14 Jan 2022 17:04:44 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 343C38F87C for ; Fri, 14 Jan 2022 22:04:44 +0000 (UTC) X-FDA: 79030272888.26.FBF3AE2 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf11.hostedemail.com (Postfix) with ESMTP id B214440003 for ; Fri, 14 Jan 2022 22:04:43 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9EE0CB8260F; Fri, 14 Jan 2022 22:04:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C6925C36AE9; Fri, 14 Jan 2022 22:04:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197881; bh=8qEKIm0DQ7xfXek0F6iDSdibkKFhjySTVuu4TSBpeck=; h=Date:From:To:Subject:In-Reply-To:From; b=U5cAeLudma/AUSiCJCiDJD0LOc2YJ63SyXn8kpojJCUcIV2MvsZr54pWPG9axV0KT jmer/Pqp+QaqTN8wPm+AOqVnoR1X135v4Uc8JO3pxrJejgFhrYKPwjPFUACgHvgv9Q OA8e28hzXZPiCveDJyx5inUOh0vvxiyLB/mZFbSM= Date: Fri, 14 Jan 2022 14:04:40 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 033/146] device-dax: set mapping prior to vmf_insert_pfn{,_pmd,pud}() Message-ID: <20220114220440.ft_KpKrLo%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=U5cAeLud; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: jhqtqskjgy1unqnmhbi95cm1j9iazt47 X-Rspamd-Queue-Id: B214440003 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197883-763380 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: set mapping prior to vmf_insert_pfn{,_pmd,pud}() Normally, the @page mapping is set prior to inserting the page into a page table entry. Make device-dax adhere to the same ordering, rather than setting mapping after the PTE is inserted. The address_space never changes and it is always associated with the same inode and underlying pages. So, the page mapping is set once but cleared when the struct pages are removed/freed (i.e. after {devm_}memunmap_pages()). Link: https://lkml.kernel.org/r/20211202204422.26777-10-joao.m.martins@oracle.com Suggested-by: Jason Gunthorpe Signed-off-by: Joao Martins Cc: Christoph Hellwig Cc: Dan Williams Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/drivers/dax/device.c~device-dax-set-mapping-prior-to-vmf_insert_pfn_pmdpud +++ a/drivers/dax/device.c @@ -121,6 +121,8 @@ static vm_fault_t __dev_dax_pte_fault(st *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + dax_set_mapping(vmf, *pfn, fault_size); + return vmf_insert_mixed(vmf->vma, vmf->address, *pfn); } @@ -161,6 +163,8 @@ static vm_fault_t __dev_dax_pmd_fault(st *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + dax_set_mapping(vmf, *pfn, fault_size); + return vmf_insert_pfn_pmd(vmf, *pfn, vmf->flags & FAULT_FLAG_WRITE); } @@ -203,6 +207,8 @@ static vm_fault_t __dev_dax_pud_fault(st *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + dax_set_mapping(vmf, *pfn, fault_size); + return vmf_insert_pfn_pud(vmf, *pfn, vmf->flags & FAULT_FLAG_WRITE); } #else @@ -217,7 +223,6 @@ static vm_fault_t dev_dax_huge_fault(str enum page_entry_size pe_size) { struct file *filp = vmf->vma->vm_file; - unsigned long fault_size; vm_fault_t rc = VM_FAULT_SIGBUS; int id; pfn_t pfn; @@ -230,23 +235,18 @@ static vm_fault_t dev_dax_huge_fault(str id = dax_read_lock(); switch (pe_size) { case PE_SIZE_PTE: - fault_size = PAGE_SIZE; rc = __dev_dax_pte_fault(dev_dax, vmf, &pfn); break; case PE_SIZE_PMD: - fault_size = PMD_SIZE; rc = __dev_dax_pmd_fault(dev_dax, vmf, &pfn); break; case PE_SIZE_PUD: - fault_size = PUD_SIZE; rc = __dev_dax_pud_fault(dev_dax, vmf, &pfn); break; default: rc = VM_FAULT_SIGBUS; } - if (rc == VM_FAULT_NOPAGE) - dax_set_mapping(vmf, pfn, fault_size); dax_read_unlock(id); return rc; From patchwork Fri Jan 14 22:04:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714060 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9308FC433F5 for ; Fri, 14 Jan 2022 22:04:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 229356B00C5; Fri, 14 Jan 2022 17:04:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1B1626B00C7; Fri, 14 Jan 2022 17:04:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 051E46B00C8; Fri, 14 Jan 2022 17:04:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id E75AD6B00C5 for ; Fri, 14 Jan 2022 17:04:47 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AB96780F9B3C for ; Fri, 14 Jan 2022 22:04:47 +0000 (UTC) X-FDA: 79030273014.30.433D239 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf10.hostedemail.com (Postfix) with ESMTP id 39CD8C0002 for ; Fri, 14 Jan 2022 22:04:47 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 40244B8262F; Fri, 14 Jan 2022 22:04:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F858C36AE9; Fri, 14 Jan 2022 22:04:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197885; bh=fHFUYb1fwx62u3tgrkOysJKXx9pbu4Mcnp2cF0Nnpx4=; h=Date:From:To:Subject:In-Reply-To:From; b=xp8R+iNn9etW6hfxmhoIIa+UjJi+bepAwSlOO2Wdvu33H19vbF66djvkI5k3cYxJ+ BJV3NDvlaZHBMzTlvRS7qTCnm57TUCBJuV4RPuqz0RLTyYB3K8WJnqLs08PW+3BQOF /miTdJpErE4ao0U+XC4ehv33RBRgMKn7oHacYObE= Date: Fri, 14 Jan 2022 14:04:43 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 034/146] device-dax: remove pfn from __dev_dax_{pte,pmd,pud}_fault() Message-ID: <20220114220443.oOnsq2CFf%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 39CD8C0002 X-Stat-Signature: 7bxhmiwhdfx17coueyj4zwpq517z3gwm Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xp8R+iNn; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197887-289742 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: remove pfn from __dev_dax_{pte,pmd,pud}_fault() After moving the page mapping to be set prior to pte insertion, the pfn in dev_dax_huge_fault() no longer is necessary. Remove it, as well as the @pfn argument passed to the internal fault handler helpers. [akpm@linux-foundation.org: fix CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n build] Link: https://lkml.kernel.org/r/20211202204422.26777-11-joao.m.martins@oracle.com Signed-off-by: Joao Martins Suggested-by: Christoph Hellwig Cc: Dan Williams Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) --- a/drivers/dax/device.c~device-dax-remove-pfn-from-__dev_dax_ptepmdpud_fault +++ a/drivers/dax/device.c @@ -95,10 +95,11 @@ static void dax_set_mapping(struct vm_fa } static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, - struct vm_fault *vmf, pfn_t *pfn) + struct vm_fault *vmf) { struct device *dev = &dev_dax->dev; phys_addr_t phys; + pfn_t pfn; unsigned int fault_size = PAGE_SIZE; if (check_vma(dev_dax, vmf->vma, __func__)) @@ -119,20 +120,21 @@ static vm_fault_t __dev_dax_pte_fault(st return VM_FAULT_SIGBUS; } - *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); - dax_set_mapping(vmf, *pfn, fault_size); + dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_mixed(vmf->vma, vmf->address, *pfn); + return vmf_insert_mixed(vmf->vma, vmf->address, pfn); } static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, - struct vm_fault *vmf, pfn_t *pfn) + struct vm_fault *vmf) { unsigned long pmd_addr = vmf->address & PMD_MASK; struct device *dev = &dev_dax->dev; phys_addr_t phys; pgoff_t pgoff; + pfn_t pfn; unsigned int fault_size = PMD_SIZE; if (check_vma(dev_dax, vmf->vma, __func__)) @@ -161,21 +163,22 @@ static vm_fault_t __dev_dax_pmd_fault(st return VM_FAULT_SIGBUS; } - *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); - dax_set_mapping(vmf, *pfn, fault_size); + dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_pfn_pmd(vmf, *pfn, vmf->flags & FAULT_FLAG_WRITE); + return vmf_insert_pfn_pmd(vmf, pfn, vmf->flags & FAULT_FLAG_WRITE); } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, - struct vm_fault *vmf, pfn_t *pfn) + struct vm_fault *vmf) { unsigned long pud_addr = vmf->address & PUD_MASK; struct device *dev = &dev_dax->dev; phys_addr_t phys; pgoff_t pgoff; + pfn_t pfn; unsigned int fault_size = PUD_SIZE; @@ -205,15 +208,15 @@ static vm_fault_t __dev_dax_pud_fault(st return VM_FAULT_SIGBUS; } - *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); - dax_set_mapping(vmf, *pfn, fault_size); + dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_pfn_pud(vmf, *pfn, vmf->flags & FAULT_FLAG_WRITE); + return vmf_insert_pfn_pud(vmf, pfn, vmf->flags & FAULT_FLAG_WRITE); } #else static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, - struct vm_fault *vmf, pfn_t *pfn) + struct vm_fault *vmf) { return VM_FAULT_FALLBACK; } @@ -225,7 +228,6 @@ static vm_fault_t dev_dax_huge_fault(str struct file *filp = vmf->vma->vm_file; vm_fault_t rc = VM_FAULT_SIGBUS; int id; - pfn_t pfn; struct dev_dax *dev_dax = filp->private_data; dev_dbg(&dev_dax->dev, "%s: %s (%#lx - %#lx) size = %d\n", current->comm, @@ -235,13 +237,13 @@ static vm_fault_t dev_dax_huge_fault(str id = dax_read_lock(); switch (pe_size) { case PE_SIZE_PTE: - rc = __dev_dax_pte_fault(dev_dax, vmf, &pfn); + rc = __dev_dax_pte_fault(dev_dax, vmf); break; case PE_SIZE_PMD: - rc = __dev_dax_pmd_fault(dev_dax, vmf, &pfn); + rc = __dev_dax_pmd_fault(dev_dax, vmf); break; case PE_SIZE_PUD: - rc = __dev_dax_pud_fault(dev_dax, vmf, &pfn); + rc = __dev_dax_pud_fault(dev_dax, vmf); break; default: rc = VM_FAULT_SIGBUS; From patchwork Fri Jan 14 22:04:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B100C433EF for ; Fri, 14 Jan 2022 22:04:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D733F6B00C7; Fri, 14 Jan 2022 17:04:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CFBF56B00C9; Fri, 14 Jan 2022 17:04:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9CE36B00CA; Fri, 14 Jan 2022 17:04:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id A60776B00C7 for ; Fri, 14 Jan 2022 17:04:51 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 564C7181B7E25 for ; Fri, 14 Jan 2022 22:04:51 +0000 (UTC) X-FDA: 79030273182.06.E29FDED Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf31.hostedemail.com (Postfix) with ESMTP id EB56520003 for ; Fri, 14 Jan 2022 22:04:50 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C1C28B8260F; Fri, 14 Jan 2022 22:04:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F3652C36AE5; Fri, 14 Jan 2022 22:04:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197888; bh=ed71DndyqrKIKDTVCRds4jSThgSiLtT+AJBB5N2qXeM=; h=Date:From:To:Subject:In-Reply-To:From; b=IfiRUp8tVcq0mjrVFvWXddXIIV8MN7alk/BxqMyRGw34WTGsTVpDKMsk5ql2S3DXe Owa2bKEixAdhmiHkwxGvRU3SBtJhd7rnRGT1k/So85J9Pe0u4fJdAJ5x+yxpueZw6R BGMrK4jd3VpgHoD8Z4H+uzWGTWHGLz1+A4vKix0w= Date: Fri, 14 Jan 2022 14:04:47 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, dan.j.williams@intel.com, dave.jiang@intel.com, hch@lst.de, jane.chu@oracle.com, jgg@nvidia.com, jgg@ziepe.ca, jhubbard@nvidia.com, joao.m.martins@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vishal.l.verma@intel.com, willy@infradead.org Subject: [patch 035/146] device-dax: compound devmap support Message-ID: <20220114220447.U7zzozxTZ%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: EB56520003 X-Stat-Signature: iuswd8z3opkf1r8s1mqan6c9orm6gu1x Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IfiRUp8t; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197890-347466 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Joao Martins Subject: device-dax: compound devmap support Use the newly added compound devmap facility which maps the assigned dax ranges as compound pages at a page size of @align. dax devices are created with a fixed @align (huge page size) which is enforced through as well at mmap() of the device. Faults, consequently happen too at the specified @align specified at the creation, and those don't change throughout dax device lifetime. MCEs unmap a whole dax huge page, as well as splits occurring at the configured page size. Performance measured by gup_test improves considerably for unpin_user_pages() and altmap with NVDIMMs: $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms [altmap] (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms [altmap with -m 127004] (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms .. as well as unpin_user_page_range_dirty_lock() being just as effective as THP/hugetlb[0] pages. [0] https://lore.kernel.org/linux-mm/20210212130843.13865-5-joao.m.martins@oracle.com/ Link: https://lkml.kernel.org/r/20211202204422.26777-12-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Dan Williams Cc: Christoph Hellwig Cc: Dave Jiang Cc: Jane Chu Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: John Hubbard Cc: Jonathan Corbet Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Naoya Horiguchi Cc: Vishal Verma Signed-off-by: Andrew Morton --- drivers/dax/device.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- a/drivers/dax/device.c~device-dax-compound-devmap-support +++ a/drivers/dax/device.c @@ -78,14 +78,20 @@ static void dax_set_mapping(struct vm_fa { unsigned long i, nr_pages = fault_size / PAGE_SIZE; struct file *filp = vmf->vma->vm_file; + struct dev_dax *dev_dax = filp->private_data; pgoff_t pgoff; + /* mapping is only set on the head */ + if (dev_dax->pgmap->vmemmap_shift) + nr_pages = 1; + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size)); for (i = 0; i < nr_pages; i++) { struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i); + page = compound_head(page); if (page->mapping) continue; @@ -443,6 +449,9 @@ int dev_dax_probe(struct dev_dax *dev_da } pgmap->type = MEMORY_DEVICE_GENERIC; + if (dev_dax->align > PAGE_SIZE) + pgmap->vmemmap_shift = + order_base_2(dev_dax->align >> PAGE_SHIFT); addr = devm_memremap_pages(dev, pgmap); if (IS_ERR(addr)) return PTR_ERR(addr); From patchwork Fri Jan 14 22:04:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714062 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F4D9C433F5 for ; Fri, 14 Jan 2022 22:04:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E11E76B00C9; Fri, 14 Jan 2022 17:04:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D98E46B00CB; Fri, 14 Jan 2022 17:04:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C39796B00CC; Fri, 14 Jan 2022 17:04:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id AD5E26B00C9 for ; Fri, 14 Jan 2022 17:04:54 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7D804181B7E25 for ; Fri, 14 Jan 2022 22:04:54 +0000 (UTC) X-FDA: 79030273308.15.D686ABE Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf02.hostedemail.com (Postfix) with ESMTP id 0D3208000A for ; Fri, 14 Jan 2022 22:04:53 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0A67FB8262E; Fri, 14 Jan 2022 22:04:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7CF9EC36AE5; Fri, 14 Jan 2022 22:04:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197891; bh=K7d9/zwteFw3tKgifwWd6u4KHcJSfaDZ0dTFWzZcAzs=; h=Date:From:To:Subject:In-Reply-To:From; b=sYsVEv2joorPpttaqduymzcKWFl4I3EzBMZvKoehCv7p7gxhVd0//5FB+3G+8NF51 NlpkAl68IR1y6Uh+XxWuQ6FTeRXFuANcJP7KoC6E2MEa3XGgMD0ht6cUalQ/LsrJyJ m9Eb2C8j/XlS2oS5Ud6TTXdfzopbjwCvOukLViMQ= Date: Fri, 14 Jan 2022 14:04:51 -0800 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, dvyukov@google.com, elver@google.com, glider@google.com, kaiwan.billimoria@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org Subject: [patch 036/146] kasan: test: add globals left-out-of-bounds test Message-ID: <20220114220451.BSo5QR8et%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 0D3208000A X-Stat-Signature: nyf6nqftizfsy3mz6wx16s5szhw3kt1x Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=sYsVEv2j; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197893-453386 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kasan: test: add globals left-out-of-bounds test Add a test checking that KASAN generic can also detect out-of-bounds accesses to the left of globals. Unfortunately it seems that GCC doesn't catch this (tested GCC 10, 11). The main difference between GCC's globals redzoning and Clang's is that GCC relies on using increased alignment to producing padding, where Clang's redzoning implementation actually adds real data after the global and doesn't rely on alignment to produce padding. I believe this is the main reason why GCC can't reliably catch globals out-of-bounds in this case. Given this is now a known issue, to avoid failing the whole test suite, skip this test case with GCC. Link: https://lkml.kernel.org/r/20211117130714.135656-1-elver@google.com Signed-off-by: Marco Elver Reported-by: Kaiwan N Billimoria Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: Kaiwan N Billimoria Signed-off-by: Andrew Morton --- lib/test_kasan.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) --- a/lib/test_kasan.c~kasan-test-add-globals-left-out-of-bounds-test +++ a/lib/test_kasan.c @@ -700,7 +700,7 @@ static void kmem_cache_bulk(struct kunit static char global_array[10]; -static void kasan_global_oob(struct kunit *test) +static void kasan_global_oob_right(struct kunit *test) { /* * Deliberate out-of-bounds access. To prevent CONFIG_UBSAN_LOCAL_BOUNDS @@ -723,6 +723,20 @@ static void kasan_global_oob(struct kuni KUNIT_EXPECT_KASAN_FAIL(test, *(volatile char *)p); } +static void kasan_global_oob_left(struct kunit *test) +{ + char *volatile array = global_array; + char *p = array - 3; + + /* + * GCC is known to fail this test, skip it. + * See https://bugzilla.kernel.org/show_bug.cgi?id=215051. + */ + KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_CC_IS_CLANG); + KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_GENERIC); + KUNIT_EXPECT_KASAN_FAIL(test, *(volatile char *)p); +} + /* Check that ksize() makes the whole object accessible. */ static void ksize_unpoisons_memory(struct kunit *test) { @@ -1162,7 +1176,8 @@ static struct kunit_case kasan_kunit_tes KUNIT_CASE(kmem_cache_oob), KUNIT_CASE(kmem_cache_accounted), KUNIT_CASE(kmem_cache_bulk), - KUNIT_CASE(kasan_global_oob), + KUNIT_CASE(kasan_global_oob_right), + KUNIT_CASE(kasan_global_oob_left), KUNIT_CASE(kasan_stack_oob), KUNIT_CASE(kasan_alloca_oob_left), KUNIT_CASE(kasan_alloca_oob_right), From patchwork Fri Jan 14 22:04:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E550CC433FE for ; Fri, 14 Jan 2022 22:04:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7331C6B00CB; Fri, 14 Jan 2022 17:04:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BACC6B00CD; Fri, 14 Jan 2022 17:04:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55B766B00CE; Fri, 14 Jan 2022 17:04:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 3EEFA6B00CB for ; Fri, 14 Jan 2022 17:04:57 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F2C7996F0E for ; Fri, 14 Jan 2022 22:04:56 +0000 (UTC) X-FDA: 79030273392.15.D9D5902 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id 7B1C010000C for ; Fri, 14 Jan 2022 22:04:56 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CF0A361FF6; Fri, 14 Jan 2022 22:04:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0990C36AE9; Fri, 14 Jan 2022 22:04:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197895; bh=mkkkJ35mbtCFnAHFKOtKZhQFffsv0tDMjABiOPOBwlg=; h=Date:From:To:Subject:In-Reply-To:From; b=WsF6aaTBa7eRAoVxgyPiZp+99D5a2VdUbmWx4eJC++XLfdfRb1CAgklrXFhpfeZOC sg9wiOb6w3XCfCmRhWK9nQb8xYnO+BKFbipz07nx1+ZM7iGuldn2JSwhxu18rsZPyQ dOHOILNmGX16Oo8njYabzS+0Neb2jR8NNsDrDaN4= Date: Fri, 14 Jan 2022 14:04:54 -0800 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, cl@linux.com, dvyukov@google.com, elver@google.com, glider@google.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 037/146] kasan: add ability to detect double-kmem_cache_destroy() Message-ID: <20220114220454._y_b0o7I7%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: gedrsrsoz4n1kecp6fmn5w8ethwsdpgu Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WsF6aaTB; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 7B1C010000C X-HE-Tag: 1642197896-422529 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kasan: add ability to detect double-kmem_cache_destroy() Because mm/slab_common.c is not instrumented with software KASAN modes, it is not possible to detect use-after-free of the kmem_cache passed into kmem_cache_destroy(). In particular, because of the s->refcount-- and subsequent early return if non-zero, KASAN would never be able to see the double-free via kmem_cache_free(kmem_cache, s). To be able to detect a double-kmem_cache_destroy(), check accessibility of the kmem_cache, and in case of failure return early. While KASAN_HW_TAGS is able to detect such bugs, by checking accessibility and returning early we fail more gracefully and also avoid corrupting reused objects (where tags mismatch). A recent case of a double-kmem_cache_destroy() was detected by KFENCE: https://lkml.kernel.org/r/0000000000003f654905c168b09d@google.com, which was not detectable by software KASAN modes. Link: https://lkml.kernel.org/r/20211119142219.1519617-1-elver@google.com Signed-off-by: Marco Elver Acked-by: Vlastimil Babka Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Christoph Lameter Cc: David Rientjes Cc: Dmitry Vyukov Cc: Joonsoo Kim Cc: Pekka Enberg Signed-off-by: Andrew Morton --- mm/slab_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/slab_common.c~kasan-add-ability-to-detect-double-kmem_cache_destroy +++ a/mm/slab_common.c @@ -489,7 +489,7 @@ void slab_kmem_cache_release(struct kmem void kmem_cache_destroy(struct kmem_cache *s) { - if (unlikely(!s)) + if (unlikely(!s) || !kasan_check_byte(s)) return; cpus_read_lock(); From patchwork Fri Jan 14 22:04:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FD7DC4332F for ; Fri, 14 Jan 2022 22:05:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB4D36B00CD; Fri, 14 Jan 2022 17:05:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A638A6B00CF; Fri, 14 Jan 2022 17:05:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9043D6B00D0; Fri, 14 Jan 2022 17:05:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id 7E4276B00CD for ; Fri, 14 Jan 2022 17:05:00 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3F9381822D2B5 for ; Fri, 14 Jan 2022 22:05:00 +0000 (UTC) X-FDA: 79030273560.23.93C66F5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf03.hostedemail.com (Postfix) with ESMTP id DC4C620003 for ; Fri, 14 Jan 2022 22:04:59 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3A2F961FFA; Fri, 14 Jan 2022 22:04:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DBFEC36AE5; Fri, 14 Jan 2022 22:04:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197898; bh=wqBPSDc1Kr3I6HWzV5gdSwjzmr5IF5sRGfh2s8Cn7ug=; h=Date:From:To:Subject:In-Reply-To:From; b=OCQ65dIBNAsYsgKAGcLhEOZGgKUvl41UUivXn2bzS6w+abV+K6dTFTpQf2MpKDDIf xNmBy+tBflDpHKOxer+YWOi4j6gjT9a+bU0MzaCzRKYFJv3glj2dCam+qhk12C3wnQ oDAnoc1XYrn6buudtMuLXirUFMMFQyh4Iu3PBPpE= Date: Fri, 14 Jan 2022 14:04:57 -0800 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, cl@linux.com, dvyukov@google.com, elver@google.com, glider@google.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 038/146] kasan: test: add test case for double-kmem_cache_destroy() Message-ID: <20220114220457.lZG7Lp8OI%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: DC4C620003 X-Stat-Signature: xepkfs4kw16gkgcygkgxw69k45hxnoni Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OCQ65dIB; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197899-610167 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kasan: test: add test case for double-kmem_cache_destroy() Add a test case for double-kmem_cache_destroy() detection. Link: https://lkml.kernel.org/r/20211119142219.1519617-2-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Andrey Konovalov Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- lib/test_kasan.c | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/lib/test_kasan.c~kasan-test-add-test-case-for-double-kmem_cache_destroy +++ a/lib/test_kasan.c @@ -866,6 +866,16 @@ static void kmem_cache_invalid_free(stru kmem_cache_destroy(cache); } +static void kmem_cache_double_destroy(struct kunit *test) +{ + struct kmem_cache *cache; + + cache = kmem_cache_create("test_cache", 200, 0, 0, NULL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, cache); + kmem_cache_destroy(cache); + KUNIT_EXPECT_KASAN_FAIL(test, kmem_cache_destroy(cache)); +} + static void kasan_memchr(struct kunit *test) { char *ptr; @@ -1185,6 +1195,7 @@ static struct kunit_case kasan_kunit_tes KUNIT_CASE(ksize_uaf), KUNIT_CASE(kmem_cache_double_free), KUNIT_CASE(kmem_cache_invalid_free), + KUNIT_CASE(kmem_cache_double_destroy), KUNIT_CASE(kasan_memchr), KUNIT_CASE(kasan_memcmp), KUNIT_CASE(kasan_strings), From patchwork Fri Jan 14 22:05:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41D64C433EF for ; Fri, 14 Jan 2022 22:05:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C14826B00CF; Fri, 14 Jan 2022 17:05:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC3E66B00D1; Fri, 14 Jan 2022 17:05:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8B066B00D2; Fri, 14 Jan 2022 17:05:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id 92CA36B00CF for ; Fri, 14 Jan 2022 17:05:04 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5B2B818229946 for ; Fri, 14 Jan 2022 22:05:04 +0000 (UTC) X-FDA: 79030273728.11.04AC952 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf01.hostedemail.com (Postfix) with ESMTP id 0A6E640002 for ; Fri, 14 Jan 2022 22:05:03 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0B53BB82630; Fri, 14 Jan 2022 22:05:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81664C36AEC; Fri, 14 Jan 2022 22:05:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197901; bh=aeiP0CVAOuboe4A0auBtbJ4KY6uC/wCHZee0BbxwxYM=; h=Date:From:To:Subject:In-Reply-To:From; b=A7+MhyY250/zaYyqloMogsYsYiswp3PoDmfdrCzmzVXEWslLBcDkns9D96XDUUGaN ytYvf+tqepS4DAPRsXRu79Q6dBUNaVvCvCnCfpgyM+YNYY7VyhLd+cUGVen5iKWEjD 2ZMa8EUGhzH+6fEhBZnphvWk0c66PMaOUPZJCISQ= Date: Fri, 14 Jan 2022 14:05:01 -0800 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, andreyknvl@google.com, dvyukov@google.com, elver@google.com, glider@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org Subject: [patch 039/146] kasan: fix quarantine conflicting with init_on_free Message-ID: <20220114220501.Y1H1AtuvM%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 0A6E640002 X-Stat-Signature: fi81erywgeextdmd5f9aetdddtwhhkkn Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A7+MhyY2; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642197903-271604 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andrey Konovalov Subject: kasan: fix quarantine conflicting with init_on_free KASAN's quarantine might save its metadata inside freed objects. As this happens after the memory is zeroed by the slab allocator when init_on_free is enabled, the memory coming out of quarantine is not properly zeroed. This causes lib/test_meminit.c tests to fail with Generic KASAN. Zero the metadata when the object is removed from quarantine. Link: https://lkml.kernel.org/r/2805da5df4b57138fdacd671f5d227d58950ba54.1640037083.git.andreyknvl@google.com Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options") Signed-off-by: Andrey Konovalov Reviewed-by: Marco Elver Cc: Alexander Potapenko Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Andrey Ryabinin Signed-off-by: Andrew Morton --- mm/kasan/quarantine.c | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/mm/kasan/quarantine.c~kasan-fix-quarantine-conflicting-with-init_on_free +++ a/mm/kasan/quarantine.c @@ -132,12 +132,23 @@ static void *qlink_to_object(struct qlis static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache) { void *object = qlink_to_object(qlink, cache); + struct kasan_free_meta *meta = kasan_get_free_meta(cache, object); unsigned long flags; if (IS_ENABLED(CONFIG_SLAB)) local_irq_save(flags); /* + * If init_on_free is enabled and KASAN's free metadata is stored in + * the object, zero the metadata. Otherwise, the object's memory will + * not be properly zeroed, as KASAN saves the metadata after the slab + * allocator zeroes the object. + */ + if (slab_want_init_on_free(cache) && + cache->kasan_info.free_meta_offset == 0) + memzero_explicit(meta, sizeof(*meta)); + + /* * As the object now gets freed from the quarantine, assume that its * free track is no longer valid. */ From patchwork Fri Jan 14 22:05:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC3B3C433EF for ; Fri, 14 Jan 2022 22:05:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 423E46B00D1; Fri, 14 Jan 2022 17:05:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D3866B00D3; Fri, 14 Jan 2022 17:05:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2755E6B00D4; Fri, 14 Jan 2022 17:05:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 135326B00D1 for ; Fri, 14 Jan 2022 17:05:08 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CC7FD8EC06 for ; Fri, 14 Jan 2022 22:05:07 +0000 (UTC) X-FDA: 79030273854.08.8F3D06E Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf29.hostedemail.com (Postfix) with ESMTP id 4B62B120007 for ; Fri, 14 Jan 2022 22:05:07 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 46140B825F5; Fri, 14 Jan 2022 22:05:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AECACC36AE9; Fri, 14 Jan 2022 22:05:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197905; bh=JUbfp4e1LIqIT3UGacU1w3z7fU+6cnpQR3HSJYaGEPw=; h=Date:From:To:Subject:In-Reply-To:From; b=y5HRyhB6M+/7Cg7YSUP0j2gpKKFW8EjqDSUulrFBk6WsdLF9waYPQg5mUIuJvXVd0 O1bwKJIo79gPU1a4aJBNvfhafrM96Ks/IQ5dZ/WAPx6AcMOz8Tf1nhnJAFGEozZHbh jZbvD0B26inVjBgs4VUWEy4K7pO19JTIDFTRtxfU= Date: Fri, 14 Jan 2022 14:05:04 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz, william.kucharski@oracle.com, willy@infradead.org Subject: [patch 040/146] mm,fs: split dump_mapping() out from dump_page() Message-ID: <20220114220504.ODJjKdF8t%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 4B62B120007 X-Stat-Signature: ec6zw9xk7676uc57xqqeiqjg1yoeb7fu Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=y5HRyhB6; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197907-303609 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm,fs: split dump_mapping() out from dump_page() dump_mapping() is a big chunk of dump_page(), and it'd be handy to be able to call it when we don't have a struct page. Split it out and move it to fs/inode.c. Take the opportunity to simplify some of the debug messages a little. Link: https://lkml.kernel.org/r/20211121121056.2870061-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: William Kucharski Acked-by: Michal Hocko Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- fs/inode.c | 49 ++++++++++++++++++++++++++++++++++++++++ include/linux/fs.h | 1 mm/debug.c | 52 +------------------------------------------ 3 files changed, 52 insertions(+), 50 deletions(-) --- a/fs/inode.c~mmfs-split-dump_mapping-out-from-dump_page +++ a/fs/inode.c @@ -526,6 +526,55 @@ void __remove_inode_hash(struct inode *i } EXPORT_SYMBOL(__remove_inode_hash); +void dump_mapping(const struct address_space *mapping) +{ + struct inode *host; + const struct address_space_operations *a_ops; + struct hlist_node *dentry_first; + struct dentry *dentry_ptr; + struct dentry dentry; + unsigned long ino; + + /* + * If mapping is an invalid pointer, we don't want to crash + * accessing it, so probe everything depending on it carefully. + */ + if (get_kernel_nofault(host, &mapping->host) || + get_kernel_nofault(a_ops, &mapping->a_ops)) { + pr_warn("invalid mapping:%px\n", mapping); + return; + } + + if (!host) { + pr_warn("aops:%ps\n", a_ops); + return; + } + + if (get_kernel_nofault(dentry_first, &host->i_dentry.first) || + get_kernel_nofault(ino, &host->i_ino)) { + pr_warn("aops:%ps invalid inode:%px\n", a_ops, host); + return; + } + + if (!dentry_first) { + pr_warn("aops:%ps ino:%lx\n", a_ops, ino); + return; + } + + dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias); + if (get_kernel_nofault(dentry, dentry_ptr)) { + pr_warn("aops:%ps ino:%lx invalid dentry:%px\n", + a_ops, ino, dentry_ptr); + return; + } + + /* + * if dentry is corrupted, the %pd handler may still crash, + * but it's unlikely that we reach here with a corrupt mapping + */ + pr_warn("aops:%ps ino:%lx dentry name:\"%pd\"\n", a_ops, ino, &dentry); +} + void clear_inode(struct inode *inode) { /* --- a/include/linux/fs.h~mmfs-split-dump_mapping-out-from-dump_page +++ a/include/linux/fs.h @@ -3152,6 +3152,7 @@ extern void unlock_new_inode(struct inod extern void discard_new_inode(struct inode *); extern unsigned int get_next_ino(void); extern void evict_inodes(struct super_block *sb); +void dump_mapping(const struct address_space *); /* * Userspace may rely on the the inode number being non-zero. For example, glibc --- a/mm/debug.c~mmfs-split-dump_mapping-out-from-dump_page +++ a/mm/debug.c @@ -112,56 +112,8 @@ static void __dump_page(struct page *pag type = "ksm "; else if (PageAnon(page)) type = "anon "; - else if (mapping) { - struct inode *host; - const struct address_space_operations *a_ops; - struct hlist_node *dentry_first; - struct dentry *dentry_ptr; - struct dentry dentry; - unsigned long ino; - - /* - * mapping can be invalid pointer and we don't want to crash - * accessing it, so probe everything depending on it carefully - */ - if (get_kernel_nofault(host, &mapping->host) || - get_kernel_nofault(a_ops, &mapping->a_ops)) { - pr_warn("failed to read mapping contents, not a valid kernel address?\n"); - goto out_mapping; - } - - if (!host) { - pr_warn("aops:%ps\n", a_ops); - goto out_mapping; - } - - if (get_kernel_nofault(dentry_first, &host->i_dentry.first) || - get_kernel_nofault(ino, &host->i_ino)) { - pr_warn("aops:%ps with invalid host inode %px\n", - a_ops, host); - goto out_mapping; - } - - if (!dentry_first) { - pr_warn("aops:%ps ino:%lx\n", a_ops, ino); - goto out_mapping; - } - - dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias); - if (get_kernel_nofault(dentry, dentry_ptr)) { - pr_warn("aops:%ps ino:%lx with invalid dentry %px\n", - a_ops, ino, dentry_ptr); - } else { - /* - * if dentry is corrupted, the %pd handler may still - * crash, but it's unlikely that we reach here with a - * corrupted struct page - */ - pr_warn("aops:%ps ino:%lx dentry name:\"%pd\"\n", - a_ops, ino, &dentry); - } - } -out_mapping: + else if (mapping) + dump_mapping(mapping); BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1); pr_warn("%sflags: %pGp%s\n", type, &head->flags, From patchwork Fri Jan 14 22:05:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94FF3C4332F for ; Fri, 14 Jan 2022 22:05:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DB266B00D3; Fri, 14 Jan 2022 17:05:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18A3B6B00D5; Fri, 14 Jan 2022 17:05:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0791C6B00D6; Fri, 14 Jan 2022 17:05:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id DC84E6B00D3 for ; Fri, 14 Jan 2022 17:05:09 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9C15D181C49C0 for ; Fri, 14 Jan 2022 22:05:09 +0000 (UTC) X-FDA: 79030273938.30.9BD99F2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf31.hostedemail.com (Postfix) with ESMTP id 43DE220004 for ; Fri, 14 Jan 2022 22:05:09 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8A67561FF6; Fri, 14 Jan 2022 22:05:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BCE30C36AEC; Fri, 14 Jan 2022 22:05:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197908; bh=fYosh6t1jrY2eNo/PYk8o82juQtQ+9oPeBQ1g/Ukvow=; h=Date:From:To:Subject:In-Reply-To:From; b=dPamXJtWixS9l2Ydatp0YI0Is9BIsfIx/pRPiAKp3aNlKeKz385GkvZaNiLYgXMnv xQ4I1R1jcx61jsnknqt4HNU40fixuPmPolibAA+fQTQAeF957KoG27QVqgzxmP94Kc Ts0XqYWaEgyqsnp7cwhQ9uYijGtNHWCZENp+1sHI= Date: Fri, 14 Jan 2022 14:05:07 -0800 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 041/146] mm/debug_vm_pgtable: update comments regarding migration swap entries Message-ID: <20220114220507.FT-t6AQz5%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 43DE220004 X-Stat-Signature: o61ugeqq5i3cktm6hdmix7agc4xac7o6 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=dPamXJtW; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197909-115027 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual Subject: mm/debug_vm_pgtable: update comments regarding migration swap entries The commit 4dd845b5a3e5 ("mm/swapops: rework swap entry manipulation code") had changed migtation entry related helpers. Just update debug_vm_pgatble() synced documentation to reflect those changes. Link: https://lkml.kernel.org/r/1641880417-24848-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/vm/arch_pgtable_helpers.rst | 14 +++++++------- mm/debug_vm_pgtable.c | 4 ++-- 2 files changed, 9 insertions(+), 9 deletions(-) --- a/Documentation/vm/arch_pgtable_helpers.rst~mm-debug_vm_pgtable-update-comments-regarding-migration-swap-entries +++ a/Documentation/vm/arch_pgtable_helpers.rst @@ -247,12 +247,12 @@ SWAP Page Table Helpers | __swp_to_pmd_entry | Creates a mapped PMD from a swapped entry (arch) | +---------------------------+--------------------------------------------------+ | is_migration_entry | Tests a migration (read or write) swapped entry | -+---------------------------+--------------------------------------------------+ -| is_write_migration_entry | Tests a write migration swapped entry | -+---------------------------+--------------------------------------------------+ -| make_migration_entry_read | Converts into read migration swapped entry | -+---------------------------+--------------------------------------------------+ -| make_migration_entry | Creates a migration swapped entry (read or write)| -+---------------------------+--------------------------------------------------+ ++-------------------------------+----------------------------------------------+ +| is_writable_migration_entry | Tests a write migration swapped entry | ++-------------------------------+----------------------------------------------+ +| make_readable_migration_entry | Creates a read migration swapped entry | ++-------------------------------+----------------------------------------------+ +| make_writable_migration_entry | Creates a write migration swapped entry | ++-------------------------------+----------------------------------------------+ [1] https://lore.kernel.org/linux-mm/20181017020930.GN30832@redhat.com/ --- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-update-comments-regarding-migration-swap-entries +++ a/mm/debug_vm_pgtable.c @@ -888,8 +888,8 @@ static void __init swap_migration_tests( pr_debug("Validating swap migration\n"); /* - * make_migration_entry() expects given page to be - * locked, otherwise it stumbles upon a BUG_ON(). + * make_[readable|writable]_migration_entry() expects given page to + * be locked, otherwise it stumbles upon a BUG_ON(). */ __SetPageLocked(page); swp = make_writable_migration_entry(page_to_pfn(page)); From patchwork Fri Jan 14 22:05:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9D9DC433F5 for ; Fri, 14 Jan 2022 22:05:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44C8E6B00D5; Fri, 14 Jan 2022 17:05:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FB4C6B00D7; Fri, 14 Jan 2022 17:05:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C3526B00D8; Fri, 14 Jan 2022 17:05:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 165D86B00D5 for ; Fri, 14 Jan 2022 17:05:14 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id CE8788EC06 for ; Fri, 14 Jan 2022 22:05:13 +0000 (UTC) X-FDA: 79030274106.12.94DA57A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf21.hostedemail.com (Postfix) with ESMTP id 615081C0006 for ; Fri, 14 Jan 2022 22:05:13 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3E31AB82630; Fri, 14 Jan 2022 22:05:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BFDB2C36AE5; Fri, 14 Jan 2022 22:05:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197911; bh=/twb55yQS27uFy5si8zT+X4JSRlJUwALW3K1cG2Y3Jo=; h=Date:From:To:Subject:In-Reply-To:From; b=pEtNffu4lJWbtlZK1niuqiYA4dclWY47QVesHdOW+2PTUxm4nhqOjzQ83Ruh9tA/V zDHDciHRQymnMTf6+PdhpmzxvT0kErbotnK+D1Px8IBC+Uy0QCZnGYMckE0aZiNnwg PiDMBVBQvd5OVWjL0jMYaLRzQi6Zup7r27NLStDI= Date: Fri, 14 Jan 2022 14:05:10 -0800 From: Andrew Morton To: akpm@linux-foundation.org, chi.minghao@zte.com.cn, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, pankaj.gupta@ionos.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, zealci@zte.com.cm Subject: [patch 042/146] mm/truncate.c: remove unneeded variable Message-ID: <20220114220510.-gWU9LZq3%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 615081C0006 X-Stat-Signature: od8ckasxeijdiwyszdam7wo5dthk8g3t Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=pEtNffu4; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197913-51488 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: chiminghao Subject: mm/truncate.c: remove unneeded variable Return value directly instead of taking this in another redundant variable. Link: https://lkml.kernel.org/r/20211207083222.401594-1-chi.minghao@zte.com.cn Signed-off-by: chiminghao Reported-by: Zeal Robot Reviewed-by: David Hildenbrand Reviewed-by: Pankaj Gupta Reviewed-by: Muchun Song Signed-off-by: Andrew Morton --- mm/truncate.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/mm/truncate.c~mm-remove-unneeded-variable +++ a/mm/truncate.c @@ -205,7 +205,6 @@ static void truncate_cleanup_page(struct static int invalidate_complete_page(struct address_space *mapping, struct page *page) { - int ret; if (page->mapping != mapping) return 0; @@ -213,9 +212,7 @@ invalidate_complete_page(struct address_ if (page_has_private(page) && !try_to_release_page(page, 0)) return 0; - ret = remove_mapping(mapping, page); - - return ret; + return remove_mapping(mapping, page); } int truncate_inode_page(struct address_space *mapping, struct page *page) From patchwork Fri Jan 14 22:05:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3BE2C433F5 for ; Fri, 14 Jan 2022 22:05:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F9A56B00D7; Fri, 14 Jan 2022 17:05:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D0C46B00D9; Fri, 14 Jan 2022 17:05:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 497CB6B00DA; Fri, 14 Jan 2022 17:05:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 356786B00D7 for ; Fri, 14 Jan 2022 17:05:17 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DE13A181CBDBB for ; Fri, 14 Jan 2022 22:05:16 +0000 (UTC) X-FDA: 79030274232.24.4BA1B7A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 5FD9140011 for ; Fri, 14 Jan 2022 22:05:15 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9F06261FF0; Fri, 14 Jan 2022 22:05:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D56DBC36AE9; Fri, 14 Jan 2022 22:05:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197914; bh=JBh4/ao6a8cwJd+fW2YVjUpV2pIP7+/GDPGAhN/WGsA=; h=Date:From:To:Subject:In-Reply-To:From; b=S7y+GQ8mIDp6bw70lYmSKj5kkVdSzJYclCv2hOLSPl6H9ZPfuQ/gcmLZhWkai04a8 Z4vmFv4a4l9UIroOZ+ccJ5B8rtmEaM174YPvdyg1ByXpddOs0taUTHqGLspjgfdNpc wd7zNoao7KG31AHpznsLTSoWdGVADSZz0YvdOwAI= Date: Fri, 14 Jan 2022 14:05:13 -0800 From: Andrew Morton To: agruenba@redhat.com, akpm@linux-foundation.org, christophe.leroy@csgroup.eu, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 043/146] gup: avoid multiple user access locking/unlocking in fault_in_{read/write}able Message-ID: <20220114220513.oOei0Zeeb%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=S7y+GQ8m; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: kfr15iyjq9b6whb3i91ai4crf46bmt51 X-Rspamd-Queue-Id: 5FD9140011 X-Rspamd-Server: rspam12 X-HE-Tag: 1642197915-920903 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe Leroy Subject: gup: avoid multiple user access locking/unlocking in fault_in_{read/write}able fault_in_readable() and fault_in_writeable() perform __get_user() and __put_user() in a loop, implying multiple user access locking/unlocking. To avoid that, use user access blocks. Link: https://lkml.kernel.org/r/720dcf79314acca1a78fae56d478cc851952149d.1637084492.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Reviewed-by: Andreas Gruenbacher Signed-off-by: Andrew Morton --- mm/gup.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) --- a/mm/gup.c~gup-avoid-multiple-user-access-locking-unlocking-in-fault_in_read-writeable +++ a/mm/gup.c @@ -1672,21 +1672,22 @@ size_t fault_in_writeable(char __user *u if (unlikely(size == 0)) return 0; + if (!user_write_access_begin(uaddr, size)) + return size; if (!PAGE_ALIGNED(uaddr)) { - if (unlikely(__put_user(0, uaddr) != 0)) - return size; + unsafe_put_user(0, uaddr, out); uaddr = (char __user *)PAGE_ALIGN((unsigned long)uaddr); } end = (char __user *)PAGE_ALIGN((unsigned long)start + size); if (unlikely(end < start)) end = NULL; while (uaddr != end) { - if (unlikely(__put_user(0, uaddr) != 0)) - goto out; + unsafe_put_user(0, uaddr, out); uaddr += PAGE_SIZE; } out: + user_write_access_end(); if (size > uaddr - start) return size - (uaddr - start); return 0; @@ -1771,21 +1772,22 @@ size_t fault_in_readable(const char __us if (unlikely(size == 0)) return 0; + if (!user_read_access_begin(uaddr, size)) + return size; if (!PAGE_ALIGNED(uaddr)) { - if (unlikely(__get_user(c, uaddr) != 0)) - return size; + unsafe_get_user(c, uaddr, out); uaddr = (const char __user *)PAGE_ALIGN((unsigned long)uaddr); } end = (const char __user *)PAGE_ALIGN((unsigned long)start + size); if (unlikely(end < start)) end = NULL; while (uaddr != end) { - if (unlikely(__get_user(c, uaddr) != 0)) - goto out; + unsafe_get_user(c, uaddr, out); uaddr += PAGE_SIZE; } out: + user_read_access_end(); (void)c; if (size > uaddr - start) return size - (uaddr - start); From patchwork Fri Jan 14 22:05:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B524DC4332F for ; Fri, 14 Jan 2022 22:05:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D6C56B00D9; Fri, 14 Jan 2022 17:05:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 361BB6B00DB; Fri, 14 Jan 2022 17:05:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 201766B00DC; Fri, 14 Jan 2022 17:05:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 05A1C6B00D9 for ; Fri, 14 Jan 2022 17:05:19 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BB93798C1D for ; Fri, 14 Jan 2022 22:05:18 +0000 (UTC) X-FDA: 79030274316.19.3E52684 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id 6DB26C0002 for ; Fri, 14 Jan 2022 22:05:18 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C3E6561FF0; Fri, 14 Jan 2022 22:05:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E0404C36AE9; Fri, 14 Jan 2022 22:05:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197917; bh=gN5DsMeAdSGJoe2owBHk09Ta8KpskfveIu7mEDq0Ux4=; h=Date:From:To:Subject:In-Reply-To:From; b=LCU6z3cA68spkFXerZ5eXcSjJuTkOsHZAfKiwnq9t/SEyA6skrZ2kQjZiLKQo7U56 fgGM4ydJB/pa0P5u8uUqiSyS8njJGqzTZDF6kBKpw74DRlE36ZYHVwPo4w/jdQ5OyE zovxCG71xZl7/WTSfIpHFY+QdpeWVu25QLI2fqoA= Date: Fri, 14 Jan 2022 14:05:16 -0800 From: Andrew Morton To: akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, linmiaohe@huawei.com, linux-mm@kvack.org, lixinhai.lxh@gmail.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, ying.huang@intel.com, ziy@nvidia.com Subject: [patch 044/146] mm/gup.c: stricter check on THP migration entry during follow_pmd_mask Message-ID: <20220114220516.9_DFP1Inz%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6DB26C0002 X-Stat-Signature: jt7ab6shbzxbutfciy4qsp5q5xa8y37m Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=LCU6z3cA; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197918-301283 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Li Xinhai Subject: mm/gup.c: stricter check on THP migration entry during follow_pmd_mask When BUG_ON check for THP migration entry, the existing code only check thp_migration_supported case, but not for !thp_migration_supported case. If !thp_migration_supported() and !pmd_present(), the original code may dead loop in theory. To make the BUG_ON check consistent, we need catch both cases. Move the BUG_ON check one step earlier, because if the bug happen we should know it instead of depend on FOLL_MIGRATION been used by caller. Because pmdval instead of *pmd is read by the is_pmd_migration_entry() check, the existing code don't help to avoid useless locking within pmd_migration_entry_wait(), so remove that check. Link: https://lkml.kernel.org/r/20211217062559.737063-1-lixinhai.lxh@gmail.com Signed-off-by: Li Xinhai Reviewed-by: "Huang, Ying" Reviewed-by: Miaohe Lin Cc: Zi Yan Cc: "Kirill A. Shutemov" Signed-off-by: Andrew Morton --- mm/gup.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) --- a/mm/gup.c~mm-gupc-stricter-check-on-thp-migration-entry-during-follow_pmd_mask +++ a/mm/gup.c @@ -642,12 +642,17 @@ static struct page *follow_pmd_mask(stru } retry: if (!pmd_present(pmdval)) { + /* + * Should never reach here, if thp migration is not supported; + * Otherwise, it must be a thp migration entry. + */ + VM_BUG_ON(!thp_migration_supported() || + !is_pmd_migration_entry(pmdval)); + if (likely(!(flags & FOLL_MIGRATION))) return no_page_table(vma, flags); - VM_BUG_ON(thp_migration_supported() && - !is_pmd_migration_entry(pmdval)); - if (is_pmd_migration_entry(pmdval)) - pmd_migration_entry_wait(mm, pmd); + + pmd_migration_entry_wait(mm, pmd); pmdval = READ_ONCE(*pmd); /* * MADV_DONTNEED may convert the pmd to null because From patchwork Fri Jan 14 22:05:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 955FEC4332F for ; Fri, 14 Jan 2022 22:05:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 24F2F6B00DB; Fri, 14 Jan 2022 17:05:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1FF8D6B00DD; Fri, 14 Jan 2022 17:05:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 079F06B00DE; Fri, 14 Jan 2022 17:05:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0146.hostedemail.com [216.40.44.146]) by kanga.kvack.org (Postfix) with ESMTP id E62AC6B00DB for ; Fri, 14 Jan 2022 17:05:23 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AA1AD80F9B3E for ; Fri, 14 Jan 2022 22:05:23 +0000 (UTC) X-FDA: 79030274526.18.CC105B1 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf20.hostedemail.com (Postfix) with ESMTP id E4BFD1C0002 for ; Fri, 14 Jan 2022 22:05:22 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E26A5B825F5; Fri, 14 Jan 2022 22:05:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34B6DC36AEC; Fri, 14 Jan 2022 22:05:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197920; bh=8AIMNW2kktylIwIkCjAs27utNwB1FH1LY8ztkjD6O1I=; h=Date:From:To:Subject:In-Reply-To:From; b=QTN/Dn6uRPJV7k5hCQvVQ/92//NZshNWi1UEJvyfiloDkfkZvsRKpjNys7RMZ79Ed 8qht9a2kNlhrMPgOZhou+6fQ6yY/yEtsa5Gxh3C4hpxNeTaU60xv0xCzSKLIFw2WBh rHV18EWq9CTmZAgjlmDfNf8EIY2Wlg//qAZ9DmO0= Date: Fri, 14 Jan 2022 14:05:19 -0800 From: Andrew Morton To: ajaygargnsit@gmail.com, akpm@linux-foundation.org, andy.lavr@gmail.com, arnd@arndb.de, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 045/146] mm: shmem: don't truncate page if memory failure happens Message-ID: <20220114220519.M1fWj4WeI%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: E4BFD1C0002 X-Stat-Signature: 3t6mnr5th8gedxuxt7t3met1i9eqpy5b Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="QTN/Dn6u"; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197922-206010 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: shmem: don't truncate page if memory failure happens The current behavior of memory failure is to truncate the page cache regardless of dirty or clean. If the page is dirty the later access will get the obsolete data from disk without any notification to the users. This may cause silent data loss. It is even worse for shmem since shmem is in-memory filesystem, truncating page cache means discarding data blocks. The later read would return all zero. The right approach is to keep the corrupted page in page cache, any later access would return error for syscalls or SIGBUS for page fault, until the file is truncated, hole punched or removed. The regular storage backed filesystems would be more complicated so this patch is focused on shmem. This also unblock the support for soft offlining shmem THP. [akpm@linux-foundation.org: coding style fixes] [arnd@arndb.de: fix uninitialized variable use in me_pagecache_clean()] Link: https://lkml.kernel.org/r/20211022064748.4173718-1-arnd@kernel.org [Fix invalid pointer dereference in shmem_read_mapping_page_gfp() with a slight different implementation from what Ajay Garg and Muchun Song proposed and reworked the error handling of shmem_write_begin() suggested by Linus] Link: https://lore.kernel.org/linux-mm/20211111084617.6746-1-ajaygargnsit@gmail.com/ Link: https://lkml.kernel.org/r/20211020210755.23964-6-shy828301@gmail.com Link: https://lkml.kernel.org/r/20211116193247.21102-1-shy828301@gmail.com Signed-off-by: Yang Shi Signed-off-by: Arnd Bergmann Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Naoya Horiguchi Cc: Oscar Salvador Cc: Peter Xu Cc: Ajay Garg Cc: Muchun Song Cc: Andy Lavr Signed-off-by: Andrew Morton --- mm/memory-failure.c | 14 +++++++++-- mm/shmem.c | 51 +++++++++++++++++++++++++++++++++++++----- mm/userfaultfd.c | 5 ++++ 3 files changed, 61 insertions(+), 9 deletions(-) --- a/mm/memory-failure.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/memory-failure.c @@ -58,6 +58,7 @@ #include #include #include +#include #include "internal.h" #include "ras/ras_event.h" @@ -867,6 +868,7 @@ static int me_pagecache_clean(struct pag { int ret; struct address_space *mapping; + bool extra_pins; delete_from_lru_cache(p); @@ -896,17 +898,23 @@ static int me_pagecache_clean(struct pag } /* + * The shmem page is kept in page cache instead of truncating + * so is expected to have an extra refcount after error-handling. + */ + extra_pins = shmem_mapping(mapping); + + /* * Truncation is a bit tricky. Enable it per file system for now. * * Open: to take i_rwsem or not for this? Right now we don't. */ ret = truncate_error_page(p, page_to_pfn(p), mapping); + if (has_extra_refcount(ps, p, extra_pins)) + ret = MF_FAILED; + out: unlock_page(p); - if (has_extra_refcount(ps, p, false)) - ret = MF_FAILED; - return ret; } --- a/mm/shmem.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/shmem.c @@ -2457,6 +2457,7 @@ shmem_write_begin(struct file *file, str struct inode *inode = mapping->host; struct shmem_inode_info *info = SHMEM_I(inode); pgoff_t index = pos >> PAGE_SHIFT; + int ret = 0; /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | @@ -2467,7 +2468,19 @@ shmem_write_begin(struct file *file, str return -EPERM; } - return shmem_getpage(inode, index, pagep, SGP_WRITE); + ret = shmem_getpage(inode, index, pagep, SGP_WRITE); + + if (ret) + return ret; + + if (PageHWPoison(*pagep)) { + unlock_page(*pagep); + put_page(*pagep); + *pagep = NULL; + return -EIO; + } + + return 0; } static int @@ -2554,6 +2567,12 @@ static ssize_t shmem_file_read_iter(stru if (sgp == SGP_CACHE) set_page_dirty(page); unlock_page(page); + + if (PageHWPoison(page)) { + put_page(page); + error = -EIO; + break; + } } /* @@ -3093,7 +3112,8 @@ static const char *shmem_get_link(struct page = find_get_page(inode->i_mapping, 0); if (!page) return ERR_PTR(-ECHILD); - if (!PageUptodate(page)) { + if (PageHWPoison(page) || + !PageUptodate(page)) { put_page(page); return ERR_PTR(-ECHILD); } @@ -3101,6 +3121,13 @@ static const char *shmem_get_link(struct error = shmem_getpage(inode, 0, &page, SGP_READ); if (error) return ERR_PTR(error); + if (!page) + return ERR_PTR(-ECHILD); + if (PageHWPoison(page)) { + unlock_page(page); + put_page(page); + return ERR_PTR(-ECHILD); + } unlock_page(page); } set_delayed_call(done, shmem_put_link, page); @@ -3751,6 +3778,13 @@ static void shmem_destroy_inodecache(voi kmem_cache_destroy(shmem_inode_cachep); } +/* Keep the page in page cache instead of truncating it */ +static int shmem_error_remove_page(struct address_space *mapping, + struct page *page) +{ + return 0; +} + const struct address_space_operations shmem_aops = { .writepage = shmem_writepage, .set_page_dirty = __set_page_dirty_no_writeback, @@ -3761,7 +3795,7 @@ const struct address_space_operations sh #ifdef CONFIG_MIGRATION .migratepage = migrate_page, #endif - .error_remove_page = generic_error_remove_page, + .error_remove_page = shmem_error_remove_page, }; EXPORT_SYMBOL(shmem_aops); @@ -4169,9 +4203,14 @@ struct page *shmem_read_mapping_page_gfp error = shmem_getpage_gfp(inode, index, &page, SGP_CACHE, gfp, NULL, NULL, NULL); if (error) - page = ERR_PTR(error); - else - unlock_page(page); + return ERR_PTR(error); + + unlock_page(page); + if (PageHWPoison(page)) { + put_page(page); + return ERR_PTR(-EIO); + } + return page; #else /* --- a/mm/userfaultfd.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/userfaultfd.c @@ -232,6 +232,11 @@ static int mcontinue_atomic_pte(struct m goto out; } + if (PageHWPoison(page)) { + ret = -EIO; + goto out_release; + } + ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, page, false, wp_copy); if (ret) From patchwork Fri Jan 14 22:05:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A0CDC433EF for ; Fri, 14 Jan 2022 22:05:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 203F26B00DD; Fri, 14 Jan 2022 17:05:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 190D76B00DF; Fri, 14 Jan 2022 17:05:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02B116B00E0; Fri, 14 Jan 2022 17:05:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id E22FB6B00DD for ; Fri, 14 Jan 2022 17:05:25 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id ADCAA80F9B3E for ; Fri, 14 Jan 2022 22:05:25 +0000 (UTC) X-FDA: 79030274610.18.213FE7B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf07.hostedemail.com (Postfix) with ESMTP id 3634240003 for ; Fri, 14 Jan 2022 22:05:25 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8D35961FF0; Fri, 14 Jan 2022 22:05:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB18FC36AE5; Fri, 14 Jan 2022 22:05:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197924; bh=UW1gO00Rf3OgyA+/XNaq+nZ6muF292ZuljiSVTpw6sI=; h=Date:From:To:Subject:In-Reply-To:From; b=j4JpGheqJ0ItsRvu1N1noyOETc4mCX/sQ0pijsmUYER1Zj6Nrp+Kzqk4XHVYxFJ7X SM1w71sQutykBkH68eyllbfSV0UNflDB2RRh8Dc68hMB852cbTYURIklYIo3exLgm0 7qRQle4nJftTxk+6r9SFTjT1/FwdvRKihR3cJcRQ= Date: Fri, 14 Jan 2022 14:05:23 -0800 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, ligang.bdlg@bytedance.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 046/146] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Message-ID: <20220114220523.T4gcXYcCD%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 3634240003 X-Stat-Signature: mf3unxrs581hw7w8bz8h1j8io8afg4uq Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=j4JpGheq; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642197925-28377 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gang Li Subject: shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Fix a data race in commit 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure"). Here are call traces causing race: Call Trace 1: shmem_unused_huge_shrink+0x3ae/0x410 ? __list_lru_walk_one.isra.5+0x33/0x160 super_cache_scan+0x17c/0x190 shrink_slab.part.55+0x1ef/0x3f0 shrink_node+0x10e/0x330 kswapd+0x380/0x740 kthread+0xfc/0x130 ? mem_cgroup_shrink_node+0x170/0x170 ? kthread_create_on_node+0x70/0x70 ret_from_fork+0x1f/0x30 Call Trace 2: shmem_evict_inode+0xd8/0x190 evict+0xbe/0x1c0 do_unlinkat+0x137/0x330 do_syscall_64+0x76/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 A simple explanation: Image there are 3 items in the local list (@list). In the first traversal, A is not deleted from @list. 1) A->B->C ^ | pos (leave) In the second traversal, B is deleted from @list. Concurrently, A is deleted from @list through shmem_evict_inode() since last reference counter of inode is dropped by other thread. Then the @list is corrupted. 2) A->B->C ^ ^ | | evict pos (drop) We should make sure the inode is either on the global list or deleted from any local list before iput(). Fixed by moving inodes back to global list before we put them. [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/20211125064502.99983-1-ligang.bdlg@bytedance.com Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Gang Li Reviewed-by: Muchun Song Acked-by: Kirill A. Shutemov Cc: Hugh Dickins Cc: Signed-off-by: Andrew Morton --- mm/shmem.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) --- a/mm/shmem.c~shmem-fix-a-race-between-shmem_unused_huge_shrink-and-shmem_evict_inode +++ a/mm/shmem.c @@ -554,7 +554,7 @@ static unsigned long shmem_unused_huge_s struct shmem_inode_info *info; struct page *page; unsigned long batch = sc ? sc->nr_to_scan : 128; - int removed = 0, split = 0; + int split = 0; if (list_empty(&sbinfo->shrinklist)) return SHRINK_STOP; @@ -569,7 +569,6 @@ static unsigned long shmem_unused_huge_s /* inode is about to be evicted */ if (!inode) { list_del_init(&info->shrinklist); - removed++; goto next; } @@ -577,12 +576,12 @@ static unsigned long shmem_unused_huge_s if (round_up(inode->i_size, PAGE_SIZE) == round_up(inode->i_size, HPAGE_PMD_SIZE)) { list_move(&info->shrinklist, &to_remove); - removed++; goto next; } list_move(&info->shrinklist, &list); next: + sbinfo->shrinklist_len--; if (!--batch) break; } @@ -602,7 +601,7 @@ next: inode = &info->vfs_inode; if (nr_to_split && split >= nr_to_split) - goto leave; + goto move_back; page = find_get_page(inode->i_mapping, (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT); @@ -616,38 +615,44 @@ next: } /* - * Leave the inode on the list if we failed to lock - * the page at this time. + * Move the inode on the list back to shrinklist if we failed + * to lock the page at this time. * * Waiting for the lock may lead to deadlock in the * reclaim path. */ if (!trylock_page(page)) { put_page(page); - goto leave; + goto move_back; } ret = split_huge_page(page); unlock_page(page); put_page(page); - /* If split failed leave the inode on the list */ + /* If split failed move the inode on the list back to shrinklist */ if (ret) - goto leave; + goto move_back; split++; drop: list_del_init(&info->shrinklist); - removed++; -leave: + goto put; +move_back: + /* + * Make sure the inode is either on the global list or deleted + * from any local list before iput() since it could be deleted + * in another thread once we put the inode (then the local list + * is corrupted). + */ + spin_lock(&sbinfo->shrinklist_lock); + list_move(&info->shrinklist, &sbinfo->shrinklist); + sbinfo->shrinklist_len++; + spin_unlock(&sbinfo->shrinklist_lock); +put: iput(inode); } - spin_lock(&sbinfo->shrinklist_lock); - list_splice_tail(&list, &sbinfo->shrinklist); - sbinfo->shrinklist_len -= removed; - spin_unlock(&sbinfo->shrinklist_lock); - return split; } From patchwork Fri Jan 14 22:05:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 722EAC433F5 for ; Fri, 14 Jan 2022 22:05:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED4176B00E1; Fri, 14 Jan 2022 17:05:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E84116B00E3; Fri, 14 Jan 2022 17:05:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4C426B00E2; Fri, 14 Jan 2022 17:05:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id C27666B00DF for ; Fri, 14 Jan 2022 17:05:31 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 82BE298C14 for ; Fri, 14 Jan 2022 22:05:28 +0000 (UTC) X-FDA: 79030274736.10.0C02F32 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf28.hostedemail.com (Postfix) with ESMTP id 16A5DC0008 for ; Fri, 14 Jan 2022 22:05:27 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7DB2E61FF6; Fri, 14 Jan 2022 22:05:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAAFDC36AE5; Fri, 14 Jan 2022 22:05:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197926; bh=A6orIj7zEKCwqVSN/lU708h9PEkKG+pv4FQp8bVap30=; h=Date:From:To:Subject:In-Reply-To:From; b=AZ5qYlpTZijVLdEZWCzhit5/fGinmeBPhnN4dBLAwG3jvetQYB/4j/3pysm0u3cUv 5cueOj4F88qZseM580hGBsObrDAXncZmpmsZ1S/dsPle+Hw9MoQwO+RBmFCrWbX6nw zf58UrTar/O6zwgeLV1PvJaQFY0LlhkSXTAS59KU= Date: Fri, 14 Jan 2022 14:05:26 -0800 From: Andrew Morton To: akpm@linux-foundation.org, christophe.jaillet@wanadoo.fr, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 047/146] mm/frontswap.c: use non-atomic '__set_bit()' when possible Message-ID: <20220114220526.Av3FTFVPw%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 16A5DC0008 X-Stat-Signature: i566znksc41mfedwb7497g8pj7umerho Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=AZ5qYlpT; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642197927-637623 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe JAILLET Subject: mm/frontswap.c: use non-atomic '__set_bit()' when possible the 'a' and 'b' bitmaps are local to this function, so no concurrent access can occur. So the non-atomic '__set_bit()' can be used to save a few cycles. Link: https://lkml.kernel.org/r/e52476da5cee57151745c5c3c934a69798dc6fa4.1638132190.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET Signed-off-by: Andrew Morton --- mm/frontswap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/frontswap.c~mm-mempool-use-non-atomic-__set_bit-when-possible +++ a/mm/frontswap.c @@ -127,7 +127,7 @@ void frontswap_register_ops(struct front spin_lock(&swap_lock); plist_for_each_entry(si, &swap_active_head, list) { if (!WARN_ON(!si->frontswap_map)) - set_bit(si->type, a); + __set_bit(si->type, a); } spin_unlock(&swap_lock); @@ -149,7 +149,7 @@ void frontswap_register_ops(struct front spin_lock(&swap_lock); plist_for_each_entry(si, &swap_active_head, list) { if (si->frontswap_map) - set_bit(si->type, b); + __set_bit(si->type, b); } spin_unlock(&swap_lock); From patchwork Fri Jan 14 22:05:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE0AEC433FE for ; Fri, 14 Jan 2022 22:05:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 477F06B00DF; Fri, 14 Jan 2022 17:05:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FB626B00E2; Fri, 14 Jan 2022 17:05:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B6626B00E4; Fri, 14 Jan 2022 17:05:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id DB5696B00DF for ; Fri, 14 Jan 2022 17:05:31 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A62B318272F0E for ; Fri, 14 Jan 2022 22:05:31 +0000 (UTC) X-FDA: 79030274862.31.3A3EA33 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf05.hostedemail.com (Postfix) with ESMTP id 4DE4B10000A for ; Fri, 14 Jan 2022 22:05:31 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AA1D161FE2; Fri, 14 Jan 2022 22:05:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0F53C36AE9; Fri, 14 Jan 2022 22:05:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197930; bh=F1ELaCyx4YcGBzGH+gZlEAcKQ7q0e/ULOdTBWIPHcWg=; h=Date:From:To:Subject:In-Reply-To:From; b=lFrpLxsxpqpJrPT+oQTJwLnymxBov9mT4XdLGdb4WQ9pLD//+jf/5JWIKA5k4Rbi3 2IPk4d6z01V9xNWw+AqxlKl8TJMDpMhsjKohDGhEp/wY8s5UrSh8y9B30861VoiI2y nnmHtHgbSiQLoNlHrEOc6xFTZLjfNLtPFzJ3rrwc= Date: Fri, 14 Jan 2022 14:05:29 -0800 From: Andrew Morton To: akpm@linux-foundation.org, chris@chrisdown.name, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, vbabka@suse.cz, vdavydov.dev@gmail.com Subject: [patch 048/146] mm: memcontrol: make cgroup_memory_nokmem static Message-ID: <20220114220529.svLxttnTm%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4DE4B10000A X-Stat-Signature: yo9ag4oqmimiiickk4m8qow66qcsdu8x Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lFrpLxsx; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197931-12766 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: memcontrol: make cgroup_memory_nokmem static commit 494c1dfe855e ("mm: memcg/slab: create a new set of kmalloc-cg- caches") makes cgroup_memory_nokmem global, however, it is unnecessary because there is already a function mem_cgroup_kmem_disabled() which exports it. Just make it static and replace it with mem_cgroup_kmem_disabled() in mm/slab_common.c. Link: https://lkml.kernel.org/r/20211109065418.21693-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Acked-by: Chris Down Acked-by: Vlastimil Babka Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Signed-off-by: Andrew Morton --- mm/internal.h | 5 ----- mm/memcontrol.c | 2 +- mm/slab_common.c | 2 +- 3 files changed, 2 insertions(+), 7 deletions(-) --- a/mm/internal.h~mm-memcontrol-make-cgroup_memory_nokmem-static +++ a/mm/internal.h @@ -158,11 +158,6 @@ extern void reclaim_throttle(pg_data_t * extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address); /* - * in mm/memcontrol.c: - */ -extern bool cgroup_memory_nokmem; - -/* * in mm/page_alloc.c */ --- a/mm/memcontrol.c~mm-memcontrol-make-cgroup_memory_nokmem-static +++ a/mm/memcontrol.c @@ -84,7 +84,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(int_active_mem static bool cgroup_memory_nosocket __ro_after_init; /* Kernel memory accounting disabled? */ -bool cgroup_memory_nokmem __ro_after_init; +static bool cgroup_memory_nokmem __ro_after_init; /* Whether the swap controller is active */ #ifdef CONFIG_MEMCG_SWAP --- a/mm/slab_common.c~mm-memcontrol-make-cgroup_memory_nokmem-static +++ a/mm/slab_common.c @@ -844,7 +844,7 @@ new_kmalloc_cache(int idx, enum kmalloc_ if (type == KMALLOC_RECLAIM) { flags |= SLAB_RECLAIM_ACCOUNT; } else if (IS_ENABLED(CONFIG_MEMCG_KMEM) && (type == KMALLOC_CGROUP)) { - if (cgroup_memory_nokmem) { + if (mem_cgroup_kmem_disabled()) { kmalloc_caches[type][idx] = kmalloc_caches[KMALLOC_NORMAL][idx]; return; } From patchwork Fri Jan 14 22:05:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E7ECC433F5 for ; Fri, 14 Jan 2022 22:05:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C615B6B00E2; Fri, 14 Jan 2022 17:05:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C10946B00E4; Fri, 14 Jan 2022 17:05:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8BAE6B00E6; Fri, 14 Jan 2022 17:05:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id 8C5C56B00E2 for ; Fri, 14 Jan 2022 17:05:35 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 58C008EC06 for ; Fri, 14 Jan 2022 22:05:35 +0000 (UTC) X-FDA: 79030275030.19.A43CA94 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id E9BF4140008 for ; Fri, 14 Jan 2022 22:05:34 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C3F4061FF0; Fri, 14 Jan 2022 22:05:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E834FC36AE9; Fri, 14 Jan 2022 22:05:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197933; bh=2dP30ttTaKtwyHMkE9iApBpNEEQRRLNZqJO6/Y8n2eE=; h=Date:From:To:Subject:In-Reply-To:From; b=Gcz+tMtM5C77uFuemT4Ze5ZNv7esbd32Ybx//cC3yb41GkF0MVnZeNB1Ss+uxIwy4 MLrD7HKzEFmukqpRIRBvWE25qS1i/evYAjVHkQD+G+OorUcUitXspH0uHi92pS+yrj uNngePFWXte15EI80ZpHSnsV9uCNLSxVzqyUjSS0= Date: Fri, 14 Jan 2022 14:05:32 -0800 From: Andrew Morton To: akpm@linux-foundation.org, dqiao@redhat.com, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 049/146] mm/page_counter: remove an incorrect call to propagate_protected_usage() Message-ID: <20220114220532.E3LmAAg_1%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E9BF4140008 X-Stat-Signature: bhxksogqdhd6npfxb3ha61cyejbyjpd9 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Gcz+tMtM; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197934-897320 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Donghai Qiao Subject: mm/page_counter: remove an incorrect call to propagate_protected_usage() propagate_protected_usage() is called to propagate the usage change in the page_counter structure. But there is a call to this function from page_counter_try_charge() when there is actually no usage change. Hence this call should be removed. Link: https://lkml.kernel.org/r/20211118181125.3918222-1-dqiao@redhat.com Signed-off-by: Donghai Qiao Reviewed-by: Roman Gushchin Cc: Michal Hocko Cc: Johannes Weiner Signed-off-by: Andrew Morton --- mm/page_counter.c | 1 - 1 file changed, 1 deletion(-) --- a/mm/page_counter.c~mm-page_counter-remove-an-incorrect-call-to-propagate_protected_usage +++ a/mm/page_counter.c @@ -120,7 +120,6 @@ bool page_counter_try_charge(struct page new = atomic_long_add_return(nr_pages, &c->usage); if (new > c->max) { atomic_long_sub(nr_pages, &c->usage); - propagate_protected_usage(c, new); /* * This is racy, but we can live with some * inaccuracy in the failcnt which is only used From patchwork Fri Jan 14 22:05:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67BBEC433FE for ; Fri, 14 Jan 2022 22:05:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5D4A6B00E4; Fri, 14 Jan 2022 17:05:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0C556B00E7; Fri, 14 Jan 2022 17:05:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAD756B00E8; Fri, 14 Jan 2022 17:05:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id B96596B00E4 for ; Fri, 14 Jan 2022 17:05:39 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8927F998C8 for ; Fri, 14 Jan 2022 22:05:39 +0000 (UTC) X-FDA: 79030275198.25.FC98F56 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf24.hostedemail.com (Postfix) with ESMTP id 10522180002 for ; Fri, 14 Jan 2022 22:05:38 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 060CEB825F5; Fri, 14 Jan 2022 22:05:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36D70C36AEC; Fri, 14 Jan 2022 22:05:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197936; bh=4/QC+45WSYa3uCpbQJoUpc6dCsWZ74r9DZcUaqbb2lY=; h=Date:From:To:Subject:In-Reply-To:From; b=Z/eAqCphDtCF/uHFI9hLQX3HqybDRCOq6vFCYjEmFPAr/EC7rqPiofIsV03l2xM+y 7SSv78ZBF9toXA00EOc+2MUhU1bHnmZqRRRZhm43ZKCuW9jvB5fknH6V2P8xxNTHeH CfBnUonrV5ZcGVlOXZbhbRq7EtNi221+qDqrWiCk= Date: Fri, 14 Jan 2022 14:05:35 -0800 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, chris@chrisdown.name, corbet@lwn.net, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, lizefan.x@bytedance.com, mhocko@suse.com, mm-commits@vger.kernel.org, richard.weiyang@gmail.com, schatzberg.dan@gmail.com, shakeelb@google.com, songmuchun@bytedance.com, tj@kernel.org, torvalds@linux-foundation.org, vdavydov.dev@gmail.com, willy@infradead.org Subject: [patch 050/146] mm/memcg: add oom_group_kill memory event Message-ID: <20220114220535.WrLxJtqd2%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 10522180002 X-Stat-Signature: g3ys58ntbeiuk5y749178utjy64fj7j3 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Z/eAqCph"; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197938-728885 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dan Schatzberg Subject: mm/memcg: add oom_group_kill memory event Our container agent wants to know when a container exits if it was OOM killed or not to report to the user. We use memory.oom.group = 1 to ensure that OOM kills within the container's cgroup kill everything. Existing memory.events are insufficient for knowing if this triggered: 1) Our current approach reads memory.events oom_kill and reports the container was killed if the value is non-zero. This is erroneous in some cases where containers create their children cgroups with memory.oom.group=1 as such OOM kills will get counted against the container cgroup's oom_kill counter despite not actually OOM killing the entire container. 2) Reading memory.events.local will fail to identify OOM kills in leaf cgroups (that don't set memory.oom.group) within the container cgroup. This patch adds a new oom_group_kill event when memory.oom.group triggers to allow userspace to cleanly identify when an entire cgroup is oom killed. [schatzberg.dan@gmail.com: changes from Johannes and Chris] Link: https://lkml.kernel.org/r/20211213162511.2492267-1-schatzberg.dan@gmail.com Link: https://lkml.kernel.org/r/20211203162426.3375036-1-schatzberg.dan@gmail.com Signed-off-by: Dan Schatzberg Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner Acked-by: Chris Down Reviewed-by: Shakeel Butt Acked-by: Michal Hocko Cc: Tejun Heo Cc: Zefan Li Cc: Jonathan Corbet Cc: Vladimir Davydov Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Alex Shi Cc: Wei Yang Signed-off-by: Andrew Morton --- Documentation/admin-guide/cgroup-v2.rst | 3 +++ include/linux/memcontrol.h | 1 + mm/memcontrol.c | 2 ++ mm/oom_kill.c | 1 + 4 files changed, 7 insertions(+) --- a/Documentation/admin-guide/cgroup-v2.rst~mm-add-group_oom_kill-memory-event +++ a/Documentation/admin-guide/cgroup-v2.rst @@ -1268,6 +1268,9 @@ PAGE_SIZE multiple when read back. The number of processes belonging to this cgroup killed by any kind of OOM killer. + oom_group_kill + The number of times a group OOM has occurred. + memory.events.local Similar to memory.events but the fields in the file are local to the cgroup i.e. not hierarchical. The file modified event --- a/include/linux/memcontrol.h~mm-add-group_oom_kill-memory-event +++ a/include/linux/memcontrol.h @@ -42,6 +42,7 @@ enum memcg_memory_event { MEMCG_MAX, MEMCG_OOM, MEMCG_OOM_KILL, + MEMCG_OOM_GROUP_KILL, MEMCG_SWAP_HIGH, MEMCG_SWAP_MAX, MEMCG_SWAP_FAIL, --- a/mm/memcontrol.c~mm-add-group_oom_kill-memory-event +++ a/mm/memcontrol.c @@ -6318,6 +6318,8 @@ static void __memory_events_show(struct seq_printf(m, "oom %lu\n", atomic_long_read(&events[MEMCG_OOM])); seq_printf(m, "oom_kill %lu\n", atomic_long_read(&events[MEMCG_OOM_KILL])); + seq_printf(m, "oom_group_kill %lu\n", + atomic_long_read(&events[MEMCG_OOM_GROUP_KILL])); } static int memory_events_show(struct seq_file *m, void *v) --- a/mm/oom_kill.c~mm-add-group_oom_kill-memory-event +++ a/mm/oom_kill.c @@ -994,6 +994,7 @@ static void oom_kill_process(struct oom_ * If necessary, kill all tasks in the selected memory cgroup. */ if (oom_group) { + memcg_memory_event(oom_group, MEMCG_OOM_GROUP_KILL); mem_cgroup_print_oom_group(oom_group); mem_cgroup_scan_tasks(oom_group, oom_kill_memcg_member, (void *)message); From patchwork Fri Jan 14 22:05:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C330EC433F5 for ; Fri, 14 Jan 2022 22:05:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C0C56B00E7; Fri, 14 Jan 2022 17:05:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 471816B00E9; Fri, 14 Jan 2022 17:05:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29FA06B00EA; Fri, 14 Jan 2022 17:05:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id 0E23F6B00E7 for ; Fri, 14 Jan 2022 17:05:42 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C25988F174 for ; Fri, 14 Jan 2022 22:05:41 +0000 (UTC) X-FDA: 79030275282.11.3049970 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 555A31C0004 for ; Fri, 14 Jan 2022 22:05:41 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 93E6B61FE2; Fri, 14 Jan 2022 22:05:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AC5FAC36AE5; Fri, 14 Jan 2022 22:05:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197940; bh=oKn+aBO7BR0Xe5e3+yhIf2M8qlqSguFVow4JagZrtXg=; h=Date:From:To:Subject:In-Reply-To:From; b=EhnOAkvDYQNsZXCdGYtQF8TE2r08hPbJjd91QooGQhDPcM38Bxyh2IXD+UbpyyVmx 2OzUmwxaeHOBbTDxKlzBr4zUQrYf9DHXUg7GIHMS8k3ECBXctwN4vY6X4F3+u6eal7 FTPDswWbdyZ263M9rPheQigPV1daoD/yopHsytEI= Date: Fri, 14 Jan 2022 14:05:39 -0800 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mkoutny@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org Subject: [patch 051/146] memcg: better bounds on the memcg stats updates Message-ID: <20220114220539.4-Ius7f56%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Stat-Signature: 84n73rwdetaiuf95a3udkm4h6tw513zu Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EhnOAkvD; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 555A31C0004 X-HE-Tag: 1642197941-278138 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg: better bounds on the memcg stats updates The commit 11192d9c124d ("memcg: flush stats only if updated") added tracking of memcg stats updates which is used by the readers to flush only if the updates are over a certain threshold. However each individual update can correspond to a large value change for a given stat. For example adding or removing a hugepage to an LRU changes the stat by thp_nr_pages (512 on x86_64). Treating the update related to THP as one can keep the stat off, in theory, by (thp_nr_pages * nr_cpus * CHARGE_BATCH) before flush. To handle such scenarios, this patch adds consideration of the stat update value as well instead of just the update event. In addition let the asyn flusher unconditionally flush the stats to put time limit on the stats skew and hopefully a lot less readers would need to flush. Link: https://lkml.kernel.org/r/20211118065350.697046-1-shakeelb@google.com Signed-off-by: Shakeel Butt Cc: Johannes Weiner Cc: Michal Hocko Cc: "Michal Koutný" Signed-off-by: Andrew Morton --- mm/memcontrol.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) --- a/mm/memcontrol.c~memcg-better-bounds-on-the-memcg-stats-updates +++ a/mm/memcontrol.c @@ -629,11 +629,17 @@ static DEFINE_SPINLOCK(stats_flush_lock) static DEFINE_PER_CPU(unsigned int, stats_updates); static atomic_t stats_flush_threshold = ATOMIC_INIT(0); -static inline void memcg_rstat_updated(struct mem_cgroup *memcg) +static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { + unsigned int x; + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); - if (!(__this_cpu_inc_return(stats_updates) % MEMCG_CHARGE_BATCH)) - atomic_inc(&stats_flush_threshold); + + x = __this_cpu_add_return(stats_updates, abs(val)); + if (x > MEMCG_CHARGE_BATCH) { + atomic_add(x / MEMCG_CHARGE_BATCH, &stats_flush_threshold); + __this_cpu_write(stats_updates, 0); + } } static void __mem_cgroup_flush_stats(void) @@ -656,7 +662,7 @@ void mem_cgroup_flush_stats(void) static void flush_memcg_stats_dwork(struct work_struct *w) { - mem_cgroup_flush_stats(); + __mem_cgroup_flush_stats(); queue_delayed_work(system_unbound_wq, &stats_flush_dwork, 2UL*HZ); } @@ -672,7 +678,7 @@ void __mod_memcg_state(struct mem_cgroup return; __this_cpu_add(memcg->vmstats_percpu->state[idx], val); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val); } /* idx can be of type enum memcg_stat_item or node_stat_item. */ @@ -705,7 +711,7 @@ void __mod_memcg_lruvec_state(struct lru /* Update lruvec */ __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val); } /** @@ -789,7 +795,7 @@ void __count_memcg_events(struct mem_cgr return; __this_cpu_add(memcg->vmstats_percpu->events[idx], count); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, count); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) From patchwork Fri Jan 14 22:05:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714078 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2117C433F5 for ; Fri, 14 Jan 2022 22:05:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E74C6B00E9; Fri, 14 Jan 2022 17:05:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 495EC6B00EB; Fri, 14 Jan 2022 17:05:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 360B16B00EC; Fri, 14 Jan 2022 17:05:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id 231DA6B00E9 for ; Fri, 14 Jan 2022 17:05:45 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D9EAD998B4 for ; Fri, 14 Jan 2022 22:05:44 +0000 (UTC) X-FDA: 79030275408.20.DA77EB6 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 62A8840002 for ; Fri, 14 Jan 2022 22:05:44 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B52996200C; Fri, 14 Jan 2022 22:05:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D2ED4C36AE9; Fri, 14 Jan 2022 22:05:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197943; bh=JexHpmoYiIAWantUDQDX6rbtfMsRQ+uyVXiLRCHcRLg=; h=Date:From:To:Subject:In-Reply-To:From; b=PE5EYzOe0G1G5WjElw6bvndsE/3W2jz6AAxalB/p/zNHtS0y1A7vtn0TRfPsDRxLt 0FFkgAKaZ0xwW4fRK31ryrcPnMbD+/72pyf+fzK4gyFOt9xIivF5t10t0zVasWdKju V3RgsxWqQzf2c2RtE+ab5wNoO9yrZNnpH2pnchXA= Date: Fri, 14 Jan 2022 14:05:42 -0800 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, vdavydov.dev@gmail.com, wangweiyang2@huawei.com Subject: [patch 052/146] mm/memcg: use struct_size() helper in kzalloc() Message-ID: <20220114220542.X_Xm3i4Ga%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 62A8840002 X-Stat-Signature: q8dex4t4fgmwsrcdnafpnfdhu9jsypri Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PE5EYzOe; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642197944-856119 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Wang Weiyang Subject: mm/memcg: use struct_size() helper in kzalloc() Make use of the struct_size() helper instead of an open-coded version, in order to avoid any potential type mistakes or integer overflows that, in the worst scenario, could lead to heap overflows. Link: https://github.com/KSPP/linux/issues/160 Link: https://lkml.kernel.org/r/20211216022024.127375-1-wangweiyang2@huawei.com Signed-off-by: Wang Weiyang Reviewed-by: Muchun Song Acked-by: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Signed-off-by: Andrew Morton --- mm/memcontrol.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) --- a/mm/memcontrol.c~mm-memcg-use-struct_size-helper-in-kzalloc +++ a/mm/memcontrol.c @@ -5122,15 +5122,11 @@ static void mem_cgroup_free(struct mem_c static struct mem_cgroup *mem_cgroup_alloc(void) { struct mem_cgroup *memcg; - unsigned int size; int node; int __maybe_unused i; long error = -ENOMEM; - size = sizeof(struct mem_cgroup); - size += nr_node_ids * sizeof(struct mem_cgroup_per_node *); - - memcg = kzalloc(size, GFP_KERNEL); + memcg = kzalloc(struct_size(memcg, nodeinfo, nr_node_ids), GFP_KERNEL); if (!memcg) return ERR_PTR(error); From patchwork Fri Jan 14 22:05:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB393C433EF for ; Fri, 14 Jan 2022 22:05:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D2086B00EB; Fri, 14 Jan 2022 17:05:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 581126B00ED; Fri, 14 Jan 2022 17:05:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 448FF6B00EE; Fri, 14 Jan 2022 17:05:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id 2EA2F6B00EB for ; Fri, 14 Jan 2022 17:05:49 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E9F6082F4BFD for ; Fri, 14 Jan 2022 22:05:48 +0000 (UTC) X-FDA: 79030275576.06.3A18F29 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf17.hostedemail.com (Postfix) with ESMTP id 72D3540003 for ; Fri, 14 Jan 2022 22:05:48 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7057BB825F5; Fri, 14 Jan 2022 22:05:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED6FCC36AE9; Fri, 14 Jan 2022 22:05:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197946; bh=/kcWCfrx5i22N7Kl0YQiTP4RoT2jsC72pUgUtiJ8eLU=; h=Date:From:To:Subject:In-Reply-To:From; b=jAzL5E0D5v5CFnrv9d6Wd8dg9c5wbnd2PPcwxcgV5djXyl+hTF0IY+03gTqIHGFza yqviqNAyCrR2W6UFk3Or5FdDji8kRrorcE/9r59VUQvCS8r1PiLnJ9C78bfV3CUrjN rG6iINP5qIc+NLOldd01oyMF8IOGzfEo38O+0e1E= Date: Fri, 14 Jan 2022 14:05:45 -0800 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org Subject: [patch 053/146] memcg: add per-memcg vmalloc stat Message-ID: <20220114220545.1dtJ90qzv%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 72D3540003 X-Stat-Signature: iyrzfgz9smf977i9ijw7gaiqje4i49cs Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jAzL5E0D; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197948-671388 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg: add per-memcg vmalloc stat The kvmalloc* allocation functions can fallback to vmalloc allocations and more often on long running machines. In addition the kernel does have __GFP_ACCOUNT kvmalloc* calls. So, often on long running machines, the memory.stat does not tell the complete picture which type of memory is charged to the memcg. So add a per-memcg vmalloc stat. [shakeelb@google.com: page_memcg() within rcu lock, per Muchun] Link: https://lkml.kernel.org/r/20211222052457.1960701-1-shakeelb@google.com [akpm@linux-foundation.org: remove cast, per Muchun] [shakeelb@google.com: remove area->page[0] checks and move to page by page accounting per Michal] Link: https://lkml.kernel.org/r/20220104222341.3972772-1-shakeelb@google.com Link: https://lkml.kernel.org/r/20211221215336.1922823-1-shakeelb@google.com Signed-off-by: Shakeel Butt Acked-by: Roman Gushchin Reviewed-by: Muchun Song Acked-by: Michal Hocko Cc: Johannes Weiner Signed-off-by: Andrew Morton --- Documentation/admin-guide/cgroup-v2.rst | 3 +++ include/linux/memcontrol.h | 21 +++++++++++++++++++++ mm/memcontrol.c | 1 + mm/vmalloc.c | 13 +++++++++++-- 4 files changed, 36 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/cgroup-v2.rst~memcg-add-per-memcg-vmalloc-stat +++ a/Documentation/admin-guide/cgroup-v2.rst @@ -1314,6 +1314,9 @@ PAGE_SIZE multiple when read back. sock (npn) Amount of memory used in network transmission buffers + vmalloc (npn) + Amount of memory used for vmap backed memory. + shmem Amount of cached filesystem data that is swap-backed, such as tmpfs, shm segments, shared anonymous mmap()s --- a/include/linux/memcontrol.h~memcg-add-per-memcg-vmalloc-stat +++ a/include/linux/memcontrol.h @@ -33,6 +33,7 @@ enum memcg_stat_item { MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS, MEMCG_SOCK, MEMCG_PERCPU_B, + MEMCG_VMALLOC, MEMCG_NR_STAT, }; @@ -992,6 +993,21 @@ static inline void mod_memcg_state(struc local_irq_restore(flags); } +static inline void mod_memcg_page_state(struct page *page, + int idx, int val) +{ + struct mem_cgroup *memcg; + + if (mem_cgroup_disabled()) + return; + + rcu_read_lock(); + memcg = page_memcg(page); + if (memcg) + mod_memcg_state(memcg, idx, val); + rcu_read_unlock(); +} + static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { return READ_ONCE(memcg->vmstats.state[idx]); @@ -1447,6 +1463,11 @@ static inline void mod_memcg_state(struc { } +static inline void mod_memcg_page_state(struct page *page, + int idx, int val) +{ +} + static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { return 0; --- a/mm/memcontrol.c~memcg-add-per-memcg-vmalloc-stat +++ a/mm/memcontrol.c @@ -1375,6 +1375,7 @@ static const struct memory_stat memory_s { "pagetables", NR_PAGETABLE }, { "percpu", MEMCG_PERCPU_B }, { "sock", MEMCG_SOCK }, + { "vmalloc", MEMCG_VMALLOC }, { "shmem", NR_SHMEM }, { "file_mapped", NR_FILE_MAPPED }, { "file_dirty", NR_FILE_DIRTY }, --- a/mm/vmalloc.c~memcg-add-per-memcg-vmalloc-stat +++ a/mm/vmalloc.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -2623,12 +2624,13 @@ static void __vunmap(const void *addr, i if (deallocate_pages) { unsigned int page_order = vm_area_page_order(area); - int i; + int i, step = 1U << page_order; - for (i = 0; i < area->nr_pages; i += 1U << page_order) { + for (i = 0; i < area->nr_pages; i += step) { struct page *page = area->pages[i]; BUG_ON(!page); + mod_memcg_page_state(page, MEMCG_VMALLOC, -step); __free_pages(page, page_order); cond_resched(); } @@ -2955,6 +2957,13 @@ static void *__vmalloc_area_node(struct page_order, nr_small_pages, area->pages); atomic_long_add(area->nr_pages, &nr_vmalloc_pages); + if (gfp_mask & __GFP_ACCOUNT) { + int i, step = 1U << page_order; + + for (i = 0; i < area->nr_pages; i += step) + mod_memcg_page_state(area->pages[i], MEMCG_VMALLOC, + step); + } /* * If not enough pages were obtained to accomplish an From patchwork Fri Jan 14 22:05:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714080 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C3A3C433EF for ; Fri, 14 Jan 2022 22:05:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8C9A6B00ED; Fri, 14 Jan 2022 17:05:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A3C3B6B00EF; Fri, 14 Jan 2022 17:05:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88E9E6B00F0; Fri, 14 Jan 2022 17:05:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id 7493B6B00ED for ; Fri, 14 Jan 2022 17:05:51 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3AAA7181C49C0 for ; Fri, 14 Jan 2022 22:05:51 +0000 (UTC) X-FDA: 79030275702.18.213E6AA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id C2FE9140004 for ; Fri, 14 Jan 2022 22:05:50 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CC3E961FB7; Fri, 14 Jan 2022 22:05:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 080A3C36AE9; Fri, 14 Jan 2022 22:05:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197949; bh=nzmXekE0bXlfnawVvQP/czFiiwQFPyO2nhpHEMNdO+Q=; h=Date:From:To:Subject:In-Reply-To:From; b=oMuIx1oAYAWJvA+cjZNp7XEeemlU6Rq19G63ka4jbIcL5prBkXz+I4dJBl541j25K 27N05x7AN4TRtxCP0J0V4gSiYr98XPVS5Z16kSPkR2ozcBoPwA4ksSV9E7LtxXKRxp nh6TyGhH/tYwESNW9x9c7GWV7Ka8AnFAPxir1/uA= Date: Fri, 14 Jan 2022 14:05:48 -0800 From: Andrew Morton To: akpm@linux-foundation.org, chi.minghao@zte.com.cn, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, zealci@zte.com.cn Subject: [patch 054/146] tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner Message-ID: <20220114220548.cmlrzhxJX%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: C2FE9140004 X-Stat-Signature: hfcrxoqugc5tmutn18ijt7s6pnpf97ix Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=oMuIx1oA; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642197950-535998 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: chiminghao Subject: tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner Fix the following coccicheck REVIEW: ./tools/testing/selftests/vm/userfaultfd.c:1531:21-22:use swap() to make code cleaner Link: https://lkml.kernel.org/r/20211124031632.35317-1-chi.minghao@zte.com.cn Signed-off-by: chiminghao Reported-by: Zeal Robot Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~selftests-vm-use-swap-to-make-code-cleaner +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -1417,7 +1417,6 @@ static void userfaultfd_pagemap_test(uns static int userfaultfd_stress(void) { void *area; - char *tmp_area; unsigned long nr; struct uffdio_register uffdio_register; struct uffd_stats uffd_stats[nr_cpus]; @@ -1528,13 +1527,9 @@ static int userfaultfd_stress(void) count_verify[nr], nr); /* prepare next bounce */ - tmp_area = area_src; - area_src = area_dst; - area_dst = tmp_area; - - tmp_area = area_src_alias; - area_src_alias = area_dst_alias; - area_dst_alias = tmp_area; + swap(area_src, area_dst); + + swap(area_src_alias, area_dst_alias); uffd_stats_report(uffd_stats, nr_cpus); } From patchwork Fri Jan 14 22:05:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F071C433F5 for ; Fri, 14 Jan 2022 22:05:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C623C6B00EF; Fri, 14 Jan 2022 17:05:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C10026B00F1; Fri, 14 Jan 2022 17:05:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB24B6B00F2; Fri, 14 Jan 2022 17:05:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 957626B00EF for ; Fri, 14 Jan 2022 17:05:55 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5E4E1181F3B4A for ; Fri, 14 Jan 2022 22:05:55 +0000 (UTC) X-FDA: 79030275870.21.A87AF17 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id CD02B40002 for ; Fri, 14 Jan 2022 22:05:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AB82AB8262E; Fri, 14 Jan 2022 22:05:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F32CC36AE9; Fri, 14 Jan 2022 22:05:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197952; bh=i1n8wjA0M5/HzYEAu9i51yKPWNXKF5/Wd7OhO20lNQw=; h=Date:From:To:Subject:In-Reply-To:From; b=KNQNDmHIeotTk8sToQttD8TV3o57WAsRHu82Hlr1MhkdjSK2URJ+SNiErHro1wB+8 tg4mk3fUYSlnhHGacMCSM2X6s0qJvyGOVgYBfmyFi/XgLA8ENVCYGf+/l5dkwKyb7i Oo2xV4wPtcgiBBTz8JO+5p2BuFFIgF/HmZETpsEY= Date: Fri, 14 Jan 2022 14:05:51 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, kirill@shutemov.name, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, peterx@redhat.com, peterz@infradead.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, zhengqi.arch@bytedance.com, zhouchengming@bytedance.com Subject: [patch 055/146] mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit Message-ID: <20220114220551.g71d587A_%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: CD02B40002 X-Stat-Signature: qe5neua3hmdoxwqjss1hityhbj33wnrj Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KNQNDmHI; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197954-739347 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qi Zheng Subject: mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit Since commit 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times") allowed VM_FAULT_RETRY for multiple times, the FAULT_FLAG_ALLOW_RETRY bit of fault_flag will not be changed in the page fault path, so the following check is no longer needed: flags & FAULT_FLAG_ALLOW_RETRY So just remove it. [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/20211110123358.36511-1-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng Cc: Peter Zijlstra Cc: Ingo Molnar Cc: David Hildenbrand Cc: Kirill Shutemov Cc: Peter Xu Cc: Muchun Song Cc: Chengming Zhou Signed-off-by: Andrew Morton --- arch/alpha/mm/fault.c | 16 +++++++--------- arch/arc/mm/fault.c | 3 +-- arch/arm/mm/fault.c | 2 +- arch/arm64/mm/fault.c | 6 ++---- arch/hexagon/mm/vm_fault.c | 8 +++----- arch/ia64/mm/fault.c | 16 +++++++--------- arch/m68k/mm/fault.c | 18 ++++++++---------- arch/microblaze/mm/fault.c | 18 ++++++++---------- arch/mips/mm/fault.c | 19 +++++++++---------- arch/nds32/mm/fault.c | 16 +++++++--------- arch/nios2/mm/fault.c | 18 ++++++++---------- arch/openrisc/mm/fault.c | 18 ++++++++---------- arch/parisc/mm/fault.c | 18 ++++++++---------- arch/powerpc/mm/fault.c | 6 ++---- arch/riscv/mm/fault.c | 2 +- arch/s390/mm/fault.c | 28 ++++++++++++++-------------- arch/sh/mm/fault.c | 18 ++++++++---------- arch/sparc/mm/fault_32.c | 16 +++++++--------- arch/sparc/mm/fault_64.c | 16 +++++++--------- arch/um/kernel/trap.c | 8 +++----- arch/x86/mm/fault.c | 3 +-- arch/xtensa/mm/fault.c | 17 ++++++++--------- 22 files changed, 128 insertions(+), 162 deletions(-) --- a/arch/alpha/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/alpha/mm/fault.c @@ -165,17 +165,15 @@ retry: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/arc/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/arc/mm/fault.c @@ -149,8 +149,7 @@ retry: /* * Fault retry nuances, mmap_lock already relinquished by core mm */ - if (unlikely((fault & VM_FAULT_RETRY) && - (flags & FAULT_FLAG_ALLOW_RETRY))) { + if (unlikely(fault & VM_FAULT_RETRY)) { flags |= FAULT_FLAG_TRIED; goto retry; } --- a/arch/arm64/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/arm64/mm/fault.c @@ -606,10 +606,8 @@ retry: } if (fault & VM_FAULT_RETRY) { - if (mm_flags & FAULT_FLAG_ALLOW_RETRY) { - mm_flags |= FAULT_FLAG_TRIED; - goto retry; - } + mm_flags |= FAULT_FLAG_TRIED; + goto retry; } mmap_read_unlock(mm); --- a/arch/arm/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/arm/mm/fault.c @@ -312,7 +312,7 @@ retry: return 0; } - if (!(fault & VM_FAULT_ERROR) && flags & FAULT_FLAG_ALLOW_RETRY) { + if (!(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_RETRY) { flags |= FAULT_FLAG_TRIED; goto retry; --- a/arch/hexagon/mm/vm_fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/hexagon/mm/vm_fault.c @@ -98,11 +98,9 @@ good_area: /* The most common case -- we are done. */ if (likely(!(fault & VM_FAULT_ERROR))) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; - goto retry; - } + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; + goto retry; } mmap_read_unlock(mm); --- a/arch/ia64/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/ia64/mm/fault.c @@ -156,17 +156,15 @@ retry: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/m68k/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/m68k/mm/fault.c @@ -153,18 +153,16 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/microblaze/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/microblaze/mm/fault.c @@ -232,18 +232,16 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/mips/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/mips/mm/fault.c @@ -171,18 +171,17 @@ good_area: goto do_sigbus; BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - goto retry; - } + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ + + goto retry; } mmap_read_unlock(mm); --- a/arch/nds32/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/nds32/mm/fault.c @@ -230,16 +230,14 @@ good_area: goto bad_area; } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ - goto retry; - } + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ + goto retry; } mmap_read_unlock(mm); --- a/arch/nios2/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/nios2/mm/fault.c @@ -149,18 +149,16 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/openrisc/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/openrisc/mm/fault.c @@ -177,18 +177,16 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - /*RGD modeled on Cris */ - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + /*RGD modeled on Cris */ + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/parisc/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/parisc/mm/fault.c @@ -324,16 +324,14 @@ good_area: goto bad_area; BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ - flags |= FAULT_FLAG_TRIED; - goto retry; - } + if (fault & VM_FAULT_RETRY) { + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ + flags |= FAULT_FLAG_TRIED; + goto retry; } mmap_read_unlock(mm); return; --- a/arch/powerpc/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/powerpc/mm/fault.c @@ -516,10 +516,8 @@ retry: * case. */ if (unlikely(fault & VM_FAULT_RETRY)) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { - flags |= FAULT_FLAG_TRIED; - goto retry; - } + flags |= FAULT_FLAG_TRIED; + goto retry; } mmap_read_unlock(current->mm); --- a/arch/riscv/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/riscv/mm/fault.c @@ -330,7 +330,7 @@ good_area: if (fault_signal_pending(fault, regs)) return; - if (unlikely((fault & VM_FAULT_RETRY) && (flags & FAULT_FLAG_ALLOW_RETRY))) { + if (unlikely(fault & VM_FAULT_RETRY)) { flags |= FAULT_FLAG_TRIED; /* --- a/arch/s390/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/s390/mm/fault.c @@ -452,21 +452,21 @@ retry: if (unlikely(fault & VM_FAULT_ERROR)) goto out_up; - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - if (IS_ENABLED(CONFIG_PGSTE) && gmap && - (flags & FAULT_FLAG_RETRY_NOWAIT)) { - /* FAULT_FLAG_RETRY_NOWAIT has been set, - * mmap_lock has not been released */ - current->thread.gmap_pfault = 1; - fault = VM_FAULT_PFAULT; - goto out_up; - } - flags &= ~FAULT_FLAG_RETRY_NOWAIT; - flags |= FAULT_FLAG_TRIED; - mmap_read_lock(mm); - goto retry; + if (fault & VM_FAULT_RETRY) { + if (IS_ENABLED(CONFIG_PGSTE) && gmap && + (flags & FAULT_FLAG_RETRY_NOWAIT)) { + /* + * FAULT_FLAG_RETRY_NOWAIT has been set, mmap_lock has + * not been released + */ + current->thread.gmap_pfault = 1; + fault = VM_FAULT_PFAULT; + goto out_up; } + flags &= ~FAULT_FLAG_RETRY_NOWAIT; + flags |= FAULT_FLAG_TRIED; + mmap_read_lock(mm); + goto retry; } if (IS_ENABLED(CONFIG_PGSTE) && gmap) { address = __gmap_link(gmap, current->thread.gmap_addr, --- a/arch/sh/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/sh/mm/fault.c @@ -485,17 +485,15 @@ good_area: if (mm_fault_error(regs, error_code, address, fault)) return; - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* - * No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ - goto retry; - } + /* + * No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ + goto retry; } mmap_read_unlock(mm); --- a/arch/sparc/mm/fault_32.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/sparc/mm/fault_32.c @@ -200,17 +200,15 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/sparc/mm/fault_64.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/sparc/mm/fault_64.c @@ -437,17 +437,15 @@ good_area: BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ - goto retry; - } + goto retry; } mmap_read_unlock(mm); --- a/arch/um/kernel/trap.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/um/kernel/trap.c @@ -87,12 +87,10 @@ good_area: } BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - goto retry; - } + goto retry; } pmd = pmd_off(mm, address); --- a/arch/x86/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/x86/mm/fault.c @@ -1413,8 +1413,7 @@ good_area: * and if there is a fatal signal pending there is no guarantee * that we made any progress. Handle this case first. */ - if (unlikely((fault & VM_FAULT_RETRY) && - (flags & FAULT_FLAG_ALLOW_RETRY))) { + if (unlikely(fault & VM_FAULT_RETRY)) { flags |= FAULT_FLAG_TRIED; goto retry; } --- a/arch/xtensa/mm/fault.c~mm-remove-redundant-check-about-fault_flag_allow_retry-bit +++ a/arch/xtensa/mm/fault.c @@ -127,17 +127,16 @@ good_area: goto do_sigbus; BUG(); } - if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_RETRY) { - flags |= FAULT_FLAG_TRIED; - /* No need to mmap_read_unlock(mm) as we would - * have already released it in __lock_page_or_retry - * in mm/filemap.c. - */ + if (fault & VM_FAULT_RETRY) { + flags |= FAULT_FLAG_TRIED; - goto retry; - } + /* No need to mmap_read_unlock(mm) as we would + * have already released it in __lock_page_or_retry + * in mm/filemap.c. + */ + + goto retry; } mmap_read_unlock(mm); From patchwork Fri Jan 14 22:05:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714082 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F06EC433FE for ; Fri, 14 Jan 2022 22:05:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C13A76B00F1; Fri, 14 Jan 2022 17:05:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B9CE46B00F3; Fri, 14 Jan 2022 17:05:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9EE996B00F4; Fri, 14 Jan 2022 17:05:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id 850EF6B00F1 for ; Fri, 14 Jan 2022 17:05:58 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 497BA181F3B4A for ; Fri, 14 Jan 2022 22:05:58 +0000 (UTC) X-FDA: 79030275996.24.7FCD616 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id CC3F1A0007 for ; Fri, 14 Jan 2022 22:05:57 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1BAAD6200C; Fri, 14 Jan 2022 22:05:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9571C36AE9; Fri, 14 Jan 2022 22:05:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197956; bh=nrjO1l+ryFYU1dNFEZ+ygqUkzBGYKDsvKqXCEcGYS+8=; h=Date:From:To:Subject:In-Reply-To:From; b=j83QTi9u36yvVZt37MSRMtu+2c5TTNaONkTSX1jVpwuRN4+qehwOWnKZcHrulb/jP ZMC+zsBrwQlyhYEbGZyEq9XeDhKgKXRv1RLBDFPeZrjQpqvl8YymjPF0BtXIStjL6Z 0IXWe1KsqQCDE2Rk0//3EsODVLTQZ+0LndEd/dUc= Date: Fri, 14 Jan 2022 14:05:55 -0800 From: Andrew Morton To: akpm@linux-foundation.org, ccross@google.com, dave.hansen@intel.com, ebiederm@xmission.com, gorcunov@openvz.org, hannes@cmpxchg.org, hughd@google.com, jan.glauber@gmail.com, john.stultz@linaro.org, keescook@chromium.org, linux-mm@kvack.org, mgorman@suse.de, minchan@kernel.org, mingo@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, penberg@kernel.org, peterz@infradead.org, rientjes@google.com, rob@landley.net, serge.hallyn@ubuntu.com, shli@fusionio.com, surenb@google.com, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 056/146] mm: rearrange madvise code to allow for reuse Message-ID: <20220114220555.kAbhSEmus%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: c8h6irjccs4fdgcq9mr81u9afckutj7z Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=j83QTi9u; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CC3F1A0007 X-HE-Tag: 1642197957-107295 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Cross Subject: mm: rearrange madvise code to allow for reuse Patch series "mm: rearrange madvise code to allow for reuse", v11. Speed up fork() by up to 40% by refcounting the anon vma name field. I checked the image sizes with allnoconfig builds: unpatched Linus' ToT text data bss dec hex filename 1324759 32 73928 1398719 1557bf vmlinux After the first patch is applied (madvise refactoring) text data bss dec hex filename 1322346 32 73928 1396306 154e52 vmlinux >>> 2413 bytes decrease vs ToT <<< After all patches applied with CONFIG_ANON_VMA_NAME=n text data bss dec hex filename 1322337 32 73928 1396297 154e49 vmlinux >>> 2422 bytes decrease vs ToT <<< After all patches applied with CONFIG_ANON_VMA_NAME=y text data bss dec hex filename 1325228 32 73928 1399188 155994 vmlinux >>> 469 bytes increase vs ToT <<< This patch (of 3): Refactor the madvise syscall to allow for parts of it to be reused by a prctl syscall that affects vmas. Move the code that walks vmas in a virtual address range into a function that takes a function pointer as a parameter. The only caller for now is sys_madvise, which uses it to call madvise_vma_behavior on each vma, but the next patch will add an additional caller. Move handling all vma behaviors inside madvise_behavior, and rename it to madvise_vma_behavior. Move the code that updates the flags on a vma, including splitting or merging the vma as necessary, into a new function called madvise_update_vma. The next patch will add support for updating a new anon_name field as well. Link: https://lkml.kernel.org/r/20211019215511.3771969-1-surenb@google.com Signed-off-by: Colin Cross Signed-off-by: Suren Baghdasaryan Cc: Pekka Enberg Cc: Dave Hansen Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Oleg Nesterov Cc: "Eric W. Biederman" Cc: Jan Glauber Cc: John Stultz Cc: Rob Landley Cc: Cyrill Gorcunov Cc: Kees Cook Cc: "Serge E. Hallyn" Cc: David Rientjes Cc: Al Viro Cc: Hugh Dickins Cc: Mel Gorman Cc: Shaohua Li Cc: Johannes Weiner Cc: Minchan Kim Signed-off-by: Andrew Morton --- mm/madvise.c | 338 +++++++++++++++++++++++++------------------------ 1 file changed, 178 insertions(+), 160 deletions(-) --- a/mm/madvise.c~mm-rearrange-madvise-code-to-allow-for-reuse +++ a/mm/madvise.c @@ -63,76 +63,20 @@ static int madvise_need_mmap_write(int b } /* - * We can potentially split a vm area into separate - * areas, each area with its own behavior. + * Update the vm_flags on region of a vma, splitting it or merging it as + * necessary. Must be called with mmap_sem held for writing; */ -static long madvise_behavior(struct vm_area_struct *vma, - struct vm_area_struct **prev, - unsigned long start, unsigned long end, int behavior) +static int madvise_update_vma(struct vm_area_struct *vma, + struct vm_area_struct **prev, unsigned long start, + unsigned long end, unsigned long new_flags) { struct mm_struct *mm = vma->vm_mm; - int error = 0; + int error; pgoff_t pgoff; - unsigned long new_flags = vma->vm_flags; - - switch (behavior) { - case MADV_NORMAL: - new_flags = new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; - break; - case MADV_SEQUENTIAL: - new_flags = (new_flags & ~VM_RAND_READ) | VM_SEQ_READ; - break; - case MADV_RANDOM: - new_flags = (new_flags & ~VM_SEQ_READ) | VM_RAND_READ; - break; - case MADV_DONTFORK: - new_flags |= VM_DONTCOPY; - break; - case MADV_DOFORK: - if (vma->vm_flags & VM_IO) { - error = -EINVAL; - goto out; - } - new_flags &= ~VM_DONTCOPY; - break; - case MADV_WIPEONFORK: - /* MADV_WIPEONFORK is only supported on anonymous memory. */ - if (vma->vm_file || vma->vm_flags & VM_SHARED) { - error = -EINVAL; - goto out; - } - new_flags |= VM_WIPEONFORK; - break; - case MADV_KEEPONFORK: - new_flags &= ~VM_WIPEONFORK; - break; - case MADV_DONTDUMP: - new_flags |= VM_DONTDUMP; - break; - case MADV_DODUMP: - if (!is_vm_hugetlb_page(vma) && new_flags & VM_SPECIAL) { - error = -EINVAL; - goto out; - } - new_flags &= ~VM_DONTDUMP; - break; - case MADV_MERGEABLE: - case MADV_UNMERGEABLE: - error = ksm_madvise(vma, start, end, behavior, &new_flags); - if (error) - goto out_convert_errno; - break; - case MADV_HUGEPAGE: - case MADV_NOHUGEPAGE: - error = hugepage_madvise(vma, &new_flags, behavior); - if (error) - goto out_convert_errno; - break; - } if (new_flags == vma->vm_flags) { *prev = vma; - goto out; + return 0; } pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); @@ -147,23 +91,19 @@ static long madvise_behavior(struct vm_a *prev = vma; if (start != vma->vm_start) { - if (unlikely(mm->map_count >= sysctl_max_map_count)) { - error = -ENOMEM; - goto out; - } + if (unlikely(mm->map_count >= sysctl_max_map_count)) + return -ENOMEM; error = __split_vma(mm, vma, start, 1); if (error) - goto out_convert_errno; + return error; } if (end != vma->vm_end) { - if (unlikely(mm->map_count >= sysctl_max_map_count)) { - error = -ENOMEM; - goto out; - } + if (unlikely(mm->map_count >= sysctl_max_map_count)) + return -ENOMEM; error = __split_vma(mm, vma, end, 0); if (error) - goto out_convert_errno; + return error; } success: @@ -172,15 +112,7 @@ success: */ vma->vm_flags = new_flags; -out_convert_errno: - /* - * madvise() returns EAGAIN if kernel resources, such as - * slab, are temporarily unavailable. - */ - if (error == -ENOMEM) - error = -EAGAIN; -out: - return error; + return 0; } #ifdef CONFIG_SWAP @@ -930,6 +862,94 @@ static long madvise_remove(struct vm_are return error; } +/* + * Apply an madvise behavior to a region of a vma. madvise_update_vma + * will handle splitting a vm area into separate areas, each area with its own + * behavior. + */ +static int madvise_vma_behavior(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end, + unsigned long behavior) +{ + int error; + unsigned long new_flags = vma->vm_flags; + + switch (behavior) { + case MADV_REMOVE: + return madvise_remove(vma, prev, start, end); + case MADV_WILLNEED: + return madvise_willneed(vma, prev, start, end); + case MADV_COLD: + return madvise_cold(vma, prev, start, end); + case MADV_PAGEOUT: + return madvise_pageout(vma, prev, start, end); + case MADV_FREE: + case MADV_DONTNEED: + return madvise_dontneed_free(vma, prev, start, end, behavior); + case MADV_POPULATE_READ: + case MADV_POPULATE_WRITE: + return madvise_populate(vma, prev, start, end, behavior); + case MADV_NORMAL: + new_flags = new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; + break; + case MADV_SEQUENTIAL: + new_flags = (new_flags & ~VM_RAND_READ) | VM_SEQ_READ; + break; + case MADV_RANDOM: + new_flags = (new_flags & ~VM_SEQ_READ) | VM_RAND_READ; + break; + case MADV_DONTFORK: + new_flags |= VM_DONTCOPY; + break; + case MADV_DOFORK: + if (vma->vm_flags & VM_IO) + return -EINVAL; + new_flags &= ~VM_DONTCOPY; + break; + case MADV_WIPEONFORK: + /* MADV_WIPEONFORK is only supported on anonymous memory. */ + if (vma->vm_file || vma->vm_flags & VM_SHARED) + return -EINVAL; + new_flags |= VM_WIPEONFORK; + break; + case MADV_KEEPONFORK: + new_flags &= ~VM_WIPEONFORK; + break; + case MADV_DONTDUMP: + new_flags |= VM_DONTDUMP; + break; + case MADV_DODUMP: + if (!is_vm_hugetlb_page(vma) && new_flags & VM_SPECIAL) + return -EINVAL; + new_flags &= ~VM_DONTDUMP; + break; + case MADV_MERGEABLE: + case MADV_UNMERGEABLE: + error = ksm_madvise(vma, start, end, behavior, &new_flags); + if (error) + goto out; + break; + case MADV_HUGEPAGE: + case MADV_NOHUGEPAGE: + error = hugepage_madvise(vma, &new_flags, behavior); + if (error) + goto out; + break; + } + + error = madvise_update_vma(vma, prev, start, end, new_flags); + +out: + /* + * madvise() returns EAGAIN if kernel resources, such as + * slab, are temporarily unavailable. + */ + if (error == -ENOMEM) + error = -EAGAIN; + return error; +} + #ifdef CONFIG_MEMORY_FAILURE /* * Error injection support for memory error handling. @@ -978,30 +998,6 @@ static int madvise_inject_error(int beha } #endif -static long -madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev, - unsigned long start, unsigned long end, int behavior) -{ - switch (behavior) { - case MADV_REMOVE: - return madvise_remove(vma, prev, start, end); - case MADV_WILLNEED: - return madvise_willneed(vma, prev, start, end); - case MADV_COLD: - return madvise_cold(vma, prev, start, end); - case MADV_PAGEOUT: - return madvise_pageout(vma, prev, start, end); - case MADV_FREE: - case MADV_DONTNEED: - return madvise_dontneed_free(vma, prev, start, end, behavior); - case MADV_POPULATE_READ: - case MADV_POPULATE_WRITE: - return madvise_populate(vma, prev, start, end, behavior); - default: - return madvise_behavior(vma, prev, start, end, behavior); - } -} - static bool madvise_behavior_valid(int behavior) { @@ -1056,6 +1052,73 @@ process_madvise_behavior_valid(int behav } /* + * Walk the vmas in range [start,end), and call the visit function on each one. + * The visit function will get start and end parameters that cover the overlap + * between the current vma and the original range. Any unmapped regions in the + * original range will result in this function returning -ENOMEM while still + * calling the visit function on all of the existing vmas in the range. + * Must be called with the mmap_lock held for reading or writing. + */ +static +int madvise_walk_vmas(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned long arg, + int (*visit)(struct vm_area_struct *vma, + struct vm_area_struct **prev, unsigned long start, + unsigned long end, unsigned long arg)) +{ + struct vm_area_struct *vma; + struct vm_area_struct *prev; + unsigned long tmp; + int unmapped_error = 0; + + /* + * If the interval [start,end) covers some unmapped address + * ranges, just ignore them, but return -ENOMEM at the end. + * - different from the way of handling in mlock etc. + */ + vma = find_vma_prev(mm, start, &prev); + if (vma && start > vma->vm_start) + prev = vma; + + for (;;) { + int error; + + /* Still start < end. */ + if (!vma) + return -ENOMEM; + + /* Here start < (end|vma->vm_end). */ + if (start < vma->vm_start) { + unmapped_error = -ENOMEM; + start = vma->vm_start; + if (start >= end) + break; + } + + /* Here vma->vm_start <= start < (end|vma->vm_end) */ + tmp = vma->vm_end; + if (end < tmp) + tmp = end; + + /* Here vma->vm_start <= start < tmp <= (end|vma->vm_end). */ + error = visit(vma, &prev, start, tmp, arg); + if (error) + return error; + start = tmp; + if (prev && start < prev->vm_end) + start = prev->vm_end; + if (start >= end) + break; + if (prev) + vma = prev->vm_next; + else /* madvise_remove dropped mmap_lock */ + vma = find_vma(mm, start); + } + + return unmapped_error; +} + +/* * The madvise(2) system call. * * Applications can use madvise() to advise the kernel how it should @@ -1127,10 +1190,8 @@ process_madvise_behavior_valid(int behav */ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior) { - unsigned long end, tmp; - struct vm_area_struct *vma, *prev; - int unmapped_error = 0; - int error = -EINVAL; + unsigned long end; + int error; int write; size_t len; struct blk_plug plug; @@ -1138,23 +1199,22 @@ int do_madvise(struct mm_struct *mm, uns start = untagged_addr(start); if (!madvise_behavior_valid(behavior)) - return error; + return -EINVAL; if (!PAGE_ALIGNED(start)) - return error; + return -EINVAL; len = PAGE_ALIGN(len_in); /* Check to see whether len was rounded up from small -ve to zero */ if (len_in && !len) - return error; + return -EINVAL; end = start + len; if (end < start) - return error; + return -EINVAL; - error = 0; if (end == start) - return error; + return 0; #ifdef CONFIG_MEMORY_FAILURE if (behavior == MADV_HWPOISON || behavior == MADV_SOFT_OFFLINE) @@ -1169,51 +1229,9 @@ int do_madvise(struct mm_struct *mm, uns mmap_read_lock(mm); } - /* - * If the interval [start,end) covers some unmapped address - * ranges, just ignore them, but return -ENOMEM at the end. - * - different from the way of handling in mlock etc. - */ - vma = find_vma_prev(mm, start, &prev); - if (vma && start > vma->vm_start) - prev = vma; - blk_start_plug(&plug); - for (;;) { - /* Still start < end. */ - error = -ENOMEM; - if (!vma) - goto out; - - /* Here start < (end|vma->vm_end). */ - if (start < vma->vm_start) { - unmapped_error = -ENOMEM; - start = vma->vm_start; - if (start >= end) - goto out; - } - - /* Here vma->vm_start <= start < (end|vma->vm_end) */ - tmp = vma->vm_end; - if (end < tmp) - tmp = end; - - /* Here vma->vm_start <= start < tmp <= (end|vma->vm_end). */ - error = madvise_vma(vma, &prev, start, tmp, behavior); - if (error) - goto out; - start = tmp; - if (prev && start < prev->vm_end) - start = prev->vm_end; - error = unmapped_error; - if (start >= end) - goto out; - if (prev) - vma = prev->vm_next; - else /* madvise_remove dropped mmap_lock */ - vma = find_vma(mm, start); - } -out: + error = madvise_walk_vmas(mm, start, end, behavior, + madvise_vma_behavior); blk_finish_plug(&plug); if (write) mmap_write_unlock(mm); From patchwork Fri Jan 14 22:05:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714083 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2196C433FE for ; Fri, 14 Jan 2022 22:06:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44C0D6B00F3; Fri, 14 Jan 2022 17:06:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D32A6B00F6; Fri, 14 Jan 2022 17:06:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20D856B00F3; Fri, 14 Jan 2022 17:06:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 04F616B00F3 for ; Fri, 14 Jan 2022 17:06:04 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B789193F28 for ; Fri, 14 Jan 2022 22:06:03 +0000 (UTC) X-FDA: 79030276206.13.E54D251 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf23.hostedemail.com (Postfix) with ESMTP id 125C3140002 for ; Fri, 14 Jan 2022 22:06:02 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EBC8AB825F5; Fri, 14 Jan 2022 22:06:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D997AC36AE9; Fri, 14 Jan 2022 22:05:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197960; bh=JPAyqrfaYfQNRLLxBfQSNOG6BDBDtSN9wd+KLPmQOP0=; h=Date:From:To:Subject:In-Reply-To:From; b=xvCivqOkMCF/ccdagD+gFODyvtxe/drRLwhSeCBNl5YA/2oO+/BKKmSh2SoKLL4ze 1Sh27bkbNmSvdxSEb9BXil5UvVv5wcM893VWam0qTDttqyBlaUvTigv6fxpBvFUv0G 3zFQTt6mv9T34Po5v+dHmq/ucTTmd9rZHaiRAx2Y= Date: Fri, 14 Jan 2022 14:05:59 -0800 From: Andrew Morton To: akpm@linux-foundation.org, ccross@google.com, dave.hansen@intel.com, ebiederm@xmission.com, gorcunov@openvz.org, hannes@cmpxchg.org, hughd@google.com, jan.glauber@gmail.com, john.stultz@linaro.org, keescook@chromium.org, linux-mm@kvack.org, mgorman@suse.de, minchan@kernel.org, mingo@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, penberg@kernel.org, peterz@infradead.org, rientjes@google.com, rob@landley.net, serge.hallyn@ubuntu.com, sfr@canb.auug.org.au, shli@fusionio.com, surenb@google.com, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 057/146] mm: add a field to store names for private anonymous memory Message-ID: <20220114220559.Ctv1K5aWW%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 125C3140002 X-Stat-Signature: jqs944pakrwi1re5n45mnwrc4wt4nezd Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xvCivqOk; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642197962-536802 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Cross Subject: mm: add a field to store names for private anonymous memory In many userspace applications, and especially in VM based applications like Android uses heavily, there are multiple different allocators in use. At a minimum there is libc malloc and the stack, and in many cases there are libc malloc, the stack, direct syscalls to mmap anonymous memory, and multiple VM heaps (one for small objects, one for big objects, etc.). Each of these layers usually has its own tools to inspect its usage; malloc by compiling a debug version, the VM through heap inspection tools, and for direct syscalls there is usually no way to track them. On Android we heavily use a set of tools that use an extended version of the logic covered in Documentation/vm/pagemap.txt to walk all pages mapped in userspace and slice their usage by process, shared (COW) vs. unique mappings, backing, etc. This can account for real physical memory usage even in cases like fork without exec (which Android uses heavily to share as many private COW pages as possible between processes), Kernel SamePage Merging, and clean zero pages. It produces a measurement of the pages that only exist in that process (USS, for unique), and a measurement of the physical memory usage of that process with the cost of shared pages being evenly split between processes that share them (PSS). If all anonymous memory is indistinguishable then figuring out the real physical memory usage (PSS) of each heap requires either a pagemap walking tool that can understand the heap debugging of every layer, or for every layer's heap debugging tools to implement the pagemap walking logic, in which case it is hard to get a consistent view of memory across the whole system. Tracking the information in userspace leads to all sorts of problems. It either needs to be stored inside the process, which means every process has to have an API to export its current heap information upon request, or it has to be stored externally in a filesystem that somebody needs to clean up on crashes. It needs to be readable while the process is still running, so it has to have some sort of synchronization with every layer of userspace. Efficiently tracking the ranges requires reimplementing something like the kernel vma trees, and linking to it from every layer of userspace. It requires more memory, more syscalls, more runtime cost, and more complexity to separately track regions that the kernel is already tracking. This patch adds a field to /proc/pid/maps and /proc/pid/smaps to show a userspace-provided name for anonymous vmas. The names of named anonymous vmas are shown in /proc/pid/maps and /proc/pid/smaps as [anon:]. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name); Setting the name to NULL clears it. The name length limit is 80 bytes including NUL-terminator and is checked to contain only printable ascii characters (including space), except '[',']','\','$' and '`'. Ascii strings are being used to have a descriptive identifiers for vmas, which can be understood by the users reading /proc/pid/maps or /proc/pid/smaps. Names can be standardized for a given system and they can include some variable parts such as the name of the allocator or a library, tid of the thread using it, etc. The name is stored in a pointer in the shared union in vm_area_struct that points to a null terminated string. Anonymous vmas with the same name (equivalent strings) and are otherwise mergeable will be merged. The name pointers are not shared between vmas even if they contain the same name. The name pointer is stored in a union with fields that are only used on file-backed mappings, so it does not increase memory usage. CONFIG_ANON_VMA_NAME kernel configuration is introduced to enable this feature. It keeps the feature disabled by default to prevent any additional memory overhead and to avoid confusing procfs parsers on systems which are not ready to support named anonymous vmas. The patch is based on the original patch developed by Colin Cross, more specifically on its latest version [1] posted upstream by Sumit Semwal. It used a userspace pointer to store vma names. In that design, name pointers could be shared between vmas. However during the last upstreaming attempt, Kees Cook raised concerns [2] about this approach and suggested to copy the name into kernel memory space, perform validity checks [3] and store as a string referenced from vm_area_struct. One big concern is about fork() performance which would need to strdup anonymous vma names. Dave Hansen suggested experimenting with worst-case scenario of forking a process with 64k vmas having longest possible names [4]. I ran this experiment on an ARM64 Android device and recorded a worst-case regression of almost 40% when forking such a process. This regression is addressed in the followup patch which replaces the pointer to a name with a refcounted structure that allows sharing the name pointer between vmas of the same name. Instead of duplicating the string during fork() or when splitting a vma it increments the refcount. [1] https://lore.kernel.org/linux-mm/20200901161459.11772-4-sumit.semwal@linaro.org/ [2] https://lore.kernel.org/linux-mm/202009031031.D32EF57ED@keescook/ [3] https://lore.kernel.org/linux-mm/202009031022.3834F692@keescook/ [4] https://lore.kernel.org/linux-mm/5d0358ab-8c47-2f5f-8e43-23b89d6a8e95@intel.com/ Changes for prctl(2) manual page (in the options section): PR_SET_VMA Sets an attribute specified in arg2 for virtual memory areas starting from the address specified in arg3 and spanning the size specified in arg4. arg5 specifies the value of the attribute to be set. Note that assigning an attribute to a virtual memory area might prevent it from being merged with adjacent virtual memory areas due to the difference in that attribute's value. Currently, arg2 must be one of: PR_SET_VMA_ANON_NAME Set a name for anonymous virtual memory areas. arg5 should be a pointer to a null-terminated string containing the name. The name length including null byte cannot exceed 80 bytes. If arg5 is NULL, the name of the appropriate anonymous virtual memory areas will be reset. The name can contain only printable ascii characters (including space), except '[',']','\','$' and '`'. This feature is available only if the kernel is built with the CONFIG_ANON_VMA_NAME option enabled. [surenb@google.com: docs: proc.rst: /proc/PID/maps: fix malformed table] Link: https://lkml.kernel.org/r/20211123185928.2513763-1-surenb@google.com [surenb: rebased over v5.15-rc6, replaced userpointer with a kernel copy, added input sanitization and CONFIG_ANON_VMA_NAME config. The bulk of the work here was done by Colin Cross, therefore, with his permission, keeping him as the author] Link: https://lkml.kernel.org/r/20211019215511.3771969-2-surenb@google.com Signed-off-by: Colin Cross Signed-off-by: Suren Baghdasaryan Reviewed-by: Kees Cook Cc: Stephen Rothwell Cc: Al Viro Cc: Cyrill Gorcunov Cc: Dave Hansen Cc: David Rientjes Cc: "Eric W. Biederman" Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jan Glauber Cc: Johannes Weiner Cc: John Stultz Cc: Mel Gorman Cc: Minchan Kim Cc: Oleg Nesterov Cc: Pekka Enberg Cc: Peter Zijlstra Cc: Rob Landley Cc: "Serge E. Hallyn" Cc: Shaohua Li Signed-off-by: Andrew Morton --- Documentation/filesystems/proc.rst | 6 - fs/proc/task_mmu.c | 12 ++ fs/userfaultfd.c | 7 - include/linux/mm.h | 13 ++ include/linux/mm_types.h | 64 ++++++++++++- include/uapi/linux/prctl.h | 3 kernel/fork.c | 2 kernel/sys.c | 63 +++++++++++++ mm/Kconfig | 14 ++ mm/madvise.c | 129 ++++++++++++++++++++++++++- mm/mempolicy.c | 3 mm/mlock.c | 2 mm/mmap.c | 38 ++++--- mm/mprotect.c | 2 14 files changed, 324 insertions(+), 34 deletions(-) --- a/Documentation/filesystems/proc.rst~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/Documentation/filesystems/proc.rst @@ -426,12 +426,14 @@ with the memory region, as the case woul The "pathname" shows the name associated file for this mapping. If the mapping is not associated with a file: - ======= ==================================== + ============= ==================================== [heap] the heap of the program [stack] the stack of the main process [vdso] the "virtual dynamic shared object", the kernel system call handler - ======= ==================================== + [anon:] an anonymous mapping that has been + named by userspace + ============= ==================================== or if empty, the mapping is anonymous. --- a/fs/proc/task_mmu.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/fs/proc/task_mmu.c @@ -308,6 +308,8 @@ show_map_vma(struct seq_file *m, struct name = arch_vma_name(vma); if (!name) { + const char *anon_name; + if (!mm) { name = "[vdso]"; goto done; @@ -319,8 +321,16 @@ show_map_vma(struct seq_file *m, struct goto done; } - if (is_stack(vma)) + if (is_stack(vma)) { name = "[stack]"; + goto done; + } + + anon_name = vma_anon_name(vma); + if (anon_name) { + seq_pad(m, ' '); + seq_printf(m, "[anon:%s]", anon_name); + } } done: --- a/fs/userfaultfd.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/fs/userfaultfd.c @@ -877,7 +877,7 @@ static int userfaultfd_release(struct in new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX); + NULL_VM_UFFD_CTX, vma_anon_name(vma)); if (prev) vma = prev; else @@ -1436,7 +1436,8 @@ static int userfaultfd_register(struct u prev = vma_merge(mm, prev, start, vma_end, new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, vma_policy(vma), - ((struct vm_userfaultfd_ctx){ ctx })); + ((struct vm_userfaultfd_ctx){ ctx }), + vma_anon_name(vma)); if (prev) { vma = prev; goto next; @@ -1613,7 +1614,7 @@ static int userfaultfd_unregister(struct prev = vma_merge(mm, prev, start, vma_end, new_flags, vma->anon_vma, vma->vm_file, vma->vm_pgoff, vma_policy(vma), - NULL_VM_UFFD_CTX); + NULL_VM_UFFD_CTX, vma_anon_name(vma)); if (prev) { vma = prev; goto next; --- a/include/linux/mm.h~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/include/linux/mm.h @@ -2658,7 +2658,7 @@ static inline int vma_adjust(struct vm_a extern struct vm_area_struct *vma_merge(struct mm_struct *, struct vm_area_struct *prev, unsigned long addr, unsigned long end, unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t, - struct mempolicy *, struct vm_userfaultfd_ctx); + struct mempolicy *, struct vm_userfaultfd_ctx, const char *); extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *); extern int __split_vma(struct mm_struct *, struct vm_area_struct *, unsigned long addr, int new_below); @@ -3391,5 +3391,16 @@ static inline int seal_check_future_writ return 0; } +#ifdef CONFIG_ANON_VMA_NAME +int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, + unsigned long len_in, const char *name); +#else +static inline int +madvise_set_anon_name(struct mm_struct *mm, unsigned long start, + unsigned long len_in, const char *name) { + return 0; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ --- a/include/linux/mm_types.h~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/include/linux/mm_types.h @@ -426,11 +426,19 @@ struct vm_area_struct { /* * For areas with an address space and backing store, * linkage into the address_space->i_mmap interval tree. + * + * For private anonymous mappings, a pointer to a null terminated string + * containing the name given to the vma, or NULL if unnamed. */ - struct { - struct rb_node rb; - unsigned long rb_subtree_last; - } shared; + + union { + struct { + struct rb_node rb; + unsigned long rb_subtree_last; + } shared; + /* Serialized by mmap_sem. */ + char *anon_name; + }; /* * A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma @@ -875,4 +883,52 @@ typedef struct { unsigned long val; } swp_entry_t; +#ifdef CONFIG_ANON_VMA_NAME +/* + * mmap_lock should be read-locked when calling vma_anon_name() and while using + * the returned pointer. + */ +extern const char *vma_anon_name(struct vm_area_struct *vma); + +/* + * mmap_lock should be read-locked for orig_vma->vm_mm. + * mmap_lock should be write-locked for new_vma->vm_mm or new_vma should be + * isolated. + */ +extern void dup_vma_anon_name(struct vm_area_struct *orig_vma, + struct vm_area_struct *new_vma); + +/* + * mmap_lock should be write-locked or vma should have been isolated under + * write-locked mmap_lock protection. + */ +extern void free_vma_anon_name(struct vm_area_struct *vma); + +/* mmap_lock should be read-locked */ +static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, + const char *name) +{ + const char *vma_name = vma_anon_name(vma); + + /* either both NULL, or pointers to same string */ + if (vma_name == name) + return true; + + return name && vma_name && !strcmp(name, vma_name); +} +#else /* CONFIG_ANON_VMA_NAME */ +static inline const char *vma_anon_name(struct vm_area_struct *vma) +{ + return NULL; +} +static inline void dup_vma_anon_name(struct vm_area_struct *orig_vma, + struct vm_area_struct *new_vma) {} +static inline void free_vma_anon_name(struct vm_area_struct *vma) {} +static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, + const char *name) +{ + return true; +} +#endif /* CONFIG_ANON_VMA_NAME */ + #endif /* _LINUX_MM_TYPES_H */ --- a/include/uapi/linux/prctl.h~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/include/uapi/linux/prctl.h @@ -272,4 +272,7 @@ struct prctl_mm_map { # define PR_SCHED_CORE_SCOPE_THREAD_GROUP 1 # define PR_SCHED_CORE_SCOPE_PROCESS_GROUP 2 +#define PR_SET_VMA 0x53564d41 +# define PR_SET_VMA_ANON_NAME 0 + #endif /* _LINUX_PRCTL_H */ --- a/kernel/fork.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/kernel/fork.c @@ -365,12 +365,14 @@ struct vm_area_struct *vm_area_dup(struc *new = data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); new->vm_next = new->vm_prev = NULL; + dup_vma_anon_name(orig, new); } return new; } void vm_area_free(struct vm_area_struct *vma) { + free_vma_anon_name(vma); kmem_cache_free(vm_area_cachep, vma); } --- a/kernel/sys.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/kernel/sys.c @@ -2261,6 +2261,66 @@ int __weak arch_prctl_spec_ctrl_set(stru #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE) +#ifdef CONFIG_ANON_VMA_NAME + +#define ANON_VMA_NAME_MAX_LEN 80 +#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]" + +static inline bool is_valid_name_char(char ch) +{ + /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */ + return ch > 0x1f && ch < 0x7f && + !strchr(ANON_VMA_NAME_INVALID_CHARS, ch); +} + +static int prctl_set_vma(unsigned long opt, unsigned long addr, + unsigned long size, unsigned long arg) +{ + struct mm_struct *mm = current->mm; + const char __user *uname; + char *name, *pch; + int error; + + switch (opt) { + case PR_SET_VMA_ANON_NAME: + uname = (const char __user *)arg; + if (uname) { + name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN); + + if (IS_ERR(name)) + return PTR_ERR(name); + + for (pch = name; *pch != '\0'; pch++) { + if (!is_valid_name_char(*pch)) { + kfree(name); + return -EINVAL; + } + } + } else { + /* Reset the name */ + name = NULL; + } + + mmap_write_lock(mm); + error = madvise_set_anon_name(mm, addr, size, name); + mmap_write_unlock(mm); + kfree(name); + break; + default: + error = -EINVAL; + } + + return error; +} + +#else /* CONFIG_ANON_VMA_NAME */ +static int prctl_set_vma(unsigned long opt, unsigned long start, + unsigned long size, unsigned long arg) +{ + return -EINVAL; +} +#endif /* CONFIG_ANON_VMA_NAME */ + SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, unsigned long, arg4, unsigned long, arg5) { @@ -2530,6 +2590,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsi error = sched_core_share_pid(arg2, arg3, arg4, arg5); break; #endif + case PR_SET_VMA: + error = prctl_set_vma(arg2, arg3, arg4, arg5); + break; default: error = -EINVAL; break; --- a/mm/Kconfig~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/Kconfig @@ -900,6 +900,20 @@ config IO_MAPPING config SECRETMEM def_bool ARCH_HAS_SET_DIRECT_MAP && !EMBEDDED +config ANON_VMA_NAME + bool "Anonymous VMA name support" + depends on PROC_FS && ADVISE_SYSCALLS && MMU + + help + Allow naming anonymous virtual memory areas. + + This feature allows assigning names to virtual memory areas. Assigned + names can be later retrieved from /proc/pid/maps and /proc/pid/smaps + and help identifying individual anonymous memory areas. + Assigning a name to anonymous virtual memory area might prevent that + area from being merged with adjacent virtual memory areas due to the + difference in their name. + source "mm/damon/Kconfig" endmenu --- a/mm/madvise.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/madvise.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -62,19 +63,84 @@ static int madvise_need_mmap_write(int b } } +#ifdef CONFIG_ANON_VMA_NAME +static inline bool has_vma_anon_name(struct vm_area_struct *vma) +{ + return !vma->vm_file && vma->anon_name; +} + +const char *vma_anon_name(struct vm_area_struct *vma) +{ + if (!has_vma_anon_name(vma)) + return NULL; + + mmap_assert_locked(vma->vm_mm); + + return vma->anon_name; +} + +void dup_vma_anon_name(struct vm_area_struct *orig_vma, + struct vm_area_struct *new_vma) +{ + if (!has_vma_anon_name(orig_vma)) + return; + + new_vma->anon_name = kstrdup(orig_vma->anon_name, GFP_KERNEL); +} + +void free_vma_anon_name(struct vm_area_struct *vma) +{ + if (!has_vma_anon_name(vma)) + return; + + kfree(vma->anon_name); + vma->anon_name = NULL; +} + +/* mmap_lock should be write-locked */ +static int replace_vma_anon_name(struct vm_area_struct *vma, const char *name) +{ + if (!name) { + free_vma_anon_name(vma); + return 0; + } + + if (vma->anon_name) { + /* Same name, nothing to do here */ + if (!strcmp(name, vma->anon_name)) + return 0; + + free_vma_anon_name(vma); + } + vma->anon_name = kstrdup(name, GFP_KERNEL); + if (!vma->anon_name) + return -ENOMEM; + + return 0; +} +#else /* CONFIG_ANON_VMA_NAME */ +static int replace_vma_anon_name(struct vm_area_struct *vma, const char *name) +{ + if (name) + return -EINVAL; + + return 0; +} +#endif /* CONFIG_ANON_VMA_NAME */ /* * Update the vm_flags on region of a vma, splitting it or merging it as * necessary. Must be called with mmap_sem held for writing; */ static int madvise_update_vma(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, - unsigned long end, unsigned long new_flags) + unsigned long end, unsigned long new_flags, + const char *name) { struct mm_struct *mm = vma->vm_mm; int error; pgoff_t pgoff; - if (new_flags == vma->vm_flags) { + if (new_flags == vma->vm_flags && is_same_vma_anon_name(vma, name)) { *prev = vma; return 0; } @@ -82,7 +148,7 @@ static int madvise_update_vma(struct vm_ pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(mm, *prev, start, end, new_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx); + vma->vm_userfaultfd_ctx, name); if (*prev) { vma = *prev; goto success; @@ -111,6 +177,11 @@ success: * vm_flags is protected by the mmap_lock held in write mode. */ vma->vm_flags = new_flags; + if (!vma->vm_file) { + error = replace_vma_anon_name(vma, name); + if (error) + return error; + } return 0; } @@ -938,7 +1009,8 @@ static int madvise_vma_behavior(struct v break; } - error = madvise_update_vma(vma, prev, start, end, new_flags); + error = madvise_update_vma(vma, prev, start, end, new_flags, + vma_anon_name(vma)); out: /* @@ -1118,6 +1190,55 @@ int madvise_walk_vmas(struct mm_struct * return unmapped_error; } +#ifdef CONFIG_ANON_VMA_NAME +static int madvise_vma_anon_name(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end, + unsigned long name) +{ + int error; + + /* Only anonymous mappings can be named */ + if (vma->vm_file) + return -EBADF; + + error = madvise_update_vma(vma, prev, start, end, vma->vm_flags, + (const char *)name); + + /* + * madvise() returns EAGAIN if kernel resources, such as + * slab, are temporarily unavailable. + */ + if (error == -ENOMEM) + error = -EAGAIN; + return error; +} + +int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, + unsigned long len_in, const char *name) +{ + unsigned long end; + unsigned long len; + + if (start & ~PAGE_MASK) + return -EINVAL; + len = (len_in + ~PAGE_MASK) & PAGE_MASK; + + /* Check to see whether len was rounded up from small -ve to zero */ + if (len_in && !len) + return -EINVAL; + + end = start + len; + if (end < start) + return -EINVAL; + + if (end == start) + return 0; + + return madvise_walk_vmas(mm, start, end, (unsigned long)name, + madvise_vma_anon_name); +} +#endif /* CONFIG_ANON_VMA_NAME */ /* * The madvise(2) system call. * --- a/mm/mempolicy.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/mempolicy.c @@ -810,7 +810,8 @@ static int mbind_range(struct mm_struct ((vmstart - vma->vm_start) >> PAGE_SHIFT); prev = vma_merge(mm, prev, vmstart, vmend, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, - new_pol, vma->vm_userfaultfd_ctx); + new_pol, vma->vm_userfaultfd_ctx, + vma_anon_name(vma)); if (prev) { vma = prev; next = vma->vm_next; --- a/mm/mlock.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/mlock.c @@ -512,7 +512,7 @@ static int mlock_fixup(struct vm_area_st pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *prev = vma_merge(mm, *prev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx); + vma->vm_userfaultfd_ctx, vma_anon_name(vma)); if (*prev) { vma = *prev; goto success; --- a/mm/mmap.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/mmap.c @@ -1029,7 +1029,8 @@ again: */ static inline int is_mergeable_vma(struct vm_area_struct *vma, struct file *file, unsigned long vm_flags, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx) + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + const char *anon_name) { /* * VM_SOFTDIRTY should not prevent from VMA merging, if we @@ -1047,6 +1048,8 @@ static inline int is_mergeable_vma(struc return 0; if (!is_mergeable_vm_userfaultfd_ctx(vma, vm_userfaultfd_ctx)) return 0; + if (!is_same_vma_anon_name(vma, anon_name)) + return 0; return 1; } @@ -1079,9 +1082,10 @@ static int can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx) + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + const char *anon_name) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { if (vma->vm_pgoff == vm_pgoff) return 1; @@ -1100,9 +1104,10 @@ static int can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t vm_pgoff, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx) + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + const char *anon_name) { - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx) && + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name) && is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { pgoff_t vm_pglen; vm_pglen = vma_pages(vma); @@ -1113,9 +1118,9 @@ can_vma_merge_after(struct vm_area_struc } /* - * Given a mapping request (addr,end,vm_flags,file,pgoff), figure out - * whether that can be merged with its predecessor or its successor. - * Or both (it neatly fills a hole). + * Given a mapping request (addr,end,vm_flags,file,pgoff,anon_name), + * figure out whether that can be merged with its predecessor or its + * successor. Or both (it neatly fills a hole). * * In most cases - when called for mmap, brk or mremap - [addr,end) is * certain not to be mapped by the time vma_merge is called; but when @@ -1160,7 +1165,8 @@ struct vm_area_struct *vma_merge(struct unsigned long end, unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, pgoff_t pgoff, struct mempolicy *policy, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx) + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + const char *anon_name) { pgoff_t pglen = (end - addr) >> PAGE_SHIFT; struct vm_area_struct *area, *next; @@ -1190,7 +1196,7 @@ struct vm_area_struct *vma_merge(struct mpol_equal(vma_policy(prev), policy) && can_vma_merge_after(prev, vm_flags, anon_vma, file, pgoff, - vm_userfaultfd_ctx)) { + vm_userfaultfd_ctx, anon_name)) { /* * OK, it can. Can we now merge in the successor as well? */ @@ -1199,7 +1205,7 @@ struct vm_area_struct *vma_merge(struct can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx) && + vm_userfaultfd_ctx, anon_name) && is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NULL)) { /* cases 1, 6 */ @@ -1222,7 +1228,7 @@ struct vm_area_struct *vma_merge(struct mpol_equal(policy, vma_policy(next)) && can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx)) { + vm_userfaultfd_ctx, anon_name)) { if (prev && addr < prev->vm_end) /* case 4 */ err = __vma_adjust(prev, prev->vm_start, addr, prev->vm_pgoff, NULL, next); @@ -1754,7 +1760,7 @@ unsigned long mmap_region(struct file *f * Can we just expand an old mapping? */ vma = vma_merge(mm, prev, addr, addr + len, vm_flags, - NULL, file, pgoff, NULL, NULL_VM_UFFD_CTX); + NULL, file, pgoff, NULL, NULL_VM_UFFD_CTX, NULL); if (vma) goto out; @@ -1803,7 +1809,7 @@ unsigned long mmap_region(struct file *f */ if (unlikely(vm_flags != vma->vm_flags && prev)) { merge = vma_merge(mm, prev, vma->vm_start, vma->vm_end, vma->vm_flags, - NULL, vma->vm_file, vma->vm_pgoff, NULL, NULL_VM_UFFD_CTX); + NULL, vma->vm_file, vma->vm_pgoff, NULL, NULL_VM_UFFD_CTX, NULL); if (merge) { /* ->mmap() can change vma->vm_file and fput the original file. So * fput the vma->vm_file here or we would add an extra fput for file @@ -3056,7 +3062,7 @@ static int do_brk_flags(unsigned long ad /* Can we just expand an old private anonymous mapping? */ vma = vma_merge(mm, prev, addr, addr + len, flags, - NULL, NULL, pgoff, NULL, NULL_VM_UFFD_CTX); + NULL, NULL, pgoff, NULL, NULL_VM_UFFD_CTX, NULL); if (vma) goto out; @@ -3249,7 +3255,7 @@ struct vm_area_struct *copy_vma(struct v return NULL; /* should never get here */ new_vma = vma_merge(mm, prev, addr, addr + len, vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx); + vma->vm_userfaultfd_ctx, vma_anon_name(vma)); if (new_vma) { /* * Source vma may have been merged into new_vma --- a/mm/mprotect.c~mm-add-a-field-to-store-names-for-private-anonymous-memory +++ a/mm/mprotect.c @@ -464,7 +464,7 @@ mprotect_fixup(struct vm_area_struct *vm pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); *pprev = vma_merge(mm, *pprev, start, end, newflags, vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx); + vma->vm_userfaultfd_ctx, vma_anon_name(vma)); if (*pprev) { vma = *pprev; VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); From patchwork Fri Jan 14 22:06:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66C54C433EF for ; Fri, 14 Jan 2022 22:06:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5BB96B00F6; Fri, 14 Jan 2022 17:06:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0C0B6B00F7; Fri, 14 Jan 2022 17:06:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAD1B6B00F8; Fri, 14 Jan 2022 17:06:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id B46846B00F6 for ; Fri, 14 Jan 2022 17:06:07 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 81E6B92E13 for ; Fri, 14 Jan 2022 22:06:07 +0000 (UTC) X-FDA: 79030276374.27.512F245 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf15.hostedemail.com (Postfix) with ESMTP id 1B94AA0008 for ; Fri, 14 Jan 2022 22:06:06 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 12715B8262F; Fri, 14 Jan 2022 22:06:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB656C36AE5; Fri, 14 Jan 2022 22:06:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197964; bh=OU1yyGKrmkfjJX/TshcXSbRF+JTsP5ncjFrE9Rd0l0E=; h=Date:From:To:Subject:In-Reply-To:From; b=hO1QqERGlEnsl8BK31PBcXCAtytM05OHGAbab25BIB7nGv6C6MgCV2Ac9Z1vbEKKl 32AxF/ROiiAedoLKMsOgYJYRwyJ5hQu4mWouFNUHdPeVAoR4OlI8PoQDqDRgmncco0 owc2ONbWZtE/MMq2TBbUD4Qyo7WEHmRTfr3ORe1Q= Date: Fri, 14 Jan 2022 14:06:03 -0800 From: Andrew Morton To: akpm@linux-foundation.org, ccross@google.com, dave.hansen@intel.com, ebiederm@xmission.com, gorcunov@openvz.org, hannes@cmpxchg.org, hughd@google.com, jan.glauber@gmail.com, john.stultz@linaro.org, keescook@chromium.org, linux-mm@kvack.org, mgorman@suse.de, minchan@kernel.org, mingo@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, penberg@kernel.org, peterz@infradead.org, rientjes@google.com, rob@landley.net, serge.hallyn@ubuntu.com, shli@fusionio.com, surenb@google.com, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 058/146] mm: add anonymous vma name refcounting Message-ID: <20220114220603.xTg_jpyN1%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 1B94AA0008 X-Stat-Signature: 833e88amhy1knbki897bu5izcsnja8y8 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hO1QqERG; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642197966-182961 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm: add anonymous vma name refcounting While forking a process with high number (64K) of named anonymous vmas the overhead caused by strdup() is noticeable. Experiments with ARM64 Android device show up to 40% performance regression when forking a process with 64k unpopulated anonymous vmas using the max name lengths vs the same process with the same number of anonymous vmas having no name. Introduce anon_vma_name refcounted structure to avoid the overhead of copying vma names during fork() and when splitting named anonymous vmas. When a vma is duplicated, instead of copying the name we increment the refcount of this structure. Multiple vmas can point to the same anon_vma_name as long as they increment the refcount. The name member of anon_vma_name structure is assigned at structure allocation time and is never changed. If vma name changes then the refcount of the original structure is dropped, a new anon_vma_name structure is allocated to hold the new name and the vma pointer is updated to point to the new structure. With this approach the fork() performance regressions is reduced 3-4x times and with usecases using more reasonable number of VMAs (a few thousand) the regressions is not measurable. Link: https://lkml.kernel.org/r/20211019215511.3771969-3-surenb@google.com Signed-off-by: Suren Baghdasaryan Reviewed-by: Kees Cook Cc: Al Viro Cc: Colin Cross Cc: Cyrill Gorcunov Cc: Dave Hansen Cc: David Rientjes Cc: "Eric W. Biederman" Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jan Glauber Cc: Johannes Weiner Cc: John Stultz Cc: Mel Gorman Cc: Minchan Kim Cc: Oleg Nesterov Cc: Pekka Enberg Cc: Peter Zijlstra Cc: Rob Landley Cc: "Serge E. Hallyn" Cc: Shaohua Li Signed-off-by: Andrew Morton --- include/linux/mm_types.h | 9 +++++++ mm/madvise.c | 42 +++++++++++++++++++++++++++++++------ 2 files changed, 44 insertions(+), 7 deletions(-) --- a/include/linux/mm_types.h~mm-add-anonymous-vma-name-refcounting +++ a/include/linux/mm_types.h @@ -5,6 +5,7 @@ #include #include +#include #include #include #include @@ -386,6 +387,12 @@ struct vm_userfaultfd_ctx { struct vm_userfaultfd_ctx {}; #endif /* CONFIG_USERFAULTFD */ +struct anon_vma_name { + struct kref kref; + /* The name needs to be at the end because it is dynamically sized. */ + char name[]; +}; + /* * This struct describes a virtual memory area. There is one of these * per VM-area/task. A VM area is any part of the process virtual memory @@ -437,7 +444,7 @@ struct vm_area_struct { unsigned long rb_subtree_last; } shared; /* Serialized by mmap_sem. */ - char *anon_name; + struct anon_vma_name *anon_name; }; /* --- a/mm/madvise.c~mm-add-anonymous-vma-name-refcounting +++ a/mm/madvise.c @@ -64,6 +64,29 @@ static int madvise_need_mmap_write(int b } #ifdef CONFIG_ANON_VMA_NAME +static struct anon_vma_name *anon_vma_name_alloc(const char *name) +{ + struct anon_vma_name *anon_name; + size_t count; + + /* Add 1 for NUL terminator at the end of the anon_name->name */ + count = strlen(name) + 1; + anon_name = kmalloc(struct_size(anon_name, name, count), GFP_KERNEL); + if (anon_name) { + kref_init(&anon_name->kref); + memcpy(anon_name->name, name, count); + } + + return anon_name; +} + +static void vma_anon_name_free(struct kref *kref) +{ + struct anon_vma_name *anon_name = + container_of(kref, struct anon_vma_name, kref); + kfree(anon_name); +} + static inline bool has_vma_anon_name(struct vm_area_struct *vma) { return !vma->vm_file && vma->anon_name; @@ -76,7 +99,7 @@ const char *vma_anon_name(struct vm_area mmap_assert_locked(vma->vm_mm); - return vma->anon_name; + return vma->anon_name->name; } void dup_vma_anon_name(struct vm_area_struct *orig_vma, @@ -85,34 +108,41 @@ void dup_vma_anon_name(struct vm_area_st if (!has_vma_anon_name(orig_vma)) return; - new_vma->anon_name = kstrdup(orig_vma->anon_name, GFP_KERNEL); + kref_get(&orig_vma->anon_name->kref); + new_vma->anon_name = orig_vma->anon_name; } void free_vma_anon_name(struct vm_area_struct *vma) { + struct anon_vma_name *anon_name; + if (!has_vma_anon_name(vma)) return; - kfree(vma->anon_name); + anon_name = vma->anon_name; vma->anon_name = NULL; + kref_put(&anon_name->kref, vma_anon_name_free); } /* mmap_lock should be write-locked */ static int replace_vma_anon_name(struct vm_area_struct *vma, const char *name) { + const char *anon_name; + if (!name) { free_vma_anon_name(vma); return 0; } - if (vma->anon_name) { + anon_name = vma_anon_name(vma); + if (anon_name) { /* Same name, nothing to do here */ - if (!strcmp(name, vma->anon_name)) + if (!strcmp(name, anon_name)) return 0; free_vma_anon_name(vma); } - vma->anon_name = kstrdup(name, GFP_KERNEL); + vma->anon_name = anon_vma_name_alloc(name); if (!vma->anon_name) return -ENOMEM; From patchwork Fri Jan 14 22:06:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E4DEC433EF for ; Fri, 14 Jan 2022 22:06:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97B656B00F7; Fri, 14 Jan 2022 17:06:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92A176B00F9; Fri, 14 Jan 2022 17:06:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CB2A6B00FA; Fri, 14 Jan 2022 17:06:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id 68CDF6B00F7 for ; Fri, 14 Jan 2022 17:06:10 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2563B182012AF for ; Fri, 14 Jan 2022 22:06:10 +0000 (UTC) X-FDA: 79030276500.05.06D0DB4 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 85DB3180009 for ; Fri, 14 Jan 2022 22:06:09 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DACCB61FB7; Fri, 14 Jan 2022 22:06:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BBEE6C36AE9; Fri, 14 Jan 2022 22:06:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197968; bh=368je0VaVIJlFtzJl1GJqS8Yl6LTj8XhpVD9P1TcOwE=; h=Date:From:To:Subject:In-Reply-To:From; b=pFw/N6sKCa3XctkNyRuRAW8B6ve0OB1OFippsF6SGdGXvAybevXu2juH8kKwgs5VM vR2kRs8eK5YNMxsEuWskbZXfzJi3612VQ6zZyyva5EK4iM/dkpuqSzo30nYXMixZ4N fVsMs7hMzpFSJgeztZBRBw54KzIpj+1ZC+wlq/zU= Date: Fri, 14 Jan 2022 14:06:07 -0800 From: Andrew Morton To: akpm@linux-foundation.org, arnd@arndb.de, ccross@google.com, ebiederm@xmission.com, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, peterz@infradead.org, sfr@canb.auug.org.au, surenb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, viro@zeniv.linux.org.uk, willy@infradead.org, yuzhao@google.com Subject: [patch 059/146] mm: move anon_vma declarations to linux/mm_inline.h Message-ID: <20220114220607.ER5C3rmbo%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 85DB3180009 X-Stat-Signature: 4oxd3us4cmkrdya5xbfwtjgtqmjqhww8 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="pFw/N6sK"; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642197969-508430 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Arnd Bergmann Subject: mm: move anon_vma declarations to linux/mm_inline.h The patch to add anonymous vma names causes a build failure in some configurations: include/linux/mm_types.h: In function 'is_same_vma_anon_name': include/linux/mm_types.h:924:37: error: implicit declaration of function 'strcmp' [-Werror=implicit-function-declaration] 924 | return name && vma_name && !strcmp(name, vma_name); | ^~~~~~ include/linux/mm_types.h:22:1: note: 'strcmp' is defined in header ''; did you forget to '#include '? This should not really be part of linux/mm_types.h in the first place, as that header is meant to only contain structure defintions and need a minimum set of indirect includes itself. While the header clearly includes more than it should at this point, let's not make it worse by including string.h as well, which would pull in the expensive (compile-speed wise) fortify-string logic. Move the new functions into a separate header that only needs to be included in a couple of locations. Link: https://lkml.kernel.org/r/20211207125710.2503446-1-arnd@kernel.org Fixes: "mm: add a field to store names for private anonymous memory" Signed-off-by: Arnd Bergmann Cc: Al Viro Cc: Colin Cross Cc: Eric Biederman Cc: Kees Cook Cc: Matthew Wilcox (Oracle) Cc: Peter Xu Cc: Peter Zijlstra (Intel) Cc: Stephen Rothwell Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Yu Zhao Signed-off-by: Andrew Morton --- fs/proc/task_mmu.c | 1 fs/userfaultfd.c | 1 include/linux/mm_inline.h | 50 ++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 48 ---------------------------------- kernel/fork.c | 1 mm/madvise.c | 1 mm/mmap.c | 1 7 files changed, 55 insertions(+), 48 deletions(-) --- a/fs/proc/task_mmu.c~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/fs/proc/task_mmu.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include #include --- a/fs/userfaultfd.c~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/fs/userfaultfd.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include --- a/include/linux/mm_inline.h~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/include/linux/mm_inline.h @@ -4,6 +4,7 @@ #include #include +#include /** * folio_is_file_lru - Should the folio be on a file LRU or anon LRU? @@ -135,4 +136,53 @@ static __always_inline void del_page_fro { lruvec_del_folio(lruvec, page_folio(page)); } + +#ifdef CONFIG_ANON_VMA_NAME +/* + * mmap_lock should be read-locked when calling vma_anon_name() and while using + * the returned pointer. + */ +extern const char *vma_anon_name(struct vm_area_struct *vma); + +/* + * mmap_lock should be read-locked for orig_vma->vm_mm. + * mmap_lock should be write-locked for new_vma->vm_mm or new_vma should be + * isolated. + */ +extern void dup_vma_anon_name(struct vm_area_struct *orig_vma, + struct vm_area_struct *new_vma); + +/* + * mmap_lock should be write-locked or vma should have been isolated under + * write-locked mmap_lock protection. + */ +extern void free_vma_anon_name(struct vm_area_struct *vma); + +/* mmap_lock should be read-locked */ +static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, + const char *name) +{ + const char *vma_name = vma_anon_name(vma); + + /* either both NULL, or pointers to same string */ + if (vma_name == name) + return true; + + return name && vma_name && !strcmp(name, vma_name); +} +#else /* CONFIG_ANON_VMA_NAME */ +static inline const char *vma_anon_name(struct vm_area_struct *vma) +{ + return NULL; +} +static inline void dup_vma_anon_name(struct vm_area_struct *orig_vma, + struct vm_area_struct *new_vma) {} +static inline void free_vma_anon_name(struct vm_area_struct *vma) {} +static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, + const char *name) +{ + return true; +} +#endif /* CONFIG_ANON_VMA_NAME */ + #endif --- a/include/linux/mm_types.h~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/include/linux/mm_types.h @@ -890,52 +890,4 @@ typedef struct { unsigned long val; } swp_entry_t; -#ifdef CONFIG_ANON_VMA_NAME -/* - * mmap_lock should be read-locked when calling vma_anon_name() and while using - * the returned pointer. - */ -extern const char *vma_anon_name(struct vm_area_struct *vma); - -/* - * mmap_lock should be read-locked for orig_vma->vm_mm. - * mmap_lock should be write-locked for new_vma->vm_mm or new_vma should be - * isolated. - */ -extern void dup_vma_anon_name(struct vm_area_struct *orig_vma, - struct vm_area_struct *new_vma); - -/* - * mmap_lock should be write-locked or vma should have been isolated under - * write-locked mmap_lock protection. - */ -extern void free_vma_anon_name(struct vm_area_struct *vma); - -/* mmap_lock should be read-locked */ -static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, - const char *name) -{ - const char *vma_name = vma_anon_name(vma); - - /* either both NULL, or pointers to same string */ - if (vma_name == name) - return true; - - return name && vma_name && !strcmp(name, vma_name); -} -#else /* CONFIG_ANON_VMA_NAME */ -static inline const char *vma_anon_name(struct vm_area_struct *vma) -{ - return NULL; -} -static inline void dup_vma_anon_name(struct vm_area_struct *orig_vma, - struct vm_area_struct *new_vma) {} -static inline void free_vma_anon_name(struct vm_area_struct *vma) {} -static inline bool is_same_vma_anon_name(struct vm_area_struct *vma, - const char *name) -{ - return true; -} -#endif /* CONFIG_ANON_VMA_NAME */ - #endif /* _LINUX_MM_TYPES_H */ --- a/kernel/fork.c~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/kernel/fork.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include --- a/mm/madvise.c~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/mm/madvise.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include --- a/mm/mmap.c~mm-move-anon_vma-declarations-to-linux-mm_inlineh +++ a/mm/mmap.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include From patchwork Fri Jan 14 22:06:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714086 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0106C433EF for ; Fri, 14 Jan 2022 22:06:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35F596B00F9; Fri, 14 Jan 2022 17:06:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 336D16B00FB; Fri, 14 Jan 2022 17:06:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D79E6B00FC; Fri, 14 Jan 2022 17:06:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0032.hostedemail.com [216.40.44.32]) by kanga.kvack.org (Postfix) with ESMTP id 091386B00F9 for ; Fri, 14 Jan 2022 17:06:15 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BCF8B8F174 for ; Fri, 14 Jan 2022 22:06:14 +0000 (UTC) X-FDA: 79030276668.15.212B38D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf24.hostedemail.com (Postfix) with ESMTP id 2FEEA180009 for ; Fri, 14 Jan 2022 22:06:14 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F00A1B8262E; Fri, 14 Jan 2022 22:06:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3C657C36AEC; Fri, 14 Jan 2022 22:06:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197971; bh=i/zyjDCm+j2/jhhg4CYtS0oYAgIxBsB9qXS7IlyLVwI=; h=Date:From:To:Subject:In-Reply-To:From; b=alBljQxD2xfIy/nUTTGSyZfyrPwOHYEkqJvGdFcZctnq2qRNN/Fc4dfWMQclZhumw OCjWUIIslbWQJXlTNxUlAYsiGjO2aCqEJ1snwRddmghD5sBkotpH8atHmnVEsUg7gK loyEJJEkdwiicaq6cV2jVJWlvlXQkmeZ/AeZvmVo= Date: Fri, 14 Jan 2022 14:06:10 -0800 From: Andrew Morton To: akpm@linux-foundation.org, arnd@arndb.de, ccross@google.com, ebiederm@xmission.com, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, peterz@infradead.org, sfr@canb.auug.org.au, surenb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, viro@zeniv.linux.org.uk, willy@infradead.org, yuzhao@google.com Subject: [patch 060/146] mm: move tlb_flush_pending inline helpers to mm_inline.h Message-ID: <20220114220610.523fehBV4%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2FEEA180009 X-Stat-Signature: 94137bhk6zowiewk1dody7845qgofbda Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=alBljQxD; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197974-946603 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Arnd Bergmann Subject: mm: move tlb_flush_pending inline helpers to mm_inline.h linux/mm_types.h should only define structure definitions, to make it cheap to include elsewhere. The atomic_t helper function definitions are particularly large, so it's better to move the helpers using those into the existing linux/mm_inline.h and only include that where needed. As a follow-up, we may want to go through all the indirect includes in mm_types.h and reduce them as much as possible. Link: https://lkml.kernel.org/r/20211207125710.2503446-2-arnd@kernel.org Signed-off-by: Arnd Bergmann Cc: Al Viro Cc: Stephen Rothwell Cc: Suren Baghdasaryan Cc: Colin Cross Cc: Kees Cook Cc: Peter Xu Cc: Peter Zijlstra (Intel) Cc: Yu Zhao Cc: Vlastimil Babka Cc: Matthew Wilcox (Oracle) Cc: Eric Biederman Signed-off-by: Andrew Morton --- arch/x86/include/asm/pgtable.h | 2 include/linux/mm.h | 45 ---------- include/linux/mm_inline.h | 86 ++++++++++++++++++++ include/linux/mm_types.h | 129 ++++++++++--------------------- mm/ksm.c | 1 mm/mapping_dirty_helpers.c | 1 mm/memory.c | 1 mm/mmu_gather.c | 1 mm/pgtable-generic.c | 1 9 files changed, 137 insertions(+), 130 deletions(-) --- a/arch/x86/include/asm/pgtable.h~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/arch/x86/include/asm/pgtable.h @@ -752,7 +752,7 @@ static inline bool pte_accessible(struct return true; if ((pte_flags(a) & _PAGE_PROTNONE) && - mm_tlb_flush_pending(mm)) + atomic_read(&mm->tlb_flush_pending)) return true; return false; --- a/include/linux/mm.h~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/include/linux/mm.h @@ -424,51 +424,6 @@ extern unsigned int kobjsize(const void */ extern pgprot_t protection_map[16]; -/** - * enum fault_flag - Fault flag definitions. - * @FAULT_FLAG_WRITE: Fault was a write fault. - * @FAULT_FLAG_MKWRITE: Fault was mkwrite of existing PTE. - * @FAULT_FLAG_ALLOW_RETRY: Allow to retry the fault if blocked. - * @FAULT_FLAG_RETRY_NOWAIT: Don't drop mmap_lock and wait when retrying. - * @FAULT_FLAG_KILLABLE: The fault task is in SIGKILL killable region. - * @FAULT_FLAG_TRIED: The fault has been tried once. - * @FAULT_FLAG_USER: The fault originated in userspace. - * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. - * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. - * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. - * - * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify - * whether we would allow page faults to retry by specifying these two - * fault flags correctly. Currently there can be three legal combinations: - * - * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and - * this is the first try - * - * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and - * we've already tried at least once - * - * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry - * - * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never - * be used. Note that page faults can be allowed to retry for multiple times, - * in which case we'll have an initial fault with flags (a) then later on - * continuous faults with flags (b). We should always try to detect pending - * signals before a retry to make sure the continuous page faults can still be - * interrupted if necessary. - */ -enum fault_flag { - FAULT_FLAG_WRITE = 1 << 0, - FAULT_FLAG_MKWRITE = 1 << 1, - FAULT_FLAG_ALLOW_RETRY = 1 << 2, - FAULT_FLAG_RETRY_NOWAIT = 1 << 3, - FAULT_FLAG_KILLABLE = 1 << 4, - FAULT_FLAG_TRIED = 1 << 5, - FAULT_FLAG_USER = 1 << 6, - FAULT_FLAG_REMOTE = 1 << 7, - FAULT_FLAG_INSTRUCTION = 1 << 8, - FAULT_FLAG_INTERRUPTIBLE = 1 << 9, -}; - /* * The default fault flags that should be used by most of the * arch-specific page fault handlers. --- a/include/linux/mm_inline.h~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/include/linux/mm_inline.h @@ -2,6 +2,7 @@ #ifndef LINUX_MM_INLINE_H #define LINUX_MM_INLINE_H +#include #include #include #include @@ -185,4 +186,89 @@ static inline bool is_same_vma_anon_name } #endif /* CONFIG_ANON_VMA_NAME */ +static inline void init_tlb_flush_pending(struct mm_struct *mm) +{ + atomic_set(&mm->tlb_flush_pending, 0); +} + +static inline void inc_tlb_flush_pending(struct mm_struct *mm) +{ + atomic_inc(&mm->tlb_flush_pending); + /* + * The only time this value is relevant is when there are indeed pages + * to flush. And we'll only flush pages after changing them, which + * requires the PTL. + * + * So the ordering here is: + * + * atomic_inc(&mm->tlb_flush_pending); + * spin_lock(&ptl); + * ... + * set_pte_at(); + * spin_unlock(&ptl); + * + * spin_lock(&ptl) + * mm_tlb_flush_pending(); + * .... + * spin_unlock(&ptl); + * + * flush_tlb_range(); + * atomic_dec(&mm->tlb_flush_pending); + * + * Where the increment if constrained by the PTL unlock, it thus + * ensures that the increment is visible if the PTE modification is + * visible. After all, if there is no PTE modification, nobody cares + * about TLB flushes either. + * + * This very much relies on users (mm_tlb_flush_pending() and + * mm_tlb_flush_nested()) only caring about _specific_ PTEs (and + * therefore specific PTLs), because with SPLIT_PTE_PTLOCKS and RCpc + * locks (PPC) the unlock of one doesn't order against the lock of + * another PTL. + * + * The decrement is ordered by the flush_tlb_range(), such that + * mm_tlb_flush_pending() will not return false unless all flushes have + * completed. + */ +} + +static inline void dec_tlb_flush_pending(struct mm_struct *mm) +{ + /* + * See inc_tlb_flush_pending(). + * + * This cannot be smp_mb__before_atomic() because smp_mb() simply does + * not order against TLB invalidate completion, which is what we need. + * + * Therefore we must rely on tlb_flush_*() to guarantee order. + */ + atomic_dec(&mm->tlb_flush_pending); +} + +static inline bool mm_tlb_flush_pending(struct mm_struct *mm) +{ + /* + * Must be called after having acquired the PTL; orders against that + * PTLs release and therefore ensures that if we observe the modified + * PTE we must also observe the increment from inc_tlb_flush_pending(). + * + * That is, it only guarantees to return true if there is a flush + * pending for _this_ PTL. + */ + return atomic_read(&mm->tlb_flush_pending); +} + +static inline bool mm_tlb_flush_nested(struct mm_struct *mm) +{ + /* + * Similar to mm_tlb_flush_pending(), we must have acquired the PTL + * for which there is a TLB flush pending in order to guarantee + * we've seen both that PTE modification and the increment. + * + * (no requirement on actually still holding the PTL, that is irrelevant) + */ + return atomic_read(&mm->tlb_flush_pending) > 1; +} + + #endif --- a/include/linux/mm_types.h~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/include/linux/mm_types.h @@ -692,90 +692,6 @@ extern void tlb_gather_mmu(struct mmu_ga extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm); extern void tlb_finish_mmu(struct mmu_gather *tlb); -static inline void init_tlb_flush_pending(struct mm_struct *mm) -{ - atomic_set(&mm->tlb_flush_pending, 0); -} - -static inline void inc_tlb_flush_pending(struct mm_struct *mm) -{ - atomic_inc(&mm->tlb_flush_pending); - /* - * The only time this value is relevant is when there are indeed pages - * to flush. And we'll only flush pages after changing them, which - * requires the PTL. - * - * So the ordering here is: - * - * atomic_inc(&mm->tlb_flush_pending); - * spin_lock(&ptl); - * ... - * set_pte_at(); - * spin_unlock(&ptl); - * - * spin_lock(&ptl) - * mm_tlb_flush_pending(); - * .... - * spin_unlock(&ptl); - * - * flush_tlb_range(); - * atomic_dec(&mm->tlb_flush_pending); - * - * Where the increment if constrained by the PTL unlock, it thus - * ensures that the increment is visible if the PTE modification is - * visible. After all, if there is no PTE modification, nobody cares - * about TLB flushes either. - * - * This very much relies on users (mm_tlb_flush_pending() and - * mm_tlb_flush_nested()) only caring about _specific_ PTEs (and - * therefore specific PTLs), because with SPLIT_PTE_PTLOCKS and RCpc - * locks (PPC) the unlock of one doesn't order against the lock of - * another PTL. - * - * The decrement is ordered by the flush_tlb_range(), such that - * mm_tlb_flush_pending() will not return false unless all flushes have - * completed. - */ -} - -static inline void dec_tlb_flush_pending(struct mm_struct *mm) -{ - /* - * See inc_tlb_flush_pending(). - * - * This cannot be smp_mb__before_atomic() because smp_mb() simply does - * not order against TLB invalidate completion, which is what we need. - * - * Therefore we must rely on tlb_flush_*() to guarantee order. - */ - atomic_dec(&mm->tlb_flush_pending); -} - -static inline bool mm_tlb_flush_pending(struct mm_struct *mm) -{ - /* - * Must be called after having acquired the PTL; orders against that - * PTLs release and therefore ensures that if we observe the modified - * PTE we must also observe the increment from inc_tlb_flush_pending(). - * - * That is, it only guarantees to return true if there is a flush - * pending for _this_ PTL. - */ - return atomic_read(&mm->tlb_flush_pending); -} - -static inline bool mm_tlb_flush_nested(struct mm_struct *mm) -{ - /* - * Similar to mm_tlb_flush_pending(), we must have acquired the PTL - * for which there is a TLB flush pending in order to guarantee - * we've seen both that PTE modification and the increment. - * - * (no requirement on actually still holding the PTL, that is irrelevant) - */ - return atomic_read(&mm->tlb_flush_pending) > 1; -} - struct vm_fault; /** @@ -890,4 +806,49 @@ typedef struct { unsigned long val; } swp_entry_t; +/** + * enum fault_flag - Fault flag definitions. + * @FAULT_FLAG_WRITE: Fault was a write fault. + * @FAULT_FLAG_MKWRITE: Fault was mkwrite of existing PTE. + * @FAULT_FLAG_ALLOW_RETRY: Allow to retry the fault if blocked. + * @FAULT_FLAG_RETRY_NOWAIT: Don't drop mmap_lock and wait when retrying. + * @FAULT_FLAG_KILLABLE: The fault task is in SIGKILL killable region. + * @FAULT_FLAG_TRIED: The fault has been tried once. + * @FAULT_FLAG_USER: The fault originated in userspace. + * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. + * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. + * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. + * + * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify + * whether we would allow page faults to retry by specifying these two + * fault flags correctly. Currently there can be three legal combinations: + * + * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and + * this is the first try + * + * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and + * we've already tried at least once + * + * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry + * + * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never + * be used. Note that page faults can be allowed to retry for multiple times, + * in which case we'll have an initial fault with flags (a) then later on + * continuous faults with flags (b). We should always try to detect pending + * signals before a retry to make sure the continuous page faults can still be + * interrupted if necessary. + */ +enum fault_flag { + FAULT_FLAG_WRITE = 1 << 0, + FAULT_FLAG_MKWRITE = 1 << 1, + FAULT_FLAG_ALLOW_RETRY = 1 << 2, + FAULT_FLAG_RETRY_NOWAIT = 1 << 3, + FAULT_FLAG_KILLABLE = 1 << 4, + FAULT_FLAG_TRIED = 1 << 5, + FAULT_FLAG_USER = 1 << 6, + FAULT_FLAG_REMOTE = 1 << 7, + FAULT_FLAG_INSTRUCTION = 1 << 8, + FAULT_FLAG_INTERRUPTIBLE = 1 << 9, +}; + #endif /* _LINUX_MM_TYPES_H */ --- a/mm/ksm.c~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/mm/ksm.c @@ -15,6 +15,7 @@ #include #include +#include #include #include #include --- a/mm/mapping_dirty_helpers.c~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/mm/mapping_dirty_helpers.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include --- a/mm/memory.c~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/mm/memory.c @@ -41,6 +41,7 @@ #include #include +#include #include #include #include --- a/mm/mmu_gather.c~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/mm/mmu_gather.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include --- a/mm/pgtable-generic.c~mm-move-tlb_flush_pending-inline-helpers-to-mm_inlineh +++ a/mm/pgtable-generic.c @@ -10,6 +10,7 @@ #include #include #include +#include #include /* From patchwork Fri Jan 14 22:06:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DE84C433F5 for ; Fri, 14 Jan 2022 22:06:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B2686B00FB; Fri, 14 Jan 2022 17:06:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 161E16B00FD; Fri, 14 Jan 2022 17:06:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02A6D6B00FE; Fri, 14 Jan 2022 17:06:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id E3AE26B00FB for ; Fri, 14 Jan 2022 17:06:18 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A5B1193F28 for ; Fri, 14 Jan 2022 22:06:18 +0000 (UTC) X-FDA: 79030276836.19.D82A4E1 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf06.hostedemail.com (Postfix) with ESMTP id 13A5018000A for ; Fri, 14 Jan 2022 22:06:17 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 15A27B82A39; Fri, 14 Jan 2022 22:06:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E345EC36AE9; Fri, 14 Jan 2022 22:06:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197975; bh=tF3xPSaf8EMG71ALylYwTHCZQ/17Lh6u4Lgb4gyKR88=; h=Date:From:To:Subject:In-Reply-To:From; b=eu+j83cFIt00ddEhdz7IqzY4bUZYQPxy3kXzCFq6yW2rpjaNw8Q04WQroHX04BPcz P7rN887bqAgGhl0vz6exr680p/ce8Z/AIYtkr7K+RtA4zisugz9UKsouCzbvlqcmGV A32RoPZhoskF2EZuSFaHEoZhCTdpFrSuDO4DG42c= Date: Fri, 14 Jan 2022 14:06:14 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, christian.brauner@ubuntu.com, christian@brauner.io, david@redhat.com, fweimer@redhat.com, guro@fb.com, hannes@cmpxchg.org, hch@infradead.org, jannh@google.com, jengelh@inai.de, jgg@nvidia.com, kirill@shutemov.name, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, minchan@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, riel@surriel.com, rientjes@google.com, shakeelb@google.com, surenb@google.com, timmurray@google.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 061/146] mm: protect free_pgtables with mmap_lock write lock in exit_mmap Message-ID: <20220114220614.9QUmVM-rY%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 13A5018000A X-Stat-Signature: oey469h3z518gznntn51u7q734ef1j4g Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=eu+j83cF; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197977-389651 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm: protect free_pgtables with mmap_lock write lock in exit_mmap oom-reaper and process_mrelease system call should protect against races with exit_mmap which can destroy page tables while they walk the VMA tree. oom-reaper protects from that race by setting MMF_OOM_VICTIM and by relying on exit_mmap to set MMF_OOM_SKIP before taking and releasing mmap_write_lock. process_mrelease has to elevate mm->mm_users to prevent such race. Both oom-reaper and process_mrelease hold mmap_read_lock when walking the VMA tree. The locking rules and mechanisms could be simpler if exit_mmap takes mmap_write_lock while executing destructive operations such as free_pgtables. Change exit_mmap to hold the mmap_write_lock when calling unlock_range, free_pgtables and remove_vma. Note also that because oom-reaper checks VM_LOCKED flag, unlock_range() should not be allowed to race with it. Before this patch, remove_vma used to be called with no locks held, however with fput being executed asynchronously and vm_ops->close not being allowed to hold mmap_lock (it is called from __split_vma with mmap_sem held for write), changing that should be fine. In most cases this lock should be uncontended. Previously, Kirill reported ~4% regression caused by a similar change [1]. We reran the same test and although the individual results are quite noisy, the percentiles show lower regression with 1.6% being the worst case [2]. The change allows oom-reaper and process_mrelease to execute safely under mmap_read_lock without worries that exit_mmap might destroy page tables from under them. [1] https://lore.kernel.org/all/20170725141723.ivukwhddk2voyhuc@node.shutemov.name/ [2] https://lore.kernel.org/all/CAJuCfpGC9-c9P40x7oy=jy5SphMcd0o0G_6U1-+JAziGKG6dGA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20211209191325.3069345-1-surenb@google.com Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Cc: David Rientjes Cc: Matthew Wilcox Cc: Johannes Weiner Cc: Roman Gushchin Cc: Rik van Riel Cc: Minchan Kim Cc: Kirill A. Shutemov Cc: Andrea Arcangeli Cc: Christian Brauner Cc: Christoph Hellwig Cc: Oleg Nesterov Cc: David Hildenbrand Cc: Jann Horn Cc: Shakeel Butt Cc: Andy Lutomirski Cc: Christian Brauner Cc: Florian Weimer Cc: Jan Engelhardt Cc: Tim Murray Cc: Jason Gunthorpe Signed-off-by: Andrew Morton --- mm/mmap.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) --- a/mm/mmap.c~mm-protect-free_pgtables-with-mmap_lock-write-lock-in-exit_mmap +++ a/mm/mmap.c @@ -3149,25 +3149,27 @@ void exit_mmap(struct mm_struct *mm) * to mmu_notifier_release(mm) ensures mmu notifier callbacks in * __oom_reap_task_mm() will not block. * - * This needs to be done before calling munlock_vma_pages_all(), + * This needs to be done before calling unlock_range(), * which clears VM_LOCKED, otherwise the oom reaper cannot * reliably test it. */ (void)__oom_reap_task_mm(mm); set_bit(MMF_OOM_SKIP, &mm->flags); - mmap_write_lock(mm); - mmap_write_unlock(mm); } + mmap_write_lock(mm); if (mm->locked_vm) unlock_range(mm->mmap, ULONG_MAX); arch_exit_mmap(mm); vma = mm->mmap; - if (!vma) /* Can happen if dup_mmap() received an OOM */ + if (!vma) { + /* Can happen if dup_mmap() received an OOM */ + mmap_write_unlock(mm); return; + } lru_add_drain(); flush_cache_mm(mm); @@ -3178,16 +3180,14 @@ void exit_mmap(struct mm_struct *mm) free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb); - /* - * Walk the list again, actually closing and freeing it, - * with preemption enabled, without holding any MM locks. - */ + /* Walk the list again, actually closing and freeing it. */ while (vma) { if (vma->vm_flags & VM_ACCOUNT) nr_accounted += vma_pages(vma); vma = remove_vma(vma); cond_resched(); } + mmap_write_unlock(mm); vm_unacct_memory(nr_accounted); } From patchwork Fri Jan 14 22:06:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714088 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20FFCC4332F for ; Fri, 14 Jan 2022 22:06:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A648A6B00FD; Fri, 14 Jan 2022 17:06:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A14916B00FF; Fri, 14 Jan 2022 17:06:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DC616B0100; Fri, 14 Jan 2022 17:06:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 7D4886B00FD for ; Fri, 14 Jan 2022 17:06:21 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 476FC181D6051 for ; Fri, 14 Jan 2022 22:06:21 +0000 (UTC) X-FDA: 79030276962.18.B57A0C2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id EF4AB160004 for ; Fri, 14 Jan 2022 22:06:20 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4C3936200F; Fri, 14 Jan 2022 22:06:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5F25C36AE5; Fri, 14 Jan 2022 22:06:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197979; bh=pux9fU/4421rHpB9PuKYNriuo1AkKdCq7YCEgbrVSUA=; h=Date:From:To:Subject:In-Reply-To:From; b=hxfP21H+1qKYdjfo5Uw+x4FsYc1ZXhxxMp2bYQLXU27mQwxUxHUOiGAFSYk2Aqjwb v22Ay++0MTvdyWpBNnep+QVfW2twom8VzrCEZxYjqYkIQ4wiKbrUVoAhyPaOG5lmcN pYUUhlgdMU11h8mSrE1FBikda4waFR32fH2n88VY= Date: Fri, 14 Jan 2022 14:06:18 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, christian.brauner@ubuntu.com, christian@brauner.io, david@redhat.com, fweimer@redhat.com, guro@fb.com, hannes@cmpxchg.org, hch@infradead.org, jannh@google.com, jengelh@inai.de, jgg@nvidia.com, kirill@shutemov.name, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, minchan@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, riel@surriel.com, rientjes@google.com, shakeelb@google.com, surenb@google.com, timmurray@google.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 062/146] mm: document locking restrictions for vm_operations_struct::close Message-ID: <20220114220618.Hd0iXILBd%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: EF4AB160004 X-Stat-Signature: ei313bxgeuru7qjrtb8j4gts7kmkch3b Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hxfP21H+; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642197980-443935 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm: document locking restrictions for vm_operations_struct::close Add comments for vm_operations_struct::close documenting locking requirements for this callback and its callers. Link: https://lkml.kernel.org/r/20211209191325.3069345-2-surenb@google.com Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Cc: Andrea Arcangeli Cc: Andy Lutomirski Cc: Christian Brauner Cc: Christian Brauner Cc: Christoph Hellwig Cc: David Hildenbrand Cc: David Rientjes Cc: Florian Weimer Cc: Jan Engelhardt Cc: Jann Horn Cc: Johannes Weiner Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Minchan Kim Cc: Oleg Nesterov Cc: Rik van Riel Cc: Roman Gushchin Cc: Shakeel Butt Cc: Tim Murray Cc: Jason Gunthorpe Signed-off-by: Andrew Morton --- include/linux/mm.h | 4 ++++ 1 file changed, 4 insertions(+) --- a/include/linux/mm.h~mm-document-locking-restrictions-for-vm_operations_struct-close +++ a/include/linux/mm.h @@ -532,6 +532,10 @@ enum page_entry_size { */ struct vm_operations_struct { void (*open)(struct vm_area_struct * area); + /** + * @close: Called when the VMA is being removed from the MM. + * Context: User context. May sleep. Caller holds mmap_lock. + */ void (*close)(struct vm_area_struct * area); /* Called any time before splitting to check if it's allowed */ int (*may_split)(struct vm_area_struct *area, unsigned long addr); From patchwork Fri Jan 14 22:06:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BF8AC433EF for ; Fri, 14 Jan 2022 22:06:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C118D6B00FF; Fri, 14 Jan 2022 17:06:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC16A6B0101; Fri, 14 Jan 2022 17:06:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB0456B0102; Fri, 14 Jan 2022 17:06:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id 9980D6B00FF for ; Fri, 14 Jan 2022 17:06:26 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5A11D93F28 for ; Fri, 14 Jan 2022 22:06:26 +0000 (UTC) X-FDA: 79030277172.12.1B7B1C4 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf21.hostedemail.com (Postfix) with ESMTP id CC0C01C0011 for ; Fri, 14 Jan 2022 22:06:25 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D1EF4B82A3A; Fri, 14 Jan 2022 22:06:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CA61EC36AE5; Fri, 14 Jan 2022 22:06:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197983; bh=zoZ7pcOfeEEFLoaKWpM6u/ApyYu+Z5ckSOjULxdBdM8=; h=Date:From:To:Subject:In-Reply-To:From; b=cNaGWYAC1Oq8o71fInJXNwmpIxpLbFJRAaDm4VUpnVoDTLBK9Yh/PhxUqFdAYPWKK MYnV75H7YNQl2g6tOyR+k073s3UwTtvlnc9sASezB6N5+kjwhCJG8HDWesQSBDTpDe TgUzGsAxN6RaALdWjneVyZ6+4mbkS2T87q2a7Sx4= Date: Fri, 14 Jan 2022 14:06:22 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, christian.brauner@ubuntu.com, christian@brauner.io, david@redhat.com, fweimer@redhat.com, guro@fb.com, hannes@cmpxchg.org, hch@infradead.org, jannh@google.com, jengelh@inai.de, jgg@nvidia.com, kirill@shutemov.name, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, minchan@kernel.org, mm-commits@vger.kernel.org, oleg@redhat.com, riel@surriel.com, rientjes@google.com, shakeelb@google.com, surenb@google.com, timmurray@google.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 063/146] mm/oom_kill: allow process_mrelease to run under mmap_lock protection Message-ID: <20220114220622.9SK_j2BKF%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CC0C01C0011 X-Stat-Signature: ydecdh8sqzheaao3wqpyayjajuzadeur Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=cNaGWYAC; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197985-246312 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm/oom_kill: allow process_mrelease to run under mmap_lock protection With exit_mmap holding mmap_write_lock during free_pgtables call, process_mrelease does not need to elevate mm->mm_users in order to prevent exit_mmap from destrying pagetables while __oom_reap_task_mm is walking the VMA tree. The change prevents process_mrelease from calling the last mmput, which can lead to waiting for IO completion in exit_aio. Link: https://lkml.kernel.org/r/20211209191325.3069345-3-surenb@google.com Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Reviewed-by: Jason Gunthorpe Cc: Andrea Arcangeli Cc: Andy Lutomirski Cc: Christian Brauner Cc: Christian Brauner Cc: Christoph Hellwig Cc: David Hildenbrand Cc: David Rientjes Cc: Florian Weimer Cc: Jan Engelhardt Cc: Jann Horn Cc: Johannes Weiner Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Minchan Kim Cc: Oleg Nesterov Cc: Rik van Riel Cc: Roman Gushchin Cc: Shakeel Butt Cc: Tim Murray Signed-off-by: Andrew Morton --- mm/oom_kill.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) --- a/mm/oom_kill.c~mm-oom_kill-allow-process_mrelease-to-run-under-mmap_lock-protection +++ a/mm/oom_kill.c @@ -1170,15 +1170,15 @@ SYSCALL_DEFINE2(process_mrelease, int, p goto put_task; } - if (mmget_not_zero(p->mm)) { - mm = p->mm; - if (task_will_free_mem(p)) - reap = true; - else { - /* Error only if the work has not been done already */ - if (!test_bit(MMF_OOM_SKIP, &mm->flags)) - ret = -EINVAL; - } + mm = p->mm; + mmgrab(mm); + + if (task_will_free_mem(p)) + reap = true; + else { + /* Error only if the work has not been done already */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags)) + ret = -EINVAL; } task_unlock(p); @@ -1189,13 +1189,16 @@ SYSCALL_DEFINE2(process_mrelease, int, p ret = -EINTR; goto drop_mm; } - if (!__oom_reap_task_mm(mm)) + /* + * Check MMF_OOM_SKIP again under mmap_read_lock protection to ensure + * possible change in exit_mmap is seen + */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags) && !__oom_reap_task_mm(mm)) ret = -EAGAIN; mmap_read_unlock(mm); drop_mm: - if (mm) - mmput(mm); + mmdrop(mm); put_task: put_task_struct(task); return ret; From patchwork Fri Jan 14 22:06:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714090 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3C9AC433EF for ; Fri, 14 Jan 2022 22:06:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61FF36B0101; Fri, 14 Jan 2022 17:06:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CE3B6B0103; Fri, 14 Jan 2022 17:06:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46FF06B0104; Fri, 14 Jan 2022 17:06:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 350EF6B0101 for ; Fri, 14 Jan 2022 17:06:30 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0436A182693EE for ; Fri, 14 Jan 2022 22:06:30 +0000 (UTC) X-FDA: 79030277340.24.3C058F8 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf15.hostedemail.com (Postfix) with ESMTP id EA60AA0004 for ; Fri, 14 Jan 2022 22:06:28 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D3654B8260F; Fri, 14 Jan 2022 22:06:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 744FCC36AE5; Fri, 14 Jan 2022 22:06:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197986; bh=SL4aTYRo5Jhc2WweavqW9HTFwLTXylwg+N1li/TFi8Y=; h=Date:From:To:Subject:In-Reply-To:From; b=kVOVEEPx6OOYPcXTHjJSM0xiEnIFSOTqxZsEJcNt427EEQOxsurVfz+WnC5o9JtPn 8MZzkbSP3X1aSqgHXToSx/RsF5LfwjHAFjzWCspkpOheD1c8jQrjUfV50vzyLMQWZ7 vPXi4ljbKBsGyB8Yj66bkF4GHfw9WoVJFLQ7mBus= Date: Fri, 14 Jan 2022 14:06:26 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, skhan@linuxfoundation.org, torvalds@linux-foundation.org Subject: [patch 064/146] docs/vm: add vmalloced-kernel-stacks document Message-ID: <20220114220626.RQe8Ln1aD%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EA60AA0004 X-Stat-Signature: 7pzu76wcfqihqew8syo7nz74oh9uekyp Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kVOVEEPx; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197988-54916 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shuah Khan Subject: docs/vm: add vmalloced-kernel-stacks document Add a new document to explain Virtually Mapped Kernel Stack Support. This is a compilation of information from the code and original patch series that introduced the Virtually Mapped Kernel Stacks feature. This document summarizes the feature and provides details on allocation, free, and stack overflow handling. Provides reference to available tests. Link: https://lkml.kernel.org/r/20211215002004.47981-1-skhan@linuxfoundation.org Signed-off-by: Shuah Khan Cc: Jonathan Corbet Cc: Andy Lutomirski Signed-off-by: Andrew Morton --- Documentation/vm/index.rst | 1 Documentation/vm/vmalloced-kernel-stacks.rst | 153 +++++++++++++++++ 2 files changed, 154 insertions(+) --- a/Documentation/vm/index.rst~docs-vm-add-vmalloced-kernel-stacks-document +++ a/Documentation/vm/index.rst @@ -36,5 +36,6 @@ algorithms. If you are looking for advi split_page_table_lock transhuge unevictable-lru + vmalloced-kernel-stacks z3fold zsmalloc --- /dev/null +++ a/Documentation/vm/vmalloced-kernel-stacks.rst @@ -0,0 +1,153 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================================== +Virtually Mapped Kernel Stack Support +===================================== + +:Author: Shuah Khan + +.. contents:: :local: + +Overview +-------- + +This is a compilation of information from the code and original patch +series that introduced the `Virtually Mapped Kernel Stacks feature +` + +Introduction +------------ + +Kernel stack overflows are often hard to debug and make the kernel +susceptible to exploits. Problems could show up at a later time making +it difficult to isolate and root-cause. + +Virtually-mapped kernel stacks with guard pages causes kernel stack +overflows to be caught immediately rather than causing difficult to +diagnose corruptions. + +HAVE_ARCH_VMAP_STACK and VMAP_STACK configuration options enable +support for virtually mapped stacks with guard pages. This feature +causes reliable faults when the stack overflows. The usability of +the stack trace after overflow and response to the overflow itself +is architecture dependent. + +.. note:: + As of this writing, arm64, powerpc, riscv, s390, um, and x86 have + support for VMAP_STACK. + +HAVE_ARCH_VMAP_STACK +-------------------- + +Architectures that can support Virtually Mapped Kernel Stacks should +enable this bool configuration option. The requirements are: + +- vmalloc space must be large enough to hold many kernel stacks. This + may rule out many 32-bit architectures. +- Stacks in vmalloc space need to work reliably. For example, if + vmap page tables are created on demand, either this mechanism + needs to work while the stack points to a virtual address with + unpopulated page tables or arch code (switch_to() and switch_mm(), + most likely) needs to ensure that the stack's page table entries + are populated before running on a possibly unpopulated stack. +- If the stack overflows into a guard page, something reasonable + should happen. The definition of "reasonable" is flexible, but + instantly rebooting without logging anything would be unfriendly. + +VMAP_STACK +---------- + +VMAP_STACK bool configuration option when enabled allocates virtually +mapped task stacks. This option depends on HAVE_ARCH_VMAP_STACK. + +- Enable this if you want the use virtually-mapped kernel stacks + with guard pages. This causes kernel stack overflows to be caught + immediately rather than causing difficult-to-diagnose corruption. + +.. note:: + + Using this feature with KASAN requires architecture support + for backing virtual mappings with real shadow memory, and + KASAN_VMALLOC must be enabled. + +.. note:: + + VMAP_STACK is enabled, it is not possible to run DMA on stack + allocated data. + +Kernel configuration options and dependencies keep changing. Refer to +the latest code base: + +`Kconfig ` + +Allocation +----------- + +When a new kernel thread is created, thread stack is allocated from +virtually contiguous memory pages from the page level allocator. These +pages are mapped into contiguous kernel virtual space with PAGE_KERNEL +protections. + +alloc_thread_stack_node() calls __vmalloc_node_range() to allocate stack +with PAGE_KERNEL protections. + +- Allocated stacks are cached and later reused by new threads, so memcg + accounting is performed manually on assigning/releasing stacks to tasks. + Hence, __vmalloc_node_range is called without __GFP_ACCOUNT. +- vm_struct is cached to be able to find when thread free is initiated + in interrupt context. free_thread_stack() can be called in interrupt + context. +- On arm64, all VMAP's stacks need to have the same alignment to ensure + that VMAP'd stack overflow detection works correctly. Arch specific + vmap stack allocator takes care of this detail. +- This does not address interrupt stacks - according to the original patch + +Thread stack allocation is initiated from clone(), fork(), vfork(), +kernel_thread() via kernel_clone(). Leaving a few hints for searching +the code base to understand when and how thread stack is allocated. + +Bulk of the code is in: +`kernel/fork.c `. + +stack_vm_area pointer in task_struct keeps track of the virtually allocated +stack and a non-null stack_vm_area pointer serves as a indication that the +virtually mapped kernel stacks are enabled. + +:: + + struct vm_struct *stack_vm_area; + +Stack overflow handling +----------------------- + +Leading and trailing guard pages help detect stack overflows. When stack +overflows into the guard pages, handlers have to be careful not overflow +the stack again. When handlers are called, it is likely that very little +stack space is left. + +On x86, this is done by handling the page fault indicating the kernel +stack overflow on the double-fault stack. + +Testing VMAP allocation with guard pages +---------------------------------------- + +How do we ensure that VMAP_STACK is actually allocating with a leading +and trailing guard page? The following lkdtm tests can help detect any +regressions. + +:: + + void lkdtm_STACK_GUARD_PAGE_LEADING() + void lkdtm_STACK_GUARD_PAGE_TRAILING() + +Conclusions +----------- + +- A percpu cache of vmalloced stacks appears to be a bit faster than a + high-order stack allocation, at least when the cache hits. +- THREAD_INFO_IN_TASK gets rid of arch-specific thread_info entirely and + simply embed the thread_info (containing only flags) and 'int cpu' into + task_struct. +- The thread stack can be free'ed as soon as the task is dead (without + waiting for RCU) and then, if vmapped stacks are in use, cache the + entire stack for reuse on the same cpu. From patchwork Fri Jan 14 22:06:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70DC7C433FE for ; Fri, 14 Jan 2022 22:06:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02B7E6B0103; Fri, 14 Jan 2022 17:06:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1CE36B0105; Fri, 14 Jan 2022 17:06:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE51B6B0106; Fri, 14 Jan 2022 17:06:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id CCC746B0103 for ; Fri, 14 Jan 2022 17:06:32 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9CA1B182698FF for ; Fri, 14 Jan 2022 22:06:32 +0000 (UTC) X-FDA: 79030277424.07.5EEE37C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id 201FA140005 for ; Fri, 14 Jan 2022 22:06:31 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 59EAE62001; Fri, 14 Jan 2022 22:06:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0ACAC36AE5; Fri, 14 Jan 2022 22:06:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197990; bh=QywIrIHRdQWHbIF8krE1slR1movNTr0w4PFjaGy6SU0=; h=Date:From:To:Subject:In-Reply-To:From; b=CDndbAGi0qzABfsFWZZxNPJl3xnXc877FaHSbIfiW6LfSrLaC5e6QllEOIirPsFNm pIEjIK1nGDMm/aEy50y/ayDmOQHe/AkyJkDd8dZkqC2FnuF+Pfw0xqrkZBRm5xP3VN 0OKm0lgbsxRLyY5Y13J1hHUEPE8o+jOx0wanrilE= Date: Fri, 14 Jan 2022 14:06:29 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, corbet@lwn.net, dave.hansen@linux.intel.com, frederic@kernel.org, gthelen@google.com, hpa@zytor.com, hughd@google.com, jirislaby@kernel.org, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mingo@redhat.com, mm-commits@vger.kernel.org, pasha.tatashin@soleen.com, peterz@infradead.org, pjt@google.com, rientjes@google.com, rppt@kernel.org, samitolvanen@google.com, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, weixugc@google.com, will@kernel.org Subject: [patch 065/146] mm: change page type prior to adding page table entry Message-ID: <20220114220629.y-VkIZDj3%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 201FA140005 X-Stat-Signature: xkr6e34hjwxmfnqgtxpzpm3kbt3qnmnz Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=CDndbAGi; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197991-147545 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Pasha Tatashin Subject: mm: change page type prior to adding page table entry Patch series "page table check", v3. Ensure that some memory corruptions are prevented by checking at the time of insertion of entries into user page tables that there is no illegal sharing. We have recently found a problem [1] that existed in kernel since 4.14. The problem was caused by broken page ref count and led to memory leaking from one process into another. The problem was accidentally detected by studying a dump of one process and noticing that one page contains memory that should not belong to this process. There are some other page->_refcount related problems that were recently fixed: [2], [3] which potentially could also lead to illegal sharing. In addition to hardening refcount [4] itself, this work is an attempt to prevent this class of memory corruption issues. It uses a simple state machine that is independent from regular MM logic to check for illegal sharing at time pages are inserted and removed from page tables. [1] https://lore.kernel.org/all/xr9335nxwc5y.fsf@gthelen2.svl.corp.google.com [2] https://lore.kernel.org/all/1582661774-30925-2-git-send-email-akaher@vmware.com [3] https://lore.kernel.org/all/20210622021423.154662-3-mike.kravetz@oracle.com [4] https://lore.kernel.org/all/20211221150140.988298-1-pasha.tatashin@soleen.com This patch (of 4): There are a few places where we first update the entry in the user page table, and later change the struct page to indicate that this is anonymous or file page. In most places, however, we first configure the page metadata and then insert entries into the page table. Page table check, will use the information from struct page to verify the type of entry is inserted. Change the order in all places to first update struct page, and later to update page table. This means that we first do calls that may change the type of page (anon or file): page_move_anon_rmap page_add_anon_rmap do_page_add_anon_rmap page_add_new_anon_rmap page_add_file_rmap hugepage_add_anon_rmap hugepage_add_new_anon_rmap And after that do calls that add entries to the page table: set_huge_pte_at set_pte_at Link: https://lkml.kernel.org/r/20211221154650.1047963-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20211221154650.1047963-2-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin Cc: David Rientjes Cc: Paul Turner Cc: Wei Xu Cc: Greg Thelen Cc: Ingo Molnar Cc: Jonathan Corbet Cc: Will Deacon Cc: Mike Rapoport Cc: Kees Cook Cc: Thomas Gleixner Cc: Peter Zijlstra Cc: Masahiro Yamada Cc: Sami Tolvanen Cc: Dave Hansen Cc: Frederic Weisbecker Cc: "H. Peter Anvin" Cc: Aneesh Kumar K.V Cc: Jiri Slaby Cc: Muchun Song Cc: Hugh Dickins Signed-off-by: Andrew Morton --- mm/hugetlb.c | 6 +++--- mm/memory.c | 9 +++++---- mm/migrate.c | 5 ++--- mm/swapfile.c | 4 ++-- 4 files changed, 12 insertions(+), 12 deletions(-) --- a/mm/hugetlb.c~mm-change-page-type-prior-to-adding-page-table-entry +++ a/mm/hugetlb.c @@ -4684,8 +4684,8 @@ hugetlb_install_page(struct vm_area_stru struct page *new_page) { __SetPageUptodate(new_page); - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugepage_add_new_anon_rmap(new_page, vma, addr); + set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); ClearHPageRestoreReserve(new_page); SetHPageMigratable(new_page); @@ -5259,10 +5259,10 @@ retry_avoidcopy: /* Break COW */ huge_ptep_clear_flush(vma, haddr, ptep); mmu_notifier_invalidate_range(mm, range.start, range.end); - set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, new_page, 1)); page_remove_rmap(old_page, true); hugepage_add_new_anon_rmap(new_page, vma, haddr); + set_huge_pte_at(mm, haddr, ptep, + make_huge_pte(vma, new_page, 1)); SetHPageMigratable(new_page); /* Make the old page be freed below */ new_page = old_page; --- a/mm/memory.c~mm-change-page-type-prior-to-adding-page-table-entry +++ a/mm/memory.c @@ -720,8 +720,6 @@ static void restore_exclusive_pte(struct else if (is_writable_device_exclusive_entry(entry)) pte = maybe_mkwrite(pte_mkdirty(pte), vma); - set_pte_at(vma->vm_mm, address, ptep, pte); - /* * No need to take a page reference as one was already * created when the swap entry was made. @@ -735,6 +733,8 @@ static void restore_exclusive_pte(struct */ WARN_ON_ONCE(!PageAnon(page)); + set_pte_at(vma->vm_mm, address, ptep, pte); + if (vma->vm_flags & VM_LOCKED) mlock_vma_page(page); @@ -3640,8 +3640,6 @@ vm_fault_t do_swap_page(struct vm_fault pte = pte_mkuffd_wp(pte); pte = pte_wrprotect(pte); } - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; /* ksm created a completely new copy */ @@ -3652,6 +3650,9 @@ vm_fault_t do_swap_page(struct vm_fault do_page_add_anon_rmap(page, vma, vmf->address, exclusive); } + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); + arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); + swap_free(entry); if (mem_cgroup_swap_full(page) || (vma->vm_flags & VM_LOCKED) || PageMlocked(page)) --- a/mm/migrate.c~mm-change-page-type-prior-to-adding-page-table-entry +++ a/mm/migrate.c @@ -236,20 +236,19 @@ static bool remove_migration_pte(struct pte = pte_mkhuge(pte); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); - set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); if (PageAnon(new)) hugepage_add_anon_rmap(new, vma, pvmw.address); else page_dup_rmap(new, true); + set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif { - set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); - if (PageAnon(new)) page_add_anon_rmap(new, vma, pvmw.address, false); else page_add_file_rmap(new, false); + set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); --- a/mm/swapfile.c~mm-change-page-type-prior-to-adding-page-table-entry +++ a/mm/swapfile.c @@ -1917,14 +1917,14 @@ static int unuse_pte(struct vm_area_stru dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); - set_pte_at(vma->vm_mm, addr, pte, - pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { page_add_anon_rmap(page, vma, addr, false); } else { /* ksm created a completely new copy */ page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_inactive_or_unevictable(page, vma); } + set_pte_at(vma->vm_mm, addr, pte, + pte_mkold(mk_pte(page, vma->vm_page_prot))); swap_free(entry); out: pte_unmap_unlock(pte, ptl); From patchwork Fri Jan 14 22:06:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714092 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69C84C433FE for ; Fri, 14 Jan 2022 22:06:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 079036B0105; Fri, 14 Jan 2022 17:06:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 028566B0107; Fri, 14 Jan 2022 17:06:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E32C96B0108; Fri, 14 Jan 2022 17:06:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id D00656B0105 for ; Fri, 14 Jan 2022 17:06:37 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 941FE96353 for ; Fri, 14 Jan 2022 22:06:37 +0000 (UTC) X-FDA: 79030277634.05.52E269C Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf24.hostedemail.com (Postfix) with ESMTP id 1C54E180012 for ; Fri, 14 Jan 2022 22:06:36 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E9EBFB8262F; Fri, 14 Jan 2022 22:06:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD29FC36AEC; Fri, 14 Jan 2022 22:06:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197994; bh=yFxuP4qqe798e6LZgS0TqBwAyKW6JB39G52ANX9xKRg=; h=Date:From:To:Subject:In-Reply-To:From; b=AToy7468ngIjzcsdQAuHeYRO+nFhqo+pMG0aa0FOj3vOhYvhozpSOgUqp5UoVnNJr HLifuNPz9yCCchXoNOrbTKC8U1WRdyaSVphFs1C6xDhXzv/Imaf8fCdc2wi/tgFnyz 26Cscga4oRYHUUlYEBS+9treN4hSaL70UBF6gMwk= Date: Fri, 14 Jan 2022 14:06:33 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, corbet@lwn.net, dave.hansen@linux.intel.com, frederic@kernel.org, gthelen@google.com, hpa@zytor.com, hughd@google.com, jirislaby@kernel.org, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mingo@redhat.com, mm-commits@vger.kernel.org, pasha.tatashin@soleen.com, peterz@infradead.org, pjt@google.com, rientjes@google.com, rppt@kernel.org, samitolvanen@google.com, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, weixugc@google.com, will@kernel.org Subject: [patch 066/146] mm: ptep_clear() page table helper Message-ID: <20220114220633.tueCIXFno%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 1C54E180012 X-Stat-Signature: s8ahsftqnxdwdqrnhimwkuzg5rognii6 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=AToy7468; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642197996-779165 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Pasha Tatashin Subject: mm: ptep_clear() page table helper We have ptep_get_and_clear() and ptep_get_and_clear_full() helpers to clear PTE from user page tables, but there is no variant for simple clear of a present PTE from user page tables without using a low level pte_clear() which can be either native or para-virtualised. Add a new ptep_clear() that can be used in common code to clear PTEs from page table. We will need this call later in order to add a hook for page table check. Link: https://lkml.kernel.org/r/20211221154650.1047963-3-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin Cc: Aneesh Kumar K.V Cc: Dave Hansen Cc: David Rientjes Cc: Frederic Weisbecker Cc: Greg Thelen Cc: "H. Peter Anvin" Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jiri Slaby Cc: Jonathan Corbet Cc: Kees Cook Cc: Masahiro Yamada Cc: Mike Rapoport Cc: Muchun Song Cc: Paul Turner Cc: Peter Zijlstra Cc: Sami Tolvanen Cc: Thomas Gleixner Cc: Wei Xu Cc: Will Deacon Signed-off-by: Andrew Morton --- Documentation/vm/arch_pgtable_helpers.rst | 6 ++++-- include/linux/pgtable.h | 8 ++++++++ mm/debug_vm_pgtable.c | 2 +- mm/khugepaged.c | 12 ++---------- 4 files changed, 15 insertions(+), 13 deletions(-) --- a/Documentation/vm/arch_pgtable_helpers.rst~mm-ptep_clear-page-table-helper +++ a/Documentation/vm/arch_pgtable_helpers.rst @@ -66,9 +66,11 @@ PTE Page Table Helpers +---------------------------+--------------------------------------------------+ | pte_mknotpresent | Invalidates a mapped PTE | +---------------------------+--------------------------------------------------+ -| ptep_get_and_clear | Clears a PTE | +| ptep_clear | Clears a PTE | +---------------------------+--------------------------------------------------+ -| ptep_get_and_clear_full | Clears a PTE | +| ptep_get_and_clear | Clears and returns PTE | ++---------------------------+--------------------------------------------------+ +| ptep_get_and_clear_full | Clears and returns PTE (batched PTE unmap) | +---------------------------+--------------------------------------------------+ | ptep_test_and_clear_young | Clears young from a PTE | +---------------------------+--------------------------------------------------+ --- a/include/linux/pgtable.h~mm-ptep_clear-page-table-helper +++ a/include/linux/pgtable.h @@ -258,6 +258,14 @@ static inline int pmdp_clear_flush_young #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif +#ifndef __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + pte_clear(mm, addr, ptep); +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, --- a/mm/debug_vm_pgtable.c~mm-ptep_clear-page-table-helper +++ a/mm/debug_vm_pgtable.c @@ -652,7 +652,7 @@ static void __init pte_clear_tests(struc set_pte_at(args->mm, args->vaddr, args->ptep, pte); flush_dcache_page(page); barrier(); - pte_clear(args->mm, args->vaddr, args->ptep); + ptep_clear(args->mm, args->vaddr, args->ptep); pte = ptep_get(args->ptep); WARN_ON(!pte_none(pte)); } --- a/mm/khugepaged.c~mm-ptep_clear-page-table-helper +++ a/mm/khugepaged.c @@ -756,11 +756,7 @@ static void __collapse_huge_page_copy(pt * ptl mostly unnecessary. */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); spin_unlock(ptl); } } else { @@ -774,11 +770,7 @@ static void __collapse_huge_page_copy(pt * inside page_remove_rmap(). */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); page_remove_rmap(src_page, false); spin_unlock(ptl); free_page_and_swap_cache(src_page); From patchwork Fri Jan 14 22:06:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65CD9C433EF for ; Fri, 14 Jan 2022 22:06:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5C396B0107; Fri, 14 Jan 2022 17:06:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0CBE6B0109; Fri, 14 Jan 2022 17:06:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C852D6B010A; Fri, 14 Jan 2022 17:06:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id B3DC76B0107 for ; Fri, 14 Jan 2022 17:06:40 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 70CF3182693EE for ; Fri, 14 Jan 2022 22:06:40 +0000 (UTC) X-FDA: 79030277760.16.733D1EA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf17.hostedemail.com (Postfix) with ESMTP id DD81440011 for ; Fri, 14 Jan 2022 22:06:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3BD7062011; Fri, 14 Jan 2022 22:06:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C24B7C36AE5; Fri, 14 Jan 2022 22:06:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642197998; bh=QwuADMFxF9o4TXIv4wv/hsiVlOtlt58FZqXgCMff+ds=; h=Date:From:To:Subject:In-Reply-To:From; b=qO6s/llrkLbjFM4vMcrJJSaWKx9HpBl965VAGU1PbyX24mROJ9MUGJqnJuuf/tT/C SZDW+HvjvQS2Nt6ItVBdWTwbmXS0wGdlVgu1yFg7yVvOlMxkNs5rGbYi8lEyTcd8nl E7ohGENeNg9cs60YRXW0ukeDCkylEthJCNx5jD5E= Date: Fri, 14 Jan 2022 14:06:37 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, corbet@lwn.net, dave.hansen@linux.intel.com, frederic@kernel.org, gthelen@google.com, hpa@zytor.com, hughd@google.com, jirislaby@kernel.org, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mingo@redhat.com, mm-commits@vger.kernel.org, pasha.tatashin@soleen.com, peterz@infradead.org, pjt@google.com, rientjes@google.com, rppt@kernel.org, samitolvanen@google.com, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, weixugc@google.com, will@kernel.org Subject: [patch 067/146] mm: page table check Message-ID: <20220114220637.fzWDGOzsB%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DD81440011 X-Stat-Signature: tazb9j7mdo5nk74dk65x1rcebqx8mr31 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="qO6s/llr"; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642197999-886681 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Pasha Tatashin Subject: mm: page table check Check user page table entries at the time they are added and removed. Allows to synchronously catch memory corruption issues related to double mapping. When a pte for an anonymous page is added into page table, we verify that this pte does not already point to a file backed page, and vice versa if this is a file backed page that is being added we verify that this page does not have an anonymous mapping We also enforce that read-only sharing for anonymous pages is allowed (i.e. cow after fork). All other sharing must be for file pages. Page table check allows to protect and debug cases where "struct page" metadata became corrupted for some reason. For example, when refcnt or mapcount become invalid. Link: https://lkml.kernel.org/r/20211221154650.1047963-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin Cc: Aneesh Kumar K.V Cc: Dave Hansen Cc: David Rientjes Cc: Frederic Weisbecker Cc: Greg Thelen Cc: "H. Peter Anvin" Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jiri Slaby Cc: Jonathan Corbet Cc: Kees Cook Cc: Masahiro Yamada Cc: Mike Rapoport Cc: Muchun Song Cc: Paul Turner Cc: Peter Zijlstra Cc: Sami Tolvanen Cc: Thomas Gleixner Cc: Wei Xu Cc: Will Deacon Signed-off-by: Andrew Morton --- Documentation/vm/index.rst | 1 Documentation/vm/page_table_check.rst | 56 ++++ MAINTAINERS | 9 arch/Kconfig | 3 include/linux/page_table_check.h | 147 +++++++++++++ mm/Kconfig.debug | 24 ++ mm/Makefile | 1 mm/page_alloc.c | 4 mm/page_ext.c | 4 mm/page_table_check.c | 270 ++++++++++++++++++++++++ 10 files changed, 519 insertions(+) --- a/arch/Kconfig~mm-page-table-check +++ a/arch/Kconfig @@ -1297,6 +1297,9 @@ config HAVE_ARCH_PFN_VALID config ARCH_SUPPORTS_DEBUG_PAGEALLOC bool +config ARCH_SUPPORTS_PAGE_TABLE_CHECK + bool + config ARCH_SPLIT_ARG64 bool help --- a/Documentation/vm/index.rst~mm-page-table-check +++ a/Documentation/vm/index.rst @@ -31,6 +31,7 @@ algorithms. If you are looking for advi page_migration page_frags page_owner + page_table_check remap_file_pages slub split_page_table_lock --- /dev/null +++ a/Documentation/vm/page_table_check.rst @@ -0,0 +1,56 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _page_table_check: + +================ +Page Table Check +================ + +Introduction +============ + +Page table check allows to hardern the kernel by ensuring that some types of +the memory corruptions are prevented. + +Page table check performs extra verifications at the time when new pages become +accessible from the userspace by getting their page table entries (PTEs PMDs +etc.) added into the table. + +In case of detected corruption, the kernel is crashed. There is a small +performance and memory overhead associated with the page table check. Therefore, +it is disabled by default, but can be optionally enabled on systems where the +extra hardening outweighs the performance costs. Also, because page table check +is synchronous, it can help with debugging double map memory corruption issues, +by crashing kernel at the time wrong mapping occurs instead of later which is +often the case with memory corruptions bugs. + +Double mapping detection logic +============================== + ++-------------------+-------------------+-------------------+------------------+ +| Current Mapping | New mapping | Permissions | Rule | ++===================+===================+===================+==================+ +| Anonymous | Anonymous | Read | Allow | ++-------------------+-------------------+-------------------+------------------+ +| Anonymous | Anonymous | Read / Write | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Anonymous | Named | Any | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Named | Anonymous | Any | Prohibit | ++-------------------+-------------------+-------------------+------------------+ +| Named | Named | Any | Allow | ++-------------------+-------------------+-------------------+------------------+ + +Enabling Page Table Check +========================= + +Build kernel with: + +- PAGE_TABLE_CHECK=y + Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE_CHECK + is available. + +- Boot with 'page_table_check=on' kernel parameter. + +Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page +table support without extra kernel parameter. --- /dev/null +++ a/include/linux/page_table_check.h @@ -0,0 +1,147 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#ifndef __LINUX_PAGE_TABLE_CHECK_H +#define __LINUX_PAGE_TABLE_CHECK_H + +#ifdef CONFIG_PAGE_TABLE_CHECK +#include + +extern struct static_key_true page_table_check_disabled; +extern struct page_ext_operations page_table_check_ops; + +void __page_table_check_zero(struct page *page, unsigned int order); +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte); +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd); +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud); +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd); +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud); + +static inline void page_table_check_alloc(struct page *page, unsigned int order) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_free(struct page *page, unsigned int order) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_clear(mm, addr, pte); +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_clear(mm, addr, pmd); +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_clear(mm, addr, pud); +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_set(mm, addr, ptep, pte); +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_set(mm, addr, pmdp, pmd); +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_set(mm, addr, pudp, pud); +} + +#else + +static inline void page_table_check_alloc(struct page *page, unsigned int order) +{ +} + +static inline void page_table_check_free(struct page *page, unsigned int order) +{ +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ +} + +#endif /* CONFIG_PAGE_TABLE_CHECK */ +#endif /* __LINUX_PAGE_TABLE_CHECK_H */ --- a/MAINTAINERS~mm-page-table-check +++ a/MAINTAINERS @@ -14387,6 +14387,15 @@ F: include/net/page_pool.h F: include/trace/events/page_pool.h F: net/core/page_pool.c +PAGE TABLE CHECK +M: Pasha Tatashin +M: Andrew Morton +L: linux-mm@kvack.org +S: Maintained +F: Documentation/vm/page_table_check.rst +F: include/linux/page_table_check.h +F: mm/page_table_check.c + PANASONIC LAPTOP ACPI EXTRAS DRIVER M: Kenneth Chan L: platform-driver-x86@vger.kernel.org --- a/mm/Kconfig.debug~mm-page-table-check +++ a/mm/Kconfig.debug @@ -62,6 +62,30 @@ config PAGE_OWNER If unsure, say N. +config PAGE_TABLE_CHECK + bool "Check for invalid mappings in user page tables" + depends on ARCH_SUPPORTS_PAGE_TABLE_CHECK + select PAGE_EXTENSION + help + Check that anonymous page is not being mapped twice with read write + permissions. Check that anonymous and file pages are not being + erroneously shared. Since the checking is performed at the time + entries are added and removed to user page tables, leaking, corruption + and double mapping problems are detected synchronously. + + If unsure say "n". + +config PAGE_TABLE_CHECK_ENFORCED + bool "Enforce the page table checking by default" + depends on PAGE_TABLE_CHECK + help + Always enable page table checking. By default the page table checking + is disabled, and can be optionally enabled via page_table_check=on + kernel parameter. This config enforces that page table check is always + enabled. + + If unsure say "n". + config PAGE_POISONING bool "Poison pages after freeing" help --- a/mm/Makefile~mm-page-table-check +++ a/mm/Makefile @@ -112,6 +112,7 @@ obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += e obj-$(CONFIG_CMA) += cma.o obj-$(CONFIG_MEMORY_BALLOON) += balloon_compaction.o obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o +obj-$(CONFIG_PAGE_TABLE_CHECK) += page_table_check.o obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o obj-$(CONFIG_SECRETMEM) += secretmem.o obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o --- a/mm/page_alloc.c~mm-page-table-check +++ a/mm/page_alloc.c @@ -63,6 +63,7 @@ #include #include #include +#include #include #include #include @@ -1307,6 +1308,7 @@ static __always_inline bool free_pages_p if (memcg_kmem_enabled() && PageMemcgKmem(page)) __memcg_kmem_uncharge_page(page, order); reset_page_owner(page, order); + page_table_check_free(page, order); return false; } @@ -1346,6 +1348,7 @@ static __always_inline bool free_pages_p page_cpupid_reset_last(page); page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); + page_table_check_free(page, order); if (!PageHighMem(page)) { debug_check_no_locks_freed(page_address(page), @@ -2420,6 +2423,7 @@ inline void post_alloc_hook(struct page } set_page_owner(page, order, gfp_flags); + page_table_check_alloc(page, order); } static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, --- a/mm/page_ext.c~mm-page-table-check +++ a/mm/page_ext.c @@ -8,6 +8,7 @@ #include #include #include +#include /* * struct page extension @@ -75,6 +76,9 @@ static struct page_ext_operations *page_ #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_PAGE_TABLE_CHECK + &page_table_check_ops, +#endif }; unsigned long page_ext_size = sizeof(struct page_ext); --- /dev/null +++ a/mm/page_table_check.c @@ -0,0 +1,270 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#include +#include + +#undef pr_fmt +#define pr_fmt(fmt) "page_table_check: " fmt + +struct page_table_check { + atomic_t anon_map_count; + atomic_t file_map_count; +}; + +static bool __page_table_check_enabled __initdata = + IS_ENABLED(CONFIG_PAGE_TABLE_CHECK_ENFORCED); + +DEFINE_STATIC_KEY_TRUE(page_table_check_disabled); +EXPORT_SYMBOL(page_table_check_disabled); + +static int __init early_page_table_check_param(char *buf) +{ + if (!buf) + return -EINVAL; + + if (strcmp(buf, "on") == 0) + __page_table_check_enabled = true; + else if (strcmp(buf, "off") == 0) + __page_table_check_enabled = false; + + return 0; +} + +early_param("page_table_check", early_page_table_check_param); + +static bool __init need_page_table_check(void) +{ + return __page_table_check_enabled; +} + +static void __init init_page_table_check(void) +{ + if (!__page_table_check_enabled) + return; + static_branch_disable(&page_table_check_disabled); +} + +struct page_ext_operations page_table_check_ops = { + .size = sizeof(struct page_table_check), + .need = need_page_table_check, + .init = init_page_table_check, +}; + +static struct page_table_check *get_page_table_check(struct page_ext *page_ext) +{ + BUG_ON(!page_ext); + return (void *)(page_ext) + page_table_check_ops.offset; +} + +static inline bool pte_user_accessible_page(pte_t pte) +{ + return (pte_val(pte) & _PAGE_PRESENT) && (pte_val(pte) & _PAGE_USER); +} + +static inline bool pmd_user_accessible_page(pmd_t pmd) +{ + return pmd_leaf(pmd) && (pmd_val(pmd) & _PAGE_PRESENT) && + (pmd_val(pmd) & _PAGE_USER); +} + +static inline bool pud_user_accessible_page(pud_t pud) +{ + return pud_leaf(pud) && (pud_val(pud) & _PAGE_PRESENT) && + (pud_val(pud) & _PAGE_USER); +} + +/* + * An enty is removed from the page table, decrement the counters for that page + * verify that it is of correct type and counters do not become negative. + */ +static void page_table_check_clear(struct mm_struct *mm, unsigned long addr, + unsigned long pfn, unsigned long pgcnt) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page = pfn_to_page(pfn); + page_ext = lookup_page_ext(page); + anon = PageAnon(page); + + for (i = 0; i < pgcnt; i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_dec_return(&ptc->anon_map_count) < 0); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0); + } + page_ext = page_ext_next(page_ext); + } +} + +/* + * A new enty is added to the page table, increment the counters for that page + * verify that it is of correct type and is not being mapped with a different + * type to a different process. + */ +static void page_table_check_set(struct mm_struct *mm, unsigned long addr, + unsigned long pfn, unsigned long pgcnt, + bool rw) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page = pfn_to_page(pfn); + page_ext = lookup_page_ext(page); + anon = PageAnon(page); + + for (i = 0; i < pgcnt; i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0); + } + page_ext = page_ext_next(page_ext); + } +} + +/* + * page is on free list, or is being allocated, verify that counters are zeroes + * crash if they are not. + */ +void __page_table_check_zero(struct page *page, unsigned int order) +{ + struct page_ext *page_ext = lookup_page_ext(page); + int i; + + BUG_ON(!page_ext); + for (i = 0; i < (1 << order); i++) { + struct page_table_check *ptc = get_page_table_check(page_ext); + + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_read(&ptc->file_map_count)); + page_ext = page_ext_next(page_ext); + } +} + +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte) +{ + if (&init_mm == mm) + return; + + if (pte_user_accessible_page(pte)) { + page_table_check_clear(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pte_clear); + +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd) +{ + if (&init_mm == mm) + return; + + if (pmd_user_accessible_page(pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_clear); + +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud) +{ + if (&init_mm == mm) + return; + + if (pud_user_accessible_page(pud)) { + page_table_check_clear(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pud_clear); + +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + pte_t old_pte; + + if (&init_mm == mm) + return; + + old_pte = *ptep; + if (pte_user_accessible_page(old_pte)) { + page_table_check_clear(mm, addr, pte_pfn(old_pte), + PAGE_SIZE >> PAGE_SHIFT); + } + + if (pte_user_accessible_page(pte)) { + page_table_check_set(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT, + pte_write(pte)); + } +} +EXPORT_SYMBOL(__page_table_check_pte_set); + +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd) +{ + pmd_t old_pmd; + + if (&init_mm == mm) + return; + + old_pmd = *pmdp; + if (pmd_user_accessible_page(old_pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(old_pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pmd_user_accessible_page(pmd)) { + page_table_check_set(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT, + pmd_write(pmd)); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_set); + +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud) +{ + pud_t old_pud; + + if (&init_mm == mm) + return; + + old_pud = *pudp; + if (pud_user_accessible_page(old_pud)) { + page_table_check_clear(mm, addr, pud_pfn(old_pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pud_user_accessible_page(pud)) { + page_table_check_set(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT, + pud_write(pud)); + } +} +EXPORT_SYMBOL(__page_table_check_pud_set); From patchwork Fri Jan 14 22:06:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714094 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0042C433EF for ; Fri, 14 Jan 2022 22:06:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7CC806B0109; Fri, 14 Jan 2022 17:06:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 77C286B010B; Fri, 14 Jan 2022 17:06:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 643EB6B010C; Fri, 14 Jan 2022 17:06:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 540196B0109 for ; Fri, 14 Jan 2022 17:06:44 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1D0D596744 for ; Fri, 14 Jan 2022 22:06:44 +0000 (UTC) X-FDA: 79030277928.26.81E5B1F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id B0F7D100017 for ; Fri, 14 Jan 2022 22:06:43 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2EA3561FE2; Fri, 14 Jan 2022 22:06:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6AD0C36AEC; Fri, 14 Jan 2022 22:06:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198002; bh=KupSSk++AjSc6KmAQh+XrNjtvrvn9YQIz7cmfHajYeY=; h=Date:From:To:Subject:In-Reply-To:From; b=a+b5xz52dDVha10GaknPD8V2JVOsO8kYTfbJgKKX+PpZdouepIRr/RDVSWl1d/w0X +ZnwCUtri1RcXdJqFK7X0oJX5B0whZZeGRVFe/kYsTq3UwMJG5sqVSuy+UoOTgwoWY H0kUxFF3Ry7xAyvHq+raJUM3275LdvHje2Wu5yBA= Date: Fri, 14 Jan 2022 14:06:41 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, corbet@lwn.net, dave.hansen@linux.intel.com, frederic@kernel.org, gthelen@google.com, hpa@zytor.com, hughd@google.com, jirislaby@kernel.org, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mingo@redhat.com, mm-commits@vger.kernel.org, pasha.tatashin@soleen.com, peterz@infradead.org, pjt@google.com, rientjes@google.com, rppt@kernel.org, samitolvanen@google.com, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, weixugc@google.com, will@kernel.org Subject: [patch 068/146] x86: mm: add x86_64 support for page table check Message-ID: <20220114220641.t2cwQu05-%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B0F7D100017 X-Stat-Signature: hw1nwmxjwq595xuxdgwsyq3u7o5gryei Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=a+b5xz52; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198003-172467 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Pasha Tatashin Subject: x86: mm: add x86_64 support for page table check Add page table check hooks into routines that modify user page tables. Link: https://lkml.kernel.org/r/20211221154650.1047963-5-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin Cc: Aneesh Kumar K.V Cc: Dave Hansen Cc: David Rientjes Cc: Frederic Weisbecker Cc: Greg Thelen Cc: "H. Peter Anvin" Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jiri Slaby Cc: Jonathan Corbet Cc: Kees Cook Cc: Masahiro Yamada Cc: Mike Rapoport Cc: Muchun Song Cc: Paul Turner Cc: Peter Zijlstra Cc: Sami Tolvanen Cc: Thomas Gleixner Cc: Wei Xu Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 29 +++++++++++++++++++++++++++-- 2 files changed, 28 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/pgtable.h~x86-mm-add-x86_64-support-for-page-table-check +++ a/arch/x86/include/asm/pgtable.h @@ -26,6 +26,7 @@ #include #include #include +#include extern pgd_t early_top_pgt[PTRS_PER_PGD]; bool __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -1006,18 +1007,21 @@ static inline pud_t native_local_pudp_ge static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { + page_table_check_pte_set(mm, addr, ptep, pte); set_pte(ptep, pte); } static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(mm, addr, pmdp, pmd); set_pmd(pmdp, pmd); } static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { + page_table_check_pud_set(mm, addr, pudp, pud); native_set_pud(pudp, pud); } @@ -1048,6 +1052,7 @@ static inline pte_t ptep_get_and_clear(s pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); return pte; } @@ -1063,12 +1068,23 @@ static inline pte_t ptep_get_and_clear_f * care about updates and native needs no locking */ pte = native_local_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); } else { pte = ptep_get_and_clear(mm, addr, ptep); } return pte; } +#define __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + if (IS_ENABLED(CONFIG_PAGE_TABLE_CHECK)) + ptep_get_and_clear(mm, addr, ptep); + else + pte_clear(mm, addr, ptep); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -1109,14 +1125,22 @@ static inline int pmd_write(pmd_t pmd) static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { - return native_pmdp_get_and_clear(pmdp); + pmd_t pmd = native_pmdp_get_and_clear(pmdp); + + page_table_check_pmd_clear(mm, addr, pmd); + + return pmd; } #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) { - return native_pudp_get_and_clear(pudp); + pud_t pud = native_pudp_get_and_clear(pudp); + + page_table_check_pud_clear(mm, addr, pud); + + return pud; } #define __HAVE_ARCH_PMDP_SET_WRPROTECT @@ -1137,6 +1161,7 @@ static inline int pud_write(pud_t pud) static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); if (IS_ENABLED(CONFIG_SMP)) { return xchg(pmdp, pmd); } else { --- a/arch/x86/Kconfig~x86-mm-add-x86_64-support-for-page-table-check +++ a/arch/x86/Kconfig @@ -104,6 +104,7 @@ config X86 select ARCH_SUPPORTS_ACPI select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC + select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64 select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096 select ARCH_SUPPORTS_LTO_CLANG From patchwork Fri Jan 14 22:06:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26D61C433F5 for ; Fri, 14 Jan 2022 22:06:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B27106B010B; Fri, 14 Jan 2022 17:06:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AD7956B010D; Fri, 14 Jan 2022 17:06:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 99F136B010E; Fri, 14 Jan 2022 17:06:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 897006B010B for ; Fri, 14 Jan 2022 17:06:47 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5431C8F6D3 for ; Fri, 14 Jan 2022 22:06:47 +0000 (UTC) X-FDA: 79030278054.03.224B4FF Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id EE47E140014 for ; Fri, 14 Jan 2022 22:06:46 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 42E9B62009; Fri, 14 Jan 2022 22:06:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71D0CC36AE9; Fri, 14 Jan 2022 22:06:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198005; bh=cR+CvkD83f5p1kgvg0zFS47hOrMJ3pdQ115EVBQZvnQ=; h=Date:From:To:Subject:In-Reply-To:From; b=vYSlrefopH+TVjEVAmdZ6h83vrv5pGjJakulLPy0W6QY9b3YeSFujuoPJ4fRrbR9x EFSwIWCYTDFWZOoEJDKsxlG+I2WrOFqoaJ9/JfU/4CHlNuwawVQDVXxn1NkmCAAEGk PrxaHot6ZzGNCrsQKUGRh9YCfUvaXVjvVm9QxM/M= Date: Fri, 14 Jan 2022 14:06:44 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org Subject: [patch 069/146] mm: remove last argument of reuse_swap_page() Message-ID: <20220114220644.bS3fPHyrC%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: EE47E140014 X-Stat-Signature: 3i4rrz9ra51wtc9at5xbnw6hu1cr8hmw Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=vYSlrefo; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198006-352544 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm: remove last argument of reuse_swap_page() None of the callers care about the total_map_swapcount() any more. Link: https://lkml.kernel.org/r/20211220205943.456187-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Linus Torvalds Reviewed-by: William Kucharski Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton --- include/linux/swap.h | 6 +++--- mm/huge_memory.c | 2 +- mm/khugepaged.c | 2 +- mm/memory.c | 2 +- mm/swapfile.c | 8 +------- 5 files changed, 7 insertions(+), 13 deletions(-) --- a/include/linux/swap.h~mm-remove-last-argument-of-reuse_swap_page +++ a/include/linux/swap.h @@ -514,7 +514,7 @@ extern int __swp_swapcount(swp_entry_t e extern int swp_swapcount(swp_entry_t entry); extern struct swap_info_struct *page_swap_info(struct page *); extern struct swap_info_struct *swp_swap_info(swp_entry_t entry); -extern bool reuse_swap_page(struct page *, int *); +extern bool reuse_swap_page(struct page *); extern int try_to_free_swap(struct page *); struct backing_dev_info; extern int init_swap_address_space(unsigned int type, unsigned long nr_pages); @@ -680,8 +680,8 @@ static inline int swp_swapcount(swp_entr return 0; } -#define reuse_swap_page(page, total_map_swapcount) \ - (page_trans_huge_mapcount(page, total_map_swapcount) == 1) +#define reuse_swap_page(page) \ + (page_trans_huge_mapcount(page, NULL) == 1) static inline int try_to_free_swap(struct page *page) { --- a/mm/huge_memory.c~mm-remove-last-argument-of-reuse_swap_page +++ a/mm/huge_memory.c @@ -1322,7 +1322,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm * We can only reuse the page if nobody else maps the huge page or it's * part. */ - if (reuse_swap_page(page, NULL)) { + if (reuse_swap_page(page)) { pmd_t entry; entry = pmd_mkyoung(orig_pmd); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); --- a/mm/khugepaged.c~mm-remove-last-argument-of-reuse_swap_page +++ a/mm/khugepaged.c @@ -681,7 +681,7 @@ static int __collapse_huge_page_isolate( goto out; } if (!pte_write(pteval) && PageSwapCache(page) && - !reuse_swap_page(page, NULL)) { + !reuse_swap_page(page)) { /* * Page is in the swap cache and cannot be re-used. * It cannot be collapsed into a THP. --- a/mm/memory.c~mm-remove-last-argument-of-reuse_swap_page +++ a/mm/memory.c @@ -3627,7 +3627,7 @@ vm_fault_t do_swap_page(struct vm_fault inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); - if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { + if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; ret |= VM_FAULT_WRITE; --- a/mm/swapfile.c~mm-remove-last-argument-of-reuse_swap_page +++ a/mm/swapfile.c @@ -1668,12 +1668,8 @@ static int page_trans_huge_map_swapcount * to it. And as a side-effect, free up its swap: because the old content * on disk will never be read, and seeking back there to write new content * later would only waste time away from clustering. - * - * NOTE: total_map_swapcount should not be relied upon by the caller if - * reuse_swap_page() returns false, but it may be always overwritten - * (see the other implementation for CONFIG_SWAP=n). */ -bool reuse_swap_page(struct page *page, int *total_map_swapcount) +bool reuse_swap_page(struct page *page) { int count, total_mapcount, total_swapcount; @@ -1682,8 +1678,6 @@ bool reuse_swap_page(struct page *page, return false; count = page_trans_huge_map_swapcount(page, &total_mapcount, &total_swapcount); - if (total_map_swapcount) - *total_map_swapcount = total_mapcount + total_swapcount; if (count == 1 && PageSwapCache(page) && (likely(!PageTransCompound(page)) || /* The remaining swap count will be freed soon */ From patchwork Fri Jan 14 22:06:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714096 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CBCBC433F5 for ; Fri, 14 Jan 2022 22:06:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D24B56B010D; Fri, 14 Jan 2022 17:06:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CABB56B010F; Fri, 14 Jan 2022 17:06:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B73A36B0110; Fri, 14 Jan 2022 17:06:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0051.hostedemail.com [216.40.44.51]) by kanga.kvack.org (Postfix) with ESMTP id A7C936B010D for ; Fri, 14 Jan 2022 17:06:50 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5C7E3182698FF for ; Fri, 14 Jan 2022 22:06:50 +0000 (UTC) X-FDA: 79030278180.11.3DF7C3D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id ED3EE40013 for ; Fri, 14 Jan 2022 22:06:49 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5D1C262011; Fri, 14 Jan 2022 22:06:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88BE5C36AE5; Fri, 14 Jan 2022 22:06:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198008; bh=rTWyWbGFQtOk+sX6CUPreTJDiw6KILZcFlb2lH5J1Oc=; h=Date:From:To:Subject:In-Reply-To:From; b=LIpIT0mbOJSUPkHd9GiFdOWZFIqpccbjN4/jHE5bmy+Bel7KfVOmzXObssvu01BSq Poq00PCnZ/RU5EfMdAG3KBurDahlxG5E/UPa2tMW+Dsdc+8nLxQYhy5pY+MBFnqR5s RuGOjhcsZdjE3hLf63Q5d9W6bTWvd8odG/F0+Dr4= Date: Fri, 14 Jan 2022 14:06:48 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org Subject: [patch 070/146] mm: remove the total_mapcount argument from page_trans_huge_map_swapcount() Message-ID: <20220114220648.YXzQLDqG5%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: ED3EE40013 X-Stat-Signature: sfkiuinuu1sggnm3w4kfajb5axqcn3jn Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=LIpIT0mb; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198009-197400 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm: remove the total_mapcount argument from page_trans_huge_map_swapcount() Now that we don't report it to the caller of reuse_swap_page(), we don't need to request it from page_trans_huge_map_swapcount(). Link: https://lkml.kernel.org/r/20211220205943.456187-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: William Kucharski Acked-by: Linus Torvalds Cc: David Hildenbrand Signed-off-by: Andrew Morton --- mm/swapfile.c | 32 ++++++++++++-------------------- 1 file changed, 12 insertions(+), 20 deletions(-) --- a/mm/swapfile.c~mm-remove-the-total_mapcount-argument-from-page_trans_huge_map_swapcount +++ a/mm/swapfile.c @@ -1601,31 +1601,30 @@ static bool page_swapped(struct page *pa return false; } -static int page_trans_huge_map_swapcount(struct page *page, int *total_mapcount, +static int page_trans_huge_map_swapcount(struct page *page, int *total_swapcount) { - int i, map_swapcount, _total_mapcount, _total_swapcount; + int i, map_swapcount, _total_swapcount; unsigned long offset = 0; struct swap_info_struct *si; struct swap_cluster_info *ci = NULL; unsigned char *map = NULL; - int mapcount, swapcount = 0; + int swapcount = 0; /* hugetlbfs shouldn't call it */ VM_BUG_ON_PAGE(PageHuge(page), page); if (!IS_ENABLED(CONFIG_THP_SWAP) || likely(!PageTransCompound(page))) { - mapcount = page_trans_huge_mapcount(page, total_mapcount); if (PageSwapCache(page)) swapcount = page_swapcount(page); if (total_swapcount) *total_swapcount = swapcount; - return mapcount + swapcount; + return swapcount + page_trans_huge_mapcount(page, NULL); } page = compound_head(page); - _total_mapcount = _total_swapcount = map_swapcount = 0; + _total_swapcount = map_swapcount = 0; if (PageSwapCache(page)) { swp_entry_t entry; @@ -1639,8 +1638,7 @@ static int page_trans_huge_map_swapcount if (map) ci = lock_cluster(si, offset); for (i = 0; i < HPAGE_PMD_NR; i++) { - mapcount = atomic_read(&page[i]._mapcount) + 1; - _total_mapcount += mapcount; + int mapcount = atomic_read(&page[i]._mapcount) + 1; if (map) { swapcount = swap_count(map[offset + i]); _total_swapcount += swapcount; @@ -1648,19 +1646,14 @@ static int page_trans_huge_map_swapcount map_swapcount = max(map_swapcount, mapcount + swapcount); } unlock_cluster(ci); - if (PageDoubleMap(page)) { + + if (PageDoubleMap(page)) map_swapcount -= 1; - _total_mapcount -= HPAGE_PMD_NR; - } - mapcount = compound_mapcount(page); - map_swapcount += mapcount; - _total_mapcount += mapcount; - if (total_mapcount) - *total_mapcount = _total_mapcount; + if (total_swapcount) *total_swapcount = _total_swapcount; - return map_swapcount; + return map_swapcount + compound_mapcount(page); } /* @@ -1671,13 +1664,12 @@ static int page_trans_huge_map_swapcount */ bool reuse_swap_page(struct page *page) { - int count, total_mapcount, total_swapcount; + int count, total_swapcount; VM_BUG_ON_PAGE(!PageLocked(page), page); if (unlikely(PageKsm(page))) return false; - count = page_trans_huge_map_swapcount(page, &total_mapcount, - &total_swapcount); + count = page_trans_huge_map_swapcount(page, &total_swapcount); if (count == 1 && PageSwapCache(page) && (likely(!PageTransCompound(page)) || /* The remaining swap count will be freed soon */ From patchwork Fri Jan 14 22:06:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE136C433EF for ; Fri, 14 Jan 2022 22:06:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 449766B010F; Fri, 14 Jan 2022 17:06:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F93E6B0111; Fri, 14 Jan 2022 17:06:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C0BA6B0112; Fri, 14 Jan 2022 17:06:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 1C9F06B010F for ; Fri, 14 Jan 2022 17:06:55 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D99EA82F76DF for ; Fri, 14 Jan 2022 22:06:54 +0000 (UTC) X-FDA: 79030278348.26.B9F323D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf04.hostedemail.com (Postfix) with ESMTP id 6B8A440007 for ; Fri, 14 Jan 2022 22:06:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 133DAB825F5; Fri, 14 Jan 2022 22:06:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9C86FC36AE9; Fri, 14 Jan 2022 22:06:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198011; bh=VwBhdtaeRanacG0TzN0ZeUWC+8LPKCJXuMh5hJT7s+c=; h=Date:From:To:Subject:In-Reply-To:From; b=n7OGRrNUOtfTuZ6J4uwB08uOzSZNUN1AUqxvhd+zjk7G8mWPWva+qkNApSqz88GpO +RpHkH4NTmE0R1mtMByk/4BrIAv1dWr2vKx6nXis/8AQVy65NsaOC3pzPnMKmrQM0U 3WQQOzbwPifMtXK3lPQoFihELKR52cjWzOljsWQM= Date: Fri, 14 Jan 2022 14:06:51 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org Subject: [patch 071/146] mm: remove the total_mapcount argument from page_trans_huge_mapcount() Message-ID: <20220114220651.8nMOH5xMP%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 6B8A440007 X-Stat-Signature: pshxefhdfssf5poxku3t1new8enxo4tj Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=n7OGRrNU; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198014-404226 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm: remove the total_mapcount argument from page_trans_huge_mapcount() All callers pass NULL, so we can stop calculating the value we would store in it. Link: https://lkml.kernel.org/r/20211220205943.456187-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: William Kucharski Acked-by: Linus Torvalds Cc: David Hildenbrand Signed-off-by: Andrew Morton --- include/linux/mm.h | 10 +++------- include/linux/swap.h | 2 +- mm/huge_memory.c | 30 ++++++++++-------------------- mm/swapfile.c | 2 +- 4 files changed, 15 insertions(+), 29 deletions(-) --- a/include/linux/mm.h~mm-remove-the-total_mapcount-argument-from-page_trans_huge_mapcount +++ a/include/linux/mm.h @@ -799,19 +799,15 @@ static inline int page_mapcount(struct p #ifdef CONFIG_TRANSPARENT_HUGEPAGE int total_mapcount(struct page *page); -int page_trans_huge_mapcount(struct page *page, int *total_mapcount); +int page_trans_huge_mapcount(struct page *page); #else static inline int total_mapcount(struct page *page) { return page_mapcount(page); } -static inline int page_trans_huge_mapcount(struct page *page, - int *total_mapcount) +static inline int page_trans_huge_mapcount(struct page *page) { - int mapcount = page_mapcount(page); - if (total_mapcount) - *total_mapcount = mapcount; - return mapcount; + return page_mapcount(page); } #endif --- a/include/linux/swap.h~mm-remove-the-total_mapcount-argument-from-page_trans_huge_mapcount +++ a/include/linux/swap.h @@ -681,7 +681,7 @@ static inline int swp_swapcount(swp_entr } #define reuse_swap_page(page) \ - (page_trans_huge_mapcount(page, NULL) == 1) + (page_trans_huge_mapcount(page) == 1) static inline int try_to_free_swap(struct page *page) { --- a/mm/huge_memory.c~mm-remove-the-total_mapcount-argument-from-page_trans_huge_mapcount +++ a/mm/huge_memory.c @@ -2542,38 +2542,28 @@ int total_mapcount(struct page *page) * need full accuracy to avoid breaking page pinning, because * page_trans_huge_mapcount() is slower than page_mapcount(). */ -int page_trans_huge_mapcount(struct page *page, int *total_mapcount) +int page_trans_huge_mapcount(struct page *page) { - int i, ret, _total_mapcount, mapcount; + int i, ret; /* hugetlbfs shouldn't call it */ VM_BUG_ON_PAGE(PageHuge(page), page); - if (likely(!PageTransCompound(page))) { - mapcount = atomic_read(&page->_mapcount) + 1; - if (total_mapcount) - *total_mapcount = mapcount; - return mapcount; - } + if (likely(!PageTransCompound(page))) + return atomic_read(&page->_mapcount) + 1; page = compound_head(page); - _total_mapcount = ret = 0; + ret = 0; for (i = 0; i < thp_nr_pages(page); i++) { - mapcount = atomic_read(&page[i]._mapcount) + 1; + int mapcount = atomic_read(&page[i]._mapcount) + 1; ret = max(ret, mapcount); - _total_mapcount += mapcount; } - if (PageDoubleMap(page)) { + + if (PageDoubleMap(page)) ret -= 1; - _total_mapcount -= thp_nr_pages(page); - } - mapcount = compound_mapcount(page); - ret += mapcount; - _total_mapcount += mapcount; - if (total_mapcount) - *total_mapcount = _total_mapcount; - return ret; + + return ret + compound_mapcount(page); } /* Racy check whether the huge page can be split */ --- a/mm/swapfile.c~mm-remove-the-total_mapcount-argument-from-page_trans_huge_mapcount +++ a/mm/swapfile.c @@ -1619,7 +1619,7 @@ static int page_trans_huge_map_swapcount swapcount = page_swapcount(page); if (total_swapcount) *total_swapcount = swapcount; - return swapcount + page_trans_huge_mapcount(page, NULL); + return swapcount + page_trans_huge_mapcount(page); } page = compound_head(page); From patchwork Fri Jan 14 22:06:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAEF7C433F5 for ; Fri, 14 Jan 2022 22:06:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61EA86B0111; Fri, 14 Jan 2022 17:06:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CD4D6B0113; Fri, 14 Jan 2022 17:06:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BD466B0114; Fri, 14 Jan 2022 17:06:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213]) by kanga.kvack.org (Postfix) with ESMTP id 36CD36B0111 for ; Fri, 14 Jan 2022 17:06:58 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DA9781826B6C1 for ; Fri, 14 Jan 2022 22:06:57 +0000 (UTC) X-FDA: 79030278474.21.6C138FB Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf12.hostedemail.com (Postfix) with ESMTP id 4D4EC40011 for ; Fri, 14 Jan 2022 22:06:57 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4B29AB8260F; Fri, 14 Jan 2022 22:06:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C13AEC36AE5; Fri, 14 Jan 2022 22:06:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198015; bh=/oovxXJXTKtKuvnlkNmYq68O5ONmf4m79qsZGWgSX5k=; h=Date:From:To:Subject:In-Reply-To:From; b=jWNdtpuhK133rjj30s3o+yy1R+uG84R7rynEpeaf0XXpCTeXdFZK2c/zyB23lBS7M RGjN70c670O/kZIIMvZPuwPH5Qiy0Rx+N984cik0rpbrbu/TjNpUVgHn37GPQzZ8tq Hcv8FHYpceicrng+USlsM3iCyojcZUbxJI5hgwyI= Date: Fri, 14 Jan 2022 14:06:54 -0800 From: Andrew Morton To: ak@suse.de, akpm@linux-foundation.org, christian.koenig@amd.com, clameter@sgi.com, greg@kroah.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rientjes@google.com, torvalds@linux-foundation.org, yinghai.lu@sun.com Subject: [patch 072/146] mm/dmapool.c: revert "make dma pool to use kmalloc_node" Message-ID: <20220114220654.Mqg0T2wlx%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4D4EC40011 X-Stat-Signature: kp9aqbh3dwbo3naina49hkx1tge316ar Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jWNdtpuh; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198017-264895 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christian König Subject: mm/dmapool.c: revert "make dma pool to use kmalloc_node" This reverts commit 2618c60b8b5836 ("dma: make dma pool to use kmalloc_node"). While working myself into the dmapool code I've found this little odd kmalloc_node(). What basically happens here is that we allocate the housekeeping structure on the numa node where the device is attached to. Since the device is never doing DMA to or from that memory this doesn't seem to make sense at all. So while this doesn't seem to cause much harm it's probably cleaner to revert the change for consistency. Link: https://lkml.kernel.org/r/20211221110724.97664-1-christian.koenig@amd.com Signed-off-by: Christian König Cc: Yinghai Lu Cc: Andi Kleen Cc: Christoph Lameter Cc: David Rientjes Cc: Greg KH Signed-off-by: Andrew Morton --- mm/dmapool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/dmapool.c~dma-revert-make-dma-pool-to-use-kmalloc_node +++ a/mm/dmapool.c @@ -152,7 +152,7 @@ struct dma_pool *dma_pool_create(const c else if ((boundary < size) || (boundary & (boundary - 1))) return NULL; - retval = kmalloc_node(sizeof(*retval), GFP_KERNEL, dev_to_node(dev)); + retval = kmalloc(sizeof(*retval), GFP_KERNEL); if (!retval) return retval; From patchwork Fri Jan 14 22:06:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31B96C433FE for ; Fri, 14 Jan 2022 22:07:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B4ED96B0113; Fri, 14 Jan 2022 17:07:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AFD746B0115; Fri, 14 Jan 2022 17:07:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 99F1E6B0116; Fri, 14 Jan 2022 17:07:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id 841B16B0113 for ; Fri, 14 Jan 2022 17:07:00 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 54C1E9675D for ; Fri, 14 Jan 2022 22:07:00 +0000 (UTC) X-FDA: 79030278600.28.8DC329D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id EE007A0005 for ; Fri, 14 Jan 2022 22:06:59 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2E16162014; Fri, 14 Jan 2022 22:06:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19260C36AE9; Fri, 14 Jan 2022 22:06:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198018; bh=zq84wMjGP9tDsxm6FLajjoesNvn9MC8To9xmnKFingw=; h=Date:From:To:Subject:In-Reply-To:From; b=qAQspy3+QTLFRWAwl9eT74Z8pxosLOfdTK9QKSxUC9r6Ulhq/CEtRjRX5/j5zBDZR 9RN3KmtoPOt/dfh4O9iRd0OTJX0Dq+zs7SXWXWGMvlluLbhSLjTier1CTqUCOwxRVY XmLk9A1pqE8iFVDjjX4ULgxQl9pPXF7/oEfvHB4M= Date: Fri, 14 Jan 2022 14:06:57 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, dchinner@redhat.com, hch@lst.de, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, sfr@canb.auug.org.au, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz Subject: [patch 073/146] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Message-ID: <20220114220657.pv_rxF6F8%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EE007A0005 X-Stat-Signature: uwobad3o97f8preccap9a3g8ptx8iemh Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qAQspy3+; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198019-728499 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Patch series "extend vmalloc support for constrained allocations", v2. Based on a recent discussion with Dave and Neil [1] I have tried to implement NOFS, NOIO, NOFAIL support for the vmalloc to make life of kvmalloc users easier. A requirement for NOFAIL support for kvmalloc was new to me but this seems to be really needed by the xfs code. NOFS/NOIO was a known and a long term problem which was hoped to be handled by the scope API. Those scope should have been used at the reclaim recursion boundaries both to document them and also to remove the necessity of NOFS/NOIO constrains for all allocations within that scope. Instead workarounds were developed to wrap a single allocation instead (like ceph_kvmalloc). First patch implements NOFS/NOIO support for vmalloc. The second one adds NOFAIL support and the third one bundles all together into kvmalloc and drops ceph_kvmalloc which can use kvmalloc directly now. [1] http://lkml.kernel.org/r/163184741778.29351.16920832234899124642.stgit@noble.brown This patch (of 4): vmalloc historically hasn't supported GFP_NO{FS,IO} requests because page table allocations do not support externally provided gfp mask and performed GFP_KERNEL like allocations. Since few years we have scope (memalloc_no{fs,io}_{save,restore}) APIs to enforce NOFS and NOIO constrains implicitly to all allocators within the scope. There was a hope that those scopes would be defined on a higher level when the reclaim recursion boundary starts/stops (e.g. when a lock required during the memory reclaim is required etc.). It seems that not all NOFS/NOIO users have adopted this approach and instead they have taken a workaround approach to wrap a single [k]vmalloc allocation by a scope API. These workarounds do not serve the purpose of a better reclaim recursion documentation and reduction of explicit GFP_NO{FS,IO} usege so let's just provide them with the semantic they are asking for without a need for workarounds. Add support for GFP_NOFS and GFP_NOIO to vmalloc directly. All internal allocations already comply with the given gfp_mask. The only current exception is vmap_pages_range which maps kernel page tables. Infer the proper scope API based on the given gfp mask. [sfr@canb.auug.org.au: mm/vmalloc.c needs linux/sched/mm.h] Link: https://lkml.kernel.org/r/20211217232641.0148710c@canb.auug.org.au Link: https://lkml.kernel.org/r/20211122153233.9924-1-mhocko@kernel.org Link: https://lkml.kernel.org/r/20211122153233.9924-2-mhocko@kernel.org Signed-off-by: Michal Hocko Signed-off-by: Stephen Rothwell Reviewed-by: Uladzislau Rezki (Sony) Acked-by: Vlastimil Babka Cc: Neil Brown Cc: Christoph Hellwig Cc: Ilya Dryomov Cc: Jeff Layton Cc: Dave Chinner Cc: Sebastian Andrzej Siewior Signed-off-by: Andrew Morton --- mm/vmalloc.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-alloc-gfp_nofsio-for-vmalloc +++ a/mm/vmalloc.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -2928,6 +2929,8 @@ static void *__vmalloc_area_node(struct unsigned long array_size; unsigned int nr_small_pages = size >> PAGE_SHIFT; unsigned int page_order; + unsigned int flags; + int ret; array_size = (unsigned long)nr_small_pages * sizeof(struct page *); gfp_mask |= __GFP_NOWARN; @@ -2976,8 +2979,24 @@ static void *__vmalloc_area_node(struct goto fail; } - if (vmap_pages_range(addr, addr + size, prot, area->pages, - page_shift) < 0) { + /* + * page tables allocations ignore external gfp mask, enforce it + * by the scope API + */ + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + flags = memalloc_nofs_save(); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + flags = memalloc_noio_save(); + + ret = vmap_pages_range(addr, addr + size, prot, area->pages, + page_shift); + + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + memalloc_nofs_restore(flags); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + memalloc_noio_restore(flags); + + if (ret < 0) { warn_alloc(orig_gfp_mask, NULL, "vmalloc error: size %lu, failed to map pages", area->nr_pages * PAGE_SIZE); From patchwork Fri Jan 14 22:07:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98DC2C433FE for ; Fri, 14 Jan 2022 22:07:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 311B06B0115; Fri, 14 Jan 2022 17:07:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CE516B0117; Fri, 14 Jan 2022 17:07:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 188606B0118; Fri, 14 Jan 2022 17:07:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 095AA6B0115 for ; Fri, 14 Jan 2022 17:07:05 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BD16D9676F for ; Fri, 14 Jan 2022 22:07:04 +0000 (UTC) X-FDA: 79030278768.18.33747B7 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id 34AEE4000D for ; Fri, 14 Jan 2022 22:07:04 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 35460B8262F; Fri, 14 Jan 2022 22:07:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80F71C36AE5; Fri, 14 Jan 2022 22:07:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198022; bh=qCFAOoksNK3LHGlC2lmG0KH430b0yql12Aejhm+Z38o=; h=Date:From:To:Subject:In-Reply-To:From; b=UlVjEyJXdmIIdQZYOFIk+fk8LiDjC35tDFXy3sZsUEpjT+eu1jKdf9OqglR8ZZA/M WQa/wTI6m89QbyUvclVJ5fCnc4QMqSJO7eA2gGDYzIw1hE7bJG8pj7q9bqPd32SgPM wrsTOtw81z2pMFIS7krTstJ/WftF8ckR5EwotuYE= Date: Fri, 14 Jan 2022 14:07:01 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, david@fromorbit.com, hch@lst.de, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz Subject: [patch 074/146] mm/vmalloc: add support for __GFP_NOFAIL Message-ID: <20220114220701.zgLAA-x6k%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 34AEE4000D X-Stat-Signature: h4gt7ktcm6gmmm347iarwu65hbd3ke8b Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UlVjEyJX; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642198024-940270 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm/vmalloc: add support for __GFP_NOFAIL Dave Chinner has mentioned that some of the xfs code would benefit from kvmalloc support for __GFP_NOFAIL because they have allocations that cannot fail and they do not fit into a single page. The large part of the vmalloc implementation already complies with the given gfp flags so there is no work for those to be done. The area and page table allocations are an exception to that. Implement a retry loop for those. Add a short sleep before retrying. 1 jiffy is a completely random timeout. Ideally the retry would wait for an explicit event - e.g. a change to the vmalloc space change if the failure was caused by the space fragmentation or depletion. But there are multiple different reasons to retry and this could become much more complex. Keep the retry simple for now and just sleep to prevent from hogging CPUs. Link: https://lkml.kernel.org/r/20211122153233.9924-3-mhocko@kernel.org Signed-off-by: Michal Hocko Acked-by: Vlastimil Babka Cc: Christoph Hellwig Cc: Dave Chinner Cc: Ilya Dryomov Cc: Jeff Layton Cc: Neil Brown Cc: Sebastian Andrzej Siewior Cc: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- mm/vmalloc.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-add-support-for-__gfp_nofail +++ a/mm/vmalloc.c @@ -2847,6 +2847,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * more permissive. */ if (!order) { + gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; + while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request; @@ -2864,12 +2866,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * but mempolcy want to alloc memory by interleaving. */ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) - nr = alloc_pages_bulk_array_mempolicy(gfp, + nr = alloc_pages_bulk_array_mempolicy(bulk_gfp, nr_pages_request, pages + nr_allocated); else - nr = alloc_pages_bulk_array_node(gfp, nid, + nr = alloc_pages_bulk_array_node(bulk_gfp, nid, nr_pages_request, pages + nr_allocated); @@ -2924,6 +2926,7 @@ static void *__vmalloc_area_node(struct { const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO; const gfp_t orig_gfp_mask = gfp_mask; + bool nofail = gfp_mask & __GFP_NOFAIL; unsigned long addr = (unsigned long)area->addr; unsigned long size = get_vm_area_size(area); unsigned long array_size; @@ -2988,8 +2991,12 @@ static void *__vmalloc_area_node(struct else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) flags = memalloc_noio_save(); - ret = vmap_pages_range(addr, addr + size, prot, area->pages, + do { + ret = vmap_pages_range(addr, addr + size, prot, area->pages, page_shift); + if (nofail && (ret < 0)) + schedule_timeout_uninterruptible(1); + } while (nofail && (ret < 0)); if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) memalloc_nofs_restore(flags); @@ -3084,9 +3091,14 @@ again: VM_UNINITIALIZED | vm_flags, start, end, node, gfp_mask, caller); if (!area) { + bool nofail = gfp_mask & __GFP_NOFAIL; warn_alloc(gfp_mask, NULL, - "vmalloc error: size %lu, vm_struct allocation failed", - real_size); + "vmalloc error: size %lu, vm_struct allocation failed%s", + real_size, (nofail) ? ". Retrying." : ""); + if (nofail) { + schedule_timeout_uninterruptible(1); + goto again; + } goto fail; } From patchwork Fri Jan 14 22:07:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C080EC433FE for ; Fri, 14 Jan 2022 22:07:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55B5D6B0117; Fri, 14 Jan 2022 17:07:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 50A6D6B0119; Fri, 14 Jan 2022 17:07:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FA2B6B011A; Fri, 14 Jan 2022 17:07:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 2ED376B0117 for ; Fri, 14 Jan 2022 17:07:08 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F36489676F for ; Fri, 14 Jan 2022 22:07:07 +0000 (UTC) X-FDA: 79030278894.26.EC2B48D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf21.hostedemail.com (Postfix) with ESMTP id 8AA341C0003 for ; Fri, 14 Jan 2022 22:07:07 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 903D3B8262F; Fri, 14 Jan 2022 22:07:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4D9BC36AE9; Fri, 14 Jan 2022 22:07:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198025; bh=GgRLpPLhwdQFwmc0O1ArEwBnFOWwxwYDCaQv3EWkZcU=; h=Date:From:To:Subject:In-Reply-To:From; b=hxdbYPT7iJBTOOK2cELTdailGh0VL3dArnngKlH3SSVKJBdJLDEMNyD9C6THXyvhb bptklND5kBmucDKdxHkl5XBfYnNtC4uBmkjts/b57nA3+CZBluJcEiuWMTHHi+vSLq HOs14b6buWR6BJqDY20KWgAdqi8VepW3SlW52FOE= Date: Fri, 14 Jan 2022 14:07:04 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, david@fromorbit.com, hch@lst.de, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz Subject: [patch 075/146] mm/vmalloc: be more explicit about supported gfp flags. Message-ID: <20220114220704.yBdrgAyd4%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 8AA341C0003 X-Stat-Signature: 874prq3uguwpz3e35npj1uz914saqpoy Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hxdbYPT7; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198027-151898 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm/vmalloc: be more explicit about supported gfp flags. b7d90e7a5ea8 ("mm/vmalloc: be more explicit about supported gfp flags") has been merged prematurely without the rest of the series and without addressed review feedback from Neil. Fix that up now. Only wording is changed slightly. Link: https://lkml.kernel.org/r/20211122153233.9924-4-mhocko@kernel.org Signed-off-by: Michal Hocko Reviewed-by: Uladzislau Rezki (Sony) Acked-by: Vlastimil Babka Cc: Christoph Hellwig Cc: Dave Chinner Cc: Ilya Dryomov Cc: Jeff Layton Cc: Neil Brown Cc: Sebastian Andrzej Siewior Signed-off-by: Andrew Morton --- mm/vmalloc.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-be-more-explicit-about-supported-gfp-flags +++ a/mm/vmalloc.c @@ -3031,12 +3031,14 @@ fail: * * Allocate enough pages to cover @size from the page level * allocator with @gfp_mask flags. Please note that the full set of gfp - * flags are not supported. GFP_KERNEL would be a preferred allocation mode - * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not - * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka - * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka - * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). - * __GFP_NOWARN can be used to suppress error messages about failures. + * flags are not supported. GFP_KERNEL, GFP_NOFS and GFP_NOIO are all + * supported. + * Zone modifiers are not supported. From the reclaim modifiers + * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported) + * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and + * __GFP_RETRY_MAYFAIL are not supported). + * + * __GFP_NOWARN can be used to suppress failures messages. * * Map them into contiguous kernel virtual space, using a pagetable * protection of @prot. From patchwork Fri Jan 14 22:07:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CEDCC433FE for ; Fri, 14 Jan 2022 22:07:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D24126B0119; Fri, 14 Jan 2022 17:07:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD2926B011B; Fri, 14 Jan 2022 17:07:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4D266B011C; Fri, 14 Jan 2022 17:07:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay038.a.hostedemail.com [64.99.140.38]) by kanga.kvack.org (Postfix) with ESMTP id A4E746B0119 for ; Fri, 14 Jan 2022 17:07:10 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 72C3B245CD for ; Fri, 14 Jan 2022 22:07:10 +0000 (UTC) X-FDA: 79030279020.02.678A9B5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf24.hostedemail.com (Postfix) with ESMTP id EAD2218000D for ; Fri, 14 Jan 2022 22:07:09 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3E5F462011; Fri, 14 Jan 2022 22:07:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3597FC36AE5; Fri, 14 Jan 2022 22:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198028; bh=0XSX5HzE+MIUOn3OaABCAvi1Gn2dMYH9+oMOYrdLJv0=; h=Date:From:To:Subject:In-Reply-To:From; b=lH6672N9ky1pUP1pyWqUisMmQNN37oFUSnAxmAL86v0U9kWYZtG1shKf5cW4mzF4v +iGJ+HdQpdhHELvprcjYyW+vn2NKNoujvr8bVDQPGYfcr7NZYnAoaqBwAtTEik45yj YTHvkLkRRTyaMZeDqYoTitabcJ699NN8JpWAfhNY= Date: Fri, 14 Jan 2022 14:07:07 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, david@fromorbit.com, hch@lst.de, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz Subject: [patch 076/146] mm: allow !GFP_KERNEL allocations for kvmalloc Message-ID: <20220114220707.SD7dYB5br%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EAD2218000D X-Stat-Signature: ykdafqa7hib1ow31dcdkeo8rckfw7ujn Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lH6672N9; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198029-253911 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm: allow !GFP_KERNEL allocations for kvmalloc Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by previous patches so we can allow the support for kvmalloc. This will allow some external users to simplify or completely remove their helpers. GFP_NOWAIT semantic hasn't been supported so far but it hasn't been explicitly documented so let's add a note about that. ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc. Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org Signed-off-by: Michal Hocko Reviewed-by: Uladzislau Rezki (Sony) Acked-by: Vlastimil Babka Cc: Christoph Hellwig Cc: Dave Chinner Cc: Ilya Dryomov Cc: Jeff Layton Cc: Neil Brown Cc: Sebastian Andrzej Siewior Signed-off-by: Andrew Morton --- include/linux/ceph/libceph.h | 1 - mm/util.c | 15 ++++----------- net/ceph/buffer.c | 4 ++-- net/ceph/ceph_common.c | 27 --------------------------- net/ceph/crypto.c | 2 +- net/ceph/messenger.c | 2 +- net/ceph/messenger_v2.c | 2 +- net/ceph/osdmap.c | 12 ++++++------ 8 files changed, 15 insertions(+), 50 deletions(-) --- a/include/linux/ceph/libceph.h~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/include/linux/ceph/libceph.h @@ -295,7 +295,6 @@ extern bool libceph_compatible(void *dat extern const char *ceph_msg_type_name(int type); extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); -extern void *ceph_kvmalloc(size_t size, gfp_t flags); struct fs_parameter; struct fc_log; --- a/mm/util.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/mm/util.c @@ -549,13 +549,10 @@ EXPORT_SYMBOL(vm_mmap); * Uses kmalloc to get the memory but if the allocation fails then falls back * to the vmalloc allocator. Use kvfree for freeing the memory. * - * Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. + * GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the __GFP_NORETRY modifier. * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is * preferable to the vmalloc fallback, due to visible performance drawbacks. * - * Please note that any use of gfp flags outside of GFP_KERNEL is careful to not - * fall back to vmalloc. - * * Return: pointer to the allocated memory of %NULL in case of failure */ void *kvmalloc_node(size_t size, gfp_t flags, int node) @@ -564,13 +561,6 @@ void *kvmalloc_node(size_t size, gfp_t f void *ret; /* - * vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables) - * so the given set of flags has to be compatible. - */ - if ((flags & GFP_KERNEL) != GFP_KERNEL) - return kmalloc_node(size, flags, node); - - /* * We want to attempt a large physically contiguous block first because * it is less likely to fragment multiple larger blocks and therefore * contribute to a long term fragmentation less than vmalloc fallback. @@ -582,6 +572,9 @@ void *kvmalloc_node(size_t size, gfp_t f if (!(kmalloc_flags & __GFP_RETRY_MAYFAIL)) kmalloc_flags |= __GFP_NORETRY; + + /* nofail semantic is implemented by the vmalloc fallback */ + kmalloc_flags &= ~__GFP_NOFAIL; } ret = kmalloc_node(size, kmalloc_flags, node); --- a/net/ceph/buffer.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/buffer.c @@ -7,7 +7,7 @@ #include #include -#include /* for ceph_kvmalloc */ +#include /* for kvmalloc */ struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp) { @@ -17,7 +17,7 @@ struct ceph_buffer *ceph_buffer_new(size if (!b) return NULL; - b->vec.iov_base = ceph_kvmalloc(len, gfp); + b->vec.iov_base = kvmalloc(len, gfp); if (!b->vec.iov_base) { kfree(b); return NULL; --- a/net/ceph/ceph_common.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/ceph_common.c @@ -190,33 +190,6 @@ int ceph_compare_options(struct ceph_opt } EXPORT_SYMBOL(ceph_compare_options); -/* - * kvmalloc() doesn't fall back to the vmalloc allocator unless flags are - * compatible with (a superset of) GFP_KERNEL. This is because while the - * actual pages are allocated with the specified flags, the page table pages - * are always allocated with GFP_KERNEL. - * - * ceph_kvmalloc() may be called with GFP_KERNEL, GFP_NOFS or GFP_NOIO. - */ -void *ceph_kvmalloc(size_t size, gfp_t flags) -{ - void *p; - - if ((flags & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) { - p = kvmalloc(size, flags); - } else if ((flags & (__GFP_IO | __GFP_FS)) == __GFP_IO) { - unsigned int nofs_flag = memalloc_nofs_save(); - p = kvmalloc(size, GFP_KERNEL); - memalloc_nofs_restore(nofs_flag); - } else { - unsigned int noio_flag = memalloc_noio_save(); - p = kvmalloc(size, GFP_KERNEL); - memalloc_noio_restore(noio_flag); - } - - return p; -} - static int parse_fsid(const char *str, struct ceph_fsid *fsid) { int i = 0; --- a/net/ceph/crypto.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/crypto.c @@ -147,7 +147,7 @@ void ceph_crypto_key_destroy(struct ceph static const u8 *aes_iv = (u8 *)CEPH_AES_IV; /* - * Should be used for buffers allocated with ceph_kvmalloc(). + * Should be used for buffers allocated with kvmalloc(). * Currently these are encrypt out-buffer (ceph_buffer) and decrypt * in-buffer (msg front). * --- a/net/ceph/messenger.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/messenger.c @@ -1920,7 +1920,7 @@ struct ceph_msg *ceph_msg_new2(int type, /* front */ if (front_len) { - m->front.iov_base = ceph_kvmalloc(front_len, flags); + m->front.iov_base = kvmalloc(front_len, flags); if (m->front.iov_base == NULL) { dout("ceph_msg_new can't allocate %d bytes\n", front_len); --- a/net/ceph/messenger_v2.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/messenger_v2.c @@ -308,7 +308,7 @@ static void *alloc_conn_buf(struct ceph_ if (WARN_ON(con->v2.conn_buf_cnt >= ARRAY_SIZE(con->v2.conn_bufs))) return NULL; - buf = ceph_kvmalloc(len, GFP_NOIO); + buf = kvmalloc(len, GFP_NOIO); if (!buf) return NULL; --- a/net/ceph/osdmap.c~mm-allow-gfp_kernel-allocations-for-kvmalloc +++ a/net/ceph/osdmap.c @@ -980,7 +980,7 @@ static struct crush_work *alloc_workspac work_size = crush_work_size(c, CEPH_PG_MAX_SIZE); dout("%s work_size %zu bytes\n", __func__, work_size); - work = ceph_kvmalloc(work_size, GFP_NOIO); + work = kvmalloc(work_size, GFP_NOIO); if (!work) return NULL; @@ -1190,9 +1190,9 @@ static int osdmap_set_max_osd(struct cep if (max == map->max_osd) return 0; - state = ceph_kvmalloc(array_size(max, sizeof(*state)), GFP_NOFS); - weight = ceph_kvmalloc(array_size(max, sizeof(*weight)), GFP_NOFS); - addr = ceph_kvmalloc(array_size(max, sizeof(*addr)), GFP_NOFS); + state = kvmalloc(array_size(max, sizeof(*state)), GFP_NOFS); + weight = kvmalloc(array_size(max, sizeof(*weight)), GFP_NOFS); + addr = kvmalloc(array_size(max, sizeof(*addr)), GFP_NOFS); if (!state || !weight || !addr) { kvfree(state); kvfree(weight); @@ -1222,7 +1222,7 @@ static int osdmap_set_max_osd(struct cep if (map->osd_primary_affinity) { u32 *affinity; - affinity = ceph_kvmalloc(array_size(max, sizeof(*affinity)), + affinity = kvmalloc(array_size(max, sizeof(*affinity)), GFP_NOFS); if (!affinity) return -ENOMEM; @@ -1503,7 +1503,7 @@ static int set_primary_affinity(struct c if (!map->osd_primary_affinity) { int i; - map->osd_primary_affinity = ceph_kvmalloc( + map->osd_primary_affinity = kvmalloc( array_size(map->max_osd, sizeof(*map->osd_primary_affinity)), GFP_NOFS); if (!map->osd_primary_affinity) From patchwork Fri Jan 14 22:07:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFD5FC433EF for ; Fri, 14 Jan 2022 22:07:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5348F6B011B; Fri, 14 Jan 2022 17:07:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E54E6B011D; Fri, 14 Jan 2022 17:07:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AB976B011E; Fri, 14 Jan 2022 17:07:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 2733A6B011B for ; Fri, 14 Jan 2022 17:07:14 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E130B98C03 for ; Fri, 14 Jan 2022 22:07:13 +0000 (UTC) X-FDA: 79030279146.17.BDEF02F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 50422160004 for ; Fri, 14 Jan 2022 22:07:13 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9F54662011; Fri, 14 Jan 2022 22:07:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FBD8C36AED; Fri, 14 Jan 2022 22:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198032; bh=GQEP01sTsyUivJvV23Ywfkcnu0Wj9586sffJkfCEypg=; h=Date:From:To:Subject:In-Reply-To:From; b=suHldoq+aVi67LkoPuPNMGgo+CsjKGB1bjP3Y2BytubngNtVwrJA4nylCPdGeZuAO EVNRTEmv7bSEodCIXArl/0fD8iCQzt8wtE5/oI+MMK3Kh/COSD11fDAYg9iEOXmqxj hBC/PePUcjIriTnQAJgXag9Z8pC1iWJsPs9daPnE= Date: Fri, 14 Jan 2022 14:07:11 -0800 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, david@fromorbit.com, dchinner@redhat.com, hch@lst.de, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz Subject: [patch 077/146] mm: make slab and vmalloc allocators __GFP_NOLOCKDEP aware Message-ID: <20220114220711.IJjDTpdMu%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 50422160004 X-Stat-Signature: xyomc81sa65rrfyfnzgbxjaeddkx4ay8 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=suHldoq+; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198033-371349 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm: make slab and vmalloc allocators __GFP_NOLOCKDEP aware sl?b and vmalloc allocators reduce the given gfp mask for their internal needs. For that they use GFP_RECLAIM_MASK to preserve the reclaim behavior and constrains. __GFP_NOLOCKDEP is not a part of that mask because it doesn't really control the reclaim behavior strictly speaking. On the other hand it tells the underlying page allocator to disable reclaim recursion detection so arguably it should be part of the mask. Having __GFP_NOLOCKDEP in the mask will not alter the behavior in any form so this change is safe pretty much by definition. It also adds a support for this flag to SL?B and vmalloc allocators which will in turn allow its use to kvmalloc as well. A lack of the support has been noticed recently in http://lkml.kernel.org/r/20211119225435.GZ449541@dread.disaster.area Link: https://lkml.kernel.org/r/YZ9XtLY4AEjVuiEI@dhcp22.suse.cz Signed-off-by: Michal Hocko Reported-by: Sebastian Andrzej Siewior Acked-by: Dave Chinner Acked-by: Vlastimil Babka Cc: Christoph Hellwig Cc: Dave Chinner Cc: Ilya Dryomov Cc: Jeff Layton Cc: Neil Brown Cc: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- mm/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/internal.h~mm-make-slab-and-vmalloc-allocators-__gfp_nolockdep-aware +++ a/mm/internal.h @@ -21,7 +21,7 @@ #define GFP_RECLAIM_MASK (__GFP_RECLAIM|__GFP_HIGH|__GFP_IO|__GFP_FS|\ __GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_NOFAIL|\ __GFP_NORETRY|__GFP_MEMALLOC|__GFP_NOMEMALLOC|\ - __GFP_ATOMIC) + __GFP_ATOMIC|__GFP_NOLOCKDEP) /* The GFP flags allowed during early boot */ #define GFP_BOOT_MASK (__GFP_BITS_MASK & ~(__GFP_RECLAIM|__GFP_IO|__GFP_FS)) From patchwork Fri Jan 14 22:07:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30124C433EF for ; Fri, 14 Jan 2022 22:07:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B75BB6B011D; Fri, 14 Jan 2022 17:07:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B24476B011F; Fri, 14 Jan 2022 17:07:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C5A26B0120; Fri, 14 Jan 2022 17:07:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 8AF396B011D for ; Fri, 14 Jan 2022 17:07:18 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4D82298C03 for ; Fri, 14 Jan 2022 22:07:18 +0000 (UTC) X-FDA: 79030279356.06.321EAD2 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf08.hostedemail.com (Postfix) with ESMTP id 8FCCB16000E for ; Fri, 14 Jan 2022 22:07:17 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 97CC7B8262E; Fri, 14 Jan 2022 22:07:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E7B74C36AE5; Fri, 14 Jan 2022 22:07:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198035; bh=DrUpgG71ZzSMNqsyDMOMR3pdmqOX/EemLl/2Wsh0t3I=; h=Date:From:To:Subject:In-Reply-To:From; b=qWifwkT55BIm8oaqvIBrn6ewf1ibMN5A4HGYT7lM9VVH7TdJSUuZdGT7rWLUeisH3 tJruD9EbkmeG4gcVYigxGtkhiIPEpdD/N2xVneBSmLE3mFJ3xNHNgixmr0NkQvtfn+ GBAGcQMN90ff7zlSNrfMb6rUoZ3Sanh528qaZmmo= Date: Fri, 14 Jan 2022 14:07:14 -0800 From: Andrew Morton To: akpm@linux-foundation.org, chao@kernel.org, chuck.lever@oracle.com, david@fromorbit.com, djwong@kernel.org, jaegeuk@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, tytso@mit.edu Subject: [patch 078/146] mm: introduce memalloc_retry_wait() Message-ID: <20220114220714.jg5V2tDGp%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 8FCCB16000E X-Stat-Signature: 9qfbt1p419ghaz4agfj3kx5kcgi984i1 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qWifwkT5; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198037-709321 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "NeilBrown" Subject: mm: introduce memalloc_retry_wait() Various places in the kernel - largely in filesystems - respond to a memory allocation failure by looping around and re-trying. Some of these cannot conveniently use __GFP_NOFAIL, for reasons such as: - a GFP_ATOMIC allocation, which __GFP_NOFAIL doesn't work on - a need to check for the process being signalled between failures - the possibility that other recovery actions could be performed - the allocation is quite deep in support code, and passing down an extra flag to say if __GFP_NOFAIL is wanted would be clumsy. Many of these currently use congestion_wait() which (in almost all cases) simply waits the given timeout - congestion isn't tracked for most devices. It isn't clear what the best delay is for loops, but it is clear that the various filesystems shouldn't be responsible for choosing a timeout. This patch introduces memalloc_retry_wait() with takes on that responsibility. Code that wants to retry a memory allocation can call this function passing the GFP flags that were used. It will wait however is appropriate. For now, it only considers __GFP_NORETRY and whatever gfpflags_allow_blocking() tests. If blocking is allowed without __GFP_NORETRY, then alloc_page either made some reclaim progress, or waited for a while, before failing. So there is no need for much further waiting. memalloc_retry_wait() will wait until the current jiffie ends. If this condition is not met, then alloc_page() won't have waited much if at all. In that case memalloc_retry_wait() waits about 200ms. This is the delay that most current loops uses. linux/sched/mm.h needs to be included in some files now, but linux/backing-dev.h does not. Link: https://lkml.kernel.org/r/163754371968.13692.1277530886009912421@noble.neil.brown.name Signed-off-by: NeilBrown Cc: Dave Chinner Cc: Michal Hocko Cc: "Theodore Ts'o" Cc: Jaegeuk Kim Cc: Chao Yu Cc: Darrick J. Wong Cc: Chuck Lever Signed-off-by: Andrew Morton --- fs/ext4/extents.c | 8 +++----- fs/ext4/inline.c | 5 ++--- fs/ext4/page-io.c | 9 +++++---- fs/f2fs/data.c | 4 ++-- fs/f2fs/gc.c | 5 ++--- fs/f2fs/inode.c | 4 ++-- fs/f2fs/node.c | 4 ++-- fs/f2fs/recovery.c | 6 +++--- fs/f2fs/segment.c | 9 +++------ fs/f2fs/super.c | 5 ++--- fs/xfs/kmem.c | 3 +-- fs/xfs/xfs_buf.c | 2 +- include/linux/sched/mm.h | 26 ++++++++++++++++++++++++++ net/sunrpc/svc_xprt.c | 3 ++- 14 files changed, 56 insertions(+), 37 deletions(-) --- a/fs/ext4/extents.c~mm-introduce-memalloc_retry_wait +++ a/fs/ext4/extents.c @@ -27,8 +27,8 @@ #include #include #include -#include #include +#include #include "ext4_jbd2.h" #include "ext4_extents.h" #include "xattr.h" @@ -4407,8 +4407,7 @@ retry: err = ext4_es_remove_extent(inode, last_block, EXT_MAX_BLOCKS - last_block); if (err == -ENOMEM) { - cond_resched(); - congestion_wait(BLK_RW_ASYNC, HZ/50); + memalloc_retry_wait(GFP_ATOMIC); goto retry; } if (err) @@ -4416,8 +4415,7 @@ retry: retry_remove_space: err = ext4_ext_remove_space(inode, last_block, EXT_MAX_BLOCKS - 1); if (err == -ENOMEM) { - cond_resched(); - congestion_wait(BLK_RW_ASYNC, HZ/50); + memalloc_retry_wait(GFP_ATOMIC); goto retry_remove_space; } return err; --- a/fs/ext4/inline.c~mm-introduce-memalloc_retry_wait +++ a/fs/ext4/inline.c @@ -7,7 +7,7 @@ #include #include #include -#include +#include #include "ext4_jbd2.h" #include "ext4.h" @@ -1929,8 +1929,7 @@ int ext4_inline_data_truncate(struct ino retry: err = ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS); if (err == -ENOMEM) { - cond_resched(); - congestion_wait(BLK_RW_ASYNC, HZ/50); + memalloc_retry_wait(GFP_ATOMIC); goto retry; } if (err) --- a/fs/ext4/page-io.c~mm-introduce-memalloc_retry_wait +++ a/fs/ext4/page-io.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include "ext4_jbd2.h" #include "xattr.h" @@ -523,12 +523,13 @@ int ext4_bio_write_page(struct ext4_io_s ret = PTR_ERR(bounce_page); if (ret == -ENOMEM && (io->io_bio || wbc->sync_mode == WB_SYNC_ALL)) { - gfp_flags = GFP_NOFS; + gfp_t new_gfp_flags = GFP_NOFS; if (io->io_bio) ext4_io_submit(io); else - gfp_flags |= __GFP_NOFAIL; - congestion_wait(BLK_RW_ASYNC, HZ/50); + new_gfp_flags |= __GFP_NOFAIL; + memalloc_retry_wait(gfp_flags); + gfp_flags = new_gfp_flags; goto retry_encrypt; } --- a/fs/f2fs/data.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/data.c @@ -8,9 +8,9 @@ #include #include #include +#include #include #include -#include #include #include #include @@ -2542,7 +2542,7 @@ retry_encrypt: /* flush pending IOs and wait for a while in the ENOMEM case */ if (PTR_ERR(fio->encrypted_page) == -ENOMEM) { f2fs_flush_merged_writes(fio->sbi); - congestion_wait(BLK_RW_ASYNC, DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); gfp_flags |= __GFP_NOFAIL; goto retry_encrypt; } --- a/fs/f2fs/gc.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/gc.c @@ -7,7 +7,6 @@ */ #include #include -#include #include #include #include @@ -15,6 +14,7 @@ #include #include #include +#include #include "f2fs.h" #include "node.h" @@ -1375,8 +1375,7 @@ retry: if (err) { clear_page_private_gcing(page); if (err == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, - DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto retry; } if (is_dirty) --- a/fs/f2fs/inode.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/inode.c @@ -8,8 +8,8 @@ #include #include #include -#include #include +#include #include "f2fs.h" #include "node.h" @@ -562,7 +562,7 @@ retry: inode = f2fs_iget(sb, ino); if (IS_ERR(inode)) { if (PTR_ERR(inode) == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto retry; } } --- a/fs/f2fs/node.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/node.c @@ -8,7 +8,7 @@ #include #include #include -#include +#include #include #include #include @@ -2750,7 +2750,7 @@ int f2fs_recover_inode_page(struct f2fs_ retry: ipage = f2fs_grab_cache_page(NODE_MAPPING(sbi), ino, false); if (!ipage) { - congestion_wait(BLK_RW_ASYNC, DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto retry; } --- a/fs/f2fs/recovery.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/recovery.c @@ -8,6 +8,7 @@ #include #include #include +#include #include "f2fs.h" #include "node.h" #include "segment.h" @@ -587,7 +588,7 @@ retry_dn: err = f2fs_get_dnode_of_data(&dn, start, ALLOC_NODE); if (err) { if (err == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto retry_dn; } goto out; @@ -670,8 +671,7 @@ retry_prev: err = check_index_in_prev_nodes(sbi, dest, &dn); if (err) { if (err == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, - DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto retry_prev; } goto err; --- a/fs/f2fs/segment.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/segment.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -245,9 +246,7 @@ retry: LOOKUP_NODE); if (err) { if (err == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, - DEFAULT_IO_TIMEOUT); - cond_resched(); + memalloc_retry_wait(GFP_NOFS); goto retry; } err = -EAGAIN; @@ -424,9 +423,7 @@ retry: err = f2fs_do_write_data_page(&fio); if (err) { if (err == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, - DEFAULT_IO_TIMEOUT); - cond_resched(); + memalloc_retry_wait(GFP_NOFS); goto retry; } unlock_page(page); --- a/fs/f2fs/super.c~mm-introduce-memalloc_retry_wait +++ a/fs/f2fs/super.c @@ -8,9 +8,9 @@ #include #include #include +#include #include #include -#include #include #include #include @@ -2415,8 +2415,7 @@ repeat: page = read_cache_page_gfp(mapping, blkidx, GFP_NOFS); if (IS_ERR(page)) { if (PTR_ERR(page) == -ENOMEM) { - congestion_wait(BLK_RW_ASYNC, - DEFAULT_IO_TIMEOUT); + memalloc_retry_wait(GFP_NOFS); goto repeat; } set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR); --- a/fs/xfs/kmem.c~mm-introduce-memalloc_retry_wait +++ a/fs/xfs/kmem.c @@ -4,7 +4,6 @@ * All Rights Reserved. */ #include "xfs.h" -#include #include "xfs_message.h" #include "xfs_trace.h" @@ -26,6 +25,6 @@ kmem_alloc(size_t size, xfs_km_flags_t f "%s(%u) possible memory allocation deadlock size %u in %s (mode:0x%x)", current->comm, current->pid, (unsigned int)size, __func__, lflags); - congestion_wait(BLK_RW_ASYNC, HZ/50); + memalloc_retry_wait(lflags); } while (1); } --- a/fs/xfs/xfs_buf.c~mm-introduce-memalloc_retry_wait +++ a/fs/xfs/xfs_buf.c @@ -394,7 +394,7 @@ xfs_buf_alloc_pages( } XFS_STATS_INC(bp->b_mount, xb_page_retries); - congestion_wait(BLK_RW_ASYNC, HZ / 50); + memalloc_retry_wait(gfp_mask); } return 0; } --- a/include/linux/sched/mm.h~mm-introduce-memalloc_retry_wait +++ a/include/linux/sched/mm.h @@ -214,6 +214,32 @@ static inline void fs_reclaim_acquire(gf static inline void fs_reclaim_release(gfp_t gfp_mask) { } #endif +/* Any memory-allocation retry loop should use + * memalloc_retry_wait(), and pass the flags for the most + * constrained allocation attempt that might have failed. + * This provides useful documentation of where loops are, + * and a central place to fine tune the waiting as the MM + * implementation changes. + */ +static inline void memalloc_retry_wait(gfp_t gfp_flags) +{ + /* We use io_schedule_timeout because waiting for memory + * typically included waiting for dirty pages to be + * written out, which requires IO. + */ + __set_current_state(TASK_UNINTERRUPTIBLE); + gfp_flags = current_gfp_context(gfp_flags); + if (gfpflags_allow_blocking(gfp_flags) && + !(gfp_flags & __GFP_NORETRY)) + /* Probably waited already, no need for much more */ + io_schedule_timeout(1); + else + /* Probably didn't wait, and has now released a lock, + * so now is a good time to wait + */ + io_schedule_timeout(HZ/50); +} + /** * might_alloc - Mark possible allocation sites * @gfp_mask: gfp_t flags that would be used to allocate --- a/net/sunrpc/svc_xprt.c~mm-introduce-memalloc_retry_wait +++ a/net/sunrpc/svc_xprt.c @@ -6,6 +6,7 @@ */ #include +#include #include #include #include @@ -688,7 +689,7 @@ static int svc_alloc_arg(struct svc_rqst return -EINTR; } trace_svc_alloc_arg_err(pages); - schedule_timeout(msecs_to_jiffies(500)); + memalloc_retry_wait(GFP_KERNEL); } rqstp->rq_page_end = &rqstp->rq_pages[pages]; rqstp->rq_pages[pages] = NULL; /* this might be seen in nfsd_splice_actor() */ From patchwork Fri Jan 14 22:07:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B476C433FE for ; Fri, 14 Jan 2022 22:07:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E9A86B011F; Fri, 14 Jan 2022 17:07:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 299C66B0121; Fri, 14 Jan 2022 17:07:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 139CA6B0122; Fri, 14 Jan 2022 17:07:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id F28DF6B011F for ; Fri, 14 Jan 2022 17:07:20 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B3AAB96F31 for ; Fri, 14 Jan 2022 22:07:20 +0000 (UTC) X-FDA: 79030279440.07.9C0AC0C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf07.hostedemail.com (Postfix) with ESMTP id 3A81A40010 for ; Fri, 14 Jan 2022 22:07:20 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 93BEC6201A; Fri, 14 Jan 2022 22:07:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58F13C36AE5; Fri, 14 Jan 2022 22:07:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198039; bh=tDLobOzau0oLIqILsdj8C88xEKH5BlBa+zweQA20Hdo=; h=Date:From:To:Subject:In-Reply-To:From; b=XLun85IaBvyCf0H/FHqSejGKIpmNBsqyXaD/wel/3F4CSpp/0eC9yxw/1a0veuZjn 8OkqfZ+gvH0D56kcFGloHKhLaOPLkkFtPtkiMxpVHRK2kLLimeQRXhVO/wIoTtclVZ 1oDGWuKORyynJD3kf7EieaxhTZx5m9qIsN04AcOI= Date: Fri, 14 Jan 2022 14:07:17 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, crope@iki.fi, dave.hansen@linux.intel.com, hannes@cmpxchg.org, keescook@chromium.org, kernel@tuxforce.de, linux-mm@kvack.org, mcgrof@kernel.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, rppt@kernel.org, surenb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, xi.fengfei@h3c.com, yi.zhang@huawei.com, yzaikin@google.com Subject: [patch 079/146] mm/pagealloc: sysctl: change watermark_scale_factor max limit to 30% Message-ID: <20220114220717.gRYVbnFKZ%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XLun85Ia; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: izzid5rx3765mi6u1ixhze78s6k4g8ds X-Rspamd-Queue-Id: 3A81A40010 X-Rspamd-Server: rspam12 X-HE-Tag: 1642198040-590356 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm/pagealloc: sysctl: change watermark_scale_factor max limit to 30% For embedded systems with low total memory, having to run applications with relatively large memory requirements, 10% max limitation for watermark_scale_factor poses an issue of triggering direct reclaim every time such application is started. This results in slow application startup times and bad end-user experience. By increasing watermark_scale_factor max limit we allow vendors more flexibility to choose the right level of kswapd aggressiveness for their device and workload requirements. Link: https://lkml.kernel.org/r/20211124193604.2758863-1-surenb@google.com Signed-off-by: Suren Baghdasaryan Acked-by: Johannes Weiner Cc: Michal Hocko Cc: Lukas Middendorf Cc: Antti Palosaari Cc: Luis Chamberlain Cc: Kees Cook Cc: Iurii Zaikin Cc: Dave Hansen Cc: Vlastimil Babka Cc: Mel Gorman Cc: Jonathan Corbet Cc: Zhang Yi Cc: Fengfei Xi Cc: Mike Rapoport Signed-off-by: Andrew Morton --- Documentation/admin-guide/sysctl/vm.rst | 2 +- kernel/sysctl.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/sysctl/vm.rst~sysctl-change-watermark_scale_factor-max-limit-to-30% +++ a/Documentation/admin-guide/sysctl/vm.rst @@ -948,7 +948,7 @@ how much memory needs to be free before The unit is in fractions of 10,000. The default value of 10 means the distances between watermarks are 0.1% of the available memory in the -node/system. The maximum value is 1000, or 10% of memory. +node/system. The maximum value is 3000, or 30% of memory. A high rate of threads entering direct reclaim (allocstall) or kswapd going to sleep prematurely (kswapd_low_wmark_hit_quickly) can indicate --- a/kernel/sysctl.c~sysctl-change-watermark_scale_factor-max-limit-to-30% +++ a/kernel/sysctl.c @@ -122,6 +122,7 @@ static unsigned long long_max = LONG_MAX static int one_hundred = 100; static int two_hundred = 200; static int one_thousand = 1000; +static int three_thousand = 3000; #ifdef CONFIG_PRINTK static int ten_thousand = 10000; #endif @@ -2959,7 +2960,7 @@ static struct ctl_table vm_table[] = { .mode = 0644, .proc_handler = watermark_scale_factor_sysctl_handler, .extra1 = SYSCTL_ONE, - .extra2 = &one_thousand, + .extra2 = &three_thousand, }, { .procname = "percpu_pagelist_high_fraction", From patchwork Fri Jan 14 22:07:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1EBAC433EF for ; Fri, 14 Jan 2022 22:07:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 386566B0121; Fri, 14 Jan 2022 17:07:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 335416B0123; Fri, 14 Jan 2022 17:07:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FEC96B0124; Fri, 14 Jan 2022 17:07:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 0D9576B0121 for ; Fri, 14 Jan 2022 17:07:24 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C1AE91826B6C5 for ; Fri, 14 Jan 2022 22:07:23 +0000 (UTC) X-FDA: 79030279566.30.BEA8A18 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 5CA3EA0005 for ; Fri, 14 Jan 2022 22:07:23 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 88FCE6201D; Fri, 14 Jan 2022 22:07:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C02B3C36AEC; Fri, 14 Jan 2022 22:07:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198042; bh=7LuOOK14eXnlNYG8J9QodfnSLJlEQv7H2NUrOoEMzOg=; h=Date:From:To:Subject:In-Reply-To:From; b=1VTAfQ26b7Cf/+PEbx+koxZ9z1iHu7NVfozhK3Mavil4Z9dF3ykrt2QN4EXKsFfFn izn3v2SrAiCiWn0G7idmUCpQhb47utLlRIL6qcUDzQoVAqBvM5QHTBO/Hdh28D7WJI 2qZLM3KREM+WAiGe7SeUNXBzs8owD5IBEdvCDnpc= Date: Fri, 14 Jan 2022 14:07:21 -0800 From: Andrew Morton To: akpm@linux-foundation.org, deng.changcheng@zte.com.cn, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, zealci@zte.com.cn Subject: [patch 080/146] mm: fix boolreturn.cocci warning Message-ID: <20220114220721.-NgiYC23R%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 5CA3EA0005 X-Stat-Signature: ry3estu6595csomdj5fujyi9ojg8qius Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1VTAfQ26; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198043-40461 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changcheng Deng Subject: mm: fix boolreturn.cocci warning Return statements in functions returning bool should use true/false instead of 1/0. Link: https://lkml.kernel.org/r/20211126073327.74815-1-deng.changcheng@zte.com.cn Signed-off-by: Changcheng Deng Reported-by: Zeal Robot Signed-off-by: Andrew Morton --- include/linux/page-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/include/linux/page-flags.h~mm-fix-boolreturncocci-warning +++ a/include/linux/page-flags.h @@ -383,7 +383,7 @@ static __always_inline int TestClearPage TESTCLEARFLAG(uname, lname, policy) #define TESTPAGEFLAG_FALSE(uname, lname) \ -static inline bool folio_test_##lname(const struct folio *folio) { return 0; } \ +static inline bool folio_test_##lname(const struct folio *folio) { return false; } \ static inline int Page##uname(const struct page *page) { return 0; } #define SETPAGEFLAG_NOOP(uname, lname) \ From patchwork Fri Jan 14 22:07:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AB75C433EF for ; Fri, 14 Jan 2022 22:07:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2268D6B0123; Fri, 14 Jan 2022 17:07:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D6EE6B0125; Fri, 14 Jan 2022 17:07:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09DB46B0126; Fri, 14 Jan 2022 17:07:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id EB5E86B0123 for ; Fri, 14 Jan 2022 17:07:27 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B9DB28221941 for ; Fri, 14 Jan 2022 22:07:27 +0000 (UTC) X-FDA: 79030279734.23.07D7457 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id 5051040003 for ; Fri, 14 Jan 2022 22:07:27 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4AED6B8262F; Fri, 14 Jan 2022 22:07:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C4921C36AED; Fri, 14 Jan 2022 22:07:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198045; bh=9PlbeLhugCgVuTC0nx8YtiRpKMewE3NTT+OCUMOAQ4Q=; h=Date:From:To:Subject:In-Reply-To:From; b=1Tbh3snbT7NNc53tO4wsborI/2qklFZA9aexmRw1vvQ9+XXJKK6mHiwEcjP9eufL9 6+LxEDXRdCIv9/WpmvjFod7PCJ78YXOPIZAgWGwIv0k4XcJSC1pinD34bvGC6dOZnk PjKye3MQQugt9/ubN/1l5aQd+grk2FG4sog/cDHw= Date: Fri, 14 Jan 2022 14:07:24 -0800 From: Andrew Morton To: akpm@linux-foundation.org, arthur.marsh@internode.on.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sxwjean@gmail.com, torvalds@linux-foundation.org Subject: [patch 081/146] mm: page_alloc: fix building error on -Werror=array-compare Message-ID: <20220114220724.BNY5-7-ga%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5051040003 X-Stat-Signature: k7m7ic1m43cfdwkd3dsx5i3jxuxqau3p Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1Tbh3snb; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198047-782044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xiongwei Song Subject: mm: page_alloc: fix building error on -Werror=array-compare Arthur Marsh reported we would hit the error below when building kernel with gcc-12: CC mm/page_alloc.o mm/page_alloc.c: In function `mem_init_print_info': mm/page_alloc.c:8173:27: error: comparison between two arrays [-Werror=array-compare] 8173 | if (start <= pos && pos < end && size > adj) \ | In C++20, the comparision between arrays should be warned. Link: https://lkml.kernel.org/r/20211125130928.32465-1-sxwjean@me.com Signed-off-by: Xiongwei Song Reported-by: Arthur Marsh Signed-off-by: Andrew Morton --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-page_alloc-fix-building-error-on-werror=array-compare +++ a/mm/page_alloc.c @@ -8228,7 +8228,7 @@ void __init mem_init_print_info(void) */ #define adj_init_size(start, end, size, pos, adj) \ do { \ - if (start <= pos && pos < end && size > adj) \ + if (&start[0] <= &pos[0] && &pos[0] < &end[0] && size > adj) \ size -= adj; \ } while (0) From patchwork Fri Jan 14 22:07:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3493FC433EF for ; Fri, 14 Jan 2022 22:07:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C11876B0125; Fri, 14 Jan 2022 17:07:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE78A6B0127; Fri, 14 Jan 2022 17:07:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAFB76B0128; Fri, 14 Jan 2022 17:07:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay033.a.hostedemail.com [64.99.140.33]) by kanga.kvack.org (Postfix) with ESMTP id 9ABCA6B0125 for ; Fri, 14 Jan 2022 17:07:31 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5A8B4244EE for ; Fri, 14 Jan 2022 22:07:31 +0000 (UTC) X-FDA: 79030279902.02.19EF0D7 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf10.hostedemail.com (Postfix) with ESMTP id C44D4C0002 for ; Fri, 14 Jan 2022 22:07:30 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A0489B8262E; Fri, 14 Jan 2022 22:07:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E33BFC36AE9; Fri, 14 Jan 2022 22:07:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198048; bh=zilKDy3f92CxmPJP95EOc6C/qYXPkrquStMBdwKL4hQ=; h=Date:From:To:Subject:In-Reply-To:From; b=FRQojqsSc/rmcZZ05PJsacNozbATUtW7kPdcKZooRoNFDcWQ7hheFALhJobwUsHhE bRjiFvlpwNo1stXNKCDY5c6lXbTypQSij0LeluEqm6jIYqTQQ2tcqV8h1h8WznBA3j 1ZIPKeUYv/cRuWgKbs+rQWxiNSt0Mbw4db2joyzQ= Date: Fri, 14 Jan 2022 14:07:27 -0800 From: Andrew Morton To: aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, ben.widawsky@intel.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, feng.tang@intel.com, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, ying.huang@intel.com Subject: [patch 082/146] mm: drop node from alloc_pages_vma Message-ID: <20220114220727._VOw2pCCP%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C44D4C0002 X-Stat-Signature: wicpksyehib6t4f7pf7u35ezncjyc8rh Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=FRQojqsS; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198050-948848 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm: drop node from alloc_pages_vma alloc_pages_vma is meant to allocate a page with a vma specific memory policy. The initial node parameter is always a local node so it is pointless to waste a function argument for this. Drop the parameter. Link: https://lkml.kernel.org/r/YaSnlv4QpryEpesG@dhcp22.suse.cz Signed-off-by: Michal Hocko Cc: Aneesh Kumar K.V Cc: Ben Widawsky Cc: Dave Hansen Cc: Feng Tang Cc: Andrea Arcangeli Cc: Mel Gorman Cc: Mike Kravetz Cc: Randy Dunlap Cc: Vlastimil Babka Cc: Andi Kleen Cc: Dan Williams Cc: "Huang, Ying" Signed-off-by: Andrew Morton --- include/linux/gfp.h | 8 ++++---- mm/mempolicy.c | 3 ++- mm/shmem.c | 3 +-- 3 files changed, 7 insertions(+), 7 deletions(-) --- a/include/linux/gfp.h~mm-drop-node-from-alloc_pages_vma +++ a/include/linux/gfp.h @@ -598,9 +598,9 @@ struct page *alloc_pages(gfp_t gfp, unsi struct folio *folio_alloc(gfp_t gfp, unsigned order); extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order, struct vm_area_struct *vma, unsigned long addr, - int node, bool hugepage); + bool hugepage); #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \ - alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true) + alloc_pages_vma(gfp_mask, order, vma, addr, true) #else static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order) { @@ -610,14 +610,14 @@ static inline struct folio *folio_alloc( { return __folio_alloc_node(gfp, order, numa_node_id()); } -#define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\ +#define alloc_pages_vma(gfp_mask, order, vma, addr, false)\ alloc_pages(gfp_mask, order) #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \ alloc_pages(gfp_mask, order) #endif #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0) #define alloc_page_vma(gfp_mask, vma, addr) \ - alloc_pages_vma(gfp_mask, 0, vma, addr, numa_node_id(), false) + alloc_pages_vma(gfp_mask, 0, vma, addr, false) extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order); extern unsigned long get_zeroed_page(gfp_t gfp_mask); --- a/mm/mempolicy.c~mm-drop-node-from-alloc_pages_vma +++ a/mm/mempolicy.c @@ -2084,9 +2084,10 @@ static struct page *alloc_pages_preferre * Return: The page on success or NULL if allocation fails. */ struct page *alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma, - unsigned long addr, int node, bool hugepage) + unsigned long addr, bool hugepage) { struct mempolicy *pol; + int node = numa_node_id(); struct page *page; int preferred_nid; nodemask_t *nmask; --- a/mm/shmem.c~mm-drop-node-from-alloc_pages_vma +++ a/mm/shmem.c @@ -1564,8 +1564,7 @@ static struct page *shmem_alloc_hugepage return NULL; shmem_pseudo_vma_init(&pvma, info, hindex); - page = alloc_pages_vma(gfp, HPAGE_PMD_ORDER, &pvma, 0, numa_node_id(), - true); + page = alloc_pages_vma(gfp, HPAGE_PMD_ORDER, &pvma, 0, true); shmem_pseudo_vma_destroy(&pvma); if (page) prep_transhuge_page(page); From patchwork Fri Jan 14 22:07:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D35CBC4332F for ; Fri, 14 Jan 2022 22:07:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 708EB6B0127; Fri, 14 Jan 2022 17:07:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B7D76B0129; Fri, 14 Jan 2022 17:07:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57F686B012A; Fri, 14 Jan 2022 17:07:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 423F86B0127 for ; Fri, 14 Jan 2022 17:07:33 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id F3FC118272526 for ; Fri, 14 Jan 2022 22:07:32 +0000 (UTC) X-FDA: 79030279986.10.BF7F4B4 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf16.hostedemail.com (Postfix) with ESMTP id 9FB0D180006 for ; Fri, 14 Jan 2022 22:07:32 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 093C162016; Fri, 14 Jan 2022 22:07:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 44C11C36AE5; Fri, 14 Jan 2022 22:07:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198051; bh=dWuzg1DeLjJJExaIBfZZunBr+biz1Xgp5Yu22jkp1ko=; h=Date:From:To:Subject:In-Reply-To:From; b=Mn+YUkXtCbLDa1vPhshtRpwXRiQNLCg75Oi1OAeBakU4lTBtlDb149z6AinnUkVrx LFIT868pk8VIZHKmIzzF+RlxwfzQXlc2UkFyflaDDVHsvtjmwtT+K/bHFGeQAT3/PL 6cV+ue4DKKmcIok3Dewl956SKnHXTWbiSFJJgKoY= Date: Fri, 14 Jan 2022 14:07:30 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, miles.chen@mediatek.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 083/146] include/linux/gfp.h: further document GFP_DMA32 Message-ID: <20220114220730.NH9yCa_p7%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 9FB0D180006 X-Stat-Signature: 83ndpzau9iedd3jtfao61p4d3q83guo4 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Mn+YUkXt; dmarc=none; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198052-516924 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miles Chen Subject: include/linux/gfp.h: further document GFP_DMA32 kmalloc(..., GFP_DMA32) does not return DMA32 memory because the DMA32 kmalloc cache array is not implemented. (Reason: there is no such user in kernel). Put a short comment about this so people can understand this by reading the comment. [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html Link: https://lkml.kernel.org/r/20211207093610.6406-1-miles.chen@mediatek.com Signed-off-by: Miles Chen Signed-off-by: Andrew Morton --- include/linux/gfp.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/include/linux/gfp.h~gfp-further-document-gfp_dma32 +++ a/include/linux/gfp.h @@ -302,7 +302,9 @@ struct vm_area_struct; * lowest zone as a type of emergency reserve. * * %GFP_DMA32 is similar to %GFP_DMA except that the caller requires a 32-bit - * address. + * address. Note that kmalloc(..., GFP_DMA32) does not return DMA32 memory + * because the DMA32 kmalloc cache array is not implemented. + * (Reason: there is no such user in kernel). * * %GFP_HIGHUSER is for userspace allocations that may be mapped to userspace, * do not need to be directly accessible by the kernel but that cannot From patchwork Fri Jan 14 22:07:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68FA9C433EF for ; Fri, 14 Jan 2022 22:07:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA9566B0129; Fri, 14 Jan 2022 17:07:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E582F6B012B; Fri, 14 Jan 2022 17:07:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D203C6B012C; Fri, 14 Jan 2022 17:07:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id BEF7A6B0129 for ; Fri, 14 Jan 2022 17:07:39 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8F2649675D for ; Fri, 14 Jan 2022 22:07:39 +0000 (UTC) X-FDA: 79030280238.15.5B9A2DC Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf19.hostedemail.com (Postfix) with ESMTP id F2D9D1A000E for ; Fri, 14 Jan 2022 22:07:38 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 01746CE24A4; Fri, 14 Jan 2022 22:07:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CA01C36AE5; Fri, 14 Jan 2022 22:07:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198054; bh=WDobod/aoY5bdGeRdkcta3oKakA6AJ5+WGq6wRljvQo=; h=Date:From:To:Subject:In-Reply-To:From; b=UXT6Pnn+zO0rf0Ts96NiNt2tgid21g2QNJDKLObDJ+89D9thmjPln87OBfmoyx4Y6 SuMy8plvtSibxrRVpn3K9TqSCsoVP8LPA0YIInFnWh9e0ehmhCWDylXfiHf8qKmYdA /LweWUWN6/qQffMGQPh/6Vyf6WjLSbLj+nLYtntI= Date: Fri, 14 Jan 2022 14:07:33 -0800 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 084/146] mm/page_alloc.c: modify the comment section for alloc_contig_pages() Message-ID: <20220114220733.mDKSUOxi2%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: F2D9D1A000E X-Stat-Signature: buwqcdrs834ty7nnf985dqwmuzixzrri Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UXT6Pnn+; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198058-765245 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual Subject: mm/page_alloc.c: modify the comment section for alloc_contig_pages() Clarify that the alloc_contig_pages() allocated range will always be aligned to the requested nr_pages. Link: https://lkml.kernel.org/r/1639545478-12160-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Cc: David Hildenbrand Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-modify-the-comment-section-for-alloc_contig_pages +++ a/mm/page_alloc.c @@ -9272,8 +9272,8 @@ static bool zone_spans_last_pfn(const st * for allocation requests which can not be fulfilled with the buddy allocator. * * The allocated memory is always aligned to a page boundary. If nr_pages is a - * power of two then the alignment is guaranteed to be to the given nr_pages - * (e.g. 1GB request would be aligned to 1GB). + * power of two, then allocated range is also guaranteed to be aligned to same + * nr_pages (e.g. 1GB request would be aligned to 1GB). * * Allocated pages can be freed with free_contig_range() or by manually calling * __free_page() on each allocated page. From patchwork Fri Jan 14 22:07:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E70FDC433FE for ; Fri, 14 Jan 2022 22:07:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8417B6B012B; Fri, 14 Jan 2022 17:07:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EF6F6B012D; Fri, 14 Jan 2022 17:07:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B70E6B012E; Fri, 14 Jan 2022 17:07:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 5AEC06B012B for ; Fri, 14 Jan 2022 17:07:43 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 28FBB8221941 for ; Fri, 14 Jan 2022 22:07:43 +0000 (UTC) X-FDA: 79030280406.25.3295601 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf26.hostedemail.com (Postfix) with ESMTP id 4A39A140004 for ; Fri, 14 Jan 2022 22:07:42 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id CDE46CE19A9; Fri, 14 Jan 2022 22:07:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8FAFC36AE9; Fri, 14 Jan 2022 22:07:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198058; bh=GvLI1VMIk/NS63MOGwEUgrmTl41cPtOIpDgQ1jhXHZY=; h=Date:From:To:Subject:In-Reply-To:From; b=lF6oR21qFJpecXE/MJh5AixdRuragDNro0gW2bUDgug5HWPvQ0oL3+vGjzmeasYl6 CmzdvA7y5xqS85SIf9EowzXfikOV4u+fia3mb0SYO2hXwhUnNQZdmTUreeo8S5gxUk ZvYLB6aSWV+/BZP9peU9JD/6l16FRZfSDAdCw1R8= Date: Fri, 14 Jan 2022 14:07:37 -0800 From: Andrew Morton To: 42.hyeyoo@gmail.com, akpm@linux-foundation.org, bhe@redhat.com, bp@alien8.de, cl@linux.com, David.Laight@ACULAB.COM, david@redhat.com, hch@lst.de, iamjoonsoo.kim@lge.com, john.p.donnelly@oracle.com, linux-mm@kvack.org, m.szyprowski@samsung.com, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, robin.murphy@arm.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 085/146] mm_zone: add function to check if managed dma zone exists Message-ID: <20220114220737.Yf78KyApy%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 4A39A140004 X-Stat-Signature: zjtktjo39aqii3ybj59x96gerzaxkxx4 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lF6oR21q; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642198062-810842 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baoquan He Subject: mm_zone: add function to check if managed dma zone exists Patch series "Handle warning of allocation failure on DMA zone w/o managed pages", v4. **Problem observed: On x86_64, when crash is triggered and entering into kdump kernel, page allocation failure can always be seen. --------------------------------- DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations swapper/0: page allocation failure: order:5, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 0 PID: 1 Comm: swapper/0 Call Trace: dump_stack+0x7f/0xa1 warn_alloc.cold+0x72/0xd6 ...... __alloc_pages+0x24d/0x2c0 ...... dma_atomic_pool_init+0xdb/0x176 do_one_initcall+0x67/0x320 ? rcu_read_lock_sched_held+0x3f/0x80 kernel_init_freeable+0x290/0x2dc ? rest_init+0x24f/0x24f kernel_init+0xa/0x111 ret_from_fork+0x22/0x30 Mem-Info: ------------------------------------ ***Root cause: In the current kernel, it assumes that DMA zone must have managed pages and try to request pages if CONFIG_ZONE_DMA is enabled. While this is not always true. E.g in kdump kernel of x86_64, only low 1M is presented and locked down at very early stage of boot, so that this low 1M won't be added into buddy allocator to become managed pages of DMA zone. This exception will always cause page allocation failure if page is requested from DMA zone. ***Investigation: This failure happens since below commit merged into linus's tree. 1a6a9044b967 x86/setup: Remove CONFIG_X86_RESERVE_LOW and reservelow= options 23721c8e92f7 x86/crash: Remove crash_reserve_low_1M() f1d4d47c5851 x86/setup: Always reserve the first 1M of RAM 7c321eb2b843 x86/kdump: Remove the backup region handling 6f599d84231f x86/kdump: Always reserve the low 1M when the crashkernel option is specified Before them, on x86_64, the low 640K area will be reused by kdump kernel. So in kdump kernel, the content of low 640K area is copied into a backup region for dumping before jumping into kdump. Then except of those firmware reserved region in [0, 640K], the left area will be added into buddy allocator to become available managed pages of DMA zone. However, after above commits applied, in kdump kernel of x86_64, the low 1M is reserved by memblock, but not released to buddy allocator. So any later page allocation requested from DMA zone will fail. At the beginning, if crashkernel is reserved, the low 1M need be locked down because AMD SME encrypts memory making the old backup region mechanims impossible when switching into kdump kernel. Later, it was also observed that there are BIOSes corrupting memory under 1M. To solve this, in commit f1d4d47c5851, the entire region of low 1M is always reserved after the real mode trampoline is allocated. Besides, recently, Intel engineer mentioned their TDX (Trusted domain extensions) which is under development in kernel also needs to lock down the low 1M. So we can't simply revert above commits to fix the page allocation failure from DMA zone as someone suggested. ***Solution: Currently, only DMA atomic pool and dma-kmalloc will initialize and request page allocation with GFP_DMA during bootup. So only initializ DMA atomic pool when DMA zone has available managed pages, otherwise just skip the initialization. For dma-kmalloc(), for the time being, let's mute the warning of allocation failure if requesting pages from DMA zone while no manged pages. Meanwhile, change code to use dma_alloc_xx/dma_map_xx API to replace kmalloc(GFP_DMA), or do not use GFP_DMA when calling kmalloc() if not necessary. Christoph is posting patches to fix those under drivers/scsi/. Finally, we can remove the need of dma-kmalloc() as people suggested. This patch (of 3): In some places of the current kernel, it assumes that dma zone must have managed pages if CONFIG_ZONE_DMA is enabled. While this is not always true. E.g in kdump kernel of x86_64, only low 1M is presented and locked down at very early stage of boot, so that there's no managed pages at all in DMA zone. This exception will always cause page allocation failure if page is requested from DMA zone. Here add function has_managed_dma() and the relevant helper functions to check if there's DMA zone with managed pages. It will be used in later patches. Link: https://lkml.kernel.org/r/20211223094435.248523-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20211223094435.248523-2-bhe@redhat.com Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") Signed-off-by: Baoquan He Reviewed-by: David Hildenbrand Acked-by: John Donnelly Cc: Christoph Hellwig Cc: Christoph Lameter Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: David Laight Cc: Borislav Petkov Cc: Marek Szyprowski Cc: Robin Murphy Cc: Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 9 +++++++++ mm/page_alloc.c | 15 +++++++++++++++ 2 files changed, 24 insertions(+) --- a/include/linux/mmzone.h~mm_zone-add-function-to-check-if-managed-dma-zone-exists +++ a/include/linux/mmzone.h @@ -1047,6 +1047,15 @@ static inline int is_highmem_idx(enum zo #endif } +#ifdef CONFIG_ZONE_DMA +bool has_managed_dma(void); +#else +static inline bool has_managed_dma(void) +{ + return false; +} +#endif + /** * is_highmem - helper function to quickly check if a struct zone is a * highmem zone or not. This is an attempt to keep references --- a/mm/page_alloc.c~mm_zone-add-function-to-check-if-managed-dma-zone-exists +++ a/mm/page_alloc.c @@ -9518,3 +9518,18 @@ bool take_page_off_buddy(struct page *pa return ret; } #endif + +#ifdef CONFIG_ZONE_DMA +bool has_managed_dma(void) +{ + struct pglist_data *pgdat; + + for_each_online_pgdat(pgdat) { + struct zone *zone = &pgdat->node_zones[ZONE_DMA]; + + if (managed_zone(zone)) + return true; + } + return false; +} +#endif /* CONFIG_ZONE_DMA */ From patchwork Fri Jan 14 22:07:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7BCBC433EF for ; Fri, 14 Jan 2022 22:07:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F7B76B012D; Fri, 14 Jan 2022 17:07:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 580766B012F; Fri, 14 Jan 2022 17:07:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 41FD36B0130; Fri, 14 Jan 2022 17:07:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id 2A30A6B012D for ; Fri, 14 Jan 2022 17:07:45 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DE3041826B6C5 for ; Fri, 14 Jan 2022 22:07:44 +0000 (UTC) X-FDA: 79030280448.05.8870A4C Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf12.hostedemail.com (Postfix) with ESMTP id 7115A40002 for ; Fri, 14 Jan 2022 22:07:44 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 76490B825F5; Fri, 14 Jan 2022 22:07:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8341BC36AE9; Fri, 14 Jan 2022 22:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198062; bh=J5e179zjw3W3zMjjHqBdwyuqskrDIkdNg9dL/tTN6Vo=; h=Date:From:To:Subject:In-Reply-To:From; b=VJldcyoJLn7ojA85bDXn+/AWKfRuSDX1Z9qCOtJb4Kc8Jv0XsgqPAPG9g8z9koBU6 BWKDeu57r6ZS/SvgqafRPXMKggQnhjlK8e/6vXVGXjCYb1uUt2YZP/B3GKQMqSOEtf 3w/6kRFDaYbaE3rB43Q1ZTpeF2M4+D5aMU90ry8s= Date: Fri, 14 Jan 2022 14:07:41 -0800 From: Andrew Morton To: 42.hyeyoo@gmail.com, akpm@linux-foundation.org, bhe@redhat.com, bp@alien8.de, cl@linux.com, David.Laight@ACULAB.COM, david@redhat.com, hch@lst.de, iamjoonsoo.kim@lge.com, john.p.donnelly@oracle.com, linux-mm@kvack.org, m.szyprowski@samsung.com, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, robin.murphy@arm.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 086/146] dma/pool: create dma atomic pool only if dma zone has managed pages Message-ID: <20220114220741.hCFHv9Bem%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7115A40002 X-Stat-Signature: tryckyifguyof1kffpcffruepxndfw79 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=VJldcyoJ; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198064-112925 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baoquan He Subject: dma/pool: create dma atomic pool only if dma zone has managed pages Currently three dma atomic pools are initialized as long as the relevant kernel codes are built in. While in kdump kernel of x86_64, this is not right when trying to create atomic_pool_dma, because there's no managed pages in DMA zone. In the case, DMA zone only has low 1M memory presented and locked down by memblock allocator. So no pages are added into buddy of DMA zone. Please check commit f1d4d47c5851 ("x86/setup: Always reserve the first 1M of RAM"). Then in kdump kernel of x86_64, it always prints below failure message: DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations swapper/0: page allocation failure: order:5, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-0.rc5.20210611git929d931f2b40.42.fc35.x86_64 #1 Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 2.12.0 06/04/2018 Call Trace: dump_stack+0x7f/0xa1 warn_alloc.cold+0x72/0xd6 ? _raw_spin_unlock_irq+0x24/0x40 ? __alloc_pages_direct_compact+0x90/0x1b0 __alloc_pages_slowpath.constprop.0+0xf29/0xf50 ? __cond_resched+0x16/0x50 ? prepare_alloc_pages.constprop.0+0x19d/0x1b0 __alloc_pages+0x24d/0x2c0 ? __dma_atomic_pool_init+0x93/0x93 alloc_page_interleave+0x13/0xb0 atomic_pool_expand+0x118/0x210 ? __dma_atomic_pool_init+0x93/0x93 __dma_atomic_pool_init+0x45/0x93 dma_atomic_pool_init+0xdb/0x176 do_one_initcall+0x67/0x320 ? rcu_read_lock_sched_held+0x3f/0x80 kernel_init_freeable+0x290/0x2dc ? rest_init+0x24f/0x24f kernel_init+0xa/0x111 ret_from_fork+0x22/0x30 Mem-Info: ...... DMA: failed to allocate 128 KiB GFP_KERNEL|GFP_DMA pool for atomic allocation DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations Here, let's check if DMA zone has managed pages, then create atomic_pool_dma if yes. Otherwise just skip it. Link: https://lkml.kernel.org/r/20211223094435.248523-3-bhe@redhat.com Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") Signed-off-by: Baoquan He Reviewed-by: Christoph Hellwig Acked-by: John Donnelly Reviewed-by: David Hildenbrand Cc: Marek Szyprowski Cc: Robin Murphy Cc: Borislav Petkov Cc: Christoph Lameter Cc: David Laight Cc: David Rientjes Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim Cc: Pekka Enberg Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- kernel/dma/pool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/kernel/dma/pool.c~dma-pool-create-dma-atomic-pool-only-if-dma-zone-has-managed-pages +++ a/kernel/dma/pool.c @@ -203,7 +203,7 @@ static int __init dma_atomic_pool_init(v GFP_KERNEL); if (!atomic_pool_kernel) ret = -ENOMEM; - if (IS_ENABLED(CONFIG_ZONE_DMA)) { + if (has_managed_dma()) { atomic_pool_dma = __dma_atomic_pool_init(atomic_pool_size, GFP_KERNEL | GFP_DMA); if (!atomic_pool_dma) @@ -226,7 +226,7 @@ static inline struct gen_pool *dma_guess if (prev == NULL) { if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32)) return atomic_pool_dma32; - if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA)) + if (atomic_pool_dma && (gfp & GFP_DMA)) return atomic_pool_dma; return atomic_pool_kernel; } From patchwork Fri Jan 14 22:07:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1397C433FE for ; Fri, 14 Jan 2022 22:07:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55A6D6B012F; Fri, 14 Jan 2022 17:07:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 509D86B0131; Fri, 14 Jan 2022 17:07:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D1456B0132; Fri, 14 Jan 2022 17:07:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2EC816B012F for ; Fri, 14 Jan 2022 17:07:51 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EFF9C1827251D for ; Fri, 14 Jan 2022 22:07:50 +0000 (UTC) X-FDA: 79030280700.23.F3A6C5B Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf07.hostedemail.com (Postfix) with ESMTP id 243A640002 for ; Fri, 14 Jan 2022 22:07:50 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id AC8F7CE2384; Fri, 14 Jan 2022 22:07:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61332C36AEC; Fri, 14 Jan 2022 22:07:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198066; bh=H4631Pyu+q2Mu2CLJ4PHUC0/uQIvI0vr+z8ZF4QT454=; h=Date:From:To:Subject:In-Reply-To:From; b=yT2+9SDTyn93j9Rkm5df/HkuRF1usCFfqJjoK4id5XW1pV/DyjWFVEClRlk0JokbG 0n28Nb1qSqA7G7PysuB+HJC5K1bPWdq1f1ar6iaqkOMgZW+ETrjxnXAh65Fz9b42HI prQW7PcnPnG9hT0oDwHBDmEfVnqo2LpUAr7iySyo= Date: Fri, 14 Jan 2022 14:07:44 -0800 From: Andrew Morton To: 42.hyeyoo@gmail.com, akpm@linux-foundation.org, bhe@redhat.com, bp@alien8.de, cl@linux.com, David.Laight@ACULAB.COM, david@redhat.com, hch@lst.de, iamjoonsoo.kim@lge.com, john.p.donnelly@oracle.com, linux-mm@kvack.org, m.szyprowski@samsung.com, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, robin.murphy@arm.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 087/146] mm/page_alloc.c: do not warn allocation failure on zone DMA if no managed pages Message-ID: <20220114220744.5KK-0rDF5%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 243A640002 X-Stat-Signature: sqni3hsjspcz9qfup87ut5rhx3cqeo1x Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yT2+9SDT; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198070-661371 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baoquan He Subject: mm/page_alloc.c: do not warn allocation failure on zone DMA if no managed pages In kdump kernel of x86_64, page allocation failure is observed: kworker/u2:2: page allocation failure: order:0, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 0 PID: 55 Comm: kworker/u2:2 Not tainted 5.16.0-rc4+ #5 Hardware name: AMD Dinar/Dinar, BIOS RDN1505B 06/05/2013 Workqueue: events_unbound async_run_entry_fn Call Trace: dump_stack_lvl+0x48/0x5e warn_alloc.cold+0x72/0xd6 __alloc_pages_slowpath.constprop.0+0xc69/0xcd0 __alloc_pages+0x1df/0x210 new_slab+0x389/0x4d0 ___slab_alloc+0x58f/0x770 __slab_alloc.constprop.0+0x4a/0x80 kmem_cache_alloc_trace+0x24b/0x2c0 sr_probe+0x1db/0x620 ...... device_add+0x405/0x920 ...... __scsi_add_device+0xe5/0x100 ata_scsi_scan_host+0x97/0x1d0 async_run_entry_fn+0x30/0x130 process_one_work+0x1e8/0x3c0 worker_thread+0x50/0x3b0 ? rescuer_thread+0x350/0x350 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 Mem-Info: ...... The above failure happened when calling kmalloc() to allocate buffer with GFP_DMA. It requests to allocate slab page from DMA zone while no managed pages at all in there. sr_probe() --> get_capabilities() --> buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); Because in the current kernel, dma-kmalloc will be created as long as CONFIG_ZONE_DMA is enabled. However, kdump kernel of x86_64 doesn't have managed pages on DMA zone since commit 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified"). The failure can be always reproduced. For now, let's mute the warning of allocation failure if requesting pages from DMA zone while no managed pages. [akpm@linux-foundation.org: fix warning] Link: https://lkml.kernel.org/r/20211223094435.248523-4-bhe@redhat.com Fixes: 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified") Signed-off-by: Baoquan He Acked-by: John Donnelly Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Borislav Petkov Cc: Christoph Hellwig Cc: David Hildenbrand Cc: David Laight Cc: Marek Szyprowski Cc: Robin Murphy Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-page_allocc-do-not-warn-allocation-failure-on-zone-dma-if-no-managed-pages +++ a/mm/page_alloc.c @@ -4218,7 +4218,9 @@ void warn_alloc(gfp_t gfp_mask, nodemask va_list args; static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); - if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) + if ((gfp_mask & __GFP_NOWARN) || + !__ratelimit(&nopage_rs) || + ((gfp_mask & __GFP_DMA) && !has_managed_dma())) return; va_start(args, fmt); From patchwork Fri Jan 14 22:07:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714114 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBAE6C433EF for ; Fri, 14 Jan 2022 22:07:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6468F6B0131; Fri, 14 Jan 2022 17:07:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CF236B0133; Fri, 14 Jan 2022 17:07:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4225A6B0134; Fri, 14 Jan 2022 17:07:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id A5DFA6B0131 for ; Fri, 14 Jan 2022 17:07:52 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 76CAC9675D for ; Fri, 14 Jan 2022 22:07:52 +0000 (UTC) X-FDA: 79030280784.04.BB200BD Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf08.hostedemail.com (Postfix) with ESMTP id EFCE7160003 for ; Fri, 14 Jan 2022 22:07:51 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DA5BBB8262E; Fri, 14 Jan 2022 22:07:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 11AB3C36AE9; Fri, 14 Jan 2022 22:07:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198069; bh=qmiPWMZwTgA/BMJA7I+d9SQrNXt8PzR9//XGI0UIFRQ=; h=Date:From:To:Subject:In-Reply-To:From; b=Ubp4JbdoHHjlaZZXmtZpWALILW5CjG+MjIWNBAlOazj99yaZy/+WT9KTXB0eJMG9U /K+ynGYMm7AlaX63XFVxuglE++0xe7fYQ8I/tJl0YhVhP4SCxp5DlGYrwesWr7V3EJ mJAoEVRF4yoEg/BL25zkfTVMC8UIOm6GX+XTZ9MM= Date: Fri, 14 Jan 2022 14:07:48 -0800 From: Andrew Morton To: akpm@linux-foundation.org, almasrymina@google.com, cannonmatthews@google.com, colin.i.king@gmail.com, joannali@google.com, juew@google.com, keescook@chromium.org, linmiaohe@huawei.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, osalvador@suse.de, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, ygyao@google.com Subject: [patch 088/146] hugetlb: add hugetlb.*.numa_stat file Message-ID: <20220114220748.yBUy4PjHh%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: j5em1ckce86gxa66k95zn4s7m9ma3gy6 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ubp4Jbdo; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: EFCE7160003 X-HE-Tag: 1642198071-280707 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mina Almasry Subject: hugetlb: add hugetlb.*.numa_stat file For hugetlb backed jobs/VMs it's critical to understand the numa information for the memory backing these jobs to deliver optimal performance. Currently this technically can be queried from /proc/self/numa_maps, but there are significant issues with that. Namely: 1. Memory can be mapped or unmapped. 2. numa_maps are per process and need to be aggregated across all processes in the cgroup. For shared memory this is more involved as the userspace needs to make sure it doesn't double count shared mappings. 3. I believe querying numa_maps needs to hold the mmap_lock which adds to the contention on this lock. For these reasons I propose simply adding hugetlb.*.numa_stat file, which shows the numa information of the cgroup similarly to memory.numa_stat. On cgroup-v2: cat /sys/fs/cgroup/unified/test/hugetlb.2MB.numa_stat total=2097152 N0=2097152 N1=0 On cgroup-v1: cat /sys/fs/cgroup/hugetlb/test/hugetlb.2MB.numa_stat total=2097152 N0=2097152 N1=0 hierarichal_total=2097152 N0=2097152 N1=0 This patch was tested manually by allocating hugetlb memory and querying the hugetlb.*.numa_stat file of the cgroup and its parents. [colin.i.king@googlemail.com: fix spelling mistake "hierarichal" -> "hierarchical"] Link: https://lkml.kernel.org/r/20211125090635.23508-1-colin.i.king@gmail.com [keescook@chromium.org: fix copy/paste array assignment] Link: https://lkml.kernel.org/r/20211203065647.2819707-1-keescook@chromium.org Link: https://lkml.kernel.org/r/20211123001020.4083653-1-almasrymina@google.com Signed-off-by: Mina Almasry Signed-off-by: Colin Ian King Signed-off-by: Kees Cook Reviewed-by: Shakeel Butt Reviewed-by: Muchun Song Reviewed-by: Mike Kravetz Cc: Shuah Khan Cc: Miaohe Lin Cc: Oscar Salvador Cc: Michal Hocko Cc: David Rientjes Cc: Jue Wang Cc: Yang Yao Cc: Joanna Li Cc: Cannon Matthews Signed-off-by: Andrew Morton --- Documentation/admin-guide/cgroup-v1/hugetlb.rst | 4 Documentation/admin-guide/cgroup-v2.rst | 5 include/linux/hugetlb.h | 4 include/linux/hugetlb_cgroup.h | 7 mm/hugetlb_cgroup.c | 133 ++++++++++++-- 5 files changed, 141 insertions(+), 12 deletions(-) --- a/Documentation/admin-guide/cgroup-v1/hugetlb.rst~hugetlb-add-hugetlbnuma_stat-file +++ a/Documentation/admin-guide/cgroup-v1/hugetlb.rst @@ -29,12 +29,14 @@ Brief summary of control files:: hugetlb..max_usage_in_bytes # show max "hugepagesize" hugetlb usage recorded hugetlb..usage_in_bytes # show current usage for "hugepagesize" hugetlb hugetlb..failcnt # show the number of allocation failure due to HugeTLB usage limit + hugetlb..numa_stat # show the numa information of the hugetlb memory charged to this cgroup For a system supporting three hugepage sizes (64k, 32M and 1G), the control files include:: hugetlb.1GB.limit_in_bytes hugetlb.1GB.max_usage_in_bytes + hugetlb.1GB.numa_stat hugetlb.1GB.usage_in_bytes hugetlb.1GB.failcnt hugetlb.1GB.rsvd.limit_in_bytes @@ -43,6 +45,7 @@ files include:: hugetlb.1GB.rsvd.failcnt hugetlb.64KB.limit_in_bytes hugetlb.64KB.max_usage_in_bytes + hugetlb.64KB.numa_stat hugetlb.64KB.usage_in_bytes hugetlb.64KB.failcnt hugetlb.64KB.rsvd.limit_in_bytes @@ -51,6 +54,7 @@ files include:: hugetlb.64KB.rsvd.failcnt hugetlb.32MB.limit_in_bytes hugetlb.32MB.max_usage_in_bytes + hugetlb.32MB.numa_stat hugetlb.32MB.usage_in_bytes hugetlb.32MB.failcnt hugetlb.32MB.rsvd.limit_in_bytes --- a/Documentation/admin-guide/cgroup-v2.rst~hugetlb-add-hugetlbnuma_stat-file +++ a/Documentation/admin-guide/cgroup-v2.rst @@ -2266,6 +2266,11 @@ HugeTLB Interface Files are local to the cgroup i.e. not hierarchical. The file modified event generated on this file reflects only the local events. + hugetlb..numa_stat + Similar to memory.numa_stat, it shows the numa information of the + hugetlb pages of in this cgroup. Only active in + use hugetlb pages are included. The per-node values are in bytes. + Misc ---- --- a/include/linux/hugetlb_cgroup.h~hugetlb-add-hugetlbnuma_stat-file +++ a/include/linux/hugetlb_cgroup.h @@ -36,6 +36,11 @@ enum hugetlb_memory_event { HUGETLB_NR_MEMORY_EVENTS, }; +struct hugetlb_cgroup_per_node { + /* hugetlb usage in pages over all hstates. */ + unsigned long usage[HUGE_MAX_HSTATE]; +}; + struct hugetlb_cgroup { struct cgroup_subsys_state css; @@ -57,6 +62,8 @@ struct hugetlb_cgroup { /* Handle for "hugetlb.events.local" */ struct cgroup_file events_local_file[HUGE_MAX_HSTATE]; + + struct hugetlb_cgroup_per_node *nodeinfo[]; }; static inline struct hugetlb_cgroup * --- a/include/linux/hugetlb.h~hugetlb-add-hugetlbnuma_stat-file +++ a/include/linux/hugetlb.h @@ -622,8 +622,8 @@ struct hstate { #endif #ifdef CONFIG_CGROUP_HUGETLB /* cgroup control files */ - struct cftype cgroup_files_dfl[7]; - struct cftype cgroup_files_legacy[9]; + struct cftype cgroup_files_dfl[8]; + struct cftype cgroup_files_legacy[10]; #endif char name[HSTATE_NAME_LEN]; }; --- a/mm/hugetlb_cgroup.c~hugetlb-add-hugetlbnuma_stat-file +++ a/mm/hugetlb_cgroup.c @@ -123,29 +123,58 @@ static void hugetlb_cgroup_init(struct h } } +static void hugetlb_cgroup_free(struct hugetlb_cgroup *h_cgroup) +{ + int node; + + for_each_node(node) + kfree(h_cgroup->nodeinfo[node]); + kfree(h_cgroup); +} + static struct cgroup_subsys_state * hugetlb_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) { struct hugetlb_cgroup *parent_h_cgroup = hugetlb_cgroup_from_css(parent_css); struct hugetlb_cgroup *h_cgroup; + int node; + + h_cgroup = kzalloc(struct_size(h_cgroup, nodeinfo, nr_node_ids), + GFP_KERNEL); - h_cgroup = kzalloc(sizeof(*h_cgroup), GFP_KERNEL); if (!h_cgroup) return ERR_PTR(-ENOMEM); if (!parent_h_cgroup) root_h_cgroup = h_cgroup; + /* + * TODO: this routine can waste much memory for nodes which will + * never be onlined. It's better to use memory hotplug callback + * function. + */ + for_each_node(node) { + /* Set node_to_alloc to -1 for offline nodes. */ + int node_to_alloc = + node_state(node, N_NORMAL_MEMORY) ? node : -1; + h_cgroup->nodeinfo[node] = + kzalloc_node(sizeof(struct hugetlb_cgroup_per_node), + GFP_KERNEL, node_to_alloc); + if (!h_cgroup->nodeinfo[node]) + goto fail_alloc_nodeinfo; + } + hugetlb_cgroup_init(h_cgroup, parent_h_cgroup); return &h_cgroup->css; + +fail_alloc_nodeinfo: + hugetlb_cgroup_free(h_cgroup); + return ERR_PTR(-ENOMEM); } static void hugetlb_cgroup_css_free(struct cgroup_subsys_state *css) { - struct hugetlb_cgroup *h_cgroup; - - h_cgroup = hugetlb_cgroup_from_css(css); - kfree(h_cgroup); + hugetlb_cgroup_free(hugetlb_cgroup_from_css(css)); } /* @@ -289,7 +318,17 @@ static void __hugetlb_cgroup_commit_char return; __set_hugetlb_cgroup(page, h_cg, rsvd); - return; + if (!rsvd) { + unsigned long usage = + h_cg->nodeinfo[page_to_nid(page)]->usage[idx]; + /* + * This write is not atomic due to fetching usage and writing + * to it, but that's fine because we call this with + * hugetlb_lock held anyway. + */ + WRITE_ONCE(h_cg->nodeinfo[page_to_nid(page)]->usage[idx], + usage + nr_pages); + } } void hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages, @@ -328,8 +367,17 @@ static void __hugetlb_cgroup_uncharge_pa if (rsvd) css_put(&h_cg->css); - - return; + else { + unsigned long usage = + h_cg->nodeinfo[page_to_nid(page)]->usage[idx]; + /* + * This write is not atomic due to fetching usage and writing + * to it, but that's fine because we call this with + * hugetlb_lock held anyway. + */ + WRITE_ONCE(h_cg->nodeinfo[page_to_nid(page)]->usage[idx], + usage - nr_pages); + } } void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages, @@ -418,6 +466,59 @@ enum { RES_RSVD_FAILCNT, }; +static int hugetlb_cgroup_read_numa_stat(struct seq_file *seq, void *dummy) +{ + int nid; + struct cftype *cft = seq_cft(seq); + int idx = MEMFILE_IDX(cft->private); + bool legacy = MEMFILE_ATTR(cft->private); + struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(seq_css(seq)); + struct cgroup_subsys_state *css; + unsigned long usage; + + if (legacy) { + /* Add up usage across all nodes for the non-hierarchical total. */ + usage = 0; + for_each_node_state(nid, N_MEMORY) + usage += READ_ONCE(h_cg->nodeinfo[nid]->usage[idx]); + seq_printf(seq, "total=%lu", usage * PAGE_SIZE); + + /* Simply print the per-node usage for the non-hierarchical total. */ + for_each_node_state(nid, N_MEMORY) + seq_printf(seq, " N%d=%lu", nid, + READ_ONCE(h_cg->nodeinfo[nid]->usage[idx]) * + PAGE_SIZE); + seq_putc(seq, '\n'); + } + + /* + * The hierarchical total is pretty much the value recorded by the + * counter, so use that. + */ + seq_printf(seq, "%stotal=%lu", legacy ? "hierarchical_" : "", + page_counter_read(&h_cg->hugepage[idx]) * PAGE_SIZE); + + /* + * For each node, transverse the css tree to obtain the hierarchical + * node usage. + */ + for_each_node_state(nid, N_MEMORY) { + usage = 0; + rcu_read_lock(); + css_for_each_descendant_pre(css, &h_cg->css) { + usage += READ_ONCE(hugetlb_cgroup_from_css(css) + ->nodeinfo[nid] + ->usage[idx]); + } + rcu_read_unlock(); + seq_printf(seq, " N%d=%lu", nid, usage * PAGE_SIZE); + } + + seq_putc(seq, '\n'); + + return 0; +} + static u64 hugetlb_cgroup_read_u64(struct cgroup_subsys_state *css, struct cftype *cft) { @@ -668,8 +769,14 @@ static void __init __hugetlb_cgroup_file events_local_file[idx]); cft->flags = CFTYPE_NOT_ON_ROOT; - /* NULL terminate the last cft */ + /* Add the numa stat file */ cft = &h->cgroup_files_dfl[6]; + snprintf(cft->name, MAX_CFTYPE_NAME, "%s.numa_stat", buf); + cft->seq_show = hugetlb_cgroup_read_numa_stat; + cft->flags = CFTYPE_NOT_ON_ROOT; + + /* NULL terminate the last cft */ + cft = &h->cgroup_files_dfl[7]; memset(cft, 0, sizeof(*cft)); WARN_ON(cgroup_add_dfl_cftypes(&hugetlb_cgrp_subsys, @@ -739,8 +846,14 @@ static void __init __hugetlb_cgroup_file cft->write = hugetlb_cgroup_reset; cft->read_u64 = hugetlb_cgroup_read_u64; - /* NULL terminate the last cft */ + /* Add the numa stat file */ cft = &h->cgroup_files_legacy[8]; + snprintf(cft->name, MAX_CFTYPE_NAME, "%s.numa_stat", buf); + cft->private = MEMFILE_PRIVATE(idx, 1); + cft->seq_show = hugetlb_cgroup_read_numa_stat; + + /* NULL terminate the last cft */ + cft = &h->cgroup_files_legacy[9]; memset(cft, 0, sizeof(*cft)); WARN_ON(cgroup_add_legacy_cftypes(&hugetlb_cgrp_subsys, From patchwork Fri Jan 14 22:07:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79916C433EF for ; Fri, 14 Jan 2022 22:07:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09E916B0133; Fri, 14 Jan 2022 17:07:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 04EEC6B0135; Fri, 14 Jan 2022 17:07:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E34786B0136; Fri, 14 Jan 2022 17:07:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id D2C206B0133 for ; Fri, 14 Jan 2022 17:07:55 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 92A96998B1 for ; Fri, 14 Jan 2022 22:07:55 +0000 (UTC) X-FDA: 79030280910.20.61B07A3 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 20BA61C0009 for ; Fri, 14 Jan 2022 22:07:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1CF7BB82A26; Fri, 14 Jan 2022 22:07:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8AEF0C36AE9; Fri, 14 Jan 2022 22:07:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198072; bh=LLVBL4zoC2wVFwmXEE3n+SJftiNf+BT088LprLM5pXs=; h=Date:From:To:Subject:In-Reply-To:From; b=Pi7YD9T0/f6r3tSxBn03y7yon5Jt9GY22G/UkZpftFH97Ln9A0CmZ88kTQZIjv5p9 gj28EcxZLOYRTRntZt6xoMOI3tn+T2q3ao2zsQ5GNTWm6jQOALICQIPu8qppzcy0Uo H0sePDsTRvyZw+7NUREXSdT64QJcRK9lNzSn/DyU= Date: Fri, 14 Jan 2022 14:07:52 -0800 From: Andrew Morton To: akpm@linux-foundation.org, almasrymina@google.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, shuah@kernel.org, torvalds@linux-foundation.org, yosryahmed@google.com Subject: [patch 089/146] mm, hugepages: make memory size variable in hugepage-mremap selftest Message-ID: <20220114220752._hNtbvxep%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 20BA61C0009 X-Stat-Signature: 3fk6x13wqezew3k3bzen7k4ze66ins91 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Pi7YD9T0; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198074-910858 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yosry Ahmed Subject: mm, hugepages: make memory size variable in hugepage-mremap selftest The hugetlb vma mremap() test currently maps 1GB of memory to trigger pmd sharing and make sure that 'unshare' path in mremap code works. The test originally only mapped 10MB of memory (as specified by the header comment) but was later modified to 1GB to tackle this case. However, not all machines will have 1GB of memory to spare for this test. Adding a mapping size arg will allow run_vmtest.sh to pass an adequate mapping size, while allowing users to run the test independently with arbitrary size mappings. Link: https://lkml.kernel.org/r/20211124203805.3700355-1-yosryahmed@google.com Signed-off-by: Yosry Ahmed Cc: Shuah Khan Cc: Mina Almasry Cc: Mike Kravetz Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/hugepage-mremap.c | 46 +++++++++++------ tools/testing/selftests/vm/run_vmtests.sh | 2 2 files changed, 31 insertions(+), 17 deletions(-) --- a/tools/testing/selftests/vm/hugepage-mremap.c~mm-hugepages-make-memory-size-variable-in-hugepage-mremap-selftest +++ a/tools/testing/selftests/vm/hugepage-mremap.c @@ -4,7 +4,11 @@ * * Example of remapping huge page memory in a user application using the * mremap system call. Code assumes a hugetlbfs filesystem is mounted - * at './huge'. The code will use 10MB worth of huge pages. + * at './huge'. The amount of memory used by this test is decided by a command + * line argument in MBs. If missing, the default amount is 10MB. + * + * To make sure the test triggers pmd sharing and goes through the 'unshare' + * path in the mremap code use 1GB (1024) or more. */ #define _GNU_SOURCE @@ -18,8 +22,10 @@ #include #include -#define LENGTH (1UL * 1024 * 1024 * 1024) +#define DEFAULT_LENGTH_MB 10UL +#define MB_TO_BYTES(x) (x * 1024 * 1024) +#define FILE_NAME "huge/hugepagefile" #define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC) #define FLAGS (MAP_SHARED | MAP_ANONYMOUS) @@ -28,20 +34,20 @@ static void check_bytes(char *addr) printf("First hex is %x\n", *((unsigned int *)addr)); } -static void write_bytes(char *addr) +static void write_bytes(char *addr, size_t len) { unsigned long i; - for (i = 0; i < LENGTH; i++) + for (i = 0; i < len; i++) *(addr + i) = (char)i; } -static int read_bytes(char *addr) +static int read_bytes(char *addr, size_t len) { unsigned long i; check_bytes(addr); - for (i = 0; i < LENGTH; i++) + for (i = 0; i < len; i++) if (*(addr + i) != (char)i) { printf("Mismatch at %lu\n", i); return 1; @@ -99,11 +105,19 @@ static void register_region_with_uffd(ch } } -int main(void) +int main(int argc, char *argv[]) { + /* Read memory length as the first arg if valid, otherwise fallback to + * the default length. Any additional args are ignored. + */ + size_t length = argc > 1 ? (size_t)atoi(argv[1]) : 0UL; + + length = length > 0 ? length : DEFAULT_LENGTH_MB; + length = MB_TO_BYTES(length); + int ret = 0; - int fd = open("/huge/test", O_CREAT | O_RDWR, 0755); + int fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755); if (fd < 0) { perror("Open failed"); @@ -112,7 +126,7 @@ int main(void) /* mmap to a PUD aligned address to hopefully trigger pmd sharing. */ unsigned long suggested_addr = 0x7eaa40000000; - void *haddr = mmap((void *)suggested_addr, LENGTH, PROTECTION, + void *haddr = mmap((void *)suggested_addr, length, PROTECTION, MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0); printf("Map haddr: Returned address is %p\n", haddr); if (haddr == MAP_FAILED) { @@ -122,7 +136,7 @@ int main(void) /* mmap again to a dummy address to hopefully trigger pmd sharing. */ suggested_addr = 0x7daa40000000; - void *daddr = mmap((void *)suggested_addr, LENGTH, PROTECTION, + void *daddr = mmap((void *)suggested_addr, length, PROTECTION, MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0); printf("Map daddr: Returned address is %p\n", daddr); if (daddr == MAP_FAILED) { @@ -132,16 +146,16 @@ int main(void) suggested_addr = 0x7faa40000000; void *vaddr = - mmap((void *)suggested_addr, LENGTH, PROTECTION, FLAGS, -1, 0); + mmap((void *)suggested_addr, length, PROTECTION, FLAGS, -1, 0); printf("Map vaddr: Returned address is %p\n", vaddr); if (vaddr == MAP_FAILED) { perror("mmap2"); exit(1); } - register_region_with_uffd(haddr, LENGTH); + register_region_with_uffd(haddr, length); - void *addr = mremap(haddr, LENGTH, LENGTH, + void *addr = mremap(haddr, length, length, MREMAP_MAYMOVE | MREMAP_FIXED, vaddr); if (addr == MAP_FAILED) { perror("mremap"); @@ -150,10 +164,10 @@ int main(void) printf("Mremap: Returned address is %p\n", addr); check_bytes(addr); - write_bytes(addr); - ret = read_bytes(addr); + write_bytes(addr, length); + ret = read_bytes(addr, length); - munmap(addr, LENGTH); + munmap(addr, length); return ret; } --- a/tools/testing/selftests/vm/run_vmtests.sh~mm-hugepages-make-memory-size-variable-in-hugepage-mremap-selftest +++ a/tools/testing/selftests/vm/run_vmtests.sh @@ -111,7 +111,7 @@ fi echo "-----------------------" echo "running hugepage-mremap" echo "-----------------------" -./hugepage-mremap +./hugepage-mremap 256 if [ $? -ne 0 ]; then echo "[FAIL]" exitcode=1 From patchwork Fri Jan 14 22:07:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9B09C433F5 for ; Fri, 14 Jan 2022 22:07:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FB276B0135; Fri, 14 Jan 2022 17:07:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AA9C6B0137; Fri, 14 Jan 2022 17:07:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2719B6B0138; Fri, 14 Jan 2022 17:07:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id 16C156B0135 for ; Fri, 14 Jan 2022 17:07:59 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CB53E18272F05 for ; Fri, 14 Jan 2022 22:07:58 +0000 (UTC) X-FDA: 79030281036.08.6E00C3A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf17.hostedemail.com (Postfix) with ESMTP id 394AD40006 for ; Fri, 14 Jan 2022 22:07:58 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3D9DFB825F5; Fri, 14 Jan 2022 22:07:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7B9FC36AE9; Fri, 14 Jan 2022 22:07:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198076; bh=8H8KZ2Pu6BX3OqoLt3oCpnmieTMWW/dKlmrxjQ9dGGw=; h=Date:From:To:Subject:In-Reply-To:From; b=OnvMBu9oQJHghHu5CooY7kO9z/0ZGJwjUF/qxJkqPhHl6tNgc41G4ELuawoPQMsUA bBhtsPfG0HlvmQvMhySWtbp7GgFfU+ripy7XuB02maTntNsYFI8YH7ztBozSrdJb92 4nUtSgltgFI14fdYoQ9Qh3pklA6Vftuks3N65Rig= Date: Fri, 14 Jan 2022 14:07:55 -0800 From: Andrew Morton To: akpm@linux-foundation.org, dave.hansen@linux.intel.com, linux-mm@kvack.org, mike.kravetz@oracle.com, minchan@kernel.org, mm-commits@vger.kernel.org, saravanand@fb.com, torvalds@linux-foundation.org, yang.yang29@zte.com.cn, ying.huang@intel.com Subject: [patch 090/146] mm/vmstat: add events for THP max_ptes_* exceeds Message-ID: <20220114220755.QzewFdx6k%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 394AD40006 X-Stat-Signature: sy3z6o9hq9z9h3bmwjum7dzzw8imsd51 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OnvMBu9o; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642198078-495535 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Yang Subject: mm/vmstat: add events for THP max_ptes_* exceeds There are interfaces to adjust max_ptes_none, max_ptes_swap, max_ptes_shared values, see /sys/kernel/mm/transparent_hugepage/khugepaged/. But system administrator maynot know which value is the best. So Add those events to support adjusting max_ptes_* to suitable values. For example, if default max_ptes_swap value causes too much failures, and system uses zram whose IO is fast, administrator could increase max_ptes_swap until THP_SCAN_EXCEED_SWAP_PTE not increase anymore. Link: https://lkml.kernel.org/r/20211225094036.574157-1-yang.yang29@zte.com.cn Signed-off-by: Yang Yang Cc: "Huang, Ying" Cc: Dave Hansen Cc: Minchan Kim Cc: Saravanan D Cc: Mike Kravetz Signed-off-by: Andrew Morton --- include/linux/vm_event_item.h | 3 +++ mm/khugepaged.c | 7 +++++++ mm/vmstat.c | 3 +++ 3 files changed, 13 insertions(+) --- a/include/linux/vm_event_item.h~mm-vmstat-add-events-for-thp-max_ptes_-exceeds +++ a/include/linux/vm_event_item.h @@ -98,6 +98,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS THP_SPLIT_PAGE_FAILED, THP_DEFERRED_SPLIT_PAGE, THP_SPLIT_PMD, + THP_SCAN_EXCEED_NONE_PTE, + THP_SCAN_EXCEED_SWAP_PTE, + THP_SCAN_EXCEED_SHARED_PTE, #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD THP_SPLIT_PUD, #endif --- a/mm/khugepaged.c~mm-vmstat-add-events-for-thp-max_ptes_-exceeds +++ a/mm/khugepaged.c @@ -618,6 +618,7 @@ static int __collapse_huge_page_isolate( continue; } else { result = SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); goto out; } } @@ -636,6 +637,7 @@ static int __collapse_huge_page_isolate( if (page_mapcount(page) > 1 && ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; + count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; } @@ -1253,6 +1255,7 @@ static int khugepaged_scan_pmd(struct mm continue; } else { result = SCAN_EXCEED_SWAP_PTE; + count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); goto out_unmap; } } @@ -1262,6 +1265,7 @@ static int khugepaged_scan_pmd(struct mm continue; } else { result = SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); goto out_unmap; } } @@ -1290,6 +1294,7 @@ static int khugepaged_scan_pmd(struct mm if (page_mapcount(page) > 1 && ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; + count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; } @@ -2000,6 +2005,7 @@ static void khugepaged_scan_file(struct if (xa_is_value(page)) { if (++swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; + count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; } continue; @@ -2046,6 +2052,7 @@ static void khugepaged_scan_file(struct if (result == SCAN_SUCCEED) { if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { result = SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { node = khugepaged_find_target_node(); collapse_file(mm, file, start, hpage, node); --- a/mm/vmstat.c~mm-vmstat-add-events-for-thp-max_ptes_-exceeds +++ a/mm/vmstat.c @@ -1353,6 +1353,9 @@ const char * const vmstat_text[] = { "thp_split_page_failed", "thp_deferred_split_page", "thp_split_pmd", + "thp_scan_exceed_none_pte", + "thp_scan_exceed_swap_pte", + "thp_scan_exceed_share_pte", #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD "thp_split_pud", #endif From patchwork Fri Jan 14 22:07:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2711BC433F5 for ; Fri, 14 Jan 2022 22:08:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC2A16B0137; Fri, 14 Jan 2022 17:08:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B719B6B0139; Fri, 14 Jan 2022 17:08:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A604D6B013A; Fri, 14 Jan 2022 17:08:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 95CD56B0137 for ; Fri, 14 Jan 2022 17:08:01 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5A183998B4 for ; Fri, 14 Jan 2022 22:08:01 +0000 (UTC) X-FDA: 79030281162.07.200EBC1 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf09.hostedemail.com (Postfix) with ESMTP id 6FDFF14000A for ; Fri, 14 Jan 2022 22:08:00 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7854BB8262E; Fri, 14 Jan 2022 22:07:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2DB4C36AE5; Fri, 14 Jan 2022 22:07:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198079; bh=5kea9z7oh8Qbchd+POtd5OUJCwummpuralZ14j4Y2GQ=; h=Date:From:To:Subject:In-Reply-To:From; b=FSjTrcd9uwqHo0kyGhC7bbRjLpUtuKBUmTMwEy6yBbs/GVbG/48317LHJJbMY75e7 xYd8LEHkNX673vVc8wtMzjhihRUjJszsc5ClNUkq1scxl6BFI4G1eNBWkuEkjxlMbH uBm8e7X91TkBO3p0WXNWq1g9qBTotUQ1bCX0YDGc= Date: Fri, 14 Jan 2022 14:07:58 -0800 From: Andrew Morton To: akpm@linux-foundation.org, almasrymina@google.com, linux-mm@kvack.org, longman@redhat.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, shuah@kernel.org, torvalds@linux-foundation.org Subject: [patch 091/146] selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting Message-ID: <20220114220758.tfOfW645l%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 6FDFF14000A X-Stat-Signature: 1dormhpt49oj6iwndeupy6ugxcsuujia Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=FSjTrcd9; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198080-248603 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Waiman Long Subject: selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting The hugetlb cgroup reservation test charge_reserved_hugetlb.sh assume that no cgroup filesystems are mounted before running the test. That is not true in many cases. As a result, the test fails to run. Fix that by querying the current cgroup mount setting and using the existing cgroup setup instead before attempting to freshly mount a cgroup filesystem. Similar change is also made for hugetlb_reparenting_test.sh as well, though it still has problem if cgroup v2 isn't used. The patched test scripts were run on a centos 8 based system to verify that they ran properly. Link: https://lkml.kernel.org/r/20220106201359.1646575-1-longman@redhat.com Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") Signed-off-by: Waiman Long Acked-by: Mina Almasry Cc: Shuah Khan Cc: Mike Kravetz Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/charge_reserved_hugetlb.sh | 34 +++++----- tools/testing/selftests/vm/hugetlb_reparenting_test.sh | 21 +++--- tools/testing/selftests/vm/write_hugetlb_memory.sh | 2 3 files changed, 34 insertions(+), 23 deletions(-) --- a/tools/testing/selftests/vm/charge_reserved_hugetlb.sh~selftests-vm-make-charge_reserved_hugetlbsh-work-with-existing-cgroup-setting +++ a/tools/testing/selftests/vm/charge_reserved_hugetlb.sh @@ -24,19 +24,23 @@ if [[ "$1" == "-cgroup-v2" ]]; then reservation_usage_file=rsvd.current fi -cgroup_path=/dev/cgroup/memory -if [[ ! -e $cgroup_path ]]; then - mkdir -p $cgroup_path - if [[ $cgroup2 ]]; then +if [[ $cgroup2 ]]; then + cgroup_path=$(mount -t cgroup2 | head -1 | awk -e '{print $3}') + if [[ -z "$cgroup_path" ]]; then + cgroup_path=/dev/cgroup/memory mount -t cgroup2 none $cgroup_path - else + do_umount=1 + fi + echo "+hugetlb" >$cgroup_path/cgroup.subtree_control +else + cgroup_path=$(mount -t cgroup | grep ",hugetlb" | awk -e '{print $3}') + if [[ -z "$cgroup_path" ]]; then + cgroup_path=/dev/cgroup/memory mount -t cgroup memory,hugetlb $cgroup_path + do_umount=1 fi fi - -if [[ $cgroup2 ]]; then - echo "+hugetlb" >/dev/cgroup/memory/cgroup.subtree_control -fi +export cgroup_path function cleanup() { if [[ $cgroup2 ]]; then @@ -108,7 +112,7 @@ function setup_cgroup() { function wait_for_hugetlb_memory_to_get_depleted() { local cgroup="$1" - local path="/dev/cgroup/memory/$cgroup/hugetlb.${MB}MB.$reservation_usage_file" + local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$reservation_usage_file" # Wait for hugetlbfs memory to get depleted. while [ $(cat $path) != 0 ]; do echo Waiting for hugetlb memory to get depleted. @@ -121,7 +125,7 @@ function wait_for_hugetlb_memory_to_get_ local cgroup="$1" local size="$2" - local path="/dev/cgroup/memory/$cgroup/hugetlb.${MB}MB.$reservation_usage_file" + local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$reservation_usage_file" # Wait for hugetlbfs memory to get written. while [ $(cat $path) != $size ]; do echo Waiting for hugetlb memory reservation to reach size $size. @@ -134,7 +138,7 @@ function wait_for_hugetlb_memory_to_get_ local cgroup="$1" local size="$2" - local path="/dev/cgroup/memory/$cgroup/hugetlb.${MB}MB.$fault_usage_file" + local path="$cgroup_path/$cgroup/hugetlb.${MB}MB.$fault_usage_file" # Wait for hugetlbfs memory to get written. while [ $(cat $path) != $size ]; do echo Waiting for hugetlb memory to reach size $size. @@ -574,5 +578,7 @@ for populate in "" "-o"; do done # populate done # method -umount $cgroup_path -rmdir $cgroup_path +if [[ $do_umount ]]; then + umount $cgroup_path + rmdir $cgroup_path +fi --- a/tools/testing/selftests/vm/hugetlb_reparenting_test.sh~selftests-vm-make-charge_reserved_hugetlbsh-work-with-existing-cgroup-setting +++ a/tools/testing/selftests/vm/hugetlb_reparenting_test.sh @@ -18,19 +18,24 @@ if [[ "$1" == "-cgroup-v2" ]]; then usage_file=current fi -CGROUP_ROOT='/dev/cgroup/memory' -MNT='/mnt/huge/' -if [[ ! -e $CGROUP_ROOT ]]; then - mkdir -p $CGROUP_ROOT - if [[ $cgroup2 ]]; then +if [[ $cgroup2 ]]; then + CGROUP_ROOT=$(mount -t cgroup2 | head -1 | awk -e '{print $3}') + if [[ -z "$CGROUP_ROOT" ]]; then + CGROUP_ROOT=/dev/cgroup/memory mount -t cgroup2 none $CGROUP_ROOT - sleep 1 - echo "+hugetlb +memory" >$CGROUP_ROOT/cgroup.subtree_control - else + do_umount=1 + fi + echo "+hugetlb +memory" >$CGROUP_ROOT/cgroup.subtree_control +else + CGROUP_ROOT=$(mount -t cgroup | grep ",hugetlb" | awk -e '{print $3}') + if [[ -z "$CGROUP_ROOT" ]]; then + CGROUP_ROOT=/dev/cgroup/memory mount -t cgroup memory,hugetlb $CGROUP_ROOT + do_umount=1 fi fi +MNT='/mnt/huge/' function get_machine_hugepage_size() { hpz=$(grep -i hugepagesize /proc/meminfo) --- a/tools/testing/selftests/vm/write_hugetlb_memory.sh~selftests-vm-make-charge_reserved_hugetlbsh-work-with-existing-cgroup-setting +++ a/tools/testing/selftests/vm/write_hugetlb_memory.sh @@ -14,7 +14,7 @@ want_sleep=$8 reserve=$9 echo "Putting task in cgroup '$cgroup'" -echo $$ > /dev/cgroup/memory/"$cgroup"/cgroup.procs +echo $$ > ${cgroup_path:-/dev/cgroup/memory}/"$cgroup"/cgroup.procs echo "Method is $method" From patchwork Fri Jan 14 22:08:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714118 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3516CC433F5 for ; Fri, 14 Jan 2022 22:08:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C38186B0139; Fri, 14 Jan 2022 17:08:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE8CE6B013B; Fri, 14 Jan 2022 17:08:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A87F56B013C; Fri, 14 Jan 2022 17:08:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 95B256B0139 for ; Fri, 14 Jan 2022 17:08:07 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5832190079 for ; Fri, 14 Jan 2022 22:08:07 +0000 (UTC) X-FDA: 79030281414.08.640C0CD Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf11.hostedemail.com (Postfix) with ESMTP id BC3ED40011 for ; Fri, 14 Jan 2022 22:08:06 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id DAC6CCE19A9; Fri, 14 Jan 2022 22:08:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1C3EC36AE9; Fri, 14 Jan 2022 22:08:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198082; bh=U69HfuOa+YONiguT5W5kkfMrpQWWUFeHxAKhMreztxo=; h=Date:From:To:Subject:In-Reply-To:From; b=Bt70inFZBzwoQZaepBGeIemirUALYdtcFE/mem722BeEPMuciqd1rhw8kSECo7FXl WnlRwReGKgVKHUHtQ+57GhSv+KVBfYG8d8mN6omgKwtdbOV0FDjGFtwN7BKBpq66Jf TxK3QrEA7w+tg7jnzg9PU/PECH7IYHKrzTaiEwIg= Date: Fri, 14 Jan 2022 14:08:01 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, axelrasmussen@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nadav.amit@gmail.com, peterx@redhat.com, torvalds@linux-foundation.org Subject: [patch 092/146] selftests/uffd: allow EINTR/EAGAIN Message-ID: <20220114220801.vRTCklHUU%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: BC3ED40011 X-Stat-Signature: du1b6eife7e7qni7i44ztwtx4rzwqaeq Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Bt70inFZ; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198086-380228 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: selftests/uffd: allow EINTR/EAGAIN This allow test to continue with interruptions like gdb. Link: https://lkml.kernel.org/r/20211115135219.85881-1-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Axel Rasmussen Cc: Andrea Arcangeli Cc: Nadav Amit Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~selftests-uffd-allow-eintr-eagain +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -648,7 +648,7 @@ static int uffd_read_msg(int ufd, struct if (ret != sizeof(*msg)) { if (ret < 0) { - if (errno == EAGAIN) + if (errno == EAGAIN || errno == EINTR) return 1; err("blocking read error"); } else { @@ -724,8 +724,11 @@ static void *uffd_poll_thread(void *arg) for (;;) { ret = poll(pollfd, 2, -1); - if (ret <= 0) + if (ret <= 0) { + if (errno == EINTR || errno == EAGAIN) + continue; err("poll error: %d", ret); + } if (pollfd[1].revents & POLLIN) { if (read(pollfd[1].fd, &tmp_chr, 1) != 1) err("read pipefd error"); From patchwork Fri Jan 14 22:08:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49D36C4332F for ; Fri, 14 Jan 2022 22:08:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B74576B013B; Fri, 14 Jan 2022 17:08:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B239E6B013D; Fri, 14 Jan 2022 17:08:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C5A36B013E; Fri, 14 Jan 2022 17:08:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 8453B6B013B for ; Fri, 14 Jan 2022 17:08:08 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 442341826B6C5 for ; Fri, 14 Jan 2022 22:08:08 +0000 (UTC) X-FDA: 79030281456.25.4DF028F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf09.hostedemail.com (Postfix) with ESMTP id B9A18140003 for ; Fri, 14 Jan 2022 22:08:07 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BE58CB8262F; Fri, 14 Jan 2022 22:08:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1CD19C36AE9; Fri, 14 Jan 2022 22:08:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198085; bh=X3CbClOOdYI8cFhZFL8AwWbzbDVa+nqZyRksH6v0Jkw=; h=Date:From:To:Subject:In-Reply-To:From; b=2I0V5DqqQkr5eIfwmeYSMQeWl+y8o/OtNJjCmb2sdT559cCGp9eoLJaR5mxBQtc4Z h+iCEgJ3ZJUGJKoyaQOo5QgbbHSfO+o8BVmUlvQuokBjHTKNy6AZIPPNKP+4adIMqp Jc+QLSd/wNPkI87iiqVL2htVEtJ0NQtzCWVH1aGk= Date: Fri, 14 Jan 2022 14:08:04 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, almasrymina@google.com, axelrasmussen@google.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, peterx@redhat.com, shuah@kernel.org, torvalds@linux-foundation.org Subject: [patch 093/146] userfaultfd/selftests: clean up hugetlb allocation code Message-ID: <20220114220804.dCcvxHVal%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: k19zxq3krwof66oo6xuzqsx98gcq1pf5 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2I0V5Dqq; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B9A18140003 X-HE-Tag: 1642198087-404236 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: userfaultfd/selftests: clean up hugetlb allocation code The message for commit f5c73297181c ("userfaultfd/selftests: fix hugetlb area allocations") says there is no need to create a hugetlb file in the non-shared testing case. However, the commit did not actually change the code to prevent creation of the file. While it is technically true that there is no need to create and use a hugetlb file in the case of non-shared-testing, it is useful. This is because 'hole punching' of a hugetlb file has the potentially incorrect side effect of also removing pages from private mappings. The userfaultfd test relies on this side effect for removing pages from the destination buffer during rounds of stress testing. Remove the incomplete code that was added to deal with no hugetlb file. Just keep the code that prevents reserves from being created for the destination area. Link: https://lkml.kernel.org/r/20220104021729.111006-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reviewed-by: Axel Rasmussen Cc: Peter Xu Cc: Andrea Arcangeli Cc: Mina Almasry Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-clean-up-hugetlb-allocation-code +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -87,7 +87,7 @@ static bool test_uffdio_minor = false; static bool map_shared; static int shm_fd; -static int huge_fd = -1; /* only used for hugetlb_shared test */ +static int huge_fd; static char *huge_fd_off0; static unsigned long long *count_verify; static int uffd = -1; @@ -223,9 +223,6 @@ static void noop_alias_mapping(__u64 *st static void hugetlb_release_pages(char *rel_area) { - if (huge_fd == -1) - return; - if (fallocate(huge_fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, rel_area == huge_fd_off0 ? 0 : nr_pages * page_size, nr_pages * page_size)) @@ -238,17 +235,17 @@ static void hugetlb_allocate_area(void * char **alloc_area_alias; *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE, - map_shared ? MAP_SHARED : - MAP_PRIVATE | MAP_HUGETLB | + (map_shared ? MAP_SHARED : MAP_PRIVATE) | + MAP_HUGETLB | (*alloc_area == area_src ? 0 : MAP_NORESERVE), - huge_fd, - *alloc_area == area_src ? 0 : nr_pages * page_size); + huge_fd, *alloc_area == area_src ? 0 : + nr_pages * page_size); if (*alloc_area == MAP_FAILED) err("mmap of hugetlbfs file failed"); if (map_shared) { area_alias = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE, - MAP_SHARED, + MAP_SHARED | MAP_HUGETLB, huge_fd, *alloc_area == area_src ? 0 : nr_pages * page_size); if (area_alias == MAP_FAILED) From patchwork Fri Jan 14 22:08:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714120 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAB53C433EF for ; Fri, 14 Jan 2022 22:08:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AB786B013D; Fri, 14 Jan 2022 17:08:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 35BF76B013F; Fri, 14 Jan 2022 17:08:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 223D86B0140; Fri, 14 Jan 2022 17:08:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 13DC46B013D for ; Fri, 14 Jan 2022 17:08:12 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C31881826B6C5 for ; Fri, 14 Jan 2022 22:08:11 +0000 (UTC) X-FDA: 79030281582.28.BC10459 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf29.hostedemail.com (Postfix) with ESMTP id 1F61E120012 for ; Fri, 14 Jan 2022 22:08:10 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C0527B825F5; Fri, 14 Jan 2022 22:08:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 445A8C36AEC; Fri, 14 Jan 2022 22:08:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198088; bh=x52Bihfq2g5s2vzRz3Ir1EyXlJtuNlkVDsMtcn41xeg=; h=Date:From:To:Subject:In-Reply-To:From; b=Ww7nwi0PBsyPfFbjx6TJtIBrqIVetFAznEfK7NFE772iF0VbC2gouXh2JrvgTHGCz 6KcvY7lVgf73JcTG7Gr9H284wAoKZpeOWkt0Y3TdyKW5jFNdk4vifyWRWnBFGj6SAN gK47Coinl41Nh1Yc2DsaqRTFV/+AyJUcOTn/wqYE= Date: Fri, 14 Jan 2022 14:08:07 -0800 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, ligang.bdlg@bytedance.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org Subject: [patch 094/146] vmscan: make drop_slab_node static Message-ID: <20220114220807.RmWYbLa8H%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1F61E120012 X-Stat-Signature: ngdk165d4c3satfa6n9ije4ufu89z148 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ww7nwi0P; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198090-3485 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gang Li Subject: vmscan: make drop_slab_node static drop_slab_node is only used in drop_slab. So remove it's declaration from header file and add keyword static for it's definition. Link: https://lkml.kernel.org/r/20211111062445.5236-1-ligang.bdlg@bytedance.com Signed-off-by: Gang Li Reviewed-by: David Hildenbrand Reviewed-by: Muchun Song Signed-off-by: Andrew Morton --- include/linux/mm.h | 1 - mm/vmscan.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) --- a/include/linux/mm.h~vmscan-make-drop_slab_node-static +++ a/include/linux/mm.h @@ -3122,7 +3122,6 @@ int drop_caches_sysctl_handler(struct ct #endif void drop_slab(void); -void drop_slab_node(int nid); #ifndef CONFIG_MMU #define randomize_va_space 0 --- a/mm/vmscan.c~vmscan-make-drop_slab_node-static +++ a/mm/vmscan.c @@ -951,7 +951,7 @@ out: return freed; } -void drop_slab_node(int nid) +static void drop_slab_node(int nid) { unsigned long freed; int shift = 0; From patchwork Fri Jan 14 22:08:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714121 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35974C433FE for ; Fri, 14 Jan 2022 22:08:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9CA56B013F; Fri, 14 Jan 2022 17:08:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B24496B0141; Fri, 14 Jan 2022 17:08:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9EB446B0142; Fri, 14 Jan 2022 17:08:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0158.hostedemail.com [216.40.44.158]) by kanga.kvack.org (Postfix) with ESMTP id 8F3556B013F for ; Fri, 14 Jan 2022 17:08:16 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5901B998B9 for ; Fri, 14 Jan 2022 22:08:16 +0000 (UTC) X-FDA: 79030281792.20.312A594 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf31.hostedemail.com (Postfix) with ESMTP id D8E362000D for ; Fri, 14 Jan 2022 22:08:15 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 5B51ECE19A9; Fri, 14 Jan 2022 22:08:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 62005C36AED; Fri, 14 Jan 2022 22:08:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198091; bh=/hqtVcwg+2V7+zxTh2BL3YYwBuzfVyzUlbO59/g3EIc=; h=Date:From:To:Subject:In-Reply-To:From; b=Z+tp/pN7aNIPtZNoWJp6DaonCHYYcs+iIK74Mb5UuGlqDWnY4XRx9zA7pw50KUmil U4FiGvuATAYmDwog2g1unoYodbz/zYOh2KHazHl94t6hcF3DoBw8FHPWfxm2r5LfRB KPu+2TYwBUX0yMAdvwjEPkUf2NnoYxHqB+rLq3ms= Date: Fri, 14 Jan 2022 14:08:10 -0800 From: Andrew Morton To: akpm@linux-foundation.org, chenwandun@huawei.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, osalvador@suse.de, torvalds@linux-foundation.org, vbabka@suse.cz, wangkefeng.wang@huawei.com Subject: [patch 095/146] mm/page_isolation: unset migratetype directly for non Buddy page Message-ID: <20220114220810.RDDLbQECb%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: D8E362000D X-Stat-Signature: 9x1d5fzpcoxksdy83ibiimujyqi3sb15 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Z+tp/pN7"; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198095-783576 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chen Wandun Subject: mm/page_isolation: unset migratetype directly for non Buddy page In unset_migratetype_isolate(), we can bypass the call to move_freepages_block() for non-buddy pages. It will save a few cpu cycles for some situations such as cma and hugetlb when allocating continue pages, in these situation function alloc_contig_pages will be called. alloc_contig_pages __alloc_contig_migrate_range isolate_freepages_range ==> pages has been remove from buddy undo_isolate_page_range unset_migratetype_isolate ==> can directly set migratetype [osalvador@suse.de: changelog tweak] Link: https://lkml.kernel.org/r/20211229033649.2760586-1-chenwandun@huawei.com Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: Chen Wandun Reviewed-by: Oscar Salvador Cc: Vlastimil Babka Cc: Joonsoo Kim Cc: Wang Kefeng Signed-off-by: Andrew Morton --- mm/page_isolation.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page_isolation.c~mm-page_isolation-unset-migratetype-directly-for-non-buddy-page +++ a/mm/page_isolation.c @@ -115,7 +115,7 @@ static void unset_migratetype_isolate(st * onlining - just onlined memory won't immediately be considered for * allocation. */ - if (!isolated_page) { + if (!isolated_page && PageBuddy(page)) { nr_pages = move_freepages_block(zone, page, migratetype, NULL); __mod_zone_freepage_state(zone, nr_pages, migratetype); } From patchwork Fri Jan 14 22:08:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714122 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FD31C433EF for ; Fri, 14 Jan 2022 22:08:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0294A6B0141; Fri, 14 Jan 2022 17:08:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1B0F6B0143; Fri, 14 Jan 2022 17:08:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE2366B0144; Fri, 14 Jan 2022 17:08:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id CF5666B0141 for ; Fri, 14 Jan 2022 17:08:20 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8F7ED1826B6C5 for ; Fri, 14 Jan 2022 22:08:20 +0000 (UTC) X-FDA: 79030281960.22.8E9E396 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf06.hostedemail.com (Postfix) with ESMTP id D7EB618000D for ; Fri, 14 Jan 2022 22:08:19 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 1C40DCE2497; Fri, 14 Jan 2022 22:08:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AF92BC36AE5; Fri, 14 Jan 2022 22:08:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198095; bh=ZbCchHDUPBb6OYDDv+U+Z5Qx/7yv5P7aY7rSX3YQsHc=; h=Date:From:To:Subject:In-Reply-To:From; b=qMkOv4HRiqzQVJ/4iNU5zcQbSsEMambZ0Y5GRW14MooyQHgzsgxsmv6/XE4sts3El C5Z/QEeZxIvcSHAjEKv3E1fMVpYfz5vmIpTaIJb34z9dp0Pqd8qtF0ln4Z1qhy9vlS QkajX+71dMS7ocL4DmDWGsmFgqOfY21ZPPmrj2kg= Date: Fri, 14 Jan 2022 14:08:14 -0800 From: Andrew Morton To: aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, ben.widawsky@intel.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, feng.tang@intel.com, linux-api@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, ying.huang@intel.com Subject: [patch 096/146] mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Message-ID: <20220114220814.xl_k70knk%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: D7EB618000D X-Stat-Signature: gqdn1o7o5zdnkfxzy6i1ztka7h5i3syd Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qMkOv4HR; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642198099-376885 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Aneesh Kumar K.V" Subject: mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Patch series "mm: add new syscall set_mempolicy_home_node", v6. This patch (of 3): A followup patch will enable setting a home node with MPOL_PREFERRED_MANY memory policy. To facilitate that switch to using policy_node helper. There is no functional change in this patch. Link: https://lkml.kernel.org/r/20211202123810.267175-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20211202123810.267175-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Acked-by: Michal Hocko Cc: Ben Widawsky Cc: Dave Hansen Cc: Feng Tang Cc: Andrea Arcangeli Cc: Mel Gorman Cc: Mike Kravetz Cc: Randy Dunlap Cc: Vlastimil Babka Cc: Andi Kleen Cc: Dan Williams Cc: Huang Ying Cc: Signed-off-by: Andrew Morton --- mm/mempolicy.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/mm/mempolicy.c~mm-mempolicy-use-policy_node-helper-with-mpol_preferred_many +++ a/mm/mempolicy.c @@ -2062,7 +2062,7 @@ static struct page *alloc_pages_preferre preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); page = __alloc_pages(preferred_gfp, order, nid, &pol->nodes); if (!page) - page = __alloc_pages(gfp, order, numa_node_id(), NULL); + page = __alloc_pages(gfp, order, nid, NULL); return page; } @@ -2104,6 +2104,7 @@ struct page *alloc_pages_vma(gfp_t gfp, } if (pol->mode == MPOL_PREFERRED_MANY) { + node = policy_node(gfp, pol, node); page = alloc_pages_preferred_many(gfp, order, node, pol); mpol_cond_put(pol); goto out; @@ -2187,7 +2188,7 @@ struct page *alloc_pages(gfp_t gfp, unsi page = alloc_page_interleave(gfp, order, interleave_nodes(pol)); else if (pol->mode == MPOL_PREFERRED_MANY) page = alloc_pages_preferred_many(gfp, order, - numa_node_id(), pol); + policy_node(gfp, pol, numa_node_id()), pol); else page = __alloc_pages(gfp, order, policy_node(gfp, pol, numa_node_id()), From patchwork Fri Jan 14 22:08:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714123 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C12B4C433F5 for ; Fri, 14 Jan 2022 22:08:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 559B66B0143; Fri, 14 Jan 2022 17:08:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E3466B0145; Fri, 14 Jan 2022 17:08:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AC2D6B0146; Fri, 14 Jan 2022 17:08:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 236166B0143 for ; Fri, 14 Jan 2022 17:08:22 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D34F182F7A85 for ; Fri, 14 Jan 2022 22:08:21 +0000 (UTC) X-FDA: 79030282002.13.44CA559 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf27.hostedemail.com (Postfix) with ESMTP id 3F3AB4000B for ; Fri, 14 Jan 2022 22:08:21 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 378DDB825F5; Fri, 14 Jan 2022 22:08:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 576F7C36AE5; Fri, 14 Jan 2022 22:08:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198099; bh=QLB+ZKQJRGoI7Mh/ZuwssIuXIuBPqDuI11pPrmiYiGc=; h=Date:From:To:Subject:In-Reply-To:From; b=hsj7HJnakkIuStyMuOBCXT1W+PRiPU6tIx10DGycNTjUbDs9qASOktG3epuZv0OAq bDH1eyWDqOJevGFIfh/ZFFLu8AGjaV5tHC/6VHpeLWgtIrux4wFOvsO0ETnk7Ych6C zbrDkM745CjJHYBm6nwsBOIPVm0L/KeR5BNWAlK4= Date: Fri, 14 Jan 2022 14:08:17 -0800 From: Andrew Morton To: aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, ben.widawsky@intel.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, feng.tang@intel.com, linux-api@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, ying.huang@intel.com Subject: [patch 097/146] mm/mempolicy: add set_mempolicy_home_node syscall Message-ID: <20220114220817.MyJze7bLB%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hsj7HJna; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: fw8t9fyxfyn98k9r9cgsiagmqpkux6x7 X-Rspamd-Queue-Id: 3F3AB4000B X-Rspamd-Server: rspam12 X-HE-Tag: 1642198101-158566 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Aneesh Kumar K.V" Subject: mm/mempolicy: add set_mempolicy_home_node syscall This syscall can be used to set a home node for the MPOL_BIND and MPOL_PREFERRED_MANY memory policy. Users should use this syscall after setting up a memory policy for the specified range as shown below. mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp, new_nodes->size + 1, 0); sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size, home_node, 0); The syscall allows specifying a home node/preferred node from which kernel will fulfill memory allocation requests first. For address range with MPOL_BIND memory policy, if nodemask specifies more than one node, page allocations will come from the node in the nodemask with sufficient free memory that is closest to the home node/preferred node. For MPOL_PREFERRED_MANY if the nodemask specifies more than one node, page allocation will come from the node in the nodemask with sufficient free memory that is closest to the home node/preferred node. If there is not enough memory in all the nodes specified in the nodemask, the allocation will be attempted from the closest numa node to the home node in the system. This helps applications to hint at a memory allocation preference node and fallback to _only_ a set of nodes if the memory is not available on the preferred node. Fallback allocation is attempted from the node which is nearest to the preferred node. This helps applications to have control on memory allocation numa nodes and avoids default fallback to slow memory NUMA nodes. For example a system with NUMA nodes 1,2 and 3 with DRAM memory and 10, 11 and 12 of slow memory new_nodes = numa_bitmask_alloc(nr_nodes); numa_bitmask_setbit(new_nodes, 1); numa_bitmask_setbit(new_nodes, 2); numa_bitmask_setbit(new_nodes, 3); p = mmap(NULL, nr_pages * page_size, protflag, mapflag, -1, 0); mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp, new_nodes->size + 1, 0); sys_set_mempolicy_home_node(p, nr_pages * page_size, 2, 0); This will allocate from nodes closer to node 2 and will make sure the kernel will only allocate from nodes 1, 2, and 3. Memory will not be allocated from slow memory nodes 10, 11, and 12. This differs from default MPOL_BIND behavior in that with default MPOL_BIND the allocation will be attempted from node closer to the local node. One of the reasons to specify a home node is to allow allocations from cpu less NUMA node and its nearby NUMA nodes. With MPOL_PREFERRED_MANY on the other hand will first try to allocate from the closest node to node 2 from the node list 1, 2 and 3. If those nodes don't have enough memory, kernel will allocate from slow memory node 10, 11 and 12 which ever is closer to node 2. Link: https://lkml.kernel.org/r/20211202123810.267175-3-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Cc: Ben Widawsky Cc: Dave Hansen Cc: Feng Tang Cc: Michal Hocko Cc: Andrea Arcangeli Cc: Mel Gorman Cc: Mike Kravetz Cc: Randy Dunlap Cc: Vlastimil Babka Cc: Andi Kleen Cc: Dan Williams Cc: Huang Ying Cc: Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/numa_memory_policy.rst | 16 +- include/linux/mempolicy.h | 1 mm/mempolicy.c | 79 ++++++++++ 3 files changed, 95 insertions(+), 1 deletion(-) --- a/Documentation/admin-guide/mm/numa_memory_policy.rst~mm-mempolicy-add-set_mempolicy_home_node-syscall +++ a/Documentation/admin-guide/mm/numa_memory_policy.rst @@ -408,7 +408,7 @@ follows: Memory Policy APIs ================== -Linux supports 3 system calls for controlling memory policy. These APIS +Linux supports 4 system calls for controlling memory policy. These APIS always affect only the calling task, the calling task's address space, or some shared object mapped into the calling task's address space. @@ -460,6 +460,20 @@ requested via the 'flags' argument. See the mbind(2) man page for more details. +Set home node for a Range of Task's Address Spacec:: + + long sys_set_mempolicy_home_node(unsigned long start, unsigned long len, + unsigned long home_node, + unsigned long flags); + +sys_set_mempolicy_home_node set the home node for a VMA policy present in the +task's address range. The system call updates the home node only for the existing +mempolicy range. Other address ranges are ignored. A home node is the NUMA node +closest to which page allocation will come from. Specifying the home node override +the default allocation policy to allocate memory close to the local node for an +executing CPU. + + Memory Policy Command Line Interface ==================================== --- a/include/linux/mempolicy.h~mm-mempolicy-add-set_mempolicy_home_node-syscall +++ a/include/linux/mempolicy.h @@ -46,6 +46,7 @@ struct mempolicy { unsigned short mode; /* See MPOL_* above */ unsigned short flags; /* See set_mempolicy() MPOL_F_* above */ nodemask_t nodes; /* interleave/bind/perfer */ + int home_node; /* Home node to use for MPOL_BIND and MPOL_PREFERRED_MANY */ union { nodemask_t cpuset_mems_allowed; /* relative to these nodes */ --- a/mm/mempolicy.c~mm-mempolicy-add-set_mempolicy_home_node-syscall +++ a/mm/mempolicy.c @@ -296,6 +296,7 @@ static struct mempolicy *mpol_new(unsign atomic_set(&policy->refcnt, 1); policy->mode = mode; policy->flags = flags; + policy->home_node = NUMA_NO_NODE; return policy; } @@ -1478,6 +1479,77 @@ static long kernel_mbind(unsigned long s return do_mbind(start, len, lmode, mode_flags, &nodes, flags); } +SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, len, + unsigned long, home_node, unsigned long, flags) +{ + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + struct mempolicy *new; + unsigned long vmstart; + unsigned long vmend; + unsigned long end; + int err = -ENOENT; + + start = untagged_addr(start); + if (start & ~PAGE_MASK) + return -EINVAL; + /* + * flags is used for future extension if any. + */ + if (flags != 0) + return -EINVAL; + + /* + * Check home_node is online to avoid accessing uninitialized + * NODE_DATA. + */ + if (home_node >= MAX_NUMNODES || !node_online(home_node)) + return -EINVAL; + + len = (len + PAGE_SIZE - 1) & PAGE_MASK; + end = start + len; + + if (end < start) + return -EINVAL; + if (end == start) + return 0; + mmap_write_lock(mm); + vma = find_vma(mm, start); + for (; vma && vma->vm_start < end; vma = vma->vm_next) { + + vmstart = max(start, vma->vm_start); + vmend = min(end, vma->vm_end); + new = mpol_dup(vma_policy(vma)); + if (IS_ERR(new)) { + err = PTR_ERR(new); + break; + } + /* + * Only update home node if there is an existing vma policy + */ + if (!new) + continue; + + /* + * If any vma in the range got policy other than MPOL_BIND + * or MPOL_PREFERRED_MANY we return error. We don't reset + * the home node for vmas we already updated before. + */ + if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) { + err = -EOPNOTSUPP; + break; + } + + new->home_node = home_node; + err = mbind_range(mm, vmstart, vmend, new); + mpol_put(new); + if (err) + break; + } + mmap_write_unlock(mm); + return err; +} + SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len, unsigned long, mode, const unsigned long __user *, nmask, unsigned long, maxnode, unsigned int, flags) @@ -1802,6 +1874,11 @@ static int policy_node(gfp_t gfp, struct WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE)); } + if ((policy->mode == MPOL_BIND || + policy->mode == MPOL_PREFERRED_MANY) && + policy->home_node != NUMA_NO_NODE) + return policy->home_node; + return nd; } @@ -2344,6 +2421,8 @@ bool __mpol_equal(struct mempolicy *a, s return false; if (a->flags != b->flags) return false; + if (a->home_node != b->home_node) + return false; if (mpol_store_user_nodemask(a)) if (!nodes_equal(a->w.user_nodemask, b->w.user_nodemask)) return false; From patchwork Fri Jan 14 22:08:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714124 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 266CAC433F5 for ; Fri, 14 Jan 2022 22:08:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A89806B0145; Fri, 14 Jan 2022 17:08:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A3AB66B0147; Fri, 14 Jan 2022 17:08:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DA4A6B0148; Fri, 14 Jan 2022 17:08:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0052.hostedemail.com [216.40.44.52]) by kanga.kvack.org (Postfix) with ESMTP id 7F1376B0145 for ; Fri, 14 Jan 2022 17:08:26 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 468AA82F7A85 for ; Fri, 14 Jan 2022 22:08:26 +0000 (UTC) X-FDA: 79030282212.30.91A7767 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf11.hostedemail.com (Postfix) with ESMTP id F3A1740007 for ; Fri, 14 Jan 2022 22:08:24 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CEDEDB82630; Fri, 14 Jan 2022 22:08:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2521C36AEC; Fri, 14 Jan 2022 22:08:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198102; bh=Rhl5u2YqdJnc2j+8AZS3DkUsx9SKcIAFHQAFTBvXDSU=; h=Date:From:To:Subject:In-Reply-To:From; b=mySE1nEcVZt2QWTRRO66ZBKOldPKsg805+1ZgHgev0ix1sVhmKrZXtUJLfzQocW2+ W1iiyi9mUItL2417mDmY8Q+Bl1Mi5RdEChfXWlYO04+9bk2Ui/z5OKny8j9V0hiUGF MSXCe+vfSBeleUahnDoe6LyhQKx/chWFltos0oKw= Date: Fri, 14 Jan 2022 14:08:21 -0800 From: Andrew Morton To: aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, ben.widawsky@intel.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, feng.tang@intel.com, linux-api@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, ying.huang@intel.com Subject: [patch 098/146] mm/mempolicy: wire up syscall set_mempolicy_home_node Message-ID: <20220114220821.AgzigM7yd%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F3A1740007 X-Stat-Signature: kp3zb8ychuuftop1uqec4ixn71fyi4i9 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mySE1nEc; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198104-377298 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Aneesh Kumar K.V" Subject: mm/mempolicy: wire up syscall set_mempolicy_home_node Link: https://lkml.kernel.org/r/20211202123810.267175-4-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Cc: Ben Widawsky Cc: Dave Hansen Cc: Feng Tang Cc: Michal Hocko Cc: Andrea Arcangeli Cc: Mel Gorman Cc: Mike Kravetz Cc: Randy Dunlap Cc: Vlastimil Babka Cc: Andi Kleen Cc: Dan Williams Cc: Huang Ying Cc: Signed-off-by: Andrew Morton --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/syscalls.h | 3 +++ include/uapi/asm-generic/unistd.h | 5 ++++- kernel/sys_ni.c | 1 + 21 files changed, 27 insertions(+), 2 deletions(-) --- a/arch/alpha/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/alpha/kernel/syscalls/syscall.tbl @@ -489,3 +489,4 @@ # 557 reserved for memfd_secret 558 common process_mrelease sys_process_mrelease 559 common futex_waitv sys_futex_waitv +560 common set_mempolicy_home_node sys_ni_syscall --- a/arch/arm64/include/asm/unistd32.h~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/arm64/include/asm/unistd32.h @@ -905,6 +905,8 @@ __SYSCALL(__NR_landlock_restrict_self, s __SYSCALL(__NR_process_mrelease, sys_process_mrelease) #define __NR_futex_waitv 449 __SYSCALL(__NR_futex_waitv, sys_futex_waitv) +#define __NR_set_mempolicy_home_node 450 +__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) /* * Please add new compat syscalls above this comment and update --- a/arch/arm64/include/asm/unistd.h~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 450 +#define __NR_compat_syscalls 451 #endif #define __ARCH_WANT_SYS_CLONE --- a/arch/arm/tools/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/arm/tools/syscall.tbl @@ -463,3 +463,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/ia64/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/ia64/kernel/syscalls/syscall.tbl @@ -370,3 +370,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/m68k/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/m68k/kernel/syscalls/syscall.tbl @@ -449,3 +449,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/microblaze/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/microblaze/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/mips/kernel/syscalls/syscall_n32.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -388,3 +388,4 @@ # 447 reserved for memfd_secret 448 n32 process_mrelease sys_process_mrelease 449 n32 futex_waitv sys_futex_waitv +450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/mips/kernel/syscalls/syscall_n64.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -364,3 +364,4 @@ # 447 reserved for memfd_secret 448 n64 process_mrelease sys_process_mrelease 449 n64 futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/mips/kernel/syscalls/syscall_o32.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -437,3 +437,4 @@ # 447 reserved for memfd_secret 448 o32 process_mrelease sys_process_mrelease 449 o32 futex_waitv sys_futex_waitv +450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/parisc/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/parisc/kernel/syscalls/syscall.tbl @@ -447,3 +447,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/powerpc/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/powerpc/kernel/syscalls/syscall.tbl @@ -529,3 +529,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/s390/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/s390/kernel/syscalls/syscall.tbl @@ -452,3 +452,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/sh/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/sh/kernel/syscalls/syscall.tbl @@ -452,3 +452,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/sparc/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/sparc/kernel/syscalls/syscall.tbl @@ -495,3 +495,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/x86/entry/syscalls/syscall_32.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/x86/entry/syscalls/syscall_32.tbl @@ -454,3 +454,4 @@ 447 i386 memfd_secret sys_memfd_secret 448 i386 process_mrelease sys_process_mrelease 449 i386 futex_waitv sys_futex_waitv +450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node --- a/arch/x86/entry/syscalls/syscall_64.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/x86/entry/syscalls/syscall_64.tbl @@ -371,6 +371,7 @@ 447 common memfd_secret sys_memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node # # Due to a historical design error, certain syscalls are numbered differently --- a/arch/xtensa/kernel/syscalls/syscall.tbl~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/arch/xtensa/kernel/syscalls/syscall.tbl @@ -420,3 +420,4 @@ # 447 reserved for memfd_secret 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv +450 common set_mempolicy_home_node sys_set_mempolicy_home_node --- a/include/linux/syscalls.h~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/include/linux/syscalls.h @@ -1057,6 +1057,9 @@ asmlinkage long sys_landlock_add_rule(in const void __user *rule_attr, __u32 flags); asmlinkage long sys_landlock_restrict_self(int ruleset_fd, __u32 flags); asmlinkage long sys_memfd_secret(unsigned int flags); +asmlinkage long sys_set_mempolicy_home_node(unsigned long start, unsigned long len, + unsigned long home_node, + unsigned long flags); /* * Architecture-specific system calls --- a/include/uapi/asm-generic/unistd.h~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/include/uapi/asm-generic/unistd.h @@ -883,8 +883,11 @@ __SYSCALL(__NR_process_mrelease, sys_pro #define __NR_futex_waitv 449 __SYSCALL(__NR_futex_waitv, sys_futex_waitv) +#define __NR_set_mempolicy_home_node 450 +__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) + #undef __NR_syscalls -#define __NR_syscalls 450 +#define __NR_syscalls 451 /* * 32 bit systems traditionally used different --- a/kernel/sys_ni.c~mm-mempolicy-wire-up-syscall-set_mempolicy_home_node +++ a/kernel/sys_ni.c @@ -297,6 +297,7 @@ COND_SYSCALL(get_mempolicy); COND_SYSCALL(set_mempolicy); COND_SYSCALL(migrate_pages); COND_SYSCALL(move_pages); +COND_SYSCALL(set_mempolicy_home_node); COND_SYSCALL(perf_event_open); COND_SYSCALL(accept4); From patchwork Fri Jan 14 22:08:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38C7DC433FE for ; Fri, 14 Jan 2022 22:08:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C7DD6B0147; Fri, 14 Jan 2022 17:08:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99D4F6B0149; Fri, 14 Jan 2022 17:08:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C8F56B014A; Fri, 14 Jan 2022 17:08:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id 5B2076B0147 for ; Fri, 14 Jan 2022 17:08:27 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 216B19675D for ; Fri, 14 Jan 2022 22:08:27 +0000 (UTC) X-FDA: 79030282254.09.5AACEEF Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf23.hostedemail.com (Postfix) with ESMTP id BB81C14000D for ; Fri, 14 Jan 2022 22:08:26 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BFA25B8262F; Fri, 14 Jan 2022 22:08:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 49FD4C36AE9; Fri, 14 Jan 2022 22:08:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198105; bh=v6vpo38FPm930nZRXR76SyfDcmtpeF2iC8fXDHHjDGc=; h=Date:From:To:Subject:In-Reply-To:From; b=GML2lh0QuUQYJp2Q0BosyKpG3llyvdbbRKrqzsd3GJtrqGcKIwQPbT4H0HjxWgXd7 a88ss1+sJnILLJdHNn0tnQpEj1OPXQnIV4Qv6UCKbjfPl7H8xt1DAH8vAuTEKQaZ8K /C6+V+Y6Uku/Dxzu0N5173/FiwesVCXKls2nMi2s= Date: Fri, 14 Jan 2022 14:08:24 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org Subject: [patch 099/146] mm/mempolicy: fix all kernel-doc warnings Message-ID: <20220114220824.z6HclrP9P%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: BB81C14000D X-Stat-Signature: gw3hr5nzig3kjeo3g8jqpeg5zmy7cfmy Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=GML2lh0Q; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198106-612368 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Randy Dunlap Subject: mm/mempolicy: fix all kernel-doc warnings Fix kernel-doc warnings in mempolicy.c: mempolicy.c:139: warning: No description found for return value of 'numa_map_to_online_node' mempolicy.c:2165: warning: Excess function parameter 'node' description in 'alloc_pages_vma' mempolicy.c:2973: warning: No description found for return value of 'mpol_parse_str' Link: https://lkml.kernel.org/r/20211213233216.5477-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap Signed-off-by: Andrew Morton --- mm/mempolicy.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/mm/mempolicy.c~mm-mempolicy-fix-all-kernel-doc-warnings +++ a/mm/mempolicy.c @@ -134,6 +134,8 @@ static struct mempolicy preferred_node_p * @node: Node id to start the search * * Lookup the next closest node by distance if @nid is not online. + * + * Return: this @node if it is online, otherwise the closest node by distance */ int numa_map_to_online_node(int node) { @@ -2150,7 +2152,6 @@ static struct page *alloc_pages_preferre * @order: Order of the GFP allocation. * @vma: Pointer to VMA or NULL if not available. * @addr: Virtual address of the allocation. Must be inside @vma. - * @node: Which node to prefer for allocation (modulo policy). * @hugepage: For hugepages try only the preferred node if possible. * * Allocate a page for a specific address in @vma, using the appropriate @@ -2966,7 +2967,7 @@ static const char * const policy_modes[] * Format of input: * [=][:] * - * On success, returns 0, else 1 + * Return: %0 on success, else %1 */ int mpol_parse_str(char *str, struct mempolicy **mpol) { From patchwork Fri Jan 14 22:08:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DD7AC433F5 for ; Fri, 14 Jan 2022 22:08:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECF6A6B0149; Fri, 14 Jan 2022 17:08:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E7E0D6B014B; Fri, 14 Jan 2022 17:08:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D949B6B014C; Fri, 14 Jan 2022 17:08:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id CB2916B0149 for ; Fri, 14 Jan 2022 17:08:33 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9769E80D1AB9 for ; Fri, 14 Jan 2022 22:08:33 +0000 (UTC) X-FDA: 79030282506.20.F3092A8 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf28.hostedemail.com (Postfix) with ESMTP id 04F94C000D for ; Fri, 14 Jan 2022 22:08:32 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 79E18CE2497; Fri, 14 Jan 2022 22:08:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4AA55C36AEC; Fri, 14 Jan 2022 22:08:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198108; bh=sUu9jleLZuzhuolkSsvQLjphRcZDYQ9uRg1a5lDaPQM=; h=Date:From:To:Subject:In-Reply-To:From; b=r6xB0c2Ld8gkn8X004MhHiXFevR5Olg2Lb2r326uQ3ftFuJBy2nO24Rz+2sdDa8VB WNk2ic5TtvhHuGMT4Rc1KXUStJOyZliVar2WqdfcH47g+fbo3Rlbw1dHV0WaX8cssj whF0a8rgcJxH8xj1eb/zU4mLDMs+oeRLYO5tosfs= Date: Fri, 14 Jan 2022 14:08:27 -0800 From: Andrew Morton To: akpm@linux-foundation.org, jannh@google.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, rientjes@google.com, torvalds@linux-foundation.org Subject: [patch 100/146] mm, oom: OOM sysrq should always kill a process Message-ID: <20220114220827.OSbTBO1rc%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 04F94C000D X-Stat-Signature: beo5tzzcri8d6ybjzi8fhchsocz4ki6j Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=r6xB0c2L; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198112-411827 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jann Horn Subject: mm, oom: OOM sysrq should always kill a process The OOM kill sysrq (alt+sysrq+F) should allow the user to kill the process with the highest OOM badness with a single execution. However, at the moment, the OOM kill can bail out if an OOM notifier (e.g. the i915 one) says that it reclaimed a tiny amount of memory from somewhere. That's probably not what the user wants, so skip the bailout if the OOM was triggered via sysrq. Link: https://lkml.kernel.org/r/20220106102605.635656-1-jannh@google.com Signed-off-by: Jann Horn Acked-by: Michal Hocko Acked-by: David Rientjes Signed-off-by: Andrew Morton --- mm/oom_kill.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/oom_kill.c~mm-oom-oom-sysrq-should-always-kill-a-process +++ a/mm/oom_kill.c @@ -1058,7 +1058,7 @@ bool out_of_memory(struct oom_control *o if (!is_memcg_oom(oc)) { blocking_notifier_call_chain(&oom_notify_list, 0, &freed); - if (freed > 0) + if (freed > 0 && !is_sysrq_oom(oc)) /* Got some memory back in the last second. */ return true; } From patchwork Fri Jan 14 22:08:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F318C433FE for ; Fri, 14 Jan 2022 22:08:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0794C6B014B; Fri, 14 Jan 2022 17:08:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0015B6B014D; Fri, 14 Jan 2022 17:08:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6D256B014E; Fri, 14 Jan 2022 17:08:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0027.hostedemail.com [216.40.44.27]) by kanga.kvack.org (Postfix) with ESMTP id BA3E66B014B for ; Fri, 14 Jan 2022 17:08:34 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7DC401809A350 for ; Fri, 14 Jan 2022 22:08:34 +0000 (UTC) X-FDA: 79030282548.05.D891035 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf19.hostedemail.com (Postfix) with ESMTP id CD6711A0013 for ; Fri, 14 Jan 2022 22:08:33 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C59DAB8262E; Fri, 14 Jan 2022 22:08:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F8F3C36AE5; Fri, 14 Jan 2022 22:08:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198111; bh=bf5nzAkdJboQbDq4AH0ai7Cxdq7IpRNVo9WD1L1IwZM=; h=Date:From:To:Subject:In-Reply-To:From; b=A6hp7MCKMvOG12k34GLQqYtmZwMA7gN/OLnKW0ueQaJ9OrHzoQdZIdncpKOYQP3lD djygfMzqq2J5nLOLkGlp0dxFzWHP6jjCM0dkyWdVUaa8yNp0CbFhU2FI3ArNonFk5x 6i8AM3fwWNlRVMQm/Ngln4Q7fPMAp4iIrMze69cM= Date: Fri, 14 Jan 2022 14:08:30 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, pbonzini@redhat.com, seanjc@google.com, torvalds@linux-foundation.org Subject: [patch 101/146] hugetlbfs: fix off-by-one error in hugetlb_vmdelete_list() Message-ID: <20220114220830.qBW1Hy4MM%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: CD6711A0013 X-Stat-Signature: zyrk7y75pbmnd13yszjm5eajmaigfpis Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A6hp7MCK; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198113-738608 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Sean Christopherson Subject: hugetlbfs: fix off-by-one error in hugetlb_vmdelete_list() Pass "end - 1" instead of "end" when walking the interval tree in hugetlb_vmdelete_list() to fix an inclusive vs. exclusive bug. The two callers that pass a non-zero "end" treat it as exclusive, whereas the interval tree iterator expects an inclusive "last". E.g. punching a hole in a file that precisely matches the size of a single hugepage, with a vma starting right on the boundary, will result in unmap_hugepage_range() being called twice, with the second call having start==end. The off-by-one error doesn't cause functional problems as __unmap_hugepage_range() turns into a massive nop due to short-circuiting its for-loop on "address < end". But, the mmu_notifier invocations to invalid_range_{start,end}() are passed a bogus zero-sized range, which may be unexpected behavior for secondary MMUs. The bug was exposed by commit ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots"), currently queued in the KVM tree for 5.17, which added a WARN to detect ranges with start==end. Link: https://lkml.kernel.org/r/20211228234257.1926057-1-seanjc@google.com Fixes: 1bfad99ab425 ("hugetlbfs: hugetlb_vmtruncate_list() needs to take a range to delete") Signed-off-by: Sean Christopherson Reported-by: syzbot+4e697fe80a31aa7efe21@syzkaller.appspotmail.com Reviewed-by: Mike Kravetz Cc: Paolo Bonzini Signed-off-by: Andrew Morton --- fs/hugetlbfs/inode.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/fs/hugetlbfs/inode.c~hugetlbfs-fix-off-by-one-error-in-hugetlb_vmdelete_list +++ a/fs/hugetlbfs/inode.c @@ -409,10 +409,11 @@ hugetlb_vmdelete_list(struct rb_root_cac struct vm_area_struct *vma; /* - * end == 0 indicates that the entire range after - * start should be unmapped. + * end == 0 indicates that the entire range after start should be + * unmapped. Note, end is exclusive, whereas the interval tree takes + * an inclusive "last". */ - vma_interval_tree_foreach(vma, root, start, end ? end : ULONG_MAX) { + vma_interval_tree_foreach(vma, root, start, end ? end - 1 : ULONG_MAX) { unsigned long v_offset; unsigned long v_end; From patchwork Fri Jan 14 22:08:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FD63C433FE for ; Fri, 14 Jan 2022 22:08:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECFE16B014D; Fri, 14 Jan 2022 17:08:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E7E4A6B014F; Fri, 14 Jan 2022 17:08:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF9CD6B0150; Fri, 14 Jan 2022 17:08:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id C06806B014D for ; Fri, 14 Jan 2022 17:08:37 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 829E11809A350 for ; Fri, 14 Jan 2022 22:08:37 +0000 (UTC) X-FDA: 79030282674.09.2D726F3 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 018031C000A for ; Fri, 14 Jan 2022 22:08:36 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E2027B825F5; Fri, 14 Jan 2022 22:08:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 797D4C36AE9; Fri, 14 Jan 2022 22:08:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198114; bh=4Yx6fb62JkZoWnIGHGMght9Mdk8csjhT757pptAPsW8=; h=Date:From:To:Subject:In-Reply-To:From; b=ehQldkzoSaXK8L6Dy5dsTjdVOV437RlpPwKSxBU9wKv1yh9oFhR94QAM1XoOcYHsI dzLvxXE7zIGIv9c0LjudbhsXUpJHCfWXJLUg8ZgORozZgMYgsQ0+qDQTRCI9TFlNsn 8opgXICgg/8uEp3XkBDTwZ5FWe8xmEUCa4EKDmEQ= Date: Fri, 14 Jan 2022 14:08:34 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 102/146] mm: migrate: fix the return value of migrate_pages() Message-ID: <20220114220834.rKbLjEusz%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 018031C000A X-Stat-Signature: rrons85dbscxc5sro4hjdwmxd4mcsiob Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ehQldkzo; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198116-817206 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm: migrate: fix the return value of migrate_pages() Patch series "Improve the migration stats". According to talk with Zi Yan [1], this patch set changes the return value of migrate_pages() to avoid returning a number which is larger than the number of pages the users tried to migrate by move_pages() syscall. Also fix the hugetlb migration stats and migration stats in trace_mm_compaction_migratepages(). [1] https://lore.kernel.org/linux-mm/7E44019D-2A5D-4BA7-B4D5-00D4712F1687@nvidia.com/ This patch (of 3): As Zi Yan pointed out, the syscall move_pages() can return a non-migrated number larger than the number of pages the users tried to migrate, when a THP page is failed to migrate. This is confusing for users. Since other migration scenarios do not care about the actual non-migrated number of pages except the memory compaction migration which will fix in following patch. Thus we can change the return value to return the number of {normal page, THP, hugetlb} instead to avoid this issue, and the number of THP splits will be considered as the number of non-migrated THP, no matter how many subpages of the THP are migrated successfully. Meanwhile we should still keep the migration counters using the number of normal pages. Link: https://lkml.kernel.org/r/cover.1636275127.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/6486fabc3e8c66ff613e150af25e89b3147977a6.1636275127.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Signed-off-by: Zi Yan Co-developed-by: Zi Yan Cc: Steven Rostedt (VMware) Signed-off-by: Andrew Morton --- mm/migrate.c | 63 ++++++++++++++++++++++++++++++++++++------------- 1 file changed, 47 insertions(+), 16 deletions(-) --- a/mm/migrate.c~mm-migrate-fix-the-return-value-of-migrate_pages +++ a/mm/migrate.c @@ -1421,7 +1421,7 @@ static inline int try_split_thp(struct p * @mode: The migration mode that specifies the constraints for * page migration, if any. * @reason: The reason for page migration. - * @ret_succeeded: Set to the number of pages migrated successfully if + * @ret_succeeded: Set to the number of normal pages migrated successfully if * the caller passes a non-NULL pointer. * * The function returns after 10 attempts or if no pages are movable any more @@ -1429,7 +1429,9 @@ static inline int try_split_thp(struct p * It is caller's responsibility to call putback_movable_pages() to return pages * to the LRU or free list only if ret != 0. * - * Returns the number of pages that were not migrated, or an error code. + * Returns the number of {normal page, THP} that were not migrated, or an error code. + * The number of THP splits will be considered as the number of non-migrated THP, + * no matter how many subpages of the THP are migrated successfully. */ int migrate_pages(struct list_head *from, new_page_t get_new_page, free_page_t put_new_page, unsigned long private, @@ -1438,6 +1440,7 @@ int migrate_pages(struct list_head *from int retry = 1; int thp_retry = 1; int nr_failed = 0; + int nr_failed_pages = 0; int nr_succeeded = 0; int nr_thp_succeeded = 0; int nr_thp_failed = 0; @@ -1449,13 +1452,16 @@ int migrate_pages(struct list_head *from int swapwrite = current->flags & PF_SWAPWRITE; int rc, nr_subpages; LIST_HEAD(ret_pages); + LIST_HEAD(thp_split_pages); bool nosplit = (reason == MR_NUMA_MISPLACED); + bool no_subpage_counting = false; trace_mm_migrate_pages_start(mode, reason); if (!swapwrite) current->flags |= PF_SWAPWRITE; +thp_subpage_migration: for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { retry = 0; thp_retry = 0; @@ -1504,18 +1510,20 @@ retry: case -ENOSYS: /* THP migration is unsupported */ if (is_thp) { - if (!try_split_thp(page, &page2, from)) { + nr_thp_failed++; + if (!try_split_thp(page, &page2, &thp_split_pages)) { nr_thp_split++; goto retry; } - nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; break; } /* Hugetlb migration is unsupported */ - nr_failed++; + if (!no_subpage_counting) + nr_failed++; + nr_failed_pages++; break; case -ENOMEM: /* @@ -1524,16 +1532,19 @@ retry: * THP NUMA faulting doesn't split THP to retry. */ if (is_thp && !nosplit) { - if (!try_split_thp(page, &page2, from)) { + nr_thp_failed++; + if (!try_split_thp(page, &page2, &thp_split_pages)) { nr_thp_split++; goto retry; } - nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; goto out; } - nr_failed++; + + if (!no_subpage_counting) + nr_failed++; + nr_failed_pages++; goto out; case -EAGAIN: if (is_thp) { @@ -1559,17 +1570,37 @@ retry: */ if (is_thp) { nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; break; } - nr_failed++; + + if (!no_subpage_counting) + nr_failed++; + nr_failed_pages++; break; } } } - nr_failed += retry + thp_retry; + nr_failed += retry; nr_thp_failed += thp_retry; - rc = nr_failed; + /* + * Try to migrate subpages of fail-to-migrate THPs, no nr_failed + * counting in this round, since all subpages of a THP is counted + * as 1 failure in the first round. + */ + if (!list_empty(&thp_split_pages)) { + /* + * Move non-migrated pages (after 10 retries) to ret_pages + * to avoid migrating them again. + */ + list_splice_init(from, &ret_pages); + list_splice_init(&thp_split_pages, from); + no_subpage_counting = true; + retry = 1; + goto thp_subpage_migration; + } + + rc = nr_failed + nr_thp_failed; out: /* * Put the permanent failure page back to migration list, they @@ -1578,11 +1609,11 @@ out: list_splice(&ret_pages, from); count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); - count_vm_events(PGMIGRATE_FAIL, nr_failed); + count_vm_events(PGMIGRATE_FAIL, nr_failed_pages); count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded); count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed); count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split); - trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded, + trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded, nr_thp_failed, nr_thp_split, mode, reason); if (!swapwrite) From patchwork Fri Jan 14 22:08:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 626A7C433EF for ; Fri, 14 Jan 2022 22:08:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA8726B014F; Fri, 14 Jan 2022 17:08:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E57DF6B0151; Fri, 14 Jan 2022 17:08:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1F986B0152; Fri, 14 Jan 2022 17:08:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id BF4096B014F for ; Fri, 14 Jan 2022 17:08:40 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 83090998B4 for ; Fri, 14 Jan 2022 22:08:40 +0000 (UTC) X-FDA: 79030282800.11.AE73CE9 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf25.hostedemail.com (Postfix) with ESMTP id 0AAB2A000F for ; Fri, 14 Jan 2022 22:08:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 09C9FB82630; Fri, 14 Jan 2022 22:08:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 869CEC36AE9; Fri, 14 Jan 2022 22:08:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198117; bh=H+8BKHlRw4ydJJIqoNXnYqxoxCJ7lnBuWLDXTYmOd5c=; h=Date:From:To:Subject:In-Reply-To:From; b=XFLWkfEEyZQkO6vv+cxFgcFlnKjb0lwQjNbwgMwotAC3LwJpRxEqgrkQZF5qu1cBB 3PKEp8z9P2lezSTX/GJEDdTJ+1p+pgcH0Vy3Uwo3k7i3MR9H3xUXD6lpyfxP+CpkPH FF0WP0gzKYW8Fp2SBW8d2MA7obt5s9V4IdPgcYG8= Date: Fri, 14 Jan 2022 14:08:37 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 103/146] mm: migrate: correct the hugetlb migration stats Message-ID: <20220114220837.ht0mOVvU5%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 0AAB2A000F X-Stat-Signature: 4s81fqant5tuz9nm8cqkaxjsnesjccak Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XFLWkfEE; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198119-555337 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm: migrate: correct the hugetlb migration stats Correct the migration stats for hugetlb with using compound_nr() instead of thp_nr_pages(), meanwhile change 'nr_failed_pages' to record the number of normal pages failed to migrate, including THP and hugetlb, and 'nr_succeeded' will record the number of normal pages migrated successfully. [baolin.wang@linux.alibaba.com: fix docs, per Mike] Link: https://lkml.kernel.org/r/141bdfc6-f898-3cc3-f692-726c5f6cb74d@linux.alibaba.com Link: https://lkml.kernel.org/r/71a4b6c22f208728fe8c78ad26375436c4ff9704.1636275127.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Zi Yan Cc: Steven Rostedt (VMware) Signed-off-by: Andrew Morton --- Documentation/vm/page_migration.rst | 12 ++++++------ mm/migrate.c | 17 ++++++++--------- 2 files changed, 14 insertions(+), 15 deletions(-) --- a/Documentation/vm/page_migration.rst~mm-migrate-correct-the-hugetlb-migration-stats +++ a/Documentation/vm/page_migration.rst @@ -263,15 +263,15 @@ Monitoring Migration The following events (counters) can be used to monitor page migration. 1. PGMIGRATE_SUCCESS: Normal page migration success. Each count means that a - page was migrated. If the page was a non-THP page, then this counter is - increased by one. If the page was a THP, then this counter is increased by - the number of THP subpages. For example, migration of a single 2MB THP that - has 4KB-size base pages (subpages) will cause this counter to increase by - 512. + page was migrated. If the page was a non-THP and non-hugetlb page, then + this counter is increased by one. If the page was a THP or hugetlb, then + this counter is increased by the number of THP or hugetlb subpages. + For example, migration of a single 2MB THP that has 4KB-size base pages + (subpages) will cause this counter to increase by 512. 2. PGMIGRATE_FAIL: Normal page migration failure. Same counting rules as for PGMIGRATE_SUCCESS, above: this will be increased by the number of subpages, - if it was a THP. + if it was a THP or hugetlb. 3. THP_MIGRATION_SUCCESS: A THP was migrated without being split. --- a/mm/migrate.c~mm-migrate-correct-the-hugetlb-migration-stats +++ a/mm/migrate.c @@ -1429,9 +1429,9 @@ static inline int try_split_thp(struct p * It is caller's responsibility to call putback_movable_pages() to return pages * to the LRU or free list only if ret != 0. * - * Returns the number of {normal page, THP} that were not migrated, or an error code. - * The number of THP splits will be considered as the number of non-migrated THP, - * no matter how many subpages of the THP are migrated successfully. + * Returns the number of {normal page, THP, hugetlb} that were not migrated, or + * an error code. The number of THP splits will be considered as the number of + * non-migrated THP, no matter how many subpages of the THP are migrated successfully. */ int migrate_pages(struct list_head *from, new_page_t get_new_page, free_page_t put_new_page, unsigned long private, @@ -1474,7 +1474,7 @@ retry: * during migration. */ is_thp = PageTransHuge(page) && !PageHuge(page); - nr_subpages = thp_nr_pages(page); + nr_subpages = compound_nr(page); cond_resched(); if (PageHuge(page)) @@ -1523,7 +1523,7 @@ retry: /* Hugetlb migration is unsupported */ if (!no_subpage_counting) nr_failed++; - nr_failed_pages++; + nr_failed_pages += nr_subpages; break; case -ENOMEM: /* @@ -1544,7 +1544,7 @@ retry: if (!no_subpage_counting) nr_failed++; - nr_failed_pages++; + nr_failed_pages += nr_subpages; goto out; case -EAGAIN: if (is_thp) { @@ -1554,12 +1554,11 @@ retry: retry++; break; case MIGRATEPAGE_SUCCESS: + nr_succeeded += nr_subpages; if (is_thp) { nr_thp_succeeded++; - nr_succeeded += nr_subpages; break; } - nr_succeeded++; break; default: /* @@ -1576,7 +1575,7 @@ retry: if (!no_subpage_counting) nr_failed++; - nr_failed_pages++; + nr_failed_pages += nr_subpages; break; } } From patchwork Fri Jan 14 22:08:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714130 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2FA8C433EF for ; Fri, 14 Jan 2022 22:08:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BC6C6B0151; Fri, 14 Jan 2022 17:08:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46CD56B0153; Fri, 14 Jan 2022 17:08:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 333F16B0154; Fri, 14 Jan 2022 17:08:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 211006B0151 for ; Fri, 14 Jan 2022 17:08:43 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E03A81809A350 for ; Fri, 14 Jan 2022 22:08:42 +0000 (UTC) X-FDA: 79030282884.25.7EE421A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf19.hostedemail.com (Postfix) with ESMTP id 419D31A0003 for ; Fri, 14 Jan 2022 22:08:41 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 22238B82A26; Fri, 14 Jan 2022 22:08:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91C82C36AE5; Fri, 14 Jan 2022 22:08:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198120; bh=AbxaJI2ZzEZZvcHuk8pMyWSoIl0x4wvKLH4mxWiFD30=; h=Date:From:To:Subject:In-Reply-To:From; b=nOOaJVpVT4tEQBvbpUkM3/0IMy/lzozBo+bDJwMJI/Q4KHrSPOqfjL6sI3ZaxCDKv HXcNlLPYSif8Y5XydX3Tl2DW8sIRM6WCFiOeo8LpKYzxzlaV9L3xzhSai0nn7eyIbD /adefYQ2QlSnvZ+xCFE6j+DPdowdvckSOyyGlrhs= Date: Fri, 14 Jan 2022 14:08:40 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 104/146] mm: compaction: fix the migration stats in trace_mm_compaction_migratepages() Message-ID: <20220114220840.FROGeVO3F%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 419D31A0003 X-Stat-Signature: ctxkxct9skck1jmc6weyp6uy33313pqu Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=nOOaJVpV; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642198121-91438 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm: compaction: fix the migration stats in trace_mm_compaction_migratepages() Now the migrate_pages() has changed to return the number of {normal page, THP, hugetlb} instead, thus we should not use the return value to calculate the number of pages migrated successfully. Instead we can just use the 'nr_succeeded' which indicates the number of normal pages migrated successfully to calculate the non-migrated pages in trace_mm_compaction_migratepages(). Link: https://lkml.kernel.org/r/b4225251c4bec068dcd90d275ab7de88a39e2bd7.1636275127.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Steven Rostedt (VMware) Cc: Zi Yan Signed-off-by: Andrew Morton --- include/trace/events/compaction.h | 24 ++++-------------------- mm/compaction.c | 7 ++++--- 2 files changed, 8 insertions(+), 23 deletions(-) --- a/include/trace/events/compaction.h~mm-compaction-fix-the-migration-stats-in-trace_mm_compaction_migratepages +++ a/include/trace/events/compaction.h @@ -68,10 +68,9 @@ DEFINE_EVENT(mm_compaction_isolate_templ TRACE_EVENT(mm_compaction_migratepages, TP_PROTO(unsigned long nr_all, - int migrate_rc, - struct list_head *migratepages), + unsigned int nr_succeeded), - TP_ARGS(nr_all, migrate_rc, migratepages), + TP_ARGS(nr_all, nr_succeeded), TP_STRUCT__entry( __field(unsigned long, nr_migrated) @@ -79,23 +78,8 @@ TRACE_EVENT(mm_compaction_migratepages, ), TP_fast_assign( - unsigned long nr_failed = 0; - struct list_head *page_lru; - - /* - * migrate_pages() returns either a non-negative number - * with the number of pages that failed migration, or an - * error code, in which case we need to count the remaining - * pages manually - */ - if (migrate_rc >= 0) - nr_failed = migrate_rc; - else - list_for_each(page_lru, migratepages) - nr_failed++; - - __entry->nr_migrated = nr_all - nr_failed; - __entry->nr_failed = nr_failed; + __entry->nr_migrated = nr_succeeded; + __entry->nr_failed = nr_all - nr_succeeded; ), TP_printk("nr_migrated=%lu nr_failed=%lu", --- a/mm/compaction.c~mm-compaction-fix-the-migration-stats-in-trace_mm_compaction_migratepages +++ a/mm/compaction.c @@ -2280,6 +2280,7 @@ compact_zone(struct compact_control *cc, unsigned long last_migrated_pfn; const bool sync = cc->mode != MIGRATE_ASYNC; bool update_cached; + unsigned int nr_succeeded = 0; /* * These counters track activities during zone compaction. Initialize @@ -2398,10 +2399,10 @@ compact_zone(struct compact_control *cc, err = migrate_pages(&cc->migratepages, compaction_alloc, compaction_free, (unsigned long)cc, cc->mode, - MR_COMPACTION, NULL); + MR_COMPACTION, &nr_succeeded); - trace_mm_compaction_migratepages(cc->nr_migratepages, err, - &cc->migratepages); + trace_mm_compaction_migratepages(cc->nr_migratepages, + nr_succeeded); /* All pages were either migrated or will be released */ cc->nr_migratepages = 0; From patchwork Fri Jan 14 22:08:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09A75C433EF for ; Fri, 14 Jan 2022 22:08:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9022F6B0153; Fri, 14 Jan 2022 17:08:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E74C6B0155; Fri, 14 Jan 2022 17:08:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A1936B0156; Fri, 14 Jan 2022 17:08:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6B1636B0153 for ; Fri, 14 Jan 2022 17:08:47 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3A9A29675D for ; Fri, 14 Jan 2022 22:08:47 +0000 (UTC) X-FDA: 79030283094.12.1307D46 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf02.hostedemail.com (Postfix) with ESMTP id 92D128000E for ; Fri, 14 Jan 2022 22:08:46 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 51FC2B8262F; Fri, 14 Jan 2022 22:08:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE4E5C36AE5; Fri, 14 Jan 2022 22:08:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198124; bh=PALJUVJDxS9YD5XCN1rUh94wbpAhE0V91aPiJrc1y/I=; h=Date:From:To:Subject:In-Reply-To:From; b=coep9d6OGky1cLlk3+dMnVVw0pnx3/LNzvUVeT2wyz3hXYUn8cmZO81rh1te9qyVH lPLxI6GNdI/l/LKnKYTaRBUnRtwh0ZOfjNM70Ahi3sy8sTBJEGZWrZkLmUn2fTwQ6o boNQruDdJTjUY9iE81+DslOtodEVYN98omavABog= Date: Fri, 14 Jan 2022 14:08:43 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, osalvador@suse.de, shy828301@gmail.com, torvalds@linux-foundation.org, xlpang@linux.alibaba.com, ying.huang@intel.com, zhongjiang-ali@linux.alibaba.com, ziy@nvidia.com Subject: [patch 105/146] mm: migrate: support multiple target nodes demotion Message-ID: <20220114220843.Yephwi4-G%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 92D128000E X-Stat-Signature: ue4yzxscgg6dc8sbbrnrtgso3ytehozk Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=coep9d6O; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642198126-833207 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm: migrate: support multiple target nodes demotion We have some machines with multiple memory types like below, which have one fast (DRAM) memory node and two slow (persistent memory) memory nodes. According to current node demotion policy, if node 0 fills up, its memory should be migrated to node 1, when node 1 fills up, its memory will be migrated to node 2: node 0 -> node 1 -> node 2 ->stop. But this is not efficient and suitbale memory migration route for our machine with multiple slow memory nodes. Since the distance between node 0 to node 1 and node 0 to node 2 is equal, and memory migration between slow memory nodes will increase persistent memory bandwidth greatly, which will hurt the whole system's performance. Thus for this case, we can treat the slow memory node 1 and node 2 as a whole slow memory region, and we should migrate memory from node 0 to node 1 and node 2 if node 0 fills up. This patch changes the node_demotion data structure to support multiple target nodes, and establishes the migration path to support multiple target nodes with validating if the node distance is the best or not. available: 3 nodes (0-2) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 0 size: 62153 MB node 0 free: 55135 MB node 1 cpus: node 1 size: 127007 MB node 1 free: 126930 MB node 2 cpus: node 2 size: 126968 MB node 2 free: 126878 MB node distances: node 0 1 2 0: 10 20 20 1: 20 10 20 2: 20 20 10 Link: https://lkml.kernel.org/r/00728da107789bb4ed9e0d28b1d08fd8056af2ef.1636697263.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: "Huang, Ying" Cc: Dave Hansen Cc: Zi Yan Cc: Oscar Salvador Cc: Yang Shi Cc: Baolin Wang Cc: zhongjiang-ali Cc: Xunlei Pang Signed-off-by: Andrew Morton --- mm/migrate.c | 164 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 129 insertions(+), 35 deletions(-) --- a/mm/migrate.c~mm-migrate-support-multiple-target-nodes-demotion +++ a/mm/migrate.c @@ -50,6 +50,7 @@ #include #include #include +#include #include @@ -1118,12 +1119,25 @@ out: * * This is represented in the node_demotion[] like this: * - * { 1, // Node 0 migrates to 1 - * 2, // Node 1 migrates to 2 - * -1, // Node 2 does not migrate - * 4, // Node 3 migrates to 4 - * 5, // Node 4 migrates to 5 - * -1} // Node 5 does not migrate + * { nr=1, nodes[0]=1 }, // Node 0 migrates to 1 + * { nr=1, nodes[0]=2 }, // Node 1 migrates to 2 + * { nr=0, nodes[0]=-1 }, // Node 2 does not migrate + * { nr=1, nodes[0]=4 }, // Node 3 migrates to 4 + * { nr=1, nodes[0]=5 }, // Node 4 migrates to 5 + * { nr=0, nodes[0]=-1 }, // Node 5 does not migrate + * + * Moreover some systems may have multiple slow memory nodes. + * Suppose a system has one socket with 3 memory nodes, node 0 + * is fast memory type, and node 1/2 both are slow memory + * type, and the distance between fast memory node and slow + * memory node is same. So the migration path should be: + * + * 0 -> 1/2 -> stop + * + * This is represented in the node_demotion[] like this: + * { nr=2, {nodes[0]=1, nodes[1]=2} }, // Node 0 migrates to node 1 and node 2 + * { nr=0, nodes[0]=-1, }, // Node 1 dose not migrate + * { nr=0, nodes[0]=-1, }, // Node 2 does not migrate */ /* @@ -1134,8 +1148,20 @@ out: * must be held over all reads to ensure that no cycles are * observed. */ -static int node_demotion[MAX_NUMNODES] __read_mostly = - {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE}; +#define DEFAULT_DEMOTION_TARGET_NODES 15 + +#if MAX_NUMNODES < DEFAULT_DEMOTION_TARGET_NODES +#define DEMOTION_TARGET_NODES (MAX_NUMNODES - 1) +#else +#define DEMOTION_TARGET_NODES DEFAULT_DEMOTION_TARGET_NODES +#endif + +struct demotion_nodes { + unsigned short nr; + short nodes[DEMOTION_TARGET_NODES]; +}; + +static struct demotion_nodes *node_demotion __read_mostly; /** * next_demotion_node() - Get the next node in the demotion path @@ -1148,8 +1174,15 @@ static int node_demotion[MAX_NUMNODES] _ */ int next_demotion_node(int node) { + struct demotion_nodes *nd; + unsigned short target_nr, index; int target; + if (!node_demotion) + return NUMA_NO_NODE; + + nd = &node_demotion[node]; + /* * node_demotion[] is updated without excluding this * function from running. RCU doesn't provide any @@ -1160,9 +1193,28 @@ int next_demotion_node(int node) * node_demotion[] reads need to be consistent. */ rcu_read_lock(); - target = READ_ONCE(node_demotion[node]); - rcu_read_unlock(); + target_nr = READ_ONCE(nd->nr); + switch (target_nr) { + case 0: + target = NUMA_NO_NODE; + goto out; + case 1: + index = 0; + break; + default: + /* + * If there are multiple target nodes, just select one + * target node randomly. + */ + index = get_random_int() % target_nr; + break; + } + + target = READ_ONCE(nd->nodes[index]); + +out: + rcu_read_unlock(); return target; } @@ -3003,10 +3055,16 @@ EXPORT_SYMBOL(migrate_vma_finalize); /* Disable reclaim-based migration. */ static void __disable_all_migrate_targets(void) { - int node; + int node, i; - for_each_online_node(node) - node_demotion[node] = NUMA_NO_NODE; + if (!node_demotion) + return; + + for_each_online_node(node) { + node_demotion[node].nr = 0; + for (i = 0; i < DEMOTION_TARGET_NODES; i++) + node_demotion[node].nodes[i] = NUMA_NO_NODE; + } } static void disable_all_migrate_targets(void) @@ -3033,26 +3091,40 @@ static void disable_all_migrate_targets( * Failing here is OK. It might just indicate * being at the end of a chain. */ -static int establish_migrate_target(int node, nodemask_t *used) +static int establish_migrate_target(int node, nodemask_t *used, + int best_distance) { - int migration_target; + int migration_target, index, val; + struct demotion_nodes *nd; - /* - * Can not set a migration target on a - * node with it already set. - * - * No need for READ_ONCE() here since this - * in the write path for node_demotion[]. - * This should be the only thread writing. - */ - if (node_demotion[node] != NUMA_NO_NODE) + if (!node_demotion) return NUMA_NO_NODE; + nd = &node_demotion[node]; + migration_target = find_next_best_node(node, used); if (migration_target == NUMA_NO_NODE) return NUMA_NO_NODE; - node_demotion[node] = migration_target; + /* + * If the node has been set a migration target node before, + * which means it's the best distance between them. Still + * check if this node can be demoted to other target nodes + * if they have a same best distance. + */ + if (best_distance != -1) { + val = node_distance(node, migration_target); + if (val > best_distance) + return NUMA_NO_NODE; + } + + index = nd->nr; + if (WARN_ONCE(index >= DEMOTION_TARGET_NODES, + "Exceeds maximum demotion target nodes\n")) + return NUMA_NO_NODE; + + nd->nodes[index] = migration_target; + nd->nr++; return migration_target; } @@ -3068,7 +3140,9 @@ static int establish_migrate_target(int * * The difference here is that cycles must be avoided. If * node0 migrates to node1, then neither node1, nor anything - * node1 migrates to can migrate to node0. + * node1 migrates to can migrate to node0. Also one node can + * be migrated to multiple nodes if the target nodes all have + * a same best-distance against the source node. * * This function can run simultaneously with readers of * node_demotion[]. However, it can not run simultaneously @@ -3080,7 +3154,7 @@ static void __set_migration_target_nodes nodemask_t next_pass = NODE_MASK_NONE; nodemask_t this_pass = NODE_MASK_NONE; nodemask_t used_targets = NODE_MASK_NONE; - int node; + int node, best_distance; /* * Avoid any oddities like cycles that could occur @@ -3109,18 +3183,33 @@ again: * multiple source nodes to share a destination. */ nodes_or(used_targets, used_targets, this_pass); - for_each_node_mask(node, this_pass) { - int target_node = establish_migrate_target(node, &used_targets); - if (target_node == NUMA_NO_NODE) - continue; + for_each_node_mask(node, this_pass) { + best_distance = -1; /* - * Visit targets from this pass in the next pass. - * Eventually, every node will have been part of - * a pass, and will become set in 'used_targets'. + * Try to set up the migration path for the node, and the target + * migration nodes can be multiple, so doing a loop to find all + * the target nodes if they all have a best node distance. */ - node_set(target_node, next_pass); + do { + int target_node = + establish_migrate_target(node, &used_targets, + best_distance); + + if (target_node == NUMA_NO_NODE) + break; + + if (best_distance == -1) + best_distance = node_distance(node, target_node); + + /* + * Visit targets from this pass in the next pass. + * Eventually, every node will have been part of + * a pass, and will become set in 'used_targets'. + */ + node_set(target_node, next_pass); + } while (1); } /* * 'next_pass' contains nodes which became migration @@ -3221,6 +3310,11 @@ static int __init migrate_on_reclaim_ini { int ret; + node_demotion = kmalloc_array(nr_node_ids, + sizeof(struct demotion_nodes), + GFP_KERNEL); + WARN_ON(!node_demotion); + ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline", NULL, migration_offline_cpu); /* From patchwork Fri Jan 14 22:08:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714132 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71195C433EF for ; Fri, 14 Jan 2022 22:08:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 001AC6B0155; Fri, 14 Jan 2022 17:08:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF2E86B0157; Fri, 14 Jan 2022 17:08:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBBD36B0158; Fri, 14 Jan 2022 17:08:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id C97196B0155 for ; Fri, 14 Jan 2022 17:08:51 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 846D580C5512 for ; Fri, 14 Jan 2022 22:08:51 +0000 (UTC) X-FDA: 79030283262.22.D3033D9 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf01.hostedemail.com (Postfix) with ESMTP id B306640014 for ; Fri, 14 Jan 2022 22:08:49 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B6C74B82A39; Fri, 14 Jan 2022 22:08:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0EAEAC36AE5; Fri, 14 Jan 2022 22:08:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198127; bh=TlrU+C6l/nFj2pbC+AT6rgSDEJdIHWVMfCtIE6Kos9Q=; h=Date:From:To:Subject:In-Reply-To:From; b=Ww52qYlA7cW/9rI40sXBdKJA0CkmXJeX9Q2AzImHUDLBG/V52VtK9+Iul3ExYO2xn blNMiG48z50CoeZGpMnKCAlDv692ZvqvI8S7zqE/tylclZSR/gvi7gEAsrKJAJAUU6 d81sRt0rjxXG5i14+K10X/R/3orSV1bhjMNc90gk= Date: Fri, 14 Jan 2022 14:08:46 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, shy828301@gmail.com, torvalds@linux-foundation.org, xlpang@linux.alibaba.com, ying.huang@intel.com, zhongjiang-ali@linux.alibaba.com, ziy@nvidia.com Subject: [patch 106/146] mm: migrate: add more comments for selecting target node randomly Message-ID: <20220114220846.k9XhK_39D%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: B306640014 X-Stat-Signature: bd1p6no4searm9uixi9mbnj83r8fta3z Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ww52qYlA; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198129-201610 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm: migrate: add more comments for selecting target node randomly As Yang Shi suggested [1], it will be helpful to explain why we should select target node randomly now if there are multiple target nodes. [1] https://lore.kernel.org/all/CAHbLzkqSqCL+g7dfzeOw8fPyeEC0BBv13Ny1UVGHDkadnQdR=g@mail.gmail.com/ Link: https://lkml.kernel.org/r/c31d36bd097c6e9e69fc0f409c43b78e53e64fc2.1637766801.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Yang Shi Cc: "Huang, Ying" Cc: Dave Hansen Cc: Zi Yan Cc: zhongjiang-ali Cc: Xunlei Pang Signed-off-by: Andrew Morton --- mm/migrate.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/mm/migrate.c~mm-migrate-add-more-comments-for-selecting-target-node-randomly +++ a/mm/migrate.c @@ -1206,6 +1206,14 @@ int next_demotion_node(int node) /* * If there are multiple target nodes, just select one * target node randomly. + * + * In addition, we can also use round-robin to select + * target node, but we should introduce another variable + * for node_demotion[] to record last selected target node, + * that may cause cache ping-pong due to the changing of + * last target node. Or introducing per-cpu data to avoid + * caching issue, which seems more complicated. So selecting + * target node randomly seems better until now. */ index = get_random_int() % target_nr; break; From patchwork Fri Jan 14 22:08:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD76DC433F5 for ; Fri, 14 Jan 2022 22:08:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FA086B0157; Fri, 14 Jan 2022 17:08:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A9356B0159; Fri, 14 Jan 2022 17:08:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 223226B015A; Fri, 14 Jan 2022 17:08:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0077.hostedemail.com [216.40.44.77]) by kanga.kvack.org (Postfix) with ESMTP id 0B6716B0157 for ; Fri, 14 Jan 2022 17:08:53 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id BAF91180A0581 for ; Fri, 14 Jan 2022 22:08:52 +0000 (UTC) X-FDA: 79030283304.20.2470801 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id 388C840013 for ; Fri, 14 Jan 2022 22:08:52 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3E957B8262E; Fri, 14 Jan 2022 22:08:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B4D5C36AE5; Fri, 14 Jan 2022 22:08:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198130; bh=k2U9fjn111PewIzPB6Wp40CyroSN+x4EIrTXLaRi6Ag=; h=Date:From:To:Subject:In-Reply-To:From; b=hY4PsMIx0sf4JUqvyyaqIcBagqbjtWxnUyrCgI5cYRT6uHFti8nry6DdsNn2HpqFs h9yzPawursBoag6P1wxtbBI2DyotbiPUFKwJG3lpKjXpdHkay4OA5MuyQSqBi7ndp9 AL0eZtauT1IFLQQCJcehSezg2WH9jV1d08sBgK3E= Date: Fri, 14 Jan 2022 14:08:49 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, david@redhat.com, gthelen@google.com, kbusch@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rientjes@google.com, shy828301@gmail.com, torvalds@linux-foundation.org, weixugc@google.com, yang.shi@linux.alibaba.com, ying.huang@intel.com, ziy@nvidia.com Subject: [patch 107/146] mm/migrate: move node demotion code to near its user Message-ID: <20220114220849.s2s5kA-M3%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 388C840013 X-Stat-Signature: isi9rreb5rk89rghuu67miiqqedzqenz Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hY4PsMIx; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642198132-666537 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Huang Ying Subject: mm/migrate: move node demotion code to near its user Now, node_demotion and next_demotion_node() are placed between __unmap_and_move() and unmap_and_move(). This hurts code readability. So move them near their users in the file. There's no functionality change in this patch. Link: https://lkml.kernel.org/r/20211206031227.3323097-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" Reviewed-by: Baolin Wang Reviewed-by: Yang Shi Reviewed-by: Wei Xu Cc: Dave Hansen Cc: Zi Yan Cc: Oscar Salvador Cc: Michal Hocko Cc: David Rientjes Cc: Dan Williams Cc: David Hildenbrand Cc: Greg Thelen Cc: Keith Busch Cc: Yang Shi Signed-off-by: Andrew Morton --- mm/migrate.c | 265 ++++++++++++++++++++++++------------------------- 1 file changed, 132 insertions(+), 133 deletions(-) --- a/mm/migrate.c~mm-migrate-move-node-demotion-code-to-near-its-user +++ a/mm/migrate.c @@ -1093,139 +1093,6 @@ out: return rc; } - -/* - * node_demotion[] example: - * - * Consider a system with two sockets. Each socket has - * three classes of memory attached: fast, medium and slow. - * Each memory class is placed in its own NUMA node. The - * CPUs are placed in the node with the "fast" memory. The - * 6 NUMA nodes (0-5) might be split among the sockets like - * this: - * - * Socket A: 0, 1, 2 - * Socket B: 3, 4, 5 - * - * When Node 0 fills up, its memory should be migrated to - * Node 1. When Node 1 fills up, it should be migrated to - * Node 2. The migration path start on the nodes with the - * processors (since allocations default to this node) and - * fast memory, progress through medium and end with the - * slow memory: - * - * 0 -> 1 -> 2 -> stop - * 3 -> 4 -> 5 -> stop - * - * This is represented in the node_demotion[] like this: - * - * { nr=1, nodes[0]=1 }, // Node 0 migrates to 1 - * { nr=1, nodes[0]=2 }, // Node 1 migrates to 2 - * { nr=0, nodes[0]=-1 }, // Node 2 does not migrate - * { nr=1, nodes[0]=4 }, // Node 3 migrates to 4 - * { nr=1, nodes[0]=5 }, // Node 4 migrates to 5 - * { nr=0, nodes[0]=-1 }, // Node 5 does not migrate - * - * Moreover some systems may have multiple slow memory nodes. - * Suppose a system has one socket with 3 memory nodes, node 0 - * is fast memory type, and node 1/2 both are slow memory - * type, and the distance between fast memory node and slow - * memory node is same. So the migration path should be: - * - * 0 -> 1/2 -> stop - * - * This is represented in the node_demotion[] like this: - * { nr=2, {nodes[0]=1, nodes[1]=2} }, // Node 0 migrates to node 1 and node 2 - * { nr=0, nodes[0]=-1, }, // Node 1 dose not migrate - * { nr=0, nodes[0]=-1, }, // Node 2 does not migrate - */ - -/* - * Writes to this array occur without locking. Cycles are - * not allowed: Node X demotes to Y which demotes to X... - * - * If multiple reads are performed, a single rcu_read_lock() - * must be held over all reads to ensure that no cycles are - * observed. - */ -#define DEFAULT_DEMOTION_TARGET_NODES 15 - -#if MAX_NUMNODES < DEFAULT_DEMOTION_TARGET_NODES -#define DEMOTION_TARGET_NODES (MAX_NUMNODES - 1) -#else -#define DEMOTION_TARGET_NODES DEFAULT_DEMOTION_TARGET_NODES -#endif - -struct demotion_nodes { - unsigned short nr; - short nodes[DEMOTION_TARGET_NODES]; -}; - -static struct demotion_nodes *node_demotion __read_mostly; - -/** - * next_demotion_node() - Get the next node in the demotion path - * @node: The starting node to lookup the next node - * - * Return: node id for next memory node in the demotion path hierarchy - * from @node; NUMA_NO_NODE if @node is terminal. This does not keep - * @node online or guarantee that it *continues* to be the next demotion - * target. - */ -int next_demotion_node(int node) -{ - struct demotion_nodes *nd; - unsigned short target_nr, index; - int target; - - if (!node_demotion) - return NUMA_NO_NODE; - - nd = &node_demotion[node]; - - /* - * node_demotion[] is updated without excluding this - * function from running. RCU doesn't provide any - * compiler barriers, so the READ_ONCE() is required - * to avoid compiler reordering or read merging. - * - * Make sure to use RCU over entire code blocks if - * node_demotion[] reads need to be consistent. - */ - rcu_read_lock(); - target_nr = READ_ONCE(nd->nr); - - switch (target_nr) { - case 0: - target = NUMA_NO_NODE; - goto out; - case 1: - index = 0; - break; - default: - /* - * If there are multiple target nodes, just select one - * target node randomly. - * - * In addition, we can also use round-robin to select - * target node, but we should introduce another variable - * for node_demotion[] to record last selected target node, - * that may cause cache ping-pong due to the changing of - * last target node. Or introducing per-cpu data to avoid - * caching issue, which seems more complicated. So selecting - * target node randomly seems better until now. - */ - index = get_random_int() % target_nr; - break; - } - - target = READ_ONCE(nd->nodes[index]); - -out: - rcu_read_unlock(); - return target; -} - /* * Obtain the lock on page, remove all ptes and migrate the page * to the newly allocated page in newpage. @@ -3059,6 +2926,138 @@ void migrate_vma_finalize(struct migrate EXPORT_SYMBOL(migrate_vma_finalize); #endif /* CONFIG_DEVICE_PRIVATE */ +/* + * node_demotion[] example: + * + * Consider a system with two sockets. Each socket has + * three classes of memory attached: fast, medium and slow. + * Each memory class is placed in its own NUMA node. The + * CPUs are placed in the node with the "fast" memory. The + * 6 NUMA nodes (0-5) might be split among the sockets like + * this: + * + * Socket A: 0, 1, 2 + * Socket B: 3, 4, 5 + * + * When Node 0 fills up, its memory should be migrated to + * Node 1. When Node 1 fills up, it should be migrated to + * Node 2. The migration path start on the nodes with the + * processors (since allocations default to this node) and + * fast memory, progress through medium and end with the + * slow memory: + * + * 0 -> 1 -> 2 -> stop + * 3 -> 4 -> 5 -> stop + * + * This is represented in the node_demotion[] like this: + * + * { nr=1, nodes[0]=1 }, // Node 0 migrates to 1 + * { nr=1, nodes[0]=2 }, // Node 1 migrates to 2 + * { nr=0, nodes[0]=-1 }, // Node 2 does not migrate + * { nr=1, nodes[0]=4 }, // Node 3 migrates to 4 + * { nr=1, nodes[0]=5 }, // Node 4 migrates to 5 + * { nr=0, nodes[0]=-1 }, // Node 5 does not migrate + * + * Moreover some systems may have multiple slow memory nodes. + * Suppose a system has one socket with 3 memory nodes, node 0 + * is fast memory type, and node 1/2 both are slow memory + * type, and the distance between fast memory node and slow + * memory node is same. So the migration path should be: + * + * 0 -> 1/2 -> stop + * + * This is represented in the node_demotion[] like this: + * { nr=2, {nodes[0]=1, nodes[1]=2} }, // Node 0 migrates to node 1 and node 2 + * { nr=0, nodes[0]=-1, }, // Node 1 dose not migrate + * { nr=0, nodes[0]=-1, }, // Node 2 does not migrate + */ + +/* + * Writes to this array occur without locking. Cycles are + * not allowed: Node X demotes to Y which demotes to X... + * + * If multiple reads are performed, a single rcu_read_lock() + * must be held over all reads to ensure that no cycles are + * observed. + */ +#define DEFAULT_DEMOTION_TARGET_NODES 15 + +#if MAX_NUMNODES < DEFAULT_DEMOTION_TARGET_NODES +#define DEMOTION_TARGET_NODES (MAX_NUMNODES - 1) +#else +#define DEMOTION_TARGET_NODES DEFAULT_DEMOTION_TARGET_NODES +#endif + +struct demotion_nodes { + unsigned short nr; + short nodes[DEMOTION_TARGET_NODES]; +}; + +static struct demotion_nodes *node_demotion __read_mostly; + +/** + * next_demotion_node() - Get the next node in the demotion path + * @node: The starting node to lookup the next node + * + * Return: node id for next memory node in the demotion path hierarchy + * from @node; NUMA_NO_NODE if @node is terminal. This does not keep + * @node online or guarantee that it *continues* to be the next demotion + * target. + */ +int next_demotion_node(int node) +{ + struct demotion_nodes *nd; + unsigned short target_nr, index; + int target; + + if (!node_demotion) + return NUMA_NO_NODE; + + nd = &node_demotion[node]; + + /* + * node_demotion[] is updated without excluding this + * function from running. RCU doesn't provide any + * compiler barriers, so the READ_ONCE() is required + * to avoid compiler reordering or read merging. + * + * Make sure to use RCU over entire code blocks if + * node_demotion[] reads need to be consistent. + */ + rcu_read_lock(); + target_nr = READ_ONCE(nd->nr); + + switch (target_nr) { + case 0: + target = NUMA_NO_NODE; + goto out; + case 1: + index = 0; + break; + default: + /* + * If there are multiple target nodes, just select one + * target node randomly. + * + * In addition, we can also use round-robin to select + * target node, but we should introduce another variable + * for node_demotion[] to record last selected target node, + * that may cause cache ping-pong due to the changing of + * last target node. Or introducing per-cpu data to avoid + * caching issue, which seems more complicated. So selecting + * target node randomly seems better until now. + */ + index = get_random_int() % target_nr; + break; + } + + target = READ_ONCE(nd->nodes[index]); + +out: + rcu_read_unlock(); + return target; +} + #if defined(CONFIG_HOTPLUG_CPU) /* Disable reclaim-based migration. */ static void __disable_all_migrate_targets(void) From patchwork Fri Jan 14 22:08:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714134 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 511ADC433EF for ; Fri, 14 Jan 2022 22:08:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBFF26B0159; Fri, 14 Jan 2022 17:08:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6CC56B015B; Fri, 14 Jan 2022 17:08:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C35466B015C; Fri, 14 Jan 2022 17:08:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id B09C06B0159 for ; Fri, 14 Jan 2022 17:08:58 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 722EC9675D for ; Fri, 14 Jan 2022 22:08:58 +0000 (UTC) X-FDA: 79030283556.29.E7EF0D9 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf28.hostedemail.com (Postfix) with ESMTP id B49F1C0011 for ; Fri, 14 Jan 2022 22:08:57 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 9399CCE2384; Fri, 14 Jan 2022 22:08:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B32DEC36AED; Fri, 14 Jan 2022 22:08:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198133; bh=OEr23vaDxvm6QNKQAFBebrhz6efR0tnYgbOQuxOfUbI=; h=Date:From:To:Subject:In-Reply-To:From; b=ldadwEOjh9XwSw7AQIVUBDy+R+Kogjb5/jz2EKoHjbGN53kCFljdasINAkyCkHgZL 8LXOSuwXP0OJsvO83deZ4/1DkgpgW9Z3dUIVutqu9VAHEPj9IxVI6fMSHPIqrp6tzF RWIcFejmvED6ff/ey9tgrFjhdK+6YA+Sy2BuyAx4= Date: Fri, 14 Jan 2022 14:08:53 -0800 From: Andrew Morton To: akpm@linux-foundation.org, colin.i.king@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 108/146] mm/migrate: remove redundant variables used in a for-loop Message-ID: <20220114220853.LyjnURLZU%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: pegyddrr7npnwmp4jyez9xjjdnngw3rr Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ldadwEOj; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B49F1C0011 X-HE-Tag: 1642198137-289539 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: mm/migrate: remove redundant variables used in a for-loop The variable addr is being set and incremented in a for-loop but not actually being used. It is redundant and so addr and also variable start can be removed. Link: https://lkml.kernel.org/r/20211221185729.609630-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King Signed-off-by: Andrew Morton --- mm/migrate.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/mm/migrate.c~mm-migrate-remove-redundant-variables-used-in-a-for-loop +++ a/mm/migrate.c @@ -2481,8 +2481,7 @@ static bool migrate_vma_check_page(struc static void migrate_vma_unmap(struct migrate_vma *migrate) { const unsigned long npages = migrate->npages; - const unsigned long start = migrate->start; - unsigned long addr, i, restore = 0; + unsigned long i, restore = 0; bool allow_drain = true; lru_add_drain(); @@ -2528,7 +2527,7 @@ static void migrate_vma_unmap(struct mig } } - for (addr = start, i = 0; i < npages && restore; addr += PAGE_SIZE, i++) { + for (i = 0; i < npages && restore; i++) { struct page *page = migrate_pfn_to_page(migrate->src[i]); if (!page || (migrate->src[i] & MIGRATE_PFN_MIGRATE)) From patchwork Fri Jan 14 22:08:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E796C433F5 for ; Fri, 14 Jan 2022 22:09:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35D246B015B; Fri, 14 Jan 2022 17:08:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 30D976B015D; Fri, 14 Jan 2022 17:08:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D3AA6B015E; Fri, 14 Jan 2022 17:08:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id 08DDF6B015D for ; Fri, 14 Jan 2022 17:08:59 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id C24379675D for ; Fri, 14 Jan 2022 22:08:58 +0000 (UTC) X-FDA: 79030283556.10.D2518D9 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf24.hostedemail.com (Postfix) with ESMTP id 53BC0180010 for ; Fri, 14 Jan 2022 22:08:58 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 58E62B8262F; Fri, 14 Jan 2022 22:08:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAFFEC36AE9; Fri, 14 Jan 2022 22:08:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198137; bh=X8f+gyK//G9NkkmWg+k7VoiisPVxh9cYQA0c6j7PTjc=; h=Date:From:To:Subject:In-Reply-To:From; b=cwSwCBujZyMsabrH0xJtOOXfLhdLAQy+YrxaZJe73JUBjJGhBfwC/OhDY8CFeEqZj Sm/dkp1xry0US/UaE4jcooDkNfA3HboGc6LXubM5MCq422LBHx58FWbPJSTlvXTLIU FbDYcQoySkQmTJuPFqApkeF3posKlx0bfTDPNG6w= Date: Fri, 14 Jan 2022 14:08:56 -0800 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, david@redhat.com, kirill@shutemov.name, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org Subject: [patch 109/146] mm/thp: drop unused trace events hugepage_[invalidate|splitting] Message-ID: <20220114220856.uc7z7U0Nn%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 53BC0180010 X-Stat-Signature: ij346cjsat4bkdzxdzdeha3m4oqonfhu Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=cwSwCBuj; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198138-665928 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual Subject: mm/thp: drop unused trace events hugepage_[invalidate|splitting] The trace events hugepage_[invalidate|splitting], were added via the commit 9e813308a5c1 ("powerpc/thp: Add tracepoints to track hugepage invalidate"). Afterwards their call sites i.e trace_hugepage_[invalidate|splitting] were just dropped off, leaving these trace points unused. Link: https://lkml.kernel.org/r/1641546351-15109-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Reviewed-by: David Hildenbrand Cc: Steven Rostedt Cc: Ingo Molnar Cc: Kirill A. Shutemov Signed-off-by: Andrew Morton --- include/trace/events/thp.h | 35 ----------------------------------- 1 file changed, 35 deletions(-) --- a/include/trace/events/thp.h~mm-thp-drop-unused-trace-events-hugepage_ +++ a/include/trace/events/thp.h @@ -8,24 +8,6 @@ #include #include -TRACE_EVENT(hugepage_invalidate, - - TP_PROTO(unsigned long addr, unsigned long pte), - TP_ARGS(addr, pte), - TP_STRUCT__entry( - __field(unsigned long, addr) - __field(unsigned long, pte) - ), - - TP_fast_assign( - __entry->addr = addr; - __entry->pte = pte; - ), - - TP_printk("hugepage invalidate at addr 0x%lx and pte = 0x%lx", - __entry->addr, __entry->pte) -); - TRACE_EVENT(hugepage_set_pmd, TP_PROTO(unsigned long addr, unsigned long pmd), @@ -65,23 +47,6 @@ TRACE_EVENT(hugepage_update, TP_printk("hugepage update at addr 0x%lx and pte = 0x%lx clr = 0x%lx, set = 0x%lx", __entry->addr, __entry->pte, __entry->clr, __entry->set) ); -TRACE_EVENT(hugepage_splitting, - - TP_PROTO(unsigned long addr, unsigned long pte), - TP_ARGS(addr, pte), - TP_STRUCT__entry( - __field(unsigned long, addr) - __field(unsigned long, pte) - ), - - TP_fast_assign( - __entry->addr = addr; - __entry->pte = pte; - ), - - TP_printk("hugepage splitting at addr 0x%lx and pte = 0x%lx", - __entry->addr, __entry->pte) -); #endif /* _TRACE_THP_H */ From patchwork Fri Jan 14 22:08:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714136 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B789C433EF for ; Fri, 14 Jan 2022 22:09:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 929AC6B015D; Fri, 14 Jan 2022 17:09:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D8776B015F; Fri, 14 Jan 2022 17:09:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C7E86B0160; Fri, 14 Jan 2022 17:09:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0042.hostedemail.com [216.40.44.42]) by kanga.kvack.org (Postfix) with ESMTP id 6E87A6B015D for ; Fri, 14 Jan 2022 17:09:03 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2FE268F6D3 for ; Fri, 14 Jan 2022 22:09:03 +0000 (UTC) X-FDA: 79030283766.25.C233382 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf10.hostedemail.com (Postfix) with ESMTP id 96EA7C0005 for ; Fri, 14 Jan 2022 22:09:02 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 915ECB82A26; Fri, 14 Jan 2022 22:09:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0ABFFC36AE9; Fri, 14 Jan 2022 22:09:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198140; bh=DlxIKWoIIBJ6DIUVppCn1s3lQaDYGfLGvk8Cw0emKH0=; h=Date:From:To:Subject:In-Reply-To:From; b=TBE9jCN14aBf+yYyQt4xJ8DWpgRhVo8Z6Rqa92XScQXY5W+c+vHjd5JRE/jZId4Kt yrzxfNNl1GVU1Ros8334SttGIsXdbBGSeUARZhTKFBWOiP2DFzMypHvlI3yFMn4Tez fXiOvwvRB3XVEM0MeeGM5PLGaqHuIeDi4D9h1Yu8= Date: Fri, 14 Jan 2022 14:08:59 -0800 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sunnanyong@huawei.com, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 110/146] mm: ksm: fix use-after-free kasan report in ksm_might_need_to_copy Message-ID: <20220114220859.RBibTk8EB%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 96EA7C0005 X-Stat-Signature: w9nm1n71ffmsa9chiy53csijgdgz64ay Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=TBE9jCN1; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198142-761338 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nanyong Sun Subject: mm: ksm: fix use-after-free kasan report in ksm_might_need_to_copy When under the stress of swapping in/out with KSM enabled, there is a low probability that kasan reports the BUG of use-after-free in ksm_might_need_to_copy() when do swap in. The freed object is the anon_vma got from page_anon_vma(page). It is because a swapcache page associated with one anon_vma now needed for another anon_vma, but the page's original vma was unmapped and the anon_vma was freed. In this case the if condition below always return false and then alloc a new page to copy. Swapin process then use the new page and can continue to run well, so this is harmless actually. } else if (anon_vma->root == vma->anon_vma->root && page->index == linear_page_index(vma, address)) { This patch exchange the order of above two judgment statement to avoid the kasan warning. Let cpu run "page->index == linear_page_index(vma, address)" firstly and return false basically to skip the read of anon_vma->root which may trigger the kasan use-after-free warning. ================================================================== BUG: KASAN: use-after-free in ksm_might_need_to_copy+0x12e/0x5b0 Read of size 8 at addr ffff88be9977dbd0 by task khugepaged/694 CPU: 8 PID: 694 Comm: khugepaged Kdump: loaded Tainted: G OE - 4.18.0.x86_64 Hardware name: 1288H V5/BC11SPSC0, BIOS 7.93 01/14/2021 Call Trace: dump_stack+0xf1/0x19b print_address_description+0x70/0x360 kasan_report+0x1b2/0x330 ksm_might_need_to_copy+0x12e/0x5b0 do_swap_page+0x452/0xe70 __collapse_huge_page_swapin+0x24b/0x720 khugepaged_scan_pmd+0xcae/0x1ff0 khugepaged+0x8ee/0xd70 kthread+0x1a2/0x1d0 ret_from_fork+0x1f/0x40 Allocated by task 2306153: kasan_kmalloc+0xa0/0xd0 kmem_cache_alloc+0xc0/0x1c0 anon_vma_clone+0xf7/0x380 anon_vma_fork+0xc0/0x390 copy_process+0x447b/0x4810 _do_fork+0x118/0x620 do_syscall_64+0x112/0x360 entry_SYSCALL_64_after_hwframe+0x65/0xca Freed by task 2306242: __kasan_slab_free+0x130/0x180 kmem_cache_free+0x78/0x1d0 unlink_anon_vmas+0x19c/0x4a0 free_pgtables+0x137/0x1b0 exit_mmap+0x133/0x320 mmput+0x15e/0x390 do_exit+0x8c5/0x1210 do_group_exit+0xb5/0x1b0 __x64_sys_exit_group+0x21/0x30 do_syscall_64+0x112/0x360 entry_SYSCALL_64_after_hwframe+0x65/0xca The buggy address belongs to the object at ffff88be9977dba0 which belongs to the cache anon_vma_chain of size 64 The buggy address is located 48 bytes inside of 64-byte region [ffff88be9977dba0, ffff88be9977dbe0) The buggy address belongs to the page: page:ffffea00fa65df40 count:1 mapcount:0 mapping:ffff888107717800 index:0x0 flags: 0x17ffffc0000100(slab) ================================================================== Link: https://lkml.kernel.org/r/20211202102940.1069634-1-sunnanyong@huawei.com Signed-off-by: Nanyong Sun Cc: Hugh Dickins Cc: Kefeng Wang Signed-off-by: Andrew Morton --- mm/ksm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/ksm.c~mm-ksm-fix-use-after-free-kasan-report-in-ksm_might_need_to_copy +++ a/mm/ksm.c @@ -2576,8 +2576,8 @@ struct page *ksm_might_need_to_copy(stru return page; /* no need to copy it */ } else if (!anon_vma) { return page; /* no need to copy it */ - } else if (anon_vma->root == vma->anon_vma->root && - page->index == linear_page_index(vma, address)) { + } else if (page->index == linear_page_index(vma, address) && + anon_vma->root == vma->anon_vma->root) { return page; /* still no need to copy it */ } if (!PageUptodate(page)) From patchwork Fri Jan 14 22:09:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99639C433EF for ; Fri, 14 Jan 2022 22:09:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 299FA6B015F; Fri, 14 Jan 2022 17:09:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 248EB6B0161; Fri, 14 Jan 2022 17:09:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EA666B0162; Fri, 14 Jan 2022 17:09:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0097.hostedemail.com [216.40.44.97]) by kanga.kvack.org (Postfix) with ESMTP id F2B196B015F for ; Fri, 14 Jan 2022 17:09:08 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id AD67180C7BF8 for ; Fri, 14 Jan 2022 22:09:08 +0000 (UTC) X-FDA: 79030283976.11.8B887D7 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf14.hostedemail.com (Postfix) with ESMTP id DDB7C10000E for ; Fri, 14 Jan 2022 22:09:07 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 661B8CE19A9; Fri, 14 Jan 2022 22:09:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40547C36AE5; Fri, 14 Jan 2022 22:09:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198143; bh=8YVev0luQKHfSKmGcdBlF001qT6IsletKbXz5bXpA98=; h=Date:From:To:Subject:In-Reply-To:From; b=OTRHQqQnj2pX4SESqTWgsFLqYHEdBltIPHwFXzIie3Heq/G62exfY+MRBEOYrwvsG EC/lF1M2vsixT54b/bb+TFw6I/HoJT4SUH+GIRRr5CBbx1U5CvYPfDrpa7IjAJ0v+k PiD1SxpFE1+y5EvYVzma2KgxGNPwVOIHO4QC0JOQ= Date: Fri, 14 Jan 2022 14:09:02 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, david@redhat.com, dinghui@sangfor.com.cn, linmiaohe@huawei.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, tony.luck@intel.com, torvalds@linux-foundation.org Subject: [patch 111/146] mm/hwpoison: mf_mutex for soft offline and unpoison Message-ID: <20220114220902.48bKohjgW%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: DDB7C10000E X-Stat-Signature: pwfxk5oyua7fb3zsq8i6p6wsge7h4jjn Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OTRHQqQn; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198147-628842 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: mm/hwpoison: mf_mutex for soft offline and unpoison Patch series "mm/hwpoison: fix unpoison_memory()", v4. Main purpose of this series is to sync unpoison code to recent changes around how hwpoison code takes page refcount. Unpoison should work or simply fail (without crash) if impossible. The recent works of keeping hwpoison pages in shmem pagecache introduce a new state of hwpoisoned pages, but unpoison for such pages is not supported yet with this series. It seems that soft-offline and unpoison can be used as general purpose page offline/online mechanism (not in the context of memory error). I think that we need some additional works to realize it because currently soft-offline and unpoison are assumed not to happen so frequently (print out too many messages for aggressive usecases). But anyway this could be another interesting next topic. v1: https://lore.kernel.org/linux-mm/20210614021212.223326-1-nao.horiguchi@gmail.com/ v2: https://lore.kernel.org/linux-mm/20211025230503.2650970-1-naoya.horiguchi@linux.dev/ v3: https://lore.kernel.org/linux-mm/20211105055058.3152564-1-naoya.horiguchi@linux.dev/ This patch (of 3): Originally mf_mutex is introduced to serialize multiple MCE events, but it is not that useful to allow unpoison to run in parallel with memory_failure() and soft offline. So apply mf_mutex to soft offline and unpoison. The memory failure handler and soft offline handler get simpler with this. Link: https://lkml.kernel.org/r/20211115084006.3728254-1-naoya.horiguchi@linux.dev Link: https://lkml.kernel.org/r/20211115084006.3728254-2-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Reviewed-by: Yang Shi Cc: "Aneesh Kumar K.V" Cc: David Hildenbrand Cc: Ding Hui Cc: Miaohe Lin Cc: Michal Hocko Cc: Oscar Salvador Cc: Peter Xu Cc: Tony Luck Signed-off-by: Andrew Morton --- mm/memory-failure.c | 62 ++++++++++++------------------------------ 1 file changed, 18 insertions(+), 44 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-mf_mutex-for-soft-offline-and-unpoison +++ a/mm/memory-failure.c @@ -1502,14 +1502,6 @@ static int memory_failure_hugetlb(unsign lock_page(head); page_flags = head->flags; - if (!PageHWPoison(head)) { - pr_err("Memory failure: %#lx: just unpoisoned\n", pfn); - num_poisoned_pages_dec(); - unlock_page(head); - put_page(head); - return 0; - } - /* * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so * simply disable it. In order to make it work properly, we need @@ -1623,6 +1615,8 @@ out: return rc; } +static DEFINE_MUTEX(mf_mutex); + /** * memory_failure - Handle memory failure of a page. * @pfn: Page Number of the corrupted page @@ -1649,7 +1643,6 @@ int memory_failure(unsigned long pfn, in int res = 0; unsigned long page_flags; bool retry = true; - static DEFINE_MUTEX(mf_mutex); if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); @@ -1783,16 +1776,6 @@ try_again: */ page_flags = p->flags; - /* - * unpoison always clear PG_hwpoison inside page lock - */ - if (!PageHWPoison(p)) { - pr_err("Memory failure: %#lx: just unpoisoned\n", pfn); - num_poisoned_pages_dec(); - unlock_page(p); - put_page(p); - goto unlock_mutex; - } if (hwpoison_filter(p)) { if (TestClearPageHWPoison(p)) num_poisoned_pages_dec(); @@ -1973,6 +1956,7 @@ int unpoison_memory(unsigned long pfn) struct page *page; struct page *p; int freeit = 0; + int ret = 0; unsigned long flags = 0; static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); @@ -1983,39 +1967,30 @@ int unpoison_memory(unsigned long pfn) p = pfn_to_page(pfn); page = compound_head(p); + mutex_lock(&mf_mutex); + if (!PageHWPoison(p)) { unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n", pfn, &unpoison_rs); - return 0; + goto unlock_mutex; } if (page_count(page) > 1) { unpoison_pr_info("Unpoison: Someone grabs the hwpoison page %#lx\n", pfn, &unpoison_rs); - return 0; + goto unlock_mutex; } if (page_mapped(page)) { unpoison_pr_info("Unpoison: Someone maps the hwpoison page %#lx\n", pfn, &unpoison_rs); - return 0; + goto unlock_mutex; } if (page_mapping(page)) { unpoison_pr_info("Unpoison: the hwpoison page has non-NULL mapping %#lx\n", pfn, &unpoison_rs); - return 0; - } - - /* - * unpoison_memory() can encounter thp only when the thp is being - * worked by memory_failure() and the page lock is not held yet. - * In such case, we yield to memory_failure() and make unpoison fail. - */ - if (!PageHuge(page) && PageTransHuge(page)) { - unpoison_pr_info("Unpoison: Memory failure is now running on %#lx\n", - pfn, &unpoison_rs); - return 0; + goto unlock_mutex; } if (!get_hwpoison_page(p, flags)) { @@ -2023,29 +1998,23 @@ int unpoison_memory(unsigned long pfn) num_poisoned_pages_dec(); unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n", pfn, &unpoison_rs); - return 0; + goto unlock_mutex; } - lock_page(page); - /* - * This test is racy because PG_hwpoison is set outside of page lock. - * That's acceptable because that won't trigger kernel panic. Instead, - * the PG_hwpoison page will be caught and isolated on the entrance to - * the free buddy page pool. - */ if (TestClearPageHWPoison(page)) { unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", pfn, &unpoison_rs); num_poisoned_pages_dec(); freeit = 1; } - unlock_page(page); put_page(page); if (freeit && !(pfn == my_zero_pfn(0) && page_count(p) == 1)) put_page(page); - return 0; +unlock_mutex: + mutex_unlock(&mf_mutex); + return ret; } EXPORT_SYMBOL(unpoison_memory); @@ -2226,9 +2195,12 @@ int soft_offline_page(unsigned long pfn, return -EIO; } + mutex_lock(&mf_mutex); + if (PageHWPoison(page)) { pr_info("%s: %#lx page already poisoned\n", __func__, pfn); put_ref_page(ref_page); + mutex_unlock(&mf_mutex); return 0; } @@ -2247,5 +2219,7 @@ retry: } } + mutex_unlock(&mf_mutex); + return ret; } From patchwork Fri Jan 14 22:09:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A087C433FE for ; Fri, 14 Jan 2022 22:09:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 188C46B0161; Fri, 14 Jan 2022 17:09:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 137E46B0163; Fri, 14 Jan 2022 17:09:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 026B76B0164; Fri, 14 Jan 2022 17:09:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id E82F46B0161 for ; Fri, 14 Jan 2022 17:09:11 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B68B2944D0 for ; Fri, 14 Jan 2022 22:09:11 +0000 (UTC) X-FDA: 79030284102.04.B78C471 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf29.hostedemail.com (Postfix) with ESMTP id F3FCC12001D for ; Fri, 14 Jan 2022 22:09:10 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id C8FFBCE2384; Fri, 14 Jan 2022 22:09:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A50BDC36AE9; Fri, 14 Jan 2022 22:09:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198147; bh=d6K5UYoNfBteh//DxGgqdVWBJmQMdSLVG9nyQrxvyWA=; h=Date:From:To:Subject:In-Reply-To:From; b=DQSQW6b3nU6Zoe5EQy1mNhG1Hksqn45qz/MOtNi3OmcIRh6P8n1VOn3zZEHBrPJjN 7VTp4GbTCXfybP2mV12xLb2CoV1mLNqb3lj5fVNoBQVu4A5gcTgBxHvNIVhpqrms6D itWgpGRon7Wi8Ar6QwQ6goGpIKn12W9ZK2vWuIBk= Date: Fri, 14 Jan 2022 14:09:06 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, david@redhat.com, dinghui@sangfor.com.cn, linmiaohe@huawei.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, tony.luck@intel.com, torvalds@linux-foundation.org Subject: [patch 112/146] mm/hwpoison: remove MF_MSG_BUDDY_2ND and MF_MSG_POISONED_HUGE Message-ID: <20220114220906.qMYr-bmbz%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DQSQW6b3; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: dtodhie59bj1xqxzs1p95ra846oy5dxo X-Rspamd-Queue-Id: F3FCC12001D X-Rspamd-Server: rspam12 X-HE-Tag: 1642198150-435583 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: mm/hwpoison: remove MF_MSG_BUDDY_2ND and MF_MSG_POISONED_HUGE These action_page_types are no longer used, so remove them. Link: https://lkml.kernel.org/r/20211115084006.3728254-3-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Acked-by: Yang Shi Cc: "Aneesh Kumar K.V" Cc: David Hildenbrand Cc: Ding Hui Cc: Miaohe Lin Cc: Michal Hocko Cc: Oscar Salvador Cc: Peter Xu Cc: Tony Luck Signed-off-by: Andrew Morton --- include/linux/mm.h | 2 -- include/ras/ras_event.h | 2 -- mm/memory-failure.c | 2 -- 3 files changed, 6 deletions(-) --- a/include/linux/mm.h~mm-hwpoison-remove-mf_msg_buddy_2nd-and-mf_msg_poisoned_huge +++ a/include/linux/mm.h @@ -3201,7 +3201,6 @@ enum mf_action_page_type { MF_MSG_KERNEL_HIGH_ORDER, MF_MSG_SLAB, MF_MSG_DIFFERENT_COMPOUND, - MF_MSG_POISONED_HUGE, MF_MSG_HUGE, MF_MSG_FREE_HUGE, MF_MSG_NON_PMD_HUGE, @@ -3216,7 +3215,6 @@ enum mf_action_page_type { MF_MSG_CLEAN_LRU, MF_MSG_TRUNCATED_LRU, MF_MSG_BUDDY, - MF_MSG_BUDDY_2ND, MF_MSG_DAX, MF_MSG_UNSPLIT_THP, MF_MSG_UNKNOWN, --- a/include/ras/ras_event.h~mm-hwpoison-remove-mf_msg_buddy_2nd-and-mf_msg_poisoned_huge +++ a/include/ras/ras_event.h @@ -358,7 +358,6 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_KERNEL_HIGH_ORDER, "high-order kernel page" ) \ EM ( MF_MSG_SLAB, "kernel slab page" ) \ EM ( MF_MSG_DIFFERENT_COMPOUND, "different compound page after locking" ) \ - EM ( MF_MSG_POISONED_HUGE, "huge page already hardware poisoned" ) \ EM ( MF_MSG_HUGE, "huge page" ) \ EM ( MF_MSG_FREE_HUGE, "free huge page" ) \ EM ( MF_MSG_NON_PMD_HUGE, "non-pmd-sized huge page" ) \ @@ -373,7 +372,6 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_CLEAN_LRU, "clean LRU page" ) \ EM ( MF_MSG_TRUNCATED_LRU, "already truncated LRU page" ) \ EM ( MF_MSG_BUDDY, "free buddy page" ) \ - EM ( MF_MSG_BUDDY_2ND, "free buddy page (2nd try)" ) \ EM ( MF_MSG_DAX, "dax page" ) \ EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \ EMe ( MF_MSG_UNKNOWN, "unknown page" ) --- a/mm/memory-failure.c~mm-hwpoison-remove-mf_msg_buddy_2nd-and-mf_msg_poisoned_huge +++ a/mm/memory-failure.c @@ -723,7 +723,6 @@ static const char * const action_page_ty [MF_MSG_KERNEL_HIGH_ORDER] = "high-order kernel page", [MF_MSG_SLAB] = "kernel slab page", [MF_MSG_DIFFERENT_COMPOUND] = "different compound page after locking", - [MF_MSG_POISONED_HUGE] = "huge page already hardware poisoned", [MF_MSG_HUGE] = "huge page", [MF_MSG_FREE_HUGE] = "free huge page", [MF_MSG_NON_PMD_HUGE] = "non-pmd-sized huge page", @@ -738,7 +737,6 @@ static const char * const action_page_ty [MF_MSG_CLEAN_LRU] = "clean LRU page", [MF_MSG_TRUNCATED_LRU] = "already truncated LRU page", [MF_MSG_BUDDY] = "free buddy page", - [MF_MSG_BUDDY_2ND] = "free buddy page (2nd try)", [MF_MSG_DAX] = "dax page", [MF_MSG_UNSPLIT_THP] = "unsplit thp", [MF_MSG_UNKNOWN] = "unknown page", From patchwork Fri Jan 14 22:09:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A5FDC433EF for ; Fri, 14 Jan 2022 22:09:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0E866B0163; Fri, 14 Jan 2022 17:09:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6F916B0165; Fri, 14 Jan 2022 17:09:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0E796B0166; Fri, 14 Jan 2022 17:09:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A9F3C6B0163 for ; Fri, 14 Jan 2022 17:09:13 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 681F718259E4F for ; Fri, 14 Jan 2022 22:09:13 +0000 (UTC) X-FDA: 79030284186.11.5664EB2 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf03.hostedemail.com (Postfix) with ESMTP id C26F120016 for ; Fri, 14 Jan 2022 22:09:12 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CA4E8B8262F; Fri, 14 Jan 2022 22:09:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09DDDC36AE5; Fri, 14 Jan 2022 22:09:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198150; bh=Ilbibu6gt61locO/NIi/QYhj8p/ZfYjIBqjfa1d6kXw=; h=Date:From:To:Subject:In-Reply-To:From; b=MCbaqil789pe9GrxT8YYssrbfgl6hgq57g/vhbq5023eCDMuMcwZM1tujp6eqkGCv bN/38RNOGwew5pNSIQA11n0w4Yq/ZMUNzIVwyodVuyMZt7YQxVMjgBCA+gl9DIYlfN kcoBXPyPbE1orUk9VhzecHOhannAQxoGK6plyeeQ= Date: Fri, 14 Jan 2022 14:09:09 -0800 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, david@redhat.com, dinghui@sangfor.com.cn, linmiaohe@huawei.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, tony.luck@intel.com, torvalds@linux-foundation.org Subject: [patch 113/146] mm/hwpoison: fix unpoison_memory() Message-ID: <20220114220909.zXvwN3m7r%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: C26F120016 X-Stat-Signature: 16e81hncbhew1d7q4ak5hawuk4ar7off Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=MCbaqil7; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198152-989717 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: mm/hwpoison: fix unpoison_memory() After recent soft-offline rework, error pages can be taken off from buddy allocator, but the existing unpoison_memory() does not properly undo the operation. Moreover, due to the recent change on __get_hwpoison_page(), get_page_unless_zero() is hardly called for hwpoisoned pages. So __get_hwpoison_page() highly likely returns -EBUSY (meaning to fail to grab page refcount) and unpoison just clears PG_hwpoison without releasing a refcount. That does not lead to a critical issue like kernel panic, but unpoisoned pages never get back to buddy (leaked permanently), which is not good. To (partially) fix this, we need to identify "taken off" pages from other types of hwpoisoned pages. We can't use refcount or page flags for this purpose, so a pseudo flag is defined by hacking ->private field. Someone might think that put_page() is enough to cancel taken-off pages, but the normal free path contains some operations not suitable for the current purpose, and can fire VM_BUG_ON(). Note that unpoison_memory() is now supposed to be cancel hwpoison events injected only by madvise() or /sys/devices/system/memory/{hard,soft}_offline_page, not by MCE injection, so please don't try to use unpoison when testing with MCE injection. [lkp@intel.com: report build failure for ARCH=i386] Link: https://lkml.kernel.org/r/20211115084006.3728254-4-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Reviewed-by: Yang Shi Cc: David Hildenbrand Cc: Oscar Salvador Cc: Michal Hocko Cc: Ding Hui Cc: Tony Luck Cc: "Aneesh Kumar K.V" Cc: Miaohe Lin Cc: Peter Xu Signed-off-by: Andrew Morton --- include/linux/mm.h | 1 include/linux/page-flags.h | 4 + mm/memory-failure.c | 109 ++++++++++++++++++++++++++++------- mm/page_alloc.c | 27 ++++++++ 4 files changed, 122 insertions(+), 19 deletions(-) --- a/include/linux/mm.h~mm-hwpoison-fix-unpoison_memory +++ a/include/linux/mm.h @@ -3174,6 +3174,7 @@ enum mf_flags { MF_ACTION_REQUIRED = 1 << 1, MF_MUST_KILL = 1 << 2, MF_SOFT_OFFLINE = 1 << 3, + MF_UNPOISON = 1 << 4, }; extern int memory_failure(unsigned long pfn, int flags); extern void memory_failure_queue(unsigned long pfn, int flags); --- a/include/linux/page-flags.h~mm-hwpoison-fix-unpoison_memory +++ a/include/linux/page-flags.h @@ -522,7 +522,11 @@ PAGEFLAG_FALSE(Uncached, uncached) PAGEFLAG(HWPoison, hwpoison, PF_ANY) TESTSCFLAG(HWPoison, hwpoison, PF_ANY) #define __PG_HWPOISON (1UL << PG_hwpoison) +#define MAGIC_HWPOISON 0x48575053U /* HWPS */ +extern void SetPageHWPoisonTakenOff(struct page *page); +extern void ClearPageHWPoisonTakenOff(struct page *page); extern bool take_page_off_buddy(struct page *page); +extern bool put_page_back_buddy(struct page *page); #else PAGEFLAG_FALSE(HWPoison, hwpoison) #define __PG_HWPOISON 0 --- a/mm/memory-failure.c~mm-hwpoison-fix-unpoison_memory +++ a/mm/memory-failure.c @@ -1160,6 +1160,22 @@ static int page_action(struct page_state return (result == MF_RECOVERED || result == MF_DELAYED) ? 0 : -EBUSY; } +static inline bool PageHWPoisonTakenOff(struct page *page) +{ + return PageHWPoison(page) && page_private(page) == MAGIC_HWPOISON; +} + +void SetPageHWPoisonTakenOff(struct page *page) +{ + set_page_private(page, MAGIC_HWPOISON); +} + +void ClearPageHWPoisonTakenOff(struct page *page) +{ + if (PageHWPoison(page)) + set_page_private(page, 0); +} + /* * Return true if a page type of a given page is supported by hwpoison * mechanism (while handling could fail), otherwise false. This function @@ -1262,6 +1278,27 @@ out: return ret; } +static int __get_unpoison_page(struct page *page) +{ + struct page *head = compound_head(page); + int ret = 0; + bool hugetlb = false; + + ret = get_hwpoison_huge_page(head, &hugetlb); + if (hugetlb) + return ret; + + /* + * PageHWPoisonTakenOff pages are not only marked as PG_hwpoison, + * but also isolated from buddy freelist, so need to identify the + * state and have to cancel both operations to unpoison. + */ + if (PageHWPoisonTakenOff(page)) + return -EHWPOISON; + + return get_page_unless_zero(page) ? 1 : 0; +} + /** * get_hwpoison_page() - Get refcount for memory error handling * @p: Raw error page (hit by memory error) @@ -1278,18 +1315,26 @@ out: * extra care for the error page's state (as done in __get_hwpoison_page()), * and has some retry logic in get_any_page(). * + * When called from unpoison_memory(), the caller should already ensure that + * the given page has PG_hwpoison. So it's never reused for other page + * allocations, and __get_unpoison_page() never races with them. + * * Return: 0 on failure, * 1 on success for in-use pages in a well-defined state, * -EIO for pages on which we can not handle memory errors, * -EBUSY when get_hwpoison_page() has raced with page lifecycle - * operations like allocation and free. + * operations like allocation and free, + * -EHWPOISON when the page is hwpoisoned and taken off from buddy. */ static int get_hwpoison_page(struct page *p, unsigned long flags) { int ret; zone_pcp_disable(page_zone(p)); - ret = get_any_page(p, flags); + if (flags & MF_UNPOISON) + ret = __get_unpoison_page(p); + else + ret = get_any_page(p, flags); zone_pcp_enable(page_zone(p)); return ret; @@ -1937,6 +1982,28 @@ core_initcall(memory_failure_init); pr_info(fmt, pfn); \ }) +static inline int clear_page_hwpoison(struct ratelimit_state *rs, struct page *p) +{ + if (TestClearPageHWPoison(p)) { + unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", + page_to_pfn(p), rs); + num_poisoned_pages_dec(); + return 1; + } + return 0; +} + +static inline int unpoison_taken_off_page(struct ratelimit_state *rs, + struct page *p) +{ + if (put_page_back_buddy(p)) { + unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", + page_to_pfn(p), rs); + return 0; + } + return -EBUSY; +} + /** * unpoison_memory - Unpoison a previously poisoned page * @pfn: Page number of the to be unpoisoned page @@ -1953,9 +2020,7 @@ int unpoison_memory(unsigned long pfn) { struct page *page; struct page *p; - int freeit = 0; - int ret = 0; - unsigned long flags = 0; + int ret = -EBUSY; static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); @@ -1991,24 +2056,30 @@ int unpoison_memory(unsigned long pfn) goto unlock_mutex; } - if (!get_hwpoison_page(p, flags)) { - if (TestClearPageHWPoison(p)) - num_poisoned_pages_dec(); - unpoison_pr_info("Unpoison: Software-unpoisoned free page %#lx\n", - pfn, &unpoison_rs); + if (PageSlab(page) || PageTable(page)) goto unlock_mutex; - } - if (TestClearPageHWPoison(page)) { - unpoison_pr_info("Unpoison: Software-unpoisoned page %#lx\n", - pfn, &unpoison_rs); - num_poisoned_pages_dec(); - freeit = 1; - } + ret = get_hwpoison_page(p, MF_UNPOISON); + if (!ret) { + if (clear_page_hwpoison(&unpoison_rs, page)) + ret = 0; + else + ret = -EBUSY; + } else if (ret < 0) { + if (ret == -EHWPOISON) { + ret = unpoison_taken_off_page(&unpoison_rs, p); + } else + unpoison_pr_info("Unpoison: failed to grab page %#lx\n", + pfn, &unpoison_rs); + } else { + int freeit = clear_page_hwpoison(&unpoison_rs, p); - put_page(page); - if (freeit && !(pfn == my_zero_pfn(0) && page_count(p) == 1)) put_page(page); + if (freeit && !(pfn == my_zero_pfn(0) && page_count(p) == 1)) { + put_page(page); + ret = 0; + } + } unlock_mutex: mutex_unlock(&mf_mutex); --- a/mm/page_alloc.c~mm-hwpoison-fix-unpoison_memory +++ a/mm/page_alloc.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -9508,6 +9509,7 @@ bool take_page_off_buddy(struct page *pa del_page_from_free_list(page_head, zone, page_order); break_down_buddy_pages(zone, page_head, page, 0, page_order, migratetype); + SetPageHWPoisonTakenOff(page); if (!is_migrate_isolate(migratetype)) __mod_zone_freepage_state(zone, -1, migratetype); ret = true; @@ -9519,6 +9521,31 @@ bool take_page_off_buddy(struct page *pa spin_unlock_irqrestore(&zone->lock, flags); return ret; } + +/* + * Cancel takeoff done by take_page_off_buddy(). + */ +bool put_page_back_buddy(struct page *page) +{ + struct zone *zone = page_zone(page); + unsigned long pfn = page_to_pfn(page); + unsigned long flags; + int migratetype = get_pfnblock_migratetype(page, pfn); + bool ret = false; + + spin_lock_irqsave(&zone->lock, flags); + if (put_page_testzero(page)) { + ClearPageHWPoisonTakenOff(page); + __free_one_page(page, pfn, zone, 0, migratetype, FPI_NONE); + if (TestClearPageHWPoison(page)) { + num_poisoned_pages_dec(); + ret = true; + } + } + spin_unlock_irqrestore(&zone->lock, flags); + + return ret; +} #endif #ifdef CONFIG_ZONE_DMA From patchwork Fri Jan 14 22:09:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78A67C433F5 for ; Fri, 14 Jan 2022 22:09:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EC716B0165; Fri, 14 Jan 2022 17:09:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0757A6B0167; Fri, 14 Jan 2022 17:09:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7FDC6B0168; Fri, 14 Jan 2022 17:09:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0066.hostedemail.com [216.40.44.66]) by kanga.kvack.org (Postfix) with ESMTP id D5F9A6B0165 for ; Fri, 14 Jan 2022 17:09:16 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9606D998CC for ; Fri, 14 Jan 2022 22:09:16 +0000 (UTC) X-FDA: 79030284312.13.0B0AB49 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf03.hostedemail.com (Postfix) with ESMTP id EF8EF20012 for ; Fri, 14 Jan 2022 22:09:15 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E76E9B82A39; Fri, 14 Jan 2022 22:09:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53D3DC36AED; Fri, 14 Jan 2022 22:09:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198153; bh=MVKBiCbgfbFPrjIK9rQgMu8XnzgLeX+0UI6C6gCEpV4=; h=Date:From:To:Subject:In-Reply-To:From; b=zqaHT1uZ9C8x53yDPeXscKhH5zQ2OE2SmRVg6bEtfoKCUdPCHK5Hz0NHaViKlyGH9 tK0TVAS1UfM4S1ddW00kBjajY+Nk7xN94POjsV0y51N/mEhkiOOtc/Uij7HeobQJSq ryrJ/VdOL/CW6a8DLtOK7qAHest7juc1AI9LQ0yU= Date: Fri, 14 Jan 2022 14:09:12 -0800 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, dennis@kernel.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, tj@kernel.org, torvalds@linux-foundation.org, zhengqi.arch@bytedance.com Subject: [patch 114/146] mm: memcg/percpu: account extra objcg space to memory cgroups Message-ID: <20220114220912.5x99uuvei%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: EF8EF20012 X-Stat-Signature: eyupqspgzwkw4wuxuug8pdukqr3wr5ne Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zqaHT1uZ; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198155-953855 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qi Zheng Subject: mm: memcg/percpu: account extra objcg space to memory cgroups Similar to slab memory allocator, for each accounted percpu object there is an extra space which is used to store obj_cgroup membership. Charge it too. [akpm@linux-foundation.org: fix layout] Link: https://lkml.kernel.org/r/20211126040606.97836-1-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng Acked-by: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: Muchun Song Signed-off-by: Andrew Morton --- mm/percpu-internal.h | 18 ++++++++++++++++++ mm/percpu.c | 10 +++++----- 2 files changed, 23 insertions(+), 5 deletions(-) --- a/mm/percpu.c~mm-memcg-percpu-account-extra-objcg-space-to-memory-cgroups +++ a/mm/percpu.c @@ -1635,7 +1635,7 @@ static bool pcpu_memcg_pre_alloc_hook(si if (!objcg) return true; - if (obj_cgroup_charge(objcg, gfp, size * num_possible_cpus())) { + if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size))) { obj_cgroup_put(objcg); return false; } @@ -1656,10 +1656,10 @@ static void pcpu_memcg_post_alloc_hook(s rcu_read_lock(); mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_PERCPU_B, - size * num_possible_cpus()); + pcpu_obj_full_size(size)); rcu_read_unlock(); } else { - obj_cgroup_uncharge(objcg, size * num_possible_cpus()); + obj_cgroup_uncharge(objcg, pcpu_obj_full_size(size)); obj_cgroup_put(objcg); } } @@ -1676,11 +1676,11 @@ static void pcpu_memcg_free_hook(struct return; chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = NULL; - obj_cgroup_uncharge(objcg, size * num_possible_cpus()); + obj_cgroup_uncharge(objcg, pcpu_obj_full_size(size)); rcu_read_lock(); mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_PERCPU_B, - -(size * num_possible_cpus())); + -pcpu_obj_full_size(size)); rcu_read_unlock(); obj_cgroup_put(objcg); --- a/mm/percpu-internal.h~mm-memcg-percpu-account-extra-objcg-space-to-memory-cgroups +++ a/mm/percpu-internal.h @@ -113,6 +113,24 @@ static inline int pcpu_chunk_map_bits(st return pcpu_nr_pages_to_map_bits(chunk->nr_pages); } +#ifdef CONFIG_MEMCG_KMEM +/** + * pcpu_obj_full_size - helper to calculate size of each accounted object + * @size: size of area to allocate in bytes + * + * For each accounted object there is an extra space which is used to store + * obj_cgroup membership. Charge it too. + */ +static inline size_t pcpu_obj_full_size(size_t size) +{ + size_t extra_size; + + extra_size = size / PCPU_MIN_ALLOC_SIZE * sizeof(struct obj_cgroup *); + + return size * num_possible_cpus() + extra_size; +} +#endif /* CONFIG_MEMCG_KMEM */ + #ifdef CONFIG_PERCPU_STATS #include From patchwork Fri Jan 14 22:09:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D30BAC433F5 for ; Fri, 14 Jan 2022 22:09:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 666226B0167; Fri, 14 Jan 2022 17:09:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 61B3E6B0169; Fri, 14 Jan 2022 17:09:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50B066B016B; Fri, 14 Jan 2022 17:09:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id 3EC756B0167 for ; Fri, 14 Jan 2022 17:09:22 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0CC6A80E5F1F for ; Fri, 14 Jan 2022 22:09:22 +0000 (UTC) X-FDA: 79030284564.13.280F67C Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf13.hostedemail.com (Postfix) with ESMTP id 23C1B2000D for ; Fri, 14 Jan 2022 22:09:20 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id AFCE8CE2497; Fri, 14 Jan 2022 22:09:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6363C36AEF; Fri, 14 Jan 2022 22:09:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198157; bh=T3z6J5B2A7bPEhmCzHtZCKjTZNHGf9MgB8sBT7dXqFE=; h=Date:From:To:Subject:In-Reply-To:From; b=v5wt2457hvFOSqzA8ICDXZjqOkCmkKnBDihwa7U+QE+dW5A0jPrpoAnyGupGUmeOr AMksiCPm2MB2DH/tALfO1Gi1KRM4QzRh//lX9ZyFqtsI7gZZ6qrRCzHvEoiZTRHPye PmVYlcnIMa3IZ1FQOpwj/cz94wBBg7cbLuuQ0BLI= Date: Fri, 14 Jan 2022 14:09:16 -0800 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, dave.hansen@linux.intel.com, elver@google.com, linux-mm@kvack.org, luto@kernel.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, namit@vmware.com, torvalds@linux-foundation.org, will@kernel.org, ying.huang@intel.com, yuzhao@google.com Subject: [patch 115/146] mm/rmap: fix potential batched TLB flush race Message-ID: <20220114220916.k8LSl3Sd6%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: jrcbkm5hxtbt4835zc4ntukqx4p49gq7 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=v5wt2457; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 23C1B2000D X-HE-Tag: 1642198160-343428 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Huang Ying Subject: mm/rmap: fix potential batched TLB flush race In theory, the following race is possible for batched TLB flushing. CPU0 CPU1 ---- ---- shrink_page_list() unmap zap_pte_range() flush_tlb_batched_pending() flush_tlb_mm() try_to_unmap() set_tlb_ubc_flush_pending() mm->tlb_flush_batched = true mm->tlb_flush_batched = false After the TLB is flushed on CPU1 via flush_tlb_mm() and before mm->tlb_flush_batched is set to false, some PTE is unmapped on CPU0 and the TLB flushing is pended. Then the pended TLB flushing will be lost. Although both set_tlb_ubc_flush_pending() and flush_tlb_batched_pending() are called with PTL locked, different PTL instances may be used. Because the race window is really small, and the lost TLB flushing will cause problem only if a TLB entry is inserted before the unmapping in the race window, the race is only theoretical. But the fix is simple and cheap too. Syzbot has reported this too as follows, ================================================================== BUG: KCSAN: data-race in flush_tlb_batched_pending / try_to_unmap_one write to 0xffff8881072cfbbc of 1 bytes by task 17406 on cpu 1: flush_tlb_batched_pending+0x5f/0x80 mm/rmap.c:691 madvise_free_pte_range+0xee/0x7d0 mm/madvise.c:594 walk_pmd_range mm/pagewalk.c:128 [inline] walk_pud_range mm/pagewalk.c:205 [inline] walk_p4d_range mm/pagewalk.c:240 [inline] walk_pgd_range mm/pagewalk.c:277 [inline] __walk_page_range+0x981/0x1160 mm/pagewalk.c:379 walk_page_range+0x131/0x300 mm/pagewalk.c:475 madvise_free_single_vma mm/madvise.c:734 [inline] madvise_dontneed_free mm/madvise.c:822 [inline] madvise_vma mm/madvise.c:996 [inline] do_madvise+0xe4a/0x1140 mm/madvise.c:1202 __do_sys_madvise mm/madvise.c:1228 [inline] __se_sys_madvise mm/madvise.c:1226 [inline] __x64_sys_madvise+0x5d/0x70 mm/madvise.c:1226 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff8881072cfbbc of 1 bytes by task 71 on cpu 0: set_tlb_ubc_flush_pending mm/rmap.c:636 [inline] try_to_unmap_one+0x60e/0x1220 mm/rmap.c:1515 rmap_walk_anon+0x2fb/0x470 mm/rmap.c:2301 try_to_unmap+0xec/0x110 shrink_page_list+0xe91/0x2620 mm/vmscan.c:1719 shrink_inactive_list+0x3fb/0x730 mm/vmscan.c:2394 shrink_list mm/vmscan.c:2621 [inline] shrink_lruvec+0x3c9/0x710 mm/vmscan.c:2940 shrink_node_memcgs+0x23e/0x410 mm/vmscan.c:3129 shrink_node+0x8f6/0x1190 mm/vmscan.c:3252 kswapd_shrink_node mm/vmscan.c:4022 [inline] balance_pgdat+0x702/0xd30 mm/vmscan.c:4213 kswapd+0x200/0x340 mm/vmscan.c:4473 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x01 -> 0x00 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 71 Comm: kswapd0 Not tainted 5.16.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ================================================================== [akpm@linux-foundation.org: tweak comments] Link: https://lkml.kernel.org/r/20211201021104.126469-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" Reported-by: syzbot+aa5bebed695edaccf0df@syzkaller.appspotmail.com Cc: Nadav Amit Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Andy Lutomirski Cc: Dave Hansen Cc: Will Deacon Cc: Yu Zhao Cc: Marco Elver Signed-off-by: Andrew Morton --- include/linux/mm_types.h | 2 - mm/rmap.c | 43 ++++++++++++++++++++++++++++++------- 2 files changed, 37 insertions(+), 8 deletions(-) --- a/include/linux/mm_types.h~mm-rmap-fix-potential-batched-tlb-flush-race +++ a/include/linux/mm_types.h @@ -647,7 +647,7 @@ struct mm_struct { atomic_t tlb_flush_pending; #ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH /* See flush_tlb_batched_pending() */ - bool tlb_flush_batched; + atomic_t tlb_flush_batched; #endif struct uprobes_state uprobes_state; #ifdef CONFIG_PREEMPT_RT --- a/mm/rmap.c~mm-rmap-fix-potential-batched-tlb-flush-race +++ a/mm/rmap.c @@ -621,9 +621,20 @@ void try_to_unmap_flush_dirty(void) try_to_unmap_flush(); } +/* + * Bits 0-14 of mm->tlb_flush_batched record pending generations. + * Bits 16-30 of mm->tlb_flush_batched bit record flushed generations. + */ +#define TLB_FLUSH_BATCH_FLUSHED_SHIFT 16 +#define TLB_FLUSH_BATCH_PENDING_MASK \ + ((1 << (TLB_FLUSH_BATCH_FLUSHED_SHIFT - 1)) - 1) +#define TLB_FLUSH_BATCH_PENDING_LARGE \ + (TLB_FLUSH_BATCH_PENDING_MASK / 2) + static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; + int batch, nbatch; arch_tlbbatch_add_mm(&tlb_ubc->arch, mm); tlb_ubc->flush_required = true; @@ -633,7 +644,22 @@ static void set_tlb_ubc_flush_pending(st * before the PTE is cleared. */ barrier(); - mm->tlb_flush_batched = true; + batch = atomic_read(&mm->tlb_flush_batched); +retry: + if ((batch & TLB_FLUSH_BATCH_PENDING_MASK) > TLB_FLUSH_BATCH_PENDING_LARGE) { + /* + * Prevent `pending' from catching up with `flushed' because of + * overflow. Reset `pending' and `flushed' to be 1 and 0 if + * `pending' becomes large. + */ + nbatch = atomic_cmpxchg(&mm->tlb_flush_batched, batch, 1); + if (nbatch != batch) { + batch = nbatch; + goto retry; + } + } else { + atomic_inc(&mm->tlb_flush_batched); + } /* * If the PTE was dirty then it's best to assume it's writable. The @@ -680,15 +706,18 @@ static bool should_defer_flush(struct mm */ void flush_tlb_batched_pending(struct mm_struct *mm) { - if (data_race(mm->tlb_flush_batched)) { - flush_tlb_mm(mm); + int batch = atomic_read(&mm->tlb_flush_batched); + int pending = batch & TLB_FLUSH_BATCH_PENDING_MASK; + int flushed = batch >> TLB_FLUSH_BATCH_FLUSHED_SHIFT; + if (pending != flushed) { + flush_tlb_mm(mm); /* - * Do not allow the compiler to re-order the clearing of - * tlb_flush_batched before the tlb is flushed. + * If the new TLB flushing is pending during flushing, leave + * mm->tlb_flush_batched as is, to avoid losing flushing. */ - barrier(); - mm->tlb_flush_batched = false; + atomic_cmpxchg(&mm->tlb_flush_batched, batch, + pending | (pending << TLB_FLUSH_BATCH_FLUSHED_SHIFT)); } } #else From patchwork Fri Jan 14 22:09:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714142 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0306BC433EF for ; Fri, 14 Jan 2022 22:09:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8DA276B0169; Fri, 14 Jan 2022 17:09:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E0296B016C; Fri, 14 Jan 2022 17:09:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57CD76B016A; Fri, 14 Jan 2022 17:09:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id 478F56B0169 for ; Fri, 14 Jan 2022 17:09:22 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 08ACE18163995 for ; Fri, 14 Jan 2022 22:09:22 +0000 (UTC) X-FDA: 79030284564.15.FC512ED Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id 8933B4000F for ; Fri, 14 Jan 2022 22:09:21 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 757B6B8262E; Fri, 14 Jan 2022 22:09:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECB73C36AE5; Fri, 14 Jan 2022 22:09:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198160; bh=poGYbaTX2+cz0z8Cnf73X/eJ76wvLpRwgchPkO/xz4E=; h=Date:From:To:Subject:In-Reply-To:From; b=fDR4ILereFpWIKbK7kzBNxzTW7WB+g1fKCTWc1Z/CpNSvjRkZqz+BYV3TqogYK4kC lRse9StUh8G4wGKgIOyrCjWPzsmPtrLkaJMIyXb/v0Y51gW0vOknIdJXzGEJUndjPy Ih60yHwlasQ2Qgx5NfC9pIUfCHYBCKaT3UWIat7Y= Date: Fri, 14 Jan 2022 14:09:19 -0800 From: Andrew Morton To: akpm@linux-foundation.org, ddstreet@ieee.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, zackary.liu.pro@gmail.com Subject: [patch 116/146] zpool: remove the list of pools_head Message-ID: <20220114220919.syBRmxBpv%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 8933B4000F X-Stat-Signature: 3s7wxtcy7zzqw3bk5bxqbu7pjq1e8c65 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=fDR4ILer; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198161-788745 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zhaoyu Liu Subject: zpool: remove the list of pools_head The list of pools_head is no longer needed because the caller has been deleted in commit 479305fd7172 ("zpool: remove zpool_evict()"). Link: https://lkml.kernel.org/r/20211215163727.GA17196@pc Signed-off-by: Zhaoyu Liu Cc: Dan Streetman Signed-off-by: Andrew Morton --- mm/zpool.c | 12 ------------ 1 file changed, 12 deletions(-) --- a/mm/zpool.c~zpool-remove-the-list-of-pools_head +++ a/mm/zpool.c @@ -24,16 +24,11 @@ struct zpool { const struct zpool_ops *ops; bool evictable; bool can_sleep_mapped; - - struct list_head list; }; static LIST_HEAD(drivers_head); static DEFINE_SPINLOCK(drivers_lock); -static LIST_HEAD(pools_head); -static DEFINE_SPINLOCK(pools_lock); - /** * zpool_register_driver() - register a zpool implementation. * @driver: driver to register @@ -195,10 +190,6 @@ struct zpool *zpool_create_pool(const ch pr_debug("created pool type %s\n", type); - spin_lock(&pools_lock); - list_add(&zpool->list, &pools_head); - spin_unlock(&pools_lock); - return zpool; } @@ -217,9 +208,6 @@ void zpool_destroy_pool(struct zpool *zp { pr_debug("destroying pool type %s\n", zpool->driver->type); - spin_lock(&pools_lock); - list_del(&zpool->list); - spin_unlock(&pools_lock); zpool->driver->destroy(zpool->pool); zpool_put_driver(zpool->driver); kfree(zpool); From patchwork Fri Jan 14 22:09:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E3FFC433F5 for ; Fri, 14 Jan 2022 22:09:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99D956B016C; Fri, 14 Jan 2022 17:09:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 94E446B016D; Fri, 14 Jan 2022 17:09:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83D1B6B016E; Fri, 14 Jan 2022 17:09:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id 73C666B016C for ; Fri, 14 Jan 2022 17:09:28 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3676780E7AD8 for ; Fri, 14 Jan 2022 22:09:28 +0000 (UTC) X-FDA: 79030284816.29.B51CBE1 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf03.hostedemail.com (Postfix) with ESMTP id AA96020006 for ; Fri, 14 Jan 2022 22:09:27 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 25CF0CE2384; Fri, 14 Jan 2022 22:09:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 046ADC36AEC; Fri, 14 Jan 2022 22:09:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198163; bh=chqW46Vo2JD2NNMvF3sjlrNHvbfLhnBMRF+ydxRlpy8=; h=Date:From:To:Subject:In-Reply-To:From; b=T7FAG8Bfnr3p+XepYjkOZfbFqa1aNWxifETuENZLktATsjc3NLw8YSKoJjF+M42KG TjeYv1SBQssmCSZ9YERMbxrmlt5Kifrh1GVvyXQgXnWFzuh4w5yXhX0o477e0wLQ3m mT6X3tkcq+2A4JCo3ozJpAQnI/wThygn0sFuDGmY= Date: Fri, 14 Jan 2022 14:09:22 -0800 From: Andrew Morton To: akpm@linux-foundation.org, axboe@kernel.dk, bvanassche@acm.org, linux-mm@kvack.org, mcgrof@kernel.org, minchan@kernel.org, mm-commits@vger.kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, torvalds@linux-foundation.org Subject: [patch 117/146] zram: use ATTRIBUTE_GROUPS Message-ID: <20220114220922.EyQYZ4NNz%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: AA96020006 X-Stat-Signature: yason7ua5ux8sxeugtzgdnykokeqrius Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=T7FAG8Bf; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198167-319332 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Luis Chamberlain Subject: zram: use ATTRIBUTE_GROUPS Embrace ATTRIBUTE_GROUPS to avoid boiler plate code. This should not introduce any functional changes. Link: https://lkml.kernel.org/r/20211028203600.2157356-1-mcgrof@kernel.org Signed-off-by: Luis Chamberlain Reviewed-by: Bart Van Assche Reviewed-by: Sergey Senozhatsky Cc: Minchan Kim Cc: Nitin Gupta Cc: Jens Axboe Signed-off-by: Andrew Morton --- drivers/block/zram/zram_drv.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) --- a/drivers/block/zram/zram_drv.c~zram-use-attribute_groups +++ a/drivers/block/zram/zram_drv.c @@ -1903,14 +1903,7 @@ static struct attribute *zram_disk_attrs NULL, }; -static const struct attribute_group zram_disk_attr_group = { - .attrs = zram_disk_attrs, -}; - -static const struct attribute_group *zram_disk_attr_groups[] = { - &zram_disk_attr_group, - NULL, -}; +ATTRIBUTE_GROUPS(zram_disk); /* * Allocate and initialize new zram device. the function returns @@ -1982,7 +1975,7 @@ static int zram_add(void) blk_queue_max_write_zeroes_sectors(zram->disk->queue, UINT_MAX); blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, zram->disk->queue); - ret = device_add_disk(NULL, zram->disk, zram_disk_attr_groups); + ret = device_add_disk(NULL, zram->disk, zram_disk_groups); if (ret) goto out_cleanup_disk; From patchwork Fri Jan 14 22:09:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8012DC4332F for ; Fri, 14 Jan 2022 22:09:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 189196B016D; Fri, 14 Jan 2022 17:09:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 139FE6B016F; Fri, 14 Jan 2022 17:09:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F41CE6B0170; Fri, 14 Jan 2022 17:09:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id E4F996B016D for ; Fri, 14 Jan 2022 17:09:29 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A9142180E9362 for ; Fri, 14 Jan 2022 22:09:29 +0000 (UTC) X-FDA: 79030284858.26.8120C94 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf11.hostedemail.com (Postfix) with ESMTP id D671F4000A for ; Fri, 14 Jan 2022 22:09:28 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 9FD50CE2498; Fri, 14 Jan 2022 22:09:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19E26C36AE9; Fri, 14 Jan 2022 22:09:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198166; bh=40c4mcsd+CSi9CF5yfo+Yph8I7LpRIVF0s/UlOw6gGk=; h=Date:From:To:Subject:In-Reply-To:From; b=ejdpZ9+z13uj5HtCRt7ST5GrC0YiQenHndbiTBn3DJL394SK19RMDi4uJgK+xlpqY atME2XC7A1FSPJ3pKupyTDm0TB10aw+lEt3igXOUnkGUFrYocSkXILblsxaFPSApUc Q0BJ+LEickDs1NU/GxKvfu6V8mM1J47HLrxXbqvY= Date: Fri, 14 Jan 2022 14:09:25 -0800 From: Andrew Morton To: akpm@linux-foundation.org, fuqf0919@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 118/146] mm: fix some comment errors Message-ID: <20220114220925.I63gWTTtt%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D671F4000A X-Stat-Signature: 6mn1m4oqturdmi11n7wbno38qahng4t8 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ejdpZ9+z; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198168-91093 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Quanfa Fu Subject: mm: fix some comment errors Link: https://lkml.kernel.org/r/20211101040208.460810-1-fuqf0919@gmail.com Signed-off-by: Quanfa Fu Signed-off-by: Andrew Morton --- mm/khugepaged.c | 2 +- mm/memory-failure.c | 2 +- mm/slab_common.c | 2 +- mm/swap.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) --- a/mm/khugepaged.c~writeback-fix-some-comment-errors +++ a/mm/khugepaged.c @@ -1303,7 +1303,7 @@ static int khugepaged_scan_pmd(struct mm /* * Record which node the original page is from and save this * information to khugepaged_node_load[]. - * Khupaged will allocate hugepage from the node has the max + * Khugepaged will allocate hugepage from the node has the max * hit record. */ node = page_to_nid(page); --- a/mm/memory-failure.c~writeback-fix-some-comment-errors +++ a/mm/memory-failure.c @@ -1306,7 +1306,7 @@ static int __get_unpoison_page(struct pa * * get_hwpoison_page() takes a page refcount of an error page to handle memory * error on it, after checking that the error page is in a well-defined state - * (defined as a page-type we can successfully handle the memor error on it, + * (defined as a page-type we can successfully handle the memory error on it, * such as LRU page and hugetlb page). * * Memory error handling could be triggered at any time on any type of page, --- a/mm/slab_common.c~writeback-fix-some-comment-errors +++ a/mm/slab_common.c @@ -819,7 +819,7 @@ void __init setup_kmalloc_cache_index_ta if (KMALLOC_MIN_SIZE >= 64) { /* - * The 96 byte size cache is not used if the alignment + * The 96 byte sized cache is not used if the alignment * is 64 byte. */ for (i = 64 + 8; i <= 96; i += 8) --- a/mm/swap.c~writeback-fix-some-comment-errors +++ a/mm/swap.c @@ -882,7 +882,7 @@ void lru_cache_disable(void) * all online CPUs so any calls of lru_cache_disabled wrapped by * local_lock or preemption disabled would be ordered by that. * The atomic operation doesn't need to have stronger ordering - * requirements because that is enforeced by the scheduling + * requirements because that is enforced by the scheduling * guarantees. */ __lru_add_drain_all(true); From patchwork Fri Jan 14 22:09:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D165C433F5 for ; Fri, 14 Jan 2022 22:09:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF82D6B016F; Fri, 14 Jan 2022 17:09:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CA8856B0171; Fri, 14 Jan 2022 17:09:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B99BE6B0172; Fri, 14 Jan 2022 17:09:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AA4AE6B016F for ; Fri, 14 Jan 2022 17:09:34 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6BB0C181CC1BF for ; Fri, 14 Jan 2022 22:09:34 +0000 (UTC) X-FDA: 79030285068.22.9C454AD Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf20.hostedemail.com (Postfix) with ESMTP id C33251C0007 for ; Fri, 14 Jan 2022 22:09:33 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 0BC54CE2497; Fri, 14 Jan 2022 22:09:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0F3A8C36AE9; Fri, 14 Jan 2022 22:09:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198169; bh=sJj+3KSGx0lT8xZbK1n+Apk8RaMnsSo4xCzroKRdGoc=; h=Date:From:To:Subject:In-Reply-To:From; b=2AEDeP6k2XFvv1PUg1DeQD6jPiPE/9JuFzLfuhIX/vFYUokjcTpvqRx9sssIcZ1ls 0D9faQcNwG+q0CYHsiyiFUUSRyG6EKNmqSWN+1ifTOUlQvx2yt2kJJjuY431p9hSXl YiW0HHB20b3zI+pv/8GadKriA89Sb1qymnaYa1OI= Date: Fri, 14 Jan 2022 14:09:28 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, liuting.0x7c00@bytedance.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 119/146] mm: make some vars and functions static or __init Message-ID: <20220114220928.xzpniNG1e%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: C33251C0007 X-Stat-Signature: g1iqwofsj5614h3r78nmw1wr8cckn1o7 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2AEDeP6k; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198173-750003 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ting Liu Subject: mm: make some vars and functions static or __init "page_idle_ops" as a global var, but its scope of use within this document. So it should be static. "page_ext_ops" is a var used in the kernel initial phase. And other functions are aslo used in the kernel initial phase. So they should be __init or __initdata to reclaim memory. Link: https://lkml.kernel.org/r/20211217095023.67293-1-liuting.0x7c00@bytedance.com Signed-off-by: Ting Liu Signed-off-by: Andrew Morton --- include/linux/page_idle.h | 1 - mm/page_ext.c | 4 ++-- mm/page_owner.c | 4 ++-- 3 files changed, 4 insertions(+), 5 deletions(-) --- a/include/linux/page_idle.h~mm-make-some-vars-and-functions-static-or-__init +++ a/include/linux/page_idle.h @@ -13,7 +13,6 @@ * If there is not enough space to store Idle and Young bits in page flags, use * page ext flags instead. */ -extern struct page_ext_operations page_idle_ops; static inline bool folio_test_young(struct folio *folio) { --- a/mm/page_ext.c~mm-make-some-vars-and-functions-static-or-__init +++ a/mm/page_ext.c @@ -64,12 +64,12 @@ static bool need_page_idle(void) { return true; } -struct page_ext_operations page_idle_ops = { +static struct page_ext_operations page_idle_ops __initdata = { .need = need_page_idle, }; #endif -static struct page_ext_operations *page_ext_ops[] = { +static struct page_ext_operations *page_ext_ops[] __initdata = { #ifdef CONFIG_PAGE_OWNER &page_owner_ops, #endif --- a/mm/page_owner.c~mm-make-some-vars-and-functions-static-or-__init +++ a/mm/page_owner.c @@ -46,7 +46,7 @@ static int __init early_page_owner_param } early_param("page_owner", early_page_owner_param); -static bool need_page_owner(void) +static __init bool need_page_owner(void) { return page_owner_enabled; } @@ -75,7 +75,7 @@ static noinline void register_early_stac early_handle = create_dummy_stack(); } -static void init_page_owner(void) +static __init void init_page_owner(void) { if (!page_owner_enabled) return; From patchwork Fri Jan 14 22:09:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58A6CC433F5 for ; Fri, 14 Jan 2022 22:09:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBD126B0171; Fri, 14 Jan 2022 17:09:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6C156B0173; Fri, 14 Jan 2022 17:09:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C81F36B0174; Fri, 14 Jan 2022 17:09:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id B803D6B0171 for ; Fri, 14 Jan 2022 17:09:37 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 77541824C454 for ; Fri, 14 Jan 2022 22:09:37 +0000 (UTC) X-FDA: 79030285194.30.A21FB16 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf27.hostedemail.com (Postfix) with ESMTP id ABA7940002 for ; Fri, 14 Jan 2022 22:09:36 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 38C88CE2498; Fri, 14 Jan 2022 22:09:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 258DFC36AEC; Fri, 14 Jan 2022 22:09:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198172; bh=+vVz2eWdeL20DFjkOdFxoyvdzR1B/GkjtVkxXYkjPlE=; h=Date:From:To:Subject:In-Reply-To:From; b=zNvO1u0NxyKCpbUwYG+KTu/hHBuVwb8mn57U1ALhN3RqC94yB3YdW9f13JnVx/r6A pY2IFSfdjgRFlGLQIZEnfWlv/YpPBmnYWYGPVGzn6jfEC0AnztGPNEL/VeTJCjiuBW nu6pCqMdD7kECkiIC4nBRLx6cetIF+JfLT/IG+JM= Date: Fri, 14 Jan 2022 14:09:31 -0800 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, Felix.Kuehling@amd.com, jgg@nvidia.com, jglisse@redhat.com, jhubbard@nvidia.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rcampbell@nvidia.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 120/146] mm/hmm.c: allow VM_MIXEDMAP to work with hmm_range_fault Message-ID: <20220114220931.285H-925c%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: ABA7940002 X-Stat-Signature: 9bsgzr3cbni7z3d7ji7o7scf5id5jkuh Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zNvO1u0N; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198176-843258 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alistair Popple Subject: mm/hmm.c: allow VM_MIXEDMAP to work with hmm_range_fault hmm_range_fault() can be used instead of get_user_pages() for devices which allow faulting however unlike get_user_pages() it will return an error when used on a VM_MIXEDMAP range. To make hmm_range_fault() more closely match get_user_pages() remove this restriction. This requires dealing with the !ARCH_HAS_PTE_SPECIAL case in hmm_vma_handle_pte(). Rather than replicating the logic of vm_normal_page() call it directly and do a check for the zero pfn similar to what get_user_pages() currently does. Also add a test to hmm selftest to verify functionality. Link: https://lkml.kernel.org/r/20211104012001.2555676-1-apopple@nvidia.com Fixes: da4c3c735ea4 ("mm/hmm/mirror: helper to snapshot CPU page table") Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Cc: Jerome Glisse Cc: John Hubbard Cc: Zi Yan Cc: Ralph Campbell Cc: Felix Kuehling Signed-off-by: Andrew Morton --- lib/test_hmm.c | 24 +++++++++++++ mm/hmm.c | 5 +- tools/testing/selftests/vm/hmm-tests.c | 42 +++++++++++++++++++++++ 3 files changed, 69 insertions(+), 2 deletions(-) --- a/lib/test_hmm.c~mm-hmmc-allow-vm_mixedmap-to-work-with-hmm_range_fault +++ a/lib/test_hmm.c @@ -1086,9 +1086,33 @@ static long dmirror_fops_unlocked_ioctl( return 0; } +static int dmirror_fops_mmap(struct file *file, struct vm_area_struct *vma) +{ + unsigned long addr; + + for (addr = vma->vm_start; addr < vma->vm_end; addr += PAGE_SIZE) { + struct page *page; + int ret; + + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return -ENOMEM; + + ret = vm_insert_page(vma, addr, page); + if (ret) { + __free_page(page); + return ret; + } + put_page(page); + } + + return 0; +} + static const struct file_operations dmirror_fops = { .open = dmirror_fops_open, .release = dmirror_fops_release, + .mmap = dmirror_fops_mmap, .unlocked_ioctl = dmirror_fops_unlocked_ioctl, .llseek = default_llseek, .owner = THIS_MODULE, --- a/mm/hmm.c~mm-hmmc-allow-vm_mixedmap-to-work-with-hmm_range_fault +++ a/mm/hmm.c @@ -300,7 +300,8 @@ static int hmm_vma_handle_pte(struct mm_ * Since each architecture defines a struct page for the zero page, just * fall through and treat it like a normal page. */ - if (pte_special(pte) && !pte_devmap(pte) && + if (!vm_normal_page(walk->vma, addr, pte) && + !pte_devmap(pte) && !is_zero_pfn(pte_pfn(pte))) { if (hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0)) { pte_unmap(ptep); @@ -518,7 +519,7 @@ static int hmm_vma_walk_test(unsigned lo struct hmm_range *range = hmm_vma_walk->range; struct vm_area_struct *vma = walk->vma; - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP | VM_MIXEDMAP)) && + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)) && vma->vm_flags & VM_READ) return 0; --- a/tools/testing/selftests/vm/hmm-tests.c~mm-hmmc-allow-vm_mixedmap-to-work-with-hmm_range_fault +++ a/tools/testing/selftests/vm/hmm-tests.c @@ -1251,6 +1251,48 @@ TEST_F(hmm, anon_teardown) /* * Test memory snapshot without faulting in pages accessed by the device. */ +TEST_F(hmm, mixedmap) +{ + struct hmm_buffer *buffer; + unsigned long npages; + unsigned long size; + unsigned char *m; + int ret; + + npages = 1; + size = npages << self->page_shift; + + buffer = malloc(sizeof(*buffer)); + ASSERT_NE(buffer, NULL); + + buffer->fd = -1; + buffer->size = size; + buffer->mirror = malloc(npages); + ASSERT_NE(buffer->mirror, NULL); + + + /* Reserve a range of addresses. */ + buffer->ptr = mmap(NULL, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE, + self->fd, 0); + ASSERT_NE(buffer->ptr, MAP_FAILED); + + /* Simulate a device snapshotting CPU pagetables. */ + ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_SNAPSHOT, buffer, npages); + ASSERT_EQ(ret, 0); + ASSERT_EQ(buffer->cpages, npages); + + /* Check what the device saw. */ + m = buffer->mirror; + ASSERT_EQ(m[0], HMM_DMIRROR_PROT_READ); + + hmm_buffer_free(buffer); +} + +/* + * Test memory snapshot without faulting in pages accessed by the device. + */ TEST_F(hmm2, snapshot) { struct hmm_buffer *buffer; From patchwork Fri Jan 14 22:09:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BCDFC4332F for ; Fri, 14 Jan 2022 22:09:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F42256B0178; Fri, 14 Jan 2022 17:09:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF37B6B0177; Fri, 14 Jan 2022 17:09:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE0E86B0178; Fri, 14 Jan 2022 17:09:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id CD9C96B0175 for ; Fri, 14 Jan 2022 17:09:41 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8DED5181B3643 for ; Fri, 14 Jan 2022 22:09:41 +0000 (UTC) X-FDA: 79030285362.10.DDF1C96 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf11.hostedemail.com (Postfix) with ESMTP id BF9A240009 for ; Fri, 14 Jan 2022 22:09:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 4C6F8CE2497; Fri, 14 Jan 2022 22:09:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5E875C36AE9; Fri, 14 Jan 2022 22:09:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198175; bh=3gYqyKxjvDZrzDNDzCvo30JEP86e0uaxOGrSIHxv/6E=; h=Date:From:To:Subject:In-Reply-To:From; b=c9hIlN+SYOon7MRBwUs8K5jbFM/VAXEy2hf7PazAknI0AG8JyoKtKuSB387u4PgzH zidhb5orBBsa8AE8JvcILv0jfFmIDYBB7C7rOCWZT1tDQehKfnftzGjqeTii5hANdA ZOh1mbHKf5qc2Qx27wU0hUwOmK4E2uMuPzXaRPDo= Date: Fri, 14 Jan 2022 14:09:34 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 121/146] mm/damon: unified access_check function naming rules Message-ID: <20220114220934.ZfuKoQEI5%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BF9A240009 X-Stat-Signature: ehf849d59qtfczgmswut7yaf56nrfunz Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=c9hIlN+S; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198179-844229 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: unified access_check function naming rules Patch series "mm/damon: Do some small changes", v4. This patch (of 4): In damon/paddr.c file, two functions names start with underscore, static void __damon_pa_prepare_access_check(struct damon_ctx *ctx, struct damon_region *r) static void __damon_pa_prepare_access_check(struct damon_ctx *ctx, struct damon_region *r) In damon/vaddr.c file, there are also two functions with the same function, static void damon_va_prepare_access_check(struct damon_ctx *ctx, struct mm_struct *mm, struct damon_region *r) static void damon_va_check_access(struct damon_ctx *ctx, struct mm_struct *mm, struct damon_region *r) It makes sense to keep consistent, and it is not easy to be confused with the function that call them. Link: https://lkml.kernel.org/r/cover.1636989871.git.xhao@linux.alibaba.com Link: https://lkml.kernel.org/r/529054aed932a42b9c09fc9977ad4574b9e7b0bd.1636989871.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Cc: Muchun Song Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/mm/damon/vaddr.c~mm-damon-unified-access_check-function-naming-rules +++ a/mm/damon/vaddr.c @@ -410,7 +410,7 @@ static void damon_va_mkold(struct mm_str * Functions for the access checking of the regions */ -static void damon_va_prepare_access_check(struct damon_ctx *ctx, +static void __damon_va_prepare_access_check(struct damon_ctx *ctx, struct mm_struct *mm, struct damon_region *r) { r->sampling_addr = damon_rand(r->ar.start, r->ar.end); @@ -429,7 +429,7 @@ void damon_va_prepare_access_checks(stru if (!mm) continue; damon_for_each_region(r, t) - damon_va_prepare_access_check(ctx, mm, r); + __damon_va_prepare_access_check(ctx, mm, r); mmput(mm); } } @@ -515,7 +515,7 @@ static bool damon_va_young(struct mm_str * mm 'mm_struct' for the given virtual address space * r the region to be checked */ -static void damon_va_check_access(struct damon_ctx *ctx, +static void __damon_va_check_access(struct damon_ctx *ctx, struct mm_struct *mm, struct damon_region *r) { static struct mm_struct *last_mm; @@ -551,7 +551,7 @@ unsigned int damon_va_check_accesses(str if (!mm) continue; damon_for_each_region(r, t) { - damon_va_check_access(ctx, mm, r); + __damon_va_check_access(ctx, mm, r); max_nr_accesses = max(r->nr_accesses, max_nr_accesses); } mmput(mm); From patchwork Fri Jan 14 22:09:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C1D9C433EF for ; Fri, 14 Jan 2022 22:09:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A627B6B0173; Fri, 14 Jan 2022 17:09:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A10CF6B0175; Fri, 14 Jan 2022 17:09:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D8B26B0176; Fri, 14 Jan 2022 17:09:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id 7F9046B0173 for ; Fri, 14 Jan 2022 17:09:41 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4BD43998C8 for ; Fri, 14 Jan 2022 22:09:41 +0000 (UTC) X-FDA: 79030285362.28.45FFE67 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf31.hostedemail.com (Postfix) with ESMTP id EE0D920006 for ; Fri, 14 Jan 2022 22:09:40 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E6EC2B8262F; Fri, 14 Jan 2022 22:09:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6B071C36AE9; Fri, 14 Jan 2022 22:09:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198178; bh=w7pxM9LCTRt/1PblGv0j5a5eklnr95NNigWSslQVvdU=; h=Date:From:To:Subject:In-Reply-To:From; b=q1hTNVQXqpwrKRnOLbplTzfNpqQYRUtK48rIcthVqyif2neaGYHYZhRZomc21gHvh IM2rszrOK78hTBzlTryIBkO7eM+kNxFcWP/5EZwELRAYh8GWHC0c2SPbc4eSggr128 lHdFTch9X/O8DF3ZlsLzC14g+LPyzdllHPBAVF2Y= Date: Fri, 14 Jan 2022 14:09:37 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 122/146] mm/damon: add 'age' of region tracepoint support Message-ID: <20220114220937.or_UptqWt%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: EE0D920006 X-Stat-Signature: 7w6ry1cg4mw738gxc8br3cws89ppoxwo Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=q1hTNVQX; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198180-317808 X-Bogosity: Ham, tests=bogofilter, spamicity=0.006327, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: add 'age' of region tracepoint support In Damon, we can get age information by analyzing the nr_access change, But short time sampling is not effective, we have to obtain enough data for analysis through long time trace, this also means that we need to consume more cpu resources and storage space. Now the region add a new 'age' variable, we only need to get the change of age value through a little time trace, for example, age has been increasing to 141, but nr_access shows a value of 0 at the same time, Through this,we can conclude that the region has a very low nr_access value for a long time. Link: https://lkml.kernel.org/r/b9def1262af95e0dc1d0caea447886434db01161.1636989871.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Cc: Muchun Song Signed-off-by: Andrew Morton --- include/trace/events/damon.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) --- a/include/trace/events/damon.h~mm-damon-add-age-of-region-tracepoint-support +++ a/include/trace/events/damon.h @@ -22,6 +22,7 @@ TRACE_EVENT(damon_aggregated, __field(unsigned long, start) __field(unsigned long, end) __field(unsigned int, nr_accesses) + __field(unsigned int, age) ), TP_fast_assign( @@ -30,11 +31,13 @@ TRACE_EVENT(damon_aggregated, __entry->start = r->ar.start; __entry->end = r->ar.end; __entry->nr_accesses = r->nr_accesses; + __entry->age = r->age; ), - TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u", + TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u %u", __entry->target_id, __entry->nr_regions, - __entry->start, __entry->end, __entry->nr_accesses) + __entry->start, __entry->end, + __entry->nr_accesses, __entry->age) ); #endif /* _TRACE_DAMON_H */ From patchwork Fri Jan 14 22:09:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29FA2C433F5 for ; Fri, 14 Jan 2022 22:09:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD78F6B0177; Fri, 14 Jan 2022 17:09:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A5FE36B0179; Fri, 14 Jan 2022 17:09:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E836B017A; Fri, 14 Jan 2022 17:09:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 83CD96B0177 for ; Fri, 14 Jan 2022 17:09:44 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4E4AB7C37A for ; Fri, 14 Jan 2022 22:09:44 +0000 (UTC) X-FDA: 79030285488.07.F2A3196 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf03.hostedemail.com (Postfix) with ESMTP id DE78A2000A for ; Fri, 14 Jan 2022 22:09:43 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E6C83B825F5; Fri, 14 Jan 2022 22:09:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 727F3C36AE5; Fri, 14 Jan 2022 22:09:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198181; bh=2A3XM+2UG12YFp8tc/9YQ+vPnJHFwu+rf8Z18zCW6+8=; h=Date:From:To:Subject:In-Reply-To:From; b=ZddWNWtWUwf4PmALBRLCaJPGzEix6KczzKBT0xjCL8R14380kArlTHoA6ctPf9osx OnDBDGfMaZDYh/aXrRAhbxj1AdxVWckhP6K7Lxr0lWKUNd0N9MkQCvFZTMSXm+1uYw y9UUlOvgATrhVCcMN2G0iMMsPcE+oQOkoq2iPPOw= Date: Fri, 14 Jan 2022 14:09:40 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 123/146] mm/damon/core: use abs() instead of diff_of() Message-ID: <20220114220940.B19d9XhL-%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: DE78A2000A X-Stat-Signature: w6jrm4x7oujoe8xmyrqbf6wayimyiwyy Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ZddWNWtW; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-Rspamd-Server: rspam07 X-HE-Tag: 1642198183-308598 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon/core: use abs() instead of diff_of() In kernel, we can use abs(a - b) to get the absolute value, So there is no need to redefine a new one. Link: https://lkml.kernel.org/r/b24e7b82d9efa90daf150d62dea171e19390ad0b.1636989871.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: Muchun Song Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/mm/damon/core.c~mm-damon-core-using-function-abs-instead-of-diff_of +++ a/mm/damon/core.c @@ -750,8 +750,6 @@ static void damon_merge_two_regions(stru damon_destroy_region(r, t); } -#define diff_of(a, b) (a > b ? a - b : b - a) - /* * Merge adjacent regions having similar access frequencies * @@ -765,13 +763,13 @@ static void damon_merge_regions_of(struc struct damon_region *r, *prev = NULL, *next; damon_for_each_region_safe(r, next, t) { - if (diff_of(r->nr_accesses, r->last_nr_accesses) > thres) + if (abs(r->nr_accesses - r->last_nr_accesses) > thres) r->age = 0; else r->age++; if (prev && prev->ar.end == r->ar.start && - diff_of(prev->nr_accesses, r->nr_accesses) <= thres && + abs(prev->nr_accesses - r->nr_accesses) <= thres && sz_damon_region(prev) + sz_damon_region(r) <= sz_limit) damon_merge_two_regions(t, prev, r); else From patchwork Fri Jan 14 22:09:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93B4BC433EF for ; Fri, 14 Jan 2022 22:09:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 223A16B0179; Fri, 14 Jan 2022 17:09:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D3B36B017B; Fri, 14 Jan 2022 17:09:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09B106B017C; Fri, 14 Jan 2022 17:09:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id E97036B0179 for ; Fri, 14 Jan 2022 17:09:49 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id ACA1D94FCA for ; Fri, 14 Jan 2022 22:09:49 +0000 (UTC) X-FDA: 79030285698.21.E915836 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf24.hostedemail.com (Postfix) with ESMTP id DFF6818000B for ; Fri, 14 Jan 2022 22:09:48 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 7027DCE2497; Fri, 14 Jan 2022 22:09:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83EC2C36AEC; Fri, 14 Jan 2022 22:09:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198184; bh=46TBjWG4o8z2/SEH28qPH2s6LB7lB1ohY3MvyDrGeyk=; h=Date:From:To:Subject:In-Reply-To:From; b=JX2VHpe+1qtTjA8JtA3Oc9k8KVFn/HIUYBcg/qS52ei9d84kkJd2zFhh/yY0v/7Tm jqFTKzMrDifF0OJWlwkqhSTxTuFrdEbTjUn1ttlYaC6ulPeV7pqLn3yAj+OQjBNwIH I0P92ap+aZxxMD6FE4mAFJGbK3e2aY5Wj2smVs98= Date: Fri, 14 Jan 2022 14:09:44 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 124/146] mm/damon: remove some unneeded function definitions in damon.h Message-ID: <20220114220944.YoUxxwx7S%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DFF6818000B X-Stat-Signature: jqchbejweabghgpf77o4mc56ftcc4ayo Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=JX2VHpe+; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198188-583312 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: remove some unneeded function definitions in damon.h In damon.h some func definitions about VA & PA can only be used in its own file, so there no need to define in the header file, and the header file will look cleaner. If other files later need these functions, the prototypes can be added to damon.h at that time. [sj@kernel.org: remove unnecessary function prototype position changes] Link: https://lkml.kernel.org/r/20211118114827.20052-1-sj@kernel.org Link: https://lkml.kernel.org/r/45fd5b3ef6cce8e28dbc1c92f9dc845ccfc949d7.1636989871.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Signed-off-by: SeongJae Park Reviewed-by: SeongJae Park Cc: Muchun Song Signed-off-by: Andrew Morton --- include/linux/damon.h | 21 --------------------- mm/damon/paddr.c | 11 ++++++----- mm/damon/vaddr.c | 18 ++++++++++-------- 3 files changed, 16 insertions(+), 34 deletions(-) --- a/include/linux/damon.h~mm-damon-remove-some-no-need-func-definitions-in-damonh-file +++ a/include/linux/damon.h @@ -461,34 +461,13 @@ int damon_stop(struct damon_ctx **ctxs, #endif /* CONFIG_DAMON */ #ifdef CONFIG_DAMON_VADDR - -/* Monitoring primitives for virtual memory address spaces */ -void damon_va_init(struct damon_ctx *ctx); -void damon_va_update(struct damon_ctx *ctx); -void damon_va_prepare_access_checks(struct damon_ctx *ctx); -unsigned int damon_va_check_accesses(struct damon_ctx *ctx); bool damon_va_target_valid(void *t); -void damon_va_cleanup(struct damon_ctx *ctx); -int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme); -int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme); void damon_va_set_primitives(struct damon_ctx *ctx); - #endif /* CONFIG_DAMON_VADDR */ #ifdef CONFIG_DAMON_PADDR - -/* Monitoring primitives for the physical memory address space */ -void damon_pa_prepare_access_checks(struct damon_ctx *ctx); -unsigned int damon_pa_check_accesses(struct damon_ctx *ctx); bool damon_pa_target_valid(void *t); -int damon_pa_apply_scheme(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme); -int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme); void damon_pa_set_primitives(struct damon_ctx *ctx); - #endif /* CONFIG_DAMON_PADDR */ #endif /* _DAMON_H */ --- a/mm/damon/paddr.c~mm-damon-remove-some-no-need-func-definitions-in-damonh-file +++ a/mm/damon/paddr.c @@ -73,7 +73,7 @@ static void __damon_pa_prepare_access_ch damon_pa_mkold(r->sampling_addr); } -void damon_pa_prepare_access_checks(struct damon_ctx *ctx) +static void damon_pa_prepare_access_checks(struct damon_ctx *ctx) { struct damon_target *t; struct damon_region *r; @@ -192,7 +192,7 @@ static void __damon_pa_check_access(stru last_addr = r->sampling_addr; } -unsigned int damon_pa_check_accesses(struct damon_ctx *ctx) +static unsigned int damon_pa_check_accesses(struct damon_ctx *ctx) { struct damon_target *t; struct damon_region *r; @@ -213,7 +213,7 @@ bool damon_pa_target_valid(void *t) return true; } -int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, +static int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, struct damon_region *r, struct damos *scheme) { unsigned long addr; @@ -246,8 +246,9 @@ int damon_pa_apply_scheme(struct damon_c return 0; } -int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme) +static int damon_pa_scheme_score(struct damon_ctx *context, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) { switch (scheme->action) { case DAMOS_PAGEOUT: --- a/mm/damon/vaddr.c~mm-damon-remove-some-no-need-func-definitions-in-damonh-file +++ a/mm/damon/vaddr.c @@ -272,7 +272,7 @@ static void __damon_va_init_regions(stru } /* Initialize '->regions_list' of every target (task) */ -void damon_va_init(struct damon_ctx *ctx) +static void damon_va_init(struct damon_ctx *ctx) { struct damon_target *t; @@ -292,7 +292,8 @@ void damon_va_init(struct damon_ctx *ctx * * Returns true if it is. */ -static bool damon_intersect(struct damon_region *r, struct damon_addr_range *re) +static bool damon_intersect(struct damon_region *r, + struct damon_addr_range *re) { return !(r->ar.end <= re->start || re->end <= r->ar.start); } @@ -356,7 +357,7 @@ static void damon_va_apply_three_regions /* * Update regions for current memory mappings */ -void damon_va_update(struct damon_ctx *ctx) +static void damon_va_update(struct damon_ctx *ctx) { struct damon_addr_range three_regions[3]; struct damon_target *t; @@ -418,7 +419,7 @@ static void __damon_va_prepare_access_ch damon_va_mkold(mm, r->sampling_addr); } -void damon_va_prepare_access_checks(struct damon_ctx *ctx) +static void damon_va_prepare_access_checks(struct damon_ctx *ctx) { struct damon_target *t; struct mm_struct *mm; @@ -539,7 +540,7 @@ static void __damon_va_check_access(stru last_addr = r->sampling_addr; } -unsigned int damon_va_check_accesses(struct damon_ctx *ctx) +static unsigned int damon_va_check_accesses(struct damon_ctx *ctx) { struct damon_target *t; struct mm_struct *mm; @@ -603,7 +604,7 @@ out: } #endif /* CONFIG_ADVISE_SYSCALLS */ -int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, +static int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, struct damon_region *r, struct damos *scheme) { int madv_action; @@ -633,8 +634,9 @@ int damon_va_apply_scheme(struct damon_c return damos_madvise(t, r, madv_action); } -int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme) +static int damon_va_scheme_score(struct damon_ctx *context, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) { switch (scheme->action) { From patchwork Fri Jan 14 22:09:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDA3DC433FE for ; Fri, 14 Jan 2022 22:09:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBD7B6B017E; Fri, 14 Jan 2022 17:09:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CF8AD6B017D; Fri, 14 Jan 2022 17:09:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B231D6B017E; Fri, 14 Jan 2022 17:09:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0153.hostedemail.com [216.40.44.153]) by kanga.kvack.org (Postfix) with ESMTP id 9E9C36B017B for ; Fri, 14 Jan 2022 17:09:50 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5E3551809A350 for ; Fri, 14 Jan 2022 22:09:50 +0000 (UTC) X-FDA: 79030285740.04.93C731A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf28.hostedemail.com (Postfix) with ESMTP id E3DB2C000F for ; Fri, 14 Jan 2022 22:09:49 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 17C1EB825F5; Fri, 14 Jan 2022 22:09:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 936EEC36AE9; Fri, 14 Jan 2022 22:09:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198187; bh=vc1hBj5TydsfWWItX3y4oi2NRvlqaxJOEeG4Ak3AMyE=; h=Date:From:To:Subject:In-Reply-To:From; b=ZF/NUC5iIsaX+0YOAh/LLDjN0ySgpdvqj+TyXD0hjeY0YZX3VG4A7zK7vPwJcvAXr mySzKXSaqCHxAPyWpnVvLkDtQiX3iLuZR8kqI3jr+Biyf0JIyWeyzDVLhUo6Xm6lQJ yLookao6ObrxjXiNSBO5HbnsH37o7xTs896FW4S8= Date: Fri, 14 Jan 2022 14:09:47 -0800 From: Andrew Morton To: akpm@linux-foundation.org, hanyihao@vivo.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org Subject: [patch 125/146] mm/damon/vaddr: remove swap_ranges() and replace it with swap() Message-ID: <20220114220947.nloFzSIKt%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E3DB2C000F X-Stat-Signature: k9pjwryugw7rwi7dfrzrn5dkswsqhk34 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="ZF/NUC5i"; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198189-93892 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yihao Han Subject: mm/damon/vaddr: remove swap_ranges() and replace it with swap() Remove 'swap_ranges()' and replace it with the macro 'swap()' defined in 'include/linux/minmax.h' to simplify code and improve efficiency Link: https://lkml.kernel.org/r/20211111115355.2808-1-hanyihao@vivo.com Signed-off-by: Yihao Han Reviewed-by: SeongJae Park Reviewed-by: Muchun Song Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) --- a/mm/damon/vaddr.c~mm-damon-vaddr-remove-swap_ranges-and-replace-it-with-swap +++ a/mm/damon/vaddr.c @@ -98,16 +98,6 @@ static unsigned long sz_range(struct dam return r->end - r->start; } -static void swap_ranges(struct damon_addr_range *r1, - struct damon_addr_range *r2) -{ - struct damon_addr_range tmp; - - tmp = *r1; - *r1 = *r2; - *r2 = tmp; -} - /* * Find three regions separated by two biggest unmapped regions * @@ -146,9 +136,9 @@ static int __damon_va_three_regions(stru gap.start = last_vma->vm_end; gap.end = vma->vm_start; if (sz_range(&gap) > sz_range(&second_gap)) { - swap_ranges(&gap, &second_gap); + swap(gap, second_gap); if (sz_range(&second_gap) > sz_range(&first_gap)) - swap_ranges(&second_gap, &first_gap); + swap(second_gap, first_gap); } next: last_vma = vma; @@ -159,7 +149,7 @@ next: /* Sort the two biggest gaps by address */ if (first_gap.start > second_gap.start) - swap_ranges(&first_gap, &second_gap); + swap(first_gap, second_gap); /* Store the result */ regions[0].start = ALIGN(start, DAMON_MIN_REGION); From patchwork Fri Jan 14 22:09:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33FD6C433F5 for ; Fri, 14 Jan 2022 22:09:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0F276B017D; Fri, 14 Jan 2022 17:09:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC0006B017F; Fri, 14 Jan 2022 17:09:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A86C66B0180; Fri, 14 Jan 2022 17:09:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 964B16B017D for ; Fri, 14 Jan 2022 17:09:55 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 54D40181E4348 for ; Fri, 14 Jan 2022 22:09:55 +0000 (UTC) X-FDA: 79030285950.14.B310012 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf10.hostedemail.com (Postfix) with ESMTP id C75C5C0004 for ; Fri, 14 Jan 2022 22:09:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 8C349CE2497; Fri, 14 Jan 2022 22:09:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A032AC36AEC; Fri, 14 Jan 2022 22:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198190; bh=aIhc77Uq5yJjcD900SeC3Syb5vcahaAxAaDKlZ9jVxU=; h=Date:From:To:Subject:In-Reply-To:From; b=WfpBeHStwYgSF78FqoSJ4pOmrbdYf+5a5lGWgfr4Re4Ao1NS3I5ZtPGP7hjf281Cu 8ZvLEJIELJXhMPbTai/27F9dh3DTcsjRVr64RXMQjDu59PhzpOf6lXRQxt7JHaMxie d9Av2A1DxYtEgpndz5iU/BF6w626Vb5j4hVYwUvA= Date: Fri, 14 Jan 2022 14:09:50 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 126/146] mm/damon/schemes: add the validity judgment of thresholds Message-ID: <20220114220950.xLR4KeO6m%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C75C5C0004 X-Stat-Signature: dst4mzki8qtspgbo1xap3ky46ognbgu7 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WfpBeHSt; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198194-289744 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon/schemes: add the validity judgment of thresholds In dbgfs "schemes" interface, i do some test like this: # cd /sys/kernel/debug/damon # echo "2 1 2 1 10 1 3 10 1 1 1 1 1 1 1 1 2 3" > schemes # cat schemes # 2 1 2 1 10 1 3 10 1 1 1 1 1 1 1 1 2 3 0 0 There have some unreasonable places, i set the valules of these variables " , , " as "<2, 1>, <2, 1>, <10, 1>, <1, 2, 3>. So there add a validity judgment for these thresholds value. Link: https://lkml.kernel.org/r/d78360e52158d786fcbf20bc62c96785742e76d3.1637239568.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/mm/damon/dbgfs.c~mm-damon-schemes-add-the-validity-judgment-of-thresholds +++ a/mm/damon/dbgfs.c @@ -213,6 +213,13 @@ static struct damos **str_to_schemes(con if (!damos_action_valid(action)) goto fail; + if (min_sz > max_sz || min_nr_a > max_nr_a || min_age > max_age) + goto fail; + + if (wmarks.high < wmarks.mid || wmarks.high < wmarks.low || + wmarks.mid < wmarks.low) + goto fail; + pos += parsed; scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a, min_age, max_age, action, "a, &wmarks); From patchwork Fri Jan 14 22:09:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714154 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99E77C4332F for ; Fri, 14 Jan 2022 22:10:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30E656B0181; Fri, 14 Jan 2022 17:10:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BF906B0183; Fri, 14 Jan 2022 17:10:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 186066B0184; Fri, 14 Jan 2022 17:10:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 08C4F6B0181 for ; Fri, 14 Jan 2022 17:10:02 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C62B4181B3643 for ; Fri, 14 Jan 2022 22:10:01 +0000 (UTC) X-FDA: 79030286202.13.F7CA812 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf22.hostedemail.com (Postfix) with ESMTP id D93D2C000A for ; Fri, 14 Jan 2022 22:09:57 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 90DC5CE2384; Fri, 14 Jan 2022 22:09:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7485C36AED; Fri, 14 Jan 2022 22:09:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198193; bh=aNp5XdsonTy6nase51ja7DvCUQtbDcFpa9p+eXvLDog=; h=Date:From:To:Subject:In-Reply-To:From; b=frB06DtmviCcWb5o11I6baKFYWJUfhp4qmYXIX4PxtMlXFFB42ugTVCYpoS4/Xr93 iLWAcF3DJJN+D5MLARkMozoGRuOFYn2d/Gt/NLkoEmUWhnhoe0gqLIT8ZCWATVjpTo U405MW4HNHqbFAVfYL9MUOXMIhnMuZYEEEQifXoQ= Date: Fri, 14 Jan 2022 14:09:53 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 127/146] mm/damon: move damon_rand() definition into damon.h Message-ID: <20220114220953.RcbO-TzCE%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D93D2C000A X-Stat-Signature: 8ueq7xrm34diff81ejridbf7s3hdnb53 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=frB06Dtm; dmarc=none; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198197-282652 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: move damon_rand() definition into damon.h damon_rand() is called in three files:damon/core.c, damon/ paddr.c, damon/vaddr.c, i think there is no need to redefine this twice, So move it to damon.h will be a good choice. Link: https://lkml.kernel.org/r/20211202075859.51341-1-xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 4 ++++ mm/damon/core.c | 4 ---- mm/damon/prmtv-common.h | 4 ---- 3 files changed, 4 insertions(+), 8 deletions(-) --- a/include/linux/damon.h~mm-damon-move-damon_rand-definition-into-damonh +++ a/include/linux/damon.h @@ -11,12 +11,16 @@ #include #include #include +#include /* Minimal region size. Every damon_region is aligned by this. */ #define DAMON_MIN_REGION PAGE_SIZE /* Max priority score for DAMON-based operation schemes */ #define DAMOS_MAX_SCORE (99) +/* Get a random number in [l, r) */ +#define damon_rand(l, r) (l + prandom_u32_max(r - l)) + /** * struct damon_addr_range - Represents an address region of [@start, @end). * @start: Start address of the region (inclusive). --- a/mm/damon/core.c~mm-damon-move-damon_rand-definition-into-damonh +++ a/mm/damon/core.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include @@ -23,9 +22,6 @@ #define DAMON_MIN_REGION 1 #endif -/* Get a random number in [l, r) */ -#define damon_rand(l, r) (l + prandom_u32_max(r - l)) - static DEFINE_MUTEX(damon_lock); static int nr_running_ctxs; --- a/mm/damon/prmtv-common.h~mm-damon-move-damon_rand-definition-into-damonh +++ a/mm/damon/prmtv-common.h @@ -6,10 +6,6 @@ */ #include -#include - -/* Get a random number in [l, r) */ -#define damon_rand(l, r) (l + prandom_u32_max(r - l)) struct page *damon_get_page(unsigned long pfn); From patchwork Fri Jan 14 22:09:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 823C8C433EF for ; Fri, 14 Jan 2022 22:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 160EE6B017F; Fri, 14 Jan 2022 17:10:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10FEE6B0181; Fri, 14 Jan 2022 17:10:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F40C16B0182; Fri, 14 Jan 2022 17:09:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id E24276B017F for ; Fri, 14 Jan 2022 17:09:59 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B0C3A1823021C for ; Fri, 14 Jan 2022 22:09:59 +0000 (UTC) X-FDA: 79030286118.18.A1993AC Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf07.hostedemail.com (Postfix) with ESMTP id 5E77140005 for ; Fri, 14 Jan 2022 22:09:59 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4B5DFB82A3A; Fri, 14 Jan 2022 22:09:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C04C8C36AEC; Fri, 14 Jan 2022 22:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198197; bh=ylEN43liLL2VFwNC4OUZNmfMB4yZIT74JoAXom6JfQI=; h=Date:From:To:Subject:In-Reply-To:From; b=lyWyKKMGMxRHkoeUq3Fx5DJs0tf9V1KIq2yxYa2eT734/mukiN0sCrp7dBjDfZx1y cnNkbxGBwj4ll3jQwgroK7DuK3NIhYfw3+P0YVZQpOvdpQmflqA04skTJVJ9YOZAxQ +ohPlJs73PzwSxzW63dE7Fb2+iro+RW0/uTiMB8M= Date: Fri, 14 Jan 2022 14:09:56 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 128/146] mm/damon: modify damon_rand() macro to static inline function Message-ID: <20220114220956.eFS-asqP9%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5E77140005 X-Stat-Signature: herd876m7ujd8hp1rqfbsj3n6zy79fqy Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lyWyKKMG; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198199-406361 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: modify damon_rand() macro to static inline function damon_rand() cannot be implemented as a macro. Example: damon_rand(a++, b); The value of 'a' will be incremented twice, This is obviously unreasonable, So there fix it. Link: https://lkml.kernel.org/r/110ffcd4e420c86c42b41ce2bc9f0fe6a4f32cd3.1638795127.git.xhao@linux.alibaba.com Fixes: b9a6ac4e4ede ("mm/damon: adaptively adjust regions") Signed-off-by: Xin Hao Reported-by: Andrew Morton Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/include/linux/damon.h~mm-damon-modify-damon_rand-macro-to-static-inline-function +++ a/include/linux/damon.h @@ -19,7 +19,10 @@ #define DAMOS_MAX_SCORE (99) /* Get a random number in [l, r) */ -#define damon_rand(l, r) (l + prandom_u32_max(r - l)) +static inline unsigned long damon_rand(unsigned long l, unsigned long r) +{ + return l + prandom_u32_max(r - l); +} /** * struct damon_addr_range - Represents an address region of [@start, @end). From patchwork Fri Jan 14 22:09:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74401C4332F for ; Fri, 14 Jan 2022 22:10:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0284A6B0183; Fri, 14 Jan 2022 17:10:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ECAF26B0185; Fri, 14 Jan 2022 17:10:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBB3D6B0186; Fri, 14 Jan 2022 17:10:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0097.hostedemail.com [216.40.44.97]) by kanga.kvack.org (Postfix) with ESMTP id C9A8E6B0183 for ; Fri, 14 Jan 2022 17:10:03 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9B1B68A9A6 for ; Fri, 14 Jan 2022 22:10:03 +0000 (UTC) X-FDA: 79030286286.21.F7D771F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf25.hostedemail.com (Postfix) with ESMTP id 7DA14A0005 for ; Fri, 14 Jan 2022 22:10:02 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4EA9FB825F5; Fri, 14 Jan 2022 22:10:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D04ADC36AE5; Fri, 14 Jan 2022 22:09:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198200; bh=ROyu1EzDKGTKYt+cupcXpS7/dD/1Se9+KtdbmMl1boE=; h=Date:From:To:Subject:In-Reply-To:From; b=fzKaxu6HTVVGGQc4KY0adod/WLnwSu8x1xaV34LGqaYKTFtTwGPqWBx58CNCC3a3l dNxs4fRx8EZnGhOxqcIX9h8sIuxYWb4TnhTPCexdsSjlNIJ03wfOmweb135owKy69f UKqEwsfh3kraJxvf0obl/XPod4hyE3a4hao3bxmw= Date: Fri, 14 Jan 2022 14:09:59 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 129/146] mm/damon: convert macro functions to static inline functions Message-ID: <20220114220959.Y9bKQcLDC%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 7DA14A0005 X-Stat-Signature: 4yuzahxemstruswhhht1tuhmcn4k617i Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=fzKaxu6H; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-HE-Tag: 1642198202-937974 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon: convert macro functions to static inline functions Patch series "mm/damon: Misc cleanups". This patchset contains miscellaneous cleanups for DAMON's macro functions and documentation. This patch (of 6): This commit converts macro functions in DAMON to static inline functions, for better type checking, code documentation, etc[1]. [1] https://lore.kernel.org/linux-mm/20211202151213.6ec830863342220da4141bc5@linux-foundation.org/ Link: https://lkml.kernel.org/r/20211209131806.19317-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211209131806.19317-2-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- include/linux/damon.h | 18 ++++++++++++------ mm/damon/core.c | 5 ++++- mm/damon/vaddr.c | 6 ++++-- 3 files changed, 20 insertions(+), 9 deletions(-) --- a/include/linux/damon.h~mm-damon-convert-macro-functions-to-static-inline-functions +++ a/include/linux/damon.h @@ -399,14 +399,20 @@ struct damon_ctx { struct list_head schemes; }; -#define damon_next_region(r) \ - (container_of(r->list.next, struct damon_region, list)) +static inline struct damon_region *damon_next_region(struct damon_region *r) +{ + return container_of(r->list.next, struct damon_region, list); +} -#define damon_prev_region(r) \ - (container_of(r->list.prev, struct damon_region, list)) +static inline struct damon_region *damon_prev_region(struct damon_region *r) +{ + return container_of(r->list.prev, struct damon_region, list); +} -#define damon_last_region(t) \ - (list_last_entry(&t->regions_list, struct damon_region, list)) +static inline struct damon_region *damon_last_region(struct damon_target *t) +{ + return list_last_entry(&t->regions_list, struct damon_region, list); +} #define damon_for_each_region(r, t) \ list_for_each_entry(r, &t->regions_list, list) --- a/mm/damon/core.c~mm-damon-convert-macro-functions-to-static-inline-functions +++ a/mm/damon/core.c @@ -729,7 +729,10 @@ static void kdamond_apply_schemes(struct } } -#define sz_damon_region(r) (r->ar.end - r->ar.start) +static inline unsigned long sz_damon_region(struct damon_region *r) +{ + return r->ar.end - r->ar.start; +} /* * Merge two adjacent regions into one region --- a/mm/damon/vaddr.c~mm-damon-convert-macro-functions-to-static-inline-functions +++ a/mm/damon/vaddr.c @@ -26,8 +26,10 @@ * 't->id' should be the pointer to the relevant 'struct pid' having reference * count. Caller must put the returned task, unless it is NULL. */ -#define damon_get_task_struct(t) \ - (get_pid_task((struct pid *)t->id, PIDTYPE_PID)) +static inline struct task_struct *damon_get_task_struct(struct damon_target *t) +{ + return get_pid_task((struct pid *)t->id, PIDTYPE_PID); +} /* * Get the mm_struct of the given target From patchwork Fri Jan 14 22:10:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA571C433F5 for ; Fri, 14 Jan 2022 22:10:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A6A96B0185; Fri, 14 Jan 2022 17:10:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57C716B0187; Fri, 14 Jan 2022 17:10:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 444C16B0188; Fri, 14 Jan 2022 17:10:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay031.a.hostedemail.com [64.99.140.31]) by kanga.kvack.org (Postfix) with ESMTP id 321736B0185 for ; Fri, 14 Jan 2022 17:10:11 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F373C1E01 for ; Fri, 14 Jan 2022 22:10:10 +0000 (UTC) X-FDA: 79030286622.01.8026F9C Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf28.hostedemail.com (Postfix) with ESMTP id 10565C0008 for ; Fri, 14 Jan 2022 22:10:09 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id E30D5CE24A6; Fri, 14 Jan 2022 22:10:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DFA40C36AEC; Fri, 14 Jan 2022 22:10:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198206; bh=foC+vPwr0CUjmkhtolXcG3WMcxw4VlfduBgFEN0YHFs=; h=Date:From:To:Subject:In-Reply-To:From; b=KWeAlBWjRA/CJFZTRFHKoQWXFvcGgnNN9TCU8++z9bE4jP8UgbP7Gj2Nt0OaOsNVl Id2KCEVc/q6qwBevqQOO4RaPgVSlChaTJxBO1PNw7R0LxFOe9gRZfyUQ2/OPLFJQO5 ITEU1CQMWcfn3mzuljee0M36FRM272wn3imme+n4= Date: Fri, 14 Jan 2022 14:10:05 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 131/146] Docs/admin-guide/mm/damon/usage: remove redundant information Message-ID: <20220114221005.k_LcZXMlh%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 10565C0008 X-Stat-Signature: kizznjhqhtb3k7hsw774ffft8pnan3ec Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KWeAlBWj; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198209-387046 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/usage: remove redundant information DAMON usage document mentions DAMON user space tool and programming interface twice. This commit integrates those and remove unnecessary part. Link: https://lkml.kernel.org/r/20211209131806.19317-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 42 ++++++++--------- 1 file changed, 21 insertions(+), 21 deletions(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-usage-remove-redundant-information +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -7,30 +7,30 @@ Detailed Usages DAMON provides below three interfaces for different users. - *DAMON user space tool.* - This is for privileged people such as system administrators who want a - just-working human-friendly interface. Using this, users can use the DAMON’s - major features in a human-friendly way. It may not be highly tuned for - special cases, though. It supports both virtual and physical address spaces - monitoring. + `This `_ is for privileged people such as + system administrators who want a just-working human-friendly interface. + Using this, users can use the DAMON’s major features in a human-friendly way. + It may not be highly tuned for special cases, though. It supports both + virtual and physical address spaces monitoring. For more detail, please + refer to its `usage document + `_. - *debugfs interface.* - This is for privileged user space programmers who want more optimized use of - DAMON. Using this, users can use DAMON’s major features by reading - from and writing to special debugfs files. Therefore, you can write and use - your personalized DAMON debugfs wrapper programs that reads/writes the - debugfs files instead of you. The DAMON user space tool is also a reference - implementation of such programs. It supports both virtual and physical - address spaces monitoring. + :ref:`This ` is for privileged user space programmers who + want more optimized use of DAMON. Using this, users can use DAMON’s major + features by reading from and writing to special debugfs files. Therefore, + you can write and use your personalized DAMON debugfs wrapper programs that + reads/writes the debugfs files instead of you. The `DAMON user space tool + `_ is one example of such programs. It + supports both virtual and physical address spaces monitoring. - *Kernel Space Programming Interface.* - This is for kernel space programmers. Using this, users can utilize every - feature of DAMON most flexibly and efficiently by writing kernel space - DAMON application programs for you. You can even extend DAMON for various - address spaces. + :doc:`This ` is for kernel space programmers. Using this, + users can utilize every feature of DAMON most flexibly and efficiently by + writing kernel space DAMON application programs for you. You can even extend + DAMON for various address spaces. For detail, please refer to the interface + :doc:`document `. -Nevertheless, you could write your own user space tool using the debugfs -interface. A reference implementation is available at -https://github.com/awslabs/damo. If you are a kernel programmer, you could -refer to :doc:`/vm/damon/api` for the kernel space programming interface. For -the reason, this document describes only the debugfs interface + +.. _debugfs_interface: debugfs Interface ================= From patchwork Fri Jan 14 22:10:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62ECAC4332F for ; Fri, 14 Jan 2022 22:10:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE4B66B0187; Fri, 14 Jan 2022 17:10:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D940D6B0189; Fri, 14 Jan 2022 17:10:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5B6B6B018A; Fri, 14 Jan 2022 17:10:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id AC8476B0187 for ; Fri, 14 Jan 2022 17:10:12 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 76D3193F2F for ; Fri, 14 Jan 2022 22:10:12 +0000 (UTC) X-FDA: 79030286664.11.AF6E927 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 753DB1C0002 for ; Fri, 14 Jan 2022 22:10:11 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6E81CB8262F; Fri, 14 Jan 2022 22:10:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2B25C36AE9; Fri, 14 Jan 2022 22:10:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198209; bh=19E+CtDAg/34H9fR7SjNVgJCCYloRDmlLaUNNNmjuM8=; h=Date:From:To:Subject:In-Reply-To:From; b=1nxwDgsKlTMgT0rqZAUytGHY9Mmru5rUNYw8tLou2DHaG62TfkQ/ofxLsE8cOPYoY gG5jvaJCtWhGEIlKw55oJx7fXzYMpdLg9sLPCLdvolxm+CmpNaVnqVj3xd1POOItKP Hzcw0t/5it/C0o+Ee9KbsUUWMjrI910yJkLq4vlo= Date: Fri, 14 Jan 2022 14:10:08 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 132/146] Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Message-ID: <20220114221008.1h5lucSiN%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 753DB1C0002 X-Stat-Signature: 5j1ktnzf1udf338xrr5gc6xse7erxr61 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1nxwDgsK; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198211-179793 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning To get detailed monitoring results from the user space, users need to use the damon_aggregated tracepoint. This commit adds a brief mention of it at the beginning of the usage document. Link: https://lkml.kernel.org/r/20211209131806.19317-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-usage-mention-tracepoint-at-the-beginning +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -21,7 +21,10 @@ DAMON provides below three interfaces fo you can write and use your personalized DAMON debugfs wrapper programs that reads/writes the debugfs files instead of you. The `DAMON user space tool `_ is one example of such programs. It - supports both virtual and physical address spaces monitoring. + supports both virtual and physical address spaces monitoring. Note that this + interface provides only simple :ref:`statistics ` for the + monitoring results. For detailed monitoring results, DAMON provides a + :ref:`tracepoint `. - *Kernel Space Programming Interface.* :doc:`This ` is for kernel space programmers. Using this, users can utilize every feature of DAMON most flexibly and efficiently by @@ -215,6 +218,8 @@ If the value is higher than ````, the scheme is activated. +.. _damos_stats: + Statistics ~~~~~~~~~~ @@ -268,6 +273,8 @@ the monitoring is turned on. If you wri an error code such as ``-EBUSY`` will be returned. +.. _tracepoint: + Tracepoint for Monitoring Results ================================= From patchwork Fri Jan 14 22:10:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02D85C433F5 for ; Fri, 14 Jan 2022 22:10:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 901866B0189; Fri, 14 Jan 2022 17:10:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 889596B018B; Fri, 14 Jan 2022 17:10:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7528B6B018C; Fri, 14 Jan 2022 17:10:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id 66E176B0189 for ; Fri, 14 Jan 2022 17:10:15 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 278B9180C3C4C for ; Fri, 14 Jan 2022 22:10:15 +0000 (UTC) X-FDA: 79030286790.07.35A2E1F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 7903B1C000B for ; Fri, 14 Jan 2022 22:10:14 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 77CE7B8262F; Fri, 14 Jan 2022 22:10:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E89C9C36AE5; Fri, 14 Jan 2022 22:10:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198212; bh=bqJOBv3bz137w2s4IpkvD2KdNttFcPuPeJVeiFVERwY=; h=Date:From:To:Subject:In-Reply-To:From; b=W3uv4CXTxJtVns/FsnMteprPuEzzvGEAHMq3KB3nSD3NoOGYKqvgpnRUvbxh0rBng UdBdw800XiP7VxqNKy/jSs/gNfQDT6iM9UdLXEoz3ByIw6bYwXfWE4+4cyAFX0P/gq HpvPTD/d3Uo73JfD2mQuhxRgKn2Pw9VO1LEh2Kiw= Date: Fri, 14 Jan 2022 14:10:11 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 133/146] Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts Message-ID: <20220114221011.W00hsuTTA%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7903B1C000B X-Stat-Signature: tw5r779wcsa837jqmb5u7hbohi9mz96i Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=W3uv4CXT; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198214-31307 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts The DAMON debugfs usage document is missing descriptions for 'kdamond_pid', 'mk_contexts', and 'rm_contexts' debugfs files. This commit adds those. Link: https://lkml.kernel.org/r/20211209131806.19317-6-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 52 ++++++++++++++++- 1 file changed, 49 insertions(+), 3 deletions(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-usage-update-for-kdamond_pid-and-mkrm_contexts +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -38,9 +38,9 @@ DAMON provides below three interfaces fo debugfs Interface ================= -DAMON exports five files, ``attrs``, ``target_ids``, ``init_regions``, -``schemes`` and ``monitor_on`` under its debugfs directory, -``/damon/``. +DAMON exports eight files, ``attrs``, ``target_ids``, ``init_regions``, +``schemes``, ``monitor_on``, ``kdamond_pid``, ``mk_contexts`` and +``rm_contexts`` under its debugfs directory, ``/damon/``. Attributes @@ -273,6 +273,52 @@ the monitoring is turned on. If you wri an error code such as ``-EBUSY`` will be returned. +Monitoring Thread PID +--------------------- + +DAMON does requested monitoring with a kernel thread called ``kdamond``. You +can get the pid of the thread by reading the ``kdamond_pid`` file. When the +monitoring is turned off, reading the file returns ``none``. :: + + # cd /damon + # cat monitor_on + off + # cat kdamond_pid + none + # echo on > monitor_on + # cat kdamond_pid + 18594 + + +Using Multiple Monitoring Threads +--------------------------------- + +One ``kdamond`` thread is created for each monitoring context. You can create +and remove monitoring contexts for multiple ``kdamond`` required use case using +the ``mk_contexts`` and ``rm_contexts`` files. + +Writing the name of the new context to the ``mk_contexts`` file creates a +directory of the name on the DAMON debugfs directory. The directory will have +DAMON debugfs files for the context. :: + + # cd /damon + # ls foo + # ls: cannot access 'foo': No such file or directory + # echo foo > mk_contexts + # ls foo + # attrs init_regions kdamond_pid schemes target_ids + +If the context is not needed anymore, you can remove it and the corresponding +directory by putting the name of the context to the ``rm_contexts`` file. :: + + # echo foo > rm_contexts + # ls foo + # ls: cannot access 'foo': No such file or directory + +Note that ``mk_contexts``, ``rm_contexts``, and ``monitor_on`` files are in the +root directory only. + + .. _tracepoint: Tracepoint for Monitoring Results From patchwork Fri Jan 14 22:10:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE591C4332F for ; Fri, 14 Jan 2022 22:10:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8406C6B018B; Fri, 14 Jan 2022 17:10:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7C7146B018D; Fri, 14 Jan 2022 17:10:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68D756B018E; Fri, 14 Jan 2022 17:10:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay033.a.hostedemail.com [64.99.140.33]) by kanga.kvack.org (Postfix) with ESMTP id 583C86B018B for ; Fri, 14 Jan 2022 17:10:17 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 21CD621ED3 for ; Fri, 14 Jan 2022 22:10:17 +0000 (UTC) X-FDA: 79030286874.02.F92B620 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf13.hostedemail.com (Postfix) with ESMTP id 9060E2000E for ; Fri, 14 Jan 2022 22:10:16 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7CFD9B8262F; Fri, 14 Jan 2022 22:10:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1162C36AE9; Fri, 14 Jan 2022 22:10:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198215; bh=0wpDmZGhi9LlL0lT+cuhNOqCuaItfO0G1UgkHf7/l5U=; h=Date:From:To:Subject:In-Reply-To:From; b=Bl8MD/34u6Vs4uMu7KpuR8P03BGSPRVxWmlLgQwH2fBGNlbW6A3Zf+10iU3yGKJ2N hpo3IAcNeRISuBTCc5/jZklv4jTlt8+QpWgZ6wJwnrvNDL1T7ubFfLDlpWgsBglZ28 4Vhgy5ATdLpJp2hpglxBVMNm/pB/lyhHIPl/KfGg= Date: Fri, 14 Jan 2022 14:10:14 -0800 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 134/146] mm/damon: remove a mistakenly added comment for a future feature Message-ID: <20220114221014.MeoVZWm-2%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9060E2000E X-Stat-Signature: zesrt6ib4bckw65sj88si63z5ckjwqi5 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Bl8MD/34"; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198216-72663 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon: remove a mistakenly added comment for a future feature Due to a mistake in patches reordering, a comment for a future feature called 'arbitrary monitoring target support'[1], which is still under development, has added. Because it only introduces confusion and we don't have a plan to post the patches soon, this commit removes the mistakenly added part. [1] https://lore.kernel.org/linux-mm/20201215115448.25633-3-sjpark@amazon.com/ Link: https://lkml.kernel.org/r/20211209131806.19317-7-sj@kernel.org Fixes: 1f366e421c8f ("mm/damon/core: implement DAMON-based Operation Schemes (DAMOS)") Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- include/linux/damon.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/include/linux/damon.h~mm-damon-remove-a-mistakenly-added-comment-for-a-future-feature +++ a/include/linux/damon.h @@ -281,7 +281,7 @@ struct damon_ctx; * as an integer in [0, &DAMOS_MAX_SCORE]. * @apply_scheme is called from @kdamond when a region for user provided * DAMON-based operation scheme is found. It should apply the scheme's action - * to the region. This is not used for &DAMON_ARBITRARY_TARGET case. + * to the region. * @target_valid should check whether the target is still valid for the * monitoring. * @cleanup is called from @kdamond just before its termination. From patchwork Fri Jan 14 22:10:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35AC8C433F5 for ; Fri, 14 Jan 2022 22:10:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0FB26B018D; Fri, 14 Jan 2022 17:10:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC0286B018F; Fri, 14 Jan 2022 17:10:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A868E6B0190; Fri, 14 Jan 2022 17:10:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id 993936B018D for ; Fri, 14 Jan 2022 17:10:21 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6310E998CE for ; Fri, 14 Jan 2022 22:10:21 +0000 (UTC) X-FDA: 79030287042.14.969281D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf26.hostedemail.com (Postfix) with ESMTP id C831014000E for ; Fri, 14 Jan 2022 22:10:20 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 642BBB8262E; Fri, 14 Jan 2022 22:10:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B8A0C36AEC; Fri, 14 Jan 2022 22:10:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198218; bh=p3sRxf5KPrCsfdnLLYtlaAUPCoyKYoRrNnN/gDyVi5Q=; h=Date:From:To:Subject:In-Reply-To:From; b=brLFC7zqfiR+AEIfjjY4IbSoZBQOhgJH57wsYTNTd8vjYftbbZzWZfZdGbePQnVBg vm9ELWtxClaIzLmWkAdo9GKXG55VyepUsjUShlnbnDquew9vhyDS9LkC2bwnOnamqr oxRoRDmDTVczDJIrXix3QBY4aUNIIUordIkHcx08= Date: Fri, 14 Jan 2022 14:10:17 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 135/146] mm/damon/schemes: account scheme actions that successfully applied Message-ID: <20220114221017.9wTeZ8OM6%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C831014000E X-Stat-Signature: 5gaucyhrd9uetjahheaicmtkfbteqnuf Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=brLFC7zq; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198220-722426 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: account scheme actions that successfully applied Patch series "mm/damon/schemes: Extend stats for better online analysis and tuning". To help online access pattern analysis and tuning of DAMON-based Operation Schemes (DAMOS), DAMOS provides simple statistics for each scheme. Introduction of DAMOS time/space quota further made the tuning easier by making the risk management easier. However, that also made understanding of the working schemes a little bit more difficult. For an example, progress of a given scheme can now be throttled by not only the aggressiveness of the target access pattern, but also the time/space quotas. So, when a scheme is showing unexpectedly slow progress, it's difficult to know by what the progress of the scheme is throttled, with currently provided statistics. This patchset extends the statistics to contain some metrics that can be helpful for such online schemes analysis and tuning (patches 1-2), exports those to users (patches 3 and 5), and add documents (patches 4 and 6). This patch (of 6): DAMON-based operation schemes (DAMOS) stats provide only the number and the amount of regions that the action of the scheme has tried to be applied. Because the action could be failed for some reasons, the currently provided information is sometimes not useful or convenient enough for schemes profiling and tuning. To improve this situation, this commit extends the DAMOS stats to provide the number and the amount of regions that the action has successfully applied. Link: https://lkml.kernel.org/r/20211210150016.35349-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211210150016.35349-2-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 28 +++++++++++++++++++++------- mm/damon/core.c | 13 ++++++++----- mm/damon/dbgfs.c | 2 +- mm/damon/paddr.c | 13 +++++++------ mm/damon/vaddr.c | 30 ++++++++++++++++-------------- 5 files changed, 53 insertions(+), 33 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-account-scheme-actions-that-successfully-applied +++ a/include/linux/damon.h @@ -193,6 +193,20 @@ struct damos_watermarks { }; /** + * struct damos_stat - Statistics on a given scheme. + * @nr_tried: Total number of regions that the scheme is tried to be applied. + * @sz_tried: Total size of regions that the scheme is tried to be applied. + * @nr_applied: Total number of regions that the scheme is applied. + * @sz_applied: Total size of regions that the scheme is applied. + */ +struct damos_stat { + unsigned long nr_tried; + unsigned long sz_tried; + unsigned long nr_applied; + unsigned long sz_applied; +}; + +/** * struct damos - Represents a Data Access Monitoring-based Operation Scheme. * @min_sz_region: Minimum size of target regions. * @max_sz_region: Maximum size of target regions. @@ -203,8 +217,7 @@ struct damos_watermarks { * @action: &damo_action to be applied to the target regions. * @quota: Control the aggressiveness of this scheme. * @wmarks: Watermarks for automated (in)activation of this scheme. - * @stat_count: Total number of regions that this scheme is applied. - * @stat_sz: Total size of regions that this scheme is applied. + * @stat: Statistics of this scheme. * @list: List head for siblings. * * For each aggregation interval, DAMON finds regions which fit in the @@ -235,8 +248,7 @@ struct damos { enum damos_action action; struct damos_quota quota; struct damos_watermarks wmarks; - unsigned long stat_count; - unsigned long stat_sz; + struct damos_stat stat; struct list_head list; }; @@ -281,7 +293,8 @@ struct damon_ctx; * as an integer in [0, &DAMOS_MAX_SCORE]. * @apply_scheme is called from @kdamond when a region for user provided * DAMON-based operation scheme is found. It should apply the scheme's action - * to the region. + * to the region and return bytes of the region that the action is successfully + * applied. * @target_valid should check whether the target is still valid for the * monitoring. * @cleanup is called from @kdamond just before its termination. @@ -295,8 +308,9 @@ struct damon_primitive { int (*get_scheme_score)(struct damon_ctx *context, struct damon_target *t, struct damon_region *r, struct damos *scheme); - int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t, - struct damon_region *r, struct damos *scheme); + unsigned long (*apply_scheme)(struct damon_ctx *context, + struct damon_target *t, struct damon_region *r, + struct damos *scheme); bool (*target_valid)(void *target); void (*cleanup)(struct damon_ctx *context); }; --- a/mm/damon/core.c~mm-damon-schemes-account-scheme-actions-that-successfully-applied +++ a/mm/damon/core.c @@ -102,8 +102,7 @@ struct damos *damon_new_scheme( scheme->min_age_region = min_age_region; scheme->max_age_region = max_age_region; scheme->action = action; - scheme->stat_count = 0; - scheme->stat_sz = 0; + scheme->stat = (struct damos_stat){}; INIT_LIST_HEAD(&scheme->list); scheme->quota.ms = quota->ms; @@ -574,6 +573,7 @@ static void damon_do_apply_schemes(struc struct damos_quota *quota = &s->quota; unsigned long sz = r->ar.end - r->ar.start; struct timespec64 begin, end; + unsigned long sz_applied = 0; if (!s->wmarks.activated) continue; @@ -627,7 +627,7 @@ static void damon_do_apply_schemes(struc damon_split_region_at(c, t, r, sz); } ktime_get_coarse_ts64(&begin); - c->primitive.apply_scheme(c, t, r, s); + sz_applied = c->primitive.apply_scheme(c, t, r, s); ktime_get_coarse_ts64(&end); quota->total_charged_ns += timespec64_to_ns(&end) - timespec64_to_ns(&begin); @@ -641,8 +641,11 @@ static void damon_do_apply_schemes(struc r->age = 0; update_stat: - s->stat_count++; - s->stat_sz += sz; + s->stat.nr_tried++; + s->stat.sz_tried += sz; + if (sz_applied) + s->stat.nr_applied++; + s->stat.sz_applied += sz_applied; } } --- a/mm/damon/dbgfs.c~mm-damon-schemes-account-scheme-actions-that-successfully-applied +++ a/mm/damon/dbgfs.c @@ -117,7 +117,7 @@ static ssize_t sprint_schemes(struct dam s->quota.weight_age, s->wmarks.metric, s->wmarks.interval, s->wmarks.high, s->wmarks.mid, s->wmarks.low, - s->stat_count, s->stat_sz); + s->stat.nr_tried, s->stat.sz_tried); if (!rc) return -ENOMEM; --- a/mm/damon/paddr.c~mm-damon-schemes-account-scheme-actions-that-successfully-applied +++ a/mm/damon/paddr.c @@ -213,14 +213,15 @@ bool damon_pa_target_valid(void *t) return true; } -static int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, - struct damon_region *r, struct damos *scheme) +static unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) { - unsigned long addr; + unsigned long addr, applied; LIST_HEAD(page_list); if (scheme->action != DAMOS_PAGEOUT) - return -EINVAL; + return 0; for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) { struct page *page = damon_get_page(PHYS_PFN(addr)); @@ -241,9 +242,9 @@ static int damon_pa_apply_scheme(struct put_page(page); } } - reclaim_pages(&page_list); + applied = reclaim_pages(&page_list); cond_resched(); - return 0; + return applied * PAGE_SIZE; } static int damon_pa_scheme_score(struct damon_ctx *context, --- a/mm/damon/vaddr.c~mm-damon-schemes-account-scheme-actions-that-successfully-applied +++ a/mm/damon/vaddr.c @@ -572,32 +572,34 @@ bool damon_va_target_valid(void *target) } #ifndef CONFIG_ADVISE_SYSCALLS -static int damos_madvise(struct damon_target *target, struct damon_region *r, - int behavior) +static unsigned long damos_madvise(struct damon_target *target, + struct damon_region *r, int behavior) { - return -EINVAL; + return 0; } #else -static int damos_madvise(struct damon_target *target, struct damon_region *r, - int behavior) +static unsigned long damos_madvise(struct damon_target *target, + struct damon_region *r, int behavior) { struct mm_struct *mm; - int ret = -ENOMEM; + unsigned long start = PAGE_ALIGN(r->ar.start); + unsigned long len = PAGE_ALIGN(r->ar.end - r->ar.start); + unsigned long applied; mm = damon_get_mm(target); if (!mm) - goto out; + return 0; - ret = do_madvise(mm, PAGE_ALIGN(r->ar.start), - PAGE_ALIGN(r->ar.end - r->ar.start), behavior); + applied = do_madvise(mm, start, len, behavior) ? 0 : len; mmput(mm); -out: - return ret; + + return applied; } #endif /* CONFIG_ADVISE_SYSCALLS */ -static int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, - struct damon_region *r, struct damos *scheme) +static unsigned long damon_va_apply_scheme(struct damon_ctx *ctx, + struct damon_target *t, struct damon_region *r, + struct damos *scheme) { int madv_action; @@ -620,7 +622,7 @@ static int damon_va_apply_scheme(struct case DAMOS_STAT: return 0; default: - return -EINVAL; + return 0; } return damos_madvise(t, r, madv_action); From patchwork Fri Jan 14 22:10:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63800C433F5 for ; Fri, 14 Jan 2022 22:10:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA6426B018F; Fri, 14 Jan 2022 17:10:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E55A06B0191; Fri, 14 Jan 2022 17:10:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1E4E6B0192; Fri, 14 Jan 2022 17:10:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id C04276B018F for ; Fri, 14 Jan 2022 17:10:25 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8449F998C6 for ; Fri, 14 Jan 2022 22:10:25 +0000 (UTC) X-FDA: 79030287210.22.816AC7A Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf10.hostedemail.com (Postfix) with ESMTP id E5146C0006 for ; Fri, 14 Jan 2022 22:10:24 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id B8D58CE2497; Fri, 14 Jan 2022 22:10:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0A0B9C36AE9; Fri, 14 Jan 2022 22:10:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198221; bh=CVm+MwUZ4sjp+DltbKcOTE15F9vbCL+4dpAHq0TNcEk=; h=Date:From:To:Subject:In-Reply-To:From; b=t+JaDgjN4UbDdpbSbe9Kh4GrlJBfkEUUOPabyr6CvTBL+DF4PN5pe/tV5lStwXa18 32IzEVjkaY3YTwUUuAlW2lp6cwYG970sDu9lVJQh/Cy7td9PEciOoUNvOiLup6M6P7 3+UvW1SUi1k1VXlp9yCNwesmKCSG8zGgDV6tY6as= Date: Fri, 14 Jan 2022 14:10:20 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 136/146] mm/damon/schemes: account how many times quota limit has exceeded Message-ID: <20220114221020.IMbqpzsPF%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: E5146C0006 X-Stat-Signature: icfyaa1c5j7moo1q58j15h598aioin5d Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=t+JaDgjN; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198224-856081 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: account how many times quota limit has exceeded If the time/space quotas of a given DAMON-based operation scheme is too small, the scheme could show unexpectedly slow progress. However, there is no good way to notice the case in runtime. This commit extends the DAMOS stat to provide how many times the quota limits exceeded so that the users can easily notice the case and tune the scheme. Link: https://lkml.kernel.org/r/20211210150016.35349-3-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 2 ++ mm/damon/core.c | 2 ++ 2 files changed, 4 insertions(+) --- a/include/linux/damon.h~mm-damon-schemes-account-how-many-times-quota-limit-has-exceeded +++ a/include/linux/damon.h @@ -198,12 +198,14 @@ struct damos_watermarks { * @sz_tried: Total size of regions that the scheme is tried to be applied. * @nr_applied: Total number of regions that the scheme is applied. * @sz_applied: Total size of regions that the scheme is applied. + * @qt_exceeds: Total number of times the quota of the scheme has exceeded. */ struct damos_stat { unsigned long nr_tried; unsigned long sz_tried; unsigned long nr_applied; unsigned long sz_applied; + unsigned long qt_exceeds; }; /** --- a/mm/damon/core.c~mm-damon-schemes-account-how-many-times-quota-limit-has-exceeded +++ a/mm/damon/core.c @@ -693,6 +693,8 @@ static void kdamond_apply_schemes(struct if (time_after_eq(jiffies, quota->charged_from + msecs_to_jiffies( quota->reset_interval))) { + if (quota->esz && quota->charged_sz >= quota->esz) + s->stat.qt_exceeds++; quota->total_charged_sz += quota->charged_sz; quota->charged_from = jiffies; quota->charged_sz = 0; From patchwork Fri Jan 14 22:10:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 479CFC433FE for ; Fri, 14 Jan 2022 22:10:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6D916B0191; Fri, 14 Jan 2022 17:10:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D1D7E6B0193; Fri, 14 Jan 2022 17:10:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0C896B0194; Fri, 14 Jan 2022 17:10:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id B1B006B0191 for ; Fri, 14 Jan 2022 17:10:28 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7446C1826B6C5 for ; Fri, 14 Jan 2022 22:10:28 +0000 (UTC) X-FDA: 79030287336.20.19912E5 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf09.hostedemail.com (Postfix) with ESMTP id E3EFA140003 for ; Fri, 14 Jan 2022 22:10:26 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 8E8EDCE2498; Fri, 14 Jan 2022 22:10:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F3F6BC36AE5; Fri, 14 Jan 2022 22:10:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198224; bh=zJkWwEa3WLIXIkRvusgvNXLDqDO48S+ZkshdlA8Upgc=; h=Date:From:To:Subject:In-Reply-To:From; b=qy4SB9RXkWIjaf2VTCQBOIv0/knExwEuhg+418TJ6ge6k5jZAJDbeQ2RQ5x1aHqQ6 KYytWenZy9eIfwxToUYvPvMXXEu+UPv5rBtLm9i5y0dQ0KahmdTWuYMohLhx8szemE Dw9XeBkXzDjO+PfL7D+bLlabBuUw28LQ914M8svE= Date: Fri, 14 Jan 2022 14:10:23 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 137/146] mm/damon/reclaim: provide reclamation statistics Message-ID: <20220114221023.4HlH4s90J%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: E3EFA140003 X-Stat-Signature: n61erdfbc9mei13ftht4mabg663zs4fx Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qy4SB9RX; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-HE-Tag: 1642198226-239610 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/reclaim: provide reclamation statistics This commit implements new DAMON_RECLAIM parameters for statistics reporting. Those can be used for understanding how DAMON_RECLAIM is working, and for tuning the other parameters. Link: https://lkml.kernel.org/r/20211210150016.35349-4-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/reclaim.c | 46 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) --- a/mm/damon/reclaim.c~mm-damon-reclaim-provide-reclamation-statistics +++ a/mm/damon/reclaim.c @@ -185,6 +185,36 @@ module_param(monitor_region_end, ulong, static int kdamond_pid __read_mostly = -1; module_param(kdamond_pid, int, 0400); +/* + * Number of memory regions that tried to be reclaimed. + */ +static unsigned long nr_reclaim_tried_regions __read_mostly; +module_param(nr_reclaim_tried_regions, ulong, 0400); + +/* + * Total bytes of memory regions that tried to be reclaimed. + */ +static unsigned long bytes_reclaim_tried_regions __read_mostly; +module_param(bytes_reclaim_tried_regions, ulong, 0400); + +/* + * Number of memory regions that successfully be reclaimed. + */ +static unsigned long nr_reclaimed_regions __read_mostly; +module_param(nr_reclaimed_regions, ulong, 0400); + +/* + * Total bytes of memory regions that successfully be reclaimed. + */ +static unsigned long bytes_reclaimed_regions __read_mostly; +module_param(bytes_reclaimed_regions, ulong, 0400); + +/* + * Number of times that the time/space quota limits have exceeded + */ +static unsigned long nr_quota_exceeds __read_mostly; +module_param(nr_quota_exceeds, ulong, 0400); + static struct damon_ctx *ctx; static struct damon_target *target; @@ -333,6 +363,21 @@ static void damon_reclaim_timer_fn(struc } static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn); +static int damon_reclaim_after_aggregation(struct damon_ctx *c) +{ + struct damos *s; + + /* update the stats parameter */ + damon_for_each_scheme(s, c) { + nr_reclaim_tried_regions = s->stat.nr_tried; + bytes_reclaim_tried_regions = s->stat.sz_tried; + nr_reclaimed_regions = s->stat.nr_applied; + bytes_reclaimed_regions = s->stat.sz_applied; + nr_quota_exceeds = s->stat.qt_exceeds; + } + return 0; +} + static int __init damon_reclaim_init(void) { ctx = damon_new_ctx(); @@ -340,6 +385,7 @@ static int __init damon_reclaim_init(voi return -ENOMEM; damon_pa_set_primitives(ctx); + ctx->callback.after_aggregation = damon_reclaim_after_aggregation; /* 4242 means nothing but fun */ target = damon_new_target(4242); From patchwork Fri Jan 14 22:10:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AC27C433EF for ; Fri, 14 Jan 2022 22:10:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 24A336B0193; Fri, 14 Jan 2022 17:10:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F9BA6B0195; Fri, 14 Jan 2022 17:10:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E9DB6B0196; Fri, 14 Jan 2022 17:10:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id EFCAD6B0193 for ; Fri, 14 Jan 2022 17:10:29 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B956E1826B6AD for ; Fri, 14 Jan 2022 22:10:29 +0000 (UTC) X-FDA: 79030287378.19.195DD1A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf14.hostedemail.com (Postfix) with ESMTP id 4408910000C for ; Fri, 14 Jan 2022 22:10:29 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 41F5AB825F5; Fri, 14 Jan 2022 22:10:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E43CFC36AE5; Fri, 14 Jan 2022 22:10:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198227; bh=+Liij4OSwUtj1wepAGX3ZoZnxl68z6mZqipSVj0FaI0=; h=Date:From:To:Subject:In-Reply-To:From; b=YsMNIwKowDHDrfcRGFZEs0nLwy95ZEpipbXDUTpImAobT7tJxPzm3HS22oi7bF9Te awHwYcuCKMGHbFEcKd/adYlqSAuF2Ir9zA6RtlQwu0XzsHC5GY0M2vN8oPiJkqjTS+ DxHWMkJBru3F0X/FiJ0N8v3X1OUPI/SPtqEdxsD8= Date: Fri, 14 Jan 2022 14:10:26 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 138/146] Docs/admin-guide/mm/damon/reclaim: document statistics parameters Message-ID: <20220114221026.JpjZy9r0m%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4408910000C X-Stat-Signature: 5c4w7fa9qf4s7cqutypg5s5hx3o8914w Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YsMNIwKo; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1642198229-13673 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/reclaim: document statistics parameters This commit adds descriptions for the DAMON_RECLAIM statistics parameters. Link: https://lkml.kernel.org/r/20211210150016.35349-5-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/reclaim.rst | 25 +++++++++++++++ 1 file changed, 25 insertions(+) --- a/Documentation/admin-guide/mm/damon/reclaim.rst~docs-admin-guide-mm-damon-reclaim-document-statistics-parameters +++ a/Documentation/admin-guide/mm/damon/reclaim.rst @@ -208,6 +208,31 @@ PID of the DAMON thread. If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread. Else, -1. +nr_reclaim_tried_regions +------------------------ + +Number of memory regions that tried to be reclaimed by DAMON_RECLAIM. + +bytes_reclaim_tried_regions +--------------------------- + +Total bytes of memory regions that tried to be reclaimed by DAMON_RECLAIM. + +nr_reclaimed_regions +-------------------- + +Number of memory regions that successfully be reclaimed by DAMON_RECLAIM. + +bytes_reclaimed_regions +----------------------- + +Total bytes of memory regions that successfully be reclaimed by DAMON_RECLAIM. + +nr_quota_exceeds +---------------- + +Number of times that the time/space quota limits have exceeded. + Example ======= From patchwork Fri Jan 14 22:10:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3E04C433EF for ; Fri, 14 Jan 2022 22:10:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3399B6B0195; Fri, 14 Jan 2022 17:10:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E4386B0197; Fri, 14 Jan 2022 17:10:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D3A76B0198; Fri, 14 Jan 2022 17:10:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id 0D0276B0195 for ; Fri, 14 Jan 2022 17:10:33 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C399882F4BEE for ; Fri, 14 Jan 2022 22:10:32 +0000 (UTC) X-FDA: 79030287504.30.B072678 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf30.hostedemail.com (Postfix) with ESMTP id 583AA80004 for ; Fri, 14 Jan 2022 22:10:32 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 58185B825F5; Fri, 14 Jan 2022 22:10:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DC5E6C36AE9; Fri, 14 Jan 2022 22:10:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198230; bh=f7ebOWbgDygOOHI24vYnOZH5qj2LJ+FCPcXNn+80hAo=; h=Date:From:To:Subject:In-Reply-To:From; b=0hXGqa4fhLbkBmmE4Bin2do8gxX7DQxs4nIcMzj0Jpl3hy0OuCNkriK83uOxJjx1E JiR/vmiNGCrYIO1UE+Gs8BNDdJizQPnuqImDo+SUWoOpWCQAfogtW2/FkBuOlNKoqs EGFbCw+y3Uu+AK33Zp5uexBSXXzYmyA77cjrxUeY= Date: Fri, 14 Jan 2022 14:10:29 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 139/146] mm/damon/dbgfs: support all DAMOS stats Message-ID: <20220114221029.jCyyux7ns%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 583AA80004 X-Stat-Signature: jmpd3gj84fgudggbq37wt5711pph6tki Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=0hXGqa4f; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198232-525258 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support all DAMOS stats Currently, DAMON debugfs interface is not supporting DAMON-based Operation Schemes (DAMOS) stats for schemes successfully applied regions and time/space quota limit exceeds. This commit adds the support. Link: https://lkml.kernel.org/r/20211210150016.35349-6-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-all-damos-stats +++ a/mm/damon/dbgfs.c @@ -105,7 +105,7 @@ static ssize_t sprint_schemes(struct dam damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, - "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu\n", + "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu %lu %lu %lu\n", s->min_sz_region, s->max_sz_region, s->min_nr_accesses, s->max_nr_accesses, s->min_age_region, s->max_age_region, @@ -117,7 +117,9 @@ static ssize_t sprint_schemes(struct dam s->quota.weight_age, s->wmarks.metric, s->wmarks.interval, s->wmarks.high, s->wmarks.mid, s->wmarks.low, - s->stat.nr_tried, s->stat.sz_tried); + s->stat.nr_tried, s->stat.sz_tried, + s->stat.nr_applied, s->stat.sz_applied, + s->stat.qt_exceeds); if (!rc) return -ENOMEM; From patchwork Fri Jan 14 22:10:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A708C433F5 for ; Fri, 14 Jan 2022 22:10:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D6526B0197; Fri, 14 Jan 2022 17:10:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15EFB6B0199; Fri, 14 Jan 2022 17:10:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 025716B019A; Fri, 14 Jan 2022 17:10:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id E8AEC6B0197 for ; Fri, 14 Jan 2022 17:10:35 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B2D0082F12DC for ; Fri, 14 Jan 2022 22:10:35 +0000 (UTC) X-FDA: 79030287630.27.A29E7C8 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf11.hostedemail.com (Postfix) with ESMTP id 1C94040003 for ; Fri, 14 Jan 2022 22:10:34 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2992BB8262F; Fri, 14 Jan 2022 22:10:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3E03C36AEC; Fri, 14 Jan 2022 22:10:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198233; bh=BwaypfJBGN7aih/LesEouJ+/BAb2lC33TQCh8P5CKoo=; h=Date:From:To:Subject:In-Reply-To:From; b=wmzgu228bQvN9EfQIPJQJlfn9xiNcugYrZir/OnWKlYp8mNNqHPZOrpXHo9Nc/Uht +TJsM+Eu4Eb77KDDMeN1ylVTwrdZNLwcku3teeO8YBh410U/vrav5g46g9TBE9ROob DMx0MRY08327gWHUSLFCk+3dIQc37/d4geSLvXm0= Date: Fri, 14 Jan 2022 14:10:32 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 140/146] Docs/admin-guide/mm/damon/usage: update for schemes statistics Message-ID: <20220114221032.36H2uejCs%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1C94040003 X-Stat-Signature: hyun6b9k1ermcujhwsg5mtahkngnwaq5 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=wmzgu228; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198234-1015 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/usage: update for schemes statistics This commit updates DAMON debugfs interface for statistics of schemes successfully applied regions and time/space quota limit exceeds counts. Link: https://lkml.kernel.org/r/20211210150016.35349-7-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-usage-update-for-schemes-statistics +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -223,12 +223,13 @@ is activated. Statistics ~~~~~~~~~~ -It also counts the total number and bytes of regions that each scheme is -applied. This statistics can be used for online analysis or tuning of the -schemes. +It also counts the total number and bytes of regions that each scheme is tried +to be applied, the two numbers for the regions that each scheme is successfully +applied, and the total number of the quota limit exceeds. This statistics can +be used for online analysis or tuning of the schemes. The statistics can be shown by reading the ``schemes`` file. Reading the file -will show each scheme you entered in each line, and the two numbers for the +will show each scheme you entered in each line, and the five numbers for the statistics will be added at the end of each line. Example From patchwork Fri Jan 14 22:10:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714166 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6C30C433EF for ; Fri, 14 Jan 2022 22:10:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CCCD6B0199; Fri, 14 Jan 2022 17:10:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57BD16B019B; Fri, 14 Jan 2022 17:10:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 41E6D6B019C; Fri, 14 Jan 2022 17:10:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id 32BA66B0199 for ; Fri, 14 Jan 2022 17:10:41 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E895582F12DC for ; Fri, 14 Jan 2022 22:10:40 +0000 (UTC) X-FDA: 79030287840.23.AB10B20 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf03.hostedemail.com (Postfix) with ESMTP id 294F72000D for ; Fri, 14 Jan 2022 22:10:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id D9AB0CE19A9; Fri, 14 Jan 2022 22:10:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D9CB3C36AEC; Fri, 14 Jan 2022 22:10:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198236; bh=494o6UPR1dCRBxmjCxiev9rGNpIsWEH9ubGKHFmP2PY=; h=Date:From:To:Subject:In-Reply-To:From; b=xpZtTSVwVN2jvF3K5q2T/UeGARS9OwAKj7e7845ceWp0KknJXJEzjJUQ6ewNnG5su LSf9k08jvSdGavFyzjE3ObZDoBECA5DRzXEPQoumhpZ478bvn3fMm1Sa9JGWVMYk3O 80sqpK//+K794iYYs3UJhD10dAw+UP5vRPtSDYok= Date: Fri, 14 Jan 2022 14:10:35 -0800 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rdunlap@infradead.org, sfr@canb.auug.org.au, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 141/146] mm/damon: add access checking for hugetlb pages Message-ID: <20220114221035.BB8CjJjza%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xpZtTSVw; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: 9q8c51mq4rmd63gtp3j7jwuayu6obu7z X-Rspamd-Queue-Id: 294F72000D X-Rspamd-Server: rspam12 X-HE-Tag: 1642198239-590019 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: mm/damon: add access checking for hugetlb pages The process's VMAs can be mapped by hugetlb page, but now the DAMON did not implement the access checking for hugetlb pte, so we can not get the actual access count like below if a process VMAs were mapped by hugetlb. damon_aggregated: target_id=18446614368406014464 nr_regions=12 4194304-5476352: 0 545 damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662370467840-140662372970496: 0 545 damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662372970496-140662375460864: 0 545 damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662375460864-140662377951232: 0 545 damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662377951232-140662380449792: 0 545 damon_aggregated: target_id=18446614368406014464 nr_regions=12 140662380449792-140662382944256: 0 545 ...... Thus this patch adds hugetlb access checking support, with this patch we can see below VMA mapped by hugetlb access count. damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296486649856-140296489914368: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296489914368-140296492978176: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296492978176-140296495439872: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296495439872-140296498311168: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296498311168-140296501198848: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296501198848-140296504320000: 1 3 damon_aggregated: target_id=18446613056935405824 nr_regions=12 140296504320000-140296507568128: 1 2 ...... [baolin.wang@linux.alibaba.com: fix unused var warning] Link: https://lkml.kernel.org/r/1aaf9c11-0d8e-b92d-5c92-46e50a6e8d4e@linux.alibaba.com [baolin.wang@linux.alibaba.com: v3] Link: https://lkml.kernel.org/r/486927ecaaaecf2e3a7fbe0378ec6e1c58b50747.1640852276.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/6afcbd1fda5f9c7c24f320d26a98188c727ceec3.1639623751.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: SeongJae Park Cc: Mike Kravetz Cc: Randy Dunlap Cc: Stephen Rothwell Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 96 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) --- a/mm/damon/vaddr.c~mm-damon-add-access-checking-for-hugetlb-pages +++ a/mm/damon/vaddr.c @@ -388,8 +388,65 @@ out: return 0; } +#ifdef CONFIG_HUGETLB_PAGE +static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, + struct vm_area_struct *vma, unsigned long addr) +{ + bool referenced = false; + pte_t entry = huge_ptep_get(pte); + struct page *page = pte_page(entry); + + if (!page) + return; + + get_page(page); + + if (pte_young(entry)) { + referenced = true; + entry = pte_mkold(entry); + huge_ptep_set_access_flags(vma, addr, pte, entry, + vma->vm_flags & VM_WRITE); + } + +#ifdef CONFIG_MMU_NOTIFIER + if (mmu_notifier_clear_young(mm, addr, + addr + huge_page_size(hstate_vma(vma)))) + referenced = true; +#endif /* CONFIG_MMU_NOTIFIER */ + + if (referenced) + set_page_young(page); + + set_page_idle(page); + put_page(page); +} + +static int damon_mkold_hugetlb_entry(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct hstate *h = hstate_vma(walk->vma); + spinlock_t *ptl; + pte_t entry; + + ptl = huge_pte_lock(h, walk->mm, pte); + entry = huge_ptep_get(pte); + if (!pte_present(entry)) + goto out; + + damon_hugetlb_mkold(pte, walk->mm, walk->vma, addr); + +out: + spin_unlock(ptl); + return 0; +} +#else +#define damon_mkold_hugetlb_entry NULL +#endif /* CONFIG_HUGETLB_PAGE */ + static const struct mm_walk_ops damon_mkold_ops = { .pmd_entry = damon_mkold_pmd_entry, + .hugetlb_entry = damon_mkold_hugetlb_entry, }; static void damon_va_mkold(struct mm_struct *mm, unsigned long addr) @@ -484,8 +541,47 @@ out: return 0; } +#ifdef CONFIG_HUGETLB_PAGE +static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct damon_young_walk_private *priv = walk->private; + struct hstate *h = hstate_vma(walk->vma); + struct page *page; + spinlock_t *ptl; + pte_t entry; + + ptl = huge_pte_lock(h, walk->mm, pte); + entry = huge_ptep_get(pte); + if (!pte_present(entry)) + goto out; + + page = pte_page(entry); + if (!page) + goto out; + + get_page(page); + + if (pte_young(entry) || !page_is_idle(page) || + mmu_notifier_test_young(walk->mm, addr)) { + *priv->page_sz = huge_page_size(h); + priv->young = true; + } + + put_page(page); + +out: + spin_unlock(ptl); + return 0; +} +#else +#define damon_young_hugetlb_entry NULL +#endif /* CONFIG_HUGETLB_PAGE */ + static const struct mm_walk_ops damon_young_ops = { .pmd_entry = damon_young_pmd_entry, + .hugetlb_entry = damon_young_hugetlb_entry, }; static bool damon_va_young(struct mm_struct *mm, unsigned long addr, From patchwork Fri Jan 14 22:10:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BF5EC433FE for ; Fri, 14 Jan 2022 22:10:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5CAC6B019B; Fri, 14 Jan 2022 17:10:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B6FFB6B019D; Fri, 14 Jan 2022 17:10:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5DA36B019E; Fri, 14 Jan 2022 17:10:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 934116B019B for ; Fri, 14 Jan 2022 17:10:42 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5FF591826B6AD for ; Fri, 14 Jan 2022 22:10:42 +0000 (UTC) X-FDA: 79030287924.23.3C41D0B Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf26.hostedemail.com (Postfix) with ESMTP id C97DE140008 for ; Fri, 14 Jan 2022 22:10:41 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 93A55CE24A5; Fri, 14 Jan 2022 22:10:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1294C36AE5; Fri, 14 Jan 2022 22:10:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198239; bh=v1bwwHVFMHKlbRbNAi/180jRFambaRbnWJ42e8609zc=; h=Date:From:To:Subject:In-Reply-To:From; b=Z7CbX80CYjdVYnzWi5Ai/mCrTEitVYqHK35US8nVKve/7hsD8e5s5CdP186xX57LX GfwTnPeI3KPt2UNZMSJCpxX2eMBL6zSGyjW0IH0bKI9rXZJMXmnWrKU+J39PtyGQld 98VtigBzk0KEfskWcEQuBVvWNoHdaJaF/tOBXikQ= Date: Fri, 14 Jan 2022 14:10:38 -0800 From: Andrew Morton To: akpm@linux-foundation.org, guoqing.jiang@linux.dev, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 142/146] mm/damon: move the implementation of damon_insert_region to damon.h Message-ID: <20220114221038.0qYlWfSgE%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: C97DE140008 X-Stat-Signature: crjp5se1hmhm9swq8x5x44hzmdqfiyg7 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Z7CbX80C; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam06 X-HE-Tag: 1642198241-170372 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Guoqing Jiang Subject: mm/damon: move the implementation of damon_insert_region to damon.h Usually, inline function is declared static since it should sit between storage and type. And implement it in a header file if used by multiple files. And this change also fixes compile issue when backport damon to 5.10. mm/damon/vaddr.c: In function `damon_va_evenly_split_region': ./include/linux/damon.h:425:13: error: inlining failed in call to `always_inline' `damon_insert_region': function body not available 425 | inline void damon_insert_region(struct damon_region *r, | ^~~~~~~~~~~~~~~~~~~ mm/damon/vaddr.c:86:3: note: called from here 86 | damon_insert_region(n, r, next, t); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Link: https://lkml.kernel.org/r/20211223085703.6142-1-guoqing.jiang@linux.dev Signed-off-by: Guoqing Jiang Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 13 +++++++++++-- mm/damon/core.c | 11 ----------- 2 files changed, 11 insertions(+), 13 deletions(-) --- a/include/linux/damon.h~mm-damon-move-the-implementation-of-damon_insert_region-to-damonh +++ a/include/linux/damon.h @@ -451,9 +451,18 @@ static inline struct damon_region *damon #ifdef CONFIG_DAMON struct damon_region *damon_new_region(unsigned long start, unsigned long end); -inline void damon_insert_region(struct damon_region *r, + +/* + * Add a region between two other regions + */ +static inline void damon_insert_region(struct damon_region *r, struct damon_region *prev, struct damon_region *next, - struct damon_target *t); + struct damon_target *t) +{ + __list_add(&r->list, &prev->list, &next->list); + t->nr_regions++; +} + void damon_add_region(struct damon_region *r, struct damon_target *t); void damon_destroy_region(struct damon_region *r, struct damon_target *t); --- a/mm/damon/core.c~mm-damon-move-the-implementation-of-damon_insert_region-to-damonh +++ a/mm/damon/core.c @@ -49,17 +49,6 @@ struct damon_region *damon_new_region(un return region; } -/* - * Add a region between two other regions - */ -inline void damon_insert_region(struct damon_region *r, - struct damon_region *prev, struct damon_region *next, - struct damon_target *t) -{ - __list_add(&r->list, &prev->list, &next->list); - t->nr_regions++; -} - void damon_add_region(struct damon_region *r, struct damon_target *t) { list_add_tail(&r->list, &t->regions_list); From patchwork Fri Jan 14 22:10:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714168 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93597C4332F for ; Fri, 14 Jan 2022 22:10:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C2556B019D; Fri, 14 Jan 2022 17:10:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 220666B019F; Fri, 14 Jan 2022 17:10:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0ECB86B01A0; Fri, 14 Jan 2022 17:10:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id EF8BD6B019D for ; Fri, 14 Jan 2022 17:10:44 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B33E5998C3 for ; Fri, 14 Jan 2022 22:10:44 +0000 (UTC) X-FDA: 79030288008.03.7B6FB5F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf05.hostedemail.com (Postfix) with ESMTP id 52C94100007 for ; Fri, 14 Jan 2022 22:10:44 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 592BBB825F5; Fri, 14 Jan 2022 22:10:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00D5DC36AEC; Fri, 14 Jan 2022 22:10:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198242; bh=B52i9MSp+eHr0i0/0dXWUWYTfW+JhUWQgTL4pNV8vRk=; h=Date:From:To:Subject:In-Reply-To:From; b=gxsNJIk205z7B/ML9iGBwC1nf/QLPuwNw9NETV2lPYcQ58yvQ9WuwnJDWvjvVHUme I6dK87tzel0vH+HuKaUudUpjHdmg50F4bJZOd0PBEOT/4AVHuqJBHQpd1QnB17M1O2 Lcj7sqWYrjGQCXwND/Jr2RGBbGneeC8nQn6ot+Vw= Date: Fri, 14 Jan 2022 14:10:41 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 143/146] mm/damon/dbgfs: remove an unnecessary variable Message-ID: <20220114221041.whTITHaup%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 52C94100007 X-Stat-Signature: 3ta7gmpcfxx991ga8im84cjoq5c5y1te Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=gxsNJIk2; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam08 X-HE-Tag: 1642198244-205000 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: remove an unnecessary variable Patch series "mm/damon: Hide unnecessary information disclosures". DAMON is exposing some unnecessary information including kernel pointer in kernel log and tracepoint. This patchset hides such information. The first patch is only for a trivial cleanup, though. This patch (of 4): This commit removes a unnecessarily used variable in dbgfs_target_ids_write(). Link: https://lkml.kernel.org/r/20211229131016.23641-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211229131016.23641-2-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-remove-a-unnecessary-variable +++ a/mm/damon/dbgfs.c @@ -364,7 +364,7 @@ static ssize_t dbgfs_target_ids_write(st struct damon_ctx *ctx = file->private_data; struct damon_target *t, *next_t; bool id_is_pid = true; - char *kbuf, *nrs; + char *kbuf; unsigned long *targets; ssize_t nr_targets; ssize_t ret; @@ -374,14 +374,13 @@ static ssize_t dbgfs_target_ids_write(st if (IS_ERR(kbuf)) return PTR_ERR(kbuf); - nrs = kbuf; if (!strncmp(kbuf, "paddr\n", count)) { id_is_pid = false; /* target id is meaningless here, but we set it just for fun */ scnprintf(kbuf, count, "42 "); } - targets = str_to_target_ids(nrs, count, &nr_targets); + targets = str_to_target_ids(kbuf, count, &nr_targets); if (!targets) { ret = -ENOMEM; goto out; From patchwork Fri Jan 14 22:10:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91FA5C433EF for ; Fri, 14 Jan 2022 22:10:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2719F6B019F; Fri, 14 Jan 2022 17:10:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 247A36B01A1; Fri, 14 Jan 2022 17:10:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 138EB6B01A2; Fri, 14 Jan 2022 17:10:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0036.hostedemail.com [216.40.44.36]) by kanga.kvack.org (Postfix) with ESMTP id F42256B019F for ; Fri, 14 Jan 2022 17:10:46 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AF31E18272F05 for ; Fri, 14 Jan 2022 22:10:46 +0000 (UTC) X-FDA: 79030288092.30.C10C33A Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf03.hostedemail.com (Postfix) with ESMTP id 67FC920008 for ; Fri, 14 Jan 2022 22:10:46 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 70B85B8262E; Fri, 14 Jan 2022 22:10:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EF39AC36AE9; Fri, 14 Jan 2022 22:10:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198245; bh=XI8c08BaJs1j2iC89Rj9atI+5ZEcHjbWcd51QKBOcdM=; h=Date:From:To:Subject:In-Reply-To:From; b=V2rlK6Or4GzGREhY/V8e7LYp0yP76zhIpdBbIVKxMftV4C2KR39Pdx+bvP9N87IVp 9BIQiTwAKBQ2PBa8klwZcrrHQE9TR6902Y4pLVwf0+QLzEKGIwYG3m0KtlF/sDM0+A XyWNVUcsOpIkX4w/kgf+R0aLY4RmIbv9m1PQVCUM= Date: Fri, 14 Jan 2022 14:10:44 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 144/146] mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging Message-ID: <20220114221044.otM3UPlhQ%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 67FC920008 X-Stat-Signature: oiaw5ryjgn4z5f6nqa5gw7rfdwc96krg Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=V2rlK6Or; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198246-887221 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging Failure of 'damon_va_three_regions()' is logged using 'pr_err()'. But, the function can fail in legal situations. To avoid making users be surprised and to keep the kernel clean, this commit makes the log to be printed using 'pr_debug()'. Link: https://lkml.kernel.org/r/20211229131016.23641-3-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/vaddr.c~mm-damon-vaddr-use-pr_debug-for-damon_va_three_regions-failure-logging +++ a/mm/damon/vaddr.c @@ -238,7 +238,7 @@ static void __damon_va_init_regions(stru int i; if (damon_va_three_regions(t, regions)) { - pr_err("Failed to get three regions of target %lu\n", t->id); + pr_debug("Failed to get three regions of target %lu\n", t->id); return; } From patchwork Fri Jan 14 22:10:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714170 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B494C433F5 for ; Fri, 14 Jan 2022 22:10:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 160356B01A1; Fri, 14 Jan 2022 17:10:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10F556B01A3; Fri, 14 Jan 2022 17:10:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF2B06B01A4; Fri, 14 Jan 2022 17:10:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id DFE066B01A1 for ; Fri, 14 Jan 2022 17:10:50 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 9FD59182693C9 for ; Fri, 14 Jan 2022 22:10:50 +0000 (UTC) X-FDA: 79030288260.17.D94F7EB Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 3C0751C0003 for ; Fri, 14 Jan 2022 22:10:50 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 43322B8262F; Fri, 14 Jan 2022 22:10:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF42CC36AE9; Fri, 14 Jan 2022 22:10:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198248; bh=Crjmo7tHv8SQwDHJ1SZziUwYsbzz7L7y0cmGnpncxfA=; h=Date:From:To:Subject:In-Reply-To:From; b=c/gJFmuEn1wdW4awCCkTj7naIguVxYfuYwz09LwswAMCBgATn6hyCgsGUpVJoxxzd hyRdTl3lPDrN3VMaY3KGglbk7bDuHl/USQfrA7wjKYLHUOsUgmPG7RaUUiFOMWPRio mW88gLPXmddS21tUTzZ0XzndjwMGC73Xr2QpzNtA= Date: Fri, 14 Jan 2022 14:10:47 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 145/146] mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log Message-ID: <20220114221047.5nwaCB_o6%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3C0751C0003 X-Stat-Signature: c6tcmzqu3oimzwjq1t3aebtgc8ogzw6c Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="c/gJFmuE"; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1642198250-867326 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log The failure log message for 'damon_va_three_regions()' prints the target id, which is a 'struct pid' pointer in the case. To avoid exposing the kernel pointer via the log, this commit makes the log to use the index of the target in the context's targets list instead. Link: https://lkml.kernel.org/r/20211229131016.23641-4-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- a/mm/damon/vaddr.c~mm-damon-vaddr-hide-kernel-pointer-from-damon_va_three_regions-failure-log +++ a/mm/damon/vaddr.c @@ -232,13 +232,19 @@ static int damon_va_three_regions(struct static void __damon_va_init_regions(struct damon_ctx *ctx, struct damon_target *t) { + struct damon_target *ti; struct damon_region *r; struct damon_addr_range regions[3]; unsigned long sz = 0, nr_pieces; - int i; + int i, tidx = 0; if (damon_va_three_regions(t, regions)) { - pr_debug("Failed to get three regions of target %lu\n", t->id); + damon_for_each_target(ti, ctx) { + if (ti == t) + break; + tidx++; + } + pr_debug("Failed to get three regions of %dth target\n", tidx); return; } From patchwork Fri Jan 14 22:10:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12714171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38B51C433FE for ; Fri, 14 Jan 2022 22:10:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0E556B01A3; Fri, 14 Jan 2022 17:10:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BBD5E6B01A5; Fri, 14 Jan 2022 17:10:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAC416B01A6; Fri, 14 Jan 2022 17:10:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay037.a.hostedemail.com [64.99.140.37]) by kanga.kvack.org (Postfix) with ESMTP id 9C9D06B01A3 for ; Fri, 14 Jan 2022 17:10:55 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5B4AD2D8 for ; Fri, 14 Jan 2022 22:10:55 +0000 (UTC) X-FDA: 79030288470.01.FBA57BC Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf08.hostedemail.com (Postfix) with ESMTP id C950B160008 for ; Fri, 14 Jan 2022 22:10:54 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 95460CE2384; Fri, 14 Jan 2022 22:10:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7957C36AE9; Fri, 14 Jan 2022 22:10:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1642198251; bh=ldj3WX+CtmHXViybAkVKj1BnaVhm96z7FxJvSl8RpVk=; h=Date:From:To:Subject:In-Reply-To:From; b=DWzsDz0IJtlWEFLEOr6KneXtbQqPjm4cOLDkmCVGy+aWEOAV5hAK89CFKfcj42ZuE PL39hjWgrDQbW/cmOpsnfu76T9Jns+t3JpGAD3kiSzJtiVQ9+DR2oz99J2WF9E4p1D EmCNk51m9+vRxK3rmvjfRXWl0Op6SI4ewnHu0z7o= Date: Fri, 14 Jan 2022 14:10:50 -0800 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 146/146] mm/damon: hide kernel pointer from tracepoint event Message-ID: <20220114221050.qcGC9jBnT%akpm@linux-foundation.org> In-Reply-To: <20220114140222.6b14f0061194d3200000c52d@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Stat-Signature: u59tjxhxjkmfk3d3tsjg57jdf5xyg3i4 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DWzsDz0I; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: C950B160008 X-HE-Tag: 1642198254-466343 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon: hide kernel pointer from tracepoint event DAMON's virtual address spaces monitoring primitive uses 'struct pid *' of the target process as its monitoring target id. The kernel address is exposed as-is to the user space via the DAMON tracepoint, 'damon_aggregated'. Though primarily only privileged users are allowed to access that, it would be better to avoid unnecessarily exposing kernel pointers so. Because the trace result is only required to be able to distinguish each target, we aren't need to use the pointer as-is. This commit makes the tracepoint to use the index of the target in the context's targets list as its id in the tracepoint, to hide the kernel space address. Link: https://lkml.kernel.org/r/20211229131016.23641-5-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- include/trace/events/damon.h | 8 ++++---- mm/damon/core.c | 4 +++- 2 files changed, 7 insertions(+), 5 deletions(-) --- a/include/trace/events/damon.h~mm-damon-hide-kernel-pointer-from-tracepoint-event +++ a/include/trace/events/damon.h @@ -11,10 +11,10 @@ TRACE_EVENT(damon_aggregated, - TP_PROTO(struct damon_target *t, struct damon_region *r, - unsigned int nr_regions), + TP_PROTO(struct damon_target *t, unsigned int target_id, + struct damon_region *r, unsigned int nr_regions), - TP_ARGS(t, r, nr_regions), + TP_ARGS(t, target_id, r, nr_regions), TP_STRUCT__entry( __field(unsigned long, target_id) @@ -26,7 +26,7 @@ TRACE_EVENT(damon_aggregated, ), TP_fast_assign( - __entry->target_id = t->id; + __entry->target_id = target_id; __entry->nr_regions = nr_regions; __entry->start = r->ar.start; __entry->end = r->ar.end; --- a/mm/damon/core.c~mm-damon-hide-kernel-pointer-from-tracepoint-event +++ a/mm/damon/core.c @@ -514,15 +514,17 @@ static bool kdamond_aggregate_interval_p static void kdamond_reset_aggregated(struct damon_ctx *c) { struct damon_target *t; + unsigned int ti = 0; /* target's index */ damon_for_each_target(t, c) { struct damon_region *r; damon_for_each_region(r, t) { - trace_damon_aggregated(t, r, damon_nr_regions(t)); + trace_damon_aggregated(t, ti, r, damon_nr_regions(t)); r->last_nr_accesses = r->nr_accesses; r->nr_accesses = 0; } + ti++; } }