From patchwork Fri Nov 5 20:34:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A16DC433F5 for ; Fri, 5 Nov 2021 20:34:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 893C760E09 for ; Fri, 5 Nov 2021 20:34:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 893C760E09 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 268616B006C; Fri, 5 Nov 2021 16:34:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2179C940009; Fri, 5 Nov 2021 16:34:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 107B5940008; Fri, 5 Nov 2021 16:34:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0031.hostedemail.com [216.40.44.31]) by kanga.kvack.org (Postfix) with ESMTP id 033096B006C for ; Fri, 5 Nov 2021 16:34:39 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B881E8249980 for ; Fri, 5 Nov 2021 20:34:38 +0000 (UTC) X-FDA: 78776029836.09.EB825F8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id D6382508FA69 for ; Fri, 5 Nov 2021 20:34:20 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4EF57611CE; Fri, 5 Nov 2021 20:34:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144477; bh=y0hh9s32gA0j1+uUPy4IiJlbyl3LxBF0a4Mwnq0as0M=; h=Date:From:To:Subject:In-Reply-To:From; b=yfHQKq8tiTOqNty6NmO+RucY7JcgAQVoMtdgZqszPcOo/wFCX+vr9Yt2m0Q4yenXW igH0GlSOm1c4ZzMCmgPgGSeV7PCO4OKIGQdJTOC6ZBcbLUxsADiFHz3GHeb48CYxYJ iML5I7HzxD7YlMCLGD63LUBdEQf3kEWknc5+FJU4= Date: Fri, 05 Nov 2021 13:34:36 -0700 From: Andrew Morton To: akpm@linux-foundation.org, colin.king@canonical.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt Message-ID: <20211105203436.hC7p6A2-9%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D6382508FA69 X-Stat-Signature: 94qpcx7d8apmzzps9nshtfw1wrbp9x54 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yfHQKq8t; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144460-619314 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000306, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: scripts/spelling.txt: add more spellings to spelling.txt Some of the more common spelling mistakes and typos that I've found while fixing up spelling mistakes in the kernel in the past few months. Link: https://lkml.kernel.org/r/20210907072941.7033-1-colin.king@canonical.com Signed-off-by: Colin Ian King Signed-off-by: Andrew Morton --- scripts/spelling.txt | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) --- a/scripts/spelling.txt~scripts-spellingtxt-add-more-spellings-to-spellingtxt +++ a/scripts/spelling.txt @@ -178,6 +178,7 @@ assum||assume assumtpion||assumption asuming||assuming asycronous||asynchronous +asychronous||asynchronous asynchnous||asynchronous asynchromous||asynchronous asymetric||asymmetric @@ -241,6 +242,7 @@ beter||better betweeen||between bianries||binaries bitmast||bitmask +bitwiedh||bitwidth boardcast||broadcast borad||board boundry||boundary @@ -265,7 +267,10 @@ calucate||calculate calulate||calculate cancelation||cancellation cancle||cancel +cant||can't +cant'||can't canot||cannot +cann't||can't capabilites||capabilities capabilties||capabilities capabilty||capability @@ -501,6 +506,7 @@ disble||disable disgest||digest disired||desired dispalying||displaying +dissable||disable diplay||display directon||direction direcly||directly @@ -595,6 +601,7 @@ exceded||exceeded exceds||exceeds exceeed||exceed excellant||excellent +exchnage||exchange execeeded||exceeded execeeds||exceeds exeed||exceed @@ -938,6 +945,7 @@ migrateable||migratable milliseonds||milliseconds minium||minimum minimam||minimum +minimun||minimum miniumum||minimum minumum||minimum misalinged||misaligned @@ -956,6 +964,7 @@ mmnemonic||mnemonic mnay||many modfiy||modify modifer||modifier +modul||module modulues||modules momery||memory memomry||memory @@ -1154,6 +1163,7 @@ programable||programmable programers||programmers programm||program programms||programs +progres||progress progresss||progress prohibitted||prohibited prohibitting||prohibiting @@ -1328,6 +1338,7 @@ servive||service setts||sets settting||setting shapshot||snapshot +shoft||shift shotdown||shutdown shoud||should shouldnt||shouldn't @@ -1439,6 +1450,7 @@ syfs||sysfs symetric||symmetric synax||syntax synchonized||synchronized +synchronization||synchronization synchronuously||synchronously syncronize||synchronize syncronized||synchronized @@ -1521,6 +1533,7 @@ unexpexted||unexpected unfortunatelly||unfortunately unifiy||unify uniterrupted||uninterrupted +uninterruptable||uninterruptible unintialized||uninitialized unitialized||uninitialized unkmown||unknown @@ -1553,6 +1566,7 @@ unuseful||useless unvalid||invalid upate||update upsupported||unsupported +useable||usable usefule||useful usefull||useful usege||usage @@ -1574,6 +1588,7 @@ varient||variant vaule||value verbse||verbose veify||verify +verfication||verification veriosn||version verisons||versions verison||version @@ -1586,6 +1601,7 @@ visiters||visitors vitual||virtual vunerable||vulnerable wakeus||wakeups +was't||wasn't wathdog||watchdog wating||waiting wiat||wait From patchwork Fri Nov 5 20:34:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2B05C433EF for ; Fri, 5 Nov 2021 20:42:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 864D860E09 for ; Fri, 5 Nov 2021 20:42:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 864D860E09 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1A3BA94008A; Fri, 5 Nov 2021 16:42:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 025EE94007C; Fri, 5 Nov 2021 16:42:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D039F94008A; Fri, 5 Nov 2021 16:42:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 688B294007C for ; Fri, 5 Nov 2021 16:42:05 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 205B78249980 for ; Fri, 5 Nov 2021 20:42:05 +0000 (UTC) X-FDA: 78776048610.18.BDABBAB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 9EDE3F0000B9 for ; Fri, 5 Nov 2021 20:42:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 367C061212; Fri, 5 Nov 2021 20:34:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144480; bh=YmJN9k+AvaTAJe4caCznDxTBuY6tu4WmKoSVS/t4vgQ=; h=Date:From:To:Subject:In-Reply-To:From; b=un60l7yN+1dnfo+YkSDfwBnaT1fCOEGvpe5rkC/o9ExTd4HVTlAlYkEZ++ODKmYN7 nBn3EKSez9jC13TZPefO39vW31qR1jfq5pkMA/ZcNrJUsrifuuab6jjHyuSSqlnM7g Eq9u6/tX+aLtsC64lARw/jBB6I8EOFUggaYeC+ZI= Date: Fri, 05 Nov 2021 13:34:39 -0700 From: Andrew Morton To: akpm@linux-foundation.org, colin.king@canonical.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sven@narfation.org, torvalds@linux-foundation.org Subject: [patch 002/262] scripts/spelling.txt: fix "mistake" version of "synchronization" Message-ID: <20211105203439.Tb9TpW8XV%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=un60l7yN; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9EDE3F0000B9 X-Stat-Signature: oycqachcdrhw8esjwchgnce5wkj46onr X-HE-Tag: 1636144924-443237 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Sven Eckelmann Subject: scripts/spelling.txt: fix "mistake" version of "synchronization" If both "mistake" version and "correction" version are the same, a warning message is created by checkpatch which is impossible to fix. But it was noticed that Colan Ian King created a commit e6c0a0889b80 ("ALSA: aloop: Fix spelling mistake "synchronization" -> "synchronization"") which suggests that this spelling mistake was fixed by replacing the word "synchronization" with itself. But the actual diff shows that the mistake in the code was "sychronization". It is rather likely that the "mistake" in spelling.txt should have been the latter. Link: https://lkml.kernel.org/r/20210926065529.6880-1-sven@narfation.org Fixes: 2e74c9433ba8 ("scripts/spelling.txt: add more spellings to spelling.txt") Signed-off-by: Sven Eckelmann Reviewed-by: Colin Ian King Signed-off-by: Andrew Morton --- scripts/spelling.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/scripts/spelling.txt~scripts-spellingtxt-fix-mistake-version-of-synchronization +++ a/scripts/spelling.txt @@ -1450,7 +1450,7 @@ syfs||sysfs symetric||symmetric synax||syntax synchonized||synchronized -synchronization||synchronization +sychronization||synchronization synchronuously||synchronously syncronize||synchronize syncronized||synchronized From patchwork Fri Nov 5 20:34:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28535C433F5 for ; Fri, 5 Nov 2021 20:42:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D401861212 for ; Fri, 5 Nov 2021 20:42:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D401861212 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8A0B794008F; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 84FB894007C; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73CF494008F; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 61D1B94007C for ; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 27AF4184B452D for ; Fri, 5 Nov 2021 20:42:21 +0000 (UTC) X-FDA: 78776049282.29.1CE59A9 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id 4823F5000300 for ; Fri, 5 Nov 2021 20:42:12 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 225636120D; Fri, 5 Nov 2021 20:34:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144483; bh=ZDnk8aUALmlN+b8RX2OiAAKavfx1qgWqQH9RlcuqhZw=; h=Date:From:To:Subject:In-Reply-To:From; b=Ad+hjpy6NBGqaPr7it7DbnokfdqnNPCOrnpXn+Mp44M26BpF6Ut9iZZM0m9MngtJ9 qWIBBGJelbr9Nh81lZNY4RYPbbdUrei6IgVqMiHP+K5n3c+/aRY2K1ee7BsQRoAenf 156FPbnN/A+N6bvx7KPBo+5L8yNxvCXYX1LYgv74= Date: Fri, 05 Nov 2021 13:34:42 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bp@suse.de, linux-mm@kvack.org, maz@misterjones.org, mm-commits@vger.kernel.org, rabin@rab.in, torvalds@linux-foundation.org, weidonghui@allwinnertech.com, will@kernel.org Subject: [patch 003/262] scripts/decodecode: fix faulting instruction no print when opps.file is DOS format Message-ID: <20211105203442.uBhwndE3D%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ad+hjpy6; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4823F5000300 X-Stat-Signature: ki54bkqamue99c736gapnzt1qfthzer9 X-HE-Tag: 1636144932-990605 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: weidonghui Subject: scripts/decodecode: fix faulting instruction no print when opps.file is DOS format If opps.file is in DOS format, faulting instruction cannot be printed: / # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- / # ./scripts/decodecode < oops.file [ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f) aarch64-linux-gnu-strip: '/tmp/tmp.5Y9eybnnSi.o': No such file aarch64-linux-gnu-objdump: '/tmp/tmp.5Y9eybnnSi.o': No such file All code ======== 0: d0002881 adrp x1, 0x512000 4: 912f9c21 add x1, x1, #0xbe7 8: 94067e68 bl 0x19f9a8 c: d2800001 mov x1, #0x0 // #0 10: b900003f str wzr, [x1] Code starting with the faulting instruction =========================================== Background: The compilation environment is Ubuntu, and the test environment is Windows. Most logs are generated in the Windows environment. In this way, CR (carriage return) will inevitably appear, which will affect the use of decodecode in the Ubuntu environment. The repaired effect is as follows: / # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- / # ./scripts/decodecode < oops.file [ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f) All code ======== 0: d0002881 adrp x1, 0x512000 4: 912f9c21 add x1, x1, #0xbe7 8: 94067e68 bl 0x19f9a8 c: d2800001 mov x1, #0x0 // #0 10:* b900003f str wzr, [x1] <-- trapping instruction Code starting with the faulting instruction =========================================== 0: b900003f str wzr, [x1] Link: https://lkml.kernel.org/r/20211008064712.926-1-weidonghui@allwinnertech.com Signed-off-by: weidonghui Acked-by: Borislav Petkov Cc: Marc Zyngier Cc: Will Deacon Cc: Rabin Vincent Signed-off-by: Andrew Morton --- scripts/decodecode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/scripts/decodecode~scripts-decodecode-fix-faulting-instruction-no-print-when-oppsfile-is-dos-format +++ a/scripts/decodecode @@ -126,7 +126,7 @@ if [ $marker -ne 0 ]; then fi echo Code starting with the faulting instruction > $T.aa echo =========================================== >> $T.aa -code=`echo $code | sed -e 's/ [<(]/ /;s/[>)] / /;s/ /,0x/g; s/[>)]$//'` +code=`echo $code | sed -e 's/\r//;s/ [<(]/ /;s/[>)] / /;s/ /,0x/g; s/[>)]$//'` echo -n " .$type 0x" > $T.s echo $code >> $T.s disas $T 0 From patchwork Fri Nov 5 20:34:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EADB0C433F5 for ; Fri, 5 Nov 2021 20:34:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 859BD61244 for ; Fri, 5 Nov 2021 20:34:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 859BD61244 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2DE606B0072; Fri, 5 Nov 2021 16:34:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28C67940009; Fri, 5 Nov 2021 16:34:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A443940008; Fri, 5 Nov 2021 16:34:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id 0E02F6B0072 for ; Fri, 5 Nov 2021 16:34:48 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CAD301849A3B5 for ; Fri, 5 Nov 2021 20:34:47 +0000 (UTC) X-FDA: 78776030214.35.29263A5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 746F070000AE for ; Fri, 5 Nov 2021 20:34:47 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 457FA61216; Fri, 5 Nov 2021 20:34:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144486; bh=OJn0JxWHNIGTV3bVVu6WatIl/Ul4OL6osGpyinM+A8U=; h=Date:From:To:Subject:In-Reply-To:From; b=Hz+9DuYxzzONEBqyqV+r6mBOt9B2RmLMSvcVD1+Za+rYbNe7QY6kEzoLDEA4QbjNQ aOf42tnYSULFlShAIUBx1K26m8+jkZFB8HMG4gY93Y8ZMp0Vm32mzuT/YKcoe7ZujA NI369sokqMh2lpGEIcDHSQGH9smqfhiC2QItrSi0= Date: Fri, 05 Nov 2021 13:34:45 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cymi20@fudan.edu.cn, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, tanxin.ctf@gmail.com, torvalds@linux-foundation.org, wen.gang.wang@oracle.com, xiyuyang19@fudan.edu.cn Subject: [patch 004/262] ocfs2: fix handle refcount leak in two exception handling paths Message-ID: <20211105203445.DaSjbdbf2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 746F070000AE X-Stat-Signature: 386ipzyig56pga3aw8w6eoksh14q88xj Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Hz+9DuYx; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144487-745624 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chenyuan Mi Subject: ocfs2: fix handle refcount leak in two exception handling paths The reference counting issue happens in two exception handling paths of ocfs2_replay_truncate_records(). When executing these two exception handling paths, the function forgets to decrease the refcount of handle increased by ocfs2_start_trans(), causing a refcount leak. Fix this issue by using ocfs2_commit_trans() to decrease the refcount of handle in two handling paths. Link: https://lkml.kernel.org/r/20210908102055.10168-1-cymi20@fudan.edu.cn Signed-off-by: Chenyuan Mi Signed-off-by: Xiyu Yang Signed-off-by: Xin Tan Reviewed-by: Joseph Qi Cc: Wengang Wang Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/alloc.c | 2 ++ 1 file changed, 2 insertions(+) --- a/fs/ocfs2/alloc.c~ocfs2-fix-handle-refcount-leak-in-two-exception-handling-paths +++ a/fs/ocfs2/alloc.c @@ -5940,6 +5940,7 @@ static int ocfs2_replay_truncate_records status = ocfs2_journal_access_di(handle, INODE_CACHE(tl_inode), tl_bh, OCFS2_JOURNAL_ACCESS_WRITE); if (status < 0) { + ocfs2_commit_trans(osb, handle); mlog_errno(status); goto bail; } @@ -5964,6 +5965,7 @@ static int ocfs2_replay_truncate_records data_alloc_bh, start_blk, num_clusters); if (status < 0) { + ocfs2_commit_trans(osb, handle); mlog_errno(status); goto bail; } From patchwork Fri Nov 5 20:34:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605377 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1923CC4332F for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C140561263 for ; Fri, 5 Nov 2021 20:34:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C140561263 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 37CCA940009; Fri, 5 Nov 2021 16:34:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DC596B0078; Fri, 5 Nov 2021 16:34:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C9C3940009; Fri, 5 Nov 2021 16:34:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id 01D776B0075 for ; Fri, 5 Nov 2021 16:34:59 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A052E184B3AAE for ; Fri, 5 Nov 2021 20:34:58 +0000 (UTC) X-FDA: 78776030676.21.5626781 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id 5D50CD036A4D for ; Fri, 5 Nov 2021 20:34:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id ADB6361244; Fri, 5 Nov 2021 20:34:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144490; bh=CHabFx2ki0R5M6Lp1Muc1RwNq5BS3257ezZr562iHcw=; h=Date:From:To:Subject:In-Reply-To:From; b=KwBE9X6k3dGJZnXYARk4MsUk1JkmYxnBaXFTWmaM35x13dLY0Rk6YK494vmRpy4Bc azOiRNgM9bi3dHSxdoV8xmR9O4f8dhAO2CwlIB4B3Z56hquxzFMaWWdCcJunCZTR8a bB251X+JKdRc5Qy9QMwlvQ/aGd/5Jbp9ibmpwWzc= Date: Fri, 05 Nov 2021 13:34:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org, vvidic@valentin-vidic.from.hr Subject: [patch 005/262] ocfs2: cleanup journal init and shutdown Message-ID: <20211105203449.XAP-gt26N%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KwBE9X6k; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5D50CD036A4D X-Stat-Signature: 9r11f3wkomtgxfug71ocfaeqctiw5ypf X-HE-Tag: 1636144493-353756 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Valentin Vidic Subject: ocfs2: cleanup journal init and shutdown Allocate and free struct ocfs2_journal in ocfs2_journal_init and ocfs2_journal_shutdown. Init and release of system inodes references the journal so reorder calls to make sure they work correctly. Link: https://lkml.kernel.org/r/20211009145006.3478-1-vvidic@valentin-vidic.from.hr Signed-off-by: Valentin Vidic Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/inode.c | 4 ++-- fs/ocfs2/journal.c | 28 ++++++++++++++++++++++------ fs/ocfs2/journal.h | 3 +-- fs/ocfs2/super.c | 40 +++------------------------------------- 4 files changed, 28 insertions(+), 47 deletions(-) --- a/fs/ocfs2/inode.c~ocfs2-cleanup-journal-init-and-shutdown +++ a/fs/ocfs2/inode.c @@ -125,7 +125,6 @@ struct inode *ocfs2_iget(struct ocfs2_su struct inode *inode = NULL; struct super_block *sb = osb->sb; struct ocfs2_find_inode_args args; - journal_t *journal = OCFS2_SB(sb)->journal->j_journal; trace_ocfs2_iget_begin((unsigned long long)blkno, flags, sysfile_type); @@ -172,10 +171,11 @@ struct inode *ocfs2_iget(struct ocfs2_su * part of the transaction - the inode could have been reclaimed and * now it is reread from disk. */ - if (journal) { + if (osb->journal) { transaction_t *transaction; tid_t tid; struct ocfs2_inode_info *oi = OCFS2_I(inode); + journal_t *journal = osb->journal->j_journal; read_lock(&journal->j_state_lock); if (journal->j_running_transaction) --- a/fs/ocfs2/journal.c~ocfs2-cleanup-journal-init-and-shutdown +++ a/fs/ocfs2/journal.c @@ -810,19 +810,34 @@ void ocfs2_set_journal_params(struct ocf write_unlock(&journal->j_state_lock); } -int ocfs2_journal_init(struct ocfs2_journal *journal, int *dirty) +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) { int status = -1; struct inode *inode = NULL; /* the journal inode */ journal_t *j_journal = NULL; + struct ocfs2_journal *journal = NULL; struct ocfs2_dinode *di = NULL; struct buffer_head *bh = NULL; - struct ocfs2_super *osb; int inode_lock = 0; - BUG_ON(!journal); - - osb = journal->j_osb; + /* initialize our journal structure */ + journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL); + if (!journal) { + mlog(ML_ERROR, "unable to alloc journal\n"); + status = -ENOMEM; + goto done; + } + osb->journal = journal; + journal->j_osb = osb; + + atomic_set(&journal->j_num_trans, 0); + init_rwsem(&journal->j_trans_barrier); + init_waitqueue_head(&journal->j_checkpointed); + spin_lock_init(&journal->j_lock); + journal->j_trans_id = 1UL; + INIT_LIST_HEAD(&journal->j_la_cleanups); + INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery); + journal->j_state = OCFS2_JOURNAL_FREE; /* already have the inode for our journal */ inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE, @@ -1028,9 +1043,10 @@ void ocfs2_journal_shutdown(struct ocfs2 journal->j_state = OCFS2_JOURNAL_FREE; -// up_write(&journal->j_trans_barrier); done: iput(inode); + kfree(journal); + osb->journal = NULL; } static void ocfs2_clear_journal_error(struct super_block *sb, --- a/fs/ocfs2/journal.h~ocfs2-cleanup-journal-init-and-shutdown +++ a/fs/ocfs2/journal.h @@ -167,8 +167,7 @@ int ocfs2_compute_replay_slots(struct oc * ocfs2_start_checkpoint - Kick the commit thread to do a checkpoint. */ void ocfs2_set_journal_params(struct ocfs2_super *osb); -int ocfs2_journal_init(struct ocfs2_journal *journal, - int *dirty); +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty); void ocfs2_journal_shutdown(struct ocfs2_super *osb); int ocfs2_journal_wipe(struct ocfs2_journal *journal, int full); --- a/fs/ocfs2/super.c~ocfs2-cleanup-journal-init-and-shutdown +++ a/fs/ocfs2/super.c @@ -1894,8 +1894,6 @@ static void ocfs2_dismount_volume(struct /* This will disable recovery and flush any recovery work. */ ocfs2_recovery_exit(osb); - ocfs2_journal_shutdown(osb); - ocfs2_sync_blockdev(sb); ocfs2_purge_refcount_trees(osb); @@ -1918,6 +1916,8 @@ static void ocfs2_dismount_volume(struct ocfs2_release_system_inodes(osb); + ocfs2_journal_shutdown(osb); + /* * If we're dismounting due to mount error, mount.ocfs2 will clean * up heartbeat. If we're a local mount, there is no heartbeat. @@ -2016,7 +2016,6 @@ static int ocfs2_initialize_super(struct int i, cbits, bbits; struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data; struct inode *inode = NULL; - struct ocfs2_journal *journal; struct ocfs2_super *osb; u64 total_blocks; @@ -2197,33 +2196,6 @@ static int ocfs2_initialize_super(struct get_random_bytes(&osb->s_next_generation, sizeof(u32)); - /* FIXME - * This should be done in ocfs2_journal_init(), but unknown - * ordering issues will cause the filesystem to crash. - * If anyone wants to figure out what part of the code - * refers to osb->journal before ocfs2_journal_init() is run, - * be my guest. - */ - /* initialize our journal structure */ - - journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL); - if (!journal) { - mlog(ML_ERROR, "unable to alloc journal\n"); - status = -ENOMEM; - goto bail; - } - osb->journal = journal; - journal->j_osb = osb; - - atomic_set(&journal->j_num_trans, 0); - init_rwsem(&journal->j_trans_barrier); - init_waitqueue_head(&journal->j_checkpointed); - spin_lock_init(&journal->j_lock); - journal->j_trans_id = (unsigned long) 1; - INIT_LIST_HEAD(&journal->j_la_cleanups); - INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery); - journal->j_state = OCFS2_JOURNAL_FREE; - INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs); init_llist_head(&osb->dquot_drop_list); @@ -2404,7 +2376,7 @@ static int ocfs2_check_volume(struct ocf * ourselves. */ /* Init our journal object. */ - status = ocfs2_journal_init(osb->journal, &dirty); + status = ocfs2_journal_init(osb, &dirty); if (status < 0) { mlog(ML_ERROR, "Could not initialize journal!\n"); goto finally; @@ -2513,12 +2485,6 @@ static void ocfs2_delete_osb(struct ocfs kfree(osb->osb_orphan_wipes); kfree(osb->slot_recovery_generations); - /* FIXME - * This belongs in journal shutdown, but because we have to - * allocate osb->journal at the start of ocfs2_initialize_osb(), - * we free it here. - */ - kfree(osb->journal); kfree(osb->local_alloc_copy); kfree(osb->uuid_str); kfree(osb->vol_label); From patchwork Fri Nov 5 20:34:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C114C43217 for ; Fri, 5 Nov 2021 20:34:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 21B3A6127A for ; Fri, 5 Nov 2021 20:34:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 21B3A6127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B8FCA6B0073; Fri, 5 Nov 2021 16:34:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3EE8940008; Fri, 5 Nov 2021 16:34:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A06996B0075; Fri, 5 Nov 2021 16:34:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id 8D65E6B0073 for ; Fri, 5 Nov 2021 16:34:54 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4534A184AAA81 for ; Fri, 5 Nov 2021 20:34:54 +0000 (UTC) X-FDA: 78776030508.22.27231E7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 3792B508FA72 for ; Fri, 5 Nov 2021 20:34:42 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DD8DB611C0; Fri, 5 Nov 2021 20:34:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144493; bh=5thybw1wSxumef2TYW475T3X2CO6HWfi9biq1yI7RbY=; h=Date:From:To:Subject:In-Reply-To:From; b=KcO+dbqGdPTjAngV7jDh7QHTAaH3ERXuBl7dQr7PitnCK4FBB48DhniyYI0W3/SUO Fbd2u5ToKcroXbi/fXApWUgFTpiYRDA4ir5gLDIJhnhUA+2v+hVDK5xrhRzm6JOU4T 9jCRToQFjL8BLQtR/+8SkRmMrIFDN2b4hmIVOHx0= Date: Fri, 05 Nov 2021 13:34:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, colin.king@canonical.com, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 006/262] ocfs2/dlm: remove redundant assignment of variable ret Message-ID: <20211105203452.W9C94pmyD%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KcO+dbqG; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3792B508FA72 X-Stat-Signature: yo1s6i9ib8hmf6e97gg1knhunyxqao4w X-HE-Tag: 1636144482-24242 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000015, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: ocfs2/dlm: remove redundant assignment of variable ret The variable ret is being assigned a value that is never read, it is updated later on with a different value. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Link: https://lkml.kernel.org/r/20211007233452.30815-1-colin.king@canonical.com Signed-off-by: Colin Ian King Reviewed-by: Andrew Morton Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/dlm/dlmrecovery.c | 1 - 1 file changed, 1 deletion(-) --- a/fs/ocfs2/dlm/dlmrecovery.c~ocfs2-dlm-remove-redundant-assignment-of-variable-ret +++ a/fs/ocfs2/dlm/dlmrecovery.c @@ -2698,7 +2698,6 @@ static int dlm_send_begin_reco_message(s continue; } retry: - ret = -EINVAL; mlog(0, "attempting to send begin reco msg to %d\n", nodenum); ret = o2net_send_message(DLM_BEGIN_RECO_MSG, dlm->key, From patchwork Fri Nov 5 20:34:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCDBEC433F5 for ; Fri, 5 Nov 2021 20:34:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 813E861262 for ; Fri, 5 Nov 2021 20:34:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 813E861262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2403E940008; Fri, 5 Nov 2021 16:34:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F1E96B0075; Fri, 5 Nov 2021 16:34:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09199940008; Fri, 5 Nov 2021 16:34:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id ED85D6B0074 for ; Fri, 5 Nov 2021 16:34:57 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9E15476B6D for ; Fri, 5 Nov 2021 20:34:57 +0000 (UTC) X-FDA: 78776030634.03.6F1760A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id 5163ED036A62 for ; Fri, 5 Nov 2021 20:34:52 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0121661279; Fri, 5 Nov 2021 20:34:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144496; bh=OXcNBKV2broFZ2R+f4CXMqo8D5nn4AOws8wB04GUjkE=; h=Date:From:To:Subject:In-Reply-To:From; b=aq5TiriTSiQUAQxt/aMCjx08b8hUq28tiYF/q/DYCygh9+Ca+LgEiOg3+c1v4t9CV y0ROTc0GcO+qIJirHv1uhsVRREL04uN53ZEo/BRbtiRU2CMh2jxnNG8lXpmsL5+km9 VYwpYkaNXyMMMJJUD+5M+DPcJHrCWIcp6ipxtqTg= Date: Fri, 05 Nov 2021 13:34:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jack@suse.cz, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 007/262] ocfs2: fix data corruption on truncate Message-ID: <20211105203455.hOZux5v3f%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=aq5TiriT; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5163ED036A62 X-Stat-Signature: bsz6g1syjq9aaccm5cobtbzu8ruysnbj X-HE-Tag: 1636144492-843044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jan Kara Subject: ocfs2: fix data corruption on truncate Patch series "ocfs2: Truncate data corruption fix". As further testing has shown, commit 5314454ea3f ("ocfs2: fix data corruption after conversion from inline format") didn't fix all the data corruption issues the customer started observing after 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()") This time I have tracked them down to two bugs in ocfs2 truncation code. One bug (truncating page cache before clearing tail cluster and setting i_size) could cause data corruption even before 6dbf7bb55598, but before that commit it needed a race with page fault, after 6dbf7bb55598 it started to be pretty deterministic. Another bug (zeroing pages beyond old i_size) used to be harmless inefficiency before commit 6dbf7bb55598. But after commit 6dbf7bb55598 in combination with the first bug it resulted in deterministic data corruption. Although fixing only the first problem is needed to stop data corruption, I've fixed both issues to make the code more robust. This patch (of 2): ocfs2_truncate_file() did unmap invalidate page cache pages before zeroing partial tail cluster and setting i_size. Thus some pages could be left (and likely have left if the cluster zeroing happened) in the page cache beyond i_size after truncate finished letting user possibly see stale data once the file was extended again. Also the tail cluster zeroing was not guaranteed to finish before truncate finished causing possible stale data exposure. The problem started to be particularly easy to hit after commit 6dbf7bb55598 "fs: Don't invalidate page buffers in block_write_full_page()" stopped invalidation of pages beyond i_size from page writeback path. Fix these problems by unmapping and invalidating pages in the page cache after the i_size is reduced and tail cluster is zeroed out. Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Jan Kara Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Cc: Signed-off-by: Andrew Morton --- fs/ocfs2/file.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/fs/ocfs2/file.c~ocfs2-fix-data-corruption-on-truncate +++ a/fs/ocfs2/file.c @@ -476,10 +476,11 @@ int ocfs2_truncate_file(struct inode *in * greater than page size, so we have to truncate them * anyway. */ - unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1); - truncate_inode_pages(inode->i_mapping, new_i_size); if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) { + unmap_mapping_range(inode->i_mapping, + new_i_size + PAGE_SIZE - 1, 0, 1); + truncate_inode_pages(inode->i_mapping, new_i_size); status = ocfs2_truncate_inline(inode, di_bh, new_i_size, i_size_read(inode), 1); if (status) @@ -498,6 +499,9 @@ int ocfs2_truncate_file(struct inode *in goto bail_unlock_sem; } + unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1); + truncate_inode_pages(inode->i_mapping, new_i_size); + status = ocfs2_commit_truncate(osb, inode, di_bh); if (status < 0) { mlog_errno(status); From patchwork Fri Nov 5 20:34:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DBCEC433F5 for ; Fri, 5 Nov 2021 20:35:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B63B2611C0 for ; Fri, 5 Nov 2021 20:35:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B63B2611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EC8D094000A; Fri, 5 Nov 2021 16:35:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E26026B007B; Fri, 5 Nov 2021 16:35:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEDBE94000A; Fri, 5 Nov 2021 16:35:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id C09396B0078 for ; Fri, 5 Nov 2021 16:35:00 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 87F8A8249980 for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) X-FDA: 78776030760.13.1F7E718 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id EBDC96001997 for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 29F0F6125F; Fri, 5 Nov 2021 20:34:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144499; bh=7Y5od5AlGs+vMq5IUdTUsBVDcqUkL6+i0coxbAtZ45k=; h=Date:From:To:Subject:In-Reply-To:From; b=CrcFNJVwDy616GzA3J1OZ96xu3SBkrLpm4IQc3Zn1CsuL+Xc830X4phXXB3qiWVIw oqOwLZxTPGZ0OAqotAd76O/Nz/E1ec4kMirVuKzp+deY/zm1FqR6DEE/uhSlM89kyN Tdpy+iwd/PHHN7ibvrHSjxrBJ94K8tvmgK4oR8Yo= Date: Fri, 05 Nov 2021 13:34:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jack@suse.cz, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 008/262] ocfs2: do not zero pages beyond i_size Message-ID: <20211105203458.Q4tA6BQfK%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=CrcFNJVw; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EBDC96001997 X-Stat-Signature: c7x7negdouzmzjkkq714i1tnbrynm9mb X-HE-Tag: 1636144500-704404 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jan Kara Subject: ocfs2: do not zero pages beyond i_size ocfs2_zero_range_for_truncate() can try to zero pages beyond current inode size despite the fact that underlying blocks should be already zeroed out and writeback will skip writing such pages anyway. Avoid the pointless work. Link: https://lkml.kernel.org/r/20211025151332.11301-2-jack@suse.cz Signed-off-by: Jan Kara Reviewed-by: Joseph Qi Cc: Changwei Ge Cc: Gang He Cc: Joel Becker Cc: Jun Piao Cc: Junxiao Bi Cc: Mark Fasheh Signed-off-by: Andrew Morton --- fs/ocfs2/alloc.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) --- a/fs/ocfs2/alloc.c~ocfs2-do-not-zero-pages-beyond-i_size +++ a/fs/ocfs2/alloc.c @@ -6923,13 +6923,12 @@ static int ocfs2_grab_eof_pages(struct i } /* - * Zero the area past i_size but still within an allocated - * cluster. This avoids exposing nonzero data on subsequent file - * extends. + * Zero partial cluster for a hole punch or truncate. This avoids exposing + * nonzero data on subsequent file extends. * * We need to call this before i_size is updated on the inode because * otherwise block_write_full_page() will skip writeout of pages past - * i_size. The new_i_size parameter is passed for this reason. + * i_size. */ int ocfs2_zero_range_for_truncate(struct inode *inode, handle_t *handle, u64 range_start, u64 range_end) @@ -6947,6 +6946,15 @@ int ocfs2_zero_range_for_truncate(struct if (!ocfs2_sparse_alloc(OCFS2_SB(sb))) return 0; + /* + * Avoid zeroing pages fully beyond current i_size. It is pointless as + * underlying blocks of those pages should be already zeroed out and + * page writeback will skip them anyway. + */ + range_end = min_t(u64, range_end, i_size_read(inode)); + if (range_start >= range_end) + return 0; + pages = kcalloc(ocfs2_pages_per_cluster(sb), sizeof(struct page *), GFP_NOFS); if (pages == NULL) { @@ -6955,9 +6963,6 @@ int ocfs2_zero_range_for_truncate(struct goto out; } - if (range_start == range_end) - goto out; - ret = ocfs2_extent_map_get_blocks(inode, range_start >> sb->s_blocksize_bits, &phys, NULL, &ext_flags); From patchwork Fri Nov 5 20:35:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BBC1C433FE for ; Fri, 5 Nov 2021 20:35:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 188DD6127A for ; Fri, 5 Nov 2021 20:35:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 188DD6127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3A0BB94000E; Fri, 5 Nov 2021 16:35:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3511F94000D; Fri, 5 Nov 2021 16:35:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 217B294000E; Fri, 5 Nov 2021 16:35:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 102E694000D for ; Fri, 5 Nov 2021 16:35:14 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CCE03184D7B13 for ; Fri, 5 Nov 2021 20:35:13 +0000 (UTC) X-FDA: 78776031306.06.73BA619 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id 27F9CF000091 for ; Fri, 5 Nov 2021 20:35:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4B96A61242; Fri, 5 Nov 2021 20:35:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144502; bh=2hc4vY4Tu2uTNEDtYnPWoJC+AGdi9+R26xbGbgeX9Hg=; h=Date:From:To:Subject:In-Reply-To:From; b=GqbnmHlQUlOu1xWZrXSNdHpdpEPdcJKXcFgduDtyqQjKfTImcOXJq0FNJiW4+bCd2 H9uvxQeEhYe9No+wWeMNE/ZJ7sjP6fFuHDthepv/f+KWOVvR8vJYTPqAl0HnaMZSbu Zp22ogdInNj1mbFGsCeFsOnB66lw5at+yEDGT4ug= Date: Fri, 05 Nov 2021 13:35:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, arnd@arndb.de, christian.brauner@ubuntu.com, jamorris@linux.microsoft.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, mszeredi@redhat.com, serge@hallyn.com, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 009/262] fs/posix_acl.c: avoid -Wempty-body warning Message-ID: <20211105203501.yNYlpMbop%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 27F9CF000091 X-Stat-Signature: w71pcz17z4j5y8kqc4eobs81yy6yr91p Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=GqbnmHlQ; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144505-859756 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Arnd Bergmann Subject: fs/posix_acl.c: avoid -Wempty-body warning The fallthrough comment for an ignored cmpxchg() return value produces a harmless warning with 'make W=1': fs/posix_acl.c: In function 'get_acl': fs/posix_acl.c:127:36: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 127 | /* fall through */ ; | ^ Simplify it as a step towards a clean W=1 build. As all architectures define cmpxchg() as a statement expression these days, it is no longer necessary to evaluate its return code, and the if() can just be droped. Link: https://lkml.kernel.org/r/20210927102410.1863853-1-arnd@kernel.org Link: https://lore.kernel.org/all/20210322132103.qiun2rjilnlgztxe@wittgenstein/ Signed-off-by: Arnd Bergmann Reviewed-by: Christian Brauner Cc: Alexander Viro Cc: James Morris Cc: Serge Hallyn Cc: Miklos Szeredi Signed-off-by: Andrew Morton --- fs/posix_acl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/fs/posix_acl.c~posix-acl-avoid-wempty-body-warning +++ a/fs/posix_acl.c @@ -134,8 +134,7 @@ struct posix_acl *get_acl(struct inode * * to just call ->get_acl to fetch the ACL ourself. (This is going to * be an unlikely race.) */ - if (cmpxchg(p, ACL_NOT_CACHED, sentinel) != ACL_NOT_CACHED) - /* fall through */ ; + cmpxchg(p, ACL_NOT_CACHED, sentinel); /* * Normally, the ACL returned by ->get_acl will be cached. From patchwork Fri Nov 5 20:35:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B79E8C433EF for ; Fri, 5 Nov 2021 20:35:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6DB16611EE for ; Fri, 5 Nov 2021 20:35:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6DB16611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1560194000B; Fri, 5 Nov 2021 16:35:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 105D76B007D; Fri, 5 Nov 2021 16:35:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01D6494000B; Fri, 5 Nov 2021 16:35:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id E1AD76B007B for ; Fri, 5 Nov 2021 16:35:06 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A1B93184B83C5 for ; Fri, 5 Nov 2021 20:35:06 +0000 (UTC) X-FDA: 78776031012.28.FA94FC2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id EB4C4700170D for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 51595611C0; Fri, 5 Nov 2021 20:35:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144505; bh=lvO/Xal88SvK8JgJPCjeMxe0PNEcLisn+UlciSN2QV4=; h=Date:From:To:Subject:In-Reply-To:From; b=HFvUNp7RjRAYZUooYaWxSUfF10d/wE3dmxoWA036/wdk3ANexAZuYk5fWYrJnrPwa OgfFN/eQlOj+vTPzFBOznm1otk98dZmXvee9DExqi5DWQNenhmuCtz0lTd3rzBQq2T qHzqFgj9fqBrufb/zYkrf31/kQP8Bo88UR69Iwgk= Date: Fri, 05 Nov 2021 13:35:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andriy.shevchenko@linux.intel.com, justin.he@arm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rdunlap@infradead.org, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 010/262] d_path: fix Kernel doc validator complaining Message-ID: <20211105203504.mi-qYxPAX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HFvUNp7R; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EB4C4700170D X-Stat-Signature: sro66mm91xcncreqqhet6km8rf61uq9m X-HE-Tag: 1636144500-564686 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jia He Subject: d_path: fix Kernel doc validator complaining Kernel doc validator complains: Function parameter or member 'p' not described in 'prepend_name' Excess function parameter 'buffer' description in 'prepend_name' Link: https://lkml.kernel.org/r/20211011005614.26189-1-justin.he@arm.com Fixes: ad08ae586586 ("d_path: introduce struct prepend_buffer") Signed-off-by: Jia He Reviewed-by: Andy Shevchenko Acked-by: Randy Dunlap Cc: Al Viro Signed-off-by: Andrew Morton --- fs/d_path.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) --- a/fs/d_path.c~d_path-fix-kernel-doc-validator-complaining +++ a/fs/d_path.c @@ -77,9 +77,8 @@ static bool prepend(struct prepend_buffe /** * prepend_name - prepend a pathname in front of current buffer pointer - * @buffer: buffer pointer - * @buflen: allocated length of the buffer - * @name: name string and length qstr structure + * @p: prepend buffer which contains buffer pointer and allocated length + * @name: name string and length qstr structure * * With RCU path tracing, it may race with d_move(). Use READ_ONCE() to * make sure that either the old or the new name pointer and length are @@ -141,8 +140,7 @@ static int __prepend_path(const struct d * prepend_path - Prepend path string to a buffer * @path: the dentry/vfsmount to report * @root: root vfsmnt/dentry - * @buffer: pointer to the end of the buffer - * @buflen: pointer to buffer length + * @p: prepend buffer which contains buffer pointer and allocated length * * The function will first try to write out the pathname without taking any * lock other than the RCU read lock to make sure that dentries won't go away. From patchwork Fri Nov 5 20:35:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC5F4C433F5 for ; Fri, 5 Nov 2021 20:35:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D1DF61252 for ; Fri, 5 Nov 2021 20:35:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8D1DF61252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3051E6B007D; Fri, 5 Nov 2021 16:35:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B47994000C; Fri, 5 Nov 2021 16:35:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A2EE6B0080; Fri, 5 Nov 2021 16:35:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id 0BE696B007D for ; Fri, 5 Nov 2021 16:35:10 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B6100777ED for ; Fri, 5 Nov 2021 20:35:09 +0000 (UTC) X-FDA: 78776031138.08.F6CE78D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id C9454500030D for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 541C3611EE; Fri, 5 Nov 2021 20:35:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144508; bh=Mc3Jmd7eT/qvokZGg0khVUvvsM7oKTWlA9em/Tlk+Cg=; h=Date:From:To:Subject:In-Reply-To:From; b=OgIoVJFh8lLRyJpuuU/8DrC0PBUoHxmAoJ6Ury9MHWGuq5Odoew8OqqDng966vp93 2KUEmzf2OHm7d8c6ymhLzFbQFUnKwX7RhZ563n6cXsNyyE+qXX2IZnWGapnAT1CjwL LcHymS78gbR7q9REI+IFaCzceC/64Hqb94SQTKsQ= Date: Fri, 05 Nov 2021 13:35:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 011/262] mm: move kvmalloc-related functions to slab.h Message-ID: <20211105203507.gvGl5bZmw%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OgIoVJFh; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C9454500030D X-Stat-Signature: oxzqdyxcomw4ennoyf8p5aqhuiqyz4gg X-HE-Tag: 1636144500-306778 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm: move kvmalloc-related functions to slab.h Not all files in the kernel should include mm.h. Migrating callers from kmalloc to kvmalloc is easier if the kvmalloc functions are in slab.h. [akpm@linux-foundation.org: move the new kvrealloc() also] [akpm@linux-foundation.org: drivers/hwmon/occ/p9_sbe.c needs slab.h] Link: https://lkml.kernel.org/r/20210622215757.3525604-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Pekka Enberg Cc: Christoph Lameter Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- drivers/hwmon/occ/p9_sbe.c | 1 + drivers/of/kexec.c | 1 + include/linux/mm.h | 34 ---------------------------------- include/linux/slab.h | 34 ++++++++++++++++++++++++++++++++++ 4 files changed, 36 insertions(+), 34 deletions(-) --- a/drivers/hwmon/occ/p9_sbe.c~mm-move-kvmalloc-related-functions-to-slabh +++ a/drivers/hwmon/occ/p9_sbe.c @@ -3,6 +3,7 @@ #include #include +#include #include #include #include --- a/drivers/of/kexec.c~mm-move-kvmalloc-related-functions-to-slabh +++ a/drivers/of/kexec.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #define RNG_SEED_SIZE 128 --- a/include/linux/mm.h~mm-move-kvmalloc-related-functions-to-slabh +++ a/include/linux/mm.h @@ -799,40 +799,6 @@ static inline int is_vmalloc_or_module_a } #endif -extern void *kvmalloc_node(size_t size, gfp_t flags, int node); -static inline void *kvmalloc(size_t size, gfp_t flags) -{ - return kvmalloc_node(size, flags, NUMA_NO_NODE); -} -static inline void *kvzalloc_node(size_t size, gfp_t flags, int node) -{ - return kvmalloc_node(size, flags | __GFP_ZERO, node); -} -static inline void *kvzalloc(size_t size, gfp_t flags) -{ - return kvmalloc(size, flags | __GFP_ZERO); -} - -static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags) -{ - size_t bytes; - - if (unlikely(check_mul_overflow(n, size, &bytes))) - return NULL; - - return kvmalloc(bytes, flags); -} - -static inline void *kvcalloc(size_t n, size_t size, gfp_t flags) -{ - return kvmalloc_array(n, size, flags | __GFP_ZERO); -} - -extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, - gfp_t flags); -extern void kvfree(const void *addr); -extern void kvfree_sensitive(const void *addr, size_t len); - static inline int head_compound_mapcount(struct page *head) { return atomic_read(compound_mapcount_ptr(head)) + 1; --- a/include/linux/slab.h~mm-move-kvmalloc-related-functions-to-slabh +++ a/include/linux/slab.h @@ -732,6 +732,40 @@ static inline void *kzalloc_node(size_t return kmalloc_node(size, flags | __GFP_ZERO, node); } +extern void *kvmalloc_node(size_t size, gfp_t flags, int node); +static inline void *kvmalloc(size_t size, gfp_t flags) +{ + return kvmalloc_node(size, flags, NUMA_NO_NODE); +} +static inline void *kvzalloc_node(size_t size, gfp_t flags, int node) +{ + return kvmalloc_node(size, flags | __GFP_ZERO, node); +} +static inline void *kvzalloc(size_t size, gfp_t flags) +{ + return kvmalloc(size, flags | __GFP_ZERO); +} + +static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags) +{ + size_t bytes; + + if (unlikely(check_mul_overflow(n, size, &bytes))) + return NULL; + + return kvmalloc(bytes, flags); +} + +static inline void *kvcalloc(size_t n, size_t size, gfp_t flags) +{ + return kvmalloc_array(n, size, flags | __GFP_ZERO); +} + +extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, + gfp_t flags); +extern void kvfree(const void *addr); +extern void kvfree_sensitive(const void *addr, size_t len); + unsigned int kmem_cache_size(struct kmem_cache *s); void __init kmem_cache_init_late(void); From patchwork Fri Nov 5 20:35:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0532C433EF for ; Fri, 5 Nov 2021 20:35:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9263A61252 for ; Fri, 5 Nov 2021 20:35:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9263A61252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 39FD894000C; Fri, 5 Nov 2021 16:35:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34F8A94000D; Fri, 5 Nov 2021 16:35:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23EBF94000C; Fri, 5 Nov 2021 16:35:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161]) by kanga.kvack.org (Postfix) with ESMTP id 1300594000D for ; Fri, 5 Nov 2021 16:35:13 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BBB4882499A8 for ; Fri, 5 Nov 2021 20:35:12 +0000 (UTC) X-FDA: 78776031180.19.9362CA7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id E099E60019B0 for ; Fri, 5 Nov 2021 20:35:00 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5E2A46125F; Fri, 5 Nov 2021 20:35:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144511; bh=DCh7yJy6z4y7qRQcZROwHfmik69LdedH9RpmhvtpYrE=; h=Date:From:To:Subject:In-Reply-To:From; b=EDyQkK1xNAfJDzrAKp3XAnJgA9PLBfou5PIVij6W6cAn134gCZrfJRGgw2/o5+c6b BMYx0DNHD94rH6XbS9o7Cau55JjY3YvkmlTzyft3W8cHK+txx9Suf8E74YW9XG6aYZ NIUn9MQzhhMrlBwntJTW+QyG4hYorWPB2pSGFLag= Date: Fri, 05 Nov 2021 13:35:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, shi_lei@massclouds.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 012/262] mm/slab.c: remove useless lines in enable_cpucache() Message-ID: <20211105203510.kUPi_1np4%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E099E60019B0 X-Stat-Signature: yfe35smq5p8bkyycwqnapbe6funfj7og Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EDyQkK1x; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144500-755956 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shi Lei Subject: mm/slab.c: remove useless lines in enable_cpucache() These lines are useless, so remove them. Link: https://lkml.kernel.org/r/20210930034845.2539-1-shi_lei@massclouds.com Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations") Signed-off-by: Shi Lei Reviewed-by: Vlastimil Babka Acked-by: David Rientjes Cc: Christoph Lameter Cc: Pekka Enberg Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- mm/slab.c | 3 --- 1 file changed, 3 deletions(-) --- a/mm/slab.c~mm-remove-useless-lines-in-enable_cpucache +++ a/mm/slab.c @@ -3900,8 +3900,6 @@ static int enable_cpucache(struct kmem_c if (err) goto end; - if (limit && shared && batchcount) - goto skip_setup; /* * The head array serves three purposes: * - create a LIFO ordering, i.e. return objects that are cache-warm @@ -3944,7 +3942,6 @@ static int enable_cpucache(struct kmem_c limit = 32; #endif batchcount = (limit + 1) / 2; -skip_setup: err = do_tune_cpucache(cachep, limit, batchcount, shared, gfp); end: if (err) From patchwork Fri Nov 5 20:35:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99298C433F5 for ; Fri, 5 Nov 2021 20:35:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E01361242 for ; Fri, 5 Nov 2021 20:35:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4E01361242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DB1B4940011; Fri, 5 Nov 2021 16:35:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D60EF940007; Fri, 5 Nov 2021 16:35:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5096940011; Fri, 5 Nov 2021 16:35:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id B4D2E940007 for ; Fri, 5 Nov 2021 16:35:28 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7A037777F6 for ; Fri, 5 Nov 2021 20:35:28 +0000 (UTC) X-FDA: 78776031936.25.A575AA6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id C6C3F104AAC7 for ; Fri, 5 Nov 2021 20:35:19 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6CEDF61252; Fri, 5 Nov 2021 20:35:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144514; bh=zOIB/S9H+ITionS7Amebd4LYXa7vQtVFbpQay9Vx7tk=; h=Date:From:To:Subject:In-Reply-To:From; b=lRbualA7B2zEAnD1pjQySBJD15PL+I/enRD46ryjWe/P+FFg4zRwtjsjEoBu18No5 +OeyWFAXqOtr4DITg9ztkZxsCB/ue8Rco4TIMvV3YNjtAIqui7Fh/chHAtoq+9zSAB t/OA6qWrEdt0MDroIo6ZFOZlMyYJvFJOwsQcMZc8= Date: Fri, 05 Nov 2021 13:35:14 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, shakeelb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, wangkefeng.wang@huawei.com, willy@infradead.org Subject: [patch 013/262] slub: add back check for free nonslab objects Message-ID: <20211105203514.l8x6qswFB%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lRbualA7; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C6C3F104AAC7 X-Stat-Signature: xqrpygwfyyhosrd6gcs8d3zjxpcatxkp X-HE-Tag: 1636144519-875570 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: slub: add back check for free nonslab objects After commit ("f227f0faf63b slub: fix unreclaimable slab stat for bulk free"), the check for free nonslab page is replaced by VM_BUG_ON_PAGE, which only check with CONFIG_DEBUG_VM enabled, but this config may impact performance, so it only for debug. Commit ("0937502af7c9 slub: Add check for kfree() of non slab objects.") add the ability, which should be needed in any configs to catch the invalid free, they even could be potential issue, eg, memory corruption, use after free and double free, so replace VM_BUG_ON_PAGE to WARN_ON_ONCE, add object address printing to help use to debug the issue. Link: https://lkml.kernel.org/r/20210930070214.61499-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Cc: Matthew Wilcox Cc: Shakeel Butt Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rienjes Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- mm/slub.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/mm/slub.c~slub-add-back-check-for-free-nonslab-objects +++ a/mm/slub.c @@ -3522,7 +3522,9 @@ static inline void free_nonslab_page(str { unsigned int order = compound_order(page); - VM_BUG_ON_PAGE(!PageCompound(page), page); + if (WARN_ON_ONCE(!PageCompound(page))) + pr_warn_once("object pointer: 0x%p\n", object); + kfree_hook(object); mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order)); __free_pages(page, order); From patchwork Fri Nov 5 20:35:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9BDAC433FE for ; Fri, 5 Nov 2021 20:35:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4879E611C0 for ; Fri, 5 Nov 2021 20:35:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4879E611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E271D94000D; Fri, 5 Nov 2021 16:35:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD72E940007; Fri, 5 Nov 2021 16:35:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9E8694000D; Fri, 5 Nov 2021 16:35:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id B95CA940007 for ; Fri, 5 Nov 2021 16:35:19 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7684376BBB for ; Fri, 5 Nov 2021 20:35:19 +0000 (UTC) X-FDA: 78776031558.38.D03AFBE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 0821270000B9 for ; Fri, 5 Nov 2021 20:35:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0366B61242; Fri, 5 Nov 2021 20:35:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144518; bh=r+rOVlDuZFH89DBSvCSq623tdfXh5HZE2Pri2pUwdx0=; h=Date:From:To:Subject:In-Reply-To:From; b=kazDEnq4UE6urEDotHbh9urj4l82xxaY1yCKmuSmHm6pAfVRAzttf8z2zwTefPVww 4eqO4EhmUpH5JtHpt6sFaY10cWr8DCk0bUOtcIYMNTPQHp1DCezCpLz2TzvpRVXIm0 fh7Io6TslE2cjEDziFynHd5mMDEtCMInofO+ubYY= Date: Fri, 05 Nov 2021 13:35:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, guro@fb.com, iamjoonsoo.kim@lge.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 014/262] mm, slub: change percpu partial accounting from objects to pages Message-ID: <20211105203517.sQ2746FAk%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kazDEnq4; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0821270000B9 X-Stat-Signature: ch37u3weduesyedsh95u4ewjbnduq8a8 X-HE-Tag: 1636144518-210139 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vlastimil Babka Subject: mm, slub: change percpu partial accounting from objects to pages With CONFIG_SLUB_CPU_PARTIAL enabled, SLUB keeps a percpu list of partial slabs that can be promoted to cpu slab when the previous one is depleted, without accessing the shared partial list. A slab can be added to this list by 1) refill of an empty list from get_partial_node() - once we really have to access the shared partial list, we acquire multiple slabs to amortize the cost of locking, and 2) first free to a previously full slab - instead of putting the slab on a shared partial list, we can more cheaply freeze it and put it on the per-cpu list. To control how large a percpu partial list can grow for a kmem cache, set_cpu_partial() calculates a target number of free objects on each cpu's percpu partial list, and this can be also set by the sysfs file cpu_partial. However, the tracking of actual number of objects is imprecise, in order to limit overhead from cpu X freeing an objects to a slab on percpu partial list of cpu Y. Basically, the percpu partial slabs form a single linked list, and when we add a new slab to the list with current head "oldpage", we set in the struct page of the slab we're adding: page->pages = oldpage->pages + 1; // this is precise page->pobjects = oldpage->pobjects + (page->objects - page->inuse); page->next = oldpage; Thus the real number of free objects in the slab (objects - inuse) is only determined at the moment of adding the slab to the percpu partial list, and further freeing doesn't update the pobjects counter nor propagate it to the current list head. As Jann reports [1], this can easily lead to large inaccuracies, where the target number of objects (up to 30 by default) can translate to the same number of (empty) slab pages on the list. In case 2) above, we put a slab with 1 free object on the list, thus only increase page->pobjects by 1, even if there are subsequent frees on the same slab. Jann has noticed this in practice and so did we [2] when investigating significant increase of kmemcg usage after switching from SLAB to SLUB. While this is no longer a problem in kmemcg context thanks to the accounting rewrite in 5.9, the memory waste is still not ideal and it's questionable whether it makes sense to perform free object count based control when object counts can easily become so much inaccurate. So this patch converts the accounting to be based on number of pages only (which is precise) and removes the page->pobjects field completely. This is also ultimately simpler. To retain the existing set_cpu_partial() heuristic, first calculate the target number of objects as previously, but then convert it to target number of pages by assuming the pages will be half-filled on average. This assumption might obviously also be inaccurate in practice, but cannot degrade to actual number of pages being equal to the target number of objects. We could also skip the intermediate step with target number of objects and rewrite the heuristic in terms of pages. However we still have the sysfs file cpu_partial which uses number of objects and could break existing users if it suddenly becomes number of pages, so this patch doesn't do that. In practice, after this patch the heuristics limit the size of percpu partial list up to 2 pages. In case of a reported regression (which would mean some workload has benefited from the previous imprecise object based counting), we can tune the heuristics to get a better compromise within the new scheme, while still avoid the unexpectedly long percpu partial lists. [1] https://lore.kernel.org/linux-mm/CAG48ez2Qx5K1Cab-m8BdSibp6wLTip6ro4=-umR7BLsEgjEYzA@mail.gmail.com/ [2] https://lore.kernel.org/all/2f0f46e8-2535-410a-1859-e9cfa4e57c18@suse.cz/ ========== Evaluation ========== Mel was kind enough to run v1 through mmtests machinery for netperf (localhost) and hackbench and, for most significant results see below. So there are some apparent regressions, especially with hackbench, which I think ultimately boils down to having shorter percpu partial lists on average and some benchmarks benefiting from longer ones. Monitoring slab usage also indicated less memory usage by slab. Based on that, the following patch will bump the defaults to allow longer percpu partial lists than after this patch. However the goal is certainly not such that we would limit the percpu partial lists to 30 pages just because previously a specific alloc/free pattern could lead to the limit of 30 objects translate to a limit to 30 pages - that would make little sense. This is a correctness patch, and if a workload benefits from larger lists, the sysfs tuning knobs are still there to allow that. Netperf 2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads per socket), 384GB RAM TCP-RR: hmean before 127045.79 after 121092.94 (-4.69%, worse) stddev before 2634.37 after 1254.08 UDP-RR: hmean before 166985.45 after 160668.94 ( -3.78%, worse) stddev before 4059.69 after 1943.63 2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads per socket), 512GB RAM TCP-RR: hmean before 84173.25 after 76914.72 ( -8.62%, worse) UDP-RR: hmean before 93571.12 after 96428.69 ( 3.05%, better) stddev before 23118.54 after 16828.14 2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads per socket), 64GB RAM TCP-RR: hmean before 49984.92 after 48922.27 ( -2.13%, worse) stddev before 6248.15 after 4740.51 UDP-RR: hmean before 61854.31 after 68761.81 ( 11.17%, better) stddev before 4093.54 after 5898.91 other machines - within 2% Hackbench (results before and after the patch, negative % means worse) 2-socket AMD EPYC 7713 (64 cores, 128 threads per core), 256GB RAM hackbench-process-sockets Amean 1 0.5380 0.5583 ( -3.78%) Amean 4 0.7510 0.8150 ( -8.52%) Amean 7 0.7930 0.9533 ( -20.22%) Amean 12 0.7853 1.1313 ( -44.06%) Amean 21 1.1520 1.4993 ( -30.15%) Amean 30 1.6223 1.9237 ( -18.57%) Amean 48 2.6767 2.9903 ( -11.72%) Amean 79 4.0257 5.1150 ( -27.06%) Amean 110 5.5193 7.4720 ( -35.38%) Amean 141 7.2207 9.9840 ( -38.27%) Amean 172 8.4770 12.1963 ( -43.88%) Amean 203 9.6473 14.3137 ( -48.37%) Amean 234 11.3960 18.7917 ( -64.90%) Amean 265 13.9627 22.4607 ( -60.86%) Amean 296 14.9163 26.0483 ( -74.63%) hackbench-thread-sockets Amean 1 0.5597 0.5877 ( -5.00%) Amean 4 0.7913 0.8960 ( -13.23%) Amean 7 0.8190 1.0017 ( -22.30%) Amean 12 0.9560 1.1727 ( -22.66%) Amean 21 1.7587 1.5660 ( 10.96%) Amean 30 2.4477 1.9807 ( 19.08%) Amean 48 3.4573 3.0630 ( 11.41%) Amean 79 4.7903 5.1733 ( -8.00%) Amean 110 6.1370 7.4220 ( -20.94%) Amean 141 7.5777 9.2617 ( -22.22%) Amean 172 9.2280 11.0907 ( -20.18%) Amean 203 10.2793 13.3470 ( -29.84%) Amean 234 11.2410 17.1070 ( -52.18%) Amean 265 12.5970 23.3323 ( -85.22%) Amean 296 17.1540 24.2857 ( -41.57%) 2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads per socket), 384GB RAM hackbench-process-sockets Amean 1 0.5760 0.4793 ( 16.78%) Amean 4 0.9430 0.9707 ( -2.93%) Amean 7 1.5517 1.8843 ( -21.44%) Amean 12 2.4903 2.7267 ( -9.49%) Amean 21 3.9560 4.2877 ( -8.38%) Amean 30 5.4613 5.8343 ( -6.83%) Amean 48 8.5337 9.2937 ( -8.91%) Amean 79 14.0670 15.2630 ( -8.50%) Amean 110 19.2253 21.2467 ( -10.51%) Amean 141 23.7557 25.8550 ( -8.84%) Amean 172 28.4407 29.7603 ( -4.64%) Amean 203 33.3407 33.9927 ( -1.96%) Amean 234 38.3633 39.1150 ( -1.96%) Amean 265 43.4420 43.8470 ( -0.93%) Amean 296 48.3680 48.9300 ( -1.16%) hackbench-thread-sockets Amean 1 0.6080 0.6493 ( -6.80%) Amean 4 1.0000 1.0513 ( -5.13%) Amean 7 1.6607 2.0260 ( -22.00%) Amean 12 2.7637 2.9273 ( -5.92%) Amean 21 5.0613 4.5153 ( 10.79%) Amean 30 6.3340 6.1140 ( 3.47%) Amean 48 9.0567 9.5577 ( -5.53%) Amean 79 14.5657 15.7983 ( -8.46%) Amean 110 19.6213 21.6333 ( -10.25%) Amean 141 24.1563 26.2697 ( -8.75%) Amean 172 28.9687 30.2187 ( -4.32%) Amean 203 33.9763 34.6970 ( -2.12%) Amean 234 38.8647 39.3207 ( -1.17%) Amean 265 44.0813 44.1507 ( -0.16%) Amean 296 49.2040 49.4330 ( -0.47%) 2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads per socket), 512GB RAM hackbench-process-sockets Amean 1 0.5027 0.5017 ( 0.20%) Amean 4 1.1053 1.2033 ( -8.87%) Amean 7 1.8760 2.1820 ( -16.31%) Amean 12 2.9053 3.1810 ( -9.49%) Amean 21 4.6777 4.9920 ( -6.72%) Amean 30 6.5180 6.7827 ( -4.06%) Amean 48 10.0710 10.5227 ( -4.48%) Amean 79 16.4250 17.5053 ( -6.58%) Amean 110 22.6203 24.4617 ( -8.14%) Amean 141 28.0967 31.0363 ( -10.46%) Amean 172 34.4030 36.9233 ( -7.33%) Amean 203 40.5933 43.0850 ( -6.14%) Amean 234 46.6477 48.7220 ( -4.45%) Amean 265 53.0530 53.9597 ( -1.71%) Amean 296 59.2760 59.9213 ( -1.09%) hackbench-thread-sockets Amean 1 0.5363 0.5330 ( 0.62%) Amean 4 1.1647 1.2157 ( -4.38%) Amean 7 1.9237 2.2833 ( -18.70%) Amean 12 2.9943 3.3110 ( -10.58%) Amean 21 4.9987 5.1880 ( -3.79%) Amean 30 6.7583 7.0043 ( -3.64%) Amean 48 10.4547 10.8353 ( -3.64%) Amean 79 16.6707 17.6790 ( -6.05%) Amean 110 22.8207 24.4403 ( -7.10%) Amean 141 28.7090 31.0533 ( -8.17%) Amean 172 34.9387 36.8260 ( -5.40%) Amean 203 41.1567 43.0450 ( -4.59%) Amean 234 47.3790 48.5307 ( -2.43%) Amean 265 53.9543 54.6987 ( -1.38%) Amean 296 60.0820 60.2163 ( -0.22%) 1-socket Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz (4 cores, 8 threads), 32 GB RAM hackbench-process-sockets Amean 1 1.4760 1.5773 ( -6.87%) Amean 3 3.9370 4.0910 ( -3.91%) Amean 5 6.6797 6.9357 ( -3.83%) Amean 7 9.3367 9.7150 ( -4.05%) Amean 12 15.7627 16.1400 ( -2.39%) Amean 18 23.5360 23.6890 ( -0.65%) Amean 24 31.0663 31.3137 ( -0.80%) Amean 30 38.7283 39.0037 ( -0.71%) Amean 32 41.3417 41.6097 ( -0.65%) hackbench-thread-sockets Amean 1 1.5250 1.6043 ( -5.20%) Amean 3 4.0897 4.2603 ( -4.17%) Amean 5 6.7760 7.0933 ( -4.68%) Amean 7 9.4817 9.9157 ( -4.58%) Amean 12 15.9610 16.3937 ( -2.71%) Amean 18 23.9543 24.3417 ( -1.62%) Amean 24 31.4400 31.7217 ( -0.90%) Amean 30 39.2457 39.5467 ( -0.77%) Amean 32 41.8267 42.1230 ( -0.71%) 2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads per socket), 64GB RAM hackbench-process-sockets Amean 1 1.0347 1.0880 ( -5.15%) Amean 4 1.7267 1.8527 ( -7.30%) Amean 7 2.6707 2.8110 ( -5.25%) Amean 12 4.1617 4.3383 ( -4.25%) Amean 21 7.0070 7.2600 ( -3.61%) Amean 30 9.9187 10.2397 ( -3.24%) Amean 48 15.6710 16.3923 ( -4.60%) Amean 79 24.7743 26.1247 ( -5.45%) Amean 110 34.3000 35.9307 ( -4.75%) Amean 141 44.2043 44.8010 ( -1.35%) Amean 172 54.2430 54.7260 ( -0.89%) Amean 192 60.6557 60.9777 ( -0.53%) hackbench-thread-sockets Amean 1 1.0610 1.1353 ( -7.01%) Amean 4 1.7543 1.9140 ( -9.10%) Amean 7 2.7840 2.9573 ( -6.23%) Amean 12 4.3813 4.4937 ( -2.56%) Amean 21 7.3460 7.5350 ( -2.57%) Amean 30 10.2313 10.5190 ( -2.81%) Amean 48 15.9700 16.5940 ( -3.91%) Amean 79 25.3973 26.6637 ( -4.99%) Amean 110 35.1087 36.4797 ( -3.91%) Amean 141 45.8220 46.3053 ( -1.05%) Amean 172 55.4917 55.7320 ( -0.43%) Amean 192 62.7490 62.5410 ( 0.33%) Link: https://lkml.kernel.org/r/20211012134651.11258-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka Reported-by: Jann Horn Cc: Roman Gushchin Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- include/linux/mm_types.h | 2 include/linux/slub_def.h | 13 ----- mm/slub.c | 89 ++++++++++++++++++++++++------------- 3 files changed, 61 insertions(+), 43 deletions(-) --- a/include/linux/mm_types.h~mm-slub-change-percpu-partial-accounting-from-objects-to-pages +++ a/include/linux/mm_types.h @@ -124,10 +124,8 @@ struct page { struct page *next; #ifdef CONFIG_64BIT int pages; /* Nr of pages left */ - int pobjects; /* Approximate count */ #else short int pages; - short int pobjects; #endif }; }; --- a/include/linux/slub_def.h~mm-slub-change-percpu-partial-accounting-from-objects-to-pages +++ a/include/linux/slub_def.h @@ -99,6 +99,8 @@ struct kmem_cache { #ifdef CONFIG_SLUB_CPU_PARTIAL /* Number of per cpu partial objects to keep around */ unsigned int cpu_partial; + /* Number of per cpu partial pages to keep around */ + unsigned int cpu_partial_pages; #endif struct kmem_cache_order_objects oo; @@ -141,17 +143,6 @@ struct kmem_cache { struct kmem_cache_node *node[MAX_NUMNODES]; }; -#ifdef CONFIG_SLUB_CPU_PARTIAL -#define slub_cpu_partial(s) ((s)->cpu_partial) -#define slub_set_cpu_partial(s, n) \ -({ \ - slub_cpu_partial(s) = (n); \ -}) -#else -#define slub_cpu_partial(s) (0) -#define slub_set_cpu_partial(s, n) -#endif /* CONFIG_SLUB_CPU_PARTIAL */ - #ifdef CONFIG_SYSFS #define SLAB_SUPPORTS_SYSFS void sysfs_slab_unlink(struct kmem_cache *); --- a/mm/slub.c~mm-slub-change-percpu-partial-accounting-from-objects-to-pages +++ a/mm/slub.c @@ -414,6 +414,29 @@ static inline unsigned int oo_objects(st return x.x & OO_MASK; } +#ifdef CONFIG_SLUB_CPU_PARTIAL +static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) +{ + unsigned int nr_pages; + + s->cpu_partial = nr_objects; + + /* + * We take the number of objects but actually limit the number of + * pages on the per cpu partial list, in order to limit excessive + * growth of the list. For simplicity we assume that the pages will + * be half-full. + */ + nr_pages = DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo)); + s->cpu_partial_pages = nr_pages; +} +#else +static inline void +slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) +{ +} +#endif /* CONFIG_SLUB_CPU_PARTIAL */ + /* * Per slab locking using the pagelock */ @@ -2052,7 +2075,7 @@ static inline void remove_partial(struct */ static inline void *acquire_slab(struct kmem_cache *s, struct kmem_cache_node *n, struct page *page, - int mode, int *objects) + int mode) { void *freelist; unsigned long counters; @@ -2068,7 +2091,6 @@ static inline void *acquire_slab(struct freelist = page->freelist; counters = page->counters; new.counters = counters; - *objects = new.objects - new.inuse; if (mode) { new.inuse = page->objects; new.freelist = NULL; @@ -2106,9 +2128,8 @@ static void *get_partial_node(struct kme { struct page *page, *page2; void *object = NULL; - unsigned int available = 0; unsigned long flags; - int objects; + unsigned int partial_pages = 0; /* * Racy check. If we mistakenly see no partial slabs then we @@ -2126,11 +2147,10 @@ static void *get_partial_node(struct kme if (!pfmemalloc_match(page, gfpflags)) continue; - t = acquire_slab(s, n, page, object == NULL, &objects); + t = acquire_slab(s, n, page, object == NULL); if (!t) break; - available += objects; if (!object) { *ret_page = page; stat(s, ALLOC_FROM_PARTIAL); @@ -2138,10 +2158,15 @@ static void *get_partial_node(struct kme } else { put_cpu_partial(s, page, 0); stat(s, CPU_PARTIAL_NODE); + partial_pages++; } +#ifdef CONFIG_SLUB_CPU_PARTIAL if (!kmem_cache_has_cpu_partial(s) - || available > slub_cpu_partial(s) / 2) + || partial_pages > s->cpu_partial_pages / 2) break; +#else + break; +#endif } spin_unlock_irqrestore(&n->list_lock, flags); @@ -2546,14 +2571,13 @@ static void put_cpu_partial(struct kmem_ struct page *page_to_unfreeze = NULL; unsigned long flags; int pages = 0; - int pobjects = 0; local_lock_irqsave(&s->cpu_slab->lock, flags); oldpage = this_cpu_read(s->cpu_slab->partial); if (oldpage) { - if (drain && oldpage->pobjects > slub_cpu_partial(s)) { + if (drain && oldpage->pages >= s->cpu_partial_pages) { /* * Partial array is full. Move the existing set to the * per node partial list. Postpone the actual unfreezing @@ -2562,16 +2586,13 @@ static void put_cpu_partial(struct kmem_ page_to_unfreeze = oldpage; oldpage = NULL; } else { - pobjects = oldpage->pobjects; pages = oldpage->pages; } } pages++; - pobjects += page->objects - page->inuse; page->pages = pages; - page->pobjects = pobjects; page->next = oldpage; this_cpu_write(s->cpu_slab->partial, page); @@ -3991,6 +4012,8 @@ static void set_min_partial(struct kmem_ static void set_cpu_partial(struct kmem_cache *s) { #ifdef CONFIG_SLUB_CPU_PARTIAL + unsigned int nr_objects; + /* * cpu_partial determined the maximum number of objects kept in the * per cpu partial lists of a processor. @@ -4000,24 +4023,22 @@ static void set_cpu_partial(struct kmem_ * filled up again with minimal effort. The slab will never hit the * per node partial lists and therefore no locking will be required. * - * This setting also determines - * - * A) The number of objects from per cpu partial slabs dumped to the - * per node list when we reach the limit. - * B) The number of objects in cpu partial slabs to extract from the - * per node list when we run out of per cpu objects. We only fetch - * 50% to keep some capacity around for frees. + * For backwards compatibility reasons, this is determined as number + * of objects, even though we now limit maximum number of pages, see + * slub_set_cpu_partial() */ if (!kmem_cache_has_cpu_partial(s)) - slub_set_cpu_partial(s, 0); + nr_objects = 0; else if (s->size >= PAGE_SIZE) - slub_set_cpu_partial(s, 2); + nr_objects = 2; else if (s->size >= 1024) - slub_set_cpu_partial(s, 6); + nr_objects = 6; else if (s->size >= 256) - slub_set_cpu_partial(s, 13); + nr_objects = 13; else - slub_set_cpu_partial(s, 30); + nr_objects = 30; + + slub_set_cpu_partial(s, nr_objects); #endif } @@ -5392,7 +5413,12 @@ SLAB_ATTR(min_partial); static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf) { - return sysfs_emit(buf, "%u\n", slub_cpu_partial(s)); + unsigned int nr_partial = 0; +#ifdef CONFIG_SLUB_CPU_PARTIAL + nr_partial = s->cpu_partial; +#endif + + return sysfs_emit(buf, "%u\n", nr_partial); } static ssize_t cpu_partial_store(struct kmem_cache *s, const char *buf, @@ -5463,12 +5489,12 @@ static ssize_t slabs_cpu_partial_show(st page = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); - if (page) { + if (page) pages += page->pages; - objects += page->pobjects; - } } + /* Approximate half-full pages , see slub_set_cpu_partial() */ + objects = (pages * oo_objects(s->oo)) / 2; len += sysfs_emit_at(buf, len, "%d(%d)", objects, pages); #ifdef CONFIG_SMP @@ -5476,9 +5502,12 @@ static ssize_t slabs_cpu_partial_show(st struct page *page; page = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); - if (page) + if (page) { + pages = READ_ONCE(page->pages); + objects = (pages * oo_objects(s->oo)) / 2; len += sysfs_emit_at(buf, len, " C%d=%d(%d)", - cpu, page->pobjects, page->pages); + cpu, objects, pages); + } } #endif len += sysfs_emit_at(buf, len, "\n"); From patchwork Fri Nov 5 20:35:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08FE2C433F5 for ; Fri, 5 Nov 2021 20:35:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A83E1611EE for ; Fri, 5 Nov 2021 20:35:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A83E1611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 462C794000F; Fri, 5 Nov 2021 16:35:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41226940007; Fri, 5 Nov 2021 16:35:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 328FF94000F; Fri, 5 Nov 2021 16:35:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id 21BD4940007 for ; Fri, 5 Nov 2021 16:35:23 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D9FBD18233660 for ; Fri, 5 Nov 2021 20:35:22 +0000 (UTC) X-FDA: 78776031684.25.24B749D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id 9D4199000381 for ; Fri, 5 Nov 2021 20:35:09 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4002B611C0; Fri, 5 Nov 2021 20:35:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144521; bh=BFrRHDItB033KGMI0DsbXXAPXn9nInuCSxGMm1myxbE=; h=Date:From:To:Subject:In-Reply-To:From; b=EXlsT492wsfhEbZ0AHTE9uYvWR6tXaHE8ux+ob+fhZCBXdmSz+evmGHDv8V+XWhEp e65KzXtme/bJUxW96oF3Idoh/Xi4HKVdP4aSTjkfXBZcL5r2mJFA3JzG3pggxzlMry k8gV3EpLwGVqZTcu+sF8f3ibmn91kjVkLargTAe0= Date: Fri, 05 Nov 2021 13:35:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, guro@fb.com, iamjoonsoo.kim@lge.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 015/262] mm/slub: increase default cpu partial list sizes Message-ID: <20211105203520.Q1wOR3Iwr%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9D4199000381 X-Stat-Signature: o8r3e31ufjhjs957d3pxknoom3bsbh9j Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EXlsT492; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144509-589868 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vlastimil Babka Subject: mm/slub: increase default cpu partial list sizes The defaults are determined based on object size and can go up to 30 for objects smaller than 256 bytes. Before the previous patch changed the accounting, this could have made cpu partial list contain up to 30 pages. After that patch, only up to 2 pages with default allocation order. Very short lists limit the usefulness of the whole concept of cpu partial lists, so this patch aims at a more reasonable default under the new accounting. The defaults are quadrupled, except for object size >= PAGE_SIZE where it's doubled. This makes the lists grow up to 10 pages in practice. A quick test of booting a kernel under virtme with 4GB RAM and 8 vcpus shows the following slab memory usage after boot: Before previous patch (using page->pobjects): Slab: 36732 kB SReclaimable: 14836 kB SUnreclaim: 21896 kB After previous patch (using page->pages): Slab: 34720 kB SReclaimable: 13716 kB SUnreclaim: 21004 kB After this patch (using page->pages, higher defaults): Slab: 35252 kB SReclaimable: 13944 kB SUnreclaim: 21308 kB In the same setup, I also ran 5 times: hackbench -l 16000 -g 16 Differences in time were in the noise, we can compare slub stats as given by slabinfo -r skbuff_head_cache (the other cache heavily used by hackbench, kmalloc-cg-512 looks similar). Negligible stats left out for brevity. Before previous patch (using page->pobjects): Objects: 1408, Memory Total: 401408 Used : 304128 Slab Perf Counter Alloc Free %Al %Fr -------------------------------------------------- Fastpath 469952498 5946606 91 1 Slowpath 42053573 506059465 8 98 Page Alloc 41093 41044 0 0 Add partial 18 21229327 0 4 Remove partial 20039522 36051 3 0 Cpu partial list 4686640 24767229 0 4 RemoteObj/SlabFrozen 16 124027841 0 24 Total 512006071 512006071 Flushes 18 Slab Deactivation Occurrences % ------------------------------------------------- Slab empty 4993 0% Deactivation bypass 24767229 99% Refilled from foreign frees 21972674 88% After previous patch (using page->pages): Objects: 480, Memory Total: 131072 Used : 103680 Slab Perf Counter Alloc Free %Al %Fr -------------------------------------------------- Fastpath 473016294 5405653 92 1 Slowpath 38989777 506600418 7 98 Page Alloc 32717 32701 0 0 Add partial 3 22749164 0 4 Remove partial 11371127 32474 2 0 Cpu partial list 11686226 23090059 2 4 RemoteObj/SlabFrozen 2 67541803 0 13 Total 512006071 512006071 Flushes 3 Slab Deactivation Occurrences % ------------------------------------------------- Slab empty 227 0% Deactivation bypass 23090059 99% Refilled from foreign frees 27585695 119% After this patch (using page->pages, higher defaults): Objects: 896, Memory Total: 229376 Used : 193536 Slab Perf Counter Alloc Free %Al %Fr -------------------------------------------------- Fastpath 473799295 4980278 92 0 Slowpath 38206776 507025793 7 99 Page Alloc 32295 32267 0 0 Add partial 11 23291143 0 4 Remove partial 5815764 31278 1 0 Cpu partial list 18119280 23967320 3 4 RemoteObj/SlabFrozen 10 76974794 0 15 Total 512006071 512006071 Flushes 11 Slab Deactivation Occurrences % ------------------------------------------------- Slab empty 989 0% Deactivation bypass 23967320 99% Refilled from foreign frees 32358473 135% As expected, memory usage dropped significantly with change of accounting, increasing the defaults increased it, but not as much. The number of page allocation/frees dropped significantly with the new accounting, but didn't increase with the higher defaults. Interestingly, the number of fasthpath allocations increased, as well as allocations from the cpu partial list, even though it's shorter. Link: https://lkml.kernel.org/r/20211012134651.11258-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka Cc: Christoph Lameter Cc: David Rientjes Cc: Jann Horn Cc: Joonsoo Kim Cc: Pekka Enberg Cc: Roman Gushchin Signed-off-by: Andrew Morton --- mm/slub.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/mm/slub.c~mm-slub-increase-default-cpu-partial-list-sizes +++ a/mm/slub.c @@ -4030,13 +4030,13 @@ static void set_cpu_partial(struct kmem_ if (!kmem_cache_has_cpu_partial(s)) nr_objects = 0; else if (s->size >= PAGE_SIZE) - nr_objects = 2; - else if (s->size >= 1024) nr_objects = 6; + else if (s->size >= 1024) + nr_objects = 24; else if (s->size >= 256) - nr_objects = 13; + nr_objects = 52; else - nr_objects = 30; + nr_objects = 120; slub_set_cpu_partial(s, nr_objects); #endif From patchwork Fri Nov 5 20:35:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07A27C433F5 for ; Fri, 5 Nov 2021 20:35:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AE9AC6125F for ; Fri, 5 Nov 2021 20:35:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AE9AC6125F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 54E90940010; Fri, 5 Nov 2021 16:35:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FD63940007; Fri, 5 Nov 2021 16:35:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 445D2940010; Fri, 5 Nov 2021 16:35:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 33421940007 for ; Fri, 5 Nov 2021 16:35:26 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D9D971850785C for ; Fri, 5 Nov 2021 20:35:25 +0000 (UTC) X-FDA: 78776031810.21.89416AE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 66A73801A8AE for ; Fri, 5 Nov 2021 20:35:25 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6F01F611EE; Fri, 5 Nov 2021 20:35:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144524; bh=tbBuV/n/ElbD/PPaV5EnZhU0chfTNAvsAxLFDxCDgms=; h=Date:From:To:Subject:In-Reply-To:From; b=xcGqaprqW3DECF2kihUk3FDHeh/7zHtJGs4JF8nBfuBvaKJf1S1Akb6U8T/ZvHJKj HOCqCkXe6rXbQIEDDrTwpHAjOkDVPQaMAXGhhMWyKAFkeBfZzOmfKJIUgHwW2RKyYo CCOnRF/4AYvD4mMWHeB0OP17PEnjzyKr64iHiS0s= Date: Fri, 05 Nov 2021 13:35:24 -0700 From: Andrew Morton To: 42.hyeyoo@gmail.com, akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 016/262] mm, slub: use prefetchw instead of prefetch Message-ID: <20211105203524.9wSyvNnow%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xcGqaprq; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 66A73801A8AE X-Stat-Signature: wbanadagmdbwt1ufbukmcoux7upoqnh1 X-HE-Tag: 1636144525-958531 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Hyeonggon Yoo <42.hyeyoo@gmail.com> Subject: mm, slub: use prefetchw instead of prefetch commit 0ad9500e16fe ("slub: prefetch next freelist pointer in slab_alloc()") introduced prefetch_freepointer() because when other cpu(s) freed objects into a page that current cpu owns, the freelist link is hot on cpu(s) which freed objects and possibly very cold on current cpu. But if freelist link chain is hot on cpu(s) which freed objects, it's better to invalidate that chain because they're not going to access again within a short time. So use prefetchw instead of prefetch. On supported architectures like x86 and arm, it invalidates other copied instances of a cache line when prefetching it. Before: Time: 91.677 Performance counter stats for 'hackbench -g 100 -l 10000': 1462938.07 msec cpu-clock # 15.908 CPUs utilized 18072550 context-switches # 12.354 K/sec 1018814 cpu-migrations # 696.416 /sec 104558 page-faults # 71.471 /sec 1580035699271 cycles # 1.080 GHz (54.51%) 2003670016013 instructions # 1.27 insn per cycle (54.31%) 5702204863 branch-misses (54.28%) 643368500985 cache-references # 439.778 M/sec (54.26%) 18475582235 cache-misses # 2.872 % of all cache refs (54.28%) 642206796636 L1-dcache-loads # 438.984 M/sec (46.87%) 18215813147 L1-dcache-load-misses # 2.84% of all L1-dcache accesses (46.83%) 653842996501 dTLB-loads # 446.938 M/sec (46.63%) 3227179675 dTLB-load-misses # 0.49% of all dTLB cache accesses (46.85%) 537531951350 iTLB-loads # 367.433 M/sec (54.33%) 114750630 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.37%) 630135543177 L1-icache-loads # 430.733 M/sec (46.80%) 22923237620 L1-icache-load-misses # 3.64% of all L1-icache accesses (46.76%) 91.964452802 seconds time elapsed 43.416742000 seconds user 1422.441123000 seconds sys After: Time: 90.220 Performance counter stats for 'hackbench -g 100 -l 10000': 1437418.48 msec cpu-clock # 15.880 CPUs utilized 17694068 context-switches # 12.310 K/sec 958257 cpu-migrations # 666.651 /sec 100604 page-faults # 69.989 /sec 1583259429428 cycles # 1.101 GHz (54.57%) 2004002484935 instructions # 1.27 insn per cycle (54.37%) 5594202389 branch-misses (54.36%) 643113574524 cache-references # 447.409 M/sec (54.39%) 18233791870 cache-misses # 2.835 % of all cache refs (54.37%) 640205852062 L1-dcache-loads # 445.386 M/sec (46.75%) 17968160377 L1-dcache-load-misses # 2.81% of all L1-dcache accesses (46.79%) 651747432274 dTLB-loads # 453.415 M/sec (46.59%) 3127124271 dTLB-load-misses # 0.48% of all dTLB cache accesses (46.75%) 535395273064 iTLB-loads # 372.470 M/sec (54.38%) 113500056 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.35%) 628871845924 L1-icache-loads # 437.501 M/sec (46.80%) 22585641203 L1-icache-load-misses # 3.59% of all L1-icache accesses (46.79%) 90.514819303 seconds time elapsed 43.877656000 seconds user 1397.176001000 seconds sys Link: https://lkml.org/lkml/2021/10/8/598=20 Link: https://lkml.kernel.org/r/20211011144331.70084-1-42.hyeyoo@gmail.com Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/slub.c~mm-slub-use-prefetchw-instead-of-prefetch +++ a/mm/slub.c @@ -354,7 +354,7 @@ static inline void *get_freepointer(stru static void prefetch_freepointer(const struct kmem_cache *s, void *object) { - prefetch(object + s->offset); + prefetchw(object + s->offset); } static inline void *get_freepointer_safe(struct kmem_cache *s, void *object) From patchwork Fri Nov 5 20:35:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F17E8C433EF for ; Fri, 5 Nov 2021 20:35:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A800B61262 for ; Fri, 5 Nov 2021 20:35:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A800B61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6AC1A940012; Fri, 5 Nov 2021 16:35:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65E28940007; Fri, 5 Nov 2021 16:35:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 526BF940012; Fri, 5 Nov 2021 16:35:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id 41B0C940007 for ; Fri, 5 Nov 2021 16:35:29 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EC03218233660 for ; Fri, 5 Nov 2021 20:35:28 +0000 (UTC) X-FDA: 78776031936.29.5B6D410 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id 873BD4002085 for ; Fri, 5 Nov 2021 20:35:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 83F3D611C0; Fri, 5 Nov 2021 20:35:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144527; bh=rwqbZ7Jw7ad1fBjerU310bTskbYP0lg49TmFynugLVI=; h=Date:From:To:Subject:In-Reply-To:From; b=19F2cLj/MDo6vNgdKRTKiZNN48aR/yBIFv2kS54SM4699UdAtaj96qDN/czScBGT5 gfrhKVMBiqR0nI9KJ87wRba8y8PNDu2J/5qHk6vROgi4LEuIiwwrsAbUj5DgkXzdOJ 61eZ54V71snpFH2fDg3EKSP3z5rUWyiA6IgI/2wk= Date: Fri, 05 Nov 2021 13:35:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, david@redhat.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, tglx@linutronix.de, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 017/262] mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT Message-ID: <20211105203527.xgYSnEg-G%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 873BD4002085 X-Stat-Signature: dzg3pady1kss4ecefe4d8e15wwfud111 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="19F2cLj/"; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144528-463060 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Sebastian Andrzej Siewior Subject: mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT TRANSPARENT_HUGEPAGE: There are potential non-deterministic delays to an RT thread if a critical memory region is not THP-aligned and a non-RT buffer is located in the same hugepage-aligned region. It's also possible for an unrelated thread to migrate pages belonging to an RT task incurring unexpected page faults due to memory defragmentation even if khugepaged is disabled. Regular HUGEPAGEs are not affected by this can be used. NUMA_BALANCING: There is a non-deterministic delay to mark PTEs PROT_NONE to gather NUMA fault samples, increased page faults of regions even if mlocked and non-deterministic delays when migrating pages. [Mel Gorman worded 99% of the commit description]. Link: https://lore.kernel.org/all/20200304091159.GN3818@techsingularity.net/ Link: https://lore.kernel.org/all/20211026165100.ahz5bkx44lrrw5pt@linutronix.de/ Link: https://lkml.kernel.org/r/20211028143327.hfbxjze7palrpfgp@linutronix.de Signed-off-by: Sebastian Andrzej Siewior Acked-by: Mel Gorman Reviewed-by: David Hildenbrand Cc: Vlastimil Babka Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- init/Kconfig | 2 +- mm/Kconfig | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/init/Kconfig~mm-disable-numa_balancing_default_enabled-and-transparent_hugepage-on-preempt_rt +++ a/init/Kconfig @@ -901,7 +901,7 @@ config NUMA_BALANCING bool "Memory placement aware NUMA scheduler" depends on ARCH_SUPPORTS_NUMA_BALANCING depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY - depends on SMP && NUMA && MIGRATION + depends on SMP && NUMA && MIGRATION && !PREEMPT_RT help This option adds support for automatic NUMA aware memory/task placement. The mechanism is quite primitive and is based on migrating memory when --- a/mm/Kconfig~mm-disable-numa_balancing_default_enabled-and-transparent_hugepage-on-preempt_rt +++ a/mm/Kconfig @@ -371,7 +371,7 @@ config NOMMU_INITIAL_TRIM_EXCESS config TRANSPARENT_HUGEPAGE bool "Transparent Hugepage Support" - depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE + depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE && !PREEMPT_RT select COMPACTION select XARRAY_MULTI help From patchwork Fri Nov 5 20:35:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0078C433F5 for ; Fri, 5 Nov 2021 20:35:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A86C3611EE for ; Fri, 5 Nov 2021 20:35:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A86C3611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2912B940013; Fri, 5 Nov 2021 16:35:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23F32940007; Fri, 5 Nov 2021 16:35:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E0AD940013; Fri, 5 Nov 2021 16:35:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id EEC2B940007 for ; Fri, 5 Nov 2021 16:35:31 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B6BDE777F4 for ; Fri, 5 Nov 2021 20:35:31 +0000 (UTC) X-FDA: 78776032062.02.F8E450A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 7126090000B4 for ; Fri, 5 Nov 2021 20:35:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8C90E6125F; Fri, 5 Nov 2021 20:35:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144530; bh=A2CJN584gHMdu1tguBc0hQwk713kctjwbfDr8H1GB68=; h=Date:From:To:Subject:In-Reply-To:From; b=BXrM17EjaEgi1vR3sdL2nyJ/yvo4K5S7JTsM6WCaOcJgm8N/agyRJTKE6QsVuqmNF mgfKHKxs4Piqt5FIaOkx2+ziVRwYIsTL1G+PKyblK9ZlUD41rTUJtLlMTp56lbPVUZ avM+NLwQZop33L1MaGQoMmvX87PPkm1XHZdvruYQ= Date: Fri, 05 Nov 2021 13:35:30 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dan.j.williams@intel.com, hch@lst.de, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, torvalds@linux-foundation.org Subject: [patch 018/262] mm: don't include in Message-ID: <20211105203530.rAInkI_6I%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BXrM17Ej; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7126090000B4 X-Stat-Signature: ybtpnpu6oms8yq61ucnb6gofjaoqnswk X-HE-Tag: 1636144531-638207 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: mm: don't include in Not required at all, and having this causes a huge kernel rebuild as soon as something in dax.h changes. Link: https://lkml.kernel.org/r/20210921082253.1859794-1-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Naoya Horiguchi Reviewed-by: Dan Williams Signed-off-by: Andrew Morton --- include/linux/mempolicy.h | 1 - mm/memory-failure.c | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) --- a/include/linux/mempolicy.h~mm-dont-include-linux-daxh-in-linux-mempolicyh +++ a/include/linux/mempolicy.h @@ -8,7 +8,6 @@ #include #include -#include #include #include #include --- a/mm/memory-failure.c~mm-dont-include-linux-daxh-in-linux-mempolicyh +++ a/mm/memory-failure.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include From patchwork Fri Nov 5 20:35:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52054C433F5 for ; Fri, 5 Nov 2021 20:35:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0AE68611EE for ; Fri, 5 Nov 2021 20:35:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0AE68611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A2EDD940014; Fri, 5 Nov 2021 16:35:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DEA7940007; Fri, 5 Nov 2021 16:35:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91B87940014; Fri, 5 Nov 2021 16:35:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id 837B7940007 for ; Fri, 5 Nov 2021 16:35:37 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EF89B7798A for ; Fri, 5 Nov 2021 20:35:36 +0000 (UTC) X-FDA: 78776032314.17.4E77252 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id B9B1E60019B1 for ; Fri, 5 Nov 2021 20:35:35 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A3ACF61242; Fri, 5 Nov 2021 20:35:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144534; bh=MencArMCGulqOIgXbLG7I3T8o3/dVJOjikBHte+gtVw=; h=Date:From:To:Subject:In-Reply-To:From; b=hpUa3FSOzjmDUXKbhpQAaylvFPmQrOVcAblfgun13e+Q6y5ihMVnQrGqCw17BWplx uOcg/NlPvKOjH9SO/HXOmCofV7ljyJnZuQKarcroRoUVpWTR31WsNh975c12Rwc/Sc p8pjkTL3R4rQrbhAK9AxQ7izUuRJXwaRXqX96kpM= Date: Fri, 05 Nov 2021 13:35:33 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 019/262] lib/stackdepot: include gfp.h Message-ID: <20211105203533.ZUinXe5lF%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hpUa3FSO; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B9B1E60019B1 X-Stat-Signature: eg4hrty4yf63r4ewpbd935uqejirmn4o X-HE-Tag: 1636144535-935835 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: lib/stackdepot: include gfp.h Patch series "stackdepot, kasan, workqueue: Avoid expanding stackdepot slabs when holding raw_spin_lock", v2. Shuah Khan reported [1]: | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled, | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when | it tries to allocate memory attempting to acquire spinlock in page | allocation code while holding workqueue pool raw_spinlock. | | There are several instances of this problem when block layer tries | to __queue_work(). Call trace from one of these instances is below: | | kblockd_mod_delayed_work_on() | mod_delayed_work_on() | __queue_delayed_work() | __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held) | insert_work() | kasan_record_aux_stack() | kasan_save_stack() | stack_depot_save() | alloc_pages() | __alloc_pages() | get_page_from_freelist() | rm_queue() | rm_queue_pcplist() | local_lock_irqsave(&pagesets.lock, flags); | [ BUG: Invalid wait context triggered ] PROVE_RAW_LOCK_NESTING is pointing out that (on RT kernels) the locking rules are being violated. More generally, memory is being allocated from a non-preemptive context (raw_spin_lock'd c-s) where it is not allowed. To properly fix this, we must prevent stackdepot from replenishing its "stack slab" pool if memory allocations cannot be done in the current context: it's a bug to use either GFP_ATOMIC nor GFP_NOWAIT in certain non-preemptive contexts, including raw_spin_locks (see gfp.h and ab00db216c9c7). The only downside is that saving a stack trace may fail if: stackdepot runs out of space AND the same stack trace has not been recorded before. I expect this to be unlikely, and a simple experiment (boot the kernel) didn't result in any failure to record stack trace from insert_work(). The series includes a few minor fixes to stackdepot that I noticed in preparing the series. It then introduces __stack_depot_save(), which exposes the option to force stackdepot to not allocate any memory. Finally, KASAN is changed to use the new stackdepot interface and provide kasan_record_aux_stack_noalloc(), which is then used by workqueue code. [1] https://lkml.kernel.org/r/20210902200134.25603-1-skhan@linuxfoundation.org This patch (of 6): refers to gfp_t, but doesn't include gfp.h. Fix it by including . Link: https://lkml.kernel.org/r/20210913112609.2651084-1-elver@google.com Link: https://lkml.kernel.org/r/20210913112609.2651084-2-elver@google.com Signed-off-by: Marco Elver Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Reviewed-by: Andrey Konovalov Cc: Tejun Heo Cc: Lai Jiangshan Cc: Walter Wu Cc: Thomas Gleixner Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: "Gustavo A. R. Silva" Cc: Taras Madan Signed-off-by: Andrew Morton --- include/linux/stackdepot.h | 2 ++ 1 file changed, 2 insertions(+) --- a/include/linux/stackdepot.h~lib-stackdepot-include-gfph +++ a/include/linux/stackdepot.h @@ -11,6 +11,8 @@ #ifndef _LINUX_STACKDEPOT_H #define _LINUX_STACKDEPOT_H +#include + typedef u32 depot_stack_handle_t; depot_stack_handle_t stack_depot_save(unsigned long *entries, From patchwork Fri Nov 5 20:35:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADCC1C433FE for ; Fri, 5 Nov 2021 20:35:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6666C611EE for ; Fri, 5 Nov 2021 20:35:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6666C611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F39C5940015; Fri, 5 Nov 2021 16:35:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC079940007; Fri, 5 Nov 2021 16:35:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD7E0940015; Fri, 5 Nov 2021 16:35:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id C8F86940007 for ; Fri, 5 Nov 2021 16:35:38 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8394718523518 for ; Fri, 5 Nov 2021 20:35:38 +0000 (UTC) X-FDA: 78776032356.28.FFBD264 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id 5E3EC30000B2 for ; Fri, 5 Nov 2021 20:35:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0434561252; Fri, 5 Nov 2021 20:35:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144537; bh=+7UKRVqmEaOgoDhdjdGy+rjCLEa9T0ZjVP5xXx9RZW8=; h=Date:From:To:Subject:In-Reply-To:From; b=muLXTV2y+UG2iY78yME2yAXMk3GrnhRaq6tGbHHdpZJ7W56/L/bAaBVJtjScgTfbU HeTN9CbxoFikxH0AMig1xxTNx8PoTaOIEtlF2Lq2sILHl83V00bVCC5IzJrd+GNR8C 2qbAtD7/7+b+qPaAqeTGp5MxTlRrtIQp+v8vCdTA= Date: Fri, 05 Nov 2021 13:35:36 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 020/262] lib/stackdepot: remove unused function argument Message-ID: <20211105203536.z6Oj6ZDrj%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5E3EC30000B2 X-Stat-Signature: y5pp9did8kwjcuuz3y9ihrrcdknws36f Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=muLXTV2y; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144531-614123 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: lib/stackdepot: remove unused function argument alloc_flags in depot_alloc_stack() is no longer used; remove it. Link: https://lkml.kernel.org/r/20210913112609.2651084-3-elver@google.com Signed-off-by: Marco Elver Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: "Gustavo A. R. Silva" Cc: Lai Jiangshan Cc: Taras Madan Cc: Tejun Heo Cc: Thomas Gleixner Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: Walter Wu Signed-off-by: Andrew Morton --- lib/stackdepot.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) --- a/lib/stackdepot.c~lib-stackdepot-remove-unused-function-argument +++ a/lib/stackdepot.c @@ -102,8 +102,8 @@ static bool init_stack_slab(void **preal } /* Allocation of a new stack in raw storage */ -static struct stack_record *depot_alloc_stack(unsigned long *entries, int size, - u32 hash, void **prealloc, gfp_t alloc_flags) +static struct stack_record * +depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc) { struct stack_record *stack; size_t required_size = struct_size(stack, entries, size); @@ -309,9 +309,8 @@ depot_stack_handle_t stack_depot_save(un found = find_stack(*bucket, entries, nr_entries, hash); if (!found) { - struct stack_record *new = - depot_alloc_stack(entries, nr_entries, - hash, &prealloc, alloc_flags); + struct stack_record *new = depot_alloc_stack(entries, nr_entries, hash, &prealloc); + if (new) { new->next = *bucket; /* From patchwork Fri Nov 5 20:35:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DAB3C433EF for ; Fri, 5 Nov 2021 20:35:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ECF3E61262 for ; Fri, 5 Nov 2021 20:35:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org ECF3E61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8CF2D940016; Fri, 5 Nov 2021 16:35:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85817940007; Fri, 5 Nov 2021 16:35:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E0E5940017; Fri, 5 Nov 2021 16:35:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 5D809940016 for ; Fri, 5 Nov 2021 16:35:42 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 21EDD18527CF7 for ; Fri, 5 Nov 2021 20:35:42 +0000 (UTC) X-FDA: 78776032524.18.F816ABD Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id D183E30000BA for ; Fri, 5 Nov 2021 20:35:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 554EB611EE; Fri, 5 Nov 2021 20:35:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144540; bh=54yLhZ4FdY7G6NZnyqkmhy/fP815gr1eOuvYTayGcj0=; h=Date:From:To:Subject:In-Reply-To:From; b=xcxsLCinNDloHvX78B7ljRsDROIdTyfgdarYFJbjheybPVbWbD9yVVLM+iVSLU2Zw VFX2KNKu07LoJQjF2ZHn2q2sxCwlcOyMIT0C3sf7i8dGkmIAk4K387J4iwH9laeU4S QC2dAIkCsszqekWhb1wbYNgUmeke9EfJfH+Ikqy4= Date: Fri, 05 Nov 2021 13:35:39 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 021/262] lib/stackdepot: introduce __stack_depot_save() Message-ID: <20211105203539.x75nvkjw2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xcxsLCin; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D183E30000BA X-Stat-Signature: ho4zttwf57usur15bmwu3p78ixej9ubr X-HE-Tag: 1636144529-449401 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: lib/stackdepot: introduce __stack_depot_save() Add __stack_depot_save(), which provides more fine-grained control over stackdepot's memory allocation behaviour, in case stackdepot runs out of "stack slabs". Normally stackdepot uses alloc_pages() in case it runs out of space; passing can_alloc==false to __stack_depot_save() prohibits this, at the cost of more likely failure to record a stack trace. Link: https://lkml.kernel.org/r/20210913112609.2651084-4-elver@google.com Signed-off-by: Marco Elver Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: "Gustavo A. R. Silva" Cc: Lai Jiangshan Cc: Taras Madan Cc: Tejun Heo Cc: Thomas Gleixner Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: Walter Wu Signed-off-by: Andrew Morton --- include/linux/stackdepot.h | 4 +++ lib/stackdepot.c | 43 ++++++++++++++++++++++++++++++----- 2 files changed, 41 insertions(+), 6 deletions(-) --- a/include/linux/stackdepot.h~lib-stackdepot-introduce-__stack_depot_save +++ a/include/linux/stackdepot.h @@ -15,6 +15,10 @@ typedef u32 depot_stack_handle_t; +depot_stack_handle_t __stack_depot_save(unsigned long *entries, + unsigned int nr_entries, + gfp_t gfp_flags, bool can_alloc); + depot_stack_handle_t stack_depot_save(unsigned long *entries, unsigned int nr_entries, gfp_t gfp_flags); --- a/lib/stackdepot.c~lib-stackdepot-introduce-__stack_depot_save +++ a/lib/stackdepot.c @@ -248,17 +248,28 @@ unsigned int stack_depot_fetch(depot_sta EXPORT_SYMBOL_GPL(stack_depot_fetch); /** - * stack_depot_save - Save a stack trace from an array + * __stack_depot_save - Save a stack trace from an array * * @entries: Pointer to storage array * @nr_entries: Size of the storage array * @alloc_flags: Allocation gfp flags + * @can_alloc: Allocate stack slabs (increased chance of failure if false) + * + * Saves a stack trace from @entries array of size @nr_entries. If @can_alloc is + * %true, is allowed to replenish the stack slab pool in case no space is left + * (allocates using GFP flags of @alloc_flags). If @can_alloc is %false, avoids + * any allocations and will fail if no space is left to store the stack trace. + * + * Context: Any context, but setting @can_alloc to %false is required if + * alloc_pages() cannot be used from the current context. Currently + * this is the case from contexts where neither %GFP_ATOMIC nor + * %GFP_NOWAIT can be used (NMI, raw_spin_lock). * - * Return: The handle of the stack struct stored in depot + * Return: The handle of the stack struct stored in depot, 0 on failure. */ -depot_stack_handle_t stack_depot_save(unsigned long *entries, - unsigned int nr_entries, - gfp_t alloc_flags) +depot_stack_handle_t __stack_depot_save(unsigned long *entries, + unsigned int nr_entries, + gfp_t alloc_flags, bool can_alloc) { struct stack_record *found = NULL, **bucket; depot_stack_handle_t retval = 0; @@ -291,7 +302,7 @@ depot_stack_handle_t stack_depot_save(un * The smp_load_acquire() here pairs with smp_store_release() to * |next_slab_inited| in depot_alloc_stack() and init_stack_slab(). */ - if (unlikely(!smp_load_acquire(&next_slab_inited))) { + if (unlikely(can_alloc && !smp_load_acquire(&next_slab_inited))) { /* * Zero out zone modifiers, as we don't have specific zone * requirements. Keep the flags related to allocation in atomic @@ -339,6 +350,26 @@ exit: fast_exit: return retval; } +EXPORT_SYMBOL_GPL(__stack_depot_save); + +/** + * stack_depot_save - Save a stack trace from an array + * + * @entries: Pointer to storage array + * @nr_entries: Size of the storage array + * @alloc_flags: Allocation gfp flags + * + * Context: Contexts where allocations via alloc_pages() are allowed. + * See __stack_depot_save() for more details. + * + * Return: The handle of the stack struct stored in depot, 0 on failure. + */ +depot_stack_handle_t stack_depot_save(unsigned long *entries, + unsigned int nr_entries, + gfp_t alloc_flags) +{ + return __stack_depot_save(entries, nr_entries, alloc_flags, true); +} EXPORT_SYMBOL_GPL(stack_depot_save); static inline int in_irqentry_text(unsigned long ptr) From patchwork Fri Nov 5 20:35:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D9EBC433F5 for ; Fri, 5 Nov 2021 20:35:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3E7C961252 for ; Fri, 5 Nov 2021 20:35:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3E7C961252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D62D1940017; Fri, 5 Nov 2021 16:35:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D12E0940007; Fri, 5 Nov 2021 16:35:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C031E940017; Fri, 5 Nov 2021 16:35:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id B19E4940007 for ; Fri, 5 Nov 2021 16:35:45 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 623C48249980 for ; Fri, 5 Nov 2021 20:35:45 +0000 (UTC) X-FDA: 78776032650.11.6F0D702 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id A16A9B0000A5 for ; Fri, 5 Nov 2021 20:35:37 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B4A1C61262; Fri, 5 Nov 2021 20:35:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144544; bh=mC+rC11iZXaH81twid8kOtXT8SB99zVqXQ9S6eCijb0=; h=Date:From:To:Subject:In-Reply-To:From; b=v5g7g8NLGAJ7sOADvKnLEJR+ODRIg5NtjbJrKmGiN2xrRZZLWijAri5aCs4KaXyvu Bk0bk/BFfuwdrbR5nEkOr/aFS6+keT9TIPoJZJF7Me6iteAPJCkwbQEAoPCIQx9vvm CHfWtn1Yn14pTAB3JvQQK2T6jLOcMqqthf6XMTRE= Date: Fri, 05 Nov 2021 13:35:43 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 022/262] kasan: common: provide can_alloc in kasan_save_stack() Message-ID: <20211105203543.HfM3mZlSs%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: A16A9B0000A5 X-Stat-Signature: 6auy5imdxbz7iygj8bxe8czup5caihrb Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=v5g7g8NL; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144537-373660 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kasan: common: provide can_alloc in kasan_save_stack() Add another argument, can_alloc, to kasan_save_stack() which is passed as-is to __stack_depot_save(). No functional change intended. Link: https://lkml.kernel.org/r/20210913112609.2651084-5-elver@google.com Signed-off-by: Marco Elver Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: "Gustavo A. R. Silva" Cc: Lai Jiangshan Cc: Taras Madan Cc: Tejun Heo Cc: Thomas Gleixner Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: Walter Wu Signed-off-by: Andrew Morton --- mm/kasan/common.c | 6 +++--- mm/kasan/generic.c | 2 +- mm/kasan/kasan.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) --- a/mm/kasan/common.c~kasan-common-provide-can_alloc-in-kasan_save_stack +++ a/mm/kasan/common.c @@ -30,20 +30,20 @@ #include "kasan.h" #include "../slab.h" -depot_stack_handle_t kasan_save_stack(gfp_t flags) +depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc) { unsigned long entries[KASAN_STACK_DEPTH]; unsigned int nr_entries; nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0); nr_entries = filter_irq_stacks(entries, nr_entries); - return stack_depot_save(entries, nr_entries, flags); + return __stack_depot_save(entries, nr_entries, flags, can_alloc); } void kasan_set_track(struct kasan_track *track, gfp_t flags) { track->pid = current->pid; - track->stack = kasan_save_stack(flags); + track->stack = kasan_save_stack(flags, true); } #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) --- a/mm/kasan/generic.c~kasan-common-provide-can_alloc-in-kasan_save_stack +++ a/mm/kasan/generic.c @@ -345,7 +345,7 @@ void kasan_record_aux_stack(void *addr) return; alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0]; - alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT); + alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, true); } void kasan_set_free_info(struct kmem_cache *cache, --- a/mm/kasan/kasan.h~kasan-common-provide-can_alloc-in-kasan_save_stack +++ a/mm/kasan/kasan.h @@ -251,7 +251,7 @@ void kasan_report_invalid_free(void *obj struct page *kasan_addr_to_page(const void *addr); -depot_stack_handle_t kasan_save_stack(gfp_t flags); +depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc); void kasan_set_track(struct kasan_track *track, gfp_t flags); void kasan_set_free_info(struct kmem_cache *cache, void *object, u8 tag); struct kasan_track *kasan_get_free_track(struct kmem_cache *cache, From patchwork Fri Nov 5 20:35:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5C32C433FE for ; Fri, 5 Nov 2021 20:35:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8A176611C0 for ; Fri, 5 Nov 2021 20:35:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8A176611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2B4F3940018; Fri, 5 Nov 2021 16:35:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2673C940007; Fri, 5 Nov 2021 16:35:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17AE9940018; Fri, 5 Nov 2021 16:35:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0006.hostedemail.com [216.40.44.6]) by kanga.kvack.org (Postfix) with ESMTP id 09430940007 for ; Fri, 5 Nov 2021 16:35:49 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C9D707530E for ; Fri, 5 Nov 2021 20:35:48 +0000 (UTC) X-FDA: 78776032818.17.14C4F48 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 5ED73F000201 for ; Fri, 5 Nov 2021 20:35:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1C0DA611EE; Fri, 5 Nov 2021 20:35:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144547; bh=IjxZzy+5Jjm2z5NmidHT1xCLtqR0xj3arDGEfT2tB/o=; h=Date:From:To:Subject:In-Reply-To:From; b=1a+ceE147I3cV2+XGnIIQoKSVlEEIowOg36l/53/j4FJThhXZRjnHik25stMwGqAm 1d/3cPKN5HIhgGTEvb3Q1/+Vkbpa3HD5f9LXheIiETM1NefnjcfzAi6y7V93sn52Xg nNYaNB6C0o8gXwmt6eonhKguY7X41CwgPMVMKUyI= Date: Fri, 05 Nov 2021 13:35:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 023/262] kasan: generic: introduce kasan_record_aux_stack_noalloc() Message-ID: <20211105203546.DpiM4aEip%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5ED73F000201 X-Stat-Signature: 7iuq9pwq8xdazoyqp7biubcifezdz8jo Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1a+ceE14; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144548-94377 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kasan: generic: introduce kasan_record_aux_stack_noalloc() Introduce a variant of kasan_record_aux_stack() that does not do any memory allocation through stackdepot. This will permit using it in contexts that cannot allocate any memory. Link: https://lkml.kernel.org/r/20210913112609.2651084-6-elver@google.com Signed-off-by: Marco Elver Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: "Gustavo A. R. Silva" Cc: Lai Jiangshan Cc: Taras Madan Cc: Tejun Heo Cc: Thomas Gleixner Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: Walter Wu Signed-off-by: Andrew Morton --- include/linux/kasan.h | 2 ++ mm/kasan/generic.c | 14 ++++++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) --- a/include/linux/kasan.h~kasan-generic-introduce-kasan_record_aux_stack_noalloc +++ a/include/linux/kasan.h @@ -370,12 +370,14 @@ static inline void kasan_unpoison_task_s void kasan_cache_shrink(struct kmem_cache *cache); void kasan_cache_shutdown(struct kmem_cache *cache); void kasan_record_aux_stack(void *ptr); +void kasan_record_aux_stack_noalloc(void *ptr); #else /* CONFIG_KASAN_GENERIC */ static inline void kasan_cache_shrink(struct kmem_cache *cache) {} static inline void kasan_cache_shutdown(struct kmem_cache *cache) {} static inline void kasan_record_aux_stack(void *ptr) {} +static inline void kasan_record_aux_stack_noalloc(void *ptr) {} #endif /* CONFIG_KASAN_GENERIC */ --- a/mm/kasan/generic.c~kasan-generic-introduce-kasan_record_aux_stack_noalloc +++ a/mm/kasan/generic.c @@ -328,7 +328,7 @@ DEFINE_ASAN_SET_SHADOW(f3); DEFINE_ASAN_SET_SHADOW(f5); DEFINE_ASAN_SET_SHADOW(f8); -void kasan_record_aux_stack(void *addr) +static void __kasan_record_aux_stack(void *addr, bool can_alloc) { struct page *page = kasan_addr_to_page(addr); struct kmem_cache *cache; @@ -345,7 +345,17 @@ void kasan_record_aux_stack(void *addr) return; alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0]; - alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, true); + alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, can_alloc); +} + +void kasan_record_aux_stack(void *addr) +{ + return __kasan_record_aux_stack(addr, true); +} + +void kasan_record_aux_stack_noalloc(void *addr) +{ + return __kasan_record_aux_stack(addr, false); } void kasan_set_free_info(struct kmem_cache *cache, From patchwork Fri Nov 5 20:35:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A7A4C433EF for ; Fri, 5 Nov 2021 20:35:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4394A611C0 for ; Fri, 5 Nov 2021 20:35:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4394A611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D3FDE940019; Fri, 5 Nov 2021 16:35:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEDB1940007; Fri, 5 Nov 2021 16:35:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB3C0940019; Fri, 5 Nov 2021 16:35:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0117.hostedemail.com [216.40.44.117]) by kanga.kvack.org (Postfix) with ESMTP id ADBAE940007 for ; Fri, 5 Nov 2021 16:35:52 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6B0318249980 for ; Fri, 5 Nov 2021 20:35:52 +0000 (UTC) X-FDA: 78776032944.18.8E6C80B Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id F403C90000AF for ; Fri, 5 Nov 2021 20:35:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A0F356125F; Fri, 5 Nov 2021 20:35:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144551; bh=6hwGJeuqg+C82l2O5CY/QOwkqZrGCqqopInDMa9RhMo=; h=Date:From:To:Subject:In-Reply-To:From; b=B1ePJecY1HVeXBNLPtbLMIantay4e/WF5zdF2xcgeqvgqVpE1tvxFjfnZvaWRL0t6 zUILTx9CJq/Mnct3M/iHnOH3wj4ncBPBtpeXwXFT9gB9Oc9XSAxN2AZebO78C1/s3q DD1J+Q9CCGApn0MAQwDUyNtd6aXVzy/x6lIiEkC0= Date: Fri, 05 Nov 2021 13:35:50 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, bigeasy@linutronix.de, dvyukov@google.com, elver@google.com, glider@google.com, gustavoars@kernel.org, jiangshanlai@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, skhan@linuxfoundation.org, tarasmadan@google.com, tglx@linutronix.de, tj@kernel.org, torvalds@linux-foundation.org, vinmenon@codeaurora.org, vjitta@codeaurora.org, walter-zh.wu@mediatek.com Subject: [patch 024/262] workqueue, kasan: avoid alloc_pages() when recording stack Message-ID: <20211105203550.82Ko3s-BI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: F403C90000AF X-Stat-Signature: rc6idtg5zfyb7pgnw9eby5n1hdxsqxpa Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=B1ePJecY; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144551-183641 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: workqueue, kasan: avoid alloc_pages() when recording stack Shuah Khan reported: | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled, | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when | it tries to allocate memory attempting to acquire spinlock in page | allocation code while holding workqueue pool raw_spinlock. | | There are several instances of this problem when block layer tries | to __queue_work(). Call trace from one of these instances is below: | | kblockd_mod_delayed_work_on() | mod_delayed_work_on() | __queue_delayed_work() | __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held) | insert_work() | kasan_record_aux_stack() | kasan_save_stack() | stack_depot_save() | alloc_pages() | __alloc_pages() | get_page_from_freelist() | rm_queue() | rm_queue_pcplist() | local_lock_irqsave(&pagesets.lock, flags); | [ BUG: Invalid wait context triggered ] The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT, which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however, it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain non-preemptive contexts, including raw_spin_locks (see gfp.h and ab00db216c9c7). Fix it by instructing stackdepot to not expand stack storage via alloc_pages() in case it runs out by using kasan_record_aux_stack_noalloc(). While there is an increased risk of failing to insert the stack trace, this is typically unlikely, especially if the same insertion had already succeeded previously (stack depot hit). For frequent calls from the same location, it therefore becomes extremely unlikely that kasan_record_aux_stack_noalloc() fails. Link: https://lkml.kernel.org/r/20210902200134.25603-1-skhan@linuxfoundation.org Link: https://lkml.kernel.org/r/20210913112609.2651084-7-elver@google.com Signed-off-by: Marco Elver Reported-by: Shuah Khan Tested-by: Shuah Khan Acked-by: Sebastian Andrzej Siewior Acked-by: Tejun Heo Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: "Gustavo A. R. Silva" Cc: Lai Jiangshan Cc: Taras Madan Cc: Thomas Gleixner Cc: Vijayanand Jitta Cc: Vinayak Menon Cc: Walter Wu Signed-off-by: Andrew Morton --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/workqueue.c~workqueue-kasan-avoid-alloc_pages-when-recording-stack +++ a/kernel/workqueue.c @@ -1350,7 +1350,7 @@ static void insert_work(struct pool_work struct worker_pool *pool = pwq->pool; /* record the work call stack in order to print it in KASAN reports */ - kasan_record_aux_stack(work); + kasan_record_aux_stack_noalloc(work); /* we own @work, set data and link */ set_work_pwq(work, pwq, extra_flags); From patchwork Fri Nov 5 20:35:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F2B9C433FE for ; Fri, 5 Nov 2021 20:35:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2564F61242 for ; Fri, 5 Nov 2021 20:35:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2564F61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B8DE794001A; Fri, 5 Nov 2021 16:35:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF056940007; Fri, 5 Nov 2021 16:35:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A090A94001A; Fri, 5 Nov 2021 16:35:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0053.hostedemail.com [216.40.44.53]) by kanga.kvack.org (Postfix) with ESMTP id 9252D940007 for ; Fri, 5 Nov 2021 16:35:55 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4CCAA8249980 for ; Fri, 5 Nov 2021 20:35:55 +0000 (UTC) X-FDA: 78776033070.25.C0BEEDC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 022CDF0000A8 for ; Fri, 5 Nov 2021 20:35:54 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E53F8611C0; Fri, 5 Nov 2021 20:35:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144554; bh=gX87Amf+PVnytwD7lQr6QutoFY5VcSy+n6KZz8x7OyM=; h=Date:From:To:Subject:In-Reply-To:From; b=1706hhnPNBVGScm7AIY7L1lLMQC6lbVfC9UQQMKtcwGTj8CbTKo05gDMaiQ6cia7X 4Z/Db2KWpykaJxX/e5SYD1eCPVAxNZkaXVqJVw+OVDXfvdInVsWxnzq4oNYy0Nj8yi WcHBQurkQtcnt1amPinOvlP3hdJwPS98rJFeTUzY= Date: Fri, 05 Nov 2021 13:35:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, dvyukov@google.com, elver@google.com, glider@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 025/262] kasan: fix tag for large allocations when using CONFIG_SLAB Message-ID: <20211105203553.dTwQoHYqA%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 022CDF0000A8 X-Stat-Signature: 9rpmech64jcxkya7p7jompfd88xyq1eb Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1706hhnP; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144554-831894 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: kasan: fix tag for large allocations when using CONFIG_SLAB If an object is allocated on a tail page of a multi-page slab, kasan will get the wrong tag because page->s_mem is NULL for tail pages. I'm not quite sure what the user-visible effect of this might be. Link: https://lkml.kernel.org/r/20211001024105.3217339-1-willy@infradead.org Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode") Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Marco Elver Reviewed-by: Andrey Konovalov Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Dmitry Vyukov Signed-off-by: Andrew Morton --- mm/kasan/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/kasan/common.c~kasan-fix-tag-for-large-allocations-when-using-config_slab +++ a/mm/kasan/common.c @@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_ /* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */ #ifdef CONFIG_SLAB /* For SLAB assign tags based on the object index in the freelist. */ - return (u8)obj_to_index(cache, virt_to_page(object), (void *)object); + return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object); #else /* * For SLUB assign a random tag during slab creation, otherwise reuse From patchwork Fri Nov 5 20:35:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B759C433EF for ; Fri, 5 Nov 2021 20:35:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4DE1861242 for ; Fri, 5 Nov 2021 20:35:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4DE1861242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E4D226B006C; Fri, 5 Nov 2021 16:35:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DFDDB94001B; Fri, 5 Nov 2021 16:35:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1459940007; Fri, 5 Nov 2021 16:35:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id C236D6B006C for ; Fri, 5 Nov 2021 16:35:58 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8464C777ED for ; Fri, 5 Nov 2021 20:35:58 +0000 (UTC) X-FDA: 78776033196.13.324277F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 2B53810000AB for ; Fri, 5 Nov 2021 20:35:58 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 15805611EE; Fri, 5 Nov 2021 20:35:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144557; bh=5z2+sfRC8ejjpEzOSr92a4X3PtVD+qQDarEhEA6O8+M=; h=Date:From:To:Subject:In-Reply-To:From; b=szMOgNXbW0wH5634+zpp3IPREpaTyErXfkkNOL/KGBUP778okP7ODUFndtlFQYsS4 WC58wytpPY9xxRLqleO2TbNnH6SZ8UEhDa3ewuGh9G4r5dhN6/lFVy4y5pHYwHtlUF 3RFXYtGnvMWRB65dw9jbNR49VP6Rg35FaaYMFKj8= Date: Fri, 05 Nov 2021 13:35:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, catalin.marinas@arm.com, elver@google.com, eugenis@google.com, glider@google.com, linux-mm@kvack.org, mark.rutland@arm.com, mm-commits@vger.kernel.org, pcc@google.com, robin.murphy@arm.com, torvalds@linux-foundation.org, will@kernel.org Subject: [patch 026/262] kasan: test: add memcpy test that avoids out-of-bounds write Message-ID: <20211105203556.oqhUL-8Gp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=szMOgNXb; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2B53810000AB X-Stat-Signature: obidu17zbwehfy5q1mtnortkom9mgii3 X-HE-Tag: 1636144558-966833 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Collingbourne Subject: kasan: test: add memcpy test that avoids out-of-bounds write With HW tag-based KASAN, error checks are performed implicitly by the load and store instructions in the memcpy implementation. A failed check results in tag checks being disabled and execution will keep going. As a result, under HW tag-based KASAN, prior to commit 1b0668be62cf ("kasan: test: disable kmalloc_memmove_invalid_size for HW_TAGS"), this memcpy would end up corrupting memory until it hits an inaccessible page and causes a kernel panic. This is a pre-existing issue that was revealed by commit 285133040e6c ("arm64: Import latest memcpy()/memmove() implementation") which changed the memcpy implementation from using signed comparisons (incorrectly, resulting in the memcpy being terminated early for negative sizes) to using unsigned comparisons. It is unclear how this could be handled by memcpy itself in a reasonable way. One possibility would be to add an exception handler that would force memcpy to return if a tag check fault is detected -- this would make the behavior roughly similar to generic and SW tag-based KASAN. However, this wouldn't solve the problem for asynchronous mode and also makes memcpy behavior inconsistent with manually copying data. This test was added as a part of a series that taught KASAN to detect negative sizes in memory operations, see commit 8cceeff48f23 ("kasan: detect negative size in memory operation function"). Therefore we should keep testing for negative sizes with generic and SW tag-based KASAN. But there is some value in testing small memcpy overflows, so let's add another test with memcpy that does not destabilize the kernel by performing out-of-bounds writes, and run it in all modes. Link: https://linux-review.googlesource.com/id/I048d1e6a9aff766c4a53f989fb0c83de68923882 Link: https://lkml.kernel.org/r/20210910211356.3603758-1-pcc@google.com Signed-off-by: Peter Collingbourne Reviewed-by: Andrey Konovalov Acked-by: Marco Elver Cc: Robin Murphy Cc: Will Deacon Cc: Catalin Marinas Cc: Mark Rutland Cc: Evgenii Stepanov Cc: Alexander Potapenko Signed-off-by: Andrew Morton --- lib/test_kasan.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) --- a/lib/test_kasan.c~kasan-test-add-memcpy-test-that-avoids-out-of-bounds-write +++ a/lib/test_kasan.c @@ -493,7 +493,7 @@ static void kmalloc_oob_in_memset(struct kfree(ptr); } -static void kmalloc_memmove_invalid_size(struct kunit *test) +static void kmalloc_memmove_negative_size(struct kunit *test) { char *ptr; size_t size = 64; @@ -515,6 +515,21 @@ static void kmalloc_memmove_invalid_size kfree(ptr); } +static void kmalloc_memmove_invalid_size(struct kunit *test) +{ + char *ptr; + size_t size = 64; + volatile size_t invalid_size = size; + + ptr = kmalloc(size, GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + + memset((char *)ptr, 0, 64); + KUNIT_EXPECT_KASAN_FAIL(test, + memmove((char *)ptr, (char *)ptr + 4, invalid_size)); + kfree(ptr); +} + static void kmalloc_uaf(struct kunit *test) { char *ptr; @@ -1129,6 +1144,7 @@ static struct kunit_case kasan_kunit_tes KUNIT_CASE(kmalloc_oob_memset_4), KUNIT_CASE(kmalloc_oob_memset_8), KUNIT_CASE(kmalloc_oob_memset_16), + KUNIT_CASE(kmalloc_memmove_negative_size), KUNIT_CASE(kmalloc_memmove_invalid_size), KUNIT_CASE(kmalloc_uaf), KUNIT_CASE(kmalloc_uaf_memset), From patchwork Fri Nov 5 20:35:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBE95C433EF for ; Fri, 5 Nov 2021 20:36:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 810A0611C0 for ; Fri, 5 Nov 2021 20:36:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 810A0611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2182B94001B; Fri, 5 Nov 2021 16:36:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C731940007; Fri, 5 Nov 2021 16:36:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DDB194001B; Fri, 5 Nov 2021 16:36:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id F332C940007 for ; Fri, 5 Nov 2021 16:36:01 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 90FEC8249980 for ; Fri, 5 Nov 2021 20:36:01 +0000 (UTC) X-FDA: 78776033322.24.98ADAB4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id E9EB0104AADE for ; Fri, 5 Nov 2021 20:35:52 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4DAA061252; Fri, 5 Nov 2021 20:36:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144560; bh=POl68rlaiFepkgXhJo1XMNxFC/jPqk7URJQ1Z3yPjBA=; h=Date:From:To:Subject:In-Reply-To:From; b=XJ4gy82mwUyGEXNia1N/d2ASstsEB0cMZs6tHKBay6Gvw49eIqad8/PAnRZ+Pqks+ LvbY5IKR/D3H7v1WJB/yKbAKOnABK6+cc5wbB04lyQm+eC0X6bfvkC+5z14L/7WCX7 j/a2PxB8GTN5d9Juhd7sBMWOxa76ZMM8TtbP4gzw= Date: Fri, 05 Nov 2021 13:35:59 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 027/262] mm/smaps: fix shmem pte hole swap calculation Message-ID: <20211105203559.qOWoE4JSh%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XJ4gy82m; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E9EB0104AADE X-Stat-Signature: ps3n319zmenqksb1ifnux5ei16hb7znw X-HE-Tag: 1636144552-53031 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/smaps: fix shmem pte hole swap calculation Patch series "mm/smaps: Fixes and optimizations on shmem swap handling". This patch (of 3): The shmem swap calculation on the privately writable mappings are using wrong parameters as spotted by Vlastimil. Fix them. That's introduced in commit 48131e03ca4e, when rework shmem_swap_usage to shmem_partial_swap_usage. Test program: ================== void main(void) { char *buffer, *p; int i, fd; fd = memfd_create("test", 0); assert(fd > 0); /* isize==2M*3, fill in pages, swap them out */ ftruncate(fd, SIZE_2M * 3); buffer = mmap(NULL, SIZE_2M * 3, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); assert(buffer); for (i = 0, p = buffer; i < SIZE_2M * 3 / 4096; i++) { *p = 1; p += 4096; } madvise(buffer, SIZE_2M * 3, MADV_PAGEOUT); munmap(buffer, SIZE_2M * 3); /* * Remap with private+writtable mappings on partial of the inode (<= 2M*3), * while the size must also be >= 2M*2 to make sure there's a none pmd so * smaps_pte_hole will be triggered. */ buffer = mmap(NULL, SIZE_2M * 2, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); printf("pid=%d, buffer=%p ", getpid(), buffer); /* Check /proc/$PID/smap_rollup, should see 4MB swap */ sleep(1000000); } ================== Before the patch, smaps_rollup shows <4MB swap and the number will be random depending on the alignment of the buffer of mmap() allocated. After this patch, it'll show 4MB. Link: https://lkml.kernel.org/r/20210917164756.8586-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210917164756.8586-2-peterx@redhat.com Fixes: 48131e03ca4e ("mm, proc: reduce cost of /proc/pid/smaps for unpopulated shmem mappings") Signed-off-by: Peter Xu Reported-by: Vlastimil Babka Reviewed-by: Vlastimil Babka Cc: Andrea Arcangeli Cc: Hugh Dickins Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- fs/proc/task_mmu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/fs/proc/task_mmu.c~mm-smaps-fix-shmem-pte-hole-swap-calculation +++ a/fs/proc/task_mmu.c @@ -478,9 +478,11 @@ static int smaps_pte_hole(unsigned long __always_unused int depth, struct mm_walk *walk) { struct mem_size_stats *mss = walk->private; + struct vm_area_struct *vma = walk->vma; - mss->swap += shmem_partial_swap_usage( - walk->vma->vm_file->f_mapping, addr, end); + mss->swap += shmem_partial_swap_usage(walk->vma->vm_file->f_mapping, + linear_page_index(vma, addr), + linear_page_index(vma, end)); return 0; } From patchwork Fri Nov 5 20:36:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF33EC433F5 for ; Fri, 5 Nov 2021 20:36:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9B6EB61242 for ; Fri, 5 Nov 2021 20:36:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9B6EB61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3792E94001C; Fri, 5 Nov 2021 16:36:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 350D0940007; Fri, 5 Nov 2021 16:36:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 219C194001C; Fri, 5 Nov 2021 16:36:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 0D725940007 for ; Fri, 5 Nov 2021 16:36:05 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BFD9E70737 for ; Fri, 5 Nov 2021 20:36:04 +0000 (UTC) X-FDA: 78776033490.01.0F94E62 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id BA2275000300 for ; Fri, 5 Nov 2021 20:35:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5685F611C0; Fri, 5 Nov 2021 20:36:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144563; bh=wkTASAhiipzxDonZhbRjk6v7zkBquZBk9Uduri6FL+Q=; h=Date:From:To:Subject:In-Reply-To:From; b=Ss7q6Nfj1FVdCH3Xj99rcQ4c2NVRBN8LeE0X3a0Vu3ttpk/qngLiYDRcXN20GnpRL xUnVY+wOUdoO0HP1P0EYgZOnsQLym65WL7q+IEPRNpSk6x2HabFdS/k1snC+FG9rvn Fa4S86W741MJJDQ6dvHWBHj8TO1A+hGS4OszvkPE= Date: Fri, 05 Nov 2021 13:36:02 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 028/262] mm/smaps: use vma->vm_pgoff directly when counting partial swap Message-ID: <20211105203602.PAtMTScF1%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Ss7q6Nfj; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BA2275000300 X-Stat-Signature: wub8nxbgdzyr4wkenaftpf9cybs53k8x X-HE-Tag: 1636144555-285326 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/smaps: use vma->vm_pgoff directly when counting partial swap As it's trying to cover the whole vma anyways, use direct vm_pgoff value and vma_pages() rather than linear_page_index. Link: https://lkml.kernel.org/r/20210917164756.8586-3-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Vlastimil Babka Cc: Hugh Dickins Cc: Andrea Arcangeli Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- mm/shmem.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/mm/shmem.c~mm-smaps-use-vma-vm_pgoff-directly-when-counting-partial-swap +++ a/mm/shmem.c @@ -856,9 +856,8 @@ unsigned long shmem_swap_usage(struct vm return swapped << PAGE_SHIFT; /* Here comes the more involved part */ - return shmem_partial_swap_usage(mapping, - linear_page_index(vma, vma->vm_start), - linear_page_index(vma, vma->vm_end)); + return shmem_partial_swap_usage(mapping, vma->vm_pgoff, + vma->vm_pgoff + vma_pages(vma)); } /* From patchwork Fri Nov 5 20:36:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08BAEC433FE for ; Fri, 5 Nov 2021 20:36:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ADCDD61242 for ; Fri, 5 Nov 2021 20:36:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org ADCDD61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 54DBB94001D; Fri, 5 Nov 2021 16:36:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FCBA940007; Fri, 5 Nov 2021 16:36:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4187F94001E; Fri, 5 Nov 2021 16:36:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 2B4E594001D for ; Fri, 5 Nov 2021 16:36:08 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D818C1856A1CD for ; Fri, 5 Nov 2021 20:36:07 +0000 (UTC) X-FDA: 78776033574.10.B39BE82 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id E45BE500030E for ; Fri, 5 Nov 2021 20:35:58 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 60F43611EE; Fri, 5 Nov 2021 20:36:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144566; bh=yLA0kwAv/1UTmsV1tgpDpmVs8y0uomeZVun4V7iw8pg=; h=Date:From:To:Subject:In-Reply-To:From; b=bKOHh8tlmaoCSWaWJs/U3mb1IfFB1367Z2/1LUAgELqzzy5LiPyfFhBj0r9WMRZ/6 APAphjidByIPQgdnsi2GjFge2Jum3T9vwPfuVeYtuJ8/LMANQO0IJmwS5qVyMmgqPw DT4AYgAjQiKbivURVH7o6gMbN+C2zZ80+QBwafGA= Date: Fri, 05 Nov 2021 13:36:05 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 029/262] mm/smaps: simplify shmem handling of pte holes Message-ID: <20211105203605.TfGCVx9PE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E45BE500030E X-Stat-Signature: yd9sg71xnh1wsnr9yaoa1rdx8xwnnmkp Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bKOHh8tl; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144558-790339 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/smaps: simplify shmem handling of pte holes Firstly, check_shmem_swap variable is actually not necessary, because it's always set with pte_hole hook; checking each would work. Meanwhile, the check within smaps_pte_entry is not easy to follow. E.g., pte_none() check is not needed as "!pte_present && !is_swap_pte" is the same. Since at it, use the pte_hole() helper rather than dup the page cache lookup. Still keep the CONFIG_SHMEM part so the code can be optimized to nop for !SHMEM. There will be a very slight functional change in smaps_pte_entry(), that for !SHMEM we'll return early for pte_none (before checking page==NULL), but that's even nicer. Link: https://lkml.kernel.org/r/20210917164756.8586-4-peterx@redhat.com Signed-off-by: Peter Xu Cc: Hugh Dickins Cc: Matthew Wilcox Cc: Vlastimil Babka Cc: Andrea Arcangeli Signed-off-by: Andrew Morton --- fs/proc/task_mmu.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) --- a/fs/proc/task_mmu.c~mm-smaps-simplify-shmem-handling-of-pte-holes +++ a/fs/proc/task_mmu.c @@ -397,7 +397,6 @@ struct mem_size_stats { u64 pss_shmem; u64 pss_locked; u64 swap_pss; - bool check_shmem_swap; }; static void smaps_page_accumulate(struct mem_size_stats *mss, @@ -490,6 +489,16 @@ static int smaps_pte_hole(unsigned long #define smaps_pte_hole NULL #endif /* CONFIG_SHMEM */ +static void smaps_pte_hole_lookup(unsigned long addr, struct mm_walk *walk) +{ +#ifdef CONFIG_SHMEM + if (walk->ops->pte_hole) { + /* depth is not used */ + smaps_pte_hole(addr, addr + PAGE_SIZE, 0, walk); + } +#endif +} + static void smaps_pte_entry(pte_t *pte, unsigned long addr, struct mm_walk *walk) { @@ -518,12 +527,8 @@ static void smaps_pte_entry(pte_t *pte, } } else if (is_pfn_swap_entry(swpent)) page = pfn_swap_entry_to_page(swpent); - } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap - && pte_none(*pte))) { - page = xa_load(&vma->vm_file->f_mapping->i_pages, - linear_page_index(vma, addr)); - if (xa_is_value(page)) - mss->swap += PAGE_SIZE; + } else { + smaps_pte_hole_lookup(addr, walk); return; } @@ -737,8 +742,6 @@ static void smap_gather_stats(struct vm_ return; #ifdef CONFIG_SHMEM - /* In case of smaps_rollup, reset the value from previous vma */ - mss->check_shmem_swap = false; if (vma->vm_file && shmem_mapping(vma->vm_file->f_mapping)) { /* * For shared or readonly shmem mappings we know that all @@ -756,7 +759,6 @@ static void smap_gather_stats(struct vm_ !(vma->vm_flags & VM_WRITE))) { mss->swap += shmem_swapped; } else { - mss->check_shmem_swap = true; ops = &smaps_shmem_walk_ops; } } From patchwork Fri Nov 5 20:36:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F347AC433EF for ; Fri, 5 Nov 2021 20:36:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A3FB761252 for ; Fri, 5 Nov 2021 20:36:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A3FB761252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4AFFE94001E; Fri, 5 Nov 2021 16:36:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 43890940007; Fri, 5 Nov 2021 16:36:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3278F94001E; Fri, 5 Nov 2021 16:36:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id 2198C940007 for ; Fri, 5 Nov 2021 16:36:11 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C502A18562F62 for ; Fri, 5 Nov 2021 20:36:10 +0000 (UTC) X-FDA: 78776033700.30.2106F4D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id 2CF6620019E2 for ; Fri, 5 Nov 2021 20:36:11 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7AB5E611C0; Fri, 5 Nov 2021 20:36:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144569; bh=0d6uitbGLF/OFI0q7D4CCJ+ru0RJkFAdz8ZbF1SChaI=; h=Date:From:To:Subject:In-Reply-To:From; b=VK9w1Ki85O9htaQwpGSCOlwnTWjr9XRth1E+UFUpLXIll0zqI2QzFHLOdIeOYP6xp g0m3yjfZe8Ck0fC+Cy11F7igbNbQfIp0fqnjCJqnjpnnN++hoKrsZU+kNcrXbjHN16 2GZ79jTJskreyIYTs2vycuyIew/qzSPpK+yHVU2g= Date: Fri, 05 Nov 2021 13:36:09 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, christophe.leroy@csgroup.eu, gerald.schaefer@linux.ibm.com, gshan@redhat.com, guoren@linux.alibaba.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 030/262] mm: debug_vm_pgtable: don't use __P000 directly Message-ID: <20211105203609.pRPXEXXRp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2CF6620019E2 X-Stat-Signature: swo9hy1ydfswescecam1g66qumuski33 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=VK9w1Ki8; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144571-666277 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Guo Ren Subject: mm: debug_vm_pgtable: don't use __P000 directly The __Pxxx/__Sxxx macros are only for protection_map[] init. All usage of them in linux should come from protection_map array. Because a lot of architectures would re-initilize protection_map[] array, eg: x86-mem_encrypt, m68k-motorola, mips, arm, sparc. Using __P000 is not rigorous. Link: https://lkml.kernel.org/r/20210924060821.1138281-1-guoren@kernel.org Signed-off-by: Guo Ren Reviewed-by: Andrew Morton Reviewed-by: Anshuman Khandual Cc: Gavin Shan Cc: Christophe Leroy Cc: Gerald Schaefer Signed-off-by: Andrew Morton --- mm/debug_vm_pgtable.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-dont-use-__p000-directly +++ a/mm/debug_vm_pgtable.c @@ -1104,13 +1104,14 @@ static int __init init_args(struct pgtab /* * Initialize the debugging data. * - * __P000 (or even __S000) will help create page table entries with - * PROT_NONE permission as required for pxx_protnone_tests(). + * protection_map[0] (or even protection_map[8]) will help create + * page table entries with PROT_NONE permission as required for + * pxx_protnone_tests(). */ memset(args, 0, sizeof(*args)); args->vaddr = get_random_vaddr(); args->page_prot = vm_get_page_prot(VMFLAGS); - args->page_prot_none = __P000; + args->page_prot_none = protection_map[0]; args->is_contiguous_page = false; args->pud_pfn = ULONG_MAX; args->pmd_pfn = ULONG_MAX; From patchwork Fri Nov 5 20:36:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141C8C433EF for ; Fri, 5 Nov 2021 20:36:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC9DB61242 for ; Fri, 5 Nov 2021 20:36:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BC9DB61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 610B994001F; Fri, 5 Nov 2021 16:36:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 598F2940007; Fri, 5 Nov 2021 16:36:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4608894001F; Fri, 5 Nov 2021 16:36:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0022.hostedemail.com [216.40.44.22]) by kanga.kvack.org (Postfix) with ESMTP id 36D72940007 for ; Fri, 5 Nov 2021 16:36:14 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EBD068249980 for ; Fri, 5 Nov 2021 20:36:13 +0000 (UTC) X-FDA: 78776033826.03.EA0D1F8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 82F9BF0000AA for ; Fri, 5 Nov 2021 20:36:13 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 859466127A; Fri, 5 Nov 2021 20:36:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144572; bh=mEPLmSbWpuTn0UUQaBDXnDGbCyw0rQUP9Zk5kzFyRZs=; h=Date:From:To:Subject:In-Reply-To:From; b=mpkse5GWmCUTDqXyqVRSGUCPaZ2Bzlphsjn+R9VOESLgkzAS2xrAyzCJ+YJBgkmqZ 6bfurd0DNIoaCyeiPqPgFFyn2MUS+rT5jWUKzljDgEHw0ExEWlJCPCDF3n8Qzb1ptm jNkRWEKLbJseezSoF8Q4H1rMC/S1GxFBNaKGfuBs= Date: Fri, 05 Nov 2021 13:36:12 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, dvyukov@google.com, glider@google.com, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org Subject: [patch 031/262] kasan: test: bypass __alloc_size checks Message-ID: <20211105203612.J6qbXKwVl%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mpkse5GW; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 82F9BF0000AA X-Stat-Signature: d9thuxfykt7oemigq6c8cit7e9yqkjm5 X-HE-Tag: 1636144573-999033 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: kasan: test: bypass __alloc_size checks Intentional overflows, as performed by the KASAN tests, are detected at compile time[1] (instead of only at run-time) with the addition of __alloc_size. Fix this by forcing the compiler into not being able to trust the size used following the kmalloc()s. [1] https://lore.kernel.org/lkml/20211005184717.65c6d8eb39350395e387b71f@linux-foundation.org Link: https://lkml.kernel.org/r/20211006181544.1670992-1-keescook@chromium.org Signed-off-by: Kees Cook Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Andrey Konovalov Cc: Dmitry Vyukov Signed-off-by: Andrew Morton --- lib/test_kasan.c | 8 +++++++- lib/test_kasan_module.c | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) --- a/lib/test_kasan.c~kasan-test-bypass-__alloc_size-checks +++ a/lib/test_kasan.c @@ -440,6 +440,7 @@ static void kmalloc_oob_memset_2(struct ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(size); KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 1, 0, 2)); kfree(ptr); } @@ -452,6 +453,7 @@ static void kmalloc_oob_memset_4(struct ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(size); KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 3, 0, 4)); kfree(ptr); } @@ -464,6 +466,7 @@ static void kmalloc_oob_memset_8(struct ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(size); KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 7, 0, 8)); kfree(ptr); } @@ -476,6 +479,7 @@ static void kmalloc_oob_memset_16(struct ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(size); KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 15, 0, 16)); kfree(ptr); } @@ -488,6 +492,7 @@ static void kmalloc_oob_in_memset(struct ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(size); KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr, 0, size + KASAN_GRANULE_SIZE)); kfree(ptr); @@ -497,7 +502,7 @@ static void kmalloc_memmove_negative_siz { char *ptr; size_t size = 64; - volatile size_t invalid_size = -2; + size_t invalid_size = -2; /* * Hardware tag-based mode doesn't check memmove for negative size. @@ -510,6 +515,7 @@ static void kmalloc_memmove_negative_siz KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); memset((char *)ptr, 0, 64); + OPTIMIZER_HIDE_VAR(invalid_size); KUNIT_EXPECT_KASAN_FAIL(test, memmove((char *)ptr, (char *)ptr + 4, invalid_size)); kfree(ptr); --- a/lib/test_kasan_module.c~kasan-test-bypass-__alloc_size-checks +++ a/lib/test_kasan_module.c @@ -35,6 +35,8 @@ static noinline void __init copy_user_te return; } + OPTIMIZER_HIDE_VAR(size); + pr_info("out-of-bounds in copy_from_user()\n"); unused = copy_from_user(kmem, usermem, size + 1); From patchwork Fri Nov 5 20:36:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9E96C433EF for ; Fri, 5 Nov 2021 20:36:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D3D5611EE for ; Fri, 5 Nov 2021 20:36:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8D3D5611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 266D7940020; Fri, 5 Nov 2021 16:36:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2157E940007; Fri, 5 Nov 2021 16:36:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 104C5940020; Fri, 5 Nov 2021 16:36:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id F3AED940007 for ; Fri, 5 Nov 2021 16:36:17 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C100418565647 for ; Fri, 5 Nov 2021 20:36:17 +0000 (UTC) X-FDA: 78776033994.21.6D4E38F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id B186D5000301 for ; Fri, 5 Nov 2021 20:36:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BDA9D611C0; Fri, 5 Nov 2021 20:36:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144576; bh=AR/knvV7sZbF+P9B2GhmwtHpsmXX+YC0xLG1nzE6cYw=; h=Date:From:To:Subject:In-Reply-To:From; b=NvFehpFFVOHqL0N0aPIsTe79qe6o9c4yferioqNC+KE9FJw28aQGDe2f0UKseKscO xXTvHnxS7dVR/6FFtj+2BbjczUNY9Zy6ZNpgZw1N67Om7478rt9S/si9aKAmsGimPF yrb17qyLZxIUi4K1Q8QkpG9ijV2xd2T3jNwRiAUQ= Date: Fri, 05 Nov 2021 13:36:15 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 032/262] rapidio: avoid bogus __alloc_size warning Message-ID: <20211105203615.KtXTy0MDX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B186D5000301 X-Stat-Signature: undmqgu7ateduh8ecbi7ttd9zo7n7iqo Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=NvFehpFF; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144568-746032 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: rapidio: avoid bogus __alloc_size warning Patch series "Add __alloc_size()", v3. GCC and Clang both use the "alloc_size" attribute to assist with bounds checking around the use of allocation functions. Add the attribute, adjust the Makefile to silence needless warnings, and add the hints to the allocators where possible. These changes have been in use for a while now in GrapheneOS. This patch (of 8): After adding __alloc_size attributes to the allocators, GCC 9.3 (but not later) may incorrectly evaluate the arguments to check_copy_size(), getting seemingly confused by the size being returned from array_size(). Instead, perform the calculation once, which both makes the code more readable and avoids the bug in GCC. In file included from arch/x86/include/asm/preempt.h:7, from include/linux/preempt.h:78, from include/linux/spinlock.h:55, from include/linux/mm_types.h:9, from include/linux/buildid.h:5, from include/linux/module.h:14, from drivers/rapidio/devices/rio_mport_cdev.c:13: In function 'check_copy_size', inlined from 'copy_from_user' at include/linux/uaccess.h:191:6, inlined from 'rio_mport_transfer_ioctl' at drivers/rapidio/devices/rio_mport_cdev.c:983:6: include/linux/thread_info.h:213:4: error: call to '__bad_copy_to' declared with attribute error: copy destination size is too small 213 | __bad_copy_to(); | ^~~~~~~~~~~~~~~ But the allocation size and the copy size are identical: transfer = vmalloc(array_size(sizeof(*transfer), transaction.count)); if (!transfer) return -ENOMEM; if (unlikely(copy_from_user(transfer, (void __user *)(uintptr_t)transaction.block, array_size(sizeof(*transfer), transaction.count)))) { Link: https://lkml.kernel.org/r/20210930222704.2631604-1-keescook@chromium.org Link: https://lkml.kernel.org/r/20210930222704.2631604-2-keescook@chromium.org Link: https://lore.kernel.org/linux-mm/202109091134.FHnRmRxu-lkp@intel.com/ Signed-off-by: Kees Cook Reviewed-by: John Hubbard Reported-by: kernel test robot Cc: Matt Porter Cc: Alexandre Bounine Cc: Jing Xiangfeng Cc: Ira Weiny Cc: Souptick Joarder Cc: Gustavo A. R. Silva Cc: Andy Whitcroft Cc: Christoph Lameter Cc: Daniel Micay Cc: David Rientjes Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Joonsoo Kim Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Pekka Enberg Cc: Randy Dunlap Cc: Tejun Heo Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- drivers/rapidio/devices/rio_mport_cdev.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-avoid-bogus-__alloc_size-warning +++ a/drivers/rapidio/devices/rio_mport_cdev.c @@ -965,6 +965,7 @@ static int rio_mport_transfer_ioctl(stru struct rio_transfer_io *transfer; enum dma_data_direction dir; int i, ret = 0; + size_t size; if (unlikely(copy_from_user(&transaction, arg, sizeof(transaction)))) return -EFAULT; @@ -976,13 +977,14 @@ static int rio_mport_transfer_ioctl(stru priv->md->properties.transfer_mode) == 0) return -ENODEV; - transfer = vmalloc(array_size(sizeof(*transfer), transaction.count)); + size = array_size(sizeof(*transfer), transaction.count); + transfer = vmalloc(size); if (!transfer) return -ENOMEM; if (unlikely(copy_from_user(transfer, (void __user *)(uintptr_t)transaction.block, - array_size(sizeof(*transfer), transaction.count)))) { + size))) { ret = -EFAULT; goto out_free; } @@ -994,8 +996,7 @@ static int rio_mport_transfer_ioctl(stru transaction.sync, dir, &transfer[i]); if (unlikely(copy_to_user((void __user *)(uintptr_t)transaction.block, - transfer, - array_size(sizeof(*transfer), transaction.count)))) + transfer, size))) ret = -EFAULT; out_free: From patchwork Fri Nov 5 20:36:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0320C433EF for ; Fri, 5 Nov 2021 20:36:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8B537611C0 for ; Fri, 5 Nov 2021 20:36:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8B537611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28DEE940021; Fri, 5 Nov 2021 16:36:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 23D09940007; Fri, 5 Nov 2021 16:36:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12B73940021; Fri, 5 Nov 2021 16:36:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 04859940007 for ; Fri, 5 Nov 2021 16:36:22 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C066C1856565A for ; Fri, 5 Nov 2021 20:36:21 +0000 (UTC) X-FDA: 78776034162.29.CF9D3CB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 64460801A8AE for ; Fri, 5 Nov 2021 20:36:21 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C7293611EE; Fri, 5 Nov 2021 20:36:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144580; bh=TW2hIdei19ESGZqyhS5HK/bJ0sILzn3n1UhvzV3EMxs=; h=Date:From:To:Subject:In-Reply-To:From; b=doRtpdzuuPb3KnrUOjeRPCp/ys67o/WLYEFR2jBG4O+40rzwF2Ctln1vbHPeKdrBT pBwCD1hnU5YGllfuw8nLjfBuKv4Wun2UY2mjHetVnjFHhGyHOEVK3CUitkqltODqRF Qd5sESR2Hg0mXVWwLxkuCzxQQzP0HOr/iPN60A30= Date: Fri, 05 Nov 2021 13:36:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 033/262] Compiler Attributes: add __alloc_size() for better bounds checking Message-ID: <20211105203619.7llAC4ARX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=doRtpdzu; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 64460801A8AE X-Stat-Signature: 9x8fwsdqfpksax9jdeg4n47qh8ngfjb7 X-HE-Tag: 1636144581-830713 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: Compiler Attributes: add __alloc_size() for better bounds checking GCC and Clang can use the "alloc_size" attribute to better inform the results of __builtin_object_size() (for compile-time constant values). Clang can additionally use alloc_size to inform the results of __builtin_dynamic_object_size() (for run-time values). Because GCC sees the frequent use of struct_size() as an allocator size argument, and notices it can return SIZE_MAX (the overflow indication), it complains about these call sites overflowing (since SIZE_MAX is greater than the default -Walloc-size-larger-than=PTRDIFF_MAX). This isn't helpful since we already know a SIZE_MAX will be caught at run-time (this was an intentional design). To deal with this, we must disable this check as it is both a false positive and redundant. (Clang does not have this warning option.) Unfortunately, just checking the -Wno-alloc-size-larger-than is not sufficient to make the __alloc_size attribute behave correctly under older GCC versions. The attribute itself must be disabled in those situations too, as there appears to be no way to reliably silence the SIZE_MAX constant expression cases for GCC versions less than 9.1: In file included from ./include/linux/resource_ext.h:11, from ./include/linux/pci.h:40, from drivers/net/ethernet/intel/ixgbe/ixgbe.h:9, from drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c:4: In function 'kmalloc_node', inlined from 'ixgbe_alloc_q_vector' at ./include/linux/slab.h:743:9: ./include/linux/slab.h:618:9: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=] return __kmalloc_node(size, flags, node); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/slab.h: In function 'ixgbe_alloc_q_vector': ./include/linux/slab.h:455:7: note: in a call to allocation function '__kmalloc_node' declared here void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_slab_alignment __malloc; ^~~~~~~~~~~~~~ Specifically: -Wno-alloc-size-larger-than is not correctly handled by GCC < 9.1 https://godbolt.org/z/hqsfG7q84 (doesn't disable) https://godbolt.org/z/P9jdrPTYh (doesn't admit to not knowing about option) https://godbolt.org/z/465TPMWKb (only warns when other warnings appear) -Walloc-size-larger-than=18446744073709551615 is not handled by GCC < 8.2 https://godbolt.org/z/73hh1EPxz (ignores numeric value) Since anything marked with __alloc_size would also qualify for marking with __malloc, just include __malloc along with it to avoid redundant markings. (Suggested by Linus Torvalds.) Finally, make sure checkpatch.pl doesn't get confused about finding the __alloc_size attribute on functions. (Thanks to Joe Perches.) Link: https://lkml.kernel.org/r/20210930222704.2631604-3-keescook@chromium.org Signed-off-by: Kees Cook Tested-by: Randy Dunlap Cc: Andy Whitcroft Cc: Christoph Lameter Cc: Daniel Micay Cc: David Rientjes Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Joonsoo Kim Cc: Lukas Bulwahn Cc: Pekka Enberg Cc: Tejun Heo Cc: Vlastimil Babka Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Souptick Joarder Signed-off-by: Andrew Morton --- Makefile | 15 +++++++++++++++ include/linux/compiler-gcc.h | 8 ++++++++ include/linux/compiler_attributes.h | 10 ++++++++++ include/linux/compiler_types.h | 12 ++++++++++++ scripts/checkpatch.pl | 3 ++- 5 files changed, 47 insertions(+), 1 deletion(-) --- a/include/linux/compiler_attributes.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking +++ a/include/linux/compiler_attributes.h @@ -34,6 +34,15 @@ #define __aligned_largest __attribute__((__aligned__)) /* + * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally + * available and includes other attributes. + * + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size + */ +#define __alloc_size__(x, ...) __attribute__((__alloc_size__(x, ## __VA_ARGS__))) + +/* * Note: users of __always_inline currently do not write "inline" themselves, * which seems to be required by gcc to apply the attribute according * to its docs (and also "warning: always_inline function might not be @@ -153,6 +162,7 @@ /* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute + * clang: https://clang.llvm.org/docs/AttributeReference.html#malloc */ #define __malloc __attribute__((__malloc__)) --- a/include/linux/compiler-gcc.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking +++ a/include/linux/compiler-gcc.h @@ -144,3 +144,11 @@ #else #define __diag_GCC_8(s) #endif + +/* + * Prior to 9.1, -Wno-alloc-size-larger-than (and therefore the "alloc_size" + * attribute) do not work, and must be disabled. + */ +#if GCC_VERSION < 90100 +#undef __alloc_size__ +#endif --- a/include/linux/compiler_types.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking +++ a/include/linux/compiler_types.h @@ -250,6 +250,18 @@ struct ftrace_likely_data { # define __cficanonical #endif +/* + * Any place that could be marked with the "alloc_size" attribute is also + * a place to be marked with the "malloc" attribute. Do this as part of the + * __alloc_size macro to avoid redundant attributes and to avoid missing a + * __malloc marking. + */ +#ifdef __alloc_size__ +# define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc +#else +# define __alloc_size(x, ...) __malloc +#endif + #ifndef asm_volatile_goto #define asm_volatile_goto(x...) asm goto(x) #endif --- a/Makefile~compiler-attributes-add-__alloc_size-for-better-bounds-checking +++ a/Makefile @@ -1008,6 +1008,21 @@ ifdef CONFIG_CC_IS_GCC KBUILD_CFLAGS += -Wno-maybe-uninitialized endif +ifdef CONFIG_CC_IS_GCC +# The allocators already balk at large sizes, so silence the compiler +# warnings for bounds checks involving those possible values. While +# -Wno-alloc-size-larger-than would normally be used here, earlier versions +# of gcc (<9.1) weirdly don't handle the option correctly when _other_ +# warnings are produced (?!). Using -Walloc-size-larger-than=SIZE_MAX +# doesn't work (as it is documented to), silently resolving to "0" prior to +# version 9.1 (and producing an error more recently). Numeric values larger +# than PTRDIFF_MAX also don't work prior to version 9.1, which are silently +# ignored, continuing to default to PTRDIFF_MAX. So, left with no other +# choice, we must perform a versioned check to disable this warning. +# https://lore.kernel.org/lkml/20210824115859.187f272f@canb.auug.org.au +KBUILD_CFLAGS += $(call cc-ifversion, -ge, 0901, -Wno-alloc-size-larger-than) +endif + # disable invalid "can't wrap" optimizations for signed / pointers KBUILD_CFLAGS += -fno-strict-overflow --- a/scripts/checkpatch.pl~compiler-attributes-add-__alloc_size-for-better-bounds-checking +++ a/scripts/checkpatch.pl @@ -489,7 +489,8 @@ our $Attribute = qr{ ____cacheline_aligned| ____cacheline_aligned_in_smp| ____cacheline_internodealigned_in_smp| - __weak + __weak| + __alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) }x; our $Modifier; our $Inline = qr{inline|__always_inline|noinline|__inline|__inline__}; From patchwork Fri Nov 5 20:36:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C96B3C433FE for ; Fri, 5 Nov 2021 20:36:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7F94B6125F for ; Fri, 5 Nov 2021 20:36:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7F94B6125F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 21B0A940022; Fri, 5 Nov 2021 16:36:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 178EB940007; Fri, 5 Nov 2021 16:36:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08FAD940022; Fri, 5 Nov 2021 16:36:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id EDC2D940007 for ; Fri, 5 Nov 2021 16:36:25 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AA01B18562F62 for ; Fri, 5 Nov 2021 20:36:25 +0000 (UTC) X-FDA: 78776034330.25.61E36DE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id 4298E10000AC for ; Fri, 5 Nov 2021 20:36:25 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AE399611C0; Fri, 5 Nov 2021 20:36:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144584; bh=ar8dKobxy8HRi6pY7W2uldNAH6IQmSYxxrtr3Y0BwT8=; h=Date:From:To:Subject:In-Reply-To:From; b=efn4Kzo3CIcZWxzANExckdbfzmWZCxga+S6AxvDyZsyYWnrC6CRxe79orfa1sSFka Ges9r0aoQWou4/0LSekFqwNjodIzdg8hedgJjB7R7yjMP9KAe7TqTbJ6DR5rhc2VNy 3b0Jibstdo1R3VOnw0JqwxuC4lI6fIw4AGu9+uCQ= Date: Fri, 05 Nov 2021 13:36:23 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 034/262] slab: clean up function prototypes Message-ID: <20211105203623.pP9FyETfE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4298E10000AC X-Stat-Signature: pmqzrcknz63fg48k5z4678z7u3rtigig Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=efn4Kzo3; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144585-434989 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: slab: clean up function prototypes Based on feedback from Joe Perches and Linus Torvalds, regularize the slab function prototypes before making attribute changes. Link: https://lkml.kernel.org/r/20210930222704.2631604-4-keescook@chromium.org Signed-off-by: Kees Cook Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Alexandre Bounine Cc: Andy Whitcroft Cc: Daniel Micay Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: Joe Perches Cc: John Hubbard Cc: kernel test robot Cc: Lukas Bulwahn Cc: Matt Porter Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Randy Dunlap Cc: Souptick Joarder Cc: Tejun Heo Signed-off-by: Andrew Morton --- include/linux/slab.h | 68 ++++++++++++++++++++--------------------- 1 file changed, 34 insertions(+), 34 deletions(-) --- a/include/linux/slab.h~slab-clean-up-function-prototypes +++ a/include/linux/slab.h @@ -152,8 +152,8 @@ struct kmem_cache *kmem_cache_create_use slab_flags_t flags, unsigned int useroffset, unsigned int usersize, void (*ctor)(void *)); -void kmem_cache_destroy(struct kmem_cache *); -int kmem_cache_shrink(struct kmem_cache *); +void kmem_cache_destroy(struct kmem_cache *s); +int kmem_cache_shrink(struct kmem_cache *s); /* * Please use this macro to create slab caches. Simply specify the @@ -181,11 +181,11 @@ int kmem_cache_shrink(struct kmem_cache /* * Common kmalloc functions provided by all allocators */ -void * __must_check krealloc(const void *, size_t, gfp_t); -void kfree(const void *); -void kfree_sensitive(const void *); -size_t __ksize(const void *); -size_t ksize(const void *); +void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags); +void kfree(const void *objp); +void kfree_sensitive(const void *objp); +size_t __ksize(const void *objp); +size_t ksize(const void *objp); #ifdef CONFIG_PRINTK bool kmem_valid_obj(void *object); void kmem_dump_obj(void *object); @@ -426,8 +426,8 @@ static __always_inline unsigned int __km #endif /* !CONFIG_SLOB */ void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc; -void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment __malloc; -void kmem_cache_free(struct kmem_cache *, void *); +void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc; +void kmem_cache_free(struct kmem_cache *s, void *objp); /* * Bulk allocation and freeing operations. These are accelerated in an @@ -436,8 +436,8 @@ void kmem_cache_free(struct kmem_cache * * * Note that interrupts must be enabled when calling these functions. */ -void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **); -int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **); +void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p); +int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, void **p); /* * Caller must not use kfree_bulk() on memory not originally allocated @@ -450,7 +450,8 @@ static __always_inline void kfree_bulk(s #ifdef CONFIG_NUMA void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc; -void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment __malloc; +void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment + __malloc; #else static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node) { @@ -464,25 +465,24 @@ static __always_inline void *kmem_cache_ #endif #ifdef CONFIG_TRACING -extern void *kmem_cache_alloc_trace(struct kmem_cache *, gfp_t, size_t) __assume_slab_alignment __malloc; +extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) + __assume_slab_alignment __malloc; #ifdef CONFIG_NUMA -extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, - gfp_t gfpflags, - int node, size_t size) __assume_slab_alignment __malloc; +extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, + int node, size_t size) __assume_slab_alignment __malloc; #else -static __always_inline void * -kmem_cache_alloc_node_trace(struct kmem_cache *s, - gfp_t gfpflags, - int node, size_t size) +static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, + gfp_t gfpflags, int node, + size_t size) { return kmem_cache_alloc_trace(s, gfpflags, size); } #endif /* CONFIG_NUMA */ #else /* CONFIG_TRACING */ -static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, - gfp_t flags, size_t size) +static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, + size_t size) { void *ret = kmem_cache_alloc(s, flags); @@ -490,10 +490,8 @@ static __always_inline void *kmem_cache_ return ret; } -static __always_inline void * -kmem_cache_alloc_node_trace(struct kmem_cache *s, - gfp_t gfpflags, - int node, size_t size) +static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, + int node, size_t size) { void *ret = kmem_cache_alloc_node(s, gfpflags, node); @@ -502,13 +500,14 @@ kmem_cache_alloc_node_trace(struct kmem_ } #endif /* CONFIG_TRACING */ -extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc; +extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment + __malloc; #ifdef CONFIG_TRACING -extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc; +extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) + __assume_page_alignment __malloc; #else -static __always_inline void * -kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) +static __always_inline void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) { return kmalloc_order(size, flags, order); } @@ -638,8 +637,8 @@ static inline void *kmalloc_array(size_t * @new_size: new size of a single member of the array * @flags: the type of memory to allocate (see kmalloc) */ -static __must_check inline void * -krealloc_array(void *p, size_t new_n, size_t new_size, gfp_t flags) +static inline void * __must_check krealloc_array(void *p, size_t new_n, size_t new_size, + gfp_t flags) { size_t bytes; @@ -668,7 +667,7 @@ static inline void *kcalloc(size_t n, si * allocator where we care about the real place the memory allocation * request comes from. */ -extern void *__kmalloc_track_caller(size_t, gfp_t, unsigned long); +extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller); #define kmalloc_track_caller(size, flags) \ __kmalloc_track_caller(size, flags, _RET_IP_) @@ -691,7 +690,8 @@ static inline void *kcalloc_node(size_t #ifdef CONFIG_NUMA -extern void *__kmalloc_node_track_caller(size_t, gfp_t, int, unsigned long); +extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, + unsigned long caller); #define kmalloc_node_track_caller(size, flags, node) \ __kmalloc_node_track_caller(size, flags, node, \ _RET_IP_) From patchwork Fri Nov 5 20:36:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E51D9C433FE for ; Fri, 5 Nov 2021 20:36:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 90AED611EE for ; Fri, 5 Nov 2021 20:36:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 90AED611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2DBD2940023; Fri, 5 Nov 2021 16:36:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2678C940007; Fri, 5 Nov 2021 16:36:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B71D940023; Fri, 5 Nov 2021 16:36:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id EA41B940007 for ; Fri, 5 Nov 2021 16:36:29 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A866A8249980 for ; Fri, 5 Nov 2021 20:36:29 +0000 (UTC) X-FDA: 78776034498.29.45F8B39 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id 459D410000AF for ; Fri, 5 Nov 2021 20:36:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A46F961262; Fri, 5 Nov 2021 20:36:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144588; bh=akYYhH0iJKPVWEkI9sNv9gAHAI6WfT9k0FloR+hb1yM=; h=Date:From:To:Subject:In-Reply-To:From; b=nocvR7TCxybtWS0lBGkBE4jGuDdrHA6OfJyh4MznRQF7PBJxokVSHgqV8IsFjbM1s jsVsKWkBFf9cFTuYDEhUSdLZN9RU4cPBCbLJWoHxFuRAy8KQhvsXx1NOIC6ALV1Euf s9zfsrWTQQTaZvthrkjBkwc3diAYVT5zVrHqg1Io= Date: Fri, 05 Nov 2021 13:36:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 035/262] slab: add __alloc_size attributes for better bounds checking Message-ID: <20211105203627.piXTTUwEo%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=nocvR7TC; dmarc=none; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 459D410000AF X-Stat-Signature: azrhpypgsgigh4u5qy3ywza3dnnfwhdn X-HE-Tag: 1636144589-111623 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: slab: add __alloc_size attributes for better bounds checking As already done in GrapheneOS, add the __alloc_size attribute for regular kmalloc interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Link: https://lkml.kernel.org/r/20210930222704.2631604-5-keescook@chromium.org Signed-off-by: Kees Cook Co-developed-by: Daniel Micay Signed-off-by: Daniel Micay Reviewed-by: Nick Desaulniers Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Andy Whitcroft Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Tejun Heo Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Randy Dunlap Cc: Souptick Joarder Signed-off-by: Andrew Morton --- include/linux/slab.h | 61 ++++++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 28 deletions(-) --- a/include/linux/slab.h~slab-add-__alloc_size-attributes-for-better-bounds-checking +++ a/include/linux/slab.h @@ -181,7 +181,7 @@ int kmem_cache_shrink(struct kmem_cache /* * Common kmalloc functions provided by all allocators */ -void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags); +void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); void kfree(const void *objp); void kfree_sensitive(const void *objp); size_t __ksize(const void *objp); @@ -425,7 +425,7 @@ static __always_inline unsigned int __km #define kmalloc_index(s) __kmalloc_index(s, true) #endif /* !CONFIG_SLOB */ -void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc; +void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1); void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc; void kmem_cache_free(struct kmem_cache *s, void *objp); @@ -449,11 +449,12 @@ static __always_inline void kfree_bulk(s } #ifdef CONFIG_NUMA -void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc; +void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment + __alloc_size(1); void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment __malloc; #else -static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node) +static __always_inline __alloc_size(1) void *__kmalloc_node(size_t size, gfp_t flags, int node) { return __kmalloc(size, flags); } @@ -466,23 +467,23 @@ static __always_inline void *kmem_cache_ #ifdef CONFIG_TRACING extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) - __assume_slab_alignment __malloc; + __assume_slab_alignment __alloc_size(3); #ifdef CONFIG_NUMA extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, - int node, size_t size) __assume_slab_alignment __malloc; + int node, size_t size) __assume_slab_alignment + __alloc_size(4); #else -static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, - gfp_t gfpflags, int node, - size_t size) +static __always_inline __alloc_size(4) void *kmem_cache_alloc_node_trace(struct kmem_cache *s, + gfp_t gfpflags, int node, size_t size) { return kmem_cache_alloc_trace(s, gfpflags, size); } #endif /* CONFIG_NUMA */ #else /* CONFIG_TRACING */ -static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, - size_t size) +static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_cache *s, + gfp_t flags, size_t size) { void *ret = kmem_cache_alloc(s, flags); @@ -501,19 +502,20 @@ static __always_inline void *kmem_cache_ #endif /* CONFIG_TRACING */ extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment - __malloc; + __alloc_size(1); #ifdef CONFIG_TRACING extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) - __assume_page_alignment __malloc; + __assume_page_alignment __alloc_size(1); #else -static __always_inline void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) +static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags, + unsigned int order) { return kmalloc_order(size, flags, order); } #endif -static __always_inline void *kmalloc_large(size_t size, gfp_t flags) +static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags) { unsigned int order = get_order(size); return kmalloc_order_trace(size, flags, order); @@ -573,7 +575,7 @@ static __always_inline void *kmalloc_lar * Try really hard to succeed the allocation but fail * eventually. */ -static __always_inline void *kmalloc(size_t size, gfp_t flags) +static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags) { if (__builtin_constant_p(size)) { #ifndef CONFIG_SLOB @@ -595,7 +597,7 @@ static __always_inline void *kmalloc(siz return __kmalloc(size, flags); } -static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node) +static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node) { #ifndef CONFIG_SLOB if (__builtin_constant_p(size) && @@ -619,7 +621,7 @@ static __always_inline void *kmalloc_nod * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ -static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags) +static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_t flags) { size_t bytes; @@ -637,8 +639,10 @@ static inline void *kmalloc_array(size_t * @new_size: new size of a single member of the array * @flags: the type of memory to allocate (see kmalloc) */ -static inline void * __must_check krealloc_array(void *p, size_t new_n, size_t new_size, - gfp_t flags) +static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, + size_t new_n, + size_t new_size, + gfp_t flags) { size_t bytes; @@ -654,7 +658,7 @@ static inline void * __must_check kreall * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ -static inline void *kcalloc(size_t n, size_t size, gfp_t flags) +static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flags) { return kmalloc_array(n, size, flags | __GFP_ZERO); } @@ -667,12 +671,13 @@ static inline void *kcalloc(size_t n, si * allocator where we care about the real place the memory allocation * request comes from. */ -extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller); +extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller) + __alloc_size(1); #define kmalloc_track_caller(size, flags) \ __kmalloc_track_caller(size, flags, _RET_IP_) -static inline void *kmalloc_array_node(size_t n, size_t size, gfp_t flags, - int node) +static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags, + int node) { size_t bytes; @@ -683,7 +688,7 @@ static inline void *kmalloc_array_node(s return __kmalloc_node(bytes, flags, node); } -static inline void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node) +static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node) { return kmalloc_array_node(n, size, flags | __GFP_ZERO, node); } @@ -691,7 +696,7 @@ static inline void *kcalloc_node(size_t #ifdef CONFIG_NUMA extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, - unsigned long caller); + unsigned long caller) __alloc_size(1); #define kmalloc_node_track_caller(size, flags, node) \ __kmalloc_node_track_caller(size, flags, node, \ _RET_IP_) @@ -716,7 +721,7 @@ static inline void *kmem_cache_zalloc(st * @size: how many bytes of memory are required. * @flags: the type of memory to allocate (see kmalloc). */ -static inline void *kzalloc(size_t size, gfp_t flags) +static inline __alloc_size(1) void *kzalloc(size_t size, gfp_t flags) { return kmalloc(size, flags | __GFP_ZERO); } @@ -727,7 +732,7 @@ static inline void *kzalloc(size_t size, * @flags: the type of memory to allocate (see kmalloc). * @node: memory node from which to allocate */ -static inline void *kzalloc_node(size_t size, gfp_t flags, int node) +static inline __alloc_size(1) void *kzalloc_node(size_t size, gfp_t flags, int node) { return kmalloc_node(size, flags | __GFP_ZERO, node); } From patchwork Fri Nov 5 20:36:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4EDFC433FE for ; Fri, 5 Nov 2021 20:36:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5F00E6125F for ; Fri, 5 Nov 2021 20:36:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5F00E6125F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F0FD8940024; Fri, 5 Nov 2021 16:36:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E99D6940007; Fri, 5 Nov 2021 16:36:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D61AF940024; Fri, 5 Nov 2021 16:36:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id BAFE1940007 for ; Fri, 5 Nov 2021 16:36:33 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 83E9F76BA0 for ; Fri, 5 Nov 2021 20:36:33 +0000 (UTC) X-FDA: 78776034666.31.F091F30 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 179D5900025D for ; Fri, 5 Nov 2021 20:36:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 828C961252; Fri, 5 Nov 2021 20:36:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144592; bh=myepLjtfWp/8bvq/A1IB+XDx9kY5s7Ss424hxk2FKrI=; h=Date:From:To:Subject:In-Reply-To:From; b=IorBU4yU2jDlI9/eF11uKl2lEPTzCStlsIr/yJnlKI0VLFLJ3rUq6SPDM8/fnbMZD AyJe4x6IDCUDLfYAya48c7GEkJNjwCLLjwldNstzgjdMtzxGof7NUGgh49qavDeDrN j/xLD5ApkKtGLqJRKL4cMNphwnD8HOOXN6QXzWzI= Date: Fri, 05 Nov 2021 13:36:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 036/262] mm/kvmalloc: add __alloc_size attributes for better bounds checking Message-ID: <20211105203631.9gBOpeNth%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IorBU4yU; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 179D5900025D X-Stat-Signature: 9fgo1unwh4ed9fu47fe9z5hasidh9pny X-HE-Tag: 1636144592-272182 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: mm/kvmalloc: add __alloc_size attributes for better bounds checking As already done in GrapheneOS, add the __alloc_size attribute for regular kvmalloc interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Link: https://lkml.kernel.org/r/20210930222704.2631604-6-keescook@chromium.org Signed-off-by: Kees Cook Co-developed-by: Daniel Micay Signed-off-by: Daniel Micay Reviewed-by: Nick Desaulniers Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Andy Whitcroft Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Tejun Heo Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Randy Dunlap Cc: Souptick Joarder Signed-off-by: Andrew Morton --- include/linux/slab.h | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) --- a/include/linux/slab.h~mm-kvmalloc-add-__alloc_size-attributes-for-better-bounds-checking +++ a/include/linux/slab.h @@ -737,21 +737,21 @@ static inline __alloc_size(1) void *kzal return kmalloc_node(size, flags | __GFP_ZERO, node); } -extern void *kvmalloc_node(size_t size, gfp_t flags, int node); -static inline void *kvmalloc(size_t size, gfp_t flags) +extern void *kvmalloc_node(size_t size, gfp_t flags, int node) __alloc_size(1); +static inline __alloc_size(1) void *kvmalloc(size_t size, gfp_t flags) { return kvmalloc_node(size, flags, NUMA_NO_NODE); } -static inline void *kvzalloc_node(size_t size, gfp_t flags, int node) +static inline __alloc_size(1) void *kvzalloc_node(size_t size, gfp_t flags, int node) { return kvmalloc_node(size, flags | __GFP_ZERO, node); } -static inline void *kvzalloc(size_t size, gfp_t flags) +static inline __alloc_size(1) void *kvzalloc(size_t size, gfp_t flags) { return kvmalloc(size, flags | __GFP_ZERO); } -static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags) +static inline __alloc_size(1, 2) void *kvmalloc_array(size_t n, size_t size, gfp_t flags) { size_t bytes; @@ -761,13 +761,13 @@ static inline void *kvmalloc_array(size_ return kvmalloc(bytes, flags); } -static inline void *kvcalloc(size_t n, size_t size, gfp_t flags) +static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags) { return kvmalloc_array(n, size, flags | __GFP_ZERO); } -extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, - gfp_t flags); +extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags) + __alloc_size(3); extern void kvfree(const void *addr); extern void kvfree_sensitive(const void *addr, size_t len); From patchwork Fri Nov 5 20:36:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8B45C433EF for ; Fri, 5 Nov 2021 20:36:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6F20861263 for ; Fri, 5 Nov 2021 20:36:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6F20861263 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0930C940025; Fri, 5 Nov 2021 16:36:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 019A3940007; Fri, 5 Nov 2021 16:36:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E23DC940025; Fri, 5 Nov 2021 16:36:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0111.hostedemail.com [216.40.44.111]) by kanga.kvack.org (Postfix) with ESMTP id CCBA2940007 for ; Fri, 5 Nov 2021 16:36:37 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7C9FA27710 for ; Fri, 5 Nov 2021 20:36:37 +0000 (UTC) X-FDA: 78776034834.30.8F36087 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id 4A843D036A44 for ; Fri, 5 Nov 2021 20:36:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5537F6125F; Fri, 5 Nov 2021 20:36:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144596; bh=vGbSSE8wZa8f5EBkGTx5DhlCtmkjYXrvnFWjJTaDngQ=; h=Date:From:To:Subject:In-Reply-To:From; b=APELzXTv03FNAkmqsWQ8mmSS2aVDvfyrxGYfQVJxty2TkA1XtW986eXqkTft3IoSA CfkxyAN2VA9ei1TEyd62LoQyxzH5Di7RqxAI2G9Vdq2lLMnSFBKpunaEfQubL8SeeN JCzqk4Z70PzYCwCrtYLGJ/uuWrY+4ZDQSaGs4v48= Date: Fri, 05 Nov 2021 13:36:34 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 037/262] mm/vmalloc: add __alloc_size attributes for better bounds checking Message-ID: <20211105203634.o9OEm5iFe%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4A843D036A44 X-Stat-Signature: kx5m8qcozz9krd76gacg18aqri5sirpz Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=APELzXTv; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144592-381333 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: mm/vmalloc: add __alloc_size attributes for better bounds checking As already done in GrapheneOS, add the __alloc_size attribute for appropriate vmalloc allocator interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Link: https://lkml.kernel.org/r/20210930222704.2631604-7-keescook@chromium.org Signed-off-by: Kees Cook Co-developed-by: Daniel Micay Signed-off-by: Daniel Micay Cc: Andy Whitcroft Cc: Christoph Lameter Cc: David Rientjes Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Joonsoo Kim Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Pekka Enberg Cc: Tejun Heo Cc: Vlastimil Babka Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Randy Dunlap Cc: Souptick Joarder Signed-off-by: Andrew Morton --- include/linux/vmalloc.h | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) --- a/include/linux/vmalloc.h~mm-vmalloc-add-__alloc_size-attributes-for-better-bounds-checking +++ a/include/linux/vmalloc.h @@ -136,21 +136,21 @@ static inline void vmalloc_init(void) static inline unsigned long vmalloc_nr_pages(void) { return 0; } #endif -extern void *vmalloc(unsigned long size); -extern void *vzalloc(unsigned long size); -extern void *vmalloc_user(unsigned long size); -extern void *vmalloc_node(unsigned long size, int node); -extern void *vzalloc_node(unsigned long size, int node); -extern void *vmalloc_32(unsigned long size); -extern void *vmalloc_32_user(unsigned long size); -extern void *__vmalloc(unsigned long size, gfp_t gfp_mask); +extern void *vmalloc(unsigned long size) __alloc_size(1); +extern void *vzalloc(unsigned long size) __alloc_size(1); +extern void *vmalloc_user(unsigned long size) __alloc_size(1); +extern void *vmalloc_node(unsigned long size, int node) __alloc_size(1); +extern void *vzalloc_node(unsigned long size, int node) __alloc_size(1); +extern void *vmalloc_32(unsigned long size) __alloc_size(1); +extern void *vmalloc_32_user(unsigned long size) __alloc_size(1); +extern void *__vmalloc(unsigned long size, gfp_t gfp_mask) __alloc_size(1); extern void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, int node, - const void *caller); + const void *caller) __alloc_size(1); void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, - int node, const void *caller); -void *vmalloc_no_huge(unsigned long size); + int node, const void *caller) __alloc_size(1); +void *vmalloc_no_huge(unsigned long size) __alloc_size(1); extern void vfree(const void *addr); extern void vfree_atomic(const void *addr); From patchwork Fri Nov 5 20:36:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 524A5C433FE for ; Fri, 5 Nov 2021 20:36:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0C6BD61262 for ; Fri, 5 Nov 2021 20:36:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0C6BD61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A0749940026; Fri, 5 Nov 2021 16:36:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98F6F940007; Fri, 5 Nov 2021 16:36:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85CE9940027; Fri, 5 Nov 2021 16:36:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 70E99940026 for ; Fri, 5 Nov 2021 16:36:41 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 27C5A1856566F for ; Fri, 5 Nov 2021 20:36:41 +0000 (UTC) X-FDA: 78776035002.31.4F9E736 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id E81F2D0000A3 for ; Fri, 5 Nov 2021 20:36:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3F23461288; Fri, 5 Nov 2021 20:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144600; bh=nj+jXMcQkw55wlrLcahnKl+uxAyVUt0et9kd1SJC76E=; h=Date:From:To:Subject:In-Reply-To:From; b=Bz85E1W9j9QqzZil6HTJMghEdlNO2gb2lPn3Flxw37rQ/lpCD86cgb/deHL3vPI3X OWVRVmePL4iw5B2SjpVeXlmGDgVho3llnrVXHaRISX9gnLcWFDdoLHOf10ES42s7pA vyMJqitvzedtGZa9vP8DVA67jCGDcs5agiHjr7dY= Date: Fri, 05 Nov 2021 13:36:38 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 038/262] mm/page_alloc: add __alloc_size attributes for better bounds checking Message-ID: <20211105203638.pMNMEqAVm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E81F2D0000A3 X-Stat-Signature: zn3r5tdqc4u8jmxxosa14u8kyrir6rbu Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Bz85E1W9; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144589-571146 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: mm/page_alloc: add __alloc_size attributes for better bounds checking As already done in GrapheneOS, add the __alloc_size attribute for appropriate page allocator interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Link: https://lkml.kernel.org/r/20210930222704.2631604-8-keescook@chromium.org Signed-off-by: Kees Cook Co-developed-by: Daniel Micay Signed-off-by: Daniel Micay Cc: Andy Whitcroft Cc: Christoph Lameter Cc: David Rientjes Cc: Dennis Zhou Cc: Dwaipayan Ray Cc: Joe Perches Cc: Joonsoo Kim Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Pekka Enberg Cc: Tejun Heo Cc: Vlastimil Babka Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Randy Dunlap Cc: Souptick Joarder Signed-off-by: Andrew Morton --- include/linux/gfp.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/include/linux/gfp.h~mm-page_alloc-add-__alloc_size-attributes-for-better-bounds-checking +++ a/include/linux/gfp.h @@ -608,9 +608,9 @@ static inline struct page *alloc_pages(g extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order); extern unsigned long get_zeroed_page(gfp_t gfp_mask); -void *alloc_pages_exact(size_t size, gfp_t gfp_mask); +void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1); void free_pages_exact(void *virt, size_t size); -void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask); +__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(1); #define __get_free_page(gfp_mask) \ __get_free_pages((gfp_mask), 0) From patchwork Fri Nov 5 20:36:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58CFAC433F5 for ; Fri, 5 Nov 2021 20:36:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E951611EE for ; Fri, 5 Nov 2021 20:36:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0E951611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A094594000E; Fri, 5 Nov 2021 16:36:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B657940007; Fri, 5 Nov 2021 16:36:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A5B694000E; Fri, 5 Nov 2021 16:36:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id 7434B940007 for ; Fri, 5 Nov 2021 16:36:45 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 40E9518404017 for ; Fri, 5 Nov 2021 20:36:45 +0000 (UTC) X-FDA: 78776035128.26.244F13D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id E0978D036A44 for ; Fri, 5 Nov 2021 20:36:39 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 188D161252; Fri, 5 Nov 2021 20:36:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144604; bh=07XHwbCqaKPeuz8WUdfDRiYFEofoYlspfb5ynFhIWvw=; h=Date:From:To:Subject:In-Reply-To:From; b=fk8GKHsh5KBEGCrZWMRfAGK/Fr2+xeX48uK3Wd8sOP0GHoU/0Y10u8FtHjw4wL20t ebvaOvUERkSvkALZQ9fOkxkuYkAY7CcOgBLSRQgkVWdZqccnmWHl9RYTkLfmGrLo6k 68a01QWZOXZv2ZCVE67Jj+ler1sEUeFcxqbx1VSY= Date: Fri, 05 Nov 2021 13:36:42 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alex.bou9@gmail.com, apw@canonical.com, cl@linux.com, danielmicay@gmail.com, dennis@kernel.org, dwaipayanray1@gmail.com, gustavoars@kernel.org, iamjoonsoo.kim@lge.com, ira.weiny@intel.com, jhubbard@nvidia.com, jingxiangfeng@huawei.com, joe@perches.com, jrdr.linux@gmail.com, keescook@chromium.org, linux-mm@kvack.org, lkp@intel.com, lukas.bulwahn@gmail.com, mm-commits@vger.kernel.org, mporter@kernel.crashing.org, nathan@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, penberg@kernel.org, rdunlap@infradead.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 039/262] percpu: add __alloc_size attributes for better bounds checking Message-ID: <20211105203642.Mqd-16juk%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=fk8GKHsh; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E0978D036A44 X-Stat-Signature: z1ra4efx1gdggqjyncg9sg7c1rfiqu6q X-HE-Tag: 1636144599-152496 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kees Cook Subject: percpu: add __alloc_size attributes for better bounds checking As already done in GrapheneOS, add the __alloc_size attribute for appropriate percpu allocator interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Note that due to the implementation of the percpu API, this is unlikely to ever actually provide compile-time checking beyond very simple non-SMP builds. But, since they are technically allocators, mark them as such. Link: https://lkml.kernel.org/r/20210930222704.2631604-9-keescook@chromium.org Signed-off-by: Kees Cook Co-developed-by: Daniel Micay Signed-off-by: Daniel Micay Acked-by: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: Andy Whitcroft Cc: David Rientjes Cc: Dwaipayan Ray Cc: Joe Perches Cc: Joonsoo Kim Cc: Lukas Bulwahn Cc: Miguel Ojeda Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Pekka Enberg Cc: Vlastimil Babka Cc: Alexandre Bounine Cc: Gustavo A. R. Silva Cc: Ira Weiny Cc: Jing Xiangfeng Cc: John Hubbard Cc: kernel test robot Cc: Matt Porter Cc: Randy Dunlap Cc: Souptick Joarder Signed-off-by: Andrew Morton --- include/linux/percpu.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/include/linux/percpu.h~percpu-add-__alloc_size-attributes-for-better-bounds-checking +++ a/include/linux/percpu.h @@ -123,7 +123,7 @@ extern int __init pcpu_page_first_chunk( pcpu_fc_populate_pte_fn_t populate_pte_fn); #endif -extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align); +extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align) __alloc_size(1); extern bool __is_kernel_percpu_address(unsigned long addr, unsigned long *can_addr); extern bool is_kernel_percpu_address(unsigned long addr); @@ -131,8 +131,8 @@ extern bool is_kernel_percpu_address(uns extern void __init setup_per_cpu_areas(void); #endif -extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp); -extern void __percpu *__alloc_percpu(size_t size, size_t align); +extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __alloc_size(1); +extern void __percpu *__alloc_percpu(size_t size, size_t align) __alloc_size(1); extern void free_percpu(void __percpu *__pdata); extern phys_addr_t per_cpu_ptr_to_phys(void *addr); From patchwork Fri Nov 5 20:36:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 160DDC433F5 for ; Fri, 5 Nov 2021 20:36:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D6F9D611EE for ; Fri, 5 Nov 2021 20:36:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D6F9D611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 74945940027; Fri, 5 Nov 2021 16:36:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D18A940007; Fri, 5 Nov 2021 16:36:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52400940027; Fri, 5 Nov 2021 16:36:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id 3BFA0940007 for ; Fri, 5 Nov 2021 16:36:48 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F363375977 for ; Fri, 5 Nov 2021 20:36:47 +0000 (UTC) X-FDA: 78776035296.07.565955E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 52BF37001A05 for ; Fri, 5 Nov 2021 20:36:42 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A965561262; Fri, 5 Nov 2021 20:36:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144606; bh=K30jLwJu2H2IwN4Vev+J4MYWEvdqbHjCEuX/VAadIw4=; h=Date:From:To:Subject:In-Reply-To:From; b=sZ86veP/gmd7mV3YY8JLyGqeM2FrNJHKbYhAXDwGwWRROuns0MQXzqLo2FgN/jZ2R TIZ1CL+hRtEmSZR0mMxfJ01MhNJp7wP32kJfRTNnSINOozELZNPhzcFs9mqzNcQcKK 780vQJTFjw/gqajIGMtRWAiBLPckZfX5xdBmhI0U= Date: Fri, 05 Nov 2021 13:36:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz, zhangyinan2019@email.szu.edu.cn Subject: [patch 040/262] mm/page_ext.c: fix a comment Message-ID: <20211105203646.2UGG_XIty%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="sZ86veP/"; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 52BF37001A05 X-Stat-Signature: bdmo3bzeqx4nnbd3i5mzdetq41o379bg X-HE-Tag: 1636144602-263732 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yinan Zhang Subject: mm/page_ext.c: fix a comment I have noticed that the previous macro is #ifndef CONFIG_SPARSEMEM. I think the comment of #else should be CONFIG_SPARSEMEM. Link: https://lkml.kernel.org/r/20211008140312.6492-1-zhangyinan2019@email.szu.edu.cn Signed-off-by: Yinan Zhang Acked-by: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page_ext.c~mm-fix-a-comment +++ a/mm/page_ext.c @@ -201,7 +201,7 @@ fail: panic("Out of memory"); } -#else /* CONFIG_FLATMEM */ +#else /* CONFIG_SPARSEMEM */ struct page_ext *lookup_page_ext(const struct page *page) { From patchwork Fri Nov 5 20:36:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EE91C433EF for ; Fri, 5 Nov 2021 20:36:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B5A0D611EE for ; Fri, 5 Nov 2021 20:36:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B5A0D611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 59B98940028; Fri, 5 Nov 2021 16:36:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 52299940007; Fri, 5 Nov 2021 16:36:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C338940028; Fri, 5 Nov 2021 16:36:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0032.hostedemail.com [216.40.44.32]) by kanga.kvack.org (Postfix) with ESMTP id 2E021940007 for ; Fri, 5 Nov 2021 16:36:51 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DCBDF75807 for ; Fri, 5 Nov 2021 20:36:50 +0000 (UTC) X-FDA: 78776035380.29.DDFFB92 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id F2C92B000188 for ; Fri, 5 Nov 2021 20:36:40 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 99ECB6127A; Fri, 5 Nov 2021 20:36:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144609; bh=uhBELLkR39FKXoBKx2krq9Lq6sztERzbvXjBlZH47K0=; h=Date:From:To:Subject:In-Reply-To:From; b=gY+GAuJAyVtKLlQlVJNdAF49s9cSxTpohchxSe3IT5jw+eLTzWc1InXVTMIj2UDGH kxL2sWLXtkCyixW4USENl/ykcBobeLuS2fU6nMEGWUkxf/4q3EhnEKsnWb4VN27VPo h1CuxsiNzXFi9vQYXBNY6zcTpyjOkZrSy8SRprbk= Date: Fri, 05 Nov 2021 13:36:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dhowells@redhat.com, jlayton@kernel.org, kent.overstreet@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 041/262] mm: stop filemap_read() from grabbing a superfluous page Message-ID: <20211105203649.fsgWgNvjd%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=gY+GAuJA; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F2C92B000188 X-Stat-Signature: tadd4zgx8cnkwiaawxryhq3jas8ut9w6 X-HE-Tag: 1636144600-812850 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Howells Subject: mm: stop filemap_read() from grabbing a superfluous page Under some circumstances, filemap_read() will allocate sufficient pages to read to the end of the file, call readahead/readpages on them and copy the data over - and then it will allocate another page at the EOF and call readpage on that and then ignore it. This is unnecessary and a waste of time and resources. filemap_read() *does* check for this, but only after it has already done the allocation and I/O. Fix this by checking before calling filemap_get_pages() also. Link: https://lkml.kernel.org/r/163472463105.3126792.7056099385135786492.stgit@warthog.procyon.org.uk Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163456863216.2614702.6384850026368833133.stgit@warthog.procyon.org.uk/ Signed-off-by: David Howells Acked-by: Jeff Layton Reviewed-by: Matthew Wilcox (Oracle) Cc: Kent Overstreet Signed-off-by: Andrew Morton --- mm/filemap.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/filemap.c~mm-stop-filemap_read-from-grabbing-a-superfluous-page +++ a/mm/filemap.c @@ -2625,6 +2625,9 @@ ssize_t filemap_read(struct kiocb *iocb, if ((iocb->ki_flags & IOCB_WAITQ) && already_read) iocb->ki_flags |= IOCB_NOWAIT; + if (unlikely(iocb->ki_pos >= i_size_read(inode))) + break; + error = filemap_get_pages(iocb, iter, &pvec); if (error < 0) break; From patchwork Fri Nov 5 20:36:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F561C433FE for ; Fri, 5 Nov 2021 20:36:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E3FEF61252 for ; Fri, 5 Nov 2021 20:36:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E3FEF61252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 80A9894000F; Fri, 5 Nov 2021 16:36:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7977C940007; Fri, 5 Nov 2021 16:36:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E71F94000F; Fri, 5 Nov 2021 16:36:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 45794940007 for ; Fri, 5 Nov 2021 16:36:54 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EEFEB1856A8C9 for ; Fri, 5 Nov 2021 20:36:53 +0000 (UTC) X-FDA: 78776035506.15.12A4C42 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id E0FB890000A6 for ; Fri, 5 Nov 2021 20:36:40 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9C37E611C0; Fri, 5 Nov 2021 20:36:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144612; bh=TYOkewgMcv8aVJYVpBRuZQ4keM8guxXKpPSzDyaFzRY=; h=Date:From:To:Subject:In-Reply-To:From; b=Q3sV0JYtwM3fSC0fuqAzP5m/OcF6RsVLq+eZWpyXtLVnD8itFdbIw6O2kWETUyHMg rpOUyQWl0nVf6PU/PSoyyAOtjMz0Zr6Ls6xwK5bnj6Y4o5DwtSSS3p/StulwiiGWlW n9p7uReLFdXAFM3kp8TgqvEy/ZxpFjGc6CeDEiWQ= Date: Fri, 05 Nov 2021 13:36:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jack@suse.cz, linux-mm@kvack.org, miquel.raynal@bootlin.com, mm-commits@vger.kernel.org, richard@nod.at, torvalds@linux-foundation.org, vigneshr@ti.com Subject: [patch 042/262] mm: export bdi_unregister Message-ID: <20211105203652._au34liRg%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Q3sV0JYt; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E0FB890000A6 X-Stat-Signature: ryw1z4wwp6sh997zor1dspgfiwfkuic8 X-HE-Tag: 1636144600-202404 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: mm: export bdi_unregister Patch series "simplify bdi unregistation". This series simplifies the BDI code to get rid of the magic auto-unregister feature that hid a recent block layer refcounting bug. This patch (of 5): To wind down the magic auto-unregister semantics we'll need to push this into modular code. Link: https://lkml.kernel.org/r/20211021124441.668816-1-hch@lst.de Link: https://lkml.kernel.org/r/20211021124441.668816-2-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Signed-off-by: Andrew Morton --- mm/backing-dev.c | 1 + 1 file changed, 1 insertion(+) --- a/mm/backing-dev.c~mm-export-bdi_unregister +++ a/mm/backing-dev.c @@ -958,6 +958,7 @@ void bdi_unregister(struct backing_dev_i bdi->owner = NULL; } } +EXPORT_SYMBOL(bdi_unregister); static void release_bdi(struct kref *ref) { From patchwork Fri Nov 5 20:36:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DD8AC433FE for ; Fri, 5 Nov 2021 20:36:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E8E71611EE for ; Fri, 5 Nov 2021 20:36:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E8E71611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 88FE3940010; Fri, 5 Nov 2021 16:36:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 809EC940007; Fri, 5 Nov 2021 16:36:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AAA6940010; Fri, 5 Nov 2021 16:36:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0024.hostedemail.com [216.40.44.24]) by kanga.kvack.org (Postfix) with ESMTP id 5596F940007 for ; Fri, 5 Nov 2021 16:36:57 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 14C0D8249980 for ; Fri, 5 Nov 2021 20:36:57 +0000 (UTC) X-FDA: 78776035674.12.969092F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id E3F4430000B4 for ; Fri, 5 Nov 2021 20:36:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B52CA611C0; Fri, 5 Nov 2021 20:36:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144616; bh=eTuZqzSlv2o5SvGn4sWCrct+xzq3D9/H3R/OlwuiCRI=; h=Date:From:To:Subject:In-Reply-To:From; b=ew6W1QYPpN5JauFr8k2f8M/svicGXD69dr6aS/l+wSP2nHIWZht3pcIxgJezUdwZJ vVdT77DvAVoyvwZY2qqPY3Lc1D2X/opkkWCmfiq9Mw+dhZVVvtz5jqL2Pt8NeSgOXS X/0rTqgrJ8JoaT1kvbE1xK9wwp/LzQfdD3w74J3c= Date: Fri, 05 Nov 2021 13:36:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jack@suse.cz, linux-mm@kvack.org, miquel.raynal@bootlin.com, mm-commits@vger.kernel.org, richard@nod.at, torvalds@linux-foundation.org, vigneshr@ti.com Subject: [patch 043/262] mtd: call bdi_unregister explicitly Message-ID: <20211105203655.7f7DgwdJ4%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E3F4430000B4 X-Stat-Signature: ebc91w6mpakcoq9uyinupgt96i6b7k36 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ew6W1QYP; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144604-121663 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: mtd: call bdi_unregister explicitly Call bdi_unregister explicitly instead of relying on the automatic unregistration. Link: https://lkml.kernel.org/r/20211021124441.668816-3-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Signed-off-by: Andrew Morton --- drivers/mtd/mtdcore.c | 1 + 1 file changed, 1 insertion(+) --- a/drivers/mtd/mtdcore.c~mtd-call-bdi_unregister-explicitly +++ a/drivers/mtd/mtdcore.c @@ -2409,6 +2409,7 @@ static void __exit cleanup_mtd(void) if (proc_mtd) remove_proc_entry("mtd", NULL); class_unregister(&mtd_class); + bdi_unregister(mtd_bdi); bdi_put(mtd_bdi); idr_destroy(&mtd_idr); } From patchwork Fri Nov 5 20:36:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29250C4332F for ; Fri, 5 Nov 2021 20:37:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D6CCC611C0 for ; Fri, 5 Nov 2021 20:37:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D6CCC611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 79848940029; Fri, 5 Nov 2021 16:37:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 71DEF940007; Fri, 5 Nov 2021 16:37:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E5CC940029; Fri, 5 Nov 2021 16:37:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id 50570940007 for ; Fri, 5 Nov 2021 16:37:00 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1F19C1856A8C8 for ; Fri, 5 Nov 2021 20:37:00 +0000 (UTC) X-FDA: 78776035800.21.FBDFD8A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 964C76001986 for ; Fri, 5 Nov 2021 20:37:00 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D9EDF61242; Fri, 5 Nov 2021 20:36:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144619; bh=3V0DVgLklK+DGdkQfrAXN3hgTsfYdiu9kND4RGYZZwQ=; h=Date:From:To:Subject:In-Reply-To:From; b=anFzLvbsvWXpWiDh6UeSUzV+ekyMFYobQleZO/gzuLlyHl68H4W+wjDukFNeS9chf deU5KLVNAiPiVLH0kxULAZ7wgbM+spzlL7GlxJHRHTfAx7TgxSMimziNNMZ403yVuz 63HnCsqcaW+zSOJv24GNUmWFGiRISvsGT1VcZmMA= Date: Fri, 05 Nov 2021 13:36:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jack@suse.cz, linux-mm@kvack.org, miquel.raynal@bootlin.com, mm-commits@vger.kernel.org, richard@nod.at, torvalds@linux-foundation.org, vigneshr@ti.com Subject: [patch 044/262] fs: explicitly unregister per-superblock BDIs Message-ID: <20211105203658.rCpMjLKAt%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 964C76001986 X-Stat-Signature: 79hj47gjnz8ojqa3n8udnernbuf5mcr1 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=anFzLvbs; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144620-849439 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: fs: explicitly unregister per-superblock BDIs Add a new SB_I_ flag to mark superblocks that have an ephemeral bdi associated with them, and unregister it when the superblock is shut down. Link: https://lkml.kernel.org/r/20211021124441.668816-4-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Signed-off-by: Andrew Morton --- fs/super.c | 3 +++ include/linux/fs.h | 1 + 2 files changed, 4 insertions(+) --- a/fs/super.c~fs-explicitly-unregister-per-superblock-bdis +++ a/fs/super.c @@ -476,6 +476,8 @@ void generic_shutdown_super(struct super spin_unlock(&sb_lock); up_write(&sb->s_umount); if (sb->s_bdi != &noop_backing_dev_info) { + if (sb->s_iflags & SB_I_PERSB_BDI) + bdi_unregister(sb->s_bdi); bdi_put(sb->s_bdi); sb->s_bdi = &noop_backing_dev_info; } @@ -1562,6 +1564,7 @@ int super_setup_bdi_name(struct super_bl } WARN_ON(sb->s_bdi != &noop_backing_dev_info); sb->s_bdi = bdi; + sb->s_iflags |= SB_I_PERSB_BDI; return 0; } --- a/include/linux/fs.h~fs-explicitly-unregister-per-superblock-bdis +++ a/include/linux/fs.h @@ -1443,6 +1443,7 @@ extern int send_sigurg(struct fown_struc #define SB_I_UNTRUSTED_MOUNTER 0x00000040 #define SB_I_SKIP_SYNC 0x00000100 /* Skip superblock at global sync */ +#define SB_I_PERSB_BDI 0x00000200 /* has a per-sb bdi */ /* Possible states of 'frozen' field */ enum { From patchwork Fri Nov 5 20:37:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E7A6C433EF for ; Fri, 5 Nov 2021 20:37:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 08F92611EE for ; Fri, 5 Nov 2021 20:37:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 08F92611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9DF6594002A; Fri, 5 Nov 2021 16:37:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9690D940007; Fri, 5 Nov 2021 16:37:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E2C094002A; Fri, 5 Nov 2021 16:37:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 6DEFC940007 for ; Fri, 5 Nov 2021 16:37:03 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 38E30777F4 for ; Fri, 5 Nov 2021 20:37:03 +0000 (UTC) X-FDA: 78776036010.05.D757863 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id DFECFF0003A0 for ; Fri, 5 Nov 2021 20:37:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DCEA461242; Fri, 5 Nov 2021 20:37:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144622; bh=whFe3f09wOkOblZuiOhkmdNFJjqy8eyUdcc/AyCZSxo=; h=Date:From:To:Subject:In-Reply-To:From; b=W7KgDrzY4E4V56cknMYQW+oYGS+Ui83/U8ArReRajRz5R2dy3hKbp45LdXnEBPB6x aJe4ZZXZJWb2D3gAbRoHSZIQeM4oLT5fxF3O4voxnG1pz1QG9kSHYwa9LOZHk4zqex zdn5ssdCo4DQE8E2k7dZSUZuDpJRUz1ot6mzqg1E= Date: Fri, 05 Nov 2021 13:37:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jack@suse.cz, linux-mm@kvack.org, miquel.raynal@bootlin.com, mm-commits@vger.kernel.org, richard@nod.at, torvalds@linux-foundation.org, vigneshr@ti.com Subject: [patch 045/262] mm: don't automatically unregister bdis Message-ID: <20211105203701.M3N9WiZBX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DFECFF0003A0 X-Stat-Signature: bh8sjxgoirm3gnzb674813g66t8hezei Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=W7KgDrzY; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144622-975026 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: mm: don't automatically unregister bdis All BDI users now unregister explicitly. Link: https://lkml.kernel.org/r/20211021124441.668816-5-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Signed-off-by: Andrew Morton --- mm/backing-dev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/mm/backing-dev.c~mm-dont-automatically-unregister-bdis +++ a/mm/backing-dev.c @@ -965,8 +965,7 @@ static void release_bdi(struct kref *ref struct backing_dev_info *bdi = container_of(ref, struct backing_dev_info, refcnt); - if (test_bit(WB_registered, &bdi->wb.state)) - bdi_unregister(bdi); + WARN_ON_ONCE(test_bit(WB_registered, &bdi->wb.state)); WARN_ON_ONCE(bdi->dev); wb_exit(&bdi->wb); kfree(bdi); From patchwork Fri Nov 5 20:37:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D42CC433EF for ; Fri, 5 Nov 2021 20:37:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 240D160E08 for ; Fri, 5 Nov 2021 20:37:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 240D160E08 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B8F8394002B; Fri, 5 Nov 2021 16:37:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B160E940007; Fri, 5 Nov 2021 16:37:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 941EC94002B; Fri, 5 Nov 2021 16:37:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id 7B730940007 for ; Fri, 5 Nov 2021 16:37:07 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 408045C0B7 for ; Fri, 5 Nov 2021 20:37:07 +0000 (UTC) X-FDA: 78776036094.12.8C6C996 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 84501B0000B7 for ; Fri, 5 Nov 2021 20:36:58 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E406D60174; Fri, 5 Nov 2021 20:37:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144625; bh=+lZa/aW/EpyL1dpyKbQIzU2cEsLmLQ8PFnRYy6XdsRw=; h=Date:From:To:Subject:In-Reply-To:From; b=Q9Dj9wOoEuhRaJJad3SQbUTJHCB9mKgYZwOkkXuRM0u9rTQg9oEZfIWyvLZ7aiAiA dfAbZ08XYsY4XSntrAWUPJgAi0awl+VrPrMdawvTKT0LJ+/0XX4h3Xd9JcNB/+E7Bl aTrR8REqKJPu3nVWc7T/vPgHPzngHUOU2+AmMRuI= Date: Fri, 05 Nov 2021 13:37:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jack@suse.cz, linux-mm@kvack.org, miquel.raynal@bootlin.com, mm-commits@vger.kernel.org, richard@nod.at, torvalds@linux-foundation.org, vigneshr@ti.com Subject: [patch 046/262] mm: simplify bdi refcounting Message-ID: <20211105203704.IPHQ9qf-p%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Q9Dj9wOo; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 84501B0000B7 X-Stat-Signature: mc9cbhoedz5hheab8bguu91egqwhzpdt X-HE-Tag: 1636144618-389251 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Subject: mm: simplify bdi refcounting Move grabbing and releasing the bdi refcount out of the common wb_init/wb_exit helpers into code that is only used for the non-default memcg driven bdi_writeback structures. [hch@lst.de: add comment] Link: https://lkml.kernel.org/r/20211027074207.GA12793@lst.de [akpm@linux-foundation.org: fix typo] Link: https://lkml.kernel.org/r/20211021124441.668816-6-hch@lst.de Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Signed-off-by: Andrew Morton --- include/linux/backing-dev-defs.h | 3 +++ mm/backing-dev.c | 13 +++++-------- 2 files changed, 8 insertions(+), 8 deletions(-) --- a/include/linux/backing-dev-defs.h~mm-simplify-bdi-refcounting +++ a/include/linux/backing-dev-defs.h @@ -103,6 +103,9 @@ struct wb_completion { * change as blkcg is disabled and enabled higher up in the hierarchy, a wb * is tested for blkcg after lookup and removed from index on mismatch so * that a new wb for the combination can be created. + * + * Each bdi_writeback that is not embedded into the backing_dev_info must hold + * a reference to the parent backing_dev_info. See cgwb_create() for details. */ struct bdi_writeback { struct backing_dev_info *bdi; /* our parent bdi */ --- a/mm/backing-dev.c~mm-simplify-bdi-refcounting +++ a/mm/backing-dev.c @@ -291,8 +291,6 @@ static int wb_init(struct bdi_writeback memset(wb, 0, sizeof(*wb)); - if (wb != &bdi->wb) - bdi_get(bdi); wb->bdi = bdi; wb->last_old_flush = jiffies; INIT_LIST_HEAD(&wb->b_dirty); @@ -316,7 +314,7 @@ static int wb_init(struct bdi_writeback err = fprop_local_init_percpu(&wb->completions, gfp); if (err) - goto out_put_bdi; + return err; for (i = 0; i < NR_WB_STAT_ITEMS; i++) { err = percpu_counter_init(&wb->stat[i], 0, gfp); @@ -330,9 +328,6 @@ out_destroy_stat: while (i--) percpu_counter_destroy(&wb->stat[i]); fprop_local_destroy_percpu(&wb->completions); -out_put_bdi: - if (wb != &bdi->wb) - bdi_put(bdi); return err; } @@ -373,8 +368,6 @@ static void wb_exit(struct bdi_writeback percpu_counter_destroy(&wb->stat[i]); fprop_local_destroy_percpu(&wb->completions); - if (wb != &wb->bdi->wb) - bdi_put(wb->bdi); } #ifdef CONFIG_CGROUP_WRITEBACK @@ -397,6 +390,7 @@ static void cgwb_release_workfn(struct w struct bdi_writeback *wb = container_of(work, struct bdi_writeback, release_work); struct blkcg *blkcg = css_to_blkcg(wb->blkcg_css); + struct backing_dev_info *bdi = wb->bdi; mutex_lock(&wb->bdi->cgwb_release_mutex); wb_shutdown(wb); @@ -416,6 +410,7 @@ static void cgwb_release_workfn(struct w percpu_ref_exit(&wb->refcnt); wb_exit(wb); + bdi_put(bdi); WARN_ON_ONCE(!list_empty(&wb->b_attached)); kfree_rcu(wb, rcu); } @@ -497,6 +492,7 @@ static int cgwb_create(struct backing_de INIT_LIST_HEAD(&wb->b_attached); INIT_WORK(&wb->release_work, cgwb_release_workfn); set_bit(WB_registered, &wb->state); + bdi_get(bdi); /* * The root wb determines the registered state of the whole bdi and @@ -528,6 +524,7 @@ static int cgwb_create(struct backing_de goto out_put; err_fprop_exit: + bdi_put(bdi); fprop_local_destroy_percpu(&wb->memcg_completions); err_ref_exit: percpu_ref_exit(&wb->refcnt); From patchwork Fri Nov 5 20:37:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B080C433EF for ; Fri, 5 Nov 2021 20:37:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 365076056B for ; Fri, 5 Nov 2021 20:37:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 365076056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B8F8B94002C; Fri, 5 Nov 2021 16:37:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AEF1A940007; Fri, 5 Nov 2021 16:37:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 969FF94002C; Fri, 5 Nov 2021 16:37:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id 7DB4E940007 for ; Fri, 5 Nov 2021 16:37:09 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 400BF1849A3B5 for ; Fri, 5 Nov 2021 20:37:09 +0000 (UTC) X-FDA: 78776036178.28.4B0632D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id DBF524002085 for ; Fri, 5 Nov 2021 20:37:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E9CDB60E05; Fri, 5 Nov 2021 20:37:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144628; bh=k7WljE0iQVEK2JtPw8YULivJdqphpmtM/3TtjO4wdWc=; h=Date:From:To:Subject:In-Reply-To:From; b=sx4/h36y7mqB0iHe9hbEfNxMJs0UrlNaBNQvX9l+cdSRw2qreQ3w3LoSEr31PiSSV 9mBP838UTeG5NwRQoovDLEDTcAZhWQcXhugjy5dutQTBBkOW96+mJvhkVJ3dYrP7C2 osmfkhbqsIjr4Gqtze2d4i7Ebtf37/cUpYiyb9a4= Date: Fri, 05 Nov 2021 13:37:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, asml.silence@gmail.com, axboe@kernel.dk, clm@fb.com, david@fromorbit.com, jack@suse.cz, josef@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 047/262] mm: don't read i_size of inode unless we need it Message-ID: <20211105203707.MxPJku2CF%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="sx4/h36y"; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DBF524002085 X-Stat-Signature: 7gjxgrngxok14ag5rmiqw1x1wasoh1e7 X-HE-Tag: 1636144628-842400 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jens Axboe Subject: mm: don't read i_size of inode unless we need it We always go through i_size_read(), and we rarely end up needing it. Push the read to down where we need to check it, which avoids it for most cases. It looks like we can even remove this check entirely, which might be worth pursuing. But at least this takes it out of the hot path. Link: https://lkml.kernel.org/r/6b67981f-57d4-c80e-bc07-6020aa601381@kernel.dk Signed-off-by: Jens Axboe Acked-by: Chris Mason Cc: Josef Bacik Cc: Dave Chinner Cc: Pavel Begunkov Cc: Jan Kara Signed-off-by: Andrew Morton --- mm/filemap.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/mm/filemap.c~mm-dont-read-i_size-of-inode-unless-we-need-it +++ a/mm/filemap.c @@ -2740,9 +2740,7 @@ generic_file_read_iter(struct kiocb *ioc struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; - loff_t size; - size = i_size_read(inode); if (iocb->ki_flags & IOCB_NOWAIT) { if (filemap_range_needs_writeback(mapping, iocb->ki_pos, iocb->ki_pos + count - 1)) @@ -2774,8 +2772,9 @@ generic_file_read_iter(struct kiocb *ioc * the rest of the read. Buffered reads will not work for * DAX files, so don't bother trying. */ - if (retval < 0 || !count || iocb->ki_pos >= size || - IS_DAX(inode)) + if (retval < 0 || !count || IS_DAX(inode)) + return retval; + if (iocb->ki_pos >= i_size_read(inode)) return retval; } From patchwork Fri Nov 5 20:37:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 643FFC433FE for ; Fri, 5 Nov 2021 20:37:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2D02D60E08 for ; Fri, 5 Nov 2021 20:37:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2D02D60E08 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B8FCE94002D; Fri, 5 Nov 2021 16:37:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1798940007; Fri, 5 Nov 2021 16:37:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B91F94002D; Fri, 5 Nov 2021 16:37:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 8191F940007 for ; Fri, 5 Nov 2021 16:37:12 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 426611856A8D4 for ; Fri, 5 Nov 2021 20:37:12 +0000 (UTC) X-FDA: 78776036388.05.79A86A5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id E637DF0000BC for ; Fri, 5 Nov 2021 20:37:11 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0EB9760E05; Fri, 5 Nov 2021 20:37:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144631; bh=tolFhU/6c0H5j6/fvNY0DHTPCF/4Lqm1K9ZS2GAWqOM=; h=Date:From:To:Subject:In-Reply-To:From; b=2s8/x+FiI09Nb4WDj7o+PyiSi7JTvfnaJKlopsruiH7qp7QGi8Fb7qEaAXHHxDtvY KIS1mLIoL7M2G0eCY397vhzaJ50vrqHHOP5ans8sfOJscwwBiXdVSsAzK9xcr1+cP4 dv0hlSHlzlGrd+DAV1Tsy0x1n55dYJYPQzYZGYZw= Date: Fri, 05 Nov 2021 13:37:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, syzbot+c87be4f669d920c76330@syzkaller.appspotmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON Message-ID: <20211105203710.7ygSi-fQI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E637DF0000BC X-Stat-Signature: enokr5oriyt9fmqp8nq1pogwoe61imrz Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="2s8/x+Fi"; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144631-731485 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm/filemap.c: remove bogus VM_BUG_ON It is not safe to check page->index without holding the page lock. It can be changed if the page is moved between the swap cache and the page cache for a shmem file, for example. There is a VM_BUG_ON below which checks page->index is correct after taking the page lock. Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org Fixes: 5c211ba29deb ("mm: add and use find_lock_entries") Signed-off-by: Matthew Wilcox (Oracle) Reported-by: Cc: Hugh Dickins Cc: Signed-off-by: Andrew Morton --- mm/filemap.c | 1 - 1 file changed, 1 deletion(-) --- a/mm/filemap.c~mm-remove-bogus-vm_bug_on +++ a/mm/filemap.c @@ -2093,7 +2093,6 @@ unsigned find_lock_entries(struct addres if (!xa_is_value(page)) { if (page->index < start) goto put; - VM_BUG_ON_PAGE(page->index != xas.xa_index, page); if (page->index + thp_nr_pages(page) - 1 > end) goto put; if (!trylock_page(page)) From patchwork Fri Nov 5 20:37:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78D14C433EF for ; Fri, 5 Nov 2021 20:37:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 26C436056B for ; Fri, 5 Nov 2021 20:37:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 26C436056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A2E4B94002E; Fri, 5 Nov 2021 16:37:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B626940007; Fri, 5 Nov 2021 16:37:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8819C94002E; Fri, 5 Nov 2021 16:37:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id 7067C940007 for ; Fri, 5 Nov 2021 16:37:15 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2782518562FFF for ; Fri, 5 Nov 2021 20:37:15 +0000 (UTC) X-FDA: 78776036430.22.AF452CD Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 24A7EB00018A for ; Fri, 5 Nov 2021 20:37:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id EDA4760720; Fri, 5 Nov 2021 20:37:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144634; bh=2UkTKQih8idxLVqf2ibokSlG04tvQHz4ZCkdKXtbOhk=; h=Date:From:To:Subject:In-Reply-To:From; b=XF4YEKexWqtTz+ggQcKLSCB9Q/4DKw0/CfhX3IoK0rWvFLq9qqGbkdRLLXxtbYKel +MFizU1uLqkBLRUOjZxFrdNE3dP0IhtYdHvDWJBGO3SakB/WZJdMMb9Ntwi1UN6PdU 5PtfuJpnrQe/yLIrXGnNIw3o4eAp30BdBX9kWPaI= Date: Fri, 05 Nov 2021 13:37:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axboe@kernel.dk, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 049/262] mm: move more expensive part of XA setup out of mapping check Message-ID: <20211105203713.0Ahhj4i8_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XF4YEKex; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 24A7EB00018A X-Stat-Signature: gnzs4ywof5u5kta3biezgudq6ux8zjrc X-HE-Tag: 1636144625-833963 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jens Axboe Subject: mm: move more expensive part of XA setup out of mapping check The fast path here is not needing any writeback, yet we spend time setting up the xarray lookup data upfront. Move the part that actually needs to iterate the address space mapping into a separate helper, saving ~30% of the time here. Link: https://lkml.kernel.org/r/49f67983-b802-8929-edab-d807f745c9ca@kernel.dk Signed-off-by: Jens Axboe Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- mm/filemap.c | 43 +++++++++++++++++++++++++------------------ 1 file changed, 25 insertions(+), 18 deletions(-) --- a/mm/filemap.c~mm-move-more-expensive-part-of-xa-setup-out-of-mapping-check +++ a/mm/filemap.c @@ -639,6 +639,30 @@ static bool mapping_needs_writeback(stru return mapping->nrpages; } +static bool filemap_range_has_writeback(struct address_space *mapping, + loff_t start_byte, loff_t end_byte) +{ + XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT); + pgoff_t max = end_byte >> PAGE_SHIFT; + struct page *page; + + if (end_byte < start_byte) + return false; + + rcu_read_lock(); + xas_for_each(&xas, page, max) { + if (xas_retry(&xas, page)) + continue; + if (xa_is_value(page)) + continue; + if (PageDirty(page) || PageLocked(page) || PageWriteback(page)) + break; + } + rcu_read_unlock(); + return page != NULL; + +} + /** * filemap_range_needs_writeback - check if range potentially needs writeback * @mapping: address space within which to check @@ -656,29 +680,12 @@ static bool mapping_needs_writeback(stru bool filemap_range_needs_writeback(struct address_space *mapping, loff_t start_byte, loff_t end_byte) { - XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT); - pgoff_t max = end_byte >> PAGE_SHIFT; - struct page *page; - if (!mapping_needs_writeback(mapping)) return false; if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) && !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) return false; - if (end_byte < start_byte) - return false; - - rcu_read_lock(); - xas_for_each(&xas, page, max) { - if (xas_retry(&xas, page)) - continue; - if (xa_is_value(page)) - continue; - if (PageDirty(page) || PageLocked(page) || PageWriteback(page)) - break; - } - rcu_read_unlock(); - return page != NULL; + return filemap_range_has_writeback(mapping, start_byte, end_byte); } EXPORT_SYMBOL_GPL(filemap_range_needs_writeback); From patchwork Fri Nov 5 20:37:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E9BC433F5 for ; Fri, 5 Nov 2021 20:37:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2B61F60E05 for ; Fri, 5 Nov 2021 20:37:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2B61F60E05 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C2AB294002F; Fri, 5 Nov 2021 16:37:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3D1E940007; Fri, 5 Nov 2021 16:37:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B83994002F; Fri, 5 Nov 2021 16:37:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0185.hostedemail.com [216.40.44.185]) by kanga.kvack.org (Postfix) with ESMTP id 81478940007 for ; Fri, 5 Nov 2021 16:37:18 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3BEB87781E for ; Fri, 5 Nov 2021 20:37:18 +0000 (UTC) X-FDA: 78776036556.09.CEB1833 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 80BA4B0000AC for ; Fri, 5 Nov 2021 20:37:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DAC3360720; Fri, 5 Nov 2021 20:37:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144637; bh=pJI1XCeFCB/vXMFPBQdMc6D36Bo3BhS18WObboGfWDU=; h=Date:From:To:Subject:In-Reply-To:From; b=S+toJuOPkUGoA+9iwLS85LuJroHnORqUC8emIDXSgQjt6IFSlxwXk5f0N6GX5jxW6 xPtN6lqTUPDNIJlshS6jAW6FfUb55AiBMr9T11BgWs/eotbCqWOwjcbMCn5g7R/t7Y i6+N/GYCvu0fuB3Cs24Y/5Xzn/Q8BZkSeMcKx/JE= Date: Fri, 05 Nov 2021 13:37:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, imbrenda@linux.ibm.com, jack@suse.cz, jhubbard@nvidia.com, kirill.shutemov@linux.intel.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 050/262] mm/gup: further simplify __gup_device_huge() Message-ID: <20211105203716.qKGfFlN1h%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 80BA4B0000AC X-Stat-Signature: 8c8689r1erw7y4najnen4upb3354tfnq Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=S+toJuOP; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144630-265955 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: John Hubbard Subject: mm/gup: further simplify __gup_device_huge() commit 6401c4eb57f9 ("mm: gup: fix potential pgmap refcnt leak in __gup_device_huge()") simplified the return paths, but didn't go quite far enough, as discussed in [1]. Remove the "ret" variable entirely, because there is enough information already available to provide the return value. [1] https://lore.kernel.org/r/CAHk-=wgQTRX=5SkCmS+zfmpqubGHGJvXX_HgnPG8JSpHKHBMeg@mail.gmail.com Link: https://lkml.kernel.org/r/20210904004224.86391-1-jhubbard@nvidia.com Signed-off-by: John Hubbard Suggested-by: Linus Torvalds Reviewed-by: Jan Kara Cc: Miaohe Lin Cc: Claudio Imbrenda Cc: Kirill A. Shutemov Signed-off-by: Andrew Morton --- mm/gup.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/mm/gup.c~mm-gup-further-simplify-__gup_device_huge +++ a/mm/gup.c @@ -2228,7 +2228,6 @@ static int __gup_device_huge(unsigned lo { int nr_start = *nr; struct dev_pagemap *pgmap = NULL; - int ret = 1; do { struct page *page = pfn_to_page(pfn); @@ -2236,14 +2235,12 @@ static int __gup_device_huge(unsigned lo pgmap = get_dev_pagemap(pfn, pgmap); if (unlikely(!pgmap)) { undo_dev_pagemap(nr, nr_start, flags, pages); - ret = 0; break; } SetPageReferenced(page); pages[*nr] = page; if (unlikely(!try_grab_page(page, flags))) { undo_dev_pagemap(nr, nr_start, flags, pages); - ret = 0; break; } (*nr)++; @@ -2251,7 +2248,7 @@ static int __gup_device_huge(unsigned lo } while (addr += PAGE_SIZE, addr != end); put_dev_pagemap(pgmap); - return ret; + return addr == end; } static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, From patchwork Fri Nov 5 20:37:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EF0FC433EF for ; Fri, 5 Nov 2021 20:37:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 211E360720 for ; Fri, 5 Nov 2021 20:37:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 211E360720 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id ACA08940030; Fri, 5 Nov 2021 16:37:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0366940007; Fri, 5 Nov 2021 16:37:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8CE79940030; Fri, 5 Nov 2021 16:37:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 761DC940007 for ; Fri, 5 Nov 2021 16:37:21 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 34B721856A8C9 for ; Fri, 5 Nov 2021 20:37:21 +0000 (UTC) X-FDA: 78776036682.18.1521C65 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id 8AFEC20019E2 for ; Fri, 5 Nov 2021 20:37:21 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CF0E660E08; Fri, 5 Nov 2021 20:37:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144640; bh=ocH+yVvAf4/EjIgaD54nWyy2pMZQ33fGsi9sbCMjpyE=; h=Date:From:To:Subject:In-Reply-To:From; b=xxCgZkbByb/JuEgUC5tr6XhrDYrM1o7zYSDwgKHGFdzeRdfE+EFaAjpvBaOTwWtzx a5caNLO1CwtourKpBSV6MSk5S2p3nqYEMFG3v/VDirSTlt7n/qFhHrRPMEnNU6w33G JNnZNtEdOrfY5FeYFStFLfYuo9R+Rf4ZwX1ORQbY= Date: Fri, 05 Nov 2021 13:37:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vulab@iscas.ac.cn Subject: [patch 051/262] mm/swapfile: remove needless request_queue NULL pointer check Message-ID: <20211105203719.-ssTsYg32%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8AFEC20019E2 X-Stat-Signature: por9ag8uza1jd5cabwhhswwmghibktoz Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xxCgZkbB; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144641-523447 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xu Wang Subject: mm/swapfile: remove needless request_queue NULL pointer check The request_queue pointer returned from bdev_get_queue() shall never be NULL, so the null check is unnecessary, just remove it. Link: https://lkml.kernel.org/r/20210917082111.33923-1-vulab@iscas.ac.cn Signed-off-by: Xu Wang Acked-by: David Hildenbrand Signed-off-by: Andrew Morton --- mm/swapfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/swapfile.c~mm-swapfile-remove-needless-request_queue-null-pointer-check +++ a/mm/swapfile.c @@ -3118,7 +3118,7 @@ static bool swap_discardable(struct swap { struct request_queue *q = bdev_get_queue(si->bdev); - if (!q || !blk_queue_discard(q)) + if (!blk_queue_discard(q)) return false; return true; From patchwork Fri Nov 5 20:37:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38B90C433FE for ; Fri, 5 Nov 2021 20:37:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E76E060720 for ; Fri, 5 Nov 2021 20:37:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E76E060720 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7E36B94000D; Fri, 5 Nov 2021 16:37:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76CBC940007; Fri, 5 Nov 2021 16:37:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60E0A94000D; Fri, 5 Nov 2021 16:37:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0231.hostedemail.com [216.40.44.231]) by kanga.kvack.org (Postfix) with ESMTP id 450C0940007 for ; Fri, 5 Nov 2021 16:37:24 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0409B82499A8 for ; Fri, 5 Nov 2021 20:37:24 +0000 (UTC) X-FDA: 78776036808.24.5AE1B82 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id F03EC90000BC for ; Fri, 5 Nov 2021 20:37:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C69C56056B; Fri, 5 Nov 2021 20:37:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144643; bh=fUsc17QLprrGYCLgQqWtflrIJ6wfV1kBZrogCaoSSp8=; h=Date:From:To:Subject:In-Reply-To:From; b=rh1TJFOTBRKkkVDF7U4M4smEMsZmkkNauyEZcfhUP13TjARtKIXt9/nkQgI129LgM reDWfYpubtHf9kL0q2vOpQUfIriZHeRFThUOePPLbwAVpDPBJZWs6ZCouRRZUxJjDS MGQyr/ExA34sfmlZ2o5K4+q3fkEI6Qlt02uFeRhE= Date: Fri, 05 Nov 2021 13:37:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aquini@redhat.com, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 052/262] mm/swapfile: fix an integer overflow in swap_show() Message-ID: <20211105203722.TlSr_OQ9B%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=rh1TJFOT; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: F03EC90000BC X-Stat-Signature: ptwusazjb5r8egosza14zc9p9z1kypta X-HE-Tag: 1636144630-859568 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rafael Aquini Subject: mm/swapfile: fix an integer overflow in swap_show() This one is just a minor nuisance for people going through /proc/swaps if any of their swapareas is bigger than, or equal to 1073741824 pages (4TB). seq_printf() format string casts as uint the conversion from pages to KB, and that will overflow in the aforementioned case. Albeit being almost unthinkable that someone would actually set up such big of a single swaparea, there is a ticket recently filed against RHEL: https://bugzilla.redhat.com/show_bug.cgi?id=2008812 Given that all other codesites that use format strings for the same swap pages-to-KB conversion do cast it as ulong, this patch just follows suit. Link: https://lkml.kernel.org/r/20211006184011.2579054-1-aquini@redhat.com Signed-off-by: Rafael Aquini Cc: Hugh Dickins Signed-off-by: Andrew Morton --- mm/swapfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/swapfile.c~mm-swapfile-fix-an-integer-overflow-in-swap_show +++ a/mm/swapfile.c @@ -2763,7 +2763,7 @@ static int swap_show(struct seq_file *sw struct swap_info_struct *si = v; struct file *file; int len; - unsigned int bytes, inuse; + unsigned long bytes, inuse; if (si == SEQ_START_TOKEN) { seq_puts(swap, "Filename\t\t\t\tType\t\tSize\t\tUsed\t\tPriority\n"); @@ -2775,7 +2775,7 @@ static int swap_show(struct seq_file *sw file = si->swap_file; len = seq_file_path(swap, file, " \t\n\\"); - seq_printf(swap, "%*s%s\t%u\t%s%u\t%s%d\n", + seq_printf(swap, "%*s%s\t%lu\t%s%lu\t%s%d\n", len < 40 ? 40 - len : 1, " ", S_ISBLK(file_inode(file)->i_mode) ? "partition" : "file\t", From patchwork Fri Nov 5 20:37:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C8E1C433EF for ; Fri, 5 Nov 2021 20:37:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC13560E05 for ; Fri, 5 Nov 2021 20:37:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BC13560E05 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5C16B940031; Fri, 5 Nov 2021 16:37:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 54891940007; Fri, 5 Nov 2021 16:37:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EA40940031; Fri, 5 Nov 2021 16:37:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 2764A940007 for ; Fri, 5 Nov 2021 16:37:27 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DF3561856A1D4 for ; Fri, 5 Nov 2021 20:37:26 +0000 (UTC) X-FDA: 78776036892.16.176B226 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id D8687508E4B5 for ; Fri, 5 Nov 2021 20:37:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A198660720; Fri, 5 Nov 2021 20:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144645; bh=u5mzoQR1SMHgiWq/xg3HGEOJS7HcxmR+enVcKc0Kw/8=; h=Date:From:To:Subject:In-Reply-To:From; b=kBetjkGavmlqCGCyVOKn3pixXc+otSmLhaBs/euDoeeSwkwB/JUtMdWcYUDXVDXZy bYjWV2xQXYzDyY9wtcoFPXVFjNFfNVyWkrsICOGa6IomUyl2R8zHNm9NdrtS0pUZ74 +rOKiVr8eZv/vJCsRwCp2jsVLGDAmFAsHf3IngnE= Date: Fri, 05 Nov 2021 13:37:25 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anthony.yznaga@oracle.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 053/262] mm: optimise put_pages_list() Message-ID: <20211105203725.zKys-1BiC%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D8687508E4B5 X-Stat-Signature: fxhmwunbbysxb3fhbdemidkp3yq7jup6 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kBetjkGa; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144634-303313 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Subject: mm: optimise put_pages_list() Instead of calling put_page() one page at a time, pop pages off the list if their refcount was too high and pass the remainder to put_unref_page_list(). This should be a speed improvement, but I have no measurements to support that. Current callers do not care about performance, but I hope to add some which do. Link: https://lkml.kernel.org/r/20211007192138.561673-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Anthony Yznaga Cc: Mel Gorman Signed-off-by: Andrew Morton --- mm/swap.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) --- a/mm/swap.c~mm-optimise-put_pages_list +++ a/mm/swap.c @@ -134,18 +134,27 @@ EXPORT_SYMBOL(__put_page); * put_pages_list() - release a list of pages * @pages: list of pages threaded on page->lru * - * Release a list of pages which are strung together on page.lru. Currently - * used by read_cache_pages() and related error recovery code. + * Release a list of pages which are strung together on page.lru. */ void put_pages_list(struct list_head *pages) { - while (!list_empty(pages)) { - struct page *victim; + struct page *page, *next; - victim = lru_to_page(pages); - list_del(&victim->lru); - put_page(victim); + list_for_each_entry_safe(page, next, pages, lru) { + if (!put_page_testzero(page)) { + list_del(&page->lru); + continue; + } + if (PageHead(page)) { + list_del(&page->lru); + __put_compound_page(page); + continue; + } + /* Cannot be PageLRU because it's passed to us using the lru */ + __ClearPageWaiters(page); } + + free_unref_page_list(pages); } EXPORT_SYMBOL(put_pages_list); From patchwork Fri Nov 5 20:37:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 977D8C433EF for ; Fri, 5 Nov 2021 20:37:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4BD8160E05 for ; Fri, 5 Nov 2021 20:37:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4BD8160E05 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D8A6F940032; Fri, 5 Nov 2021 16:37:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D10AB940007; Fri, 5 Nov 2021 16:37:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB1B7940032; Fri, 5 Nov 2021 16:37:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id A5927940007 for ; Fri, 5 Nov 2021 16:37:30 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7592527776 for ; Fri, 5 Nov 2021 20:37:30 +0000 (UTC) X-FDA: 78776037060.20.8AE4418 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 2838A70000AB for ; Fri, 5 Nov 2021 20:37:30 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2628A60720; Fri, 5 Nov 2021 20:37:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144649; bh=4eOvPhGrgkxeO9IzUCUsx6OFT8uD8g534bxI7hddIGQ=; h=Date:From:To:Subject:In-Reply-To:From; b=ajt+6bLrxFljrO5kKf1ZOoa1ZZ6Q2ZGWAtyKifFzuZBQnpYCdGmYhjryX6nByczWQ XGw44A1hEyVSonvl5JJPc8lD1K9ipgO+8ORQv326djPnBnAPAK3BI1iJVaSeu9zcsX 66fmM9/bDaAfo1rkQ4MBKwV/QV1t+S0in62UTRJE= Date: Fri, 05 Nov 2021 13:37:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, peterx@redhat.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 054/262] mm/memcg: drop swp_entry_t* in mc_handle_file_pte() Message-ID: <20211105203728.UuCSJCA-O%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2838A70000AB X-Stat-Signature: ibk5py1dgbb9q8o3dcza88gxot1z8jsk Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ajt+6bLr; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144650-933631 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/memcg: drop swp_entry_t* in mc_handle_file_pte() After the rework of f5df8635c5a3 ("mm: use find_get_incore_page in memcontrol", 2020-10-13) it's unused. Link: https://lkml.kernel.org/r/20210916193014.80129-1-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Muchun Song Reviewed-by: David Hildenbrand Cc: Johannes Weiner Cc: Michal Hocko Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/memcontrol.c~mm-memcg-drop-swp_entry_t-in-mc_handle_file_pte +++ a/mm/memcontrol.c @@ -5545,7 +5545,7 @@ static struct page *mc_handle_swap_pte(s #endif static struct page *mc_handle_file_pte(struct vm_area_struct *vma, - unsigned long addr, pte_t ptent, swp_entry_t *entry) + unsigned long addr, pte_t ptent) { if (!vma->vm_file) /* anonymous vma */ return NULL; @@ -5718,7 +5718,7 @@ static enum mc_target_type get_mctgt_typ else if (is_swap_pte(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); else if (pte_none(ptent)) - page = mc_handle_file_pte(vma, addr, ptent, &ent); + page = mc_handle_file_pte(vma, addr, ptent); if (!page && !ent.val) return ret; From patchwork Fri Nov 5 20:37:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BED80C433EF for ; Fri, 5 Nov 2021 20:37:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 734E461242 for ; Fri, 5 Nov 2021 20:37:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 734E461242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 107A2940033; Fri, 5 Nov 2021 16:37:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 09080940034; Fri, 5 Nov 2021 16:37:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFC5D940033; Fri, 5 Nov 2021 16:37:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id C951A940007 for ; Fri, 5 Nov 2021 16:37:33 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8B8561856A8FE for ; Fri, 5 Nov 2021 20:37:33 +0000 (UTC) X-FDA: 78776037312.05.04F7BD1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 27A7A90000AD for ; Fri, 5 Nov 2021 20:37:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1FCC66056B; Fri, 5 Nov 2021 20:37:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144652; bh=sbSbvXulVG76Pdn5K0qar18qXE+PpMEkpUnJ5PUvKEs=; h=Date:From:To:Subject:In-Reply-To:From; b=UyOKve9TzyKbcNUVOz5tUwWrg5OAtOfUrq1WEHZ/xMbonwK95eQQuHgQHjNfRMnH5 epM/UrSuKdW5NOei3ia6/vqy5/LpXQYun47zGnRndJ+qHVnkPfeLeRjxiAy+jiVStL mU3mOTBJmoVfaoE7hlIC0JDxoYOI+1hJpJ92C8a0= Date: Fri, 05 Nov 2021 13:37:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mkoutny@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org Subject: [patch 055/262] memcg: flush stats only if updated Message-ID: <20211105203731.uHUWGR8SE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UyOKve9T; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 27A7A90000AD X-Stat-Signature: diqjbziepdggif5bsfkqnoya4s17brr6 X-HE-Tag: 1636144653-437636 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg: flush stats only if updated At the moment, the kernel flushes the memcg stats on every refault and also on every reclaim iteration. Although rstat maintains per-cpu update tree but on the flush the kernel still has to go through all the cpu rstat update tree to check if there is anything to flush. This patch adds the tracking on the stats update side to make flush side more clever by skipping the flush if there is no update. The stats update codepath is very sensitive performance wise for many workloads and benchmarks. So, we can not follow what the commit aa48e47e3906 ("memcg: infrastructure to flush memcg stats") did which was triggering async flush through queue_work() and caused a lot performance regression reports. That got reverted by the commit 1f828223b799 ("memcg: flush lruvec stats in the refault"). In this patch we kept the stats update codepath very minimal and let the stats reader side to flush the stats only when the updates are over a specific threshold. For now the threshold is (nr_cpus * CHARGE_BATCH). To evaluate the impact of this patch, an 8 GiB tmpfs file is created on a system with swap-on-zram and the file was pushed to swap through memory.force_empty interface. On reading the whole file, the memcg stat flush in the refault code path is triggered. With this patch, we observed 63% reduction in the read time of 8 GiB file. Link: https://lkml.kernel.org/r/20211001190040.48086-1-shakeelb@google.com Signed-off-by: Shakeel Butt Acked-by: Johannes Weiner Cc: Michal Hocko Reviewed-by: "Michal Koutný" Signed-off-by: Andrew Morton --- mm/memcontrol.c | 78 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 55 insertions(+), 23 deletions(-) --- a/mm/memcontrol.c~memcg-flush-stats-only-if-updated +++ a/mm/memcontrol.c @@ -103,11 +103,6 @@ static bool do_memsw_account(void) return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap; } -/* memcg and lruvec stats flushing */ -static void flush_memcg_stats_dwork(struct work_struct *w); -static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork); -static DEFINE_SPINLOCK(stats_flush_lock); - #define THRESHOLDS_EVENTS_TARGET 128 #define SOFTLIMIT_EVENTS_TARGET 1024 @@ -635,6 +630,56 @@ mem_cgroup_largest_soft_limit_node(struc return mz; } +/* + * memcg and lruvec stats flushing + * + * Many codepaths leading to stats update or read are performance sensitive and + * adding stats flushing in such codepaths is not desirable. So, to optimize the + * flushing the kernel does: + * + * 1) Periodically and asynchronously flush the stats every 2 seconds to not let + * rstat update tree grow unbounded. + * + * 2) Flush the stats synchronously on reader side only when there are more than + * (MEMCG_CHARGE_BATCH * nr_cpus) update events. Though this optimization + * will let stats be out of sync by atmost (MEMCG_CHARGE_BATCH * nr_cpus) but + * only for 2 seconds due to (1). + */ +static void flush_memcg_stats_dwork(struct work_struct *w); +static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork); +static DEFINE_SPINLOCK(stats_flush_lock); +static DEFINE_PER_CPU(unsigned int, stats_updates); +static atomic_t stats_flush_threshold = ATOMIC_INIT(0); + +static inline void memcg_rstat_updated(struct mem_cgroup *memcg) +{ + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); + if (!(__this_cpu_inc_return(stats_updates) % MEMCG_CHARGE_BATCH)) + atomic_inc(&stats_flush_threshold); +} + +static void __mem_cgroup_flush_stats(void) +{ + if (!spin_trylock(&stats_flush_lock)) + return; + + cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup); + atomic_set(&stats_flush_threshold, 0); + spin_unlock(&stats_flush_lock); +} + +void mem_cgroup_flush_stats(void) +{ + if (atomic_read(&stats_flush_threshold) > num_online_cpus()) + __mem_cgroup_flush_stats(); +} + +static void flush_memcg_stats_dwork(struct work_struct *w) +{ + mem_cgroup_flush_stats(); + queue_delayed_work(system_unbound_wq, &stats_flush_dwork, 2UL*HZ); +} + /** * __mod_memcg_state - update cgroup memory statistics * @memcg: the memory cgroup @@ -647,7 +692,7 @@ void __mod_memcg_state(struct mem_cgroup return; __this_cpu_add(memcg->vmstats_percpu->state[idx], val); - cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); + memcg_rstat_updated(memcg); } /* idx can be of type enum memcg_stat_item or node_stat_item. */ @@ -675,10 +720,12 @@ void __mod_memcg_lruvec_state(struct lru memcg = pn->memcg; /* Update memcg */ - __mod_memcg_state(memcg, idx, val); + __this_cpu_add(memcg->vmstats_percpu->state[idx], val); /* Update lruvec */ __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); + + memcg_rstat_updated(memcg); } /** @@ -780,7 +827,7 @@ void __count_memcg_events(struct mem_cgr return; __this_cpu_add(memcg->vmstats_percpu->events[idx], count); - cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); + memcg_rstat_updated(memcg); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) @@ -5341,21 +5388,6 @@ static void mem_cgroup_css_reset(struct memcg_wb_domain_size_changed(memcg); } -void mem_cgroup_flush_stats(void) -{ - if (!spin_trylock(&stats_flush_lock)) - return; - - cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup); - spin_unlock(&stats_flush_lock); -} - -static void flush_memcg_stats_dwork(struct work_struct *w) -{ - mem_cgroup_flush_stats(); - queue_delayed_work(system_unbound_wq, &stats_flush_dwork, 2UL*HZ); -} - static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) { struct mem_cgroup *memcg = mem_cgroup_from_css(css); From patchwork Fri Nov 5 20:37:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66C7FC433EF for ; Fri, 5 Nov 2021 20:37:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1C8346056B for ; Fri, 5 Nov 2021 20:37:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1C8346056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AA1C7940034; Fri, 5 Nov 2021 16:37:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A2B16940007; Fri, 5 Nov 2021 16:37:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F1C5940034; Fri, 5 Nov 2021 16:37:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 7B8ED940007 for ; Fri, 5 Nov 2021 16:37:37 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 8B26727776 for ; Fri, 5 Nov 2021 20:37:36 +0000 (UTC) X-FDA: 78776037312.14.552C730 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 54031508FA5B for ; Fri, 5 Nov 2021 20:37:24 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 24DD161252; Fri, 5 Nov 2021 20:37:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144655; bh=MHyJACwCe3CuEGC+Njn6Dps9KgUr3ex5b66zNaCxUbM=; h=Date:From:To:Subject:In-Reply-To:From; b=P7b7U40F+TZIYKwCkm6KH3IMlmqrp79nN2Bdn6IHKby0XiRce+OR4u4lSrKZYDT2/ 2F4RaSA97it4bV5dXD42kt3g8+Xb/I9pShEvQ7/PWUN/7Da5IPyk8q0vAUV+Mfh+96 Vzv0BBuHCCMQ7+TNciG/Ltn6lccHqsOQS0QabpoI= Date: Fri, 05 Nov 2021 13:37:34 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mkoutny@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org Subject: [patch 056/262] memcg: unify memcg stat flushing Message-ID: <20211105203734.G_ioCWCJv%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 54031508FA5B X-Stat-Signature: 5cidc1twdm6mix99pw3acc1sb5ui79je Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=P7b7U40F; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144644-773771 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg: unify memcg stat flushing The memcg stats can be flushed in multiple context and potentially in parallel too. For example multiple parallel user space readers for memcg stats will contend on the rstat locks with each other. There is no need for that. We just need one flusher and everyone else can benefit. In addition after aa48e47e3906 ("memcg: infrastructure to flush memcg stats") the kernel periodically flush the memcg stats from the root, so, the other flushers will potentially have much less work to do. Link: https://lkml.kernel.org/r/20211001190040.48086-2-shakeelb@google.com Signed-off-by: Shakeel Butt Acked-by: Johannes Weiner Cc: Michal Hocko Cc: "Michal Koutný" Signed-off-by: Andrew Morton --- mm/memcontrol.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) --- a/mm/memcontrol.c~memcg-unify-memcg-stat-flushing +++ a/mm/memcontrol.c @@ -660,12 +660,14 @@ static inline void memcg_rstat_updated(s static void __mem_cgroup_flush_stats(void) { - if (!spin_trylock(&stats_flush_lock)) + unsigned long flag; + + if (!spin_trylock_irqsave(&stats_flush_lock, flag)) return; cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup); atomic_set(&stats_flush_threshold, 0); - spin_unlock(&stats_flush_lock); + spin_unlock_irqrestore(&stats_flush_lock, flag); } void mem_cgroup_flush_stats(void) @@ -1461,7 +1463,7 @@ static char *memory_stat_format(struct m * * Current memory state: */ - cgroup_rstat_flush(memcg->css.cgroup); + mem_cgroup_flush_stats(); for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { u64 size; @@ -3565,8 +3567,7 @@ static unsigned long mem_cgroup_usage(st unsigned long val; if (mem_cgroup_is_root(memcg)) { - /* mem_cgroup_threshold() calls here from irqsafe context */ - cgroup_rstat_flush_irqsafe(memcg->css.cgroup); + mem_cgroup_flush_stats(); val = memcg_page_state(memcg, NR_FILE_PAGES) + memcg_page_state(memcg, NR_ANON_MAPPED); if (swap) @@ -3947,7 +3948,7 @@ static int memcg_numa_stat_show(struct s int nid; struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - cgroup_rstat_flush(memcg->css.cgroup); + mem_cgroup_flush_stats(); for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) { seq_printf(m, "%s=%lu", stat->name, @@ -4019,7 +4020,7 @@ static int memcg_stat_show(struct seq_fi BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats)); - cgroup_rstat_flush(memcg->css.cgroup); + mem_cgroup_flush_stats(); for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { unsigned long nr; @@ -4522,7 +4523,7 @@ void mem_cgroup_wb_stats(struct bdi_writ struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css); struct mem_cgroup *parent; - cgroup_rstat_flush_irqsafe(memcg->css.cgroup); + mem_cgroup_flush_stats(); *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY); *pwriteback = memcg_page_state(memcg, NR_WRITEBACK); @@ -6405,7 +6406,7 @@ static int memory_numa_stat_show(struct int i; struct mem_cgroup *memcg = mem_cgroup_from_seq(m); - cgroup_rstat_flush(memcg->css.cgroup); + mem_cgroup_flush_stats(); for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { int nid; From patchwork Fri Nov 5 20:37:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE3D9C433FE for ; Fri, 5 Nov 2021 20:37:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A62D160174 for ; Fri, 5 Nov 2021 20:37:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A62D160174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 48866940035; Fri, 5 Nov 2021 16:37:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 411C9940007; Fri, 5 Nov 2021 16:37:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B0FC940035; Fri, 5 Nov 2021 16:37:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 1792F940007 for ; Fri, 5 Nov 2021 16:37:40 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CE70518562FFF for ; Fri, 5 Nov 2021 20:37:39 +0000 (UTC) X-FDA: 78776037438.12.9F8C976 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id 2B9DEF000097 for ; Fri, 5 Nov 2021 20:37:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 42314611C0; Fri, 5 Nov 2021 20:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144658; bh=xFZJjqfbB2coAlfjMLpnv1ffrcNXhB2MQsxAUgr8ouk=; h=Date:From:To:Subject:In-Reply-To:From; b=IxkflBrzC0/Q4D0X7SABuqdZg/FqnT8ahpTqk5NqL/xaTnTaxRexf6nUd5/DSv/Zn NT2UvMhpX7HUQ3okYGBb3nmwPrCOiuNNoeiBZwVDcUDdYDVoz4OLh7BYWF0h6Afe0b WzRWiWJx2zWGNel0p/enqRgCqOX+/FLjmR2paeOs= Date: Fri, 05 Nov 2021 13:37:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, atomlin@redhat.com, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, longman@redhat.com, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vbabka@suse.cz, vdavydov.dev@gmail.com Subject: [patch 057/262] mm/memcg: remove obsolete memcg_free_kmem() Message-ID: <20211105203737.zbyywaHFG%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IxkflBrz; dmarc=none; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2B9DEF000097 X-Stat-Signature: q93ok96y6eu7757t5agktioqty1t6eaa X-HE-Tag: 1636144651-858925 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Waiman Long Subject: mm/memcg: remove obsolete memcg_free_kmem() Since commit d648bcc7fe65 ("mm: kmem: make memcg_kmem_enabled() irreversible"), the only thing memcg_free_kmem() does is to call memcg_offline_kmem() when the memcg is still online which can happen when online_css() fails due to -ENOMEM. However, the name memcg_free_kmem() is confusing and it is more clear and straight forward to call memcg_offline_kmem() directly from mem_cgroup_css_free(). Link: https://lkml.kernel.org/r/20211005202450.11775-1-longman@redhat.com Signed-off-by: Waiman Long Suggested-by: Roman Gushchin Reviewed-by: Aaron Tomlin Reviewed-by: Shakeel Butt Reviewed-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Vlastimil Babka Cc: Muchun Song Signed-off-by: Andrew Morton --- mm/memcontrol.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) --- a/mm/memcontrol.c~mm-memcg-remove-obsolete-memcg_free_kmem +++ a/mm/memcontrol.c @@ -3704,13 +3704,6 @@ static void memcg_offline_kmem(struct me memcg_free_cache_id(kmemcg_id); } - -static void memcg_free_kmem(struct mem_cgroup *memcg) -{ - /* css_alloc() failed, offlining didn't happen */ - if (unlikely(memcg->kmem_state == KMEM_ONLINE)) - memcg_offline_kmem(memcg); -} #else static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -3719,9 +3712,6 @@ static int memcg_online_kmem(struct mem_ static void memcg_offline_kmem(struct mem_cgroup *memcg) { } -static void memcg_free_kmem(struct mem_cgroup *memcg) -{ -} #endif /* CONFIG_MEMCG_KMEM */ static int memcg_update_kmem_max(struct mem_cgroup *memcg, @@ -5356,7 +5346,9 @@ static void mem_cgroup_css_free(struct c cancel_work_sync(&memcg->high_work); mem_cgroup_remove_from_trees(memcg); free_shrinker_info(memcg); - memcg_free_kmem(memcg); + + /* Need to offline kmem if online_css() fails */ + memcg_offline_kmem(memcg); mem_cgroup_free(memcg); } From patchwork Fri Nov 5 20:37:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C596BC433F5 for ; Fri, 5 Nov 2021 20:37:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 77D73611C4 for ; Fri, 5 Nov 2021 20:37:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 77D73611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1A126940036; Fri, 5 Nov 2021 16:37:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 102F7940007; Fri, 5 Nov 2021 16:37:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0F53940036; Fri, 5 Nov 2021 16:37:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0240.hostedemail.com [216.40.44.240]) by kanga.kvack.org (Postfix) with ESMTP id D9B6B940007 for ; Fri, 5 Nov 2021 16:37:42 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A2C713DDA8 for ; Fri, 5 Nov 2021 20:37:42 +0000 (UTC) X-FDA: 78776037564.21.9CD5793 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id E8C80104AAF0 for ; Fri, 5 Nov 2021 20:37:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6AC7860174; Fri, 5 Nov 2021 20:37:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144661; bh=KQCjP6opqOl5LwSsFk+8X1PT7ooH+zkY2Dt982p+fhg=; h=Date:From:To:Subject:In-Reply-To:From; b=kxG4FMjT6TW5QBMs6UptBr3oBZqm9faCID51wSykgnvbDtberUX6XV4UgVHNMfgur RVDBHCx46kBaS7g4zZZLR5IT+U4vOKfDPOuuDqDWEIR/b3urBXx8NEvxDeOtbh+Sno 5NQQz5kHdXypUj79I+fLREDUQtZ3kaXYQhNzyRzk= Date: Fri, 05 Nov 2021 13:37:40 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gustavoars@kernel.org, keescook@chromium.org, len.baker@gmx.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 058/262] mm/list_lru.c: prefer struct_size over open coded arithmetic Message-ID: <20211105203740.j6OjstKjD%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E8C80104AAF0 X-Stat-Signature: hpnn6dghkbyaymjbch3g5dddse119txn Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kxG4FMjT; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144653-364245 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Len Baker Subject: mm/list_lru.c: prefer struct_size over open coded arithmetic As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the struct_size() helper to do the arithmetic instead of the argument "size + count * size" in the kvmalloc() functions. Also, take the opportunity to refactor the memcpy() call to use the flex_array_size() helper. This code was detected with the help of Coccinelle and audited and fixed manually. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Link: https://lkml.kernel.org/r/20211017105929.9284-1-len.baker@gmx.com Signed-off-by: Len Baker Cc: Kees Cook Cc: "Gustavo A. R. Silva" Signed-off-by: Andrew Morton --- mm/list_lru.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/mm/list_lru.c~mm-list_lruc-prefer-struct_size-over-open-coded-arithmetic +++ a/mm/list_lru.c @@ -354,8 +354,7 @@ static int memcg_init_list_lru_node(stru struct list_lru_memcg *memcg_lrus; int size = memcg_nr_cache_ids; - memcg_lrus = kvmalloc(sizeof(*memcg_lrus) + - size * sizeof(void *), GFP_KERNEL); + memcg_lrus = kvmalloc(struct_size(memcg_lrus, lru, size), GFP_KERNEL); if (!memcg_lrus) return -ENOMEM; @@ -389,7 +388,7 @@ static int memcg_update_list_lru_node(st old = rcu_dereference_protected(nlru->memcg_lrus, lockdep_is_held(&list_lrus_mutex)); - new = kvmalloc(sizeof(*new) + new_size * sizeof(void *), GFP_KERNEL); + new = kvmalloc(struct_size(new, lru, new_size), GFP_KERNEL); if (!new) return -ENOMEM; @@ -398,7 +397,7 @@ static int memcg_update_list_lru_node(st return -ENOMEM; } - memcpy(&new->lru, &old->lru, old_size * sizeof(void *)); + memcpy(&new->lru, &old->lru, flex_array_size(new, lru, old_size)); /* * The locking below allows readers that hold nlru->lock avoid taking From patchwork Fri Nov 5 20:37:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F06B2C433EF for ; Fri, 5 Nov 2021 20:37:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AAAF060174 for ; Fri, 5 Nov 2021 20:37:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AAAF060174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 46077940037; Fri, 5 Nov 2021 16:37:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E8AF940007; Fri, 5 Nov 2021 16:37:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B0C1940037; Fri, 5 Nov 2021 16:37:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 157C6940007 for ; Fri, 5 Nov 2021 16:37:46 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CC3FE1856A99C for ; Fri, 5 Nov 2021 20:37:45 +0000 (UTC) X-FDA: 78776037690.11.53E564B Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 19D9B104AAC1 for ; Fri, 5 Nov 2021 20:37:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6D72C611C0; Fri, 5 Nov 2021 20:37:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144664; bh=3jo0PsGxHwQQciMl73+65yoyer6IErrgt1+PUuPanXg=; h=Date:From:To:Subject:In-Reply-To:From; b=Pg/8w3lKzjuYEvP3xmBWqDov+e2Bs+yYKq9uzJ7K34hcOMSE0r4HDUuCsWy0vsdPT Y+kiX7vWM0ZmWhlQ48BtWT1iw2R9yv/LrOnBMWKL7p503b0W5w0w31JUufnnHK3SNG aNk/PlfmNwiNQfS8tUAv6wQ6MgdIFWZng3otaDN0= Date: Fri, 05 Nov 2021 13:37:44 -0700 From: Andrew Morton To: akpm@linux-foundation.org, arnd@arndb.de, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, vvs@virtuozzo.com Subject: [patch 059/262] memcg, kmem: further deprecate kmem.limit_in_bytes Message-ID: <20211105203744.FeAxCRHkX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Pg/8w3lK"; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 19D9B104AAC1 X-Stat-Signature: 1x4q5fsjjichuttxrbroxucm5o68fbuz X-HE-Tag: 1636144656-766501 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: memcg, kmem: further deprecate kmem.limit_in_bytes The deprecation process of kmem.limit_in_bytes started with the commit 0158115f702 ("memcg, kmem: deprecate kmem.limit_in_bytes") which also explains in detail the motivation behind the deprecation. To summarize, it is the unexpected behavior on hitting the kmem limit. This patch moves the deprecation process to the next stage by disallowing to set the kmem limit. In future we might just remove the kmem.limit_in_bytes file completely. [akpm@linux-foundation.org: s/ENOTSUPP/EOPNOTSUPP/] [arnd@arndb.de: mark cancel_charge() inline] Link: https://lkml.kernel.org/r/20211022070542.679839-1-arnd@kernel.org Link: https://lkml.kernel.org/r/20211019153408.2916808-1-shakeelb@google.com Signed-off-by: Shakeel Butt Signed-off-by: Arnd Bergmann Acked-by: Roman Gushchin Acked-by: Michal Hocko Reviewed-by: Muchun Song Cc: Vasily Averin Cc: Johannes Weiner Signed-off-by: Andrew Morton --- Documentation/admin-guide/cgroup-v1/memory.rst | 11 ---- mm/memcontrol.c | 39 +-------------- 2 files changed, 7 insertions(+), 43 deletions(-) --- a/Documentation/admin-guide/cgroup-v1/memory.rst~memcg-kmem-further-deprecate-kmemlimit_in_bytes +++ a/Documentation/admin-guide/cgroup-v1/memory.rst @@ -87,10 +87,8 @@ Brief summary of control files. memory.oom_control set/show oom controls. memory.numa_stat show the number of memory usage per numa node - memory.kmem.limit_in_bytes set/show hard limit for kernel memory - This knob is deprecated and shouldn't be - used. It is planned that this be removed in - the foreseeable future. + memory.kmem.limit_in_bytes This knob is deprecated and writing to + it will return -ENOTSUPP. memory.kmem.usage_in_bytes show current kernel memory allocation memory.kmem.failcnt show the number of kernel memory usage hits limits @@ -518,11 +516,6 @@ will be charged as a new owner of it. charged file caches. Some out-of-use page caches may keep charged until memory pressure happens. If you want to avoid that, force_empty will be useful. - Also, note that when memory.kmem.limit_in_bytes is set the charges due to - kernel pages will still be seen. This is not considered a failure and the - write will still return success. In this case, it is expected that - memory.kmem.usage_in_bytes == memory.usage_in_bytes. - 5.2 stat file ------------- --- a/mm/memcontrol.c~memcg-kmem-further-deprecate-kmemlimit_in_bytes +++ a/mm/memcontrol.c @@ -2771,8 +2771,7 @@ static inline int try_charge(struct mem_ return try_charge_memcg(memcg, gfp_mask, nr_pages); } -#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MMU) -static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) +static inline void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) { if (mem_cgroup_is_root(memcg)) return; @@ -2781,7 +2780,6 @@ static void cancel_charge(struct mem_cgr if (do_memsw_account()) page_counter_uncharge(&memcg->memsw, nr_pages); } -#endif static void commit_charge(struct page *page, struct mem_cgroup *memcg) { @@ -3000,7 +2998,6 @@ static void obj_cgroup_uncharge_pages(st static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp, unsigned int nr_pages) { - struct page_counter *counter; struct mem_cgroup *memcg; int ret; @@ -3010,21 +3007,8 @@ static int obj_cgroup_charge_pages(struc if (ret) goto out; - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && - !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) { - - /* - * Enforce __GFP_NOFAIL allocation because callers are not - * prepared to see failures and likely do not have any failure - * handling code. - */ - if (gfp & __GFP_NOFAIL) { - page_counter_charge(&memcg->kmem, nr_pages); - goto out; - } - cancel_charge(memcg, nr_pages); - ret = -ENOMEM; - } + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) + page_counter_charge(&memcg->kmem, nr_pages); out: css_put(&memcg->css); @@ -3714,17 +3698,6 @@ static void memcg_offline_kmem(struct me } #endif /* CONFIG_MEMCG_KMEM */ -static int memcg_update_kmem_max(struct mem_cgroup *memcg, - unsigned long max) -{ - int ret; - - mutex_lock(&memcg_max_mutex); - ret = page_counter_set_max(&memcg->kmem, max); - mutex_unlock(&memcg_max_mutex); - return ret; -} - static int memcg_update_tcp_max(struct mem_cgroup *memcg, unsigned long max) { int ret; @@ -3790,10 +3763,8 @@ static ssize_t mem_cgroup_write(struct k ret = mem_cgroup_resize_max(memcg, nr_pages, true); break; case _KMEM: - pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. " - "Please report your usecase to linux-mm@kvack.org if you " - "depend on this functionality.\n"); - ret = memcg_update_kmem_max(memcg, nr_pages); + /* kmem.limit_in_bytes is deprecated. */ + ret = -EOPNOTSUPP; break; case _TCP: ret = memcg_update_tcp_max(memcg, nr_pages); From patchwork Fri Nov 5 20:37:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF0B7C433EF for ; Fri, 5 Nov 2021 20:37:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9882860174 for ; Fri, 5 Nov 2021 20:37:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9882860174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 39DFC940038; Fri, 5 Nov 2021 16:37:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D8F4940007; Fri, 5 Nov 2021 16:37:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 179D2940038; Fri, 5 Nov 2021 16:37:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 0353D940007 for ; Fri, 5 Nov 2021 16:37:49 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BA6A77798B for ; Fri, 5 Nov 2021 20:37:48 +0000 (UTC) X-FDA: 78776037816.29.A85F23F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id F37C160019A7 for ; Fri, 5 Nov 2021 20:37:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8343B61252; Fri, 5 Nov 2021 20:37:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144667; bh=I3HMw03PdXao4xhAgGi9zClkcaC8ZMkFxxofAO2HuRM=; h=Date:From:To:Subject:In-Reply-To:From; b=mTTLJT5F0Y5SX9PJeCHjp5vC9WYuqPFed4A40PljPf0nfBRaIyP297c+VGB5eogFB C6ma27BZMh90v39YG08LCSg4Ca+K88mPunOEgZBXBBZvMEQ5lCHZvRx/zNiMKn4BrR igE4aXNtZaLW3CExg4cEXDYN2OyTmG3sOdwc/cjw= Date: Fri, 05 Nov 2021 13:37:47 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 060/262] mm: list_lru: remove holding lru lock Message-ID: <20211105203747.qUrOzAFsm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: F37C160019A7 X-Stat-Signature: 5pfk1iunp17ntz5574foyeqr8aae4fzw Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mTTLJT5F; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144656-435388 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: list_lru: remove holding lru lock Since commit e5bc3af7734f ("rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu()"), the critical section of spin lock can serve as an RCU read-side critical section which already allows readers that hold nlru->lock to avoid taking rcu lock. So just remove holding lock. Link: https://lkml.kernel.org/r/20211025124534.56345-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Signed-off-by: Andrew Morton --- mm/list_lru.c | 11 ----------- 1 file changed, 11 deletions(-) --- a/mm/list_lru.c~mm-list_lru-remove-holding-lru-lock +++ a/mm/list_lru.c @@ -398,18 +398,7 @@ static int memcg_update_list_lru_node(st } memcpy(&new->lru, &old->lru, flex_array_size(new, lru, old_size)); - - /* - * The locking below allows readers that hold nlru->lock avoid taking - * rcu_read_lock (see list_lru_from_memcg_idx). - * - * Since list_lru_{add,del} may be called under an IRQ-safe lock, - * we have to use IRQ-safe primitives here to avoid deadlock. - */ - spin_lock_irq(&nlru->lock); rcu_assign_pointer(nlru->memcg_lrus, new); - spin_unlock_irq(&nlru->lock); - kvfree_rcu(old, rcu); return 0; } From patchwork Fri Nov 5 20:37:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23F16C433FE for ; Fri, 5 Nov 2021 20:37:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D1F21611C0 for ; Fri, 5 Nov 2021 20:37:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D1F21611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6D1F7940039; Fri, 5 Nov 2021 16:37:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65A0D940007; Fri, 5 Nov 2021 16:37:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FC2E940039; Fri, 5 Nov 2021 16:37:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0148.hostedemail.com [216.40.44.148]) by kanga.kvack.org (Postfix) with ESMTP id 3A922940007 for ; Fri, 5 Nov 2021 16:37:52 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 00FC96EDE9 for ; Fri, 5 Nov 2021 20:37:52 +0000 (UTC) X-FDA: 78776037984.02.153071F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 99BF970000AE for ; Fri, 5 Nov 2021 20:37:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9018460174; Fri, 5 Nov 2021 20:37:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144670; bh=mQfLVeFky/wjvihXxWdCm3YwQNNOIc8OFRailCxRgaY=; h=Date:From:To:Subject:In-Reply-To:From; b=XLIS7WnvT6OGa1Pj4ukL0J8ai9VXu5qY0TFBftGYk9DTQtUM60R1L2UgOxR3F1ooT 2rQUVwadEO91wHaxx1nUF0wDjPbv+HN3R1tRaGCHKhq0pCrc9BXxQRqIpnxbYhpMgU wp0AAXyGotM8M+khITkxWppPtt6YTgI0ydMhZpRM= Date: Fri, 05 Nov 2021 13:37:50 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 061/262] mm: list_lru: fix the return value of list_lru_count_one() Message-ID: <20211105203750.6jGE4g6rW%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 99BF970000AE X-Stat-Signature: bp3dnd9d7fprhi9egjzkkyrszftctbyc Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XLIS7Wnv; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144671-649686 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: list_lru: fix the return value of list_lru_count_one() Since commit 2788cf0c401c ("memcg: reparent list_lrus and free kmemcg_id on css offline"), ->nr_items can be negative during memory cgroup reparenting. In this case, list_lru_count_one() will return an unusual and huge value, which can surprise users. At least for now it hasn't affected any users. But it is better to let list_lru_count_ont() returns zero when ->nr_items is negative. Link: https://lkml.kernel.org/r/20211025124910.56433-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Signed-off-by: Andrew Morton --- mm/list_lru.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/mm/list_lru.c~mm-list_lru-fix-the-return-value-of-list_lru_count_one +++ a/mm/list_lru.c @@ -176,13 +176,16 @@ unsigned long list_lru_count_one(struct { struct list_lru_node *nlru = &lru->node[nid]; struct list_lru_one *l; - unsigned long count; + long count; rcu_read_lock(); l = list_lru_from_memcg_idx(nlru, memcg_cache_id(memcg)); count = READ_ONCE(l->nr_items); rcu_read_unlock(); + if (unlikely(count < 0)) + count = 0; + return count; } EXPORT_SYMBOL_GPL(list_lru_count_one); From patchwork Fri Nov 5 20:37:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BD4AC433EF for ; Fri, 5 Nov 2021 20:37:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C847E611C4 for ; Fri, 5 Nov 2021 20:37:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C847E611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5E58894003A; Fri, 5 Nov 2021 16:37:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 56FA6940007; Fri, 5 Nov 2021 16:37:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 463B094003B; Fri, 5 Nov 2021 16:37:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 2E2B094003A for ; Fri, 5 Nov 2021 16:37:55 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E823E7798E for ; Fri, 5 Nov 2021 20:37:54 +0000 (UTC) X-FDA: 78776037984.19.D7E4570 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 99F98F00039B for ; Fri, 5 Nov 2021 20:37:54 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9E132611C0; Fri, 5 Nov 2021 20:37:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144673; bh=FZcD3yo9NxFRA7vGHyNmjobiIoJIRxzisn67CwJuB+0=; h=Date:From:To:Subject:In-Reply-To:From; b=YbDLvOYq77twMFEiUh7N0MnH8Ia5H5ZzKUv/RjfHe0xmFLPIfMF6HsDS4OlRbfay/ rhPqK9yChT6bdtYowHnkEIPuRuPpjnKzUw5pjM8tuMxyNa+kDTUiq7o/glW5XbnnyZ oPQHtafHsoiZ2P3ojSDFbWp/qQRlndWDZvZ2Q9Tc= Date: Fri, 05 Nov 2021 13:37:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 062/262] mm: memcontrol: remove kmemcg_id reparenting Message-ID: <20211105203753.q3Y-GvB-j%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YbDLvOYq; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 99F98F00039B X-Stat-Signature: y9odaspwpd53a5pzho5i4kq1mhzswkun X-HE-Tag: 1636144674-285289 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: memcontrol: remove kmemcg_id reparenting Since slab objects and kmem pages are charged to object cgroup instead of memory cgroup, memcg_reparent_objcgs() will reparent this cgroup and all its descendants to its parent cgroup. This already makes further list_lru_add()'s add elements to the parent's list. So it is unnecessary to change kmemcg_id of an offline cgroup to its parent's id. It just wastes CPU cycles. Just to remove those redundant code. Link: https://lkml.kernel.org/r/20211025125102.56533-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Acked-by: Roman Gushchin Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Shakeel Butt Signed-off-by: Andrew Morton --- mm/memcontrol.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) --- a/mm/memcontrol.c~mm-memcontrol-remove-kmemcg_id-reparenting +++ a/mm/memcontrol.c @@ -3650,8 +3650,7 @@ static int memcg_online_kmem(struct mem_ static void memcg_offline_kmem(struct mem_cgroup *memcg) { - struct cgroup_subsys_state *css; - struct mem_cgroup *parent, *child; + struct mem_cgroup *parent; int kmemcg_id; if (memcg->kmem_state != KMEM_ONLINE) @@ -3669,21 +3668,11 @@ static void memcg_offline_kmem(struct me BUG_ON(kmemcg_id < 0); /* - * Change kmemcg_id of this cgroup and all its descendants to the - * parent's id, and then move all entries from this cgroup's list_lrus - * to ones of the parent. After we have finished, all list_lrus - * corresponding to this cgroup are guaranteed to remain empty. The - * ordering is imposed by list_lru_node->lock taken by + * After we have finished memcg_reparent_objcgs(), all list_lrus + * corresponding to this cgroup are guaranteed to remain empty. + * The ordering is imposed by list_lru_node->lock taken by * memcg_drain_all_list_lrus(). */ - rcu_read_lock(); /* can be called from css_free w/o cgroup_mutex */ - css_for_each_descendant_pre(css, &memcg->css) { - child = mem_cgroup_from_css(css); - BUG_ON(child->kmemcg_id != kmemcg_id); - child->kmemcg_id = parent->kmemcg_id; - } - rcu_read_unlock(); - memcg_drain_all_list_lrus(kmemcg_id, parent); memcg_free_cache_id(kmemcg_id); From patchwork Fri Nov 5 20:37:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D0A2C433F5 for ; Fri, 5 Nov 2021 20:37:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1161060174 for ; Fri, 5 Nov 2021 20:37:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1161060174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 944B594003B; Fri, 5 Nov 2021 16:37:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8CC4E940007; Fri, 5 Nov 2021 16:37:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 792D894003B; Fri, 5 Nov 2021 16:37:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 66933940007 for ; Fri, 5 Nov 2021 16:37:58 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2E2998249980 for ; Fri, 5 Nov 2021 20:37:58 +0000 (UTC) X-FDA: 78776038236.30.E4CFBB8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 98763D0000B2 for ; Fri, 5 Nov 2021 20:37:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B3A006056B; Fri, 5 Nov 2021 20:37:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144677; bh=9h0IigcCmrTEt+4eoM3rVMbgJmqxLhcDzX7zddFmjw4=; h=Date:From:To:Subject:In-Reply-To:From; b=s4SNBDfe1Fiq/fnw/c7NL3wC7RbI1zJsdlzPQ/kHBwkdiBc1+IWJ6buYX+HBNigcD 9a2KTu6MWR3zQOaK1Y+6dus98KjHYJs8Wf9GZenum7QbDKqRcE+Q4T/fxnBqu6YLel DweUYEQCSd1psGM6ZL3dvnsO2x8oPrDafggpHbsY= Date: Fri, 05 Nov 2021 13:37:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 063/262] mm: memcontrol: remove the kmem states Message-ID: <20211105203756.ePVAGAdsI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 98763D0000B2 X-Stat-Signature: 4s6js8j84qr7w6jr6q6skk7u8jjgiyzk Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=s4SNBDfe; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144668-298171 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: memcontrol: remove the kmem states Now the kmem states is only used to indicate whether the kmem is offline. However, we can set ->kmemcg_id to -1 to indicate whether the kmem is offline. Finally, we can remove the kmem states to simplify the code. Link: https://lkml.kernel.org/r/20211025125259.56624-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Acked-by: Roman Gushchin Cc: Michal Hocko Cc: Shakeel Butt Cc: Matthew Wilcox (Oracle) Cc: Johannes Weiner Signed-off-by: Andrew Morton --- include/linux/memcontrol.h | 7 ------- mm/memcontrol.c | 7 ++----- 2 files changed, 2 insertions(+), 12 deletions(-) --- a/include/linux/memcontrol.h~mm-memcontrol-remove-the-kmem-states +++ a/include/linux/memcontrol.h @@ -180,12 +180,6 @@ struct mem_cgroup_thresholds { struct mem_cgroup_threshold_ary *spare; }; -enum memcg_kmem_state { - KMEM_NONE, - KMEM_ALLOCATED, - KMEM_ONLINE, -}; - #if defined(CONFIG_SMP) struct memcg_padding { char x[0]; @@ -318,7 +312,6 @@ struct mem_cgroup { #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; - enum memcg_kmem_state kmem_state; struct obj_cgroup __rcu *objcg; struct list_head objcg_list; /* list of inherited objcgs */ #endif --- a/mm/memcontrol.c~mm-memcontrol-remove-the-kmem-states +++ a/mm/memcontrol.c @@ -3626,7 +3626,6 @@ static int memcg_online_kmem(struct mem_ return 0; BUG_ON(memcg->kmemcg_id >= 0); - BUG_ON(memcg->kmem_state); memcg_id = memcg_alloc_cache_id(); if (memcg_id < 0) @@ -3643,7 +3642,6 @@ static int memcg_online_kmem(struct mem_ static_branch_enable(&memcg_kmem_enabled_key); memcg->kmemcg_id = memcg_id; - memcg->kmem_state = KMEM_ONLINE; return 0; } @@ -3653,11 +3651,9 @@ static void memcg_offline_kmem(struct me struct mem_cgroup *parent; int kmemcg_id; - if (memcg->kmem_state != KMEM_ONLINE) + if (memcg->kmemcg_id == -1) return; - memcg->kmem_state = KMEM_ALLOCATED; - parent = parent_mem_cgroup(memcg); if (!parent) parent = root_mem_cgroup; @@ -3676,6 +3672,7 @@ static void memcg_offline_kmem(struct me memcg_drain_all_list_lrus(kmemcg_id, parent); memcg_free_cache_id(kmemcg_id); + memcg->kmemcg_id = -1; } #else static int memcg_online_kmem(struct mem_cgroup *memcg) From patchwork Fri Nov 5 20:37:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4246C433EF for ; Fri, 5 Nov 2021 20:38:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6669160174 for ; Fri, 5 Nov 2021 20:38:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6669160174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0901F94003C; Fri, 5 Nov 2021 16:38:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 01839940007; Fri, 5 Nov 2021 16:38:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2DD894003C; Fri, 5 Nov 2021 16:38:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CA845940007 for ; Fri, 5 Nov 2021 16:38:01 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 94A2B75317 for ; Fri, 5 Nov 2021 20:38:01 +0000 (UTC) X-FDA: 78776038362.27.08836A5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id 6951690000A0 for ; Fri, 5 Nov 2021 20:37:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 13A21611C0; Fri, 5 Nov 2021 20:38:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144680; bh=jS1FeamgCnYd09CMNrrQ01Ee83jnhCb8nkd2pCHOsvw=; h=Date:From:To:Subject:In-Reply-To:From; b=V+Azqit2Gp4N09ipAO9Nf4eK2sqpum3tEYgt1N2zrHMQiVsRk7NySEwwHXS3Gnrmj NIyHLzHBI4VSEf3NiUkvCOInHC+9WLrkiYekFrgBwRH76sAIdmSqM0iMpnRYxUV+se oyhiCRlv23J8mwrHNqI5SiRXeMDQof4C4Cj1/zmk= Date: Fri, 05 Nov 2021 13:37:59 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 064/262] mm: list_lru: only add memcg-aware lrus to the global lru list Message-ID: <20211105203759.WevBHqcIp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 6951690000A0 X-Stat-Signature: hepq3rtinsgq41esrcdram7jyjmoxbf9 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=V+Azqit2; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144668-371375 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Muchun Song Subject: mm: list_lru: only add memcg-aware lrus to the global lru list The non-memcg-aware lru is always skiped when traversing the global lru list, which is not efficient. We can only add the memcg-aware lru to the global lru list instead to make traversing more efficient. Link: https://lkml.kernel.org/r/20211025124353.55781-1-songmuchun@bytedance.com Signed-off-by: Muchun Song Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Signed-off-by: Andrew Morton --- mm/list_lru.c | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-) --- a/mm/list_lru.c~mm-list_lru-only-add-memcg-aware-lrus-to-the-global-lru-list +++ a/mm/list_lru.c @@ -15,18 +15,29 @@ #include "slab.h" #ifdef CONFIG_MEMCG_KMEM -static LIST_HEAD(list_lrus); +static LIST_HEAD(memcg_list_lrus); static DEFINE_MUTEX(list_lrus_mutex); +static inline bool list_lru_memcg_aware(struct list_lru *lru) +{ + return lru->memcg_aware; +} + static void list_lru_register(struct list_lru *lru) { + if (!list_lru_memcg_aware(lru)) + return; + mutex_lock(&list_lrus_mutex); - list_add(&lru->list, &list_lrus); + list_add(&lru->list, &memcg_list_lrus); mutex_unlock(&list_lrus_mutex); } static void list_lru_unregister(struct list_lru *lru) { + if (!list_lru_memcg_aware(lru)) + return; + mutex_lock(&list_lrus_mutex); list_del(&lru->list); mutex_unlock(&list_lrus_mutex); @@ -37,11 +48,6 @@ static int lru_shrinker_id(struct list_l return lru->shrinker_id; } -static inline bool list_lru_memcg_aware(struct list_lru *lru) -{ - return lru->memcg_aware; -} - static inline struct list_lru_one * list_lru_from_memcg_idx(struct list_lru_node *nlru, int idx) { @@ -457,9 +463,6 @@ static int memcg_update_list_lru(struct { int i; - if (!list_lru_memcg_aware(lru)) - return 0; - for_each_node(i) { if (memcg_update_list_lru_node(&lru->node[i], old_size, new_size)) @@ -482,9 +485,6 @@ static void memcg_cancel_update_list_lru { int i; - if (!list_lru_memcg_aware(lru)) - return; - for_each_node(i) memcg_cancel_update_list_lru_node(&lru->node[i], old_size, new_size); @@ -497,7 +497,7 @@ int memcg_update_all_list_lrus(int new_s int old_size = memcg_nr_cache_ids; mutex_lock(&list_lrus_mutex); - list_for_each_entry(lru, &list_lrus, list) { + list_for_each_entry(lru, &memcg_list_lrus, list) { ret = memcg_update_list_lru(lru, old_size, new_size); if (ret) goto fail; @@ -506,7 +506,7 @@ out: mutex_unlock(&list_lrus_mutex); return ret; fail: - list_for_each_entry_continue_reverse(lru, &list_lrus, list) + list_for_each_entry_continue_reverse(lru, &memcg_list_lrus, list) memcg_cancel_update_list_lru(lru, old_size, new_size); goto out; } @@ -543,9 +543,6 @@ static void memcg_drain_list_lru(struct { int i; - if (!list_lru_memcg_aware(lru)) - return; - for_each_node(i) memcg_drain_list_lru_node(lru, i, src_idx, dst_memcg); } @@ -555,7 +552,7 @@ void memcg_drain_all_list_lrus(int src_i struct list_lru *lru; mutex_lock(&list_lrus_mutex); - list_for_each_entry(lru, &list_lrus, list) + list_for_each_entry(lru, &memcg_list_lrus, list) memcg_drain_list_lru(lru, src_idx, dst_memcg); mutex_unlock(&list_lrus_mutex); } From patchwork Fri Nov 5 20:38:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17F96C4332F for ; Fri, 5 Nov 2021 20:38:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF19C6056B for ; Fri, 5 Nov 2021 20:38:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BF19C6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 525F1940007; Fri, 5 Nov 2021 16:38:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4859694003D; Fri, 5 Nov 2021 16:38:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34E56940007; Fri, 5 Nov 2021 16:38:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id 18D1094003D for ; Fri, 5 Nov 2021 16:38:05 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D4F5777996 for ; Fri, 5 Nov 2021 20:38:04 +0000 (UTC) X-FDA: 78776038488.14.D1A1607 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 6D9BA10000A6 for ; Fri, 5 Nov 2021 20:38:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3BCEE60174; Fri, 5 Nov 2021 20:38:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144683; bh=oYskhWGQjMRbogBIr+BmH6a6+9MqQrXa1LGipJld3Kc=; h=Date:From:To:Subject:In-Reply-To:From; b=1uXlQCd6CfVEJRqAvdXg8xud5nqwKNSFEdF8lM+ki2CBmt0n1Ju2DcJHpNwBg+1Sm TdmFzSYY9ysMO2myveEAwmAD2XjkeBHr9mglo0nnfjtlZe+4mHIlTKpcHcir7yoNPb vb5Y2u+tUvjR8at+MBUvE7O/maRgEHECrgiF0NR0= Date: Fri, 05 Nov 2021 13:38:02 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz, vdavydov.dev@gmail.com, vvs@virtuozzo.com Subject: [patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Message-ID: <20211105203802.Yoox9Lj1y%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1uXlQCd6; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6D9BA10000A6 X-Stat-Signature: hm7mrqq67iq6ocw4o3uyqp71j9uizpb8 X-HE-Tag: 1636144684-951250 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vasily Averin Subject: mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It can be misused and allowed to trigger global OOM from inside a memcg-limited container. On the other hand if memcg fails allocation, called from inside #PF handler it triggers global OOM from inside pagefault_out_of_memory(). To prevent these problems this patchset: a) removes execution of out_of_memory() from pagefault_out_of_memory(), becasue nobody can explain why it is necessary. b) allow memcg to fail allocation of dying/killed tasks. This patch (of 3): Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory which in turn executes out_out_memory() and can kill a random task. An allocation might fail when the current task is the oom victim and there are no memory reserves left. The OOM killer is already handled at the page allocator level for the global OOM and at the charging level for the memcg one. Both have much more information about the scope of allocation/charge request. This means that either the OOM killer has been invoked properly and didn't lead to the allocation success or it has been skipped because it couldn't have been invoked. In both cases triggering it from here is pointless and even harmful. It makes much more sense to let the killed task die rather than to wake up an eternally hungry oom-killer and send him to choose a fatter victim for breakfast. Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com Signed-off-by: Vasily Averin Suggested-by: Michal Hocko Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Mel Gorman Cc: Roman Gushchin Cc: Shakeel Butt Cc: Tetsuo Handa Cc: Uladzislau Rezki Cc: Vladimir Davydov Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- mm/oom_kill.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/oom_kill.c~mm-oom-pagefault_out_of_memory-dont-force-global-oom-for-dying-tasks +++ a/mm/oom_kill.c @@ -1137,6 +1137,9 @@ void pagefault_out_of_memory(void) if (mem_cgroup_oom_synchronize(true)) return; + if (fatal_signal_pending(current)) + return; + if (!mutex_trylock(&oom_lock)) return; out_of_memory(&oc); From patchwork Fri Nov 5 20:38:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A6CDC433EF for ; Fri, 5 Nov 2021 20:38:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C886C6056B for ; Fri, 5 Nov 2021 20:38:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C886C6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 684A894003E; Fri, 5 Nov 2021 16:38:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60A8D94003D; Fri, 5 Nov 2021 16:38:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4ABAC94003E; Fri, 5 Nov 2021 16:38:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 3624994003D for ; Fri, 5 Nov 2021 16:38:08 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0206D77994 for ; Fri, 5 Nov 2021 20:38:08 +0000 (UTC) X-FDA: 78776038656.30.6A5E268 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id E360F508E4B5 for ; Fri, 5 Nov 2021 20:37:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7DFD8611C4; Fri, 5 Nov 2021 20:38:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144686; bh=awhbnFa5DLKRc1G7pIbv55FH3BSDx0kVnCgWrcVGYNw=; h=Date:From:To:Subject:In-Reply-To:From; b=PeSpsNWxSd+mapn3dEaSiDXHTG1IlJCN92ONB5fNCqnpEk/nIxMaIdc4OVkYXyFL/ 0kjw5YzPjXhnr/a+SQg80qnKy6HX690Db4mhmDRI7+t7lQyCOThYJQP4uDFE5igT7T 1PBAlKq1g4XM16HcOMdFoGrnwnoxmKAozP42voZ0= Date: Fri, 05 Nov 2021 13:38:06 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz, vdavydov.dev@gmail.com, vvs@virtuozzo.com Subject: [patch 066/262] mm, oom: do not trigger out_of_memory from the #PF Message-ID: <20211105203806.gt45tfz6b%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E360F508E4B5 X-Stat-Signature: xj8nat56sfdjqwc3xbgjw1tmrobcwwgh Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PeSpsNWx; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144675-45997 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm, oom: do not trigger out_of_memory from the #PF Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory. This can happen for 2 different reasons. a) Memcg is out of memory and we rely on mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) normal allocation fails. The latter is quite problematic because allocation paths already trigger out_of_memory and the page allocator tries really hard to not fail allocations. Anyway, if the OOM killer has been already invoked there is no reason to invoke it again from the #PF path. Especially when the OOM condition might be gone by that time and we have no way to find out other than allocate. Moreover if the allocation failed and the OOM killer hasn't been invoked then we are unlikely to do the right thing from the #PF context because we have already lost the allocation context and restictions and therefore might oom kill a task from a different NUMA domain. This all suggests that there is no legitimate reason to trigger out_of_memory from pagefault_out_of_memory so drop it. Just to be sure that no #PF path returns with VM_FAULT_OOM without allocation print a warning that this is happening before we restart the #PF. [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller. This is a local problem related to memcg, however, it causes unnecessary global OOM kills that are repeated over and over again and escalate into a real disaster. This has been broken since kmem accounting has been introduced for cgroup v1 (3.8). There was no kmem specific reclaim for the separate limit so the only way to handle kmem hard limit was to return with ENOMEM. In upstream the problem will be fixed by removing the outdated kmem limit, however stable and LTS kernels cannot do it and are still affected. This patch fixes the problem and should be backported into stable/LTS.] Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com Signed-off-by: Michal Hocko Signed-off-by: Vasily Averin Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Mel Gorman Cc: Roman Gushchin Cc: Shakeel Butt Cc: Tetsuo Handa Cc: Uladzislau Rezki Cc: Vladimir Davydov Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- mm/oom_kill.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) --- a/mm/oom_kill.c~mm-oom-do-not-trigger-out_of_memory-from-the-pf +++ a/mm/oom_kill.c @@ -1120,19 +1120,15 @@ bool out_of_memory(struct oom_control *o } /* - * The pagefault handler calls here because it is out of memory, so kill a - * memory-hogging task. If oom_lock is held by somebody else, a parallel oom - * killing is already in progress so do nothing. + * The pagefault handler calls here because some allocation has failed. We have + * to take care of the memcg OOM here because this is the only safe context without + * any locks held but let the oom killer triggered from the allocation context care + * about the global OOM. */ void pagefault_out_of_memory(void) { - struct oom_control oc = { - .zonelist = NULL, - .nodemask = NULL, - .memcg = NULL, - .gfp_mask = 0, - .order = 0, - }; + static DEFINE_RATELIMIT_STATE(pfoom_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); if (mem_cgroup_oom_synchronize(true)) return; @@ -1140,10 +1136,8 @@ void pagefault_out_of_memory(void) if (fatal_signal_pending(current)) return; - if (!mutex_trylock(&oom_lock)) - return; - out_of_memory(&oc); - mutex_unlock(&oom_lock); + if (__ratelimit(&pfoom_rs)) + pr_warn("Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF\n"); } SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) From patchwork Fri Nov 5 20:38:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F9F5C433FE for ; Fri, 5 Nov 2021 20:38:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E8E366056B for ; Fri, 5 Nov 2021 20:38:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E8E366056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 857F794003F; Fri, 5 Nov 2021 16:38:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7DF1A94003D; Fri, 5 Nov 2021 16:38:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65DE594003F; Fri, 5 Nov 2021 16:38:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0028.hostedemail.com [216.40.44.28]) by kanga.kvack.org (Postfix) with ESMTP id AF6C294003D for ; Fri, 5 Nov 2021 16:38:11 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6D56F8249980 for ; Fri, 5 Nov 2021 20:38:11 +0000 (UTC) X-FDA: 78776038782.13.14FA35A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id E61429000256 for ; Fri, 5 Nov 2021 20:38:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B872A611C0; Fri, 5 Nov 2021 20:38:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144690; bh=t5wHbGfIuJW7vsDOUJJHIHgqWF5JBs2z11pDnbCKrQ0=; h=Date:From:To:Subject:In-Reply-To:From; b=cRM87EKqt0tRqXyf//aIvYRgAh5V4iOMfoNhyvLJ/x2Yj5K+FIkMwjWrmVep9Mz8E IFJsG0ZXMr8Y0oFaNzmnRXgluFHio5rUTFBsjvAqd/1NRfJ4DHnvB9PP5FssdtLaLT mOeeUr2F6af/jFAVgDFDZO74JavCzckNTBCqnzNE= Date: Fri, 05 Nov 2021 13:38:09 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com, vbabka@suse.cz, vdavydov.dev@gmail.com, vvs@virtuozzo.com Subject: [patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks Message-ID: <20211105203809.1Zku99VL8%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E61429000256 X-Stat-Signature: 8rp7pycjw5pj5apz9z3i4fmhf8bozrnb Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=cRM87EKq; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144690-193493 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vasily Averin Subject: memcg: prohibit unconditional exceeding the limit of dying tasks Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It is assumed that the amount of the memory charged by those tasks is bound and most of the memory will get released while the task is exiting. This is resembling a heuristic for the global OOM situation when tasks get access to memory reserves. There is no global memory shortage at the memcg level so the memcg heuristic is more relieved. The above assumption is overly optimistic though. E.g. vmalloc can scale to really large requests and the heuristic would allow that. We used to have an early break in the vmalloc allocator for killed tasks but this has been reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when the current task is killed""). There are likely other similar code paths which do not check for fatal signals in an allocation&charge loop. Also there are some kernel objects charged to a memcg which are not bound to a process life time. It has been observed that it is not really hard to trigger these bypasses and cause global OOM situation. One potential way to address these runaways would be to limit the amount of excess (similar to the global OOM with limited oom reserves). This is certainly possible but it is not really clear how much of an excess is desirable and still protects from global OOMs as that would have to consider the overall memcg configuration. This patch is addressing the problem by removing the heuristic altogether. Bypass is only allowed for requests which either cannot fail or where the failure is not desirable while excess should be still limited (e.g. atomic requests). Implementation wise a killed or dying task fails to charge if it has passed the OOM killer stage. That should give all forms of reclaim chance to restore the limit before the failure (ENOMEM) and tell the caller to back off. In addition, this patch renames should_force_charge() helper to task_is_dying() because now its use is not associated witch forced charging. This patch depends on pagefault_out_of_memory() to not trigger out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM and cause a global OOM killer. Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com Signed-off-by: Vasily Averin Suggested-by: Michal Hocko Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Roman Gushchin Cc: Uladzislau Rezki Cc: Vlastimil Babka Cc: Shakeel Butt Cc: Mel Gorman Cc: Tetsuo Handa Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-) --- a/mm/memcontrol.c~memcg-prohibit-unconditional-exceeding-the-limit-of-dying-tasks +++ a/mm/memcontrol.c @@ -234,7 +234,7 @@ enum res_type { iter != NULL; \ iter = mem_cgroup_iter(NULL, iter, NULL)) -static inline bool should_force_charge(void) +static inline bool task_is_dying(void) { return tsk_is_oom_victim(current) || fatal_signal_pending(current) || (current->flags & PF_EXITING); @@ -1624,7 +1624,7 @@ static bool mem_cgroup_out_of_memory(str * A few threads which were not waiting at mutex_lock_killable() can * fail to bail out. Therefore, check again after holding oom_lock. */ - ret = should_force_charge() || out_of_memory(&oc); + ret = task_is_dying() || out_of_memory(&oc); unlock: mutex_unlock(&oom_lock); @@ -2579,6 +2579,7 @@ static int try_charge_memcg(struct mem_c struct page_counter *counter; enum oom_status oom_status; unsigned long nr_reclaimed; + bool passed_oom = false; bool may_swap = true; bool drained = false; unsigned long pflags; @@ -2614,15 +2615,6 @@ retry: goto force; /* - * Unlike in global OOM situations, memcg is not in a physical - * memory shortage. Allow dying and OOM-killed tasks to - * bypass the last charges so that they can exit quickly and - * free their memory. - */ - if (unlikely(should_force_charge())) - goto force; - - /* * Prevent unbounded recursion when reclaim operations need to * allocate memory. This might exceed the limits temporarily, * but we prefer facilitating memory reclaim and getting back @@ -2679,8 +2671,9 @@ retry: if (gfp_mask & __GFP_RETRY_MAYFAIL) goto nomem; - if (fatal_signal_pending(current)) - goto force; + /* Avoid endless loop for tasks bypassed by the oom killer */ + if (passed_oom && task_is_dying()) + goto nomem; /* * keep retrying as long as the memcg oom killer is able to make @@ -2689,14 +2682,10 @@ retry: */ oom_status = mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(nr_pages * PAGE_SIZE)); - switch (oom_status) { - case OOM_SUCCESS: + if (oom_status == OOM_SUCCESS) { + passed_oom = true; nr_retries = MAX_RECLAIM_RETRIES; goto retry; - case OOM_FAILED: - goto force; - default: - goto nomem; } nomem: if (!(gfp_mask & __GFP_NOFAIL)) From patchwork Fri Nov 5 20:38:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C1F6C433F5 for ; Fri, 5 Nov 2021 20:38:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41E3660174 for ; Fri, 5 Nov 2021 20:38:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 41E3660174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id CEEB7940040; Fri, 5 Nov 2021 16:38:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C744194003D; Fri, 5 Nov 2021 16:38:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B627A940040; Fri, 5 Nov 2021 16:38:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 9F8C394003D for ; Fri, 5 Nov 2021 16:38:14 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 710F37798F for ; Fri, 5 Nov 2021 20:38:14 +0000 (UTC) X-FDA: 78776038908.21.1C9D422 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 707AC104AAC7 for ; Fri, 5 Nov 2021 20:38:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id F238D611C4; Fri, 5 Nov 2021 20:38:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144693; bh=FgwI/W7sBWzuvMF+2tMW3dFamQwob5KnsnD+Bqc9pgg=; h=Date:From:To:Subject:In-Reply-To:From; b=18AlemLY2u6wkSB3c/k0j5tRcNYC9RolUQxHIWAZ/nwb7AEgLzNEYk+2mlrN0Y0eO Rkz3hajjhriKXuOVm9BYkHvS2IZlhnLRjRo5DQiIb1NmllKThu5PgxnOQ5IUyDPJmY AkE9R+FYPCgVb9QAvsJVWWL249YLj/0vV3u19utQ= Date: Fri, 05 Nov 2021 13:38:12 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, liupeng256@huawei.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 068/262] mm/mmap.c: fix a data race of mm->total_vm Message-ID: <20211105203812.6Y1n-n47F%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=18AlemLY; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 707AC104AAC7 X-Stat-Signature: owrmkm9emunahxhnqsg56ei45c7tcsw8 X-HE-Tag: 1636144685-33648 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peng Liu Subject: mm/mmap.c: fix a data race of mm->total_vm Variable mm->total_vm could be accessed concurrently during mmaping and system accounting as noticed by KCSAN, BUG: KCSAN: data-race in __acct_update_integrals / mmap_region read-write to 0xffffa40267bd14c8 of 8 bytes by task 15609 on cpu 3: mmap_region+0x6dc/0x1400 do_mmap+0x794/0xca0 vm_mmap_pgoff+0xdf/0x150 ksys_mmap_pgoff+0xe1/0x380 do_syscall_64+0x37/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffffa40267bd14c8 of 8 bytes by interrupt on cpu 2: __acct_update_integrals+0x187/0x1d0 acct_account_cputime+0x3c/0x40 update_process_times+0x5c/0x150 tick_sched_timer+0x184/0x210 __run_hrtimer+0x119/0x3b0 hrtimer_interrupt+0x350/0xaa0 __sysvec_apic_timer_interrupt+0x7b/0x220 asm_call_irq_on_stack+0x12/0x20 sysvec_apic_timer_interrupt+0x4d/0x80 asm_sysvec_apic_timer_interrupt+0x12/0x20 smp_call_function_single+0x192/0x2b0 perf_install_in_context+0x29b/0x4a0 __se_sys_perf_event_open+0x1a98/0x2550 __x64_sys_perf_event_open+0x63/0x70 do_syscall_64+0x37/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 2 PID: 15610 Comm: syz-executor.3 Not tainted 5.10.0+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 In vm_stat_account which called by mmap_region, increase total_vm, and __acct_update_integrals may read total_vm at the same time. This will cause a data race which lead to undefined behaviour. To avoid potential bad read/write, volatile property and barrier are both used to avoid undefined behaviour. Link: https://lkml.kernel.org/r/20210913105550.1569419-1-liupeng256@huawei.com Signed-off-by: Peng Liu Signed-off-by: Andrew Morton --- kernel/tsacct.c | 2 +- mm/mmap.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/kernel/tsacct.c~mm-mmapc-fix-a-data-race-of-mm-total_vm +++ a/kernel/tsacct.c @@ -137,7 +137,7 @@ static void __acct_update_integrals(stru * the rest of the math is done in xacct_add_tsk. */ tsk->acct_rss_mem1 += delta * get_mm_rss(tsk->mm) >> 10; - tsk->acct_vm_mem1 += delta * tsk->mm->total_vm >> 10; + tsk->acct_vm_mem1 += delta * READ_ONCE(tsk->mm->total_vm) >> 10; } /** --- a/mm/mmap.c~mm-mmapc-fix-a-data-race-of-mm-total_vm +++ a/mm/mmap.c @@ -3332,7 +3332,7 @@ bool may_expand_vm(struct mm_struct *mm, void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages) { - mm->total_vm += npages; + WRITE_ONCE(mm->total_vm, READ_ONCE(mm->total_vm)+npages); if (is_exec_mapping(flags)) mm->exec_vm += npages; From patchwork Fri Nov 5 20:38:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 439D8C433EF for ; Fri, 5 Nov 2021 20:38:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D09160174 for ; Fri, 5 Nov 2021 20:38:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0D09160174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9415D940041; Fri, 5 Nov 2021 16:38:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F13594003D; Fri, 5 Nov 2021 16:38:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E0B7940041; Fri, 5 Nov 2021 16:38:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0249.hostedemail.com [216.40.44.249]) by kanga.kvack.org (Postfix) with ESMTP id 6089194003D for ; Fri, 5 Nov 2021 16:38:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1DF961801E236 for ; Fri, 5 Nov 2021 20:38:17 +0000 (UTC) X-FDA: 78776039034.07.9FA14A8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP id BC781190C for ; Fri, 5 Nov 2021 20:38:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CC7446056B; Fri, 5 Nov 2021 20:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144696; bh=7aqVEQ2BNP1wlsrk4fse+g/i3lItD/S7wyUeyrYH/SY=; h=Date:From:To:Subject:In-Reply-To:From; b=KiO5BLES9XX+fQuWzlTDf16CBIte6Ybo1cu+8Vul0l4Q+OQERWsWMZLEBZvpiVlvl pwQL1iSjFQsvdMpCuNHlDg42RmWwNZE0DwzwQLtZpA4KRbS3nlTXt4o+16ZSkf5+iw EFq4sJ865rmo5k5TPiDF7szyZGpbPp34DYTR2GLk= Date: Fri, 05 Nov 2021 13:38:15 -0700 From: Andrew Morton To: akpm@linux-foundation.org, eb@emlix.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 069/262] mm: use __pfn_to_section() instead of open coding it Message-ID: <20211105203815.dJG5uSPY1%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: BC781190C X-Stat-Signature: k3bnky7ys94cyfrgpaj3r118u16bse6b Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KiO5BLES; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144696-446767 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rolf Eike Beer Subject: mm: use __pfn_to_section() instead of open coding it It is defined in the same file just a few lines above. Link: https://lkml.kernel.org/r/4598487.Rc0NezkW7i@mobilepool36.emlix.com Signed-off-by: Rolf Eike Beer Reviewed-by: Andrew Morton Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/include/linux/mmzone.h~mm-use-__pfn_to_section-instead-of-open-coding-it +++ a/include/linux/mmzone.h @@ -1481,7 +1481,7 @@ static inline int pfn_valid(unsigned lon if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - ms = __nr_to_section(pfn_to_section_nr(pfn)); + ms = __pfn_to_section(pfn); if (!valid_section(ms)) return 0; /* @@ -1496,7 +1496,7 @@ static inline int pfn_in_present_section { if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - return present_section(__nr_to_section(pfn_to_section_nr(pfn))); + return present_section(__pfn_to_section(pfn)); } static inline unsigned long next_present_section_nr(unsigned long section_nr) From patchwork Fri Nov 5 20:38:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E061C433F5 for ; Fri, 5 Nov 2021 20:38:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D562F6056B for ; Fri, 5 Nov 2021 20:38:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D562F6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6F957940042; Fri, 5 Nov 2021 16:38:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65B3394003D; Fri, 5 Nov 2021 16:38:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4ACF2940042; Fri, 5 Nov 2021 16:38:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 3495494003D for ; Fri, 5 Nov 2021 16:38:20 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EF28B1802EFDD for ; Fri, 5 Nov 2021 20:38:19 +0000 (UTC) X-FDA: 78776039160.17.854C555 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id 68CA5F000097 for ; Fri, 5 Nov 2021 20:38:11 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B2179611C0; Fri, 5 Nov 2021 20:38:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144698; bh=ATPv8/v4Nqm+039u6nxvoHetifbhpBxKrXQN0Sbsr2o=; h=Date:From:To:Subject:In-Reply-To:From; b=lIY8ZrQVk5rqK+ZedOqf+z2Qqp16S4s4W1wkwqumB89hebWjVxiOuntTWMNqZn5JJ 3K2LcxA0cSefESaJlajGzp0naUwMjOmAZN8TbvxuLU4x/6rg57AolreHVs4AlNV7vc VDMs14ulayNxgSN/O/zBaZWR1vUcDbMWkXnP85Ic= Date: Fri, 05 Nov 2021 13:38:18 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit.kachhap@arm.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, Vincenzo.Frascino@arm.com Subject: [patch 070/262] mm/memory.c: avoid unnecessary kernel/user pointer conversion Message-ID: <20211105203818.RLrQbH0Iv%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lIY8ZrQV; dmarc=none; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 68CA5F000097 X-Stat-Signature: 88mr9pa6snhbqxcpskf3nifz4ho94ckd X-HE-Tag: 1636144691-890923 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Amit Daniel Kachhap Subject: mm/memory.c: avoid unnecessary kernel/user pointer conversion Annotating a pointer from __user to kernel and then back again might confuse sparse. In copy_huge_page_from_user() it can be avoided by removing the intermediate variable since it is never used. Link: https://lkml.kernel.org/r/20210914150820.19326-1-amit.kachhap@arm.com Signed-off-by: Amit Daniel Kachhap Acked-by: Kirill A. Shutemov Cc: Vincenzo Frascino Signed-off-by: Andrew Morton --- mm/memory.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) --- a/mm/memory.c~mm-memory-avoid-unnecessary-kernel-user-pointer-conversion +++ a/mm/memory.c @@ -5421,7 +5421,6 @@ long copy_huge_page_from_user(struct pag unsigned int pages_per_huge_page, bool allow_pagefault) { - void *src = (void *)usr_src; void *page_kaddr; unsigned long i, rc = 0; unsigned long ret_val = pages_per_huge_page * PAGE_SIZE; @@ -5434,8 +5433,7 @@ long copy_huge_page_from_user(struct pag else page_kaddr = kmap_atomic(subpage); rc = copy_from_user(page_kaddr, - (const void __user *)(src + i * PAGE_SIZE), - PAGE_SIZE); + usr_src + i * PAGE_SIZE, PAGE_SIZE); if (allow_pagefault) kunmap(subpage); else From patchwork Fri Nov 5 20:38:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E366C433F5 for ; Fri, 5 Nov 2021 20:38:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 54DE56056B for ; Fri, 5 Nov 2021 20:38:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 54DE56056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DFD44940043; Fri, 5 Nov 2021 16:38:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D84F394003D; Fri, 5 Nov 2021 16:38:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4FA3940043; Fri, 5 Nov 2021 16:38:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id AD04094003D for ; Fri, 5 Nov 2021 16:38:23 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6375F18037D3A for ; Fri, 5 Nov 2021 20:38:23 +0000 (UTC) X-FDA: 78776039286.22.A5B8F2B Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id EC6CD90000B4 for ; Fri, 5 Nov 2021 20:38:22 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C457C60174; Fri, 5 Nov 2021 20:38:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144702; bh=/5P1kFsDt2rnxtQraJN9QoS2RlQR3nyp2CXKI0+oaHw=; h=Date:From:To:Subject:In-Reply-To:From; b=KeV+hYAGxCpYKI9ECVB5KlYlEoNOivfvMPUE1KHHntJs8ZCyAClWepw7phznmJEYE 9c/vnj6gymOID0vKwUcEd66y0PIwneU4Fq7bYprnEyGYFgSRLIfkWqslNPS227eGYj 43N+dWR3JO/jYVdnhRqjh8nSAEDIU3PLWjV38WEY= Date: Fri, 05 Nov 2021 13:38:21 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, andrew.cooper3@citrix.com, dave.hansen@linux.intel.com, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, namit@vmware.com, npiggin@gmail.com, peterz@infradead.org, tglx@linutronix.de, torvalds@linux-foundation.org, will@kernel.org, yuzhao@google.com Subject: [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables Message-ID: <20211105203821.mV5HWKCGr%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=KeV+hYAG; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EC6CD90000B4 X-Stat-Signature: dhbmiy7iyp6xpqi1uhw78tx9f1hs771w X-HE-Tag: 1636144702-512037 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Subject: mm/memory.c: use correct VMA flags when freeing page-tables Consistent use of the mmu_gather interface requires a call to tlb_start_vma() and tlb_end_vma() for each VMA. free_pgtables() does not follow this pattern. Certain architectures need tlb_start_vma() to be called in order for tlb_update_vma_flags() to update the VMA flags (tlb->vma_exec and tlb->vma_huge), which are later used for the proper TLB flush to be issued. Since tlb_start_vma() is not called, this can lead to the wrong VMA flags being used when the flush is performed. Specifically, the munmap syscall would call unmap_region(), which unmaps the VMAs and then frees the page-tables. A flush is needed after the page-tables are removed to prevent page-walk caches from holding stale entries, but this flush would use the flags of the VMA flags of the last VMA that was flushed. This does not appear to be right. Use tlb_start_vma() and tlb_end_vma() to prevent this from happening. This might lead to unnecessary calls to flush_cache_range() on certain arch's. If needed, a new flag can be added to mmu_gather to indicate that the flush is not needed. Link: https://lkml.kernel.org/r/20211021122322.592822-1-namit@vmware.com Signed-off-by: Nadav Amit Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andy Lutomirski Cc: Dave Hansen Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Andrew Morton --- mm/memory.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/memory.c~mm-use-correct-vma-flags-when-freeing-page-tables +++ a/mm/memory.c @@ -412,6 +412,8 @@ void free_pgtables(struct mmu_gather *tl unlink_anon_vmas(vma); unlink_file_vma(vma); + tlb_start_vma(tlb, vma); + if (is_vm_hugetlb_page(vma)) { hugetlb_free_pgd_range(tlb, addr, vma->vm_end, floor, next ? next->vm_start : ceiling); @@ -429,6 +431,8 @@ void free_pgtables(struct mmu_gather *tl free_pgd_range(tlb, addr, vma->vm_end, floor, next ? next->vm_start : ceiling); } + + tlb_end_vma(tlb, vma); vma = next; } } From patchwork Fri Nov 5 20:38:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3424C433EF for ; Fri, 5 Nov 2021 20:38:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 94F38611C0 for ; Fri, 5 Nov 2021 20:38:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 94F38611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28CD5940044; Fri, 5 Nov 2021 16:38:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2135A94003D; Fri, 5 Nov 2021 16:38:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08CA6940044; Fri, 5 Nov 2021 16:38:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id E8AA194003D for ; Fri, 5 Nov 2021 16:38:26 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A8D3882499A8 for ; Fri, 5 Nov 2021 20:38:26 +0000 (UTC) X-FDA: 78776039412.29.06EB4AB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 7D5D2508E4BC for ; Fri, 5 Nov 2021 20:38:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1D11E611C4; Fri, 5 Nov 2021 20:38:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144705; bh=MLR4U50RT2yuKzbSmLF7KMClNgDVmIHCRhN6LqHDA00=; h=Date:From:To:Subject:In-Reply-To:From; b=oGr7Qywh7TabM8X8doTvLkWtMHIxFonRL7FdwNdL9oSp0OOIX+QAIMKmUd+O78gH0 /FD2Nm9SUNHOyPw95R8I/8EoiSuGk4RuYz0MJROiXbZAnxXblTz75RzVejjuAdVdnR 5d0O8YcWfc7LqLL6DqCLLNOL1rH1qwyapmHdrOsE= Date: Fri, 05 Nov 2021 13:38:24 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, apopple@nvidia.com, axelrasmussen@google.com, david@redhat.com, hughd@google.com, jglisse@redhat.com, kirill@shutemov.name, liam.howlett@oracle.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rppt@linux.vnet.ibm.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 072/262] mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte Message-ID: <20211105203824.F5P-pkCc0%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7D5D2508E4BC X-Stat-Signature: i9pf5zqi4tb48abtmzffpuzjefh4w5kd Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=oGr7Qywh; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144694-837865 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte Patch series "mm: A few cleanup patches around zap, shmem and uffd", v4. IMHO all of them are very nice cleanups to existing code already, they're all small and self-contained. They'll be needed by uffd-wp coming series. This patch (of 4): It was conditionally done previously, as there's one shmem special case that we use SetPageDirty() instead. However that's not necessary and it should be easier and cleaner to do it unconditionally in mfill_atomic_install_pte(). The most recent discussion about this is here, where Hugh explained the history of SetPageDirty() and why it's possible that it's not required at all: https://lore.kernel.org/lkml/alpine.LSU.2.11.2104121657050.1097@eggly.anvils/ Currently mfill_atomic_install_pte() has three callers: 1. shmem_mfill_atomic_pte 2. mcopy_atomic_pte 3. mcontinue_atomic_pte After the change: case (1) should have its SetPageDirty replaced by the dirty bit on pte (so we unify them together, finally), case (2) should have no functional change at all as it has page_in_cache==false, case (3) may add a dirty bit to the pte. However since case (3) is UFFDIO_CONTINUE for shmem, it's merely 100% sure the page is dirty after all because UFFDIO_CONTINUE normally requires another process to modify the page cache and kick the faulted thread, so should not make a real difference either. This should make it much easier to follow on which case will set dirty for uffd, as we'll simply set it all now for all uffd related ioctls. Meanwhile, no special handling of SetPageDirty() if there's no need. Link: https://lkml.kernel.org/r/20210915181456.10739-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210915181456.10739-2-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Axel Rasmussen Cc: Hugh Dickins Cc: Andrea Arcangeli Cc: Liam Howlett Cc: Mike Rapoport Cc: Yang Shi Cc: David Hildenbrand Cc: "Kirill A . Shutemov" Cc: Jerome Glisse Cc: Alistair Popple Cc: Miaohe Lin Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- mm/shmem.c | 1 - mm/userfaultfd.c | 3 +-- 2 files changed, 1 insertion(+), 3 deletions(-) --- a/mm/shmem.c~mm-shmem-unconditionally-set-pte-dirty-in-mfill_atomic_install_pte +++ a/mm/shmem.c @@ -2423,7 +2423,6 @@ int shmem_mfill_atomic_pte(struct mm_str shmem_recalc_inode(inode); spin_unlock_irq(&info->lock); - SetPageDirty(page); unlock_page(page); return 0; out_delete_from_cache: --- a/mm/userfaultfd.c~mm-shmem-unconditionally-set-pte-dirty-in-mfill_atomic_install_pte +++ a/mm/userfaultfd.c @@ -69,10 +69,9 @@ int mfill_atomic_install_pte(struct mm_s pgoff_t offset, max_off; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); + _dst_pte = pte_mkdirty(_dst_pte); if (page_in_cache && !vm_shared) writable = false; - if (writable || !page_in_cache) - _dst_pte = pte_mkdirty(_dst_pte); if (writable) { if (wp_copy) _dst_pte = pte_mkuffd_wp(_dst_pte); From patchwork Fri Nov 5 20:38:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23BCBC433F5 for ; Fri, 5 Nov 2021 20:38:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CCCF3611C0 for ; Fri, 5 Nov 2021 20:38:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CCCF3611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6F86F940045; Fri, 5 Nov 2021 16:38:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A85C94003D; Fri, 5 Nov 2021 16:38:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D51D940045; Fri, 5 Nov 2021 16:38:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id 3A2AF94003D for ; Fri, 5 Nov 2021 16:38:30 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0AF7582499A8 for ; Fri, 5 Nov 2021 20:38:30 +0000 (UTC) X-FDA: 78776039580.22.2FEB889 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id AA4229000254 for ; Fri, 5 Nov 2021 20:38:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7716C6056B; Fri, 5 Nov 2021 20:38:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144709; bh=Wf0tVOb4tt0xQGd9r0b57aM920yhQT4PyHh3AT5AcIE=; h=Date:From:To:Subject:In-Reply-To:From; b=qQLBbBomFeH2QKTzQN4EXHNpWmrQGjp8NPzu0QmZnGockSaFhdq8BaikfYLqhOC+r RqB0w8DYYZSK5Q24XwHTnSEA0Hyy3OnW+UFPZru96txdPCLcklF6N4gyWlSN84drIz XaKa3hR6jgghli2gK0HqfEUf/cpJX5DDWlSpwRk4= Date: Fri, 05 Nov 2021 13:38:28 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, apopple@nvidia.com, axelrasmussen@google.com, david@redhat.com, hughd@google.com, jglisse@redhat.com, kirill@shutemov.name, liam.howlett@oracle.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rppt@linux.vnet.ibm.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 073/262] mm: clear vmf->pte after pte_unmap_same() returns Message-ID: <20211105203828.bHFJC7Quc%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: AA4229000254 X-Stat-Signature: hd3pas9jees7y189kyy7hqcnkakz19zm Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qQLBbBom; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144709-514824 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm: clear vmf->pte after pte_unmap_same() returns pte_unmap_same() will always unmap the pte pointer. After the unmap, vmf->pte will not be valid any more, we should clear it. It was safe only because no one is accessing vmf->pte after pte_unmap_same() returns, since the only caller of pte_unmap_same() (so far) is do_swap_page(), where vmf->pte will in most cases be overwritten very soon. Directly pass in vmf into pte_unmap_same() and then we can also avoid the long parameter list too, which should be a nice cleanup. Link: https://lkml.kernel.org/r/20210915181533.11188-1-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Miaohe Lin Reviewed-by: David Hildenbrand Reviewed-by: Liam Howlett Acked-by: Hugh Dickins Cc: Alistair Popple Cc: Andrea Arcangeli Cc: Axel Rasmussen Cc: Jerome Glisse Cc: "Kirill A . Shutemov" Cc: Matthew Wilcox Cc: Mike Rapoport Cc: Yang Shi Signed-off-by: Andrew Morton --- mm/memory.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/mm/memory.c~mm-clear-vmf-pte-after-pte_unmap_same-returns +++ a/mm/memory.c @@ -2728,19 +2728,19 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page * proceeding (but do_wp_page is only called after already making such a check; * and do_anonymous_page can safely check later on). */ -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, - pte_t *page_table, pte_t orig_pte) +static inline int pte_unmap_same(struct vm_fault *vmf) { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPTION) if (sizeof(pte_t) > sizeof(unsigned long)) { - spinlock_t *ptl = pte_lockptr(mm, pmd); + spinlock_t *ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd); spin_lock(ptl); - same = pte_same(*page_table, orig_pte); + same = pte_same(*vmf->pte, vmf->orig_pte); spin_unlock(ptl); } #endif - pte_unmap(page_table); + pte_unmap(vmf->pte); + vmf->pte = NULL; return same; } @@ -3492,7 +3492,7 @@ vm_fault_t do_swap_page(struct vm_fault vm_fault_t ret = 0; void *shadow = NULL; - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) + if (!pte_unmap_same(vmf)) goto out; entry = pte_to_swp_entry(vmf->orig_pte); From patchwork Fri Nov 5 20:38:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99691C433EF for ; Fri, 5 Nov 2021 20:38:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4FCB7611C0 for ; Fri, 5 Nov 2021 20:38:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4FCB7611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E98F7940046; Fri, 5 Nov 2021 16:38:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E484D94003D; Fri, 5 Nov 2021 16:38:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1233940046; Fri, 5 Nov 2021 16:38:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id BC1E194003D for ; Fri, 5 Nov 2021 16:38:33 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 84725779A7 for ; Fri, 5 Nov 2021 20:38:33 +0000 (UTC) X-FDA: 78776039706.29.E706513 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 166C0801A8A0 for ; Fri, 5 Nov 2021 20:38:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D755C611C4; Fri, 5 Nov 2021 20:38:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144712; bh=+WQottrag5nM12JSj6ClJ577YghmMslIUJIreMa+8hE=; h=Date:From:To:Subject:In-Reply-To:From; b=aks1TfTdLKfd2PWRKIfA/48wndRqE/kYqMzAj084YJS1oow+SOZaBFuUJv2RhuZLa lwSIcMJfGPLIwG8LYtzPg3WJ0mkYQNzaQkJp4Of1LuCPHF3Axjb72Kh3BqorKPJTdC sxVpVHHb11niWYPT6kaq95K15Sx4cL+ssS2UpsVg= Date: Fri, 05 Nov 2021 13:38:31 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, apopple@nvidia.com, axelrasmussen@google.com, david@redhat.com, hughd@google.com, jglisse@redhat.com, kirill@shutemov.name, liam.howlett@oracle.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rppt@linux.vnet.ibm.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 074/262] mm: drop first_index/last_index in zap_details Message-ID: <20211105203831.QjFEazYcI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=aks1TfTd; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 166C0801A8A0 X-Stat-Signature: rbruuuwwe3tqe6q6wtkwj5tggenghe68 X-HE-Tag: 1636144712-884515 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm: drop first_index/last_index in zap_details The first_index/last_index parameters in zap_details are actually only used in unmap_mapping_range_tree(). At the meantime, this function is only called by unmap_mapping_pages() once. Instead of passing these two variables through the whole stack of page zapping code, remove them from zap_details and let them simply be parameters of unmap_mapping_range_tree(), which is inlined. Link: https://lkml.kernel.org/r/20210915181535.11238-1-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Alistair Popple Reviewed-by: David Hildenbrand Reviewed-by: Liam Howlett Acked-by: Hugh Dickins Cc: Andrea Arcangeli Cc: Axel Rasmussen Cc: Jerome Glisse Cc: "Kirill A . Shutemov" Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Mike Rapoport Cc: Yang Shi Signed-off-by: Andrew Morton --- include/linux/mm.h | 2 -- mm/memory.c | 31 ++++++++++++++++++------------- 2 files changed, 18 insertions(+), 15 deletions(-) --- a/include/linux/mm.h~mm-drop-first_index-last_index-in-zap_details +++ a/include/linux/mm.h @@ -1688,8 +1688,6 @@ extern void user_shm_unlock(size_t, stru */ struct zap_details { struct address_space *check_mapping; /* Check page->mapping if set */ - pgoff_t first_index; /* Lowest page->index to unmap */ - pgoff_t last_index; /* Highest page->index to unmap */ struct page *single_page; /* Locked page to be unmapped */ }; --- a/mm/memory.c~mm-drop-first_index-last_index-in-zap_details +++ a/mm/memory.c @@ -3325,20 +3325,20 @@ static void unmap_mapping_range_vma(stru } static inline void unmap_mapping_range_tree(struct rb_root_cached *root, + pgoff_t first_index, + pgoff_t last_index, struct zap_details *details) { struct vm_area_struct *vma; pgoff_t vba, vea, zba, zea; - vma_interval_tree_foreach(vma, root, - details->first_index, details->last_index) { - + vma_interval_tree_foreach(vma, root, first_index, last_index) { vba = vma->vm_pgoff; vea = vba + vma_pages(vma) - 1; - zba = details->first_index; + zba = first_index; if (zba < vba) zba = vba; - zea = details->last_index; + zea = last_index; if (zea > vea) zea = vea; @@ -3364,18 +3364,22 @@ void unmap_mapping_page(struct page *pag { struct address_space *mapping = page->mapping; struct zap_details details = { }; + pgoff_t first_index; + pgoff_t last_index; VM_BUG_ON(!PageLocked(page)); VM_BUG_ON(PageTail(page)); + first_index = page->index; + last_index = page->index + thp_nr_pages(page) - 1; + details.check_mapping = mapping; - details.first_index = page->index; - details.last_index = page->index + thp_nr_pages(page) - 1; details.single_page = page; i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) - unmap_mapping_range_tree(&mapping->i_mmap, &details); + unmap_mapping_range_tree(&mapping->i_mmap, first_index, + last_index, &details); i_mmap_unlock_write(mapping); } @@ -3395,16 +3399,17 @@ void unmap_mapping_pages(struct address_ pgoff_t nr, bool even_cows) { struct zap_details details = { }; + pgoff_t first_index = start; + pgoff_t last_index = start + nr - 1; details.check_mapping = even_cows ? NULL : mapping; - details.first_index = start; - details.last_index = start + nr - 1; - if (details.last_index < details.first_index) - details.last_index = ULONG_MAX; + if (last_index < first_index) + last_index = ULONG_MAX; i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) - unmap_mapping_range_tree(&mapping->i_mmap, &details); + unmap_mapping_range_tree(&mapping->i_mmap, first_index, + last_index, &details); i_mmap_unlock_write(mapping); } EXPORT_SYMBOL_GPL(unmap_mapping_pages); From patchwork Fri Nov 5 20:38:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EABBC433EF for ; Fri, 5 Nov 2021 20:38:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B8BCA6056B for ; Fri, 5 Nov 2021 20:38:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B8BCA6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4FA90940047; Fri, 5 Nov 2021 16:38:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AA3394003D; Fri, 5 Nov 2021 16:38:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3261D940047; Fri, 5 Nov 2021 16:38:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 1DC6994003D for ; Fri, 5 Nov 2021 16:38:37 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D522877997 for ; Fri, 5 Nov 2021 20:38:36 +0000 (UTC) X-FDA: 78776039832.12.CF2826A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id DE88C5000303 for ; Fri, 5 Nov 2021 20:38:27 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 30E3E61252; Fri, 5 Nov 2021 20:38:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144715; bh=5qowX4AnX4c065/IuBa9/8Cm26bFDC1XWPKyQFwp1Rg=; h=Date:From:To:Subject:In-Reply-To:From; b=SGsP5y7KaXlI2sPLGGaBT0LsIJHBOifZv0WlObjBa6ijXcmh4UdNLi2XLVts0jwlP VNoN5BBbXT9Uob3WHckvsBf7KtP1/8ANmBc+OzHc7A2Fh6tATLNiwCGdO8IfzR2xDw d9mLxCsmimdQ64/AXwoi99kAaUejjQX1HQs1Q2sg= Date: Fri, 05 Nov 2021 13:38:34 -0700 From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, apopple@nvidia.com, axelrasmussen@google.com, david@redhat.com, hughd@google.com, jglisse@redhat.com, kirill@shutemov.name, liam.howlett@oracle.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rppt@linux.vnet.ibm.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 075/262] mm: add zap_skip_check_mapping() helper Message-ID: <20211105203834.7SPeSSP3b%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DE88C5000303 X-Stat-Signature: f3ed6gf8atopj58ho1pqi6teeyat4xy5 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=SGsP5y7K; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144707-233054 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm: add zap_skip_check_mapping() helper Use the helper for the checks. Rename "check_mapping" into "zap_mapping" because "check_mapping" looks like a bool but in fact it stores the mapping itself. When it's set, we check the mapping (it must be non-NULL). When it's cleared we skip the check, which works like the old way. Move the duplicated comments to the helper too. Link: https://lkml.kernel.org/r/20210915181538.11288-1-peterx@redhat.com Signed-off-by: Peter Xu Reviewed-by: Alistair Popple Cc: Andrea Arcangeli Cc: Axel Rasmussen Cc: David Hildenbrand Cc: Hugh Dickins Cc: Jerome Glisse Cc: "Kirill A . Shutemov" Cc: Liam Howlett Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Mike Rapoport Cc: Yang Shi Signed-off-by: Andrew Morton --- include/linux/mm.h | 16 +++++++++++++++- mm/memory.c | 29 ++++++----------------------- 2 files changed, 21 insertions(+), 24 deletions(-) --- a/include/linux/mm.h~mm-add-zap_skip_check_mapping-helper +++ a/include/linux/mm.h @@ -1687,10 +1687,24 @@ extern void user_shm_unlock(size_t, stru * Parameter block passed down to zap_pte_range in exceptional cases. */ struct zap_details { - struct address_space *check_mapping; /* Check page->mapping if set */ + struct address_space *zap_mapping; /* Check page->mapping if set */ struct page *single_page; /* Locked page to be unmapped */ }; +/* + * We set details->zap_mappings when we want to unmap shared but keep private + * pages. Return true if skip zapping this page, false otherwise. + */ +static inline bool +zap_skip_check_mapping(struct zap_details *details, struct page *page) +{ + if (!details || !page) + return false; + + return details->zap_mapping && + (details->zap_mapping != page_rmapping(page)); +} + struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, pte_t pte); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, --- a/mm/memory.c~mm-add-zap_skip_check_mapping-helper +++ a/mm/memory.c @@ -1337,16 +1337,8 @@ again: struct page *page; page = vm_normal_page(vma, addr, ptent); - if (unlikely(details) && page) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping && - details->check_mapping != page_rmapping(page)) - continue; - } + if (unlikely(zap_skip_check_mapping(details, page))) + continue; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); @@ -1379,17 +1371,8 @@ again: is_device_exclusive_entry(entry)) { struct page *page = pfn_swap_entry_to_page(entry); - if (unlikely(details && details->check_mapping)) { - /* - * unmap_shared_mapping_pages() wants to - * invalidate cache without truncating: - * unmap shared but keep private pages. - */ - if (details->check_mapping != - page_rmapping(page)) - continue; - } - + if (unlikely(zap_skip_check_mapping(details, page))) + continue; pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); rss[mm_counter(page)]--; @@ -3373,7 +3356,7 @@ void unmap_mapping_page(struct page *pag first_index = page->index; last_index = page->index + thp_nr_pages(page) - 1; - details.check_mapping = mapping; + details.zap_mapping = mapping; details.single_page = page; i_mmap_lock_write(mapping); @@ -3402,7 +3385,7 @@ void unmap_mapping_pages(struct address_ pgoff_t first_index = start; pgoff_t last_index = start + nr - 1; - details.check_mapping = even_cows ? NULL : mapping; + details.zap_mapping = even_cows ? NULL : mapping; if (last_index < first_index) last_index = ULONG_MAX; From patchwork Fri Nov 5 20:38:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76DE3C433F5 for ; Fri, 5 Nov 2021 20:38:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2CA1C60FBF for ; Fri, 5 Nov 2021 20:38:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2CA1C60FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BB343940048; Fri, 5 Nov 2021 16:38:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1469940049; Fri, 5 Nov 2021 16:38:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B7A4940048; Fri, 5 Nov 2021 16:38:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 8733D94003D for ; Fri, 5 Nov 2021 16:38:40 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3EAB5779A8 for ; Fri, 5 Nov 2021 20:38:40 +0000 (UTC) X-FDA: 78776040000.38.B06A13F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id C838470000AE for ; Fri, 5 Nov 2021 20:38:39 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B011C6056B; Fri, 5 Nov 2021 20:38:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144719; bh=TO7fQlFAUIMUtuvraGiofhQAac+hReuX+JTU+yDcl9Q=; h=Date:From:To:Subject:In-Reply-To:From; b=af2C+O++JVJl5mi5WysOFszD3BcxsWyGnEsovuUNXlogqS6c5Idtf4/ZVKBa8lG+x guMFeDs7Skw3nVTlO5uk8zkQ5gR7tJYQ4bcSFw3/N4UCqcQcTPa0MJIh3Q6ndq8qJL N+BX83gmPECzSUvdCHLZP0DJUrElfFSb8pzgX5p0= Date: Fri, 05 Nov 2021 13:38:38 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, hannes@cmpxchg.org, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mhocko@kernel.org, mika.penttila@nextfour.com, mm-commits@vger.kernel.org, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, vbabka@suse.cz, vdavydov.dev@gmail.com, zhengqi.arch@bytedance.com Subject: [patch 076/262] mm: introduce pmd_install() helper Message-ID: <20211105203838.xYsnl2UZw%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C838470000AE X-Stat-Signature: s4c5nqfsk7zh3rbzw4xxeuwx4riiudd6 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=af2C+O++; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144719-201489 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qi Zheng Subject: mm: introduce pmd_install() helper Patch series "Do some code cleanups related to mm", v3. This patch (of 2): Currently we have three times the same few lines repeated in the code. Deduplicate them by newly introduced pmd_install() helper. Link: https://lkml.kernel.org/r/20210901102722.47686-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20210901102722.47686-2-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng Reviewed-by: David Hildenbrand Reviewed-by: Muchun Song Acked-by: Kirill A. Shutemov Cc: Thomas Gleixner Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Mika Penttila Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/filemap.c | 11 ++--------- mm/internal.h | 1 + mm/memory.c | 34 ++++++++++++++++------------------ 3 files changed, 19 insertions(+), 27 deletions(-) --- a/mm/filemap.c~mm-introduce-pmd_install-helper +++ a/mm/filemap.c @@ -3211,15 +3211,8 @@ static bool filemap_map_pmd(struct vm_fa } } - if (pmd_none(*vmf->pmd)) { - vmf->ptl = pmd_lock(mm, vmf->pmd); - if (likely(pmd_none(*vmf->pmd))) { - mm_inc_nr_ptes(mm); - pmd_populate(mm, vmf->pmd, vmf->prealloc_pte); - vmf->prealloc_pte = NULL; - } - spin_unlock(vmf->ptl); - } + if (pmd_none(*vmf->pmd)) + pmd_install(mm, vmf->pmd, &vmf->prealloc_pte); /* See comment in handle_pte_fault() */ if (pmd_devmap_trans_unstable(vmf->pmd)) { --- a/mm/internal.h~mm-introduce-pmd_install-helper +++ a/mm/internal.h @@ -38,6 +38,7 @@ vm_fault_t do_swap_page(struct vm_fault void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling); +void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); static inline bool can_madv_lru_vma(struct vm_area_struct *vma) { --- a/mm/memory.c~mm-introduce-pmd_install-helper +++ a/mm/memory.c @@ -437,9 +437,20 @@ void free_pgtables(struct mmu_gather *tl } } +void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) +{ + spinlock_t *ptl = pmd_lock(mm, pmd); + + if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + mm_inc_nr_ptes(mm); + pmd_populate(mm, pmd, *pte); + *pte = NULL; + } + spin_unlock(ptl); +} + int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) { - spinlock_t *ptl; pgtable_t new = pte_alloc_one(mm); if (!new) return -ENOMEM; @@ -459,13 +470,7 @@ int __pte_alloc(struct mm_struct *mm, pm */ smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ - ptl = pmd_lock(mm, pmd); - if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ - mm_inc_nr_ptes(mm); - pmd_populate(mm, pmd, new); - new = NULL; - } - spin_unlock(ptl); + pmd_install(mm, pmd, &new); if (new) pte_free(mm, new); return 0; @@ -4028,17 +4033,10 @@ vm_fault_t finish_fault(struct vm_fault return ret; } - if (vmf->prealloc_pte) { - vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); - if (likely(pmd_none(*vmf->pmd))) { - mm_inc_nr_ptes(vma->vm_mm); - pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte); - vmf->prealloc_pte = NULL; - } - spin_unlock(vmf->ptl); - } else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) { + if (vmf->prealloc_pte) + pmd_install(vma->vm_mm, vmf->pmd, &vmf->prealloc_pte); + else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) return VM_FAULT_OOM; - } } /* See comment in handle_pte_fault() */ From patchwork Fri Nov 5 20:38:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2C55C433EF for ; Fri, 5 Nov 2021 20:38:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6A0D260FBF for ; Fri, 5 Nov 2021 20:38:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6A0D260FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0925794003D; Fri, 5 Nov 2021 16:38:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3D1E940049; Fri, 5 Nov 2021 16:38:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB03F94003D; Fri, 5 Nov 2021 16:38:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0158.hostedemail.com [216.40.44.158]) by kanga.kvack.org (Postfix) with ESMTP id BE047940049 for ; Fri, 5 Nov 2021 16:38:43 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 888321813C150 for ; Fri, 5 Nov 2021 20:38:43 +0000 (UTC) X-FDA: 78776040084.26.C5746D3 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id 349B430000B0 for ; Fri, 5 Nov 2021 20:38:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id F059E611C4; Fri, 5 Nov 2021 20:38:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144722; bh=S4N4QLiF/9emJ6Vh5KF8q2lQz32gWn0CCQ7ZowG4u8g=; h=Date:From:To:Subject:In-Reply-To:From; b=XP9MhSCsqUP4fC62Sid088hIskoH1zYJKj+Q1SKkyShMWBc82rdoAN4GYYitIS3SV v5BZrZOLbLeeNUMwFuKXIc9/jL3F3VFBiTPqIbra+7aJWlyYxC31vqjkqmHIEVwMwO 1wQiSb7hb+pGtfXXv1wHFQOHZw3Znj2Mq4JEXjDk= Date: Fri, 05 Nov 2021 13:38:41 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, hannes@cmpxchg.org, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mhocko@kernel.org, mika.penttila@nextfour.com, mm-commits@vger.kernel.org, songmuchun@bytedance.com, tglx@linutronix.de, torvalds@linux-foundation.org, vbabka@suse.cz, vdavydov.dev@gmail.com, zhengqi.arch@bytedance.com Subject: [patch 077/262] mm: remove redundant smp_wmb() Message-ID: <20211105203841.NJQh4O_0B%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 349B430000B0 X-Stat-Signature: 48uk69d57awbkxz9bdob5kdqopggj781 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XP9MhSCs; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144716-198627 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qi Zheng Subject: mm: remove redundant smp_wmb() The smp_wmb() which is in the __pte_alloc() is used to ensure all ptes setup is visible before the pte is made visible to other CPUs by being put into page tables. We only need this when the pte is actually populated, so move it to pmd_install(). __pte_alloc_kernel(), __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar to this case. We can also defer smp_wmb() to the place where the pmd entry is really populated by preallocated pte. There are two kinds of user of preallocated pte, one is filemap & finish_fault(), another is THP. The former does not need another smp_wmb() because the smp_wmb() has been done by pmd_install(). Fortunately, the latter also does not need another smp_wmb() because there is already a smp_wmb() before populating the new pte when the THP uses a preallocated pte to split a huge pmd. Link: https://lkml.kernel.org/r/20210901102722.47686-3-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng Reviewed-by: Muchun Song Acked-by: David Hildenbrand Acked-by: Kirill A. Shutemov Cc: Johannes Weiner Cc: Michal Hocko Cc: Mika Penttila Cc: Thomas Gleixner Cc: Vladimir Davydov Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/memory.c | 52 ++++++++++++++++++------------------------ mm/sparse-vmemmap.c | 2 - 2 files changed, 24 insertions(+), 30 deletions(-) --- a/mm/memory.c~mm-remove-redundant-smp_wmb +++ a/mm/memory.c @@ -443,6 +443,20 @@ void pmd_install(struct mm_struct *mm, p if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ mm_inc_nr_ptes(mm); + /* + * Ensure all pte setup (eg. pte page lock and page clearing) are + * visible before the pte is made visible to other CPUs by being + * put into page tables. + * + * The other side of the story is the pointer chasing in the page + * table walking code (when walking the page table without locking; + * ie. most of the time). Fortunately, these data accesses consist + * of a chain of data-dependent loads, meaning most CPUs (alpha + * being the notable exception) will already guarantee loads are + * seen in-order. See the alpha page table accessors for the + * smp_rmb() barriers in page table walking code. + */ + smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ pmd_populate(mm, pmd, *pte); *pte = NULL; } @@ -455,21 +469,6 @@ int __pte_alloc(struct mm_struct *mm, pm if (!new) return -ENOMEM; - /* - * Ensure all pte setup (eg. pte page lock and page clearing) are - * visible before the pte is made visible to other CPUs by being - * put into page tables. - * - * The other side of the story is the pointer chasing in the page - * table walking code (when walking the page table without locking; - * ie. most of the time). Fortunately, these data accesses consist - * of a chain of data-dependent loads, meaning most CPUs (alpha - * being the notable exception) will already guarantee loads are - * seen in-order. See the alpha page table accessors for the - * smp_rmb() barriers in page table walking code. - */ - smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ - pmd_install(mm, pmd, &new); if (new) pte_free(mm, new); @@ -482,10 +481,9 @@ int __pte_alloc_kernel(pmd_t *pmd) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&init_mm.page_table_lock); if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + smp_wmb(); /* See comment in pmd_install() */ pmd_populate_kernel(&init_mm, pmd, new); new = NULL; } @@ -3849,7 +3847,6 @@ static vm_fault_t __do_fault(struct vm_f vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } ret = vma->vm_ops->fault(vmf); @@ -3920,7 +3917,6 @@ vm_fault_t do_set_pmd(struct vm_fault *v vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -4145,7 +4141,6 @@ static vm_fault_t do_fault_around(struct vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pg if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); - if (pgd_present(*pgd)) /* Another has populated it */ + if (pgd_present(*pgd)) { /* Another has populated it */ p4d_free(mm, new); - else + } else { + smp_wmb(); /* See comment in pmd_install() */ pgd_populate(mm, pgd, new); + } spin_unlock(&mm->page_table_lock); return 0; } @@ -4842,11 +4837,10 @@ int __pud_alloc(struct mm_struct *mm, p4 if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); if (!p4d_present(*p4d)) { mm_inc_nr_puds(mm); + smp_wmb(); /* See comment in pmd_install() */ p4d_populate(mm, p4d, new); } else /* Another has populated it */ pud_free(mm, new); @@ -4867,14 +4861,14 @@ int __pmd_alloc(struct mm_struct *mm, pu if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - ptl = pud_lock(mm, pud); if (!pud_present(*pud)) { mm_inc_nr_pmds(mm); + smp_wmb(); /* See comment in pmd_install() */ pud_populate(mm, pud, new); - } else /* Another has populated it */ + } else { /* Another has populated it */ pmd_free(mm, new); + } spin_unlock(ptl); return 0; } --- a/mm/sparse-vmemmap.c~mm-remove-redundant-smp_wmb +++ a/mm/sparse-vmemmap.c @@ -76,7 +76,7 @@ static int split_vmemmap_huge_pmd(pmd_t set_pte_at(&init_mm, addr, pte, entry); } - /* Make pte visible before pmd. See comment in __pte_alloc(). */ + /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); pmd_populate_kernel(&init_mm, pmd, pgtable); From patchwork Fri Nov 5 20:38:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9104C433F5 for ; Fri, 5 Nov 2021 20:38:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9777C611C4 for ; Fri, 5 Nov 2021 20:38:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9777C611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 378BE94004A; Fri, 5 Nov 2021 16:38:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FE2D940049; Fri, 5 Nov 2021 16:38:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C4A994004A; Fri, 5 Nov 2021 16:38:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id 09BEE940049 for ; Fri, 5 Nov 2021 16:38:47 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id BDC65182067DF for ; Fri, 5 Nov 2021 20:38:46 +0000 (UTC) X-FDA: 78776040252.06.F7DAC43 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 17ACBD0000B9 for ; Fri, 5 Nov 2021 20:38:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3DD576056B; Fri, 5 Nov 2021 20:38:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144725; bh=uXpFjnZl1TDxU4RkcF/ZAleUQ1ZM9H9lOcFfdZvNkxE=; h=Date:From:To:Subject:In-Reply-To:From; b=U51nPDVjoMRL3OeAJwbEQW4JbqoAwrZPp8LPrqA8TNC9/19Vc1CsMWeJcPRO6tZi0 kwCz29PSg7ycis2XixtcOasloN9E5ZcQobDJbjkVQLq6dZSUQ/HAC/WatAKEk60jNc 3jYu34eN5QqY//MkXASklbzamvAdHGuvxrOidOGA= Date: Fri, 05 Nov 2021 13:38:44 -0700 From: Andrew Morton To: akpm@linux-foundation.org, carl.waldspurger@nutanix.com, corbet@lwn.net, david@redhat.com, florian.schmidt@nutanix.com, ivan.teterevkov@nutanix.com, jonathan.davies@nutanix.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, tiberiu.georgescu@nutanix.com, torvalds@linux-foundation.org Subject: [patch 078/262] Documentation: update pagemap with shmem exceptions Message-ID: <20211105203844.xnHgY8CWB%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 17ACBD0000B9 X-Stat-Signature: qwmake5ugi4uhijkxcp5rpgj3p5si6k9 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=U51nPDVj; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144716-523564 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Tiberiu A Georgescu Subject: Documentation: update pagemap with shmem exceptions This patch follows the discussions on previous documentation patch threads [1][2]. It presents the exception case of shared memory management from the pagemap's point of view. It briefly describes what is missing, why it is missing and alternatives to the pagemap for page info retrieval in user space. In short, the kernel does not keep track of PTEs for swapped out shared pages within the processes that references them. Thus, the proc/pid/pagemap tool cannot print the swap destination of the shared memory pages, instead setting the pagemap entry to zero for both non-allocated and swapped out pages. This can create confusion for users who need information on swapped out pages. The reasons why maintaining the PTEs of all swapped out shared pages among all processes while maintaining similar performance is not a trivial task, or a desirable change, have been discussed extensively [1][3][4][5]. There are also arguments for why this arguably missing information should eventually be exposed to the user in either a future pagemap patch, or by an alternative tool. [1]: https://marc.info/?m=162878395426774 [2]: https://lore.kernel.org/lkml/20210920164931.175411-1-tiberiu.georgescu@nutanix.com/ [3]: https://lore.kernel.org/lkml/20210730160826.63785-1-tiberiu.georgescu@nutanix.com/ [4]: https://lore.kernel.org/lkml/20210807032521.7591-1-peterx@redhat.com/ [5]: https://lore.kernel.org/lkml/20210715201651.212134-1-peterx@redhat.com/ Mention the current missing information in the pagemap and alternatives on how to retrieve it, in case someone stumbles upon unexpected behaviour. Link: https://lkml.kernel.org/r/20210923064618.157046-1-tiberiu.georgescu@nutanix.com Link: https://lkml.kernel.org/r/20210923064618.157046-2-tiberiu.georgescu@nutanix.com Signed-off-by: Tiberiu A Georgescu Reviewed-by: Ivan Teterevkov Reviewed-by: Florian Schmidt Reviewed-by: Carl Waldspurger Reviewed-by: Jonathan Davies Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/pagemap.rst | 22 +++++++++++++++++++++ 1 file changed, 22 insertions(+) --- a/Documentation/admin-guide/mm/pagemap.rst~documentation-update-pagemap-with-shmem-exceptions +++ a/Documentation/admin-guide/mm/pagemap.rst @@ -196,6 +196,28 @@ you can go through every map in the proc in kpagecount, and tally up the number of pages that are only referenced once. +Exceptions for Shared Memory +============================ + +Page table entries for shared pages are cleared when the pages are zapped or +swapped out. This makes swapped out pages indistinguishable from never-allocated +ones. + +In kernel space, the swap location can still be retrieved from the page cache. +However, values stored only on the normal PTE get lost irretrievably when the +page is swapped out (i.e. SOFT_DIRTY). + +In user space, whether the page is present, swapped or none can be deduced with +the help of lseek and/or mincore system calls. + +lseek() can differentiate between accessed pages (present or swapped out) and +holes (none/non-allocated) by specifying the SEEK_DATA flag on the file where +the pages are backed. For anonymous shared pages, the file can be found in +``/proc/pid/map_files/``. + +mincore() can differentiate between pages in memory (present, including swap +cache) and out of memory (swapped out or none/non-allocated). + Other notes =========== From patchwork Fri Nov 5 20:38:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D805C433EF for ; Fri, 5 Nov 2021 20:38:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D25AE611C0 for ; Fri, 5 Nov 2021 20:38:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D25AE611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7B9DD94004B; Fri, 5 Nov 2021 16:38:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 769D8940049; Fri, 5 Nov 2021 16:38:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6350594004B; Fri, 5 Nov 2021 16:38:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id 4DBB2940049 for ; Fri, 5 Nov 2021 16:38:50 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 055B38249980 for ; Fri, 5 Nov 2021 20:38:50 +0000 (UTC) X-FDA: 78776040420.28.C84703C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 808A19000257 for ; Fri, 5 Nov 2021 20:38:49 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6F75560FBF; Fri, 5 Nov 2021 20:38:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144728; bh=mFBz9py+Ba4qdlLlHRD26epOPMvKKwJgs393gYq4odQ=; h=Date:From:To:Subject:In-Reply-To:From; b=dkWsM78jBRaCoHolLP2DbaAHJHMuWDgGhm+Q+0dxwEwsVySBHjX4NTNkaljxo8B1I ITBofBUwupUU061Xrip6WWtBFHXmzCUwKzBeway8qGm1tG5tC7l8qoaszy5m8pt6zI mzOF6HpFASDsDdVg9M3Q9psWdm7G7CZ1ZWuvPuw0= Date: Fri, 05 Nov 2021 13:38:48 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anton@ozlabs.org, benh@kernel.crashing.org, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org, rdunlap@infradead.org, torvalds@linux-foundation.org Subject: [patch 079/262] lazy tlb: introduce lazy mm refcount helper functions Message-ID: <20211105203848.Ggf8ZZOJe%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=dkWsM78j; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 808A19000257 X-Stat-Signature: g5qq66um4t3y516w7i9cwsuob4cz3nje X-HE-Tag: 1636144729-504304 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nicholas Piggin Subject: lazy tlb: introduce lazy mm refcount helper functions Patch series "shoot lazy tlbs", v4. On a 16-socket 192-core POWER8 system, a context switching benchmark with as many software threads as CPUs (so each switch will go in and out of idle), upstream can achieve a rate of about 1 million context switches per second. After this series it goes up to 118 million. This patch (of 4): Add explicit _lazy_tlb annotated functions for lazy mm refcounting. This makes lazy mm references more obvious, and allows explicit refcounting to be removed if it is not used. If a kernel thread's current lazy tlb mm happens to be the one it wants to use, then kthread_use_mm() cleverly transfers the mm refcount from the lazy tlb mm reference to the returned reference. If the lazy tlb mm reference is no longer identical to a normal reference, this trick does not work, so that is changed to be explicit about the two references. [npiggin@gmail.com: fix a refcounting bug in kthread_use_mm] Link: https://lkml.kernel.org/r/1623125298.bx63h3mopj.astroid@bobo.none Link: https://lkml.kernel.org/r/20210605014216.446867-1-npiggin@gmail.com Link: https://lkml.kernel.org/r/20210605014216.446867-2-npiggin@gmail.com Signed-off-by: Nicholas Piggin Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Andy Lutomirski Cc: Anton Blanchard Cc: Randy Dunlap Signed-off-by: Andrew Morton --- arch/arm/mach-rpc/ecard.c | 2 +- arch/powerpc/kernel/smp.c | 2 +- arch/powerpc/mm/book3s64/radix_tlb.c | 4 ++-- fs/exec.c | 4 ++-- include/linux/sched/mm.h | 11 +++++++++++ kernel/cpu.c | 2 +- kernel/exit.c | 2 +- kernel/kthread.c | 21 +++++++++++++-------- kernel/sched/core.c | 15 ++++++++------- 9 files changed, 40 insertions(+), 23 deletions(-) --- a/arch/arm/mach-rpc/ecard.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/arch/arm/mach-rpc/ecard.c @@ -253,7 +253,7 @@ static int ecard_init_mm(void) current->mm = mm; current->active_mm = mm; activate_mm(active_mm, mm); - mmdrop(active_mm); + mmdrop_lazy_tlb(active_mm); ecard_init_pgtables(mm); return 0; } --- a/arch/powerpc/kernel/smp.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/arch/powerpc/kernel/smp.c @@ -1582,7 +1582,7 @@ void start_secondary(void *unused) if (IS_ENABLED(CONFIG_PPC32)) setup_kup(); - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; smp_store_cpu_info(cpu); --- a/arch/powerpc/mm/book3s64/radix_tlb.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/arch/powerpc/mm/book3s64/radix_tlb.c @@ -786,10 +786,10 @@ void exit_lazy_flush_tlb(struct mm_struc if (current->active_mm == mm) { WARN_ON_ONCE(current->mm != NULL); /* Is a kernel thread and is using mm as the lazy tlb */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); current->active_mm = &init_mm; switch_mm_irqs_off(mm, &init_mm, current); - mmdrop(mm); + mmdrop_lazy_tlb(mm); } /* --- a/fs/exec.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/fs/exec.c @@ -1028,9 +1028,9 @@ static int exec_mmap(struct mm_struct *m setmax_mm_hiwater_rss(&tsk->signal->maxrss, old_mm); mm_update_next_owner(old_mm); mmput(old_mm); - return 0; + } else { + mmdrop_lazy_tlb(active_mm); } - mmdrop(active_mm); return 0; } --- a/include/linux/sched/mm.h~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/include/linux/sched/mm.h @@ -49,6 +49,17 @@ static inline void mmdrop(struct mm_stru __mmdrop(mm); } +/* Helpers for lazy TLB mm refcounting */ +static inline void mmgrab_lazy_tlb(struct mm_struct *mm) +{ + mmgrab(mm); +} + +static inline void mmdrop_lazy_tlb(struct mm_struct *mm) +{ + mmdrop(mm); +} + /** * mmget() - Pin the address space associated with a &struct mm_struct. * @mm: The address space to pin. --- a/kernel/cpu.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/kernel/cpu.c @@ -613,7 +613,7 @@ static int finish_cpu(unsigned int cpu) */ if (mm != &init_mm) idle->active_mm = &init_mm; - mmdrop(mm); + mmdrop_lazy_tlb(mm); return 0; } --- a/kernel/exit.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/kernel/exit.c @@ -475,7 +475,7 @@ static void exit_mm(void) __set_current_state(TASK_RUNNING); mmap_read_lock(mm); } - mmgrab(mm); + mmgrab_lazy_tlb(mm); BUG_ON(mm != current->active_mm); /* more a memory barrier than a real lock */ task_lock(current); --- a/kernel/kthread.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/kernel/kthread.c @@ -1350,14 +1350,19 @@ void kthread_use_mm(struct mm_struct *mm WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD)); WARN_ON_ONCE(tsk->mm); + /* + * It's possible that tsk->active_mm == mm here, but we must + * still mmgrab(mm) and mmdrop_lazy_tlb(active_mm), because lazy + * mm may not have its own refcount (see mmgrab/drop_lazy_tlb()). + */ + mmgrab(mm); + task_lock(tsk); /* Hold off tlb flush IPIs while switching mm's */ local_irq_disable(); active_mm = tsk->active_mm; - if (active_mm != mm) { - mmgrab(mm); + if (active_mm != mm) tsk->active_mm = mm; - } tsk->mm = mm; membarrier_update_current_mm(mm); switch_mm_irqs_off(active_mm, mm, tsk); @@ -1374,12 +1379,9 @@ void kthread_use_mm(struct mm_struct *mm * memory barrier after storing to tsk->mm, before accessing * user-space memory. A full memory barrier for membarrier * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by - * mmdrop(), or explicitly with smp_mb(). + * mmdrop_lazy_tlb(). */ - if (active_mm != mm) - mmdrop(active_mm); - else - smp_mb(); + mmdrop_lazy_tlb(active_mm); to_kthread(tsk)->oldfs = force_uaccess_begin(); } @@ -1411,10 +1413,13 @@ void kthread_unuse_mm(struct mm_struct * local_irq_disable(); tsk->mm = NULL; membarrier_update_current_mm(NULL); + mmgrab_lazy_tlb(mm); /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); local_irq_enable(); task_unlock(tsk); + + mmdrop(mm); } EXPORT_SYMBOL_GPL(kthread_unuse_mm); --- a/kernel/sched/core.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions +++ a/kernel/sched/core.c @@ -4831,13 +4831,14 @@ static struct rq *finish_task_switch(str * rq->curr, before returning to userspace, so provide them here: * * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly - * provided by mmdrop(), + * provided by mmdrop_lazy_tlb(), * - a sync_core for SYNC_CORE. */ if (mm) { membarrier_mm_sync_core_before_usermode(mm); - mmdrop(mm); + mmdrop_lazy_tlb(mm); } + if (unlikely(prev_state == TASK_DEAD)) { if (prev->sched_class->task_dead) prev->sched_class->task_dead(prev); @@ -4900,9 +4901,9 @@ context_switch(struct rq *rq, struct tas /* * kernel -> kernel lazy + transfer active - * user -> kernel lazy + mmgrab() active + * user -> kernel lazy + mmgrab_lazy_tlb() active * - * kernel -> user switch + mmdrop() active + * kernel -> user switch + mmdrop_lazy_tlb() active * user -> user switch */ if (!next->mm) { // to kernel @@ -4910,7 +4911,7 @@ context_switch(struct rq *rq, struct tas next->active_mm = prev->active_mm; if (prev->mm) // from user - mmgrab(prev->active_mm); + mmgrab_lazy_tlb(prev->active_mm); else prev->active_mm = NULL; } else { // to user @@ -4926,7 +4927,7 @@ context_switch(struct rq *rq, struct tas switch_mm_irqs_off(prev->active_mm, next->mm, next); if (!prev->mm) { // from kernel - /* will mmdrop() in finish_task_switch(). */ + /* will mmdrop_lazy_tlb() in finish_task_switch(). */ rq->prev_mm = prev->active_mm; prev->active_mm = NULL; } @@ -9442,7 +9443,7 @@ void __init sched_init(void) /* * The boot idle thread does lazy MMU switching as well: */ - mmgrab(&init_mm); + mmgrab_lazy_tlb(&init_mm); enter_lazy_tlb(&init_mm, current); /* From patchwork Fri Nov 5 20:38:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15BCDC433EF for ; Fri, 5 Nov 2021 20:38:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF0AF60174 for ; Fri, 5 Nov 2021 20:38:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BF0AF60174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6330094004C; Fri, 5 Nov 2021 16:38:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E587940049; Fri, 5 Nov 2021 16:38:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AACE94004C; Fri, 5 Nov 2021 16:38:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 2D314940049 for ; Fri, 5 Nov 2021 16:38:53 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E5CB8779A6 for ; Fri, 5 Nov 2021 20:38:52 +0000 (UTC) X-FDA: 78776040504.06.29D5F09 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 45CDC7001A05 for ; Fri, 5 Nov 2021 20:38:47 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8A608611C0; Fri, 5 Nov 2021 20:38:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144731; bh=PiwF7oMDAByYs1eamiVwxptlyfzBEUPJOm8BcTrHjF4=; h=Date:From:To:Subject:In-Reply-To:From; b=ubVkohP+TeIEt65JUxTaupiuFPhaU0hesOWTOyahGTD9Hedg+Dt6Ba9I32VB6xTL6 mwL3fx2TqoDULkz/dxAa2iFm78piaKYEiZcwMV2eUIN50lwHLWL512rE2qJV0gJk7c ifoAVKUechPAXwvRHtaVMz/zPHdI6pqTiQfvs9H4= Date: Fri, 05 Nov 2021 13:38:51 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anton@ozlabs.org, benh@kernel.crashing.org, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org, rdunlap@infradead.org, torvalds@linux-foundation.org Subject: [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable Message-ID: <20211105203851.Ise-VKP1_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 45CDC7001A05 X-Stat-Signature: 579kiimugddfk78mnuonu49eytda7iqd Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ubVkohP+; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144727-77765 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nicholas Piggin Subject: lazy tlb: allow lazy tlb mm refcounting to be configurable Add CONFIG_MMU_TLB_REFCOUNT which enables refcounting of the lazy tlb mm when it is context switched. This can be disabled by architectures that don't require this refcounting if they clean up lazy tlb mms when the last refcount is dropped. Currently this is always enabled, which is what existing code does, so the patch is effectively a no-op. Rename rq->prev_mm to rq->prev_lazy_mm, because that's what it is. [akpm@linux-foundation.org: fix comment] [npiggin@gmail.com: update comments] Link: https://lkml.kernel.org/r/1623121605.j47gdpccep.astroid@bobo.none Link: https://lkml.kernel.org/r/20210605014216.446867-3-npiggin@gmail.com Signed-off-by: Nicholas Piggin Cc: Andy Lutomirski Cc: Anton Blanchard Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Randy Dunlap Signed-off-by: Andrew Morton --- arch/Kconfig | 14 ++++++++++++++ include/linux/sched/mm.h | 14 ++++++++++++-- kernel/sched/core.c | 22 ++++++++++++++++++---- kernel/sched/sched.h | 4 +++- 4 files changed, 47 insertions(+), 7 deletions(-) --- a/arch/Kconfig~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable +++ a/arch/Kconfig @@ -428,6 +428,20 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM irqs disabled over activate_mm. Architectures that do IPI based TLB shootdowns should enable this. +# Use normal mm refcounting for MMU_LAZY_TLB kernel thread references. +# MMU_LAZY_TLB_REFCOUNT=n can improve the scalability of context switching +# to/from kernel threads when the same mm is running on a lot of CPUs (a large +# multi-threaded application), by reducing contention on the mm refcount. +# +# This can be disabled if the architecture ensures no CPUs are using an mm as a +# "lazy tlb" beyond its final refcount (i.e., by the time __mmdrop frees the mm +# or its kernel page tables). This could be arranged by arch_exit_mmap(), or +# final exit(2) TLB flush, for example. arch code must also ensure the +# _lazy_tlb variants of mmgrab/mmdrop are used when dropping the lazy reference +# to a kthread ->active_mm (non-arch code has been converted already). +config MMU_LAZY_TLB_REFCOUNT + def_bool y + config ARCH_HAVE_NMI_SAFE_CMPXCHG bool --- a/include/linux/sched/mm.h~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable +++ a/include/linux/sched/mm.h @@ -52,12 +52,22 @@ static inline void mmdrop(struct mm_stru /* Helpers for lazy TLB mm refcounting */ static inline void mmgrab_lazy_tlb(struct mm_struct *mm) { - mmgrab(mm); + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) + mmgrab(mm); } static inline void mmdrop_lazy_tlb(struct mm_struct *mm) { - mmdrop(mm); + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) { + mmdrop(mm); + } else { + /* + * mmdrop_lazy_tlb must provide a full memory barrier, see the + * membarrier comment in finish_task_switch which relies on + * this. + */ + smp_mb(); + } } /** --- a/kernel/sched/core.c~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable +++ a/kernel/sched/core.c @@ -4772,7 +4772,7 @@ static struct rq *finish_task_switch(str __releases(rq->lock) { struct rq *rq = this_rq(); - struct mm_struct *mm = rq->prev_mm; + struct mm_struct *mm = NULL; long prev_state; /* @@ -4791,7 +4791,10 @@ static struct rq *finish_task_switch(str current->comm, current->pid, preempt_count())) preempt_count_set(FORK_PREEMPT_COUNT); - rq->prev_mm = NULL; +#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT + mm = rq->prev_lazy_mm; + rq->prev_lazy_mm = NULL; +#endif /* * A task struct has one reference for the use as "current". @@ -4927,9 +4930,20 @@ context_switch(struct rq *rq, struct tas switch_mm_irqs_off(prev->active_mm, next->mm, next); if (!prev->mm) { // from kernel - /* will mmdrop_lazy_tlb() in finish_task_switch(). */ - rq->prev_mm = prev->active_mm; +#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT + /* Will mmdrop_lazy_tlb() in finish_task_switch(). */ + rq->prev_lazy_mm = prev->active_mm; prev->active_mm = NULL; +#else + /* + * Without MMU_LAZY_TLB_REFCOUNT there is no lazy + * tracking (because no rq->prev_lazy_mm) in + * finish_task_switch, so no mmdrop_lazy_tlb(), so no + * memory barrier for membarrier (see the membarrier + * comment in finish_task_switch()). Do it here. + */ + smp_mb(); +#endif } } --- a/kernel/sched/sched.h~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable +++ a/kernel/sched/sched.h @@ -977,7 +977,9 @@ struct rq { struct task_struct *idle; struct task_struct *stop; unsigned long next_balance; - struct mm_struct *prev_mm; +#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT + struct mm_struct *prev_lazy_mm; +#endif unsigned int clock_update_flags; u64 clock; From patchwork Fri Nov 5 20:38:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E5A8C433EF for ; Fri, 5 Nov 2021 20:38:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0941C60174 for ; Fri, 5 Nov 2021 20:38:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0941C60174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 91C7594004D; Fri, 5 Nov 2021 16:38:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C924940049; Fri, 5 Nov 2021 16:38:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 790D594004D; Fri, 5 Nov 2021 16:38:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id 651F7940049 for ; Fri, 5 Nov 2021 16:38:56 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 32D6177995 for ; Fri, 5 Nov 2021 20:38:56 +0000 (UTC) X-FDA: 78776040672.21.C26DF8C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id B5409801A8BF for ; Fri, 5 Nov 2021 20:38:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A11626056B; Fri, 5 Nov 2021 20:38:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144734; bh=X1whOE334UN16Uo/oejX6lTyL1rxi9kcrg7LmwDGZcc=; h=Date:From:To:Subject:In-Reply-To:From; b=zMx7hY0/1T3dnBEUx2NveiAL6TSyzGSu5Le5gmsNH9xsjTm9xftWn4njUxGbileBf e2FFCtNuFdOr0GRUQpkcPDJjczJT8V8m2tDzfvLIALMG6hRM0yJmXxA8uZrCQhkmQt Qwn839Qw6dp34tm+TNTbPe2ewpi7jFxRez5NNJEY= Date: Fri, 05 Nov 2021 13:38:54 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anton@ozlabs.org, benh@kernel.crashing.org, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org, rdunlap@infradead.org, torvalds@linux-foundation.org Subject: [patch 081/262] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Message-ID: <20211105203854.-FBJlVYYh%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B5409801A8BF X-Stat-Signature: hjgrw7j6xhbjpwhht71ciwmypgw6y7a8 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="zMx7hY0/"; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144735-922281 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nicholas Piggin Subject: lazy tlb: shoot lazies, a non-refcounting lazy tlb option On big systems, the mm refcount can become highly contented when doing a lot of context switching with threaded applications (particularly switching between the idle thread and an application thread). Abandoning lazy tlb slows switching down quite a bit in the important user->idle->user cases, so instead implement a non-refcounted scheme that causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down any remaining lazy ones. Shootdown IPIs are some concern, but they have not been observed to be a big problem with this scheme (the powerpc implementation generated 314 additional interrupts on a 144 CPU system during a kernel compile). There are a number of strategies that could be employed to reduce IPIs if they turn out to be a problem for some workload. [npiggin@gmail.com: update comments] Link: https://lkml.kernel.org/r/1623121901.mszkmmum0n.astroid@bobo.none Link: https://lkml.kernel.org/r/20210605014216.446867-4-npiggin@gmail.com Signed-off-by: Nicholas Piggin Cc: Anton Blanchard Cc: Andy Lutomirski Cc: Randy Dunlap Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Andrew Morton --- arch/Kconfig | 14 +++++++++++++ kernel/fork.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) --- a/arch/Kconfig~lazy-tlb-shoot-lazies-a-non-refcounting-lazy-tlb-option +++ a/arch/Kconfig @@ -441,6 +441,20 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM # to a kthread ->active_mm (non-arch code has been converted already). config MMU_LAZY_TLB_REFCOUNT def_bool y + depends on !MMU_LAZY_TLB_SHOOTDOWN + +# This option allows MMU_LAZY_TLB_REFCOUNT=n. It ensures no CPUs are using an +# mm as a lazy tlb beyond its last reference count, by shooting down these +# users before the mm is deallocated. __mmdrop() first IPIs all CPUs that may +# be using the mm as a lazy tlb, so that they may switch themselves to using +# init_mm for their active mm. mm_cpumask(mm) is used to determine which CPUs +# may be using mm as a lazy tlb mm. +# +# To implement this, an arch must ensure mm_cpumask(mm) contains at least all +# possible CPUs in which the mm is lazy, and it must meet the requirements for +# MMU_LAZY_TLB_REFCOUNT=n (see above). +config MMU_LAZY_TLB_SHOOTDOWN + bool config ARCH_HAVE_NMI_SAFE_CMPXCHG bool --- a/kernel/fork.c~lazy-tlb-shoot-lazies-a-non-refcounting-lazy-tlb-option +++ a/kernel/fork.c @@ -686,6 +686,53 @@ static void check_mm(struct mm_struct *m #define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL)) #define free_mm(mm) (kmem_cache_free(mm_cachep, (mm))) +static void do_shoot_lazy_tlb(void *arg) +{ + struct mm_struct *mm = arg; + + if (current->active_mm == mm) { + WARN_ON_ONCE(current->mm); + current->active_mm = &init_mm; + switch_mm(mm, &init_mm, current); + } +} + +static void do_check_lazy_tlb(void *arg) +{ + struct mm_struct *mm = arg; + + WARN_ON_ONCE(current->active_mm == mm); +} + +static void shoot_lazy_tlbs(struct mm_struct *mm) +{ + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) { + /* + * IPI overheads have not found to be expensive, but they could + * be reduced in a number of possible ways, for example (in + * roughly increasing order of complexity): + * - A batch of mms requiring IPIs could be gathered and freed + * at once. + * - CPUs could store their active mm somewhere that can be + * remotely checked without a lock, to filter out + * false-positives in the cpumask. + * - After mm_users or mm_count reaches zero, switching away + * from the mm could clear mm_cpumask to reduce some IPIs + * (some batching or delaying would help). + * - A delayed freeing and RCU-like quiescing sequence based on + * mm switching to avoid IPIs completely. + */ + on_each_cpu_mask(mm_cpumask(mm), do_shoot_lazy_tlb, (void *)mm, 1); + if (IS_ENABLED(CONFIG_DEBUG_VM)) + on_each_cpu(do_check_lazy_tlb, (void *)mm, 1); + } else { + /* + * In this case, lazy tlb mms are refounted and would not reach + * __mmdrop until all CPUs have switched away and mmdrop()ed. + */ + } +} + /* * Called when the last reference to the mm * is dropped: either by a lazy thread or by @@ -695,6 +742,10 @@ void __mmdrop(struct mm_struct *mm) { BUG_ON(mm == &init_mm); WARN_ON_ONCE(mm == current->mm); + + /* Ensure no CPUs are using this as their lazy tlb mm */ + shoot_lazy_tlbs(mm); + WARN_ON_ONCE(mm == current->active_mm); mm_free_pgd(mm); destroy_context(mm); From patchwork Fri Nov 5 20:38:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 324C0C433FE for ; Fri, 5 Nov 2021 20:39:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D9BD860174 for ; Fri, 5 Nov 2021 20:38:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D9BD860174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6F7EF94004E; Fri, 5 Nov 2021 16:38:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A714940049; Fri, 5 Nov 2021 16:38:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4101894004E; Fri, 5 Nov 2021 16:38:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 29DEE940049 for ; Fri, 5 Nov 2021 16:38:59 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E65401802610F for ; Fri, 5 Nov 2021 20:38:58 +0000 (UTC) X-FDA: 78776040756.08.E4D1EEB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 4D162700177B for ; Fri, 5 Nov 2021 20:38:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A7EB060174; Fri, 5 Nov 2021 20:38:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144738; bh=y1OO1QFMPYTHtzTSjbHoiur6l/+GGZAAa02xHkG0VW0=; h=Date:From:To:Subject:In-Reply-To:From; b=DM/9sJ0JB28ik53F4D81KszLNvpc+BY31pGGUF27jSx6J41SmkGq6BmD07e+i5vqP LXeYzoX8Z0RjVBKSovjWtkMYjsOaqOvUzZU2RbLOst/P9jn3ipiHBsnrgl0BHlvReX 0ZhnMX5YTCpkd4M0urFzButna37sUn88HnloKu/s= Date: Fri, 05 Nov 2021 13:38:57 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anton@ozlabs.org, benh@kernel.crashing.org, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, npiggin@gmail.com, paulus@ozlabs.org, rdunlap@infradead.org, torvalds@linux-foundation.org Subject: [patch 082/262] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Message-ID: <20211105203857.gaAdLZ-Vh%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4D162700177B X-Stat-Signature: z3mb465ku4goxhpe6k9zobsenj34zeix Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="DM/9sJ0J"; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144733-415452 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nicholas Piggin Subject: powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN On a 16-socket 192-core POWER8 system, a context switching benchmark with as many software threads as CPUs (so each switch will go in and out of idle), upstream can achieve a rate of about 1 million context switches per second. After this patch it goes up to 118 million. No real datya for real world workloads unfortunately. I think it's always been a "known" cacheline, it just showed up badly on will-it-scale tests recently when Anton was doing a sweep of low hanging scalability issues on big systems. We have some very big systems running certain in-memory databases that get into very high contention conditions on mutexes that push context switch rates right up and with idle times pretty high, which would get a lot of parallel context switching between user and idle thread, we might be getting a bit of this contention there. It's not something at the top of profiles though. And on multi-threaded workloads like this, the normal refcounting of the user mm still has fundmaental contention. It's tricky to get the change tested on these workloads (machine time is very limited and I can't drive the software). I suspect it could also show in things that do high net or disk IO rates (enough to need a lot of cores), and do some user processing steps along the way. You'd potentially get a lot of idle switching. This infrastructure could be beneficial to other architectures. The cacheline is going to bounce in the same situations on other archs, so I would say yes. Rik at one stage had some patches to try avoid it for x86 some years ago, I don't know what happened to those. The way powerpc has to maintain mm_cpumask for its TLB flushing makes it relatively easy to do this shootdown, and we decided the additional IPIs were less of a concern than the bouncing. Others have different concerns, but I tried to make it generic and add comments explaining what other archs can do, or possibly different ways it might be achieved. Link: https://lkml.kernel.org/r/20210605014216.446867-5-npiggin@gmail.com Signed-off-by: Nicholas Piggin Cc: Andy Lutomirski Cc: Anton Blanchard Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Randy Dunlap Signed-off-by: Andrew Morton --- arch/powerpc/Kconfig | 1 + 1 file changed, 1 insertion(+) --- a/arch/powerpc/Kconfig~powerpc-64s-enable-mmu_lazy_tlb_shootdown +++ a/arch/powerpc/Kconfig @@ -249,6 +249,7 @@ config PPC select IRQ_FORCED_THREADING select MMU_GATHER_PAGE_SIZE select MMU_GATHER_RCU_TABLE_FREE + select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64 select MODULES_USE_ELF_RELA select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE select NEED_SG_DMA_LENGTH From patchwork Fri Nov 5 20:39:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A100C433EF for ; Fri, 5 Nov 2021 20:39:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E9B5860FBF for ; Fri, 5 Nov 2021 20:39:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E9B5860FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8A3C194004F; Fri, 5 Nov 2021 16:39:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 83B45940049; Fri, 5 Nov 2021 16:39:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6817294004F; Fri, 5 Nov 2021 16:39:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 51D18940049 for ; Fri, 5 Nov 2021 16:39:02 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 11FF0181C9673 for ; Fri, 5 Nov 2021 20:39:02 +0000 (UTC) X-FDA: 78776040882.09.7784C8F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id ABF5CB0000B2 for ; Fri, 5 Nov 2021 20:39:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AC56C60174; Fri, 5 Nov 2021 20:39:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144740; bh=z8Gm1/mAEeURZen3baG2cZYKbHFRwQEKqQkCaN+4no0=; h=Date:From:To:Subject:In-Reply-To:From; b=UrXQFGyuC6gC4SVQdJJdUX+FJeOXWJzbBHn3CnXRtScCCQtSMQB0Kzg2MZIkCA1Es r3kNlio5HtTw06dbg9U8m5LnOwJSzDiD+d+Z+39VFPVFghu3UqTxwRfxjqcQ6LZNLP wHSif08jrdm85Zzpyubqzo6dBy9o876poSEt1xh8= Date: Fri, 05 Nov 2021 13:39:00 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dave.hansen@linux.intel.com, david@redhat.com, linux-mm@kvack.org, lukas.bulwahn@gmail.com, mhocko@suse.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 083/262] memory: remove unused CONFIG_MEM_BLOCK_SIZE Message-ID: <20211105203900.bo9SeDCE2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: ABF5CB0000B2 X-Stat-Signature: 43qd3fbjgcqyhmzyq973abzugfi5595j Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UrXQFGyu; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144741-777427 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Lukas Bulwahn Subject: memory: remove unused CONFIG_MEM_BLOCK_SIZE Commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") defines CONFIG_MEM_BLOCK_SIZE, but this has never been utilized anywhere. It is a good practice to keep the CONFIG_* defines exclusively for the Kbuild system. So, drop this unused definition. This issue was noticed due to running ./scripts/checkkconfigsymbols.py. Link: https://lkml.kernel.org/r/20211006120354.7468-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn Reviewed-by: David Hildenbrand Cc: Michal Hocko Cc: Dave Hansen Signed-off-by: Andrew Morton --- include/linux/memory.h | 1 - 1 file changed, 1 deletion(-) --- a/include/linux/memory.h~memory-remove-unused-config_mem_block_size +++ a/include/linux/memory.h @@ -140,7 +140,6 @@ typedef int (*walk_memory_blocks_func_t) extern int walk_memory_blocks(unsigned long start, unsigned long size, void *arg, walk_memory_blocks_func_t func); extern int for_each_memory_block(void *arg, walk_memory_blocks_func_t func); -#define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION< X-Patchwork-Id: 12605531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 906ADC433EF for ; Fri, 5 Nov 2021 20:39:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5F6E760174 for ; Fri, 5 Nov 2021 20:39:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5F6E760174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 08D36940050; Fri, 5 Nov 2021 16:39:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 03D70940049; Fri, 5 Nov 2021 16:39:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4727940050; Fri, 5 Nov 2021 16:39:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0005.hostedemail.com [216.40.44.5]) by kanga.kvack.org (Postfix) with ESMTP id D25A4940049 for ; Fri, 5 Nov 2021 16:39:05 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 96A9618232A17 for ; Fri, 5 Nov 2021 20:39:05 +0000 (UTC) X-FDA: 78776041050.16.7043927 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id DBE4F90000A6 for ; Fri, 5 Nov 2021 20:38:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AE0016056B; Fri, 5 Nov 2021 20:39:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144743; bh=kjdyWl6bfgEGl5f491e/pYQgwifhEzyB6NH8Q0LTYHo=; h=Date:From:To:Subject:In-Reply-To:From; b=H3Fp7F0cggfptGIjCRflLkByF630/2Zs0To0wwU92GG5Uw81sqKWjwemk6pRpuUMf vM0C85M9cBTKXiQqjNgRsVrWUKPXjdkSFUb0wqdE+RtGvnbhdeekjG4QDNZ6wloODB AzjckXuyg0eH3XpJ5dbI6nSP5SFtD57nyhTr1gcs= Date: Fri, 05 Nov 2021 13:39:03 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, liu.song11@zte.com.cn, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 084/262] mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() Message-ID: <20211105203903.w6-SbZ0O3%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DBE4F90000A6 X-Stat-Signature: 8d8dcfk5fmwwdbai56kjxozgf9wnnaaw Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=H3Fp7F0c; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144731-580430 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Liu Song Subject: mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() After adjustment, the repeated assignment of "prev" is avoided, and the readability of the code is improved. Link: https://lkml.kernel.org/r/20211012152444.4127-1-fishland@aliyun.com Reviewed-by: Andrew Morton Signed-off-by: Liu Song Signed-off-by: Andrew Morton --- mm/mprotect.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/mm/mprotect.c~mm-mprotectc-avoid-repeated-assignment-in-do_mprotect_pkey +++ a/mm/mprotect.c @@ -563,7 +563,7 @@ static int do_mprotect_pkey(unsigned lon error = -ENOMEM; if (!vma) goto out; - prev = vma->vm_prev; + if (unlikely(grows & PROT_GROWSDOWN)) { if (vma->vm_start >= end) goto out; @@ -581,8 +581,11 @@ static int do_mprotect_pkey(unsigned lon goto out; } } + if (start > vma->vm_start) prev = vma; + else + prev = vma->vm_prev; for (nstart = start ; ; ) { unsigned long mask_off_old_flags; From patchwork Fri Nov 5 20:39:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18951C433F5 for ; Fri, 5 Nov 2021 20:39:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B14DD6056B for ; Fri, 5 Nov 2021 20:39:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B14DD6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4F98A940051; Fri, 5 Nov 2021 16:39:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A9D2940049; Fri, 5 Nov 2021 16:39:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3998F940051; Fri, 5 Nov 2021 16:39:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0117.hostedemail.com [216.40.44.117]) by kanga.kvack.org (Postfix) with ESMTP id 296D1940049 for ; Fri, 5 Nov 2021 16:39:09 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E9936182D6989 for ; Fri, 5 Nov 2021 20:39:08 +0000 (UTC) X-FDA: 78776041176.11.A68D712 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id 7DBCB4002085 for ; Fri, 5 Nov 2021 20:39:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DB004611C4; Fri, 5 Nov 2021 20:39:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144747; bh=t1NfH4zixgoDx6FI8US0iGgt2wM5VMISQnRde6+NfsA=; h=Date:From:To:Subject:In-Reply-To:From; b=BDdn1xA9U2qAlw+dms1yGBtjUpfBTp1t7c9z0Ac0VJv9iMPORNSTmOERRwswYilYU AruW2vsWKhKiNH4/1y8b2afvVJaJ8JoNgECA02Ib2O0abGj8PceqE9cpro8QHxKgvl L19k/U6zqH4g0LmEFqucND+KmNWLUIAmm4mU4jUA= Date: Fri, 05 Nov 2021 13:39:06 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bgeffon@google.com, catalin.marinas@arm.com, chenwandun@huawei.com, dan.carpenter@oracle.com, dan.j.williams@intel.com, dave.jiang@intel.com, dima@arista.com, hughd@google.com, jgg@ziepe.ca, jhubbard@nvidia.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, linux@armlinux.org.uk, luto@kernel.org, mike.kravetz@oracle.com, minchan@kernel.org, mingo@redhat.com, mm-commits@vger.kernel.org, rcampbell@nvidia.com, tglx@linutronix.de, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vbabka@suse.cz, viro@zeniv.linux.org.uk, vishal.l.verma@intel.com, wangkefeng.wang@huawei.com, weiyongjun1@huawei.com, will@kernel.org Subject: [patch 085/262] mm/mremap: don't account pages in vma_to_resize() Message-ID: <20211105203906.uzcmwCsFU%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BDdn1xA9; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7DBCB4002085 X-Stat-Signature: ddgu9ins3e6hgf78naya4m7ac1dxqyyh X-HE-Tag: 1636144748-382038 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dmitry Safonov Subject: mm/mremap: don't account pages in vma_to_resize() All this vm_unacct_memory(charged) dance seems to complicate the life without a good reason. Furthermore, it seems not always done right on error-pathes in mremap_to(). And worse than that: this `charged' difference is sometimes double-accounted for growing MREMAP_DONTUNMAP mremap()s in move_vma(): if (security_vm_enough_memory_mm(mm, new_len >> PAGE_SHIFT)) Let's not do this. Account memory in mremap() fast-path for growing VMAs or in move_vma() for actually moving things. The same simpler way as it's done by vm_stat_account(), but with a difference to call security_vm_enough_memory_mm() before copying/adjusting VMA. Originally noticed by Chen Wandun: https://lkml.kernel.org/r/20210717101942.120607-1-chenwandun@huawei.com Link: https://lkml.kernel.org/r/20210721131320.522061-1-dima@arista.com Fixes: e346b3813067 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()") Signed-off-by: Dmitry Safonov Acked-by: Brian Geffon Cc: Alexander Viro Cc: Andy Lutomirski Cc: Catalin Marinas Cc: Chen Wandun Cc: Dan Carpenter Cc: Dan Williams Cc: Dave Jiang Cc: Hugh Dickins Cc: Ingo Molnar Cc: Jason Gunthorpe Cc: John Hubbard Cc: Kefeng Wang Cc: "Kirill A. Shutemov" Cc: Mike Kravetz Cc: Minchan Kim Cc: Ralph Campbell Cc: Russell King Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Vishal Verma Cc: Vlastimil Babka Cc: Wei Yongjun Cc: Will Deacon Signed-off-by: Andrew Morton --- mm/mremap.c | 50 ++++++++++++++++++++++---------------------------- 1 file changed, 22 insertions(+), 28 deletions(-) --- a/mm/mremap.c~mm-mremap-dont-account-pages-in-vma_to_resize +++ a/mm/mremap.c @@ -565,6 +565,7 @@ static unsigned long move_vma(struct vm_ bool *locked, unsigned long flags, struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap) { + long to_account = new_len - old_len; struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *new_vma; unsigned long vm_flags = vma->vm_flags; @@ -583,6 +584,9 @@ static unsigned long move_vma(struct vm_ if (mm->map_count >= sysctl_max_map_count - 3) return -ENOMEM; + if (unlikely(flags & MREMAP_DONTUNMAP)) + to_account = new_len; + if (vma->vm_ops && vma->vm_ops->may_split) { if (vma->vm_start != old_addr) err = vma->vm_ops->may_split(vma, old_addr); @@ -604,8 +608,8 @@ static unsigned long move_vma(struct vm_ if (err) return err; - if (unlikely(flags & MREMAP_DONTUNMAP && vm_flags & VM_ACCOUNT)) { - if (security_vm_enough_memory_mm(mm, new_len >> PAGE_SHIFT)) + if (vm_flags & VM_ACCOUNT) { + if (security_vm_enough_memory_mm(mm, to_account >> PAGE_SHIFT)) return -ENOMEM; } @@ -613,8 +617,8 @@ static unsigned long move_vma(struct vm_ new_vma = copy_vma(&vma, new_addr, new_len, new_pgoff, &need_rmap_locks); if (!new_vma) { - if (unlikely(flags & MREMAP_DONTUNMAP && vm_flags & VM_ACCOUNT)) - vm_unacct_memory(new_len >> PAGE_SHIFT); + if (vm_flags & VM_ACCOUNT) + vm_unacct_memory(to_account >> PAGE_SHIFT); return -ENOMEM; } @@ -708,8 +712,7 @@ static unsigned long move_vma(struct vm_ } static struct vm_area_struct *vma_to_resize(unsigned long addr, - unsigned long old_len, unsigned long new_len, unsigned long flags, - unsigned long *p) + unsigned long old_len, unsigned long new_len, unsigned long flags) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma; @@ -768,13 +771,6 @@ static struct vm_area_struct *vma_to_res (new_len - old_len) >> PAGE_SHIFT)) return ERR_PTR(-ENOMEM); - if (vma->vm_flags & VM_ACCOUNT) { - unsigned long charged = (new_len - old_len) >> PAGE_SHIFT; - if (security_vm_enough_memory_mm(mm, charged)) - return ERR_PTR(-ENOMEM); - *p = charged; - } - return vma; } @@ -787,7 +783,6 @@ static unsigned long mremap_to(unsigned struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long ret = -EINVAL; - unsigned long charged = 0; unsigned long map_flags = 0; if (offset_in_page(new_addr)) @@ -830,7 +825,7 @@ static unsigned long mremap_to(unsigned old_len = new_len; } - vma = vma_to_resize(addr, old_len, new_len, flags, &charged); + vma = vma_to_resize(addr, old_len, new_len, flags); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto out; @@ -853,7 +848,7 @@ static unsigned long mremap_to(unsigned ((addr - vma->vm_start) >> PAGE_SHIFT), map_flags); if (IS_ERR_VALUE(ret)) - goto out1; + goto out; /* We got a new mapping */ if (!(flags & MREMAP_FIXED)) @@ -862,12 +857,6 @@ static unsigned long mremap_to(unsigned ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, flags, uf, uf_unmap); - if (!(offset_in_page(ret))) - goto out; - -out1: - vm_unacct_memory(charged); - out: return ret; } @@ -899,7 +888,6 @@ SYSCALL_DEFINE5(mremap, unsigned long, a struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long ret = -EINVAL; - unsigned long charged = 0; bool locked = false; bool downgraded = false; struct vm_userfaultfd_ctx uf = NULL_VM_UFFD_CTX; @@ -981,7 +969,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, a /* * Ok, we need to grow.. */ - vma = vma_to_resize(addr, old_len, new_len, flags, &charged); + vma = vma_to_resize(addr, old_len, new_len, flags); if (IS_ERR(vma)) { ret = PTR_ERR(vma); goto out; @@ -992,10 +980,18 @@ SYSCALL_DEFINE5(mremap, unsigned long, a if (old_len == vma->vm_end - addr) { /* can we just expand the current mapping? */ if (vma_expandable(vma, new_len - old_len)) { - int pages = (new_len - old_len) >> PAGE_SHIFT; + long pages = (new_len - old_len) >> PAGE_SHIFT; + + if (vma->vm_flags & VM_ACCOUNT) { + if (security_vm_enough_memory_mm(mm, pages)) { + ret = -ENOMEM; + goto out; + } + } if (vma_adjust(vma, vma->vm_start, addr + new_len, vma->vm_pgoff, NULL)) { + vm_unacct_memory(pages); ret = -ENOMEM; goto out; } @@ -1034,10 +1030,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, a &locked, flags, &uf, &uf_unmap); } out: - if (offset_in_page(ret)) { - vm_unacct_memory(charged); + if (offset_in_page(ret)) locked = false; - } if (downgraded) mmap_read_unlock(current->mm); else From patchwork Fri Nov 5 20:39:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10986C433F5 for ; Fri, 5 Nov 2021 20:39:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BEF7961252 for ; Fri, 5 Nov 2021 20:39:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BEF7961252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 60BC9940052; Fri, 5 Nov 2021 16:39:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BCC3940049; Fri, 5 Nov 2021 16:39:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45BCE940052; Fri, 5 Nov 2021 16:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 2C8F4940049 for ; Fri, 5 Nov 2021 16:39:12 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E03A67798F for ; Fri, 5 Nov 2021 20:39:11 +0000 (UTC) X-FDA: 78776041302.30.A88B1D2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 7721D10000AB for ; Fri, 5 Nov 2021 20:39:11 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 83BF66056B; Fri, 5 Nov 2021 20:39:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144750; bh=p6w+94iaxs7+r544PijRic/QocSyQwil4huhbWvkmJM=; h=Date:From:To:Subject:In-Reply-To:From; b=jbwechSTOKscOUsBf5zT7zlLyDCobpI1l2xFl3Ky0D1ss/M63QU48lkPaVV/2eA1R +ZUqNqx19ewOG/8k+Li1CsMeSSNw9e+I0uIVktWyKGHwTHR/pDycahOm7Skkl+74RN pYwZuoDSUvhPJ+/yrK7P9Ro7N0e/w5QaEOLQe+kU= Date: Fri, 05 Nov 2021 13:39:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chris@chris-wilson.co.uk, daniel.vetter@ffwll.ch, joonas.lahtinen@linux.intel.com, linux-mm@kvack.org, lucas.demarchi@intel.com, mm-commits@vger.kernel.org, peterz@infradead.org, torvalds@linux-foundation.org Subject: [patch 086/262] include/linux/io-mapping.h: remove fallback for writecombine Message-ID: <20211105203910.DxiEtivZD%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jbwechST; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7721D10000AB X-Stat-Signature: nac1zr7zdhqpzjrmhj41yzdno4gjse9o X-HE-Tag: 1636144751-940251 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Lucas De Marchi Subject: include/linux/io-mapping.h: remove fallback for writecombine The fallback was introduced in commit 80c33624e472 ("io-mapping: Fixup for different names of writecombine") to fix the build on microblaze. 5 years later, it seems all archs now provide a pgprot_writecombine(), so just remove the other possible fallbacks. For microblaze, pgprot_writecombine() is available since commit 97ccedd793ac ("microblaze: Provide pgprot_device/writecombine macros for nommu"). This is build-tested on microblaze with a hack to always build mm/io-mapping.o and without DIYing on an x86-only macro (_PAGE_CACHE_MASK) Link: https://lkml.kernel.org/r/20211020204838.1142908-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi Cc: Chris Wilson Cc: Daniel Vetter Cc: Joonas Lahtinen Cc: Peter Zijlstra Signed-off-by: Andrew Morton --- include/linux/io-mapping.h | 6 ------ 1 file changed, 6 deletions(-) --- a/include/linux/io-mapping.h~io-mapping-remove-fallback-for-writecombine +++ a/include/linux/io-mapping.h @@ -132,13 +132,7 @@ io_mapping_init_wc(struct io_mapping *io iomap->base = base; iomap->size = size; -#if defined(pgprot_noncached_wc) /* archs can't agree on a name ... */ - iomap->prot = pgprot_noncached_wc(PAGE_KERNEL); -#elif defined(pgprot_writecombine) iomap->prot = pgprot_writecombine(PAGE_KERNEL); -#else - iomap->prot = pgprot_noncached(PAGE_KERNEL); -#endif return iomap; } From patchwork Fri Nov 5 20:39:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AC88C433FE for ; Fri, 5 Nov 2021 20:39:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C61E16056B for ; Fri, 5 Nov 2021 20:39:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C61E16056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6807C940053; Fri, 5 Nov 2021 16:39:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6301F940049; Fri, 5 Nov 2021 16:39:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AA5C940053; Fri, 5 Nov 2021 16:39:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0125.hostedemail.com [216.40.44.125]) by kanga.kvack.org (Postfix) with ESMTP id 39735940049 for ; Fri, 5 Nov 2021 16:39:15 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DF4777799E for ; Fri, 5 Nov 2021 20:39:14 +0000 (UTC) X-FDA: 78776041428.20.85D5B31 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id 7454D104AAE9 for ; Fri, 5 Nov 2021 20:39:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8C997611C4; Fri, 5 Nov 2021 20:39:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144753; bh=rtwuP7hVZi1gtDgBjBy0ZFmr7NoKm/yrpYBknmJAcgI=; h=Date:From:To:Subject:In-Reply-To:From; b=jevt8ezyR/NAC9Jdo0allCuWrkP2GRGiwnMFKr9jEeKZmSW/p3/SiOXNPd4d124Ri sp51P1xiyL75YJUJGS+Z549WSh3eEqvEUm27Zcts5Xqup5XLdtXTZqVUTwtZc/eC9k Ewpi7iLOkPPTf4f1SX4cWI57lGSPc4Zd1IGdp1nk= Date: Fri, 05 Nov 2021 13:39:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, ligang.bdlg@bytedance.com, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 087/262] mm: mmap_lock: remove redundant newline in TP_printk Message-ID: <20211105203913.NmX2MeX7r%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7454D104AAE9 X-Stat-Signature: 6kuzjnoujhdz58az5iq3h5zfunf1s61t Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jevt8ezy; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144745-270980 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gang Li Subject: mm: mmap_lock: remove redundant newline in TP_printk Ftrace core will add newline automatically on printing, so using it in TP_printkcreates a blank line. Link: https://lkml.kernel.org/r/20211009071105.69544-1-ligang.bdlg@bytedance.com Signed-off-by: Gang Li Acked-by: Vlastimil Babka Reviewed-by: Steven Rostedt (VMware) Cc: Ingo Molnar Cc: Axel Rasmussen Signed-off-by: Andrew Morton --- include/trace/events/mmap_lock.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/include/trace/events/mmap_lock.h~mm-mmap_lock-remove-redundant-newline-in-tp_printk +++ a/include/trace/events/mmap_lock.h @@ -32,7 +32,7 @@ TRACE_EVENT_FN(mmap_lock_start_locking, ), TP_printk( - "mm=%p memcg_path=%s write=%s\n", + "mm=%p memcg_path=%s write=%s", __entry->mm, __get_str(memcg_path), __entry->write ? "true" : "false" @@ -63,7 +63,7 @@ TRACE_EVENT_FN(mmap_lock_acquire_returne ), TP_printk( - "mm=%p memcg_path=%s write=%s success=%s\n", + "mm=%p memcg_path=%s write=%s success=%s", __entry->mm, __get_str(memcg_path), __entry->write ? "true" : "false", @@ -92,7 +92,7 @@ TRACE_EVENT_FN(mmap_lock_released, ), TP_printk( - "mm=%p memcg_path=%s write=%s\n", + "mm=%p memcg_path=%s write=%s", __entry->mm, __get_str(memcg_path), __entry->write ? "true" : "false" From patchwork Fri Nov 5 20:39:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA309C433EF for ; Fri, 5 Nov 2021 20:39:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8B6506056B for ; Fri, 5 Nov 2021 20:39:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8B6506056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2B0CB940054; Fri, 5 Nov 2021 16:39:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25F7A940049; Fri, 5 Nov 2021 16:39:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14E77940054; Fri, 5 Nov 2021 16:39:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 078E3940049 for ; Fri, 5 Nov 2021 16:39:18 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BF8C118147E44 for ; Fri, 5 Nov 2021 20:39:17 +0000 (UTC) X-FDA: 78776041554.24.91798B6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 3641A6001997 for ; Fri, 5 Nov 2021 20:39:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7D73961252; Fri, 5 Nov 2021 20:39:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144756; bh=t2MGmxM33Ivo+QEkc8LhpU9OZll/v//8O0tpkrCJWaQ=; h=Date:From:To:Subject:In-Reply-To:From; b=ysWuXp0/FCL1LrSCPabwIQqJfRPescr8nmqA1GDlfvW+ZUK6D9DgXQ4DUVheSh0aQ hLuDkL3W23mcY/LTxnI0czSjgk659X9HpYpCueaYG9ggSK8OiEQGqjh6k13R3cfqQe 7o6ehGLKa/C13I7AWilCWXn1csp+ewd7ZzQLNCwk= Date: Fri, 05 Nov 2021 13:39:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, ligang.bdlg@bytedance.com, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, rostedt@goodmis.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 088/262] mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN Message-ID: <20211105203916.x6uvI3JuF%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3641A6001997 X-Stat-Signature: ipjy4ae39m5cfso4ip1k9q4p9xjnhnsz Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="ysWuXp0/"; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144758-917134 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gang Li Subject: mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN By using DECLARE_EVENT_CLASS and TRACE_EVENT_FN, we can save a lot of space from duplicate code. Link: https://lkml.kernel.org/r/20211009071243.70286-1-ligang.bdlg@bytedance.com Signed-off-by: Gang Li Acked-by: Vlastimil Babka Reviewed-by: Steven Rostedt (VMware) Cc: Axel Rasmussen Cc: Ingo Molnar Signed-off-by: Andrew Morton --- include/trace/events/mmap_lock.h | 44 +++++++---------------------- 1 file changed, 12 insertions(+), 32 deletions(-) --- a/include/trace/events/mmap_lock.h~mm-mmap_lock-use-declare_event_class-and-define_event_fn +++ a/include/trace/events/mmap_lock.h @@ -13,7 +13,7 @@ struct mm_struct; extern int trace_mmap_lock_reg(void); extern void trace_mmap_lock_unreg(void); -TRACE_EVENT_FN(mmap_lock_start_locking, +DECLARE_EVENT_CLASS(mmap_lock, TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write), @@ -36,11 +36,19 @@ TRACE_EVENT_FN(mmap_lock_start_locking, __entry->mm, __get_str(memcg_path), __entry->write ? "true" : "false" - ), - - trace_mmap_lock_reg, trace_mmap_lock_unreg + ) ); +#define DEFINE_MMAP_LOCK_EVENT(name) \ + DEFINE_EVENT_FN(mmap_lock, name, \ + TP_PROTO(struct mm_struct *mm, const char *memcg_path, \ + bool write), \ + TP_ARGS(mm, memcg_path, write), \ + trace_mmap_lock_reg, trace_mmap_lock_unreg) + +DEFINE_MMAP_LOCK_EVENT(mmap_lock_start_locking); +DEFINE_MMAP_LOCK_EVENT(mmap_lock_released); + TRACE_EVENT_FN(mmap_lock_acquire_returned, TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write, @@ -71,34 +79,6 @@ TRACE_EVENT_FN(mmap_lock_acquire_returne ), trace_mmap_lock_reg, trace_mmap_lock_unreg -); - -TRACE_EVENT_FN(mmap_lock_released, - - TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write), - - TP_ARGS(mm, memcg_path, write), - - TP_STRUCT__entry( - __field(struct mm_struct *, mm) - __string(memcg_path, memcg_path) - __field(bool, write) - ), - - TP_fast_assign( - __entry->mm = mm; - __assign_str(memcg_path, memcg_path); - __entry->write = write; - ), - - TP_printk( - "mm=%p memcg_path=%s write=%s", - __entry->mm, - __get_str(memcg_path), - __entry->write ? "true" : "false" - ), - - trace_mmap_lock_reg, trace_mmap_lock_unreg ); #endif /* _TRACE_MMAP_LOCK_H */ From patchwork Fri Nov 5 20:39:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3467BC433EF for ; Fri, 5 Nov 2021 20:39:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB02860FBF for ; Fri, 5 Nov 2021 20:39:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DB02860FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 74595940055; Fri, 5 Nov 2021 16:39:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6CC4A940049; Fri, 5 Nov 2021 16:39:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BBC7940055; Fri, 5 Nov 2021 16:39:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 48D99940049 for ; Fri, 5 Nov 2021 16:39:21 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1839F779AA for ; Fri, 5 Nov 2021 20:39:21 +0000 (UTC) X-FDA: 78776041722.12.D30C402 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 8288C90000B2 for ; Fri, 5 Nov 2021 20:39:20 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8686C611C0; Fri, 5 Nov 2021 20:39:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144759; bh=aBcUhrqpaJfBMDAJwgABLlQ6kUsrlKOTpt0nhKo1fx0=; h=Date:From:To:Subject:In-Reply-To:From; b=Y0MQvbuGHcRN6vld3QDLCbDu1mu3cq2sDehCX0Y7bZbjkklgJtlhWwxPn60laXbBn 1FaxwcRTa3T0xvm+OY4fTmHbIQk8aZKnYjT381aJrMjNjvSEFmdux6oiTDsQpJyeS2 6c3cRPxEOYYygiXcKOoOKO996oXhPb92vHVI40C4= Date: Fri, 05 Nov 2021 13:39:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, torvalds@linux-foundation.org, urezki@gmail.com, vvs@virtuozzo.com Subject: [patch 089/262] mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node() Message-ID: <20211105203919.fOpfiPRMV%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Y0MQvbuG; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8288C90000B2 X-Stat-Signature: o39ea65npzd3n7mxsxtq3b3pi1j94yn7 X-HE-Tag: 1636144760-661069 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vasily Averin Subject: mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node() Commit f255935b9767 ("mm: cleanup the gfp_mask handling in __vmalloc_area_node") added __GFP_NOWARN to gfp_mask unconditionally however it disabled all output inside warn_alloc() call. This patch saves original gfp_mask and provides it to all warn_alloc() calls. Link: https://lkml.kernel.org/r/f4f3187b-9684-e426-565d-827c2a9bbb0e@virtuozzo.com Fixes: f255935b9767 ("mm: cleanup the gfp_mask handling in __vmalloc_area_node") Signed-off-by: Vasily Averin Reviewed-by: Christoph Hellwig Reviewed-by: Muchun Song Cc: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- mm/vmalloc.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-repair-warn_allocs-in-__vmalloc_area_node +++ a/mm/vmalloc.c @@ -2887,6 +2887,7 @@ static void *__vmalloc_area_node(struct int node) { const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO; + const gfp_t orig_gfp_mask = gfp_mask; unsigned long addr = (unsigned long)area->addr; unsigned long size = get_vm_area_size(area); unsigned long array_size; @@ -2907,7 +2908,7 @@ static void *__vmalloc_area_node(struct } if (!area->pages) { - warn_alloc(gfp_mask, NULL, + warn_alloc(orig_gfp_mask, NULL, "vmalloc error: size %lu, failed to allocated page array size %lu", nr_small_pages * PAGE_SIZE, array_size); free_vm_area(area); @@ -2927,7 +2928,7 @@ static void *__vmalloc_area_node(struct * allocation request, free them via __vfree() if any. */ if (area->nr_pages != nr_small_pages) { - warn_alloc(gfp_mask, NULL, + warn_alloc(orig_gfp_mask, NULL, "vmalloc error: size %lu, page order %u, failed to allocate pages", area->nr_pages * PAGE_SIZE, page_order); goto fail; @@ -2935,7 +2936,7 @@ static void *__vmalloc_area_node(struct if (vmap_pages_range(addr, addr + size, prot, area->pages, page_shift) < 0) { - warn_alloc(gfp_mask, NULL, + warn_alloc(orig_gfp_mask, NULL, "vmalloc error: size %lu, failed to map pages", area->nr_pages * PAGE_SIZE); goto fail; From patchwork Fri Nov 5 20:39:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32E61C433F5 for ; Fri, 5 Nov 2021 20:39:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DE7736056B for ; Fri, 5 Nov 2021 20:39:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DE7736056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 76D42940056; Fri, 5 Nov 2021 16:39:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68291940049; Fri, 5 Nov 2021 16:39:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5205F940056; Fri, 5 Nov 2021 16:39:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0233.hostedemail.com [216.40.44.233]) by kanga.kvack.org (Postfix) with ESMTP id 3182A940049 for ; Fri, 5 Nov 2021 16:39:24 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E72217798B for ; Fri, 5 Nov 2021 20:39:23 +0000 (UTC) X-FDA: 78776041806.24.8B69061 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 98DA2F0003A3 for ; Fri, 5 Nov 2021 20:39:23 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 914A06056B; Fri, 5 Nov 2021 20:39:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144762; bh=w5HZz/nvaFNsq3igD47VFQavMrp+gHMdvsVCo8mPFbE=; h=Date:From:To:Subject:In-Reply-To:From; b=DiYUdMdWnAa/FcSkNwBMfMl+/77/yCJWzVDgCFNNurneaTphJVv5r4DqimGt7VxDe PUeaOOg03YWykBP35Ah57dvdsv9L/U4Riq7bH0zTZwImMMJbFnCsBCk1AJ5tbCj5Nr ToQNkNi3YizPXWXRA999NCYX/6D10EUFlQ6sviN0= Date: Fri, 05 Nov 2021 13:39:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, david@redhat.com, hch@lst.de, keescook@chromium.org, linux-mm@kvack.org, mgorman@suse.de, mm-commits@vger.kernel.org, peterz@infradead.org, torvalds@linux-foundation.org, urezki@gmail.com, will@kernel.org Subject: [patch 090/262] mm/vmalloc: don't allow VM_NO_GUARD on vmap() Message-ID: <20211105203922.X-KS3cTcm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 98DA2F0003A3 X-Stat-Signature: qm174tpdhkx3nu9wfhcfnse84decfszf Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DiYUdMdW; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144763-227793 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Zijlstra Subject: mm/vmalloc: don't allow VM_NO_GUARD on vmap() The vmalloc guard pages are added on top of each allocation, thereby isolating any two allocations from one another. The top guard of the lower allocation is the bottom guard guard of the higher allocation etc. Therefore VM_NO_GUARD is dangerous; it breaks the basic premise of isolating separate allocations. There are only two in-tree users of this flag, neither of which use it through the exported interface. Ensure it stays this way. Link: https://lkml.kernel.org/r/YUMfdA36fuyZ+/xt@hirez.programming.kicks-ass.net Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Christoph Hellwig Reviewed-by: David Hildenbrand Acked-by: Will Deacon Acked-by: Kees Cook Cc: Andrey Konovalov Cc: Mel Gorman Cc: Uladzislau Rezki Signed-off-by: Andrew Morton --- include/linux/vmalloc.h | 2 +- mm/vmalloc.c | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) --- a/include/linux/vmalloc.h~mm-vmalloc-dont-allow-vm_no_guard-on-vmap +++ a/include/linux/vmalloc.h @@ -22,7 +22,7 @@ struct notifier_block; /* in notifier.h #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */ #define VM_DMA_COHERENT 0x00000010 /* dma_alloc_coherent */ #define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */ -#define VM_NO_GUARD 0x00000040 /* don't add guard page */ +#define VM_NO_GUARD 0x00000040 /* ***DANGEROUS*** don't add guard page */ #define VM_KASAN 0x00000080 /* has allocated kasan shadow memory */ #define VM_FLUSH_RESET_PERMS 0x00000100 /* reset direct map and flush TLB on unmap, can't be freed in atomic context */ #define VM_MAP_PUT_PAGES 0x00000200 /* put pages and free array in vfree */ --- a/mm/vmalloc.c~mm-vmalloc-dont-allow-vm_no_guard-on-vmap +++ a/mm/vmalloc.c @@ -2743,6 +2743,13 @@ void *vmap(struct page **pages, unsigned might_sleep(); + /* + * Your top guard is someone else's bottom guard. Not having a top + * guard compromises someone else's mappings too. + */ + if (WARN_ON_ONCE(flags & VM_NO_GUARD)) + flags &= ~VM_NO_GUARD; + if (count > totalram_pages()) return NULL; From patchwork Fri Nov 5 20:39:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07485C433EF for ; Fri, 5 Nov 2021 20:39:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B21CC6056B for ; Fri, 5 Nov 2021 20:39:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B21CC6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 57077940057; Fri, 5 Nov 2021 16:39:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D14E940049; Fri, 5 Nov 2021 16:39:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34A5A940057; Fri, 5 Nov 2021 16:39:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 26598940049 for ; Fri, 5 Nov 2021 16:39:27 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DED6018352D2C for ; Fri, 5 Nov 2021 20:39:26 +0000 (UTC) X-FDA: 78776042016.01.FB8A987 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 8D076F0000AE for ; Fri, 5 Nov 2021 20:39:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A57F560FBF; Fri, 5 Nov 2021 20:39:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144765; bh=XIZU7H+XkC2VvekaNszqpL0dVHiRIrj4XkVzD6+V0uA=; h=Date:From:To:Subject:In-Reply-To:From; b=gV2z0ikWBROu0+0dL7EHYZxxbMihigUrZdbrbmxMjLRNhuyrNCv2mfim8HcIdeuch NMODOEkXsuNnBU2H7073bbMgVyPvVt3poru0pScf+TrMTjAvmX7moBkgkuFODPhk09 GFtOuCv526ojlo3iXia64hIuRcswBh9ChTMgfjmE= Date: Fri, 05 Nov 2021 13:39:25 -0700 From: Andrew Morton To: akpm@linux-foundation.org, edumazet@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com Subject: [patch 091/262] mm/vmalloc: make show_numa_info() aware of hugepage mappings Message-ID: <20211105203925.w8uD6f-BW%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=gV2z0ikW; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8D076F0000AE X-Stat-Signature: 9bsysfj8buba6azxd5daq8mxgpgfjrxb X-HE-Tag: 1636144766-912216 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Eric Dumazet Subject: mm/vmalloc: make show_numa_info() aware of hugepage mappings show_numa_info() can be slightly faster, by skipping over hugepages directly. Link: https://lkml.kernel.org/r/20211001172725.105824-1-eric.dumazet@gmail.com Signed-off-by: Eric Dumazet Cc: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- mm/vmalloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-make-show_numa_info-aware-of-hugepage-mappings +++ a/mm/vmalloc.c @@ -3864,6 +3864,7 @@ static void show_numa_info(struct seq_fi { if (IS_ENABLED(CONFIG_NUMA)) { unsigned int nr, *counters = m->private; + unsigned int step = 1U << vm_area_page_order(v); if (!counters) return; @@ -3875,9 +3876,8 @@ static void show_numa_info(struct seq_fi memset(counters, 0, nr_node_ids * sizeof(unsigned int)); - for (nr = 0; nr < v->nr_pages; nr++) - counters[page_to_nid(v->pages[nr])]++; - + for (nr = 0; nr < v->nr_pages; nr += step) + counters[page_to_nid(v->pages[nr])] += step; for_each_node_state(nr, N_HIGH_MEMORY) if (counters[nr]) seq_printf(m, " N%u=%u", nr, counters[nr]); From patchwork Fri Nov 5 20:39:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49680C433EF for ; Fri, 5 Nov 2021 20:39:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 11A8B611C4 for ; Fri, 5 Nov 2021 20:39:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 11A8B611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B140E940058; Fri, 5 Nov 2021 16:39:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC475940049; Fri, 5 Nov 2021 16:39:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98B1C940058; Fri, 5 Nov 2021 16:39:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8347D940049 for ; Fri, 5 Nov 2021 16:39:30 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4332E82499A8 for ; Fri, 5 Nov 2021 20:39:30 +0000 (UTC) X-FDA: 78776042100.09.67936B7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id 7E4BF10000B0 for ; Fri, 5 Nov 2021 20:39:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 94C986056B; Fri, 5 Nov 2021 20:39:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144768; bh=195WpJFUXCAzHcNg6yr8SaCAS75bTx0VLBndn3JQVuA=; h=Date:From:To:Subject:In-Reply-To:From; b=DoU5EyvSPtmibve1UJHVhmbEKohWk+ViF20gplNgqiwKcRTm28YrSIH4hKzl6Ng/N 798WwQD88VBANkrJccoGDZ11LkSLPFao6S/VnJMlozmsxzj+QTSrZfXci4b1mQ53G1 tbTRWgAdyG6SFYU7aPglxzXcvJFhwOBUbXJhSI/g= Date: Fri, 05 Nov 2021 13:39:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, edumazet@google.com, linux-mm@kvack.org, lpf.vector@gmail.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com Subject: [patch 092/262] mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo Message-ID: <20211105203928.VUhnNzshg%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DoU5EyvS; dmarc=none; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7E4BF10000B0 X-Stat-Signature: cxg8wyeucidt7x4ijuuptp1tx1br9bna X-HE-Tag: 1636144769-438743 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Eric Dumazet Subject: mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo If last va found in vmap_area_list does not have a vm pointer, vmallocinfo.s_show() returns 0, and show_purge_info() is not called as it should. Link: https://lkml.kernel.org/r/20211001170815.73321-1-eric.dumazet@gmail.com Fixes: dd3b8353bae7 ("mm/vmalloc: do not keep unpurged areas in the busy tree") Signed-off-by: Eric Dumazet Cc: Uladzislau Rezki (Sony) Cc: Pengfei Li Signed-off-by: Andrew Morton --- mm/vmalloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/vmalloc.c~mm-vmalloc-make-sure-to-dump-unpurged-areas-in-proc-vmallocinfo +++ a/mm/vmalloc.c @@ -3913,7 +3913,7 @@ static int s_show(struct seq_file *m, vo (void *)va->va_start, (void *)va->va_end, va->va_end - va->va_start); - return 0; + goto final; } v = va->vm; @@ -3954,6 +3954,7 @@ static int s_show(struct seq_file *m, vo /* * As a final step, dump "unpurged" areas. */ +final: if (list_is_last(&va->list, &vmap_area_list)) show_purge_info(m); From patchwork Fri Nov 5 20:39:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA782C433F5 for ; Fri, 5 Nov 2021 20:39:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 530E76056B for ; Fri, 5 Nov 2021 20:39:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 530E76056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E49AB940059; Fri, 5 Nov 2021 16:39:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD26E940049; Fri, 5 Nov 2021 16:39:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2544940059; Fri, 5 Nov 2021 16:39:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id AF27E940049 for ; Fri, 5 Nov 2021 16:39:33 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 728AD779B2 for ; Fri, 5 Nov 2021 20:39:33 +0000 (UTC) X-FDA: 78776042310.05.5ED3762 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 018FEB0000A9 for ; Fri, 5 Nov 2021 20:39:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C4B8960FBF; Fri, 5 Nov 2021 20:39:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144772; bh=fT44FlyRT7p/dbHo6OdXQwsW5i2vIKS+q5AvhxobN74=; h=Date:From:To:Subject:In-Reply-To:From; b=Jx049YCgO2v/wjvk8NNEA5chsTEO4PtuiLD+5W2LZhFl3TLLtTzvg8fcZGpRoeMKN Vzpk04MAt2s6wFiCD1lAV0IPlLOTs2gLNG+1bhMsKV8F2DkE8XTyS240xYhRas5JAl EGxLAGdvG3b9VByWEHOymbKmPE0eLJnTXzncNVyI= Date: Fri, 05 Nov 2021 13:39:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, hch@infradead.org, hdanton@sina.com, linux-mm@kvack.org, mgorman@suse.de, mhocko@suse.com, mm-commits@vger.kernel.org, npiggin@gmail.com, oleksiy.avramchenko@sonymobile.com, pifang@redhat.com, rostedt@goodmis.org, torvalds@linux-foundation.org, urezki@gmail.com, willy@infradead.org Subject: [patch 093/262] mm/vmalloc: do not adjust the search size for alignment overhead Message-ID: <20211105203931.vjj4onVoM%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Jx049YCg; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 018FEB0000A9 X-Stat-Signature: ukq4dcpmrcpyfju7n5mgto145bfbfjxi X-HE-Tag: 1636144772-725573 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Uladzislau Rezki (Sony)" Subject: mm/vmalloc: do not adjust the search size for alignment overhead We used to include an alignment overhead into a search length, in that case we guarantee that a found area will definitely fit after applying a specific alignment that user specifies. From the other hand we do not guarantee that an area has the lowest address if an alignment is >= PAGE_SIZE. It means that, when a user specifies a special alignment together with a range that corresponds to an exact requested size then an allocation will fail. This is what happens to KASAN, it wants the free block that exactly matches a specified range during onlining memory banks: [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory82/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory83/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory85/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory84/state [ 223.858115] vmap allocation for size 16777216 failed: use vmalloc= to increase size [ 223.859415] bash: vmalloc: allocation failure: 16777216 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0 [ 223.860992] CPU: 4 PID: 1644 Comm: bash Kdump: loaded Not tainted 4.18.0-339.el8.x86_64+debug #1 [ 223.862149] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 223.863580] Call Trace: [ 223.863946] dump_stack+0x8e/0xd0 [ 223.864420] warn_alloc.cold.90+0x8a/0x1b2 [ 223.864990] ? zone_watermark_ok_safe+0x300/0x300 [ 223.865626] ? slab_free_freelist_hook+0x85/0x1a0 [ 223.866264] ? __get_vm_area_node+0x240/0x2c0 [ 223.866858] ? kfree+0xdd/0x570 [ 223.867309] ? kmem_cache_alloc_node_trace+0x157/0x230 [ 223.868028] ? notifier_call_chain+0x90/0x160 [ 223.868625] __vmalloc_node_range+0x465/0x840 [ 223.869230] ? mark_held_locks+0xb7/0x120 Fix it by making sure that find_vmap_lowest_match() returns lowest start address with any given alignment value, i.e. for alignments bigger then PAGE_SIZE the algorithm rolls back toward parent nodes checking right sub-trees if the most left free block did not fit due to alignment overhead. Link: https://lkml.kernel.org/r/20211004142829.22222-1-urezki@gmail.com Fixes: 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap allocation") Signed-off-by: Uladzislau Rezki (Sony) Reported-by: Ping Fang Tested-by: David Hildenbrand Reviewed-by: David Hildenbrand Cc: Mel Gorman Cc: Christoph Hellwig Cc: Matthew Wilcox Cc: Nicholas Piggin Cc: Hillf Danton Cc: Michal Hocko Cc: Oleksiy Avramchenko Cc: Steven Rostedt Signed-off-by: Andrew Morton --- mm/vmalloc.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-do-not-adjust-the-search-size-for-alignment-overhead +++ a/mm/vmalloc.c @@ -1195,18 +1195,14 @@ find_vmap_lowest_match(unsigned long siz { struct vmap_area *va; struct rb_node *node; - unsigned long length; /* Start from the root. */ node = free_vmap_area_root.rb_node; - /* Adjust the search size for alignment overhead. */ - length = size + align - 1; - while (node) { va = rb_entry(node, struct vmap_area, rb_node); - if (get_subtree_max_size(node->rb_left) >= length && + if (get_subtree_max_size(node->rb_left) >= size && vstart < va->va_start) { node = node->rb_left; } else { @@ -1216,9 +1212,9 @@ find_vmap_lowest_match(unsigned long siz /* * Does not make sense to go deeper towards the right * sub-tree if it does not have a free block that is - * equal or bigger to the requested search length. + * equal or bigger to the requested search size. */ - if (get_subtree_max_size(node->rb_right) >= length) { + if (get_subtree_max_size(node->rb_right) >= size) { node = node->rb_right; continue; } @@ -1226,15 +1222,23 @@ find_vmap_lowest_match(unsigned long siz /* * OK. We roll back and find the first right sub-tree, * that will satisfy the search criteria. It can happen - * only once due to "vstart" restriction. + * due to "vstart" restriction or an alignment overhead + * that is bigger then PAGE_SIZE. */ while ((node = rb_parent(node))) { va = rb_entry(node, struct vmap_area, rb_node); if (is_within_this_va(va, size, align, vstart)) return va; - if (get_subtree_max_size(node->rb_right) >= length && + if (get_subtree_max_size(node->rb_right) >= size && vstart <= va->va_start) { + /* + * Shift the vstart forward. Please note, we update it with + * parent's start address adding "1" because we do not want + * to enter same sub-tree after it has already been checked + * and no suitable free block found there. + */ + vstart = va->va_start + 1; node = node->rb_right; break; } From patchwork Fri Nov 5 20:39:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3A22C43219 for ; Fri, 5 Nov 2021 20:39:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 775386056B for ; Fri, 5 Nov 2021 20:39:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 775386056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1788294005A; Fri, 5 Nov 2021 16:39:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1294A940049; Fri, 5 Nov 2021 16:39:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F318294005A; Fri, 5 Nov 2021 16:39:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id E3C81940049 for ; Fri, 5 Nov 2021 16:39:36 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A5E3C184B3FBC for ; Fri, 5 Nov 2021 20:39:36 +0000 (UTC) X-FDA: 78776042352.07.68AF504 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id 166D7E001982 for ; Fri, 5 Nov 2021 20:39:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 15989611C0; Fri, 5 Nov 2021 20:39:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144775; bh=oVPilQfPvdjYagrAhjbUOoJejKExAkojVOJLLRWnlY4=; h=Date:From:To:Subject:In-Reply-To:From; b=QfWcHr3h435YIPuwCEtGHlmpM37BsD7UenRpPRZlEltC05D6xVUm+JsRdz0l1rGH2 sBsGTPDgG3FP3yVu081fxUGMZCT0CngTJabfKGCz67Y9h5p9Els18ehhQm2EaHf1hu AD+9WLDom8i6lZxw4O88SXPNVlsXFWdXOXzV00To= Date: Fri, 05 Nov 2021 13:39:34 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, hch@infradead.org, hdanton@sina.com, linux-mm@kvack.org, mgorman@suse.de, mhocko@suse.com, mm-commits@vger.kernel.org, npiggin@gmail.com, oleksiy.avramchenko@sonymobile.com, pifang@redhat.com, rostedt@goodmis.org, torvalds@linux-foundation.org, urezki@gmail.com, willy@infradead.org Subject: [patch 094/262] mm/vmalloc: check various alignments when debugging Message-ID: <20211105203934.cGNvYDsTr%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=QfWcHr3h; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 166D7E001982 X-Stat-Signature: rqz9b67zdg6sgs33pbb18y5hw64bkujf X-HE-Tag: 1636144758-928059 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Uladzislau Rezki (Sony)" Subject: mm/vmalloc: check various alignments when debugging Before we did not guarantee a free block with lowest start address for allocations with alignment >= PAGE_SIZE. Because an alignment overhead was included into a search length like below: length = size + align - 1; doing so we make sure that a bigger block would fit after applying an alignment adjustment. Now there is no such limitation, i.e. any alignment that user wants to apply will result to a lowest address of returned free area. Link: https://lkml.kernel.org/r/20211004142829.22222-2-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) Cc: Christoph Hellwig Cc: David Hildenbrand Cc: Hillf Danton Cc: Matthew Wilcox Cc: Mel Gorman Cc: Michal Hocko Cc: Nicholas Piggin Cc: Oleksiy Avramchenko Cc: Ping Fang Cc: Steven Rostedt Signed-off-by: Andrew Morton --- mm/vmalloc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-check-various-alignments-when-debugging +++ a/mm/vmalloc.c @@ -1269,7 +1269,7 @@ find_vmap_lowest_linear_match(unsigned l } static void -find_vmap_lowest_match_check(unsigned long size) +find_vmap_lowest_match_check(unsigned long size, unsigned long align) { struct vmap_area *va_1, *va_2; unsigned long vstart; @@ -1278,8 +1278,8 @@ find_vmap_lowest_match_check(unsigned lo get_random_bytes(&rnd, sizeof(rnd)); vstart = VMALLOC_START + rnd; - va_1 = find_vmap_lowest_match(size, 1, vstart); - va_2 = find_vmap_lowest_linear_match(size, 1, vstart); + va_1 = find_vmap_lowest_match(size, align, vstart); + va_2 = find_vmap_lowest_linear_match(size, align, vstart); if (va_1 != va_2) pr_emerg("not lowest: t: 0x%p, l: 0x%p, v: 0x%lx\n", @@ -1458,7 +1458,7 @@ __alloc_vmap_area(unsigned long size, un return vend; #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK - find_vmap_lowest_match_check(size); + find_vmap_lowest_match_check(size, align); #endif return nva_start_addr; From patchwork Fri Nov 5 20:39:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14ADFC4332F for ; Fri, 5 Nov 2021 20:39:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C15C86056B for ; Fri, 5 Nov 2021 20:39:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C15C86056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6166894005B; Fri, 5 Nov 2021 16:39:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DFC794005C; Fri, 5 Nov 2021 16:39:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3994794005B; Fri, 5 Nov 2021 16:39:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 29F52940049 for ; Fri, 5 Nov 2021 16:39:40 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D5EF0779A9 for ; Fri, 5 Nov 2021 20:39:39 +0000 (UTC) X-FDA: 78776042478.21.D53EDE8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id 9342F30000B1 for ; Fri, 5 Nov 2021 20:39:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 63AA461252; Fri, 5 Nov 2021 20:39:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144778; bh=qjRCMovY/TzXTgGOY2iBP3gdzjaB4kLvhWPEcXRxeb8=; h=Date:From:To:Subject:In-Reply-To:From; b=tqndgzKekZt6BsM77J/TyD61JJ4W90IQASUCaXIlmgL8UqZ3jH2Fmp5tDAwvgXsDW 4oRf0e1KKqxFlZ3aDXHMawGAkJOvfXHVATDUHwGfwLTDtSSZc8sXha0T1ehLiuNmz6 2ho9EvPov4xH2/3CITv2h5vP9U/awGlqadR38A+w= Date: Fri, 05 Nov 2021 13:39:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, penguin-kernel@i-love.sakura.ne.jp, torvalds@linux-foundation.org, urezki@gmail.com, vdavydov.dev@gmail.com, vvs@virtuozzo.com Subject: [patch 095/262] vmalloc: back off when the current task is OOM-killed Message-ID: <20211105203937.PoPS-YPxz%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9342F30000B1 X-Stat-Signature: u7npwf9qzbwkbjpgrrg6ro9dewy7uo9b Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=tqndgzKe; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144772-583646 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vasily Averin Subject: vmalloc: back off when the current task is OOM-killed Huge vmalloc allocation on heavy loaded node can lead to a global memory shortage. Task called vmalloc can have worst badness and be selected by OOM-killer, however taken fatal signal does not interrupt allocation cycle. Vmalloc repeat page allocaions again and again, exacerbating the crisis and consuming the memory freed up by another killed tasks. After a successful completion of the allocation procedure, a fatal signal will be processed and task will be destroyed finally. However it may not release the consumed memory, since the allocated object may have a lifetime unrelated to the completed task. In the worst case, this can lead to the host will panic due to "Out of memory and no killable processes..." This patch allows OOM-killer to break vmalloc cycle, makes OOM more effective and avoid host panic. It does not check oom condition directly, however, and breaks page allocation cycle when fatal signal was received. This may trigger some hidden problems, when caller does not handle vmalloc failures, or when rollaback after failed vmalloc calls own vmallocs inside. However all of these scenarios are incorrect: vmalloc does not guarantee successful allocation, it has never been called with __GFP_NOFAIL and threfore either should not be used for any rollbacks or should handle such errors correctly and not lead to critical failures. Link: https://lkml.kernel.org/r/83efc664-3a65-2adb-d7c4-2885784cf109@virtuozzo.com Signed-off-by: Vasily Averin Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Tetsuo Handa Cc: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- mm/vmalloc.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/vmalloc.c~vmalloc-back-off-when-the-current-task-is-oom-killed +++ a/mm/vmalloc.c @@ -2871,6 +2871,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid, /* High-order pages or fallback path if "bulk" fails. */ while (nr_allocated < nr_pages) { + if (fatal_signal_pending(current)) + break; + if (nid == NUMA_NO_NODE) page = alloc_pages(gfp, order); else From patchwork Fri Nov 5 20:39:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FF8AC433EF for ; Fri, 5 Nov 2021 20:39:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EAE1361252 for ; Fri, 5 Nov 2021 20:39:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EAE1361252 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4111994005C; Fri, 5 Nov 2021 16:39:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BEDB940049; Fri, 5 Nov 2021 16:39:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2388C94005C; Fri, 5 Nov 2021 16:39:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 12079940049 for ; Fri, 5 Nov 2021 16:39:43 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C88C21850E68C for ; Fri, 5 Nov 2021 20:39:42 +0000 (UTC) X-FDA: 78776042604.26.C34672C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 26830104AAD6 for ; Fri, 5 Nov 2021 20:39:34 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 79A8E6056B; Fri, 5 Nov 2021 20:39:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144781; bh=P/nrofPNnHJCgbWYj0aQDtmgmtt7sglbyvIoCov5SDM=; h=Date:From:To:Subject:In-Reply-To:From; b=HeFFX33m7SFpTo+5KTPRCIHKybgd4ET6YUIQhUg8bsS/+06npQ8QLzS8tjsOq5XOy dvrX8/r4U2GY1zr+Jm7w3uQSyLaWi+TAyLTlcz9OxtNg/J0XlLhjtCM1DOIzTtLpvV FmVpCmy8pk7P7DWf6jZsdDDLqG2z1Pz3xto70HjE= Date: Fri, 05 Nov 2021 13:39:41 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, catalin.marinas@arm.com, dvyukov@google.com, elver@google.com, gregkh@linuxfoundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com, will@kernel.org Subject: [patch 096/262] vmalloc: choose a better start address in vm_area_register_early() Message-ID: <20211105203941.j3BCCvNFv%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HeFFX33m; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 26830104AAD6 X-Stat-Signature: ozkmq1jgafs9d189dmb7nwsfic4h3mpf X-HE-Tag: 1636144774-912256 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: vmalloc: choose a better start address in vm_area_register_early() Percpu embedded first chunk allocator is the firstly option, but it could fail on ARM64, eg, "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838" and the system cannot boot successfully. Let's implement page mapping percpu first chunk allocator as a fallback to the embedding allocator to increase the robustness of the system. Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC enabled. Tested on ARM64 qemu with cmdline "percpu_alloc=page". This patch (of 3): There are some fixed locations in the vmalloc area be reserved in ARM(see iotable_init()) and ARM64(see map_kernel()), but for pcpu_page_first_chunk(), it calls vm_area_register_early() and choose VMALLOC_START as the start address of vmap area which could be conflicted with above address, then could trigger a BUG_ON in vm_area_add_early(). Let's choose a suit start address by traversing the vmlist. Link: https://lkml.kernel.org/r/20210910053354.26721-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20210910053354.26721-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Reviewed-by: Catalin Marinas Cc: Will Deacon Cc: Andrey Ryabinin Cc: Andrey Konovalov Cc: Dmitry Vyukov Cc: Marco Elver Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton --- mm/vmalloc.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) --- a/mm/vmalloc.c~vmalloc-choose-a-better-start-address-in-vm_area_register_early +++ a/mm/vmalloc.c @@ -2276,15 +2276,21 @@ void __init vm_area_add_early(struct vm_ */ void __init vm_area_register_early(struct vm_struct *vm, size_t align) { - static size_t vm_init_off __initdata; - unsigned long addr; + unsigned long addr = ALIGN(VMALLOC_START, align); + struct vm_struct *cur, **p; - addr = ALIGN(VMALLOC_START + vm_init_off, align); - vm_init_off = PFN_ALIGN(addr + vm->size) - VMALLOC_START; + BUG_ON(vmap_initialized); - vm->addr = (void *)addr; + for (p = &vmlist; (cur = *p) != NULL; p = &cur->next) { + if ((unsigned long)cur->addr - addr >= vm->size) + break; + addr = ALIGN((unsigned long)cur->addr + cur->size, align); + } - vm_area_add_early(vm); + BUG_ON(addr > VMALLOC_END - vm->size); + vm->addr = (void *)addr; + vm->next = *p; + *p = vm; } static void vmap_init_free_space(void) From patchwork Fri Nov 5 20:39:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7147FC433F5 for ; Fri, 5 Nov 2021 20:39:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1DD7B61262 for ; Fri, 5 Nov 2021 20:39:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1DD7B61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A9F4494005D; Fri, 5 Nov 2021 16:39:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4F07940049; Fri, 5 Nov 2021 16:39:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8532894005D; Fri, 5 Nov 2021 16:39:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 6F8DF940049 for ; Fri, 5 Nov 2021 16:39:46 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3A5F277994 for ; Fri, 5 Nov 2021 20:39:46 +0000 (UTC) X-FDA: 78776042772.02.D2FA7E3 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 515DB508FA5B for ; Fri, 5 Nov 2021 20:39:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AA1A2611C4; Fri, 5 Nov 2021 20:39:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144785; bh=uIr/yR+cTG3tP+8ppcwYlU7Zq/oYvQD9oE6JfTIPKL8=; h=Date:From:To:Subject:In-Reply-To:From; b=YUcU5SC/+gKERv5pSTqOz22qZVXpgIqYXfBjWpVkfvKZ47CoD/OUptWjBgxZs7wHY pZOPeUyb5kdg2wY3648HsJiBxNhB9zthGoUwKpORze9GsL0LhDOZYm6loqCNg2b0Ja NYo1s4wY/JWdI9tS/kZXMB5iJSvyvZBF2gpMDpV0= Date: Fri, 05 Nov 2021 13:39:44 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, catalin.marinas@arm.com, dvyukov@google.com, elver@google.com, gregkh@linuxfoundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com, will@kernel.org Subject: [patch 097/262] arm64: support page mapping percpu first chunk allocator Message-ID: <20211105203944.5iPfK6pI6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 515DB508FA5B X-Stat-Signature: memyxqn9mc4o4go18o1orzqn9qs6933d Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="YUcU5SC/"; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144768-314740 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: arm64: support page mapping percpu first chunk allocator Percpu embedded first chunk allocator is the firstly option, but it could fails on ARM64, eg, "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838" and the system could not boot successfully. Let's implement page mapping percpu first chunk allocator as a fallback to the embedding allocator to increase the robustness of the system. Link: https://lkml.kernel.org/r/20210910053354.26721-3-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Reviewed-by: Catalin Marinas Cc: Andrey Konovalov Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: Greg Kroah-Hartman Cc: Marco Elver Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/arm64/Kconfig | 4 + drivers/base/arch_numa.c | 82 ++++++++++++++++++++++++++++++++----- 2 files changed, 76 insertions(+), 10 deletions(-) --- a/arch/arm64/Kconfig~arm64-support-page-mapping-percpu-first-chunk-allocator +++ a/arch/arm64/Kconfig @@ -1042,6 +1042,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA +config NEED_PER_CPU_PAGE_FIRST_CHUNK + def_bool y + depends on NUMA + source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE --- a/drivers/base/arch_numa.c~arm64-support-page-mapping-percpu-first-chunk-allocator +++ a/drivers/base/arch_numa.c @@ -14,6 +14,7 @@ #include #include +#include struct pglist_data *node_data[MAX_NUMNODES] __read_mostly; EXPORT_SYMBOL(node_data); @@ -168,22 +169,83 @@ static void __init pcpu_fc_free(void *pt memblock_free_early(__pa(ptr), size); } +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK +static void __init pcpu_populate_pte(unsigned long addr) +{ + pgd_t *pgd = pgd_offset_k(addr); + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) { + pud_t *new; + + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!new) + goto err_alloc; + p4d_populate(&init_mm, p4d, new); + } + + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) { + pmd_t *new; + + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!new) + goto err_alloc; + pud_populate(&init_mm, pud, new); + } + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) { + pte_t *new; + + new = memblock_alloc(PAGE_SIZE, PAGE_SIZE); + if (!new) + goto err_alloc; + pmd_populate_kernel(&init_mm, pmd, new); + } + + return; + +err_alloc: + panic("%s: Failed to allocate %lu bytes align=%lx from=%lx\n", + __func__, PAGE_SIZE, PAGE_SIZE, PAGE_SIZE); +} +#endif + void __init setup_per_cpu_areas(void) { unsigned long delta; unsigned int cpu; - int rc; + int rc = -EINVAL; - /* - * Always reserve area for module percpu variables. That's - * what the legacy allocator did. - */ - rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, - PERCPU_DYNAMIC_RESERVE, PAGE_SIZE, - pcpu_cpu_distance, - pcpu_fc_alloc, pcpu_fc_free); + if (pcpu_chosen_fc != PCPU_FC_PAGE) { + /* + * Always reserve area for module percpu variables. That's + * what the legacy allocator did. + */ + rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE, + PERCPU_DYNAMIC_RESERVE, PAGE_SIZE, + pcpu_cpu_distance, + pcpu_fc_alloc, pcpu_fc_free); +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK + if (rc < 0) + pr_warn("PERCPU: %s allocator failed (%d), falling back to page size\n", + pcpu_fc_names[pcpu_chosen_fc], rc); +#endif + } + +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK + if (rc < 0) + rc = pcpu_page_first_chunk(PERCPU_MODULE_RESERVE, + pcpu_fc_alloc, + pcpu_fc_free, + pcpu_populate_pte); +#endif if (rc < 0) - panic("Failed to initialize percpu areas."); + panic("Failed to initialize percpu areas (err=%d).", rc); delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start; for_each_possible_cpu(cpu) From patchwork Fri Nov 5 20:39:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7ABE6C433EF for ; Fri, 5 Nov 2021 20:39:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 315AB611C0 for ; Fri, 5 Nov 2021 20:39:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 315AB611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BD73B94005E; Fri, 5 Nov 2021 16:39:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B88BE940049; Fri, 5 Nov 2021 16:39:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A27DD94005E; Fri, 5 Nov 2021 16:39:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0221.hostedemail.com [216.40.44.221]) by kanga.kvack.org (Postfix) with ESMTP id 920AC940049 for ; Fri, 5 Nov 2021 16:39:49 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4E69B18503B50 for ; Fri, 5 Nov 2021 20:39:49 +0000 (UTC) X-FDA: 78776042814.19.9B31D91 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 4F8B4B000187 for ; Fri, 5 Nov 2021 20:39:39 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E0D5F6056B; Fri, 5 Nov 2021 20:39:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144788; bh=Zk+BrwYZCFK61Dq1skpN7K27Yt3m+OHUlfiO6EiaMkY=; h=Date:From:To:Subject:In-Reply-To:From; b=IpBe62TDm2yFyRmRqh2nUEOvdA8rxTx17q3XHYr0OzFYo4r6RkDWks7WY7aAPgiB3 sxnZZl/QxMLc9M+YxmmTiZGDldEqNJTqg9/8lGeKZ9MFewXhZUh2qI7Sqn30cQ2+sq 5rPw+IRqVpoqRzr0ptbqiqz37zwrTM1MNjXdHA3s= Date: Fri, 05 Nov 2021 13:39:47 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andreyknvl@gmail.com, catalin.marinas@arm.com, dvyukov@google.com, elver@google.com, gregkh@linuxfoundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, ryabinin.a.a@gmail.com, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com, will@kernel.org Subject: [patch 098/262] kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC Message-ID: <20211105203947.iTqM9LZ0I%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4F8B4B000187 X-Stat-Signature: n3greid9p6to6ohhrxy6iy137wo1q4fe Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IpBe62TD; dmarc=none; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144779-239328 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC With KASAN_VMALLOC and NEED_PER_CPU_PAGE_FIRST_CHUNK, it crashes, Unable to handle kernel paging request at virtual address ffff7000028f2000 ... swapper pgtable: 64k pages, 48-bit VAs, pgdp=0000000042440000 [ffff7000028f2000] pgd=000000063e7c0003, p4d=000000063e7c0003, pud=000000063e7c0003, pmd=000000063e7b0003, pte=0000000000000000 Internal error: Oops: 96000007 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.0-rc4-00003-gc6e6e28f3f30-dirty #62 Hardware name: linux,dummy-virt (DT) pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--) pc : kasan_check_range+0x90/0x1a0 lr : memcpy+0x88/0xf4 sp : ffff80001378fe20 ... Call trace: kasan_check_range+0x90/0x1a0 pcpu_page_first_chunk+0x3f0/0x568 setup_per_cpu_areas+0xb8/0x184 start_kernel+0x8c/0x328 The vm area used in vm_area_register_early() has no kasan shadow memory, Let's add a new kasan_populate_early_vm_area_shadow() function to populate the vm area shadow memory to fix the issue. [wangkefeng.wang@huawei.com: fix redefinition of 'kasan_populate_early_vm_area_shadow'] Link: https://lkml.kernel.org/r/20211011123211.3936196-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20210910053354.26721-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Acked-by: Marco Elver [KASAN] Acked-by: Andrey Konovalov [KASAN] Acked-by: Catalin Marinas Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: Greg Kroah-Hartman Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/arm64/mm/kasan_init.c | 16 ++++++++++++++++ include/linux/kasan.h | 6 ++++++ mm/kasan/shadow.c | 5 +++++ mm/vmalloc.c | 1 + 4 files changed, 28 insertions(+) --- a/arch/arm64/mm/kasan_init.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc +++ a/arch/arm64/mm/kasan_init.c @@ -287,6 +287,22 @@ static void __init kasan_init_depth(void init_task.kasan_depth = 0; } +#ifdef CONFIG_KASAN_VMALLOC +void __init kasan_populate_early_vm_area_shadow(void *start, unsigned long size) +{ + unsigned long shadow_start, shadow_end; + + if (!is_vmalloc_or_module_addr(start)) + return; + + shadow_start = (unsigned long)kasan_mem_to_shadow(start); + shadow_start = ALIGN_DOWN(shadow_start, PAGE_SIZE); + shadow_end = (unsigned long)kasan_mem_to_shadow(start + size); + shadow_end = ALIGN(shadow_end, PAGE_SIZE); + kasan_map_populate(shadow_start, shadow_end, NUMA_NO_NODE); +} +#endif + void __init kasan_init(void) { kasan_init_shadow(); --- a/include/linux/kasan.h~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc +++ a/include/linux/kasan.h @@ -436,6 +436,8 @@ void kasan_release_vmalloc(unsigned long unsigned long free_region_start, unsigned long free_region_end); +void kasan_populate_early_vm_area_shadow(void *start, unsigned long size); + #else /* CONFIG_KASAN_VMALLOC */ static inline int kasan_populate_vmalloc(unsigned long start, @@ -453,6 +455,10 @@ static inline void kasan_release_vmalloc unsigned long free_region_start, unsigned long free_region_end) {} +static inline void kasan_populate_early_vm_area_shadow(void *start, + unsigned long size) +{ } + #endif /* CONFIG_KASAN_VMALLOC */ #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ --- a/mm/kasan/shadow.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc +++ a/mm/kasan/shadow.c @@ -254,6 +254,11 @@ core_initcall(kasan_memhotplug_init); #ifdef CONFIG_KASAN_VMALLOC +void __init __weak kasan_populate_early_vm_area_shadow(void *start, + unsigned long size) +{ +} + static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, void *unused) { --- a/mm/vmalloc.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc +++ a/mm/vmalloc.c @@ -2291,6 +2291,7 @@ void __init vm_area_register_early(struc vm->addr = (void *)addr; vm->next = *p; *p = vm; + kasan_populate_early_vm_area_shadow(vm->addr, vm->size); } static void vmap_init_free_space(void) From patchwork Fri Nov 5 20:39:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A32C5C433F5 for ; Fri, 5 Nov 2021 20:39:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5BA21611C0 for ; Fri, 5 Nov 2021 20:39:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5BA21611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EBEE994005F; Fri, 5 Nov 2021 16:39:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6F8F940049; Fri, 5 Nov 2021 16:39:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7299940060; Fri, 5 Nov 2021 16:39:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0033.hostedemail.com [216.40.44.33]) by kanga.kvack.org (Postfix) with ESMTP id B0A5A94005F for ; Fri, 5 Nov 2021 16:39:52 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7A5921856A99A for ; Fri, 5 Nov 2021 20:39:52 +0000 (UTC) X-FDA: 78776043024.10.A61B1A7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 63779508FA48 for ; Fri, 5 Nov 2021 20:39:40 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 07E4160FBF; Fri, 5 Nov 2021 20:39:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144791; bh=WmM1WsBKgSp+vPWqSgfjQsKHetcmcNdBnXHNO7d2bGU=; h=Date:From:To:Subject:In-Reply-To:From; b=F1X0ZxPoYq5FHupcSojJeRikvj6lXLZk+TLCJyHlHPfGyAcDAD1MPZW6ltn2YIrGB 24fdE7kp84jMfDi8GF4KnuU0tBWWEiUmNuVMKqQ29RiGPem5VpgDxUlL8WXxTlm87y KdsU5TAE9Hicdd6FC85OlFAAdpK6I94q2Od9Q+A8= Date: Fri, 05 Nov 2021 13:39:50 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@fromorbit.com, hch@infradead.org, idryomov@gmail.com, jlayton@kernel.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, torvalds@linux-foundation.org, urezki@gmail.com Subject: [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags Message-ID: <20211105203950.AJ1Cnteeh%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=F1X0ZxPo; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 63779508FA48 X-Stat-Signature: ztxt6cderha48z8p6p5kitxstwranmnw X-HE-Tag: 1636144780-120896 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Michal Hocko Subject: mm/vmalloc: be more explicit about supported gfp flags The core of the vmalloc allocator __vmalloc_area_node doesn't say anything about gfp mask argument. Not all gfp flags are supported though. Be more explicit about constraints. Link: https://lkml.kernel.org/r/20211020082545.4830-1-mhocko@kernel.org Signed-off-by: Michal Hocko Cc: Dave Chinner Cc: Neil Brown Cc: Christoph Hellwig Cc: Uladzislau Rezki Cc: Ilya Dryomov Cc: Jeff Layton Signed-off-by: Andrew Morton --- mm/vmalloc.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-be-more-explicit-about-supported-gfp-flags +++ a/mm/vmalloc.c @@ -2983,8 +2983,16 @@ fail: * @caller: caller's return address * * Allocate enough pages to cover @size from the page level - * allocator with @gfp_mask flags. Map them into contiguous - * kernel virtual space, using a pagetable protection of @prot. + * allocator with @gfp_mask flags. Please note that the full set of gfp + * flags are not supported. GFP_KERNEL would be a preferred allocation mode + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). + * __GFP_NOWARN can be used to suppress error messages about failures. + * + * Map them into contiguous kernel virtual space, using a pagetable + * protection of @prot. * * Return: the address of the area or %NULL on failure */ From patchwork Fri Nov 5 20:39:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2FC7C433EF for ; Fri, 5 Nov 2021 20:39:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 68B536056B for ; Fri, 5 Nov 2021 20:39:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 68B536056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0B566940060; Fri, 5 Nov 2021 16:39:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 03D97940049; Fri, 5 Nov 2021 16:39:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4631940060; Fri, 5 Nov 2021 16:39:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0240.hostedemail.com [216.40.44.240]) by kanga.kvack.org (Postfix) with ESMTP id CD809940049 for ; Fri, 5 Nov 2021 16:39:55 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 956141856AD5A for ; Fri, 5 Nov 2021 20:39:55 +0000 (UTC) X-FDA: 78776043150.18.5F22649 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 27B94801A8AE for ; Fri, 5 Nov 2021 20:39:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2A4F260FBF; Fri, 5 Nov 2021 20:39:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144794; bh=0nAhssHM/9Zm2K5o9EpwMqLhNwbnETmDRUHBMcEEerY=; h=Date:From:To:Subject:In-Reply-To:From; b=GMutQkil8NknL46/PFhIVzJqbj42Df2UKgFfV6+WlfUe9jTFkPvet7aLnl//lvy6z DpmbXqVaeqnD+saeQAbbyes3Txa6ZsbNKVg0Jg03cGImjqclh+NjBqwAircU4y0UQx VOkvlnUQKpjP6ef3noARQjBtLscLe8eDruuFafok= Date: Fri, 05 Nov 2021 13:39:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chenwandun@huawei.com, edumazet@google.com, guohanjun@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, npiggin@gmail.com, shakeelb@google.com, torvalds@linux-foundation.org, urezki@gmail.com, wangkefeng.wang@huawei.com Subject: [patch 100/262] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Message-ID: <20211105203953.Ri4SHiHS7%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 27B94801A8AE X-Stat-Signature: madhhxhjke4sf63878kjqxwiin5jeijz Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=GMutQkil; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144795-848044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chen Wandun Subject: mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation "mm/vmalloc: fix numa spreading for large hash tables" will cause significant performance regressions in some situations as Andrew mentioned in [1]. The main situation is vmalloc, vmalloc will allocate pages with NUMA_NO_NODE by default, that will result in alloc page one by one; In order to solve this, __alloc_pages_bulk and mempolicy should be considered at the same time. 1) If node is specified in memory allocation request, it will alloc all pages by __alloc_pages_bulk. 2) If interleaving allocate memory, it will cauculate how many pages should be allocated in each node, and use __alloc_pages_bulk to alloc pages in each node. [1]: https://lore.kernel.org/lkml/CALvZod4G3SzP3kWxQYn0fj+VgG-G3yWXz=gz17+3N57ru1iajw@mail.gmail.com/t/#m750c8e3231206134293b089feaa090590afa0f60 [akpm@linux-foundation.org: coding style fixes] [akpm@linux-foundation.org: make two functions static] [akpm@linux-foundation.org: fix CONFIG_NUMA=n build] Link: https://lkml.kernel.org/r/20211021080744.874701-3-chenwandun@huawei.com Signed-off-by: Chen Wandun Reviewed-by: Uladzislau Rezki (Sony) Cc: Eric Dumazet Cc: Shakeel Butt Cc: Nicholas Piggin Cc: Kefeng Wang Cc: Hanjun Guo Signed-off-by: Andrew Morton --- include/linux/gfp.h | 4 ++ mm/mempolicy.c | 82 ++++++++++++++++++++++++++++++++++++++++++ mm/vmalloc.c | 20 ++++++++-- 3 files changed, 102 insertions(+), 4 deletions(-) --- a/include/linux/gfp.h~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation +++ a/include/linux/gfp.h @@ -535,6 +535,10 @@ unsigned long __alloc_pages_bulk(gfp_t g struct list_head *page_list, struct page **page_array); +unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, + unsigned long nr_pages, + struct page **page_array); + /* Bulk allocate order-0 pages */ static inline unsigned long alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) --- a/mm/mempolicy.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation +++ a/mm/mempolicy.c @@ -2196,6 +2196,88 @@ struct page *alloc_pages(gfp_t gfp, unsi } EXPORT_SYMBOL(alloc_pages); +static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, + struct mempolicy *pol, unsigned long nr_pages, + struct page **page_array) +{ + int nodes; + unsigned long nr_pages_per_node; + int delta; + int i; + unsigned long nr_allocated; + unsigned long total_allocated = 0; + + nodes = nodes_weight(pol->nodes); + nr_pages_per_node = nr_pages / nodes; + delta = nr_pages - nodes * nr_pages_per_node; + + for (i = 0; i < nodes; i++) { + if (delta) { + nr_allocated = __alloc_pages_bulk(gfp, + interleave_nodes(pol), NULL, + nr_pages_per_node + 1, NULL, + page_array); + delta--; + } else { + nr_allocated = __alloc_pages_bulk(gfp, + interleave_nodes(pol), NULL, + nr_pages_per_node, NULL, page_array); + } + + page_array += nr_allocated; + total_allocated += nr_allocated; + } + + return total_allocated; +} + +static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, + struct mempolicy *pol, unsigned long nr_pages, + struct page **page_array) +{ + gfp_t preferred_gfp; + unsigned long nr_allocated = 0; + + preferred_gfp = gfp | __GFP_NOWARN; + preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); + + nr_allocated = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes, + nr_pages, NULL, page_array); + + if (nr_allocated < nr_pages) + nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL, + nr_pages - nr_allocated, NULL, + page_array + nr_allocated); + return nr_allocated; +} + +/* alloc pages bulk and mempolicy should be considered at the + * same time in some situation such as vmalloc. + * + * It can accelerate memory allocation especially interleaving + * allocate memory. + */ +unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, + unsigned long nr_pages, struct page **page_array) +{ + struct mempolicy *pol = &default_policy; + + if (!in_interrupt() && !(gfp & __GFP_THISNODE)) + pol = get_task_policy(current); + + if (pol->mode == MPOL_INTERLEAVE) + return alloc_pages_bulk_array_interleave(gfp, pol, + nr_pages, page_array); + + if (pol->mode == MPOL_PREFERRED_MANY) + return alloc_pages_bulk_array_preferred_many(gfp, + numa_node_id(), pol, nr_pages, page_array); + + return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()), + policy_nodemask(gfp, pol), nr_pages, NULL, + page_array); +} + int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst) { struct mempolicy *pol = mpol_dup(vma_policy(src)); --- a/mm/vmalloc.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation +++ a/mm/vmalloc.c @@ -2843,7 +2843,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * to fails, fallback to a single page allocator that is * more permissive. */ - if (!order && nid != NUMA_NO_NODE) { + if (!order) { while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request; @@ -2855,8 +2855,20 @@ vm_area_alloc_pages(gfp_t gfp, int nid, */ nr_pages_request = min(100U, nr_pages - nr_allocated); - nr = alloc_pages_bulk_array_node(gfp, nid, - nr_pages_request, pages + nr_allocated); + /* memory allocation should consider mempolicy, we can't + * wrongly use nearest node when nid == NUMA_NO_NODE, + * otherwise memory may be allocated in only one node, + * but mempolcy want to alloc memory by interleaving. + */ + if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) + nr = alloc_pages_bulk_array_mempolicy(gfp, + nr_pages_request, + pages + nr_allocated); + + else + nr = alloc_pages_bulk_array_node(gfp, nid, + nr_pages_request, + pages + nr_allocated); nr_allocated += nr; cond_resched(); @@ -2868,7 +2880,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } - } else if (order) + } else /* * Compound pages required for remap_vmalloc_page if * high-order pages. From patchwork Fri Nov 5 20:39:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AC6BC433EF for ; Fri, 5 Nov 2021 20:39:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 37DA0611C4 for ; Fri, 5 Nov 2021 20:39:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 37DA0611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D12FC940061; Fri, 5 Nov 2021 16:39:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CBE50940049; Fri, 5 Nov 2021 16:39:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B88BA940061; Fri, 5 Nov 2021 16:39:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0158.hostedemail.com [216.40.44.158]) by kanga.kvack.org (Postfix) with ESMTP id A6F09940049 for ; Fri, 5 Nov 2021 16:39:58 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6C7CF82499A8 for ; Fri, 5 Nov 2021 20:39:58 +0000 (UTC) X-FDA: 78776043276.37.A52093D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 9ED95508FA48 for ; Fri, 5 Nov 2021 20:39:40 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2E145611C0; Fri, 5 Nov 2021 20:39:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144797; bh=qp1GkC5rhtav/0WU3qrAq2fB/epoumgn+EcvJW6QSz0=; h=Date:From:To:Subject:In-Reply-To:From; b=YxnMr26d7GE2uJ2wBmpatyUv6NoqV5pUjsh9mNl6o1oqkfBVFtkV1TjwOeMRCD7/S Wq8d5fqoHOiJMoxWpUgFHNhexMMEYshNeT/7k9p7zBOHNo6o/6ubgqTgE0ooECUa5H s1Evxyd17d5Cvpyi+aZtWcpDJMWeHEbyYi6Fwh3Y= Date: Fri, 05 Nov 2021 13:39:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, deng.changcheng@zte.com.cn, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, urezki@gmail.com, zealci@zte.com.cn Subject: [patch 101/262] lib/test_vmalloc.c: use swap() to make code cleaner Message-ID: <20211105203956.bvvtKSnFq%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9ED95508FA48 X-Stat-Signature: 5pkdc9xm49mih7nhpbqfxs7tabs9c48g Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YxnMr26d; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144780-467860 X-Bogosity: Ham, tests=bogofilter, spamicity=0.001732, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changcheng Deng Subject: lib/test_vmalloc.c: use swap() to make code cleaner Use swap() in order to make code cleaner. Issue found by coccinelle. Link: https://lkml.kernel.org/r/20211028111443.15744-1-deng.changcheng@zte.com.cn Signed-off-by: Changcheng Deng Reported-by: Zeal Robot Reviewed-by: Uladzislau Rezki (Sony) Signed-off-by: Andrew Morton --- lib/test_vmalloc.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/lib/test_vmalloc.c~lib-test_vmallocc-use-swap-to-make-code-cleaner +++ a/lib/test_vmalloc.c @@ -393,7 +393,7 @@ static struct test_driver { static void shuffle_array(int *arr, int n) { unsigned int rnd; - int i, j, x; + int i, j; for (i = n - 1; i > 0; i--) { get_random_bytes(&rnd, sizeof(rnd)); @@ -402,9 +402,7 @@ static void shuffle_array(int *arr, int j = rnd % i; /* Swap indexes. */ - x = arr[i]; - arr[i] = arr[j]; - arr[j] = x; + swap(arr[i], arr[j]); } } From patchwork Fri Nov 5 20:39:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98F9BC433FE for ; Fri, 5 Nov 2021 20:40:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 651AF611C4 for ; Fri, 5 Nov 2021 20:40:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 651AF611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 03D44940062; Fri, 5 Nov 2021 16:40:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2F09940049; Fri, 5 Nov 2021 16:40:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1EFF940062; Fri, 5 Nov 2021 16:40:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id CECED940049 for ; Fri, 5 Nov 2021 16:40:02 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8F8828249980 for ; Fri, 5 Nov 2021 20:40:02 +0000 (UTC) X-FDA: 78776043486.17.C4C2C3B Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id AFB2AE001982 for ; Fri, 5 Nov 2021 20:39:43 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0F3006056B; Fri, 5 Nov 2021 20:40:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144800; bh=AggLZ8nA2+U/SPGcwOBQOGanO0LrnntBQMKbQkeoafk=; h=Date:From:To:Subject:In-Reply-To:From; b=zC/z0uwtkF3++ZxXOtOvmBkx6jaFjq1+pVoSrzGOYiDCNiljU/Y/neiIFsJ5noXSI SXg928D8ppbgsYBsxWcm/RnNh+zlE+BDeHxP1NVpK9uWBPvJ7h2egzwmzIs/CmGxpx +Giw21Z/f1hQyWfJZ0dxEDGpumchlCEYOHV1eZzQ= Date: Fri, 05 Nov 2021 13:39:59 -0700 From: Andrew Morton To: akpm@linux-foundation.org, edumazet@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, npiggin@gmail.com, torvalds@linux-foundation.org Subject: [patch 102/262] mm/large system hash: avoid possible NULL deref in alloc_large_system_hash Message-ID: <20211105203959.pBhGbu8m4%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: AFB2AE001982 X-Stat-Signature: 1tcswpezwaz8ha78k4dzj11qb47zw56j Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="zC/z0uwt"; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144783-235260 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Eric Dumazet Subject: mm/large system hash: avoid possible NULL deref in alloc_large_system_hash If __vmalloc() returned NULL, is_vm_area_hugepages(NULL) will fault if CONFIG_HAVE_ARCH_HUGE_VMALLOC=y Link: https://lkml.kernel.org/r/20210915212530.2321545-1-eric.dumazet@gmail.com Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Eric Dumazet Reviewed-by: Andrew Morton Cc: Nicholas Piggin Signed-off-by: Andrew Morton --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-large-system-hash-avoid-possible-null-deref-in-alloc_large_system_hash +++ a/mm/page_alloc.c @@ -8762,7 +8762,8 @@ void *__init alloc_large_system_hash(con } else if (get_order(size) >= MAX_ORDER || hashdist) { table = __vmalloc(size, gfp_flags); virt = true; - huge = is_vm_area_hugepages(table); + if (table) + huge = is_vm_area_hugepages(table); } else { /* * If bucketsize is not a power-of-two, we may free From patchwork Fri Nov 5 20:40:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE375C433EF for ; Fri, 5 Nov 2021 20:40:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5FA25611C0 for ; Fri, 5 Nov 2021 20:40:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5FA25611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F0D5B940063; Fri, 5 Nov 2021 16:40:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EBA90940049; Fri, 5 Nov 2021 16:40:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D828A940063; Fri, 5 Nov 2021 16:40:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0246.hostedemail.com [216.40.44.246]) by kanga.kvack.org (Postfix) with ESMTP id B8BB6940049 for ; Fri, 5 Nov 2021 16:40:04 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6AFC06E6B1 for ; Fri, 5 Nov 2021 20:40:04 +0000 (UTC) X-FDA: 78776043528.15.D1C4AC0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id CAA2920019EC for ; Fri, 5 Nov 2021 20:40:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 13E6560FBF; Fri, 5 Nov 2021 20:40:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144803; bh=37V5lx+8mlbtRPyzBZJsOo4FXS3OHt9/vxbe9Xl0xUA=; h=Date:From:To:Subject:In-Reply-To:From; b=ENhJSmT4YdfWQlHYQkksuavHKbSb3L3bE0pVqJGt53Tij8HBUmXcH1AWHolu1lwrc KwXChZtCcUrOwXR06fE2INorb9CSwwLiiS893aawl6vRhapMNSxou05RaW5QpcXES/ IvKpioZMLlF6uEs+8m2e+mGt49emQOHxuyvwqKYQ= Date: Fri, 05 Nov 2021 13:40:02 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 103/262] mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order() Message-ID: <20211105204002.Dy-gv70Y4%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CAA2920019EC X-Stat-Signature: 4xoiwcb8q1ciy63y19isqu4jhy4twzbt Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ENhJSmT4; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144804-388187 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order() Patch series "Cleanups and fixup for page_alloc", v2. This series contains cleanups to remove meaningless VM_BUG_ON(), use helpers to simplify the code and remove obsolete comment. Also we avoid allocating highmem pages via alloc_pages_exact[_nid]. More details can be found in the respective changelogs. This patch (of 5): It's meaningless to VM_BUG_ON() order != pageblock_order just after setting order to pageblock_order. Remove it. Link: https://lkml.kernel.org/r/20210902121242.41607-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20210902121242.41607-2-linmiaohe@huawei.com Signed-off-by: Miaohe Lin Acked-by: Mel Gorman Reviewed-by: David Hildenbrand Cc: Vlastimil Babka Cc: Stephen Rothwell Cc: Peter Zijlstra Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) --- a/mm/page_alloc.c~mm-page_allocc-remove-meaningless-vm_bug_on-in-pindex_to_order +++ a/mm/page_alloc.c @@ -677,10 +677,8 @@ static inline int pindex_to_order(unsign int order = pindex / MIGRATE_PCPTYPES; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (order > PAGE_ALLOC_COSTLY_ORDER) { + if (order > PAGE_ALLOC_COSTLY_ORDER) order = pageblock_order; - VM_BUG_ON(order != pageblock_order); - } #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); #endif From patchwork Fri Nov 5 20:40:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A68EC433F5 for ; Fri, 5 Nov 2021 20:40:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 541A1611C0 for ; Fri, 5 Nov 2021 20:40:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 541A1611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E20E9940064; Fri, 5 Nov 2021 16:40:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD359940049; Fri, 5 Nov 2021 16:40:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBFEF940064; Fri, 5 Nov 2021 16:40:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id BA447940049 for ; Fri, 5 Nov 2021 16:40:07 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 80AFD70016 for ; Fri, 5 Nov 2021 20:40:07 +0000 (UTC) X-FDA: 78776043654.34.F47D267 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 1C05070000B0 for ; Fri, 5 Nov 2021 20:40:07 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2B18C6056B; Fri, 5 Nov 2021 20:40:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144806; bh=d6gn+ixdceP9Bxg3VhLYYjfPb4KMkyQcz6rHhqmc4ZA=; h=Date:From:To:Subject:In-Reply-To:From; b=zdS0Bpnw4kvPHLVKiVS44SpbfPLKuYAxeKHEDyWkPltx6SBxVctSvzIXfONeSW7tc B8ZDYjz4ta1OOJTs8Kzjg3cmUF0Kka6MXP75ZnkrU0eqSeIE6TFd/K7V+w/5GzhYA6 VStJj0D9PDHo2pz3JRydgNg2aA0M2TtdN6kIe2cY= Date: Fri, 05 Nov 2021 13:40:05 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 104/262] mm/page_alloc.c: simplify the code by using macro K() Message-ID: <20211105204005.x38Vwho4S%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1C05070000B0 X-Stat-Signature: di4xqu6gy6sinfxmmrm86ejtt4hfq3b4 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zdS0Bpnw; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144807-949989 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_alloc.c: simplify the code by using macro K() Use helper macro K() to convert the pages to the corresponding size. Minor readability improvement. Link: https://lkml.kernel.org/r/20210902121242.41607-3-linmiaohe@huawei.com Signed-off-by: Miaohe Lin Acked-by: Mel Gorman Reviewed-by: David Hildenbrand Cc: Peter Zijlstra Cc: Stephen Rothwell Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_alloc.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) --- a/mm/page_alloc.c~mm-page_allocc-simplify-the-code-by-using-macro-k +++ a/mm/page_alloc.c @@ -8130,8 +8130,7 @@ unsigned long free_reserved_area(void *s } if (pages && s) - pr_info("Freeing %s memory: %ldK\n", - s, pages << (PAGE_SHIFT - 10)); + pr_info("Freeing %s memory: %ldK\n", s, K(pages)); return pages; } @@ -8176,14 +8175,13 @@ void __init mem_init_print_info(void) ", %luK highmem" #endif ")\n", - nr_free_pages() << (PAGE_SHIFT - 10), - physpages << (PAGE_SHIFT - 10), + K(nr_free_pages()), K(physpages), codesize >> 10, datasize >> 10, rosize >> 10, (init_data_size + init_code_size) >> 10, bss_size >> 10, - (physpages - totalram_pages() - totalcma_pages) << (PAGE_SHIFT - 10), - totalcma_pages << (PAGE_SHIFT - 10) + K(physpages - totalram_pages() - totalcma_pages), + K(totalcma_pages) #ifdef CONFIG_HIGHMEM - , totalhigh_pages() << (PAGE_SHIFT - 10) + , K(totalhigh_pages()) #endif ); } From patchwork Fri Nov 5 20:40:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61A1BC433F5 for ; Fri, 5 Nov 2021 20:40:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1723D611C0 for ; Fri, 5 Nov 2021 20:40:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1723D611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AC4F3940065; Fri, 5 Nov 2021 16:40:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7484940049; Fri, 5 Nov 2021 16:40:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 915C3940065; Fri, 5 Nov 2021 16:40:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id 81F63940049 for ; Fri, 5 Nov 2021 16:40:12 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BCACC75943 for ; Fri, 5 Nov 2021 20:40:10 +0000 (UTC) X-FDA: 78776043696.19.B9D74AF Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 58D94801A8A0 for ; Fri, 5 Nov 2021 20:40:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 43EC9611C0; Fri, 5 Nov 2021 20:40:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144809; bh=1I2QrS70wrVINvLHznWTYvysKnwg3/+DP4wUEERX45E=; h=Date:From:To:Subject:In-Reply-To:From; b=Kk5Mz/HPSMPdPYscBxZ3BGwBo/TZyWpV7IPt8kRFeYAyR5Qujdeg2lgBU+LlvvUXh +vH6ASoL5RjviNZzDlZwCMKDMYj8jxz/n7fYsWxI0X6QSlPBvpizGdkhLmj54AVd8Y 50PFNfHS0ICsJh80GeqTkrV8788ny0TOPrNlJzbU= Date: Fri, 05 Nov 2021 13:40:08 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 105/262] mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk() Message-ID: <20211105204008.9CZvnKIC_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 58D94801A8A0 X-Stat-Signature: ituiokkipxdm53ih5k6y7zrjqop6prdd Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Kk5Mz/HP"; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144810-538270 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk() The second two paragraphs about "all pages pinned" and pages_scanned is obsolete. And There are PAGE_ALLOC_COSTLY_ORDER + 1 + NR_PCP_THP orders in pcp. So the same order assumption is not held now. Link: https://lkml.kernel.org/r/20210902121242.41607-4-linmiaohe@huawei.com Signed-off-by: Miaohe Lin Acked-by: Mel Gorman Cc: David Hildenbrand Cc: Peter Zijlstra Cc: Stephen Rothwell Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_alloc.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) --- a/mm/page_alloc.c~mm-page_allocc-fix-obsolete-comment-in-free_pcppages_bulk +++ a/mm/page_alloc.c @@ -1428,14 +1428,8 @@ static inline void prefetch_buddy(struct /* * Frees a number of pages from the PCP lists - * Assumes all pages on list are in same zone, and of same order. + * Assumes all pages on list are in same zone. * count is the number of pages to free. - * - * If the zone was previously in an "all pages pinned" state then look to - * see if this freeing clears that state. - * - * And clear the zone's pages_scanned counter, to hold off the "all pages are - * pinned" detection logic. */ static void free_pcppages_bulk(struct zone *zone, int count, struct per_cpu_pages *pcp) From patchwork Fri Nov 5 20:40:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BECCC433F5 for ; Fri, 5 Nov 2021 20:40:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EA0776056B for ; Fri, 5 Nov 2021 20:40:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EA0776056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 175C6940066; Fri, 5 Nov 2021 16:40:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 12992940049; Fri, 5 Nov 2021 16:40:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F07D4940066; Fri, 5 Nov 2021 16:40:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id E0483940049 for ; Fri, 5 Nov 2021 16:40:13 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A59C170016 for ; Fri, 5 Nov 2021 20:40:13 +0000 (UTC) X-FDA: 78776043906.18.D7058F3 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 14D6ED0000B9 for ; Fri, 5 Nov 2021 20:40:03 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5674460FBF; Fri, 5 Nov 2021 20:40:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144812; bh=8zmJ95KZQvysgWDXL5evRtpUL+lSnaX+NEczWmElFno=; h=Date:From:To:Subject:In-Reply-To:From; b=j/4GhRGOxKDq6ojiiUiV8HWRO+GCCI/4pasOY0LG9E/M7Uv7dwbDDmPoeRkn8MTko TURB6Y74ZrSg5QvCt8Fl7DwQ1ujzXc4Fg1eKO0FzH39xtWaEUhfzI3p4fUyI0C98Iu nfxC9ByXNA8KMPxHSUqPXu3ZE7pL2ajL8P+ZF3Ms= Date: Fri, 05 Nov 2021 13:40:11 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 106/262] mm/page_alloc.c: use helper function zone_spans_pfn() Message-ID: <20211105204011.W9Nu5EuuH%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 14D6ED0000B9 X-Stat-Signature: b5wpt51xrzf1h6764rrjie1j6zuhqa9e Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="j/4GhRGO"; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144803-430978 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_alloc.c: use helper function zone_spans_pfn() Use helper function zone_spans_pfn() to check whether pfn is within a zone to simplify the code slightly. Link: https://lkml.kernel.org/r/20210902121242.41607-5-linmiaohe@huawei.com Signed-off-by: Miaohe Lin Acked-by: Mel Gorman Reviewed-by: David Hildenbrand Cc: Peter Zijlstra Cc: Stephen Rothwell Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-page_allocc-use-helper-function-zone_spans_pfn +++ a/mm/page_alloc.c @@ -1583,7 +1583,7 @@ static void __meminit init_reserved_page for (zid = 0; zid < MAX_NR_ZONES; zid++) { struct zone *zone = &pgdat->node_zones[zid]; - if (pfn >= zone->zone_start_pfn && pfn < zone_end_pfn(zone)) + if (zone_spans_pfn(zone, pfn)) break; } __init_single_page(pfn_to_page(pfn), pfn, zid, nid); From patchwork Fri Nov 5 20:40:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08251C433F5 for ; Fri, 5 Nov 2021 20:40:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B1C50611C0 for ; Fri, 5 Nov 2021 20:40:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B1C50611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 48566940067; Fri, 5 Nov 2021 16:40:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E595940049; Fri, 5 Nov 2021 16:40:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AC4D940067; Fri, 5 Nov 2021 16:40:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0183.hostedemail.com [216.40.44.183]) by kanga.kvack.org (Postfix) with ESMTP id 136AB940049 for ; Fri, 5 Nov 2021 16:40:17 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CA1571856AD68 for ; Fri, 5 Nov 2021 20:40:16 +0000 (UTC) X-FDA: 78776044032.14.572CD20 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 21495104AAD6 for ; Fri, 5 Nov 2021 20:40:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6FD3A60FBF; Fri, 5 Nov 2021 20:40:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144815; bh=QbO42ExNIn8x+KZzHMuaQWZwZnEUnHV7UNuPArJFLeY=; h=Date:From:To:Subject:In-Reply-To:From; b=oz491ubQ73HjYa8p2AivjV+uMIxkdLcx/X4ryE+VO+MfF+6S7IcLb9Ny6B1kAgj/k b8uCfqhIgGcr7Vfr2jli5nhghr45iXUug743Zs/h94Yy/IEQlghGlIQLfobY8A1xku uZrOZRCZuldYfyUk132aRgW7Ucq0MsZ/SnjVNhns= Date: Fri, 05 Nov 2021 13:40:15 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, peterz@infradead.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 107/262] mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid] Message-ID: <20211105204015.RExQsUpA5%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 21495104AAD6 X-Stat-Signature: e5e1h7o7ny3xfxfbcx9a19wusn1oufyw Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=oz491ubQ; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144808-188574 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid] Don't use with __GFP_HIGHMEM because page_address() cannot represent highmem pages without kmap(). Newly allocated pages would leak as page_address() will return NULL for highmem pages here. But It works now because the callers do not specify __GFP_HIGHMEM now. Link: https://lkml.kernel.org/r/20210902121242.41607-6-linmiaohe@huawei.com Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand Cc: Mel Gorman Cc: Peter Zijlstra Cc: Stephen Rothwell Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_alloc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/mm/page_alloc.c~mm-page_allocc-avoid-allocating-highmem-pages-via-alloc_pages_exact +++ a/mm/page_alloc.c @@ -5610,8 +5610,8 @@ void *alloc_pages_exact(size_t size, gfp unsigned int order = get_order(size); unsigned long addr; - if (WARN_ON_ONCE(gfp_mask & __GFP_COMP)) - gfp_mask &= ~__GFP_COMP; + if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM))) + gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM); addr = __get_free_pages(gfp_mask, order); return make_alloc_exact(addr, order, size); @@ -5635,8 +5635,8 @@ void * __meminit alloc_pages_exact_nid(i unsigned int order = get_order(size); struct page *p; - if (WARN_ON_ONCE(gfp_mask & __GFP_COMP)) - gfp_mask &= ~__GFP_COMP; + if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM))) + gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM); p = alloc_pages_node(nid, gfp_mask, order); if (!p) From patchwork Fri Nov 5 20:40:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F632C433F5 for ; Fri, 5 Nov 2021 20:40:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 259DA611C4 for ; Fri, 5 Nov 2021 20:40:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 259DA611C4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B6120940068; Fri, 5 Nov 2021 16:40:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1025940049; Fri, 5 Nov 2021 16:40:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FFBC940068; Fri, 5 Nov 2021 16:40:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 8D4A2940049 for ; Fri, 5 Nov 2021 16:40:20 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4FD3418562F8C for ; Fri, 5 Nov 2021 20:40:20 +0000 (UTC) X-FDA: 78776044200.22.2AA79CF Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 8C19DF00039E for ; Fri, 5 Nov 2021 20:40:19 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 92CDC60FBF; Fri, 5 Nov 2021 20:40:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144818; bh=YU7hP/JtUSwR4tnYDgs1/WcSM6xX8jJ9zMNKGyU4QxA=; h=Date:From:To:Subject:In-Reply-To:From; b=BsSyeuLHuhGxXa30Bat/K9AXfoyuGu7PEORd+GOILNG7UGyixMqL0Xcs0AYs8MZ9Y 1X6x2an9JtBgnDEAllOWgMRLFynOzf/84nIC3f5jLLsoyrn5atRhVCxioCjBmEsBIy JvqPh2RJpHJa4ckwRbnbOGvcVL2c2I87CAFQUHgI= Date: Fri, 05 Nov 2021 13:40:18 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, bharata@amd.com, kamezawa.hiroyu@jp.fujitsu.com, krupa.ramakrishnan@amd.com, lee.schermerhorn@hp.com, linux-mm@kvack.org, mgorman@suse.de, mm-commits@vger.kernel.org, Sadagopan.Srinivasan@amd.com, torvalds@linux-foundation.org Subject: [patch 108/262] mm/page_alloc: print node fallback order Message-ID: <20211105204018.ngy5d_MGm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BsSyeuLH; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8C19DF00039E X-Stat-Signature: 6wh36tafr689mu6t4sk1mjiqb3r7r4a6 X-HE-Tag: 1636144819-918586 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Bharata B Rao Subject: mm/page_alloc: print node fallback order Patch series "Fix NUMA nodes fallback list ordering". For a NUMA system that has multiple nodes at same distance from other nodes, the fallback list generation prefers same node order for them instead of round-robin thereby penalizing one node over others. This series fixes it. More description of the problem and the fix is present in the patch description. This patch (of 2): Print information message about the allocation fallback order for each NUMA node during boot. No functional changes here. This makes it easier to illustrate the problem in the node fallback list generation, which the next patch fixes. Link: https://lkml.kernel.org/r/20210830121603.1081-1-bharata@amd.com Link: https://lkml.kernel.org/r/20210830121603.1081-2-bharata@amd.com Signed-off-by: Bharata B Rao Acked-by: Mel Gorman Reviewed-by: Anshuman Khandual Cc: KAMEZAWA Hiroyuki Cc: Lee Schermerhorn Cc: Krupa Ramakrishnan Cc: Sadagopan Srinivasan Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/page_alloc.c~mm-page_alloc-print-node-fallback-order +++ a/mm/page_alloc.c @@ -6262,6 +6262,10 @@ static void build_zonelists(pg_data_t *p build_zonelists_in_node_order(pgdat, node_order, nr_nodes); build_thisnode_zonelists(pgdat); + pr_info("Fallback order for Node %d: ", local_node); + for (node = 0; node < nr_nodes; node++) + pr_cont("%d ", node_order[node]); + pr_cont("\n"); } #ifdef CONFIG_HAVE_MEMORYLESS_NODES From patchwork Fri Nov 5 20:40:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E2F1C433EF for ; Fri, 5 Nov 2021 20:40:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12223611C0 for ; Fri, 5 Nov 2021 20:40:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12223611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A292B940069; Fri, 5 Nov 2021 16:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B120940049; Fri, 5 Nov 2021 16:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A21D940069; Fri, 5 Nov 2021 16:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 78AB4940049 for ; Fri, 5 Nov 2021 16:40:23 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 428A68249980 for ; Fri, 5 Nov 2021 20:40:23 +0000 (UTC) X-FDA: 78776044326.11.1C540A0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 821E2104AAED for ; Fri, 5 Nov 2021 20:40:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DE4A161252; Fri, 5 Nov 2021 20:40:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144822; bh=cqUpwh1vL29lqlqzPR/43fIRuS1KNjozAK1PsVth1+c=; h=Date:From:To:Subject:In-Reply-To:From; b=YOPwTKHK4yBcM6coe5G5uBmtB0K4mvpYRf9yH+OdEtvnIVrWq4IQTwiolBI4XIz/s zPFV/f1xZWERYQOm3OuiNzk4YdJc0uFgI2wH/l3Utg5YTvXoFmjM4pbI+eUgSLLczV 7+fXdw5gMlkIgLjFIpAXXG+QB1kltVzNV5DhXFhY= Date: Fri, 05 Nov 2021 13:40:21 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, bharata@amd.com, kamezawa.hiroyu@jp.fujitsu.com, krupa.ramakrishnan@amd.com, lee.schermerhorn@hp.com, linux-mm@kvack.org, mgorman@suse.de, mm-commits@vger.kernel.org, Sadagopan.Srinivasan@amd.com, torvalds@linux-foundation.org Subject: [patch 109/262] mm/page_alloc: use accumulated load when building node fallback list Message-ID: <20211105204021.LOWopzfhI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YOPwTKHK; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 821E2104AAED X-Stat-Signature: nn5cpgzhgr1z73tnihg4fxam1m4e3rcn X-HE-Tag: 1636144814-175373 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Krupa Ramakrishnan Subject: mm/page_alloc: use accumulated load when building node fallback list In build_zonelists(), when the fallback list is built for the nodes, the node load gets reinitialized during each iteration. This results in nodes with same distances occupying the same slot in different node fallback lists rather than appearing in the intended round- robin manner. This results in one node getting picked for allocation more compared to other nodes with the same distance. As an example, consider a 4 node system with the following distance matrix. Node 0 1 2 3 ---------------- 0 10 12 32 32 1 12 10 32 32 2 32 32 10 12 3 32 32 12 10 For this case, the node fallback list gets built like this: Node Fallback list --------------------- 0 0 1 2 3 1 1 0 3 2 2 2 3 0 1 3 3 2 0 1 <-- Unexpected fallback order In the fallback list for nodes 2 and 3, the nodes 0 and 1 appear in the same order which results in more allocations getting satisfied from node 0 compared to node 1. The effect of this on remote memory bandwidth as seen by stream benchmark is shown below: Case 1: Bandwidth from cores on nodes 2 & 3 to memory on nodes 0 & 1 (numactl -m 0,1 ./stream_lowOverhead ... --cores ) Case 2: Bandwidth from cores on nodes 0 & 1 to memory on nodes 2 & 3 (numactl -m 2,3 ./stream_lowOverhead ... --cores ) ---------------------------------------- BANDWIDTH (MB/s) TEST Case 1 Case 2 ---------------------------------------- COPY 57479.6 110791.8 SCALE 55372.9 105685.9 ADD 50460.6 96734.2 TRIADD 50397.6 97119.1 ---------------------------------------- The bandwidth drop in Case 1 occurs because most of the allocations get satisfied by node 0 as it appears first in the fallback order for both nodes 2 and 3. This can be fixed by accumulating the node load in build_zonelists() rather than reinitializing it during each iteration. With this the nodes with the same distance rightly get assigned in the round robin manner. In fact this was how it was originally until the commit f0c0b2b808f2 ("change zonelist order: zonelist order selection logic") dropped the load accumulation and resorted to initializing the load during each iteration. While zonelist ordering was removed by commit c9bff3eebc09 ("mm, page_alloc: rip out ZONELIST_ORDER_ZONE"), the change to the node load accumulation in build_zonelists() remained. So essentially this patch reverts back to the accumulated node load logic. After this fix, the fallback order gets built like this: Node Fallback list ------------------ 0 0 1 2 3 1 1 0 3 2 2 2 3 0 1 3 3 2 1 0 <-- Note the change here The bandwidth in Case 1 improves and matches Case 2 as shown below. ---------------------------------------- BANDWIDTH (MB/s) TEST Case 1 Case 2 ---------------------------------------- COPY 110438.9 110107.2 SCALE 105930.5 105817.5 ADD 97005.1 96159.8 TRIADD 97441.5 96757.1 ---------------------------------------- The correctness of the fallback list generation has been verified for the above node configuration where the node 3 starts as memory-less node and comes up online only during memory hotplug. [bharata@amd.com: Added changelog, review, test validation] Link: https://lkml.kernel.org/r/20210830121603.1081-3-bharata@amd.com Fixes: f0c0b2b808f2 ("change zonelist order: zonelist order selection logic") Signed-off-by: Krupa Ramakrishnan Co-developed-by: Sadagopan Srinivasan Signed-off-by: Sadagopan Srinivasan Signed-off-by: Bharata B Rao Acked-by: Mel Gorman Reviewed-by: Anshuman Khandual Cc: KAMEZAWA Hiroyuki Cc: Lee Schermerhorn Signed-off-by: Andrew Morton --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-page_alloc-use-accumulated-load-when-building-node-fallback-list +++ a/mm/page_alloc.c @@ -6253,7 +6253,7 @@ static void build_zonelists(pg_data_t *p */ if (node_distance(local_node, node) != node_distance(local_node, prev_node)) - node_load[node] = load; + node_load[node] += load; node_order[nr_nodes++] = node; prev_node = node; From patchwork Fri Nov 5 20:40:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04E21C433F5 for ; Fri, 5 Nov 2021 20:40:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B05D861262 for ; Fri, 5 Nov 2021 20:40:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B05D861262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4CCCB94006A; Fri, 5 Nov 2021 16:40:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 45922940049; Fri, 5 Nov 2021 16:40:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3218694006A; Fri, 5 Nov 2021 16:40:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 0991A940049 for ; Fri, 5 Nov 2021 16:40:27 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C6D198249980 for ; Fri, 5 Nov 2021 20:40:26 +0000 (UTC) X-FDA: 78776044452.02.EEB45F5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP id 615C51908 for ; Fri, 5 Nov 2021 20:40:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2744A6126A; Fri, 5 Nov 2021 20:40:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144825; bh=P0QIEuxGBYHx79EtCWiPsmTJpCCbBMmKsyWIt75zDBw=; h=Date:From:To:Subject:In-Reply-To:From; b=uL37d0FQNyr4n0/ZBjjuoTmeEgW6KOpJtQi7pclLf4qJrD5saD9WmhfXXwsFYIIzo KNyuGXYyNUKEU9tA84ycwqGUKJDrMZJyYI604TdL/jZ+vAZMx4jyajbY1FKtpYmq6p QYwh5GBtp5UwqmIhnzApvVbiInoAmiZEZf/lnpZY= Date: Fri, 05 Nov 2021 13:40:24 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dalias@libc.org, geert+renesas@glider.be, gonsolo@gmail.com, juri.lelli@redhat.com, linux-mm@kvack.org, matt@codeblueprint.co.uk, mgorman@suse.de, mingo@redhat.com, mm-commits@vger.kernel.org, peterz@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, vincent.guittot@linaro.org, ysato@users.osdn.me Subject: [patch 110/262] mm: move node_reclaim_distance to fix NUMA without SMP Message-ID: <20211105204024.xjOD4DN7W%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 615C51908 X-Stat-Signature: thensxu5deauc1gj1h588o36tbxus3an Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=uL37d0FQ; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144826-698752 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Geert Uytterhoeven Subject: mm: move node_reclaim_distance to fix NUMA without SMP Patch series "Fix NUMA without SMP". SuperH is the only architecture which still supports NUMA without SMP, for good reasons (various memories scattered around the address space, each with varying latencies). This series fixes two build errors due to variables and functions used by the NUMA code being provided by SMP-only source files or sections. This patch (of 2): If CONFIG_NUMA=y, but CONFIG_SMP=n (e.g. sh/migor_defconfig): sh4-linux-gnu-ld: mm/page_alloc.o: in function `get_page_from_freelist': page_alloc.c:(.text+0x2c24): undefined reference to `node_reclaim_distance' Fix this by moving the declaration of node_reclaim_distance from an SMP-only to a generic file. Link: https://lkml.kernel.org/r/cover.1631781495.git.geert+renesas@glider.be Link: https://lkml.kernel.org/r/6432666a648dde85635341e6c918cee97c97d264.1631781495.git.geert+renesas@glider.be Fixes: a55c7454a8c887b2 ("sched/topology: Improve load balancing on AMD EPYC systems") Signed-off-by: Geert Uytterhoeven Suggested-by: Matt Fleming Acked-by: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Juri Lelli Cc: Vincent Guittot Cc: Vlastimil Babka Cc: Yoshinori Sato Cc: Rich Felker Cc: Gon Solo Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton --- kernel/sched/topology.c | 1 - mm/page_alloc.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) --- a/kernel/sched/topology.c~mm-move-node_reclaim_distance-to-fix-numa-without-smp +++ a/kernel/sched/topology.c @@ -1481,7 +1481,6 @@ static int sched_domains_curr_level; int sched_max_numa_distance; static int *sched_domains_numa_distance; static struct cpumask ***sched_domains_numa_masks; -int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE; static unsigned long __read_mostly *sched_numa_onlined_nodes; #endif --- a/mm/page_alloc.c~mm-move-node_reclaim_distance-to-fix-numa-without-smp +++ a/mm/page_alloc.c @@ -3960,6 +3960,8 @@ bool zone_watermark_ok_safe(struct zone } #ifdef CONFIG_NUMA +int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE; + static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone) { return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <= From patchwork Fri Nov 5 20:40:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B46BC433FE for ; Fri, 5 Nov 2021 20:40:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 02FED61262 for ; Fri, 5 Nov 2021 20:40:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 02FED61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8F06794006B; Fri, 5 Nov 2021 16:40:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 89FEC940049; Fri, 5 Nov 2021 16:40:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7682C94006B; Fri, 5 Nov 2021 16:40:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 6373F940049 for ; Fri, 5 Nov 2021 16:40:30 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2B686779B2 for ; Fri, 5 Nov 2021 20:40:30 +0000 (UTC) X-FDA: 78776044620.24.36137F6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id DD17F30000B0 for ; Fri, 5 Nov 2021 20:40:22 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 818D5611C0; Fri, 5 Nov 2021 20:40:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144828; bh=vvUHKsxEpicZ251+SkDu9X2UOc8I4Uix0HUuSo0JoGo=; h=Date:From:To:Subject:In-Reply-To:From; b=pv9mhkGAGSDKHPHF6jPbmaoAZ489HrAAou1PAU+459ZseTAT6iXCQblck1fmD6Ndw lG6xy5XnwEsbKmwZwdBiJa/ITzrzYUxSkxFL8a8KIk2HB00aEY/HFdsd2yJQmR1O/a ftXZsGjC7CAKzH5gHw1ecbrWTPJAnLHjRVPpYYBE= Date: Fri, 05 Nov 2021 13:40:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dalias@libc.org, geert+renesas@glider.be, gonsolo@gmail.com, juri.lelli@redhat.com, linux-mm@kvack.org, matt@codeblueprint.co.uk, mgorman@suse.de, mingo@redhat.com, mm-commits@vger.kernel.org, peterz@infradead.org, torvalds@linux-foundation.org, vbabka@suse.cz, vincent.guittot@linaro.org, ysato@users.osdn.me Subject: [patch 111/262] mm: move fold_vm_numa_events() to fix NUMA without SMP Message-ID: <20211105204028.gK19bresm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=pv9mhkGA; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: DD17F30000B0 X-Stat-Signature: mwfcebpt4ea58m4tc8yeeifsbsqfzsak X-HE-Tag: 1636144822-649056 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Geert Uytterhoeven Subject: mm: move fold_vm_numa_events() to fix NUMA without SMP If CONFIG_NUMA=y, but CONFIG_SMP=n (e.g. sh/migor_defconfig): sh4-linux-gnu-ld: mm/vmstat.o: in function `vmstat_start': vmstat.c:(.text+0x97c): undefined reference to `fold_vm_numa_events' sh4-linux-gnu-ld: drivers/base/node.o: in function `node_read_vmstat': node.c:(.text+0x140): undefined reference to `fold_vm_numa_events' sh4-linux-gnu-ld: drivers/base/node.o: in function `node_read_numastat': node.c:(.text+0x1d0): undefined reference to `fold_vm_numa_events' Fix this by moving fold_vm_numa_events() outside the SMP-only section. Link: https://lkml.kernel.org/r/9d16ccdd9ef32803d7100c84f737de6a749314fb.1631781495.git.geert+renesas@glider.be Fixes: f19298b9516c1a03 ("mm/vmstat: convert NUMA statistics to basic NUMA counters") Signed-off-by: Geert Uytterhoeven Acked-by: Mel Gorman Cc: Gon Solo Cc: Ingo Molnar Cc: Juri Lelli Cc: Matt Fleming Cc: Peter Zijlstra Cc: Rich Felker Cc: Vincent Guittot Cc: Vlastimil Babka Cc: Yoshinori Sato Signed-off-by: Andrew Morton --- mm/vmstat.c | 56 +++++++++++++++++++++++++------------------------- 1 file changed, 28 insertions(+), 28 deletions(-) --- a/mm/vmstat.c~mm-move-fold_vm_numa_events-to-fix-numa-without-smp +++ a/mm/vmstat.c @@ -165,6 +165,34 @@ atomic_long_t vm_numa_event[NR_VM_NUMA_E EXPORT_SYMBOL(vm_zone_stat); EXPORT_SYMBOL(vm_node_stat); +#ifdef CONFIG_NUMA +static void fold_vm_zone_numa_events(struct zone *zone) +{ + unsigned long zone_numa_events[NR_VM_NUMA_EVENT_ITEMS] = { 0, }; + int cpu; + enum numa_stat_item item; + + for_each_online_cpu(cpu) { + struct per_cpu_zonestat *pzstats; + + pzstats = per_cpu_ptr(zone->per_cpu_zonestats, cpu); + for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++) + zone_numa_events[item] += xchg(&pzstats->vm_numa_event[item], 0); + } + + for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++) + zone_numa_event_add(zone_numa_events[item], zone, item); +} + +void fold_vm_numa_events(void) +{ + struct zone *zone; + + for_each_populated_zone(zone) + fold_vm_zone_numa_events(zone); +} +#endif + #ifdef CONFIG_SMP int calculate_pressure_threshold(struct zone *zone) @@ -771,34 +799,6 @@ static int fold_diff(int *zone_diff, int return changes; } -#ifdef CONFIG_NUMA -static void fold_vm_zone_numa_events(struct zone *zone) -{ - unsigned long zone_numa_events[NR_VM_NUMA_EVENT_ITEMS] = { 0, }; - int cpu; - enum numa_stat_item item; - - for_each_online_cpu(cpu) { - struct per_cpu_zonestat *pzstats; - - pzstats = per_cpu_ptr(zone->per_cpu_zonestats, cpu); - for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++) - zone_numa_events[item] += xchg(&pzstats->vm_numa_event[item], 0); - } - - for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++) - zone_numa_event_add(zone_numa_events[item], zone, item); -} - -void fold_vm_numa_events(void) -{ - struct zone *zone; - - for_each_populated_zone(zone) - fold_vm_zone_numa_events(zone); -} -#endif - /* * Update the zone counters for the current cpu. * From patchwork Fri Nov 5 20:40:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3367EC433EF for ; Fri, 5 Nov 2021 20:40:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB47861279 for ; Fri, 5 Nov 2021 20:40:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DB47861279 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7427694006C; Fri, 5 Nov 2021 16:40:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D5BC940049; Fri, 5 Nov 2021 16:40:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B8EB94006C; Fri, 5 Nov 2021 16:40:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 4808D940049 for ; Fri, 5 Nov 2021 16:40:33 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 08457779B2 for ; Fri, 5 Nov 2021 20:40:33 +0000 (UTC) X-FDA: 78776044704.26.112E618 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 7E958300010B for ; Fri, 5 Nov 2021 20:40:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A4DC561284; Fri, 5 Nov 2021 20:40:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144831; bh=SjisieUTqCjhiTEM6vcVSU74YHf/mekk/+PYp9VXgEo=; h=Date:From:To:Subject:In-Reply-To:From; b=oyPyt1LDMiicWc3y85oiLsYtU8YS9nKuXvWpWUZgSbACn9Rv6SlrKQKInoMPakLdC QiMZ4AB/jXy+0WAwjwfiZQUeuonqw2tUJR6eIsbWWoxBCHM3QSMA69nWdNcYpt+0ax 6Dd9Mo6ZWTSomWdGTg5GV35XPOYG0jRKw/bREsd4= Date: Fri, 05 Nov 2021 13:40:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, edumazet@google.com, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 112/262] mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page() Message-ID: <20211105204031.LpK4bsEk0%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 7E958300010B X-Stat-Signature: i5wceg4wpt6zhisozckzf8bx8i9h3wos Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=oyPyt1LD; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144832-12977 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Eric Dumazet Subject: mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page() Grabbing zone lock in is_free_buddy_page() gives a wrong sense of safety, and has potential performance implications when zone is experiencing lock contention. In any case, if a caller needs a stable result, it should grab zone lock before calling this function. Link: https://lkml.kernel.org/r/20210922152833.4023972-1-eric.dumazet@gmail.com Signed-off-by: Eric Dumazet Acked-by: Hugh Dickins Signed-off-by: Andrew Morton --- mm/page_alloc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/mm/page_alloc.c~mm-do-not-acquire-zone-lock-in-is_free_buddy_page +++ a/mm/page_alloc.c @@ -9356,21 +9356,21 @@ void __offline_isolated_pages(unsigned l } #endif +/* + * This function returns a stable result only if called under zone lock. + */ bool is_free_buddy_page(struct page *page) { - struct zone *zone = page_zone(page); unsigned long pfn = page_to_pfn(page); - unsigned long flags; unsigned int order; - spin_lock_irqsave(&zone->lock, flags); for (order = 0; order < MAX_ORDER; order++) { struct page *page_head = page - (pfn & ((1 << order) - 1)); - if (PageBuddy(page_head) && buddy_order(page_head) >= order) + if (PageBuddy(page_head) && + buddy_order_unsafe(page_head) >= order) break; } - spin_unlock_irqrestore(&zone->lock, flags); return order < MAX_ORDER; } From patchwork Fri Nov 5 20:40:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AD37C433F5 for ; Fri, 5 Nov 2021 20:40:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4F3696126A for ; Fri, 5 Nov 2021 20:40:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4F3696126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DF93394006D; Fri, 5 Nov 2021 16:40:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA969940049; Fri, 5 Nov 2021 16:40:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4A0D94006D; Fri, 5 Nov 2021 16:40:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id B19D0940049 for ; Fri, 5 Nov 2021 16:40:36 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6A2EB1856B121 for ; Fri, 5 Nov 2021 20:40:36 +0000 (UTC) X-FDA: 78776044872.22.E947DA1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 42686508FA62 for ; Fri, 5 Nov 2021 20:40:24 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D685561279; Fri, 5 Nov 2021 20:40:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144835; bh=JchrKjC5sGRpdLNmoyAD18ldsqt4V8K71+Ld+uTLGEY=; h=Date:From:To:Subject:In-Reply-To:From; b=jb+omwY4Eyu2SbDoCZIIVcG4RMfsEVsFvDZaW3gf4dh/SU9V8pqA6Of9UirtoSQvT 7/tsnva4vCQK1406V2XLZAcjQvP5j8okVqRlgvX/nhOkvLUoz1XW6TJhxFs8oNbXD1 y51wndePTDobX4SENx/aR6DJ/pWpSIeSsadfIldo= Date: Fri, 05 Nov 2021 13:40:34 -0700 From: Andrew Morton To: akpm@linux-foundation.org, feng.tang@intel.com, hannes@cmpxchg.org, linux-mm@kvack.org, lizefan.x@bytedance.com, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, rientjes@google.com, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 113/262] mm/page_alloc: detect allocation forbidden by cpuset and bail out early Message-ID: <20211105204034.elkqRsCEe%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 42686508FA62 X-Stat-Signature: ux1oj8xn1oo6ajmzqzfd3w8u8ruec54h Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jb+omwY4; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144824-76037 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Feng Tang Subject: mm/page_alloc: detect allocation forbidden by cpuset and bail out early There was a report that starting an Ubuntu in docker while using cpuset to bind it to movable nodes (a node only has movable zone, like a node for hotplug or a Persistent Memory node in normal usage) will fail due to memory allocation failure, and then OOM is involved and many other innocent processes got killed. It can be reproduced with command: $docker run -it --rm --cpuset-mems 4 ubuntu:latest bash -c "grep Mems_allowed /proc/self/status" (node 4 is a movable node) runc:[2:INIT] invoked oom-killer: gfp_mask=0x500cc2(GFP_HIGHUSER|__GFP_ACCOUNT), order=0, oom_score_adj=0 CPU: 8 PID: 8291 Comm: runc:[2:INIT] Tainted: G W I E 5.8.2-0.g71b519a-default #1 openSUSE Tumbleweed (unreleased) Hardware name: Dell Inc. PowerEdge R640/0PHYDR, BIOS 2.6.4 04/09/2020 Call Trace: dump_stack+0x6b/0x88 dump_header+0x4a/0x1e2 oom_kill_process.cold+0xb/0x10 out_of_memory.part.0+0xaf/0x230 out_of_memory+0x3d/0x80 __alloc_pages_slowpath.constprop.0+0x954/0xa20 __alloc_pages_nodemask+0x2d3/0x300 pipe_write+0x322/0x590 new_sync_write+0x196/0x1b0 vfs_write+0x1c3/0x1f0 ksys_write+0xa7/0xe0 do_syscall_64+0x52/0xd0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Mem-Info: active_anon:392832 inactive_anon:182 isolated_anon:0 active_file:68130 inactive_file:151527 isolated_file:0 unevictable:2701 dirty:0 writeback:7 slab_reclaimable:51418 slab_unreclaimable:116300 mapped:45825 shmem:735 pagetables:2540 bounce:0 free:159849484 free_pcp:73 free_cma:0 Node 4 active_anon:1448kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? no Node 4 Movable free:130021408kB min:9140kB low:139160kB high:269180kB reserved_highatomic:0KB active_anon:1448kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:130023424kB managed:130023424kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:292kB local_pcp:84kB free_cma:0kB lowmem_reserve[]: 0 0 0 0 0 Node 4 Movable: 1*4kB (M) 0*8kB 0*16kB 1*32kB (M) 0*64kB 0*128kB 1*256kB (M) 1*512kB (M) 1*1024kB (M) 0*2048kB 31743*4096kB (M) = 130021156kB oom-kill:constraint=CONSTRAINT_CPUSET,nodemask=(null),cpuset=docker-9976a269caec812c134fa317f27487ee36e1129beba7278a463dd53e5fb9997b.scope,mems_allowed=4,global_oom,task_memcg=/system.slice/containerd.service,task=containerd,pid=4100,uid=0 Out of memory: Killed process 4100 (containerd) total-vm:4077036kB, anon-rss:51184kB, file-rss:26016kB, shmem-rss:0kB, UID:0 pgtables:676kB oom_score_adj:0 oom_reaper: reaped process 8248 (docker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB oom_reaper: reaped process 2054 (node_exporter), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB oom_reaper: reaped process 1452 (systemd-journal), now anon-rss:0kB, file-rss:8564kB, shmem-rss:4kB oom_reaper: reaped process 2146 (munin-node), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB oom_reaper: reaped process 8291 (runc:[2:INIT]), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB The reason is, in the case, the target cpuset nodes only have movable zone, while the creation of an OS in docker sometimes needs to allocate memory in non-movable zones (dma/dma32/normal) like GFP_HIGHUSER, and the cpuset limit forbids the allocation, then out-of-memory killing is involved even when normal nodes and movable nodes both have many free memory. The OOM killer cannot help to resolve the situation as there is no usable memory for the request in the cpuset scope. The only reasonable measure to take is to fail the allocation right away and have the caller to deal with it. So add a check for cases like this in the slowpath of allocation, and bail out early returning NULL for the allocation. As page allocation is one of the hottest path in kernel, this check will hurt all users with sane cpuset configuration, add a static branch check and detect the abnormal config in cpuset memory binding setup so that the extra check cost in page allocation is not paid by everyone. [thanks to Micho Hocko and David Rientjes for suggesting not handling it inside OOM code, adding cpuset check, refining comments] Link: https://lkml.kernel.org/r/1632481657-68112-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang Suggested-by: Michal Hocko Acked-by: Michal Hocko Cc: David Rientjes Cc: Tejun Heo Cc: Zefan Li Cc: Johannes Weiner Cc: Mel Gorman Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- include/linux/cpuset.h | 17 +++++++++++++++++ include/linux/mmzone.h | 22 ++++++++++++++++++++++ kernel/cgroup/cpuset.c | 23 +++++++++++++++++++++++ mm/page_alloc.c | 13 +++++++++++++ 4 files changed, 75 insertions(+) --- a/include/linux/cpuset.h~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early +++ a/include/linux/cpuset.h @@ -34,6 +34,8 @@ */ extern struct static_key_false cpusets_pre_enable_key; extern struct static_key_false cpusets_enabled_key; +extern struct static_key_false cpusets_insane_config_key; + static inline bool cpusets_enabled(void) { return static_branch_unlikely(&cpusets_enabled_key); @@ -51,6 +53,19 @@ static inline void cpuset_dec(void) static_branch_dec_cpuslocked(&cpusets_pre_enable_key); } +/* + * This will get enabled whenever a cpuset configuration is considered + * unsupportable in general. E.g. movable only node which cannot satisfy + * any non movable allocations (see update_nodemask). Page allocator + * needs to make additional checks for those configurations and this + * check is meant to guard those checks without any overhead for sane + * configurations. + */ +static inline bool cpusets_insane_config(void) +{ + return static_branch_unlikely(&cpusets_insane_config_key); +} + extern int cpuset_init(void); extern void cpuset_init_smp(void); extern void cpuset_force_rebuild(void); @@ -167,6 +182,8 @@ static inline void set_mems_allowed(node static inline bool cpusets_enabled(void) { return false; } +static inline bool cpusets_insane_config(void) { return false; } + static inline int cpuset_init(void) { return 0; } static inline void cpuset_init_smp(void) {} --- a/include/linux/mmzone.h~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early +++ a/include/linux/mmzone.h @@ -1220,6 +1220,28 @@ static inline struct zoneref *first_zone #define for_each_zone_zonelist(zone, z, zlist, highidx) \ for_each_zone_zonelist_nodemask(zone, z, zlist, highidx, NULL) +/* Whether the 'nodes' are all movable nodes */ +static inline bool movable_only_nodes(nodemask_t *nodes) +{ + struct zonelist *zonelist; + struct zoneref *z; + int nid; + + if (nodes_empty(*nodes)) + return false; + + /* + * We can chose arbitrary node from the nodemask to get a + * zonelist as they are interlinked. We just need to find + * at least one zone that can satisfy kernel allocations. + */ + nid = first_node(*nodes); + zonelist = &NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK]; + z = first_zones_zonelist(zonelist, ZONE_NORMAL, nodes); + return (!z->zone) ? true : false; +} + + #ifdef CONFIG_SPARSEMEM #include #endif --- a/kernel/cgroup/cpuset.c~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early +++ a/kernel/cgroup/cpuset.c @@ -69,6 +69,13 @@ DEFINE_STATIC_KEY_FALSE(cpusets_pre_enable_key); DEFINE_STATIC_KEY_FALSE(cpusets_enabled_key); +/* + * There could be abnormal cpuset configurations for cpu or memory + * node binding, add this key to provide a quick low-cost judgement + * of the situation. + */ +DEFINE_STATIC_KEY_FALSE(cpusets_insane_config_key); + /* See "Frequency meter" comments, below. */ struct fmeter { @@ -372,6 +379,17 @@ static DECLARE_WORK(cpuset_hotplug_work, static DECLARE_WAIT_QUEUE_HEAD(cpuset_attach_wq); +static inline void check_insane_mems_config(nodemask_t *nodes) +{ + if (!cpusets_insane_config() && + movable_only_nodes(nodes)) { + static_branch_enable(&cpusets_insane_config_key); + pr_info("Unsupported (movable nodes only) cpuset configuration detected (nmask=%*pbl)!\n" + "Cpuset allocations might fail even with a lot of memory available.\n", + nodemask_pr_args(nodes)); + } +} + /* * Cgroup v2 behavior is used on the "cpus" and "mems" control files when * on default hierarchy or when the cpuset_v2_mode flag is set by mounting @@ -1870,6 +1888,8 @@ static int update_nodemask(struct cpuset if (retval < 0) goto done; + check_insane_mems_config(&trialcs->mems_allowed); + spin_lock_irq(&callback_lock); cs->mems_allowed = trialcs->mems_allowed; spin_unlock_irq(&callback_lock); @@ -3173,6 +3193,9 @@ update_tasks: cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus); mems_updated = !nodes_equal(new_mems, cs->effective_mems); + if (mems_updated) + check_insane_mems_config(&new_mems); + if (is_in_v2_mode()) hotplug_update_tasks(cs, &new_cpus, &new_mems, cpus_updated, mems_updated); --- a/mm/page_alloc.c~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early +++ a/mm/page_alloc.c @@ -4910,6 +4910,19 @@ retry_cpuset: if (!ac->preferred_zoneref->zone) goto nopage; + /* + * Check for insane configurations where the cpuset doesn't contain + * any suitable zone to satisfy the request - e.g. non-movable + * GFP_HIGHUSER allocations from MOVABLE nodes only. + */ + if (cpusets_insane_config() && (gfp_mask & __GFP_HARDWALL)) { + struct zoneref *z = first_zones_zonelist(ac->zonelist, + ac->highest_zoneidx, + &cpuset_current_mems_allowed); + if (!z->zone) + goto nopage; + } + if (alloc_flags & ALLOC_KSWAPD) wake_all_kswapds(order, gfp_mask, ac); From patchwork Fri Nov 5 20:40:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5171EC433F5 for ; Fri, 5 Nov 2021 20:40:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 05862611C0 for ; Fri, 5 Nov 2021 20:40:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 05862611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9637D94006E; Fri, 5 Nov 2021 16:40:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9194394006F; Fri, 5 Nov 2021 16:40:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F26E94006E; Fri, 5 Nov 2021 16:40:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0106.hostedemail.com [216.40.44.106]) by kanga.kvack.org (Postfix) with ESMTP id 60A0E940049 for ; Fri, 5 Nov 2021 16:40:39 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2884455D92 for ; Fri, 5 Nov 2021 20:40:39 +0000 (UTC) X-FDA: 78776044998.12.3D992FC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id E4AFED0000A8 for ; Fri, 5 Nov 2021 20:40:27 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id ED23D6126A; Fri, 5 Nov 2021 20:40:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144838; bh=3FgqD949W3pnE6WwT33oOxGUVWU25dyD4e9aK8K98rk=; h=Date:From:To:Subject:In-Reply-To:From; b=sRL8ipK2akfBj2RPlmrdLBIsrAumU6LpgzzM/R6OgsaJLwlI9jSdtvg5Sw2UeMkns bp/PFltqb3jky0exsp0GecHKAnBNCCECEt/oiWdZ01qDTwkqdyBaU4KfnMsT7DzeOj BjqX/ruMZLGqZfuf3q/zpnw8UbpF2c3GJcmcH9fU= Date: Fri, 05 Nov 2021 13:40:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, liangcaifan19@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, zhang.lyra@gmail.com Subject: [patch 114/262] mm/page_alloc.c: show watermark_boost of zone in zoneinfo Message-ID: <20211105204037.aNvfZGhYc%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=sRL8ipK2; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E4AFED0000A8 X-Stat-Signature: u1wzmr1yuabfi67m97yaoanuzo51huo6 X-HE-Tag: 1636144827-278843 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Liangcai Fan Subject: mm/page_alloc.c: show watermark_boost of zone in zoneinfo min/low/high_wmark_pages(z) is defined as (z->_watermark[WMARK_MIN/LOW/HIGH] + z->watermark_boost). If kswapd is frequently waked up due to the increase of min/low/high_wmark_pages, printing watermark_boost can quickly locate whether watermark_boost or _watermark[WMARK_MIN/LOW/HIGH] caused min/low/high_wmark_pages to increase. Link: https://lkml.kernel.org/r/1632472566-12246-1-git-send-email-liangcaifan19@gmail.com Signed-off-by: Liangcai Fan Cc: Chunyan Zhang Signed-off-by: Andrew Morton --- mm/page_alloc.c | 2 ++ mm/vmstat.c | 2 ++ 2 files changed, 4 insertions(+) --- a/mm/page_alloc.c~mm-show-watermark_boost-of-zone-in-zoneinfo +++ a/mm/page_alloc.c @@ -5993,6 +5993,7 @@ void show_free_areas(unsigned int filter printk(KERN_CONT "%s" " free:%lukB" + " boost:%lukB" " min:%lukB" " low:%lukB" " high:%lukB" @@ -6013,6 +6014,7 @@ void show_free_areas(unsigned int filter "\n", zone->name, K(zone_page_state(zone, NR_FREE_PAGES)), + K(zone->watermark_boost), K(min_wmark_pages(zone)), K(low_wmark_pages(zone)), K(high_wmark_pages(zone)), --- a/mm/vmstat.c~mm-show-watermark_boost-of-zone-in-zoneinfo +++ a/mm/vmstat.c @@ -1656,6 +1656,7 @@ static void zoneinfo_show_print(struct s } seq_printf(m, "\n pages free %lu" + "\n boost %lu" "\n min %lu" "\n low %lu" "\n high %lu" @@ -1664,6 +1665,7 @@ static void zoneinfo_show_print(struct s "\n managed %lu" "\n cma %lu", zone_page_state(zone, NR_FREE_PAGES), + zone->watermark_boost, min_wmark_pages(zone), low_wmark_pages(zone), high_wmark_pages(zone), From patchwork Fri Nov 5 20:40:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84199C433FE for ; Fri, 5 Nov 2021 20:40:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 29EE56126A for ; Fri, 5 Nov 2021 20:40:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 29EE56126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BFBEF94006F; Fri, 5 Nov 2021 16:40:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BAF06940049; Fri, 5 Nov 2021 16:40:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9C4594006F; Fri, 5 Nov 2021 16:40:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id 9AE36940049 for ; Fri, 5 Nov 2021 16:40:42 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 60D461856B121 for ; Fri, 5 Nov 2021 20:40:42 +0000 (UTC) X-FDA: 78776045124.26.AC2B96C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 0690F70000B0 for ; Fri, 5 Nov 2021 20:40:41 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DF383611C0; Fri, 5 Nov 2021 20:40:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144841; bh=f0pmK+QffkqWkpYdf0i/C/k78iRf3FkcX2CSByEiaMk=; h=Date:From:To:Subject:In-Reply-To:From; b=N52eVlPPc1wg3mSPA4w8V8K4QX1JuNyvlWmyK2ySGjw5Qd8XjAf15uqeiRC+ENimm sVDkyK9/F7iAf5SmpUK1Xro5uxqqftn5SsZeasUpDow0Mj8h6EmED/JRc/nxAUGvJG Svr3En7Q9E19ySz8mq67xKNPngk/18+PWTEpuQmc= Date: Fri, 05 Nov 2021 13:40:40 -0700 From: Andrew Morton To: akpm@linux-foundation.org, benh@kernel.crashing.org, christophe.leroy@csgroup.eu, gerald.schaefer@linux.ibm.com, hca@linux.ibm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, paulus@ozlabs.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 115/262] mm: create a new system state and fix core_kernel_text() Message-ID: <20211105204040.TmGWbPVoa%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=N52eVlPP; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0690F70000B0 X-Stat-Signature: 1nxj14md5ypwn5fi3r3fiqaruczg9kbr X-HE-Tag: 1636144841-917601 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe Leroy Subject: mm: create a new system state and fix core_kernel_text() core_kernel_text() considers that until system_state in at least SYSTEM_RUNNING, init memory is valid. But init memory is freed a few lines before setting SYSTEM_RUNNING, so we have a small period of time when core_kernel_text() is wrong. Create an intermediate system state called SYSTEM_FREEING_INIT that is set before starting freeing init memory, and use it in core_kernel_text() to report init memory invalid earlier. Link: https://lkml.kernel.org/r/9ecfdee7dd4d741d172cb93ff1d87f1c58127c9a.1633001016.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Cc: Gerald Schaefer Cc: Kefeng Wang Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Heiko Carstens Signed-off-by: Andrew Morton --- include/linux/kernel.h | 1 + init/main.c | 2 ++ kernel/extable.c | 2 +- 3 files changed, 4 insertions(+), 1 deletion(-) --- a/include/linux/kernel.h~mm-create-a-new-system-state-and-fix-core_kernel_text +++ a/include/linux/kernel.h @@ -248,6 +248,7 @@ extern bool early_boot_irqs_disabled; extern enum system_states { SYSTEM_BOOTING, SYSTEM_SCHEDULING, + SYSTEM_FREEING_INITMEM, SYSTEM_RUNNING, SYSTEM_HALT, SYSTEM_POWER_OFF, --- a/init/main.c~mm-create-a-new-system-state-and-fix-core_kernel_text +++ a/init/main.c @@ -1506,6 +1506,8 @@ static int __ref kernel_init(void *unuse kernel_init_freeable(); /* need to finish all async __init code before freeing the memory */ async_synchronize_full(); + + system_state = SYSTEM_FREEING_INITMEM; kprobe_free_init_mem(); ftrace_free_init_mem(); kgdb_free_init_mem(); --- a/kernel/extable.c~mm-create-a-new-system-state-and-fix-core_kernel_text +++ a/kernel/extable.c @@ -76,7 +76,7 @@ int notrace core_kernel_text(unsigned lo addr < (unsigned long)_etext) return 1; - if (system_state < SYSTEM_RUNNING && + if (system_state < SYSTEM_FREEING_INITMEM && init_kernel_text(addr)) return 1; return 0; From patchwork Fri Nov 5 20:40:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78C80C433EF for ; Fri, 5 Nov 2021 20:40:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E73861262 for ; Fri, 5 Nov 2021 20:40:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2E73861262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BFD89940070; Fri, 5 Nov 2021 16:40:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BACA9940049; Fri, 5 Nov 2021 16:40:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC343940070; Fri, 5 Nov 2021 16:40:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 99E80940049 for ; Fri, 5 Nov 2021 16:40:45 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5B20A1856A9A5 for ; Fri, 5 Nov 2021 20:40:45 +0000 (UTC) X-FDA: 78776045250.25.C7FFCB8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id EAED44002099 for ; Fri, 5 Nov 2021 20:40:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id ECE306127B; Fri, 5 Nov 2021 20:40:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144844; bh=uqY2/gVbPyS5qLzegr4HW5humpqDu9VdY0HZpdQttqk=; h=Date:From:To:Subject:In-Reply-To:From; b=R9vy29+ykrMm9N8JHadqqy7zOXn7kQZHSvtIIvOyhO7GLQzRfd69a9ZND5+PBFF5a Q1gg92urT/Di9nC70Rdglh0W/+H74wYxXc3rlhzWz/DmIgV3WgmVuKQ3wRYiMwRjWh IZjRFWJOs2MBqq3JBhjlblu3UK3wWBHj2zcduAnw= Date: Fri, 05 Nov 2021 13:40:43 -0700 From: Andrew Morton To: akpm@linux-foundation.org, benh@kernel.crashing.org, christophe.leroy@csgroup.eu, gerald.schaefer@linux.ibm.com, hca@linux.ibm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, paulus@ozlabs.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 116/262] mm: make generic arch_is_kernel_initmem_freed() do what it says Message-ID: <20211105204043.z5Hpw2DQg%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: EAED44002099 X-Stat-Signature: 68ejjtx1uhizsokonzqzy8ajypuu44zj Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=R9vy29+y; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144844-128593 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe Leroy Subject: mm: make generic arch_is_kernel_initmem_freed() do what it says Commit 7a5da02de8d6 ("locking/lockdep: check for freed initmem in static_obj()") added arch_is_kernel_initmem_freed() which is supposed to report whether an object is part of already freed init memory. For the time being, the generic version of arch_is_kernel_initmem_freed() always reports 'false', allthough free_initmem() is generically called on all architectures. Therefore, change the generic version of arch_is_kernel_initmem_freed() to check whether free_initmem() has been called. If so, then check if a given address falls into init memory. To ease the use of system_state, move it out of line into its only caller which is lockdep.c Link: https://lkml.kernel.org/r/1d40783e676e07858be97d881f449ee7ea8adfb1.1633001016.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Cc: Gerald Schaefer Cc: Kefeng Wang Cc: Benjamin Herrenschmidt Cc: Heiko Carstens Cc: Paul Mackerras Signed-off-by: Andrew Morton --- include/asm-generic/sections.h | 14 -------------- kernel/locking/lockdep.c | 15 +++++++++++++++ 2 files changed, 15 insertions(+), 14 deletions(-) --- a/include/asm-generic/sections.h~mm-make-generic-arch_is_kernel_initmem_freed-do-what-it-says +++ a/include/asm-generic/sections.h @@ -80,20 +80,6 @@ static inline int arch_is_kernel_data(un } #endif -/* - * Check if an address is part of freed initmem. This is needed on architectures - * with virt == phys kernel mapping, for code that wants to check if an address - * is part of a static object within [_stext, _end]. After initmem is freed, - * memory can be allocated from it, and such allocations would then have - * addresses within the range [_stext, _end]. - */ -#ifndef arch_is_kernel_initmem_freed -static inline int arch_is_kernel_initmem_freed(unsigned long addr) -{ - return 0; -} -#endif - /** * memory_contains - checks if an object is contained within a memory region * @begin: virtual address of the beginning of the memory region --- a/kernel/locking/lockdep.c~mm-make-generic-arch_is_kernel_initmem_freed-do-what-it-says +++ a/kernel/locking/lockdep.c @@ -788,6 +788,21 @@ static int very_verbose(struct lock_clas * Is this the address of a static object: */ #ifdef __KERNEL__ +/* + * Check if an address is part of freed initmem. After initmem is freed, + * memory can be allocated from it, and such allocations would then have + * addresses within the range [_stext, _end]. + */ +#ifndef arch_is_kernel_initmem_freed +static int arch_is_kernel_initmem_freed(unsigned long addr) +{ + if (system_state < SYSTEM_FREEING_INITMEM) + return 0; + + return init_section_contains((void *)addr, 1); +} +#endif + static int static_obj(const void *obj) { unsigned long start = (unsigned long) &_stext, From patchwork Fri Nov 5 20:40:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605597 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BF11C433FE for ; Fri, 5 Nov 2021 20:40:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 44B7F6126A for ; Fri, 5 Nov 2021 20:40:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 44B7F6126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D0DBF940071; Fri, 5 Nov 2021 16:40:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CBDE2940049; Fri, 5 Nov 2021 16:40:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BAC4A940071; Fri, 5 Nov 2021 16:40:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id A6E07940049 for ; Fri, 5 Nov 2021 16:40:48 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7318470016 for ; Fri, 5 Nov 2021 20:40:48 +0000 (UTC) X-FDA: 78776045376.06.B1C4736 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 047AE3000117 for ; Fri, 5 Nov 2021 20:40:47 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E63FC61262; Fri, 5 Nov 2021 20:40:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144847; bh=xVtEAAcaucpAD7AVdd2plSCAMlOUcuPZqZOvzfzdmoE=; h=Date:From:To:Subject:In-Reply-To:From; b=fxdccbG6+eK0gIwoT8Lso4FQL8vu2KIooCwOqc2egO7O0jpMQKrpj8+RCS3dVnCCZ DzvaFVVcfm8j+8z08vGvzwVarsFAMWtvNdr8Vr6FsMtNu2RI6IxFSgxubgdTHbAVAT GawHk+gyBlyvSXwIC9UxRHrg3XZ1wjQu4YwlMN0c= Date: Fri, 05 Nov 2021 13:40:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, benh@kernel.crashing.org, christophe.leroy@csgroup.eu, gerald.schaefer@linux.ibm.com, hca@linux.ibm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, paulus@ozlabs.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 117/262] powerpc: use generic version of arch_is_kernel_initmem_freed() Message-ID: <20211105204046.cC6SA8eua%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=fxdccbG6; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 047AE3000117 X-Stat-Signature: 33zod94ism1mns6jcqg3aywbw1k5azju X-HE-Tag: 1636144847-202642 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe Leroy Subject: powerpc: use generic version of arch_is_kernel_initmem_freed() Generic version of arch_is_kernel_initmem_freed() now does the same as powerpc version. Remove the powerpc version. Link: https://lkml.kernel.org/r/c53764eb45d41491e2b21da2e7812239897dbebb.1633001016.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Cc: Kefeng Wang Cc: Benjamin Herrenschmidt Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Paul Mackerras Signed-off-by: Andrew Morton --- arch/powerpc/include/asm/sections.h | 13 ------------- 1 file changed, 13 deletions(-) --- a/arch/powerpc/include/asm/sections.h~powerpc-use-generic-version-of-arch_is_kernel_initmem_freed +++ a/arch/powerpc/include/asm/sections.h @@ -6,21 +6,8 @@ #include #include -#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed - #include -extern bool init_mem_is_free; - -static inline int arch_is_kernel_initmem_freed(unsigned long addr) -{ - if (!init_mem_is_free) - return 0; - - return addr >= (unsigned long)__init_begin && - addr < (unsigned long)__init_end; -} - extern char __head_end[]; #ifdef __powerpc64__ From patchwork Fri Nov 5 20:40:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60CD9C433EF for ; Fri, 5 Nov 2021 20:40:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 15ED4611C0 for ; Fri, 5 Nov 2021 20:40:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 15ED4611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A50D1940072; Fri, 5 Nov 2021 16:40:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FFE9940049; Fri, 5 Nov 2021 16:40:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C7F7940072; Fri, 5 Nov 2021 16:40:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0166.hostedemail.com [216.40.44.166]) by kanga.kvack.org (Postfix) with ESMTP id 770D6940049 for ; Fri, 5 Nov 2021 16:40:51 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 40708779B1 for ; Fri, 5 Nov 2021 20:40:51 +0000 (UTC) X-FDA: 78776045502.05.0DE8FFA Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 75B4C508FA48 for ; Fri, 5 Nov 2021 20:40:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id F226E6126A; Fri, 5 Nov 2021 20:40:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144850; bh=qx+wnyrQEAhzRmfure/hImz/Fd4gS4JblO0Wae+xy04=; h=Date:From:To:Subject:In-Reply-To:From; b=kFaEeiFtkBoL0sevdIwDsEVOzAcbsQkIZmpVIw1o57F0ocSYMIXgvkD/Wy9BJYM4H Hc3wP7yuocf2smtQooKMHOSdLM27/MruO5XdCoQvQuwz+sAQVUI+zHMwPAV3fJ502J sRVz+ttI3RPYc4Np53Xa+edASB/6v8goieFWxKus= Date: Fri, 05 Nov 2021 13:40:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, benh@kernel.crashing.org, christophe.leroy@csgroup.eu, gerald.schaefer@linux.ibm.com, hca@linux.ibm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, paulus@ozlabs.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 118/262] s390: use generic version of arch_is_kernel_initmem_freed() Message-ID: <20211105204049.JWzd6dfQk%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 75B4C508FA48 X-Stat-Signature: 5rajof6zhicpzdmzwermoewbndjewuwz Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=kFaEeiFt; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144833-476330 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christophe Leroy Subject: s390: use generic version of arch_is_kernel_initmem_freed() Generic version of arch_is_kernel_initmem_freed() now does the same as s390 version. Remove the s390 version. Link: https://lkml.kernel.org/r/b6feb5dfe611a322de482762fc2df3a9eece70c7.1633001016.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Acked-by: Heiko Carstens Cc: Gerald Schaefer Cc: Kefeng Wang Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Andrew Morton --- arch/s390/include/asm/sections.h | 12 ------------ arch/s390/mm/init.c | 3 --- 2 files changed, 15 deletions(-) --- a/arch/s390/include/asm/sections.h~s390-use-generic-version-of-arch_is_kernel_initmem_freed +++ a/arch/s390/include/asm/sections.h @@ -2,20 +2,8 @@ #ifndef _S390_SECTIONS_H #define _S390_SECTIONS_H -#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed - #include -extern bool initmem_freed; - -static inline int arch_is_kernel_initmem_freed(unsigned long addr) -{ - if (!initmem_freed) - return 0; - return addr >= (unsigned long)__init_begin && - addr < (unsigned long)__init_end; -} - /* * .boot.data section contains variables "shared" between the decompressor and * the decompressed kernel. The decompressor will store values in them, and --- a/arch/s390/mm/init.c~s390-use-generic-version-of-arch_is_kernel_initmem_freed +++ a/arch/s390/mm/init.c @@ -58,8 +58,6 @@ unsigned long empty_zero_page, zero_page EXPORT_SYMBOL(empty_zero_page); EXPORT_SYMBOL(zero_page_mask); -bool initmem_freed; - static void __init setup_zero_pages(void) { unsigned int order; @@ -214,7 +212,6 @@ void __init mem_init(void) void free_initmem(void) { - initmem_freed = true; __set_memory((unsigned long)_sinittext, (unsigned long)(_einittext - _sinittext) >> PAGE_SHIFT, SET_MEMORY_RW | SET_MEMORY_NX); From patchwork Fri Nov 5 20:40:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A26DC433EF for ; Fri, 5 Nov 2021 20:40:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 084B061279 for ; Fri, 5 Nov 2021 20:40:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 084B061279 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9B36D940073; Fri, 5 Nov 2021 16:40:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9620D940049; Fri, 5 Nov 2021 16:40:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85136940073; Fri, 5 Nov 2021 16:40:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0088.hostedemail.com [216.40.44.88]) by kanga.kvack.org (Postfix) with ESMTP id 6FB15940049 for ; Fri, 5 Nov 2021 16:40:54 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3AF3655D92 for ; Fri, 5 Nov 2021 20:40:54 +0000 (UTC) X-FDA: 78776045628.18.6EBAC23 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id E1118F0000B0 for ; Fri, 5 Nov 2021 20:40:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id EDC1661262; Fri, 5 Nov 2021 20:40:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144853; bh=kRVAicR0RFgPFMeoHCQB41ZDmeTS5LoBsuX8BL9G2Hc=; h=Date:From:To:Subject:In-Reply-To:From; b=B4PjbjfbxO/+w4CIcqtNASyrqv8uh3sRp60VN+FZ8ZjPYJdzkIp2Tuo3+skNO4jiI EudpciNS5BHW6mKYFgiKI5vonNF7j+YnltqPQEIAof73hr2tpDtzj29P8Fik5eb3OM ybvNE80xKW3tdw5x5dg49PzRNQrMbUPek6xKaoME= Date: Fri, 05 Nov 2021 13:40:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bigeasy@linutronix.de, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterz@infradead.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 119/262] mm: page_alloc: use migrate_disable() in drain_local_pages_wq() Message-ID: <20211105204052.cwfJA8ksK%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=B4Pjbjfb; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E1118F0000B0 X-Stat-Signature: w7h1sgx6za9f19z3u16ks9ehjcawamyb X-HE-Tag: 1636144853-622513 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Sebastian Andrzej Siewior Subject: mm: page_alloc: use migrate_disable() in drain_local_pages_wq() drain_local_pages_wq() disables preemption to avoid CPU migration during CPU hotplug and can't use cpus_read_lock(). Using migrate_disable() works here, too. The scheduler won't take the CPU offline until the task left the migrate-disable section. The problem with disabled preemption here is that drain_local_pages() acquires locks which are turned into sleeping locks on PREEMPT_RT and can't be acquired with disabled preemption. Use migrate_disable() in drain_local_pages_wq(). Link: https://lkml.kernel.org/r/20211015210933.viw6rjvo64qtqxn4@linutronix.de Signed-off-by: Sebastian Andrzej Siewior Cc: Thomas Gleixner Cc: Peter Zijlstra Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-use-migrate_disable-in-drain_local_pages_wq +++ a/mm/page_alloc.c @@ -3141,9 +3141,9 @@ static void drain_local_pages_wq(struct * cpu which is alright but we also have to make sure to not move to * a different one. */ - preempt_disable(); + migrate_disable(); drain_local_pages(drain->zone); - preempt_enable(); + migrate_enable(); } /* From patchwork Fri Nov 5 20:40:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ABD7C433FE for ; Fri, 5 Nov 2021 20:40:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 032C5611C0 for ; Fri, 5 Nov 2021 20:40:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 032C5611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 91601940074; Fri, 5 Nov 2021 16:40:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C5FC940049; Fri, 5 Nov 2021 16:40:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B50E940074; Fri, 5 Nov 2021 16:40:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 6A377940049 for ; Fri, 5 Nov 2021 16:40:57 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 34E6218494E94 for ; Fri, 5 Nov 2021 20:40:57 +0000 (UTC) X-FDA: 78776045796.01.A0FF9FB Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id E5577D0000AC for ; Fri, 5 Nov 2021 20:40:45 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E905E61262; Fri, 5 Nov 2021 20:40:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144856; bh=NOCP50HYKMSoBePJ/HR1LDAyXYJlPTqLctXom3vTXos=; h=Date:From:To:Subject:In-Reply-To:From; b=Qqr7byP2j+c7GM/LyIh4imPabe34S8W3py0u1RlTX8IeHjhXhJojV3ZXkfbn88T1R nmjx+enlzFMNgNMhs+eeL3k/fYLmCqMmkUOz8qn8tdZ1RiJylJfN9SHUhnqZwhNFzE XZ/qZ1KyFn3UbOb7qH8VRGHuKSTBLte+jCAAtU28= Date: Fri, 05 Nov 2021 13:40:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bobo.shaobowang@huawei.com, david@redhat.com, huawei.libin@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, weiyongjun1@huawei.com Subject: [patch 120/262] mm/page_alloc: use clamp() to simplify code Message-ID: <20211105204055.6P3hBKyWJ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Qqr7byP2; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E5577D0000AC X-Stat-Signature: 39if89uryth8s43co1ykzoy34afch8k4 X-HE-Tag: 1636144845-131535 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Wang ShaoBo Subject: mm/page_alloc: use clamp() to simplify code This patch uses clamp() to simplify code in init_per_zone_wmark_min(). Link: https://lkml.kernel.org/r/20211021034830.1049150-1-bobo.shaobowang@huawei.com Signed-off-by: Wang ShaoBo Reviewed-by: David Hildenbrand Cc: Wei Yongjun Cc: Li Bin Signed-off-by: Andrew Morton --- mm/page_alloc.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-use-clamp-to-simplify-code +++ a/mm/page_alloc.c @@ -8477,16 +8477,12 @@ int __meminit init_per_zone_wmark_min(vo lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10); new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16); - if (new_min_free_kbytes > user_min_free_kbytes) { - min_free_kbytes = new_min_free_kbytes; - if (min_free_kbytes < 128) - min_free_kbytes = 128; - if (min_free_kbytes > 262144) - min_free_kbytes = 262144; - } else { + if (new_min_free_kbytes > user_min_free_kbytes) + min_free_kbytes = clamp(new_min_free_kbytes, 128, 262144); + else pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n", new_min_free_kbytes, user_min_free_kbytes); - } + setup_per_zone_wmarks(); refresh_zone_stat_thresholds(); setup_per_zone_lowmem_reserve(); From patchwork Fri Nov 5 20:40:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B0D8C433EF for ; Fri, 5 Nov 2021 20:41:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 228686126A for ; Fri, 5 Nov 2021 20:41:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 228686126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AEA42940075; Fri, 5 Nov 2021 16:41:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A9A12940049; Fri, 5 Nov 2021 16:41:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D7DF940075; Fri, 5 Nov 2021 16:41:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id 8A110940049 for ; Fri, 5 Nov 2021 16:41:00 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 58C821813C166 for ; Fri, 5 Nov 2021 20:41:00 +0000 (UTC) X-FDA: 78776045880.02.0EE35C4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id F22FD300010D for ; Fri, 5 Nov 2021 20:40:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E3685611C0; Fri, 5 Nov 2021 20:40:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144859; bh=FJ117JC7W34jlEpL/A9WwXWOZvE/GIu5k4dPVvRpWhg=; h=Date:From:To:Subject:In-Reply-To:From; b=LqRfLvajSIolyrWcrw1gj03FmGbXqEKxHp8HVRzSc1zAMiq18vwCxMzwIH4Q/q0eM GnnElNvUXF3vUAx2oegpiDCHknrBsaXCSFTN6rviddYuZcg7q1lL4unRHiTj4od0/8 yMTxBDcIUnvE4CPIriYSbVuuEYo74FUWs6DGVE+M= Date: Fri, 05 Nov 2021 13:40:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, elver@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, n-horiguchi@ah.jp.nec.com, oliver.sang@intel.com, torvalds@linux-foundation.org, will@kernel.org Subject: [patch 121/262] mm: fix data race in PagePoisoned() Message-ID: <20211105204058.ifx3dDphm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F22FD300010D X-Stat-Signature: unnnuzfid9rjnpk4eky3cda1sig63wgs Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=LqRfLvaj; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144859-699468 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: mm: fix data race in PagePoisoned() PagePoisoned() accesses page->flags which can be updated concurrently: | BUG: KCSAN: data-race in next_uptodate_page / unlock_page | | write (marked) to 0xffffea00050f37c0 of 8 bytes by task 1872 on cpu 1: | instrument_atomic_write include/linux/instrumented.h:87 [inline] | clear_bit_unlock_is_negative_byte include/asm-generic/bitops/instrumented-lock.h:74 [inline] | unlock_page+0x102/0x1b0 mm/filemap.c:1465 | filemap_map_pages+0x6c6/0x890 mm/filemap.c:3057 | ... | read to 0xffffea00050f37c0 of 8 bytes by task 1873 on cpu 0: | PagePoisoned include/linux/page-flags.h:204 [inline] | PageReadahead include/linux/page-flags.h:382 [inline] | next_uptodate_page+0x456/0x830 mm/filemap.c:2975 | ... | CPU: 0 PID: 1873 Comm: systemd-udevd Not tainted 5.11.0-rc4-00001-gf9ce0be71d1f #1 To avoid the compiler tearing or otherwise optimizing the access, use READ_ONCE() to access flags. Link: https://lore.kernel.org/all/20210826144157.GA26950@xsang-OptiPlex-9020/ Link: https://lkml.kernel.org/r/20210913113542.2658064-1-elver@google.com Reported-by: kernel test robot Signed-off-by: Marco Elver Acked-by: Kirill A. Shutemov Acked-by: Will Deacon Cc: Marco Elver Cc: Naoya Horiguchi Signed-off-by: Andrew Morton --- include/linux/page-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/include/linux/page-flags.h~mm-fix-data-race-in-pagepoisoned +++ a/include/linux/page-flags.h @@ -215,7 +215,7 @@ static __always_inline int PageCompound( #define PAGE_POISON_PATTERN -1l static inline int PagePoisoned(const struct page *page) { - return page->flags == PAGE_POISON_PATTERN; + return READ_ONCE(page->flags) == PAGE_POISON_PATTERN; } #ifdef CONFIG_DEBUG_VM From patchwork Fri Nov 5 20:41:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 179ACC433FE for ; Fri, 5 Nov 2021 20:41:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D99F36127B for ; Fri, 5 Nov 2021 20:41:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D99F36127B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8042B940076; Fri, 5 Nov 2021 16:41:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B488940049; Fri, 5 Nov 2021 16:41:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C9BA940076; Fri, 5 Nov 2021 16:41:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 5898A940049 for ; Fri, 5 Nov 2021 16:41:03 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 234888249980 for ; Fri, 5 Nov 2021 20:41:03 +0000 (UTC) X-FDA: 78776046006.19.EAFB6F2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id BCE3D70000B4 for ; Fri, 5 Nov 2021 20:41:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DE1F86126A; Fri, 5 Nov 2021 20:41:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144862; bh=jOuwn8Guu8VA46qSiznSZUNIlcH+vRKwNXz6m3zXcJw=; h=Date:From:To:Subject:In-Reply-To:From; b=Vvk+XD7m9HKDmD0xTclQgKcVv8ETRrKeaW6DWXXyDdTyMlxCxsZiphnvxQc/9Acza iQ7sE/a4LmuBdEAsProEjcF9jp2WqgV7+hDu0CEyXhp5Q2UhEOmrgDdH88mgFtPmHQ YiDUd8xBjWvPcwe3AvF+yGFp56GB/ITEmnitkOKY= Date: Fri, 05 Nov 2021 13:41:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, rikard.falkeborn@gmail.com, torvalds@linux-foundation.org Subject: [patch 122/262] mm/memory_failure: constify static mm_walk_ops Message-ID: <20211105204101.8OnxznkNy%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: BCE3D70000B4 X-Stat-Signature: aodpwaj3169ra5nmcobr531weumspd7q Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Vvk+XD7m; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144862-609674 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rikard Falkeborn Subject: mm/memory_failure: constify static mm_walk_ops The only usage of hwp_walk_ops is to pass its address to walk_page_range() which takes a pointer to const mm_walk_ops as argument. Make it const to allow the compiler to put it in read-only memory. Link: https://lkml.kernel.org/r/20211014075042.17174-3-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn Acked-by: Naoya Horiguchi Reviewed-by: Anshuman Khandual Signed-off-by: Andrew Morton --- mm/memory-failure.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/memory-failure.c~mm-memory_failure-constify-static-mm_walk_ops +++ a/mm/memory-failure.c @@ -674,7 +674,7 @@ static int hwpoison_hugetlb_range(pte_t #define hwpoison_hugetlb_range NULL #endif -static struct mm_walk_ops hwp_walk_ops = { +static const struct mm_walk_ops hwp_walk_ops = { .pmd_entry = hwpoison_pte_range, .hugetlb_entry = hwpoison_hugetlb_range, }; From patchwork Fri Nov 5 20:41:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECC16C433F5 for ; Fri, 5 Nov 2021 20:41:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A1678611C0 for ; Fri, 5 Nov 2021 20:41:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A1678611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 40C7D940077; Fri, 5 Nov 2021 16:41:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BBA6940049; Fri, 5 Nov 2021 16:41:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ABA7940077; Fri, 5 Nov 2021 16:41:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id 1A3C9940049 for ; Fri, 5 Nov 2021 16:41:08 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CEBDC82499A8 for ; Fri, 5 Nov 2021 20:41:07 +0000 (UTC) X-FDA: 78776046174.09.79CE71C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 324609000249 for ; Fri, 5 Nov 2021 20:41:06 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 13E1E61262; Fri, 5 Nov 2021 20:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144865; bh=JqSLIBsH1Cz2U9q66HxhbRVhkxNwgyS7ZboI8qEI7mA=; h=Date:From:To:Subject:In-Reply-To:From; b=tg8QQgqn9FhxYam80qz2rmcJY7SQ8VNnmUi8Y2gMwLuFZ+x4G2cRRoD/iz3s6pmEg mpdX194ICo3i58hTtfRLtPGJBFSLshDREyp+9Avi3hMI4ivpjvdq8aFhLKVgmACylC 2laaOKW6AlaJcnzxtZvrTTtOdBAF9EkryQFYMOjA= Date: Fri, 05 Nov 2021 13:41:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 123/262] mm: filemap: coding style cleanup for filemap_map_pmd() Message-ID: <20211105204104.BrlKrYlIS%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=tg8QQgqn; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 324609000249 X-Stat-Signature: a7cstq14jiu91aua36asw1dpaeaotmtz X-HE-Tag: 1636144866-50011 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: filemap: coding style cleanup for filemap_map_pmd() Patch series "Solve silent data loss caused by poisoned page cache (shmem/tmpfs)", v5. When discussing the patch that splits page cache THP in order to offline the poisoned page, Noaya mentioned there is a bigger problem [1] that prevents this from working since the page cache page will be truncated if uncorrectable errors happen. By looking this deeper it turns out this approach (truncating poisoned page) may incur silent data loss for all non-readonly filesystems if the page is dirty. It may be worse for in-memory filesystem, e.g. shmem/tmpfs since the data blocks are actually gone. To solve this problem we could keep the poisoned dirty page in page cache then notify the users on any later access, e.g. page fault, read/write, etc. The clean page could be truncated as is since they can be reread from disk later on. The consequence is the filesystems may find poisoned page and manipulate it as healthy page since all the filesystems actually don't check if the page is poisoned or not in all the relevant paths except page fault. In general, we need make the filesystems be aware of poisoned page before we could keep the poisoned page in page cache in order to solve the data loss problem. To make filesystems be aware of poisoned page we should consider: - The page should be not written back: clearing dirty flag could prevent from writeback. - The page should not be dropped (it shows as a clean page) by drop caches or other callers: the refcount pin from hwpoison could prevent from invalidating (called by cache drop, inode cache shrinking, etc), but it doesn't avoid invalidation in DIO path. - The page should be able to get truncated/hole punched/unlinked: it works as it is. - Notify users when the page is accessed, e.g. read/write, page fault and other paths (compression, encryption, etc). The scope of the last one is huge since almost all filesystems need do it once a page is returned from page cache lookup. There are a couple of options to do it: 1. Check hwpoison flag for every path, the most straightforward way. 2. Return NULL for poisoned page from page cache lookup, the most callsites check if NULL is returned, this should have least work I think. But the error handling in filesystems just return -ENOMEM, the error code will incur confusion to the users obviously. 3. To improve #2, we could return error pointer, e.g. ERR_PTR(-EIO), but this will involve significant amount of code change as well since all the paths need check if the pointer is ERR or not just like option #1. I did prototype for both #1 and #3, but it seems #3 may require more changes than #1. For #3 ERR_PTR will be returned so all the callers need to check the return value otherwise invalid pointer may be dereferenced, but not all callers really care about the content of the page, for example, partial truncate which just sets the truncated range in one page to 0. So for such paths it needs additional modification if ERR_PTR is returned. And if the callers have their own way to handle the problematic pages we need to add a new FGP flag to tell FGP functions to return the pointer to the page. It may happen very rarely, but once it happens the consequence (data corruption) could be very bad and it is very hard to debug. It seems this problem had been slightly discussed before, but seems no action was taken at that time. [2] As the aforementioned investigation, it needs huge amount of work to solve the potential data loss for all filesystems. But it is much easier for in-memory filesystems and such filesystems actually suffer more than others since even the data blocks are gone due to truncating. So this patchset starts from shmem/tmpfs by taking option #1. TODO: * The unpoison has been broken since commit 0ed950d1f281 ("mm,hwpoison: make get_hwpoison_page() call get_any_page()"), and this patch series make refcount check for unpoisoning shmem page fail. * Expand to other filesystems. But I haven't heard feedback from filesystem developers yet. Patch breakdown: Patch #1: cleanup, depended by patch #2 Patch #2: fix THP with hwpoisoned subpage(s) PMD map bug Patch #3: coding style cleanup Patch #4: refactor and preparation. Patch #5: keep the poisoned page in page cache and handle such case for all the paths. Patch #6: the previous patches unblock page cache THP split, so this patch add page cache THP split support. This patch (of 4): A minor cleanup to the indent. Link: https://lkml.kernel.org/r/20211020210755.23964-1-shy828301@gmail.com Link: https://lkml.kernel.org/r/20211020210755.23964-4-shy828301@gmail.com Signed-off-by: Yang Shi Reviewed-by: Naoya Horiguchi Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Signed-off-by: Andrew Morton --- mm/filemap.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/mm/filemap.c~mm-filemap-coding-style-cleanup-for-filemap_map_pmd +++ a/mm/filemap.c @@ -3203,12 +3203,12 @@ static bool filemap_map_pmd(struct vm_fa } if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { - vm_fault_t ret = do_set_pmd(vmf, page); - if (!ret) { - /* The page is mapped successfully, reference consumed. */ - unlock_page(page); - return true; - } + vm_fault_t ret = do_set_pmd(vmf, page); + if (!ret) { + /* The page is mapped successfully, reference consumed. */ + unlock_page(page); + return true; + } } if (pmd_none(*vmf->pmd)) From patchwork Fri Nov 5 20:41:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DBC4C433EF for ; Fri, 5 Nov 2021 20:41:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 57673611C0 for ; Fri, 5 Nov 2021 20:41:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 57673611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EE239940078; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E925E940049; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8066940078; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0249.hostedemail.com [216.40.44.249]) by kanga.kvack.org (Postfix) with ESMTP id C825B940049 for ; Fri, 5 Nov 2021 16:41:10 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9048B82499A8 for ; Fri, 5 Nov 2021 20:41:10 +0000 (UTC) X-FDA: 78776046300.21.7DE6E72 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 8ADD8B00018B for ; Fri, 5 Nov 2021 20:40:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3531361279; Fri, 5 Nov 2021 20:41:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144868; bh=UsuK52LmUMrPFArYKzCNECASXKGEtg5HkHkL69bbWlg=; h=Date:From:To:Subject:In-Reply-To:From; b=D3ve0zjibpfQNun5KN88gC4YeqGtcoTVzo0vHGK0vON1DDzBbeY4QoK+y45JeZLH5 KA6KwNhv15abxYxq0oDKSbphHKpDDBMBZ50AwaKeyx2MSe65739UMEttLGlwmT4iFX nWLP77rLO/ZaSOdiAKxqFgsFiwOIQuZhgWINdEWw= Date: Fri, 05 Nov 2021 13:41:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 124/262] mm: hwpoison: refactor refcount check handling Message-ID: <20211105204107.FgtW39TxN%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8ADD8B00018B X-Stat-Signature: 77e7xy7ye91wwtgjs9gwjutgurseq5tc Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=D3ve0zji; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144859-363969 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: hwpoison: refactor refcount check handling Memory failure will report failure if the page still has extra pinned refcount other than from hwpoison after the handler is done. Actually the check is not necessary for all handlers, so move the check into specific handlers. This would make the following keeping shmem page in page cache patch easier. There may be expected extra pin for some cases, for example, when the page is dirty and in swapcache. Link: https://lkml.kernel.org/r/20211020210755.23964-5-shy828301@gmail.com Signed-off-by: Yang Shi Signed-off-by: Naoya Horiguchi Suggested-by: Naoya Horiguchi Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Signed-off-by: Andrew Morton --- mm/memory-failure.c | 93 ++++++++++++++++++++++++++++-------------- 1 file changed, 64 insertions(+), 29 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-refactor-refcount-check-handling +++ a/mm/memory-failure.c @@ -807,12 +807,44 @@ static int truncate_error_page(struct pa return ret; } +struct page_state { + unsigned long mask; + unsigned long res; + enum mf_action_page_type type; + + /* Callback ->action() has to unlock the relevant page inside it. */ + int (*action)(struct page_state *ps, struct page *p); +}; + +/* + * Return true if page is still referenced by others, otherwise return + * false. + * + * The extra_pins is true when one extra refcount is expected. + */ +static bool has_extra_refcount(struct page_state *ps, struct page *p, + bool extra_pins) +{ + int count = page_count(p) - 1; + + if (extra_pins) + count -= 1; + + if (count > 0) { + pr_err("Memory failure: %#lx: %s still referenced by %d users\n", + page_to_pfn(p), action_page_types[ps->type], count); + return true; + } + + return false; +} + /* * Error hit kernel page. * Do nothing, try to be lucky and not touch this instead. For a few cases we * could be more sophisticated. */ -static int me_kernel(struct page *p, unsigned long pfn) +static int me_kernel(struct page_state *ps, struct page *p) { unlock_page(p); return MF_IGNORED; @@ -821,9 +853,9 @@ static int me_kernel(struct page *p, uns /* * Page in unknown state. Do nothing. */ -static int me_unknown(struct page *p, unsigned long pfn) +static int me_unknown(struct page_state *ps, struct page *p) { - pr_err("Memory failure: %#lx: Unknown page state\n", pfn); + pr_err("Memory failure: %#lx: Unknown page state\n", page_to_pfn(p)); unlock_page(p); return MF_FAILED; } @@ -831,7 +863,7 @@ static int me_unknown(struct page *p, un /* * Clean (or cleaned) page cache page. */ -static int me_pagecache_clean(struct page *p, unsigned long pfn) +static int me_pagecache_clean(struct page_state *ps, struct page *p) { int ret; struct address_space *mapping; @@ -868,9 +900,13 @@ static int me_pagecache_clean(struct pag * * Open: to take i_rwsem or not for this? Right now we don't. */ - ret = truncate_error_page(p, pfn, mapping); + ret = truncate_error_page(p, page_to_pfn(p), mapping); out: unlock_page(p); + + if (has_extra_refcount(ps, p, false)) + ret = MF_FAILED; + return ret; } @@ -879,7 +915,7 @@ out: * Issues: when the error hit a hole page the error is not properly * propagated. */ -static int me_pagecache_dirty(struct page *p, unsigned long pfn) +static int me_pagecache_dirty(struct page_state *ps, struct page *p) { struct address_space *mapping = page_mapping(p); @@ -923,7 +959,7 @@ static int me_pagecache_dirty(struct pag mapping_set_error(mapping, -EIO); } - return me_pagecache_clean(p, pfn); + return me_pagecache_clean(ps, p); } /* @@ -945,9 +981,10 @@ static int me_pagecache_dirty(struct pag * Clean swap cache pages can be directly isolated. A later page fault will * bring in the known good data from disk. */ -static int me_swapcache_dirty(struct page *p, unsigned long pfn) +static int me_swapcache_dirty(struct page_state *ps, struct page *p) { int ret; + bool extra_pins = false; ClearPageDirty(p); /* Trigger EIO in shmem: */ @@ -955,10 +992,17 @@ static int me_swapcache_dirty(struct pag ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; unlock_page(p); + + if (ret == MF_DELAYED) + extra_pins = true; + + if (has_extra_refcount(ps, p, extra_pins)) + ret = MF_FAILED; + return ret; } -static int me_swapcache_clean(struct page *p, unsigned long pfn) +static int me_swapcache_clean(struct page_state *ps, struct page *p) { int ret; @@ -966,6 +1010,10 @@ static int me_swapcache_clean(struct pag ret = delete_from_lru_cache(p) ? MF_FAILED : MF_RECOVERED; unlock_page(p); + + if (has_extra_refcount(ps, p, false)) + ret = MF_FAILED; + return ret; } @@ -975,7 +1023,7 @@ static int me_swapcache_clean(struct pag * - Error on hugepage is contained in hugepage unit (not in raw page unit.) * To narrow down kill region to one page, we need to break up pmd. */ -static int me_huge_page(struct page *p, unsigned long pfn) +static int me_huge_page(struct page_state *ps, struct page *p) { int res; struct page *hpage = compound_head(p); @@ -986,7 +1034,7 @@ static int me_huge_page(struct page *p, mapping = page_mapping(hpage); if (mapping) { - res = truncate_error_page(hpage, pfn, mapping); + res = truncate_error_page(hpage, page_to_pfn(p), mapping); unlock_page(hpage); } else { res = MF_FAILED; @@ -1004,6 +1052,9 @@ static int me_huge_page(struct page *p, } } + if (has_extra_refcount(ps, p, false)) + res = MF_FAILED; + return res; } @@ -1029,14 +1080,7 @@ static int me_huge_page(struct page *p, #define slab (1UL << PG_slab) #define reserved (1UL << PG_reserved) -static struct page_state { - unsigned long mask; - unsigned long res; - enum mf_action_page_type type; - - /* Callback ->action() has to unlock the relevant page inside it. */ - int (*action)(struct page *p, unsigned long pfn); -} error_states[] = { +static struct page_state error_states[] = { { reserved, reserved, MF_MSG_KERNEL, me_kernel }, /* * free pages are specially detected outside this table: @@ -1096,19 +1140,10 @@ static int page_action(struct page_state unsigned long pfn) { int result; - int count; /* page p should be unlocked after returning from ps->action(). */ - result = ps->action(p, pfn); + result = ps->action(ps, p); - count = page_count(p) - 1; - if (ps->action == me_swapcache_dirty && result == MF_DELAYED) - count--; - if (count > 0) { - pr_err("Memory failure: %#lx: %s still referenced by %d users\n", - pfn, action_page_types[ps->type], count); - result = MF_FAILED; - } action_result(pfn, ps->type, result); /* Could do more checks here if page looks ok */ From patchwork Fri Nov 5 20:41:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BF53C4321E for ; Fri, 5 Nov 2021 20:41:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B48C06127A for ; Fri, 5 Nov 2021 20:41:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B48C06127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 52070940079; Fri, 5 Nov 2021 16:41:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CFD1940049; Fri, 5 Nov 2021 16:41:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37014940079; Fri, 5 Nov 2021 16:41:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id 26929940049 for ; Fri, 5 Nov 2021 16:41:13 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E12905D718 for ; Fri, 5 Nov 2021 20:41:12 +0000 (UTC) X-FDA: 78776046426.14.6E69A17 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 7CE0C9000249 for ; Fri, 5 Nov 2021 20:41:12 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5EA1E6126A; Fri, 5 Nov 2021 20:41:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144871; bh=BBy4UbTkdz2d0HQd6+h5t8JQhHE+UceAIabf+mWFg9g=; h=Date:From:To:Subject:In-Reply-To:From; b=AQUQHoVBBuJWBheKoTxw14qziCQylWTLVlDum5aEXPXl+JKCYfMrXOahSe+qv13MU IR8ZaxLyHGscx6UC4X8xJ0w63u43t1+ODgq7ftmC2GnoeMfgCUH2o/G/4xNNAh8vKZ oAjVRdSs/bSafky79g05bWY6bMRLOhmwJQNcfOTA= Date: Fri, 05 Nov 2021 13:41:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, arnd@arndb.de, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 125/262] mm: shmem: don't truncate page if memory failure happens Message-ID: <20211105204110.FGZfj6Q-n%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7CE0C9000249 X-Stat-Signature: dryirmgw5bh5pp5khmkkkbh3r8u1uh8z Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=AQUQHoVB; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144872-260819 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: shmem: don't truncate page if memory failure happens The current behavior of memory failure is to truncate the page cache regardless of dirty or clean. If the page is dirty the later access will get the obsolete data from disk without any notification to the users. This may cause silent data loss. It is even worse for shmem since shmem is in-memory filesystem, truncating page cache means discarding data blocks. The later read would return all zero. The right approach is to keep the corrupted page in page cache, any later access would return error for syscalls or SIGBUS for page fault, until the file is truncated, hole punched or removed. The regular storage backed filesystems would be more complicated so this patch is focused on shmem. This also unblock the support for soft offlining shmem THP. [arnd@arndb.de: fix uninitialized variable use in me_pagecache_clean()] Link: https://lkml.kernel.org/r/20211022064748.4173718-1-arnd@kernel.org Link: https://lkml.kernel.org/r/20211020210755.23964-6-shy828301@gmail.com Signed-off-by: Yang Shi Signed-off-by: Arnd Bergmann Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Naoya Horiguchi Cc: Oscar Salvador Cc: Peter Xu Signed-off-by: Andrew Morton --- mm/memory-failure.c | 14 +++++++++++--- mm/shmem.c | 38 +++++++++++++++++++++++++++++++++++--- mm/userfaultfd.c | 5 +++++ 3 files changed, 51 insertions(+), 6 deletions(-) --- a/mm/memory-failure.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/memory-failure.c @@ -58,6 +58,7 @@ #include #include #include +#include #include "internal.h" #include "ras/ras_event.h" @@ -867,6 +868,7 @@ static int me_pagecache_clean(struct pag { int ret; struct address_space *mapping; + bool extra_pins; delete_from_lru_cache(p); @@ -896,17 +898,23 @@ static int me_pagecache_clean(struct pag } /* + * The shmem page is kept in page cache instead of truncating + * so is expected to have an extra refcount after error-handling. + */ + extra_pins = shmem_mapping(mapping); + + /* * Truncation is a bit tricky. Enable it per file system for now. * * Open: to take i_rwsem or not for this? Right now we don't. */ ret = truncate_error_page(p, page_to_pfn(p), mapping); + if (has_extra_refcount(ps, p, extra_pins)) + ret = MF_FAILED; + out: unlock_page(p); - if (has_extra_refcount(ps, p, false)) - ret = MF_FAILED; - return ret; } --- a/mm/shmem.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/shmem.c @@ -2454,6 +2454,7 @@ shmem_write_begin(struct file *file, str struct inode *inode = mapping->host; struct shmem_inode_info *info = SHMEM_I(inode); pgoff_t index = pos >> PAGE_SHIFT; + int ret = 0; /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | @@ -2464,7 +2465,15 @@ shmem_write_begin(struct file *file, str return -EPERM; } - return shmem_getpage(inode, index, pagep, SGP_WRITE); + ret = shmem_getpage(inode, index, pagep, SGP_WRITE); + + if (*pagep && PageHWPoison(*pagep)) { + unlock_page(*pagep); + put_page(*pagep); + ret = -EIO; + } + + return ret; } static int @@ -2551,6 +2560,12 @@ static ssize_t shmem_file_read_iter(stru if (sgp == SGP_CACHE) set_page_dirty(page); unlock_page(page); + + if (PageHWPoison(page)) { + put_page(page); + error = -EIO; + break; + } } /* @@ -3112,7 +3127,8 @@ static const char *shmem_get_link(struct page = find_get_page(inode->i_mapping, 0); if (!page) return ERR_PTR(-ECHILD); - if (!PageUptodate(page)) { + if (PageHWPoison(page) || + !PageUptodate(page)) { put_page(page); return ERR_PTR(-ECHILD); } @@ -3120,6 +3136,11 @@ static const char *shmem_get_link(struct error = shmem_getpage(inode, 0, &page, SGP_READ); if (error) return ERR_PTR(error); + if (page && PageHWPoison(page)) { + unlock_page(page); + put_page(page); + return ERR_PTR(-ECHILD); + } unlock_page(page); } set_delayed_call(done, shmem_put_link, page); @@ -3770,6 +3791,13 @@ static void shmem_destroy_inodecache(voi kmem_cache_destroy(shmem_inode_cachep); } +/* Keep the page in page cache instead of truncating it */ +static int shmem_error_remove_page(struct address_space *mapping, + struct page *page) +{ + return 0; +} + const struct address_space_operations shmem_aops = { .writepage = shmem_writepage, .set_page_dirty = __set_page_dirty_no_writeback, @@ -3780,7 +3808,7 @@ const struct address_space_operations sh #ifdef CONFIG_MIGRATION .migratepage = migrate_page, #endif - .error_remove_page = generic_error_remove_page, + .error_remove_page = shmem_error_remove_page, }; EXPORT_SYMBOL(shmem_aops); @@ -4191,6 +4219,10 @@ struct page *shmem_read_mapping_page_gfp page = ERR_PTR(error); else unlock_page(page); + + if (PageHWPoison(page)) + page = ERR_PTR(-EIO); + return page; #else /* --- a/mm/userfaultfd.c~mm-shmem-dont-truncate-page-if-memory-failure-happens +++ a/mm/userfaultfd.c @@ -232,6 +232,11 @@ static int mcontinue_atomic_pte(struct m goto out; } + if (PageHWPoison(page)) { + ret = -EIO; + goto out_release; + } + ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, page, false, wp_copy); if (ret) From patchwork Fri Nov 5 20:41:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19998C43219 for ; Fri, 5 Nov 2021 20:41:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C851C61279 for ; Fri, 5 Nov 2021 20:41:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C851C61279 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6A584940049; Fri, 5 Nov 2021 16:41:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 654C594007A; Fri, 5 Nov 2021 16:41:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 542E7940049; Fri, 5 Nov 2021 16:41:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id 43DCF94007A for ; Fri, 5 Nov 2021 16:41:16 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0248E8249980 for ; Fri, 5 Nov 2021 20:41:16 +0000 (UTC) X-FDA: 78776046552.22.AC74E6F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id D2020508FA48 for ; Fri, 5 Nov 2021 20:41:03 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 898FC611C0; Fri, 5 Nov 2021 20:41:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144874; bh=2sbxwZv0ZvcIIZZ5F7BXAbHE67nAS2XyR/CDD0yDCU8=; h=Date:From:To:Subject:In-Reply-To:From; b=YxVD7wu5cyEWNbVpoM05FULHPkuQ9l/RzGdzsj/MGjyf9LpGkIvhhyO7+BfvkQudS m2fzDKFxWifm6cdS+adZdL1GboPNQPtNZ1FmEeF9DZrciPw5sqLJCXifJK7kaD+yFW Tw+KbqbvLwx2xTQsEGkgPJCYsyAnKUjw+DdL5ggE= Date: Fri, 05 Nov 2021 13:41:14 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 126/262] mm: hwpoison: handle non-anonymous THP correctly Message-ID: <20211105204114.oqF4Oa0js%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D2020508FA48 X-Stat-Signature: j8p9gcoyrksn6majjic6mh7ykijjh1iy Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YxVD7wu5; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144863-261509 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: hwpoison: handle non-anonymous THP correctly Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP support for tmpfs and read-only file cache has been added. They could be offlined by split THP, just like anonymous THP. Link: https://lkml.kernel.org/r/20211020210755.23964-7-shy828301@gmail.com Signed-off-by: Yang Shi Acked-by: Naoya Horiguchi Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Oscar Salvador Cc: Peter Xu Signed-off-by: Andrew Morton --- mm/memory-failure.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-handle-non-anonymous-thp-correctly +++ a/mm/memory-failure.c @@ -1444,14 +1444,11 @@ static int identify_page_state(unsigned static int try_to_split_thp_page(struct page *page, const char *msg) { lock_page(page); - if (!PageAnon(page) || unlikely(split_huge_page(page))) { + if (unlikely(split_huge_page(page))) { unsigned long pfn = page_to_pfn(page); unlock_page(page); - if (!PageAnon(page)) - pr_info("%s: %#lx: non anonymous thp\n", msg, pfn); - else - pr_info("%s: %#lx: thp split failed\n", msg, pfn); + pr_info("%s: %#lx: thp split failed\n", msg, pfn); put_page(page); return -EBUSY; } From patchwork Fri Nov 5 20:41:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 102ECC433F5 for ; Fri, 5 Nov 2021 20:41:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B8AB86127A for ; Fri, 5 Nov 2021 20:41:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B8AB86127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5926A94007B; Fri, 5 Nov 2021 16:41:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5441D94007A; Fri, 5 Nov 2021 16:41:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BDD894007B; Fri, 5 Nov 2021 16:41:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 28A0094007A for ; Fri, 5 Nov 2021 16:41:19 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EBB4C18046C72 for ; Fri, 5 Nov 2021 20:41:18 +0000 (UTC) X-FDA: 78776046636.12.1F397B1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id 89900104AAEE for ; Fri, 5 Nov 2021 20:41:09 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A3F516126A; Fri, 5 Nov 2021 20:41:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144877; bh=bF0s3PiCRC3VcsPIZ/ETjwjY/SOtP8deMZuBhk2qp44=; h=Date:From:To:Subject:In-Reply-To:From; b=bOyjedehA1edoOql9tGF0gXPbopigpAoWamzd34jhgP3/OWvULYoG4cgkeT4lmwwX qF6D5KuZP5KDvwNQn9rhrPDSnJsod2LeNvA8vstvrXRyD47IP+EMTI5tyJIvomI7ZS ssDCqOkRGkCsXY9RU52wS1Lw2Ew5P4j5Cy+xEBYE= Date: Fri, 05 Nov 2021 13:41:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, jhubbard@nvidia.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, peterx@redhat.com, songmuchun@bytedance.com, torvalds@linux-foundation.org Subject: [patch 127/262] mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h Message-ID: <20211105204117.MUFO1Wxn5%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 89900104AAEE X-Stat-Signature: dh9c59ktjxars3ucx3hiwj8unnz3gi4m Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bOyjedeh; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144869-626100 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Xu Subject: mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h Remove __unmap_hugepage_range() from the header file, because it is only used in hugetlb.c. Link: https://lkml.kernel.org/r/20210917165108.9341-1-peterx@redhat.com Signed-off-by: Peter Xu Suggested-by: Mike Kravetz Reviewed-by: Mike Kravetz Reviewed-by: John Hubbard Reviewed-by: Muchun Song Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton --- include/linux/hugetlb.h | 10 ---------- mm/hugetlb.c | 6 +++--- 2 files changed, 3 insertions(+), 13 deletions(-) --- a/include/linux/hugetlb.h~mm-hugetlb-drop-__unmap_hugepage_range-definition-from-hugetlbh +++ a/include/linux/hugetlb.h @@ -143,9 +143,6 @@ void __unmap_hugepage_range_final(struct struct vm_area_struct *vma, unsigned long start, unsigned long end, struct page *ref_page); -void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long start, unsigned long end, - struct page *ref_page); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -382,13 +379,6 @@ static inline void __unmap_hugepage_rang struct vm_area_struct *vma, unsigned long start, unsigned long end, struct page *ref_page) { - BUG(); -} - -static inline void __unmap_hugepage_range(struct mmu_gather *tlb, - struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) -{ BUG(); } --- a/mm/hugetlb.c~mm-hugetlb-drop-__unmap_hugepage_range-definition-from-hugetlbh +++ a/mm/hugetlb.c @@ -4426,9 +4426,9 @@ again: return ret; } -void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, - unsigned long start, unsigned long end, - struct page *ref_page) +static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, + unsigned long start, unsigned long end, + struct page *ref_page) { struct mm_struct *mm = vma->vm_mm; unsigned long address; From patchwork Fri Nov 5 20:41:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5D1EC433F5 for ; Fri, 5 Nov 2021 20:41:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 726A06126A for ; Fri, 5 Nov 2021 20:41:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 726A06126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0D91694007A; Fri, 5 Nov 2021 16:41:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 087E694007C; Fri, 5 Nov 2021 16:41:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA05394007A; Fri, 5 Nov 2021 16:41:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id D582794007C for ; Fri, 5 Nov 2021 16:41:22 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9F9A01855931E for ; Fri, 5 Nov 2021 20:41:22 +0000 (UTC) X-FDA: 78776046804.29.29BE85D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 695AC508FA64 for ; Fri, 5 Nov 2021 20:41:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 05346611C0; Fri, 5 Nov 2021 20:41:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144881; bh=d1qjORPaX5VHzrZw4UVxUevVVCj0RjbXre/+HHENDmA=; h=Date:From:To:Subject:In-Reply-To:From; b=XVIq5Z9iDcnrbSUna4lVx1fB5+2RT/V73ZTPGpAPsGkxW06SMvWIWwqoM0sWEUzMl snv8WqyzgI/ekpDAT8MuHSfKSUKUPsuK5QWIsUr87cmnojZVOBhp85tDMywhgSmHSY xiozl3LCAqyF8g9rYMlnJXxUtzGik+oMQac+vYKw= Date: Fri, 05 Nov 2021 13:41:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@linux.dev, nghialm78@gmail.com, osalvador@suse.de, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 128/262] hugetlb: add demote hugetlb page sysfs interfaces Message-ID: <20211105204120.C6nNZAF1p%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XVIq5Z9i; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 695AC508FA64 X-Stat-Signature: n1a6pbwnsz31ncrzfcx5pjqjs66oj78a X-HE-Tag: 1636144870-925016 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: hugetlb: add demote hugetlb page sysfs interfaces Patch series "hugetlb: add demote/split page functionality", v4. The concurrent use of multiple hugetlb page sizes on a single system is becoming more common. One of the reasons is better TLB support for gigantic page sizes on x86 hardware. In addition, hugetlb pages are being used to back VMs in hosting environments. When using hugetlb pages to back VMs, it is often desirable to preallocate hugetlb pools. This avoids the delay and uncertainty of allocating hugetlb pages at VM startup. In addition, preallocating huge pages minimizes the issue of memory fragmentation that increases the longer the system is up and running. In such environments, a combination of larger and smaller hugetlb pages are preallocated in anticipation of backing VMs of various sizes. Over time, the preallocated pool of smaller hugetlb pages may become depleted while larger hugetlb pages still remain. In such situations, it is desirable to convert larger hugetlb pages to smaller hugetlb pages. Converting larger to smaller hugetlb pages can be accomplished today by first freeing the larger page to the buddy allocator and then allocating the smaller pages. For example, to convert 50 GB pages on x86: gb_pages=`cat .../hugepages-1048576kB/nr_hugepages` m2_pages=`cat .../hugepages-2048kB/nr_hugepages` echo $(($gb_pages - 50)) > .../hugepages-1048576kB/nr_hugepages echo $(($m2_pages + 25600)) > .../hugepages-2048kB/nr_hugepages On an idle system this operation is fairly reliable and results are as expected. The number of 2MB pages is increased as expected and the time of the operation is a second or two. However, when there is activity on the system the following issues arise: 1) This process can take quite some time, especially if allocation of the smaller pages is not immediate and requires migration/compaction. 2) There is no guarantee that the total size of smaller pages allocated will match the size of the larger page which was freed. This is because the area freed by the larger page could quickly be fragmented. In a test environment with a load that continually fills the page cache with clean pages, results such as the following can be observed: Unexpected number of 2MB pages allocated: Expected 25600, have 19944 real 0m42.092s user 0m0.008s sys 0m41.467s To address these issues, introduce the concept of hugetlb page demotion. Demotion provides a means of 'in place' splitting of a hugetlb page to pages of a smaller size. This avoids freeing pages to buddy and then trying to allocate from buddy. Page demotion is controlled via sysfs files that reside in the per-hugetlb page size and per node directories. - demote_size Target page size for demotion, a smaller huge page size. File can be written to chose a smaller huge page size if multiple are available. - demote Writable number of hugetlb pages to be demoted To demote 50 GB huge pages, one would: cat .../hugepages-1048576kB/free_hugepages /* optional, verify free pages */ cat .../hugepages-1048576kB/demote_size /* optional, verify target size */ echo 50 > .../hugepages-1048576kB/demote Only hugetlb pages which are free at the time of the request can be demoted. Demotion does not add to the complexity of surplus pages and honors reserved huge pages. Therefore, when a value is written to the sysfs demote file, that value is only the maximum number of pages which will be demoted. It is possible fewer will actually be demoted. The recently introduced per-hstate mutex is used to synchronize demote operations with other operations that modify hugetlb pools. Real world use cases -------------------- The above scenario describes a real world use case where hugetlb pages are used to back VMs on x86. Both issues of long allocation times and not necessarily getting the expected number of smaller huge pages after a free and allocate cycle have been experienced. The occurrence of these issues is dependent on other activity within the host and can not be predicted. This patch (of 5): Two new sysfs files are added to demote hugtlb pages. These files are both per-hugetlb page size and per node. Files are: demote_size - The size in Kb that pages are demoted to. (read-write) demote - The number of huge pages to demote. (write-only) By default, demote_size is the next smallest huge page size. Valid huge page sizes less than huge page size may be written to this file. When huge pages are demoted, they are demoted to this size. Writing a value to demote will result in an attempt to demote that number of hugetlb pages to an appropriate number of demote_size pages. NOTE: Demote interfaces are only provided for huge page sizes if there is a smaller target demote huge page size. For example, on x86 1GB huge pages will have demote interfaces. 2MB huge pages will not have demote interfaces. This patch does not provide full demote functionality. It only provides the sysfs interfaces. It also provides documentation for the new interfaces. [mike.kravetz@oracle.com: n_mask initialization does not need to be protected by the mutex] Link: https://lkml.kernel.org/r/0530e4ef-2492-5186-f919-5db68edea654@oracle.com Link: https://lkml.kernel.org/r/20211007181918.136982-2-mike.kravetz@oracle.com Link: https://lkml.kernel.org/r/20211007181918.136982-2-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reviewed-by: Oscar Salvador Cc: David Hildenbrand Cc: Michal Hocko Cc: Zi Yan Cc: Muchun Song Cc: Naoya Horiguchi Cc: David Rientjes Cc: "Aneesh Kumar K . V" Cc: Nghia Le Cc: Mike Kravetz Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/hugetlbpage.rst | 30 +++ include/linux/hugetlb.h | 1 mm/hugetlb.c | 155 ++++++++++++++++- 3 files changed, 183 insertions(+), 3 deletions(-) --- a/Documentation/admin-guide/mm/hugetlbpage.rst~hugetlb-add-demote-hugetlb-page-sysfs-interfaces +++ a/Documentation/admin-guide/mm/hugetlbpage.rst @@ -234,8 +234,12 @@ will exist, of the form:: hugepages-${size}kB -Inside each of these directories, the same set of files will exist:: +Inside each of these directories, the set of files contained in ``/proc`` +will exist. In addition, two additional interfaces for demoting huge +pages may exist:: + demote + demote_size nr_hugepages nr_hugepages_mempolicy nr_overcommit_hugepages @@ -243,7 +247,29 @@ Inside each of these directories, the sa resv_hugepages surplus_hugepages -which function as described above for the default huge page-sized case. +The demote interfaces provide the ability to split a huge page into +smaller huge pages. For example, the x86 architecture supports both +1GB and 2MB huge pages sizes. A 1GB huge page can be split into 512 +2MB huge pages. Demote interfaces are not available for the smallest +huge page size. The demote interfaces are: + +demote_size + is the size of demoted pages. When a page is demoted a corresponding + number of huge pages of demote_size will be created. By default, + demote_size is set to the next smaller huge page size. If there are + multiple smaller huge page sizes, demote_size can be set to any of + these smaller sizes. Only huge page sizes less than the current huge + pages size are allowed. + +demote + is used to demote a number of huge pages. A user with root privileges + can write to this file. It may not be possible to demote the + requested number of huge pages. To determine how many pages were + actually demoted, compare the value of nr_hugepages before and after + writing to the demote interface. demote is a write only interface. + +The interfaces which are the same as in ``/proc`` (all except demote and +demote_size) function as described above for the default huge page-sized case. .. _mem_policy_and_hp_alloc: --- a/include/linux/hugetlb.h~hugetlb-add-demote-hugetlb-page-sysfs-interfaces +++ a/include/linux/hugetlb.h @@ -586,6 +586,7 @@ struct hstate { int next_nid_to_alloc; int next_nid_to_free; unsigned int order; + unsigned int demote_order; unsigned long mask; unsigned long max_huge_pages; unsigned long nr_huge_pages; --- a/mm/hugetlb.c~hugetlb-add-demote-hugetlb-page-sysfs-interfaces +++ a/mm/hugetlb.c @@ -2986,7 +2986,7 @@ free: static void __init hugetlb_init_hstates(void) { - struct hstate *h; + struct hstate *h, *h2; for_each_hstate(h) { if (minimum_order > huge_page_order(h)) @@ -2995,6 +2995,22 @@ static void __init hugetlb_init_hstates( /* oversize hugepages were init'ed in early boot */ if (!hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); + + /* + * Set demote order for each hstate. Note that + * h->demote_order is initially 0. + * - We can not demote gigantic pages if runtime freeing + * is not supported, so skip this. + */ + if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) + continue; + for_each_hstate(h2) { + if (h2 == h) + continue; + if (h2->order < h->order && + h2->order > h->demote_order) + h->demote_order = h2->order; + } } VM_BUG_ON(minimum_order == UINT_MAX); } @@ -3235,9 +3251,31 @@ out: return 0; } +static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) + __must_hold(&hugetlb_lock) +{ + int rc = 0; + + lockdep_assert_held(&hugetlb_lock); + + /* We should never get here if no demote order */ + if (!h->demote_order) { + pr_warn("HugeTLB: NULL demote order passed to demote_pool_huge_page.\n"); + return -EINVAL; /* internal error */ + } + + /* + * TODO - demote fucntionality will be added in subsequent patch + */ + return rc; +} + #define HSTATE_ATTR_RO(_name) \ static struct kobj_attribute _name##_attr = __ATTR_RO(_name) +#define HSTATE_ATTR_WO(_name) \ + static struct kobj_attribute _name##_attr = __ATTR_WO(_name) + #define HSTATE_ATTR(_name) \ static struct kobj_attribute _name##_attr = \ __ATTR(_name, 0644, _name##_show, _name##_store) @@ -3433,6 +3471,105 @@ static ssize_t surplus_hugepages_show(st } HSTATE_ATTR_RO(surplus_hugepages); +static ssize_t demote_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t len) +{ + unsigned long nr_demote; + unsigned long nr_available; + nodemask_t nodes_allowed, *n_mask; + struct hstate *h; + int err = 0; + int nid; + + err = kstrtoul(buf, 10, &nr_demote); + if (err) + return err; + h = kobj_to_hstate(kobj, &nid); + + if (nid != NUMA_NO_NODE) { + init_nodemask_of_node(&nodes_allowed, nid); + n_mask = &nodes_allowed; + } else { + n_mask = &node_states[N_MEMORY]; + } + + /* Synchronize with other sysfs operations modifying huge pages */ + mutex_lock(&h->resize_lock); + spin_lock_irq(&hugetlb_lock); + + while (nr_demote) { + /* + * Check for available pages to demote each time thorough the + * loop as demote_pool_huge_page will drop hugetlb_lock. + * + * NOTE: demote_pool_huge_page does not yet drop hugetlb_lock + * but will when full demote functionality is added in a later + * patch. + */ + if (nid != NUMA_NO_NODE) + nr_available = h->free_huge_pages_node[nid]; + else + nr_available = h->free_huge_pages; + nr_available -= h->resv_huge_pages; + if (!nr_available) + break; + + err = demote_pool_huge_page(h, n_mask); + if (err) + break; + + nr_demote--; + } + + spin_unlock_irq(&hugetlb_lock); + mutex_unlock(&h->resize_lock); + + if (err) + return err; + return len; +} +HSTATE_ATTR_WO(demote); + +static ssize_t demote_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int nid; + struct hstate *h = kobj_to_hstate(kobj, &nid); + unsigned long demote_size = (PAGE_SIZE << h->demote_order) / SZ_1K; + + return sysfs_emit(buf, "%lukB\n", demote_size); +} + +static ssize_t demote_size_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + struct hstate *h, *demote_hstate; + unsigned long demote_size; + unsigned int demote_order; + int nid; + + demote_size = (unsigned long)memparse(buf, NULL); + + demote_hstate = size_to_hstate(demote_size); + if (!demote_hstate) + return -EINVAL; + demote_order = demote_hstate->order; + + /* demote order must be smaller than hstate order */ + h = kobj_to_hstate(kobj, &nid); + if (demote_order >= h->order) + return -EINVAL; + + /* resize_lock synchronizes access to demote size and writes */ + mutex_lock(&h->resize_lock); + h->demote_order = demote_order; + mutex_unlock(&h->resize_lock); + + return count; +} +HSTATE_ATTR(demote_size); + static struct attribute *hstate_attrs[] = { &nr_hugepages_attr.attr, &nr_overcommit_hugepages_attr.attr, @@ -3449,6 +3586,16 @@ static const struct attribute_group hsta .attrs = hstate_attrs, }; +static struct attribute *hstate_demote_attrs[] = { + &demote_size_attr.attr, + &demote_attr.attr, + NULL, +}; + +static const struct attribute_group hstate_demote_attr_group = { + .attrs = hstate_demote_attrs, +}; + static int hugetlb_sysfs_add_hstate(struct hstate *h, struct kobject *parent, struct kobject **hstate_kobjs, const struct attribute_group *hstate_attr_group) @@ -3466,6 +3613,12 @@ static int hugetlb_sysfs_add_hstate(stru hstate_kobjs[hi] = NULL; } + if (h->demote_order) { + if (sysfs_create_group(hstate_kobjs[hi], + &hstate_demote_attr_group)) + pr_warn("HugeTLB unable to create demote interfaces for %s\n", h->name); + } + return retval; } From patchwork Fri Nov 5 20:41:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605621 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BA13C433FE for ; Fri, 5 Nov 2021 20:41:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C81F36127B for ; Fri, 5 Nov 2021 20:41:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C81F36127B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6F41994007D; Fri, 5 Nov 2021 16:41:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 684F394007E; Fri, 5 Nov 2021 16:41:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 433DA94007D; Fri, 5 Nov 2021 16:41:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id 2B18B94007C for ; Fri, 5 Nov 2021 16:41:26 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DACBB1856B121 for ; Fri, 5 Nov 2021 20:41:25 +0000 (UTC) X-FDA: 78776047056.05.09F8F2C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id 4505FE00199D for ; Fri, 5 Nov 2021 20:41:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 496F56127A; Fri, 5 Nov 2021 20:41:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144884; bh=c+ThdpFnztAIEpIqVaCNN3ZX83OuglcMdNEaYcHWmLQ=; h=Date:From:To:Subject:In-Reply-To:From; b=JIlMKsDNIChuCe3fKB66HA8/6VtlnApTKfR4TBIqJd9ay+8JxlrivUdXgKcehrdU+ ivQdoqBxurUhih2nqFHKrChnrzJ8NztwZtUcbv8AAxwVUwy/xQRKxJUUz7LmIZbQti pQ8JXofuo5CI3nbGpTL+dGRBfx53mvtRFYb6s2x4= Date: Fri, 05 Nov 2021 13:41:23 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@linux.dev, nghialm78@gmail.com, osalvador@suse.de, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 129/262] mm/cma: add cma_pages_valid to determine if pages are in CMA Message-ID: <20211105204123.zxB_XA3PS%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4505FE00199D X-Stat-Signature: ekui739yrztwsfpa6ucqawze5mytphkm Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=JIlMKsDN; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144868-348900 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: mm/cma: add cma_pages_valid to determine if pages are in CMA Add new interface cma_pages_valid() which indicates if the specified pages are part of a CMA region. This interface will be used in a subsequent patch by hugetlb code. In order to keep the same amount of DEBUG information, a pr_debug() call was added to cma_pages_valid(). In the case where the page passed to cma_release is not in cma region, the debug message will be printed from cma_pages_valid as opposed to cma_release. Link: https://lkml.kernel.org/r/20211007181918.136982-3-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Acked-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: "Aneesh Kumar K . V" Cc: David Rientjes Cc: Michal Hocko Cc: Muchun Song Cc: Naoya Horiguchi Cc: Nghia Le Cc: Zi Yan Signed-off-by: Andrew Morton --- include/linux/cma.h | 1 + mm/cma.c | 24 ++++++++++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-) --- a/include/linux/cma.h~mm-cma-add-cma_pages_valid-to-determine-if-pages-are-in-cma +++ a/include/linux/cma.h @@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_ad struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align, bool no_warn); +extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); --- a/mm/cma.c~mm-cma-add-cma_pages_valid-to-determine-if-pages-are-in-cma +++ a/mm/cma.c @@ -524,6 +524,25 @@ out: return page; } +bool cma_pages_valid(struct cma *cma, const struct page *pages, + unsigned long count) +{ + unsigned long pfn; + + if (!cma || !pages) + return false; + + pfn = page_to_pfn(pages); + + if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) { + pr_debug("%s(page %p, count %lu)\n", __func__, + (void *)pages, count); + return false; + } + + return true; +} + /** * cma_release() - release allocated pages * @cma: Contiguous memory region for which the allocation is performed. @@ -539,16 +558,13 @@ bool cma_release(struct cma *cma, const { unsigned long pfn; - if (!cma || !pages) + if (!cma_pages_valid(cma, pages, count)) return false; pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); pfn = page_to_pfn(pages); - if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) - return false; - VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); free_contig_range(pfn, count); From patchwork Fri Nov 5 20:41:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605623 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40CA9C433F5 for ; Fri, 5 Nov 2021 20:41:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EC2AF61279 for ; Fri, 5 Nov 2021 20:41:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EC2AF61279 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7B6D294007E; Fri, 5 Nov 2021 16:41:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7404F94007C; Fri, 5 Nov 2021 16:41:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62C8394007E; Fri, 5 Nov 2021 16:41:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0246.hostedemail.com [216.40.44.246]) by kanga.kvack.org (Postfix) with ESMTP id 4F9C894007C for ; Fri, 5 Nov 2021 16:41:29 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 18E611856B710 for ; Fri, 5 Nov 2021 20:41:29 +0000 (UTC) X-FDA: 78776047098.04.3629828 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 4CAA5104AAC5 for ; Fri, 5 Nov 2021 20:41:20 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9755A611C0; Fri, 5 Nov 2021 20:41:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144888; bh=WAdS+HsSroFl3okG/ivzMwmUhi8nGIz/ZoEw5Rg6l0M=; h=Date:From:To:Subject:In-Reply-To:From; b=aTWuImBoSQmoT+utCMggHSJkJUGF2KY/6dFlqjIf9TA/uMpIyzO9QgFg0LONG1rRi LKo1d4SUNNpX8mxrzm921t6CxSziqjeEjwuslPUsucGvg79CYx8gmbzpGVgq0TNhph pJXMW4/PGXx+S9iaF8La7MoGiobFS7PlVV5Lk4q8= Date: Fri, 05 Nov 2021 13:41:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@linux.dev, nghialm78@gmail.com, osalvador@suse.de, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 130/262] hugetlb: be sure to free demoted CMA pages to CMA Message-ID: <20211105204127.2cYhr-b8M%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=aTWuImBo; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 4CAA5104AAC5 X-Stat-Signature: xbhrsh3rawr48hoxyumfhoxa1eacfcke X-HE-Tag: 1636144880-211774 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: hugetlb: be sure to free demoted CMA pages to CMA When huge page demotion is fully implemented, gigantic pages can be demoted to a smaller huge page size. For example, on x86 a 1G page can be demoted to 512 2M pages. However, gigantic pages can potentially be allocated from CMA. If a gigantic page which was allocated from CMA is demoted, the corresponding demoted pages needs to be returned to CMA. Use the new interface cma_pages_valid() to determine if a non-gigantic hugetlb page should be freed to CMA. Also, clear mapping field of these pages as expected by cma_release. This also requires a change to CMA region creation for gigantic pages. CMA uses a per-region bit map to track allocations. When setting up the region, you specify how many pages each bit represents. Currently, only gigantic pages are allocated/freed from CMA so the region is set up such that one bit represents a gigantic page size allocation. With demote, a gigantic page (allocation) could be split into smaller size pages. And, these smaller size pages will be freed to CMA. So, since the per-region bit map needs to be set up to represent the smallest allocation/free size, it now needs to be set to the smallest huge page size which can be freed to CMA. Unfortunately, we set up the CMA region for huge pages before we set up huge pages sizes (hstates). So, technically we do not know the smallest huge page size as this can change via command line options and architecture specific code. Therefore, at region setup time we use HUGETLB_PAGE_ORDER as the smallest possible huge page size that can be given back to CMA. It is possible that this value is sub-optimal for some architectures/config options. If needed, this can be addressed in follow on work. Link: https://lkml.kernel.org/r/20211007181918.136982-4-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Cc: "Aneesh Kumar K . V" Cc: David Hildenbrand Cc: David Rientjes Cc: Michal Hocko Cc: Muchun Song Cc: Naoya Horiguchi Cc: Nghia Le Cc: Oscar Salvador Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/hugetlb.c | 41 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 2 deletions(-) --- a/mm/hugetlb.c~hugetlb-be-sure-to-free-demoted-cma-pages-to-cma +++ a/mm/hugetlb.c @@ -50,6 +50,16 @@ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA static struct cma *hugetlb_cma[MAX_NUMNODES]; +static bool hugetlb_cma_page(struct page *page, unsigned int order) +{ + return cma_pages_valid(hugetlb_cma[page_to_nid(page)], page, + 1 << order); +} +#else +static bool hugetlb_cma_page(struct page *page, unsigned int order) +{ + return false; +} #endif static unsigned long hugetlb_cma_size __initdata; @@ -1272,6 +1282,7 @@ static void destroy_compound_gigantic_pa atomic_set(compound_pincount_ptr(page), 0); for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) { + p->mapping = NULL; clear_compound_head(p); set_page_refcounted(p); } @@ -1476,7 +1487,13 @@ static void __update_and_free_page(struc 1 << PG_active | 1 << PG_private | 1 << PG_writeback); } - if (hstate_is_gigantic(h)) { + + /* + * Non-gigantic pages demoted from CMA allocated gigantic pages + * need to be given back to CMA in free_gigantic_page. + */ + if (hstate_is_gigantic(h) || + hugetlb_cma_page(page, huge_page_order(h))) { destroy_compound_gigantic_page(page, huge_page_order(h)); free_gigantic_page(page, huge_page_order(h)); } else { @@ -3001,9 +3018,13 @@ static void __init hugetlb_init_hstates( * h->demote_order is initially 0. * - We can not demote gigantic pages if runtime freeing * is not supported, so skip this. + * - If CMA allocation is possible, we can not demote + * HUGETLB_PAGE_ORDER or smaller size pages. */ if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) continue; + if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) + continue; for_each_hstate(h2) { if (h2 == h) continue; @@ -3555,6 +3576,8 @@ static ssize_t demote_size_store(struct if (!demote_hstate) return -EINVAL; demote_order = demote_hstate->order; + if (demote_order < HUGETLB_PAGE_ORDER) + return -EINVAL; /* demote order must be smaller than hstate order */ h = kobj_to_hstate(kobj, &nid); @@ -6543,6 +6566,7 @@ void __init hugetlb_cma_reserve(int orde if (hugetlb_cma_size < (PAGE_SIZE << order)) { pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size = 0; return; } @@ -6563,7 +6587,13 @@ void __init hugetlb_cma_reserve(int orde size = round_up(size, PAGE_SIZE << order); snprintf(name, sizeof(name), "hugetlb%d", nid); - res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order, + /* + * Note that 'order per bit' is based on smallest size that + * may be returned to CMA allocator in the case of + * huge page demotion. + */ + res = cma_declare_contiguous_nid(0, size, 0, + PAGE_SIZE << HUGETLB_PAGE_ORDER, 0, false, name, &hugetlb_cma[nid], nid); if (res) { @@ -6579,6 +6609,13 @@ void __init hugetlb_cma_reserve(int orde if (reserved >= hugetlb_cma_size) break; } + + if (!reserved) + /* + * hugetlb_cma_size is used to determine if allocations from + * cma are possible. Set to zero if no cma regions are set up. + */ + hugetlb_cma_size = 0; } void __init hugetlb_cma_check(void) From patchwork Fri Nov 5 20:41:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 992D7C433FE for ; Fri, 5 Nov 2021 20:41:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 52C726127A for ; Fri, 5 Nov 2021 20:41:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 52C726127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E1DDC94007F; Fri, 5 Nov 2021 16:41:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCDAA94007C; Fri, 5 Nov 2021 16:41:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C96FE94007F; Fri, 5 Nov 2021 16:41:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id BA15794007C for ; Fri, 5 Nov 2021 16:41:32 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 870461856A1F1 for ; Fri, 5 Nov 2021 20:41:32 +0000 (UTC) X-FDA: 78776047224.36.E0DBC72 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 922BD508FA51 for ; Fri, 5 Nov 2021 20:41:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E0E626126A; Fri, 5 Nov 2021 20:41:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144891; bh=i/0RMlXO+8bjYoyojLaGCSIhofnbaW7NQgdiunK9Hwk=; h=Date:From:To:Subject:In-Reply-To:From; b=E67qznZwHL5LaczOlp59Wq6ZyUF5QXUEifSy6V7YmK0qzriljr/vpxa4RfW6nYuk6 Qtx/920KBgG5ZH64Te+y6ZgwXHumaamZofn/VxYrrooIEUcD0y0jSYNcwTsPahEyLs jusa/rVGmvTO79pmwPOv1jtXc6zUlx44q7+Pmneg= Date: Fri, 05 Nov 2021 13:41:30 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@linux.dev, nghialm78@gmail.com, osalvador@suse.de, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 131/262] hugetlb: add demote bool to gigantic page routines Message-ID: <20211105204130.goC-uFnUp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 922BD508FA51 X-Stat-Signature: iiw77fwpe74ii536sidjbrwrztsghs6k Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=E67qznZw; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144874-800329 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: hugetlb: add demote bool to gigantic page routines The routines remove_hugetlb_page and destroy_compound_gigantic_page will remove a gigantic page and make the set of base pages ready to be returned to a lower level allocator. In the process of doing this, they make all base pages reference counted. The routine prep_compound_gigantic_page creates a gigantic page from a set of base pages. It assumes that all these base pages are reference counted. During demotion, a gigantic page will be split into huge pages of a smaller size. This logically involves use of the routines, remove_hugetlb_page, and destroy_compound_gigantic_page followed by prep_compound*_page for each smaller huge page. When pages are reference counted (ref count >= 0), additional speculative ref counts could be taken as described in previous commits [1] and [2]. This could result in errors while demoting a huge page. Quite a bit of code would need to be created to handle all possible issues. Instead of dealing with the possibility of speculative ref counts, avoid the possibility by keeping ref counts at zero during the demote process. Add a boolean 'demote' to the routines remove_hugetlb_page, destroy_compound_gigantic_page and prep_compound_gigantic_page. If the boolean is set, the remove and destroy routines will not reference count pages and the prep routine will not expect reference counted pages. '*_for_demote' wrappers of the routines will be added in a subsequent patch where this functionality is used. [1] https://lore.kernel.org/linux-mm/20210622021423.154662-3-mike.kravetz@oracle.com/ [2] https://lore.kernel.org/linux-mm/20210809184832.18342-3-mike.kravetz@oracle.com/ Link: https://lkml.kernel.org/r/20211007181918.136982-5-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reviewed-by: Oscar Salvador Cc: "Aneesh Kumar K . V" Cc: David Hildenbrand Cc: David Rientjes Cc: Michal Hocko Cc: Muchun Song Cc: Naoya Horiguchi Cc: Nghia Le Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/hugetlb.c | 54 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 11 deletions(-) --- a/mm/hugetlb.c~hugetlb-add-demote-bool-to-gigantic-page-routines +++ a/mm/hugetlb.c @@ -1271,8 +1271,8 @@ static int hstate_next_node_to_free(stru nr_nodes--) #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -static void destroy_compound_gigantic_page(struct page *page, - unsigned int order) +static void __destroy_compound_gigantic_page(struct page *page, + unsigned int order, bool demote) { int i; int nr_pages = 1 << order; @@ -1284,7 +1284,8 @@ static void destroy_compound_gigantic_pa for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) { p->mapping = NULL; clear_compound_head(p); - set_page_refcounted(p); + if (!demote) + set_page_refcounted(p); } set_compound_order(page, 0); @@ -1292,6 +1293,12 @@ static void destroy_compound_gigantic_pa __ClearPageHead(page); } +static void destroy_compound_gigantic_page(struct page *page, + unsigned int order) +{ + __destroy_compound_gigantic_page(page, order, false); +} + static void free_gigantic_page(struct page *page, unsigned int order) { /* @@ -1364,12 +1371,15 @@ static inline void destroy_compound_giga /* * Remove hugetlb page from lists, and update dtor so that page appears - * as just a compound page. A reference is held on the page. + * as just a compound page. + * + * A reference is held on the page, except in the case of demote. * * Must be called with hugetlb lock held. */ -static void remove_hugetlb_page(struct hstate *h, struct page *page, - bool adjust_surplus) +static void __remove_hugetlb_page(struct hstate *h, struct page *page, + bool adjust_surplus, + bool demote) { int nid = page_to_nid(page); @@ -1407,8 +1417,12 @@ static void remove_hugetlb_page(struct h * * This handles the case where more than one ref is held when and * after update_and_free_page is called. + * + * In the case of demote we do not ref count the page as it will soon + * be turned into a page of smaller size. */ - set_page_refcounted(page); + if (!demote) + set_page_refcounted(page); if (hstate_is_gigantic(h)) set_compound_page_dtor(page, NULL_COMPOUND_DTOR); else @@ -1418,6 +1432,12 @@ static void remove_hugetlb_page(struct h h->nr_huge_pages_node[nid]--; } +static void remove_hugetlb_page(struct hstate *h, struct page *page, + bool adjust_surplus) +{ + __remove_hugetlb_page(h, page, adjust_surplus, false); +} + static void add_hugetlb_page(struct hstate *h, struct page *page, bool adjust_surplus) { @@ -1681,7 +1701,8 @@ static void prep_new_huge_page(struct hs spin_unlock_irq(&hugetlb_lock); } -static bool prep_compound_gigantic_page(struct page *page, unsigned int order) +static bool __prep_compound_gigantic_page(struct page *page, unsigned int order, + bool demote) { int i, j; int nr_pages = 1 << order; @@ -1719,10 +1740,16 @@ static bool prep_compound_gigantic_page( * the set of pages can not be converted to a gigantic page. * The caller who allocated the pages should then discard the * pages using the appropriate free interface. + * + * In the case of demote, the ref count will be zero. */ - if (!page_ref_freeze(p, 1)) { - pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); - goto out_error; + if (!demote) { + if (!page_ref_freeze(p, 1)) { + pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); + goto out_error; + } + } else { + VM_BUG_ON_PAGE(page_count(p), p); } set_page_count(p, 0); set_compound_head(p, page); @@ -1747,6 +1774,11 @@ out_error: return false; } +static bool prep_compound_gigantic_page(struct page *page, unsigned int order) +{ + return __prep_compound_gigantic_page(page, order, false); +} + /* * PageHuge() only returns true for hugetlbfs pages, but not for normal or * transparent huge pages. See the PageTransHuge() documentation for more From patchwork Fri Nov 5 20:41:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F098BC4332F for ; Fri, 5 Nov 2021 20:41:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A6C506126A for ; Fri, 5 Nov 2021 20:41:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A6C506126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 48126940080; Fri, 5 Nov 2021 16:41:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 430B594007C; Fri, 5 Nov 2021 16:41:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31F38940080; Fri, 5 Nov 2021 16:41:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 1C9A894007C for ; Fri, 5 Nov 2021 16:41:38 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7B7AC1856B713 for ; Fri, 5 Nov 2021 20:41:37 +0000 (UTC) X-FDA: 78776047350.19.EF52400 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id A2F7B900009B for ; Fri, 5 Nov 2021 20:41:22 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 366666127A; Fri, 5 Nov 2021 20:41:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144894; bh=A+jdg3x08GCxHqS871r9qo1Z5AFVwFsEiGtCKL0DdqU=; h=Date:From:To:Subject:In-Reply-To:From; b=og/aO/Wqtv00kndSua3bc786AAoEO6sHGbqB+mt6Ya7ADHT08GSb2ebmin1a+6t4i JcMOQlqsG9l7fJBgvZuh2mqJhCOKOKM9lzSNbSI/fAt0yS26gZU2YdfHsQHYeeS9vF 33ZB2enjlUkjXBLsIRgeVOCiLWyFLS6iMGa3KHDw= Date: Fri, 05 Nov 2021 13:41:33 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@linux.dev, nghialm78@gmail.com, osalvador@suse.de, rientjes@google.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, ziy@nvidia.com Subject: [patch 132/262] hugetlb: add hugetlb demote page support Message-ID: <20211105204133.Or52A3rtJ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A2F7B900009B X-Stat-Signature: cehycgh1q8kadbowkknf31jq8h6aezbk Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="og/aO/Wq"; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144882-473231 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: hugetlb: add hugetlb demote page support Demote page functionality will split a huge page into a number of huge pages of a smaller size. For example, on x86 a 1GB huge page can be demoted into 512 2M huge pages. Demotion is done 'in place' by simply splitting the huge page. Added '*_for_demote' wrappers for remove_hugetlb_page, destroy_compound_hugetlb_page and prep_compound_gigantic_page for use by demote code. [mike.kravetz@oracle.com: v4] Link: https://lkml.kernel.org/r/6ca29b8e-527c-d6ec-900e-e6a43e4f8b73@oracle.com Link: https://lkml.kernel.org/r/20211007181918.136982-6-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Reviewed-by: Oscar Salvador Cc: "Aneesh Kumar K . V" Cc: David Hildenbrand Cc: David Rientjes Cc: Michal Hocko Cc: Muchun Song Cc: Naoya Horiguchi Cc: Nghia Le Cc: Zi Yan Signed-off-by: Andrew Morton --- mm/hugetlb.c | 100 +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 92 insertions(+), 8 deletions(-) --- a/mm/hugetlb.c~hugetlb-add-hugetlb-demote-page-support +++ a/mm/hugetlb.c @@ -1270,7 +1270,7 @@ static int hstate_next_node_to_free(stru ((node = hstate_next_node_to_free(hs, mask)) || 1); \ nr_nodes--) -#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE +/* used to demote non-gigantic_huge pages as well */ static void __destroy_compound_gigantic_page(struct page *page, unsigned int order, bool demote) { @@ -1293,6 +1293,13 @@ static void __destroy_compound_gigantic_ __ClearPageHead(page); } +static void destroy_compound_hugetlb_page_for_demote(struct page *page, + unsigned int order) +{ + __destroy_compound_gigantic_page(page, order, true); +} + +#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE static void destroy_compound_gigantic_page(struct page *page, unsigned int order) { @@ -1438,6 +1445,12 @@ static void remove_hugetlb_page(struct h __remove_hugetlb_page(h, page, adjust_surplus, false); } +static void remove_hugetlb_page_for_demote(struct hstate *h, struct page *page, + bool adjust_surplus) +{ + __remove_hugetlb_page(h, page, adjust_surplus, true); +} + static void add_hugetlb_page(struct hstate *h, struct page *page, bool adjust_surplus) { @@ -1779,6 +1792,12 @@ static bool prep_compound_gigantic_page( return __prep_compound_gigantic_page(page, order, false); } +static bool prep_compound_gigantic_page_for_demote(struct page *page, + unsigned int order) +{ + return __prep_compound_gigantic_page(page, order, true); +} + /* * PageHuge() only returns true for hugetlbfs pages, but not for normal or * transparent huge pages. See the PageTransHuge() documentation for more @@ -3304,9 +3323,72 @@ out: return 0; } +static int demote_free_huge_page(struct hstate *h, struct page *page) +{ + int i, nid = page_to_nid(page); + struct hstate *target_hstate; + int rc = 0; + + target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order); + + remove_hugetlb_page_for_demote(h, page, false); + spin_unlock_irq(&hugetlb_lock); + + rc = alloc_huge_page_vmemmap(h, page); + if (rc) { + /* Allocation of vmemmmap failed, we can not demote page */ + spin_lock_irq(&hugetlb_lock); + set_page_refcounted(page); + add_hugetlb_page(h, page, false); + return rc; + } + + /* + * Use destroy_compound_hugetlb_page_for_demote for all huge page + * sizes as it will not ref count pages. + */ + destroy_compound_hugetlb_page_for_demote(page, huge_page_order(h)); + + /* + * Taking target hstate mutex synchronizes with set_max_huge_pages. + * Without the mutex, pages added to target hstate could be marked + * as surplus. + * + * Note that we already hold h->resize_lock. To prevent deadlock, + * use the convention of always taking larger size hstate mutex first. + */ + mutex_lock(&target_hstate->resize_lock); + for (i = 0; i < pages_per_huge_page(h); + i += pages_per_huge_page(target_hstate)) { + if (hstate_is_gigantic(target_hstate)) + prep_compound_gigantic_page_for_demote(page + i, + target_hstate->order); + else + prep_compound_page(page + i, target_hstate->order); + set_page_private(page + i, 0); + set_page_refcounted(page + i); + prep_new_huge_page(target_hstate, page + i, nid); + put_page(page + i); + } + mutex_unlock(&target_hstate->resize_lock); + + spin_lock_irq(&hugetlb_lock); + + /* + * Not absolutely necessary, but for consistency update max_huge_pages + * based on pool changes for the demoted page. + */ + h->max_huge_pages--; + target_hstate->max_huge_pages += pages_per_huge_page(h); + + return rc; +} + static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) __must_hold(&hugetlb_lock) { + int nr_nodes, node; + struct page *page; int rc = 0; lockdep_assert_held(&hugetlb_lock); @@ -3317,9 +3399,15 @@ static int demote_pool_huge_page(struct return -EINVAL; /* internal error */ } - /* - * TODO - demote fucntionality will be added in subsequent patch - */ + for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) { + if (!list_empty(&h->hugepage_freelists[node])) { + page = list_entry(h->hugepage_freelists[node].next, + struct page, lru); + rc = demote_free_huge_page(h, page); + break; + } + } + return rc; } @@ -3554,10 +3642,6 @@ static ssize_t demote_store(struct kobje /* * Check for available pages to demote each time thorough the * loop as demote_pool_huge_page will drop hugetlb_lock. - * - * NOTE: demote_pool_huge_page does not yet drop hugetlb_lock - * but will when full demote functionality is added in a later - * patch. */ if (nid != NUMA_NO_NODE) nr_available = h->free_huge_pages_node[nid]; From patchwork Fri Nov 5 20:41:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605629 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA0CCC433EF for ; Fri, 5 Nov 2021 20:41:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 636CF611C0 for ; Fri, 5 Nov 2021 20:41:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 636CF611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5465B940081; Fri, 5 Nov 2021 16:41:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F4E394007C; Fri, 5 Nov 2021 16:41:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E80A940081; Fri, 5 Nov 2021 16:41:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 049C994007C for ; Fri, 5 Nov 2021 16:41:39 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BCA1D82499A8 for ; Fri, 5 Nov 2021 20:41:38 +0000 (UTC) X-FDA: 78776047476.06.80B946E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 172EAD0000A7 for ; Fri, 5 Nov 2021 20:41:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 608FC61262; Fri, 5 Nov 2021 20:41:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144897; bh=+EIMaYi6ms3T0TFdzwjLTv7AeIMRwHSwaaCVCc5YJac=; h=Date:From:To:Subject:In-Reply-To:From; b=GKXpojr8O1lasW3573w4Vjd2rClXubWsH6JzF+gBiTPYU6ldEdvXOCU5CHeBbpZdj MslLTb7Uo7l52Jsqu9HSWSYiOoWaJ1LVI4Myc3UqJv1+hgNbT5TgIT0K4WbpI/YGzl mmDBTZeu5ryPVXpP50M2RCHxLjOimtFe8cmEW12o= Date: Fri, 05 Nov 2021 13:41:36 -0700 From: Andrew Morton To: akpm@linux-foundation.org, liangcaifan19@gmail.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, zhang.lyra@gmail.com Subject: [patch 133/262] mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged Message-ID: <20211105204136.NtGWXjIhw%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=GKXpojr8; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 172EAD0000A7 X-Stat-Signature: n4gc9gj41qryw9bjwhi3stmnkad8bx8a X-HE-Tag: 1636144888-486853 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Liangcai Fan Subject: mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged When initializing transparent huge pages, min_free_kbytes would be calculated according to what khugepaged expected. So when disable transparent huge pages, min_free_kbytes should be recalculated instead of the higher value set by khugepaged. Link: https://lkml.kernel.org/r/1633937809-16558-1-git-send-email-liangcaifan19@gmail.com Signed-off-by: Liangcai Fan Signed-off-by: Chunyan Zhang Cc: Mike Kravetz Signed-off-by: Andrew Morton --- include/linux/mm.h | 1 + mm/khugepaged.c | 10 ++++++++-- mm/page_alloc.c | 7 ++++++- 3 files changed, 15 insertions(+), 3 deletions(-) --- a/include/linux/mm.h~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged +++ a/include/linux/mm.h @@ -2453,6 +2453,7 @@ extern void memmap_init_range(unsigned l unsigned long, unsigned long, enum meminit_context, struct vmem_altmap *, int migratetype); extern void setup_per_zone_wmarks(void); +extern void calculate_min_free_kbytes(void); extern int __meminit init_per_zone_wmark_min(void); extern void mem_init(void); extern void __init mmap_init(void); --- a/mm/khugepaged.c~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged +++ a/mm/khugepaged.c @@ -2299,6 +2299,11 @@ static void set_recommended_min_free_kby int nr_zones = 0; unsigned long recommended_min; + if (!khugepaged_enabled()) { + calculate_min_free_kbytes(); + goto update_wmarks; + } + for_each_populated_zone(zone) { /* * We don't need to worry about fragmentation of @@ -2334,6 +2339,8 @@ static void set_recommended_min_free_kby min_free_kbytes = recommended_min; } + +update_wmarks: setup_per_zone_wmarks(); } @@ -2355,12 +2362,11 @@ int start_stop_khugepaged(void) if (!list_empty(&khugepaged_scan.mm_head)) wake_up_interruptible(&khugepaged_wait); - - set_recommended_min_free_kbytes(); } else if (khugepaged_thread) { kthread_stop(khugepaged_thread); khugepaged_thread = NULL; } + set_recommended_min_free_kbytes(); fail: mutex_unlock(&khugepaged_mutex); return err; --- a/mm/page_alloc.c~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged +++ a/mm/page_alloc.c @@ -8469,7 +8469,7 @@ void setup_per_zone_wmarks(void) * 8192MB: 11584k * 16384MB: 16384k */ -int __meminit init_per_zone_wmark_min(void) +void calculate_min_free_kbytes(void) { unsigned long lowmem_kbytes; int new_min_free_kbytes; @@ -8483,6 +8483,11 @@ int __meminit init_per_zone_wmark_min(vo pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n", new_min_free_kbytes, user_min_free_kbytes); +} + +int __meminit init_per_zone_wmark_min(void) +{ + calculate_min_free_kbytes(); setup_per_zone_wmarks(); refresh_zone_stat_thresholds(); setup_per_zone_lowmem_reserve(); From patchwork Fri Nov 5 20:41:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 356EAC433EF for ; Fri, 5 Nov 2021 20:41:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DBF30611C0 for ; Fri, 5 Nov 2021 20:41:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DBF30611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5BBC8940082; Fri, 5 Nov 2021 16:41:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5432294007C; Fri, 5 Nov 2021 16:41:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36DC1940082; Fri, 5 Nov 2021 16:41:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 28B3D94007C for ; Fri, 5 Nov 2021 16:41:42 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DC3FB7145C for ; Fri, 5 Nov 2021 20:41:41 +0000 (UTC) X-FDA: 78776047602.07.0975FC6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id B1A9F508FA46 for ; Fri, 5 Nov 2021 20:41:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6A32061262; Fri, 5 Nov 2021 20:41:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144900; bh=L/JJ4cHzBqKp18XNOpEweITRlYqPkFB15G3DqOrEHqM=; h=Date:From:To:Subject:In-Reply-To:From; b=TVeUXomFzWJSto9ot1zb5N+mvnYcgSjTm5tmsZVo0It8dZ4JpV8RghebJpWuzpamP pRnRFVu+vB3E6nTYxejd8igIRGkbUPxnmiyDcJjquT6imbG6/P+ZDtOi7cqS9m++oo JWlosLz+yyPZwWfqyx+ieg5bJmJBcpNSJg/AtcwY= Date: Fri, 05 Nov 2021 13:41:40 -0700 From: Andrew Morton To: akpm@linux-foundation.org, almasrymina@google.com, ckennelly@google.com, kenchen@google.com, kirill@shutemov.name, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 134/262] mm, hugepages: add mremap() support for hugepage backed vma Message-ID: <20211105204140.j9t0aBIw9%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=TVeUXomF; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B1A9F508FA46 X-Stat-Signature: zqasdopzn4afthw6eg4113n6jkgtc87h X-HE-Tag: 1636144889-561123 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mina Almasry Subject: mm, hugepages: add mremap() support for hugepage backed vma Support mremap() for hugepage backed vma segment by simply repositioning page table entries. The page table entries are repositioned to the new virtual address on mremap(). Hugetlb mremap() support is of course generic; my motivating use case is a library (hugepage_text), which reloads the ELF text of executables in hugepages. This significantly increases the execution performance of said executables. Restricts the mremap operation on hugepages to up to the size of the original mapping as the underlying hugetlb reservation is not yet capable of handling remapping to a larger size. During the mremap() operation we detect pmd_share'd mappings and we unshare those during the mremap(). On access and fault the sharing is established again. Link: https://lkml.kernel.org/r/20211013195825.3058275-1-almasrymina@google.com Signed-off-by: Mina Almasry Reviewed-by: Mike Kravetz Cc: Ken Chen Cc: Chris Kennelly Cc: Michal Hocko Cc: Vlastimil Babka Cc: Kirill Shutemov Signed-off-by: Andrew Morton --- include/linux/hugetlb.h | 19 ++++++ mm/hugetlb.c | 111 +++++++++++++++++++++++++++++++++++--- mm/mremap.c | 36 +++++++++++- 3 files changed, 157 insertions(+), 9 deletions(-) --- a/include/linux/hugetlb.h~mm-hugepages-add-mremap-support-for-hugepage-backed-vma +++ a/include/linux/hugetlb.h @@ -124,6 +124,7 @@ struct hugepage_subpool *hugepage_new_su void hugepage_put_subpool(struct hugepage_subpool *spool); void reset_vma_resv_huge_pages(struct vm_area_struct *vma); +void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int hugetlb_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *); int hugetlb_overcommit_handler(struct ctl_table *, int, void *, size_t *, loff_t *); @@ -132,6 +133,10 @@ int hugetlb_treat_movable_handler(struct int hugetlb_mempolicy_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *); +int move_hugetlb_page_tables(struct vm_area_struct *vma, + struct vm_area_struct *new_vma, + unsigned long old_addr, unsigned long new_addr, + unsigned long len); int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *); long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, @@ -215,6 +220,10 @@ static inline void reset_vma_resv_huge_p { } +static inline void clear_vma_resv_huge_pages(struct vm_area_struct *vma) +{ +} + static inline unsigned long hugetlb_total_pages(void) { return 0; @@ -260,6 +269,16 @@ static inline int copy_hugetlb_page_rang { BUG(); return 0; +} + +static inline int move_hugetlb_page_tables(struct vm_area_struct *vma, + struct vm_area_struct *new_vma, + unsigned long old_addr, + unsigned long new_addr, + unsigned long len) +{ + BUG(); + return 0; } static inline void hugetlb_report_meminfo(struct seq_file *m) --- a/mm/hugetlb.c~mm-hugepages-add-mremap-support-for-hugepage-backed-vma +++ a/mm/hugetlb.c @@ -1014,6 +1014,35 @@ void reset_vma_resv_huge_pages(struct vm vma->vm_private_data = (void *)0; } +/* + * Reset and decrement one ref on hugepage private reservation. + * Called with mm->mmap_sem writer semaphore held. + * This function should be only used by move_vma() and operate on + * same sized vma. It should never come here with last ref on the + * reservation. + */ +void clear_vma_resv_huge_pages(struct vm_area_struct *vma) +{ + /* + * Clear the old hugetlb private page reservation. + * It has already been transferred to new_vma. + * + * During a mremap() operation of a hugetlb vma we call move_vma() + * which copies vma into new_vma and unmaps vma. After the copy + * operation both new_vma and vma share a reference to the resv_map + * struct, and at that point vma is about to be unmapped. We don't + * want to return the reservation to the pool at unmap of vma because + * the reservation still lives on in new_vma, so simply decrement the + * ref here and remove the resv_map reference from this vma. + */ + struct resv_map *reservations = vma_resv_map(vma); + + if (reservations && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) + kref_put(&reservations->refs, resv_map_release); + + reset_vma_resv_huge_pages(vma); +} + /* Returns true if the VMA has associated reserve pages */ static bool vma_has_reserves(struct vm_area_struct *vma, long chg) { @@ -4718,6 +4747,82 @@ again: return ret; } +static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, + unsigned long new_addr, pte_t *src_pte) +{ + struct hstate *h = hstate_vma(vma); + struct mm_struct *mm = vma->vm_mm; + pte_t *dst_pte, pte; + spinlock_t *src_ptl, *dst_ptl; + + dst_pte = huge_pte_offset(mm, new_addr, huge_page_size(h)); + dst_ptl = huge_pte_lock(h, mm, dst_pte); + src_ptl = huge_pte_lockptr(h, mm, src_pte); + + /* + * We don't have to worry about the ordering of src and dst ptlocks + * because exclusive mmap_sem (or the i_mmap_lock) prevents deadlock. + */ + if (src_ptl != dst_ptl) + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); + + pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); + set_huge_pte_at(mm, new_addr, dst_pte, pte); + + if (src_ptl != dst_ptl) + spin_unlock(src_ptl); + spin_unlock(dst_ptl); +} + +int move_hugetlb_page_tables(struct vm_area_struct *vma, + struct vm_area_struct *new_vma, + unsigned long old_addr, unsigned long new_addr, + unsigned long len) +{ + struct hstate *h = hstate_vma(vma); + struct address_space *mapping = vma->vm_file->f_mapping; + unsigned long sz = huge_page_size(h); + struct mm_struct *mm = vma->vm_mm; + unsigned long old_end = old_addr + len; + unsigned long old_addr_copy; + pte_t *src_pte, *dst_pte; + struct mmu_notifier_range range; + + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr, + old_end); + adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); + mmu_notifier_invalidate_range_start(&range); + /* Prevent race with file truncation */ + i_mmap_lock_write(mapping); + for (; old_addr < old_end; old_addr += sz, new_addr += sz) { + src_pte = huge_pte_offset(mm, old_addr, sz); + if (!src_pte) + continue; + if (huge_pte_none(huge_ptep_get(src_pte))) + continue; + + /* old_addr arg to huge_pmd_unshare() is a pointer and so the + * arg may be modified. Pass a copy instead to preserve the + * value in old_addr. + */ + old_addr_copy = old_addr; + + if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) + continue; + + dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz); + if (!dst_pte) + break; + + move_huge_pte(vma, old_addr, new_addr, src_pte); + } + i_mmap_unlock_write(mapping); + flush_tlb_range(vma, old_end - len, old_end); + mmu_notifier_invalidate_range_end(&range); + + return len + old_addr - old_end; +} + static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, struct page *ref_page) @@ -6257,12 +6362,6 @@ void adjust_range_if_pmd_sharing_possibl * sharing is possible. For hugetlbfs, this prevents removal of any page * table entries associated with the address space. This is important as we * are setting up sharing based on existing page table entries (mappings). - * - * NOTE: This routine is only called from huge_pte_alloc. Some callers of - * huge_pte_alloc know that sharing is not possible and do not take - * i_mmap_rwsem as a performance optimization. This is handled by the - * if !vma_shareable check at the beginning of the routine. i_mmap_rwsem is - * only required for subsequent processing. */ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud) --- a/mm/mremap.c~mm-hugepages-add-mremap-support-for-hugepage-backed-vma +++ a/mm/mremap.c @@ -489,6 +489,10 @@ unsigned long move_page_tables(struct vm old_end = old_addr + len; flush_cache_range(vma, old_addr, old_end); + if (is_vm_hugetlb_page(vma)) + return move_hugetlb_page_tables(vma, new_vma, old_addr, + new_addr, len); + mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, old_addr, old_end); mmu_notifier_invalidate_range_start(&range); @@ -646,6 +650,10 @@ static unsigned long move_vma(struct vm_ mremap_userfaultfd_prep(new_vma, uf); } + if (is_vm_hugetlb_page(vma)) { + clear_vma_resv_huge_pages(vma); + } + /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP)) { vma->vm_flags &= ~VM_ACCOUNT; @@ -739,9 +747,6 @@ static struct vm_area_struct *vma_to_res (vma->vm_flags & (VM_DONTEXPAND | VM_PFNMAP))) return ERR_PTR(-EINVAL); - if (is_vm_hugetlb_page(vma)) - return ERR_PTR(-EINVAL); - /* We can't remap across vm area boundaries */ if (old_len > vma->vm_end - addr) return ERR_PTR(-EFAULT); @@ -937,6 +942,31 @@ SYSCALL_DEFINE5(mremap, unsigned long, a if (mmap_write_lock_killable(current->mm)) return -EINTR; + vma = find_vma(mm, addr); + if (!vma || vma->vm_start > addr) { + ret = EFAULT; + goto out; + } + + if (is_vm_hugetlb_page(vma)) { + struct hstate *h __maybe_unused = hstate_vma(vma); + + old_len = ALIGN(old_len, huge_page_size(h)); + new_len = ALIGN(new_len, huge_page_size(h)); + + /* addrs must be huge page aligned */ + if (addr & ~huge_page_mask(h)) + goto out; + if (new_addr & ~huge_page_mask(h)) + goto out; + + /* + * Don't allow remap expansion, because the underlying hugetlb + * reservation is not yet capable to handle split reservation. + */ + if (new_len > old_len) + goto out; + } if (flags & (MREMAP_FIXED | MREMAP_DONTUNMAP)) { ret = mremap_to(addr, old_len, new_addr, new_len, From patchwork Fri Nov 5 20:41:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87AE4C433EF for ; Fri, 5 Nov 2021 20:42:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 40B7560E09 for ; Fri, 5 Nov 2021 20:42:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 40B7560E09 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 14CD5940089; Fri, 5 Nov 2021 16:42:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B20594007C; Fri, 5 Nov 2021 16:42:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCCFE940089; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id C4EE294007C for ; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8A4927145C for ; Fri, 5 Nov 2021 20:42:03 +0000 (UTC) X-FDA: 78776048526.16.EEC6CDC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 1019D70000B6 for ; Fri, 5 Nov 2021 20:42:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 88E24611C0; Fri, 5 Nov 2021 20:41:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144903; bh=IjKaODHm5jlR7vHGhb6Gr56ynBNOZY2in062Nku8NEs=; h=Date:From:To:Subject:In-Reply-To:From; b=PBgVqq3C4iB1gIwYa0qIF91DXjHfYZsvnBkPuSAEr1TF/EQKyCEsT74eiYXs98Na3 /KxDMkQ2J2kI2tbORgoEztEpd2uCR3DqxZj+80VPM7J5Ff0M3Rd/ZrTx24WUq7gq+9 va8493evBfH505uebhgkC8kaIi08arzkPh4lqS4c= Date: Fri, 05 Nov 2021 13:41:43 -0700 From: Andrew Morton To: akpm@linux-foundation.org, almasrymina@google.com, ckennelly@google.com, kenchen@google.com, kirill@shutemov.name, linux-mm@kvack.org, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz, wanjiabing@vivo.com Subject: [patch 135/262] mm, hugepages: add hugetlb vma mremap() test Message-ID: <20211105204143.Ouc6aj2g2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1019D70000B6 X-Stat-Signature: 779y71ajt7s41gepwrck5wgsift34y11 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PBgVqq3C; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144922-739482 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mina Almasry Subject: mm, hugepages: add hugetlb vma mremap() test [almasrymina@google.com: v8] Link: https://lkml.kernel.org/r/20211014200542.4126947-2-almasrymina@google.com [wanjiabing@vivo.com: remove duplicated include in hugepage-mremap] Link: https://lkml.kernel.org/r/20211021122944.8857-1-wanjiabing@vivo.com Link: https://lkml.kernel.org/r/20211013195825.3058275-2-almasrymina@google.com Signed-off-by: Mina Almasry Signed-off-by: Wan Jiabing Acked-by: Mike Kravetz Cc: Ken Chen Cc: Chris Kennelly Cc: Michal Hocko Cc: Vlastimil Babka Cc: Kirill Shutemov Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/.gitignore | 1 tools/testing/selftests/vm/Makefile | 1 tools/testing/selftests/vm/hugepage-mremap.c | 160 +++++++++++++++++ tools/testing/selftests/vm/run_vmtests.sh | 11 + 4 files changed, 173 insertions(+) --- a/tools/testing/selftests/vm/.gitignore~mm-hugepages-add-hugetlb-vma-mremap-test +++ a/tools/testing/selftests/vm/.gitignore @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only hugepage-mmap +hugepage-mremap hugepage-shm khugepaged map_hugetlb --- /dev/null +++ a/tools/testing/selftests/vm/hugepage-mremap.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * hugepage-mremap: + * + * Example of remapping huge page memory in a user application using the + * mremap system call. Code assumes a hugetlbfs filesystem is mounted + * at './huge'. The code will use 10MB worth of huge pages. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include /* Definition of O_* constants */ +#include /* Definition of SYS_* constants */ +#include +#include +#include + +#define LENGTH (1UL * 1024 * 1024 * 1024) + +#define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC) +#define FLAGS (MAP_SHARED | MAP_ANONYMOUS) + +static void check_bytes(char *addr) +{ + printf("First hex is %x\n", *((unsigned int *)addr)); +} + +static void write_bytes(char *addr) +{ + unsigned long i; + + for (i = 0; i < LENGTH; i++) + *(addr + i) = (char)i; +} + +static int read_bytes(char *addr) +{ + unsigned long i; + + check_bytes(addr); + for (i = 0; i < LENGTH; i++) + if (*(addr + i) != (char)i) { + printf("Mismatch at %lu\n", i); + return 1; + } + return 0; +} + +static void register_region_with_uffd(char *addr, size_t len) +{ + long uffd; /* userfaultfd file descriptor */ + struct uffdio_api uffdio_api; + struct uffdio_register uffdio_register; + + /* Create and enable userfaultfd object. */ + + uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + if (uffd == -1) { + perror("userfaultfd"); + exit(1); + } + + uffdio_api.api = UFFD_API; + uffdio_api.features = 0; + if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1) { + perror("ioctl-UFFDIO_API"); + exit(1); + } + + /* Create a private anonymous mapping. The memory will be + * demand-zero paged--that is, not yet allocated. When we + * actually touch the memory, it will be allocated via + * the userfaultfd. + */ + + addr = mmap(NULL, len, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (addr == MAP_FAILED) { + perror("mmap"); + exit(1); + } + + printf("Address returned by mmap() = %p\n", addr); + + /* Register the memory range of the mapping we just created for + * handling by the userfaultfd object. In mode, we request to track + * missing pages (i.e., pages that have not yet been faulted in). + */ + + uffdio_register.range.start = (unsigned long)addr; + uffdio_register.range.len = len; + uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1) { + perror("ioctl-UFFDIO_REGISTER"); + exit(1); + } +} + +int main(void) +{ + int ret = 0; + + int fd = open("/huge/test", O_CREAT | O_RDWR, 0755); + + if (fd < 0) { + perror("Open failed"); + exit(1); + } + + /* mmap to a PUD aligned address to hopefully trigger pmd sharing. */ + unsigned long suggested_addr = 0x7eaa40000000; + void *haddr = mmap((void *)suggested_addr, LENGTH, PROTECTION, + MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0); + printf("Map haddr: Returned address is %p\n", haddr); + if (haddr == MAP_FAILED) { + perror("mmap1"); + exit(1); + } + + /* mmap again to a dummy address to hopefully trigger pmd sharing. */ + suggested_addr = 0x7daa40000000; + void *daddr = mmap((void *)suggested_addr, LENGTH, PROTECTION, + MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0); + printf("Map daddr: Returned address is %p\n", daddr); + if (daddr == MAP_FAILED) { + perror("mmap3"); + exit(1); + } + + suggested_addr = 0x7faa40000000; + void *vaddr = + mmap((void *)suggested_addr, LENGTH, PROTECTION, FLAGS, -1, 0); + printf("Map vaddr: Returned address is %p\n", vaddr); + if (vaddr == MAP_FAILED) { + perror("mmap2"); + exit(1); + } + + register_region_with_uffd(haddr, LENGTH); + + void *addr = mremap(haddr, LENGTH, LENGTH, + MREMAP_MAYMOVE | MREMAP_FIXED, vaddr); + if (addr == MAP_FAILED) { + perror("mremap"); + exit(1); + } + + printf("Mremap: Returned address is %p\n", addr); + check_bytes(addr); + write_bytes(addr); + ret = read_bytes(addr); + + munmap(addr, LENGTH); + + return ret; +} --- a/tools/testing/selftests/vm/Makefile~mm-hugepages-add-hugetlb-vma-mremap-test +++ a/tools/testing/selftests/vm/Makefile @@ -29,6 +29,7 @@ TEST_GEN_FILES = compaction_test TEST_GEN_FILES += gup_test TEST_GEN_FILES += hmm-tests TEST_GEN_FILES += hugepage-mmap +TEST_GEN_FILES += hugepage-mremap TEST_GEN_FILES += hugepage-shm TEST_GEN_FILES += khugepaged TEST_GEN_FILES += madv_populate --- a/tools/testing/selftests/vm/run_vmtests.sh~mm-hugepages-add-hugetlb-vma-mremap-test +++ a/tools/testing/selftests/vm/run_vmtests.sh @@ -108,6 +108,17 @@ else echo "[PASS]" fi +echo "-----------------------" +echo "running hugepage-mremap" +echo "-----------------------" +./hugepage-mremap +if [ $? -ne 0 ]; then + echo "[FAIL]" + exitcode=1 +else + echo "[PASS]" +fi + echo "NOTE: The above hugetlb tests provide minimal coverage. Use" echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" echo " hugetlb regression testing." From patchwork Fri Nov 5 20:41:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47005C433FE for ; Fri, 5 Nov 2021 20:41:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EF9896127B for ; Fri, 5 Nov 2021 20:41:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EF9896127B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 98A77940083; Fri, 5 Nov 2021 16:41:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 939DF94007C; Fri, 5 Nov 2021 16:41:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 829B0940083; Fri, 5 Nov 2021 16:41:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id 703F294007C for ; Fri, 5 Nov 2021 16:41:48 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2BDAF1856B704 for ; Fri, 5 Nov 2021 20:41:48 +0000 (UTC) X-FDA: 78776047896.04.CA40A39 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id A837F3000111 for ; Fri, 5 Nov 2021 20:41:47 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B5C376126A; Fri, 5 Nov 2021 20:41:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144907; bh=zAh9YP41Hnd95XRg1LG6DwFJ1J4g/tjtFKFlfMPhm6U=; h=Date:From:To:Subject:In-Reply-To:From; b=sDjA1h8f3Jyoub5Gs+QAUdjHsr1JwsMKYiVAhYdeRrHVVop2tKqYU1qO1GnZbocB1 DFY79EJBFaDYqcNK+5nQZYyflD5Nh9BWs56oxJHrXFT/dUntiz67zXeVnlYTPIk5s5 MKailleCqx+ecwK5ao+cEBtmNImg0oSiYiYfPlMc= Date: Fri, 05 Nov 2021 13:41:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, corbet@lwn.net, guro@fb.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 136/262] hugetlb: support node specified when using cma for gigantic hugepages Message-ID: <20211105204146.Fr7FCT9Hz%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A837F3000111 X-Stat-Signature: pton6eub9zfx5uhsdrmnr38s9x19f75w Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=sDjA1h8f; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144907-175610 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: hugetlb: support node specified when using cma for gigantic hugepages Now the size of CMA area for gigantic hugepages runtime allocation is balanced for all online nodes, but we also want to specify the size of CMA per-node, or only one node in some cases, which are similar with patch [1]. For example, on some multi-nodes systems, each node's memory can be different, allocating the same size of CMA for each node is not suitable for the low-memory nodes. Meanwhile some workloads like DPDK mentioned by Zhenguo in patch [1] only need hugepages in one node. On the other hand, we have some machines with multiple types of memory, like DRAM and PMEM (persistent memory). On this system, we may want to specify all the hugepages only on DRAM node, or specify the proportion of DRAM node and PMEM node, to tuning the performance of the workloads. Thus this patch adds node format for 'hugetlb_cma' parameter to support specifying the size of CMA per-node. An example is as follows: hugetlb_cma=0:5G,2:5G which means allocating 5G size of CMA area on node 0 and node 2 respectively. And the users should use the node specific sysfs file to allocate the gigantic hugepages if specified the CMA size on that node. [1] https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.com Link: https://lkml.kernel.org/r/bb790775ca60bb8f4b26956bb3f6988f74e075c7.1634261144.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Mike Kravetz Cc: Michal Hocko Cc: Roman Gushchin Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/admin-guide/kernel-parameters.txt | 6 mm/hugetlb.c | 86 ++++++++++++-- 2 files changed, 81 insertions(+), 11 deletions(-) --- a/Documentation/admin-guide/kernel-parameters.txt~hugetlb-support-node-specified-when-using-cma-for-gigantic-hugepages +++ a/Documentation/admin-guide/kernel-parameters.txt @@ -1587,8 +1587,10 @@ registers. Default set by CONFIG_HPET_MMAP_DEFAULT. hugetlb_cma= [HW,CMA] The size of a CMA area used for allocation - of gigantic hugepages. - Format: nn[KMGTPE] + of gigantic hugepages. Or using node format, the size + of a CMA area per node can be specified. + Format: nn[KMGTPE] or (node format) + :nn[KMGTPE][,:nn[KMGTPE]] Reserve a CMA area of given size and allocate gigantic hugepages using the CMA allocator. If enabled, the --- a/mm/hugetlb.c~hugetlb-support-node-specified-when-using-cma-for-gigantic-hugepages +++ a/mm/hugetlb.c @@ -50,6 +50,7 @@ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA static struct cma *hugetlb_cma[MAX_NUMNODES]; +static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; static bool hugetlb_cma_page(struct page *page, unsigned int order) { return cma_pages_valid(hugetlb_cma[page_to_nid(page)], page, @@ -6762,7 +6763,38 @@ static bool cma_reserve_called __initdat static int __init cmdline_parse_hugetlb_cma(char *p) { - hugetlb_cma_size = memparse(p, &p); + int nid, count = 0; + unsigned long tmp; + char *s = p; + + while (*s) { + if (sscanf(s, "%lu%n", &tmp, &count) != 1) + break; + + if (s[count] == ':') { + nid = tmp; + if (nid < 0 || nid >= MAX_NUMNODES) + break; + + s += count + 1; + tmp = memparse(s, &s); + hugetlb_cma_size_in_node[nid] = tmp; + hugetlb_cma_size += tmp; + + /* + * Skip the separator if have one, otherwise + * break the parsing. + */ + if (*s == ',') + s++; + else + break; + } else { + hugetlb_cma_size = memparse(p, &p); + break; + } + } + return 0; } @@ -6771,6 +6803,7 @@ early_param("hugetlb_cma", cmdline_parse void __init hugetlb_cma_reserve(int order) { unsigned long size, reserved, per_node; + bool node_specific_cma_alloc = false; int nid; cma_reserve_called = true; @@ -6778,6 +6811,31 @@ void __init hugetlb_cma_reserve(int orde if (!hugetlb_cma_size) return; + for (nid = 0; nid < MAX_NUMNODES; nid++) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + if (!node_state(nid, N_ONLINE)) { + pr_warn("hugetlb_cma: invalid node %d specified\n", nid); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + continue; + } + + if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) { + pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n", + nid, (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + } else { + node_specific_cma_alloc = true; + } + } + + /* Validate the CMA size again in case some invalid nodes specified. */ + if (!hugetlb_cma_size) + return; + if (hugetlb_cma_size < (PAGE_SIZE << order)) { pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", (PAGE_SIZE << order) / SZ_1M); @@ -6785,20 +6843,30 @@ void __init hugetlb_cma_reserve(int orde return; } - /* - * If 3 GB area is requested on a machine with 4 numa nodes, - * let's allocate 1 GB on first three nodes and ignore the last one. - */ - per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); - pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", - hugetlb_cma_size / SZ_1M, per_node / SZ_1M); + if (!node_specific_cma_alloc) { + /* + * If 3 GB area is requested on a machine with 4 numa nodes, + * let's allocate 1 GB on first three nodes and ignore the last one. + */ + per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); + pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", + hugetlb_cma_size / SZ_1M, per_node / SZ_1M); + } reserved = 0; for_each_node_state(nid, N_ONLINE) { int res; char name[CMA_MAX_NAME]; - size = min(per_node, hugetlb_cma_size - reserved); + if (node_specific_cma_alloc) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + size = hugetlb_cma_size_in_node[nid]; + } else { + size = min(per_node, hugetlb_cma_size - reserved); + } + size = round_up(size, PAGE_SIZE << order); snprintf(name, sizeof(name), "hugetlb%d", nid); From patchwork Fri Nov 5 20:41:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D37FC433EF for ; Fri, 5 Nov 2021 20:41:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CF5136126A for ; Fri, 5 Nov 2021 20:41:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CF5136126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7DDC7940084; Fri, 5 Nov 2021 16:41:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7668394007C; Fri, 5 Nov 2021 16:41:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6058D940084; Fri, 5 Nov 2021 16:41:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id 4DAC694007C for ; Fri, 5 Nov 2021 16:41:51 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1C722718E1 for ; Fri, 5 Nov 2021 20:41:51 +0000 (UTC) X-FDA: 78776048022.09.62FFE06 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP id B8BC4191B for ; Fri, 5 Nov 2021 20:41:50 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BF70C61262; Fri, 5 Nov 2021 20:41:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144910; bh=BSW36IfBi42ma3jGEns2UC6IdWxbZwxD8teozAaJBUM=; h=Date:From:To:Subject:In-Reply-To:From; b=ODbcsuiWu+Foww5pmzvZSG8cowDFKTfb3G+guTK4UKy+pSGa5HuIlYIcFr4ByRnPY En2i4a09DhNxifPJp25MmykvQ1iNaVeIKUZHBMEAaNwlzBYYiWW0Dg5Bv8iA2B8bK2 Q3+uAbSaHK1HnD5VyaeAOF1FjDk1q4O6M0YjJC8o= Date: Fri, 05 Nov 2021 13:41:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, ran.jianping@zte.com.cn, shuah@kernel.org, torvalds@linux-foundation.org, zealci@zte.com.cn Subject: [patch 137/262] mm: remove duplicate include in hugepage-mremap.c Message-ID: <20211105204149.yGUgSZG33%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B8BC4191B X-Stat-Signature: 1rswobzcu6e135dd6c5y98ng3khrfxqu Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ODbcsuiW; dmarc=none; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144910-515902 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ran Jianping Subject: mm: remove duplicate include in hugepage-mremap.c Remove duplicate includes 'unistd.h' included in '/tools/testing/selftests/vm/hugepage-mremap.c' is duplicated.It is also included on 23 line. Link: https://lkml.kernel.org/r/20211018102336.869726-1-ran.jianping@zte.com.cn Signed-off-by: Ran Jianping Reported-by: Zeal Robot Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/hugepage-mremap.c | 1 - 1 file changed, 1 deletion(-) --- a/tools/testing/selftests/vm/hugepage-mremap.c~mm-remove-duplicate-include-in-hugepage-mremapc +++ a/tools/testing/selftests/vm/hugepage-mremap.c @@ -15,7 +15,6 @@ #include #include /* Definition of O_* constants */ #include /* Definition of SYS_* constants */ -#include #include #include From patchwork Fri Nov 5 20:41:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22BD1C433EF for ; Fri, 5 Nov 2021 20:41:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CA7386126A for ; Fri, 5 Nov 2021 20:41:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CA7386126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 67D0C940085; Fri, 5 Nov 2021 16:41:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62CA094007C; Fri, 5 Nov 2021 16:41:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51D9D940085; Fri, 5 Nov 2021 16:41:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id 407FC94007C for ; Fri, 5 Nov 2021 16:41:54 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 02B0F1856A1F1 for ; Fri, 5 Nov 2021 20:41:54 +0000 (UTC) X-FDA: 78776048148.16.A8FDD29 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id A019C400209A for ; Fri, 5 Nov 2021 20:41:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B54BB6127A; Fri, 5 Nov 2021 20:41:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144913; bh=LB85HUFu8ll1X2iiu5LzCoBLih9nDO9SXo2caohty+0=; h=Date:From:To:Subject:In-Reply-To:From; b=tSx8jdf7iIyNLcWIdFcvD6aBD6I0IIYA7DoKkLCOCr0ATkqtuj4q/9oOE/Vcr+sFx WLE189IrXvwmSJQdy/Y/qc/VagPSnnTnSlV43Ebgu9mtexTwTnMY7AmDerhgCVhyfi ji/NbBIQmMfUGYBm2rtmV0kjjWbOjrs+v25aLfhk= Date: Fri, 05 Nov 2021 13:41:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 138/262] hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro Message-ID: <20211105204152.MEVe-4gQM%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=tSx8jdf7; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A019C400209A X-Stat-Signature: y8uotoej9w4yixr1gtyh3nntf7tgou7s X-HE-Tag: 1636144913-703213 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro Patch series "Some cleanups and improvements for hugetlb". This patchset does some cleanups and improvements for hugetlb and hugetlb_cgroup. This patch (of 4): Since commit 726b7bbe ("hugetlb_cgroup: fix illegal access to memory"), the hugetlb_cgroup_from_counter() macro is not used any more, remove it. Link: https://lkml.kernel.org/r/cover.1634797639.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/f03b29b801fa9942466ab15334ec09988e124ae6.1634797639.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Mike Kravetz Cc: Michal Hocko Signed-off-by: Andrew Morton --- mm/hugetlb_cgroup.c | 3 --- 1 file changed, 3 deletions(-) --- a/mm/hugetlb_cgroup.c~hugetlb_cgroup-remove-unused-hugetlb_cgroup_from_counter-macro +++ a/mm/hugetlb_cgroup.c @@ -27,9 +27,6 @@ #define MEMFILE_IDX(val) (((val) >> 16) & 0xffff) #define MEMFILE_ATTR(val) ((val) & 0xffff) -#define hugetlb_cgroup_from_counter(counter, idx) \ - container_of(counter, struct hugetlb_cgroup, hugepage[idx]) - static struct hugetlb_cgroup *root_h_cgroup __read_mostly; static inline struct page_counter * From patchwork Fri Nov 5 20:41:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DDEFC433EF for ; Fri, 5 Nov 2021 20:41:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C06E66127A for ; Fri, 5 Nov 2021 20:41:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C06E66127A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5E204940086; Fri, 5 Nov 2021 16:41:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5902A94007C; Fri, 5 Nov 2021 16:41:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CF49940086; Fri, 5 Nov 2021 16:41:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 3DB6294007C for ; Fri, 5 Nov 2021 16:41:57 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EFB931856B704 for ; Fri, 5 Nov 2021 20:41:56 +0000 (UTC) X-FDA: 78776048232.39.9D90B42 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 20793508FA5D for ; Fri, 5 Nov 2021 20:41:39 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AA4F461262; Fri, 5 Nov 2021 20:41:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144915; bh=wBVtpUZVtf2kVw+I46otS9bS+X/Pz+en6b3B9FC13vM=; h=Date:From:To:Subject:In-Reply-To:From; b=PDejLvKN42gFXcPh7W8eHTpFlyYFyiOOpmKcHZbQPp/mjkO5Gxv0ijZlcjhq1FgGY Qpzqiw7//+ECx0RKMuiFsrjGufp0NMhEzkTJCTolGnVOmZIkw565DcMbNIaawG7Toq 0/9SUFJqrN7yRduD88xK17Ij6iYGyf6I1bjGUDNo= Date: Fri, 05 Nov 2021 13:41:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 139/262] hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments Message-ID: <20211105204155.cbkgqZqnW%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 20793508FA5D X-Stat-Signature: yb4qztkowuph6bubhyzrc9ntznk3xy36 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PDejLvKN; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144899-505529 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments After commit 8382d914ebf7 ("mm, hugetlb: improve page-fault scalability"), the hugetlb_instantiation_mutex lock had been replaced by hugetlb_fault_mutex_table to serializes faults on the same logical page. Thus update the obsolete hugetlb_instantiation_mutex related comments. Link: https://lkml.kernel.org/r/4b3febeae37455ff7b74aa0aad16cc6909cf0926.1634797639.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Mike Kravetz Cc: Michal Hocko Signed-off-by: Andrew Morton --- mm/hugetlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/hugetlb.c~hugetlb-replace-the-obsolete-hugetlb_instantiation_mutex-in-the-comments +++ a/mm/hugetlb.c @@ -5014,7 +5014,7 @@ static void unmap_ref_private(struct mm_ /* * Hugetlb_cow() should be called with page lock of the original hugepage held. - * Called with hugetlb_instantiation_mutex held and pte_page locked so we + * Called with hugetlb_fault_mutex_table held and pte_page locked so we * cannot race with other handlers or page migration. * Keep the pte_same checks anyway to make transition from the mutex easier. */ From patchwork Fri Nov 5 20:41:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB954C433F5 for ; Fri, 5 Nov 2021 20:42:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD82061262 for ; Fri, 5 Nov 2021 20:42:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AD82061262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 51DAC940087; Fri, 5 Nov 2021 16:42:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CCEC94007C; Fri, 5 Nov 2021 16:42:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E294940087; Fri, 5 Nov 2021 16:42:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id 2F6BF94007C for ; Fri, 5 Nov 2021 16:42:00 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E86D482499A8 for ; Fri, 5 Nov 2021 20:41:59 +0000 (UTC) X-FDA: 78776048358.18.88B4C5D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 814DB900025D for ; Fri, 5 Nov 2021 20:41:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9A86A6126A; Fri, 5 Nov 2021 20:41:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144918; bh=AQysa+n5ZPxXt7NfGtd1LvpmRCrYopYheHGaUrrks8I=; h=Date:From:To:Subject:In-Reply-To:From; b=qmJckTD67WT0N1oxAvBYG0AqWJmNEfq+fzKf6aktPbAB0ONQWUtBr+6luxQX4j0Dw 2lEm3p85qu43OEElOIE3l8wnwOSpgMUrYHNTVIyTbOVzkS9dQL1pBSey57rKiRK+Q0 kK7S6ynN275bEDwSZF7ZoVs6a/AlaQ5/WSqa4AWo= Date: Fri, 05 Nov 2021 13:41:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 140/262] hugetlb: remove redundant validation in has_same_uncharge_info() Message-ID: <20211105204158.dBpgUQHuO%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 814DB900025D X-Stat-Signature: yhgfgq55i9rdp64idzr7ak74zigw65ge Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qmJckTD6; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144919-508267 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: hugetlb: remove redundant validation in has_same_uncharge_info() The callers of has_same_uncharge_info() has accessed the original file_region and new file_region, and they are impossible to be NULL now. So we can remove the file_region validation in has_same_uncharge_info() to simplify the code. Link: https://lkml.kernel.org/r/97fc68d3f8d34f63c204645e10d7a718997e50b7.1634797639.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Mike Kravetz Cc: Michal Hocko Signed-off-by: Andrew Morton --- mm/hugetlb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/mm/hugetlb.c~hugetlb-remove-redundant-validation-in-has_same_uncharge_info +++ a/mm/hugetlb.c @@ -332,8 +332,7 @@ static bool has_same_uncharge_info(struc struct file_region *org) { #ifdef CONFIG_CGROUP_HUGETLB - return rg && org && - rg->reservation_counter == org->reservation_counter && + return rg->reservation_counter == org->reservation_counter && rg->css == org->css; #else From patchwork Fri Nov 5 20:42:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2EB2C433EF for ; Fri, 5 Nov 2021 20:42:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A1C49611C0 for ; Fri, 5 Nov 2021 20:42:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A1C49611C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 45A3A940088; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 408AB94007C; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 346E1940088; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 2733894007C for ; Fri, 5 Nov 2021 16:42:03 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D9AB58249980 for ; Fri, 5 Nov 2021 20:42:02 +0000 (UTC) X-FDA: 78776048484.30.921CD11 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 83D60F0000B4 for ; Fri, 5 Nov 2021 20:42:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 92D7561262; Fri, 5 Nov 2021 20:42:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144921; bh=2ID/AI5rqphJTJ7DPV8bsvOzveeYczPqGUWG3G1vemM=; h=Date:From:To:Subject:In-Reply-To:From; b=k5Kkc13nw9hhpAiYr4+j0JWsfbRXNkc+0ODdEot8VVFVG/oX9PAlKb0Gel0N7T8Sd r2TcF2cwW7hI8IRasrG7jP+8IqzxUS8gmZMsZ7uG9ltLqA/sSzlStxBJsjlVNJzYsS 5D+jDuoUZmCnZlh+qme2ib6TICzDHQewsQheBJWU= Date: Fri, 05 Nov 2021 13:42:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 141/262] hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range() Message-ID: <20211105204201.Udvk59sOV%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=k5Kkc13n; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 83D60F0000B4 X-Stat-Signature: iwqk8hhyboqeq5abd43pkj4up6momi4s X-HE-Tag: 1636144922-40135 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Baolin Wang Subject: hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range() When calling hugetlb_resv_map_add(), we've guaranteed that the parameter 'to' is always larger than 'from', so it never returns a negative value from hugetlb_resv_map_add(). Thus remove the redundant VM_BUG_ON(). Link: https://lkml.kernel.org/r/2b565552f3d06753da1e8dda439c0d96d6d9a5a3.1634797639.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang Reviewed-by: Mike Kravetz Cc: Michal Hocko Signed-off-by: Andrew Morton --- mm/hugetlb.c | 1 - 1 file changed, 1 deletion(-) --- a/mm/hugetlb.c~hugetlb-remove-redundant-vm_bug_on-in-add_reservation_in_range +++ a/mm/hugetlb.c @@ -445,7 +445,6 @@ static long add_reservation_in_range(str add += hugetlb_resv_map_add(resv, rg, last_accounted_offset, t, h, h_cg, regions_needed); - VM_BUG_ON(add < 0); return add; } From patchwork Fri Nov 5 20:42:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E511C433F5 for ; Fri, 5 Nov 2021 20:52:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2E3460E09 for ; Fri, 5 Nov 2021 20:52:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B2E3460E09 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C8D70940105; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1A29940104; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A32D940103; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id 8C656940102 for ; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5049C1856AD51 for ; Fri, 5 Nov 2021 20:52:06 +0000 (UTC) X-FDA: 78776073852.31.F9BE274 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id E415F10004C8 for ; Fri, 5 Nov 2021 20:52:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 99AFD60C51; Fri, 5 Nov 2021 20:42:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144924; bh=8ufgrQ8+aTEw9uSx7Wtbtnkk0TlAQG3cvFwm+N7yYww=; h=Date:From:To:Subject:In-Reply-To:From; b=cH/82wZt0IT1E2p55BmpvE25b0Cu3JGsbZFmq7CpsnwIzrO3xCvh/1cBvBmMZUDBm 7FwiSFRfs15VMI/6iJqxQxEKeVlMc3VLelqpD8WCxq8+68tq5igqmve34KIPRisn6y D9A+uoxCa2Yf8+L8/9ttuxdoLFWZrplJAJeMEAyE= Date: Fri, 05 Nov 2021 13:42:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, osalvador@suse.de, pasha.tatashin@soleen.com, songmuchun@bytedance.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 142/262] hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page Message-ID: <20211105204204.OdjvHE3wI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E415F10004C8 X-Stat-Signature: 5ohij8kips1j7kxrxrrd67cigichk4zq Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="cH/82wZt"; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145525-850403 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Kravetz Subject: hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page In commit 7118fc2906e29 ("hugetlb: address ref count racing in prep_compound_gigantic_page"), page_ref_freeze is used to atomically zero the ref count of tail pages iff they are 1. The unconditional call to set_page_count(0) was left in the code. This call is after page_ref_freeze so it is really a noop. Remove redundant and unnecessary set_page_count call. Link: https://lkml.kernel.org/r/20211026220635.35187-1-mike.kravetz@oracle.com Fixes: 7118fc2906e29 ("hugetlb: address ref count racing in prep_compound_gigantic_page") Signed-off-by: Mike Kravetz Suggested-by: Pasha Tatashin Reviewed-by: Pasha Tatashin Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Oscar Salvador Reviewed-by: Muchun Song Signed-off-by: Andrew Morton --- mm/hugetlb.c | 1 - 1 file changed, 1 deletion(-) --- a/mm/hugetlb.c~hugetlb-remove-unnecessary-set_page_count-in-prep_compound_gigantic_page +++ a/mm/hugetlb.c @@ -1792,7 +1792,6 @@ static bool __prep_compound_gigantic_pag } else { VM_BUG_ON_PAGE(page_count(p), p); } - set_page_count(p, 0); set_compound_head(p, page); } atomic_set(compound_mapcount_ptr(page), -1); From patchwork Fri Nov 5 20:42:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C7DCC433F5 for ; Fri, 5 Nov 2021 20:42:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E899260EDF for ; Fri, 5 Nov 2021 20:42:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E899260EDF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 713F694008B; Fri, 5 Nov 2021 16:42:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62C6D94007C; Fri, 5 Nov 2021 16:42:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43A1094008B; Fri, 5 Nov 2021 16:42:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 2F8F694007C for ; Fri, 5 Nov 2021 16:42:09 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E764A76B62 for ; Fri, 5 Nov 2021 20:42:08 +0000 (UTC) X-FDA: 78776048736.28.5FA7017 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id C799CD036A4B for ; Fri, 5 Nov 2021 20:42:03 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A442B61058; Fri, 5 Nov 2021 20:42:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144927; bh=nYdau/qBbFk7p6piO1+LGvw11sptEtBQiL3aI5eQy4g=; h=Date:From:To:Subject:In-Reply-To:From; b=Vn/hPOMatXoc6Yvnv+JvQYCtC8+EvFIl5wN47Vudmh3nrcTwiY5smK6oAc1IV3kis 8MgvKKCXniCDIOO1WykwX+XJBSs54TrDIZdozDClPgBH6tBGfbycaaXwUcxBuWES7l 7TaFNz5C9DlYsgqLqmmltEPinMRNyN7o6lGkM0yQ= Date: Fri, 05 Nov 2021 13:42:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, shuah@kernel.org, torvalds@linux-foundation.org Subject: [patch 143/262] userfaultfd/selftests: don't rely on GNU extensions for random numbers Message-ID: <20211105204207.leuiYoT1g%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Vn/hPOMa"; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C799CD036A4B X-Stat-Signature: 7b3rih561xdxd3gftgssoigtgfdxknz9 X-HE-Tag: 1636144923-996663 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Axel Rasmussen Subject: userfaultfd/selftests: don't rely on GNU extensions for random numbers Patch series "Small userfaultfd selftest fixups", v2. This patch (of 3): Two arguments for doing this: First, and maybe most importantly, the resulting code is significantly shorter / simpler. Then, we avoid using GNU libc extensions. Why does this matter? It makes testing userfaultfd with the selftest easier e.g. on distros which use something other than glibc (e.g., Alpine, which uses musl); basically, it makes the test more portable. Link: https://lkml.kernel.org/r/20210930212309.4001967-2-axelrasmussen@google.com Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 26 +++------------------ 1 file changed, 4 insertions(+), 22 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-dont-rely-on-gnu-extensions-for-random-numbers +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -57,6 +57,7 @@ #include #include #include +#include #include "../kselftest.h" @@ -518,22 +519,10 @@ static void continue_range(int ufd, __u6 static void *locking_thread(void *arg) { unsigned long cpu = (unsigned long) arg; - struct random_data rand; unsigned long page_nr = *(&(page_nr)); /* uninitialized warning */ - int32_t rand_nr; unsigned long long count; - char randstate[64]; - unsigned int seed; - if (bounces & BOUNCE_RANDOM) { - seed = (unsigned int) time(NULL) - bounces; - if (!(bounces & BOUNCE_RACINGFAULTS)) - seed += cpu; - bzero(&rand, sizeof(rand)); - bzero(&randstate, sizeof(randstate)); - if (initstate_r(seed, randstate, sizeof(randstate), &rand)) - err("initstate_r failed"); - } else { + if (!(bounces & BOUNCE_RANDOM)) { page_nr = -bounces; if (!(bounces & BOUNCE_RACINGFAULTS)) page_nr += cpu * nr_pages_per_cpu; @@ -541,15 +530,8 @@ static void *locking_thread(void *arg) while (!finished) { if (bounces & BOUNCE_RANDOM) { - if (random_r(&rand, &rand_nr)) - err("random_r failed"); - page_nr = rand_nr; - if (sizeof(page_nr) > sizeof(rand_nr)) { - if (random_r(&rand, &rand_nr)) - err("random_r failed"); - page_nr |= (((unsigned long) rand_nr) << 16) << - 16; - } + if (getrandom(&page_nr, sizeof(page_nr), 0) != sizeof(page_nr)) + err("getrandom failed"); } else page_nr += 1; page_nr %= nr_pages; From patchwork Fri Nov 5 20:42:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41816C433FE for ; Fri, 5 Nov 2021 20:42:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E79CC60ED5 for ; Fri, 5 Nov 2021 20:42:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E79CC60ED5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 89F8294008C; Fri, 5 Nov 2021 16:42:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8520D94007C; Fri, 5 Nov 2021 16:42:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CB0494008C; Fri, 5 Nov 2021 16:42:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0004.hostedemail.com [216.40.44.4]) by kanga.kvack.org (Postfix) with ESMTP id 54AC994007C for ; Fri, 5 Nov 2021 16:42:12 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 17F9775826 for ; Fri, 5 Nov 2021 20:42:12 +0000 (UTC) X-FDA: 78776048862.26.BB3ED67 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id BB26B30000BF for ; Fri, 5 Nov 2021 20:41:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 98EF660E09; Fri, 5 Nov 2021 20:42:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144930; bh=wFhilCCIqRa3bRzNr8ZDgd116CasdCIhWJTDrKksA50=; h=Date:From:To:Subject:In-Reply-To:From; b=z00FOeBmDeupZghvyab3TGL8S+CpgdfUqG0p8h6FSbkMoFhNCCd/50Dh7Nyohi58v uMfpTWeGaYGviwt9E+l1BuZNHsEtCSRk6yeqvRk4DETeJWRglu6MjoXMildhGszfzx 4sdHPGH8o3wmQOQMcnprE/yYrZQX3RfuwQROM1BI= Date: Fri, 05 Nov 2021 13:42:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, shuah@kernel.org, torvalds@linux-foundation.org Subject: [patch 144/262] userfaultfd/selftests: fix feature support detection Message-ID: <20211105204210.vJUiuN2bw%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BB26B30000BF X-Stat-Signature: re91dq16k9fimpk9akaakyjguu5dq9cr Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=z00FOeBm; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144919-649452 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Axel Rasmussen Subject: userfaultfd/selftests: fix feature support detection Before any tests are run, in set_test_type, we decide what feature(s) we are going to be testing, based upon our command line arguments. However, the supported features are not just a function of the memory type being used, so this is broken. For instance, consider writeprotect support. It is "normally" supported for anonymous memory, but furthermore it requires that the kernel has CONFIG_HAVE_ARCH_USERFAULTFD_WP. So, it is *not* supported at all on aarch64, for example. So, this commit fixes this by querying the kernel for the set of features it supports in set_test_type, by opening a userfaultfd and issuing a UFFDIO_API ioctl. Based upon the reported features, we toggle what tests are enabled. Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 54 ++++++++++++--------- 1 file changed, 31 insertions(+), 23 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-fix-feature-support-detection +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -346,6 +346,16 @@ static struct uffd_test_ops hugetlb_uffd static struct uffd_test_ops *uffd_test_ops; +static inline uint64_t uffd_minor_feature(void) +{ + if (test_type == TEST_HUGETLB && map_shared) + return UFFD_FEATURE_MINOR_HUGETLBFS; + else if (test_type == TEST_SHMEM) + return UFFD_FEATURE_MINOR_SHMEM; + else + return 0; +} + static void userfaultfd_open(uint64_t *features) { struct uffdio_api uffdio_api; @@ -406,7 +416,7 @@ static void uffd_test_ctx_clear(void) munmap_area((void **)&area_dst_alias); } -static void uffd_test_ctx_init_ext(uint64_t *features) +static void uffd_test_ctx_init(uint64_t features) { unsigned long nr, cpu; @@ -415,7 +425,7 @@ static void uffd_test_ctx_init_ext(uint6 uffd_test_ops->allocate_area((void **)&area_src); uffd_test_ops->allocate_area((void **)&area_dst); - userfaultfd_open(features); + userfaultfd_open(&features); count_verify = malloc(nr_pages * sizeof(unsigned long long)); if (!count_verify) @@ -463,11 +473,6 @@ static void uffd_test_ctx_init_ext(uint6 err("pipe"); } -static inline void uffd_test_ctx_init(uint64_t features) -{ - uffd_test_ctx_init_ext(&features); -} - static int my_bcmp(char *str1, char *str2, size_t n) { unsigned long i; @@ -1208,7 +1213,6 @@ static int userfaultfd_minor_test(void) void *expected_page; char c; struct uffd_stats stats = { 0 }; - uint64_t req_features, features_out; if (!test_uffdio_minor) return 0; @@ -1216,21 +1220,7 @@ static int userfaultfd_minor_test(void) printf("testing minor faults: "); fflush(stdout); - if (test_type == TEST_HUGETLB) - req_features = UFFD_FEATURE_MINOR_HUGETLBFS; - else if (test_type == TEST_SHMEM) - req_features = UFFD_FEATURE_MINOR_SHMEM; - else - return 1; - - features_out = req_features; - uffd_test_ctx_init_ext(&features_out); - /* If kernel reports required features aren't supported, skip test. */ - if ((features_out & req_features) != req_features) { - printf("skipping test due to lack of feature support\n"); - fflush(stdout); - return 0; - } + uffd_test_ctx_init(uffd_minor_feature()); uffdio_register.range.start = (unsigned long)area_dst_alias; uffdio_register.range.len = nr_pages * page_size; @@ -1591,6 +1581,8 @@ unsigned long default_huge_page_size(voi static void set_test_type(const char *type) { + uint64_t features = UFFD_API_FEATURES; + if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; @@ -1624,6 +1616,22 @@ static void set_test_type(const char *ty if ((unsigned long) area_count(NULL, 0) + sizeof(unsigned long long) * 2 > page_size) err("Impossible to run this test"); + + /* + * Whether we can test certain features depends not just on test type, + * but also on whether or not this particular kernel supports the + * feature. + */ + + userfaultfd_open(&features); + + test_uffdio_wp = test_uffdio_wp && + (features & UFFD_FEATURE_PAGEFAULT_FLAG_WP); + test_uffdio_minor = test_uffdio_minor && + (features & uffd_minor_feature()); + + close(uffd); + uffd = -1; } static void sigalrm(int sig) From patchwork Fri Nov 5 20:42:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96BBDC433F5 for ; Fri, 5 Nov 2021 20:42:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E6F060ED5 for ; Fri, 5 Nov 2021 20:42:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4E6F060ED5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E1EAC94008D; Fri, 5 Nov 2021 16:42:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCE6994007C; Fri, 5 Nov 2021 16:42:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBCAF94008D; Fri, 5 Nov 2021 16:42:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id BE04B94007C for ; Fri, 5 Nov 2021 16:42:14 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 87A3076AEE for ; Fri, 5 Nov 2021 20:42:14 +0000 (UTC) X-FDA: 78776048988.13.B7D9D6E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id 47EC43000098 for ; Fri, 5 Nov 2021 20:42:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 980E061058; Fri, 5 Nov 2021 20:42:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144933; bh=EUEte4cFcCz+8iY5FYo1ioDcz3ebMgKzBGJGjPnTWoI=; h=Date:From:To:Subject:In-Reply-To:From; b=cdLsl9l/HdDbS7VMrzUA1SO8iYsRqcJhmZc0InqKYydpp81vHmQg5LMFpnNknzRD8 rtY8lBqJdLlrHuxg3ngyXjn5BfXgjCHWRBuClueVsQgpR7GMtacSR0zEgltY5HxKgV s23ZaYc5zpGIWMMKKN8C3nZW0sqX8+GxMPvo/PqM= Date: Fri, 05 Nov 2021 13:42:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axelrasmussen@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, shuah@kernel.org, torvalds@linux-foundation.org Subject: [patch 145/262] userfaultfd/selftests: fix calculation of expected ioctls Message-ID: <20211105204213.RZkItPVQK%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 47EC43000098 X-Stat-Signature: x5foqobfag6znd819omgo83e9yrrz3do Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="cdLsl9l/"; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144922-555487 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Axel Rasmussen Subject: userfaultfd/selftests: fix calculation of expected ioctls Today, we assert that the ioctls the kernel reports as supported for a registration match a precomputed list. We decide which ioctls are supported by examining the memory type. Then, in several locations we "fix up" this list by adding or removing things this initial decision got wrong. What ioctls the kernel reports is actually a function of several things: - The memory type - Kernel feature support (e.g., no writeprotect on aarch64) - The registration type (e.g., CONTINUE only supported for MINOR mode) So, we can't fully compute this at the start, in set_test_type. It varies per test, depending on what registration mode(s) those tests use. Instead, introduce a new function which computes the correct list. This centralizes the add/remove of ioctls depending on these function inputs in one place, so we don't have to repeat ourselves in various tests. Not only is the resulting code a bit shorter, but it fixes a real bug in the existing code: previously, we would incorrectly require the writeprotect ioctl to be present on aarch64, where it isn't actually supported. Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/userfaultfd.c | 77 ++++++++++----------- 1 file changed, 38 insertions(+), 39 deletions(-) --- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-fix-calculation-of-expected-ioctls +++ a/tools/testing/selftests/vm/userfaultfd.c @@ -308,37 +308,24 @@ static void shmem_alias_mapping(__u64 *s } struct uffd_test_ops { - unsigned long expected_ioctls; void (*allocate_area)(void **alloc_area); void (*release_pages)(char *rel_area); void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset); }; -#define SHMEM_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ - (1 << _UFFDIO_COPY) | \ - (1 << _UFFDIO_ZEROPAGE)) - -#define ANON_EXPECTED_IOCTLS ((1 << _UFFDIO_WAKE) | \ - (1 << _UFFDIO_COPY) | \ - (1 << _UFFDIO_ZEROPAGE) | \ - (1 << _UFFDIO_WRITEPROTECT)) - static struct uffd_test_ops anon_uffd_test_ops = { - .expected_ioctls = ANON_EXPECTED_IOCTLS, .allocate_area = anon_allocate_area, .release_pages = anon_release_pages, .alias_mapping = noop_alias_mapping, }; static struct uffd_test_ops shmem_uffd_test_ops = { - .expected_ioctls = SHMEM_EXPECTED_IOCTLS, .allocate_area = shmem_allocate_area, .release_pages = shmem_release_pages, .alias_mapping = shmem_alias_mapping, }; static struct uffd_test_ops hugetlb_uffd_test_ops = { - .expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC & ~(1 << _UFFDIO_CONTINUE), .allocate_area = hugetlb_allocate_area, .release_pages = hugetlb_release_pages, .alias_mapping = hugetlb_alias_mapping, @@ -356,6 +343,33 @@ static inline uint64_t uffd_minor_featur return 0; } +static uint64_t get_expected_ioctls(uint64_t mode) +{ + uint64_t ioctls = UFFD_API_RANGE_IOCTLS; + + if (test_type == TEST_HUGETLB) + ioctls &= ~(1 << _UFFDIO_ZEROPAGE); + + if (!((mode & UFFDIO_REGISTER_MODE_WP) && test_uffdio_wp)) + ioctls &= ~(1 << _UFFDIO_WRITEPROTECT); + + if (!((mode & UFFDIO_REGISTER_MODE_MINOR) && test_uffdio_minor)) + ioctls &= ~(1 << _UFFDIO_CONTINUE); + + return ioctls; +} + +static void assert_expected_ioctls_present(uint64_t mode, uint64_t ioctls) +{ + uint64_t expected = get_expected_ioctls(mode); + uint64_t actual = ioctls & expected; + + if (actual != expected) { + err("missing ioctl(s): expected %"PRIx64" actual: %"PRIx64, + expected, actual); + } +} + static void userfaultfd_open(uint64_t *features) { struct uffdio_api uffdio_api; @@ -1017,11 +1031,9 @@ static int __uffdio_zeropage(int ufd, un { struct uffdio_zeropage uffdio_zeropage; int ret; - unsigned long has_zeropage; + bool has_zeropage = get_expected_ioctls(0) & (1 << _UFFDIO_ZEROPAGE); __s64 res; - has_zeropage = uffd_test_ops->expected_ioctls & (1 << _UFFDIO_ZEROPAGE); - if (offset >= nr_pages * page_size) err("unexpected offset %lu", offset); uffdio_zeropage.range.start = (unsigned long) area_dst + offset; @@ -1061,7 +1073,6 @@ static int uffdio_zeropage(int ufd, unsi static int userfaultfd_zeropage_test(void) { struct uffdio_register uffdio_register; - unsigned long expected_ioctls; printf("testing UFFDIO_ZEROPAGE: "); fflush(stdout); @@ -1076,9 +1087,8 @@ static int userfaultfd_zeropage_test(voi if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure"); - expected_ioctls = uffd_test_ops->expected_ioctls; - if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls) - err("unexpected missing ioctl for anon memory"); + assert_expected_ioctls_present( + uffdio_register.mode, uffdio_register.ioctls); if (uffdio_zeropage(uffd, 0)) if (my_bcmp(area_dst, zeropage, page_size)) @@ -1091,7 +1101,6 @@ static int userfaultfd_zeropage_test(voi static int userfaultfd_events_test(void) { struct uffdio_register uffdio_register; - unsigned long expected_ioctls; pthread_t uffd_mon; int err, features; pid_t pid; @@ -1115,9 +1124,8 @@ static int userfaultfd_events_test(void) if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure"); - expected_ioctls = uffd_test_ops->expected_ioctls; - if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls) - err("unexpected missing ioctl for anon memory"); + assert_expected_ioctls_present( + uffdio_register.mode, uffdio_register.ioctls); if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats)) err("uffd_poll_thread create"); @@ -1145,7 +1153,6 @@ static int userfaultfd_events_test(void) static int userfaultfd_sig_test(void) { struct uffdio_register uffdio_register; - unsigned long expected_ioctls; unsigned long userfaults; pthread_t uffd_mon; int err, features; @@ -1169,9 +1176,8 @@ static int userfaultfd_sig_test(void) if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure"); - expected_ioctls = uffd_test_ops->expected_ioctls; - if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls) - err("unexpected missing ioctl for anon memory"); + assert_expected_ioctls_present( + uffdio_register.mode, uffdio_register.ioctls); if (faulting_process(1)) err("faulting process failed"); @@ -1206,7 +1212,6 @@ static int userfaultfd_sig_test(void) static int userfaultfd_minor_test(void) { struct uffdio_register uffdio_register; - unsigned long expected_ioctls; unsigned long p; pthread_t uffd_mon; uint8_t expected_byte; @@ -1228,10 +1233,8 @@ static int userfaultfd_minor_test(void) if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure"); - expected_ioctls = uffd_test_ops->expected_ioctls; - expected_ioctls |= 1 << _UFFDIO_CONTINUE; - if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls) - err("unexpected missing ioctl(s)"); + assert_expected_ioctls_present( + uffdio_register.mode, uffdio_register.ioctls); /* * After registering with UFFD, populate the non-UFFD-registered side of @@ -1428,8 +1431,6 @@ static int userfaultfd_stress(void) pthread_attr_setstacksize(&attr, 16*1024*1024); while (bounces--) { - unsigned long expected_ioctls; - printf("bounces: %d, mode:", bounces); if (bounces & BOUNCE_RANDOM) printf(" rnd"); @@ -1457,10 +1458,8 @@ static int userfaultfd_stress(void) uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) err("register failure"); - expected_ioctls = uffd_test_ops->expected_ioctls; - if ((uffdio_register.ioctls & expected_ioctls) != - expected_ioctls) - err("unexpected missing ioctl for anon memory"); + assert_expected_ioctls_present( + uffdio_register.mode, uffdio_register.ioctls); if (area_dst_alias) { uffdio_register.range.start = (unsigned long) From patchwork Fri Nov 5 20:42:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B22FEC433F5 for ; Fri, 5 Nov 2021 20:52:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 494F860296 for ; Fri, 5 Nov 2021 20:52:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 494F860296 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BC843940101; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B797A940100; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6828940101; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id 961579400FA for ; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 598DF8249980 for ; Fri, 5 Nov 2021 20:52:05 +0000 (UTC) X-FDA: 78776073810.17.D90AF16 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 0D9F2B0000B6 for ; Fri, 5 Nov 2021 20:52:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8C99760ED5; Fri, 5 Nov 2021 20:42:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144936; bh=seXuwMfGtoXC/UWxzkXUJ6Vlx9b8fDOck7LH0paQwVI=; h=Date:From:To:Subject:In-Reply-To:From; b=gl4ES7SEtcCvNzyFjHfuzPz9c6vK+Fq1gkgbDysXXuZ5URzCOAX7AT70xi/dOF978 gOtdGiMDjWWSKsMJVNp4SB9/mMQQCOja0+rohaWJyG8XYi0IVZQW/nTNsADDIpvLuW NLRpiRy5f+eeb+wMo3/ekAMXzl4BVq6HvbxuHZww= Date: Fri, 05 Nov 2021 13:42:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 146/262] mm/page_isolation: fix potential missing call to unset_migratetype_isolate() Message-ID: <20211105204216.y2EnaPqCQ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0D9F2B0000B6 X-Stat-Signature: hfbk53o4mitrcn7ecyrxw1d3z5txhw6z Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=gl4ES7SE; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145524-301327 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_isolation: fix potential missing call to unset_migratetype_isolate() In start_isolate_page_range() undo path, pfn_to_online_page() just checks the first pfn in a pageblock while __first_valid_page() will traverse the pageblock until the first online pfn is found. So we may miss the call to unset_migratetype_isolate() in undo path and pages will remain isolated unexpectedly. Fix this by calling undo_isolate_page_range() and this will also help to simplify the code further. Note we shouldn't ever trigger it because MAX_ORDER-1 aligned pfn ranges shouldn't contain memory holes now. Link: https://lkml.kernel.org/r/20210914114348.15569-1-linmiaohe@huawei.com Fixes: 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages") Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand Cc: Michal Hocko Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- mm/page_isolation.c | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-) --- a/mm/page_isolation.c~mm-page_isolation-fix-potential-missing-call-to-unset_migratetype_isolate +++ a/mm/page_isolation.c @@ -183,7 +183,6 @@ int start_isolate_page_range(unsigned lo unsigned migratetype, int flags) { unsigned long pfn; - unsigned long undo_pfn; struct page *page; BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages)); @@ -193,25 +192,12 @@ int start_isolate_page_range(unsigned lo pfn < end_pfn; pfn += pageblock_nr_pages) { page = __first_valid_page(pfn, pageblock_nr_pages); - if (page) { - if (set_migratetype_isolate(page, migratetype, flags)) { - undo_pfn = pfn; - goto undo; - } + if (page && set_migratetype_isolate(page, migratetype, flags)) { + undo_isolate_page_range(start_pfn, pfn, migratetype); + return -EBUSY; } } return 0; -undo: - for (pfn = start_pfn; - pfn < undo_pfn; - pfn += pageblock_nr_pages) { - struct page *page = pfn_to_online_page(pfn); - if (!page) - continue; - unset_migratetype_isolate(page, migratetype); - } - - return -EBUSY; } /* From patchwork Fri Nov 5 20:42:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE038C433EF for ; Fri, 5 Nov 2021 20:42:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9444B61058 for ; Fri, 5 Nov 2021 20:42:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9444B61058 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 36F1D94008E; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31E9A94007C; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20E8194008E; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0106.hostedemail.com [216.40.44.106]) by kanga.kvack.org (Postfix) with ESMTP id 1377A94007C for ; Fri, 5 Nov 2021 16:42:21 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CF3388249980 for ; Fri, 5 Nov 2021 20:42:20 +0000 (UTC) X-FDA: 78776049240.27.15198C8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id 650A3104AAD7 for ; Fri, 5 Nov 2021 20:42:11 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 86D1161352; Fri, 5 Nov 2021 20:42:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144939; bh=fBXL4EL1sBKDDtj6+h126HKRVsfx+U+pPEhhzlUTRp4=; h=Date:From:To:Subject:In-Reply-To:From; b=EdStPXyH+Fjx4fGrziLDi2zLuR0+72/ILO7krauKKyACZO593LcpNbwA8X4AkOpBA NUK9+lU6rFhZy6UHcw+OxLXO71SpK+zA4C/5nMgaIYkbiaPvy1+PP4PFPlhvcqQeio Agz1pNblDW/+nNXJttL3AgPfxys7TUTzTZg4EK/Q= Date: Fri, 05 Nov 2021 13:42:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, iamjoonsoo.kim@lge.com, jhubbard@nvidia.com, linmiaohe@huawei.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 147/262] mm/page_isolation: guard against possible putback unisolated page Message-ID: <20211105204219.IUM80dBhI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EdStPXyH; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 650A3104AAD7 X-Stat-Signature: terd4jgyb7393rsfdommaiu5fk5tm3fx X-HE-Tag: 1636144931-57861 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/page_isolation: guard against possible putback unisolated page Isolating a free page in an isolated pageblock is expected to always work as watermarks don't apply here. But if __isolate_free_page() failed, due to condition changes, the page will be left on the free list. And the page will be put back to free list again via __putback_isolated_page(). This may trigger VM_BUG_ON_PAGE() on page->flags checking in __free_one_page() if PageReported is set. Or we will corrupt the free list because list_add() will be called for pages already on another list. Add a VM_WARN_ON() to complain about this change. Link: https://lkml.kernel.org/r/20210914114508.23725-1-linmiaohe@huawei.com Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand Acked-by: Vlastimil Babka Cc: John Hubbard Cc: Joonsoo Kim Signed-off-by: Andrew Morton --- mm/page_isolation.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/mm/page_isolation.c~mm-page_isolation-guard-against-possible-putback-unisolated-page +++ a/mm/page_isolation.c @@ -94,8 +94,13 @@ static void unset_migratetype_isolate(st buddy = page + (buddy_pfn - pfn); if (!is_migrate_isolate_page(buddy)) { - __isolate_free_page(page, order); - isolated_page = true; + isolated_page = !!__isolate_free_page(page, order); + /* + * Isolating a free page in an isolated pageblock + * is expected to always work as watermarks don't + * apply here. + */ + VM_WARN_ON(!isolated_page); } } } From patchwork Fri Nov 5 20:42:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EF62C433FE for ; Fri, 5 Nov 2021 20:42:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E338F6120D for ; Fri, 5 Nov 2021 20:42:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E338F6120D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 51FD194007C; Fri, 5 Nov 2021 16:42:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 481D4940090; Fri, 5 Nov 2021 16:42:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 300B894007C; Fri, 5 Nov 2021 16:42:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 0F121940090 for ; Fri, 5 Nov 2021 16:42:24 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C4C1775305 for ; Fri, 5 Nov 2021 20:42:23 +0000 (UTC) X-FDA: 78776049366.10.F245ABE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 8F7BDD0000A5 for ; Fri, 5 Nov 2021 20:42:12 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8F6C361058; Fri, 5 Nov 2021 20:42:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144942; bh=jwJ8qn72re8tjT6QT5XHOWry11YP2w2kKC9FzKZOqp8=; h=Date:From:To:Subject:In-Reply-To:From; b=HqPYaWdyDG+C+B/8RyTo3Bs2P8mRqFZojHN6sgpfhFS/b07D6MnoZzD574hC1igty BTb5aCfvM0y0dhs9A/B0NxNgJhj+n7UZZmyGCN195ISmnbP6/EbpsQquFZ2YdsBYud gbvm6oVIsHcga5Vta2Xx+1dSLpggZ28OMd1PXL6w= Date: Fri, 05 Nov 2021 13:42:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, shy828301@gmail.com, songkai01@inspur.com, torvalds@linux-foundation.org Subject: [patch 148/262] mm/vmscan.c: fix -Wunused-but-set-variable warning Message-ID: <20211105204222.LKDG_epUU%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HqPYaWdy; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8F7BDD0000A5 X-Stat-Signature: ku81zb5mjiuh4d3p7peswhzxn96mxt6j X-HE-Tag: 1636144932-151792 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kai Song Subject: mm/vmscan.c: fix -Wunused-but-set-variable warning We fix the following warning when building kernel with W=1: mm/vmscan.c:1362:6: warning: variable 'err' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/20210924181218.21165-1-songkai01@inspur.com Signed-off-by: Kai Song Reviewed-by: Yang Shi Signed-off-by: Andrew Morton --- mm/vmscan.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/mm/vmscan.c~mm-vmscanc-fix-wunused-but-set-variable-warning +++ a/mm/vmscan.c @@ -1337,7 +1337,6 @@ static unsigned int demote_page_list(str { int target_nid = next_demotion_node(pgdat->node_id); unsigned int nr_succeeded; - int err; if (list_empty(demote_pages)) return 0; @@ -1346,7 +1345,7 @@ static unsigned int demote_page_list(str return 0; /* Demotion ignores all cpuset and mempolicy settings */ - err = migrate_pages(demote_pages, alloc_demote_page, NULL, + migrate_pages(demote_pages, alloc_demote_page, NULL, target_nid, MIGRATE_ASYNC, MR_DEMOTION, &nr_succeeded); From patchwork Fri Nov 5 20:42:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AC03C433FE for ; Fri, 5 Nov 2021 20:42:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 319A36127C for ; Fri, 5 Nov 2021 20:42:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 319A36127C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8D80B940090; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85433940095; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63FB5940090; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 498B8940093 for ; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 03733184B452D for ; Fri, 5 Nov 2021 20:42:41 +0000 (UTC) X-FDA: 78776050080.09.E8118E7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id EC90650000BD for ; Fri, 5 Nov 2021 20:42:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0B67D61212; Fri, 5 Nov 2021 20:42:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144946; bh=OhwZO/6VR+8M5IxU7Vyo63sXZ0uh/tmUQX+2DrSa1RU=; h=Date:From:To:Subject:In-Reply-To:From; b=CINN7TJJsYPB6tovjDs0hbn/ZOxouCLT+ndaCGZcazYK3PZWpYgm8psLidnpLvZ5d 97Nto/ECMUlXr4jvheQBuYZA9y6vYw6nvbUBxnKWb2NIG2kP2RJ/y/dFx/m1ERC2Jp jM5ymdLwxKeSkjNgIKrMh2LpDTe1wAhdGJqnP/QY= Date: Fri, 05 Nov 2021 13:42:25 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested Message-ID: <20211105204225.iIh99P9cn%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=CINN7TJJ; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EC90650000BD X-Stat-Signature: igsgfjhns3rk34t8fnjt5ecgi56u9zew X-HE-Tag: 1636144951-145403 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: throttle reclaim until some writeback completes if congested Patch series "Remove dependency on congestion_wait in mm/", v5. This series that removes all calls to congestion_wait in mm/ and deletes wait_iff_congested. It's not a clever implementation but congestion_wait has been broken for a long time (https://lore.kernel.org/linux-mm/45d8b7a6-8548-65f5-cccf-9f451d4ae3d4@kernel.dk/). Even if congestion throttling worked, it was never a great idea. While excessive dirty/writeback pages at the tail of the LRU is one possibility that reclaim may be slow, there is also the problem of too many pages being isolated and reclaim failing for other reasons (elevated references, too many pages isolated, excessive LRU contention etc). This series replaces the "congestion" throttling with 3 different types. o If there are too many dirty/writeback pages, sleep until a timeout or enough pages get cleaned o If too many pages are isolated, sleep until enough isolated pages are either reclaimed or put back on the LRU o If no progress is being made, direct reclaim tasks sleep until another task makes progress with acceptable efficiency. This was initially tested with a mix of workloads that used to trigger corner cases that no longer work. A new test case was created called "stutterp" (pagereclaim-stutterp-noreaders in mmtests) using a freshly created XFS filesystem. Note that it may be necessary to increase the timeout of ssh if executing remotely as ssh itself can get throttled and the connection may timeout. stutterp varies the number of "worker" processes from 4 up to NR_CPUS*4 to check the impact as the number of direct reclaimers increase. It has four types of worker. o One "anon latency" worker creates small mappings with mmap() and times how long it takes to fault the mapping reading it 4K at a time o X file writers which is fio randomly writing X files where the total size of the files add up to the allowed dirty_ratio. fio is allowed to run for a warmup period to allow some file-backed pages to accumulate. The duration of the warmup is based on the best-case linear write speed of the storage. o Y file readers which is fio randomly reading small files o Z anon memory hogs which continually map (100-dirty_ratio)% of memory o Total estimated WSS = (100+dirty_ration) percentage of memory X+Y+Z+1 == NR_WORKERS varying from 4 up to NR_CPUS*4 The intent is to maximise the total WSS with a mix of file and anon memory where some anonymous memory must be swapped and there is a high likelihood of dirty/writeback pages reaching the end of the LRU. The test can be configured to have no background readers to stress dirty/writeback pages. The results below are based on having zero readers. The short summary of the results is that the series works and stalls until some event occurs but the timeouts may need adjustment. The test results are not broken down by patch as the series should be treated as one block that replaces a broken throttling mechanism with a working one. Finally, three machines were tested but I'm reporting the worst set of results. The other two machines had much better latencies for example. First the results of the "anon latency" latency stutterp 5.15.0-rc1 5.15.0-rc1 vanilla mm-reclaimcongest-v5r4 Amean mmap-4 31.4003 ( 0.00%) 2661.0198 (-8374.52%) Amean mmap-7 38.1641 ( 0.00%) 149.2891 (-291.18%) Amean mmap-12 60.0981 ( 0.00%) 187.8105 (-212.51%) Amean mmap-21 161.2699 ( 0.00%) 213.9107 ( -32.64%) Amean mmap-30 174.5589 ( 0.00%) 377.7548 (-116.41%) Amean mmap-48 8106.8160 ( 0.00%) 1070.5616 ( 86.79%) Stddev mmap-4 41.3455 ( 0.00%) 27573.9676 (-66591.66%) Stddev mmap-7 53.5556 ( 0.00%) 4608.5860 (-8505.23%) Stddev mmap-12 171.3897 ( 0.00%) 5559.4542 (-3143.75%) Stddev mmap-21 1506.6752 ( 0.00%) 5746.2507 (-281.39%) Stddev mmap-30 557.5806 ( 0.00%) 7678.1624 (-1277.05%) Stddev mmap-48 61681.5718 ( 0.00%) 14507.2830 ( 76.48%) Max-90 mmap-4 31.4243 ( 0.00%) 83.1457 (-164.59%) Max-90 mmap-7 41.0410 ( 0.00%) 41.0720 ( -0.08%) Max-90 mmap-12 66.5255 ( 0.00%) 53.9073 ( 18.97%) Max-90 mmap-21 146.7479 ( 0.00%) 105.9540 ( 27.80%) Max-90 mmap-30 193.9513 ( 0.00%) 64.3067 ( 66.84%) Max-90 mmap-48 277.9137 ( 0.00%) 591.0594 (-112.68%) Max mmap-4 1913.8009 ( 0.00%) 299623.9695 (-15555.96%) Max mmap-7 2423.9665 ( 0.00%) 204453.1708 (-8334.65%) Max mmap-12 6845.6573 ( 0.00%) 221090.3366 (-3129.64%) Max mmap-21 56278.6508 ( 0.00%) 213877.3496 (-280.03%) Max mmap-30 19716.2990 ( 0.00%) 216287.6229 (-997.00%) Max mmap-48 477923.9400 ( 0.00%) 245414.8238 ( 48.65%) For most thread counts, the time to mmap() is unfortunately increased. In earlier versions of the series, this was lower but a large number of throttling events were reaching their timeout increasing the amount of inefficient scanning of the LRU. There is no prioritisation of reclaim tasks making progress based on each tasks rate of page allocation versus progress of reclaim. The variance is also impacted for high worker counts but in all cases, the differences in latency are not statistically significant due to very large maximum outliers. Max-90 shows that 90% of the stalls are comparable but the Max results show the massive outliers which are increased to to stalling. It is expected that this will be very machine dependant. Due to the test design, reclaim is difficult so allocations stall and there are variances depending on whether THPs can be allocated or not. The amount of memory will affect exactly how bad the corner cases are and how often they trigger. The warmup period calculation is not ideal as it's based on linear writes where as fio is randomly writing multiple files from multiple tasks so the start state of the test is variable. For example, these are the latencies on a single-socket machine that had more memory Amean mmap-4 42.2287 ( 0.00%) 49.6838 * -17.65%* Amean mmap-7 216.4326 ( 0.00%) 47.4451 * 78.08%* Amean mmap-12 2412.0588 ( 0.00%) 51.7497 ( 97.85%) Amean mmap-21 5546.2548 ( 0.00%) 51.8862 ( 99.06%) Amean mmap-30 1085.3121 ( 0.00%) 72.1004 ( 93.36%) The overall system CPU usage and elapsed time is as follows 5.15.0-rc3 5.15.0-rc3 vanilla mm-reclaimcongest-v5r4 Duration User 6989.03 983.42 Duration System 7308.12 799.68 Duration Elapsed 2277.67 2092.98 The patches reduce system CPU usage by 89% as the vanilla kernel is rarely stalling. The high-level /proc/vmstats show 5.15.0-rc1 5.15.0-rc1 vanilla mm-reclaimcongest-v5r2 Ops Direct pages scanned 1056608451.00 503594991.00 Ops Kswapd pages scanned 109795048.00 147289810.00 Ops Kswapd pages reclaimed 63269243.00 31036005.00 Ops Direct pages reclaimed 10803973.00 6328887.00 Ops Kswapd efficiency % 57.62 21.07 Ops Kswapd velocity 48204.98 57572.86 Ops Direct efficiency % 1.02 1.26 Ops Direct velocity 463898.83 196845.97 Kswapd scanned less pages but the detailed pattern is different. The vanilla kernel scans slowly over time where as the patches exhibits burst patterns of scan activity. Direct reclaim scanning is reduced by 52% due to stalling. The pattern for stealing pages is also slightly different. Both kernels exhibit spikes but the vanilla kernel when reclaiming shows pages being reclaimed over a period of time where as the patches tend to reclaim in spikes. The difference is that vanilla is not throttling and instead scanning constantly finding some pages over time where as the patched kernel throttles and reclaims in spikes. Ops Percentage direct scans 90.59 77.37 For direct reclaim, vanilla scanned 90.59% of pages where as with the patches, 77.37% were direct reclaim due to throttling Ops Page writes by reclaim 2613590.00 1687131.00 Page writes from reclaim context are reduced. Ops Page writes anon 2932752.00 1917048.00 And there is less swapping. Ops Page reclaim immediate 996248528.00 107664764.00 The number of pages encountered at the tail of the LRU tagged for immediate reclaim but still dirty/writeback is reduced by 89%. Ops Slabs scanned 164284.00 153608.00 Slab scan activity is similar. ftrace was used to gather stall activity Vanilla ------- 1 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=16000 2 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=12000 8 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=8000 29 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=4000 82394 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=0 The fast majority of wait_iff_congested calls do not stall at all. What is likely happening is that cond_resched() reschedules the task for a short period when the BDI is not registering congestion (which it never will in this test setup). 1 writeback_congestion_wait: usec_timeout=100000 usec_delayed=120000 2 writeback_congestion_wait: usec_timeout=100000 usec_delayed=132000 4 writeback_congestion_wait: usec_timeout=100000 usec_delayed=112000 380 writeback_congestion_wait: usec_timeout=100000 usec_delayed=108000 778 writeback_congestion_wait: usec_timeout=100000 usec_delayed=104000 congestion_wait if called always exceeds the timeout as there is no trigger to wake it up. Bottom line: Vanilla will throttle but it's not effective. Patch series ------------ Kswapd throttle activity was always due to scanning pages tagged for immediate reclaim at the tail of the LRU 1 usec_timeout=100000 usect_delayed=72000 reason=VMSCAN_THROTTLE_WRITEBACK 4 usec_timeout=100000 usect_delayed=20000 reason=VMSCAN_THROTTLE_WRITEBACK 5 usec_timeout=100000 usect_delayed=12000 reason=VMSCAN_THROTTLE_WRITEBACK 6 usec_timeout=100000 usect_delayed=16000 reason=VMSCAN_THROTTLE_WRITEBACK 11 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK 11 usec_timeout=100000 usect_delayed=8000 reason=VMSCAN_THROTTLE_WRITEBACK 94 usec_timeout=100000 usect_delayed=0 reason=VMSCAN_THROTTLE_WRITEBACK 112 usec_timeout=100000 usect_delayed=4000 reason=VMSCAN_THROTTLE_WRITEBACK The majority of events did not stall or stalled for a short period. Roughly 16% of stalls reached the timeout before expiry. For direct reclaim, the number of times stalled for each reason were 6624 reason=VMSCAN_THROTTLE_ISOLATED 93246 reason=VMSCAN_THROTTLE_NOPROGRESS 96934 reason=VMSCAN_THROTTLE_WRITEBACK The most common reason to stall was due to excessive pages tagged for immediate reclaim at the tail of the LRU followed by a failure to make forward. A relatively small number were due to too many pages isolated from the LRU by parallel threads For VMSCAN_THROTTLE_ISOLATED, the breakdown of delays was 9 usec_timeout=20000 usect_delayed=4000 reason=VMSCAN_THROTTLE_ISOLATED 12 usec_timeout=20000 usect_delayed=16000 reason=VMSCAN_THROTTLE_ISOLATED 83 usec_timeout=20000 usect_delayed=20000 reason=VMSCAN_THROTTLE_ISOLATED 6520 usec_timeout=20000 usect_delayed=0 reason=VMSCAN_THROTTLE_ISOLATED Most did not stall at all. A small number reached the timeout. For VMSCAN_THROTTLE_NOPROGRESS, the breakdown of stalls were all over the map 1 usec_timeout=500000 usect_delayed=324000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usec_timeout=500000 usect_delayed=332000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usec_timeout=500000 usect_delayed=348000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usec_timeout=500000 usect_delayed=360000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=228000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=260000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=340000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=364000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=372000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=428000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=460000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usec_timeout=500000 usect_delayed=464000 reason=VMSCAN_THROTTLE_NOPROGRESS 3 usec_timeout=500000 usect_delayed=244000 reason=VMSCAN_THROTTLE_NOPROGRESS 3 usec_timeout=500000 usect_delayed=252000 reason=VMSCAN_THROTTLE_NOPROGRESS 3 usec_timeout=500000 usect_delayed=272000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=188000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=268000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=328000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=380000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=392000 reason=VMSCAN_THROTTLE_NOPROGRESS 4 usec_timeout=500000 usect_delayed=432000 reason=VMSCAN_THROTTLE_NOPROGRESS 5 usec_timeout=500000 usect_delayed=204000 reason=VMSCAN_THROTTLE_NOPROGRESS 5 usec_timeout=500000 usect_delayed=220000 reason=VMSCAN_THROTTLE_NOPROGRESS 5 usec_timeout=500000 usect_delayed=412000 reason=VMSCAN_THROTTLE_NOPROGRESS 5 usec_timeout=500000 usect_delayed=436000 reason=VMSCAN_THROTTLE_NOPROGRESS 6 usec_timeout=500000 usect_delayed=488000 reason=VMSCAN_THROTTLE_NOPROGRESS 7 usec_timeout=500000 usect_delayed=212000 reason=VMSCAN_THROTTLE_NOPROGRESS 7 usec_timeout=500000 usect_delayed=300000 reason=VMSCAN_THROTTLE_NOPROGRESS 7 usec_timeout=500000 usect_delayed=316000 reason=VMSCAN_THROTTLE_NOPROGRESS 7 usec_timeout=500000 usect_delayed=472000 reason=VMSCAN_THROTTLE_NOPROGRESS 8 usec_timeout=500000 usect_delayed=248000 reason=VMSCAN_THROTTLE_NOPROGRESS 8 usec_timeout=500000 usect_delayed=356000 reason=VMSCAN_THROTTLE_NOPROGRESS 8 usec_timeout=500000 usect_delayed=456000 reason=VMSCAN_THROTTLE_NOPROGRESS 9 usec_timeout=500000 usect_delayed=124000 reason=VMSCAN_THROTTLE_NOPROGRESS 9 usec_timeout=500000 usect_delayed=376000 reason=VMSCAN_THROTTLE_NOPROGRESS 9 usec_timeout=500000 usect_delayed=484000 reason=VMSCAN_THROTTLE_NOPROGRESS 10 usec_timeout=500000 usect_delayed=172000 reason=VMSCAN_THROTTLE_NOPROGRESS 10 usec_timeout=500000 usect_delayed=420000 reason=VMSCAN_THROTTLE_NOPROGRESS 10 usec_timeout=500000 usect_delayed=452000 reason=VMSCAN_THROTTLE_NOPROGRESS 11 usec_timeout=500000 usect_delayed=256000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=112000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=116000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=144000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=152000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=264000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=384000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=424000 reason=VMSCAN_THROTTLE_NOPROGRESS 12 usec_timeout=500000 usect_delayed=492000 reason=VMSCAN_THROTTLE_NOPROGRESS 13 usec_timeout=500000 usect_delayed=184000 reason=VMSCAN_THROTTLE_NOPROGRESS 13 usec_timeout=500000 usect_delayed=444000 reason=VMSCAN_THROTTLE_NOPROGRESS 14 usec_timeout=500000 usect_delayed=308000 reason=VMSCAN_THROTTLE_NOPROGRESS 14 usec_timeout=500000 usect_delayed=440000 reason=VMSCAN_THROTTLE_NOPROGRESS 14 usec_timeout=500000 usect_delayed=476000 reason=VMSCAN_THROTTLE_NOPROGRESS 16 usec_timeout=500000 usect_delayed=140000 reason=VMSCAN_THROTTLE_NOPROGRESS 17 usec_timeout=500000 usect_delayed=232000 reason=VMSCAN_THROTTLE_NOPROGRESS 17 usec_timeout=500000 usect_delayed=240000 reason=VMSCAN_THROTTLE_NOPROGRESS 17 usec_timeout=500000 usect_delayed=280000 reason=VMSCAN_THROTTLE_NOPROGRESS 18 usec_timeout=500000 usect_delayed=404000 reason=VMSCAN_THROTTLE_NOPROGRESS 20 usec_timeout=500000 usect_delayed=148000 reason=VMSCAN_THROTTLE_NOPROGRESS 20 usec_timeout=500000 usect_delayed=216000 reason=VMSCAN_THROTTLE_NOPROGRESS 20 usec_timeout=500000 usect_delayed=468000 reason=VMSCAN_THROTTLE_NOPROGRESS 21 usec_timeout=500000 usect_delayed=448000 reason=VMSCAN_THROTTLE_NOPROGRESS 23 usec_timeout=500000 usect_delayed=168000 reason=VMSCAN_THROTTLE_NOPROGRESS 23 usec_timeout=500000 usect_delayed=296000 reason=VMSCAN_THROTTLE_NOPROGRESS 25 usec_timeout=500000 usect_delayed=132000 reason=VMSCAN_THROTTLE_NOPROGRESS 25 usec_timeout=500000 usect_delayed=352000 reason=VMSCAN_THROTTLE_NOPROGRESS 26 usec_timeout=500000 usect_delayed=180000 reason=VMSCAN_THROTTLE_NOPROGRESS 27 usec_timeout=500000 usect_delayed=284000 reason=VMSCAN_THROTTLE_NOPROGRESS 28 usec_timeout=500000 usect_delayed=164000 reason=VMSCAN_THROTTLE_NOPROGRESS 29 usec_timeout=500000 usect_delayed=136000 reason=VMSCAN_THROTTLE_NOPROGRESS 30 usec_timeout=500000 usect_delayed=200000 reason=VMSCAN_THROTTLE_NOPROGRESS 30 usec_timeout=500000 usect_delayed=400000 reason=VMSCAN_THROTTLE_NOPROGRESS 31 usec_timeout=500000 usect_delayed=196000 reason=VMSCAN_THROTTLE_NOPROGRESS 32 usec_timeout=500000 usect_delayed=156000 reason=VMSCAN_THROTTLE_NOPROGRESS 33 usec_timeout=500000 usect_delayed=224000 reason=VMSCAN_THROTTLE_NOPROGRESS 35 usec_timeout=500000 usect_delayed=128000 reason=VMSCAN_THROTTLE_NOPROGRESS 35 usec_timeout=500000 usect_delayed=176000 reason=VMSCAN_THROTTLE_NOPROGRESS 36 usec_timeout=500000 usect_delayed=368000 reason=VMSCAN_THROTTLE_NOPROGRESS 36 usec_timeout=500000 usect_delayed=496000 reason=VMSCAN_THROTTLE_NOPROGRESS 37 usec_timeout=500000 usect_delayed=312000 reason=VMSCAN_THROTTLE_NOPROGRESS 38 usec_timeout=500000 usect_delayed=304000 reason=VMSCAN_THROTTLE_NOPROGRESS 40 usec_timeout=500000 usect_delayed=288000 reason=VMSCAN_THROTTLE_NOPROGRESS 43 usec_timeout=500000 usect_delayed=408000 reason=VMSCAN_THROTTLE_NOPROGRESS 55 usec_timeout=500000 usect_delayed=416000 reason=VMSCAN_THROTTLE_NOPROGRESS 56 usec_timeout=500000 usect_delayed=76000 reason=VMSCAN_THROTTLE_NOPROGRESS 58 usec_timeout=500000 usect_delayed=120000 reason=VMSCAN_THROTTLE_NOPROGRESS 59 usec_timeout=500000 usect_delayed=208000 reason=VMSCAN_THROTTLE_NOPROGRESS 61 usec_timeout=500000 usect_delayed=68000 reason=VMSCAN_THROTTLE_NOPROGRESS 71 usec_timeout=500000 usect_delayed=192000 reason=VMSCAN_THROTTLE_NOPROGRESS 71 usec_timeout=500000 usect_delayed=480000 reason=VMSCAN_THROTTLE_NOPROGRESS 79 usec_timeout=500000 usect_delayed=60000 reason=VMSCAN_THROTTLE_NOPROGRESS 82 usec_timeout=500000 usect_delayed=320000 reason=VMSCAN_THROTTLE_NOPROGRESS 82 usec_timeout=500000 usect_delayed=92000 reason=VMSCAN_THROTTLE_NOPROGRESS 85 usec_timeout=500000 usect_delayed=64000 reason=VMSCAN_THROTTLE_NOPROGRESS 85 usec_timeout=500000 usect_delayed=80000 reason=VMSCAN_THROTTLE_NOPROGRESS 88 usec_timeout=500000 usect_delayed=84000 reason=VMSCAN_THROTTLE_NOPROGRESS 90 usec_timeout=500000 usect_delayed=160000 reason=VMSCAN_THROTTLE_NOPROGRESS 90 usec_timeout=500000 usect_delayed=292000 reason=VMSCAN_THROTTLE_NOPROGRESS 94 usec_timeout=500000 usect_delayed=56000 reason=VMSCAN_THROTTLE_NOPROGRESS 118 usec_timeout=500000 usect_delayed=88000 reason=VMSCAN_THROTTLE_NOPROGRESS 119 usec_timeout=500000 usect_delayed=72000 reason=VMSCAN_THROTTLE_NOPROGRESS 126 usec_timeout=500000 usect_delayed=108000 reason=VMSCAN_THROTTLE_NOPROGRESS 146 usec_timeout=500000 usect_delayed=52000 reason=VMSCAN_THROTTLE_NOPROGRESS 148 usec_timeout=500000 usect_delayed=36000 reason=VMSCAN_THROTTLE_NOPROGRESS 148 usec_timeout=500000 usect_delayed=48000 reason=VMSCAN_THROTTLE_NOPROGRESS 159 usec_timeout=500000 usect_delayed=28000 reason=VMSCAN_THROTTLE_NOPROGRESS 178 usec_timeout=500000 usect_delayed=44000 reason=VMSCAN_THROTTLE_NOPROGRESS 183 usec_timeout=500000 usect_delayed=40000 reason=VMSCAN_THROTTLE_NOPROGRESS 237 usec_timeout=500000 usect_delayed=100000 reason=VMSCAN_THROTTLE_NOPROGRESS 266 usec_timeout=500000 usect_delayed=32000 reason=VMSCAN_THROTTLE_NOPROGRESS 313 usec_timeout=500000 usect_delayed=24000 reason=VMSCAN_THROTTLE_NOPROGRESS 347 usec_timeout=500000 usect_delayed=96000 reason=VMSCAN_THROTTLE_NOPROGRESS 470 usec_timeout=500000 usect_delayed=20000 reason=VMSCAN_THROTTLE_NOPROGRESS 559 usec_timeout=500000 usect_delayed=16000 reason=VMSCAN_THROTTLE_NOPROGRESS 964 usec_timeout=500000 usect_delayed=12000 reason=VMSCAN_THROTTLE_NOPROGRESS 2001 usec_timeout=500000 usect_delayed=104000 reason=VMSCAN_THROTTLE_NOPROGRESS 2447 usec_timeout=500000 usect_delayed=8000 reason=VMSCAN_THROTTLE_NOPROGRESS 7888 usec_timeout=500000 usect_delayed=4000 reason=VMSCAN_THROTTLE_NOPROGRESS 22727 usec_timeout=500000 usect_delayed=0 reason=VMSCAN_THROTTLE_NOPROGRESS 51305 usec_timeout=500000 usect_delayed=500000 reason=VMSCAN_THROTTLE_NOPROGRESS The full timeout is often hit but a large number also do not stall at all. The remainder slept a little allowing other reclaim tasks to make progress. While this timeout could be further increased, it could also negatively impact worst-case behaviour when there is no prioritisation of what task should make progress. For VMSCAN_THROTTLE_WRITEBACK, the breakdown was 1 usec_timeout=100000 usect_delayed=44000 reason=VMSCAN_THROTTLE_WRITEBACK 2 usec_timeout=100000 usect_delayed=76000 reason=VMSCAN_THROTTLE_WRITEBACK 3 usec_timeout=100000 usect_delayed=80000 reason=VMSCAN_THROTTLE_WRITEBACK 5 usec_timeout=100000 usect_delayed=48000 reason=VMSCAN_THROTTLE_WRITEBACK 5 usec_timeout=100000 usect_delayed=84000 reason=VMSCAN_THROTTLE_WRITEBACK 6 usec_timeout=100000 usect_delayed=72000 reason=VMSCAN_THROTTLE_WRITEBACK 7 usec_timeout=100000 usect_delayed=88000 reason=VMSCAN_THROTTLE_WRITEBACK 11 usec_timeout=100000 usect_delayed=56000 reason=VMSCAN_THROTTLE_WRITEBACK 12 usec_timeout=100000 usect_delayed=64000 reason=VMSCAN_THROTTLE_WRITEBACK 16 usec_timeout=100000 usect_delayed=92000 reason=VMSCAN_THROTTLE_WRITEBACK 24 usec_timeout=100000 usect_delayed=68000 reason=VMSCAN_THROTTLE_WRITEBACK 28 usec_timeout=100000 usect_delayed=32000 reason=VMSCAN_THROTTLE_WRITEBACK 30 usec_timeout=100000 usect_delayed=60000 reason=VMSCAN_THROTTLE_WRITEBACK 30 usec_timeout=100000 usect_delayed=96000 reason=VMSCAN_THROTTLE_WRITEBACK 32 usec_timeout=100000 usect_delayed=52000 reason=VMSCAN_THROTTLE_WRITEBACK 42 usec_timeout=100000 usect_delayed=40000 reason=VMSCAN_THROTTLE_WRITEBACK 77 usec_timeout=100000 usect_delayed=28000 reason=VMSCAN_THROTTLE_WRITEBACK 99 usec_timeout=100000 usect_delayed=36000 reason=VMSCAN_THROTTLE_WRITEBACK 137 usec_timeout=100000 usect_delayed=24000 reason=VMSCAN_THROTTLE_WRITEBACK 190 usec_timeout=100000 usect_delayed=20000 reason=VMSCAN_THROTTLE_WRITEBACK 339 usec_timeout=100000 usect_delayed=16000 reason=VMSCAN_THROTTLE_WRITEBACK 518 usec_timeout=100000 usect_delayed=12000 reason=VMSCAN_THROTTLE_WRITEBACK 852 usec_timeout=100000 usect_delayed=8000 reason=VMSCAN_THROTTLE_WRITEBACK 3359 usec_timeout=100000 usect_delayed=4000 reason=VMSCAN_THROTTLE_WRITEBACK 7147 usec_timeout=100000 usect_delayed=0 reason=VMSCAN_THROTTLE_WRITEBACK 83962 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK The majority hit the timeout in direct reclaim context although a sizable number did not stall at all. This is very different to kswapd where only a tiny percentage of stalls due to writeback reached the timeout. Bottom line, the throttling appears to work and the wakeup events may limit worst case stalls. There might be some grounds for adjusting timeouts but it's likely futile as the worst-case scenarios depend on the workload, memory size and the speed of the storage. A better approach to improve the series further would be to prioritise tasks based on their rate of allocation with the caveat that it may be very expensive to track. This patch (of 5): Page reclaim throttles on wait_iff_congested under the following conditions: o kswapd is encountering pages under writeback and marked for immediate reclaim implying that pages are cycling through the LRU faster than pages can be cleaned. o Direct reclaim will stall if all dirty pages are backed by congested inodes. wait_iff_congested is almost completely broken with few exceptions. This patch adds a new node-based workqueue and tracks the number of throttled tasks and pages written back since throttling started. If enough pages belonging to the node are written back then the throttled tasks will wake early. If not, the throttled tasks sleeps until the timeout expires. [neilb@suse.de: Uninterruptible sleep and simpler wakeups] [hdanton@sina.com: Avoid race when reclaim starts] [vbabka@suse.cz: vmstat irq-safe api, clarifications] Link: https://lkml.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net Link: https://lkml.kernel.org/r/20211022144651.19914-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: NeilBrown Cc: "Theodore Ts'o" Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Matthew Wilcox Cc: Michal Hocko Cc: Dave Chinner Cc: Rik van Riel Cc: Johannes Weiner Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- include/linux/backing-dev.h | 1 include/linux/mmzone.h | 13 ++++ include/trace/events/vmscan.h | 34 ++++++++++++ include/trace/events/writeback.h | 7 -- mm/backing-dev.c | 48 ---------------- mm/filemap.c | 1 mm/internal.h | 11 +++ mm/page_alloc.c | 5 + mm/vmscan.c | 82 ++++++++++++++++++++++++----- mm/vmstat.c | 1 10 files changed, 135 insertions(+), 68 deletions(-) --- a/include/linux/backing-dev.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/include/linux/backing-dev.h @@ -154,7 +154,6 @@ static inline int wb_congested(struct bd } long congestion_wait(int sync, long timeout); -long wait_iff_congested(int sync, long timeout); static inline bool mapping_can_writeback(struct address_space *mapping) { --- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/include/linux/mmzone.h @@ -199,6 +199,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_THROTTLED_WRITTEN, /* NR_WRITTEN while reclaim throttled */ NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */ NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */ NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */ @@ -272,6 +273,11 @@ enum lru_list { NR_LRU_LISTS }; +enum vmscan_throttle_state { + VMSCAN_THROTTLE_WRITEBACK, + NR_VMSCAN_THROTTLE, +}; + #define for_each_lru(lru) for (lru = 0; lru < NR_LRU_LISTS; lru++) #define for_each_evictable_lru(lru) for (lru = 0; lru <= LRU_ACTIVE_FILE; lru++) @@ -841,6 +847,13 @@ typedef struct pglist_data { int node_id; wait_queue_head_t kswapd_wait; wait_queue_head_t pfmemalloc_wait; + + /* workqueues for throttling reclaim for different reasons. */ + wait_queue_head_t reclaim_wait[NR_VMSCAN_THROTTLE]; + + atomic_t nr_writeback_throttled;/* nr of writeback-throttled tasks */ + unsigned long nr_reclaim_start; /* nr pages written while throttled + * when throttling started. */ struct task_struct *kswapd; /* Protected by mem_hotplug_begin/end() */ int kswapd_order; --- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/include/trace/events/vmscan.h @@ -27,6 +27,14 @@ {RECLAIM_WB_ASYNC, "RECLAIM_WB_ASYNC"} \ ) : "RECLAIM_WB_NONE" +#define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) + +#define show_throttle_flags(flags) \ + (flags) ? __print_flags(flags, "|", \ + {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"} \ + ) : "VMSCAN_THROTTLE_NONE" + + #define trace_reclaim_flags(file) ( \ (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \ (RECLAIM_WB_ASYNC) \ @@ -454,6 +462,32 @@ DEFINE_EVENT(mm_vmscan_direct_reclaim_en TP_ARGS(nr_reclaimed) ); +TRACE_EVENT(mm_vmscan_throttled, + + TP_PROTO(int nid, int usec_timeout, int usec_delayed, int reason), + + TP_ARGS(nid, usec_timeout, usec_delayed, reason), + + TP_STRUCT__entry( + __field(int, nid) + __field(int, usec_timeout) + __field(int, usec_delayed) + __field(int, reason) + ), + + TP_fast_assign( + __entry->nid = nid; + __entry->usec_timeout = usec_timeout; + __entry->usec_delayed = usec_delayed; + __entry->reason = 1U << reason; + ), + + TP_printk("nid=%d usec_timeout=%d usect_delayed=%d reason=%s", + __entry->nid, + __entry->usec_timeout, + __entry->usec_delayed, + show_throttle_flags(__entry->reason)) +); #endif /* _TRACE_VMSCAN_H */ /* This part must be outside protection */ --- a/include/trace/events/writeback.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/include/trace/events/writeback.h @@ -763,13 +763,6 @@ DEFINE_EVENT(writeback_congest_waited_te TP_ARGS(usec_timeout, usec_delayed) ); -DEFINE_EVENT(writeback_congest_waited_template, writeback_wait_iff_congested, - - TP_PROTO(unsigned int usec_timeout, unsigned int usec_delayed), - - TP_ARGS(usec_timeout, usec_delayed) -); - DECLARE_EVENT_CLASS(writeback_single_inode_template, TP_PROTO(struct inode *inode, --- a/mm/backing-dev.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/backing-dev.c @@ -1038,51 +1038,3 @@ long congestion_wait(int sync, long time return ret; } EXPORT_SYMBOL(congestion_wait); - -/** - * wait_iff_congested - Conditionally wait for a backing_dev to become uncongested or a pgdat to complete writes - * @sync: SYNC or ASYNC IO - * @timeout: timeout in jiffies - * - * In the event of a congested backing_dev (any backing_dev) this waits - * for up to @timeout jiffies for either a BDI to exit congestion of the - * given @sync queue or a write to complete. - * - * The return value is 0 if the sleep is for the full timeout. Otherwise, - * it is the number of jiffies that were still remaining when the function - * returned. return_value == timeout implies the function did not sleep. - */ -long wait_iff_congested(int sync, long timeout) -{ - long ret; - unsigned long start = jiffies; - DEFINE_WAIT(wait); - wait_queue_head_t *wqh = &congestion_wqh[sync]; - - /* - * If there is no congestion, yield if necessary instead - * of sleeping on the congestion queue - */ - if (atomic_read(&nr_wb_congested[sync]) == 0) { - cond_resched(); - - /* In case we scheduled, work out time remaining */ - ret = timeout - (jiffies - start); - if (ret < 0) - ret = 0; - - goto out; - } - - /* Sleep until uncongested or a write happens */ - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); - ret = io_schedule_timeout(timeout); - finish_wait(wqh, &wait); - -out: - trace_writeback_wait_iff_congested(jiffies_to_usecs(timeout), - jiffies_to_usecs(jiffies - start)); - - return ret; -} -EXPORT_SYMBOL(wait_iff_congested); --- a/mm/filemap.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/filemap.c @@ -1612,6 +1612,7 @@ void end_page_writeback(struct page *pag smp_mb__after_atomic(); wake_up_page(page, PG_writeback); + acct_reclaim_writeback(page); put_page(page); } EXPORT_SYMBOL(end_page_writeback); --- a/mm/internal.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/internal.h @@ -34,6 +34,17 @@ void page_writeback_init(void); +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page, + int nr_throttled); +static inline void acct_reclaim_writeback(struct page *page) +{ + pg_data_t *pgdat = page_pgdat(page); + int nr_throttled = atomic_read(&pgdat->nr_writeback_throttled); + + if (nr_throttled) + __acct_reclaim_writeback(pgdat, page, nr_throttled); +} + vm_fault_t do_swap_page(struct vm_fault *vmf); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, --- a/mm/page_alloc.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/page_alloc.c @@ -7408,6 +7408,8 @@ static void pgdat_init_kcompactd(struct static void __meminit pgdat_init_internals(struct pglist_data *pgdat) { + int i; + pgdat_resize_init(pgdat); pgdat_init_split_queue(pgdat); @@ -7416,6 +7418,9 @@ static void __meminit pgdat_init_interna init_waitqueue_head(&pgdat->kswapd_wait); init_waitqueue_head(&pgdat->pfmemalloc_wait); + for (i = 0; i < NR_VMSCAN_THROTTLE; i++) + init_waitqueue_head(&pgdat->reclaim_wait[i]); + pgdat_page_ext_init(pgdat); lruvec_init(&pgdat->__lruvec); } --- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/vmscan.c @@ -1006,6 +1006,64 @@ static void handle_write_error(struct ad unlock_page(page); } +static void +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout) +{ + wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason]; + long ret; + DEFINE_WAIT(wait); + + /* + * Do not throttle IO workers, kthreads other than kswapd or + * workqueues. They may be required for reclaim to make + * forward progress (e.g. journalling workqueues or kthreads). + */ + if (!current_is_kswapd() && + current->flags & (PF_IO_WORKER|PF_KTHREAD)) + return; + + if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { + WRITE_ONCE(pgdat->nr_reclaim_start, + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); + } + + prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); + ret = schedule_timeout(timeout); + finish_wait(wqh, &wait); + atomic_dec(&pgdat->nr_writeback_throttled); + + trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout), + jiffies_to_usecs(timeout - ret), + reason); +} + +/* + * Account for pages written if tasks are throttled waiting on dirty + * pages to clean. If enough pages have been cleaned since throttling + * started then wakeup the throttled tasks. + */ +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page, + int nr_throttled) +{ + unsigned long nr_written; + + inc_node_page_state(page, NR_THROTTLED_WRITTEN); + + /* + * This is an inaccurate read as the per-cpu deltas may not + * be synchronised. However, given that the system is + * writeback throttled, it is not worth taking the penalty + * of getting an accurate count. At worst, the throttle + * timeout guarantees forward progress. + */ + nr_written = node_page_state(pgdat, NR_THROTTLED_WRITTEN) - + READ_ONCE(pgdat->nr_reclaim_start); + + if (nr_written > SWAP_CLUSTER_MAX * nr_throttled) + wake_up(&pgdat->reclaim_wait[VMSCAN_THROTTLE_WRITEBACK]); +} + /* possible outcome of pageout() */ typedef enum { /* failed to write page out, page is locked */ @@ -1411,9 +1469,8 @@ retry: /* * The number of dirty pages determines if a node is marked - * reclaim_congested which affects wait_iff_congested. kswapd - * will stall and start writing pages if the tail of the LRU - * is all dirty unqueued pages. + * reclaim_congested. kswapd will stall and start writing + * pages if the tail of the LRU is all dirty unqueued pages. */ page_check_dirty_writeback(page, &dirty, &writeback); if (dirty || writeback) @@ -3179,19 +3236,19 @@ again: * If kswapd scans pages marked for immediate * reclaim and under writeback (nr_immediate), it * implies that pages are cycling through the LRU - * faster than they are written so also forcibly stall. + * faster than they are written so forcibly stall + * until some pages complete writeback. */ if (sc->nr.immediate) - congestion_wait(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); } /* - * Tag a node/memcg as congested if all the dirty pages - * scanned were backed by a congested BDI and - * wait_iff_congested will stall. + * Tag a node/memcg as congested if all the dirty pages were marked + * for writeback and immediate reclaim (counted in nr.congested). * * Legacy memcg will stall in page writeback so avoid forcibly - * stalling in wait_iff_congested(). + * stalling in reclaim_throttle(). */ if ((current_is_kswapd() || (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) && @@ -3199,15 +3256,15 @@ again: set_bit(LRUVEC_CONGESTED, &target_lruvec->flags); /* - * Stall direct reclaim for IO completions if underlying BDIs - * and node is congested. Allow kswapd to continue until it + * Stall direct reclaim for IO completions if the lruvec is + * node is congested. Allow kswapd to continue until it * starts encountering unqueued dirty pages or cycling through * the LRU too quickly. */ if (!current_is_kswapd() && current_may_throttle() && !sc->hibernation_mode && test_bit(LRUVEC_CONGESTED, &target_lruvec->flags)) - wait_iff_congested(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, sc)) @@ -4285,6 +4342,7 @@ static int kswapd(void *p) WRITE_ONCE(pgdat->kswapd_order, 0); WRITE_ONCE(pgdat->kswapd_highest_zoneidx, MAX_NR_ZONES); + atomic_set(&pgdat->nr_writeback_throttled, 0); for ( ; ; ) { bool ret; --- a/mm/vmstat.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested +++ a/mm/vmstat.c @@ -1225,6 +1225,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_throttled_written", "nr_kernel_misc_reclaimable", "nr_foll_pin_acquired", "nr_foll_pin_released", From patchwork Fri Nov 5 20:42:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33CECC433FE for ; Fri, 5 Nov 2021 20:42:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DF20361216 for ; Fri, 5 Nov 2021 20:42:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DF20361216 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8509C940091; Fri, 5 Nov 2021 16:42:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 800B5940090; Fri, 5 Nov 2021 16:42:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 716C4940091; Fri, 5 Nov 2021 16:42:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id 6147B940090 for ; Fri, 5 Nov 2021 16:42:31 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1E96776AEE for ; Fri, 5 Nov 2021 20:42:31 +0000 (UTC) X-FDA: 78776049702.11.A59DC60 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 5960CB0000BE for ; Fri, 5 Nov 2021 20:42:23 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6A6A26120D; Fri, 5 Nov 2021 20:42:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144949; bh=waDO+7uAcViGltZ2gJRZmyPHaTt2qcSaomoLk2KCndY=; h=Date:From:To:Subject:In-Reply-To:From; b=ceu+ab1ONEQabof7tX/2qwx+VdnvgBxeGoXwUVP1INjaMBCJ0nI2v2nboSgUmINn9 U4xGum2t1Z2KSqO9qXE1SEaVLKN2OiSKjjfrvyQB25UqNNOQHk8fOVOvXbwZwMjEfQ QJRiEaNkFvlGARPSDXecUQBLWfdu93TGljrj2fGE= Date: Fri, 05 Nov 2021 13:42:29 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 150/262] mm/vmscan: throttle reclaim and compaction when too may pages are isolated Message-ID: <20211105204229.eXBgYXP95%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ceu+ab1O; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 5960CB0000BE X-Stat-Signature: dsziimst58byn3pebhkwa8j1yyjquo6i X-HE-Tag: 1636144943-624622 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: throttle reclaim and compaction when too may pages are isolated Page reclaim throttles on congestion if too many parallel reclaim instances have isolated too many pages. This makes no sense, excessive parallelisation has nothing to do with writeback or congestion. This patch creates an additional workqueue to sleep on when too many pages are isolated. The throttled tasks are woken when the number of isolated pages is reduced or a timeout occurs. There may be some false positive wakeups for GFP_NOIO/GFP_NOFS callers but the tasks will throttle again if necessary. [shy828301@gmail.com: Wake up from compaction context] [vbabka@suse.cz: Account number of throttled tasks only for writeback] Link: https://lkml.kernel.org/r/20211022144651.19914-3-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 1 + include/trace/events/vmscan.h | 4 +++- mm/compaction.c | 10 ++++++++-- mm/internal.h | 11 +++++++++++ mm/vmscan.c | 22 ++++++++++++++++------ 5 files changed, 39 insertions(+), 9 deletions(-) --- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated +++ a/include/linux/mmzone.h @@ -275,6 +275,7 @@ enum lru_list { enum vmscan_throttle_state { VMSCAN_THROTTLE_WRITEBACK, + VMSCAN_THROTTLE_ISOLATED, NR_VMSCAN_THROTTLE, }; --- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated +++ a/include/trace/events/vmscan.h @@ -28,10 +28,12 @@ ) : "RECLAIM_WB_NONE" #define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) +#define _VMSCAN_THROTTLE_ISOLATED (1 << VMSCAN_THROTTLE_ISOLATED) #define show_throttle_flags(flags) \ (flags) ? __print_flags(flags, "|", \ - {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"} \ + {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"}, \ + {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"} \ ) : "VMSCAN_THROTTLE_NONE" --- a/mm/compaction.c~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated +++ a/mm/compaction.c @@ -761,6 +761,8 @@ isolate_freepages_range(struct compact_c /* Similar to reclaim, but different enough that they don't share logic */ static bool too_many_isolated(pg_data_t *pgdat) { + bool too_many; + unsigned long active, inactive, isolated; inactive = node_page_state(pgdat, NR_INACTIVE_FILE) + @@ -770,7 +772,11 @@ static bool too_many_isolated(pg_data_t isolated = node_page_state(pgdat, NR_ISOLATED_FILE) + node_page_state(pgdat, NR_ISOLATED_ANON); - return isolated > (inactive + active) / 2; + too_many = isolated > (inactive + active) / 2; + if (!too_many) + wake_throttle_isolated(pgdat); + + return too_many; } /** @@ -822,7 +828,7 @@ isolate_migratepages_block(struct compac if (cc->mode == MIGRATE_ASYNC) return -EAGAIN; - congestion_wait(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); if (fatal_signal_pending(current)) return -EINTR; --- a/mm/internal.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated +++ a/mm/internal.h @@ -45,6 +45,15 @@ static inline void acct_reclaim_writebac __acct_reclaim_writeback(pgdat, page, nr_throttled); } +static inline void wake_throttle_isolated(pg_data_t *pgdat) +{ + wait_queue_head_t *wqh; + + wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_ISOLATED]; + if (waitqueue_active(wqh)) + wake_up(wqh); +} + vm_fault_t do_swap_page(struct vm_fault *vmf); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, @@ -121,6 +130,8 @@ extern unsigned long highest_memmap_pfn; */ extern int isolate_lru_page(struct page *page); extern void putback_lru_page(struct page *page); +extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout); /* * in mm/rmap.c: --- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated +++ a/mm/vmscan.c @@ -1006,12 +1006,12 @@ static void handle_write_error(struct ad unlock_page(page); } -static void -reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, +void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, long timeout) { wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason]; long ret; + bool acct_writeback = (reason == VMSCAN_THROTTLE_WRITEBACK); DEFINE_WAIT(wait); /* @@ -1023,7 +1023,8 @@ reclaim_throttle(pg_data_t *pgdat, enum current->flags & (PF_IO_WORKER|PF_KTHREAD)) return; - if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { + if (acct_writeback && + atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { WRITE_ONCE(pgdat->nr_reclaim_start, node_page_state(pgdat, NR_THROTTLED_WRITTEN)); } @@ -1031,7 +1032,9 @@ reclaim_throttle(pg_data_t *pgdat, enum prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); ret = schedule_timeout(timeout); finish_wait(wqh, &wait); - atomic_dec(&pgdat->nr_writeback_throttled); + + if (acct_writeback) + atomic_dec(&pgdat->nr_writeback_throttled); trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout), jiffies_to_usecs(timeout - ret), @@ -2175,6 +2178,7 @@ static int too_many_isolated(struct pgli struct scan_control *sc) { unsigned long inactive, isolated; + bool too_many; if (current_is_kswapd()) return 0; @@ -2198,7 +2202,13 @@ static int too_many_isolated(struct pgli if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) inactive >>= 3; - return isolated > inactive; + too_many = isolated > inactive; + + /* Wake up tasks throttled due to too_many_isolated. */ + if (!too_many) + wake_throttle_isolated(pgdat); + + return too_many; } /* @@ -2307,8 +2317,8 @@ shrink_inactive_list(unsigned long nr_to return 0; /* wait a bit for the reclaimer. */ - msleep(100); stalled = true; + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) From patchwork Fri Nov 5 20:42:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54321C4332F for ; Fri, 5 Nov 2021 20:43:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D3776135F for ; Fri, 5 Nov 2021 20:43:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0D3776135F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D59989400A1; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D08B9940093; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD1BE9400A1; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0101.hostedemail.com [216.40.44.101]) by kanga.kvack.org (Postfix) with ESMTP id AE7A0940093 for ; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7167C18481CCD for ; Fri, 5 Nov 2021 20:43:34 +0000 (UTC) X-FDA: 78776052348.04.D347A97 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id 1D87DD036A5D for ; Fri, 5 Nov 2021 20:42:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C4CB461216; Fri, 5 Nov 2021 20:42:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144953; bh=yV+UpPil/Eei/Q8jjLU3kcevaHT2ui9lS3AaoXw304Y=; h=Date:From:To:Subject:In-Reply-To:From; b=jzWIfqQveXgCRwoRAy3TJBB43YOvIPMejD2HYokr+lTUSYKPv9/daUO7+0GMRGTPF n6ZJy2/DPzhDNrAeXXjVi1ceBZW3cqblaHXod9+MfURh7Gn3aaQKeyvCXpJhdaei6p K6nTufA5IJsnKfvMwIpO7LjDk1XKP04d4QKYzBGo= Date: Fri, 05 Nov 2021 13:42:32 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 151/262] mm/vmscan: throttle reclaim when no progress is being made Message-ID: <20211105204232.0b_He_hZC%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1D87DD036A5D X-Stat-Signature: afrj4ujq85x5b719ws5tr3xu47k3ngic Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jzWIfqQv; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144949-556901 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: throttle reclaim when no progress is being made Memcg reclaim throttles on congestion if no reclaim progress is made. This makes little sense, it might be due to writeback or a host of other factors. For !memcg reclaim, it's messy. Direct reclaim primarily is throttled in the page allocator if it is failing to make progress. Kswapd throttles if too many pages are under writeback and marked for immediate reclaim. This patch explicitly throttles if reclaim is failing to make progress. [vbabka@suse.cz: Remove redundant code] Link: https://lkml.kernel.org/r/20211022144651.19914-4-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 1 + include/trace/events/vmscan.h | 4 +++- mm/memcontrol.c | 10 +--------- mm/vmscan.c | 28 ++++++++++++++++++++++++++++ 4 files changed, 33 insertions(+), 10 deletions(-) --- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made +++ a/include/linux/mmzone.h @@ -276,6 +276,7 @@ enum lru_list { enum vmscan_throttle_state { VMSCAN_THROTTLE_WRITEBACK, VMSCAN_THROTTLE_ISOLATED, + VMSCAN_THROTTLE_NOPROGRESS, NR_VMSCAN_THROTTLE, }; --- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made +++ a/include/trace/events/vmscan.h @@ -29,11 +29,13 @@ #define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) #define _VMSCAN_THROTTLE_ISOLATED (1 << VMSCAN_THROTTLE_ISOLATED) +#define _VMSCAN_THROTTLE_NOPROGRESS (1 << VMSCAN_THROTTLE_NOPROGRESS) #define show_throttle_flags(flags) \ (flags) ? __print_flags(flags, "|", \ {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"}, \ - {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"} \ + {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"}, \ + {_VMSCAN_THROTTLE_NOPROGRESS, "VMSCAN_THROTTLE_NOPROGRESS"} \ ) : "VMSCAN_THROTTLE_NONE" --- a/mm/memcontrol.c~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made +++ a/mm/memcontrol.c @@ -3487,19 +3487,11 @@ static int mem_cgroup_force_empty(struct /* try to free all pages in this cgroup */ while (nr_retries && page_counter_read(&memcg->memory)) { - int progress; - if (signal_pending(current)) return -EINTR; - progress = try_to_free_mem_cgroup_pages(memcg, 1, - GFP_KERNEL, true); - if (!progress) { + if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true)) nr_retries--; - /* maybe some writeback is necessary */ - congestion_wait(BLK_RW_ASYNC, HZ/10); - } - } return 0; --- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made +++ a/mm/vmscan.c @@ -3322,6 +3322,33 @@ static inline bool compaction_ready(stru return zone_watermark_ok_safe(zone, 0, watermark, sc->reclaim_idx); } +static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc) +{ + /* If reclaim is making progress, wake any throttled tasks. */ + if (sc->nr_reclaimed) { + wait_queue_head_t *wqh; + + wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_NOPROGRESS]; + if (waitqueue_active(wqh)) + wake_up(wqh); + + return; + } + + /* + * Do not throttle kswapd on NOPROGRESS as it will throttle on + * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under + * writeback and marked for immediate reclaim at the tail of + * the LRU. + */ + if (current_is_kswapd()) + return; + + /* Throttle if making no progress at high prioities. */ + if (sc->priority < DEF_PRIORITY - 2) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10); +} + /* * This is the direct reclaim path, for page-allocating processes. We only * try to reclaim pages from zones which will satisfy the caller's allocation @@ -3406,6 +3433,7 @@ static void shrink_zones(struct zonelist continue; last_pgdat = zone->zone_pgdat; shrink_node(zone->zone_pgdat, sc); + consider_reclaim_throttle(zone->zone_pgdat, sc); } /* From patchwork Fri Nov 5 20:42:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1B45C433EF for ; Fri, 5 Nov 2021 20:42:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 59E676126A for ; Fri, 5 Nov 2021 20:42:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 59E676126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EBAEC940092; Fri, 5 Nov 2021 16:42:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E69E2940090; Fri, 5 Nov 2021 16:42:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D57B1940092; Fri, 5 Nov 2021 16:42:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id C38F9940090 for ; Fri, 5 Nov 2021 16:42:37 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 87C2877986 for ; Fri, 5 Nov 2021 20:42:37 +0000 (UTC) X-FDA: 78776049954.16.CF74216 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 3805C300011A for ; Fri, 5 Nov 2021 20:42:37 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 0338161244; Fri, 5 Nov 2021 20:42:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144956; bh=9XDyNz+ALPkfGq63s6Ydkx70208Aq7xhLnT620tqUcU=; h=Date:From:To:Subject:In-Reply-To:From; b=RCjDvp3JZ9a26sWX6vS+Gxz056PN2jFC57lfIHmxJjqFCqsaHofn3H4Zds/j23kQt S0T4ffFn6EtaoiSfB2SlaL7+oI2u1E4z84O2XECt2dRCnE9oduU9yPKvXR78k2qM0t ouxvIB5hAwvhZwONyfYN1FO8EN4CZPZ4Vpn0YZhU= Date: Fri, 05 Nov 2021 13:42:35 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 152/262] mm/writeback: throttle based on page writeback instead of congestion Message-ID: <20211105204235.GNTkEv6QI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3805C300011A X-Stat-Signature: 8b981qxyu33u4wwxt5ix4kb583d4njy5 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=RCjDvp3J; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144957-540424 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/writeback: throttle based on page writeback instead of congestion do_writepages throttles on congestion if the writepages() fails due to a lack of memory but congestion_wait() is partially broken as the congestion state is not updated for all BDIs. This patch stalls waiting for a number of pages to complete writeback that located on the local node. The main weakness is that there is no correlation between the location of the inode's pages and locality but that is still better than congestion_wait. Link: https://lkml.kernel.org/r/20211022144651.19914-5-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- mm/page-writeback.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/mm/page-writeback.c~mm-writeback-throttle-based-on-page-writeback-instead-of-congestion +++ a/mm/page-writeback.c @@ -2366,8 +2366,15 @@ int do_writepages(struct address_space * ret = generic_writepages(mapping, wbc); if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL)) break; - cond_resched(); - congestion_wait(BLK_RW_ASYNC, HZ/50); + + /* + * Lacking an allocation context or the locality or writeback + * state of any of the inode's pages, throttle based on + * writeback activity on the local node. It's as good a + * guess as any. + */ + reclaim_throttle(NODE_DATA(numa_node_id()), + VMSCAN_THROTTLE_WRITEBACK, HZ/50); } /* * Usually few pages are written by now from those we've just submitted From patchwork Fri Nov 5 20:42:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 205E5C433EF for ; Fri, 5 Nov 2021 20:42:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC74B6127B for ; Fri, 5 Nov 2021 20:42:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CC74B6127B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6A475940094; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 679F3940093; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56A74940094; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4006D940090 for ; Fri, 5 Nov 2021 16:42:41 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0591418549A8A for ; Fri, 5 Nov 2021 20:42:41 +0000 (UTC) X-FDA: 78776050122.08.3270AE6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id 11E7F60019A2 for ; Fri, 5 Nov 2021 20:42:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 58F956124F; Fri, 5 Nov 2021 20:42:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144959; bh=b6483NYSqR3dKYJJj84fbe6Ur/a/+uYhb4Hz8lG2ymI=; h=Date:From:To:Subject:In-Reply-To:From; b=HwD5rm9Qd6o28YSqqHLnwKhP4SsLGPlmuQB5wuaLGU9LmJowMfPBnjQowgmUDg4y5 d1pIiLB0h9oFILdNse1zC3vw8Z1wJJayu4CYj6yYh+c+IUwQINH1MaqDQrh2gGSXls bcF4fRN3Y3Y81cSstEdtlDzgc8tDUx3Tevb+x10w= Date: Fri, 05 Nov 2021 13:42:38 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 153/262] mm/page_alloc: remove the throttling logic from the page allocator Message-ID: <20211105204238.5PGt0tLHK%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 11E7F60019A2 X-Stat-Signature: dmkkpngdg7185dbejhnw6kqfr5i3a91a Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HwD5rm9Q; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144948-761982 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/page_alloc: remove the throttling logic from the page allocator The page allocator stalls based on the number of pages that are waiting for writeback to start but this should now be redundant. shrink_inactive_list() will wake flusher threads if the LRU tail are unqueued dirty pages so the flusher should be active. If it fails to make progress due to pages under writeback not being completed quickly then it should stall on VMSCAN_THROTTLE_WRITEBACK. Link: https://lkml.kernel.org/r/20211022144651.19914-6-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- mm/page_alloc.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-remove-the-throttling-logic-from-the-page-allocator +++ a/mm/page_alloc.c @@ -4791,30 +4791,11 @@ should_reclaim_retry(gfp_t gfp_mask, uns trace_reclaim_retry_zone(z, order, reclaimable, available, min_wmark, *no_progress_loops, wmark); if (wmark) { - /* - * If we didn't make any progress and have a lot of - * dirty + writeback pages then we should wait for - * an IO to complete to slow down the reclaim and - * prevent from pre mature OOM - */ - if (!did_some_progress) { - unsigned long write_pending; - - write_pending = zone_page_state_snapshot(zone, - NR_ZONE_WRITE_PENDING); - - if (2 * write_pending > reclaimable) { - congestion_wait(BLK_RW_ASYNC, HZ/10); - return true; - } - } - ret = true; - goto out; + break; } } -out: /* * Memory allocation/reclaim might be called from a WQ context and the * current implementation of the WQ concurrency control doesn't From patchwork Fri Nov 5 20:42:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBFE2C433EF for ; Fri, 5 Nov 2021 20:42:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9C1D661284 for ; Fri, 5 Nov 2021 20:42:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9C1D661284 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id CC354940095; Fri, 5 Nov 2021 16:42:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C9443940093; Fri, 5 Nov 2021 16:42:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0FFE940095; Fri, 5 Nov 2021 16:42:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 9F711940093 for ; Fri, 5 Nov 2021 16:42:44 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6404E77816 for ; Fri, 5 Nov 2021 20:42:44 +0000 (UTC) X-FDA: 78776050248.29.D4D11C5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 5F02DB000188 for ; Fri, 5 Nov 2021 20:42:34 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C7D5E6127B; Fri, 5 Nov 2021 20:42:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144963; bh=UUiJ5Jix7iDlD0yzGQm3ssS1FmGI2cHnWSpg5gDBJ+8=; h=Date:From:To:Subject:In-Reply-To:From; b=V6VWizDLMuJUqnHRMCWsNcNalpoezLlDd4bCL8MW1PJoEGWfPfqONpuha1GcShUZM UNmTCQsfr3wwFqrEuXEXHuEGugo6WLS68JgpXxqAVcZ+DZYt/iE1w9z9ZrU3VBH8+/ 3YGQ4GfwUO8VW8+6UQptrC70ZHW4vowCM5PsNPl8= Date: Fri, 05 Nov 2021 13:42:42 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 154/262] mm/vmscan: centralise timeout values for reclaim_throttle Message-ID: <20211105204242.Zvjqz2kef%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5F02DB000188 X-Stat-Signature: fty9y5b9bx86k89prxf4ke9yagetrnyo Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=V6VWizDL; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144954-501704 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: centralise timeout values for reclaim_throttle Neil Brown raised concerns about callers of reclaim_throttle specifying a timeout value. The original timeout values to congestion_wait() were probably pulled out of thin air or copy&pasted from somewhere else. This patch centralises the timeout values and selects a timeout based on the reason for reclaim throttling. These figures are also pulled out of the same thin air but better values may be derived Running a workload that is throttling for inappropriate periods and tracing mm_vmscan_throttled can be used to pick a more appropriate value. Excessive throttling would pick a lower timeout where as excessive CPU usage in reclaim context would select a larger timeout. Ideally a large value would always be used and the wakeups would occur before a timeout but that requires careful testing. Link: https://lkml.kernel.org/r/20211022144651.19914-7-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- mm/compaction.c | 2 - mm/internal.h | 3 -- mm/page-writeback.c | 2 - mm/vmscan.c | 50 +++++++++++++++++++++++++++++++----------- 4 files changed, 40 insertions(+), 17 deletions(-) --- a/mm/compaction.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle +++ a/mm/compaction.c @@ -828,7 +828,7 @@ isolate_migratepages_block(struct compac if (cc->mode == MIGRATE_ASYNC) return -EAGAIN; - reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED); if (fatal_signal_pending(current)) return -EINTR; --- a/mm/internal.h~mm-vmscan-centralise-timeout-values-for-reclaim_throttle +++ a/mm/internal.h @@ -130,8 +130,7 @@ extern unsigned long highest_memmap_pfn; */ extern int isolate_lru_page(struct page *page); extern void putback_lru_page(struct page *page); -extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, - long timeout); +extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason); /* * in mm/rmap.c: --- a/mm/page-writeback.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle +++ a/mm/page-writeback.c @@ -2374,7 +2374,7 @@ int do_writepages(struct address_space * * guess as any. */ reclaim_throttle(NODE_DATA(numa_node_id()), - VMSCAN_THROTTLE_WRITEBACK, HZ/50); + VMSCAN_THROTTLE_WRITEBACK); } /* * Usually few pages are written by now from those we've just submitted --- a/mm/vmscan.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle +++ a/mm/vmscan.c @@ -1006,12 +1006,10 @@ static void handle_write_error(struct ad unlock_page(page); } -void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, - long timeout) +void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason) { wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason]; - long ret; - bool acct_writeback = (reason == VMSCAN_THROTTLE_WRITEBACK); + long timeout, ret; DEFINE_WAIT(wait); /* @@ -1023,17 +1021,43 @@ void reclaim_throttle(pg_data_t *pgdat, current->flags & (PF_IO_WORKER|PF_KTHREAD)) return; - if (acct_writeback && - atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { - WRITE_ONCE(pgdat->nr_reclaim_start, - node_page_state(pgdat, NR_THROTTLED_WRITTEN)); + /* + * These figures are pulled out of thin air. + * VMSCAN_THROTTLE_ISOLATED is a transient condition based on too many + * parallel reclaimers which is a short-lived event so the timeout is + * short. Failing to make progress or waiting on writeback are + * potentially long-lived events so use a longer timeout. This is shaky + * logic as a failure to make progress could be due to anything from + * writeback to a slow device to excessive references pages at the tail + * of the inactive LRU. + */ + switch(reason) { + case VMSCAN_THROTTLE_WRITEBACK: + timeout = HZ/10; + + if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { + WRITE_ONCE(pgdat->nr_reclaim_start, + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); + } + + break; + case VMSCAN_THROTTLE_NOPROGRESS: + timeout = HZ/10; + break; + case VMSCAN_THROTTLE_ISOLATED: + timeout = HZ/50; + break; + default: + WARN_ON_ONCE(1); + timeout = HZ; + break; } prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); ret = schedule_timeout(timeout); finish_wait(wqh, &wait); - if (acct_writeback) + if (reason == VMSCAN_THROTTLE_WRITEBACK) atomic_dec(&pgdat->nr_writeback_throttled); trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout), @@ -2318,7 +2342,7 @@ shrink_inactive_list(unsigned long nr_to /* wait a bit for the reclaimer. */ stalled = true; - reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) @@ -3250,7 +3274,7 @@ again: * until some pages complete writeback. */ if (sc->nr.immediate) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); } /* @@ -3274,7 +3298,7 @@ again: if (!current_is_kswapd() && current_may_throttle() && !sc->hibernation_mode && test_bit(LRUVEC_CONGESTED, &target_lruvec->flags)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, sc)) @@ -3346,7 +3370,7 @@ static void consider_reclaim_throttle(pg /* Throttle if making no progress at high prioities. */ if (sc->priority < DEF_PRIORITY - 2) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS); } /* From patchwork Fri Nov 5 20:42:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82ABEC433EF for ; Fri, 5 Nov 2021 20:43:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BCCFB61284 for ; Fri, 5 Nov 2021 20:43:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BCCFB61284 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5BA24940099; Fri, 5 Nov 2021 16:43:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 568A1940093; Fri, 5 Nov 2021 16:43:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 458AC940099; Fri, 5 Nov 2021 16:43:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 3620A940093 for ; Fri, 5 Nov 2021 16:43:01 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EB3A91856B71B for ; Fri, 5 Nov 2021 20:43:00 +0000 (UTC) X-FDA: 78776050920.13.7430FB4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 47067104AAF4 for ; Fri, 5 Nov 2021 20:42:52 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2F4FA6127A; Fri, 5 Nov 2021 20:42:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144966; bh=qAtNb1UtwO11oGQLWQsynswt809Y6PArBjBPkNKUvj8=; h=Date:From:To:Subject:In-Reply-To:From; b=ZJQiOQQPFe8Cv17p9K6vdp6NdV7uGhjRPWXvL4/p1jjsIDzX3aJnpIsF1IWl+jufP iuirKqPnPzKRNyQWaDxHRnY4j24Wdj3uNS4lJ8hsnhfyc/ATlv62C5eWCxL2VmH7N+ Zmysvk97Zx/IgR9AyB/AZQRsCMGYoYM+rDorJpZA= Date: Fri, 05 Nov 2021 13:42:45 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 155/262] mm/vmscan: increase the timeout if page reclaim is not making progress Message-ID: <20211105204245.WSzho92J2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ZJQiOQQP; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 47067104AAF4 X-Stat-Signature: tw1oi9u69198ogchmjos55fzz536cb8x X-HE-Tag: 1636144972-573727 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: increase the timeout if page reclaim is not making progress Tracing of the stutterp workload showed the following delays 1 usect_delayed=124000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=128000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=176000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=536000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=544000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=556000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=624000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=716000 reason=VMSCAN_THROTTLE_NOPROGRESS 1 usect_delayed=772000 reason=VMSCAN_THROTTLE_NOPROGRESS 2 usect_delayed=512000 reason=VMSCAN_THROTTLE_NOPROGRESS 16 usect_delayed=120000 reason=VMSCAN_THROTTLE_NOPROGRESS 53 usect_delayed=116000 reason=VMSCAN_THROTTLE_NOPROGRESS 116 usect_delayed=112000 reason=VMSCAN_THROTTLE_NOPROGRESS 5907 usect_delayed=108000 reason=VMSCAN_THROTTLE_NOPROGRESS 71741 usect_delayed=104000 reason=VMSCAN_THROTTLE_NOPROGRESS All the throttling hit the full timeout and then there was wakeup delays meaning that the wakeups are premature as no other reclaimer such as kswapd has made progress. This patch increases the maximum timeout. Link: https://lkml.kernel.org/r/20211022144651.19914-8-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmscan.c~mm-vmscan-increase-the-timeout-if-page-reclaim-is-not-making-progress +++ a/mm/vmscan.c @@ -1042,7 +1042,7 @@ void reclaim_throttle(pg_data_t *pgdat, break; case VMSCAN_THROTTLE_NOPROGRESS: - timeout = HZ/10; + timeout = HZ/2; break; case VMSCAN_THROTTLE_ISOLATED: timeout = HZ/50; From patchwork Fri Nov 5 20:42:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 183A7C43217 for ; Fri, 5 Nov 2021 20:42:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C30C261058 for ; Fri, 5 Nov 2021 20:42:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C30C261058 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 67B75940096; Fri, 5 Nov 2021 16:42:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62B5B940093; Fri, 5 Nov 2021 16:42:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51B3C940096; Fri, 5 Nov 2021 16:42:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id 42FCE940093 for ; Fri, 5 Nov 2021 16:42:51 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 06E7A1856C04C for ; Fri, 5 Nov 2021 20:42:51 +0000 (UTC) X-FDA: 78776050542.22.E079A31 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id 1E9BC500030D for ; Fri, 5 Nov 2021 20:42:42 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 71D406127C; Fri, 5 Nov 2021 20:42:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144969; bh=pAeVSQE+ghUrkBGpE6kpmbVNqnlEmUQ8UwhxuJRCBAE=; h=Date:From:To:Subject:In-Reply-To:From; b=i9NTeDdeUyWHI15/x7jR2PdVd/DaDPNCfT299892+cNmcK1wni0yivTOXd+IGZKrz 8FZpImO6jbH6r2jgA9ZR7C0bHixmhFhB+r/2jMuedC6fkwnHKYf+gjStkV8WR6dU8w Dllv8zIgabzrKb2Xwo+5AFg9Pne8jLbAoUDvBBxo= Date: Fri, 05 Nov 2021 13:42:49 -0700 From: Andrew Morton To: adilger.kernel@dilger.ca, akpm@linux-foundation.org, corbet@lwn.net, david@fromorbit.com, djwong@kernel.org, hannes@cmpxchg.org, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mm-commits@vger.kernel.org, neilb@suse.de, riel@surriel.com, torvalds@linux-foundation.org, tytso@mit.edu, vbabka@suse.cz, willy@infradead.org Subject: [patch 156/262] mm/vmscan: delay waking of tasks throttled on NOPROGRESS Message-ID: <20211105204249.YEXCXrK3w%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1E9BC500030D X-Stat-Signature: 7m3qo5816dfb3nrhweokxfebhbg5r6wq Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=i9NTeDde; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144962-408756 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mel Gorman Subject: mm/vmscan: delay waking of tasks throttled on NOPROGRESS Tracing indicates that tasks throttled on NOPROGRESS are woken prematurely resulting in occasional massive spikes in direct reclaim activity. This patch wakes tasks throttled on NOPROGRESS if reclaim efficiency is at least 12%. Link: https://lkml.kernel.org/r/20211022144651.19914-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andreas Dilger Cc: "Darrick J . Wong" Cc: Dave Chinner Cc: Johannes Weiner Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: NeilBrown Cc: Rik van Riel Cc: "Theodore Ts'o" Signed-off-by: Andrew Morton --- mm/vmscan.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) --- a/mm/vmscan.c~mm-vmscan-delay-waking-of-tasks-throttled-on-noprogress +++ a/mm/vmscan.c @@ -3348,8 +3348,11 @@ static inline bool compaction_ready(stru static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc) { - /* If reclaim is making progress, wake any throttled tasks. */ - if (sc->nr_reclaimed) { + /* + * If reclaim is making progress greater than 12% efficiency then + * wake all the NOPROGRESS throttled tasks. + */ + if (sc->nr_reclaimed > (sc->nr_scanned >> 3)) { wait_queue_head_t *wqh; wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_NOPROGRESS]; From patchwork Fri Nov 5 20:42:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86C3DC433F5 for ; Fri, 5 Nov 2021 20:42:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3840A61262 for ; Fri, 5 Nov 2021 20:42:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3840A61262 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C6EC6940097; Fri, 5 Nov 2021 16:42:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1EFC940093; Fri, 5 Nov 2021 16:42:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABF9B940097; Fri, 5 Nov 2021 16:42:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 9DBC6940093 for ; Fri, 5 Nov 2021 16:42:54 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5C54A1856B718 for ; Fri, 5 Nov 2021 20:42:54 +0000 (UTC) X-FDA: 78776050668.17.4304D20 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id ED4F1F0000B5 for ; Fri, 5 Nov 2021 20:42:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D3CAA61058; Fri, 5 Nov 2021 20:42:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144973; bh=g1888CIvImU0pakxVSLIfwqeSvRyGlBTDwFLNnrr2w8=; h=Date:From:To:Subject:In-Reply-To:From; b=WjAHkCZGg/hkNv2b3F3CBtSRkTnjCT0S2QSJgYl0H24Tq35ct/KaXlIl84x2i4nM0 3ZMxYJb32gOdPjMYX1UOqVgVcnQ3lnGvtepCGphEQOm/Z6j1Nrkqxj5f8Un7Lw4Rju PLPozgSxEKBDVg1HO9ktKzSQmmI9Od2gIKmGyY6M= Date: Fri, 05 Nov 2021 13:42:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, richard.weiyang@gmail.com, shakeelb@google.com, songmuchun@bytedance.com, songyuanzheng@huawei.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 157/262] mm/vmpressure: fix data-race with memcg->socket_pressure Message-ID: <20211105204252.CP8W1bmFQ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WjAHkCZG; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: ED4F1F0000B5 X-Stat-Signature: 4jqfqrzgm3oarhnqxyxyb5a1cmcxof8x X-HE-Tag: 1636144973-100731 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yuanzheng Song Subject: mm/vmpressure: fix data-race with memcg->socket_pressure BUG: KCSAN: data-race in __sk_mem_reduce_allocated / vmpressure write to 0xffff8881286f4938 of 8 bytes by task 24550 on cpu 3: vmpressure+0x218/0x230 mm/vmpressure.c:307 shrink_node_memcgs+0x2b9/0x410 mm/vmscan.c:2658 shrink_node+0x9d2/0x11d0 mm/vmscan.c:2769 shrink_zones+0x29f/0x470 mm/vmscan.c:2972 do_try_to_free_pages+0x193/0x6e0 mm/vmscan.c:3027 try_to_free_mem_cgroup_pages+0x1c0/0x3f0 mm/vmscan.c:3345 reclaim_high mm/memcontrol.c:2440 [inline] mem_cgroup_handle_over_high+0x18b/0x4d0 mm/memcontrol.c:2624 tracehook_notify_resume include/linux/tracehook.h:197 [inline] exit_to_user_mode_loop kernel/entry/common.c:164 [inline] exit_to_user_mode_prepare+0x110/0x170 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266 ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:289 read to 0xffff8881286f4938 of 8 bytes by interrupt on cpu 1: mem_cgroup_under_socket_pressure include/linux/memcontrol.h:1483 [inline] sk_under_memory_pressure include/net/sock.h:1314 [inline] __sk_mem_reduce_allocated+0x1d2/0x270 net/core/sock.c:2696 __sk_mem_reclaim+0x44/0x50 net/core/sock.c:2711 sk_mem_reclaim include/net/sock.h:1490 [inline] ...... net_rx_action+0x17a/0x480 net/core/dev.c:6864 __do_softirq+0x12c/0x2af kernel/softirq.c:298 run_ksoftirqd+0x13/0x20 kernel/softirq.c:653 smpboot_thread_fn+0x33f/0x510 kernel/smpboot.c:165 kthread+0x1fc/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 When reading memcg->socket_pressure in mem_cgroup_under_socket_pressure() and writing memcg->socket_pressure in vmpressure() at the same time, the data-race occurs. So fix it by using READ_ONCE() and WRITE_ONCE() to read and write memcg->socket_pressure. Link: https://lkml.kernel.org/r/20211025082843.671690-1-songyuanzheng@huawei.com Signed-off-by: Yuanzheng Song Reviewed-by: Muchun Song Cc: Shakeel Butt Cc: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Matthew Wilcox (Oracle) Cc: Alex Shi Cc: Wei Yang Signed-off-by: Andrew Morton --- include/linux/memcontrol.h | 2 +- mm/vmpressure.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/include/linux/memcontrol.h~mm-vmpressure-fix-data-race-with-memcg-socket_pressure +++ a/include/linux/memcontrol.h @@ -1606,7 +1606,7 @@ static inline bool mem_cgroup_under_sock if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) return true; do { - if (time_before(jiffies, memcg->socket_pressure)) + if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) return true; } while ((memcg = parent_mem_cgroup(memcg))); return false; --- a/mm/vmpressure.c~mm-vmpressure-fix-data-race-with-memcg-socket_pressure +++ a/mm/vmpressure.c @@ -308,7 +308,7 @@ void vmpressure(gfp_t gfp, struct mem_cg * asserted for a second in which subsequent * pressure events can occur. */ - memcg->socket_pressure = jiffies + HZ; + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); } } } From patchwork Fri Nov 5 20:42:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8ED85C433EF for ; Fri, 5 Nov 2021 20:42:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 465996126A for ; Fri, 5 Nov 2021 20:42:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 465996126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DCEB9940098; Fri, 5 Nov 2021 16:42:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7DE6940093; Fri, 5 Nov 2021 16:42:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C468A940098; Fri, 5 Nov 2021 16:42:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id B6291940093 for ; Fri, 5 Nov 2021 16:42:57 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 757488249980 for ; Fri, 5 Nov 2021 20:42:57 +0000 (UTC) X-FDA: 78776050752.26.5F7C419 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 0BC019000274 for ; Fri, 5 Nov 2021 20:42:56 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1041A61262; Fri, 5 Nov 2021 20:42:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144976; bh=NdXlU74NTWYD8cSkdCUS09bFpTGLqdSV68+IKtH5ugk=; h=Date:From:To:Subject:In-Reply-To:From; b=d8ng3W/M/+h3BbP9KdZ+4ExvETS2A0LaElp0oViz3Bnh+vyTg//42SKg+mRJXbn0f zfLAHZLEa7qJ2n0Ino118R1rvHSC8LVjo4kswv6/VR1D9+GmucN3TrBZOCYWbjkdtH 4AOGVv7eP6Pt01zF0Pctj4hCOQ1wuYZHscmKoJPU= Date: Fri, 05 Nov 2021 13:42:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, nixiaoming@huawei.com, tangbin@cmss.chinamobile.com, torvalds@linux-foundation.org, weizhenliang@huawei.com, zhangshengju@cmss.chinamobile.com Subject: [patch 158/262] tools/vm/page_owner_sort.c: count and sort by mem Message-ID: <20211105204255.iNP0h-4F-%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0BC019000274 X-Stat-Signature: nmgesicer4n8akxx3zqtd4huqyw9d48u Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="d8ng3W/M"; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144976-471175 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zhenliang Wei Subject: tools/vm/page_owner_sort.c: count and sort by mem When viewing page owner information, we may be more concerned about the total memory rather than the times of stack appears. Therefore, the following adjustments are made: 1. Added the statistics on the total number of pages. 2. Added the optional parameter "-m" to configure the program to sort by memory (total pages). The general output of page_owner is as follows: Page allocated via order XXX, ... PFN XXX ... // Detailed stack Page allocated via order XXX, ... PFN XXX ... // Detailed stack The original page_owner_sort ignores PFN rows, puts the remaining rows in buf, counts the times of buf, and finally sorts them according to the times. General output: XXX times: Page allocated via order XXX, ... // Detailed stack Now, we use regexp to extract the page order value from the buf, and count the total pages for the buf. General output: XXX times, XXX pages: Page allocated via order XXX, ... // Detailed stack By default, it is still sorted by the times of buf; If you want to sort by the pages nums of buf, use the new -m parameter. Link: https://lkml.kernel.org/r/1631678242-41033-1-git-send-email-weizhenliang@huawei.com Signed-off-by: Zhenliang Wei Cc: Tang Bin Cc: Zhang Shengju Cc: Zhenliang Wei Cc: Xiaoming Ni Signed-off-by: Andrew Morton --- Documentation/vm/page_owner.rst | 23 +++++++ tools/vm/page_owner_sort.c | 94 +++++++++++++++++++++++++++--- 2 files changed, 107 insertions(+), 10 deletions(-) --- a/Documentation/vm/page_owner.rst~tools-vm-page_owner_sortc-count-and-sort-by-mem +++ a/Documentation/vm/page_owner.rst @@ -85,5 +85,26 @@ Usage cat /sys/kernel/debug/page_owner > page_owner_full.txt ./page_owner_sort page_owner_full.txt sorted_page_owner.txt + The general output of ``page_owner_full.txt`` is as follows: + + Page allocated via order XXX, ... + PFN XXX ... + // Detailed stack + + Page allocated via order XXX, ... + PFN XXX ... + // Detailed stack + + The ``page_owner_sort`` tool ignores ``PFN`` rows, puts the remaining rows + in buf, uses regexp to extract the page order value, counts the times + and pages of buf, and finally sorts them according to the times. + See the result about who allocated each page - in the ``sorted_page_owner.txt``. + in the ``sorted_page_owner.txt``. General output: + + XXX times, XXX pages: + Page allocated via order XXX, ... + // Detailed stack + + By default, ``page_owner_sort`` is sorted according to the times of buf. + If you want to sort by the pages nums of buf, use the ``-m`` parameter. --- a/tools/vm/page_owner_sort.c~tools-vm-page_owner_sortc-count-and-sort-by-mem +++ a/tools/vm/page_owner_sort.c @@ -5,6 +5,8 @@ * Example use: * cat /sys/kernel/debug/page_owner > page_owner_full.txt * ./page_owner_sort page_owner_full.txt sorted_page_owner.txt + * Or sort by total memory: + * ./page_owner_sort -m page_owner_full.txt sorted_page_owner.txt * * See Documentation/vm/page_owner.rst */ @@ -16,14 +18,18 @@ #include #include #include +#include +#include struct block_list { char *txt; int len; int num; + int page_num; }; - +static int sort_by_memory; +static regex_t order_pattern; static struct block_list *list; static int list_size; static int max_size; @@ -59,12 +65,50 @@ static int compare_num(const void *p1, c return l2->num - l1->num; } +static int compare_page_num(const void *p1, const void *p2) +{ + const struct block_list *l1 = p1, *l2 = p2; + + return l2->page_num - l1->page_num; +} + +static int get_page_num(char *buf) +{ + int err, val_len, order_val; + char order_str[4] = {0}; + char *endptr; + regmatch_t pmatch[2]; + + err = regexec(&order_pattern, buf, 2, pmatch, REG_NOTBOL); + if (err != 0 || pmatch[1].rm_so == -1) { + printf("no order pattern in %s\n", buf); + return 0; + } + val_len = pmatch[1].rm_eo - pmatch[1].rm_so; + if (val_len > 2) /* max_order should not exceed 2 digits */ + goto wrong_order; + + memcpy(order_str, buf + pmatch[1].rm_so, val_len); + + errno = 0; + order_val = strtol(order_str, &endptr, 10); + if (errno != 0 || endptr == order_str || *endptr != '\0') + goto wrong_order; + + return 1 << order_val; + +wrong_order: + printf("wrong order in follow buf:\n%s\n", buf); + return 0; +} + static void add_list(char *buf, int len) { if (list_size != 0 && len == list[list_size-1].len && memcmp(buf, list[list_size-1].txt, len) == 0) { list[list_size-1].num++; + list[list_size-1].page_num += get_page_num(buf); return; } if (list_size == max_size) { @@ -74,6 +118,7 @@ static void add_list(char *buf, int len) list[list_size].txt = malloc(len+1); list[list_size].len = len; list[list_size].num = 1; + list[list_size].page_num = get_page_num(buf); memcpy(list[list_size].txt, buf, len); list[list_size].txt[len] = 0; list_size++; @@ -85,6 +130,13 @@ static void add_list(char *buf, int len) #define BUF_SIZE (128 * 1024) +static void usage(void) +{ + printf("Usage: ./page_owner_sort [-m] \n" + "-m Sort by total memory. If this option is unset, sort by times\n" + ); +} + int main(int argc, char **argv) { FILE *fin, *fout; @@ -92,21 +144,39 @@ int main(int argc, char **argv) int ret, i, count; struct block_list *list2; struct stat st; + int err; + int opt; - if (argc < 3) { - printf("Usage: ./program \n"); - perror("open: "); + while ((opt = getopt(argc, argv, "m")) != -1) + switch (opt) { + case 'm': + sort_by_memory = 1; + break; + default: + usage(); + exit(1); + } + + if (optind >= (argc - 1)) { + usage(); exit(1); } - fin = fopen(argv[1], "r"); - fout = fopen(argv[2], "w"); + fin = fopen(argv[optind], "r"); + fout = fopen(argv[optind + 1], "w"); if (!fin || !fout) { - printf("Usage: ./program \n"); + usage(); perror("open: "); exit(1); } + err = regcomp(&order_pattern, "order\\s*([0-9]*),", REG_EXTENDED|REG_NEWLINE); + if (err != 0 || order_pattern.re_nsub != 1) { + printf("%s: Invalid pattern 'order\\s*([0-9]*),' code %d\n", + argv[0], err); + exit(1); + } + fstat(fileno(fin), &st); max_size = st.st_size / 100; /* hack ... */ @@ -145,13 +215,19 @@ int main(int argc, char **argv) list2[count++] = list[i]; } else { list2[count-1].num += list[i].num; + list2[count-1].page_num += list[i].page_num; } } - qsort(list2, count, sizeof(list[0]), compare_num); + if (sort_by_memory) + qsort(list2, count, sizeof(list[0]), compare_page_num); + else + qsort(list2, count, sizeof(list[0]), compare_num); for (i = 0; i < count; i++) - fprintf(fout, "%d times:\n%s\n", list2[i].num, list2[i].txt); + fprintf(fout, "%d times, %d pages:\n%s\n", + list2[i].num, list2[i].page_num, list2[i].txt); + regfree(&order_pattern); return 0; } From patchwork Fri Nov 5 20:42:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605755 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C3E6C433F5 for ; Fri, 5 Nov 2021 20:44:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2F736126A for ; Fri, 5 Nov 2021 20:44:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B2F736126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D338E9400A9; Fri, 5 Nov 2021 16:44:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CE670940093; Fri, 5 Nov 2021 16:44:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B33B99400A9; Fri, 5 Nov 2021 16:44:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id A39AB940093 for ; Fri, 5 Nov 2021 16:44:00 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6D9308249980 for ; Fri, 5 Nov 2021 20:44:00 +0000 (UTC) X-FDA: 78776053440.12.77BAC2E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP id 6E1B3B000188 for ; Fri, 5 Nov 2021 20:42:50 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1F3086126A; Fri, 5 Nov 2021 20:42:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144979; bh=jgYtiR6rG07Cr+X9pE4VeeKB+UpDrZA5xCBhF+Cah3U=; h=Date:From:To:Subject:In-Reply-To:From; b=Gd9IP5qDXi090Y7mtwMwRH9xLZcuJTpwXUbGR+Cn6JwWG9tglX7wOpnKEWJj6SSJF nPFqm92tdHj1+7WhHZy447ySuvNdXANTFPouXLr4Wa8Ed8ZgO6VTKaacDHkCjOIgNt tG+79vX3OFzl89C1B1RYPeZXPXoZGHu2Ff9O72cc= Date: Fri, 05 Nov 2021 13:42:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@intel.com, chansen3@cisco.com, koct9i@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, torvalds@linux-foundation.org, wangbin224@huawei.com Subject: [patch 159/262] tools/vm/page-types.c: make walk_file() aware of address range option Message-ID: <20211105204258.-SJZLHkr6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6E1B3B000188 X-Stat-Signature: wsr87kpqjsm3ny9874amnzgbrxpypugm Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Gd9IP5qD; spf=pass (imf25.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144970-225727 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: tools/vm/page-types.c: make walk_file() aware of address range option Patch series "tools/vm/page-types.c: a few improvements". This patchset adds some improvements on tools/vm/page-types.c. Patch 1/3 makes -a option (specify address range) work with -f (file cache mode). Patch 2/3 and 3/3 are to fix minor formatting issues of this tool. These would make life a little easier for the users of this tool. Please see individual patches for more details about specific issues. This patch (of 3): -a|--addr option is used to limit the range of address to be scanned for page status. It works now for physical address space (dafult mode) or for virtual address space (with -p option), but not for file address space (with -f option). So make walk_file() aware of -a option. Link: https://lkml.kernel.org/r/20211004061325.1525902-1-naoya.horiguchi@linux.dev Link: https://lkml.kernel.org/r/20211004061325.1525902-2-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Cc: Konstantin Khlebnikov Cc: Christian Hansen Cc: Changbin Du Cc: Bin Wang Signed-off-by: Andrew Morton --- tools/vm/page-types.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) --- a/tools/vm/page-types.c~tools-vm-page-typesc-make-walk_file-aware-of-address-range-option +++ a/tools/vm/page-types.c @@ -967,22 +967,19 @@ static struct sigaction sigbus_action = .sa_flags = SA_SIGINFO, }; -static void walk_file(const char *name, const struct stat *st) +static void walk_file_range(const char *name, int fd, + unsigned long off, unsigned long end) { uint8_t vec[PAGEMAP_BATCH]; uint64_t buf[PAGEMAP_BATCH], flags; uint64_t cgroup = 0; uint64_t mapcnt = 0; unsigned long nr_pages, pfn, i; - off_t off, end = st->st_size; - int fd; ssize_t len; void *ptr; int first = 1; - fd = checked_open(name, O_RDONLY|O_NOATIME|O_NOFOLLOW); - - for (off = 0; off < end; off += len) { + for (; off < end; off += len) { nr_pages = (end - off + page_size - 1) / page_size; if (nr_pages > PAGEMAP_BATCH) nr_pages = PAGEMAP_BATCH; @@ -1043,6 +1040,21 @@ got_sigbus: flags, cgroup, mapcnt, buf[i]); } } +} + +static void walk_file(const char *name, const struct stat *st) +{ + int i; + int fd; + + fd = checked_open(name, O_RDONLY|O_NOATIME|O_NOFOLLOW); + + if (!nr_addr_ranges) + add_addr_range(0, st->st_size / page_size); + + for (i = 0; i < nr_addr_ranges; i++) + walk_file_range(name, fd, opt_offset[i] * page_size, + (opt_offset[i] + opt_size[i]) * page_size); close(fd); } From patchwork Fri Nov 5 20:43:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C99C9C433EF for ; Fri, 5 Nov 2021 20:43:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7F5B761355 for ; Fri, 5 Nov 2021 20:43:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7F5B761355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 235F694009A; Fri, 5 Nov 2021 16:43:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1BE0B940093; Fri, 5 Nov 2021 16:43:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D6DC94009A; Fri, 5 Nov 2021 16:43:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id F0BDD940093 for ; Fri, 5 Nov 2021 16:43:03 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B7CF11856C04B for ; Fri, 5 Nov 2021 20:43:03 +0000 (UTC) X-FDA: 78776051046.02.657973A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 55FA9F00039B for ; Fri, 5 Nov 2021 20:43:03 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 459A46128E; Fri, 5 Nov 2021 20:43:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144982; bh=ShfnrBTr9wSsfW0j2EnGKceQ9kQk/11wpBiFqD1/xWA=; h=Date:From:To:Subject:In-Reply-To:From; b=I7/Js934+CygG4oqWKKAkSDQxRX7vQHDOstqkx92ptwODznchw7e74vgWBH4fdCx3 AA0zcbMe0lbwr+Bf1HKFUOmM8REfjthU9xth/Dl4VzM/89ZhyAn39p/JLjqtHxMKcl CGP3VFPulzr72jE04Q2t+/Lx4QmGd9g6t8fX7VBU= Date: Fri, 05 Nov 2021 13:43:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@intel.com, chansen3@cisco.com, koct9i@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, torvalds@linux-foundation.org, wangbin224@huawei.com Subject: [patch 160/262] tools/vm/page-types.c: move show_file() to summary output Message-ID: <20211105204301.A1DZ4so80%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 55FA9F00039B X-Stat-Signature: zzbksxj1cu5czfdykaheey34szs1j34g Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="I7/Js934"; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144983-638400 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: tools/vm/page-types.c: move show_file() to summary output Currently file info from show_file() is printed out within page list like below, but this is inconvenient a little to utilize the page list from other scripts (maybe needs additional filtering). $ ./page-types -f page-types.c -l foffset offset len flags page-types.c Inode: 15108680 Size: 30953 (8 pages) Modify: Sat Oct 2 23:11:20 2021 (2399 seconds ago) Access: Sat Oct 2 23:11:28 2021 (2391 seconds ago) 0 d9f59e 1 ___U_lA____________________________________ 1 1031eb5 1 __RU_l_____________________________________ 2 13bf717 1 __RU_l_____________________________________ 3 13ac333 1 ___U_lA____________________________________ 4 d9f59f 1 __RU_l_____________________________________ 5 183fd49 1 ___U_lA____________________________________ 6 13cbf69 1 ___U_lA____________________________________ 7 d9ef05 1 ___U_lA____________________________________ flags page-count MB symbolic-flags long-symbolic-flags 0x000000000000002c 3 0 __RU_l_____________________________________ referenced,uptodate,lru 0x0000000000000068 5 0 ___U_lA____________________________________ uptodate,lru,active total 8 0 With this patch file info is printed out in summary part like below: $ ./page-types -f page-types.c -l foffset offset len flags 0 d9f59e 1 ___U_lA_____________________________________ 1 1031eb5 1 __RU_l______________________________________ 2 13bf717 1 __RU_l______________________________________ 3 13ac333 1 ___U_lA_____________________________________ 4 d9f59f 1 __RU_l______________________________________ 5 183fd49 1 ___U_lA_____________________________________ 6 13cbf69 1 ___U_lA_____________________________________ page-types.c Inode: 15108680 Size: 30953 (8 pages) Modify: Sat Oct 2 23:11:20 2021 (2435 seconds ago) Access: Sat Oct 2 23:11:28 2021 (2427 seconds ago) flags page-count MB symbolic-flags long-symbolic-flags 0x000000000000002c 3 0 __RU_l______________________________________ referenced,uptodate,lru 0x0000000000000068 4 0 ___U_lA_____________________________________ uptodate,lru,active total 7 0 Link: https://lkml.kernel.org/r/20211004061325.1525902-3-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Cc: Bin Wang Cc: Changbin Du Cc: Christian Hansen Cc: Konstantin Khlebnikov Signed-off-by: Andrew Morton --- tools/vm/page-types.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) --- a/tools/vm/page-types.c~tools-vm-page-typesc-move-show_file-to-summary-output +++ a/tools/vm/page-types.c @@ -1034,7 +1034,6 @@ got_sigbus: if (first && opt_list) { first = 0; flush_page_range(); - show_file(name, st); } add_page(off / page_size + i, pfn, flags, cgroup, mapcnt, buf[i]); @@ -1074,10 +1073,10 @@ int walk_tree(const char *name, const st return 0; } +struct stat st; + static void walk_page_cache(void) { - struct stat st; - kpageflags_fd = checked_open(opt_kpageflags, O_RDONLY); pagemap_fd = checked_open("/proc/self/pagemap", O_RDONLY); sigaction(SIGBUS, &sigbus_action, NULL); @@ -1374,6 +1373,11 @@ int main(int argc, char *argv[]) if (opt_list) printf("\n\n"); + if (opt_file) { + show_file(opt_file, &st); + printf("\n"); + } + show_summary(); if (opt_list_mapcnt) From patchwork Fri Nov 5 20:43:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96B95C433FE for ; Fri, 5 Nov 2021 20:43:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 48A0161355 for ; Fri, 5 Nov 2021 20:43:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 48A0161355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E453594009B; Fri, 5 Nov 2021 16:43:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF2A5940093; Fri, 5 Nov 2021 16:43:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D09DA94009B; Fri, 5 Nov 2021 16:43:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id C0EA9940093 for ; Fri, 5 Nov 2021 16:43:12 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 882DC779B0 for ; Fri, 5 Nov 2021 20:43:12 +0000 (UTC) X-FDA: 78776051424.24.5D78A56 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 727ABD0000A9 for ; Fri, 5 Nov 2021 20:42:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 540D261355; Fri, 5 Nov 2021 20:43:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144985; bh=I2DItkYvN5bnUgDVb2gxsHA4B4GOX2xKFZqeBeAEkHY=; h=Date:From:To:Subject:In-Reply-To:From; b=h5GqdKZ6A6sZ8rn+RoZMkM1IUptaslU9dd0bDvYFp4Czxkg5mPZpmGOWCqwTWWxl0 O+9qBz5eLWztDvhWstGtsfdZPonJaj31MutEneDNRhjTg/U0+MNALbqWxOy6RxvOxu 1YKMDQLXtccMuRAmb+/tl9YL7q1QXgrHmzeSZRKs= Date: Fri, 05 Nov 2021 13:43:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@intel.com, chansen3@cisco.com, koct9i@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, torvalds@linux-foundation.org, wangbin224@huawei.com Subject: [patch 161/262] tools/vm/page-types.c: print file offset in hexadecimal Message-ID: <20211105204304.q6OqMQ7kF%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=h5GqdKZ6; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 727ABD0000A9 X-Stat-Signature: 9yhgz8r9s5cka75ybnbr6f4d34bqgzkd X-HE-Tag: 1636144975-410684 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi Subject: tools/vm/page-types.c: print file offset in hexadecimal In page list mode (with -l and -L option), virtual address and physical address are printed in hexadecimal, but file offset is not, which is confusing, so let's align it. Link: https://lkml.kernel.org/r/20211004061325.1525902-4-naoya.horiguchi@linux.dev Signed-off-by: Naoya Horiguchi Cc: Bin Wang Cc: Changbin Du Cc: Christian Hansen Cc: Konstantin Khlebnikov Signed-off-by: Andrew Morton --- tools/vm/page-types.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/tools/vm/page-types.c~tools-vm-page-typesc-print-file-offset-in-hexadecimal +++ a/tools/vm/page-types.c @@ -390,7 +390,7 @@ static void show_page_range(unsigned lon if (opt_pid) printf("%lx\t", voff); if (opt_file) - printf("%lu\t", voff); + printf("%lx\t", voff); if (opt_list_cgroup) printf("@%llu\t", (unsigned long long)cgroup0); if (opt_list_mapcnt) @@ -418,7 +418,7 @@ static void show_page(unsigned long voff if (opt_pid) printf("%lx\t", voffset); if (opt_file) - printf("%lu\t", voffset); + printf("%lx\t", voffset); if (opt_list_cgroup) printf("@%llu\t", (unsigned long long)cgroup); if (opt_list_mapcnt) From patchwork Fri Nov 5 20:43:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3595C433EF for ; Fri, 5 Nov 2021 20:43:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 643D061355 for ; Fri, 5 Nov 2021 20:43:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 643D061355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BB28F94009C; Fri, 5 Nov 2021 16:43:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5B72940093; Fri, 5 Nov 2021 16:43:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FD3094009C; Fri, 5 Nov 2021 16:43:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0117.hostedemail.com [216.40.44.117]) by kanga.kvack.org (Postfix) with ESMTP id 8F030940093 for ; Fri, 5 Nov 2021 16:43:13 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 545C1779B8 for ; Fri, 5 Nov 2021 20:43:13 +0000 (UTC) X-FDA: 78776051466.36.8784AF9 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 5F7E670009CE for ; Fri, 5 Nov 2021 20:43:09 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5FACF61351; Fri, 5 Nov 2021 20:43:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144988; bh=WOVh97OTdGCAQNLoT+0RB6o5sJyaUskNsPPKcjdv91Y=; h=Date:From:To:Subject:In-Reply-To:From; b=ZBdyRlzE+tDCzJUVU+Y1MAz6wl7B6dsaoROHbE2zUiINFnkTqf52lmM1Fd8U9XKSR 0P9iL4wvz5RDQnnurEFChIIsteGb4cx4JS6oJMgpKseISesyaV+BDf68TGKJj5EPfl 9lwWIBDs+K4epUz2qiwvOkb7ROOLxkv0M/v1JLfk= Date: Fri, 05 Nov 2021 13:43:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 162/262] arch_numa: simplify numa_distance allocation Message-ID: <20211105204307.tnUYssaTr%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5F7E670009CE X-Stat-Signature: kfuzgkhi5f97jr4fu53je7f3zztcejuj Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ZBdyRlzE; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144989-231094 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: arch_numa: simplify numa_distance allocation Patch series "memblock: cleanup memblock_free interface", v2. This is the fix for memblock freeing APIs mismatch [1]. The first patch is a cleanup of numa_distance allocation in arch_numa I've spotted during the conversion. The second patch is a fix for Xen memory freeing on some of the error paths. [1] https://lore.kernel.org/all/CAHk-=wj9k4LZTz+svCxLYs5Y1=+yKrbAUArH1+ghyG3OLd8VVg@mail.gmail.com This patch (of 6): Memory allocation of numa_distance uses memblock_phys_alloc_range() without actual range limits, converts the returned physical address to virtual and then only uses the virtual address for further initialization. Simplify this by replacing memblock_phys_alloc_range() with memblock_alloc(). Link: https://lkml.kernel.org/r/20210930185031.18648-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20210930185031.18648-2-rppt@kernel.org Signed-off-by: Mike Rapoport Cc: Christophe Leroy Cc: Juergen Gross Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- drivers/base/arch_numa.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/drivers/base/arch_numa.c~arch_numa-simplify-numa_distance-allocation +++ a/drivers/base/arch_numa.c @@ -337,15 +337,13 @@ void __init numa_free_distance(void) static int __init numa_alloc_distance(void) { size_t size; - u64 phys; int i, j; size = nr_node_ids * nr_node_ids * sizeof(numa_distance[0]); - phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0, PFN_PHYS(max_pfn)); - if (WARN_ON(!phys)) + numa_distance = memblock_alloc(size, PAGE_SIZE); + if (WARN_ON(!numa_distance)) return -ENOMEM; - numa_distance = __va(phys); numa_distance_cnt = nr_node_ids; /* fill with the default distances */ From patchwork Fri Nov 5 20:43:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605767 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F241BC433FE for ; Fri, 5 Nov 2021 20:44:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C36E361357 for ; Fri, 5 Nov 2021 20:44:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C36E361357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 542869400B0; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F120940093; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 406C09400B0; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 33463940093 for ; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F1E45779B0 for ; Fri, 5 Nov 2021 20:44:12 +0000 (UTC) X-FDA: 78776053944.08.57499DD Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id ADB68500031F for ; Fri, 5 Nov 2021 20:43:03 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 602586128E; Fri, 5 Nov 2021 20:43:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144991; bh=eM5N+pm1njP5X63INnwBElbqxszRLEi2R6ykway0H8I=; h=Date:From:To:Subject:In-Reply-To:From; b=zwvHcDi7wRZBEX9/2stJt81QJ0NQgqdrOtrX9V6WjW80F37V8x41WgZzp96NdweAH eL8vSUOinhIo4kqZRVPWfDh8lDfJq9tTsgCCjHhkOwhO3/pz8IO4CdoL9UAsze8UWt 5A7yF1FypTwnjWeFl90qMXNq/+nt9ydNGpqy/+50= Date: Fri, 05 Nov 2021 13:43:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 163/262] xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer Message-ID: <20211105204310.GVV96aqm6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zwvHcDi7; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: ADB68500031F X-Stat-Signature: ky5te53tgo1prp6mstm16e9iihmewrce X-HE-Tag: 1636144983-520205 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer free_p2m_page() wrongly passes a virtual pointer to memblock_free() that treats it as a physical address. Call memblock_free_ptr() instead that gets a virtual address to free the memory. Link: https://lkml.kernel.org/r/20210930185031.18648-3-rppt@kernel.org Signed-off-by: Mike Rapoport Reviewed-by: Juergen Gross Cc: Christophe Leroy Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- arch/x86/xen/p2m.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/xen/p2m.c~xen-x86-free_p2m_page-use-memblock_free_ptr-to-free-a-virtual-pointer +++ a/arch/x86/xen/p2m.c @@ -197,7 +197,7 @@ static void * __ref alloc_p2m_page(void) static void __ref free_p2m_page(void *p) { if (unlikely(!slab_is_available())) { - memblock_free((unsigned long)p, PAGE_SIZE); + memblock_free_ptr(p, PAGE_SIZE); return; } From patchwork Fri Nov 5 20:43:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 049F8C433EF for ; Fri, 5 Nov 2021 20:43:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AFB1A61356 for ; Fri, 5 Nov 2021 20:43:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AFB1A61356 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4CC9D94009D; Fri, 5 Nov 2021 16:43:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B19D940093; Fri, 5 Nov 2021 16:43:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 345D094009D; Fri, 5 Nov 2021 16:43:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 25279940093 for ; Fri, 5 Nov 2021 16:43:16 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E1E39779AF for ; Fri, 5 Nov 2021 20:43:15 +0000 (UTC) X-FDA: 78776051550.24.F107C4A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 1EA9FB0000AC for ; Fri, 5 Nov 2021 20:43:07 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6934B61357; Fri, 5 Nov 2021 20:43:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144994; bh=+o0HbT/J9cKxynm+xi0lEs5z0vBe4932r+O+sGWnhIM=; h=Date:From:To:Subject:In-Reply-To:From; b=ouzbfGNvO4MgGByB4hfDF25YLfERjdrqeTcOcFAYSPT5FQ/1BibutX3E6cAI53IfH l0eIs2hBlCH53MhD40MEKa+BWk0dRwSQUtLMTlpHJrLOcXwvnkdHjNWzd34S1BOfag OdACEi+x+/FqifYdsdUSzxEN6yLOxymaKn13GnjU= Date: Fri, 05 Nov 2021 13:43:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 164/262] memblock: drop memblock_free_early_nid() and memblock_free_early() Message-ID: <20211105204313.z0c1HtZfl%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1EA9FB0000AC X-Stat-Signature: 6y6kznh74uiq3pwq5mhp7xtwtziyynfd Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ouzbfGNv; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636144987-208853 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: memblock: drop memblock_free_early_nid() and memblock_free_early() memblock_free_early_nid() is unused and memblock_free_early() is an alias for memblock_free(). Replace calls to memblock_free_early() with calls to memblock_free() and remove memblock_free_early() and memblock_free_early_nid(). Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org Signed-off-by: Mike Rapoport Cc: Christophe Leroy Cc: Juergen Gross Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- arch/mips/mm/init.c | 2 +- arch/powerpc/platforms/pseries/svm.c | 3 +-- arch/s390/kernel/smp.c | 2 +- drivers/base/arch_numa.c | 2 +- drivers/s390/char/sclp_early.c | 2 +- include/linux/memblock.h | 12 ------------ kernel/dma/swiotlb.c | 2 +- lib/cpumask.c | 2 +- mm/percpu.c | 8 ++++---- mm/sparse.c | 2 +- 10 files changed, 12 insertions(+), 25 deletions(-) --- a/arch/mips/mm/init.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/arch/mips/mm/init.c @@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_free_early(__pa(ptr), size); + memblock_free(__pa(ptr), size); } void __init setup_per_cpu_areas(void) --- a/arch/powerpc/platforms/pseries/svm.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/arch/powerpc/platforms/pseries/svm.c @@ -56,8 +56,7 @@ void __init svm_swiotlb_init(void) return; - memblock_free_early(__pa(vstart), - PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); + memblock_free(__pa(vstart), PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); panic("SVM: Cannot allocate SWIOTLB buffer"); } --- a/arch/s390/kernel/smp.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/arch/s390/kernel/smp.c @@ -880,7 +880,7 @@ void __init smp_detect_cpus(void) /* Add CPUs present at boot */ __smp_rescan_cpus(info, true); - memblock_free_early((unsigned long)info, sizeof(*info)); + memblock_free((unsigned long)info, sizeof(*info)); } /* --- a/drivers/base/arch_numa.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/drivers/base/arch_numa.c @@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_free_early(__pa(ptr), size); + memblock_free(__pa(ptr), size); } #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK --- a/drivers/s390/char/sclp_early.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/drivers/s390/char/sclp_early.c @@ -139,7 +139,7 @@ int __init sclp_early_get_core_info(stru } sclp_fill_core_info(info, sccb); out: - memblock_free_early((unsigned long)sccb, length); + memblock_free((unsigned long)sccb, length); return rc; } --- a/include/linux/memblock.h~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/include/linux/memblock.h @@ -441,18 +441,6 @@ static inline void *memblock_alloc_node( MEMBLOCK_ALLOC_ACCESSIBLE, nid); } -static inline void memblock_free_early(phys_addr_t base, - phys_addr_t size) -{ - memblock_free(base, size); -} - -static inline void memblock_free_early_nid(phys_addr_t base, - phys_addr_t size, int nid) -{ - memblock_free(base, size); -} - static inline void memblock_free_late(phys_addr_t base, phys_addr_t size) { __memblock_free_late(base, size); --- a/kernel/dma/swiotlb.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/kernel/dma/swiotlb.c @@ -247,7 +247,7 @@ swiotlb_init(int verbose) return; fail_free_mem: - memblock_free_early(__pa(tlb), bytes); + memblock_free(__pa(tlb), bytes); fail: pr_warn("Cannot allocate buffer"); } --- a/lib/cpumask.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/lib/cpumask.c @@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var); */ void __init free_bootmem_cpumask_var(cpumask_var_t mask) { - memblock_free_early(__pa(mask), cpumask_size()); + memblock_free(__pa(mask), cpumask_size()); } #endif --- a/mm/percpu.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/mm/percpu.c @@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all */ void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai) { - memblock_free_early(__pa(ai), ai->__ai_size); + memblock_free(__pa(ai), ai->__ai_size); } /** @@ -3134,7 +3134,7 @@ out_free_areas: out_free: pcpu_free_alloc_info(ai); if (areas) - memblock_free_early(__pa(areas), areas_size); + memblock_free(__pa(areas), areas_size); return rc; } #endif /* BUILD_EMBED_FIRST_CHUNK */ @@ -3256,7 +3256,7 @@ enomem: free_fn(page_address(pages[j]), PAGE_SIZE); rc = -ENOMEM; out_free_ar: - memblock_free_early(__pa(pages), pages_size); + memblock_free(__pa(pages), pages_size); pcpu_free_alloc_info(ai); return rc; } @@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u static void __init pcpu_dfl_fc_free(void *ptr, size_t size) { - memblock_free_early(__pa(ptr), size); + memblock_free(__pa(ptr), size); } void __init setup_per_cpu_areas(void) --- a/mm/sparse.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early +++ a/mm/sparse.c @@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit static inline void __meminit sparse_buffer_free(unsigned long size) { WARN_ON(!sparsemap_buf || size == 0); - memblock_free_early(__pa(sparsemap_buf), size); + memblock_free(__pa(sparsemap_buf), size); } static void __init sparse_buffer_init(unsigned long size, int nid) From patchwork Fri Nov 5 20:43:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C378FC433FE for ; Fri, 5 Nov 2021 20:52:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 74E6860C51 for ; Fri, 5 Nov 2021 20:52:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 74E6860C51 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E6BFC9400FA; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D98BC940103; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B02689400FA; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 97B5E940100 for ; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 596FB1856C2C1 for ; Fri, 5 Nov 2021 20:52:05 +0000 (UTC) X-FDA: 78776073810.16.1AA5D3D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id A8C63104AAFA for ; Fri, 5 Nov 2021 20:51:56 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 515DA61351; Fri, 5 Nov 2021 20:43:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636144997; bh=jmyhl32xkoAq0TuNyYb3SnywAkRCbzI4LXkENqNH3EU=; h=Date:From:To:Subject:In-Reply-To:From; b=X4Omg7Y8PJWQJzen1OxC4tSkM5wxSjQ49zS39uLIktoqOfRa8kBEscF8ZIo/yTMa2 H4u/iC9dO0w1ApE4gaLwUAKhnxNQrP2xZbdGA8j1NCgE7ISs4dK0dJfiZzX+FBeYxS k8hNP8y7UzEwAJW0LAWonmVnC1ueAkDcq7SPWN4o= Date: Fri, 05 Nov 2021 13:43:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 165/262] memblock: stop aliasing __memblock_free_late with memblock_free_late Message-ID: <20211105204316.KFtzxQkwE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: A8C63104AAFA X-Stat-Signature: 5diofraxa1q7pefyj8cdwu4f669gm43x Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=X4Omg7Y8; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145516-336337 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: memblock: stop aliasing __memblock_free_late with memblock_free_late memblock_free_late() is a NOP wrapper for __memblock_free_late(), there is no point to keep this indirection. Drop the wrapper and rename __memblock_free_late() to memblock_free_late(). Link: https://lkml.kernel.org/r/20210930185031.18648-5-rppt@kernel.org Signed-off-by: Mike Rapoport Cc: Christophe Leroy Cc: Juergen Gross Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- include/linux/memblock.h | 7 +------ mm/memblock.c | 8 ++++---- 2 files changed, 5 insertions(+), 10 deletions(-) --- a/include/linux/memblock.h~memblock-stop-aliasing-__memblock_free_late-with-memblock_free_late +++ a/include/linux/memblock.h @@ -133,7 +133,7 @@ void __next_mem_range_rev(u64 *idx, int struct memblock_type *type_b, phys_addr_t *out_start, phys_addr_t *out_end, int *out_nid); -void __memblock_free_late(phys_addr_t base, phys_addr_t size); +void memblock_free_late(phys_addr_t base, phys_addr_t size); #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP static inline void __next_physmem_range(u64 *idx, struct memblock_type *type, @@ -441,11 +441,6 @@ static inline void *memblock_alloc_node( MEMBLOCK_ALLOC_ACCESSIBLE, nid); } -static inline void memblock_free_late(phys_addr_t base, phys_addr_t size) -{ - __memblock_free_late(base, size); -} - /* * Set the allocation direction to bottom-up or top-down. */ --- a/mm/memblock.c~memblock-stop-aliasing-__memblock_free_late-with-memblock_free_late +++ a/mm/memblock.c @@ -366,14 +366,14 @@ void __init memblock_discard(void) addr = __pa(memblock.reserved.regions); size = PAGE_ALIGN(sizeof(struct memblock_region) * memblock.reserved.max); - __memblock_free_late(addr, size); + memblock_free_late(addr, size); } if (memblock.memory.regions != memblock_memory_init_regions) { addr = __pa(memblock.memory.regions); size = PAGE_ALIGN(sizeof(struct memblock_region) * memblock.memory.max); - __memblock_free_late(addr, size); + memblock_free_late(addr, size); } memblock_memory = NULL; @@ -1589,7 +1589,7 @@ void * __init memblock_alloc_try_nid( } /** - * __memblock_free_late - free pages directly to buddy allocator + * memblock_free_late - free pages directly to buddy allocator * @base: phys starting address of the boot memory block * @size: size of the boot memory block in bytes * @@ -1597,7 +1597,7 @@ void * __init memblock_alloc_try_nid( * down, but we are still initializing the system. Pages are released directly * to the buddy allocator. */ -void __init __memblock_free_late(phys_addr_t base, phys_addr_t size) +void __init memblock_free_late(phys_addr_t base, phys_addr_t size) { phys_addr_t cursor, end; From patchwork Fri Nov 5 20:43:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25329C433EF for ; Fri, 5 Nov 2021 20:52:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A32A860E73 for ; Fri, 5 Nov 2021 20:52:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A32A860E73 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3FFFE940104; Fri, 5 Nov 2021 16:52:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D6CE940102; Fri, 5 Nov 2021 16:52:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14132940106; Fri, 5 Nov 2021 16:52:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 044FF940102 for ; Fri, 5 Nov 2021 16:52:19 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B5E3C8249980 for ; Fri, 5 Nov 2021 20:52:18 +0000 (UTC) X-FDA: 78776074356.07.8CA15E5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 4BA01300010B for ; Fri, 5 Nov 2021 20:52:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4184461356; Fri, 5 Nov 2021 20:43:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145000; bh=gd/LVIsGjp+Yx4SMIZxOuuTBkmjQFnxJkin/mYoamQE=; h=Date:From:To:Subject:In-Reply-To:From; b=2rGAznqsu+A1t++OxrLBPHcWtbj1S1ZEibz8XcwKM90BK5ddENoe0VXXg01Q6Wjgq IWKh+tv75mwAeAmsPebBMg9hcSJLdT7EXkDEVQpKXd/cJ/txeBw0hibWswuttdRn+W ywqAOufJm9Lmh9k8wCGDEd2/7WOa8JICV4nNvChg= Date: Fri, 05 Nov 2021 13:43:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 166/262] memblock: rename memblock_free to memblock_phys_free Message-ID: <20211105204319.0_eBL3GRR%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4BA01300010B X-Stat-Signature: dabfrrxb6rgoy1rexd1h73ktkp8hor8k Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2rGAznqs; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145538-59004 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: memblock: rename memblock_free to memblock_phys_free Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport Cc: Christophe Leroy Cc: Juergen Gross Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- arch/alpha/kernel/core_irongate.c | 3 ++- arch/arc/mm/init.c | 2 +- arch/arm/mach-hisi/platmcpm.c | 2 +- arch/arm/mm/init.c | 2 +- arch/arm64/mm/mmu.c | 4 ++-- arch/mips/mm/init.c | 2 +- arch/mips/sgi-ip30/ip30-setup.c | 6 +++--- arch/powerpc/kernel/dt_cpu_ftrs.c | 4 ++-- arch/powerpc/kernel/paca.c | 8 ++++---- arch/powerpc/kernel/setup-common.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- arch/powerpc/platforms/powernv/pci-ioda.c | 2 +- arch/powerpc/platforms/pseries/svm.c | 3 ++- arch/riscv/kernel/setup.c | 5 +++-- arch/s390/kernel/setup.c | 8 ++++---- arch/s390/kernel/smp.c | 4 ++-- arch/s390/kernel/uv.c | 2 +- arch/s390/mm/kasan_init.c | 2 +- arch/sh/boards/mach-ap325rxa/setup.c | 2 +- arch/sh/boards/mach-ecovec24/setup.c | 4 ++-- arch/sh/boards/mach-kfr2r09/setup.c | 2 +- arch/sh/boards/mach-migor/setup.c | 2 +- arch/sh/boards/mach-se/7724/setup.c | 4 ++-- arch/sparc/kernel/smp_64.c | 2 +- arch/um/kernel/mem.c | 2 +- arch/x86/kernel/setup.c | 4 ++-- arch/x86/mm/init.c | 2 +- arch/x86/xen/mmu_pv.c | 6 +++--- arch/x86/xen/setup.c | 6 +++--- drivers/base/arch_numa.c | 2 +- drivers/firmware/efi/memmap.c | 2 +- drivers/of/kexec.c | 3 +-- drivers/of/of_reserved_mem.c | 5 +++-- drivers/s390/char/sclp_early.c | 2 +- drivers/usb/early/xhci-dbc.c | 10 +++++----- drivers/xen/swiotlb-xen.c | 2 +- include/linux/memblock.h | 2 +- init/initramfs.c | 2 +- kernel/dma/swiotlb.c | 2 +- lib/cpumask.c | 2 +- mm/cma.c | 2 +- mm/memblock.c | 8 ++++---- mm/memory_hotplug.c | 2 +- mm/percpu.c | 8 ++++---- mm/sparse.c | 2 +- 45 files changed, 79 insertions(+), 76 deletions(-) --- a/arch/alpha/kernel/core_irongate.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/alpha/kernel/core_irongate.c @@ -233,7 +233,8 @@ albacore_init_arch(void) unsigned long size; size = initrd_end - initrd_start; - memblock_free(__pa(initrd_start), PAGE_ALIGN(size)); + memblock_phys_free(__pa(initrd_start), + PAGE_ALIGN(size)); if (!move_initrd(pci_mem)) printk("irongate_init_arch: initrd too big " "(%ldK)\ndisabling initrd\n", --- a/arch/arc/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/arc/mm/init.c @@ -173,7 +173,7 @@ static void __init highmem_init(void) #ifdef CONFIG_HIGHMEM unsigned long tmp; - memblock_free(high_mem_start, high_mem_sz); + memblock_phys_free(high_mem_start, high_mem_sz); for (tmp = min_high_pfn; tmp < max_high_pfn; tmp++) free_highmem_page(pfn_to_page(tmp)); #endif --- a/arch/arm64/mm/mmu.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/arm64/mm/mmu.c @@ -738,8 +738,8 @@ void __init paging_init(void) cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); init_mm.pgd = swapper_pg_dir; - memblock_free(__pa_symbol(init_pg_dir), - __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir)); + memblock_phys_free(__pa_symbol(init_pg_dir), + __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir)); memblock_allow_resize(); } --- a/arch/arm/mach-hisi/platmcpm.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/arm/mach-hisi/platmcpm.c @@ -339,7 +339,7 @@ err_fabric: err_sysctrl: iounmap(relocation); err_reloc: - memblock_free(hip04_boot_method[0], hip04_boot_method[1]); + memblock_phys_free(hip04_boot_method[0], hip04_boot_method[1]); err: return ret; } --- a/arch/arm/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/arm/mm/init.c @@ -158,7 +158,7 @@ phys_addr_t __init arm_memblock_steal(ph panic("Failed to steal %pa bytes at %pS\n", &size, (void *)_RET_IP_); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); return phys; --- a/arch/mips/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/mips/mm/init.c @@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } void __init setup_per_cpu_areas(void) --- a/arch/mips/sgi-ip30/ip30-setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/mips/sgi-ip30/ip30-setup.c @@ -69,10 +69,10 @@ static void __init ip30_mem_init(void) total_mem += size; if (addr >= IP30_REAL_MEMORY_START) - memblock_free(addr, size); + memblock_phys_free(addr, size); else if ((addr + size) > IP30_REAL_MEMORY_START) - memblock_free(IP30_REAL_MEMORY_START, - size - IP30_MAX_PROM_MEMORY); + memblock_phys_free(IP30_REAL_MEMORY_START, + size - IP30_MAX_PROM_MEMORY); } pr_info("Detected %luMB of physical memory.\n", MEM_SHIFT(total_mem)); } --- a/arch/powerpc/kernel/dt_cpu_ftrs.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -1095,8 +1095,8 @@ static int __init dt_cpu_ftrs_scan_callb cpufeatures_setup_finished(); - memblock_free(__pa(dt_cpu_features), - sizeof(struct dt_cpu_feature)*nr_dt_cpu_features); + memblock_phys_free(__pa(dt_cpu_features), + sizeof(struct dt_cpu_feature) * nr_dt_cpu_features); return 0; } --- a/arch/powerpc/kernel/paca.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/kernel/paca.c @@ -322,8 +322,8 @@ void __init free_unused_pacas(void) new_ptrs_size = sizeof(struct paca_struct *) * nr_cpu_ids; if (new_ptrs_size < paca_ptrs_size) - memblock_free(__pa(paca_ptrs) + new_ptrs_size, - paca_ptrs_size - new_ptrs_size); + memblock_phys_free(__pa(paca_ptrs) + new_ptrs_size, + paca_ptrs_size - new_ptrs_size); paca_nr_cpu_ids = nr_cpu_ids; paca_ptrs_size = new_ptrs_size; @@ -331,8 +331,8 @@ void __init free_unused_pacas(void) #ifdef CONFIG_PPC_BOOK3S_64 if (early_radix_enabled()) { /* Ugly fixup, see new_slb_shadow() */ - memblock_free(__pa(paca_ptrs[boot_cpuid]->slb_shadow_ptr), - sizeof(struct slb_shadow)); + memblock_phys_free(__pa(paca_ptrs[boot_cpuid]->slb_shadow_ptr), + sizeof(struct slb_shadow)); paca_ptrs[boot_cpuid]->slb_shadow_ptr = NULL; } #endif --- a/arch/powerpc/kernel/setup_64.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/kernel/setup_64.c @@ -812,7 +812,7 @@ static void * __init pcpu_alloc_bootmem( static void __init pcpu_free_bootmem(void *ptr, size_t size) { - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } static int pcpu_cpu_distance(unsigned int from, unsigned int to) --- a/arch/powerpc/kernel/setup-common.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/kernel/setup-common.c @@ -825,7 +825,7 @@ static void __init smp_setup_pacas(void) set_hard_smp_processor_id(cpu, cpu_to_phys_id[cpu]); } - memblock_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32)); + memblock_phys_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32)); cpu_to_phys_id = NULL; } #endif --- a/arch/powerpc/platforms/powernv/pci-ioda.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2981,7 +2981,7 @@ static void __init pnv_pci_init_ioda_phb if (!phb->hose) { pr_err(" Can't allocate PCI controller for %pOF\n", np); - memblock_free(__pa(phb), sizeof(struct pnv_phb)); + memblock_phys_free(__pa(phb), sizeof(struct pnv_phb)); return; } --- a/arch/powerpc/platforms/pseries/svm.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/powerpc/platforms/pseries/svm.c @@ -56,7 +56,8 @@ void __init svm_swiotlb_init(void) return; - memblock_free(__pa(vstart), PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); + memblock_phys_free(__pa(vstart), + PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); panic("SVM: Cannot allocate SWIOTLB buffer"); } --- a/arch/riscv/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/riscv/kernel/setup.c @@ -230,13 +230,14 @@ static void __init init_resources(void) /* Clean-up any unused pre-allocated resources */ if (res_idx >= 0) - memblock_free(__pa(mem_res), (res_idx + 1) * sizeof(*mem_res)); + memblock_phys_free(__pa(mem_res), + (res_idx + 1) * sizeof(*mem_res)); return; error: /* Better an empty resource tree than an inconsistent one */ release_child_resources(&iomem_resource); - memblock_free(__pa(mem_res), mem_res_sz); + memblock_phys_free(__pa(mem_res), mem_res_sz); } --- a/arch/s390/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/s390/kernel/setup.c @@ -693,7 +693,7 @@ static void __init reserve_crashkernel(v } if (register_memory_notifier(&kdump_mem_nb)) { - memblock_free(crash_base, crash_size); + memblock_phys_free(crash_base, crash_size); return; } @@ -748,7 +748,7 @@ static void __init free_mem_detect_info( get_mem_detect_reserved(&start, &size); if (size) - memblock_free(start, size); + memblock_phys_free(start, size); } static const char * __init get_mem_info_source(void) @@ -793,7 +793,7 @@ static void __init check_initrd(void) if (initrd_data.start && initrd_data.size && !memblock_is_region_memory(initrd_data.start, initrd_data.size)) { pr_err("The initial RAM disk does not fit into the memory\n"); - memblock_free(initrd_data.start, initrd_data.size); + memblock_phys_free(initrd_data.start, initrd_data.size); initrd_start = initrd_end = 0; } #endif @@ -890,7 +890,7 @@ static void __init setup_randomness(void if (stsi(vmms, 3, 2, 2) == 0 && vmms->count) add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count); - memblock_free((unsigned long) vmms, PAGE_SIZE); + memblock_phys_free((unsigned long)vmms, PAGE_SIZE); } /* --- a/arch/s390/kernel/smp.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/s390/kernel/smp.c @@ -723,7 +723,7 @@ void __init smp_save_dump_cpus(void) /* Get the CPU registers */ smp_save_cpu_regs(sa, addr, is_boot_cpu, page); } - memblock_free(page, PAGE_SIZE); + memblock_phys_free(page, PAGE_SIZE); diag_amode31_ops.diag308_reset(); pcpu_set_smt(0); } @@ -880,7 +880,7 @@ void __init smp_detect_cpus(void) /* Add CPUs present at boot */ __smp_rescan_cpus(info, true); - memblock_free((unsigned long)info, sizeof(*info)); + memblock_phys_free((unsigned long)info, sizeof(*info)); } /* --- a/arch/s390/kernel/uv.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/s390/kernel/uv.c @@ -64,7 +64,7 @@ void __init setup_uv(void) } if (uv_init(uv_stor_base, uv_info.uv_base_stor_len)) { - memblock_free(uv_stor_base, uv_info.uv_base_stor_len); + memblock_phys_free(uv_stor_base, uv_info.uv_base_stor_len); goto fail; } --- a/arch/s390/mm/kasan_init.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/s390/mm/kasan_init.c @@ -399,5 +399,5 @@ void __init kasan_copy_shadow_mapping(vo void __init kasan_free_early_identity(void) { - memblock_free(pgalloc_pos, pgalloc_freeable - pgalloc_pos); + memblock_phys_free(pgalloc_pos, pgalloc_freeable - pgalloc_pos); } --- a/arch/sh/boards/mach-ap325rxa/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sh/boards/mach-ap325rxa/setup.c @@ -560,7 +560,7 @@ static void __init ap325rxa_mv_mem_reser if (!phys) panic("Failed to allocate CEU memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu_dma_membase = phys; --- a/arch/sh/boards/mach-ecovec24/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sh/boards/mach-ecovec24/setup.c @@ -1502,7 +1502,7 @@ static void __init ecovec_mv_mem_reserve if (!phys) panic("Failed to allocate CEU0 memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu0_dma_membase = phys; @@ -1510,7 +1510,7 @@ static void __init ecovec_mv_mem_reserve if (!phys) panic("Failed to allocate CEU1 memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu1_dma_membase = phys; } --- a/arch/sh/boards/mach-kfr2r09/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sh/boards/mach-kfr2r09/setup.c @@ -633,7 +633,7 @@ static void __init kfr2r09_mv_mem_reserv if (!phys) panic("Failed to allocate CEU memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu_dma_membase = phys; --- a/arch/sh/boards/mach-migor/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sh/boards/mach-migor/setup.c @@ -633,7 +633,7 @@ static void __init migor_mv_mem_reserve( if (!phys) panic("Failed to allocate CEU memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu_dma_membase = phys; --- a/arch/sh/boards/mach-se/7724/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sh/boards/mach-se/7724/setup.c @@ -966,7 +966,7 @@ static void __init ms7724se_mv_mem_reser if (!phys) panic("Failed to allocate CEU0 memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu0_dma_membase = phys; @@ -974,7 +974,7 @@ static void __init ms7724se_mv_mem_reser if (!phys) panic("Failed to allocate CEU1 memory\n"); - memblock_free(phys, size); + memblock_phys_free(phys, size); memblock_remove(phys, size); ceu1_dma_membase = phys; } --- a/arch/sparc/kernel/smp_64.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/sparc/kernel/smp_64.c @@ -1567,7 +1567,7 @@ static void * __init pcpu_alloc_bootmem( static void __init pcpu_free_bootmem(void *ptr, size_t size) { - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } static int __init pcpu_cpu_distance(unsigned int from, unsigned int to) --- a/arch/um/kernel/mem.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/um/kernel/mem.c @@ -47,7 +47,7 @@ void __init mem_init(void) */ brk_end = (unsigned long) UML_ROUND_UP(sbrk(0)); map_memory(brk_end, __pa(brk_end), uml_reserved - brk_end, 1, 1, 0); - memblock_free(__pa(brk_end), uml_reserved - brk_end); + memblock_phys_free(__pa(brk_end), uml_reserved - brk_end); uml_reserved = brk_end; /* this will put all low memory onto the freelists */ --- a/arch/x86/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/x86/kernel/setup.c @@ -322,7 +322,7 @@ static void __init reserve_initrd(void) relocate_initrd(); - memblock_free(ramdisk_image, ramdisk_end - ramdisk_image); + memblock_phys_free(ramdisk_image, ramdisk_end - ramdisk_image); } #else @@ -521,7 +521,7 @@ static void __init reserve_crashkernel(v } if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) { - memblock_free(crash_base, crash_size); + memblock_phys_free(crash_base, crash_size); return; } --- a/arch/x86/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/x86/mm/init.c @@ -618,7 +618,7 @@ static void __init memory_map_top_down(u */ addr = memblock_phys_alloc_range(PMD_SIZE, PMD_SIZE, map_start, map_end); - memblock_free(addr, PMD_SIZE); + memblock_phys_free(addr, PMD_SIZE); real_end = addr + PMD_SIZE; /* step_size need to be small so pgt_buf from BRK could cover it */ --- a/arch/x86/xen/mmu_pv.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/x86/xen/mmu_pv.c @@ -1025,7 +1025,7 @@ static void __init xen_free_ro_pages(uns for (; vaddr < vaddr_end; vaddr += PAGE_SIZE) make_lowmem_page_readwrite(vaddr); - memblock_free(paddr, size); + memblock_phys_free(paddr, size); } static void __init xen_cleanmfnmap_free_pgtbl(void *pgtbl, bool unpin) @@ -1151,7 +1151,7 @@ static void __init xen_pagetable_p2m_fre xen_cleanhighmap(addr, addr + size); size = PAGE_ALIGN(xen_start_info->nr_pages * sizeof(unsigned long)); - memblock_free(__pa(addr), size); + memblock_phys_free(__pa(addr), size); } else { xen_cleanmfnmap(addr); } @@ -1955,7 +1955,7 @@ void __init xen_relocate_p2m(void) pfn_end = p2m_pfn_end; } - memblock_free(PFN_PHYS(pfn), PAGE_SIZE * (pfn_end - pfn)); + memblock_phys_free(PFN_PHYS(pfn), PAGE_SIZE * (pfn_end - pfn)); while (pfn < pfn_end) { if (pfn == p2m_pfn) { pfn = p2m_pfn_end; --- a/arch/x86/xen/setup.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/arch/x86/xen/setup.c @@ -153,7 +153,7 @@ static void __init xen_del_extra_mem(uns break; } } - memblock_free(PFN_PHYS(start_pfn), PFN_PHYS(n_pfns)); + memblock_phys_free(PFN_PHYS(start_pfn), PFN_PHYS(n_pfns)); } /* @@ -719,7 +719,7 @@ static void __init xen_reserve_xen_mfnli return; xen_relocate_p2m(); - memblock_free(start, size); + memblock_phys_free(start, size); } /** @@ -885,7 +885,7 @@ char * __init xen_memory_setup(void) xen_phys_memcpy(new_area, start, size); pr_info("initrd moved from [mem %#010llx-%#010llx] to [mem %#010llx-%#010llx]\n", start, start + size, new_area, new_area + size); - memblock_free(start, size); + memblock_phys_free(start, size); boot_params.hdr.ramdisk_image = new_area; boot_params.ext_ramdisk_image = new_area >> 32; } --- a/drivers/base/arch_numa.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/base/arch_numa.c @@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK --- a/drivers/firmware/efi/memmap.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/firmware/efi/memmap.c @@ -35,7 +35,7 @@ void __init __efi_memmap_free(u64 phys, if (slab_is_available()) memblock_free_late(phys, size); else - memblock_free(phys, size); + memblock_phys_free(phys, size); } else if (flags & EFI_MEMMAP_SLAB) { struct page *p = pfn_to_page(PHYS_PFN(phys)); unsigned int order = get_order(size); --- a/drivers/of/kexec.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/of/kexec.c @@ -171,8 +171,7 @@ int ima_free_kexec_buffer(void) if (ret) return ret; - return memblock_free(addr, size); - + return memblock_phys_free(addr, size); } /** --- a/drivers/of/of_reserved_mem.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/of/of_reserved_mem.c @@ -46,7 +46,7 @@ static int __init early_init_dt_alloc_re if (nomap) { err = memblock_mark_nomap(base, size); if (err) - memblock_free(base, size); + memblock_phys_free(base, size); kmemleak_ignore_phys(base); } @@ -284,7 +284,8 @@ void __init fdt_init_reserved_mem(void) if (nomap) memblock_clear_nomap(rmem->base, rmem->size); else - memblock_free(rmem->base, rmem->size); + memblock_phys_free(rmem->base, + rmem->size); } } } --- a/drivers/s390/char/sclp_early.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/s390/char/sclp_early.c @@ -139,7 +139,7 @@ int __init sclp_early_get_core_info(stru } sclp_fill_core_info(info, sccb); out: - memblock_free((unsigned long)sccb, length); + memblock_phys_free((unsigned long)sccb, length); return rc; } --- a/drivers/usb/early/xhci-dbc.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/usb/early/xhci-dbc.c @@ -185,7 +185,7 @@ static void __init xdbc_free_ring(struct if (!seg) return; - memblock_free(seg->dma, PAGE_SIZE); + memblock_phys_free(seg->dma, PAGE_SIZE); ring->segment = NULL; } @@ -665,10 +665,10 @@ int __init early_xdbc_setup_hardware(voi xdbc_free_ring(&xdbc.in_ring); if (xdbc.table_dma) - memblock_free(xdbc.table_dma, PAGE_SIZE); + memblock_phys_free(xdbc.table_dma, PAGE_SIZE); if (xdbc.out_dma) - memblock_free(xdbc.out_dma, PAGE_SIZE); + memblock_phys_free(xdbc.out_dma, PAGE_SIZE); xdbc.table_base = NULL; xdbc.out_buf = NULL; @@ -987,8 +987,8 @@ free_and_quit: xdbc_free_ring(&xdbc.evt_ring); xdbc_free_ring(&xdbc.out_ring); xdbc_free_ring(&xdbc.in_ring); - memblock_free(xdbc.table_dma, PAGE_SIZE); - memblock_free(xdbc.out_dma, PAGE_SIZE); + memblock_phys_free(xdbc.table_dma, PAGE_SIZE); + memblock_phys_free(xdbc.out_dma, PAGE_SIZE); writel(0, &xdbc.xdbc_reg->control); early_iounmap(xdbc.xhci_base, xdbc.xhci_length); --- a/drivers/xen/swiotlb-xen.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/drivers/xen/swiotlb-xen.c @@ -241,7 +241,7 @@ retry: */ rc = xen_swiotlb_fixup(start, nslabs); if (rc) { - memblock_free(__pa(start), PAGE_ALIGN(bytes)); + memblock_phys_free(__pa(start), PAGE_ALIGN(bytes)); if (nslabs > 1024 && repeat--) { /* Min is 2MB */ nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); --- a/include/linux/memblock.h~memblock-rename-memblock_free-to-memblock_phys_free +++ a/include/linux/memblock.h @@ -103,7 +103,7 @@ void memblock_allow_resize(void); int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid); int memblock_add(phys_addr_t base, phys_addr_t size); int memblock_remove(phys_addr_t base, phys_addr_t size); -int memblock_free(phys_addr_t base, phys_addr_t size); +int memblock_phys_free(phys_addr_t base, phys_addr_t size); int memblock_reserve(phys_addr_t base, phys_addr_t size); #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP int memblock_physmem_add(phys_addr_t base, phys_addr_t size); --- a/init/initramfs.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/init/initramfs.c @@ -607,7 +607,7 @@ void __weak __init free_initrd_mem(unsig unsigned long aligned_start = ALIGN_DOWN(start, PAGE_SIZE); unsigned long aligned_end = ALIGN(end, PAGE_SIZE); - memblock_free(__pa(aligned_start), aligned_end - aligned_start); + memblock_phys_free(__pa(aligned_start), aligned_end - aligned_start); #endif free_reserved_area((void *)start, (void *)end, POISON_FREE_INITMEM, --- a/kernel/dma/swiotlb.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/kernel/dma/swiotlb.c @@ -247,7 +247,7 @@ swiotlb_init(int verbose) return; fail_free_mem: - memblock_free(__pa(tlb), bytes); + memblock_phys_free(__pa(tlb), bytes); fail: pr_warn("Cannot allocate buffer"); } --- a/lib/cpumask.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/lib/cpumask.c @@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var); */ void __init free_bootmem_cpumask_var(cpumask_var_t mask) { - memblock_free(__pa(mask), cpumask_size()); + memblock_phys_free(__pa(mask), cpumask_size()); } #endif --- a/mm/cma.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/mm/cma.c @@ -378,7 +378,7 @@ int __init cma_declare_contiguous_nid(ph return 0; free_mem: - memblock_free(base, size); + memblock_phys_free(base, size); err: pr_err("Failed to reserve %ld MiB\n", (unsigned long)size / SZ_1M); return ret; --- a/mm/memblock.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/mm/memblock.c @@ -806,18 +806,18 @@ int __init_memblock memblock_remove(phys void __init_memblock memblock_free_ptr(void *ptr, size_t size) { if (ptr) - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } /** - * memblock_free - free boot memory block + * memblock_phys_free - free boot memory block * @base: phys starting address of the boot memory block * @size: size of the boot memory block in bytes * * Free boot memory block previously allocated by memblock_alloc_xx() API. * The freeing memory will not be released to the buddy allocator. */ -int __init_memblock memblock_free(phys_addr_t base, phys_addr_t size) +int __init_memblock memblock_phys_free(phys_addr_t base, phys_addr_t size) { phys_addr_t end = base + size - 1; @@ -1937,7 +1937,7 @@ static void __init free_memmap(unsigned * memmap array. */ if (pg < pgend) - memblock_free(pg, pgend - pg); + memblock_phys_free(pg, pgend - pg); } /* --- a/mm/memory_hotplug.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/mm/memory_hotplug.c @@ -2204,7 +2204,7 @@ static int __ref try_remove_memory(u64 s arch_remove_memory(start, size, altmap); if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { - memblock_free(start, size); + memblock_phys_free(start, size); memblock_remove(start, size); } --- a/mm/percpu.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/mm/percpu.c @@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all */ void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai) { - memblock_free(__pa(ai), ai->__ai_size); + memblock_phys_free(__pa(ai), ai->__ai_size); } /** @@ -3134,7 +3134,7 @@ out_free_areas: out_free: pcpu_free_alloc_info(ai); if (areas) - memblock_free(__pa(areas), areas_size); + memblock_phys_free(__pa(areas), areas_size); return rc; } #endif /* BUILD_EMBED_FIRST_CHUNK */ @@ -3256,7 +3256,7 @@ enomem: free_fn(page_address(pages[j]), PAGE_SIZE); rc = -ENOMEM; out_free_ar: - memblock_free(__pa(pages), pages_size); + memblock_phys_free(__pa(pages), pages_size); pcpu_free_alloc_info(ai); return rc; } @@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u static void __init pcpu_dfl_fc_free(void *ptr, size_t size) { - memblock_free(__pa(ptr), size); + memblock_phys_free(__pa(ptr), size); } void __init setup_per_cpu_areas(void) --- a/mm/sparse.c~memblock-rename-memblock_free-to-memblock_phys_free +++ a/mm/sparse.c @@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit static inline void __meminit sparse_buffer_free(unsigned long size) { WARN_ON(!sparsemap_buf || size == 0); - memblock_free(__pa(sparsemap_buf), size); + memblock_phys_free(__pa(sparsemap_buf), size); } static void __init sparse_buffer_init(unsigned long size, int nid) From patchwork Fri Nov 5 20:43:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D090EC433F5 for ; Fri, 5 Nov 2021 20:43:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6D3D26135F for ; Fri, 5 Nov 2021 20:43:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6D3D26135F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1238E94009E; Fri, 5 Nov 2021 16:43:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D921940093; Fri, 5 Nov 2021 16:43:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F04A294009E; Fri, 5 Nov 2021 16:43:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0218.hostedemail.com [216.40.44.218]) by kanga.kvack.org (Postfix) with ESMTP id E0272940093 for ; Fri, 5 Nov 2021 16:43:25 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id AB524779AF for ; Fri, 5 Nov 2021 20:43:25 +0000 (UTC) X-FDA: 78776052096.05.DFA5288 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id 2A797F000091 for ; Fri, 5 Nov 2021 20:43:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4E57061357; Fri, 5 Nov 2021 20:43:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145003; bh=Ik5DIXWDhYSn16+wbSqIiEbsj37oyXYJUYcZWjW86qQ=; h=Date:From:To:Subject:In-Reply-To:From; b=ax9PJzqlC5KqeYbiJGnCAOvAoCGyEuxPA+DhYUst1OmQUBLib3oDw3oIr10OwzHBy pmVpTdkWIFqPRes9rwkzsuLxhZIREfIqFAFjYd/UVkJTMc6VFhYRrZDDPriq+fBfGN 1fdERJH9d2Yw46VXGMYJHjYRxinootDpWlaLPsZI= Date: Fri, 05 Nov 2021 13:43:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, christophe.leroy@csgroup.eu, jgross@suse.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rppt@linux.ibm.com, sfr@canb.auug.org.au, Shahab.Vahedi@synopsys.com, torvalds@linux-foundation.org Subject: [patch 167/262] memblock: use memblock_free for freeing virtual pointers Message-ID: <20211105204322.Xml2PUq-V%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 2A797F000091 X-Stat-Signature: 6qmjmiqak7t57otcisb7jmmdbepi4x3c Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ax9PJzql; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144996-547073 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mike Rapoport Subject: memblock: use memblock_free for freeing virtual pointers Rename memblock_free_ptr() to memblock_free() and use memblock_free() when freeing a virtual pointer so that memblock_free() will be a counterpart of memblock_alloc() The callers are updated with the below semantic patch and manual addition of (void *) casting to pointers that are represented by unsigned long variables. @@ identifier vaddr; expression size; @@ ( - memblock_phys_free(__pa(vaddr), size); + memblock_free(vaddr, size); | - memblock_free_ptr(vaddr, size); + memblock_free(vaddr, size); ) [sfr@canb.auug.org.au: fixup] Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org Signed-off-by: Mike Rapoport Signed-off-by: Stephen Rothwell Cc: Christophe Leroy Cc: Juergen Gross Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- arch/alpha/kernel/core_irongate.c | 3 +-- arch/mips/mm/init.c | 2 +- arch/powerpc/kernel/dt_cpu_ftrs.c | 4 ++-- arch/powerpc/kernel/setup-common.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- arch/powerpc/platforms/powernv/pci-ioda.c | 2 +- arch/powerpc/platforms/pseries/svm.c | 3 +-- arch/riscv/kernel/setup.c | 5 ++--- arch/sparc/kernel/smp_64.c | 2 +- arch/um/kernel/mem.c | 2 +- arch/x86/kernel/setup_percpu.c | 2 +- arch/x86/mm/kasan_init_64.c | 4 ++-- arch/x86/mm/numa.c | 2 +- arch/x86/mm/numa_emulation.c | 2 +- arch/x86/xen/mmu_pv.c | 2 +- arch/x86/xen/p2m.c | 2 +- drivers/base/arch_numa.c | 4 ++-- drivers/macintosh/smu.c | 2 +- drivers/xen/swiotlb-xen.c | 2 +- include/linux/memblock.h | 2 +- init/initramfs.c | 2 +- init/main.c | 4 ++-- kernel/dma/swiotlb.c | 2 +- kernel/printk/printk.c | 4 ++-- lib/bootconfig.c | 2 +- lib/cpumask.c | 2 +- mm/memblock.c | 6 +++--- mm/percpu.c | 8 ++++---- mm/sparse.c | 2 +- 29 files changed, 40 insertions(+), 43 deletions(-) --- a/arch/alpha/kernel/core_irongate.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/alpha/kernel/core_irongate.c @@ -233,8 +233,7 @@ albacore_init_arch(void) unsigned long size; size = initrd_end - initrd_start; - memblock_phys_free(__pa(initrd_start), - PAGE_ALIGN(size)); + memblock_free((void *)initrd_start, PAGE_ALIGN(size)); if (!move_initrd(pci_mem)) printk("irongate_init_arch: initrd too big " "(%ldK)\ndisabling initrd\n", --- a/arch/mips/mm/init.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/mips/mm/init.c @@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_phys_free(__pa(ptr), size); + memblock_free(ptr, size); } void __init setup_per_cpu_areas(void) --- a/arch/powerpc/kernel/dt_cpu_ftrs.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -1095,8 +1095,8 @@ static int __init dt_cpu_ftrs_scan_callb cpufeatures_setup_finished(); - memblock_phys_free(__pa(dt_cpu_features), - sizeof(struct dt_cpu_feature) * nr_dt_cpu_features); + memblock_free(dt_cpu_features, + sizeof(struct dt_cpu_feature) * nr_dt_cpu_features); return 0; } --- a/arch/powerpc/kernel/setup_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/powerpc/kernel/setup_64.c @@ -812,7 +812,7 @@ static void * __init pcpu_alloc_bootmem( static void __init pcpu_free_bootmem(void *ptr, size_t size) { - memblock_phys_free(__pa(ptr), size); + memblock_free(ptr, size); } static int pcpu_cpu_distance(unsigned int from, unsigned int to) --- a/arch/powerpc/kernel/setup-common.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/powerpc/kernel/setup-common.c @@ -825,7 +825,7 @@ static void __init smp_setup_pacas(void) set_hard_smp_processor_id(cpu, cpu_to_phys_id[cpu]); } - memblock_phys_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32)); + memblock_free(cpu_to_phys_id, nr_cpu_ids * sizeof(u32)); cpu_to_phys_id = NULL; } #endif --- a/arch/powerpc/platforms/powernv/pci-ioda.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2981,7 +2981,7 @@ static void __init pnv_pci_init_ioda_phb if (!phb->hose) { pr_err(" Can't allocate PCI controller for %pOF\n", np); - memblock_phys_free(__pa(phb), sizeof(struct pnv_phb)); + memblock_free(phb, sizeof(struct pnv_phb)); return; } --- a/arch/powerpc/platforms/pseries/svm.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/powerpc/platforms/pseries/svm.c @@ -56,8 +56,7 @@ void __init svm_swiotlb_init(void) return; - memblock_phys_free(__pa(vstart), - PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); + memblock_free(vstart, PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); panic("SVM: Cannot allocate SWIOTLB buffer"); } --- a/arch/riscv/kernel/setup.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/riscv/kernel/setup.c @@ -230,14 +230,13 @@ static void __init init_resources(void) /* Clean-up any unused pre-allocated resources */ if (res_idx >= 0) - memblock_phys_free(__pa(mem_res), - (res_idx + 1) * sizeof(*mem_res)); + memblock_free(mem_res, (res_idx + 1) * sizeof(*mem_res)); return; error: /* Better an empty resource tree than an inconsistent one */ release_child_resources(&iomem_resource); - memblock_phys_free(__pa(mem_res), mem_res_sz); + memblock_free(mem_res, mem_res_sz); } --- a/arch/sparc/kernel/smp_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/sparc/kernel/smp_64.c @@ -1567,7 +1567,7 @@ static void * __init pcpu_alloc_bootmem( static void __init pcpu_free_bootmem(void *ptr, size_t size) { - memblock_phys_free(__pa(ptr), size); + memblock_free(ptr, size); } static int __init pcpu_cpu_distance(unsigned int from, unsigned int to) --- a/arch/um/kernel/mem.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/um/kernel/mem.c @@ -47,7 +47,7 @@ void __init mem_init(void) */ brk_end = (unsigned long) UML_ROUND_UP(sbrk(0)); map_memory(brk_end, __pa(brk_end), uml_reserved - brk_end, 1, 1, 0); - memblock_phys_free(__pa(brk_end), uml_reserved - brk_end); + memblock_free((void *)brk_end, uml_reserved - brk_end); uml_reserved = brk_end; /* this will put all low memory onto the freelists */ --- a/arch/x86/kernel/setup_percpu.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/kernel/setup_percpu.c @@ -135,7 +135,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_free_ptr(ptr, size); + memblock_free(ptr, size); } static int __init pcpu_cpu_distance(unsigned int from, unsigned int to) --- a/arch/x86/mm/kasan_init_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/mm/kasan_init_64.c @@ -49,7 +49,7 @@ static void __init kasan_populate_pmd(pm p = early_alloc(PMD_SIZE, nid, false); if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL)) return; - memblock_free_ptr(p, PMD_SIZE); + memblock_free(p, PMD_SIZE); } p = early_alloc(PAGE_SIZE, nid, true); @@ -85,7 +85,7 @@ static void __init kasan_populate_pud(pu p = early_alloc(PUD_SIZE, nid, false); if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL)) return; - memblock_free_ptr(p, PUD_SIZE); + memblock_free(p, PUD_SIZE); } p = early_alloc(PAGE_SIZE, nid, true); --- a/arch/x86/mm/numa.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/mm/numa.c @@ -355,7 +355,7 @@ void __init numa_reset_distance(void) /* numa_distance could be 1LU marking allocation failure, test cnt */ if (numa_distance_cnt) - memblock_free_ptr(numa_distance, size); + memblock_free(numa_distance, size); numa_distance_cnt = 0; numa_distance = NULL; /* enable table creation */ } --- a/arch/x86/mm/numa_emulation.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/mm/numa_emulation.c @@ -517,7 +517,7 @@ void __init numa_emulation(struct numa_m } /* free the copied physical distance table */ - memblock_free_ptr(phys_dist, phys_size); + memblock_free(phys_dist, phys_size); return; no_emu: --- a/arch/x86/xen/mmu_pv.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/xen/mmu_pv.c @@ -1151,7 +1151,7 @@ static void __init xen_pagetable_p2m_fre xen_cleanhighmap(addr, addr + size); size = PAGE_ALIGN(xen_start_info->nr_pages * sizeof(unsigned long)); - memblock_phys_free(__pa(addr), size); + memblock_free((void *)addr, size); } else { xen_cleanmfnmap(addr); } --- a/arch/x86/xen/p2m.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/arch/x86/xen/p2m.c @@ -197,7 +197,7 @@ static void * __ref alloc_p2m_page(void) static void __ref free_p2m_page(void *p) { if (unlikely(!slab_is_available())) { - memblock_free_ptr(p, PAGE_SIZE); + memblock_free(p, PAGE_SIZE); return; } --- a/drivers/base/arch_numa.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/drivers/base/arch_numa.c @@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig static void __init pcpu_fc_free(void *ptr, size_t size) { - memblock_phys_free(__pa(ptr), size); + memblock_free(ptr, size); } #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK @@ -326,7 +326,7 @@ void __init numa_free_distance(void) size = numa_distance_cnt * numa_distance_cnt * sizeof(numa_distance[0]); - memblock_free_ptr(numa_distance, size); + memblock_free(numa_distance, size); numa_distance_cnt = 0; numa_distance = NULL; } --- a/drivers/macintosh/smu.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/drivers/macintosh/smu.c @@ -570,7 +570,7 @@ fail_msg_node: fail_db_node: of_node_put(smu->db_node); fail_bootmem: - memblock_free_ptr(smu, sizeof(struct smu_device)); + memblock_free(smu, sizeof(struct smu_device)); smu = NULL; fail_np: of_node_put(np); --- a/drivers/xen/swiotlb-xen.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/drivers/xen/swiotlb-xen.c @@ -241,7 +241,7 @@ retry: */ rc = xen_swiotlb_fixup(start, nslabs); if (rc) { - memblock_phys_free(__pa(start), PAGE_ALIGN(bytes)); + memblock_free(start, PAGE_ALIGN(bytes)); if (nslabs > 1024 && repeat--) { /* Min is 2MB */ nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); --- a/include/linux/memblock.h~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/include/linux/memblock.h @@ -118,7 +118,7 @@ int memblock_mark_nomap(phys_addr_t base int memblock_clear_nomap(phys_addr_t base, phys_addr_t size); void memblock_free_all(void); -void memblock_free_ptr(void *ptr, size_t size); +void memblock_free(void *ptr, size_t size); void reset_node_managed_pages(pg_data_t *pgdat); void reset_all_zones_managed_pages(void); --- a/init/initramfs.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/init/initramfs.c @@ -607,7 +607,7 @@ void __weak __init free_initrd_mem(unsig unsigned long aligned_start = ALIGN_DOWN(start, PAGE_SIZE); unsigned long aligned_end = ALIGN(end, PAGE_SIZE); - memblock_phys_free(__pa(aligned_start), aligned_end - aligned_start); + memblock_free((void *)aligned_start, aligned_end - aligned_start); #endif free_reserved_area((void *)start, (void *)end, POISON_FREE_INITMEM, --- a/init/main.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/init/main.c @@ -382,7 +382,7 @@ static char * __init xbc_make_cmdline(co ret = xbc_snprint_cmdline(new_cmdline, len + 1, root); if (ret < 0 || ret > len) { pr_err("Failed to print extra kernel cmdline.\n"); - memblock_free_ptr(new_cmdline, len + 1); + memblock_free(new_cmdline, len + 1); return NULL; } @@ -925,7 +925,7 @@ static void __init print_unknown_bootopt end += sprintf(end, " %s", *p); pr_notice("Unknown command line parameters:%s\n", unknown_options); - memblock_free_ptr(unknown_options, len); + memblock_free(unknown_options, len); } asmlinkage __visible void __init __no_sanitize_address start_kernel(void) --- a/kernel/dma/swiotlb.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/kernel/dma/swiotlb.c @@ -247,7 +247,7 @@ swiotlb_init(int verbose) return; fail_free_mem: - memblock_phys_free(__pa(tlb), bytes); + memblock_free(tlb, bytes); fail: pr_warn("Cannot allocate buffer"); } --- a/kernel/printk/printk.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/kernel/printk/printk.c @@ -1166,9 +1166,9 @@ void __init setup_log_buf(int early) return; err_free_descs: - memblock_free_ptr(new_descs, new_descs_size); + memblock_free(new_descs, new_descs_size); err_free_log_buf: - memblock_free_ptr(new_log_buf, new_log_buf_len); + memblock_free(new_log_buf, new_log_buf_len); } static bool __read_mostly ignore_loglevel; --- a/lib/bootconfig.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/lib/bootconfig.c @@ -792,7 +792,7 @@ void __init xbc_destroy_all(void) xbc_data = NULL; xbc_data_size = 0; xbc_node_num = 0; - memblock_free_ptr(xbc_nodes, sizeof(struct xbc_node) * XBC_NODE_MAX); + memblock_free(xbc_nodes, sizeof(struct xbc_node) * XBC_NODE_MAX); xbc_nodes = NULL; brace_index = 0; } --- a/lib/cpumask.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/lib/cpumask.c @@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var); */ void __init free_bootmem_cpumask_var(cpumask_var_t mask) { - memblock_phys_free(__pa(mask), cpumask_size()); + memblock_free(mask, cpumask_size()); } #endif --- a/mm/memblock.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/mm/memblock.c @@ -472,7 +472,7 @@ static int __init_memblock memblock_doub kfree(old_array); else if (old_array != memblock_memory_init_regions && old_array != memblock_reserved_init_regions) - memblock_free_ptr(old_array, old_alloc_size); + memblock_free(old_array, old_alloc_size); /* * Reserve the new array if that comes from the memblock. Otherwise, we @@ -796,14 +796,14 @@ int __init_memblock memblock_remove(phys } /** - * memblock_free_ptr - free boot memory allocation + * memblock_free - free boot memory allocation * @ptr: starting address of the boot memory allocation * @size: size of the boot memory block in bytes * * Free boot memory block previously allocated by memblock_alloc_xx() API. * The freeing memory will not be released to the buddy allocator. */ -void __init_memblock memblock_free_ptr(void *ptr, size_t size) +void __init_memblock memblock_free(void *ptr, size_t size) { if (ptr) memblock_phys_free(__pa(ptr), size); --- a/mm/percpu.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/mm/percpu.c @@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all */ void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai) { - memblock_phys_free(__pa(ai), ai->__ai_size); + memblock_free(ai, ai->__ai_size); } /** @@ -3134,7 +3134,7 @@ out_free_areas: out_free: pcpu_free_alloc_info(ai); if (areas) - memblock_phys_free(__pa(areas), areas_size); + memblock_free(areas, areas_size); return rc; } #endif /* BUILD_EMBED_FIRST_CHUNK */ @@ -3256,7 +3256,7 @@ enomem: free_fn(page_address(pages[j]), PAGE_SIZE); rc = -ENOMEM; out_free_ar: - memblock_phys_free(__pa(pages), pages_size); + memblock_free(pages, pages_size); pcpu_free_alloc_info(ai); return rc; } @@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u static void __init pcpu_dfl_fc_free(void *ptr, size_t size) { - memblock_phys_free(__pa(ptr), size); + memblock_free(ptr, size); } void __init setup_per_cpu_areas(void) --- a/mm/sparse.c~memblock-use-memblock_free-for-freeing-virtual-pointers +++ a/mm/sparse.c @@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit static inline void __meminit sparse_buffer_free(unsigned long size) { WARN_ON(!sparsemap_buf || size == 0); - memblock_phys_free(__pa(sparsemap_buf), size); + memblock_free(sparsemap_buf, size); } static void __init sparse_buffer_init(unsigned long size, int nid) From patchwork Fri Nov 5 20:43:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EB20C43219 for ; Fri, 5 Nov 2021 20:43:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 159C161357 for ; Fri, 5 Nov 2021 20:43:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 159C161357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A4DE49400A0; Fri, 5 Nov 2021 16:43:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FAEE940093; Fri, 5 Nov 2021 16:43:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89FB594009F; Fri, 5 Nov 2021 16:43:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 7A8EE940093 for ; Fri, 5 Nov 2021 16:43:27 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 48AD31828EE04 for ; Fri, 5 Nov 2021 20:43:27 +0000 (UTC) X-FDA: 78776052054.06.4CF076E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP id A2991F00008E for ; Fri, 5 Nov 2021 20:43:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 59D626135E; Fri, 5 Nov 2021 20:43:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145006; bh=yelT7RDN9BELjHO/lKSQkIlOI+NNDDjK6q3o/D/CS6o=; h=Date:From:To:Subject:In-Reply-To:From; b=t7QVdXzr6Vl9NffqXrFw8GVXcr8LIufoa5R08gu15KMIlad3x90GpFj3vJXVtSrkR 3q6opN8XgYdasurcGXkjq5zd4GkdZDRgS5ydSRXR1uOCMCiWbuOR/QPBMiA5ZUmrOS r2ZHsf4rzyXVS75dIcHDDqE2W1gTItJzFIVrI/vI= Date: Fri, 05 Nov 2021 13:43:25 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mgorman@suse.de, mhocko@suse.com, mm-commits@vger.kernel.org, rientjes@google.com, sultan@kerneltoast.com, torvalds@linux-foundation.org Subject: [patch 168/262] mm: mark the OOM reaper thread as freezable Message-ID: <20211105204325.8b8llOGWT%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A2991F00008E X-Stat-Signature: d7orxnt59wet5y1dgcmouni6hamxqggj Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=t7QVdXzr; spf=pass (imf16.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636144998-857327 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Sultan Alsawaf Subject: mm: mark the OOM reaper thread as freezable The OOM reaper alters user address space which might theoretically alter the snapshot if reaping is allowed to happen after the freezer quiescent state. To this end, the reaper kthread uses wait_event_freezable() while waiting for any work so that it cannot run while the system freezes. However, the current implementation doesn't respect the freezer because all kernel threads are created with the PF_NOFREEZE flag, so they are automatically excluded from freezing operations. This means that the OOM reaper can race with system snapshotting if it has work to do while the system is being frozen. Fix this by adding a set_freezable() call which will clear the PF_NOFREEZE flag and thus make the OOM reaper visible to the freezer. Please note that the OOM reaper altering the snapshot this way is mostly a theoretical concern and has not been observed in practice. Link: https://lkml.kernel.org/r/20210921165758.6154-1-sultan@kerneltoast.com Link: https://lkml.kernel.org/r/20210918233920.9174-1-sultan@kerneltoast.com Fixes: aac453635549 ("mm, oom: introduce oom reaper") Signed-off-by: Sultan Alsawaf Acked-by: Michal Hocko Cc: David Rientjes Cc: Mel Gorman Signed-off-by: Andrew Morton --- mm/oom_kill.c | 2 ++ 1 file changed, 2 insertions(+) --- a/mm/oom_kill.c~mm-mark-the-oom-reaper-thread-as-freezable +++ a/mm/oom_kill.c @@ -641,6 +641,8 @@ done: static int oom_reaper(void *unused) { + set_freezable(); + while (true) { struct task_struct *tsk = NULL; From patchwork Fri Nov 5 20:43:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6387C433F5 for ; Fri, 5 Nov 2021 20:43:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6C0FD61361 for ; Fri, 5 Nov 2021 20:43:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6C0FD61361 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C4A369400A3; Fri, 5 Nov 2021 16:43:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD37E940093; Fri, 5 Nov 2021 16:43:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9A059400A4; Fri, 5 Nov 2021 16:43:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 99D4D9400A3 for ; Fri, 5 Nov 2021 16:43:41 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 60B161838A433 for ; Fri, 5 Nov 2021 20:43:41 +0000 (UTC) X-FDA: 78776052600.26.61D6CB9 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id E6BC8D00009B for ; Fri, 5 Nov 2021 20:43:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5120A6135A; Fri, 5 Nov 2021 20:43:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145009; bh=eHuZNUSxL5XVjdMVR+BFeUHDNtcM1afJd8NqYHx/rOQ=; h=Date:From:To:Subject:In-Reply-To:From; b=wHbF6HRYDYksrcSi/5LYrlICPb9Gx/iFE+nJm7mGYuodmb0ZifHumugXISvYRER4z E2pNqUKpFNPz1NXmR4jO2GTGw0WZZC/Lu8pRMcj5dzXalGFfOo87sDgZK/0c2pzrt4 BjU4qs4Hcy51sBEgLkARtWuad63G+wVXboAecnng= Date: Fri, 05 Nov 2021 13:43:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, benh@kernel.crashing.org, corbet@lwn.net, dan.carpenter@oracle.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, nathan@kernel.org, paulus@samba.org, rppt@kernel.org, torvalds@linux-foundation.org, willy@infradead.org, yaozhenguo1@gmail.com Subject: [patch 169/262] hugetlbfs: extend the definition of hugepages parameter to support node allocation Message-ID: <20211105204328.Cn_I_2Bho%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=wHbF6HRY; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E6BC8D00009B X-Stat-Signature: ereuhwuubd9hzedr6penbuwtrnegztrw X-HE-Tag: 1636145009-150024 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zhenguo Yao Subject: hugetlbfs: extend the definition of hugepages parameter to support node allocation We can specify the number of hugepages to allocate at boot. But the hugepages is balanced in all nodes at present. In some scenarios, we only need hugepages in one node. For example: DPDK needs hugepages which are in the same node as NIC. If DPDK needs four hugepages of 1G size in node1 and system has 16 numa nodes we must reserve 64 hugepages on the kernel cmdline. But only four hugepages are used. The others should be free after boot. If the system memory is low(for example: 64G), it will be an impossible task. So extend the hugepages parameter to support specifying hugepages on a specific node. For example add following parameter: hugepagesz=1G hugepages=0:1,1:3 It will allocate 1 hugepage in node0 and 3 hugepages in node1. Link: https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.com Signed-off-by: Zhenguo Yao Reviewed-by: Mike Kravetz Cc: Zhenguo Yao Cc: Dan Carpenter Cc: Nathan Chancellor Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Jonathan Corbet Cc: Mike Rapoport Cc: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton --- Documentation/admin-guide/kernel-parameters.txt | 8 Documentation/admin-guide/mm/hugetlbpage.rst | 12 + arch/powerpc/mm/hugetlbpage.c | 9 include/linux/hugetlb.h | 6 mm/hugetlb.c | 153 +++++++++++--- 5 files changed, 155 insertions(+), 33 deletions(-) --- a/arch/powerpc/mm/hugetlbpage.c~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation +++ a/arch/powerpc/mm/hugetlbpage.c @@ -229,17 +229,22 @@ static int __init pseries_alloc_bootmem_ m->hstate = hstate; return 1; } + +bool __init hugetlb_node_alloc_supported(void) +{ + return false; +} #endif -int __init alloc_bootmem_huge_page(struct hstate *h) +int __init alloc_bootmem_huge_page(struct hstate *h, int nid) { #ifdef CONFIG_PPC_BOOK3S_64 if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled()) return pseries_alloc_bootmem_huge_page(h); #endif - return __alloc_bootmem_huge_page(h); + return __alloc_bootmem_huge_page(h, nid); } #ifndef CONFIG_PPC_BOOK3S_64 --- a/Documentation/admin-guide/kernel-parameters.txt~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation +++ a/Documentation/admin-guide/kernel-parameters.txt @@ -1601,9 +1601,11 @@ the number of pages of hugepagesz to be allocated. If this is the first HugeTLB parameter on the command line, it specifies the number of pages to allocate for - the default huge page size. See also - Documentation/admin-guide/mm/hugetlbpage.rst. - Format: + the default huge page size. If using node format, the + number of pages to allocate per-node can be specified. + See also Documentation/admin-guide/mm/hugetlbpage.rst. + Format: or (node format) + :[,:] hugepagesz= [HW] The size of the HugeTLB pages. This is used in --- a/Documentation/admin-guide/mm/hugetlbpage.rst~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation +++ a/Documentation/admin-guide/mm/hugetlbpage.rst @@ -128,7 +128,9 @@ hugepages implicitly specifies the number of huge pages of default size to allocate. If the number of huge pages of default size is implicitly specified, it can not be overwritten by a hugepagesz,hugepages - parameter pair for the default size. + parameter pair for the default size. This parameter also has a + node format. The node format specifies the number of huge pages + to allocate on specific nodes. For example, on an architecture with 2M default huge page size:: @@ -138,6 +140,14 @@ hugepages indicating that the hugepages=512 parameter is ignored. If a hugepages parameter is preceded by an invalid hugepagesz parameter, it will be ignored. + + Node format example:: + + hugepagesz=2M hugepages=0:1,1:2 + + It will allocate 1 2M hugepage on node0 and 2 2M hugepages on node1. + If the node number is invalid, the parameter will be ignored. + default_hugepagesz Specify the default huge page size. This parameter can only be specified once on the command line. default_hugepagesz can --- a/include/linux/hugetlb.h~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation +++ a/include/linux/hugetlb.h @@ -615,6 +615,7 @@ struct hstate { unsigned long nr_overcommit_huge_pages; struct list_head hugepage_activelist; struct list_head hugepage_freelists[MAX_NUMNODES]; + unsigned int max_huge_pages_node[MAX_NUMNODES]; unsigned int nr_huge_pages_node[MAX_NUMNODES]; unsigned int free_huge_pages_node[MAX_NUMNODES]; unsigned int surplus_huge_pages_node[MAX_NUMNODES]; @@ -647,8 +648,9 @@ void restore_reserve_on_error(struct hst unsigned long address, struct page *page); /* arch callback */ -int __init __alloc_bootmem_huge_page(struct hstate *h); -int __init alloc_bootmem_huge_page(struct hstate *h); +int __init __alloc_bootmem_huge_page(struct hstate *h, int nid); +int __init alloc_bootmem_huge_page(struct hstate *h, int nid); +bool __init hugetlb_node_alloc_supported(void); void __init hugetlb_add_hstate(unsigned order); bool __init arch_hugetlb_valid_size(unsigned long size); --- a/mm/hugetlb.c~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation +++ a/mm/hugetlb.c @@ -77,6 +77,7 @@ static struct hstate * __initdata parsed static unsigned long __initdata default_hstate_max_huge_pages; static bool __initdata parsed_valid_hugepagesz = true; static bool __initdata parsed_default_hugepagesz; +static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata; /* * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages, @@ -2963,33 +2964,39 @@ out_subpool_put: return ERR_PTR(-ENOSPC); } -int alloc_bootmem_huge_page(struct hstate *h) +int alloc_bootmem_huge_page(struct hstate *h, int nid) __attribute__ ((weak, alias("__alloc_bootmem_huge_page"))); -int __alloc_bootmem_huge_page(struct hstate *h) +int __alloc_bootmem_huge_page(struct hstate *h, int nid) { - struct huge_bootmem_page *m; + struct huge_bootmem_page *m = NULL; /* initialize for clang */ int nr_nodes, node; + if (nid >= nr_online_nodes) + return 0; + /* do node specific alloc */ + if (nid != NUMA_NO_NODE) { + m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h), + 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + if (!m) + return 0; + goto found; + } + /* allocate from next node when distributing huge pages */ for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) { - void *addr; - - addr = memblock_alloc_try_nid_raw( + m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); - if (addr) { - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - */ - m = addr; - goto found; - } + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + */ + if (!m) + return 0; + goto found; } - return 0; found: - BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h))); /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); @@ -3029,12 +3036,61 @@ static void __init gather_bootmem_preall cond_resched(); } } +static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid) +{ + unsigned long i; + char buf[32]; + + for (i = 0; i < h->max_huge_pages_node[nid]; ++i) { + if (hstate_is_gigantic(h)) { + if (!alloc_bootmem_huge_page(h, nid)) + break; + } else { + struct page *page; + gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE; + + page = alloc_fresh_huge_page(h, gfp_mask, nid, + &node_states[N_MEMORY], NULL); + if (!page) + break; + put_page(page); /* free it into the hugepage allocator */ + } + cond_resched(); + } + if (i == h->max_huge_pages_node[nid]) + return; + + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); + pr_warn("HugeTLB: allocating %u of page size %s failed node%d. Only allocated %lu hugepages.\n", + h->max_huge_pages_node[nid], buf, nid, i); + h->max_huge_pages -= (h->max_huge_pages_node[nid] - i); + h->max_huge_pages_node[nid] = i; +} static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long i; nodemask_t *node_alloc_noretry; + bool node_specific_alloc = false; + + /* skip gigantic hugepages allocation if hugetlb_cma enabled */ + if (hstate_is_gigantic(h) && hugetlb_cma_size) { + pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); + return; + } + + /* do node specific alloc */ + for (i = 0; i < nr_online_nodes; i++) { + if (h->max_huge_pages_node[i] > 0) { + hugetlb_hstate_alloc_pages_onenode(h, i); + node_specific_alloc = true; + } + } + if (node_specific_alloc) + return; + + /* below will do all node balanced alloc */ if (!hstate_is_gigantic(h)) { /* * Bit mask controlling how hard we retry per-node allocations. @@ -3055,11 +3111,7 @@ static void __init hugetlb_hstate_alloc_ for (i = 0; i < h->max_huge_pages; ++i) { if (hstate_is_gigantic(h)) { - if (hugetlb_cma_size) { - pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); - goto free; - } - if (!alloc_bootmem_huge_page(h)) + if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE)) break; } else if (!alloc_pool_huge_page(h, &node_states[N_MEMORY], @@ -3075,7 +3127,6 @@ static void __init hugetlb_hstate_alloc_ h->max_huge_pages, buf, i); h->max_huge_pages = i; } -free: kfree(node_alloc_noretry); } @@ -3990,6 +4041,10 @@ static int __init hugetlb_init(void) } default_hstate.max_huge_pages = default_hstate_max_huge_pages; + + for (i = 0; i < nr_online_nodes; i++) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; } } @@ -4050,6 +4105,10 @@ void __init hugetlb_add_hstate(unsigned parsed_hstate = h; } +bool __init __weak hugetlb_node_alloc_supported(void) +{ + return true; +} /* * hugepages command line processing * hugepages normally follows a valid hugepagsz or default_hugepagsz @@ -4061,6 +4120,10 @@ static int __init hugepages_setup(char * { unsigned long *mhp; static unsigned long *last_mhp; + int node = NUMA_NO_NODE; + int count; + unsigned long tmp; + char *p = s; if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); @@ -4084,8 +4147,40 @@ static int __init hugepages_setup(char * return 0; } - if (sscanf(s, "%lu", mhp) <= 0) - *mhp = 0; + while (*p) { + count = 0; + if (sscanf(p, "%lu%n", &tmp, &count) != 1) + goto invalid; + /* Parameter is node format */ + if (p[count] == ':') { + if (!hugetlb_node_alloc_supported()) { + pr_warn("HugeTLB: architecture can't support node specific alloc, ignoring!\n"); + return 0; + } + node = tmp; + p += count + 1; + if (node < 0 || node >= nr_online_nodes) + goto invalid; + /* Parse hugepages */ + if (sscanf(p, "%lu%n", &tmp, &count) != 1) + goto invalid; + if (!hugetlb_max_hstate) + default_hugepages_in_node[node] = tmp; + else + parsed_hstate->max_huge_pages_node[node] = tmp; + *mhp += tmp; + /* Go to parse next node*/ + if (p[count] == ',') + p += count + 1; + else + break; + } else { + if (p != s) + goto invalid; + *mhp = tmp; + break; + } + } /* * Global state is always initialized later in hugetlb_init. @@ -4098,6 +4193,10 @@ static int __init hugepages_setup(char * last_mhp = mhp; return 1; + +invalid: + pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p); + return 0; } __setup("hugepages=", hugepages_setup); @@ -4159,6 +4258,7 @@ __setup("hugepagesz=", hugepagesz_setup) static int __init default_hugepagesz_setup(char *s) { unsigned long size; + int i; parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { @@ -4187,6 +4287,9 @@ static int __init default_hugepagesz_set */ if (default_hstate_max_huge_pages) { default_hstate.max_huge_pages = default_hstate_max_huge_pages; + for (i = 0; i < nr_online_nodes; i++) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; if (hstate_is_gigantic(&default_hstate)) hugetlb_hstate_alloc_pages(&default_hstate); default_hstate_max_huge_pages = 0; From patchwork Fri Nov 5 20:43:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FAA2C433EF for ; Fri, 5 Nov 2021 20:43:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B81B16135E for ; Fri, 5 Nov 2021 20:43:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B81B16135E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5697E94009F; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51886940093; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0F794009F; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id 30947940093 for ; Fri, 5 Nov 2021 16:43:34 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F3054184B28F1 for ; Fri, 5 Nov 2021 20:43:33 +0000 (UTC) X-FDA: 78776052306.09.249A559 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 66E003000110 for ; Fri, 5 Nov 2021 20:43:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7D27D61357; Fri, 5 Nov 2021 20:43:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145012; bh=JwWtMJ9zkt/tlGDYqm+z8K1z+UxPVs0nEijRXcLv8OY=; h=Date:From:To:Subject:In-Reply-To:From; b=EHuuL7SM22tqfvAX7GnrF99bn1QtdLIoYK/X1yYxzNt35/pyScQshqpV8pcJyjxyc DzLNzGTOwJakuJIJ9HbhAhnPGfemcfE9+KGw0xsp7l+Nzd22OuVjZgwYG6tYOJLOVS GxEADXf/XFkmJsl4V+OQwCOmhwgGP4GBy0lj1PJ8= Date: Fri, 05 Nov 2021 13:43:32 -0700 From: Andrew Morton To: akpm@linux-foundation.org, jhubbard@nvidia.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, o451686892@gmail.com, torvalds@linux-foundation.org, ying.huang@intel.com Subject: [patch 170/262] mm/migrate: de-duplicate migrate_reason strings Message-ID: <20211105204332.dOk2Us1-H%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 66E003000110 X-Stat-Signature: xxonokxb7ck4jumx6u8617od4x3rdcwb Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=EHuuL7SM; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145013-359320 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: John Hubbard Subject: mm/migrate: de-duplicate migrate_reason strings In order to remove the need to manually keep three different files in synch, provide a common definition of the mapping between enum migrate_reason, and the associated strings for each enum item. 1. Use the tracing system's mapping of enums to strings, by redefining and reusing the MIGRATE_REASON and supporting macros, and using that to populate the string array in mm/debug.c. 2. Move enum migrate_reason to migrate_mode.h. This is not strictly necessary for this patch, but migrate mode and migrate reason go together, so this will slightly clarify things. Link: https://lkml.kernel.org/r/20210922041755.141817-2-jhubbard@nvidia.com Signed-off-by: John Hubbard Reviewed-by: Weizhao Ouyang Cc: "Huang, Ying" Signed-off-by: Andrew Morton --- include/linux/migrate.h | 19 +------------------ include/linux/migrate_mode.h | 13 +++++++++++++ mm/debug.c | 20 +++++++++++--------- 3 files changed, 25 insertions(+), 27 deletions(-) --- a/include/linux/migrate.h~mm-migrate-de-duplicate-migrate_reason-strings +++ a/include/linux/migrate.h @@ -19,24 +19,7 @@ struct migration_target_control; */ #define MIGRATEPAGE_SUCCESS 0 -/* - * Keep sync with: - * - macro MIGRATE_REASON in include/trace/events/migrate.h - * - migrate_reason_names[MR_TYPES] in mm/debug.c - */ -enum migrate_reason { - MR_COMPACTION, - MR_MEMORY_FAILURE, - MR_MEMORY_HOTPLUG, - MR_SYSCALL, /* also applies to cpusets */ - MR_MEMPOLICY_MBIND, - MR_NUMA_MISPLACED, - MR_CONTIG_RANGE, - MR_LONGTERM_PIN, - MR_DEMOTION, - MR_TYPES -}; - +/* Defined in mm/debug.c: */ extern const char *migrate_reason_names[MR_TYPES]; #ifdef CONFIG_MIGRATION --- a/include/linux/migrate_mode.h~mm-migrate-de-duplicate-migrate_reason-strings +++ a/include/linux/migrate_mode.h @@ -19,4 +19,17 @@ enum migrate_mode { MIGRATE_SYNC_NO_COPY, }; +enum migrate_reason { + MR_COMPACTION, + MR_MEMORY_FAILURE, + MR_MEMORY_HOTPLUG, + MR_SYSCALL, /* also applies to cpusets */ + MR_MEMPOLICY_MBIND, + MR_NUMA_MISPLACED, + MR_CONTIG_RANGE, + MR_LONGTERM_PIN, + MR_DEMOTION, + MR_TYPES +}; + #endif /* MIGRATE_MODE_H_INCLUDED */ --- a/mm/debug.c~mm-migrate-de-duplicate-migrate_reason-strings +++ a/mm/debug.c @@ -16,17 +16,19 @@ #include #include "internal.h" +#include + +/* + * Define EM() and EMe() so that MIGRATE_REASON from trace/events/migrate.h can + * be used to populate migrate_reason_names[]. + */ +#undef EM +#undef EMe +#define EM(a, b) b, +#define EMe(a, b) b const char *migrate_reason_names[MR_TYPES] = { - "compaction", - "memory_failure", - "memory_hotplug", - "syscall_or_cpuset", - "mempolicy_mbind", - "numa_misplaced", - "contig_range", - "longterm_pin", - "demotion", + MIGRATE_REASON }; const struct trace_print_flags pageflag_names[] = { From patchwork Fri Nov 5 20:43:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4B22C433F5 for ; Fri, 5 Nov 2021 20:52:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 79CF460C51 for ; Fri, 5 Nov 2021 20:52:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 79CF460C51 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 11B84940103; Fri, 5 Nov 2021 16:52:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CD52940102; Fri, 5 Nov 2021 16:52:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C89CB940103; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id AEE23940102 for ; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 676E01856C2EB for ; Fri, 5 Nov 2021 20:52:06 +0000 (UTC) X-FDA: 78776073810.26.E02076C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 09B2BD0000A9 for ; Fri, 5 Nov 2021 20:51:54 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AD9F96135E; Fri, 5 Nov 2021 20:43:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145015; bh=S3ylJVi4Y3zA8dfZ57nqli1UzSqoSQsY+ZY63liPn/g=; h=Date:From:To:Subject:In-Reply-To:From; b=iUFsHqbw+l433P5GE87zdjzZjkg10jFVbQdKy8Qt9zrM+wt3wYvyMU3fStA/TJ3F9 pVhsvuOQCSSR0CEIBU48KmTx9TLdz40pq/uPKEJzI5kfuSKwkWypoPGwfwT4VH52vJ mk9U94GqGII/1ggKnRY/1ELomBSmo9n7y2gyvF4o= Date: Fri, 05 Nov 2021 13:43:35 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dave.hansen@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, shy828301@gmail.com, torvalds@linux-foundation.org, ying.huang@intel.com Subject: [patch 171/262] mm: migrate: make demotion knob depend on migration Message-ID: <20211105204335.h7ZAr40SM%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 09B2BD0000A9 X-Stat-Signature: q8ai9nhx7uybf349ysub4g6pkxk78u8o Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=iUFsHqbw; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145514-627936 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: migrate: make demotion knob depend on migration The memory demotion needs to call migrate_pages() to do the jobs. And it is controlled by a knob, however, the knob doesn't depend on CONFIG_MIGRATION. The knob could be truned on even though MIGRATION is disabled, this will not cause any crash since migrate_pages() would just return -ENOSYS. But it is definitely not optimal to go through demotion path then retry regular swap every time. And it doesn't make too much sense to have the knob visible to the users when !MIGRATION. Move the related code from mempolicy.[h|c] to migrate.[h|c]. Link: https://lkml.kernel.org/r/20211015005559.246709-1-shy828301@gmail.com Signed-off-by: Yang Shi Acked-by: "Huang, Ying" Cc: Dave Hansen Signed-off-by: Andrew Morton --- include/linux/mempolicy.h | 4 -- include/linux/migrate.h | 4 ++ mm/mempolicy.c | 61 ------------------------------------ mm/migrate.c | 61 ++++++++++++++++++++++++++++++++++++ 4 files changed, 65 insertions(+), 65 deletions(-) --- a/include/linux/mempolicy.h~mm-migrate-make-demotion-knob-depend-on-migration +++ a/include/linux/mempolicy.h @@ -183,8 +183,6 @@ extern bool vma_migratable(struct vm_are extern int mpol_misplaced(struct page *, struct vm_area_struct *, unsigned long); extern void mpol_put_task_policy(struct task_struct *); -extern bool numa_demotion_enabled; - static inline bool mpol_is_preferred_many(struct mempolicy *pol) { return (pol->mode == MPOL_PREFERRED_MANY); @@ -300,8 +298,6 @@ static inline nodemask_t *policy_nodemas return NULL; } -#define numa_demotion_enabled false - static inline bool mpol_is_preferred_many(struct mempolicy *pol) { return false; --- a/include/linux/migrate.h~mm-migrate-make-demotion-knob-depend-on-migration +++ a/include/linux/migrate.h @@ -40,6 +40,8 @@ extern int migrate_huge_page_move_mappin struct page *newpage, struct page *page); extern int migrate_page_move_mapping(struct address_space *mapping, struct page *newpage, struct page *page, int extra_count); + +extern bool numa_demotion_enabled; #else static inline void putback_movable_pages(struct list_head *l) {} @@ -65,6 +67,8 @@ static inline int migrate_huge_page_move { return -ENOSYS; } + +#define numa_demotion_enabled false #endif /* CONFIG_MIGRATION */ #ifdef CONFIG_COMPACTION --- a/mm/mempolicy.c~mm-migrate-make-demotion-knob-depend-on-migration +++ a/mm/mempolicy.c @@ -3057,64 +3057,3 @@ void mpol_to_str(char *buffer, int maxle p += scnprintf(p, buffer + maxlen - p, ":%*pbl", nodemask_pr_args(&nodes)); } - -bool numa_demotion_enabled = false; - -#ifdef CONFIG_SYSFS -static ssize_t numa_demotion_enabled_show(struct kobject *kobj, - struct kobj_attribute *attr, char *buf) -{ - return sysfs_emit(buf, "%s\n", - numa_demotion_enabled? "true" : "false"); -} - -static ssize_t numa_demotion_enabled_store(struct kobject *kobj, - struct kobj_attribute *attr, - const char *buf, size_t count) -{ - if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1)) - numa_demotion_enabled = true; - else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1)) - numa_demotion_enabled = false; - else - return -EINVAL; - - return count; -} - -static struct kobj_attribute numa_demotion_enabled_attr = - __ATTR(demotion_enabled, 0644, numa_demotion_enabled_show, - numa_demotion_enabled_store); - -static struct attribute *numa_attrs[] = { - &numa_demotion_enabled_attr.attr, - NULL, -}; - -static const struct attribute_group numa_attr_group = { - .attrs = numa_attrs, -}; - -static int __init numa_init_sysfs(void) -{ - int err; - struct kobject *numa_kobj; - - numa_kobj = kobject_create_and_add("numa", mm_kobj); - if (!numa_kobj) { - pr_err("failed to create numa kobject\n"); - return -ENOMEM; - } - err = sysfs_create_group(numa_kobj, &numa_attr_group); - if (err) { - pr_err("failed to register numa group\n"); - goto delete_obj; - } - return 0; - -delete_obj: - kobject_put(numa_kobj); - return err; -} -subsys_initcall(numa_init_sysfs); -#endif --- a/mm/migrate.c~mm-migrate-make-demotion-knob-depend-on-migration +++ a/mm/migrate.c @@ -3306,3 +3306,64 @@ static int __init migrate_on_reclaim_ini } late_initcall(migrate_on_reclaim_init); #endif /* CONFIG_HOTPLUG_CPU */ + +bool numa_demotion_enabled = false; + +#ifdef CONFIG_SYSFS +static ssize_t numa_demotion_enabled_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%s\n", + numa_demotion_enabled ? "true" : "false"); +} + +static ssize_t numa_demotion_enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1)) + numa_demotion_enabled = true; + else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1)) + numa_demotion_enabled = false; + else + return -EINVAL; + + return count; +} + +static struct kobj_attribute numa_demotion_enabled_attr = + __ATTR(demotion_enabled, 0644, numa_demotion_enabled_show, + numa_demotion_enabled_store); + +static struct attribute *numa_attrs[] = { + &numa_demotion_enabled_attr.attr, + NULL, +}; + +static const struct attribute_group numa_attr_group = { + .attrs = numa_attrs, +}; + +static int __init numa_init_sysfs(void) +{ + int err; + struct kobject *numa_kobj; + + numa_kobj = kobject_create_and_add("numa", mm_kobj); + if (!numa_kobj) { + pr_err("failed to create numa kobject\n"); + return -ENOMEM; + } + err = sysfs_create_group(numa_kobj, &numa_attr_group); + if (err) { + pr_err("failed to register numa group\n"); + goto delete_obj; + } + return 0; + +delete_obj: + kobject_put(numa_kobj); + return err; +} +subsys_initcall(numa_init_sysfs); +#endif From patchwork Fri Nov 5 20:43:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E1CDC433FE for ; Fri, 5 Nov 2021 20:43:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 01D0161362 for ; Fri, 5 Nov 2021 20:43:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 01D0161362 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9B08B9400A2; Fri, 5 Nov 2021 16:43:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96001940093; Fri, 5 Nov 2021 16:43:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 877B59400A2; Fri, 5 Nov 2021 16:43:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id 75945940093 for ; Fri, 5 Nov 2021 16:43:40 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3AE1377A42 for ; Fri, 5 Nov 2021 20:43:40 +0000 (UTC) X-FDA: 78776052600.21.F65870D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id B259ED0000AF for ; Fri, 5 Nov 2021 20:43:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C10D061357; Fri, 5 Nov 2021 20:43:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145019; bh=A1b8T3O1T0WocifxUVe3XDB6mxulsM/WPJ8ZXiLYRy4=; h=Date:From:To:Subject:In-Reply-To:From; b=u3faRA7HpsYo4pkZtNx87bc/4o+AqbWN73KVDdAIiVS6eVjJqnAN8cNKfwRz2Jd/u DSZoRnJjoTvgrTfc1eFw15gVqYWB2k08X9WJlgjI2+IXsosUJF9WfPTSvURQbGXIVK zeu37hWkUUD+cHQGDgjGOR4f/vyQRpQrAp5vtAyY= Date: Fri, 05 Nov 2021 13:43:38 -0700 From: Andrew Morton To: akpm@linux-foundation.org, davis.george@siemens.com, erosca@de.adit-jv.com, koct9i@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, skhan@linuxfoundation.org, torvalds@linux-foundation.org Subject: [patch 172/262] selftests/vm/transhuge-stress: fix ram size thinko Message-ID: <20211105204338.945CWO140%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=u3faRA7H; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B259ED0000AF X-Stat-Signature: 41sk93sztao3y9t99pzt56ncs8zdgenf X-HE-Tag: 1636145008-885406 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "George G. Davis" Subject: selftests/vm/transhuge-stress: fix ram size thinko When executing transhuge-stress with an argument to specify the virtual memory size for testing, the ram size is reported as 0, e.g. transhuge-stress 384 thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 0 MiB of ram thp-mmap: 0.184 s/loop, 0.957 ms/page, 2090.265 MiB/s 192 succeed, 0 failed This appears to be due to a thinko in commit 0085d61fe05e ("selftests/vm/transhuge-stress: stress test for memory compaction"), where, at a guess, the intent was to base "xyz MiB of ram" on `ram` size. Here are results after using `ram` size: thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 14 MiB of ram Link: https://lkml.kernel.org/r/20210825135843.29052-1-george_davis@mentor.com Fixes: 0085d61fe05e ("selftests/vm/transhuge-stress: stress test for memory compaction") Signed-off-by: George G. Davis Cc: Konstantin Khlebnikov Cc: Eugeniu Rosca Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/transhuge-stress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/tools/testing/selftests/vm/transhuge-stress.c~selftests-vm-transhuge-stress-fix-ram-size-thinko +++ a/tools/testing/selftests/vm/transhuge-stress.c @@ -79,7 +79,7 @@ int main(int argc, char **argv) warnx("allocate %zd transhuge pages, using %zd MiB virtual memory" " and %zd MiB of ram", len >> HPAGE_SHIFT, len >> 20, - len >> (20 + HPAGE_SHIFT - PAGE_SHIFT - 1)); + ram >> (20 + HPAGE_SHIFT - PAGE_SHIFT - 1)); pagemap_fd = open("/proc/self/pagemap", O_RDONLY); if (pagemap_fd < 0) From patchwork Fri Nov 5 20:43:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0020AC433F5 for ; Fri, 5 Nov 2021 20:43:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B1EAD61361 for ; Fri, 5 Nov 2021 20:43:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B1EAD61361 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A74489400A4; Fri, 5 Nov 2021 16:43:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A2222940093; Fri, 5 Nov 2021 16:43:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C4559400A4; Fri, 5 Nov 2021 16:43:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0024.hostedemail.com [216.40.44.24]) by kanga.kvack.org (Postfix) with ESMTP id 7AAB7940093 for ; Fri, 5 Nov 2021 16:43:43 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 3E77E1801AB52 for ; Fri, 5 Nov 2021 20:43:43 +0000 (UTC) X-FDA: 78776052726.12.BB65319 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id D7663D00009F for ; Fri, 5 Nov 2021 20:43:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id EC72C6135A; Fri, 5 Nov 2021 20:43:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145022; bh=01xYxtqImsOiJdGOR+iFnMW064Fm72Egvi6gnHV65QI=; h=Date:From:To:Subject:In-Reply-To:From; b=uxEBy+B/LrODO6wjoDnwgO3cBVz3/B9nlun14cgQFYi+rvb9by+A/U9imx3Famte2 Yg4qYJrQXmYLiqtnlJ2lpTznOsHq8ssZ5MdHdpFI4YpOXfzNK+vLDeUVkC5P+ysDzL 3HIzh8CFzL+VLnqRmkVXWmTj5A2tbavwjsBYZWQo= Date: Fri, 05 Nov 2021 13:43:41 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cfijalkovich@google.com, hughd@google.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rongwei.wang@linux.alibaba.com, shy828301@gmail.com, song@kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org, xuyu@linux.alibaba.com Subject: [patch 173/262] mm, thp: lock filemap when truncating page cache Message-ID: <20211105204341.14d-EdxgW%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="uxEBy+B/"; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D7663D00009F X-Stat-Signature: 91p3z9rmfqb8ctrp3h5oouwn9y54ab59 X-HE-Tag: 1636145011-845013 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rongwei Wang Subject: mm, thp: lock filemap when truncating page cache Patch series "fix two bugs for file THP". This patch (of 2): Transparent huge page has supported read-only non-shmem files. The file- backed THP is collapsed by khugepaged and truncated when written (for shared libraries). However, there is a race when multiple writers truncate the same page cache concurrently. In that case, subpage(s) of file THP can be revealed by find_get_entry in truncate_inode_pages_range, which will trigger PageTail BUG_ON in truncate_inode_page, as follows. page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0 flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head) raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000 head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000 page dumped because: VM_BUG_ON_PAGE(PageTail(page)) ------------[ cut here ]------------ kernel BUG at mm/truncate.c:213! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ... CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ... Hardware name: ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : truncate_inode_page+0x64/0x70 lr : truncate_inode_page+0x64/0x70 sp : ffff80001b60b900 x29: ffff80001b60b900 x28: 00000000000007ff x27: ffff80001b60b9a0 x26: 0000000000000000 x25: 000000000000000f x24: ffff80001b60b9a0 x23: ffff80001b60ba18 x22: ffff0001e0999ea8 x21: ffff0000c21db300 x20: ffffffffffffffff x19: fffffe001310ffc0 x18: 0000000000000020 x17: 0000000000000000 x16: 0000000000000000 x15: ffff0000c21db960 x14: 3030306666666620 x13: 6666666666666666 x12: 3130303030303030 x11: ffff8000117b69b8 x10: 00000000ffff8000 x9 : ffff80001012690c x8 : 0000000000000000 x7 : ffff8000114f69b8 x6 : 0000000000017ffd x5 : ffff0007fffbcbc8 x4 : ffff80001b60b5c0 x3 : 0000000000000001 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: truncate_inode_page+0x64/0x70 truncate_inode_pages_range+0x550/0x7e4 truncate_pagecache+0x58/0x80 do_dentry_open+0x1e4/0x3c0 vfs_open+0x38/0x44 do_open+0x1f0/0x310 path_openat+0x114/0x1dc do_filp_open+0x84/0x134 do_sys_openat2+0xbc/0x164 __arm64_sys_openat+0x74/0xc0 el0_svc_common.constprop.0+0x88/0x220 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x1a4/0x1b0 el0_sync+0x180/0x1c0 Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000) ---[ end trace f70cdb42cb7c2d42 ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception This patch mainly to lock filemap when one enter truncate_pagecache(), avoiding truncating the same page cache concurrently. Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Xu Yu Signed-off-by: Rongwei Wang Suggested-by: Matthew Wilcox (Oracle) Tested-by: Song Liu Cc: Collin Fijalkovich Cc: Hugh Dickins Cc: Mike Kravetz Cc: William Kucharski Cc: Yang Shi Cc: Signed-off-by: Andrew Morton --- fs/open.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/fs/open.c~mm-thp-lock-filemap-when-truncating-page-cache +++ a/fs/open.c @@ -856,8 +856,11 @@ static int do_dentry_open(struct file *f * of THPs into the page cache will fail. */ smp_mb(); - if (filemap_nr_thps(inode->i_mapping)) + if (filemap_nr_thps(inode->i_mapping)) { + filemap_invalidate_lock(inode->i_mapping); truncate_pagecache(inode, 0); + filemap_invalidate_unlock(inode->i_mapping); + } } return 0; From patchwork Fri Nov 5 20:43:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D749DC433F5 for ; Fri, 5 Nov 2021 20:43:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8C7C261371 for ; Fri, 5 Nov 2021 20:43:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8C7C261371 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2D0919400A6; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2803B940093; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16F659400A6; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 0634A940093 for ; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BF73A77A42 for ; Fri, 5 Nov 2021 20:43:56 +0000 (UTC) X-FDA: 78776053272.27.2FD8E40 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id 796A530000B5 for ; Fri, 5 Nov 2021 20:43:34 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2C51061362; Fri, 5 Nov 2021 20:43:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145025; bh=r9OzvJHxM88WnMJ5qg3JF+Fqwhxp7zsPajgRsydPK2w=; h=Date:From:To:Subject:In-Reply-To:From; b=J5Fb1RBCwLWb5il9M0P0V92y6s/S09842O6bKvTwul2NhhXzry/oXO5y21AzeUMIe sffk7rKHMkLOt5ZFRIkPjiPMLLjGqg+hKU4sz406LidHLyXRIlw04ruAILd/VNl6AN y8OAaQuEKiO493ldv2tEwDcPod7tZPsjg6QAYeoA= Date: Fri, 05 Nov 2021 13:43:44 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cfijalkovich@google.com, hughd@google.com, linux-mm@kvack.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, rongwei.wang@linux.alibaba.com, shy828301@gmail.com, song@kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, william.kucharski@oracle.com, willy@infradead.org, xuyu@linux.alibaba.com Subject: [patch 174/262] mm, thp: fix incorrect unmap behavior for private pages Message-ID: <20211105204344.v9quKumgv%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 796A530000B5 X-Stat-Signature: i6pddxup3p5c41ppgtswrc9m5gjmru5o Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=J5Fb1RBC; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145014-190392 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rongwei Wang Subject: mm, thp: fix incorrect unmap behavior for private pages When truncating pagecache on file THP, the private pages of a process should not be unmapped mapping. This incorrect behavior on a dynamic shared libraries which will cause related processes to happen core dump. A simple test for a DSO (Prerequisite is the DSO mapped in file THP): int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); } close(fd); return 0; } The test only to open a target DSO, and do nothing. But this operation will lead one or more process to happen core dump. This patch mainly to fix this bug. Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Rongwei Wang Tested-by: Xu Yu Cc: Matthew Wilcox (Oracle) Cc: Song Liu Cc: William Kucharski Cc: Hugh Dickins Cc: Yang Shi Cc: Mike Kravetz Cc: Collin Fijalkovich Cc: Signed-off-by: Andrew Morton --- fs/open.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/fs/open.c~mm-thp-fix-incorrect-unmap-behavior-for-private-pages +++ a/fs/open.c @@ -857,8 +857,17 @@ static int do_dentry_open(struct file *f */ smp_mb(); if (filemap_nr_thps(inode->i_mapping)) { + struct address_space *mapping = inode->i_mapping; + filemap_invalidate_lock(inode->i_mapping); - truncate_pagecache(inode, 0); + /* + * unmap_mapping_range just need to be called once + * here, because the private pages is not need to be + * unmapped mapping (e.g. data segment of dynamic + * shared libraries here). + */ + unmap_mapping_range(mapping, 0, 0, 0); + truncate_inode_pages(mapping, 0); filemap_invalidate_unlock(inode->i_mapping); } } From patchwork Fri Nov 5 20:43:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18EA2C433EF for ; Fri, 5 Nov 2021 20:43:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB64C6136A for ; Fri, 5 Nov 2021 20:43:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DB64C6136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 989479400A8; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 89B51940093; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D88C9400A7; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0181.hostedemail.com [216.40.44.181]) by kanga.kvack.org (Postfix) with ESMTP id 6CB85940093 for ; Fri, 5 Nov 2021 16:43:57 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3405077816 for ; Fri, 5 Nov 2021 20:43:57 +0000 (UTC) X-FDA: 78776053314.24.9D2DF90 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 0F09790000A5 for ; Fri, 5 Nov 2021 20:43:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 413416135A; Fri, 5 Nov 2021 20:43:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145028; bh=J9KwG6Wm/mjzVYdnRDnygEce6+0CjcxTUg3/kbFl698=; h=Date:From:To:Subject:In-Reply-To:From; b=BOUliP9UgAhw9xlvin+k+WhXlZEfFAGP5i3FhgSGH6Re4fJUaVtzOLxhYuAoj8crz thnPhfyxaiFWxKczYFh6qsQsTnkut1OAtth7eQKWdK7KIoBP3PcnL3QoduPv0nZLSc NBRj6Sv/GVmc9XDxw+qbeoMyau+Wfa5AWwohhzvs= Date: Fri, 05 Nov 2021 13:43:47 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linf@wangsu.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 175/262] mm/readahead.c: fix incorrect comments for get_init_ra_size Message-ID: <20211105204347.ezf3f-ajG%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 0F09790000A5 X-Stat-Signature: dcsdkw8ggmi4w8xmcz39ebin85kymqot Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BOUliP9U; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145028-682087 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Lin Feng Subject: mm/readahead.c: fix incorrect comments for get_init_ra_size In fact, formated values returned by get_init_ra_size are not that intuitive. This patch make the comments reflect its truth. Link: https://lkml.kernel.org/r/20211019104812.135602-1-linf@wangsu.com Signed-off-by: Lin Feng Signed-off-by: Andrew Morton --- mm/readahead.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/readahead.c~mm-readaheadc-fix-incorrect-comments-for-get_init_ra_size +++ a/mm/readahead.c @@ -309,7 +309,7 @@ void force_page_cache_ra(struct readahea * Set the initial window size, round to next power of 2 and square * for small size, x 4 for medium, and x 2 for large * for 128k (32 page) max ra - * 1-8 page = 32k initial, > 8 page = 128k initial + * 1-2 page = 16k, 3-4 page 32k, 5-8 page = 64k, > 8 page = 128k initial */ static unsigned long get_init_ra_size(unsigned long size, unsigned long max) { From patchwork Fri Nov 5 20:43:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605747 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53E54C433EF for ; Fri, 5 Nov 2021 20:43:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1F9246136A for ; Fri, 5 Nov 2021 20:43:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1F9246136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BAA1C9400A5; Fri, 5 Nov 2021 16:43:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B59F8940093; Fri, 5 Nov 2021 16:43:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A71119400A5; Fri, 5 Nov 2021 16:43:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0036.hostedemail.com [216.40.44.36]) by kanga.kvack.org (Postfix) with ESMTP id 96E80940093 for ; Fri, 5 Nov 2021 16:43:52 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 57B5F184B28F1 for ; Fri, 5 Nov 2021 20:43:52 +0000 (UTC) X-FDA: 78776053020.19.236185D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 07FA1300011D for ; Fri, 5 Nov 2021 20:43:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 17CA961360; Fri, 5 Nov 2021 20:43:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145031; bh=W5xUEmp0/AmZRI/AUD9aU2KG3gvz5TSEBKt6VoBvGcQ=; h=Date:From:To:Subject:In-Reply-To:From; b=2SvDvKEuGg6DvyT2JzWawYZOubDXe37ZeF6X2LLjFs0v1uLgJnYEedF40n9KkpiYJ iIaAO3RXp6o/cdWi+T2qTxBMSQ2/tW84ZqnFUlbe9C+0Md71EFiTEiY7OA4dx5Ojr5 nNzSvHffXX0QM4X2/E2/yh4BOioOfgeH2BKM3ghQ= Date: Fri, 05 Nov 2021 13:43:50 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 176/262] mm: nommu: kill arch_get_unmapped_area() Message-ID: <20211105204350.YecbdIUJT%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 07FA1300011D X-Stat-Signature: qo7uf3jqdwgxhzura3c5dfcet9cbgm79 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2SvDvKEu; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145031-477904 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kefeng Wang Subject: mm: nommu: kill arch_get_unmapped_area() When nommu, the arch_get_unmapped_area() will not be called, just kill it. Link: https://lkml.kernel.org/r/20210910061906.36299-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Signed-off-by: Andrew Morton --- mm/nommu.c | 6 ------ 1 file changed, 6 deletions(-) --- a/mm/nommu.c~mm-nommu-kill-arch_get_unmapped_area +++ a/mm/nommu.c @@ -1639,12 +1639,6 @@ int remap_vmalloc_range(struct vm_area_s } EXPORT_SYMBOL(remap_vmalloc_range); -unsigned long arch_get_unmapped_area(struct file *file, unsigned long addr, - unsigned long len, unsigned long pgoff, unsigned long flags) -{ - return -ENOMEM; -} - vm_fault_t filemap_fault(struct vm_fault *vmf) { BUG(); From patchwork Fri Nov 5 20:43:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 931FDC433EF for ; Fri, 5 Nov 2021 20:52:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 48DDF60C51 for ; Fri, 5 Nov 2021 20:52:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 48DDF60C51 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2CF91940100; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 20939940102; Fri, 5 Nov 2021 16:52:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0045D940100; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0209.hostedemail.com [216.40.44.209]) by kanga.kvack.org (Postfix) with ESMTP id D1CC8940102 for ; Fri, 5 Nov 2021 16:52:05 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 86E4175C8A for ; Fri, 5 Nov 2021 20:52:05 +0000 (UTC) X-FDA: 78776073810.03.E0DF260 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 1BD5D10004C7 for ; Fri, 5 Nov 2021 20:52:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2B6E861360; Fri, 5 Nov 2021 20:43:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145034; bh=ldbdqgySLyV6aEtd+JWP45gJRv0VXjxE+LXHchfO3L4=; h=Date:From:To:Subject:In-Reply-To:From; b=l3dOgDYTi8qwKON5BElbkFNbg3jzzq2rUlb3ZFbBEYGK6J9K4SQy1BUQ9dfKT/HJq VLow75tyO/Qqm6R68wibywYQ1zHrLNo8QNr817ujFm2f1YIGMxb39fJ/ukao+0Jt8J aYawter3KoDDPTubd7zFN1BOYTEp5aAa2/cmNs+U= Date: Fri, 05 Nov 2021 13:43:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, hughd@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, pasha.tatashin@soleen.com, shuah@kernel.org, torvalds@linux-foundation.org, tyhicks@linux.microsoft.com, zhansayabagdaulet@gmail.com Subject: [patch 177/262] selftest/vm: fix ksm selftest to run with different NUMA topologies Message-ID: <20211105204353.rj6UONMEl%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=l3dOgDYT; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1BD5D10004C7 X-Stat-Signature: qfffxu7kzdbhgnpjwqkprx5wb9tbmjyd X-HE-Tag: 1636145524-468017 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Aneesh Kumar K.V" Subject: selftest/vm: fix ksm selftest to run with different NUMA topologies Platforms can have non-contiguous NUMA nodes like below #numactl -H available: 2 nodes (0,8) ..... node distances: node 0 8 0: 10 40 8: 40 10 #numactl -H available: 1 nodes (1) .... node distances: node 1 1: 10 Hence update the test to not assume the presence of Node 0 and 1 and also use numa_num_configured_nodes() instead of numa_max_node for finding whether to skip the test. Link: https://lkml.kernel.org/r/20210914141414.350759-1-aneesh.kumar@linux.ibm.com Fixes: 82e717ad3501 ("selftests: vm: add KSM merging across nodes test") Signed-off-by: Aneesh Kumar K.V Reviewed-by: Pasha Tatashin Cc: Zhansaya Bagdauletkyzy Cc: Pavel Tatashin Cc: Tyler Hicks Cc: Hugh Dickins Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/ksm_tests.c | 29 ++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) --- a/tools/testing/selftests/vm/ksm_tests.c~selftest-vm-fix-ksm-selftest-to-run-with-different-numa-topologies +++ a/tools/testing/selftests/vm/ksm_tests.c @@ -354,12 +354,34 @@ err_out: return KSFT_FAIL; } +static int get_next_mem_node(int node) +{ + + long node_size; + int mem_node = 0; + int i, max_node = numa_max_node(); + + for (i = node + 1; i <= max_node + node; i++) { + mem_node = i % (max_node + 1); + node_size = numa_node_size(mem_node, NULL); + if (node_size > 0) + break; + } + return mem_node; +} + +static int get_first_mem_node(void) +{ + return get_next_mem_node(numa_max_node()); +} + static int check_ksm_numa_merge(int mapping, int prot, int timeout, bool merge_across_nodes, size_t page_size) { void *numa1_map_ptr, *numa2_map_ptr; struct timespec start_time; int page_count = 2; + int first_node; if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { perror("clock_gettime"); @@ -370,7 +392,7 @@ static int check_ksm_numa_merge(int mapp perror("NUMA support not enabled"); return KSFT_SKIP; } - if (numa_max_node() < 1) { + if (numa_num_configured_nodes() <= 1) { printf("At least 2 NUMA nodes must be available\n"); return KSFT_SKIP; } @@ -378,8 +400,9 @@ static int check_ksm_numa_merge(int mapp return KSFT_FAIL; /* allocate 2 pages in 2 different NUMA nodes and fill them with the same data */ - numa1_map_ptr = numa_alloc_onnode(page_size, 0); - numa2_map_ptr = numa_alloc_onnode(page_size, 1); + first_node = get_first_mem_node(); + numa1_map_ptr = numa_alloc_onnode(page_size, first_node); + numa2_map_ptr = numa_alloc_onnode(page_size, get_next_mem_node(first_node)); if (!numa1_map_ptr || !numa2_map_ptr) { perror("numa_alloc_onnode"); return KSFT_FAIL; From patchwork Fri Nov 5 20:43:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605753 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 028C2C433FE for ; Fri, 5 Nov 2021 20:44:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A89EE61371 for ; Fri, 5 Nov 2021 20:44:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A89EE61371 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 913869400A7; Fri, 5 Nov 2021 16:43:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C2CC940093; Fri, 5 Nov 2021 16:43:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EEB19400A9; Fri, 5 Nov 2021 16:43:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 522BB940093 for ; Fri, 5 Nov 2021 16:43:58 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 19AE3184875A7 for ; Fri, 5 Nov 2021 20:43:58 +0000 (UTC) X-FDA: 78776053356.22.E156C28 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id C597130000AA for ; Fri, 5 Nov 2021 20:43:45 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 339436136A; Fri, 5 Nov 2021 20:43:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145037; bh=gEfqlLcFkmfYsYfMe813v1IrjynanjwWLqYY7B16zMk=; h=Date:From:To:Subject:In-Reply-To:From; b=hoq4aRx0pEIov68MFssHKAdUs72VYxX+xbHY9lgXz9kJU+Vz5jpq+Zypzol97tLLb pbbQrs0dHzrm04DTb6uzMwoyzItsOycKSuWNB1AeyxUCU6dI8SS7fCon8Erj0icmkv pWHlsagX84qQgWoLiYs3oqr/Dbm8c6GokJV49uVM= Date: Fri, 05 Nov 2021 13:43:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, pedrodemargomes@gmail.com, torvalds@linux-foundation.org, zhansayabagdaulet@gmail.com Subject: [patch 178/262] selftests: vm: add KSM huge pages merging time test Message-ID: <20211105204356.UebjPpi5N%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C597130000AA X-Stat-Signature: 9iptejtjyb175xdkma6fope8ogcsj97h Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hoq4aRx0; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145025-912517 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Pedro Demarchi Gomes Subject: selftests: vm: add KSM huge pages merging time test Add test case of KSM merging time using mostly huge pages Link: https://lkml.kernel.org/r/20211013044045.360251-1-pedrodemargomes@gmail.com Signed-off-by: Pedro Demarchi Gomes Cc: Zhansaya Bagdauletkyzy Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/ksm_tests.c | 125 ++++++++++++++++++++++- 1 file changed, 124 insertions(+), 1 deletion(-) --- a/tools/testing/selftests/vm/ksm_tests.c~selftests-vm-add-ksm-huge-pages-merging-time-test +++ a/tools/testing/selftests/vm/ksm_tests.c @@ -5,6 +5,10 @@ #include #include #include +#include +#include +#include +#include #include "../kselftest.h" #include "../../../../include/vdso/time64.h" @@ -18,6 +22,15 @@ #define KSM_MERGE_ACROSS_NODES_DEFAULT true #define MB (1ul << 20) +#define PAGE_SHIFT 12 +#define HPAGE_SHIFT 21 + +#define PAGE_SIZE (1 << PAGE_SHIFT) +#define HPAGE_SIZE (1 << HPAGE_SHIFT) + +#define PAGEMAP_PRESENT(ent) (((ent) & (1ull << 63)) != 0) +#define PAGEMAP_PFN(ent) ((ent) & ((1ull << 55) - 1)) + struct ksm_sysfs { unsigned long max_page_sharing; unsigned long merge_across_nodes; @@ -34,6 +47,7 @@ enum ksm_test_name { CHECK_KSM_ZERO_PAGE_MERGE, CHECK_KSM_NUMA_MERGE, KSM_MERGE_TIME, + KSM_MERGE_TIME_HUGE_PAGES, KSM_COW_TIME }; @@ -100,6 +114,9 @@ static void print_help(void) " -P evaluate merging time and speed.\n" " For this test, the size of duplicated memory area (in MiB)\n" " must be provided using -s option\n" + " -H evaluate merging time and speed of area allocated mostly with huge pages\n" + " For this test, the size of duplicated memory area (in MiB)\n" + " must be provided using -s option\n" " -C evaluate the time required to break COW of merged pages.\n\n"); printf(" -a: specify the access protections of pages.\n" @@ -439,6 +456,101 @@ err_out: return KSFT_FAIL; } +int64_t allocate_transhuge(void *ptr, int pagemap_fd) +{ + uint64_t ent[2]; + + /* drop pmd */ + if (mmap(ptr, HPAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_FIXED | MAP_ANONYMOUS | + MAP_NORESERVE | MAP_PRIVATE, -1, 0) != ptr) + errx(2, "mmap transhuge"); + + if (madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE)) + err(2, "MADV_HUGEPAGE"); + + /* allocate transparent huge page */ + *(volatile void **)ptr = ptr; + + if (pread(pagemap_fd, ent, sizeof(ent), + (uintptr_t)ptr >> (PAGE_SHIFT - 3)) != sizeof(ent)) + err(2, "read pagemap"); + + if (PAGEMAP_PRESENT(ent[0]) && PAGEMAP_PRESENT(ent[1]) && + PAGEMAP_PFN(ent[0]) + 1 == PAGEMAP_PFN(ent[1]) && + !(PAGEMAP_PFN(ent[0]) & ((1 << (HPAGE_SHIFT - PAGE_SHIFT)) - 1))) + return PAGEMAP_PFN(ent[0]); + + return -1; +} + +static int ksm_merge_hugepages_time(int mapping, int prot, int timeout, size_t map_size) +{ + void *map_ptr, *map_ptr_orig; + struct timespec start_time, end_time; + unsigned long scan_time_ns; + int pagemap_fd, n_normal_pages, n_huge_pages; + + map_size *= MB; + size_t len = map_size; + + len -= len % HPAGE_SIZE; + map_ptr_orig = mmap(NULL, len + HPAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); + map_ptr = map_ptr_orig + HPAGE_SIZE - (uintptr_t)map_ptr_orig % HPAGE_SIZE; + + if (map_ptr_orig == MAP_FAILED) + err(2, "initial mmap"); + + if (madvise(map_ptr, len + HPAGE_SIZE, MADV_HUGEPAGE)) + err(2, "MADV_HUGEPAGE"); + + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + err(2, "open pagemap"); + + n_normal_pages = 0; + n_huge_pages = 0; + for (void *p = map_ptr; p < map_ptr + len; p += HPAGE_SIZE) { + if (allocate_transhuge(p, pagemap_fd) < 0) + n_normal_pages++; + else + n_huge_pages++; + } + printf("Number of normal pages: %d\n", n_normal_pages); + printf("Number of huge pages: %d\n", n_huge_pages); + + memset(map_ptr, '*', len); + + if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { + perror("clock_gettime"); + goto err_out; + } + if (ksm_merge_pages(map_ptr, map_size, start_time, timeout)) + goto err_out; + if (clock_gettime(CLOCK_MONOTONIC_RAW, &end_time)) { + perror("clock_gettime"); + goto err_out; + } + + scan_time_ns = (end_time.tv_sec - start_time.tv_sec) * NSEC_PER_SEC + + (end_time.tv_nsec - start_time.tv_nsec); + + printf("Total size: %lu MiB\n", map_size / MB); + printf("Total time: %ld.%09ld s\n", scan_time_ns / NSEC_PER_SEC, + scan_time_ns % NSEC_PER_SEC); + printf("Average speed: %.3f MiB/s\n", (map_size / MB) / + ((double)scan_time_ns / NSEC_PER_SEC)); + + munmap(map_ptr_orig, len + HPAGE_SIZE); + return KSFT_PASS; + +err_out: + printf("Not OK\n"); + munmap(map_ptr_orig, len + HPAGE_SIZE); + return KSFT_FAIL; +} + static int ksm_merge_time(int mapping, int prot, int timeout, size_t map_size) { void *map_ptr; @@ -564,7 +676,7 @@ int main(int argc, char *argv[]) bool merge_across_nodes = KSM_MERGE_ACROSS_NODES_DEFAULT; long size_MB = 0; - while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPC")) != -1) { + while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCH")) != -1) { switch (opt) { case 'a': prot = str_to_prot(optarg); @@ -618,6 +730,9 @@ int main(int argc, char *argv[]) case 'P': test_name = KSM_MERGE_TIME; break; + case 'H': + test_name = KSM_MERGE_TIME_HUGE_PAGES; + break; case 'C': test_name = KSM_COW_TIME; break; @@ -670,6 +785,14 @@ int main(int argc, char *argv[]) ret = ksm_merge_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, size_MB); break; + case KSM_MERGE_TIME_HUGE_PAGES: + if (size_MB == 0) { + printf("Option '-s' is required.\n"); + return KSFT_FAIL; + } + ret = ksm_merge_hugepages_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, + ksm_scan_limit_sec, size_MB); + break; case KSM_COW_TIME: ret = ksm_cow_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, page_size); From patchwork Fri Nov 5 20:43:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E569C433EF for ; Fri, 5 Nov 2021 20:44:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D879161288 for ; Fri, 5 Nov 2021 20:44:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D879161288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DCEF69400AC; Fri, 5 Nov 2021 16:44:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D309D940093; Fri, 5 Nov 2021 16:44:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BAC9F9400AC; Fri, 5 Nov 2021 16:44:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id A89D9940093 for ; Fri, 5 Nov 2021 16:44:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6CD71779B8 for ; Fri, 5 Nov 2021 20:44:01 +0000 (UTC) X-FDA: 78776053482.02.6745165 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id CFB487001A05 for ; Fri, 5 Nov 2021 20:43:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2BB3B61357; Fri, 5 Nov 2021 20:44:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145040; bh=ciS8U1tUGH4nWOs2ja52ZMQjueYLkoBeqBu+ryMcCuw=; h=Date:From:To:Subject:In-Reply-To:From; b=hbxLh4Kn+4JKLoUEJzYSm5uoPMHJchgXZqU5u0ePXSL2d8kOE4mTJtv0Epcqnox5Z w27lUA7Goqs9vFawNIFvh841r7t4dUw41tPH2/HbfJugusT0dNy7JLMqbVwwYI8kUD F2ytR68FDqa/WhSmDD/bUC2nhQIQldDxHKnRJ6ls= Date: Fri, 05 Nov 2021 13:43:59 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, liushixin2@huawei.com, mm-commits@vger.kernel.org, paulmck@kernel.org, torvalds@linux-foundation.org Subject: [patch 179/262] mm/vmstat: annotate data race for zone->free_area[order].nr_free Message-ID: <20211105204359.98NCTdblm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hbxLh4Kn; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CFB487001A05 X-Stat-Signature: 8xjtobsd17kxxyms6d3n8gkw5k5g34f9 X-HE-Tag: 1636145035-863360 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Liu Shixin Subject: mm/vmstat: annotate data race for zone->free_area[order].nr_free KCSAN reports a data-race on v5.10 which also exists on mainline: ================================================================== BUG: KCSAN: data-race in extfrag_for_order+0x33/0x2d0 race at unknown origin, with read to 0xffff9ee9bfffab48 of 8 bytes by task 34 on cpu 1: extfrag_for_order+0x33/0x2d0 kcompactd+0x5f0/0xce0 kthread+0x1f9/0x220 ret_from_fork+0x22/0x30 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 34 Comm: kcompactd0 Not tainted 5.10.0+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 ================================================================== Access to zone->free_area[order].nr_free in extfrag_for_order()/frag_show_print() is lockless. That's intentional and the stats are a rough estimate anyway. Annotate them with data_race(). [liushixin2@huawei.com: add comments] Link: https://lkml.kernel.org/r/20210918084655.2696522-1-liushixin2@huawei.com Link: https://lkml.kernel.org/r/20210908015606.3999871-1-liushixin2@huawei.com Signed-off-by: Liu Shixin Cc: "Paul E . McKenney" Signed-off-by: Andrew Morton --- mm/vmstat.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/mm/vmstat.c~mm-vmstat-annotate-data-race-for-zone-free_areanr_free +++ a/mm/vmstat.c @@ -1070,8 +1070,13 @@ static void fill_contig_page_info(struct for (order = 0; order < MAX_ORDER; order++) { unsigned long blocks; - /* Count number of free blocks */ - blocks = zone->free_area[order].nr_free; + /* + * Count number of free blocks. + * + * Access to nr_free is lockless as nr_free is used only for + * diagnostic purposes. Use data_race to avoid KCSAN warning. + */ + blocks = data_race(zone->free_area[order].nr_free); info->free_blocks_total += blocks; /* Count free base pages */ @@ -1446,7 +1451,11 @@ static void frag_show_print(struct seq_f seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name); for (order = 0; order < MAX_ORDER; ++order) - seq_printf(m, "%6lu ", zone->free_area[order].nr_free); + /* + * Access to nr_free is lockless as nr_free is used only for + * printing purposes. Use data_race to avoid KCSAN warning. + */ + seq_printf(m, "%6lu ", data_race(zone->free_area[order].nr_free)); seq_putc(m, '\n'); } From patchwork Fri Nov 5 20:44:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605759 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70709C433F5 for ; Fri, 5 Nov 2021 20:44:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 21F7F6126A for ; Fri, 5 Nov 2021 20:44:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 21F7F6126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9B2429400AD; Fri, 5 Nov 2021 16:44:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 95ECA940093; Fri, 5 Nov 2021 16:44:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8773B9400AD; Fri, 5 Nov 2021 16:44:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 71890940093 for ; Fri, 5 Nov 2021 16:44:04 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 386EA182587FA for ; Fri, 5 Nov 2021 20:44:04 +0000 (UTC) X-FDA: 78776053608.23.0008EE2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id 38C5590000B4 for ; Fri, 5 Nov 2021 20:43:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 14A836126A; Fri, 5 Nov 2021 20:44:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145043; bh=o41G0KB57z8NSrtBDoNXTpJD2BS1KUFIOHIwDviooeg=; h=Date:From:To:Subject:In-Reply-To:From; b=0NjwxFjRlIC58kkiMgaZmI3kj9325f9q458laeKNJWXzRpW+0ifJvNrxTZ4PvWpe4 oHdjsUMzZ+afIPA9ZT+0Mo+aqt4M6E1JMrun3y43TOriJx3mUyt/pi77fgYLWmA5rM adEMzbuVKIudTM5yjiXdMMr44VIVtz9gVUpbtrdQ= Date: Fri, 05 Nov 2021 13:44:02 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linf@wangsu.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 180/262] mm: vmstat.c: make extfrag_index show more pretty Message-ID: <20211105204402.BiH6mxuKf%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 38C5590000B4 X-Stat-Signature: pgk3fjjwinigjc3cphaf5dacastwigzs Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=0NjwxFjR; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145031-294584 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000008, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Lin Feng Subject: mm: vmstat.c: make extfrag_index show more pretty fragmentation_index may return -1000 and the corresponding formated value showed by seq_printf will take a negative signatrue, but other positive formated values don't take a positive signatrue, so the output becomes unaligned. before: Node 0, zone DMA -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 Node 0, zone DMA32 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 Node 0, zone Normal -1.000 -1.000 -1.000 -1.000 0.931 0.966 0.983 0.992 0.996 0.998 0.999 after this patch: Node 0, zone DMA -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 Node 0, zone DMA32 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 Node 0, zone Normal -1.000 -1.000 -1.000 -1.000 0.931 0.966 0.983 0.992 0.996 0.998 0.999 Link: https://lkml.kernel.org/r/20211019103241.134797-1-linf@wangsu.com Signed-off-by: Lin Feng Signed-off-by: Andrew Morton --- mm/vmstat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmstat.c~mm-vmstatc-make-extfrag_index-show-more-pretty +++ a/mm/vmstat.c @@ -2191,7 +2191,7 @@ static void extfrag_show_print(struct se for (order = 0; order < MAX_ORDER; ++order) { fill_contig_page_info(zone, order, &info); index = __fragmentation_index(order, &info); - seq_printf(m, "%d.%03d ", index / 1000, index % 1000); + seq_printf(m, "%2d.%03d ", index / 1000, index % 1000); } seq_putc(m, '\n'); From patchwork Fri Nov 5 20:44:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605761 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83028C433F5 for ; Fri, 5 Nov 2021 20:44:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3DC076136A for ; Fri, 5 Nov 2021 20:44:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3DC076136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EBA8C9400AE; Fri, 5 Nov 2021 16:44:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6855940093; Fri, 5 Nov 2021 16:44:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB8F79400AE; Fri, 5 Nov 2021 16:44:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id BFAAA940093 for ; Fri, 5 Nov 2021 16:44:07 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8919E1836C629 for ; Fri, 5 Nov 2021 20:44:07 +0000 (UTC) X-FDA: 78776053734.11.4270258 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id D202010000BA for ; Fri, 5 Nov 2021 20:44:06 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E6EAA61288; Fri, 5 Nov 2021 20:44:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145046; bh=t6FLKMCloq5z7jo7GCHVsIdIemkBh7Nvclx1jJpApUw=; h=Date:From:To:Subject:In-Reply-To:From; b=mvM7+VObqP9KaUxOW7ULUSXaABczcqrJOHgLo8J2kFTGbYHDlwDZMu4FrrI2eW4iT 3kp6RkQX/0ro2saed6RI4FCk62RLMXKqih1IKdhdCcY24mEApsZXWQlyEuY0UIivfc c4zDutDS1etYvZc27yp8ZNRfWelrWIFsvj6c9P64= Date: Fri, 05 Nov 2021 13:44:05 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, skhan@linuxfoundation.org, torvalds@linux-foundation.org Subject: [patch 181/262] selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers Message-ID: <20211105204405.ezVAWOhIi%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D202010000BA X-Stat-Signature: o7mhprxtb3cdb6huc7k8mqzcfwekcxs7 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=mvM7+VOb; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145046-213447 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers The madv_populate selftest currently builds with a warning when the local installed headers (via the distribution) don't include MADV_POPULATE_READ and MADV_POPULATE_WRITE. The warning is correct, because the test cannot locate the necessary header. Reason is that the in-tree installed headers (usr/include) have a "linux" instead of a "sys" subdirectory. Including "linux/mman.h" instead of "sys/mman.h" doesn't work (e.g., mmap() and madvise() are not defined that way). The only thing that seems to work is including "linux/mman.h" in addition to "sys/mman.h". We can get rid of our availability check and simplify. Link: https://lkml.kernel.org/r/20211015165758.41374-1-david@redhat.com Signed-off-by: David Hildenbrand Reported-by: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/vm/madv_populate.c | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) --- a/tools/testing/selftests/vm/madv_populate.c~selftests-vm-make-madv_populate_readwrite-use-in-tree-headers +++ a/tools/testing/selftests/vm/madv_populate.c @@ -14,12 +14,11 @@ #include #include #include +#include #include #include "../kselftest.h" -#if defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE) - /* * For now, we're using 2 MiB of private anonymous memory for all tests. */ @@ -328,15 +327,3 @@ int main(int argc, char **argv) err, ksft_test_num()); return ksft_exit_pass(); } - -#else /* defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE) */ - -#warning "missing MADV_POPULATE_READ or MADV_POPULATE_WRITE definition" - -int main(int argc, char **argv) -{ - ksft_print_header(); - ksft_exit_skip("MADV_POPULATE_READ or MADV_POPULATE_WRITE not defined\n"); -} - -#endif /* defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE) */ From patchwork Fri Nov 5 20:44:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605763 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B1A6C433EF for ; Fri, 5 Nov 2021 20:44:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 345446126A for ; Fri, 5 Nov 2021 20:44:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 345446126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 827DE9400AF; Fri, 5 Nov 2021 16:44:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B112940093; Fri, 5 Nov 2021 16:44:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6786F9400AF; Fri, 5 Nov 2021 16:44:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0234.hostedemail.com [216.40.44.234]) by kanga.kvack.org (Postfix) with ESMTP id 511BD940093 for ; Fri, 5 Nov 2021 16:44:10 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 108D4184906A0 for ; Fri, 5 Nov 2021 20:44:10 +0000 (UTC) X-FDA: 78776053902.01.3F2D46F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 5E6CF104AAD7 for ; Fri, 5 Nov 2021 20:44:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CE0A96136F; Fri, 5 Nov 2021 20:44:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145049; bh=97KRsmTUQyuBTEoOSPs1/gYmy+83Xx9BulB9F8xEZGI=; h=Date:From:To:Subject:In-Reply-To:From; b=yQ3QgPXhl4Lv6Om0jUp7ZipHY9jOTmyDVk44pm+Zli0H+y2LzQuac27bL2Lx+DNQF 8o8xHEXWaxe+bE5TZ45MPXFRqLozm/NL6BTY4bjRl70JvFwEx7b4BXkyL4zBf4L/1a 2hcbDYyBvkbjcWgwa338+ZUFiwNKqsTxM3mVfa+U= Date: Fri, 05 Nov 2021 13:44:08 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, tangyizhou@huawei.com, torvalds@linux-foundation.org Subject: [patch 182/262] mm/memory_hotplug: add static qualifier for online_policy_to_str() Message-ID: <20211105204408.3LUaERGDf%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5E6CF104AAD7 X-Stat-Signature: ri55f9emsxbt8obxf5rcy5ynz8rqqca8 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yQ3QgPXh; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145041-762504 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Tang Yizhou Subject: mm/memory_hotplug: add static qualifier for online_policy_to_str() online_policy_to_str is only used in memory_hotplug.c and should be defined as static. Link: https://lkml.kernel.org/r/20210913024534.26161-1-tangyizhou@huawei.com Signed-off-by: Tang Yizhou Reviewed-by: Muchun Song Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton --- mm/memory_hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/memory_hotplug.c~mm-memory_hotplug-add-static-qualifier-for-online_policy_to_str +++ a/mm/memory_hotplug.c @@ -57,7 +57,7 @@ enum { ONLINE_POLICY_AUTO_MOVABLE, }; -const char *online_policy_to_str[] = { +static const char * const online_policy_to_str[] = { [ONLINE_POLICY_CONTIG_ZONES] = "contig-zones", [ONLINE_POLICY_AUTO_MOVABLE] = "auto-movable", }; From patchwork Fri Nov 5 20:44:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605765 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1C59C433EF for ; Fri, 5 Nov 2021 20:44:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9331E61288 for ; Fri, 5 Nov 2021 20:44:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9331E61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B5B499400B1; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B0AB9940093; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A22AB9400B1; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0074.hostedemail.com [216.40.44.74]) by kanga.kvack.org (Postfix) with ESMTP id 93F0D940093 for ; Fri, 5 Nov 2021 16:44:13 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 637D21849A3A0 for ; Fri, 5 Nov 2021 20:44:13 +0000 (UTC) X-FDA: 78776053986.17.1B80BB9 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id 648E96001995 for ; Fri, 5 Nov 2021 20:44:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CB00861288; Fri, 5 Nov 2021 20:44:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145052; bh=Cg3gAnv0Nq31DQ40uPeNA/2Iv5l670RIiuarOxllnus=; h=Date:From:To:Subject:In-Reply-To:From; b=b4YYe7ySGVcnu928gyaOfgjSsoahgcClhpm8Z7A8ripx6tiKBcYpe5bRM0NycDzhN WiVw4zYPcu1wKvHWAkn1T4cmBB2Ykhx5MOF3mFoF+ABkgb/zIQlnsojeXBTU6hfGSx kMLZKeGiZUb6/hQDmEeXanksC9W46F+QPZkN7xzY= Date: Fri, 05 Nov 2021 13:44:11 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, torvalds@linux-foundation.org Subject: [patch 183/262] memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node" Message-ID: <20211105204411.-wGjo2Oc-%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 648E96001995 X-Stat-Signature: s9rwgecf3byni9ruaxamgng9yyr137nf Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=b4YYe7yS; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145041-668349 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node" Patch series "memory-hotplug.rst: document the "auto-movable" online policy". Now that the memory-hotplug.rst overhaul is upstream, proper documentation for the "auto-movable" online policy, documenting all new toggles and options. Along, two fixes for the original overhaul. This patch (of 3): We really want to refer to the "movable_node" kernel command line parameter here. Link: https://lkml.kernel.org/r/20210930144117.23641-2-david@redhat.com Fixes: ac3332c44767 ("memory-hotplug.rst: complete admin-guide overhaul") Signed-off-by: David Hildenbrand Acked-by: Mike Rapoport Cc: Jonathan Corbet Cc: Michal Hocko Cc: Oscar Salvador Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/memory-hotplug.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-fix-two-instances-of-movablecore-that-should-be-movable_node +++ a/Documentation/admin-guide/mm/memory-hotplug.rst @@ -166,7 +166,7 @@ Or alternatively:: % echo 1 > /sys/devices/system/memory/memoryXXX/online The kernel will select the target zone automatically, usually defaulting to -``ZONE_NORMAL`` unless ``movablecore=1`` has been specified on the kernel +``ZONE_NORMAL`` unless ``movable_node`` has been specified on the kernel command line or if the memory block would intersect the ZONE_MOVABLE already. One can explicitly request to associate an offline memory block with @@ -393,7 +393,7 @@ command line parameters are relevant: ======================== ======================================================= ``memhp_default_state`` configure auto-onlining by essentially setting ``/sys/devices/system/memory/auto_online_blocks``. -``movablecore`` configure automatic zone selection of the kernel. When +``movable_node`` configure automatic zone selection in the kernel. When set, the kernel will default to ZONE_MOVABLE, unless other zones can be kept contiguous. ======================== ======================================================= From patchwork Fri Nov 5 20:44:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A101C433EF for ; Fri, 5 Nov 2021 20:44:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C462E61288 for ; Fri, 5 Nov 2021 20:44:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C462E61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C4BB59400B2; Fri, 5 Nov 2021 16:44:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B8298940093; Fri, 5 Nov 2021 16:44:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4ABA9400B2; Fri, 5 Nov 2021 16:44:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id 8CDA3940093 for ; Fri, 5 Nov 2021 16:44:16 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4DA1E184906A0 for ; Fri, 5 Nov 2021 20:44:16 +0000 (UTC) X-FDA: 78776054112.09.E4F4993 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id DEEB390000AC for ; Fri, 5 Nov 2021 20:44:15 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DD5656126A; Fri, 5 Nov 2021 20:44:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145055; bh=7ExTBo8yYOKPEJWgDS9yxNslGfYJtk79DkBlhAcDYfg=; h=Date:From:To:Subject:In-Reply-To:From; b=WDHw0mpk8ZZH9pJS9z1D1XqcLkmLvvfhmoxsy3SVWrUQt7XxtTML/MquqaWEDU80U fqxSlKcMM7CSwrC2weIq3EGKowrNCVwg3LlPuN1K3EuFHU9xvIn8F2YfQ5tvJlabJN Gx16elWg5db41vBfkYyD+NqRmr4efCBXEF2t8JRU= Date: Fri, 05 Nov 2021 13:44:14 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, torvalds@linux-foundation.org Subject: [patch 184/262] memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path Message-ID: <20211105204414.4-fXcG8Jq%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WDHw0mpk; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DEEB390000AC X-Stat-Signature: uaw1jau5h1t9k44fuh9yubyzqw9b3u43 X-HE-Tag: 1636145055-318591 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path We accidentially added a superfluous "s". Link: https://lkml.kernel.org/r/20210930144117.23641-3-david@redhat.com Fixes: ac3332c44767 ("memory-hotplug.rst: complete admin-guide overhaul") Signed-off-by: David Hildenbrand Acked-by: Mike Rapoport Cc: Jonathan Corbet Cc: Michal Hocko Cc: Oscar Salvador Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/memory-hotplug.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-fix-wrong-sys-module-memory_hotplug-parameters-path +++ a/Documentation/admin-guide/mm/memory-hotplug.rst @@ -410,7 +410,7 @@ them with ``memory_hotplug.`` such as:: and they can be observed (and some even modified at runtime) via:: - /sys/modules/memory_hotplug/parameters/ + /sys/module/memory_hotplug/parameters/ The following module parameters are currently defined: From patchwork Fri Nov 5 20:44:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 887DDC433FE for ; Fri, 5 Nov 2021 20:44:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 38A0A6126A for ; Fri, 5 Nov 2021 20:44:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 38A0A6126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C9AE7940093; Fri, 5 Nov 2021 16:44:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA7789400B3; Fri, 5 Nov 2021 16:44:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABDD4940093; Fri, 5 Nov 2021 16:44:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 9A4BA9400B3 for ; Fri, 5 Nov 2021 16:44:19 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5EDA3181CBC0E for ; Fri, 5 Nov 2021 20:44:19 +0000 (UTC) X-FDA: 78776054322.05.4F8D4EC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id EBF5710000B2 for ; Fri, 5 Nov 2021 20:44:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E5B3761357; Fri, 5 Nov 2021 20:44:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145058; bh=+0n8sn5d6g8O902G+ly3Hfra9pP40vyitzbWnMmfvOU=; h=Date:From:To:Subject:In-Reply-To:From; b=OfjL5e+5dBdq9iCg/RhCI0pCBrK+G88bL/Ft4998aP57zlU2sHpS5+ZTFBEmwog8G flTtWSRFiall+F6/GBzIlEEcfWZtqgoYfT6puJ3SnXqHxHT94Bm8cNxea2o6h5HH+c E/v062EnVoM4N+z/gvgH/83XbPz3yDuP2qepTgdo= Date: Fri, 05 Nov 2021 13:44:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, torvalds@linux-foundation.org Subject: [patch 185/262] memory-hotplug.rst: document the "auto-movable" online policy Message-ID: <20211105204417.KQ4istcQ9%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OfjL5e+5; dmarc=none; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EBF5710000B2 X-Stat-Signature: wk541eknkxj3dzt5ibky7misnyqeoo8b X-HE-Tag: 1636145058-4944 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memory-hotplug.rst: document the "auto-movable" online policy In commit e83a437faa62 ("mm/memory_hotplug: introduce "auto-movable" online policy") we introduced a new memory online policy to automatically select a zone for memory blocks to be onlined. We added a way to set the active online policy and tunables for the auto-movable online policy. In follow-up commits we tweaked the "auto-movable" policy to also consider memory device details when selecting zones for memory blocks to be onlined. Let's document the new toggles and how the two online policies we have work. [david@redhat.com: updates] Link: https://lkml.kernel.org/r/20211011082058.6076-4-david@redhat.com Link: https://lkml.kernel.org/r/20210930144117.23641-4-david@redhat.com Signed-off-by: David Hildenbrand Acked-by: Mike Rapoport Cc: Jonathan Corbet Cc: Michal Hocko Cc: Oscar Salvador Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/memory-hotplug.rst | 141 ++++++++++++-- 1 file changed, 121 insertions(+), 20 deletions(-) --- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-document-the-auto-movable-online-policy +++ a/Documentation/admin-guide/mm/memory-hotplug.rst @@ -165,9 +165,8 @@ Or alternatively:: % echo 1 > /sys/devices/system/memory/memoryXXX/online -The kernel will select the target zone automatically, usually defaulting to -``ZONE_NORMAL`` unless ``movable_node`` has been specified on the kernel -command line or if the memory block would intersect the ZONE_MOVABLE already. +The kernel will select the target zone automatically, depending on the +configured ``online_policy``. One can explicitly request to associate an offline memory block with ZONE_MOVABLE by:: @@ -198,6 +197,9 @@ Auto-onlining can be enabled by writing % echo online > /sys/devices/system/memory/auto_online_blocks +Similarly to manual onlining, with ``online`` the kernel will select the +target zone automatically, depending on the configured ``online_policy``. + Modifying the auto-online behavior will only affect all subsequently added memory blocks only. @@ -393,11 +395,16 @@ command line parameters are relevant: ======================== ======================================================= ``memhp_default_state`` configure auto-onlining by essentially setting ``/sys/devices/system/memory/auto_online_blocks``. -``movable_node`` configure automatic zone selection in the kernel. When - set, the kernel will default to ZONE_MOVABLE, unless - other zones can be kept contiguous. +``movable_node`` configure automatic zone selection in the kernel when + using the ``contig-zones`` online policy. When + set, the kernel will default to ZONE_MOVABLE when + onlining a memory block, unless other zones can be kept + contiguous. ======================== ======================================================= +See Documentation/admin-guide/kernel-parameters.txt for a more generic +description of these command line parameters. + Module Parameters ------------------ @@ -414,20 +421,114 @@ and they can be observed (and some even The following module parameters are currently defined: -======================== ======================================================= -``memmap_on_memory`` read-write: Allocate memory for the memmap from the - added memory block itself. Even if enabled, actual - support depends on various other system properties and - should only be regarded as a hint whether the behavior - would be desired. - - While allocating the memmap from the memory block - itself makes memory hotplug less likely to fail and - keeps the memmap on the same NUMA node in any case, it - can fragment physical memory in a way that huge pages - in bigger granularity cannot be formed on hotplugged - memory. -======================== ======================================================= +================================ =============================================== +``memmap_on_memory`` read-write: Allocate memory for the memmap from + the added memory block itself. Even if enabled, + actual support depends on various other system + properties and should only be regarded as a + hint whether the behavior would be desired. + + While allocating the memmap from the memory + block itself makes memory hotplug less likely + to fail and keeps the memmap on the same NUMA + node in any case, it can fragment physical + memory in a way that huge pages in bigger + granularity cannot be formed on hotplugged + memory. +``online_policy`` read-write: Set the basic policy used for + automatic zone selection when onlining memory + blocks without specifying a target zone. + ``contig-zones`` has been the kernel default + before this parameter was added. After an + online policy was configured and memory was + online, the policy should not be changed + anymore. + + When set to ``contig-zones``, the kernel will + try keeping zones contiguous. If a memory block + intersects multiple zones or no zone, the + behavior depends on the ``movable_node`` kernel + command line parameter: default to ZONE_MOVABLE + if set, default to the applicable kernel zone + (usually ZONE_NORMAL) if not set. + + When set to ``auto-movable``, the kernel will + try onlining memory blocks to ZONE_MOVABLE if + possible according to the configuration and + memory device details. With this policy, one + can avoid zone imbalances when eventually + hotplugging a lot of memory later and still + wanting to be able to hotunplug as much as + possible reliably, very desirable in + virtualized environments. This policy ignores + the ``movable_node`` kernel command line + parameter and isn't really applicable in + environments that require it (e.g., bare metal + with hotunpluggable nodes) where hotplugged + memory might be exposed via the + firmware-provided memory map early during boot + to the system instead of getting detected, + added and onlined later during boot (such as + done by virtio-mem or by some hypervisors + implementing emulated DIMMs). As one example, a + hotplugged DIMM will be onlined either + completely to ZONE_MOVABLE or completely to + ZONE_NORMAL, not a mixture. + As another example, as many memory blocks + belonging to a virtio-mem device will be + onlined to ZONE_MOVABLE as possible, + special-casing units of memory blocks that can + only get hotunplugged together. *This policy + does not protect from setups that are + problematic with ZONE_MOVABLE and does not + change the zone of memory blocks dynamically + after they were onlined.* +``auto_movable_ratio`` read-write: Set the maximum MOVABLE:KERNEL + memory ratio in % for the ``auto-movable`` + online policy. Whether the ratio applies only + for the system across all NUMA nodes or also + per NUMA nodes depends on the + ``auto_movable_numa_aware`` configuration. + + All accounting is based on present memory pages + in the zones combined with accounting per + memory device. Memory dedicated to the CMA + allocator is accounted as MOVABLE, although + residing on one of the kernel zones. The + possible ratio depends on the actual workload. + The kernel default is "301" %, for example, + allowing for hotplugging 24 GiB to a 8 GiB VM + and automatically onlining all hotplugged + memory to ZONE_MOVABLE in many setups. The + additional 1% deals with some pages being not + present, for example, because of some firmware + allocations. + + Note that ZONE_NORMAL memory provided by one + memory device does not allow for more + ZONE_MOVABLE memory for a different memory + device. As one example, onlining memory of a + hotplugged DIMM to ZONE_NORMAL will not allow + for another hotplugged DIMM to get onlined to + ZONE_MOVABLE automatically. In contrast, memory + hotplugged by a virtio-mem device that got + onlined to ZONE_NORMAL will allow for more + ZONE_MOVABLE memory within *the same* + virtio-mem device. +``auto_movable_numa_aware`` read-write: Configure whether the + ``auto_movable_ratio`` in the ``auto-movable`` + online policy also applies per NUMA + node in addition to the whole system across all + NUMA nodes. The kernel default is "Y". + + Disabling NUMA awareness can be helpful when + dealing with NUMA nodes that should be + completely hotunpluggable, onlining the memory + completely to ZONE_MOVABLE automatically if + possible. + + Parameter availability depends on CONFIG_NUMA. +================================ =============================================== ZONE_MOVABLE ============ From patchwork Fri Nov 5 20:44:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10120C433FE for ; Fri, 5 Nov 2021 20:44:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BB1276135A for ; Fri, 5 Nov 2021 20:44:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BB1276135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5B5FA9400B4; Fri, 5 Nov 2021 16:44:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 565C49400B3; Fri, 5 Nov 2021 16:44:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 456089400B4; Fri, 5 Nov 2021 16:44:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 396689400B3 for ; Fri, 5 Nov 2021 16:44:23 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id EF23E1802EAD7 for ; Fri, 5 Nov 2021 20:44:22 +0000 (UTC) X-FDA: 78776054364.27.D1BC8A6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 62F59D0000B5 for ; Fri, 5 Nov 2021 20:44:13 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3E17F6126A; Fri, 5 Nov 2021 20:44:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145062; bh=kEqxJVXpyBcCpQjpy6DUOkt3Fq/ds2a/WHBz96lmeIE=; h=Date:From:To:Subject:In-Reply-To:From; b=WSSsqjpJUvpmo2bWGIJJJPQfZTX5f3HdBDICg4ZD6hYvttgjv9VlBTbAijqdN1TUk fIK/RWV8w+ba1mgi7i3pBKeSMfUxwLBDC4UAFnLwMMUlfn72MkSEs2MM6+sSuEAton cYH7EuElhlUEibv/ktS3dllm2dAtR8tSvZQYHEDA= Date: Fri, 05 Nov 2021 13:44:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, shuah@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 186/262] mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG Message-ID: <20211105204420.EwguDZQuI%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 62F59D0000B5 X-Stat-Signature: yohrwdiqw6frobyo7nkjbdbxgfzyrhmt Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=WSSsqjpJ; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145053-552919 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG Patch series "mm/memory_hotplug: Kconfig and 32 bit cleanups". Some cleanups around CONFIG_MEMORY_HOTPLUG, including removing 32 bit leftovers of memory hotplug support. This patch (of 6): SPARSEMEM is the only possible memory model for x86-64, FLATMEM is not possible: config ARCH_FLATMEM_ENABLE def_bool y depends on X86_32 && !NUMA And X86_64_ACPI_NUMA (obviously) only supports x86-64: config X86_64_ACPI_NUMA def_bool y depends on X86_64 && NUMA && ACPI && PCI Let's just remove the CONFIG_X86_64_ACPI_NUMA dependency, as it does no longer make sense. Link: https://lkml.kernel.org/r/20210929143600.49379-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Jonathan Corbet Cc: Alex Shi Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Shuah Khan Cc: Michal Hocko Cc: Mike Rapoport Signed-off-by: Andrew Morton --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/Kconfig~mm-memory_hotplug-remove-config_x86_64_acpi_numa-dependency-from-config_memory_hotplug +++ a/mm/Kconfig @@ -123,7 +123,7 @@ config ARCH_ENABLE_MEMORY_HOTPLUG config MEMORY_HOTPLUG bool "Allow for memory hot-add" select MEMORY_ISOLATION - depends on SPARSEMEM || X86_64_ACPI_NUMA + depends on SPARSEMEM depends on ARCH_ENABLE_MEMORY_HOTPLUG depends on 64BIT || BROKEN select NUMA_KEEP_MEMINFO if NUMA From patchwork Fri Nov 5 20:44:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9348BC433FE for ; Fri, 5 Nov 2021 20:44:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 441F96128E for ; Fri, 5 Nov 2021 20:44:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 441F96128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D52E39400B5; Fri, 5 Nov 2021 16:44:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C43589400B3; Fri, 5 Nov 2021 16:44:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0C8B9400B5; Fri, 5 Nov 2021 16:44:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id 1B5D29400B3 for ; Fri, 5 Nov 2021 16:44:27 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id CDD1275C8F for ; Fri, 5 Nov 2021 20:44:26 +0000 (UTC) X-FDA: 78776054532.21.4D7EA10 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 5DF54B0000A6 for ; Fri, 5 Nov 2021 20:44:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DF21761288; Fri, 5 Nov 2021 20:44:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145065; bh=kxc7zV5BC7FsapY56+bOPTUUalhr1hnNjCKvMlImxC0=; h=Date:From:To:Subject:In-Reply-To:From; b=bFCz2YytBItl+x3YIAJ8lXFGLjTFOen913HCYRKjd16w5A4f19nNChaXVCoOY5ixJ zb0ntfShGbmOuKkScML3rT9JE6mzgUwg8kO7oHJKa9tnUzufe8L19GupvijYpT9ssJ tvT4X8y2RY9+LHOcD3CejDSmZR5GgME1TISQFPm4= Date: Fri, 05 Nov 2021 13:44:24 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, skhan@linuxfoundation.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 187/262] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE Message-ID: <20211105204424.n9o6HdHrO%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bFCz2Yyt; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5DF54B0000A6 X-Stat-Signature: 3m8o96q5skk5dfh53ro9ax576tr36i3u X-HE-Tag: 1636145066-179625 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE. Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com Signed-off-by: David Hildenbrand Acked-by: Shuah Khan [kselftest] Acked-by: Greg Kroah-Hartman Acked-by: Oscar Salvador Cc: Alex Shi Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jason Wang Cc: Jonathan Corbet Cc: Michael Ellerman Cc: "Michael S. Tsirkin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- arch/powerpc/include/asm/machdep.h | 2 - arch/powerpc/kernel/setup_64.c | 2 - arch/powerpc/platforms/powernv/setup.c | 4 +- arch/powerpc/platforms/pseries/setup.c | 2 - drivers/base/Makefile | 2 - drivers/base/node.c | 9 ++---- drivers/virtio/Kconfig | 2 - include/linux/memory.h | 24 ++++++---------- include/linux/node.h | 4 +- lib/Kconfig.debug | 2 - mm/Kconfig | 4 -- mm/memory_hotplug.c | 2 - tools/testing/selftests/memory-hotplug/config | 1 13 files changed, 24 insertions(+), 36 deletions(-) --- a/arch/powerpc/include/asm/machdep.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/arch/powerpc/include/asm/machdep.h @@ -32,7 +32,7 @@ struct machdep_calls { void (*iommu_save)(void); void (*iommu_restore)(void); #endif -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG unsigned long (*memory_block_size)(void); #endif #endif /* CONFIG_PPC64 */ --- a/arch/powerpc/kernel/setup_64.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/arch/powerpc/kernel/setup_64.c @@ -912,7 +912,7 @@ void __init setup_per_cpu_areas(void) } #endif -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG unsigned long memory_block_size_bytes(void) { if (ppc_md.memory_block_size) --- a/arch/powerpc/platforms/powernv/setup.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/arch/powerpc/platforms/powernv/setup.c @@ -440,7 +440,7 @@ static void pnv_kexec_cpu_down(int crash } #endif /* CONFIG_KEXEC_CORE */ -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG static unsigned long pnv_memory_block_size(void) { /* @@ -553,7 +553,7 @@ define_machine(powernv) { #ifdef CONFIG_KEXEC_CORE .kexec_cpu_down = pnv_kexec_cpu_down, #endif -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG .memory_block_size = pnv_memory_block_size, #endif }; --- a/arch/powerpc/platforms/pseries/setup.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/arch/powerpc/platforms/pseries/setup.c @@ -1089,7 +1089,7 @@ define_machine(pseries) { .machine_kexec = pSeries_machine_kexec, .kexec_cpu_down = pseries_kexec_cpu_down, #endif -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG .memory_block_size = pseries_memory_block_size, #endif }; --- a/drivers/base/Makefile~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/drivers/base/Makefile @@ -13,7 +13,7 @@ obj-y += power/ obj-$(CONFIG_ISA_BUS_API) += isa.o obj-y += firmware_loader/ obj-$(CONFIG_NUMA) += node.o -obj-$(CONFIG_MEMORY_HOTPLUG_SPARSE) += memory.o +obj-$(CONFIG_MEMORY_HOTPLUG) += memory.o ifeq ($(CONFIG_SYSFS),y) obj-$(CONFIG_MODULES) += module.o endif --- a/drivers/base/node.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/drivers/base/node.c @@ -629,7 +629,7 @@ static void node_device_release(struct d { struct node *node = to_node(dev); -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS) +#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS) /* * We schedule the work only when a memory section is * onlined/offlined on this node. When we come here, @@ -782,7 +782,7 @@ int unregister_cpu_under_node(unsigned i return 0; } -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifdef CONFIG_MEMORY_HOTPLUG static int __ref get_nid_for_pfn(unsigned long pfn) { #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT @@ -958,10 +958,9 @@ static int node_memory_callback(struct n return NOTIFY_OK; } #endif /* CONFIG_HUGETLBFS */ -#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ +#endif /* CONFIG_MEMORY_HOTPLUG */ -#if !defined(CONFIG_MEMORY_HOTPLUG_SPARSE) || \ - !defined(CONFIG_HUGETLBFS) +#if !defined(CONFIG_MEMORY_HOTPLUG) || !defined(CONFIG_HUGETLBFS) static inline int node_memory_callback(struct notifier_block *self, unsigned long action, void *arg) { --- a/drivers/virtio/Kconfig~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/drivers/virtio/Kconfig @@ -98,7 +98,7 @@ config VIRTIO_MEM default m depends on X86_64 depends on VIRTIO - depends on MEMORY_HOTPLUG_SPARSE + depends on MEMORY_HOTPLUG depends on MEMORY_HOTREMOVE depends on CONTIG_ALLOC help --- a/include/linux/memory.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/include/linux/memory.h @@ -110,7 +110,7 @@ struct mem_section; #define SLAB_CALLBACK_PRI 1 #define IPC_CALLBACK_PRI 10 -#ifndef CONFIG_MEMORY_HOTPLUG_SPARSE +#ifndef CONFIG_MEMORY_HOTPLUG static inline void memory_dev_init(void) { return; @@ -126,7 +126,14 @@ static inline int memory_notify(unsigned { return 0; } -#else +static inline int hotplug_memory_notifier(notifier_fn_t fn, int pri) +{ + return 0; +} +/* These aren't inline functions due to a GCC bug. */ +#define register_hotmemory_notifier(nb) ({ (void)(nb); 0; }) +#define unregister_hotmemory_notifier(nb) ({ (void)(nb); }) +#else /* CONFIG_MEMORY_HOTPLUG */ extern int register_memory_notifier(struct notifier_block *nb); extern void unregister_memory_notifier(struct notifier_block *nb); int create_memory_block_devices(unsigned long start, unsigned long size, @@ -148,9 +155,6 @@ struct memory_group *memory_group_find_b typedef int (*walk_memory_groups_func_t)(struct memory_group *, void *); int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func, struct memory_group *excluded, void *arg); -#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ - -#ifdef CONFIG_MEMORY_HOTPLUG #define hotplug_memory_notifier(fn, pri) ({ \ static __meminitdata struct notifier_block fn##_mem_nb =\ { .notifier_call = fn, .priority = pri };\ @@ -158,15 +162,7 @@ int walk_dynamic_memory_groups(int nid, }) #define register_hotmemory_notifier(nb) register_memory_notifier(nb) #define unregister_hotmemory_notifier(nb) unregister_memory_notifier(nb) -#else -static inline int hotplug_memory_notifier(notifier_fn_t fn, int pri) -{ - return 0; -} -/* These aren't inline functions due to a GCC bug. */ -#define register_hotmemory_notifier(nb) ({ (void)(nb); 0; }) -#define unregister_hotmemory_notifier(nb) ({ (void)(nb); }) -#endif +#endif /* CONFIG_MEMORY_HOTPLUG */ /* * Kernel text modification mutex, used for code patching. Users of this lock --- a/include/linux/node.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/include/linux/node.h @@ -85,7 +85,7 @@ struct node { struct device dev; struct list_head access_list; -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS) +#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS) struct work_struct node_work; #endif #ifdef CONFIG_HMEM_REPORTING @@ -98,7 +98,7 @@ struct memory_block; extern struct node *node_devices[]; typedef void (*node_registration_func_t)(struct node *); -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA) +#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_NUMA) void link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn, enum meminit_context context); --- a/lib/Kconfig.debug~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/lib/Kconfig.debug @@ -877,7 +877,7 @@ config DEBUG_MEMORY_INIT config MEMORY_NOTIFIER_ERROR_INJECT tristate "Memory hotplug notifier error injection module" - depends on MEMORY_HOTPLUG_SPARSE && NOTIFIER_ERROR_INJECTION + depends on MEMORY_HOTPLUG && NOTIFIER_ERROR_INJECTION help This option provides the ability to inject artificial errors to memory hotplug notifier chain callbacks. It is controlled through --- a/mm/Kconfig~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/mm/Kconfig @@ -128,10 +128,6 @@ config MEMORY_HOTPLUG depends on 64BIT || BROKEN select NUMA_KEEP_MEMINFO if NUMA -config MEMORY_HOTPLUG_SPARSE - def_bool y - depends on SPARSEMEM && MEMORY_HOTPLUG - config MEMORY_HOTPLUG_DEFAULT_ONLINE bool "Online the newly added memory blocks by default" depends on MEMORY_HOTPLUG --- a/mm/memory_hotplug.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/mm/memory_hotplug.c @@ -220,7 +220,6 @@ static void release_memory_resource(stru kfree(res); } -#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE static int check_pfn_span(unsigned long pfn, unsigned long nr_pages, const char *reason) { @@ -1163,7 +1162,6 @@ failed_addition: mem_hotplug_done(); return ret; } -#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ static void reset_node_present_pages(pg_data_t *pgdat) { --- a/tools/testing/selftests/memory-hotplug/config~mm-memory_hotplug-remove-config_memory_hotplug_sparse +++ a/tools/testing/selftests/memory-hotplug/config @@ -1,5 +1,4 @@ CONFIG_MEMORY_HOTPLUG=y -CONFIG_MEMORY_HOTPLUG_SPARSE=y CONFIG_NOTIFIER_ERROR_INJECTION=y CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m CONFIG_MEMORY_HOTREMOVE=y From patchwork Fri Nov 5 20:44:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0BF1C433F5 for ; Fri, 5 Nov 2021 20:44:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 989906135A for ; Fri, 5 Nov 2021 20:44:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 989906135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28CE49400B7; Fri, 5 Nov 2021 16:44:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0FBD89400B3; Fri, 5 Nov 2021 16:44:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2DA29400B6; Fri, 5 Nov 2021 16:44:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id DDF159400B3 for ; Fri, 5 Nov 2021 16:44:30 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 97EBB77816 for ; Fri, 5 Nov 2021 20:44:30 +0000 (UTC) X-FDA: 78776054700.13.ADE6D32 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id 089C4B0000A9 for ; Fri, 5 Nov 2021 20:44:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 904A361357; Fri, 5 Nov 2021 20:44:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145069; bh=3A4nZhu029sBU0tdcSR7GUPafBaYK4IeubSP84Q2m/Y=; h=Date:From:To:Subject:In-Reply-To:From; b=o7a6hnO3KVY4qHDb3PBMW02C5XxcppbniN8Q3BvkSf/8+qrXeUYiyfsAREkSnTPBZ 6Clxu1YiI75voca6J1CONSQMrlQsPnquuTgeft9tbZIR8SISdkkDJxeAdWLzSrB3hR P/DErO47g8E9z3bWNFKLQq6og4zhMb9tytDxE860= Date: Fri, 05 Nov 2021 13:44:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, shuah@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 188/262] mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit Message-ID: <20211105204428.ikPIxSeQt%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=o7a6hnO3; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 089C4B0000A9 X-Stat-Signature: r8575f1r95g7o3pxnuj4mfwj3emcjrw3 X-HE-Tag: 1636145069-551242 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit 32 bit support is broken in various ways: for example, we can online memory that should actually go to ZONE_HIGHMEM to ZONE_MOVABLE or in some cases even to one of the other kernel zones. We marked it BROKEN in commit b59d02ed0869 ("mm/memory_hotplug: disable the functionality for 32b") almost one year ago. According to that commit it might be broken at least since 2017. Further, there is hardly a sane use case nowadays. Let's just depend completely on 64bit, dropping the "BROKEN" dependency to make clear that we are not going to support it again. Next, we'll remove some HIGHMEM leftovers from memory hotplug code to clean up. Link: https://lkml.kernel.org/r/20210929143600.49379-4-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Alex Shi Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jason Wang Cc: Jonathan Corbet Cc: Michael Ellerman Cc: "Michael S. Tsirkin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Shuah Khan Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/Kconfig~mm-memory_hotplug-restrict-config_memory_hotplug-to-64-bit +++ a/mm/Kconfig @@ -125,7 +125,7 @@ config MEMORY_HOTPLUG select MEMORY_ISOLATION depends on SPARSEMEM depends on ARCH_ENABLE_MEMORY_HOTPLUG - depends on 64BIT || BROKEN + depends on 64BIT select NUMA_KEEP_MEMINFO if NUMA config MEMORY_HOTPLUG_DEFAULT_ONLINE From patchwork Fri Nov 5 20:44:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F000C433FE for ; Fri, 5 Nov 2021 20:44:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F3C4D6128E for ; Fri, 5 Nov 2021 20:44:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org F3C4D6128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 878709400B6; Fri, 5 Nov 2021 16:44:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 78A139400B3; Fri, 5 Nov 2021 16:44:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B4D39400B6; Fri, 5 Nov 2021 16:44:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 44D069400B3 for ; Fri, 5 Nov 2021 16:44:34 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0A243180CD60A for ; Fri, 5 Nov 2021 20:44:34 +0000 (UTC) X-FDA: 78776054868.06.4A4EFD8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 951A19000254 for ; Fri, 5 Nov 2021 20:44:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 294826135A; Fri, 5 Nov 2021 20:44:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145072; bh=yAO+VlDBu97wORVA0p9i4rg/qgwwwfY8LfcvjMSRHBQ=; h=Date:From:To:Subject:In-Reply-To:From; b=vZK0XaJVV6GccvbzdFKCIicutoqxT2Czwkd3Fz2qBgGA7RiSaqu+QfWMq05UtQi72 mBYdpZm0/9PEZRJ/nE2lYQN39k7FMX3VG1i0ddqXeOyLmrypmXfrMjnb94N9ywhHLh Gn0Ab/zuXL6Z6/SYkOHxVn5rUbgfRKHKeWLM1oIE= Date: Fri, 05 Nov 2021 13:44:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, shuah@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 189/262] mm/memory_hotplug: remove HIGHMEM leftovers Message-ID: <20211105204431.s4Dse8GID%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=vZK0XaJV; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 951A19000254 X-Stat-Signature: uirusjmgregxbh4itrxpq45cj159os4w X-HE-Tag: 1636145073-72154 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: remove HIGHMEM leftovers We don't support CONFIG_MEMORY_HOTPLUG on 32 bit and consequently not HIGHMEM. Let's remove any leftover code -- including the unused "status_change_nid_high" field part of the memory notifier. Link: https://lkml.kernel.org/r/20210929143600.49379-5-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Alex Shi Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jason Wang Cc: Jonathan Corbet Cc: Michael Ellerman Cc: "Michael S. Tsirkin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Shuah Khan Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- Documentation/core-api/memory-hotplug.rst | 3 Documentation/translations/zh_CN/core-api/memory-hotplug.rst | 4 - include/linux/memory.h | 1 mm/memory_hotplug.c | 36 ---------- 4 files changed, 2 insertions(+), 42 deletions(-) --- a/Documentation/core-api/memory-hotplug.rst~mm-memory_hotplug-remove-highmem-leftovers +++ a/Documentation/core-api/memory-hotplug.rst @@ -57,7 +57,6 @@ The third argument (arg) passes a pointe unsigned long start_pfn; unsigned long nr_pages; int status_change_nid_normal; - int status_change_nid_high; int status_change_nid; } @@ -65,8 +64,6 @@ The third argument (arg) passes a pointe - nr_pages is # of pages of online/offline memory. - status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask is (will be) set/clear, if this is -1, then nodemask status is not changed. -- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask - is (will be) set/clear, if this is -1, then nodemask status is not changed. - status_change_nid is set node id when N_MEMORY of nodemask is (will be) set/clear. It means a new(memoryless) node gets new memory by online and a node loses all memory. If this is -1, then nodemask status is not changed. --- a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst~mm-memory_hotplug-remove-highmem-leftovers +++ a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst @@ -63,7 +63,6 @@ memory_notify结构体的指针:: unsigned long start_pfn; unsigned long nr_pages; int status_change_nid_normal; - int status_change_nid_high; int status_change_nid; } @@ -74,9 +73,6 @@ memory_notify结构体的指针:: - status_change_nid_normal是当nodemask的N_NORMAL_MEMORY被设置/清除时设置节 点id,如果是-1,则nodemask状态不改变。 -- status_change_nid_high是当nodemask的N_HIGH_MEMORY被设置/清除时设置的节点 - id,如果这个值为-1,那么nodemask状态不会改变。 - - status_change_nid是当nodemask的N_MEMORY被(将)设置/清除时设置的节点id。这 意味着一个新的(没上线的)节点通过联机获得新的内存,而一个节点失去了所有的内 存。如果这个值为-1,那么nodemask的状态就不会改变。 --- a/include/linux/memory.h~mm-memory_hotplug-remove-highmem-leftovers +++ a/include/linux/memory.h @@ -96,7 +96,6 @@ struct memory_notify { unsigned long start_pfn; unsigned long nr_pages; int status_change_nid_normal; - int status_change_nid_high; int status_change_nid; }; --- a/mm/memory_hotplug.c~mm-memory_hotplug-remove-highmem-leftovers +++ a/mm/memory_hotplug.c @@ -21,7 +21,6 @@ #include #include #include -#include #include #include #include @@ -585,10 +584,6 @@ void generic_online_page(struct page *pa debug_pagealloc_map_pages(page, 1 << order); __free_pages_core(page, order); totalram_pages_add(1UL << order); -#ifdef CONFIG_HIGHMEM - if (PageHighMem(page)) - totalhigh_pages_add(1UL << order); -#endif } EXPORT_SYMBOL_GPL(generic_online_page); @@ -625,16 +620,11 @@ static void node_states_check_changes_on arg->status_change_nid = NUMA_NO_NODE; arg->status_change_nid_normal = NUMA_NO_NODE; - arg->status_change_nid_high = NUMA_NO_NODE; if (!node_state(nid, N_MEMORY)) arg->status_change_nid = nid; if (zone_idx(zone) <= ZONE_NORMAL && !node_state(nid, N_NORMAL_MEMORY)) arg->status_change_nid_normal = nid; -#ifdef CONFIG_HIGHMEM - if (zone_idx(zone) <= ZONE_HIGHMEM && !node_state(nid, N_HIGH_MEMORY)) - arg->status_change_nid_high = nid; -#endif } static void node_states_set_node(int node, struct memory_notify *arg) @@ -642,9 +632,6 @@ static void node_states_set_node(int nod if (arg->status_change_nid_normal >= 0) node_set_state(node, N_NORMAL_MEMORY); - if (arg->status_change_nid_high >= 0) - node_set_state(node, N_HIGH_MEMORY); - if (arg->status_change_nid >= 0) node_set_state(node, N_MEMORY); } @@ -1801,7 +1788,6 @@ static void node_states_check_changes_of arg->status_change_nid = NUMA_NO_NODE; arg->status_change_nid_normal = NUMA_NO_NODE; - arg->status_change_nid_high = NUMA_NO_NODE; /* * Check whether node_states[N_NORMAL_MEMORY] will be changed. @@ -1816,24 +1802,9 @@ static void node_states_check_changes_of if (zone_idx(zone) <= ZONE_NORMAL && nr_pages >= present_pages) arg->status_change_nid_normal = zone_to_nid(zone); -#ifdef CONFIG_HIGHMEM /* - * node_states[N_HIGH_MEMORY] contains nodes which - * have normal memory or high memory. - * Here we add the present_pages belonging to ZONE_HIGHMEM. - * If the zone is within the range of [0..ZONE_HIGHMEM), and - * we determine that the zones in that range become empty, - * we need to clear the node for N_HIGH_MEMORY. - */ - present_pages += pgdat->node_zones[ZONE_HIGHMEM].present_pages; - if (zone_idx(zone) <= ZONE_HIGHMEM && nr_pages >= present_pages) - arg->status_change_nid_high = zone_to_nid(zone); -#endif - - /* - * We have accounted the pages from [0..ZONE_NORMAL), and - * in case of CONFIG_HIGHMEM the pages from ZONE_HIGHMEM - * as well. + * We have accounted the pages from [0..ZONE_NORMAL); ZONE_HIGHMEM + * does not apply as we don't support 32bit. * Here we count the possible pages from ZONE_MOVABLE. * If after having accounted all the pages, we see that the nr_pages * to be offlined is over or equal to the accounted pages, @@ -1851,9 +1822,6 @@ static void node_states_clear_node(int n if (arg->status_change_nid_normal >= 0) node_clear_state(node, N_NORMAL_MEMORY); - if (arg->status_change_nid_high >= 0) - node_clear_state(node, N_HIGH_MEMORY); - if (arg->status_change_nid >= 0) node_clear_state(node, N_MEMORY); } From patchwork Fri Nov 5 20:44:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7064C433EF for ; Fri, 5 Nov 2021 20:44:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B02261362 for ; Fri, 5 Nov 2021 20:44:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7B02261362 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0FC349400B8; Fri, 5 Nov 2021 16:44:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AC939400B3; Fri, 5 Nov 2021 16:44:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB50A9400B8; Fri, 5 Nov 2021 16:44:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id D86689400B3 for ; Fri, 5 Nov 2021 16:44:38 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A28371836C629 for ; Fri, 5 Nov 2021 20:44:38 +0000 (UTC) X-FDA: 78776055036.16.22BA358 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id DD39F104AAF4 for ; Fri, 5 Nov 2021 20:44:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CF09661288; Fri, 5 Nov 2021 20:44:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145076; bh=f1Hb9dh5T1kfv5Bilwovktw+IWTAU0b2YfU/nNtWtkk=; h=Date:From:To:Subject:In-Reply-To:From; b=zqIheX0GLyIRw6PtXItOeB/GHhaYU2Rn9fGgRlu+/OhWaQ3a0rLXCZmrMD5HmZk4r YsGb0V/J7FWaFf3aHJ2FRJ4M7aI+tNAyN0AbdQ9Y7B4ZaQJJ991U1qDooYytGos3ii ZaVB9VqYTZuqtWeHxKRpCegR/qJVwDIQZba7ijAY= Date: Fri, 05 Nov 2021 13:44:35 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, shuah@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 190/262] mm/memory_hotplug: remove stale function declarations Message-ID: <20211105204435.9aBC2O7r5%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DD39F104AAF4 X-Stat-Signature: bo8xph6gct8wqeqmmuxar19oe64k9afb Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zqIheX0G; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145068-142683 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: remove stale function declarations These functions no longer exist. Link: https://lkml.kernel.org/r/20210929143600.49379-6-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Alex Shi Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jason Wang Cc: Jonathan Corbet Cc: Michael Ellerman Cc: "Michael S. Tsirkin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Shuah Khan Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- include/linux/memory_hotplug.h | 3 --- 1 file changed, 3 deletions(-) --- a/include/linux/memory_hotplug.h~mm-memory_hotplug-remove-stale-function-declarations +++ a/include/linux/memory_hotplug.h @@ -98,9 +98,6 @@ static inline void zone_seqlock_init(str { seqlock_init(&zone->span_seqlock); } -extern int zone_grow_free_lists(struct zone *zone, unsigned long new_nr_pages); -extern int zone_grow_waitqueues(struct zone *zone, unsigned long nr_pages); -extern int add_one_highpage(struct page *page, int pfn, int bad_ppro); extern void adjust_present_page_count(struct page *page, struct memory_group *group, long nr_pages); From patchwork Fri Nov 5 20:44:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 948B5C433EF for ; Fri, 5 Nov 2021 20:44:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 47C6961362 for ; Fri, 5 Nov 2021 20:44:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 47C6961362 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C94589400B9; Fri, 5 Nov 2021 16:44:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1D169400B3; Fri, 5 Nov 2021 16:44:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A96759400B9; Fri, 5 Nov 2021 16:44:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0021.hostedemail.com [216.40.44.21]) by kanga.kvack.org (Postfix) with ESMTP id 955CA9400B3 for ; Fri, 5 Nov 2021 16:44:41 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 56453182067DF for ; Fri, 5 Nov 2021 20:44:41 +0000 (UTC) X-FDA: 78776055162.20.AA62294 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 94DC4104AAD7 for ; Fri, 5 Nov 2021 20:44:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 704786128E; Fri, 5 Nov 2021 20:44:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145080; bh=o1ufznwnjDUUvGC62OCbSNe4ZqKRKpd6DpUBq68ilCA=; h=Date:From:To:Subject:In-Reply-To:From; b=Q1Cfq/BUdFaZIvkgm8h9cT7RT7KSyjOslPeR9wM0R6k5lgnSgGgmDyy3TJdu349Dx ZdY/RnuL+AGAeRTR/z3+P1dKXH3QwA4432lg7hZpDH4kS+YWmTEN3+gxPbUWTp6UHP vQPr1ZWYLoGknFJzMjrmsCWJTAE3gJFFsICoLhOE= Date: Fri, 05 Nov 2021 13:44:39 -0700 From: Andrew Morton To: akpm@linux-foundation.org, alexs@kernel.org, benh@kernel.crashing.org, bp@alien8.de, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jasowang@redhat.com, linux-mm@kvack.org, luto@kernel.org, mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, mst@redhat.com, osalvador@suse.de, paulus@samba.org, peterz@infradead.org, rafael@kernel.org, rppt@kernel.org, shuah@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org Subject: [patch 191/262] x86: remove memory hotplug support on X86_32 Message-ID: <20211105204439.BdMlAxnGp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="Q1Cfq/BU"; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 94DC4104AAD7 X-Stat-Signature: rk7onpiwh9f6nax3ty55nqczwwz531jm X-HE-Tag: 1636145072-612809 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: x86: remove memory hotplug support on X86_32 CONFIG_MEMORY_HOTPLUG was marked BROKEN over one year and we just restricted it to 64 bit. Let's remove the unused x86 32bit implementation and simplify the Kconfig. Link: https://lkml.kernel.org/r/20210929143600.49379-7-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Alex Shi Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jason Wang Cc: Jonathan Corbet Cc: Michael Ellerman Cc: "Michael S. Tsirkin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Shuah Khan Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- arch/x86/Kconfig | 6 +++--- arch/x86/mm/init_32.c | 31 ------------------------------- 2 files changed, 3 insertions(+), 34 deletions(-) --- a/arch/x86/Kconfig~x86-remove-memory-hotplug-support-on-x86_32 +++ a/arch/x86/Kconfig @@ -62,7 +62,7 @@ config X86 select ARCH_32BIT_OFF_T if X86_32 select ARCH_CLOCKSOURCE_INIT select ARCH_ENABLE_HUGEPAGE_MIGRATION if X86_64 && HUGETLB_PAGE && MIGRATION - select ARCH_ENABLE_MEMORY_HOTPLUG if X86_64 || (X86_32 && HIGHMEM) + select ARCH_ENABLE_MEMORY_HOTPLUG if X86_64 select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG select ARCH_ENABLE_SPLIT_PMD_PTLOCK if (PGTABLE_LEVELS > 2) && (X86_64 || X86_PAE) select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE @@ -1614,7 +1614,7 @@ config ARCH_SELECT_MEMORY_MODEL config ARCH_MEMORY_PROBE bool "Enable sysfs memory/probe interface" - depends on X86_64 && MEMORY_HOTPLUG + depends on MEMORY_HOTPLUG help This option enables a sysfs memory/probe interface for testing. See Documentation/admin-guide/mm/memory-hotplug.rst for more information. @@ -2394,7 +2394,7 @@ endmenu config ARCH_HAS_ADD_PAGES def_bool y - depends on X86_64 && ARCH_ENABLE_MEMORY_HOTPLUG + depends on ARCH_ENABLE_MEMORY_HOTPLUG config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE def_bool y --- a/arch/x86/mm/init_32.c~x86-remove-memory-hotplug-support-on-x86_32 +++ a/arch/x86/mm/init_32.c @@ -779,37 +779,6 @@ void __init mem_init(void) test_wp_bit(); } -#ifdef CONFIG_MEMORY_HOTPLUG -int arch_add_memory(int nid, u64 start, u64 size, - struct mhp_params *params) -{ - unsigned long start_pfn = start >> PAGE_SHIFT; - unsigned long nr_pages = size >> PAGE_SHIFT; - int ret; - - /* - * The page tables were already mapped at boot so if the caller - * requests a different mapping type then we must change all the - * pages with __set_memory_prot(). - */ - if (params->pgprot.pgprot != PAGE_KERNEL.pgprot) { - ret = __set_memory_prot(start, nr_pages, params->pgprot); - if (ret) - return ret; - } - - return __add_pages(nid, start_pfn, nr_pages, params); -} - -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) -{ - unsigned long start_pfn = start >> PAGE_SHIFT; - unsigned long nr_pages = size >> PAGE_SHIFT; - - __remove_pages(start_pfn, nr_pages, altmap); -} -#endif - int kernel_set_to_readonly __read_mostly; static void mark_nxdata_nx(void) From patchwork Fri Nov 5 20:44:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50F44C433EF for ; Fri, 5 Nov 2021 20:44:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 01AC561288 for ; Fri, 5 Nov 2021 20:44:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 01AC561288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 800469400BA; Fri, 5 Nov 2021 16:44:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 789819400B3; Fri, 5 Nov 2021 16:44:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 629DE9400BA; Fri, 5 Nov 2021 16:44:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id 507989400B3 for ; Fri, 5 Nov 2021 16:44:45 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 17C237533C for ; Fri, 5 Nov 2021 20:44:45 +0000 (UTC) X-FDA: 78776055330.16.9CFB177 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id 92A1B4002095 for ; Fri, 5 Nov 2021 20:44:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3A8B861362; Fri, 5 Nov 2021 20:44:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145083; bh=j83n6yVFIQx4Yb2IDvFFc3Wk9vLnKLD6aw+p2WAF1d4=; h=Date:From:To:Subject:In-Reply-To:From; b=bOvWEKW4c3GykJUoLRVQsi4WYHUPZ3UAEXMlelbjm2TBHkO4pMfHd3sB6STeOxJyh 6z4ul6POovpL7/Kx8fF6FP8KxWVspl8l7oq7XicWWtmzhIez38UnkJ57EYhC7NMXeT 1+6eUV21595jFXo+ovG3qP1PgixjHiTWZZcGa2yQ= Date: Fri, 05 Nov 2021 13:44:42 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, arnd@arndb.de, borntraeger@de.ibm.com, chenhuacai@kernel.org, david@redhat.com, ebiederm@xmission.com, geert@linux-m68k.org, gor@linux.ibm.com, hca@linux.ibm.com, Jianyong.Wu@arm.com, jiaxun.yang@flygoat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@kernel.org, shahab@synopsys.com, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@kernel.org Subject: [patch 192/262] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() Message-ID: <20211105204442.CRZzL-cG-%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bOvWEKW4; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 92A1B4002095 X-Stat-Signature: pgbgbxarj9og6eytu9waja61ou5o7rob X-HE-Tag: 1636145084-756680 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() Patch series "mm/memory_hotplug: full support for add_memory_driver_managed() with CONFIG_ARCH_KEEP_MEMBLOCK", v2. Architectures that require CONFIG_ARCH_KEEP_MEMBLOCK=y, such as arm64, don't cleanly support add_memory_driver_managed() yet. Most prominently, kexec_file can still end up placing kexec images on such driver-managed memory, resulting in undesired behavior, for example, having kexec images located on memory not part of the firmware-provided memory map. Teaching kexec to not place images on driver-managed memory is especially relevant for virtio-mem. Details can be found in commit 7b7b27214bba ("mm/memory_hotplug: introduce add_memory_driver_managed()"). Extend memblock with a new flag and set it from memory hotplug code when applicable. This is required to fully support virtio-mem on arm64, making also kexec_file behave like on x86-64. This patch (of 2): If memblock_add_node() fails, we're most probably running out of memory. While this is unlikely to happen, it can happen and having memory added without a memblock can be problematic for architectures that use memblock to detect valid memory. Let's fail in a nice way instead of silently ignoring the error. Link: https://lkml.kernel.org/r/20211004093605.5830-1-david@redhat.com Link: https://lkml.kernel.org/r/20211004093605.5830-2-david@redhat.com Signed-off-by: David Hildenbrand Cc: Mike Rapoport Cc: Michal Hocko Cc: Oscar Salvador Cc: Jianyong Wu Cc: "Aneesh Kumar K . V" Cc: Vineet Gupta Cc: Geert Uytterhoeven Cc: Huacai Chen Cc: Jiaxun Yang Cc: Thomas Bogendoerfer Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Eric Biederman Cc: Arnd Bergmann Cc: Shahab Vahedi Signed-off-by: Andrew Morton --- mm/memory_hotplug.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/mm/memory_hotplug.c~mm-memory_hotplug-handle-memblock_add_node-failures-in-add_memory_resource +++ a/mm/memory_hotplug.c @@ -1369,8 +1369,11 @@ int __ref add_memory_resource(int nid, s mem_hotplug_begin(); - if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) - memblock_add_node(start, size, nid); + if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { + ret = memblock_add_node(start, size, nid); + if (ret) + goto error_mem_hotplug_end; + } ret = __try_online_node(nid, false); if (ret < 0) @@ -1443,6 +1446,7 @@ error: rollback_node_hotadd(nid); if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) memblock_remove(start, size); +error_mem_hotplug_end: mem_hotplug_done(); return ret; } From patchwork Fri Nov 5 20:44:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC6A5C433F5 for ; Fri, 5 Nov 2021 20:44:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 610086128E for ; Fri, 5 Nov 2021 20:44:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 610086128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E8FDA9400BB; Fri, 5 Nov 2021 16:44:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E18A79400B3; Fri, 5 Nov 2021 16:44:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE16D9400BB; Fri, 5 Nov 2021 16:44:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id B56999400B3 for ; Fri, 5 Nov 2021 16:44:48 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 797F218019240 for ; Fri, 5 Nov 2021 20:44:48 +0000 (UTC) X-FDA: 78776055456.25.AD8CBE6 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id E6C53E00199D for ; Fri, 5 Nov 2021 20:44:30 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A400261357; Fri, 5 Nov 2021 20:44:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145087; bh=HkoYWztKitzLI7Po5xZgHrrfDx7kzFW8kNlNzd+PrAo=; h=Date:From:To:Subject:In-Reply-To:From; b=SiPOoC29Lop84lj+FSYQypUQE58fcAgcr29tfiTg6E9JjwPqbWAhui/s7obRdjTWl gJ1QInYZDNV5Ude50o9ibrBLjZbKMB98lgbgq3FUHpxiBmoqD2VyZfKw+wCR9FLjpU k+xy8CKZLCNS2eWH5W2NGDCmbfTZ30Z2oJ9SJ/W4= Date: Fri, 05 Nov 2021 13:44:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, arnd@arndb.de, borntraeger@de.ibm.com, chenhuacai@kernel.org, david@redhat.com, ebiederm@xmission.com, geert@linux-m68k.org, gor@linux.ibm.com, hca@linux.ibm.com, Jianyong.Wu@arm.com, jiaxun.yang@flygoat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, shahab@synopsys.com, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@kernel.org Subject: [patch 193/262] memblock: improve MEMBLOCK_HOTPLUG documentation Message-ID: <20211105204446.kz5JSc_-i%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=SiPOoC29; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E6C53E00199D X-Stat-Signature: x4tb6nbb33dkefam5di8kjqwkex7zxk7 X-HE-Tag: 1636145070-396399 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memblock: improve MEMBLOCK_HOTPLUG documentation The description of MEMBLOCK_HOTPLUG is currently short and consequently misleading: we're actually dealing with a memory region that might get hotunplugged later (i.e., the platform+firmware supports it), yet it is indicated in the firmware-provided memory map as system ram that will just get used by the system for any purpose when not taking special care. The firmware marked this memory region as a hot(un)plugged (e.g., hotplugged before reboot), implying that it might get hotunplugged again later. Whether we consider this information depends on the "movable_node" kernel commandline parameter: only with "movable_node" set, we'll try keeping this memory hotunpluggable, for example, by not serving early allocations from this memory region and by letting the buddy manage it using the ZONE_MOVABLE. Let's make this clearer by extending the documentation. Note: kexec *has to* indicate this memory to the second kernel. With "movable_node" set, we don't want to place kexec-images on this memory. Without "movable_node" set, we don't care and can place kexec-images on this memory. In both cases, after successful memory hotunplug, kexec has to be re-armed to update the memory map for the second kernel and to place the kexec-images somewhere else. Link: https://lkml.kernel.org/r/20211004093605.5830-3-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Mike Rapoport Cc: "Aneesh Kumar K . V" Cc: Arnd Bergmann Cc: Christian Borntraeger Cc: Eric Biederman Cc: Geert Uytterhoeven Cc: Heiko Carstens Cc: Huacai Chen Cc: Jianyong Wu Cc: Jiaxun Yang Cc: Michal Hocko Cc: Oscar Salvador Cc: Shahab Vahedi Cc: Thomas Bogendoerfer Cc: Vasily Gorbik Cc: Vineet Gupta Signed-off-by: Andrew Morton --- include/linux/memblock.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/include/linux/memblock.h~memblock-improve-memblock_hotplug-documentation +++ a/include/linux/memblock.h @@ -28,7 +28,11 @@ extern unsigned long long max_possible_p /** * enum memblock_flags - definition of memory region attributes * @MEMBLOCK_NONE: no special request - * @MEMBLOCK_HOTPLUG: hotpluggable region + * @MEMBLOCK_HOTPLUG: memory region indicated in the firmware-provided memory + * map during early boot as hot(un)pluggable system RAM (e.g., memory range + * that might get hotunplugged later). With "movable_node" set on the kernel + * commandline, try keeping this memory region hotunpluggable. Does not apply + * to memblocks added ("hotplugged") after early boot. * @MEMBLOCK_MIRROR: mirrored region * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as * reserved in the memory map; refer to memblock_mark_nomap() description From patchwork Fri Nov 5 20:44:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B507C433EF for ; Fri, 5 Nov 2021 20:44:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D46DD6128E for ; Fri, 5 Nov 2021 20:44:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D46DD6128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 62B099400BC; Fri, 5 Nov 2021 16:44:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 58D149400B3; Fri, 5 Nov 2021 16:44:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42F869400BC; Fri, 5 Nov 2021 16:44:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0005.hostedemail.com [216.40.44.5]) by kanga.kvack.org (Postfix) with ESMTP id 3038D9400B3 for ; Fri, 5 Nov 2021 16:44:52 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EC5E482499A8 for ; Fri, 5 Nov 2021 20:44:51 +0000 (UTC) X-FDA: 78776055582.28.B5EC347 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 7CB0590000BC for ; Fri, 5 Nov 2021 20:44:51 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3964C61288; Fri, 5 Nov 2021 20:44:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145090; bh=ayQD1p0jWUOaJYCWhaVnRKEStX2T3lJ/4OS7MVKIsJc=; h=Date:From:To:Subject:In-Reply-To:From; b=UWNvGCdBLyxnsfIPudbpGYJBzEbEUrrWbN9JFXMY/mgF0n1ZapFDb/mhja83ky+jq J4WbLU1lntOvUgYgr717nzXsh8Ois89ZuhyNmkpLkL1swWuNCmEwLWLcnLHA6N0xit Mupqdm+zXnyfE2NiAUqu26rtUI/zS02/D6b1/Ugk= Date: Fri, 05 Nov 2021 13:44:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, arnd@arndb.de, borntraeger@de.ibm.com, chenhuacai@kernel.org, david@redhat.com, ebiederm@xmission.com, geert@linux-m68k.org, gor@linux.ibm.com, hca@linux.ibm.com, Jianyong.Wu@arm.com, jiaxun.yang@flygoat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, shahab@synopsys.com, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@kernel.org Subject: [patch 194/262] memblock: allow to specify flags with memblock_add_node() Message-ID: <20211105204449.huGy9m-tX%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7CB0590000BC X-Stat-Signature: j9gz5113gw5xycq6934q34aarxztdtip Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UWNvGCdB; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145091-190324 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memblock: allow to specify flags with memblock_add_node() We want to specify flags when hotplugging memory. Let's prepare to pass flags to memblock_add_node() by adjusting all existing users. Note that when hotplugging memory the system is already up and running and we might have concurrent memblock users: for example, while we're hotplugging memory, kexec_file code might search for suitable memory regions to place kexec images. It's important to add the memory directly to memblock via a single call with the right flags, instead of adding the memory first and apply flags later: otherwise, concurrent memblock users might temporarily stumble over memblocks with wrong flags, which will be important in a follow-up patch that introduces a new flag to properly handle add_memory_driver_managed(). Link: https://lkml.kernel.org/r/20211004093605.5830-4-david@redhat.com Acked-by: Geert Uytterhoeven Acked-by: Heiko Carstens Signed-off-by: David Hildenbrand Acked-by: Shahab Vahedi [arch/arc] Reviewed-by: Mike Rapoport Cc: "Aneesh Kumar K . V" Cc: Arnd Bergmann Cc: Christian Borntraeger Cc: Eric Biederman Cc: Huacai Chen Cc: Jianyong Wu Cc: Jiaxun Yang Cc: Michal Hocko Cc: Oscar Salvador Cc: Thomas Bogendoerfer Cc: Vasily Gorbik Cc: Vineet Gupta Signed-off-by: Andrew Morton --- arch/arc/mm/init.c | 4 ++-- arch/ia64/mm/contig.c | 2 +- arch/ia64/mm/init.c | 2 +- arch/m68k/mm/mcfmmu.c | 3 ++- arch/m68k/mm/motorola.c | 6 ++++-- arch/mips/loongson64/init.c | 4 +++- arch/mips/sgi-ip27/ip27-memory.c | 3 ++- arch/s390/kernel/setup.c | 3 ++- include/linux/memblock.h | 3 ++- include/linux/mm.h | 2 +- mm/memblock.c | 9 +++++---- mm/memory_hotplug.c | 2 +- 12 files changed, 26 insertions(+), 17 deletions(-) --- a/arch/arc/mm/init.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/arc/mm/init.c @@ -59,13 +59,13 @@ void __init early_init_dt_add_memory_arc low_mem_sz = size; in_use = 1; - memblock_add_node(base, size, 0); + memblock_add_node(base, size, 0, MEMBLOCK_NONE); } else { #ifdef CONFIG_HIGHMEM high_mem_start = base; high_mem_sz = size; in_use = 1; - memblock_add_node(base, size, 1); + memblock_add_node(base, size, 1, MEMBLOCK_NONE); memblock_reserve(base, size); #endif } --- a/arch/ia64/mm/contig.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/ia64/mm/contig.c @@ -153,7 +153,7 @@ find_memory (void) efi_memmap_walk(find_max_min_low_pfn, NULL); max_pfn = max_low_pfn; - memblock_add_node(0, PFN_PHYS(max_low_pfn), 0); + memblock_add_node(0, PFN_PHYS(max_low_pfn), 0, MEMBLOCK_NONE); find_initrd(); --- a/arch/ia64/mm/init.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/ia64/mm/init.c @@ -378,7 +378,7 @@ int __init register_active_ranges(u64 st #endif if (start < end) - memblock_add_node(__pa(start), end - start, nid); + memblock_add_node(__pa(start), end - start, nid, MEMBLOCK_NONE); return 0; } --- a/arch/m68k/mm/mcfmmu.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/m68k/mm/mcfmmu.c @@ -174,7 +174,8 @@ void __init cf_bootmem_alloc(void) m68k_memory[0].addr = _rambase; m68k_memory[0].size = _ramend - _rambase; - memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0); + memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0, + MEMBLOCK_NONE); /* compute total pages in system */ num_pages = PFN_DOWN(_ramend - _rambase); --- a/arch/m68k/mm/motorola.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/m68k/mm/motorola.c @@ -410,7 +410,8 @@ void __init paging_init(void) min_addr = m68k_memory[0].addr; max_addr = min_addr + m68k_memory[0].size; - memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0); + memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0, + MEMBLOCK_NONE); for (i = 1; i < m68k_num_memory;) { if (m68k_memory[i].addr < min_addr) { printk("Ignoring memory chunk at 0x%lx:0x%lx before the first chunk\n", @@ -421,7 +422,8 @@ void __init paging_init(void) (m68k_num_memory - i) * sizeof(struct m68k_mem_info)); continue; } - memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i); + memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i, + MEMBLOCK_NONE); addr = m68k_memory[i].addr + m68k_memory[i].size; if (addr > max_addr) max_addr = addr; --- a/arch/mips/loongson64/init.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/mips/loongson64/init.c @@ -77,7 +77,9 @@ void __init szmem(unsigned int node) (u32)node_id, mem_type, mem_start, mem_size); pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n", start_pfn, end_pfn, num_physpages); - memblock_add_node(PFN_PHYS(start_pfn), PFN_PHYS(node_psize), node); + memblock_add_node(PFN_PHYS(start_pfn), + PFN_PHYS(node_psize), node, + MEMBLOCK_NONE); break; case SYSTEM_RAM_RESERVED: pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n", --- a/arch/mips/sgi-ip27/ip27-memory.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/mips/sgi-ip27/ip27-memory.c @@ -341,7 +341,8 @@ static void __init szmem(void) continue; } memblock_add_node(PFN_PHYS(slot_getbasepfn(node, slot)), - PFN_PHYS(slot_psize), node); + PFN_PHYS(slot_psize), node, + MEMBLOCK_NONE); } } } --- a/arch/s390/kernel/setup.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/arch/s390/kernel/setup.c @@ -593,7 +593,8 @@ static void __init setup_resources(void) * part of the System RAM resource. */ if (crashk_res.end) { - memblock_add_node(crashk_res.start, resource_size(&crashk_res), 0); + memblock_add_node(crashk_res.start, resource_size(&crashk_res), + 0, MEMBLOCK_NONE); memblock_reserve(crashk_res.start, resource_size(&crashk_res)); insert_resource(&iomem_resource, &crashk_res); } --- a/include/linux/memblock.h~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/include/linux/memblock.h @@ -104,7 +104,8 @@ static inline void memblock_discard(void #endif void memblock_allow_resize(void); -int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid); +int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid, + enum memblock_flags flags); int memblock_add(phys_addr_t base, phys_addr_t size); int memblock_remove(phys_addr_t base, phys_addr_t size); int memblock_phys_free(phys_addr_t base, phys_addr_t size); --- a/include/linux/mm.h~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/include/linux/mm.h @@ -2425,7 +2425,7 @@ static inline unsigned long get_num_phys * unsigned long max_zone_pfns[MAX_NR_ZONES] = {max_dma, max_normal_pfn, * max_highmem_pfn}; * for_each_valid_physical_page_range() - * memblock_add_node(base, size, nid) + * memblock_add_node(base, size, nid, MEMBLOCK_NONE) * free_area_init(max_zone_pfns); */ void free_area_init(unsigned long *max_zone_pfn); --- a/mm/memblock.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/mm/memblock.c @@ -655,6 +655,7 @@ repeat: * @base: base address of the new region * @size: size of the new region * @nid: nid of the new region + * @flags: flags of the new region * * Add new memblock region [@base, @base + @size) to the "memory" * type. See memblock_add_range() description for mode details @@ -663,14 +664,14 @@ repeat: * 0 on success, -errno on failure. */ int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size, - int nid) + int nid, enum memblock_flags flags) { phys_addr_t end = base + size - 1; - memblock_dbg("%s: [%pa-%pa] nid=%d %pS\n", __func__, - &base, &end, nid, (void *)_RET_IP_); + memblock_dbg("%s: [%pa-%pa] nid=%d flags=%x %pS\n", __func__, + &base, &end, nid, flags, (void *)_RET_IP_); - return memblock_add_range(&memblock.memory, base, size, nid, 0); + return memblock_add_range(&memblock.memory, base, size, nid, flags); } /** --- a/mm/memory_hotplug.c~memblock-allow-to-specify-flags-with-memblock_add_node +++ a/mm/memory_hotplug.c @@ -1370,7 +1370,7 @@ int __ref add_memory_resource(int nid, s mem_hotplug_begin(); if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { - ret = memblock_add_node(start, size, nid); + ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE); if (ret) goto error_mem_hotplug_end; } From patchwork Fri Nov 5 20:44:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4591C433EF for ; Fri, 5 Nov 2021 20:44:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 807CA6135A for ; Fri, 5 Nov 2021 20:44:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 807CA6135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 148B49400BD; Fri, 5 Nov 2021 16:44:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D3139400B3; Fri, 5 Nov 2021 16:44:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB4FD9400BD; Fri, 5 Nov 2021 16:44:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id D594E9400B3 for ; Fri, 5 Nov 2021 16:44:55 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9D9F57533C for ; Fri, 5 Nov 2021 20:44:55 +0000 (UTC) X-FDA: 78776055750.15.69D4B18 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 78FB0508FA6F for ; Fri, 5 Nov 2021 20:44:43 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CD9426128E; Fri, 5 Nov 2021 20:44:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145094; bh=Rn5SlIwAS4wvslkdavyy2tGQShoefOyqFODzo5hvCs0=; h=Date:From:To:Subject:In-Reply-To:From; b=MOHrVBk8Ti3bIXoq4TYXNh9Agyv1l6C6A9kT7FlJQ0D2ktyDJlQqp829OeCEBU26r Jdj9Rv42MGIuNnnie5QNfBxbjWHdVCro7jgD3ar39fJ9wuXfdiLyVuEZcm7iEtsaaM 44rgduYFPIquG9gcUmkUg12RmyIKsHJw9RCFjHNs= Date: Fri, 05 Nov 2021 13:44:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, arnd@arndb.de, borntraeger@de.ibm.com, chenhuacai@kernel.org, david@redhat.com, ebiederm@xmission.com, geert@linux-m68k.org, gor@linux.ibm.com, hca@linux.ibm.com, Jianyong.Wu@arm.com, jiaxun.yang@flygoat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@linux.ibm.com, shahab@synopsys.com, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@kernel.org Subject: [patch 195/262] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED Message-ID: <20211105204453.u02-BcuF_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=MOHrVBk8; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 78FB0508FA6F X-Stat-Signature: qkc3b1nu535xwfmu1rwynguomsbfyfmo X-HE-Tag: 1636145083-569013 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED, indicating that we're dealing with a memory region that is never indicated in the firmware-provided memory map, but always detected and added by a driver. Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory regions like ordinary MEMBLOCK_NONE memory regions -- for example, when selecting memory regions to add to the vmcore for dumping in the crashkernel via for_each_mem_range(). However, especially kexec_file is not supposed to select such memblocks via for_each_free_mem_range() / for_each_free_mem_range_reverse() to place kexec images, similar to how we handle IORESOURCE_SYSRAM_DRIVER_MANAGED without CONFIG_ARCH_KEEP_MEMBLOCK. We'll make sure that memory hotplug code sets the flag where applicable (IORESOURCE_SYSRAM_DRIVER_MANAGED) next. This prepares architectures that need CONFIG_ARCH_KEEP_MEMBLOCK, such as arm64, for virtio-mem support. Note that kexec *must not* indicate this memory to the second kernel and *must not* place kexec-images on this memory. Let's add a comment to kexec_walk_memblock(), documenting how we handle MEMBLOCK_DRIVER_MANAGED now just like using IORESOURCE_SYSRAM_DRIVER_MANAGED in locate_mem_hole_callback() for kexec_walk_resources(). Also note that MEMBLOCK_HOTPLUG cannot be reused due to different semantics: MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the firmware-provided memory map and added to the system early during boot; kexec *has to* indicate this memory to the second kernel and can place kexec-images on this memory. After memory hotunplug, kexec has to be re-armed. We mostly ignore this flag when "movable_node" is not set on the kernel command line, because then we're told to not care about hotunpluggability of such memory regions. MEMBLOCK_DRIVER_MANAGED: memory is not indicated as "System RAM" in the firmware-provided memory map; this memory is always detected and added to the system by a driver; memory might not actually be physically hotunpluggable. kexec *must not* indicate this memory to the second kernel and *must not* place kexec-images on this memory. Link: https://lkml.kernel.org/r/20211004093605.5830-5-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Mike Rapoport Cc: "Aneesh Kumar K . V" Cc: Arnd Bergmann Cc: Christian Borntraeger Cc: Eric Biederman Cc: Geert Uytterhoeven Cc: Heiko Carstens Cc: Huacai Chen Cc: Jianyong Wu Cc: Jiaxun Yang Cc: Michal Hocko Cc: Oscar Salvador Cc: Shahab Vahedi Cc: Thomas Bogendoerfer Cc: Vasily Gorbik Cc: Vineet Gupta Signed-off-by: Andrew Morton --- include/linux/memblock.h | 16 ++++++++++++++-- kernel/kexec_file.c | 5 +++++ mm/memblock.c | 4 ++++ 3 files changed, 23 insertions(+), 2 deletions(-) --- a/include/linux/memblock.h~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed +++ a/include/linux/memblock.h @@ -37,12 +37,17 @@ extern unsigned long long max_possible_p * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as * reserved in the memory map; refer to memblock_mark_nomap() description * for further details + * @MEMBLOCK_DRIVER_MANAGED: memory region that is always detected and added + * via a driver, and never indicated in the firmware-provided memory map as + * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the + * kernel resource tree. */ enum memblock_flags { MEMBLOCK_NONE = 0x0, /* No special request */ MEMBLOCK_HOTPLUG = 0x1, /* hotpluggable region */ MEMBLOCK_MIRROR = 0x2, /* mirrored region */ MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */ + MEMBLOCK_DRIVER_MANAGED = 0x8, /* always detected via a driver */ }; /** @@ -213,7 +218,8 @@ static inline void __next_physmem_range( */ #define for_each_mem_range(i, p_start, p_end) \ __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, \ - MEMBLOCK_HOTPLUG, p_start, p_end, NULL) + MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED, \ + p_start, p_end, NULL) /** * for_each_mem_range_rev - reverse iterate through memblock areas from @@ -224,7 +230,8 @@ static inline void __next_physmem_range( */ #define for_each_mem_range_rev(i, p_start, p_end) \ __for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \ - MEMBLOCK_HOTPLUG, p_start, p_end, NULL) + MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED,\ + p_start, p_end, NULL) /** * for_each_reserved_mem_range - iterate over all reserved memblock areas @@ -254,6 +261,11 @@ static inline bool memblock_is_nomap(str return m->flags & MEMBLOCK_NOMAP; } +static inline bool memblock_is_driver_managed(struct memblock_region *m) +{ + return m->flags & MEMBLOCK_DRIVER_MANAGED; +} + int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn, unsigned long *end_pfn); void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, --- a/kernel/kexec_file.c~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed +++ a/kernel/kexec_file.c @@ -556,6 +556,11 @@ static int kexec_walk_memblock(struct ke if (kbuf->image->type == KEXEC_TYPE_CRASH) return func(&crashk_res, kbuf); + /* + * Using MEMBLOCK_NONE will properly skip MEMBLOCK_DRIVER_MANAGED. See + * IORESOURCE_SYSRAM_DRIVER_MANAGED handling in + * locate_mem_hole_callback(). + */ if (kbuf->top_down) { for_each_free_mem_range_reverse(i, NUMA_NO_NODE, MEMBLOCK_NONE, &mstart, &mend, NULL) { --- a/mm/memblock.c~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed +++ a/mm/memblock.c @@ -982,6 +982,10 @@ static bool should_skip_region(struct me if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m)) return true; + /* skip driver-managed memory unless we were asked for it explicitly */ + if (!(flags & MEMBLOCK_DRIVER_MANAGED) && memblock_is_driver_managed(m)) + return true; + return false; } From patchwork Fri Nov 5 20:44:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 530E7C433F5 for ; Fri, 5 Nov 2021 20:45:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F36A26136A for ; Fri, 5 Nov 2021 20:44:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org F36A26136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7D8E59400BE; Fri, 5 Nov 2021 16:44:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 761EA9400B3; Fri, 5 Nov 2021 16:44:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DC899400BE; Fri, 5 Nov 2021 16:44:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 48E089400B3 for ; Fri, 5 Nov 2021 16:44:59 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0B3B176E3C for ; Fri, 5 Nov 2021 20:44:59 +0000 (UTC) X-FDA: 78776055876.27.7EDEE05 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id F0B09508E4AE for ; Fri, 5 Nov 2021 20:44:46 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 63BE06135A; Fri, 5 Nov 2021 20:44:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145098; bh=3mBCB+iqeU6b/8es3ptnTWBLGO4nOm6OCH8chmvnUK4=; h=Date:From:To:Subject:In-Reply-To:From; b=zY1sf7CrJae+W5AvjBeyDnGUAiz5KQ88tSHgyyhSuMGvvBgOEPZ5d6+cVrsnYVXZt T6ydQ5De9/QSwltDlswPgGlgNAWRqTWdo8IoAOcZaM1jQjeU9ouZiJGB/LfPcKXfNk rC215vxycc2JLwZs+yRCYxmV1yhyKXQkclxTbu1U= Date: Fri, 05 Nov 2021 13:44:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, arnd@arndb.de, borntraeger@de.ibm.com, chenhuacai@kernel.org, david@redhat.com, ebiederm@xmission.com, geert@linux-m68k.org, gor@linux.ibm.com, hca@linux.ibm.com, Jianyong.Wu@arm.com, jiaxun.yang@flygoat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, rppt@kernel.org, shahab@synopsys.com, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@kernel.org Subject: [patch 196/262] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED Message-ID: <20211105204456.Xa1aJk5Q_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: F0B09508E4AE X-Stat-Signature: qiafnaj931irqbayarmt9rfeuhg8wxge Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zY1sf7Cr; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145086-602225 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Hildenbrand Subject: mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED Let's communicate driver-managed regions to memblock, to properly teach kexec_file with CONFIG_ARCH_KEEP_MEMBLOCK to not place images on these memory regions. Link: https://lkml.kernel.org/r/20211004093605.5830-6-david@redhat.com Signed-off-by: David Hildenbrand Cc: "Aneesh Kumar K . V" Cc: Arnd Bergmann Cc: Christian Borntraeger Cc: Eric Biederman Cc: Geert Uytterhoeven Cc: Heiko Carstens Cc: Huacai Chen Cc: Jianyong Wu Cc: Jiaxun Yang Cc: Michal Hocko Cc: Mike Rapoport Cc: Oscar Salvador Cc: Shahab Vahedi Cc: Thomas Bogendoerfer Cc: Vasily Gorbik Cc: Vineet Gupta Signed-off-by: Andrew Morton --- mm/memory_hotplug.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/mm/memory_hotplug.c~mm-memory_hotplug-indicate-memblock_driver_managed-with-ioresource_sysram_driver_managed +++ a/mm/memory_hotplug.c @@ -1342,6 +1342,7 @@ bool mhp_supports_memmap_on_memory(unsig int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; + enum memblock_flags memblock_flags = MEMBLOCK_NONE; struct vmem_altmap mhp_altmap = {}; struct memory_group *group = NULL; u64 start, size; @@ -1370,7 +1371,9 @@ int __ref add_memory_resource(int nid, s mem_hotplug_begin(); if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) { - ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE); + if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED) + memblock_flags = MEMBLOCK_DRIVER_MANAGED; + ret = memblock_add_node(start, size, nid, memblock_flags); if (ret) goto error_mem_hotplug_end; } From patchwork Fri Nov 5 20:45:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60683C433EF for ; Fri, 5 Nov 2021 20:45:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12E976136A for ; Fri, 5 Nov 2021 20:45:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12E976136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8EA369400BF; Fri, 5 Nov 2021 16:45:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 872EE9400B3; Fri, 5 Nov 2021 16:45:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73ADF9400BF; Fri, 5 Nov 2021 16:45:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0186.hostedemail.com [216.40.44.186]) by kanga.kvack.org (Postfix) with ESMTP id 5B1B19400B3 for ; Fri, 5 Nov 2021 16:45:02 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2237D7533C for ; Fri, 5 Nov 2021 20:45:02 +0000 (UTC) X-FDA: 78776056044.22.2F08839 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id BD19510000AE for ; Fri, 5 Nov 2021 20:45:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CD0916136A; Fri, 5 Nov 2021 20:45:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145101; bh=idL0yaFmY8TUVLDiAyvWIBsNSZtEO/ZbWdvGTnAYvG0=; h=Date:From:To:Subject:In-Reply-To:From; b=purpFdKL9jSqV02q7QhJHpwT0YC5oCFkTXmung7fTYtGKZHYzQ1pVIictLj0G/lkQ ZU5ZGPVQfGfAcMEFffMIJWv/lWkzfjtctuUFlWyNg2KqStBY/EBJwGQY+BHI/VCRfy ndP7edNlym1wYtEfn6zniON96q11mVjIKTOe5jpA= Date: Fri, 05 Nov 2021 13:45:00 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, jglisse@redhat.com, jhubbard@nvidia.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rcampbell@nvidia.com, torvalds@linux-foundation.org Subject: [patch 197/262] mm/rmap.c: avoid double faults migrating device private pages Message-ID: <20211105204500.E7gZ9uZtc%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=purpFdKL; dmarc=none; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BD19510000AE X-Stat-Signature: c63hzkpka9r9wpz5ih9ryotaje7dsxag X-HE-Tag: 1636145101-972694 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alistair Popple Subject: mm/rmap.c: avoid double faults migrating device private pages During migration special page table entries are installed for each page being migrated. These entries store the pfn and associated permissions of ptes mapping the page being migarted. Device-private pages use special swap pte entries to distinguish read-only vs. writeable pages which the migration code checks when creating migration entries. Normally this follows a fast path in migrate_vma_collect_pmd() which correctly copies the permissions of device-private pages over to migration entries when migrating pages back to the CPU. However the slow-path falls back to using try_to_migrate() which unconditionally creates read-only migration entries for device-private pages. This leads to unnecessary double faults on the CPU as the new pages are always mapped read-only even when they could be mapped writeable. Fix this by correctly copying device-private permissions in try_to_migrate_one(). Link: https://lkml.kernel.org/r/20211018045247.3128058-1-apopple@nvidia.com Signed-off-by: Alistair Popple Reported-by: Ralph Campbell Reviewed-by: John Hubbard Cc: Jerome Glisse Signed-off-by: Andrew Morton --- mm/rmap.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/mm/rmap.c~mm-rmapc-avoid-double-faults-migrating-device-private-pages +++ a/mm/rmap.c @@ -1807,6 +1807,7 @@ static bool try_to_migrate_one(struct pa update_hiwater_rss(mm); if (is_zone_device_page(page)) { + unsigned long pfn = page_to_pfn(page); swp_entry_t entry; pte_t swp_pte; @@ -1815,8 +1816,11 @@ static bool try_to_migrate_one(struct pa * pte. do_swap_page() will wait until the migration * pte is removed and then restart fault handling. */ - entry = make_readable_migration_entry( - page_to_pfn(page)); + entry = pte_to_swp_entry(pteval); + if (is_writable_device_private_entry(entry)) + entry = make_writable_migration_entry(pfn); + else + entry = make_readable_migration_entry(pfn); swp_pte = swp_entry_to_pte(entry); /* From patchwork Fri Nov 5 20:45:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BB25C433EF for ; Fri, 5 Nov 2021 20:45:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2FE1861288 for ; Fri, 5 Nov 2021 20:45:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2FE1861288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A4B4E9400C4; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D29C9400C2; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D76B9400C3; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id 65BF39400C1 for ; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 2799A82499A8 for ; Fri, 5 Nov 2021 20:45:14 +0000 (UTC) X-FDA: 78776056548.22.9909C07 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 8E37360019A6 for ; Fri, 5 Nov 2021 20:45:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E5D0C6136F; Fri, 5 Nov 2021 20:45:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145104; bh=CdHMF4pxm2GhLSh8VjYZRSZM1GSBQRNTbC6EcG4WsYI=; h=Date:From:To:Subject:In-Reply-To:From; b=UrwAWra0MwwPZA9wxDW8fTNgXS6x89ectA9lcuSUaRjHkFTWavWaKVlL7H8sRQppJ yJ/bhqFLUOU5z/pUg/Xkak0qt97+LkJntuS05xo4re2iktjhW010zST6dBfyUSiiSU nSMQ+sUt398MgDhCglfCzdQGxR3fAy9jPqkKgIWM= Date: Fri, 05 Nov 2021 13:45:03 -0700 From: Andrew Morton To: akpm@linux-foundation.org, henryburns@google.com, linmiaohe@huawei.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, senozhatsky@chromium.org, torvalds@linux-foundation.org Subject: [patch 198/262] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() Message-ID: <20211105204503.cVz6sXG8I%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8E37360019A6 X-Stat-Signature: nkbxok6yz78ue81cszpn93y8cff34pcx Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UrwAWra0; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145114-417871 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Miaohe Lin Subject: mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() There is one possible race window between zs_pool_dec_isolated() and zs_unregister_migration() because wait_for_isolated_drain() checks the isolated count without holding class->lock and there is no order inside zs_pool_dec_isolated(). Thus the below race window could be possible: zs_pool_dec_isolated zs_unregister_migration check pool->destroying != 0 pool->destroying = true; smp_mb(); wait_for_isolated_drain() wait for pool->isolated_pages == 0 atomic_long_dec(&pool->isolated_pages); atomic_long_read(&pool->isolated_pages) == 0 Since we observe the pool->destroying (false) before atomic_long_dec() for pool->isolated_pages, waking pool->migration_wait up is missed. Fix this by ensure checking pool->destroying happens after the atomic_long_dec(&pool->isolated_pages). Link: https://lkml.kernel.org/r/20210708115027.7557-1-linmiaohe@huawei.com Fixes: 701d678599d0 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool") Signed-off-by: Miaohe Lin Cc: Minchan Kim Cc: Sergey Senozhatsky Cc: Henry Burns Signed-off-by: Andrew Morton --- mm/zsmalloc.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/mm/zsmalloc.c~mm-zsmallocc-close-race-window-between-zs_pool_dec_isolated-and-zs_unregister_migration +++ a/mm/zsmalloc.c @@ -1830,10 +1830,11 @@ static inline void zs_pool_dec_isolated( VM_BUG_ON(atomic_long_read(&pool->isolated_pages) <= 0); atomic_long_dec(&pool->isolated_pages); /* - * There's no possibility of racing, since wait_for_isolated_drain() - * checks the isolated count under &class->lock after enqueuing - * on migration_wait. + * Checking pool->destroying must happen after atomic_long_dec() + * for pool->isolated_pages above. Paired with the smp_mb() in + * zs_unregister_migration(). */ + smp_mb__after_atomic(); if (atomic_long_read(&pool->isolated_pages) == 0 && pool->destroying) wake_up_all(&pool->migration_wait); } From patchwork Fri Nov 5 20:45:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31024C433EF for ; Fri, 5 Nov 2021 20:45:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D59766135A for ; Fri, 5 Nov 2021 20:45:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D59766135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 28FD49400C2; Fri, 5 Nov 2021 16:45:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1D959400C1; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95D809400B3; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id 66D2E9400C2 for ; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 285BD7799C for ; Fri, 5 Nov 2021 20:45:14 +0000 (UTC) X-FDA: 78776056548.23.D3A7410 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id 4781560019A2 for ; Fri, 5 Nov 2021 20:45:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 03B3761357; Fri, 5 Nov 2021 20:45:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145107; bh=2t2jrk+FAx/JnrQYdz4svSTq4zn4cS4XsbnVtB9n3GM=; h=Date:From:To:Subject:In-Reply-To:From; b=UToe6SWccJDMl9Xx1vILjt1lLUBtKn6h6CtJMfE0+56Z7chISiUr7BJ9FULYffpf8 nIN0kzALh+ZMW/5guJAFzq65Hhcs8pXuf3xEaIPVLTV8Z984QicTHdPCZoX+mpIR6d A5c6vsA3R8X5UkHcAXDXB53s2yZefwapXIYfZbHs= Date: Fri, 05 Nov 2021 13:45:06 -0700 From: Andrew Morton To: akpm@linux-foundation.org, ira.weiny@intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterz@infradead.org, prathu.baronia@oneplus.com, rdunlap@infradead.org, tglx@linutronix.de, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 199/262] mm/highmem: remove deprecated kmap_atomic Message-ID: <20211105204506.UaIZqqzAR%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4781560019A2 X-Stat-Signature: rnquf6ao3y7koi6haw8ijafo7zkhbzad Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=UToe6SWc; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145102-102453 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ira Weiny Subject: mm/highmem: remove deprecated kmap_atomic kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace the uses of kmap_atomic() within the highmem code. On profiling clear_huge_page() using ftrace an improvement of 62% was observed on the below setup. Setup:- Below data has been collected on Qualcomm's SM7250 SoC THP enabled (kernel v4.19.113) with only CPU-0(Cortex-A55) and CPU-7(Cortex-A76) switched on and set to max frequency, also DDR set to perf governor. FTRACE Data:- Base data:- Number of iterations: 48 Mean of allocation time: 349.5 us std deviation: 74.5 us v4 data:- Number of iterations: 48 Mean of allocation time: 131 us std deviation: 32.7 us The following simple userspace experiment to allocate 100MB(BUF_SZ) of pages and writing to it gave us a good insight, we observed an improvement of 42% in allocation and writing timings. ------------------------------------------------------------- Test code snippet ------------------------------------------------------------- clock_start(); buf = malloc(BUF_SZ); /* Allocate 100 MB of memory */ for(i=0; i < BUF_SZ_PAGES; i++) { *((int *)(buf + (i*PAGE_SIZE))) = 1; } clock_end(); ------------------------------------------------------------- Malloc test timings for 100MB anon allocation:- Base data:- Number of iterations: 100 Mean of allocation time: 31831 us std deviation: 4286 us v4 data:- Number of iterations: 100 Mean of allocation time: 18193 us std deviation: 4915 us [willy@infradead.org: fix zero_user_segments()] Link: https://lkml.kernel.org/r/YYVhHCJcm2DM2G9u@casper.infradead.org Link: https://lkml.kernel.org/r/20210204073255.20769-2-prathu.baronia@oneplus.com Signed-off-by: Ira Weiny Signed-off-by: Prathu Baronia Cc: Thomas Gleixner Cc: Matthew Wilcox Cc: Peter Zijlstra Cc: Randy Dunlap Signed-off-by: Andrew Morton --- include/linux/highmem.h | 28 ++++++++++++++-------------- mm/highmem.c | 6 +++--- 2 files changed, 17 insertions(+), 17 deletions(-) --- a/include/linux/highmem.h~mm-highmem-remove-deprecated-kmap_atomic +++ a/include/linux/highmem.h @@ -143,9 +143,9 @@ static inline void invalidate_kernel_vma #ifndef clear_user_highpage static inline void clear_user_highpage(struct page *page, unsigned long vaddr) { - void *addr = kmap_atomic(page); + void *addr = kmap_local_page(page); clear_user_page(addr, vaddr, page); - kunmap_atomic(addr); + kunmap_local(addr); } #endif @@ -177,9 +177,9 @@ alloc_zeroed_user_highpage_movable(struc static inline void clear_highpage(struct page *page) { - void *kaddr = kmap_atomic(page); + void *kaddr = kmap_local_page(page); clear_page(kaddr); - kunmap_atomic(kaddr); + kunmap_local(kaddr); } #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE @@ -202,7 +202,7 @@ static inline void zero_user_segments(st unsigned start1, unsigned end1, unsigned start2, unsigned end2) { - void *kaddr = kmap_atomic(page); + void *kaddr = kmap_local_page(page); unsigned int i; BUG_ON(end1 > page_size(page) || end2 > page_size(page)); @@ -213,7 +213,7 @@ static inline void zero_user_segments(st if (end2 > start2) memset(kaddr + start2, 0, end2 - start2); - kunmap_atomic(kaddr); + kunmap_local(kaddr); for (i = 0; i < compound_nr(page); i++) flush_dcache_page(page + i); } @@ -238,11 +238,11 @@ static inline void copy_user_highpage(st { char *vfrom, *vto; - vfrom = kmap_atomic(from); - vto = kmap_atomic(to); + vfrom = kmap_local_page(from); + vto = kmap_local_page(to); copy_user_page(vto, vfrom, vaddr, to); - kunmap_atomic(vto); - kunmap_atomic(vfrom); + kunmap_local(vto); + kunmap_local(vfrom); } #endif @@ -253,11 +253,11 @@ static inline void copy_highpage(struct { char *vfrom, *vto; - vfrom = kmap_atomic(from); - vto = kmap_atomic(to); + vfrom = kmap_local_page(from); + vto = kmap_local_page(to); copy_page(vto, vfrom); - kunmap_atomic(vto); - kunmap_atomic(vfrom); + kunmap_local(vto); + kunmap_local(vfrom); } #endif --- a/mm/highmem.c~mm-highmem-remove-deprecated-kmap_atomic +++ a/mm/highmem.c @@ -383,7 +383,7 @@ void zero_user_segments(struct page *pag unsigned this_end = min_t(unsigned, end1, PAGE_SIZE); if (end1 > start1) { - kaddr = kmap_atomic(page + i); + kaddr = kmap_local_page(page + i); memset(kaddr + start1, 0, this_end - start1); } end1 -= this_end; @@ -398,7 +398,7 @@ void zero_user_segments(struct page *pag if (end2 > start2) { if (!kaddr) - kaddr = kmap_atomic(page + i); + kaddr = kmap_local_page(page + i); memset(kaddr + start2, 0, this_end - start2); } end2 -= this_end; @@ -406,7 +406,7 @@ void zero_user_segments(struct page *pag } if (kaddr) { - kunmap_atomic(kaddr); + kunmap_local(kaddr); flush_dcache_page(page + i); } From patchwork Fri Nov 5 20:45:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 448FCC433EF for ; Fri, 5 Nov 2021 20:45:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 118B66136A for ; Fri, 5 Nov 2021 20:45:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 118B66136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 95FE89400C0; Fri, 5 Nov 2021 16:45:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E7FA9400B3; Fri, 5 Nov 2021 16:45:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AFA79400C0; Fri, 5 Nov 2021 16:45:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id 646379400B3 for ; Fri, 5 Nov 2021 16:45:11 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 260A076BBA for ; Fri, 5 Nov 2021 20:45:11 +0000 (UTC) X-FDA: 78776056422.15.74FD4AA Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id D37CD40020A2 for ; Fri, 5 Nov 2021 20:45:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id F3C8761362; Fri, 5 Nov 2021 20:45:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145110; bh=+cFhMeIkiQ0YyKzyFTisK6aWa7VYMwkftw6QfEjSUQ4=; h=Date:From:To:Subject:In-Reply-To:From; b=InqgpbBfpBs8PKNuzjJGvGsQDI9/OiyfL3560mEyueLvQRR0BMwClekc1ACo/b8G/ 81RgOoA7cTwtzhJXpups9e5IQ6HBJ7bqnkexiYxFE7YaV95TFibZo36Dsf0a+0IJbU /NvMwVNJ0zR3XjOBFenVsMYdanSlzwe+F2n6RdwM= Date: Fri, 05 Nov 2021 13:45:09 -0700 From: Andrew Morton To: akpm@linux-foundation.org, jaewon31.kim@samsung.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, ytk.lee@samsung.com Subject: [patch 200/262] zram_drv: allow reclaim on bio_alloc Message-ID: <20211105204509.9t6bFXpJT%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=InqgpbBf; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D37CD40020A2 X-Stat-Signature: ymtzsw6ytpwk7jak6fnzzmzoyq9k6xpm X-HE-Tag: 1636145110-789732 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jaewon Kim Subject: zram_drv: allow reclaim on bio_alloc The read_from_bdev_async is not called on atomic context. So GFP_NOIO is available rather than GFP_ATOMIC. If there were reclaimable pages with GFP_NOIO, we can avoid allocation failure and page fault failure. Link: https://lkml.kernel.org/r/20210908005241.28062-1-jaewon31.kim@samsung.com Signed-off-by: Jaewon Kim Reported-by: Yong-Taek Lee Acked-by: Minchan Kim Signed-off-by: Andrew Morton --- drivers/block/zram/zram_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/block/zram/zram_drv.c~zram_drv-allow-reclaim-on-bio_alloc +++ a/drivers/block/zram/zram_drv.c @@ -587,7 +587,7 @@ static int read_from_bdev_async(struct z { struct bio *bio; - bio = bio_alloc(GFP_ATOMIC, 1); + bio = bio_alloc(GFP_NOIO, 1); if (!bio) return -ENOMEM; From patchwork Fri Nov 5 20:45:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFE18C433F5 for ; Fri, 5 Nov 2021 20:45:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BEA3D61362 for ; Fri, 5 Nov 2021 20:45:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BEA3D61362 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DF9E19400B3; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4B339400C3; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 873849400C1; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 647CE9400B3 for ; Fri, 5 Nov 2021 16:45:14 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 27333184952F1 for ; Fri, 5 Nov 2021 20:45:14 +0000 (UTC) X-FDA: 78776056548.25.D8F2099 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id BFB05400208A for ; Fri, 5 Nov 2021 20:45:13 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D66DE6128E; Fri, 5 Nov 2021 20:45:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145113; bh=xtGAmTgTfPepl5sQOtEFzWvC2DftSFOFo6QMfSRQfhE=; h=Date:From:To:Subject:In-Reply-To:From; b=aoIMkeT+7gshEJz18iT3lc1sgYvtPiVjKFLdWcMDhGr7ul/CZUrwx9EWFG54xNYzO DTrDh/rMzTWNy9pJKtJKQ1FDxc9MY/8vnLmj3xoG5PgN8tT2PtYeSd7Khj7El3Z86a 748VPKcpJRpOTEmkwudGbiillKO14n+L/c6MEcA4= Date: Fri, 05 Nov 2021 13:45:12 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dan.carpenter@oracle.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, senozhatsky@chromium.org, torvalds@linux-foundation.org Subject: [patch 201/262] zram: off by one in read_block_state() Message-ID: <20211105204512.OEBIB2O7B%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=aoIMkeT+; dmarc=none; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BFB05400208A X-Stat-Signature: goydcmqfdif9x59g6h5stc88m444grn9 X-HE-Tag: 1636145113-299347 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dan Carpenter Subject: zram: off by one in read_block_state() snprintf() returns the number of bytes it would have printed if there were space. But it does not count the NUL terminator. So that means that if "count == copied" then this has already overflowed by one character. This bug likely isn't super harmful in real life. Link: https://lkml.kernel.org/r/20210916130404.GA25094@kili Fixes: c0265342bff4 ("zram: introduce zram memory tracking") Signed-off-by: Dan Carpenter Cc: Minchan Kim Cc: Sergey Senozhatsky Signed-off-by: Andrew Morton --- drivers/block/zram/zram_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/block/zram/zram_drv.c~zram-off-by-one-in-read_block_state +++ a/drivers/block/zram/zram_drv.c @@ -910,7 +910,7 @@ static ssize_t read_block_state(struct f zram_test_flag(zram, index, ZRAM_HUGE) ? 'h' : '.', zram_test_flag(zram, index, ZRAM_IDLE) ? 'i' : '.'); - if (count < copied) { + if (count <= copied) { zram_slot_unlock(zram, index); break; } From patchwork Fri Nov 5 20:45:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63FACC433EF for ; Fri, 5 Nov 2021 20:45:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 174DB6136A for ; Fri, 5 Nov 2021 20:45:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 174DB6136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B850C9400C3; Fri, 5 Nov 2021 16:45:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A97139400C1; Fri, 5 Nov 2021 16:45:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90FB89400C3; Fri, 5 Nov 2021 16:45:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id 78D7A9400C1 for ; Fri, 5 Nov 2021 16:45:17 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 415FC779A6 for ; Fri, 5 Nov 2021 20:45:17 +0000 (UTC) X-FDA: 78776056674.24.91C146C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id D5BC4F0000AE for ; Fri, 5 Nov 2021 20:45:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D4DF461288; Fri, 5 Nov 2021 20:45:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145116; bh=Yw70Ucs/ZjatnKrr0B15kFgciDVcpS1J3rJ+Og08Ccs=; h=Date:From:To:Subject:In-Reply-To:From; b=umn73TcQUnhrbx0dyAjTuagXnUm9MLZgVmcvHMcHzKHs+63snX1bXuCIHWmQDuIgP 08vUuWPdVg7UojxhrUzyNVhe1VIZEvA5P51rV5yTIpEm8ffai8T0k04Tbg4FQqb2XC HmqwfOjLRb2xMET2boZpEk3CZl+1RY44jSVYntzY= Date: Fri, 05 Nov 2021 13:45:15 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bgeffon@google.com, corbet@lwn.net, jsbarnes@google.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, suleiman@google.com, torvalds@linux-foundation.org Subject: [patch 202/262] zram: introduce an aged idle interface Message-ID: <20211105204515.5k1FyKyar%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D5BC4F0000AE X-Stat-Signature: hj419r6hah3i8smwpxygrm8kz8est7dq Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=umn73TcQ; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145116-106364 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Brian Geffon Subject: zram: introduce an aged idle interface This change introduces an aged idle interface to the existing idle sysfs file for zram. When CONFIG_ZRAM_MEMORY_TRACKING is enabled the idle file now also accepts an integer argument. This integer is the age (in seconds) of pages to mark as idle. The idle file still supports 'all' as it always has. This new approach allows for much more control over which pages get marked as idle. [bgeffon@google.com: use IS_ENABLED and cleanup comment] Link: https://lkml.kernel.org/r/20210924161128.1508015-1-bgeffon@google.com [bgeffon@google.com: Sergey's cleanup suggestions] Link: https://lkml.kernel.org/r/20210929143056.13067-1-bgeffon@google.com Link: https://lkml.kernel.org/r/20210923130115.1344361-1-bgeffon@google.com Signed-off-by: Brian Geffon Acked-by: Minchan Kim Reviewed-by: Sergey Senozhatsky Cc: Nitin Gupta Cc: Jonathan Corbet Cc: Suleiman Souhlal Cc: Jesse Barnes Signed-off-by: Andrew Morton --- Documentation/admin-guide/blockdev/zram.rst | 8 ++ drivers/block/zram/zram_drv.c | 62 +++++++++++++----- 2 files changed, 54 insertions(+), 16 deletions(-) --- a/Documentation/admin-guide/blockdev/zram.rst~zram-introduce-an-aged-idle-interface +++ a/Documentation/admin-guide/blockdev/zram.rst @@ -328,6 +328,14 @@ as idle:: From now on, any pages on zram are idle pages. The idle mark will be removed until someone requests access of the block. IOW, unless there is access request, those pages are still idle pages. +Additionally, when CONFIG_ZRAM_MEMORY_TRACKING is enabled pages can be +marked as idle based on how long (in seconds) it's been since they were +last accessed:: + + echo 86400 > /sys/block/zramX/idle + +In this example all pages which haven't been accessed in more than 86400 +seconds (one day) will be marked idle. Admin can request writeback of those idle pages at right timing via:: --- a/drivers/block/zram/zram_drv.c~zram-introduce-an-aged-idle-interface +++ a/drivers/block/zram/zram_drv.c @@ -291,22 +291,16 @@ static ssize_t mem_used_max_store(struct return len; } -static ssize_t idle_store(struct device *dev, - struct device_attribute *attr, const char *buf, size_t len) +/* + * Mark all pages which are older than or equal to cutoff as IDLE. + * Callers should hold the zram init lock in read mode + */ +static void mark_idle(struct zram *zram, ktime_t cutoff) { - struct zram *zram = dev_to_zram(dev); + int is_idle = 1; unsigned long nr_pages = zram->disksize >> PAGE_SHIFT; int index; - if (!sysfs_streq(buf, "all")) - return -EINVAL; - - down_read(&zram->init_lock); - if (!init_done(zram)) { - up_read(&zram->init_lock); - return -EINVAL; - } - for (index = 0; index < nr_pages; index++) { /* * Do not mark ZRAM_UNDER_WB slot as ZRAM_IDLE to close race. @@ -314,14 +308,50 @@ static ssize_t idle_store(struct device */ zram_slot_lock(zram, index); if (zram_allocated(zram, index) && - !zram_test_flag(zram, index, ZRAM_UNDER_WB)) - zram_set_flag(zram, index, ZRAM_IDLE); + !zram_test_flag(zram, index, ZRAM_UNDER_WB)) { +#ifdef CONFIG_ZRAM_MEMORY_TRACKING + is_idle = !cutoff || ktime_after(cutoff, zram->table[index].ac_time); +#endif + if (is_idle) + zram_set_flag(zram, index, ZRAM_IDLE); + } zram_slot_unlock(zram, index); } +} - up_read(&zram->init_lock); +static ssize_t idle_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t len) +{ + struct zram *zram = dev_to_zram(dev); + ktime_t cutoff_time = 0; + ssize_t rv = -EINVAL; - return len; + if (!sysfs_streq(buf, "all")) { + /* + * If it did not parse as 'all' try to treat it as an integer when + * we have memory tracking enabled. + */ + u64 age_sec; + + if (IS_ENABLED(CONFIG_ZRAM_MEMORY_TRACKING) && !kstrtoull(buf, 0, &age_sec)) + cutoff_time = ktime_sub(ktime_get_boottime(), + ns_to_ktime(age_sec * NSEC_PER_SEC)); + else + goto out; + } + + down_read(&zram->init_lock); + if (!init_done(zram)) + goto out_unlock; + + /* A cutoff_time of 0 marks everything as idle, this is the "all" behavior */ + mark_idle(zram, cutoff_time); + rv = len; + +out_unlock: + up_read(&zram->init_lock); +out: + return rv; } #ifdef CONFIG_ZRAM_WRITEBACK From patchwork Fri Nov 5 20:45:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E068EC433EF for ; Fri, 5 Nov 2021 20:45:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9613A6136F for ; Fri, 5 Nov 2021 20:45:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9613A6136F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2D8D99400C5; Fri, 5 Nov 2021 16:45:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 232959400C1; Fri, 5 Nov 2021 16:45:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AA5A9400C5; Fri, 5 Nov 2021 16:45:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0036.hostedemail.com [216.40.44.36]) by kanga.kvack.org (Postfix) with ESMTP id E6FAF9400C1 for ; Fri, 5 Nov 2021 16:45:20 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B2114779AF for ; Fri, 5 Nov 2021 20:45:20 +0000 (UTC) X-FDA: 78776056800.08.B526807 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP id 48ECF191D for ; Fri, 5 Nov 2021 20:45:20 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 054396128E; Fri, 5 Nov 2021 20:45:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145119; bh=PpQE9hHEwJ4f9NZ2iMF+besnDDOA/dbnQt7+H/4lGQs=; h=Date:From:To:Subject:In-Reply-To:From; b=J9F/GR44Qqgs49sG1OiZlUYa1hrqdmnPINSKBFOMivBnqVuDRduluvsnp4Pb4nbEw bxWC4URUJp175TUAzYaeiojQsepuRJlu03ClJwZXdDBGUveJDuivPb/Sszvt3wzHsb HTM91QfHLHk0gm8vtKGyFhPMzygMwuF6LAcxFInc= Date: Fri, 05 Nov 2021 13:45:18 -0700 From: Andrew Morton To: akpm@linux-foundation.org, cl@linux.com, iamjoonsoo.kim@lge.com, jmorris@namei.org, joel@jms.id.au, keescook@chromium.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, penberg@kernel.org, rientjes@google.com, serge@hallyn.com, steve@sk2.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 203/262] mm: remove HARDENED_USERCOPY_FALLBACK Message-ID: <20211105204518._op8lD-Ug%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 48ECF191D X-Stat-Signature: tr1nrarbc15wkc8qdq1xm77qku8onhrw Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="J9F/GR44"; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145120-173634 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Stephen Kitt Subject: mm: remove HARDENED_USERCOPY_FALLBACK This has served its purpose and is no longer used. All usercopy violations appear to have been handled by now, any remaining instances (or new bugs) will cause copies to be rejected. This isn't a direct revert of commit 2d891fbc3bb6 ("usercopy: Allow strict enforcement of whitelists"); since usercopy_fallback is effectively 0, the fallback handling is removed too. This also removes the usercopy_fallback module parameter on slab_common. Link: https://github.com/KSPP/linux/issues/153 Link: https://lkml.kernel.org/r/20210921061149.1091163-1-steve@sk2.org Signed-off-by: Stephen Kitt Suggested-by: Kees Cook Acked-by: Kees Cook Reviewed-by: Joel Stanley [defconfig change] Acked-by: David Rientjes Cc: Christoph Lameter Cc: Pekka Enberg Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: James Morris Cc: "Serge E . Hallyn" Signed-off-by: Andrew Morton --- arch/powerpc/configs/skiroot_defconfig | 1 - include/linux/slab.h | 2 -- mm/slab.c | 13 ------------- mm/slab_common.c | 8 -------- mm/slub.c | 14 -------------- security/Kconfig | 14 -------------- 6 files changed, 52 deletions(-) --- a/arch/powerpc/configs/skiroot_defconfig~mm-remove-hardened_usercopy_fallback +++ a/arch/powerpc/configs/skiroot_defconfig @@ -275,7 +275,6 @@ CONFIG_NLS_UTF8=y CONFIG_ENCRYPTED_KEYS=y CONFIG_SECURITY=y CONFIG_HARDENED_USERCOPY=y -# CONFIG_HARDENED_USERCOPY_FALLBACK is not set CONFIG_HARDENED_USERCOPY_PAGESPAN=y CONFIG_FORTIFY_SOURCE=y CONFIG_SECURITY_LOCKDOWN_LSM=y --- a/include/linux/slab.h~mm-remove-hardened_usercopy_fallback +++ a/include/linux/slab.h @@ -142,8 +142,6 @@ struct mem_cgroup; void __init kmem_cache_init(void); bool slab_is_available(void); -extern bool usercopy_fallback; - struct kmem_cache *kmem_cache_create(const char *name, unsigned int size, unsigned int align, slab_flags_t flags, void (*ctor)(void *)); --- a/mm/slab.c~mm-remove-hardened_usercopy_fallback +++ a/mm/slab.c @@ -4204,19 +4204,6 @@ void __check_heap_object(const void *ptr n <= cachep->useroffset - offset + cachep->usersize) return; - /* - * If the copy is still within the allocated object, produce - * a warning instead of rejecting the copy. This is intended - * to be a temporary method to find any missing usercopy - * whitelists. - */ - if (usercopy_fallback && - offset <= cachep->object_size && - n <= cachep->object_size - offset) { - usercopy_warn("SLAB object", cachep->name, to_user, offset, n); - return; - } - usercopy_abort("SLAB object", cachep->name, to_user, offset, n); } #endif /* CONFIG_HARDENED_USERCOPY */ --- a/mm/slab_common.c~mm-remove-hardened_usercopy_fallback +++ a/mm/slab_common.c @@ -37,14 +37,6 @@ LIST_HEAD(slab_caches); DEFINE_MUTEX(slab_mutex); struct kmem_cache *kmem_cache; -#ifdef CONFIG_HARDENED_USERCOPY -bool usercopy_fallback __ro_after_init = - IS_ENABLED(CONFIG_HARDENED_USERCOPY_FALLBACK); -module_param(usercopy_fallback, bool, 0400); -MODULE_PARM_DESC(usercopy_fallback, - "WARN instead of reject usercopy whitelist violations"); -#endif - static LIST_HEAD(slab_caches_to_rcu_destroy); static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work); static DECLARE_WORK(slab_caches_to_rcu_destroy_work, --- a/mm/slub.c~mm-remove-hardened_usercopy_fallback +++ a/mm/slub.c @@ -4489,7 +4489,6 @@ void __check_heap_object(const void *ptr { struct kmem_cache *s; unsigned int offset; - size_t object_size; bool is_kfence = is_kfence_address(ptr); ptr = kasan_reset_tag(ptr); @@ -4522,19 +4521,6 @@ void __check_heap_object(const void *ptr n <= s->useroffset - offset + s->usersize) return; - /* - * If the copy is still within the allocated object, produce - * a warning instead of rejecting the copy. This is intended - * to be a temporary method to find any missing usercopy - * whitelists. - */ - object_size = slab_ksize(s); - if (usercopy_fallback && - offset <= object_size && n <= object_size - offset) { - usercopy_warn("SLUB object", s->name, to_user, offset, n); - return; - } - usercopy_abort("SLUB object", s->name, to_user, offset, n); } #endif /* CONFIG_HARDENED_USERCOPY */ --- a/security/Kconfig~mm-remove-hardened_usercopy_fallback +++ a/security/Kconfig @@ -163,20 +163,6 @@ config HARDENED_USERCOPY or are part of the kernel text. This kills entire classes of heap overflow exploits and similar kernel memory exposures. -config HARDENED_USERCOPY_FALLBACK - bool "Allow usercopy whitelist violations to fallback to object size" - depends on HARDENED_USERCOPY - default y - help - This is a temporary option that allows missing usercopy whitelists - to be discovered via a WARN() to the kernel log, instead of - rejecting the copy, falling back to non-whitelisted hardened - usercopy that checks the slab allocation size instead of the - whitelist size. This option will be removed once it seems like - all missing usercopy whitelists have been identified and fixed. - Booting with "slab_common.usercopy_fallback=Y/N" can change - this setting. - config HARDENED_USERCOPY_PAGESPAN bool "Refuse to copy allocations that span multiple pages" depends on HARDENED_USERCOPY From patchwork Fri Nov 5 20:45:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 405B6C433F5 for ; Fri, 5 Nov 2021 20:45:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E5E2B6128E for ; Fri, 5 Nov 2021 20:45:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E5E2B6128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 652829400C7; Fri, 5 Nov 2021 16:45:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B3F09400C1; Fri, 5 Nov 2021 16:45:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 455F29400C7; Fri, 5 Nov 2021 16:45:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id 2F90B9400C1 for ; Fri, 5 Nov 2021 16:45:29 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E44991856A8E6 for ; Fri, 5 Nov 2021 20:45:28 +0000 (UTC) X-FDA: 78776057262.05.322E27D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id 91BA130000AA for ; Fri, 5 Nov 2021 20:45:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 47A2761362; Fri, 5 Nov 2021 20:45:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145122; bh=tcauLH4ijKH/o2arIcCMmtzCCODlg33d8vndq8P6M8Q=; h=Date:From:To:Subject:In-Reply-To:From; b=lCRW1DGlPoD+eHOIJ2FbuqBVVl2vxXXsGyUEE3F2AOkGKDAQCix7guSOpkRYY2MBP lGnd5MitopV48RxohUejxhgQSmWJwqM6ysVSjaZ54VPyOBvSIskklSbcIwqogOEYBZ ypDgK30GIWSmVhl6LQ9zbnUzSfyDhFkPKdiKtJYk= Date: Fri, 05 Nov 2021 13:45:21 -0700 From: Andrew Morton To: akpm@linux-foundation.org, davem@davemloft.net, horms@verge.net.au, kuba@kernel.org, linux-mm@kvack.org, liumh1@shanghaitech.edu.cn, marcelo.leitner@gmail.com, mm-commits@vger.kernel.org, pshelar@ovn.org, torvalds@linux-foundation.org, ulf.hansson@linaro.org, vyasevich@gmail.com, willy@infradead.org Subject: [patch 204/262] include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h Message-ID: <20211105204521.CXrs5KDI0%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lCRW1DGl; dmarc=none; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 91BA130000AA X-Stat-Signature: mcn7nnqpjcs3f6r3e6f8ht1ck9topnqs X-HE-Tag: 1636145116-922699 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Mianhan Liu Subject: include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h nr_free_buffer_pages could be exposed through mm.h instead of swap.h. The advantage of this change is that it can reduce the obsolete includes. For example, net/ipv4/tcp.c wouldn't need swap.h any more since it has already included mm.h. Similarly, after checking all the other files, it comes that tcp.c, udp.c meter.c ,... follow the same rule, so these files can have swap.h removed too. Moreover, after preprocessing all the files that use nr_free_buffer_pages, it turns out that those files have already included mm.h.Thus, we can move nr_free_buffer_pages from swap.h to mm.h safely. This change will not affect the compilation of other files. Link: https://lkml.kernel.org/r/20210912133640.1624-1-liumh1@shanghaitech.edu.cn Signed-off-by: Mianhan Liu Cc: Jakub Kicinski CC: Ulf Hansson Cc: "David S . Miller" Cc: Simon Horman Cc: Pravin B Shelar Cc: Vlad Yasevich Cc: Marcelo Ricardo Leitner Cc: Matthew Wilcox Signed-off-by: Andrew Morton --- drivers/mmc/core/mmc_test.c | 1 - include/linux/mm.h | 2 ++ include/linux/swap.h | 1 - net/ipv4/tcp.c | 1 - net/ipv4/udp.c | 1 - net/netfilter/ipvs/ip_vs_ctl.c | 1 - net/openvswitch/meter.c | 1 - net/sctp/protocol.c | 1 - 8 files changed, 2 insertions(+), 7 deletions(-) --- a/drivers/mmc/core/mmc_test.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/drivers/mmc/core/mmc_test.c @@ -10,7 +10,6 @@ #include #include -#include /* For nr_free_buffer_pages() */ #include #include --- a/include/linux/mm.h~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/include/linux/mm.h @@ -875,6 +875,8 @@ void put_pages_list(struct list_head *pa void split_page(struct page *page, unsigned int order); void copy_huge_page(struct page *dst, struct page *src); +unsigned long nr_free_buffer_pages(void); + /* * Compound pages have a destructor function. Provide a * prototype for that function and accessor functions. --- a/include/linux/swap.h~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/include/linux/swap.h @@ -335,7 +335,6 @@ void workingset_update_node(struct xa_no /* linux/mm/page_alloc.c */ extern unsigned long totalreserve_pages; -extern unsigned long nr_free_buffer_pages(void); /* Definition of global_zone_page_state not available yet */ #define nr_free_pages() global_zone_page_state(NR_FREE_PAGES) --- a/net/ipv4/tcp.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/net/ipv4/tcp.c @@ -260,7 +260,6 @@ #include #include #include -#include #include #include #include --- a/net/ipv4/udp.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/net/ipv4/udp.c @@ -78,7 +78,6 @@ #include #include #include -#include #include #include #include --- a/net/netfilter/ipvs/ip_vs_ctl.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/net/netfilter/ipvs/ip_vs_ctl.c @@ -24,7 +24,6 @@ #include #include #include -#include #include #include --- a/net/openvswitch/meter.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/net/openvswitch/meter.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include --- a/net/sctp/protocol.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh +++ a/net/sctp/protocol.c @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include From patchwork Fri Nov 5 20:45:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2694CC433F5 for ; Fri, 5 Nov 2021 20:45:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CF24D6136A for ; Fri, 5 Nov 2021 20:45:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CF24D6136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 605319400C6; Fri, 5 Nov 2021 16:45:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 58C3D9400C1; Fri, 5 Nov 2021 16:45:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 456879400C6; Fri, 5 Nov 2021 16:45:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id 2E3399400C1 for ; Fri, 5 Nov 2021 16:45:27 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E34D01852461B for ; Fri, 5 Nov 2021 20:45:26 +0000 (UTC) X-FDA: 78776057052.21.9A2A497 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 779ADF0000BD for ; Fri, 5 Nov 2021 20:45:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 76A1961288; Fri, 5 Nov 2021 20:45:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145125; bh=t9AJWdeuoV/panzK5rsoQ9jN8YuQ9jkqzEFSK7xh5t8=; h=Date:From:To:Subject:In-Reply-To:From; b=BEWZUCgSgOhTnoAUlT/WYeDNlcRfK40hFn1b4VhmggJodfXiVU5cqqCe7dWAUMRnG x9vASEibGgovpQTxyfuRm2GidKJghux3BvIl2W+6XhGNUQ581gBJ6u+cCW7YcI9LKw BoKq1C6i76NY1sJU1poQvi1y76UTW3B5bVSnwdvY= Date: Fri, 05 Nov 2021 13:45:25 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 205/262] stacktrace: move filter_irq_stacks() to kernel/stacktrace.c Message-ID: <20211105204525.6a059SKtu%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 779ADF0000BD X-Stat-Signature: xenr4xnyne5fz3811h6aphnjgad3d9xd Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BEWZUCgS; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145126-160026 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: stacktrace: move filter_irq_stacks() to kernel/stacktrace.c filter_irq_stacks() has little to do with the stackdepot implementation, except that it is usually used by users (such as KASAN) of stackdepot to reduce the stack trace. However, filter_irq_stacks() itself is not useful without a stack trace as obtained by stack_trace_save() and friends. Therefore, move filter_irq_stacks() to kernel/stacktrace.c, so that new users of filter_irq_stacks() do not have to start depending on STACKDEPOT only for filter_irq_stacks(). Link: https://lkml.kernel.org/r/20210923104803.2620285-1-elver@google.com Signed-off-by: Marco Elver Acked-by: Dmitry Vyukov Cc: Alexander Potapenko Cc: Jann Horn Cc: Aleksandr Nogikh Cc: Taras Madan Signed-off-by: Andrew Morton --- include/linux/stackdepot.h | 2 -- include/linux/stacktrace.h | 1 + kernel/stacktrace.c | 30 ++++++++++++++++++++++++++++++ lib/stackdepot.c | 24 ------------------------ 4 files changed, 31 insertions(+), 26 deletions(-) --- a/include/linux/stackdepot.h~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec +++ a/include/linux/stackdepot.h @@ -25,8 +25,6 @@ depot_stack_handle_t stack_depot_save(un unsigned int stack_depot_fetch(depot_stack_handle_t handle, unsigned long **entries); -unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries); - #ifdef CONFIG_STACKDEPOT int stack_depot_init(void); #else --- a/include/linux/stacktrace.h~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec +++ a/include/linux/stacktrace.h @@ -21,6 +21,7 @@ unsigned int stack_trace_save_tsk(struct unsigned int stack_trace_save_regs(struct pt_regs *regs, unsigned long *store, unsigned int size, unsigned int skipnr); unsigned int stack_trace_save_user(unsigned long *store, unsigned int size); +unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries); /* Internal interfaces. Do not use in generic code */ #ifdef CONFIG_ARCH_STACKWALK --- a/kernel/stacktrace.c~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec +++ a/kernel/stacktrace.c @@ -13,6 +13,7 @@ #include #include #include +#include /** * stack_trace_print - Print the entries in the stack trace @@ -373,3 +374,32 @@ unsigned int stack_trace_save_user(unsig #endif /* CONFIG_USER_STACKTRACE_SUPPORT */ #endif /* !CONFIG_ARCH_STACKWALK */ + +static inline bool in_irqentry_text(unsigned long ptr) +{ + return (ptr >= (unsigned long)&__irqentry_text_start && + ptr < (unsigned long)&__irqentry_text_end) || + (ptr >= (unsigned long)&__softirqentry_text_start && + ptr < (unsigned long)&__softirqentry_text_end); +} + +/** + * filter_irq_stacks - Find first IRQ stack entry in trace + * @entries: Pointer to stack trace array + * @nr_entries: Number of entries in the storage array + * + * Return: Number of trace entries until IRQ stack starts. + */ +unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries) +{ + unsigned int i; + + for (i = 0; i < nr_entries; i++) { + if (in_irqentry_text(entries[i])) { + /* Include the irqentry function into the stack. */ + return i + 1; + } + } + return nr_entries; +} +EXPORT_SYMBOL_GPL(filter_irq_stacks); --- a/lib/stackdepot.c~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec +++ a/lib/stackdepot.c @@ -20,7 +20,6 @@ */ #include -#include #include #include #include @@ -371,26 +370,3 @@ depot_stack_handle_t stack_depot_save(un return __stack_depot_save(entries, nr_entries, alloc_flags, true); } EXPORT_SYMBOL_GPL(stack_depot_save); - -static inline int in_irqentry_text(unsigned long ptr) -{ - return (ptr >= (unsigned long)&__irqentry_text_start && - ptr < (unsigned long)&__irqentry_text_end) || - (ptr >= (unsigned long)&__softirqentry_text_start && - ptr < (unsigned long)&__softirqentry_text_end); -} - -unsigned int filter_irq_stacks(unsigned long *entries, - unsigned int nr_entries) -{ - unsigned int i; - - for (i = 0; i < nr_entries; i++) { - if (in_irqentry_text(entries[i])) { - /* Include the irqentry function into the stack. */ - return i + 1; - } - } - return nr_entries; -} -EXPORT_SYMBOL_GPL(filter_irq_stacks); From patchwork Fri Nov 5 20:45:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADC33C433EF for ; Fri, 5 Nov 2021 20:45:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 618DD61288 for ; Fri, 5 Nov 2021 20:45:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 618DD61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4F3189400C8; Fri, 5 Nov 2021 16:45:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 458289400C1; Fri, 5 Nov 2021 16:45:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A9179400C8; Fri, 5 Nov 2021 16:45:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id 0F0AD9400C1 for ; Fri, 5 Nov 2021 16:45:30 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C5ED48249980 for ; Fri, 5 Nov 2021 20:45:29 +0000 (UTC) X-FDA: 78776057178.22.E36F9AC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id 29EEF104AAE9 for ; Fri, 5 Nov 2021 20:45:21 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 86E746136F; Fri, 5 Nov 2021 20:45:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145128; bh=YW4baP5h7WgOWo6IqXQTCxde/fSUi9MZWni9ucF3ZcU=; h=Date:From:To:Subject:In-Reply-To:From; b=2kUUrmnjmqQgrniWzu+3YhFDF0ATUZiYIzMeVhFTeIydpNwBUHYooH2emrOoWGhuF hSjtKE8Yt24TuOIeFR6Biacw9W1jIRn3T+PnV480CHLnsYFgYWBLUF9WY58DTf305q n40S2ZfcImRyUuyyVKR8F1l7jvvfucXnMkDAMKDU= Date: Fri, 05 Nov 2021 13:45:28 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 206/262] kfence: count unexpectedly skipped allocations Message-ID: <20211105204528.1mF27G-Uh%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 29EEF104AAE9 X-Stat-Signature: 9esa33o13puabk8g745d3nt84g1xuejd Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2kUUrmnj; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145121-525739 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: count unexpectedly skipped allocations Maintain a counter to count allocations that are skipped due to being incompatible (oversized, incompatible gfp flags) or no capacity. This is to compute the fraction of allocations that could not be serviced by KFENCE, which we expect to be rare. Link: https://lkml.kernel.org/r/20210923104803.2620285-2-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Dmitry Vyukov Acked-by: Alexander Potapenko Cc: Aleksandr Nogikh Cc: Jann Horn Cc: Taras Madan Signed-off-by: Andrew Morton --- mm/kfence/core.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) --- a/mm/kfence/core.c~kfence-count-unexpectedly-skipped-allocations +++ a/mm/kfence/core.c @@ -112,6 +112,8 @@ enum kfence_counter_id { KFENCE_COUNTER_FREES, KFENCE_COUNTER_ZOMBIES, KFENCE_COUNTER_BUGS, + KFENCE_COUNTER_SKIP_INCOMPAT, + KFENCE_COUNTER_SKIP_CAPACITY, KFENCE_COUNTER_COUNT, }; static atomic_long_t counters[KFENCE_COUNTER_COUNT]; @@ -121,6 +123,8 @@ static const char *const counter_names[] [KFENCE_COUNTER_FREES] = "total frees", [KFENCE_COUNTER_ZOMBIES] = "zombie allocations", [KFENCE_COUNTER_BUGS] = "total bugs", + [KFENCE_COUNTER_SKIP_INCOMPAT] = "skipped allocations (incompatible)", + [KFENCE_COUNTER_SKIP_CAPACITY] = "skipped allocations (capacity)", }; static_assert(ARRAY_SIZE(counter_names) == KFENCE_COUNTER_COUNT); @@ -271,8 +275,10 @@ static void *kfence_guarded_alloc(struct list_del_init(&meta->list); } raw_spin_unlock_irqrestore(&kfence_freelist_lock, flags); - if (!meta) + if (!meta) { + atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_CAPACITY]); return NULL; + } if (unlikely(!raw_spin_trylock_irqsave(&meta->lock, flags))) { /* @@ -740,8 +746,10 @@ void *__kfence_alloc(struct kmem_cache * * Perform size check before switching kfence_allocation_gate, so that * we don't disable KFENCE without making an allocation. */ - if (size > PAGE_SIZE) + if (size > PAGE_SIZE) { + atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_INCOMPAT]); return NULL; + } /* * Skip allocations from non-default zones, including DMA. We cannot @@ -749,8 +757,10 @@ void *__kfence_alloc(struct kmem_cache * * properties (e.g. reside in DMAable memory). */ if ((flags & GFP_ZONEMASK) || - (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32))) + (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32))) { + atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_INCOMPAT]); return NULL; + } /* * allocation_gate only needs to become non-zero, so it doesn't make From patchwork Fri Nov 5 20:45:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBA50C433EF for ; Fri, 5 Nov 2021 20:45:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 91B7861357 for ; Fri, 5 Nov 2021 20:45:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 91B7861357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 877D99400C9; Fri, 5 Nov 2021 16:45:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D6CF9400C1; Fri, 5 Nov 2021 16:45:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 652229400C9; Fri, 5 Nov 2021 16:45:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0068.hostedemail.com [216.40.44.68]) by kanga.kvack.org (Postfix) with ESMTP id 4D9F69400C1 for ; Fri, 5 Nov 2021 16:45:33 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 151E08249980 for ; Fri, 5 Nov 2021 20:45:33 +0000 (UTC) X-FDA: 78776057430.05.B5D06AE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 9A4229000258 for ; Fri, 5 Nov 2021 20:45:32 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9D3056128E; Fri, 5 Nov 2021 20:45:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145131; bh=DRG75zfE3378R4xgpBv2EVNG7Ikk13xTsEpPXU3L7LE=; h=Date:From:To:Subject:In-Reply-To:From; b=jKXZOiYmZlTzbeC5jGTbVUzb906ytKBpBVkmCzwGwj0pylIAUMTfSCPb4NS5yovjU EUM/41TmAiPYXPWsnbo6bjcAQMLKAaTm8Ha2xfUYNfuKXzJSnGtpqlVdK5MuNW9ILn AzpDWrLqK6DvDtFNb80LQA5jFJrWC/+kSBxAjPLs= Date: Fri, 05 Nov 2021 13:45:31 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 207/262] kfence: move saving stack trace of allocations into __kfence_alloc() Message-ID: <20211105204531.sPcIqaPpM%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=jKXZOiYm; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9A4229000258 X-Stat-Signature: 57ojhqp13wntfu537ksit5xjraba5gct X-HE-Tag: 1636145132-979731 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: move saving stack trace of allocations into __kfence_alloc() Move the saving of the stack trace of allocations into __kfence_alloc(), so that the stack entries array can be used outside of kfence_guarded_alloc() and we avoid potentially unwinding the stack multiple times. Link: https://lkml.kernel.org/r/20210923104803.2620285-3-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Dmitry Vyukov Acked-by: Alexander Potapenko Cc: Aleksandr Nogikh Cc: Jann Horn Cc: Taras Madan Signed-off-by: Andrew Morton --- mm/kfence/core.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) --- a/mm/kfence/core.c~kfence-move-saving-stack-trace-of-allocations-into-__kfence_alloc +++ a/mm/kfence/core.c @@ -187,19 +187,26 @@ static inline unsigned long metadata_to_ * Update the object's metadata state, including updating the alloc/free stacks * depending on the state transition. */ -static noinline void metadata_update_state(struct kfence_metadata *meta, - enum kfence_object_state next) +static noinline void +metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state next, + unsigned long *stack_entries, size_t num_stack_entries) { struct kfence_track *track = next == KFENCE_OBJECT_FREED ? &meta->free_track : &meta->alloc_track; lockdep_assert_held(&meta->lock); - /* - * Skip over 1 (this) functions; noinline ensures we do not accidentally - * skip over the caller by never inlining. - */ - track->num_stack_entries = stack_trace_save(track->stack_entries, KFENCE_STACK_DEPTH, 1); + if (stack_entries) { + memcpy(track->stack_entries, stack_entries, + num_stack_entries * sizeof(stack_entries[0])); + } else { + /* + * Skip over 1 (this) functions; noinline ensures we do not + * accidentally skip over the caller by never inlining. + */ + num_stack_entries = stack_trace_save(track->stack_entries, KFENCE_STACK_DEPTH, 1); + } + track->num_stack_entries = num_stack_entries; track->pid = task_pid_nr(current); track->cpu = raw_smp_processor_id(); track->ts_nsec = local_clock(); /* Same source as printk timestamps. */ @@ -261,7 +268,8 @@ static __always_inline void for_each_can } } -static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp) +static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp, + unsigned long *stack_entries, size_t num_stack_entries) { struct kfence_metadata *meta = NULL; unsigned long flags; @@ -320,7 +328,7 @@ static void *kfence_guarded_alloc(struct addr = (void *)meta->addr; /* Update remaining metadata. */ - metadata_update_state(meta, KFENCE_OBJECT_ALLOCATED); + metadata_update_state(meta, KFENCE_OBJECT_ALLOCATED, stack_entries, num_stack_entries); /* Pairs with READ_ONCE() in kfence_shutdown_cache(). */ WRITE_ONCE(meta->cache, cache); meta->size = size; @@ -400,7 +408,7 @@ static void kfence_guarded_free(void *ad memzero_explicit(addr, meta->size); /* Mark the object as freed. */ - metadata_update_state(meta, KFENCE_OBJECT_FREED); + metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0); raw_spin_unlock_irqrestore(&meta->lock, flags); @@ -742,6 +750,9 @@ void kfence_shutdown_cache(struct kmem_c void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { + unsigned long stack_entries[KFENCE_STACK_DEPTH]; + size_t num_stack_entries; + /* * Perform size check before switching kfence_allocation_gate, so that * we don't disable KFENCE without making an allocation. @@ -786,7 +797,9 @@ void *__kfence_alloc(struct kmem_cache * if (!READ_ONCE(kfence_enabled)) return NULL; - return kfence_guarded_alloc(s, size, flags); + num_stack_entries = stack_trace_save(stack_entries, KFENCE_STACK_DEPTH, 0); + + return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries); } size_t kfence_ksize(const void *addr) From patchwork Fri Nov 5 20:45:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AA91C433EF for ; Fri, 5 Nov 2021 20:45:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 214E461357 for ; Fri, 5 Nov 2021 20:45:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 214E461357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9AEC49400CA; Fri, 5 Nov 2021 16:45:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90F839400C1; Fri, 5 Nov 2021 16:45:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 69E0D9400CA; Fri, 5 Nov 2021 16:45:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0123.hostedemail.com [216.40.44.123]) by kanga.kvack.org (Postfix) with ESMTP id 4F0F09400C1 for ; Fri, 5 Nov 2021 16:45:36 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 11427184D7FB1 for ; Fri, 5 Nov 2021 20:45:36 +0000 (UTC) X-FDA: 78776057430.09.03418FF Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id 6461CD0000B3 for ; Fri, 5 Nov 2021 20:45:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A118361362; Fri, 5 Nov 2021 20:45:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145135; bh=iQgnqs6q5pdfGHSS+AgoPIn8tbCKbd1duNmianb7uQM=; h=Date:From:To:Subject:In-Reply-To:From; b=AjZxSvEsfVgTl4XEYGPBjytpgpRSrnN3ztejhTuV6Co/wSXq5uAAkPRriLIqgMMC1 NAsGqgHrW6seZhZKlLtmSkMy7aK96xtQSApG83v6ut1AQxegxYmasQSbFuamAlDW8G 5Lmz023t0yCGR55Ohjvdx6B/kdCDM8evGsjNl8fQ= Date: Fri, 05 Nov 2021 13:45:34 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 208/262] kfence: limit currently covered allocations when pool nearly full Message-ID: <20211105204534.H8_8p7p0l%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=AjZxSvEs; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6461CD0000B3 X-Stat-Signature: gkqab9scozh3s19bp1n77caxrm3d3r67 X-HE-Tag: 1636145126-465208 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: limit currently covered allocations when pool nearly full One of KFENCE's main design principles is that with increasing uptime, allocation coverage increases sufficiently to detect previously undetected bugs. We have observed that frequent long-lived allocations of the same source (e.g. pagecache) tend to permanently fill up the KFENCE pool with increasing system uptime, thus breaking the above requirement. The workaround thus far had been increasing the sample interval and/or increasing the KFENCE pool size, but is no reliable solution. To ensure diverse coverage of allocations, limit currently covered allocations of the same source once pool utilization reaches 75% (configurable via `kfence.skip_covered_thresh`) or above. The effect is retaining reasonable allocation coverage when the pool is close to full. A side-effect is that this also limits frequent long-lived allocations of the same source filling up the pool permanently. Uniqueness of an allocation for coverage purposes is based on its (partial) allocation stack trace (the source). A Counting Bloom filter is used to check if an allocation is covered; if the allocation is currently covered, the allocation is skipped by KFENCE. Testing was done using: (a) a synthetic workload that performs frequent long-lived allocations (default config values; sample_interval=1; num_objects=63), and (b) normal desktop workloads on an otherwise idle machine where the problem was first reported after a few days of uptime (default config values). In both test cases the sampled allocation rate no longer drops to zero at any point. In the case of (b) we observe (after 2 days uptime) 15% unique allocations in the pool, 77% pool utilization, with 20% "skipped allocations (covered)". [elver@google.com: simplify and just use hash_32(), use more random stack_hash_seed] Link: https://lkml.kernel.org/r/YU3MRGaCaJiYht5g@elver.google.com [elver@google.com: fix 32 bit] Link: https://lkml.kernel.org/r/20210923104803.2620285-4-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Dmitry Vyukov Acked-by: Alexander Potapenko Cc: Aleksandr Nogikh Cc: Jann Horn Cc: Taras Madan Signed-off-by: Andrew Morton --- mm/kfence/core.c | 109 ++++++++++++++++++++++++++++++++++++++++++- mm/kfence/kfence.h | 2 2 files changed, 109 insertions(+), 2 deletions(-) --- a/mm/kfence/core.c~kfence-limit-currently-covered-allocations-when-pool-nearly-full +++ a/mm/kfence/core.c @@ -10,12 +10,15 @@ #include #include #include +#include #include +#include #include #include #include #include #include +#include #include #include #include @@ -82,6 +85,10 @@ static const struct kernel_param_ops sam }; module_param_cb(sample_interval, &sample_interval_param_ops, &kfence_sample_interval, 0600); +/* Pool usage% threshold when currently covered allocations are skipped. */ +static unsigned long kfence_skip_covered_thresh __read_mostly = 75; +module_param_named(skip_covered_thresh, kfence_skip_covered_thresh, ulong, 0644); + /* The pool of pages used for guard pages and objects. */ char *__kfence_pool __ro_after_init; EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */ @@ -105,6 +112,32 @@ DEFINE_STATIC_KEY_FALSE(kfence_allocatio /* Gates the allocation, ensuring only one succeeds in a given period. */ atomic_t kfence_allocation_gate = ATOMIC_INIT(1); +/* + * A Counting Bloom filter of allocation coverage: limits currently covered + * allocations of the same source filling up the pool. + * + * Assuming a range of 15%-85% unique allocations in the pool at any point in + * time, the below parameters provide a probablity of 0.02-0.33 for false + * positive hits respectively: + * + * P(alloc_traces) = (1 - e^(-HNUM * (alloc_traces / SIZE)) ^ HNUM + */ +#define ALLOC_COVERED_HNUM 2 +#define ALLOC_COVERED_ORDER (const_ilog2(CONFIG_KFENCE_NUM_OBJECTS) + 2) +#define ALLOC_COVERED_SIZE (1 << ALLOC_COVERED_ORDER) +#define ALLOC_COVERED_HNEXT(h) hash_32(h, ALLOC_COVERED_ORDER) +#define ALLOC_COVERED_MASK (ALLOC_COVERED_SIZE - 1) +static atomic_t alloc_covered[ALLOC_COVERED_SIZE]; + +/* Stack depth used to determine uniqueness of an allocation. */ +#define UNIQUE_ALLOC_STACK_DEPTH ((size_t)8) + +/* + * Randomness for stack hashes, making the same collisions across reboots and + * different machines less likely. + */ +static u32 stack_hash_seed __ro_after_init; + /* Statistics counters for debugfs. */ enum kfence_counter_id { KFENCE_COUNTER_ALLOCATED, @@ -114,6 +147,7 @@ enum kfence_counter_id { KFENCE_COUNTER_BUGS, KFENCE_COUNTER_SKIP_INCOMPAT, KFENCE_COUNTER_SKIP_CAPACITY, + KFENCE_COUNTER_SKIP_COVERED, KFENCE_COUNTER_COUNT, }; static atomic_long_t counters[KFENCE_COUNTER_COUNT]; @@ -125,11 +159,57 @@ static const char *const counter_names[] [KFENCE_COUNTER_BUGS] = "total bugs", [KFENCE_COUNTER_SKIP_INCOMPAT] = "skipped allocations (incompatible)", [KFENCE_COUNTER_SKIP_CAPACITY] = "skipped allocations (capacity)", + [KFENCE_COUNTER_SKIP_COVERED] = "skipped allocations (covered)", }; static_assert(ARRAY_SIZE(counter_names) == KFENCE_COUNTER_COUNT); /* === Internals ============================================================ */ +static inline bool should_skip_covered(void) +{ + unsigned long thresh = (CONFIG_KFENCE_NUM_OBJECTS * kfence_skip_covered_thresh) / 100; + + return atomic_long_read(&counters[KFENCE_COUNTER_ALLOCATED]) > thresh; +} + +static u32 get_alloc_stack_hash(unsigned long *stack_entries, size_t num_entries) +{ + num_entries = min(num_entries, UNIQUE_ALLOC_STACK_DEPTH); + num_entries = filter_irq_stacks(stack_entries, num_entries); + return jhash(stack_entries, num_entries * sizeof(stack_entries[0]), stack_hash_seed); +} + +/* + * Adds (or subtracts) count @val for allocation stack trace hash + * @alloc_stack_hash from Counting Bloom filter. + */ +static void alloc_covered_add(u32 alloc_stack_hash, int val) +{ + int i; + + for (i = 0; i < ALLOC_COVERED_HNUM; i++) { + atomic_add(val, &alloc_covered[alloc_stack_hash & ALLOC_COVERED_MASK]); + alloc_stack_hash = ALLOC_COVERED_HNEXT(alloc_stack_hash); + } +} + +/* + * Returns true if the allocation stack trace hash @alloc_stack_hash is + * currently contained (non-zero count) in Counting Bloom filter. + */ +static bool alloc_covered_contains(u32 alloc_stack_hash) +{ + int i; + + for (i = 0; i < ALLOC_COVERED_HNUM; i++) { + if (!atomic_read(&alloc_covered[alloc_stack_hash & ALLOC_COVERED_MASK])) + return false; + alloc_stack_hash = ALLOC_COVERED_HNEXT(alloc_stack_hash); + } + + return true; +} + static bool kfence_protect(unsigned long addr) { return !KFENCE_WARN_ON(!kfence_protect_page(ALIGN_DOWN(addr, PAGE_SIZE), true)); @@ -269,7 +349,8 @@ static __always_inline void for_each_can } static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp, - unsigned long *stack_entries, size_t num_stack_entries) + unsigned long *stack_entries, size_t num_stack_entries, + u32 alloc_stack_hash) { struct kfence_metadata *meta = NULL; unsigned long flags; @@ -332,6 +413,8 @@ static void *kfence_guarded_alloc(struct /* Pairs with READ_ONCE() in kfence_shutdown_cache(). */ WRITE_ONCE(meta->cache, cache); meta->size = size; + meta->alloc_stack_hash = alloc_stack_hash; + for_each_canary(meta, set_canary_byte); /* Set required struct page fields. */ @@ -344,6 +427,8 @@ static void *kfence_guarded_alloc(struct raw_spin_unlock_irqrestore(&meta->lock, flags); + alloc_covered_add(alloc_stack_hash, 1); + /* Memory initialization. */ /* @@ -412,6 +497,8 @@ static void kfence_guarded_free(void *ad raw_spin_unlock_irqrestore(&meta->lock, flags); + alloc_covered_add(meta->alloc_stack_hash, -1); + /* Protect to detect use-after-frees. */ kfence_protect((unsigned long)addr); @@ -677,6 +764,7 @@ void __init kfence_init(void) if (!kfence_sample_interval) return; + stack_hash_seed = (u32)random_get_entropy(); if (!kfence_init_pool()) { pr_err("%s failed\n", __func__); return; @@ -752,6 +840,7 @@ void *__kfence_alloc(struct kmem_cache * { unsigned long stack_entries[KFENCE_STACK_DEPTH]; size_t num_stack_entries; + u32 alloc_stack_hash; /* * Perform size check before switching kfence_allocation_gate, so that @@ -799,7 +888,23 @@ void *__kfence_alloc(struct kmem_cache * num_stack_entries = stack_trace_save(stack_entries, KFENCE_STACK_DEPTH, 0); - return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries); + /* + * Do expensive check for coverage of allocation in slow-path after + * allocation_gate has already become non-zero, even though it might + * mean not making any allocation within a given sample interval. + * + * This ensures reasonable allocation coverage when the pool is almost + * full, including avoiding long-lived allocations of the same source + * filling up the pool (e.g. pagecache allocations). + */ + alloc_stack_hash = get_alloc_stack_hash(stack_entries, num_stack_entries); + if (should_skip_covered() && alloc_covered_contains(alloc_stack_hash)) { + atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_COVERED]); + return NULL; + } + + return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries, + alloc_stack_hash); } size_t kfence_ksize(const void *addr) --- a/mm/kfence/kfence.h~kfence-limit-currently-covered-allocations-when-pool-nearly-full +++ a/mm/kfence/kfence.h @@ -87,6 +87,8 @@ struct kfence_metadata { /* Allocation and free stack information. */ struct kfence_track alloc_track; struct kfence_track free_track; + /* For updating alloc_covered on frees. */ + u32 alloc_stack_hash; }; extern struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS]; From patchwork Fri Nov 5 20:45:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57DA7C433F5 for ; Fri, 5 Nov 2021 20:45:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E70561288 for ; Fri, 5 Nov 2021 20:45:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0E70561288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7FFDE9400CB; Fri, 5 Nov 2021 16:45:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 73ADB9400C1; Fri, 5 Nov 2021 16:45:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B7839400CC; Fri, 5 Nov 2021 16:45:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 421FB9400C1 for ; Fri, 5 Nov 2021 16:45:39 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F3320768D2 for ; Fri, 5 Nov 2021 20:45:38 +0000 (UTC) X-FDA: 78776057598.13.08B96EE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id CB0F430000B7 for ; Fri, 5 Nov 2021 20:45:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A17A861357; Fri, 5 Nov 2021 20:45:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145137; bh=3laThTs8az9HfqbyXz9b288O9taJO3LrkKCUHkrL5Ng=; h=Date:From:To:Subject:In-Reply-To:From; b=zIjNYy/Piknx2M/n+dYwDNtwryCpIjhV+PpOp74c5hmtxN6qhsH2AnY2BcDbzM1gL VywLlblc7eY6xLcMaRGDirTxo9ZPIZdTBbSsBd/zxTSrGYk8w83E+M1+SN9It4gJBH DENVG2kpxhLBLRRXb53RA/LShK74sb8Fe4BSnobc= Date: Fri, 05 Nov 2021 13:45:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 209/262] kfence: add note to documentation about skipping covered allocations Message-ID: <20211105204537.mUNMkL-XE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CB0F430000B7 X-Stat-Signature: b54hp7fnqcy5ffx485zp54pb4ywn5zhm Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="zIjNYy/P"; dmarc=none; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145131-986099 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: add note to documentation about skipping covered allocations Add a note briefly mentioning the new policy about "skipping currently covered allocations if pool close to full." Since this has a notable impact on KFENCE's bug-detection ability on systems with large uptimes, it is worth pointing out the feature. Link: https://lkml.kernel.org/r/20210923104803.2620285-5-elver@google.com Signed-off-by: Marco Elver Reviewed-by: Dmitry Vyukov Acked-by: Alexander Potapenko Cc: Aleksandr Nogikh Cc: Jann Horn Cc: Taras Madan Signed-off-by: Andrew Morton --- Documentation/dev-tools/kfence.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/Documentation/dev-tools/kfence.rst~kfence-add-note-to-documentation-about-skipping-covered-allocations +++ a/Documentation/dev-tools/kfence.rst @@ -269,6 +269,17 @@ tail of KFENCE's freelist, so that the l first, and the chances of detecting use-after-frees of recently freed objects is increased. +If pool utilization reaches 75% (default) or above, to reduce the risk of the +pool eventually being fully occupied by allocated objects yet ensure diverse +coverage of allocations, KFENCE limits currently covered allocations of the +same source from further filling up the pool. The "source" of an allocation is +based on its partial allocation stack trace. A side-effect is that this also +limits frequent long-lived allocations (e.g. pagecache) of the same source +filling up the pool permanently, which is the most common risk for the pool +becoming full and the sampled allocation rate dropping to zero. The threshold +at which to start limiting currently covered allocations can be configured via +the boot parameter ``kfence.skip_covered_thresh`` (pool usage%). + Interface --------- From patchwork Fri Nov 5 20:45:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F2CDC433F5 for ; Fri, 5 Nov 2021 20:45:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12A3A61288 for ; Fri, 5 Nov 2021 20:45:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12A3A61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8C2AD9400CC; Fri, 5 Nov 2021 16:45:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 84AB89400C1; Fri, 5 Nov 2021 16:45:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C55C9400CC; Fri, 5 Nov 2021 16:45:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 564619400C1 for ; Fri, 5 Nov 2021 16:45:42 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 196CC768D2 for ; Fri, 5 Nov 2021 20:45:42 +0000 (UTC) X-FDA: 78776057682.27.5BEA5A8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 76A897001A05 for ; Fri, 5 Nov 2021 20:45:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B43036135A; Fri, 5 Nov 2021 20:45:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145141; bh=AksBvyK2+J34v4VU0G9t1Lm0wLcMVV2rPzybMUopnnY=; h=Date:From:To:Subject:In-Reply-To:From; b=eytZhVn53WoDNgyMUiTnmjfTPo13aa30BkLk2zUIVcgpgcmUYVthZcZuNZGZagcmV qWRcUiVdBELmqrn7QaC6UYL6COkPYWOA/5YbhnQVaDrpFs+XgU+KM/TtN96q/Pv2EG bF8J4/M6FlZgWeqxueZVjaconyfDJTsfGS0ELkxc= Date: Fri, 05 Nov 2021 13:45:40 -0700 From: Andrew Morton To: akpm@linux-foundation.org, davidgow@google.com, dvyukov@google.com, elver@google.com, glider@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, nogikh@google.com, tarasmadan@google.com, torvalds@linux-foundation.org Subject: [patch 210/262] kfence: test: use kunit_skip() to skip tests Message-ID: <20211105204540.jMRK9-vcN%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=eytZhVn5; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 76A897001A05 X-Stat-Signature: m14jnghpcnbscw6du3nnkqow6neyn655 X-HE-Tag: 1636145136-409203 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: test: use kunit_skip() to skip tests Use the new kunit_skip() to skip tests if requirements were not met. It makes it easier to see in KUnit's summary if there were skipped tests. Link: https://lkml.kernel.org/r/20210922182541.1372400-1-elver@google.com Signed-off-by: Marco Elver Reviewed-by: David Gow Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Aleksandr Nogikh Cc: Taras Madan Signed-off-by: Andrew Morton --- mm/kfence/kfence_test.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) --- a/mm/kfence/kfence_test.c~kfence-test-use-kunit_skip-to-skip-tests +++ a/mm/kfence/kfence_test.c @@ -32,6 +32,11 @@ #define arch_kfence_test_address(addr) (addr) #endif +#define KFENCE_TEST_REQUIRES(test, cond) do { \ + if (!(cond)) \ + kunit_skip((test), "Test requires: " #cond); \ +} while (0) + /* Report as observed from console. */ static struct { spinlock_t lock; @@ -555,8 +560,7 @@ static void test_init_on_free(struct kun }; int i; - if (!IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON)) - return; + KFENCE_TEST_REQUIRES(test, IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON)); /* Assume it hasn't been disabled on command line. */ setup_test_cache(test, size, 0, NULL); @@ -603,10 +607,8 @@ static void test_gfpzero(struct kunit *t char *buf1, *buf2; int i; - if (CONFIG_KFENCE_SAMPLE_INTERVAL > 100) { - kunit_warn(test, "skipping ... would take too long\n"); - return; - } + /* Skip if we think it'd take too long. */ + KFENCE_TEST_REQUIRES(test, CONFIG_KFENCE_SAMPLE_INTERVAL <= 100); setup_test_cache(test, size, 0, NULL); buf1 = test_alloc(test, size, GFP_KERNEL, ALLOCATE_ANY); From patchwork Fri Nov 5 20:45:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BF32C433FE for ; Fri, 5 Nov 2021 20:45:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 15B2061288 for ; Fri, 5 Nov 2021 20:45:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 15B2061288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 939099400CD; Fri, 5 Nov 2021 16:45:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C0979400C1; Fri, 5 Nov 2021 16:45:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73A6B9400CD; Fri, 5 Nov 2021 16:45:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 5F5539400C1 for ; Fri, 5 Nov 2021 16:45:45 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2E42518503B44 for ; Fri, 5 Nov 2021 20:45:45 +0000 (UTC) X-FDA: 78776057850.21.1DAB7F5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id C3AF9900024E for ; Fri, 5 Nov 2021 20:45:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C4AA561362; Fri, 5 Nov 2021 20:45:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145144; bh=JZS3vNkx31TQfI2LSWPr+r21IGp5W3lvWrSWb4PRtbs=; h=Date:From:To:Subject:In-Reply-To:From; b=bBcLqCGUSBb06xDyZpwl6SyCq1DVz8QGnu2H5zortOvyUDqRKKCycYW4pNE9zsVn9 KJga81YCUMrnu6FoHwQeKwwc3VVwdzjy6RvTvlnGZ3ABfhOdqtUxmbRhPnWY5Hz7+/ EBhVQl/9sKybgWSwy5dFhYdeKNESYTnGmOIDWjew= Date: Fri, 05 Nov 2021 13:45:43 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 211/262] kfence: shorten critical sections of alloc/free Message-ID: <20211105204543.QFRD7k0_w%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bBcLqCGU; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C3AF9900024E X-Stat-Signature: 3zjp31jtdg4t8141oy1xwmbhctf563d4 X-HE-Tag: 1636145144-455805 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: shorten critical sections of alloc/free Initializing memory and setting/checking the canary bytes is relatively expensive, and doing so in the meta->lock critical sections extends the duration with preemption and interrupts disabled unnecessarily. Any reads to meta->addr and meta->size in kfence_guarded_alloc() and kfence_guarded_free() don't require locking meta->lock as long as the object is removed from the freelist: only kfence_guarded_alloc() sets meta->addr and meta->size after removing it from the freelist, which requires a preceding kfence_guarded_free() returning it to the list or the initial state. Therefore move reads to meta->addr and meta->size, including expensive memory initialization using them, out of meta->lock critical sections. Link: https://lkml.kernel.org/r/20210930153706.2105471-1-elver@google.com Signed-off-by: Marco Elver Acked-by: Alexander Potapenko Cc: Dmitry Vyukov Cc: Jann Horn Signed-off-by: Andrew Morton --- mm/kfence/core.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) --- a/mm/kfence/core.c~kfence-shorten-critical-sections-of-alloc-free +++ a/mm/kfence/core.c @@ -309,12 +309,19 @@ static inline bool set_canary_byte(u8 *a /* Check canary byte at @addr. */ static inline bool check_canary_byte(u8 *addr) { + struct kfence_metadata *meta; + unsigned long flags; + if (likely(*addr == KFENCE_CANARY_PATTERN(addr))) return true; atomic_long_inc(&counters[KFENCE_COUNTER_BUGS]); - kfence_report_error((unsigned long)addr, false, NULL, addr_to_metadata((unsigned long)addr), - KFENCE_ERROR_CORRUPTION); + + meta = addr_to_metadata((unsigned long)addr); + raw_spin_lock_irqsave(&meta->lock, flags); + kfence_report_error((unsigned long)addr, false, NULL, meta, KFENCE_ERROR_CORRUPTION); + raw_spin_unlock_irqrestore(&meta->lock, flags); + return false; } @@ -324,8 +331,6 @@ static __always_inline void for_each_can const unsigned long pageaddr = ALIGN_DOWN(meta->addr, PAGE_SIZE); unsigned long addr; - lockdep_assert_held(&meta->lock); - /* * We'll iterate over each canary byte per-side until fn() returns * false. However, we'll still iterate over the canary bytes to the @@ -414,8 +419,9 @@ static void *kfence_guarded_alloc(struct WRITE_ONCE(meta->cache, cache); meta->size = size; meta->alloc_stack_hash = alloc_stack_hash; + raw_spin_unlock_irqrestore(&meta->lock, flags); - for_each_canary(meta, set_canary_byte); + alloc_covered_add(alloc_stack_hash, 1); /* Set required struct page fields. */ page = virt_to_page(meta->addr); @@ -425,11 +431,8 @@ static void *kfence_guarded_alloc(struct if (IS_ENABLED(CONFIG_SLAB)) page->s_mem = addr; - raw_spin_unlock_irqrestore(&meta->lock, flags); - - alloc_covered_add(alloc_stack_hash, 1); - /* Memory initialization. */ + for_each_canary(meta, set_canary_byte); /* * We check slab_want_init_on_alloc() ourselves, rather than letting @@ -454,6 +457,7 @@ static void kfence_guarded_free(void *ad { struct kcsan_scoped_access assert_page_exclusive; unsigned long flags; + bool init; raw_spin_lock_irqsave(&meta->lock, flags); @@ -481,6 +485,13 @@ static void kfence_guarded_free(void *ad meta->unprotected_page = 0; } + /* Mark the object as freed. */ + metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0); + init = slab_want_init_on_free(meta->cache); + raw_spin_unlock_irqrestore(&meta->lock, flags); + + alloc_covered_add(meta->alloc_stack_hash, -1); + /* Check canary bytes for memory corruption. */ for_each_canary(meta, check_canary_byte); @@ -489,16 +500,9 @@ static void kfence_guarded_free(void *ad * data is still there, and after a use-after-free is detected, we * unprotect the page, so the data is still accessible. */ - if (!zombie && unlikely(slab_want_init_on_free(meta->cache))) + if (!zombie && unlikely(init)) memzero_explicit(addr, meta->size); - /* Mark the object as freed. */ - metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0); - - raw_spin_unlock_irqrestore(&meta->lock, flags); - - alloc_covered_add(meta->alloc_stack_hash, -1); - /* Protect to detect use-after-frees. */ kfence_protect((unsigned long)addr); From patchwork Fri Nov 5 20:45:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 598BAC433EF for ; Fri, 5 Nov 2021 20:45:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 108936135A for ; Fri, 5 Nov 2021 20:45:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 108936135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9381C9400CE; Fri, 5 Nov 2021 16:45:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C03B9400C1; Fri, 5 Nov 2021 16:45:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7882B9400CE; Fri, 5 Nov 2021 16:45:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id 5E1F19400C1 for ; Fri, 5 Nov 2021 16:45:48 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2DCDD779AB for ; Fri, 5 Nov 2021 20:45:48 +0000 (UTC) X-FDA: 78776057976.07.D475A00 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 7E76C60019AA for ; Fri, 5 Nov 2021 20:45:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CA4246128E; Fri, 5 Nov 2021 20:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145147; bh=WE4XHpK0g3pEVXXIopuNPVyZxEFnzZJ0u0qegDwMaGA=; h=Date:From:To:Subject:In-Reply-To:From; b=JHNHL0N0/PvKk+15nrTowRTPvv1G+C9uXvPFROv3eZeUis5nnlC1py6iK8UG4Ejae 8YE27DC49+7lBJ9EpjyWBC5OV1y9y8Q7cnA56NAESYkgAFIKqDEDJ5cW0aTWjJLPlO qWDaokgc6INihyjYg4UEF2/3xn9g+9PSMVdO0ur8= Date: Fri, 05 Nov 2021 13:45:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 212/262] kfence: always use static branches to guard kfence_alloc() Message-ID: <20211105204546.j705HR0Zm%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=JHNHL0N0; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7E76C60019AA X-Stat-Signature: cwnqmobx5nwn6fg4o1zcqsgs1548nmip X-HE-Tag: 1636145148-736882 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: always use static branches to guard kfence_alloc() Regardless of KFENCE mode (CONFIG_KFENCE_STATIC_KEYS: either using static keys to gate allocations, or using a simple dynamic branch), always use a static branch to avoid the dynamic branch in kfence_alloc() if KFENCE was disabled at boot. For CONFIG_KFENCE_STATIC_KEYS=n, this now avoids the dynamic branch if KFENCE was disabled at boot. To simplify, also unifies the location where kfence_allocation_gate is read-checked to just be inline in kfence_alloc(). Link: https://lkml.kernel.org/r/20211019102524.2807208-1-elver@google.com Signed-off-by: Marco Elver Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Jann Horn Signed-off-by: Andrew Morton --- include/linux/kfence.h | 21 +++++++++++---------- mm/kfence/core.c | 16 +++++++--------- 2 files changed, 18 insertions(+), 19 deletions(-) --- a/include/linux/kfence.h~kfence-always-use-static-branches-to-guard-kfence_alloc +++ a/include/linux/kfence.h @@ -14,6 +14,9 @@ #ifdef CONFIG_KFENCE +#include +#include + /* * We allocate an even number of pages, as it simplifies calculations to map * address to metadata indices; effectively, the very first page serves as an @@ -22,13 +25,8 @@ #define KFENCE_POOL_SIZE ((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 * PAGE_SIZE) extern char *__kfence_pool; -#ifdef CONFIG_KFENCE_STATIC_KEYS -#include DECLARE_STATIC_KEY_FALSE(kfence_allocation_key); -#else -#include extern atomic_t kfence_allocation_gate; -#endif /** * is_kfence_address() - check if an address belongs to KFENCE pool @@ -116,13 +114,16 @@ void *__kfence_alloc(struct kmem_cache * */ static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { -#ifdef CONFIG_KFENCE_STATIC_KEYS - if (static_branch_unlikely(&kfence_allocation_key)) +#if defined(CONFIG_KFENCE_STATIC_KEYS) || CONFIG_KFENCE_SAMPLE_INTERVAL == 0 + if (!static_branch_unlikely(&kfence_allocation_key)) + return NULL; #else - if (unlikely(!atomic_read(&kfence_allocation_gate))) + if (!static_branch_likely(&kfence_allocation_key)) + return NULL; #endif - return __kfence_alloc(s, size, flags); - return NULL; + if (likely(atomic_read(&kfence_allocation_gate))) + return NULL; + return __kfence_alloc(s, size, flags); } /** --- a/mm/kfence/core.c~kfence-always-use-static-branches-to-guard-kfence_alloc +++ a/mm/kfence/core.c @@ -104,10 +104,11 @@ struct kfence_metadata kfence_metadata[C static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist); static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. */ -#ifdef CONFIG_KFENCE_STATIC_KEYS -/* The static key to set up a KFENCE allocation. */ +/* + * The static key to set up a KFENCE allocation; or if static keys are not used + * to gate allocations, to avoid a load and compare if KFENCE is disabled. + */ DEFINE_STATIC_KEY_FALSE(kfence_allocation_key); -#endif /* Gates the allocation, ensuring only one succeeds in a given period. */ atomic_t kfence_allocation_gate = ATOMIC_INIT(1); @@ -774,6 +775,8 @@ void __init kfence_init(void) return; } + if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS)) + static_branch_enable(&kfence_allocation_key); WRITE_ONCE(kfence_enabled, true); queue_delayed_work(system_unbound_wq, &kfence_timer, 0); pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE, @@ -866,12 +869,7 @@ void *__kfence_alloc(struct kmem_cache * return NULL; } - /* - * allocation_gate only needs to become non-zero, so it doesn't make - * sense to continue writing to it and pay the associated contention - * cost, in case we have a large number of concurrent allocations. - */ - if (atomic_read(&kfence_allocation_gate) || atomic_inc_return(&kfence_allocation_gate) > 1) + if (atomic_inc_return(&kfence_allocation_gate) > 1) return NULL; #ifdef CONFIG_KFENCE_STATIC_KEYS /* From patchwork Fri Nov 5 20:45:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FDFAC433FE for ; Fri, 5 Nov 2021 20:45:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DA49F61288 for ; Fri, 5 Nov 2021 20:45:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DA49F61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 69EFC9400D0; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 636719400D1; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 454F89400D0; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213]) by kanga.kvack.org (Postfix) with ESMTP id 2C5679400C1 for ; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E9EBB76B62 for ; Fri, 5 Nov 2021 20:45:56 +0000 (UTC) X-FDA: 78776058312.22.ABA5147 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id 01864508FA50 for ; Fri, 5 Nov 2021 20:45:38 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DD8BB6135A; Fri, 5 Nov 2021 20:45:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145150; bh=7ZJA7MgHrNxIdcFZRAMa17flxOE2HjkgQ9lSst71BAo=; h=Date:From:To:Subject:In-Reply-To:From; b=lg3yat1FYZjdZcQ9s26P7ZwMYp4g6lVSV5CSHr6qf7QkUtYYEF8n8vPm8ld+jaAJQ NBTnNHN85PVrPG76Xjpc7SQsyyD5iWl15FFKywmS8fJg0+5CmkOlMbyow3GCOEJVcm Q/IEAHUSegIn3NKsA5JHlmVippOKwlASRPVmRQsk= Date: Fri, 05 Nov 2021 13:45:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dvyukov@google.com, elver@google.com, glider@google.com, jannh@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 213/262] kfence: default to dynamic branch instead of static keys mode Message-ID: <20211105204549.5RJKPBc9C%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lg3yat1F; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 01864508FA50 X-Stat-Signature: qx895c4jdh8w35px43kjxuo86rz5x1xh X-HE-Tag: 1636145138-209436 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Marco Elver Subject: kfence: default to dynamic branch instead of static keys mode We have observed that on very large machines with newer CPUs, the static key/branch switching delay is on the order of milliseconds. This is due to the required broadcast IPIs, which simply does not scale well to hundreds of CPUs (cores). If done too frequently, this can adversely affect tail latencies of various workloads. One workaround is to increase the sample interval to several seconds, while decreasing sampled allocation coverage, but the problem still exists and could still increase tail latencies. As already noted in the Kconfig help text, there are trade-offs: at lower sample intervals the dynamic branch results in better performance; however, at very large sample intervals, the static keys mode can result in better performance -- careful benchmarking is recommended. Our initial benchmarking showed that with large enough sample intervals and workloads stressing the allocator, the static keys mode was slightly better. Evaluating and observing the possible system-wide side-effects of the static-key-switching induced broadcast IPIs, however, was a blind spot (in particular on large machines with 100s of cores). Therefore, a major downside of the static keys mode is, unfortunately, that it is hard to predict performance on new system architectures and topologies, but also making conclusions about performance of new workloads based on a limited set of benchmarks. Most distributions will simply select the defaults, while targeting a large variety of different workloads and system architectures. As such, the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is only recommended after careful evaluation. For reference, on x86-64 the condition in kfence_alloc() generates exactly 2 instructions in the kmem_cache_alloc() fast-path: | ... | cmpl $0x0,0x1a8021c(%rip) # ffffffff82d560d0 | je ffffffff812d6003 | ... which, given kfence_allocation_gate is infrequently modified, should be well predicted by most CPUs. Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.com Signed-off-by: Marco Elver Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Jann Horn Signed-off-by: Andrew Morton --- Documentation/dev-tools/kfence.rst | 12 ++++++++---- lib/Kconfig.kfence | 26 +++++++++++++++----------- 2 files changed, 23 insertions(+), 15 deletions(-) --- a/Documentation/dev-tools/kfence.rst~kfence-default-to-dynamic-branch-instead-of-static-keys-mode +++ a/Documentation/dev-tools/kfence.rst @@ -231,10 +231,14 @@ Guarded allocations are set up based on of the sample interval, the next allocation through the main allocator (SLAB or SLUB) returns a guarded allocation from the KFENCE object pool (allocation sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and -the next allocation is set up after the expiration of the interval. To "gate" a -KFENCE allocation through the main allocator's fast-path without overhead, -KFENCE relies on static branches via the static keys infrastructure. The static -branch is toggled to redirect the allocation to KFENCE. +the next allocation is set up after the expiration of the interval. + +When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated" +through the main allocator's fast-path by relying on static branches via the +static keys infrastructure. The static branch is toggled to redirect the +allocation to KFENCE. Depending on sample interval, target workloads, and +system architecture, this may perform better than the simple dynamic branch. +Careful benchmarking is recommended. KFENCE objects each reside on a dedicated page, at either the left or right page boundaries selected at random. The pages to the left and right of the --- a/lib/Kconfig.kfence~kfence-default-to-dynamic-branch-instead-of-static-keys-mode +++ a/lib/Kconfig.kfence @@ -25,17 +25,6 @@ menuconfig KFENCE if KFENCE -config KFENCE_STATIC_KEYS - bool "Use static keys to set up allocations" - default y - depends on JUMP_LABEL # To ensure performance, require jump labels - help - Use static keys (static branches) to set up KFENCE allocations. Using - static keys is normally recommended, because it avoids a dynamic - branch in the allocator's fast path. However, with very low sample - intervals, or on systems that do not support jump labels, a dynamic - branch may still be an acceptable performance trade-off. - config KFENCE_SAMPLE_INTERVAL int "Default sample interval in milliseconds" default 100 @@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS pages are required; with one containing the object and two adjacent ones used as guard pages. +config KFENCE_STATIC_KEYS + bool "Use static keys to set up allocations" if EXPERT + depends on JUMP_LABEL + help + Use static keys (static branches) to set up KFENCE allocations. This + option is only recommended when using very large sample intervals, or + performance has carefully been evaluated with this option. + + Using static keys comes with trade-offs that need to be carefully + evaluated given target workloads and system architectures. Notably, + enabling and disabling static keys invoke IPI broadcasts, the latency + and impact of which is much harder to predict than a dynamic branch. + + Say N if you are unsure. + config KFENCE_STRESS_TEST_FAULTS int "Stress testing of fault handling and error reporting" if EXPERT default 0 From patchwork Fri Nov 5 20:45:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605827 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DB87C433EF for ; Fri, 5 Nov 2021 20:45:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E71ED61288 for ; Fri, 5 Nov 2021 20:45:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E71ED61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7152A9400CF; Fri, 5 Nov 2021 16:45:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 69DD69400C1; Fri, 5 Nov 2021 16:45:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53E649400CF; Fri, 5 Nov 2021 16:45:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 4220A9400C1 for ; Fri, 5 Nov 2021 16:45:54 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0516F77987 for ; Fri, 5 Nov 2021 20:45:54 +0000 (UTC) X-FDA: 78776058228.23.210D6BA Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id EFCE1508FA6D for ; Fri, 5 Nov 2021 20:45:41 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D354061288; Fri, 5 Nov 2021 20:45:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145153; bh=rL52ULhCpJPKDrV0HcLGjKlhZHRqkX1Z7SGLcEe0MKo=; h=Date:From:To:Subject:In-Reply-To:From; b=osFsDVIrxFYg/hLzLU0UDf9bESj2tduIVq5UYWfoiysND2mJlPCJhuksYmOtExcdf XgJs2TpxPAXTHJ0fcRIrZLwyj77VmwZ0L6kdvoHOdGmizFeOUyz3+ewAqYFUQupoIk IviNQYM3PykVl3BGaKzAhEsWid/bR1wIeLH5YAd0= Date: Fri, 05 Nov 2021 13:45:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, geert@linux-m68k.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 214/262] mm/damon: grammar s/works/work/ Message-ID: <20211105204552.aPJCALJw7%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: EFCE1508FA6D X-Stat-Signature: rkfe3yw983jbaapruujo9p14bqd67rf7 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=osFsDVIr; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145141-411288 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Geert Uytterhoeven Subject: mm/damon: grammar s/works/work/ Correct a singular versus plural grammar mistake in the help text for the DAMON_VADDR config symbol. Link: https://lkml.kernel.org/r/20210914073451.3883834-1-geert@linux-m68k.org Fixes: 3f49584b262cf8f4 ("mm/damon: implement primitives for the virtual memory address spaces") Signed-off-by: Geert Uytterhoeven Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/Kconfig~mm-damon-grammar-s-works-work +++ a/mm/damon/Kconfig @@ -30,7 +30,7 @@ config DAMON_VADDR select PAGE_IDLE_FLAG help This builds the default data access monitoring primitives for DAMON - that works for virtual address spaces. + that work for virtual address spaces. config DAMON_VADDR_KUNIT_TEST bool "Test for DAMON primitives" if !KUNIT_ALL_TESTS From patchwork Fri Nov 5 20:45:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD8B4C433F5 for ; Fri, 5 Nov 2021 20:45:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 67C636128E for ; Fri, 5 Nov 2021 20:45:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 67C636128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B5F159400D1; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE5519400C1; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8533C9400D2; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 513B69400C1 for ; Fri, 5 Nov 2021 16:45:57 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 0F7EC1819609D for ; Fri, 5 Nov 2021 20:45:57 +0000 (UTC) X-FDA: 78776058312.26.DD1AA67 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id ED4AC90000B4 for ; Fri, 5 Nov 2021 20:45:43 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B29296128E; Fri, 5 Nov 2021 20:45:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145155; bh=a5iguRVVsaUX5MP+JBas7djnZS060/ku0HoRGXwJopY=; h=Date:From:To:Subject:In-Reply-To:From; b=A8Jjy1nU1XIswAhkdICGoC4dyiB1WwU77YLPcJA1LM+GkKedMX8ie2xqzsAvzJq90 bGA8Av4bEdgR4Pdt9NUbI82DUZZijBARQgHZDX+Lu4OSeTdqW+FoxKMmZlhjEmg0vg g4C1skzVni31OoxDL105dSk6n+eZegPyHxVxTU0k= Date: Fri, 05 Nov 2021 13:45:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 215/262] Documentation/vm: move user guides to admin-guide/mm/ Message-ID: <20211105204555.ggENH4V6V%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: ED4AC90000B4 X-Stat-Signature: cwiuim68jdtyb1hsmmxcfnit5eyp3aj4 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A8Jjy1nU; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145143-215903 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Documentation/vm: move user guides to admin-guide/mm/ Most memory management user guide documents are in 'admin-guide/mm/', but two of those are in 'vm/'. This commit moves the two docs into 'admin-guide/mm' for easier documents finding. Link: https://lkml.kernel.org/r/20210917123958.3819-2-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/index.rst | 2 ++ .../{vm => admin-guide/mm}/swap_numa.rst | 0 .../{vm => admin-guide/mm}/zswap.rst | 0 Documentation/vm/index.rst | 26 ++++--------------- 4 files changed, 7 insertions(+), 21 deletions(-) rename Documentation/{vm => admin-guide/mm}/swap_numa.rst (100%) rename Documentation/{vm => admin-guide/mm}/zswap.rst (100%) diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst index cbd19d5e625f..c21b5823f126 100644 --- a/Documentation/admin-guide/mm/index.rst +++ b/Documentation/admin-guide/mm/index.rst @@ -37,5 +37,7 @@ the Linux memory management. numaperf pagemap soft-dirty + swap_numa transhuge userfaultfd + zswap diff --git a/Documentation/vm/swap_numa.rst b/Documentation/admin-guide/mm/swap_numa.rst similarity index 100% rename from Documentation/vm/swap_numa.rst rename to Documentation/admin-guide/mm/swap_numa.rst diff --git a/Documentation/vm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst similarity index 100% rename from Documentation/vm/zswap.rst rename to Documentation/admin-guide/mm/zswap.rst diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst index b51f0d8992f8..6f5ffef4b716 100644 --- a/Documentation/vm/index.rst +++ b/Documentation/vm/index.rst @@ -3,27 +3,11 @@ Linux Memory Management Documentation ===================================== This is a collection of documents about the Linux memory management (mm) -subsystem. If you are looking for advice on simply allocating memory, -see the :ref:`memory_allocation`. - -User guides for MM features -=========================== - -The following documents provide guides for controlling and tuning -various features of the Linux memory management - -.. toctree:: - :maxdepth: 1 - - swap_numa - zswap - -Kernel developers MM documentation -================================== - -The below documents describe MM internals with different level of -details ranging from notes and mailing list responses to elaborate -descriptions of data structures and algorithms. +subsystem internals with different level of details ranging from notes and +mailing list responses for elaborating descriptions of data structures and +algorithms. If you are looking for advice on simply allocating memory, see the +:ref:`memory_allocation`. For controlling and tuning guides, see the +:doc:`admin guide <../admin-guide/mm/index>`. .. toctree:: :maxdepth: 1 From patchwork Fri Nov 5 20:45:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4237C433FE for ; Fri, 5 Nov 2021 20:46:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 857A66128E for ; Fri, 5 Nov 2021 20:46:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 857A66128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4596A9400D2; Fri, 5 Nov 2021 16:46:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 36C889400C1; Fri, 5 Nov 2021 16:46:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 232F19400D2; Fri, 5 Nov 2021 16:46:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 076739400C1 for ; Fri, 5 Nov 2021 16:46:00 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C59F1181B0496 for ; Fri, 5 Nov 2021 20:45:59 +0000 (UTC) X-FDA: 78776058564.05.E1004BC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id 66329104AAEF for ; Fri, 5 Nov 2021 20:45:50 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 97DE961288; Fri, 5 Nov 2021 20:45:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145158; bh=arL/+cT2vlehXCChqicxg5TPamvvbL+JCxEAl9Uo+/0=; h=Date:From:To:Subject:In-Reply-To:From; b=v220vfvDTHGQXE3mKza+W4s91dCPqNsN1sh8ceRe0CXo0nQexv32vXuz/CobMAhNT wq4f02ME5PUml//SMRvJia2VawD8kpcjR1e4KE0OguNO4cAxsPnNq2roQxsaJzV2C3 eA4DLJzgxQLAQVnzHbD8DxBERkiQ4vP1l+VxEuoE= Date: Fri, 05 Nov 2021 13:45:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 216/262] MAINTAINERS: update SeongJae's email address Message-ID: <20211105204558.9bWvBTLQv%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 66329104AAEF X-Stat-Signature: brrhqco73tmmfrhi8oeqsjco9x374qfy Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=v220vfvD; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145150-216844 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: MAINTAINERS: update SeongJae's email address This commit updates SeongJae's email address in MAINTAINERS file to his preferred one. Link: https://lkml.kernel.org/r/20210917123958.3819-3-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: SeongJae Park Signed-off-by: Andrew Morton --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/MAINTAINERS~maintainers-update-seongjaes-email-address +++ a/MAINTAINERS @@ -5161,7 +5161,7 @@ F: net/ax25/ax25_timer.c F: net/ax25/sysctl_net_ax25.c DATA ACCESS MONITOR -M: SeongJae Park +M: SeongJae Park L: linux-mm@kvack.org S: Maintained F: Documentation/admin-guide/mm/damon/ From patchwork Fri Nov 5 20:46:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EED5C433EF for ; Fri, 5 Nov 2021 20:46:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4DB096128E for ; Fri, 5 Nov 2021 20:46:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4DB096128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D566B9400D3; Fri, 5 Nov 2021 16:46:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D05649400C1; Fri, 5 Nov 2021 16:46:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF4839400D3; Fri, 5 Nov 2021 16:46:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id AA9269400C1 for ; Fri, 5 Nov 2021 16:46:04 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 708C182499A8 for ; Fri, 5 Nov 2021 20:46:04 +0000 (UTC) X-FDA: 78776058648.09.A9004F7 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id 687D5E001995 for ; Fri, 5 Nov 2021 20:45:45 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8E02D61288; Fri, 5 Nov 2021 20:46:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145161; bh=dLpCXweS9BQ/c81WVkbFG2RJg1JdzJeDs/+gsEXIuDg=; h=Date:From:To:Subject:In-Reply-To:From; b=boyF2jlCyzlmAtDI3gmK1TnpLE7rJTUZe7Wfwobaut5Kxt54+WYDMArNUSyOhTCh9 Tg7DnkmVJsfTQyGLtuaLDkWulCekjmOxRx/nWbnZt999BF0TqhZBQD4uJx5gQyCUks aRZHZFur3jUDBGVU12kZakLNLQ6pU5ct/75S0Zh0= Date: Fri, 05 Nov 2021 13:46:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 217/262] docs/vm/damon: remove broken reference Message-ID: <20211105204601.1bYw-dhvK%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 687D5E001995 X-Stat-Signature: 57xtxzn37w81pbu7f4gm9c6mrw1nkykk Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=boyF2jlC; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145145-804187 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: docs/vm/damon: remove broken reference Building DAMON documents warns for a reference to nonexisting doc, as below: $ time make htmldocs [...] Documentation/vm/damon/index.rst:24: WARNING: toctree contains reference to nonexisting document 'vm/damon/plans' This commit fixes the warning by removing the wrong reference. Link: https://lkml.kernel.org/r/20210917123958.3819-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- Documentation/vm/damon/index.rst | 1 - 1 file changed, 1 deletion(-) --- a/Documentation/vm/damon/index.rst~docs-vm-damon-remove-broken-reference +++ a/Documentation/vm/damon/index.rst @@ -27,4 +27,3 @@ workloads and systems. faq design api - plans From patchwork Fri Nov 5 20:46:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 462B9C433EF for ; Fri, 5 Nov 2021 20:46:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F1CDB61288 for ; Fri, 5 Nov 2021 20:46:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org F1CDB61288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2F56D9400D4; Fri, 5 Nov 2021 16:46:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A6409400C1; Fri, 5 Nov 2021 16:46:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F86C9400D4; Fri, 5 Nov 2021 16:46:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0144.hostedemail.com [216.40.44.144]) by kanga.kvack.org (Postfix) with ESMTP id E9BCF9400C1 for ; Fri, 5 Nov 2021 16:46:05 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B13F682499A8 for ; Fri, 5 Nov 2021 20:46:05 +0000 (UTC) X-FDA: 78776058732.01.9AD04A8 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP id 9B1B2508FA63 for ; Fri, 5 Nov 2021 20:45:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7D12861288; Fri, 5 Nov 2021 20:46:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145164; bh=1hDB9nfGh9/MCnfFpWEWPrr61S0fnTZleNIJlYsPn/E=; h=Date:From:To:Subject:In-Reply-To:From; b=wAF7Dtnb60hWUbTbWrxbDeXc6ruxcGj58CiJApHLh1g9F+h/KAj7hS/WZtNiwde3b WUgWA1A8+/UmZIQF/ptKaMO+6TfSo8bgDJvjsgGE7BEO6jAFRHlGOmGzNDY4RFwW5u v649Au/IV/xJKRYlWq68ZepqGyMyM34zrdGqD6GU= Date: Fri, 05 Nov 2021 13:46:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 218/262] include/linux/damon.h: fix kernel-doc comments for 'damon_callback' Message-ID: <20211105204604.SNmwZ955x%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=wAF7Dtnb; dmarc=none; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9B1B2508FA63 X-Stat-Signature: a4k791jfpfibbupo53zdmnerbajri1rh X-HE-Tag: 1636145153-421038 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: include/linux/damon.h: fix kernel-doc comments for 'damon_callback' A few Kernel-doc comments in 'damon.h' are broken. This commit fixes those. Link: https://lkml.kernel.org/r/20210917123958.3819-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Signed-off-by: Andrew Morton --- include/linux/damon.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/include/linux/damon.h~include-linux-damonh-fix-kernel-doc-comments-for-damon_callback +++ a/include/linux/damon.h @@ -62,7 +62,7 @@ struct damon_target { struct damon_ctx; /** - * struct damon_primitive Monitoring primitives for given use cases. + * struct damon_primitive - Monitoring primitives for given use cases. * * @init: Initialize primitive-internal data structures. * @update: Update primitive-internal data structures. @@ -108,8 +108,8 @@ struct damon_primitive { void (*cleanup)(struct damon_ctx *context); }; -/* - * struct damon_callback Monitoring events notification callbacks. +/** + * struct damon_callback - Monitoring events notification callbacks. * * @before_start: Called before starting the monitoring. * @after_sampling: Called after each sampling. From patchwork Fri Nov 5 20:46:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAE2CC433EF for ; Fri, 5 Nov 2021 20:46:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 89D4E61362 for ; Fri, 5 Nov 2021 20:46:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 89D4E61362 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0FAF79400D5; Fri, 5 Nov 2021 16:46:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AA409400C1; Fri, 5 Nov 2021 16:46:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDB679400D5; Fri, 5 Nov 2021 16:46:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id D53959400C1 for ; Fri, 5 Nov 2021 16:46:08 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9C1B91849531F for ; Fri, 5 Nov 2021 20:46:08 +0000 (UTC) X-FDA: 78776058816.31.3A1221F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf28.hostedemail.com (Postfix) with ESMTP id 4ABF190000BA for ; Fri, 5 Nov 2021 20:46:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6C1EE61288; Fri, 5 Nov 2021 20:46:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145167; bh=NRDUh2MdEyFRB0KledZvV54NdO0zcH9XWhGWSwoPEzw=; h=Date:From:To:Subject:In-Reply-To:From; b=uAnuQqciTDUsVLwlmw4Xxg+vscxCFLrsD7iWs4cP6vHp3QbS91+hHrk6yvCGF79oE Oofa0f/Qr7JuWfGJsoQUBsNRb6aFe02e7E+GwsOEQSLCGV45K0sbHTMJU/CJPUQ2jq WPcPTLL13FxannVHewoe5eGeZFcWVEIM6tn1tswg= Date: Fri, 05 Nov 2021 13:46:06 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 219/262] mm/damon/core: print kdamond start log in debug mode only Message-ID: <20211105204606.buFORSv6G%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=uAnuQqci; dmarc=none; spf=pass (imf28.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4ABF190000BA X-Stat-Signature: nuxwm1ttz61czdzrbruuhbrgu7my3w6x X-HE-Tag: 1636145168-992183 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/core: print kdamond start log in debug mode only Logging of kdamond startup is using 'pr_info()' unnecessarily. This commit makes it to use 'pr_debug()' instead. Link: https://lkml.kernel.org/r/20210917123958.3819-6-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/core.c~mm-damon-core-print-kdamond-start-log-in-debug-mode-only +++ a/mm/damon/core.c @@ -653,7 +653,7 @@ static int kdamond_fn(void *data) unsigned long sz_limit = 0; mutex_lock(&ctx->kdamond_lock); - pr_info("kdamond (%d) starts\n", ctx->kdamond->pid); + pr_debug("kdamond (%d) starts\n", ctx->kdamond->pid); mutex_unlock(&ctx->kdamond_lock); if (ctx->primitive.init) From patchwork Fri Nov 5 20:46:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB431C433EF for ; Fri, 5 Nov 2021 20:46:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 887CD6128E for ; Fri, 5 Nov 2021 20:46:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 887CD6128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 199639400D6; Fri, 5 Nov 2021 16:46:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 147329400C1; Fri, 5 Nov 2021 16:46:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F03CC9400D6; Fri, 5 Nov 2021 16:46:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id DD3D09400C1 for ; Fri, 5 Nov 2021 16:46:11 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A27FB18264875 for ; Fri, 5 Nov 2021 20:46:11 +0000 (UTC) X-FDA: 78776058942.03.75A4074 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP id BAC00508FA46 for ; Fri, 5 Nov 2021 20:45:53 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5ECFF6135A; Fri, 5 Nov 2021 20:46:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145170; bh=2YHM+jk7eYYqSgWjZAtGj9Hr2ACNlZNXejG7tpC+Ucg=; h=Date:From:To:Subject:In-Reply-To:From; b=nJJKS9kfjnHhin/PzuOP2TVtHPWsNiY4aiwoJQsC1v5uuxO/Bf9LGoftRUXlaD3vx MtzLOZqXSx0vtB1crzjHi728VnG+Xkluh7zq4b92ExGnfUrpPpP0pWKMKWWPqgJi4X L8dZhAwxOSYc6qIFIse4IJyxhCopPzRd43jxQz2U= Date: Fri, 05 Nov 2021 13:46:09 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sjpark@amazon.de, torvalds@linux-foundation.org Subject: [patch 220/262] mm/damon: remove unnecessary do_exit() from kdamond Message-ID: <20211105204609.O_S_2ZhvC%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=nJJKS9kf; dmarc=none; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BAC00508FA46 X-Stat-Signature: 3u3493tohyhpu1pniy1yo9j88rfmtmgp X-HE-Tag: 1636145153-725652 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changbin Du Subject: mm/damon: remove unnecessary do_exit() from kdamond Just return from the kthread function. Link: https://lkml.kernel.org/r/20210927232421.17694-1-changbin.du@gmail.com Signed-off-by: Changbin Du Cc: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/core.c~mm-damon-remove-unnecessary-do_exit-from-kdamond +++ a/mm/damon/core.c @@ -714,7 +714,7 @@ static int kdamond_fn(void *data) nr_running_ctxs--; mutex_unlock(&damon_lock); - do_exit(0); + return 0; } #include "core-test.h" From patchwork Fri Nov 5 20:46:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EA1DC433EF for ; Fri, 5 Nov 2021 20:46:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0417A6135A for ; Fri, 5 Nov 2021 20:46:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0417A6135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8C2669400DA; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 84B919400DB; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 628E79400DA; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 4C6449400C1 for ; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 18C3818494E60 for ; Fri, 5 Nov 2021 20:46:29 +0000 (UTC) X-FDA: 78776059740.01.3BA8C81 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 1D6517001A05 for ; Fri, 5 Nov 2021 20:46:21 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4E9736128E; Fri, 5 Nov 2021 20:46:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145173; bh=3i89PQ7nij2ffc9bwDtJ1DYgHE4O3FbG1Cw9fmK7GFc=; h=Date:From:To:Subject:In-Reply-To:From; b=hDQxVMyS4wZ7X3HjpX8l5qkBWgibA6/WkpQZ0VOnKSv2dIcBPwg9CXD7rFaiTnf5s I80ATNyICfMWs8uovq7oCuI3PMVY4T516aYCxzr4/eGhzW8/MfCstjibMskvxsyMsN 45G0MnDhWXD8UKzuH5Sy5DP0Dh9jZVAD4+l/ZK4E= Date: Fri, 05 Nov 2021 13:46:12 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 221/262] mm/damon: needn't hold kdamond_lock to print pid of kdamond Message-ID: <20211105204612.z0ZosYrZ6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1D6517001A05 X-Stat-Signature: 4pxfp5fhf9eogj34wcmf8gcc55ajfa6k Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hDQxVMyS; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145181-497597 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changbin Du Subject: mm/damon: needn't hold kdamond_lock to print pid of kdamond Just get the pid by 'current->pid'. Meanwhile, to be symmetrical make the 'starts' and 'finishes' logs both use debug level. Link: https://lkml.kernel.org/r/20210927232432.17750-1-changbin.du@gmail.com Signed-off-by: Changbin Du Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- a/mm/damon/core.c~mm-damon-neednt-hold-kdamond_lock-to-print-pid-of-kdamond +++ a/mm/damon/core.c @@ -652,9 +652,7 @@ static int kdamond_fn(void *data) unsigned int max_nr_accesses = 0; unsigned long sz_limit = 0; - mutex_lock(&ctx->kdamond_lock); - pr_debug("kdamond (%d) starts\n", ctx->kdamond->pid); - mutex_unlock(&ctx->kdamond_lock); + pr_debug("kdamond (%d) starts\n", current->pid); if (ctx->primitive.init) ctx->primitive.init(ctx); @@ -705,7 +703,7 @@ static int kdamond_fn(void *data) if (ctx->primitive.cleanup) ctx->primitive.cleanup(ctx); - pr_debug("kdamond (%d) finishes\n", ctx->kdamond->pid); + pr_debug("kdamond (%d) finishes\n", current->pid); mutex_lock(&ctx->kdamond_lock); ctx->kdamond = NULL; mutex_unlock(&ctx->kdamond_lock); From patchwork Fri Nov 5 20:46:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 618B9C433F5 for ; Fri, 5 Nov 2021 20:46:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 30C0061288 for ; Fri, 5 Nov 2021 20:46:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 30C0061288 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BCFF49400D7; Fri, 5 Nov 2021 16:46:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B7EC19400C1; Fri, 5 Nov 2021 16:46:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6E0C9400D7; Fri, 5 Nov 2021 16:46:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id 93F269400C1 for ; Fri, 5 Nov 2021 16:46:17 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5A32076B46 for ; Fri, 5 Nov 2021 20:46:17 +0000 (UTC) X-FDA: 78776059194.24.E269FF4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP id A8D5A104AACB for ; Fri, 5 Nov 2021 20:46:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 348D261288; Fri, 5 Nov 2021 20:46:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145176; bh=zlvH2jY7naVEqZ/kfzMKr2KBE+X/tSwBozn9f+5T/R0=; h=Date:From:To:Subject:In-Reply-To:From; b=wLTEs5XeCgXNx5pzdy5f6hxbKsDlP9iO698eXSHWF7TKFGe1HgsfoF83+vJSqwyqX GMzrgCnDIHLSu4Txd+JrAnOCFgzCtgO/P3nefiSJviIZoC0HgHDOcmmYIzQmX/nSMK Rxf+o3+JstepaeO//p/8yFWWf84eszINy3NP5qm8= Date: Fri, 05 Nov 2021 13:46:15 -0700 From: Andrew Morton To: akpm@linux-foundation.org, colin.king@canonical.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 222/262] mm/damon/core: nullify pointer ctx->kdamond with a NULL Message-ID: <20211105204615.IkBgRaxHP%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: A8D5A104AACB X-Stat-Signature: 7ueb996fhmfr38o8e9kse3gct16u9eh4 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=wLTEs5Xe; dmarc=none; spf=pass (imf31.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145168-113541 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: mm/damon/core: nullify pointer ctx->kdamond with a NULL Currently a plain integer is being used to nullify the pointer ctx->kdamond. Use NULL instead. Cleans up sparse warning: mm/damon/core.c:317:40: warning: Using plain integer as NULL pointer Link: https://lkml.kernel.org/r/20210925215908.181226-1-colin.king@canonical.com Signed-off-by: Colin Ian King Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/core.c~mm-damon-core-nullify-pointer-ctx-kdamond-with-a-null +++ a/mm/damon/core.c @@ -314,7 +314,7 @@ static int __damon_start(struct damon_ct nr_running_ctxs); if (IS_ERR(ctx->kdamond)) { err = PTR_ERR(ctx->kdamond); - ctx->kdamond = 0; + ctx->kdamond = NULL; } } mutex_unlock(&ctx->kdamond_lock); From patchwork Fri Nov 5 20:46:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F4A2C433EF for ; Fri, 5 Nov 2021 20:46:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 068F561357 for ; Fri, 5 Nov 2021 20:46:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 068F561357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 8E8A59400D8; Fri, 5 Nov 2021 16:46:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 871859400C1; Fri, 5 Nov 2021 16:46:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73A629400D8; Fri, 5 Nov 2021 16:46:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 5F3C49400C1 for ; Fri, 5 Nov 2021 16:46:21 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 225255C09A for ; Fri, 5 Nov 2021 20:46:21 +0000 (UTC) X-FDA: 78776059404.01.ECD47AD Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id AB819F0000BF for ; Fri, 5 Nov 2021 20:46:20 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 63A6F6135A; Fri, 5 Nov 2021 20:46:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145179; bh=q82RoE9izSEsXms1RkG5GEtUpT2Gy2iCl4EoOF7jce0=; h=Date:From:To:Subject:In-Reply-To:From; b=1MQUy3dDOOJo2wJ8C+JuKjEFS+gGdiBKmu/MM3U+eQWeYV3ku133mCFjariicQ6sy qLNP83e/b5kZaSpVEVxIQBd3Cbw7PYrHlrXHIHMZsaM8ndfaKcY7Jgzz4QrpU4YUIq /4BLBqhpX2Q6afV0Mn13PTwISq0k5iAn6+5BsmMU= Date: Fri, 05 Nov 2021 13:46:18 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 223/262] mm/damon/core: account age of target regions Message-ID: <20211105204618.Vn7J7CSp_%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: AB819F0000BF X-Stat-Signature: ow16nns9uzxgm578hb67n1jnz1x5p9s5 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=1MQUy3dD; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145180-191062 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/core: account age of target regions Patch series "Implement Data Access Monitoring-based Memory Operation Schemes". Introduction ============ DAMON[1] can be used as a primitive for data access aware memory management optimizations. For that, users who want such optimizations should run DAMON, read the monitoring results, analyze it, plan a new memory management scheme, and apply the new scheme by themselves. Such efforts will be inevitable for some complicated optimizations. However, in many other cases, the users would simply want the system to apply a memory management action to a memory region of a specific size having a specific access frequency for a specific time. For example, "page out a memory region larger than 100 MiB keeping only rare accesses more than 2 minutes", or "Do not use THP for a memory region larger than 2 MiB rarely accessed for more than 1 seconds". To make the works easier and non-redundant, this patchset implements a new feature of DAMON, which is called Data Access Monitoring-based Operation Schemes (DAMOS). Using the feature, users can describe the normal schemes in a simple way and ask DAMON to execute those on its own. [1] https://damonitor.github.io Evaluations =========== DAMOS is accurate and useful for memory management optimizations. An experimental DAMON-based operation scheme for THP, 'ethp', removes 76.15% of THP memory overheads while preserving 51.25% of THP speedup. Another experimental DAMON-based 'proactive reclamation' implementation, 'prcl', reduces 93.38% of residential sets and 23.63% of system memory footprint while incurring only 1.22% runtime overhead in the best case (parsec3/freqmine). NOTE that the experimental THP optimization and proactive reclamation are not for production but only for proof of concepts. Please refer to the showcase web site's evaluation document[1] for detailed evaluation setup and results. [1] https://damonitor.github.io/doc/html/v34/vm/damon/eval.html Long-term Support Trees ----------------------- For people who want to test DAMON but using LTS kernels, there are another couple of trees based on two latest LTS kernels respectively and containing the 'damon/master' backports. - For v5.4.y: https://git.kernel.org/sj/h/damon/for-v5.4.y - For v5.10.y: https://git.kernel.org/sj/h/damon/for-v5.10.y Sequence Of Patches =================== The 1st patch accounts age of each region. The 2nd patch implements the core of the DAMON-based operation schemes feature. The 3rd patch makes the default monitoring primitives for virtual address spaces to support the schemes. From this point, the kernel space users can use DAMOS. The 4th patch exports the feature to the user space via the debugfs interface. The 5th patch implements schemes statistics feature for easier tuning of the schemes and runtime access pattern analysis, and the 6th patch adds selftests for these changes. Finally, the 7th patch documents this new feature. This patch (of 7): DAMON can be used for data access pattern aware memory management optimizations. For that, users should run DAMON, read the monitoring results, analyze it, plan a new memory management scheme, and apply the new scheme by themselves. It would not be too hard, but still require some level of effort. For complicated cases, this effort is inevitable. That said, in many cases, users would simply want to apply an actions to a memory region of a specific size having a specific access frequency for a specific time. For example, "page out a memory region larger than 100 MiB but having a low access frequency more than 10 minutes", or "Use THP for a memory region larger than 2 MiB having a high access frequency for more than 2 seconds". For such optimizations, users will need to first account the age of each region themselves. To reduce such efforts, this commit implements a simple age account of each region in DAMON. For each aggregation step, DAMON compares the access frequency with that from last aggregation and reset the age of the region if the change is significant. Else, the age is incremented. Also, in case of the merge of regions, the region size-weighted average of the ages is set as the age of merged new region. Link: https://lkml.kernel.org/r/20211001125604.29660-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211001125604.29660-2-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Cameron Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Jonathan Corbet Cc: David Hildenbrand Cc: David Woodhouse Cc: Marco Elver Cc: Leonard Foerster Cc: Greg Thelen Cc: Markus Boehme Cc: David Rienjes Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 10 ++++++++++ mm/damon/core.c | 13 +++++++++++++ 2 files changed, 23 insertions(+) --- a/include/linux/damon.h~mm-damon-core-account-age-of-target-regions +++ a/include/linux/damon.h @@ -31,12 +31,22 @@ struct damon_addr_range { * @sampling_addr: Address of the sample for the next access check. * @nr_accesses: Access frequency of this region. * @list: List head for siblings. + * @age: Age of this region. + * + * @age is initially zero, increased for each aggregation interval, and reset + * to zero again if the access frequency is significantly changed. If two + * regions are merged into a new region, both @nr_accesses and @age of the new + * region are set as region size-weighted average of those of the two regions. */ struct damon_region { struct damon_addr_range ar; unsigned long sampling_addr; unsigned int nr_accesses; struct list_head list; + + unsigned int age; +/* private: Internal value for age calculation. */ + unsigned int last_nr_accesses; }; /** --- a/mm/damon/core.c~mm-damon-core-account-age-of-target-regions +++ a/mm/damon/core.c @@ -45,6 +45,9 @@ struct damon_region *damon_new_region(un region->nr_accesses = 0; INIT_LIST_HEAD(®ion->list); + region->age = 0; + region->last_nr_accesses = 0; + return region; } @@ -444,6 +447,7 @@ static void kdamond_reset_aggregated(str damon_for_each_region(r, t) { trace_damon_aggregated(t, r, damon_nr_regions(t)); + r->last_nr_accesses = r->nr_accesses; r->nr_accesses = 0; } } @@ -461,6 +465,7 @@ static void damon_merge_two_regions(stru l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) / (sz_l + sz_r); + l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r); l->ar.end = r->ar.end; damon_destroy_region(r, t); } @@ -480,6 +485,11 @@ static void damon_merge_regions_of(struc struct damon_region *r, *prev = NULL, *next; damon_for_each_region_safe(r, next, t) { + if (diff_of(r->nr_accesses, r->last_nr_accesses) > thres) + r->age = 0; + else + r->age++; + if (prev && prev->ar.end == r->ar.start && diff_of(prev->nr_accesses, r->nr_accesses) <= thres && sz_damon_region(prev) + sz_damon_region(r) <= sz_limit) @@ -527,6 +537,9 @@ static void damon_split_region_at(struct r->ar.end = new->ar.start; + new->age = r->age; + new->last_nr_accesses = r->last_nr_accesses; + damon_insert_region(new, r, damon_next_region(r), t); } From patchwork Fri Nov 5 20:46:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6081C433F5 for ; Fri, 5 Nov 2021 20:46:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7CB3161357 for ; Fri, 5 Nov 2021 20:46:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7CB3161357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0873D9400D9; Fri, 5 Nov 2021 16:46:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00C8E9400C1; Fri, 5 Nov 2021 16:46:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEFC29400D9; Fri, 5 Nov 2021 16:46:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0181.hostedemail.com [216.40.44.181]) by kanga.kvack.org (Postfix) with ESMTP id C9B919400C1 for ; Fri, 5 Nov 2021 16:46:24 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 916C45CBBF for ; Fri, 5 Nov 2021 20:46:24 +0000 (UTC) X-FDA: 78776059488.27.75D3D58 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 0EC4B10000AB for ; Fri, 5 Nov 2021 20:46:23 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CEB5B61288; Fri, 5 Nov 2021 20:46:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145183; bh=9V0eb5QqQoKtGcihqckm2W6CG5Yvlezrz3F8KXc895M=; h=Date:From:To:Subject:In-Reply-To:From; b=ovnfW0b691xxGhMCFtM9bZW1uyhUvaAzdY4l5SLI8Tssy/vYscPHH7MsPzhD7SHvA GMD352y7E2eEsEw2VHFM5M3O1ShobYLZotiO+TWy2wroAFU8XnWuRXJVU8IeyUW3rW wfGHG9QNZv876MGS0RpfeD/JTCYqgkMFocntycok= Date: Fri, 05 Nov 2021 13:46:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 224/262] mm/damon/core: implement DAMON-based Operation Schemes (DAMOS) Message-ID: <20211105204622.GKwY0kwNC%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ovnfW0b6; dmarc=none; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0EC4B10000AB X-Stat-Signature: zfaksi3twpu34ihkcadh3b89ds1e3794 X-HE-Tag: 1636145183-44404 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/core: implement DAMON-based Operation Schemes (DAMOS) In many cases, users might use DAMON for simple data access aware memory management optimizations such as applying an operation scheme to a memory region of a specific size having a specific access frequency for a specific time. For example, "page out a memory region larger than 100 MiB but having a low access frequency more than 10 minutes", or "Use THP for a memory region larger than 2 MiB having a high access frequency for more than 2 seconds". Most simple form of the solution would be doing offline data access pattern profiling using DAMON and modifying the application source code or system configuration based on the profiling results. Or, developing a daemon constructed with two modules (one for access monitoring and the other for applying memory management actions via mlock(), madvise(), sysctl, etc) is imaginable. To avoid users spending their time for implementation of such simple data access monitoring-based operation schemes, this commit makes DAMON to handle such schemes directly. With this commit, users can simply specify their desired schemes to DAMON. Then, DAMON will automatically apply the schemes to the user-specified target processes. Each of the schemes is composed with conditions for filtering of the target memory regions and desired memory management action for the target. Specifically, the format is:: The filtering conditions are size of memory region, number of accesses to the region monitored by DAMON, and the age of the region. The age of region is incremented periodically but reset when its addresses or access frequency has significantly changed or the action of a scheme was applied. For the action, current implementation supports a few of madvise()-like hints, ``WILLNEED``, ``COLD``, ``PAGEOUT``, ``HUGEPAGE``, and ``NOHUGEPAGE``. Because DAMON supports various address spaces and application of the actions to a monitoring target region is dependent to the type of the target address space, the application code should be implemented by each primitives and registered to the framework. Note that this commit only implements the framework part. Following commit will implement the action applications for virtual address spaces primitives. Link: https://lkml.kernel.org/r/20211001125604.29660-3-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 66 ++++++++++++++++++++++++ mm/damon/core.c | 109 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 175 insertions(+) --- a/include/linux/damon.h~mm-damon-core-implement-damon-based-operation-schemes-damos +++ a/include/linux/damon.h @@ -69,6 +69,48 @@ struct damon_target { struct list_head list; }; +/** + * enum damos_action - Represents an action of a Data Access Monitoring-based + * Operation Scheme. + * + * @DAMOS_WILLNEED: Call ``madvise()`` for the region with MADV_WILLNEED. + * @DAMOS_COLD: Call ``madvise()`` for the region with MADV_COLD. + * @DAMOS_PAGEOUT: Call ``madvise()`` for the region with MADV_PAGEOUT. + * @DAMOS_HUGEPAGE: Call ``madvise()`` for the region with MADV_HUGEPAGE. + * @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with MADV_NOHUGEPAGE. + */ +enum damos_action { + DAMOS_WILLNEED, + DAMOS_COLD, + DAMOS_PAGEOUT, + DAMOS_HUGEPAGE, + DAMOS_NOHUGEPAGE, +}; + +/** + * struct damos - Represents a Data Access Monitoring-based Operation Scheme. + * @min_sz_region: Minimum size of target regions. + * @max_sz_region: Maximum size of target regions. + * @min_nr_accesses: Minimum ``->nr_accesses`` of target regions. + * @max_nr_accesses: Maximum ``->nr_accesses`` of target regions. + * @min_age_region: Minimum age of target regions. + * @max_age_region: Maximum age of target regions. + * @action: &damo_action to be applied to the target regions. + * @list: List head for siblings. + * + * Note that both the minimums and the maximums are inclusive. + */ +struct damos { + unsigned long min_sz_region; + unsigned long max_sz_region; + unsigned int min_nr_accesses; + unsigned int max_nr_accesses; + unsigned int min_age_region; + unsigned int max_age_region; + enum damos_action action; + struct list_head list; +}; + struct damon_ctx; /** @@ -79,6 +121,7 @@ struct damon_ctx; * @prepare_access_checks: Prepare next access check of target regions. * @check_accesses: Check the accesses to target regions. * @reset_aggregated: Reset aggregated accesses monitoring results. + * @apply_scheme: Apply a DAMON-based operation scheme. * @target_valid: Determine if the target is valid. * @cleanup: Clean up the context. * @@ -104,6 +147,9 @@ struct damon_ctx; * of its update. The value will be used for regions adjustment threshold. * @reset_aggregated should reset the access monitoring results that aggregated * by @check_accesses. + * @apply_scheme is called from @kdamond when a region for user provided + * DAMON-based operation scheme is found. It should apply the scheme's action + * to the region. This is not used for &DAMON_ARBITRARY_TARGET case. * @target_valid should check whether the target is still valid for the * monitoring. * @cleanup is called from @kdamond just before its termination. @@ -114,6 +160,8 @@ struct damon_primitive { void (*prepare_access_checks)(struct damon_ctx *context); unsigned int (*check_accesses)(struct damon_ctx *context); void (*reset_aggregated)(struct damon_ctx *context); + int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme); bool (*target_valid)(void *target); void (*cleanup)(struct damon_ctx *context); }; @@ -192,6 +240,7 @@ struct damon_callback { * @min_nr_regions: The minimum number of adaptive monitoring regions. * @max_nr_regions: The maximum number of adaptive monitoring regions. * @adaptive_targets: Head of monitoring targets (&damon_target) list. + * @schemes: Head of schemes (&damos) list. */ struct damon_ctx { unsigned long sample_interval; @@ -213,6 +262,7 @@ struct damon_ctx { unsigned long min_nr_regions; unsigned long max_nr_regions; struct list_head adaptive_targets; + struct list_head schemes; }; #define damon_next_region(r) \ @@ -233,6 +283,12 @@ struct damon_ctx { #define damon_for_each_target_safe(t, next, ctx) \ list_for_each_entry_safe(t, next, &(ctx)->adaptive_targets, list) +#define damon_for_each_scheme(s, ctx) \ + list_for_each_entry(s, &(ctx)->schemes, list) + +#define damon_for_each_scheme_safe(s, next, ctx) \ + list_for_each_entry_safe(s, next, &(ctx)->schemes, list) + #ifdef CONFIG_DAMON struct damon_region *damon_new_region(unsigned long start, unsigned long end); @@ -242,6 +298,14 @@ inline void damon_insert_region(struct d void damon_add_region(struct damon_region *r, struct damon_target *t); void damon_destroy_region(struct damon_region *r, struct damon_target *t); +struct damos *damon_new_scheme( + unsigned long min_sz_region, unsigned long max_sz_region, + unsigned int min_nr_accesses, unsigned int max_nr_accesses, + unsigned int min_age_region, unsigned int max_age_region, + enum damos_action action); +void damon_add_scheme(struct damon_ctx *ctx, struct damos *s); +void damon_destroy_scheme(struct damos *s); + struct damon_target *damon_new_target(unsigned long id); void damon_add_target(struct damon_ctx *ctx, struct damon_target *t); void damon_free_target(struct damon_target *t); @@ -255,6 +319,8 @@ int damon_set_targets(struct damon_ctx * int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, unsigned long aggr_int, unsigned long primitive_upd_int, unsigned long min_nr_reg, unsigned long max_nr_reg); +int damon_set_schemes(struct damon_ctx *ctx, + struct damos **schemes, ssize_t nr_schemes); int damon_nr_running_ctxs(void); int damon_start(struct damon_ctx **ctxs, int nr_ctxs); --- a/mm/damon/core.c~mm-damon-core-implement-damon-based-operation-schemes-damos +++ a/mm/damon/core.c @@ -85,6 +85,50 @@ void damon_destroy_region(struct damon_r damon_free_region(r); } +struct damos *damon_new_scheme( + unsigned long min_sz_region, unsigned long max_sz_region, + unsigned int min_nr_accesses, unsigned int max_nr_accesses, + unsigned int min_age_region, unsigned int max_age_region, + enum damos_action action) +{ + struct damos *scheme; + + scheme = kmalloc(sizeof(*scheme), GFP_KERNEL); + if (!scheme) + return NULL; + scheme->min_sz_region = min_sz_region; + scheme->max_sz_region = max_sz_region; + scheme->min_nr_accesses = min_nr_accesses; + scheme->max_nr_accesses = max_nr_accesses; + scheme->min_age_region = min_age_region; + scheme->max_age_region = max_age_region; + scheme->action = action; + INIT_LIST_HEAD(&scheme->list); + + return scheme; +} + +void damon_add_scheme(struct damon_ctx *ctx, struct damos *s) +{ + list_add_tail(&s->list, &ctx->schemes); +} + +static void damon_del_scheme(struct damos *s) +{ + list_del(&s->list); +} + +static void damon_free_scheme(struct damos *s) +{ + kfree(s); +} + +void damon_destroy_scheme(struct damos *s) +{ + damon_del_scheme(s); + damon_free_scheme(s); +} + /* * Construct a damon_target struct * @@ -156,6 +200,7 @@ struct damon_ctx *damon_new_ctx(void) ctx->max_nr_regions = 1000; INIT_LIST_HEAD(&ctx->adaptive_targets); + INIT_LIST_HEAD(&ctx->schemes); return ctx; } @@ -175,7 +220,13 @@ static void damon_destroy_targets(struct void damon_destroy_ctx(struct damon_ctx *ctx) { + struct damos *s, *next_s; + damon_destroy_targets(ctx); + + damon_for_each_scheme_safe(s, next_s, ctx) + damon_destroy_scheme(s); + kfree(ctx); } @@ -251,6 +302,30 @@ int damon_set_attrs(struct damon_ctx *ct } /** + * damon_set_schemes() - Set data access monitoring based operation schemes. + * @ctx: monitoring context + * @schemes: array of the schemes + * @nr_schemes: number of entries in @schemes + * + * This function should not be called while the kdamond of the context is + * running. + * + * Return: 0 if success, or negative error code otherwise. + */ +int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes, + ssize_t nr_schemes) +{ + struct damos *s, *next; + ssize_t i; + + damon_for_each_scheme_safe(s, next, ctx) + damon_destroy_scheme(s); + for (i = 0; i < nr_schemes; i++) + damon_add_scheme(ctx, schemes[i]); + return 0; +} + +/** * damon_nr_running_ctxs() - Return number of currently running contexts. */ int damon_nr_running_ctxs(void) @@ -453,6 +528,39 @@ static void kdamond_reset_aggregated(str } } +static void damon_do_apply_schemes(struct damon_ctx *c, + struct damon_target *t, + struct damon_region *r) +{ + struct damos *s; + unsigned long sz; + + damon_for_each_scheme(s, c) { + sz = r->ar.end - r->ar.start; + if (sz < s->min_sz_region || s->max_sz_region < sz) + continue; + if (r->nr_accesses < s->min_nr_accesses || + s->max_nr_accesses < r->nr_accesses) + continue; + if (r->age < s->min_age_region || s->max_age_region < r->age) + continue; + if (c->primitive.apply_scheme) + c->primitive.apply_scheme(c, t, r, s); + r->age = 0; + } +} + +static void kdamond_apply_schemes(struct damon_ctx *c) +{ + struct damon_target *t; + struct damon_region *r; + + damon_for_each_target(t, c) { + damon_for_each_region(r, t) + damon_do_apply_schemes(c, t, r); + } +} + #define sz_damon_region(r) (r->ar.end - r->ar.start) /* @@ -693,6 +801,7 @@ static int kdamond_fn(void *data) if (ctx->callback.after_aggregation && ctx->callback.after_aggregation(ctx)) set_kdamond_stop(ctx); + kdamond_apply_schemes(ctx); kdamond_reset_aggregated(ctx); kdamond_split_regions(ctx); if (ctx->primitive.reset_aggregated) From patchwork Fri Nov 5 20:46:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28ED5C433EF for ; Fri, 5 Nov 2021 20:46:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D009B6135A for ; Fri, 5 Nov 2021 20:46:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D009B6135A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 089FF9400DB; Fri, 5 Nov 2021 16:46:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F27999400C1; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEF799400DB; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0240.hostedemail.com [216.40.44.240]) by kanga.kvack.org (Postfix) with ESMTP id C87979400C1 for ; Fri, 5 Nov 2021 16:46:29 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 92B4518494E60 for ; Fri, 5 Nov 2021 20:46:29 +0000 (UTC) X-FDA: 78776059698.22.F974199 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP id 99CE0400208A for ; Fri, 5 Nov 2021 20:46:27 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 489FE61357; Fri, 5 Nov 2021 20:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145186; bh=6J4MTwLxORuqJZBB6UwiJPJwGRtEupa/QMnQOxGNzxQ=; h=Date:From:To:Subject:In-Reply-To:From; b=zbr0Nxi65yEl5M+kX0lGvPeddFtdDzZ9VdHlKMR+XRX6uE6OxLykQj8qa6m6/FK9U /6b98ijik+ulUqmGGsARkmjw2gzOjmfoSHDM1mIeLcUS2DJtkXf40y5WAH8jDub3Xu 43N3JGikE1SK+yj7msLLCfCdmJdw5qDX/7SHilro= Date: Fri, 05 Nov 2021 13:46:25 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 225/262] mm/damon/vaddr: support DAMON-based Operation Schemes Message-ID: <20211105204625.Q_yfBYv9Z%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 99CE0400208A X-Stat-Signature: uwbtkyjo3x3g7s3gecc41c8u89nyakb3 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=zbr0Nxi6; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145187-558778 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/vaddr: support DAMON-based Operation Schemes This commit makes DAMON's default primitives for virtual address spaces to support DAMON-based Operation Schemes (DAMOS) by implementing actions application functions and registering it to the monitoring context. The implementation simply links 'madvise()' for related DAMOS actions. That is, 'madvise(MADV_WILLNEED)' is called for 'WILLNEED' DAMOS action and similar for other actions ('COLD', 'PAGEOUT', 'HUGEPAGE', 'NOHUGEPAGE'). So, the kernel space DAMON users can now use the DAMON-based optimizations with only small amount of code. Link: https://lkml.kernel.org/r/20211001125604.29660-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 2 + mm/damon/vaddr.c | 56 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) --- a/include/linux/damon.h~mm-damon-vaddr-support-damon-based-operation-schemes +++ a/include/linux/damon.h @@ -337,6 +337,8 @@ void damon_va_prepare_access_checks(stru unsigned int damon_va_check_accesses(struct damon_ctx *ctx); bool damon_va_target_valid(void *t); void damon_va_cleanup(struct damon_ctx *ctx); +int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme); void damon_va_set_primitives(struct damon_ctx *ctx); #endif /* CONFIG_DAMON_VADDR */ --- a/mm/damon/vaddr.c~mm-damon-vaddr-support-damon-based-operation-schemes +++ a/mm/damon/vaddr.c @@ -7,6 +7,7 @@ #define pr_fmt(fmt) "damon-va: " fmt +#include #include #include #include @@ -658,6 +659,60 @@ bool damon_va_target_valid(void *target) return false; } +#ifndef CONFIG_ADVISE_SYSCALLS +static int damos_madvise(struct damon_target *target, struct damon_region *r, + int behavior) +{ + return -EINVAL; +} +#else +static int damos_madvise(struct damon_target *target, struct damon_region *r, + int behavior) +{ + struct mm_struct *mm; + int ret = -ENOMEM; + + mm = damon_get_mm(target); + if (!mm) + goto out; + + ret = do_madvise(mm, PAGE_ALIGN(r->ar.start), + PAGE_ALIGN(r->ar.end - r->ar.start), behavior); + mmput(mm); +out: + return ret; +} +#endif /* CONFIG_ADVISE_SYSCALLS */ + +int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, + struct damon_region *r, struct damos *scheme) +{ + int madv_action; + + switch (scheme->action) { + case DAMOS_WILLNEED: + madv_action = MADV_WILLNEED; + break; + case DAMOS_COLD: + madv_action = MADV_COLD; + break; + case DAMOS_PAGEOUT: + madv_action = MADV_PAGEOUT; + break; + case DAMOS_HUGEPAGE: + madv_action = MADV_HUGEPAGE; + break; + case DAMOS_NOHUGEPAGE: + madv_action = MADV_NOHUGEPAGE; + break; + default: + pr_warn("Wrong action %d\n", scheme->action); + return -EINVAL; + } + + return damos_madvise(t, r, madv_action); +} + void damon_va_set_primitives(struct damon_ctx *ctx) { ctx->primitive.init = damon_va_init; @@ -667,6 +722,7 @@ void damon_va_set_primitives(struct damo ctx->primitive.reset_aggregated = NULL; ctx->primitive.target_valid = damon_va_target_valid; ctx->primitive.cleanup = NULL; + ctx->primitive.apply_scheme = damon_va_apply_scheme; } #include "vaddr-test.h" From patchwork Fri Nov 5 20:46:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49DC6C433EF for ; Fri, 5 Nov 2021 20:46:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0214361372 for ; Fri, 5 Nov 2021 20:46:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0214361372 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B36849400DC; Fri, 5 Nov 2021 16:46:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A46799400C1; Fri, 5 Nov 2021 16:46:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 899139400DC; Fri, 5 Nov 2021 16:46:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id 752059400C1 for ; Fri, 5 Nov 2021 16:46:31 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 33C828249980 for ; Fri, 5 Nov 2021 20:46:31 +0000 (UTC) X-FDA: 78776059782.22.997B45A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id CB0869000163 for ; Fri, 5 Nov 2021 20:46:30 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B02D661288; Fri, 5 Nov 2021 20:46:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145190; bh=LFEHl9z3ZAXUQgDCOG7TY3iEc6FLNBG/6gZmwtReQs0=; h=Date:From:To:Subject:In-Reply-To:From; b=16+H1EWlMV34/AOotkfgDaEVGHCiO9yiTK2XlwfGxyW3fDgBqGBm1YF3o8U8UIkZ3 3Fub9Q99FKjm0mH8qAtN24auNeUtGYoc30x/gDeWoYPFV3xet9Uu0vtcmcHDVhcKqZ m+fX+1soh+Hnd/Z2bfzQLT2+iOMZAjcaJUdO6TGw= Date: Fri, 05 Nov 2021 13:46:29 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 226/262] mm/damon/dbgfs: support DAMON-based Operation Schemes Message-ID: <20211105204629.fuuKRB3ev%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: CB0869000163 X-Stat-Signature: ydhb9k4zgprit7987h7y4wm341ua7t6o Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=16+H1EWl; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145190-126165 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support DAMON-based Operation Schemes This commit makes 'damon-dbgfs' to support the data access monitoring oriented memory management schemes. Users can read and update the schemes using ``/damon/schemes`` file. The format is:: Link: https://lkml.kernel.org/r/20211001125604.29660-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 165 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 162 insertions(+), 3 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-damon-based-operation-schemes +++ a/mm/damon/dbgfs.c @@ -98,6 +98,159 @@ out: return ret; } +static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len) +{ + struct damos *s; + int written = 0; + int rc; + + damon_for_each_scheme(s, c) { + rc = scnprintf(&buf[written], len - written, + "%lu %lu %u %u %u %u %d\n", + s->min_sz_region, s->max_sz_region, + s->min_nr_accesses, s->max_nr_accesses, + s->min_age_region, s->max_age_region, + s->action); + if (!rc) + return -ENOMEM; + + written += rc; + } + return written; +} + +static ssize_t dbgfs_schemes_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct damon_ctx *ctx = file->private_data; + char *kbuf; + ssize_t len; + + kbuf = kmalloc(count, GFP_KERNEL); + if (!kbuf) + return -ENOMEM; + + mutex_lock(&ctx->kdamond_lock); + len = sprint_schemes(ctx, kbuf, count); + mutex_unlock(&ctx->kdamond_lock); + if (len < 0) + goto out; + len = simple_read_from_buffer(buf, count, ppos, kbuf, len); + +out: + kfree(kbuf); + return len; +} + +static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes) +{ + ssize_t i; + + for (i = 0; i < nr_schemes; i++) + kfree(schemes[i]); + kfree(schemes); +} + +static bool damos_action_valid(int action) +{ + switch (action) { + case DAMOS_WILLNEED: + case DAMOS_COLD: + case DAMOS_PAGEOUT: + case DAMOS_HUGEPAGE: + case DAMOS_NOHUGEPAGE: + return true; + default: + return false; + } +} + +/* + * Converts a string into an array of struct damos pointers + * + * Returns an array of struct damos pointers that converted if the conversion + * success, or NULL otherwise. + */ +static struct damos **str_to_schemes(const char *str, ssize_t len, + ssize_t *nr_schemes) +{ + struct damos *scheme, **schemes; + const int max_nr_schemes = 256; + int pos = 0, parsed, ret; + unsigned long min_sz, max_sz; + unsigned int min_nr_a, max_nr_a, min_age, max_age; + unsigned int action; + + schemes = kmalloc_array(max_nr_schemes, sizeof(scheme), + GFP_KERNEL); + if (!schemes) + return NULL; + + *nr_schemes = 0; + while (pos < len && *nr_schemes < max_nr_schemes) { + ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n", + &min_sz, &max_sz, &min_nr_a, &max_nr_a, + &min_age, &max_age, &action, &parsed); + if (ret != 7) + break; + if (!damos_action_valid(action)) { + pr_err("wrong action %d\n", action); + goto fail; + } + + pos += parsed; + scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a, + min_age, max_age, action); + if (!scheme) + goto fail; + + schemes[*nr_schemes] = scheme; + *nr_schemes += 1; + } + return schemes; +fail: + free_schemes_arr(schemes, *nr_schemes); + return NULL; +} + +static ssize_t dbgfs_schemes_write(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + struct damon_ctx *ctx = file->private_data; + char *kbuf; + struct damos **schemes; + ssize_t nr_schemes = 0, ret = count; + int err; + + kbuf = user_input_str(buf, count, ppos); + if (IS_ERR(kbuf)) + return PTR_ERR(kbuf); + + schemes = str_to_schemes(kbuf, ret, &nr_schemes); + if (!schemes) { + ret = -EINVAL; + goto out; + } + + mutex_lock(&ctx->kdamond_lock); + if (ctx->kdamond) { + ret = -EBUSY; + goto unlock_out; + } + + err = damon_set_schemes(ctx, schemes, nr_schemes); + if (err) + ret = err; + else + nr_schemes = 0; +unlock_out: + mutex_unlock(&ctx->kdamond_lock); + free_schemes_arr(schemes, nr_schemes); +out: + kfree(kbuf); + return ret; +} + static inline bool targetid_is_pid(const struct damon_ctx *ctx) { return ctx->primitive.target_valid == damon_va_target_valid; @@ -279,6 +432,12 @@ static const struct file_operations attr .write = dbgfs_attrs_write, }; +static const struct file_operations schemes_fops = { + .open = damon_dbgfs_open, + .read = dbgfs_schemes_read, + .write = dbgfs_schemes_write, +}; + static const struct file_operations target_ids_fops = { .open = damon_dbgfs_open, .read = dbgfs_target_ids_read, @@ -292,10 +451,10 @@ static const struct file_operations kdam static void dbgfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx) { - const char * const file_names[] = {"attrs", "target_ids", + const char * const file_names[] = {"attrs", "schemes", "target_ids", "kdamond_pid"}; - const struct file_operations *fops[] = {&attrs_fops, &target_ids_fops, - &kdamond_pid_fops}; + const struct file_operations *fops[] = {&attrs_fops, &schemes_fops, + &target_ids_fops, &kdamond_pid_fops}; int i; for (i = 0; i < ARRAY_SIZE(file_names); i++) From patchwork Fri Nov 5 20:46:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F3E2C433EF for ; Fri, 5 Nov 2021 20:46:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2677C6120A for ; Fri, 5 Nov 2021 20:46:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2677C6120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3E2DE9400DD; Fri, 5 Nov 2021 16:46:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 369979400C1; Fri, 5 Nov 2021 16:46:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BC529400DD; Fri, 5 Nov 2021 16:46:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 022989400C1 for ; Fri, 5 Nov 2021 16:46:35 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BDD3C181D12F7 for ; Fri, 5 Nov 2021 20:46:34 +0000 (UTC) X-FDA: 78776059908.16.8D5A622 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id 199B37001706 for ; Fri, 5 Nov 2021 20:46:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1DBE36128E; Fri, 5 Nov 2021 20:46:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145193; bh=kdYmezbHxFKYL/QTbmlL5Nh6pnrnbJS7B6Xmpegz1Fw=; h=Date:From:To:Subject:In-Reply-To:From; b=sQ78fkLjb97cEgIFnvTP6ZPlnRSqzv9W4y/GUJKNztDovWer+glAMTI+ML0zyDApn 1/HUj6KzfjWEfiL0lanTSOLNmqwKJAuCsxJ1vwL7I8tWx2nH49YvjnEX/8UUKDiSaR wALgiEWpONAQbuON3wb5PmYxwXGTJvkxPpBKSVj0= Date: Fri, 05 Nov 2021 13:46:32 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 227/262] mm/damon/schemes: implement statistics feature Message-ID: <20211105204632.n8fr4lDi2%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=sQ78fkLj; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 199B37001706 X-Stat-Signature: yrk7h4sjgcjsy1qyb8nuyj3z4afsxo5f X-HE-Tag: 1636145188-580806 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: implement statistics feature To tune the DAMON-based operation schemes, knowing how many and how large regions are affected by each of the schemes will be helful. Those stats could be used for not only the tuning, but also monitoring of the working set size and the number of regions, if the scheme does not change the program behavior too much. For the reason, this commit implements the statistics for the schemes. The total number and size of the regions that each scheme is applied are exported to users via '->stat_count' and '->stat_sz' of 'struct damos'. Admins can also check the number by reading 'schemes' debugfs file. The last two integers now represents the stats. To allow collecting the stats without changing the program behavior, this commit also adds new scheme action, 'DAMOS_STAT'. Note that 'DAMOS_STAT' is not only making no memory operation actions, but also does not reset the age of regions. Link: https://lkml.kernel.org/r/20211001125604.29660-6-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 10 +++++++++- mm/damon/core.c | 7 ++++++- mm/damon/dbgfs.c | 5 +++-- mm/damon/vaddr.c | 2 ++ 4 files changed, 20 insertions(+), 4 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-implement-statistics-feature +++ a/include/linux/damon.h @@ -78,6 +78,7 @@ struct damon_target { * @DAMOS_PAGEOUT: Call ``madvise()`` for the region with MADV_PAGEOUT. * @DAMOS_HUGEPAGE: Call ``madvise()`` for the region with MADV_HUGEPAGE. * @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with MADV_NOHUGEPAGE. + * @DAMOS_STAT: Do nothing but count the stat. */ enum damos_action { DAMOS_WILLNEED, @@ -85,6 +86,7 @@ enum damos_action { DAMOS_PAGEOUT, DAMOS_HUGEPAGE, DAMOS_NOHUGEPAGE, + DAMOS_STAT, /* Do nothing but only record the stat */ }; /** @@ -96,9 +98,13 @@ enum damos_action { * @min_age_region: Minimum age of target regions. * @max_age_region: Maximum age of target regions. * @action: &damo_action to be applied to the target regions. + * @stat_count: Total number of regions that this scheme is applied. + * @stat_sz: Total size of regions that this scheme is applied. * @list: List head for siblings. * - * Note that both the minimums and the maximums are inclusive. + * For each aggregation interval, DAMON applies @action to monitoring target + * regions fit in the condition and updates the statistics. Note that both + * the minimums and the maximums are inclusive. */ struct damos { unsigned long min_sz_region; @@ -108,6 +114,8 @@ struct damos { unsigned int min_age_region; unsigned int max_age_region; enum damos_action action; + unsigned long stat_count; + unsigned long stat_sz; struct list_head list; }; --- a/mm/damon/core.c~mm-damon-schemes-implement-statistics-feature +++ a/mm/damon/core.c @@ -103,6 +103,8 @@ struct damos *damon_new_scheme( scheme->min_age_region = min_age_region; scheme->max_age_region = max_age_region; scheme->action = action; + scheme->stat_count = 0; + scheme->stat_sz = 0; INIT_LIST_HEAD(&scheme->list); return scheme; @@ -544,9 +546,12 @@ static void damon_do_apply_schemes(struc continue; if (r->age < s->min_age_region || s->max_age_region < r->age) continue; + s->stat_count++; + s->stat_sz += sz; if (c->primitive.apply_scheme) c->primitive.apply_scheme(c, t, r, s); - r->age = 0; + if (s->action != DAMOS_STAT) + r->age = 0; } } --- a/mm/damon/dbgfs.c~mm-damon-schemes-implement-statistics-feature +++ a/mm/damon/dbgfs.c @@ -106,11 +106,11 @@ static ssize_t sprint_schemes(struct dam damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, - "%lu %lu %u %u %u %u %d\n", + "%lu %lu %u %u %u %u %d %lu %lu\n", s->min_sz_region, s->max_sz_region, s->min_nr_accesses, s->max_nr_accesses, s->min_age_region, s->max_age_region, - s->action); + s->action, s->stat_count, s->stat_sz); if (!rc) return -ENOMEM; @@ -159,6 +159,7 @@ static bool damos_action_valid(int actio case DAMOS_PAGEOUT: case DAMOS_HUGEPAGE: case DAMOS_NOHUGEPAGE: + case DAMOS_STAT: return true; default: return false; --- a/mm/damon/vaddr.c~mm-damon-schemes-implement-statistics-feature +++ a/mm/damon/vaddr.c @@ -705,6 +705,8 @@ int damon_va_apply_scheme(struct damon_c case DAMOS_NOHUGEPAGE: madv_action = MADV_NOHUGEPAGE; break; + case DAMOS_STAT: + return 0; default: pr_warn("Wrong action %d\n", scheme->action); return -EINVAL; From patchwork Fri Nov 5 20:46:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A876EC433EF for ; Fri, 5 Nov 2021 20:46:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 601806126A for ; Fri, 5 Nov 2021 20:46:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 601806126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E40A49400DF; Fri, 5 Nov 2021 16:46:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DEFC89400C1; Fri, 5 Nov 2021 16:46:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDE1D9400DF; Fri, 5 Nov 2021 16:46:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213]) by kanga.kvack.org (Postfix) with ESMTP id B56979400C1 for ; Fri, 5 Nov 2021 16:46:45 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7BF6A8249980 for ; Fri, 5 Nov 2021 20:46:45 +0000 (UTC) X-FDA: 78776060370.14.400E642 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf20.hostedemail.com (Postfix) with ESMTP id BEA2FD0000BD for ; Fri, 5 Nov 2021 20:46:35 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7DA396120A; Fri, 5 Nov 2021 20:46:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145197; bh=X1wWSguvt0r17lGh2MG5wB182QPXV4yBUf97OZieb0A=; h=Date:From:To:Subject:In-Reply-To:From; b=OvCD3QAtNJZfPmdakaopAyVFcClQ1UzeMTBzpztoaRHadfdy0ehTbTo0Q9Vaw8JrP 9SyVZ9OaWWgEStWtSbdzlDOXShWaOzBQn6i0y7TeGqDEo0Jn8xN5v/GwnZ1qr+Puwz 4ytlXQbADw36yfK17MqAa7Aj6XhE5dj4C6RCzyM8= Date: Fri, 05 Nov 2021 13:46:36 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 228/262] selftests/damon: add 'schemes' debugfs tests Message-ID: <20211105204636.V-75NEmx0%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=OvCD3QAt; dmarc=none; spf=pass (imf20.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BEA2FD0000BD X-Stat-Signature: up5z8y41qmerggaduwqwo9n8qou7kmub X-HE-Tag: 1636145195-633375 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: selftests/damon: add 'schemes' debugfs tests This commit adds simple selftets for 'schemes' debugfs file of DAMON. Link: https://lkml.kernel.org/r/20211001125604.29660-7-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/damon/debugfs_attrs.sh | 13 +++++++++++++ 1 file changed, 13 insertions(+) --- a/tools/testing/selftests/damon/debugfs_attrs.sh~selftests-damon-add-schemes-debugfs-tests +++ a/tools/testing/selftests/damon/debugfs_attrs.sh @@ -57,6 +57,19 @@ test_write_fail "$file" "1 2 3 5 4" "$or test_content "$file" "$orig_content" "1 2 3 4 5" "successfully written" echo "$orig_content" > "$file" +# Test schemes file +# ================= + +file="$DBGFS/schemes" +orig_content=$(cat "$file") + +test_write_succ "$file" "1 2 3 4 5 6 4" \ + "$orig_content" "valid input" +test_write_fail "$file" "1 2 +3 4 5 6 3" "$orig_content" "multi lines" +test_write_succ "$file" "" "$orig_content" "disabling" +echo "$orig_content" > "$file" + # Test target_ids file # ==================== From patchwork Fri Nov 5 20:46:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFF59C433FE for ; Fri, 5 Nov 2021 20:46:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9AE866136A for ; Fri, 5 Nov 2021 20:46:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9AE866136A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 342E69400DE; Fri, 5 Nov 2021 16:46:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F2919400C1; Fri, 5 Nov 2021 16:46:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BB2D9400DE; Fri, 5 Nov 2021 16:46:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 062DE9400C1 for ; Fri, 5 Nov 2021 16:46:42 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BB10E8249980 for ; Fri, 5 Nov 2021 20:46:41 +0000 (UTC) X-FDA: 78776060328.05.C8DC38C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 43C5AF0003B8 for ; Fri, 5 Nov 2021 20:46:41 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DB46061288; Fri, 5 Nov 2021 20:46:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145200; bh=i5KSwLkZjkyCOOiADrAkxZ/QTlmqwFS8Uohf6C0QEas=; h=Date:From:To:Subject:In-Reply-To:From; b=h6NySITDANK3Tkxi73/gteFnR3d4gPWPZaBv5l5XG0p4R5upjPv7XWzqJKoxZ3SEX /veD3QB0qu0QButXYL0UWkCnvQh1F1U6oa8wa4iKaMfj7xK2wQ+HyM6dudtYWrOO6x 3kkxv4bbwZ2LMFIs5C+plAiaJ20rSH8aXc5urfZk= Date: Fri, 05 Nov 2021 13:46:39 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 229/262] Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes Message-ID: <20211105204639.Cba_-vH9D%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 43C5AF0003B8 X-Stat-Signature: 8m98ua4dam8k6aejuxj8k8j6q69h9z1m Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=h6NySITD; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145201-942334 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes This commit add description of DAMON-based operation schemes in the DAMON documents. Link: https://lkml.kernel.org/r/20211001125604.29660-8-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/start.rst | 11 +++ Documentation/admin-guide/mm/damon/usage.rst | 51 ++++++++++++++++- 2 files changed, 60 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-document-damon-based-operation-schemes +++ a/Documentation/admin-guide/mm/damon/start.rst @@ -108,6 +108,17 @@ the results as separate image files. :: You can view the visualizations of this example workload at [1]_. Visualizations of other realistic workloads are available at [2]_ [3]_ [4]_. + +Data Access Pattern Aware Memory Management +=========================================== + +Below three commands make every memory region of size >=4K that doesn't +accessed for >=60 seconds in your workload to be swapped out. :: + + $ echo "#min-size max-size min-acc max-acc min-age max-age action" > scheme + $ echo "4K max 0 0 60s max pageout" >> scheme + $ damo schemes -c my_thp_scheme + .. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns .. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html .. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.png.html --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-document-damon-based-operation-schemes +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -34,8 +34,8 @@ the reason, this document describes only debugfs Interface ================= -DAMON exports three files, ``attrs``, ``target_ids``, and ``monitor_on`` under -its debugfs directory, ``/damon/``. +DAMON exports four files, ``attrs``, ``target_ids``, ``schemes`` and +``monitor_on`` under its debugfs directory, ``/damon/``. Attributes @@ -74,6 +74,53 @@ check it again:: Note that setting the target ids doesn't start the monitoring. +Schemes +------- + +For usual DAMON-based data access aware memory management optimizations, users +would simply want the system to apply a memory management action to a memory +region of a specific size having a specific access frequency for a specific +time. DAMON receives such formalized operation schemes from the user and +applies those to the target processes. It also counts the total number and +size of regions that each scheme is applied. This statistics can be used for +online analysis or tuning of the schemes. + +Users can get and set the schemes by reading from and writing to ``schemes`` +debugfs file. Reading the file also shows the statistics of each scheme. To +the file, each of the schemes should be represented in each line in below form: + + min-size max-size min-acc max-acc min-age max-age action + +Note that the ranges are closed interval. Bytes for the size of regions +(``min-size`` and ``max-size``), number of monitored accesses per aggregate +interval for access frequency (``min-acc`` and ``max-acc``), number of +aggregate intervals for the age of regions (``min-age`` and ``max-age``), and a +predefined integer for memory management actions should be used. The supported +numbers and their meanings are as below. + + - 0: Call ``madvise()`` for the region with ``MADV_WILLNEED`` + - 1: Call ``madvise()`` for the region with ``MADV_COLD`` + - 2: Call ``madvise()`` for the region with ``MADV_PAGEOUT`` + - 3: Call ``madvise()`` for the region with ``MADV_HUGEPAGE`` + - 4: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE`` + - 5: Do nothing but count the statistics + +You can disable schemes by simply writing an empty string to the file. For +example, below commands applies a scheme saying "If a memory region of size in +[4KiB, 8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate +interval in [10, 20], page out the region", check the entered scheme again, and +finally remove the scheme. :: + + # cd /damon + # echo "4096 8192 0 5 10 20 2" > schemes + # cat schemes + 4096 8192 0 5 10 20 2 0 0 + # echo > schemes + +The last two integers in the 4th line of above example is the total number and +the total size of the regions that the scheme is applied. + + Turning On/Off -------------- From patchwork Fri Nov 5 20:46:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12606011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00677C433F5 for ; Fri, 5 Nov 2021 21:02:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9126B60D07 for ; Fri, 5 Nov 2021 21:02:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9126B60D07 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 356A06B00FF; Fri, 5 Nov 2021 17:02:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DF11940020; Fri, 5 Nov 2021 17:02:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A78494000D; Fri, 5 Nov 2021 17:02:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 04F166B00FF for ; Fri, 5 Nov 2021 17:02:08 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BD97B76BA8 for ; Fri, 5 Nov 2021 21:02:07 +0000 (UTC) X-FDA: 78776099094.24.EA9CE6F Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 8B990900026F for ; Fri, 5 Nov 2021 21:02:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 6489E61357; Fri, 5 Nov 2021 20:46:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145204; bh=VOIunbdNmcWFyd8uMrc34AqP7A1NbkwVd0pIGYCJJ68=; h=Date:From:To:Subject:In-Reply-To:From; b=CewqAxPoSvYQeZUHf6wlrx+cKYxokwBTievw36dDmD6V3VDuTz+Nm4mVr90k1JREx G+sTCB6NTqrYe/PWrJCDC8quwpnFkrZpdgPKI4g/befAfUJH8uV8d1ohrB5xasEP13 xowvOYU4qcfkxp+nJ/ve7P7/dw14hTUJYD478dM4= Date: Fri, 05 Nov 2021 13:46:42 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 230/262] mm/damon/dbgfs: allow users to set initial monitoring target regions Message-ID: <20211105204642.8cBaVVOlp%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8B990900026F X-Stat-Signature: iqijc8pukkiceonaiqcmsdukqnuc1p3i Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=CewqAxPo; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636146125-360723 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: allow users to set initial monitoring target regions Patch series "DAMON: Support Physical Memory Address Space Monitoring:. DAMON currently supports only virtual address spaces monitoring. It can be easily extended for various use cases and address spaces by configuring its monitoring primitives layer to use appropriate primitives implementations, though. This patchset implements monitoring primitives for the physical address space monitoring using the structure. The first 3 patches allow the user space users manually set the monitoring regions. The 1st patch implements the feature in the 'damon-dbgfs'. Then, patches for adding a unit tests (the 2nd patch) and updating the documentation (the 3rd patch) follow. Following 4 patches implement the physical address space monitoring primitives. The 4th patch makes some primitive functions for the virtual address spaces primitives reusable. The 5th patch implements the physical address space monitoring primitives. The 6th patch links the primitives to the 'damon-dbgfs'. Finally, 7th patch documents this new features. This patch (of 7): Some 'damon-dbgfs' users would want to monitor only a part of the entire virtual memory address space. The program interface users in the kernel space could use '->before_start()' callback or set the regions inside the context struct as they want, but 'damon-dbgfs' users cannot. For the reason, this commit introduces a new debugfs file called 'init_region'. 'damon-dbgfs' users can specify which initial monitoring target address regions they want by writing special input to the file. The input should describe each region in each line in the below form: Note that the regions will be updated to cover entire memory mapped regions after a 'regions update interval' is passed. If you want the regions to not be updated after the initial setting, you could set the interval as a very long time, say, a few decades. Link: https://lkml.kernel.org/r/20211012205711.29216-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211012205711.29216-2-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Cameron Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Jonathan Corbet Cc: David Hildenbrand Cc: David Woodhouse Cc: Marco Elver Cc: Leonard Foerster Cc: Greg Thelen Cc: Markus Boehme Cc: David Rienjes Cc: Shakeel Butt Cc: Shuah Khan Cc: Brendan Higgins Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 156 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 154 insertions(+), 2 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-allow-users-to-set-initial-monitoring-target-regions +++ a/mm/damon/dbgfs.c @@ -394,6 +394,152 @@ out: return ret; } +static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len) +{ + struct damon_target *t; + struct damon_region *r; + int written = 0; + int rc; + + damon_for_each_target(t, c) { + damon_for_each_region(r, t) { + rc = scnprintf(&buf[written], len - written, + "%lu %lu %lu\n", + t->id, r->ar.start, r->ar.end); + if (!rc) + return -ENOMEM; + written += rc; + } + } + return written; +} + +static ssize_t dbgfs_init_regions_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct damon_ctx *ctx = file->private_data; + char *kbuf; + ssize_t len; + + kbuf = kmalloc(count, GFP_KERNEL); + if (!kbuf) + return -ENOMEM; + + mutex_lock(&ctx->kdamond_lock); + if (ctx->kdamond) { + mutex_unlock(&ctx->kdamond_lock); + len = -EBUSY; + goto out; + } + + len = sprint_init_regions(ctx, kbuf, count); + mutex_unlock(&ctx->kdamond_lock); + if (len < 0) + goto out; + len = simple_read_from_buffer(buf, count, ppos, kbuf, len); + +out: + kfree(kbuf); + return len; +} + +static int add_init_region(struct damon_ctx *c, + unsigned long target_id, struct damon_addr_range *ar) +{ + struct damon_target *t; + struct damon_region *r, *prev; + unsigned long id; + int rc = -EINVAL; + + if (ar->start >= ar->end) + return -EINVAL; + + damon_for_each_target(t, c) { + id = t->id; + if (targetid_is_pid(c)) + id = (unsigned long)pid_vnr((struct pid *)id); + if (id == target_id) { + r = damon_new_region(ar->start, ar->end); + if (!r) + return -ENOMEM; + damon_add_region(r, t); + if (damon_nr_regions(t) > 1) { + prev = damon_prev_region(r); + if (prev->ar.end > r->ar.start) { + damon_destroy_region(r, t); + return -EINVAL; + } + } + rc = 0; + } + } + return rc; +} + +static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len) +{ + struct damon_target *t; + struct damon_region *r, *next; + int pos = 0, parsed, ret; + unsigned long target_id; + struct damon_addr_range ar; + int err; + + damon_for_each_target(t, c) { + damon_for_each_region_safe(r, next, t) + damon_destroy_region(r, t); + } + + while (pos < len) { + ret = sscanf(&str[pos], "%lu %lu %lu%n", + &target_id, &ar.start, &ar.end, &parsed); + if (ret != 3) + break; + err = add_init_region(c, target_id, &ar); + if (err) + goto fail; + pos += parsed; + } + + return 0; + +fail: + damon_for_each_target(t, c) { + damon_for_each_region_safe(r, next, t) + damon_destroy_region(r, t); + } + return err; +} + +static ssize_t dbgfs_init_regions_write(struct file *file, + const char __user *buf, size_t count, + loff_t *ppos) +{ + struct damon_ctx *ctx = file->private_data; + char *kbuf; + ssize_t ret = count; + int err; + + kbuf = user_input_str(buf, count, ppos); + if (IS_ERR(kbuf)) + return PTR_ERR(kbuf); + + mutex_lock(&ctx->kdamond_lock); + if (ctx->kdamond) { + ret = -EBUSY; + goto unlock_out; + } + + err = set_init_regions(ctx, kbuf, ret); + if (err) + ret = err; + +unlock_out: + mutex_unlock(&ctx->kdamond_lock); + kfree(kbuf); + return ret; +} + static ssize_t dbgfs_kdamond_pid_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { @@ -445,6 +591,12 @@ static const struct file_operations targ .write = dbgfs_target_ids_write, }; +static const struct file_operations init_regions_fops = { + .open = damon_dbgfs_open, + .read = dbgfs_init_regions_read, + .write = dbgfs_init_regions_write, +}; + static const struct file_operations kdamond_pid_fops = { .open = damon_dbgfs_open, .read = dbgfs_kdamond_pid_read, @@ -453,9 +605,9 @@ static const struct file_operations kdam static void dbgfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx) { const char * const file_names[] = {"attrs", "schemes", "target_ids", - "kdamond_pid"}; + "init_regions", "kdamond_pid"}; const struct file_operations *fops[] = {&attrs_fops, &schemes_fops, - &target_ids_fops, &kdamond_pid_fops}; + &target_ids_fops, &init_regions_fops, &kdamond_pid_fops}; int i; for (i = 0; i < ARRAY_SIZE(file_names); i++) From patchwork Fri Nov 5 20:46:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 933CAC433EF for ; Fri, 5 Nov 2021 20:46:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B65F6120A for ; Fri, 5 Nov 2021 20:46:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4B65F6120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C91279400E0; Fri, 5 Nov 2021 16:46:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1A699400C1; Fri, 5 Nov 2021 16:46:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE1F59400E0; Fri, 5 Nov 2021 16:46:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 9ACF99400C1 for ; Fri, 5 Nov 2021 16:46:48 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5F5BC55411 for ; Fri, 5 Nov 2021 20:46:48 +0000 (UTC) X-FDA: 78776060496.16.17E13F5 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id BFF6C7001A08 for ; Fri, 5 Nov 2021 20:46:42 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CF39561288; Fri, 5 Nov 2021 20:46:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145207; bh=y+5mSAl9bOTvn9SHMFyqOP82kYoIRhlo3opLhCCgEC4=; h=Date:From:To:Subject:In-Reply-To:From; b=HEv4jqP6vSwL3ZkUuZ+UYpbMBheAWDascRQGr++dVy1jBzW9495EIm5tKapN4g04k JTaqAdeZUwfI4RKN3JyG1NPoeOAHqMUvJ/xh9Q26mKeMgkY3TZweTJPiqgCAL+sOOh nMOsph/xxLELeCu+geUn1S3pixuRgbTymuSLH22Q= Date: Fri, 05 Nov 2021 13:46:46 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 231/262] mm/damon/dbgfs-test: add a unit test case for 'init_regions' Message-ID: <20211105204646.fYQGh1BC3%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=HEv4jqP6; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BFF6C7001A08 X-Stat-Signature: 9xkhg4dbfcef63ckp8cj3o66p44o4py9 X-HE-Tag: 1636145202-553403 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs-test: add a unit test case for 'init_regions' This commit adds another test case for the new feature, 'init_regions'. Link: https://lkml.kernel.org/r/20211012205711.29216-3-sj@kernel.org Signed-off-by: SeongJae Park Reviewed-by: Brendan Higgins Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/dbgfs-test.h | 54 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) --- a/mm/damon/dbgfs-test.h~mm-damon-dbgfs-test-add-a-unit-test-case-for-init_regions +++ a/mm/damon/dbgfs-test.h @@ -109,9 +109,63 @@ static void damon_dbgfs_test_set_targets dbgfs_destroy_ctx(ctx); } +static void damon_dbgfs_test_set_init_regions(struct kunit *test) +{ + struct damon_ctx *ctx = damon_new_ctx(); + unsigned long ids[] = {1, 2, 3}; + /* Each line represents one region in `` `` */ + char * const valid_inputs[] = {"2 10 20\n 2 20 30\n2 35 45", + "2 10 20\n", + "2 10 20\n1 39 59\n1 70 134\n 2 20 25\n", + ""}; + /* Reading the file again will show sorted, clean output */ + char * const valid_expects[] = {"2 10 20\n2 20 30\n2 35 45\n", + "2 10 20\n", + "1 39 59\n1 70 134\n2 10 20\n2 20 25\n", + ""}; + char * const invalid_inputs[] = {"4 10 20\n", /* target not exists */ + "2 10 20\n 2 14 26\n", /* regions overlap */ + "1 10 20\n2 30 40\n 1 5 8"}; /* not sorted by address */ + char *input, *expect; + int i, rc; + char buf[256]; + + damon_set_targets(ctx, ids, 3); + + /* Put valid inputs and check the results */ + for (i = 0; i < ARRAY_SIZE(valid_inputs); i++) { + input = valid_inputs[i]; + expect = valid_expects[i]; + + rc = set_init_regions(ctx, input, strnlen(input, 256)); + KUNIT_EXPECT_EQ(test, rc, 0); + + memset(buf, 0, 256); + sprint_init_regions(ctx, buf, 256); + + KUNIT_EXPECT_STREQ(test, (char *)buf, expect); + } + /* Put invlid inputs and check the return error code */ + for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) { + input = invalid_inputs[i]; + pr_info("input: %s\n", input); + rc = set_init_regions(ctx, input, strnlen(input, 256)); + KUNIT_EXPECT_EQ(test, rc, -EINVAL); + + memset(buf, 0, 256); + sprint_init_regions(ctx, buf, 256); + + KUNIT_EXPECT_STREQ(test, (char *)buf, ""); + } + + damon_set_targets(ctx, NULL, 0); + damon_destroy_ctx(ctx); +} + static struct kunit_case damon_test_cases[] = { KUNIT_CASE(damon_dbgfs_test_str_to_target_ids), KUNIT_CASE(damon_dbgfs_test_set_targets), + KUNIT_CASE(damon_dbgfs_test_set_init_regions), {}, }; From patchwork Fri Nov 5 20:46:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25C2AC433EF for ; Fri, 5 Nov 2021 20:46:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D11FB6128E for ; Fri, 5 Nov 2021 20:46:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D11FB6128E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5DB6A9400E1; Fri, 5 Nov 2021 16:46:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 58AD59400C1; Fri, 5 Nov 2021 16:46:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4045A9400E1; Fri, 5 Nov 2021 16:46:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0005.hostedemail.com [216.40.44.5]) by kanga.kvack.org (Postfix) with ESMTP id 2E5B09400C1 for ; Fri, 5 Nov 2021 16:46:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E9D8E55FA4 for ; Fri, 5 Nov 2021 20:46:51 +0000 (UTC) X-FDA: 78776060622.15.67DF43C Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf21.hostedemail.com (Postfix) with ESMTP id BC901D036A53 for ; Fri, 5 Nov 2021 20:46:46 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 458F26126A; Fri, 5 Nov 2021 20:46:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145210; bh=RcTA1AudXgXH9EMTrgmQoWw5+SCP6irYLWzXAWwE4GY=; h=Date:From:To:Subject:In-Reply-To:From; b=XiK25w98qyvz3HgDp69ZqPvFamWBI8AN+BXPwGvvfxPdE145ikuPJFuV84tlqj8aw JWYeBWXLq1powsv2ZWjjGnxD83JhvA2T1amKPW6mG+yQDPdq4qS9OiqUYuGFRMXhz/ 1THNiUrzR2iQe/vRccK0c4KFXLLBR9/TnZqvb/Tg= Date: Fri, 05 Nov 2021 13:46:49 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 232/262] Docs/admin-guide/mm/damon: document 'init_regions' feature Message-ID: <20211105204649.5UJA8W9k3%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XiK25w98; dmarc=none; spf=pass (imf21.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BC901D036A53 X-Stat-Signature: eg8kkji3gpsetiu4in56c8pm3yy89tns X-HE-Tag: 1636145206-801919 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon: document 'init_regions' feature This commit adds description of the 'init_regions' feature in the DAMON usage document. Link: https://lkml.kernel.org/r/20211012205711.29216-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Brendan Higgins Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 41 ++++++++++++++++- 1 file changed, 39 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-document-init_regions-feature +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -34,8 +34,9 @@ the reason, this document describes only debugfs Interface ================= -DAMON exports four files, ``attrs``, ``target_ids``, ``schemes`` and -``monitor_on`` under its debugfs directory, ``/damon/``. +DAMON exports five files, ``attrs``, ``target_ids``, ``init_regions``, +``schemes`` and ``monitor_on`` under its debugfs directory, +``/damon/``. Attributes @@ -74,6 +75,42 @@ check it again:: Note that setting the target ids doesn't start the monitoring. +Initial Monitoring Target Regions +--------------------------------- + +In case of the debugfs based monitoring, DAMON automatically sets and updates +the monitoring target regions so that entire memory mappings of target +processes can be covered. However, users can want to limit the monitoring +region to specific address ranges, such as the heap, the stack, or specific +file-mapped area. Or, some users can know the initial access pattern of their +workloads and therefore want to set optimal initial regions for the 'adaptive +regions adjustment'. + +In such cases, users can explicitly set the initial monitoring target regions +as they want, by writing proper values to the ``init_regions`` file. Each line +of the input should represent one region in below form.:: + + + +The ``target id`` should already in ``target_ids`` file, and the regions should +be passed in address order. For example, below commands will set a couple of +address ranges, ``1-100`` and ``100-200`` as the initial monitoring target +region of process 42, and another couple of address ranges, ``20-40`` and +``50-100`` as that of process 4242.:: + + # cd /damon + # echo "42 1 100 + 42 100 200 + 4242 20 40 + 4242 50 100" > init_regions + +Note that this sets the initial monitoring target regions only. In case of +virtual memory monitoring, DAMON will automatically updates the boundary of the +regions after one ``regions update interval``. Therefore, users should set the +``regions update interval`` large enough in this case, if they don't want the +update. + + Schemes ------- From patchwork Fri Nov 5 20:46:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBFFDC433F5 for ; Fri, 5 Nov 2021 20:46:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E3906120A for ; Fri, 5 Nov 2021 20:46:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6E3906120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0838B9400E2; Fri, 5 Nov 2021 16:46:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00C3F9400C1; Fri, 5 Nov 2021 16:46:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3D829400E2; Fri, 5 Nov 2021 16:46:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D20839400C1 for ; Fri, 5 Nov 2021 16:46:55 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 868DF1856B71B for ; Fri, 5 Nov 2021 20:46:55 +0000 (UTC) X-FDA: 78776060706.19.BF9BA53 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 27A5ED0000B4 for ; Fri, 5 Nov 2021 20:46:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BC8A06128E; Fri, 5 Nov 2021 20:46:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145214; bh=X2dtMA5XR8MkX8/Diz0HAs67rNqo/HUxQu7F3sk06ik=; h=Date:From:To:Subject:In-Reply-To:From; b=vuwdJsymLa+I8Drylttwy2hiO+Fb4DYztqsgdy2QNGwKPQdmOU2EcYdZ61o188jII hCoIb8XoEprlNkNPHKRmStJocOWxhTXu4/Q1prxA+AMMvfHhIxs7sFNQSdhJFuVkrD CSAt3NEjVFf7AZ0HvWmF+Y1so2RtbO12yufDDkXI= Date: Fri, 05 Nov 2021 13:46:53 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 233/262] mm/damon/vaddr: separate commonly usable functions Message-ID: <20211105204653.y2eZmAAOY%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 27A5ED0000B4 X-Stat-Signature: beazip7u6yof71gfsbp6wwgcdbt7a1fx Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=vuwdJsym; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145204-621121 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/vaddr: separate commonly usable functions This commit moves functions in the default virtual address spaces monitoring primitives that commonly usable from other address spaces like physical address space into a header file. Those will be reused by the physical address space monitoring primitives which will be implemented by the following commit. [sj@kernel.org: include 'highmem.h' to fix a build failure] Link: https://lkml.kernel.org/r/20211014110848.5204-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211012205711.29216-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Brendan Higgins Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/Makefile | 2 mm/damon/prmtv-common.c | 87 +++++++++++++++++++++++++++++++++++++ mm/damon/prmtv-common.h | 17 +++++++ mm/damon/vaddr.c | 88 +------------------------------------- 4 files changed, 108 insertions(+), 86 deletions(-) --- a/mm/damon/Makefile~mm-damon-vaddr-separate-commonly-usable-functions +++ a/mm/damon/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DAMON) := core.o -obj-$(CONFIG_DAMON_VADDR) += vaddr.o +obj-$(CONFIG_DAMON_VADDR) += prmtv-common.o vaddr.o obj-$(CONFIG_DAMON_DBGFS) += dbgfs.o --- /dev/null +++ a/mm/damon/prmtv-common.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Common Primitives for Data Access Monitoring + * + * Author: SeongJae Park + */ + +#include +#include +#include +#include + +#include "prmtv-common.h" + +/* + * Get an online page for a pfn if it's in the LRU list. Otherwise, returns + * NULL. + * + * The body of this function is stolen from the 'page_idle_get_page()'. We + * steal rather than reuse it because the code is quite simple. + */ +struct page *damon_get_page(unsigned long pfn) +{ + struct page *page = pfn_to_online_page(pfn); + + if (!page || !PageLRU(page) || !get_page_unless_zero(page)) + return NULL; + + if (unlikely(!PageLRU(page))) { + put_page(page); + page = NULL; + } + return page; +} + +void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr) +{ + bool referenced = false; + struct page *page = damon_get_page(pte_pfn(*pte)); + + if (!page) + return; + + if (pte_young(*pte)) { + referenced = true; + *pte = pte_mkold(*pte); + } + +#ifdef CONFIG_MMU_NOTIFIER + if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE)) + referenced = true; +#endif /* CONFIG_MMU_NOTIFIER */ + + if (referenced) + set_page_young(page); + + set_page_idle(page); + put_page(page); +} + +void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + bool referenced = false; + struct page *page = damon_get_page(pmd_pfn(*pmd)); + + if (!page) + return; + + if (pmd_young(*pmd)) { + referenced = true; + *pmd = pmd_mkold(*pmd); + } + +#ifdef CONFIG_MMU_NOTIFIER + if (mmu_notifier_clear_young(mm, addr, + addr + ((1UL) << HPAGE_PMD_SHIFT))) + referenced = true; +#endif /* CONFIG_MMU_NOTIFIER */ + + if (referenced) + set_page_young(page); + + set_page_idle(page); + put_page(page); +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +} --- /dev/null +++ a/mm/damon/prmtv-common.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Common Primitives for Data Access Monitoring + * + * Author: SeongJae Park + */ + +#include +#include + +/* Get a random number in [l, r) */ +#define damon_rand(l, r) (l + prandom_u32_max(r - l)) + +struct page *damon_get_page(unsigned long pfn); + +void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr); +void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr); --- a/mm/damon/vaddr.c~mm-damon-vaddr-separate-commonly-usable-functions +++ a/mm/damon/vaddr.c @@ -8,25 +8,19 @@ #define pr_fmt(fmt) "damon-va: " fmt #include -#include +#include #include -#include #include -#include #include #include -#include -#include -#include + +#include "prmtv-common.h" #ifdef CONFIG_DAMON_VADDR_KUNIT_TEST #undef DAMON_MIN_REGION #define DAMON_MIN_REGION 1 #endif -/* Get a random number in [l, r) */ -#define damon_rand(l, r) (l + prandom_u32_max(r - l)) - /* * 't->id' should be the pointer to the relevant 'struct pid' having reference * count. Caller must put the returned task, unless it is NULL. @@ -373,82 +367,6 @@ void damon_va_update(struct damon_ctx *c } } -/* - * Get an online page for a pfn if it's in the LRU list. Otherwise, returns - * NULL. - * - * The body of this function is stolen from the 'page_idle_get_page()'. We - * steal rather than reuse it because the code is quite simple. - */ -static struct page *damon_get_page(unsigned long pfn) -{ - struct page *page = pfn_to_online_page(pfn); - - if (!page || !PageLRU(page) || !get_page_unless_zero(page)) - return NULL; - - if (unlikely(!PageLRU(page))) { - put_page(page); - page = NULL; - } - return page; -} - -static void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, - unsigned long addr) -{ - bool referenced = false; - struct page *page = damon_get_page(pte_pfn(*pte)); - - if (!page) - return; - - if (pte_young(*pte)) { - referenced = true; - *pte = pte_mkold(*pte); - } - -#ifdef CONFIG_MMU_NOTIFIER - if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE)) - referenced = true; -#endif /* CONFIG_MMU_NOTIFIER */ - - if (referenced) - set_page_young(page); - - set_page_idle(page); - put_page(page); -} - -static void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, - unsigned long addr) -{ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - bool referenced = false; - struct page *page = damon_get_page(pmd_pfn(*pmd)); - - if (!page) - return; - - if (pmd_young(*pmd)) { - referenced = true; - *pmd = pmd_mkold(*pmd); - } - -#ifdef CONFIG_MMU_NOTIFIER - if (mmu_notifier_clear_young(mm, addr, - addr + ((1UL) << HPAGE_PMD_SHIFT))) - referenced = true; -#endif /* CONFIG_MMU_NOTIFIER */ - - if (referenced) - set_page_young(page); - - set_page_idle(page); - put_page(page); -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -} - static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long next, struct mm_walk *walk) { From patchwork Fri Nov 5 20:46:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5F24C433FE for ; Fri, 5 Nov 2021 20:46:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 94BCA6126A for ; Fri, 5 Nov 2021 20:46:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 94BCA6126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 343129400E3; Fri, 5 Nov 2021 16:46:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F20F9400C1; Fri, 5 Nov 2021 16:46:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E1539400E3; Fri, 5 Nov 2021 16:46:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 0B2E29400C1 for ; Fri, 5 Nov 2021 16:46:59 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C5FF01853B976 for ; Fri, 5 Nov 2021 20:46:58 +0000 (UTC) X-FDA: 78776060916.15.1A470E4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id 1D7DC20019F4 for ; Fri, 5 Nov 2021 20:46:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2FC736120A; Fri, 5 Nov 2021 20:46:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145217; bh=Id5Qvr76rrcds6Fw3RnqoNggvyfMao2NX5X58SshUlE=; h=Date:From:To:Subject:In-Reply-To:From; b=WNJhaN8/4k3q/vubzu8hQXWRYbYJDTtxD7ozn7IWnkJXl81cn8zTFJQ4xI5DhDj6t pF7Qt58Fz9SvAlUtTp742Zj0lnyIRVNdp/ayMzrqWccFbkWMjEw0ck2rJo2rdDUPIr BUlxfmIKRebpTs/602KGxu2uxBkfYxmGwkAGmH80= Date: Fri, 05 Nov 2021 13:46:56 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 234/262] mm/damon: implement primitives for physical address space monitoring Message-ID: <20211105204656.ThpMcjpCQ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1D7DC20019F4 X-Stat-Signature: hkrxjkz1g61cbby4bz8zwuswwn8uzsfg Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="WNJhaN8/"; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145219-125469 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon: implement primitives for physical address space monitoring This commit implements the monitoring primitives for the physical memory address space. Internally, it uses the PTE Accessed bit, similar to that of the virtual address spaces monitoring primitives. It supports only user memory pages, as idle pages tracking does. If the monitoring target physical memory address range contains non-user memory pages, access check of the pages will do nothing but simply treat the pages as not accessed. Link: https://lkml.kernel.org/r/20211012205711.29216-6-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Brendan Higgins Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 10 + mm/damon/Kconfig | 8 + mm/damon/Makefile | 1 mm/damon/paddr.c | 224 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 243 insertions(+) --- a/include/linux/damon.h~mm-damon-implement-primitives-for-physical-address-space-monitoring +++ a/include/linux/damon.h @@ -351,4 +351,14 @@ void damon_va_set_primitives(struct damo #endif /* CONFIG_DAMON_VADDR */ +#ifdef CONFIG_DAMON_PADDR + +/* Monitoring primitives for the physical memory address space */ +void damon_pa_prepare_access_checks(struct damon_ctx *ctx); +unsigned int damon_pa_check_accesses(struct damon_ctx *ctx); +bool damon_pa_target_valid(void *t); +void damon_pa_set_primitives(struct damon_ctx *ctx); + +#endif /* CONFIG_DAMON_PADDR */ + #endif /* _DAMON_H */ --- a/mm/damon/Kconfig~mm-damon-implement-primitives-for-physical-address-space-monitoring +++ a/mm/damon/Kconfig @@ -32,6 +32,14 @@ config DAMON_VADDR This builds the default data access monitoring primitives for DAMON that work for virtual address spaces. +config DAMON_PADDR + bool "Data access monitoring primitives for the physical address space" + depends on DAMON && MMU + select PAGE_IDLE_FLAG + help + This builds the default data access monitoring primitives for DAMON + that works for the physical address space. + config DAMON_VADDR_KUNIT_TEST bool "Test for DAMON primitives" if !KUNIT_ALL_TESTS depends on DAMON_VADDR && KUNIT=y --- a/mm/damon/Makefile~mm-damon-implement-primitives-for-physical-address-space-monitoring +++ a/mm/damon/Makefile @@ -2,4 +2,5 @@ obj-$(CONFIG_DAMON) := core.o obj-$(CONFIG_DAMON_VADDR) += prmtv-common.o vaddr.o +obj-$(CONFIG_DAMON_PADDR) += prmtv-common.o paddr.o obj-$(CONFIG_DAMON_DBGFS) += dbgfs.o --- /dev/null +++ a/mm/damon/paddr.c @@ -0,0 +1,224 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAMON Primitives for The Physical Address Space + * + * Author: SeongJae Park + */ + +#define pr_fmt(fmt) "damon-pa: " fmt + +#include +#include +#include +#include + +#include "prmtv-common.h" + +static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma, + unsigned long addr, void *arg) +{ + struct page_vma_mapped_walk pvmw = { + .page = page, + .vma = vma, + .address = addr, + }; + + while (page_vma_mapped_walk(&pvmw)) { + addr = pvmw.address; + if (pvmw.pte) + damon_ptep_mkold(pvmw.pte, vma->vm_mm, addr); + else + damon_pmdp_mkold(pvmw.pmd, vma->vm_mm, addr); + } + return true; +} + +static void damon_pa_mkold(unsigned long paddr) +{ + struct page *page = damon_get_page(PHYS_PFN(paddr)); + struct rmap_walk_control rwc = { + .rmap_one = __damon_pa_mkold, + .anon_lock = page_lock_anon_vma_read, + }; + bool need_lock; + + if (!page) + return; + + if (!page_mapped(page) || !page_rmapping(page)) { + set_page_idle(page); + goto out; + } + + need_lock = !PageAnon(page) || PageKsm(page); + if (need_lock && !trylock_page(page)) + goto out; + + rmap_walk(page, &rwc); + + if (need_lock) + unlock_page(page); + +out: + put_page(page); +} + +static void __damon_pa_prepare_access_check(struct damon_ctx *ctx, + struct damon_region *r) +{ + r->sampling_addr = damon_rand(r->ar.start, r->ar.end); + + damon_pa_mkold(r->sampling_addr); +} + +void damon_pa_prepare_access_checks(struct damon_ctx *ctx) +{ + struct damon_target *t; + struct damon_region *r; + + damon_for_each_target(t, ctx) { + damon_for_each_region(r, t) + __damon_pa_prepare_access_check(ctx, r); + } +} + +struct damon_pa_access_chk_result { + unsigned long page_sz; + bool accessed; +}; + +static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma, + unsigned long addr, void *arg) +{ + struct damon_pa_access_chk_result *result = arg; + struct page_vma_mapped_walk pvmw = { + .page = page, + .vma = vma, + .address = addr, + }; + + result->accessed = false; + result->page_sz = PAGE_SIZE; + while (page_vma_mapped_walk(&pvmw)) { + addr = pvmw.address; + if (pvmw.pte) { + result->accessed = pte_young(*pvmw.pte) || + !page_is_idle(page) || + mmu_notifier_test_young(vma->vm_mm, addr); + } else { +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + result->accessed = pmd_young(*pvmw.pmd) || + !page_is_idle(page) || + mmu_notifier_test_young(vma->vm_mm, addr); + result->page_sz = ((1UL) << HPAGE_PMD_SHIFT); +#else + WARN_ON_ONCE(1); +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + } + if (result->accessed) { + page_vma_mapped_walk_done(&pvmw); + break; + } + } + + /* If accessed, stop walking */ + return !result->accessed; +} + +static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz) +{ + struct page *page = damon_get_page(PHYS_PFN(paddr)); + struct damon_pa_access_chk_result result = { + .page_sz = PAGE_SIZE, + .accessed = false, + }; + struct rmap_walk_control rwc = { + .arg = &result, + .rmap_one = __damon_pa_young, + .anon_lock = page_lock_anon_vma_read, + }; + bool need_lock; + + if (!page) + return false; + + if (!page_mapped(page) || !page_rmapping(page)) { + if (page_is_idle(page)) + result.accessed = false; + else + result.accessed = true; + put_page(page); + goto out; + } + + need_lock = !PageAnon(page) || PageKsm(page); + if (need_lock && !trylock_page(page)) { + put_page(page); + return NULL; + } + + rmap_walk(page, &rwc); + + if (need_lock) + unlock_page(page); + put_page(page); + +out: + *page_sz = result.page_sz; + return result.accessed; +} + +static void __damon_pa_check_access(struct damon_ctx *ctx, + struct damon_region *r) +{ + static unsigned long last_addr; + static unsigned long last_page_sz = PAGE_SIZE; + static bool last_accessed; + + /* If the region is in the last checked page, reuse the result */ + if (ALIGN_DOWN(last_addr, last_page_sz) == + ALIGN_DOWN(r->sampling_addr, last_page_sz)) { + if (last_accessed) + r->nr_accesses++; + return; + } + + last_accessed = damon_pa_young(r->sampling_addr, &last_page_sz); + if (last_accessed) + r->nr_accesses++; + + last_addr = r->sampling_addr; +} + +unsigned int damon_pa_check_accesses(struct damon_ctx *ctx) +{ + struct damon_target *t; + struct damon_region *r; + unsigned int max_nr_accesses = 0; + + damon_for_each_target(t, ctx) { + damon_for_each_region(r, t) { + __damon_pa_check_access(ctx, r); + max_nr_accesses = max(r->nr_accesses, max_nr_accesses); + } + } + + return max_nr_accesses; +} + +bool damon_pa_target_valid(void *t) +{ + return true; +} + +void damon_pa_set_primitives(struct damon_ctx *ctx) +{ + ctx->primitive.init = NULL; + ctx->primitive.update = NULL; + ctx->primitive.prepare_access_checks = damon_pa_prepare_access_checks; + ctx->primitive.check_accesses = damon_pa_check_accesses; + ctx->primitive.reset_aggregated = NULL; + ctx->primitive.target_valid = damon_pa_target_valid; + ctx->primitive.cleanup = NULL; + ctx->primitive.apply_scheme = NULL; +} From patchwork Fri Nov 5 20:47:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BBF5C433F5 for ; Fri, 5 Nov 2021 20:47:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4838A6126A for ; Fri, 5 Nov 2021 20:47:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4838A6126A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DA2FD9400E4; Fri, 5 Nov 2021 16:47:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D52749400C1; Fri, 5 Nov 2021 16:47:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1C0E9400E4; Fri, 5 Nov 2021 16:47:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0200.hostedemail.com [216.40.44.200]) by kanga.kvack.org (Postfix) with ESMTP id A73FE9400C1 for ; Fri, 5 Nov 2021 16:47:02 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5F9EF1853B976 for ; Fri, 5 Nov 2021 20:47:02 +0000 (UTC) X-FDA: 78776061084.16.02DD817 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP id E23CA104AAC1 for ; Fri, 5 Nov 2021 20:46:52 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A9F0061288; Fri, 5 Nov 2021 20:47:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145221; bh=ZqvSLOptRtXgrw/ZXj/yS/IxnbTa3Q+1eGWCzJgRhQw=; h=Date:From:To:Subject:In-Reply-To:From; b=z7NY+RqMIhDkoSPy2nr2uZjqrIlch0n63CBgB9gWaF5KvfJCKzAkrPv9qV9KMzbaM DjlgnSHEK7eYYd/iwNQ9PUG+31aUCCkPY0ZxvmVT7Ec+Gkt7a6gooXiERL7dppeUvH S0FeuloFi49Bg5i+YwtzLzz09fPl35bJ2ajh1z2E= Date: Fri, 05 Nov 2021 13:47:00 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 235/262] mm/damon/dbgfs: support physical memory monitoring Message-ID: <20211105204700.1_it9abqc%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E23CA104AAC1 X-Stat-Signature: spmc1nbqitd1jdigw4m3mfjtrb6gu6hq Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=z7NY+RqM; dmarc=none; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145212-585758 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support physical memory monitoring This commit makes the 'damon-dbgfs' to support the physical memory monitoring, in addition to the virtual memory monitoring. Users can do the physical memory monitoring by writing a special keyword, 'paddr' to the 'target_ids' debugfs file. Then, DAMON will check the special keyword and configure the monitoring context to run with the primitives for the physical address space. Unlike the virtual memory monitoring, the monitoring target region will not be automatically set. Therefore, users should also set the monitoring target address region using the 'init_regions' debugfs file. Also, note that the physical memory monitoring will not automatically terminated. The user should explicitly turn off the monitoring by writing 'off' to the 'monitor_on' debugfs file. Link: https://lkml.kernel.org/r/20211012205711.29216-7-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Brendan Higgins Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/Kconfig | 2 +- mm/damon/dbgfs.c | 21 ++++++++++++++++++--- 2 files changed, 19 insertions(+), 4 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-physical-memory-monitoring +++ a/mm/damon/dbgfs.c @@ -339,6 +339,7 @@ static ssize_t dbgfs_target_ids_write(st const char __user *buf, size_t count, loff_t *ppos) { struct damon_ctx *ctx = file->private_data; + bool id_is_pid = true; char *kbuf, *nrs; unsigned long *targets; ssize_t nr_targets; @@ -351,6 +352,11 @@ static ssize_t dbgfs_target_ids_write(st return PTR_ERR(kbuf); nrs = kbuf; + if (!strncmp(kbuf, "paddr\n", count)) { + id_is_pid = false; + /* target id is meaningless here, but we set it just for fun */ + scnprintf(kbuf, count, "42 "); + } targets = str_to_target_ids(nrs, ret, &nr_targets); if (!targets) { @@ -358,7 +364,7 @@ static ssize_t dbgfs_target_ids_write(st goto out; } - if (targetid_is_pid(ctx)) { + if (id_is_pid) { for (i = 0; i < nr_targets; i++) { targets[i] = (unsigned long)find_get_pid( (int)targets[i]); @@ -372,15 +378,24 @@ static ssize_t dbgfs_target_ids_write(st mutex_lock(&ctx->kdamond_lock); if (ctx->kdamond) { - if (targetid_is_pid(ctx)) + if (id_is_pid) dbgfs_put_pids(targets, nr_targets); ret = -EBUSY; goto unlock_out; } + /* remove targets with previously-set primitive */ + damon_set_targets(ctx, NULL, 0); + + /* Configure the context for the address space type */ + if (id_is_pid) + damon_va_set_primitives(ctx); + else + damon_pa_set_primitives(ctx); + err = damon_set_targets(ctx, targets, nr_targets); if (err) { - if (targetid_is_pid(ctx)) + if (id_is_pid) dbgfs_put_pids(targets, nr_targets); ret = err; } --- a/mm/damon/Kconfig~mm-damon-dbgfs-support-physical-memory-monitoring +++ a/mm/damon/Kconfig @@ -54,7 +54,7 @@ config DAMON_VADDR_KUNIT_TEST config DAMON_DBGFS bool "DAMON debugfs interface" - depends on DAMON_VADDR && DEBUG_FS + depends on DAMON_VADDR && DAMON_PADDR && DEBUG_FS help This builds the debugfs interface for DAMON. The user space admins can use the interface for arbitrary data access monitoring. From patchwork Fri Nov 5 20:47:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21B41C433EF for ; Fri, 5 Nov 2021 20:47:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CB15360240 for ; Fri, 5 Nov 2021 20:47:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CB15360240 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 629B79400E5; Fri, 5 Nov 2021 16:47:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D9289400C1; Fri, 5 Nov 2021 16:47:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A1F69400E5; Fri, 5 Nov 2021 16:47:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0140.hostedemail.com [216.40.44.140]) by kanga.kvack.org (Postfix) with ESMTP id 357359400C1 for ; Fri, 5 Nov 2021 16:47:06 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E2B301819609D for ; Fri, 5 Nov 2021 20:47:05 +0000 (UTC) X-FDA: 78776061210.22.5596F40 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id 7955ED0000A8 for ; Fri, 5 Nov 2021 20:46:54 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2C82B60174; Fri, 5 Nov 2021 20:47:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145224; bh=M5GLFS5tB6wqOdP/GMOSX3S4NFfMPBQx++niQv3FIyA=; h=Date:From:To:Subject:In-Reply-To:From; b=XryPuswlY1CW1XvtarrtsIxBRptDdiQfuMugTb0HeHwgKL/y1QFv4A46fxuqbr8Sc iEik3a6cnbpus0cUgExybBWYShzw4tnwVK9ttbOzgZQaQTU4d5O3DAUHjalVgrftEo me2zj5buzdLjTWA2cjlPHqe7x1fRqAGnQO7MRW+w= Date: Fri, 05 Nov 2021 13:47:03 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, brendanhiggins@google.com, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 236/262] Docs/DAMON: document physical memory monitoring support Message-ID: <20211105204703.vXHvH1Bl-%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 MIME-Version: 1.0 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XryPuswl; dmarc=none; spf=pass (imf15.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7955ED0000A8 X-Stat-Signature: 35bqen6ppmp68p4t5k63d9waxkw1hm19 X-HE-Tag: 1636145214-659616 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/DAMON: document physical memory monitoring support This commit updates the DAMON documents for the physical memory address space monitoring support. Link: https://lkml.kernel.org/r/20211012205711.29216-8-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Brendan Higgins Cc: David Hildenbrand Cc: David Rienjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/usage.rst | 25 +++++++++++--- Documentation/vm/damon/design.rst | 29 ++++++++++------- Documentation/vm/damon/faq.rst | 5 +- 3 files changed, 40 insertions(+), 19 deletions(-) --- a/Documentation/admin-guide/mm/damon/usage.rst~docs-damon-document-physical-memory-monitoring-support +++ a/Documentation/admin-guide/mm/damon/usage.rst @@ -10,15 +10,16 @@ DAMON provides below three interfaces fo This is for privileged people such as system administrators who want a just-working human-friendly interface. Using this, users can use the DAMON’s major features in a human-friendly way. It may not be highly tuned for - special cases, though. It supports only virtual address spaces monitoring. + special cases, though. It supports both virtual and physical address spaces + monitoring. - *debugfs interface.* This is for privileged user space programmers who want more optimized use of DAMON. Using this, users can use DAMON’s major features by reading from and writing to special debugfs files. Therefore, you can write and use your personalized DAMON debugfs wrapper programs that reads/writes the debugfs files instead of you. The DAMON user space tool is also a reference - implementation of such programs. It supports only virtual address spaces - monitoring. + implementation of such programs. It supports both virtual and physical + address spaces monitoring. - *Kernel Space Programming Interface.* This is for kernel space programmers. Using this, users can utilize every feature of DAMON most flexibly and efficiently by writing kernel space @@ -72,20 +73,34 @@ check it again:: # cat target_ids 42 4242 +Users can also monitor the physical memory address space of the system by +writing a special keyword, "``paddr\n``" to the file. Because physical address +space monitoring doesn't support multiple targets, reading the file will show a +fake value, ``42``, as below:: + + # cd /damon + # echo paddr > target_ids + # cat target_ids + 42 + Note that setting the target ids doesn't start the monitoring. Initial Monitoring Target Regions --------------------------------- -In case of the debugfs based monitoring, DAMON automatically sets and updates -the monitoring target regions so that entire memory mappings of target +In case of the virtual address space monitoring, DAMON automatically sets and +updates the monitoring target regions so that entire memory mappings of target processes can be covered. However, users can want to limit the monitoring region to specific address ranges, such as the heap, the stack, or specific file-mapped area. Or, some users can know the initial access pattern of their workloads and therefore want to set optimal initial regions for the 'adaptive regions adjustment'. +In contrast, DAMON do not automatically sets and updates the monitoring target +regions in case of physical memory monitoring. Therefore, users should set the +monitoring target regions by themselves. + In such cases, users can explicitly set the initial monitoring target regions as they want, by writing proper values to the ``init_regions`` file. Each line of the input should represent one region in below form.:: --- a/Documentation/vm/damon/design.rst~docs-damon-document-physical-memory-monitoring-support +++ a/Documentation/vm/damon/design.rst @@ -35,13 +35,17 @@ two parts: 1. Identification of the monitoring target address range for the address space. 2. Access check of specific address range in the target space. -DAMON currently provides the implementation of the primitives for only the -virtual address spaces. Below two subsections describe how it works. +DAMON currently provides the implementations of the primitives for the physical +and virtual address spaces. Below two subsections describe how those work. VMA-based Target Address Range Construction ------------------------------------------- +This is only for the virtual address space primitives implementation. That for +the physical address space simply asks users to manually set the monitoring +target address ranges. + Only small parts in the super-huge virtual address space of the processes are mapped to the physical memory and accessed. Thus, tracking the unmapped address regions is just wasteful. However, because DAMON can deal with some @@ -71,15 +75,18 @@ to make a reasonable trade-off. Below s PTE Accessed-bit Based Access Check ----------------------------------- -The implementation for the virtual address space uses PTE Accessed-bit for -basic access checks. It finds the relevant PTE Accessed bit from the address -by walking the page table for the target task of the address. In this way, the -implementation finds and clears the bit for next sampling target address and -checks whether the bit set again after one sampling period. This could disturb -other kernel subsystems using the Accessed bits, namely Idle page tracking and -the reclaim logic. To avoid such disturbances, DAMON makes it mutually -exclusive with Idle page tracking and uses ``PG_idle`` and ``PG_young`` page -flags to solve the conflict with the reclaim logic, as Idle page tracking does. +Both of the implementations for physical and virtual address spaces use PTE +Accessed-bit for basic access checks. Only one difference is the way of +finding the relevant PTE Accessed bit(s) from the address. While the +implementation for the virtual address walks the page table for the target task +of the address, the implementation for the physical address walks every page +table having a mapping to the address. In this way, the implementations find +and clear the bit(s) for next sampling target address and checks whether the +bit(s) set again after one sampling period. This could disturb other kernel +subsystems using the Accessed bits, namely Idle page tracking and the reclaim +logic. To avoid such disturbances, DAMON makes it mutually exclusive with Idle +page tracking and uses ``PG_idle`` and ``PG_young`` page flags to solve the +conflict with the reclaim logic, as Idle page tracking does. Address Space Independent Core Mechanisms --- a/Documentation/vm/damon/faq.rst~docs-damon-document-physical-memory-monitoring-support +++ a/Documentation/vm/damon/faq.rst @@ -36,10 +36,9 @@ constructions and actual access checks c DAMON core by the users. In this way, DAMON users can monitor any address space with any access check technique. -Nonetheless, DAMON provides vma tracking and PTE Accessed bit check based +Nonetheless, DAMON provides vma/rmap tracking and PTE Accessed bit check based implementations of the address space dependent functions for the virtual memory -by default, for a reference and convenient use. In near future, we will -provide those for physical memory address space. +and the physical memory by default, for a reference and convenient use. Can I simply monitor page granularity? From patchwork Fri Nov 5 20:47:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7AA4C433EF for ; Fri, 5 Nov 2021 20:47:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9C0776056B for ; Fri, 5 Nov 2021 20:47:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9C0776056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 38FE39400E6; Fri, 5 Nov 2021 16:47:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 341BA9400C1; Fri, 5 Nov 2021 16:47:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E1789400E6; Fri, 5 Nov 2021 16:47:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id 05A209400C1 for ; Fri, 5 Nov 2021 16:47:09 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BF8738249980 for ; Fri, 5 Nov 2021 20:47:08 +0000 (UTC) X-FDA: 78776061336.06.F51EA85 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP id 6C4DC1907 for ; Fri, 5 Nov 2021 20:47:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7895A60240; Fri, 5 Nov 2021 20:47:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145227; bh=87zvzF9/N7d3ulr8MqQTzSdhjGE1/wQ5xaFcOURjg6M=; h=Date:From:To:Subject:In-Reply-To:From; b=yhYZLuUGkR0hfixY3Qvz1PsUH2TXx7BNvRn2rpO2Lqj3YR1BXcrfFEg9nQRz+38G7 RyEKt5uoSzUBd7E3XszwyBTdpr2KKYNNpD9hwd0aQ4MtmDSY2OP2LJRS5lAVSntZBD /QQtf/VnII7vk4YOi3XAyn8k/u1tPUDGT9CuwYqc= Date: Fri, 05 Nov 2021 13:47:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, anshuman.khandual@arm.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, rikard.falkeborn@gmail.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 237/262] mm/damon/vaddr: constify static mm_walk_ops Message-ID: <20211105204707.YEomvIVb6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6C4DC1907 X-Stat-Signature: iesqzyp8ajygja31meh8yg9udknps8wq Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yhYZLuUG; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145228-376991 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rikard Falkeborn Subject: mm/damon/vaddr: constify static mm_walk_ops The only usage of these structs is to pass their addresses to walk_page_range(), which takes a pointer to const mm_walk_ops as argument. Make them const to allow the compiler to put them in read-only memory. Link: https://lkml.kernel.org/r/20211014075042.17174-2-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn Reviewed-by: SeongJae Park Reviewed-by: Anshuman Khandual Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/damon/vaddr.c~mm-damon-vaddr-constify-static-mm_walk_ops +++ a/mm/damon/vaddr.c @@ -394,7 +394,7 @@ out: return 0; } -static struct mm_walk_ops damon_mkold_ops = { +static const struct mm_walk_ops damon_mkold_ops = { .pmd_entry = damon_mkold_pmd_entry, }; @@ -490,7 +490,7 @@ out: return 0; } -static struct mm_walk_ops damon_young_ops = { +static const struct mm_walk_ops damon_young_ops = { .pmd_entry = damon_young_pmd_entry, }; From patchwork Fri Nov 5 20:47:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5085BC433F5 for ; Fri, 5 Nov 2021 20:47:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 09A286056B for ; Fri, 5 Nov 2021 20:47:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 09A286056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 95E1E9400EB; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90C339400C1; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FB449400EA; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id 6CF4F9400C1 for ; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 34D3018522C44 for ; Fri, 5 Nov 2021 20:47:26 +0000 (UTC) X-FDA: 78776062092.14.3C128EF Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id A2505B00009D for ; Fri, 5 Nov 2021 20:47:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 641F36056B; Fri, 5 Nov 2021 20:47:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145230; bh=BTrcbZ1I8eoCCI4Xw4uQdbdDJW5qGKGFPbszge1bsac=; h=Date:From:To:Subject:In-Reply-To:From; b=a/T3lkjZycItE1lZQgrzsWQrO+MwLoOLnAyiD3vmWnyKaOdqiqDbjvM5L52Zgbe6S OY6H8baqG/f+JaU/pdXGYlmNfzACgLhT+pSPyU4NfJ8hEyKJNgYbX2mXfjZV1YcG7O uMPkbVK2Ra+TBlqwj7DqnX5lA38bSryYT/Ch7uKQ= Date: Fri, 05 Nov 2021 13:47:09 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, rongwei.wang@linux.alibaba.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 238/262] mm/damon/dbgfs: remove unnecessary variables Message-ID: <20211105204709.sPppLLlKi%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A2505B00009D X-Stat-Signature: 7jetpyupbpyt64k8af519sbf9ozpe5t1 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="a/T3lkjZ"; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145236-350140 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Rongwei Wang Subject: mm/damon/dbgfs: remove unnecessary variables In some functions, it's unnecessary to declare 'err' and 'ret' variables at the same time. This patch mainly to simplify the issue of such declarations by reusing one variable. Link: https://lkml.kernel.org/r/20211014073014.35754-1-sj@kernel.org Signed-off-by: Rongwei Wang Signed-off-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 66 +++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 35 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-remove-unnecessary-variables +++ a/mm/damon/dbgfs.c @@ -69,8 +69,7 @@ static ssize_t dbgfs_attrs_write(struct struct damon_ctx *ctx = file->private_data; unsigned long s, a, r, minr, maxr; char *kbuf; - ssize_t ret = count; - int err; + ssize_t ret; kbuf = user_input_str(buf, count, ppos); if (IS_ERR(kbuf)) @@ -88,9 +87,9 @@ static ssize_t dbgfs_attrs_write(struct goto unlock_out; } - err = damon_set_attrs(ctx, s, a, r, minr, maxr); - if (err) - ret = err; + ret = damon_set_attrs(ctx, s, a, r, minr, maxr); + if (!ret) + ret = count; unlock_out: mutex_unlock(&ctx->kdamond_lock); out: @@ -220,14 +219,13 @@ static ssize_t dbgfs_schemes_write(struc struct damon_ctx *ctx = file->private_data; char *kbuf; struct damos **schemes; - ssize_t nr_schemes = 0, ret = count; - int err; + ssize_t nr_schemes = 0, ret; kbuf = user_input_str(buf, count, ppos); if (IS_ERR(kbuf)) return PTR_ERR(kbuf); - schemes = str_to_schemes(kbuf, ret, &nr_schemes); + schemes = str_to_schemes(kbuf, count, &nr_schemes); if (!schemes) { ret = -EINVAL; goto out; @@ -239,11 +237,12 @@ static ssize_t dbgfs_schemes_write(struc goto unlock_out; } - err = damon_set_schemes(ctx, schemes, nr_schemes); - if (err) - ret = err; - else + ret = damon_set_schemes(ctx, schemes, nr_schemes); + if (!ret) { + ret = count; nr_schemes = 0; + } + unlock_out: mutex_unlock(&ctx->kdamond_lock); free_schemes_arr(schemes, nr_schemes); @@ -343,9 +342,8 @@ static ssize_t dbgfs_target_ids_write(st char *kbuf, *nrs; unsigned long *targets; ssize_t nr_targets; - ssize_t ret = count; + ssize_t ret; int i; - int err; kbuf = user_input_str(buf, count, ppos); if (IS_ERR(kbuf)) @@ -358,7 +356,7 @@ static ssize_t dbgfs_target_ids_write(st scnprintf(kbuf, count, "42 "); } - targets = str_to_target_ids(nrs, ret, &nr_targets); + targets = str_to_target_ids(nrs, count, &nr_targets); if (!targets) { ret = -ENOMEM; goto out; @@ -393,11 +391,12 @@ static ssize_t dbgfs_target_ids_write(st else damon_pa_set_primitives(ctx); - err = damon_set_targets(ctx, targets, nr_targets); - if (err) { + ret = damon_set_targets(ctx, targets, nr_targets); + if (ret) { if (id_is_pid) dbgfs_put_pids(targets, nr_targets); - ret = err; + } else { + ret = count; } unlock_out: @@ -715,8 +714,7 @@ static ssize_t dbgfs_mk_context_write(st { char *kbuf; char *ctx_name; - ssize_t ret = count; - int err; + ssize_t ret; kbuf = user_input_str(buf, count, ppos); if (IS_ERR(kbuf)) @@ -734,9 +732,9 @@ static ssize_t dbgfs_mk_context_write(st } mutex_lock(&damon_dbgfs_lock); - err = dbgfs_mk_context(ctx_name); - if (err) - ret = err; + ret = dbgfs_mk_context(ctx_name); + if (!ret) + ret = count; mutex_unlock(&damon_dbgfs_lock); out: @@ -805,8 +803,7 @@ static ssize_t dbgfs_rm_context_write(st const char __user *buf, size_t count, loff_t *ppos) { char *kbuf; - ssize_t ret = count; - int err; + ssize_t ret; char *ctx_name; kbuf = user_input_str(buf, count, ppos); @@ -825,9 +822,9 @@ static ssize_t dbgfs_rm_context_write(st } mutex_lock(&damon_dbgfs_lock); - err = dbgfs_rm_context(ctx_name); - if (err) - ret = err; + ret = dbgfs_rm_context(ctx_name); + if (!ret) + ret = count; mutex_unlock(&damon_dbgfs_lock); out: @@ -851,9 +848,8 @@ static ssize_t dbgfs_monitor_on_read(str static ssize_t dbgfs_monitor_on_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - ssize_t ret = count; + ssize_t ret; char *kbuf; - int err; kbuf = user_input_str(buf, count, ppos); if (IS_ERR(kbuf)) @@ -866,14 +862,14 @@ static ssize_t dbgfs_monitor_on_write(st } if (!strncmp(kbuf, "on", count)) - err = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs); + ret = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs); else if (!strncmp(kbuf, "off", count)) - err = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs); + ret = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs); else - err = -EINVAL; + ret = -EINVAL; - if (err) - ret = err; + if (!ret) + ret = count; kfree(kbuf); return ret; } From patchwork Fri Nov 5 20:47:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5432C433FE for ; Fri, 5 Nov 2021 20:47:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 60BF660FBF for ; Fri, 5 Nov 2021 20:47:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 60BF660FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F01579400E7; Fri, 5 Nov 2021 16:47:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB17F9400C1; Fri, 5 Nov 2021 16:47:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA0F69400E7; Fri, 5 Nov 2021 16:47:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id C4F379400C1 for ; Fri, 5 Nov 2021 16:47:15 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9338077994 for ; Fri, 5 Nov 2021 20:47:15 +0000 (UTC) X-FDA: 78776061630.11.B173ED1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id D5CB720019EB for ; Fri, 5 Nov 2021 20:47:15 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D4DF060174; Fri, 5 Nov 2021 20:47:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145234; bh=X7jEzMW55rkoCXUv68eLTCv6jb09tFNAmn1jtUlFabI=; h=Date:From:To:Subject:In-Reply-To:From; b=2WLCBndRV1UgTgJZXAdQYF6hyqC8ByzlWIeEF+8Gbxroj0EynvESzBW3UWbjw3fXI afIxqweNXuP09qT0RgZRIdHqFuGIf+loO434JMqSEG89nhTKapNYetsejNsOZv1mNT cwiCRcqPf78GI0ZiU7Pt5GOLdDGsyg2zjpQR+TMw= Date: Fri, 05 Nov 2021 13:47:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 239/262] mm/damon/paddr: support the pageout scheme Message-ID: <20211105204713.jQcouN_Jl%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D5CB720019EB X-Stat-Signature: w4nu4i883uo49dtkaicwuh88iqban3sd Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=2WLCBndR; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145235-314004 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/paddr: support the pageout scheme Introduction ============ This patchset 1) makes the engine for general data access pattern-oriented memory management (DAMOS) be more useful for production environments, and 2) implements a static kernel module for lightweight proactive reclamation using the engine. Proactive Reclamation --------------------- On general memory over-committed systems, proactively reclaiming cold pages helps saving memory and reducing latency spikes that incurred by the direct reclaim or the CPU consumption of kswapd, while incurring only minimal performance degradation[2]. A Free Pages Reporting[8] based memory over-commit virtualization system would be one more specific use case. In the system, the guest VMs reports their free memory to host, and the host reallocates the reported memory to other guests. As a result, the system's memory utilization can be maximized. However, the guests could be not so memory-frugal, because some kernel subsystems and user-space applications are designed to use as much memory as available. Then, guests would report only small amount of free memory to host, results in poor memory utilization. Running the proactive reclamation in such guests could help mitigating this problem. Google has also implemented this idea and using it in their data center. They further proposed upstreaming it in LSFMM'19, and "the general consensus was that, while this sort of proactive reclaim would be useful for a number of users, the cost of this particular solution was too high to consider merging it upstream"[3]. The cost mainly comes from the coldness tracking. Roughly speaking, the implementation periodically scans the 'Accessed' bit of each page. For the reason, the overhead linearly increases as the size of the memory and the scanning frequency grows. As a result, Google is known to dedicating one CPU for the work. That's a reasonable option to someone like Google, but it wouldn't be so to some others. DAMON and DAMOS: An engine for data access pattern-oriented memory management ----------------------------------------------------------------------------- DAMON[4] is a framework for general data access monitoring. Its adaptive monitoring overhead control feature minimizes its monitoring overhead. It also let the upper-bound of the overhead be configurable by clients, regardless of the size of the monitoring target memory. While monitoring 70 GiB memory of a production system every 5 milliseconds, it consumes less than 1% single CPU time. For this, it could sacrify some of the quality of the monitoring results. Nevertheless, the lower-bound of the quality is configurable, and it uses a best-effort algorithm for better quality. Our test results[5] show the quality is practical enough. From the production system monitoring, we were able to find a 4 KiB region in the 70 GiB memory that shows highest access frequency. We normally don't monitor the data access pattern just for fun but to improve something like memory management. Proactive reclamation is one such usage. For such general cases, DAMON provides a feature called DAMon-based Operation Schemes (DAMOS)[6]. It makes DAMON an engine for general data access pattern oriented memory management. Using this, clients can ask DAMON to find memory regions of specific data access pattern and apply some memory management action (e.g., page out, move to head of the LRU list, use huge page, ...). We call the request 'scheme'. Proactive Reclamation on top of DAMON/DAMOS ------------------------------------------- Therefore, by using DAMON for the cold pages detection, the proactive reclamation's monitoring overhead issue can be solved. Actually, we previously implemented a version of proactive reclamation using DAMOS and achieved noticeable improvements with our evaluation setup[5]. Nevertheless, it more for a proof-of-concept, rather than production uses. It supports only virtual address spaces of processes, and require additional tuning efforts for given workloads and the hardware. For the tuning, we introduced a simple auto-tuning user space tool[8]. Google is also known to using a ML-based similar approach for their fleets[2]. But, making it just works with intuitive knobs in the kernel would be helpful for general users. To this end, this patchset improves DAMOS to be ready for such production usages, and implements another version of the proactive reclamation, namely DAMON_RECLAIM, on top of it. DAMOS Improvements: Aggressiveness Control, Prioritization, and Watermarks -------------------------------------------------------------------------- First of all, the current version of DAMOS supports only virtual address spaces. This patchset makes it supports the physical address space for the page out action. Next major problem of the current version of DAMOS is the lack of the aggressiveness control, which can results in arbitrary overhead. For example, if huge memory regions having the data access pattern of interest are found, applying the requested action to all of the regions could incur significant overhead. It can be controlled by tuning the target data access pattern with manual or automated approaches[2,7]. But, some people would prefer the kernel to just work with only intuitive tuning or default values. For such cases, this patchset implements a safeguard, namely time/size quota. Using this, the clients can specify up to how much time can be used for applying the action, and/or up to how much memory regions the action can be applied within a user-specified time duration. A followup question is, to which memory regions should the action applied within the limits? We implement a simple regions prioritization mechanism for each action and make DAMOS to apply the action to high priority regions first. It also allows clients tune the prioritization mechanism to use different weights for size, access frequency, and age of memory regions. This means we could use not only LRU but also LFU or some fancy algorithms like CAR[9] with lightweight overhead. Though DAMON is lightweight, someone would want to remove even the cold pages monitoring overhead when it is unnecessary. Currently, it should manually turned on and off by clients, but some clients would simply want to turn it on and off based on some metrics like free memory ratio or memory fragmentation. For such cases, this patchset implements a watermarks-based automatic activation feature. It allows the clients configure the metric of their interest, and three watermarks of the metric. If the metric is higher than the high watermark or lower than the low watermark, the scheme is deactivated. If the metric is lower than the mid watermark but higher than the low watermark, the scheme is activated. DAMON-based Reclaim ------------------- Using the improved version of DAMOS, this patchset implements a static kernel module called 'damon_reclaim'. It finds memory regions that didn't accessed for specific time duration and page out. Consuming too much CPU for the paging out operations, or doing pageout too frequently can be critical for systems configuring their swap devices with software-defined in-memory block devices like zram/zswap or total number of writes limited devices like SSDs, respectively. To avoid the problems, the time/size quotas can be configured. Under the quotas, it pages out memory regions that didn't accessed longer first. Also, to remove the monitoring overhead under peaceful situation, and to fall back to the LRU-list based page granularity reclamation when it doesn't make progress, the three watermarks based activation mechanism is used, with the free memory ratio as the watermark metric. For convenient configurations, it provides several module parameters. Using these, sysadmins can enable/disable it, and tune its parameters including the coldness identification time threshold, the time/size quotas and the three watermarks. Evaluation ========== In short, DAMON_RECLAIM with 50ms/s time quota and regions prioritization on v5.15-rc5 Linux kernel with ZRAM swap device achieves 38.58% memory saving with only 1.94% runtime overhead. For this, DAMON_RECLAIM consumes only 4.97% of single CPU time. Setup ----- We evaluate DAMON_RECLAIM to show how each of the DAMOS improvements make effect. For this, we measure DAMON_RECLAIM's CPU consumption, entire system memory footprint, total number of major page faults, and runtime of 24 realistic workloads in PARSEC3 and SPLASH-2X benchmark suites on my QEMU/KVM based virtual machine. The virtual machine runs on an i3.metal AWS instance, has 130GiB memory, and runs a linux kernel built on latest -mm tree[1] plus this patchset. It also utilizes a 4 GiB ZRAM swap device. We repeats the measurement 5 times and use averages. [1] https://github.com/hnaz/linux-mm/tree/v5.15-rc5-mmots-2021-10-13-19-55 Detailed Results ---------------- The results are summarized in the below table. With coldness identification threshold of 5 seconds, DAMON_RECLAIM without the time quota-based speed limit achieves 47.21% memory saving, but incur 4.59% runtime slowdown to the workloads on average. For this, DAMON_RECLAIM consumes about 11.28% single CPU time. Applying time quotas of 200ms/s, 50ms/s, and 10ms/s without the regions prioritization reduces the slowdown to 4.89%, 2.65%, and 1.5%, respectively. Time quota of 200ms/s (20%) makes no real change compared to the quota unapplied version, because the quota unapplied version consumes only 11.28% CPU time. DAMON_RECLAIM's CPU utilization also similarly reduced: 11.24%, 5.51%, and 2.01% of single CPU time. That is, the overhead is proportional to the speed limit. Nevertheless, it also reduces the memory saving because it becomes less aggressive. In detail, the three variants show 48.76%, 37.83%, and 7.85% memory saving, respectively. Applying the regions prioritization (page out regions that not accessed longer first within the time quota) further reduces the performance degradation. Runtime slowdowns and total number of major page faults increase has been 4.89%/218,690% -> 4.39%/166,136% (200ms/s), 2.65%/111,886% -> 1.94%/59,053% (50ms/s), and 1.5%/34,973.40% -> 2.08%/8,781.75% (10ms/s). The runtime under 10ms/s time quota has increased with prioritization, but apparently that's under the margin of error. time quota prioritization memory_saving cpu_util slowdown pgmajfaults overhead N N 47.21% 11.28% 4.59% 194,802% 200ms/s N 48.76% 11.24% 4.89% 218,690% 50ms/s N 37.83% 5.51% 2.65% 111,886% 10ms/s N 7.85% 2.01% 1.5% 34,793.40% 200ms/s Y 50.08% 10.38% 4.39% 166,136% 50ms/s Y 38.58% 4.97% 1.94% 59,053% 10ms/s Y 3.63% 1.73% 2.08% 8,781.75% Baseline and Complete Git Trees =============================== The patches are based on the latest -mm tree (v5.15-rc5-mmots-2021-10-13-19-55). You can also clone the complete git tree from: $ git clone git://github.com/sjp38/linux -b damon_reclaim/patches/v1 The web is also available: https://git.kernel.org/pub/scm/linux/kernel/git/sj/linux.git/tag/?h=damon_reclaim/patches/v1 Sequence Of Patches =================== The first patch makes DAMOS support the physical address space for the page out action. Following five patches (patches 2-6) implement the time/size quotas. Next four patches (patches 7-10) implement the memory regions prioritization within the limit. Then, three following patches (patches 11-13) implement the watermarks-based schemes activation. Finally, the last two patches (patches 14-15) implement and document the DAMON-based reclamation using the advanced DAMOS. [1] https://www.kernel.org/doc/html/v5.15-rc1/vm/damon/index.html [2] https://research.google/pubs/pub48551/ [3] https://lwn.net/Articles/787611/ [4] https://damonitor.github.io [5] https://damonitor.github.io/doc/html/latest/vm/damon/eval.html [6] https://lore.kernel.org/linux-mm/20211001125604.29660-1-sj@kernel.org/ [7] https://github.com/awslabs/damoos [8] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html [9] https://www.usenix.org/conference/fast-04/car-clock-adaptive-replacement This patch (of 15): This commit makes the DAMON primitives for physical address space support the pageout action for DAMON-based Operation Schemes. With this commit, hence, users can easily implement system-level data access-aware reclamations using DAMOS. [sj@kernel.org: fix missing-prototype build warning] Link: https://lkml.kernel.org/r/20211025064220.13904-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211019150731.16699-1-sj@kernel.org Link: https://lkml.kernel.org/r/20211019150731.16699-2-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Cameron Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: Jonathan Corbet Cc: David Hildenbrand Cc: David Woodhouse Cc: Marco Elver Cc: Leonard Foerster Cc: Greg Thelen Cc: Markus Boehme Cc: David Rientjes Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 2 ++ mm/damon/paddr.c | 37 ++++++++++++++++++++++++++++++++++++- 2 files changed, 38 insertions(+), 1 deletion(-) --- a/include/linux/damon.h~mm-damon-paddr-support-the-pageout-scheme +++ a/include/linux/damon.h @@ -357,6 +357,8 @@ void damon_va_set_primitives(struct damo void damon_pa_prepare_access_checks(struct damon_ctx *ctx); unsigned int damon_pa_check_accesses(struct damon_ctx *ctx); bool damon_pa_target_valid(void *t); +int damon_pa_apply_scheme(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme); void damon_pa_set_primitives(struct damon_ctx *ctx); #endif /* CONFIG_DAMON_PADDR */ --- a/mm/damon/paddr.c~mm-damon-paddr-support-the-pageout-scheme +++ a/mm/damon/paddr.c @@ -11,7 +11,9 @@ #include #include #include +#include +#include "../internal.h" #include "prmtv-common.h" static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma, @@ -211,6 +213,39 @@ bool damon_pa_target_valid(void *t) return true; } +int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, + struct damon_region *r, struct damos *scheme) +{ + unsigned long addr; + LIST_HEAD(page_list); + + if (scheme->action != DAMOS_PAGEOUT) + return -EINVAL; + + for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) { + struct page *page = damon_get_page(PHYS_PFN(addr)); + + if (!page) + continue; + + ClearPageReferenced(page); + test_and_clear_page_young(page); + if (isolate_lru_page(page)) { + put_page(page); + continue; + } + if (PageUnevictable(page)) { + putback_lru_page(page); + } else { + list_add(&page->lru, &page_list); + put_page(page); + } + } + reclaim_pages(&page_list); + cond_resched(); + return 0; +} + void damon_pa_set_primitives(struct damon_ctx *ctx) { ctx->primitive.init = NULL; @@ -220,5 +255,5 @@ void damon_pa_set_primitives(struct damo ctx->primitive.reset_aggregated = NULL; ctx->primitive.target_valid = damon_pa_target_valid; ctx->primitive.cleanup = NULL; - ctx->primitive.apply_scheme = NULL; + ctx->primitive.apply_scheme = damon_pa_apply_scheme; } From patchwork Fri Nov 5 20:47:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 132A1C433FE for ; Fri, 5 Nov 2021 20:47:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B922261059 for ; Fri, 5 Nov 2021 20:47:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B922261059 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5B2BB9400E8; Fri, 5 Nov 2021 16:47:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 562BB9400C1; Fri, 5 Nov 2021 16:47:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 451A99400E8; Fri, 5 Nov 2021 16:47:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id 31FD39400C1 for ; Fri, 5 Nov 2021 16:47:19 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EEF9B77994 for ; Fri, 5 Nov 2021 20:47:18 +0000 (UTC) X-FDA: 78776061756.23.7413121 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id D5CD49000386 for ; Fri, 5 Nov 2021 20:47:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4CBE860240; Fri, 5 Nov 2021 20:47:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145237; bh=0/VO9mZme1WGl9HFW/lP6xGAQOzN6ikx7PmGkXLrNoQ=; h=Date:From:To:Subject:In-Reply-To:From; b=IIrfZGBMDil1oT+8zAAB97rQYu8fCnmum5UPNqil6ZW4wvwH2kUBIHCVFbKfGsPvk qLfYAHdIs+h24gmlj5CiM5xT0IoRZRujnG3c8niJ5PydHR6MqfT/zRs5iWaEJIgTKW 4PnLrUmIO10axJ3w/mU17UmN30TU3P6tEc3FGUNI= Date: Fri, 05 Nov 2021 13:47:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 240/262] mm/damon/schemes: implement size quota for schemes application speed control Message-ID: <20211105204716.KEDy5p9T4%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IIrfZGBM; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D5CD49000386 X-Stat-Signature: ap4dpezsd3ihb7o8ze1n9aizb6a9brz1 X-HE-Tag: 1636145225-672357 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: implement size quota for schemes application speed control There could be arbitrarily large memory regions fulfilling the target data access pattern of a DAMON-based operation scheme. In the case, applying the action of the scheme could incur too high overhead. To provide an intuitive way for avoiding it, this commit implements a feature called size quota. If the quota is set, DAMON tries to apply the action only up to the given amount of memory regions within a given time window. Link: https://lkml.kernel.org/r/20211019150731.16699-3-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 36 +++++++++++++++++++++--- mm/damon/core.c | 60 ++++++++++++++++++++++++++++++++++------ mm/damon/dbgfs.c | 4 ++ 3 files changed, 87 insertions(+), 13 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control +++ a/include/linux/damon.h @@ -90,6 +90,26 @@ enum damos_action { }; /** + * struct damos_quota - Controls the aggressiveness of the given scheme. + * @sz: Maximum bytes of memory that the action can be applied. + * @reset_interval: Charge reset interval in milliseconds. + * + * To avoid consuming too much CPU time or IO resources for applying the + * &struct damos->action to large memory, DAMON allows users to set a size + * quota. The quota can be set by writing non-zero values to &sz. If the size + * quota is set, DAMON tries to apply the action only up to &sz bytes within + * &reset_interval. + */ +struct damos_quota { + unsigned long sz; + unsigned long reset_interval; + +/* private: For charging the quota */ + unsigned long charged_sz; + unsigned long charged_from; +}; + +/** * struct damos - Represents a Data Access Monitoring-based Operation Scheme. * @min_sz_region: Minimum size of target regions. * @max_sz_region: Maximum size of target regions. @@ -98,13 +118,20 @@ enum damos_action { * @min_age_region: Minimum age of target regions. * @max_age_region: Maximum age of target regions. * @action: &damo_action to be applied to the target regions. + * @quota: Control the aggressiveness of this scheme. * @stat_count: Total number of regions that this scheme is applied. * @stat_sz: Total size of regions that this scheme is applied. * @list: List head for siblings. * - * For each aggregation interval, DAMON applies @action to monitoring target - * regions fit in the condition and updates the statistics. Note that both - * the minimums and the maximums are inclusive. + * For each aggregation interval, DAMON finds regions which fit in the + * condition (&min_sz_region, &max_sz_region, &min_nr_accesses, + * &max_nr_accesses, &min_age_region, &max_age_region) and applies &action to + * those. To avoid consuming too much CPU time or IO resources for the + * &action, "a is used. + * + * After applying the &action to each region, &stat_count and &stat_sz is + * updated to reflect the number of regions and total size of regions that the + * &action is applied. */ struct damos { unsigned long min_sz_region; @@ -114,6 +141,7 @@ struct damos { unsigned int min_age_region; unsigned int max_age_region; enum damos_action action; + struct damos_quota quota; unsigned long stat_count; unsigned long stat_sz; struct list_head list; @@ -310,7 +338,7 @@ struct damos *damon_new_scheme( unsigned long min_sz_region, unsigned long max_sz_region, unsigned int min_nr_accesses, unsigned int max_nr_accesses, unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action); + enum damos_action action, struct damos_quota *quota); void damon_add_scheme(struct damon_ctx *ctx, struct damos *s); void damon_destroy_scheme(struct damos *s); --- a/mm/damon/core.c~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control +++ a/mm/damon/core.c @@ -89,7 +89,7 @@ struct damos *damon_new_scheme( unsigned long min_sz_region, unsigned long max_sz_region, unsigned int min_nr_accesses, unsigned int max_nr_accesses, unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action) + enum damos_action action, struct damos_quota *quota) { struct damos *scheme; @@ -107,6 +107,11 @@ struct damos *damon_new_scheme( scheme->stat_sz = 0; INIT_LIST_HEAD(&scheme->list); + scheme->quota.sz = quota->sz; + scheme->quota.reset_interval = quota->reset_interval; + scheme->quota.charged_sz = 0; + scheme->quota.charged_from = 0; + return scheme; } @@ -530,15 +535,25 @@ static void kdamond_reset_aggregated(str } } +static void damon_split_region_at(struct damon_ctx *ctx, + struct damon_target *t, struct damon_region *r, + unsigned long sz_r); + static void damon_do_apply_schemes(struct damon_ctx *c, struct damon_target *t, struct damon_region *r) { struct damos *s; - unsigned long sz; damon_for_each_scheme(s, c) { - sz = r->ar.end - r->ar.start; + struct damos_quota *quota = &s->quota; + unsigned long sz = r->ar.end - r->ar.start; + + /* Check the quota */ + if (quota->sz && quota->charged_sz >= quota->sz) + continue; + + /* Check the target regions condition */ if (sz < s->min_sz_region || s->max_sz_region < sz) continue; if (r->nr_accesses < s->min_nr_accesses || @@ -546,22 +561,51 @@ static void damon_do_apply_schemes(struc continue; if (r->age < s->min_age_region || s->max_age_region < r->age) continue; - s->stat_count++; - s->stat_sz += sz; - if (c->primitive.apply_scheme) + + /* Apply the scheme */ + if (c->primitive.apply_scheme) { + if (quota->sz && quota->charged_sz + sz > quota->sz) { + sz = ALIGN_DOWN(quota->sz - quota->charged_sz, + DAMON_MIN_REGION); + if (!sz) + goto update_stat; + damon_split_region_at(c, t, r, sz); + } c->primitive.apply_scheme(c, t, r, s); + quota->charged_sz += sz; + } if (s->action != DAMOS_STAT) r->age = 0; + +update_stat: + s->stat_count++; + s->stat_sz += sz; } } static void kdamond_apply_schemes(struct damon_ctx *c) { struct damon_target *t; - struct damon_region *r; + struct damon_region *r, *next_r; + struct damos *s; + + damon_for_each_scheme(s, c) { + struct damos_quota *quota = &s->quota; + + if (!quota->sz) + continue; + + /* New charge window starts */ + if (time_after_eq(jiffies, quota->charged_from + + msecs_to_jiffies( + quota->reset_interval))) { + quota->charged_from = jiffies; + quota->charged_sz = 0; + } + } damon_for_each_target(t, c) { - damon_for_each_region(r, t) + damon_for_each_region_safe(r, next_r, t) damon_do_apply_schemes(c, t, r); } } --- a/mm/damon/dbgfs.c~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control +++ a/mm/damon/dbgfs.c @@ -188,6 +188,8 @@ static struct damos **str_to_schemes(con *nr_schemes = 0; while (pos < len && *nr_schemes < max_nr_schemes) { + struct damos_quota quota = {}; + ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n", &min_sz, &max_sz, &min_nr_a, &max_nr_a, &min_age, &max_age, &action, &parsed); @@ -200,7 +202,7 @@ static struct damos **str_to_schemes(con pos += parsed; scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a, - min_age, max_age, action); + min_age, max_age, action, "a); if (!scheme) goto fail; From patchwork Fri Nov 5 20:47:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA0E5C433EF for ; Fri, 5 Nov 2021 20:47:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5CB9060E94 for ; Fri, 5 Nov 2021 20:47:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5CB9060E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DF22E9400E9; Fri, 5 Nov 2021 16:47:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA0539400C1; Fri, 5 Nov 2021 16:47:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C41F89400E9; Fri, 5 Nov 2021 16:47:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id B45799400C1 for ; Fri, 5 Nov 2021 16:47:22 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6F18D184B4AFA for ; Fri, 5 Nov 2021 20:47:22 +0000 (UTC) X-FDA: 78776061924.11.DEB702D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id D5109E001995 for ; Fri, 5 Nov 2021 20:47:04 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B53CB60174; Fri, 5 Nov 2021 20:47:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145241; bh=dz8aWE2IgrDbRc52OFX/xMjZBZws4fLOuSCZThcNkDg=; h=Date:From:To:Subject:In-Reply-To:From; b=o7ppkFP80dFgJVgKL0qg2B0G04Vhtg80KuLGbQdln8xqk5FFS3KFlFUP4ZA1Grws4 hFkEHGfYx3xbReHvfBvIvU1p9xlYk16m5XNC5554CW4QTPc6z40vJWXqSQU0Q4wUoV PCLc65XKpfhKEfddwcO+19SEHDabKZIhATXPwdsQ= Date: Fri, 05 Nov 2021 13:47:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 241/262] mm/damon/schemes: skip already charged targets and regions Message-ID: <20211105204720.97KHLUzyC%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D5109E001995 X-Stat-Signature: 988wi1f5i8oux91f94urtgr19jmw34m3 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=o7ppkFP8; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145224-202034 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: skip already charged targets and regions If DAMOS has stopped applying action in the middle of a group of memory regions due to its size quota, it starts the work again from the beginning of the address space in the next charge window. If there is a huge memory region at the beginning of the address space and it fulfills the scheme's target data access pattern always, the action will applied to only the region. This commit mitigates the case by skipping memory regions that charged in current charge window at the beginning of next charge window. Link: https://lkml.kernel.org/r/20211019150731.16699-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 5 +++++ mm/damon/core.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) --- a/include/linux/damon.h~mm-damon-schemes-skip-already-charged-targets-and-regions +++ a/include/linux/damon.h @@ -107,6 +107,8 @@ struct damos_quota { /* private: For charging the quota */ unsigned long charged_sz; unsigned long charged_from; + struct damon_target *charge_target_from; + unsigned long charge_addr_from; }; /** @@ -307,6 +309,9 @@ struct damon_ctx { #define damon_prev_region(r) \ (container_of(r->list.prev, struct damon_region, list)) +#define damon_last_region(t) \ + (list_last_entry(&t->regions_list, struct damon_region, list)) + #define damon_for_each_region(r, t) \ list_for_each_entry(r, &t->regions_list, list) --- a/mm/damon/core.c~mm-damon-schemes-skip-already-charged-targets-and-regions +++ a/mm/damon/core.c @@ -111,6 +111,8 @@ struct damos *damon_new_scheme( scheme->quota.reset_interval = quota->reset_interval; scheme->quota.charged_sz = 0; scheme->quota.charged_from = 0; + scheme->quota.charge_target_from = NULL; + scheme->quota.charge_addr_from = 0; return scheme; } @@ -553,6 +555,37 @@ static void damon_do_apply_schemes(struc if (quota->sz && quota->charged_sz >= quota->sz) continue; + /* Skip previously charged regions */ + if (quota->charge_target_from) { + if (t != quota->charge_target_from) + continue; + if (r == damon_last_region(t)) { + quota->charge_target_from = NULL; + quota->charge_addr_from = 0; + continue; + } + if (quota->charge_addr_from && + r->ar.end <= quota->charge_addr_from) + continue; + + if (quota->charge_addr_from && r->ar.start < + quota->charge_addr_from) { + sz = ALIGN_DOWN(quota->charge_addr_from - + r->ar.start, DAMON_MIN_REGION); + if (!sz) { + if (r->ar.end - r->ar.start <= + DAMON_MIN_REGION) + continue; + sz = DAMON_MIN_REGION; + } + damon_split_region_at(c, t, r, sz); + r = damon_next_region(r); + sz = r->ar.end - r->ar.start; + } + quota->charge_target_from = NULL; + quota->charge_addr_from = 0; + } + /* Check the target regions condition */ if (sz < s->min_sz_region || s->max_sz_region < sz) continue; @@ -573,6 +606,10 @@ static void damon_do_apply_schemes(struc } c->primitive.apply_scheme(c, t, r, s); quota->charged_sz += sz; + if (quota->sz && quota->charged_sz >= quota->sz) { + quota->charge_target_from = t; + quota->charge_addr_from = r->ar.end + 1; + } } if (s->action != DAMOS_STAT) r->age = 0; From patchwork Fri Nov 5 20:47:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CD10C433FE for ; Fri, 5 Nov 2021 20:47:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 429FE6056B for ; Fri, 5 Nov 2021 20:47:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 429FE6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DCA0D9400C1; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D79279400EA; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C679F9400C1; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 91CA09400EA for ; Fri, 5 Nov 2021 16:47:26 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 500208249980 for ; Fri, 5 Nov 2021 20:47:26 +0000 (UTC) X-FDA: 78776062092.24.067265E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id 66C5710000B0 for ; Fri, 5 Nov 2021 20:47:25 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 29BF860FBF; Fri, 5 Nov 2021 20:47:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145244; bh=43n69Oi6XLnLLtBTXtXAlvj3k0wVNQ3MKEnvr51CoPU=; h=Date:From:To:Subject:In-Reply-To:From; b=hAOqPnXdWocEmuVs2Tvx5v7n8wRC16qoHDr2pMnV9nreZAWBB77awrKRjrHFAsT9m plCNuWpTsGTV2amZ2J5bWM9R0s629Hw6ytljkXcdP/+Bw+MWbfjOFRaKqe5v+9pQ3F KvgjqNeOuuyEjIUVl69WliL6UiX2dBuxckO7qYNI= Date: Fri, 05 Nov 2021 13:47:23 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 242/262] mm/damon/schemes: implement time quota Message-ID: <20211105204723.yxHbbIXWq%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 66C5710000B0 X-Stat-Signature: 8ho1cykh6cxw41tp48jop37ywh3x7hyh Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hAOqPnXd; spf=pass (imf12.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145245-628525 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: implement time quota The size quota feature of DAMOS is useful for IO resource-critical systems, but not so intuitive for CPU time-critical systems. Systems using zram or zswap-like swap device would be examples. To provide another intuitive ways for such systems, this commit implements time-based quota for DAMON-based Operation Schemes. If the quota is set, DAMOS tries to use only up to the user-defined quota of CPU time within a given time window. Link: https://lkml.kernel.org/r/20211019150731.16699-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 25 +++++++++++++++++----- mm/damon/core.c | 45 +++++++++++++++++++++++++++++++++++----- 2 files changed, 60 insertions(+), 10 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-implement-time-quota +++ a/include/linux/damon.h @@ -91,20 +91,35 @@ enum damos_action { /** * struct damos_quota - Controls the aggressiveness of the given scheme. + * @ms: Maximum milliseconds that the scheme can use. * @sz: Maximum bytes of memory that the action can be applied. * @reset_interval: Charge reset interval in milliseconds. * * To avoid consuming too much CPU time or IO resources for applying the - * &struct damos->action to large memory, DAMON allows users to set a size - * quota. The quota can be set by writing non-zero values to &sz. If the size - * quota is set, DAMON tries to apply the action only up to &sz bytes within - * &reset_interval. + * &struct damos->action to large memory, DAMON allows users to set time and/or + * size quotas. The quotas can be set by writing non-zero values to &ms and + * &sz, respectively. If the time quota is set, DAMON tries to use only up to + * &ms milliseconds within &reset_interval for applying the action. If the + * size quota is set, DAMON tries to apply the action only up to &sz bytes + * within &reset_interval. + * + * Internally, the time quota is transformed to a size quota using estimated + * throughput of the scheme's action. DAMON then compares it against &sz and + * uses smaller one as the effective quota. */ struct damos_quota { + unsigned long ms; unsigned long sz; unsigned long reset_interval; -/* private: For charging the quota */ +/* private: */ + /* For throughput estimation */ + unsigned long total_charged_sz; + unsigned long total_charged_ns; + + unsigned long esz; /* Effective size quota in bytes */ + + /* For charging the quota */ unsigned long charged_sz; unsigned long charged_from; struct damon_target *charge_target_from; --- a/mm/damon/core.c~mm-damon-schemes-implement-time-quota +++ a/mm/damon/core.c @@ -107,8 +107,12 @@ struct damos *damon_new_scheme( scheme->stat_sz = 0; INIT_LIST_HEAD(&scheme->list); + scheme->quota.ms = quota->ms; scheme->quota.sz = quota->sz; scheme->quota.reset_interval = quota->reset_interval; + scheme->quota.total_charged_sz = 0; + scheme->quota.total_charged_ns = 0; + scheme->quota.esz = 0; scheme->quota.charged_sz = 0; scheme->quota.charged_from = 0; scheme->quota.charge_target_from = NULL; @@ -550,9 +554,10 @@ static void damon_do_apply_schemes(struc damon_for_each_scheme(s, c) { struct damos_quota *quota = &s->quota; unsigned long sz = r->ar.end - r->ar.start; + struct timespec64 begin, end; /* Check the quota */ - if (quota->sz && quota->charged_sz >= quota->sz) + if (quota->esz && quota->charged_sz >= quota->esz) continue; /* Skip previously charged regions */ @@ -597,16 +602,21 @@ static void damon_do_apply_schemes(struc /* Apply the scheme */ if (c->primitive.apply_scheme) { - if (quota->sz && quota->charged_sz + sz > quota->sz) { - sz = ALIGN_DOWN(quota->sz - quota->charged_sz, + if (quota->esz && + quota->charged_sz + sz > quota->esz) { + sz = ALIGN_DOWN(quota->esz - quota->charged_sz, DAMON_MIN_REGION); if (!sz) goto update_stat; damon_split_region_at(c, t, r, sz); } + ktime_get_coarse_ts64(&begin); c->primitive.apply_scheme(c, t, r, s); + ktime_get_coarse_ts64(&end); + quota->total_charged_ns += timespec64_to_ns(&end) - + timespec64_to_ns(&begin); quota->charged_sz += sz; - if (quota->sz && quota->charged_sz >= quota->sz) { + if (quota->esz && quota->charged_sz >= quota->esz) { quota->charge_target_from = t; quota->charge_addr_from = r->ar.end + 1; } @@ -620,6 +630,29 @@ update_stat: } } +/* Shouldn't be called if quota->ms and quota->sz are zero */ +static void damos_set_effective_quota(struct damos_quota *quota) +{ + unsigned long throughput; + unsigned long esz; + + if (!quota->ms) { + quota->esz = quota->sz; + return; + } + + if (quota->total_charged_ns) + throughput = quota->total_charged_sz * 1000000 / + quota->total_charged_ns; + else + throughput = PAGE_SIZE * 1024; + esz = throughput * quota->ms; + + if (quota->sz && quota->sz < esz) + esz = quota->sz; + quota->esz = esz; +} + static void kdamond_apply_schemes(struct damon_ctx *c) { struct damon_target *t; @@ -629,15 +662,17 @@ static void kdamond_apply_schemes(struct damon_for_each_scheme(s, c) { struct damos_quota *quota = &s->quota; - if (!quota->sz) + if (!quota->ms && !quota->sz) continue; /* New charge window starts */ if (time_after_eq(jiffies, quota->charged_from + msecs_to_jiffies( quota->reset_interval))) { + quota->total_charged_sz += quota->charged_sz; quota->charged_from = jiffies; quota->charged_sz = 0; + damos_set_effective_quota(quota); } } From patchwork Fri Nov 5 20:47:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0101C433EF for ; Fri, 5 Nov 2021 20:47:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6F2FE6056B for ; Fri, 5 Nov 2021 20:47:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6F2FE6056B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9D2B99400EC; Fri, 5 Nov 2021 16:47:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90D039400EA; Fri, 5 Nov 2021 16:47:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 786A49400EC; Fri, 5 Nov 2021 16:47:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0186.hostedemail.com [216.40.44.186]) by kanga.kvack.org (Postfix) with ESMTP id 5F6A29400EA for ; Fri, 5 Nov 2021 16:47:29 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2EFA577998 for ; Fri, 5 Nov 2021 20:47:29 +0000 (UTC) X-FDA: 78776062260.01.963078D Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id D7DFC10000AC for ; Fri, 5 Nov 2021 20:47:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8B1E460240; Fri, 5 Nov 2021 20:47:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145248; bh=8I7lkhEPQq4e+L/qSplU2QVURMjtUbuSOP1m2V3MBQM=; h=Date:From:To:Subject:In-Reply-To:From; b=MrY455RTkvX7y1uTm5GZaWVgaoyDuZX5lAFkg+qRVpztGPZdYwD3olOwEJ28yFj8E kzABJBqNJdO15/Sq95lGEQE+b+NprL445JSeJpQsYVtIqtuvhzxllahPINNLoejVEA G5N7gLry+IHfxYbP1e0hGeMtckzmfd+6viglu0Tc= Date: Fri, 05 Nov 2021 13:47:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 243/262] mm/damon/dbgfs: support quotas of schemes Message-ID: <20211105204727.-oOh-u_xL%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D7DFC10000AC X-Stat-Signature: rgkrt4p6h19k55f1xkirtdpcs4npcx3e Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=MrY455RT; spf=pass (imf07.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145248-345391 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support quotas of schemes This commit makes the debugfs interface of DAMON support the scheme quotas by chaning the format of the input for the schemes file. Link: https://lkml.kernel.org/r/20211019150731.16699-6-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-quotas-of-schemes +++ a/mm/damon/dbgfs.c @@ -105,11 +105,14 @@ static ssize_t sprint_schemes(struct dam damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, - "%lu %lu %u %u %u %u %d %lu %lu\n", + "%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n", s->min_sz_region, s->max_sz_region, s->min_nr_accesses, s->max_nr_accesses, s->min_age_region, s->max_age_region, - s->action, s->stat_count, s->stat_sz); + s->action, + s->quota.ms, s->quota.sz, + s->quota.reset_interval, + s->stat_count, s->stat_sz); if (!rc) return -ENOMEM; @@ -190,10 +193,11 @@ static struct damos **str_to_schemes(con while (pos < len && *nr_schemes < max_nr_schemes) { struct damos_quota quota = {}; - ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n", + ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n", &min_sz, &max_sz, &min_nr_a, &max_nr_a, - &min_age, &max_age, &action, &parsed); - if (ret != 7) + &min_age, &max_age, &action, "a.ms, + "a.sz, "a.reset_interval, &parsed); + if (ret != 10) break; if (!damos_action_valid(action)) { pr_err("wrong action %d\n", action); From patchwork Fri Nov 5 20:47:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91C95C433EF for ; Fri, 5 Nov 2021 20:47:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B13860174 for ; Fri, 5 Nov 2021 20:47:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4B13860174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DF24F9400EA; Fri, 5 Nov 2021 16:47:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D2AC09400ED; Fri, 5 Nov 2021 16:47:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA4469400EA; Fri, 5 Nov 2021 16:47:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 9BF539400ED for ; Fri, 5 Nov 2021 16:47:32 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6B69376B5F for ; Fri, 5 Nov 2021 20:47:32 +0000 (UTC) X-FDA: 78776062344.14.674DBEC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id EC8D1E001982 for ; Fri, 5 Nov 2021 20:47:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E15AC60E94; Fri, 5 Nov 2021 20:47:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145251; bh=EsQBqjdXT/iiW3DxABnHMdk4BlisVjhosSxs4vcmSPE=; h=Date:From:To:Subject:In-Reply-To:From; b=imE8xzOWX46SGYXImPEu8rjyh4lp54umcFmlcFI147F20SnOEoi8pQcvIgqCbjKn2 HxD512PrUBypUc3N0/3cef/yQKP0nK5jgKuK4J6QjGLYLU1NWbibMRyZr+ZDFT0NCk B8UTVjwL+RaFXnmnIvBe+ck5Esr+aeB2cB5fAbDQ= Date: Fri, 05 Nov 2021 13:47:30 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 244/262] mm/damon/selftests: support schemes quotas Message-ID: <20211105204730.qXjD5vPnk%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=imE8xzOW; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EC8D1E001982 X-Stat-Signature: gc1d7ezgsdjx6krqxfsi8tgfboxgbss6 X-HE-Tag: 1636145234-557839 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/selftests: support schemes quotas This commit updates DAMON selftests to support updated schemes debugfs file format for the quotas. Link: https://lkml.kernel.org/r/20211019150731.16699-7-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/tools/testing/selftests/damon/debugfs_attrs.sh~mm-damon-selftests-support-schemes-quotas +++ a/tools/testing/selftests/damon/debugfs_attrs.sh @@ -63,10 +63,10 @@ echo "$orig_content" > "$file" file="$DBGFS/schemes" orig_content=$(cat "$file") -test_write_succ "$file" "1 2 3 4 5 6 4" \ +test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \ "$orig_content" "valid input" test_write_fail "$file" "1 2 -3 4 5 6 3" "$orig_content" "multi lines" +3 4 5 6 3 0 0 0" "$orig_content" "multi lines" test_write_succ "$file" "" "$orig_content" "disabling" echo "$orig_content" > "$file" From patchwork Fri Nov 5 20:47:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5ADF6C433FE for ; Fri, 5 Nov 2021 20:47:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0C25B60240 for ; Fri, 5 Nov 2021 20:47:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0C25B60240 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9D2C89400EE; Fri, 5 Nov 2021 16:47:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 981409400ED; Fri, 5 Nov 2021 16:47:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 84A8E9400EE; Fri, 5 Nov 2021 16:47:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 7094E9400ED for ; Fri, 5 Nov 2021 16:47:36 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 0A09875807 for ; Fri, 5 Nov 2021 20:47:36 +0000 (UTC) X-FDA: 78776062512.18.EB9EF81 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf09.hostedemail.com (Postfix) with ESMTP id 7D4F73000111 for ; Fri, 5 Nov 2021 20:47:35 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4FFD76120A; Fri, 5 Nov 2021 20:47:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145254; bh=iVWClm8D244JDc83yqWW7m3SpEsJsx3nFSCz2fKkO3Y=; h=Date:From:To:Subject:In-Reply-To:From; b=hSQfgJGoY/CV2Keai59WdgKEqYQWS6GowQ6y7DFc2ZgFs9yPOOodG0/PO7qklqFWL bL26WRtzR8H1GApfHcDHhrOLD4hQ41o02dLkInSpGzrd0qO70yAb3SOFrUuKMHWwmh GJHGdAmBrUTTxkv3JORkYlUGCL2BhqjhnOTVam/c= Date: Fri, 05 Nov 2021 13:47:33 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 245/262] mm/damon/schemes: prioritize regions within the quotas Message-ID: <20211105204733.kCxAdY2b-%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=hSQfgJGo; dmarc=none; spf=pass (imf09.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7D4F73000111 X-Stat-Signature: pitk14x6n1nw3y3ieyrdbtx4u3p1uex1 X-HE-Tag: 1636145255-352081 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: prioritize regions within the quotas This commit makes DAMON apply schemes to regions having higher priority first, if it cannot apply schemes to all regions due to the quotas. The prioritization function should be implemented in the monitoring primitives. Those would commonly calculate the priority of the region using attributes of regions, namely 'size', 'nr_accesses', and 'age'. For example, some primitive would calculate the priority of each region using a weighted sum of 'nr_accesses' and 'age' of the region. The optimal weights would depend on give environments, so this commit makes those customizable. Nevertheless, the score calculation functions are only encouraged to respect the weights, not mandated. Link: https://lkml.kernel.org/r/20211019150731.16699-8-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 26 ++++++++++++++++ mm/damon/core.c | 62 +++++++++++++++++++++++++++++++++++----- 2 files changed, 81 insertions(+), 7 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-prioritize-regions-within-the-quotas +++ a/include/linux/damon.h @@ -14,6 +14,8 @@ /* Minimal region size. Every damon_region is aligned by this. */ #define DAMON_MIN_REGION PAGE_SIZE +/* Max priority score for DAMON-based operation schemes */ +#define DAMOS_MAX_SCORE (99) /** * struct damon_addr_range - Represents an address region of [@start, @end). @@ -95,6 +97,10 @@ enum damos_action { * @sz: Maximum bytes of memory that the action can be applied. * @reset_interval: Charge reset interval in milliseconds. * + * @weight_sz: Weight of the region's size for prioritization. + * @weight_nr_accesses: Weight of the region's nr_accesses for prioritization. + * @weight_age: Weight of the region's age for prioritization. + * * To avoid consuming too much CPU time or IO resources for applying the * &struct damos->action to large memory, DAMON allows users to set time and/or * size quotas. The quotas can be set by writing non-zero values to &ms and @@ -106,12 +112,22 @@ enum damos_action { * Internally, the time quota is transformed to a size quota using estimated * throughput of the scheme's action. DAMON then compares it against &sz and * uses smaller one as the effective quota. + * + * For selecting regions within the quota, DAMON prioritizes current scheme's + * target memory regions using the &struct damon_primitive->get_scheme_score. + * You could customize the prioritization logic by setting &weight_sz, + * &weight_nr_accesses, and &weight_age, because monitoring primitives are + * encouraged to respect those. */ struct damos_quota { unsigned long ms; unsigned long sz; unsigned long reset_interval; + unsigned int weight_sz; + unsigned int weight_nr_accesses; + unsigned int weight_age; + /* private: */ /* For throughput estimation */ unsigned long total_charged_sz; @@ -124,6 +140,10 @@ struct damos_quota { unsigned long charged_from; struct damon_target *charge_target_from; unsigned long charge_addr_from; + + /* For prioritization */ + unsigned long histogram[DAMOS_MAX_SCORE + 1]; + unsigned int min_score; }; /** @@ -174,6 +194,7 @@ struct damon_ctx; * @prepare_access_checks: Prepare next access check of target regions. * @check_accesses: Check the accesses to target regions. * @reset_aggregated: Reset aggregated accesses monitoring results. + * @get_scheme_score: Get the score of a region for a scheme. * @apply_scheme: Apply a DAMON-based operation scheme. * @target_valid: Determine if the target is valid. * @cleanup: Clean up the context. @@ -200,6 +221,8 @@ struct damon_ctx; * of its update. The value will be used for regions adjustment threshold. * @reset_aggregated should reset the access monitoring results that aggregated * by @check_accesses. + * @get_scheme_score should return the priority score of a region for a scheme + * as an integer in [0, &DAMOS_MAX_SCORE]. * @apply_scheme is called from @kdamond when a region for user provided * DAMON-based operation scheme is found. It should apply the scheme's action * to the region. This is not used for &DAMON_ARBITRARY_TARGET case. @@ -213,6 +236,9 @@ struct damon_primitive { void (*prepare_access_checks)(struct damon_ctx *context); unsigned int (*check_accesses)(struct damon_ctx *context); void (*reset_aggregated)(struct damon_ctx *context); + int (*get_scheme_score)(struct damon_ctx *context, + struct damon_target *t, struct damon_region *r, + struct damos *scheme); int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t, struct damon_region *r, struct damos *scheme); bool (*target_valid)(void *target); --- a/mm/damon/core.c~mm-damon-schemes-prioritize-regions-within-the-quotas +++ a/mm/damon/core.c @@ -12,6 +12,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -110,6 +111,9 @@ struct damos *damon_new_scheme( scheme->quota.ms = quota->ms; scheme->quota.sz = quota->sz; scheme->quota.reset_interval = quota->reset_interval; + scheme->quota.weight_sz = quota->weight_sz; + scheme->quota.weight_nr_accesses = quota->weight_nr_accesses; + scheme->quota.weight_age = quota->weight_age; scheme->quota.total_charged_sz = 0; scheme->quota.total_charged_ns = 0; scheme->quota.esz = 0; @@ -545,6 +549,28 @@ static void damon_split_region_at(struct struct damon_target *t, struct damon_region *r, unsigned long sz_r); +static bool __damos_valid_target(struct damon_region *r, struct damos *s) +{ + unsigned long sz; + + sz = r->ar.end - r->ar.start; + return s->min_sz_region <= sz && sz <= s->max_sz_region && + s->min_nr_accesses <= r->nr_accesses && + r->nr_accesses <= s->max_nr_accesses && + s->min_age_region <= r->age && r->age <= s->max_age_region; +} + +static bool damos_valid_target(struct damon_ctx *c, struct damon_target *t, + struct damon_region *r, struct damos *s) +{ + bool ret = __damos_valid_target(r, s); + + if (!ret || !s->quota.esz || !c->primitive.get_scheme_score) + return ret; + + return c->primitive.get_scheme_score(c, t, r, s) >= s->quota.min_score; +} + static void damon_do_apply_schemes(struct damon_ctx *c, struct damon_target *t, struct damon_region *r) @@ -591,13 +617,7 @@ static void damon_do_apply_schemes(struc quota->charge_addr_from = 0; } - /* Check the target regions condition */ - if (sz < s->min_sz_region || s->max_sz_region < sz) - continue; - if (r->nr_accesses < s->min_nr_accesses || - s->max_nr_accesses < r->nr_accesses) - continue; - if (r->age < s->min_age_region || s->max_age_region < r->age) + if (!damos_valid_target(c, t, r, s)) continue; /* Apply the scheme */ @@ -661,6 +681,8 @@ static void kdamond_apply_schemes(struct damon_for_each_scheme(s, c) { struct damos_quota *quota = &s->quota; + unsigned long cumulated_sz; + unsigned int score, max_score = 0; if (!quota->ms && !quota->sz) continue; @@ -674,6 +696,32 @@ static void kdamond_apply_schemes(struct quota->charged_sz = 0; damos_set_effective_quota(quota); } + + if (!c->primitive.get_scheme_score) + continue; + + /* Fill up the score histogram */ + memset(quota->histogram, 0, sizeof(quota->histogram)); + damon_for_each_target(t, c) { + damon_for_each_region(r, t) { + if (!__damos_valid_target(r, s)) + continue; + score = c->primitive.get_scheme_score( + c, t, r, s); + quota->histogram[score] += + r->ar.end - r->ar.start; + if (score > max_score) + max_score = score; + } + } + + /* Set the min score limit */ + for (cumulated_sz = 0, score = max_score; ; score--) { + cumulated_sz += quota->histogram[score]; + if (cumulated_sz >= quota->esz || !score) + break; + } + quota->min_score = score; } damon_for_each_target(t, c) { From patchwork Fri Nov 5 20:47:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 741A2C433EF for ; Fri, 5 Nov 2021 20:47:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2825D60174 for ; Fri, 5 Nov 2021 20:47:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2825D60174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C1C939400ED; Fri, 5 Nov 2021 16:47:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA3689400EF; Fri, 5 Nov 2021 16:47:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1DF79400F0; Fri, 5 Nov 2021 16:47:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 8FCF69400EF for ; Fri, 5 Nov 2021 16:47:39 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 50EA676B5F for ; Fri, 5 Nov 2021 20:47:39 +0000 (UTC) X-FDA: 78776062638.13.49C1BB4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id 2C02890000B4 for ; Fri, 5 Nov 2021 20:47:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B4C3A60240; Fri, 5 Nov 2021 20:47:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145258; bh=iyLxisym76ZgZIlZBYxW8VfmRFVonZ+VSUjI5fXd/G0=; h=Date:From:To:Subject:In-Reply-To:From; b=JW0EZ9rEk6hhtJMoxcvtn8x0LkrdIhD5zGWj4qwRGJvnoqCBn9vdrYCC0VA0HECUs bgvMOoz/6pIBCKtQkFK7hiybw+kkR5haxji8BJDKRy843ts88C6867uDcWbQCHCQpA El2wr9qGB+DagfB2Vjf9hYS22dRbbnIIICGPkyQs= Date: Fri, 05 Nov 2021 13:47:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 246/262] mm/damon/vaddr,paddr: support pageout prioritization Message-ID: <20211105204737.OyQNtK_57%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=JW0EZ9rE; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2C02890000B4 X-Stat-Signature: infah5c7r1ehsibe7rygz1zxhkqewic8 X-HE-Tag: 1636145246-191021 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/vaddr,paddr: support pageout prioritization This commit makes the default monitoring primitives for virtual address spaces and the physical address sapce to support memory regions prioritization for 'PAGEOUT' DAMOS action. It calculates hotness of each region as weighted sum of 'nr_accesses' and 'age' of the region and get the priority score as reverse of the hotness, so that cold regions can be paged out first. Link: https://lkml.kernel.org/r/20211019150731.16699-9-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 4 +++ mm/damon/paddr.c | 14 +++++++++++ mm/damon/prmtv-common.c | 46 ++++++++++++++++++++++++++++++++++++++ mm/damon/prmtv-common.h | 3 ++ mm/damon/vaddr.c | 15 ++++++++++++ 5 files changed, 82 insertions(+) --- a/include/linux/damon.h~mm-damon-vaddrpaddr-support-pageout-prioritization +++ a/include/linux/damon.h @@ -421,6 +421,8 @@ bool damon_va_target_valid(void *t); void damon_va_cleanup(struct damon_ctx *ctx); int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t, struct damon_region *r, struct damos *scheme); +int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme); void damon_va_set_primitives(struct damon_ctx *ctx); #endif /* CONFIG_DAMON_VADDR */ @@ -433,6 +435,8 @@ unsigned int damon_pa_check_accesses(str bool damon_pa_target_valid(void *t); int damon_pa_apply_scheme(struct damon_ctx *context, struct damon_target *t, struct damon_region *r, struct damos *scheme); +int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme); void damon_pa_set_primitives(struct damon_ctx *ctx); #endif /* CONFIG_DAMON_PADDR */ --- a/mm/damon/paddr.c~mm-damon-vaddrpaddr-support-pageout-prioritization +++ a/mm/damon/paddr.c @@ -246,6 +246,19 @@ int damon_pa_apply_scheme(struct damon_c return 0; } +int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme) +{ + switch (scheme->action) { + case DAMOS_PAGEOUT: + return damon_pageout_score(context, r, scheme); + default: + break; + } + + return DAMOS_MAX_SCORE; +} + void damon_pa_set_primitives(struct damon_ctx *ctx) { ctx->primitive.init = NULL; @@ -256,4 +269,5 @@ void damon_pa_set_primitives(struct damo ctx->primitive.target_valid = damon_pa_target_valid; ctx->primitive.cleanup = NULL; ctx->primitive.apply_scheme = damon_pa_apply_scheme; + ctx->primitive.get_scheme_score = damon_pa_scheme_score; } --- a/mm/damon/prmtv-common.c~mm-damon-vaddrpaddr-support-pageout-prioritization +++ a/mm/damon/prmtv-common.c @@ -85,3 +85,49 @@ void damon_pmdp_mkold(pmd_t *pmd, struct put_page(page); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ } + +#define DAMON_MAX_SUBSCORE (100) +#define DAMON_MAX_AGE_IN_LOG (32) + +int damon_pageout_score(struct damon_ctx *c, struct damon_region *r, + struct damos *s) +{ + unsigned int max_nr_accesses; + int freq_subscore; + unsigned int age_in_sec; + int age_in_log, age_subscore; + unsigned int freq_weight = s->quota.weight_nr_accesses; + unsigned int age_weight = s->quota.weight_age; + int hotness; + + max_nr_accesses = c->aggr_interval / c->sample_interval; + freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses; + + age_in_sec = (unsigned long)r->age * c->aggr_interval / 1000000; + for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec; + age_in_log++, age_in_sec >>= 1) + ; + + /* If frequency is 0, higher age means it's colder */ + if (freq_subscore == 0) + age_in_log *= -1; + + /* + * Now age_in_log is in [-DAMON_MAX_AGE_IN_LOG, DAMON_MAX_AGE_IN_LOG]. + * Scale it to be in [0, 100] and set it as age subscore. + */ + age_in_log += DAMON_MAX_AGE_IN_LOG; + age_subscore = age_in_log * DAMON_MAX_SUBSCORE / + DAMON_MAX_AGE_IN_LOG / 2; + + hotness = (freq_weight * freq_subscore + age_weight * age_subscore); + if (freq_weight + age_weight) + hotness /= freq_weight + age_weight; + /* + * Transform it to fit in [0, DAMOS_MAX_SCORE] + */ + hotness = hotness * DAMOS_MAX_SCORE / DAMON_MAX_SUBSCORE; + + /* Return coldness of the region */ + return DAMOS_MAX_SCORE - hotness; +} --- a/mm/damon/prmtv-common.h~mm-damon-vaddrpaddr-support-pageout-prioritization +++ a/mm/damon/prmtv-common.h @@ -15,3 +15,6 @@ struct page *damon_get_page(unsigned lon void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr); void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr); + +int damon_pageout_score(struct damon_ctx *c, struct damon_region *r, + struct damos *s); --- a/mm/damon/vaddr.c~mm-damon-vaddrpaddr-support-pageout-prioritization +++ a/mm/damon/vaddr.c @@ -633,6 +633,20 @@ int damon_va_apply_scheme(struct damon_c return damos_madvise(t, r, madv_action); } +int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t, + struct damon_region *r, struct damos *scheme) +{ + + switch (scheme->action) { + case DAMOS_PAGEOUT: + return damon_pageout_score(context, r, scheme); + default: + break; + } + + return DAMOS_MAX_SCORE; +} + void damon_va_set_primitives(struct damon_ctx *ctx) { ctx->primitive.init = damon_va_init; @@ -643,6 +657,7 @@ void damon_va_set_primitives(struct damo ctx->primitive.target_valid = damon_va_target_valid; ctx->primitive.cleanup = NULL; ctx->primitive.apply_scheme = damon_va_apply_scheme; + ctx->primitive.get_scheme_score = damon_va_scheme_score; } #include "vaddr-test.h" From patchwork Fri Nov 5 20:47:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB764C433EF for ; Fri, 5 Nov 2021 20:47:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7C3E360E94 for ; Fri, 5 Nov 2021 20:47:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7C3E360E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1BCC49400F0; Fri, 5 Nov 2021 16:47:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 16E439400F1; Fri, 5 Nov 2021 16:47:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB1689400F0; Fri, 5 Nov 2021 16:47:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id DA7919400EF for ; Fri, 5 Nov 2021 16:47:42 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A1F978249980 for ; Fri, 5 Nov 2021 20:47:42 +0000 (UTC) X-FDA: 78776062764.20.48DC6CE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 1B62F6001995 for ; Fri, 5 Nov 2021 20:47:43 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2175C60174; Fri, 5 Nov 2021 20:47:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145261; bh=R7pz+DGQP3/Gs78g52C3nu9K6Pi5OhRxNJqhGYfmBl8=; h=Date:From:To:Subject:In-Reply-To:From; b=G083ygnnRA/IVuoM0rcb5br+ZN6nCwYQrUvNyIz5FcRhZqsDMxYUnkJrJlXm2niFW FtqXAUq1WKJjtkbmstOuS492df27fZdJWwc6hUoPIvvCFys4s+4+didNo2obbpUA24 8AnOKQZFU/MhXhA1is3hj402GI2l0e03pR81lwA4= Date: Fri, 05 Nov 2021 13:47:40 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 247/262] mm/damon/dbgfs: support prioritization weights Message-ID: <20211105204740.8BGnsIb__%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1B62F6001995 X-Stat-Signature: ancar667efyru6ato64hy65azf3xbdks Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=G083ygnn; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145263-647915 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support prioritization weights This commit allows DAMON debugfs interface users set the prioritization weights by putting three more numbers to the 'schemes' file. Link: https://lkml.kernel.org/r/20211019150731.16699-10-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-prioritization-weights +++ a/mm/damon/dbgfs.c @@ -105,13 +105,16 @@ static ssize_t sprint_schemes(struct dam damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, - "%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n", + "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n", s->min_sz_region, s->max_sz_region, s->min_nr_accesses, s->max_nr_accesses, s->min_age_region, s->max_age_region, s->action, s->quota.ms, s->quota.sz, s->quota.reset_interval, + s->quota.weight_sz, + s->quota.weight_nr_accesses, + s->quota.weight_age, s->stat_count, s->stat_sz); if (!rc) return -ENOMEM; @@ -193,11 +196,14 @@ static struct damos **str_to_schemes(con while (pos < len && *nr_schemes < max_nr_schemes) { struct damos_quota quota = {}; - ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n", + ret = sscanf(&str[pos], + "%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n", &min_sz, &max_sz, &min_nr_a, &max_nr_a, &min_age, &max_age, &action, "a.ms, - "a.sz, "a.reset_interval, &parsed); - if (ret != 10) + "a.sz, "a.reset_interval, + "a.weight_sz, "a.weight_nr_accesses, + "a.weight_age, &parsed); + if (ret != 13) break; if (!damos_action_valid(action)) { pr_err("wrong action %d\n", action); From patchwork Fri Nov 5 20:47:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 393B7C433FE for ; Fri, 5 Nov 2021 20:47:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DEA2E61059 for ; Fri, 5 Nov 2021 20:47:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DEA2E61059 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7ADEA9400F1; Fri, 5 Nov 2021 16:47:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75F669400EF; Fri, 5 Nov 2021 16:47:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64D7B9400F1; Fri, 5 Nov 2021 16:47:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 557099400EF for ; Fri, 5 Nov 2021 16:47:46 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 22F651856A9AF for ; Fri, 5 Nov 2021 20:47:46 +0000 (UTC) X-FDA: 78776062932.06.2BA40B1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id 7A40720019EC for ; Fri, 5 Nov 2021 20:47:46 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 8C4EB6120A; Fri, 5 Nov 2021 20:47:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145265; bh=R4r/odyeX8hiRT8ZlOOfKeCMIVXfEblvWhWF7Ydjfe0=; h=Date:From:To:Subject:In-Reply-To:From; b=Xy3ys5cRWOtOqYCkhzhGTK803N/1njMPhcUgDiTbxBJ/Ifj8SOTy7r56rsokiEd7c yi4bdjmIPyCgHd/C8cgFMCH63A2YB81XHcwGzxMqO6Gqkt9ilfU+37ybBt1YzazBdW UmcYyhc6uPoHsEpnCe/d0ZfFEDHciFl1hzyw4AJA= Date: Fri, 05 Nov 2021 13:47:44 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 248/262] tools/selftests/damon: update for regions prioritization of schemes Message-ID: <20211105204744.xk_yNocPQ%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7A40720019EC X-Stat-Signature: wckbkmxdbbw34qopokiiiq951y6pj373 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Xy3ys5cR; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145266-358394 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: tools/selftests/damon: update for regions prioritization of schemes This commit updates the DAMON selftests for 'schemes' debugfs file, as the file format is updated. Link: https://lkml.kernel.org/r/20211019150731.16699-11-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/tools/testing/selftests/damon/debugfs_attrs.sh~tools-selftests-damon-update-for-regions-prioritization-of-schemes +++ a/tools/testing/selftests/damon/debugfs_attrs.sh @@ -63,10 +63,10 @@ echo "$orig_content" > "$file" file="$DBGFS/schemes" orig_content=$(cat "$file") -test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \ +test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \ "$orig_content" "valid input" test_write_fail "$file" "1 2 -3 4 5 6 3 0 0 0" "$orig_content" "multi lines" +3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines" test_write_succ "$file" "" "$orig_content" "disabling" echo "$orig_content" > "$file" From patchwork Fri Nov 5 20:47:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605897 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE282C433F5 for ; Fri, 5 Nov 2021 20:47:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7A73360E94 for ; Fri, 5 Nov 2021 20:47:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7A73360E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 16DAF9400F2; Fri, 5 Nov 2021 16:47:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11C7E9400EF; Fri, 5 Nov 2021 16:47:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0336A9400F2; Fri, 5 Nov 2021 16:47:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id E7F3F9400EF for ; Fri, 5 Nov 2021 16:47:49 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A8A23779A5 for ; Fri, 5 Nov 2021 20:47:49 +0000 (UTC) X-FDA: 78776063058.35.085A9F4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id 2EA0E70000AB for ; Fri, 5 Nov 2021 20:47:49 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id EA7B560174; Fri, 5 Nov 2021 20:47:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145268; bh=I7Mr47Thoczs1zhINP/24nEapbVQXKgGH0IYKoruAcc=; h=Date:From:To:Subject:In-Reply-To:From; b=ci4IDlebwPNo8mJnfa7YyllRtXjBeVfB5IA4lxpcOKXb9ow3yCtapCPzTFoCwR9ZQ EDHngqhRkKaeyU2A/Uy0v+3EofE2RinXjbkWKoSUbHytdiaCu3vVUAyJndZ5hzGhJ5 qf0ALmtDXtrLVeloXwu3pORbAXGKXT3gbYTNa44k= Date: Fri, 05 Nov 2021 13:47:47 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 249/262] mm/damon/schemes: activate schemes based on a watermarks mechanism Message-ID: <20211105204747.cqLATVL8w%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2EA0E70000AB X-Stat-Signature: qmzgztgx1zu3kifhz7nns9eaprmwj1yb Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=ci4IDleb; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145269-556700 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/schemes: activate schemes based on a watermarks mechanism DAMON-based operation schemes need to be manually turned on and off. In some use cases, however, the condition for turning a scheme on and off would depend on the system's situation. For example, schemes for proactive pages reclamation would need to be turned on when some memory pressure is detected, and turned off when the system has enough free memory. For easier control of schemes activation based on the system situation, this commit introduces a watermarks-based mechanism. The client can describe the watermark metric (e.g., amount of free memory in the system), watermark check interval, and three watermarks, namely high, mid, and low. If the scheme is deactivated, it only gets the metric and compare that to the three watermarks for every check interval. If the metric is higher than the high watermark, the scheme is deactivated. If the metric is between the mid watermark and the low watermark, the scheme is activated. If the metric is lower than the low watermark, the scheme is deactivated again. This is to allow users fall back to traditional page-granularity mechanisms. Link: https://lkml.kernel.org/r/20211019150731.16699-12-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- include/linux/damon.h | 52 +++++++++++++++++++++ mm/damon/core.c | 97 +++++++++++++++++++++++++++++++++++++++- mm/damon/dbgfs.c | 5 +- 3 files changed, 151 insertions(+), 3 deletions(-) --- a/include/linux/damon.h~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism +++ a/include/linux/damon.h @@ -147,6 +147,45 @@ struct damos_quota { }; /** + * enum damos_wmark_metric - Represents the watermark metric. + * + * @DAMOS_WMARK_NONE: Ignore the watermarks of the given scheme. + * @DAMOS_WMARK_FREE_MEM_RATE: Free memory rate of the system in [0,1000]. + */ +enum damos_wmark_metric { + DAMOS_WMARK_NONE, + DAMOS_WMARK_FREE_MEM_RATE, +}; + +/** + * struct damos_watermarks - Controls when a given scheme should be activated. + * @metric: Metric for the watermarks. + * @interval: Watermarks check time interval in microseconds. + * @high: High watermark. + * @mid: Middle watermark. + * @low: Low watermark. + * + * If &metric is &DAMOS_WMARK_NONE, the scheme is always active. Being active + * means DAMON does monitoring and applying the action of the scheme to + * appropriate memory regions. Else, DAMON checks &metric of the system for at + * least every &interval microseconds and works as below. + * + * If &metric is higher than &high, the scheme is inactivated. If &metric is + * between &mid and &low, the scheme is activated. If &metric is lower than + * &low, the scheme is inactivated. + */ +struct damos_watermarks { + enum damos_wmark_metric metric; + unsigned long interval; + unsigned long high; + unsigned long mid; + unsigned long low; + +/* private: */ + bool activated; +}; + +/** * struct damos - Represents a Data Access Monitoring-based Operation Scheme. * @min_sz_region: Minimum size of target regions. * @max_sz_region: Maximum size of target regions. @@ -156,6 +195,7 @@ struct damos_quota { * @max_age_region: Maximum age of target regions. * @action: &damo_action to be applied to the target regions. * @quota: Control the aggressiveness of this scheme. + * @wmarks: Watermarks for automated (in)activation of this scheme. * @stat_count: Total number of regions that this scheme is applied. * @stat_sz: Total size of regions that this scheme is applied. * @list: List head for siblings. @@ -166,6 +206,14 @@ struct damos_quota { * those. To avoid consuming too much CPU time or IO resources for the * &action, "a is used. * + * To do the work only when needed, schemes can be activated for specific + * system situations using &wmarks. If all schemes that registered to the + * monitoring context are inactive, DAMON stops monitoring either, and just + * repeatedly checks the watermarks. + * + * If all schemes that registered to a &struct damon_ctx are inactive, DAMON + * stops monitoring and just repeatedly checks the watermarks. + * * After applying the &action to each region, &stat_count and &stat_sz is * updated to reflect the number of regions and total size of regions that the * &action is applied. @@ -179,6 +227,7 @@ struct damos { unsigned int max_age_region; enum damos_action action; struct damos_quota quota; + struct damos_watermarks wmarks; unsigned long stat_count; unsigned long stat_sz; struct list_head list; @@ -384,7 +433,8 @@ struct damos *damon_new_scheme( unsigned long min_sz_region, unsigned long max_sz_region, unsigned int min_nr_accesses, unsigned int max_nr_accesses, unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action, struct damos_quota *quota); + enum damos_action action, struct damos_quota *quota, + struct damos_watermarks *wmarks); void damon_add_scheme(struct damon_ctx *ctx, struct damos *s); void damon_destroy_scheme(struct damos *s); --- a/mm/damon/core.c~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism +++ a/mm/damon/core.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -90,7 +91,8 @@ struct damos *damon_new_scheme( unsigned long min_sz_region, unsigned long max_sz_region, unsigned int min_nr_accesses, unsigned int max_nr_accesses, unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action, struct damos_quota *quota) + enum damos_action action, struct damos_quota *quota, + struct damos_watermarks *wmarks) { struct damos *scheme; @@ -122,6 +124,13 @@ struct damos *damon_new_scheme( scheme->quota.charge_target_from = NULL; scheme->quota.charge_addr_from = 0; + scheme->wmarks.metric = wmarks->metric; + scheme->wmarks.interval = wmarks->interval; + scheme->wmarks.high = wmarks->high; + scheme->wmarks.mid = wmarks->mid; + scheme->wmarks.low = wmarks->low; + scheme->wmarks.activated = true; + return scheme; } @@ -582,6 +591,9 @@ static void damon_do_apply_schemes(struc unsigned long sz = r->ar.end - r->ar.start; struct timespec64 begin, end; + if (!s->wmarks.activated) + continue; + /* Check the quota */ if (quota->esz && quota->charged_sz >= quota->esz) continue; @@ -684,6 +696,9 @@ static void kdamond_apply_schemes(struct unsigned long cumulated_sz; unsigned int score, max_score = 0; + if (!s->wmarks.activated) + continue; + if (!quota->ms && !quota->sz) continue; @@ -924,6 +939,83 @@ static bool kdamond_need_stop(struct dam return true; } +static unsigned long damos_wmark_metric_value(enum damos_wmark_metric metric) +{ + struct sysinfo i; + + switch (metric) { + case DAMOS_WMARK_FREE_MEM_RATE: + si_meminfo(&i); + return i.freeram * 1000 / i.totalram; + default: + break; + } + return -EINVAL; +} + +/* + * Returns zero if the scheme is active. Else, returns time to wait for next + * watermark check in micro-seconds. + */ +static unsigned long damos_wmark_wait_us(struct damos *scheme) +{ + unsigned long metric; + + if (scheme->wmarks.metric == DAMOS_WMARK_NONE) + return 0; + + metric = damos_wmark_metric_value(scheme->wmarks.metric); + /* higher than high watermark or lower than low watermark */ + if (metric > scheme->wmarks.high || scheme->wmarks.low > metric) { + if (scheme->wmarks.activated) + pr_debug("inactivate a scheme (%d) for %s wmark\n", + scheme->action, + metric > scheme->wmarks.high ? + "high" : "low"); + scheme->wmarks.activated = false; + return scheme->wmarks.interval; + } + + /* inactive and higher than middle watermark */ + if ((scheme->wmarks.high >= metric && metric >= scheme->wmarks.mid) && + !scheme->wmarks.activated) + return scheme->wmarks.interval; + + if (!scheme->wmarks.activated) + pr_debug("activate a scheme (%d)\n", scheme->action); + scheme->wmarks.activated = true; + return 0; +} + +static void kdamond_usleep(unsigned long usecs) +{ + if (usecs > 100 * 1000) + schedule_timeout_interruptible(usecs_to_jiffies(usecs)); + else + usleep_range(usecs, usecs + 1); +} + +/* Returns negative error code if it's not activated but should return */ +static int kdamond_wait_activation(struct damon_ctx *ctx) +{ + struct damos *s; + unsigned long wait_time; + unsigned long min_wait_time = 0; + + while (!kdamond_need_stop(ctx)) { + damon_for_each_scheme(s, ctx) { + wait_time = damos_wmark_wait_us(s); + if (!min_wait_time || wait_time < min_wait_time) + min_wait_time = wait_time; + } + if (!min_wait_time) + return 0; + + kdamond_usleep(min_wait_time); + } + return -EBUSY; +} + static void set_kdamond_stop(struct damon_ctx *ctx) { mutex_lock(&ctx->kdamond_lock); @@ -952,6 +1044,9 @@ static int kdamond_fn(void *data) sz_limit = damon_region_sz_limit(ctx); while (!kdamond_need_stop(ctx)) { + if (kdamond_wait_activation(ctx)) + continue; + if (ctx->primitive.prepare_access_checks) ctx->primitive.prepare_access_checks(ctx); if (ctx->callback.after_sampling && --- a/mm/damon/dbgfs.c~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism +++ a/mm/damon/dbgfs.c @@ -195,6 +195,9 @@ static struct damos **str_to_schemes(con *nr_schemes = 0; while (pos < len && *nr_schemes < max_nr_schemes) { struct damos_quota quota = {}; + struct damos_watermarks wmarks = { + .metric = DAMOS_WMARK_NONE, + }; ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n", @@ -212,7 +215,7 @@ static struct damos **str_to_schemes(con pos += parsed; scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a, - min_age, max_age, action, "a); + min_age, max_age, action, "a, &wmarks); if (!scheme) goto fail; From patchwork Fri Nov 5 20:47:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2733EC433F5 for ; Fri, 5 Nov 2021 20:47:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CF0896120A for ; Fri, 5 Nov 2021 20:47:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CF0896120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 650779400F3; Fri, 5 Nov 2021 16:47:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FEEF9400EF; Fri, 5 Nov 2021 16:47:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EDA69400F3; Fri, 5 Nov 2021 16:47:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id 3E3019400EF for ; Fri, 5 Nov 2021 16:47:53 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0CCEF8249980 for ; Fri, 5 Nov 2021 20:47:53 +0000 (UTC) X-FDA: 78776063226.18.37ADDC1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id 963C3801A88D for ; Fri, 5 Nov 2021 20:47:52 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 587B860E94; Fri, 5 Nov 2021 20:47:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145271; bh=g5R111MaiTao7h+LlM/Zwl+Ic8bQclUWNUBM4EpIHMQ=; h=Date:From:To:Subject:In-Reply-To:From; b=bqahBoHo194gwMChctji44L/Z3kWofxZeE99wJeOhcpxCDFgtu72+v3jhV4qXjVpj Z0PSvx4Lve50DBr3I7mPcloxJGc3AL+jRR7tdzDEVpiKEzEK7AJHIByGdcNmbIJp4Y Ey5PG0KgfLibDJwTm46Oiggvgu3b1krQ+dgfUyA0= Date: Fri, 05 Nov 2021 13:47:50 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 250/262] mm/damon/dbgfs: support watermarks Message-ID: <20211105204750.AryAbRGhe%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bqahBoHo; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 963C3801A88D X-Stat-Signature: 4j7nprswi7967sfthwjuz116f1gdfib8 X-HE-Tag: 1636145272-817668 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon/dbgfs: support watermarks This commit updates DAMON debugfs interface to support the watermarks based schemes activation. For this, now 'schemes' file receives five more values. Link: https://lkml.kernel.org/r/20211019150731.16699-13-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/dbgfs.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-watermarks +++ a/mm/damon/dbgfs.c @@ -105,7 +105,7 @@ static ssize_t sprint_schemes(struct dam damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, - "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n", + "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu\n", s->min_sz_region, s->max_sz_region, s->min_nr_accesses, s->max_nr_accesses, s->min_age_region, s->max_age_region, @@ -115,6 +115,8 @@ static ssize_t sprint_schemes(struct dam s->quota.weight_sz, s->quota.weight_nr_accesses, s->quota.weight_age, + s->wmarks.metric, s->wmarks.interval, + s->wmarks.high, s->wmarks.mid, s->wmarks.low, s->stat_count, s->stat_sz); if (!rc) return -ENOMEM; @@ -195,18 +197,18 @@ static struct damos **str_to_schemes(con *nr_schemes = 0; while (pos < len && *nr_schemes < max_nr_schemes) { struct damos_quota quota = {}; - struct damos_watermarks wmarks = { - .metric = DAMOS_WMARK_NONE, - }; + struct damos_watermarks wmarks; ret = sscanf(&str[pos], - "%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n", + "%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u %u %lu %lu %lu %lu%n", &min_sz, &max_sz, &min_nr_a, &max_nr_a, &min_age, &max_age, &action, "a.ms, "a.sz, "a.reset_interval, "a.weight_sz, "a.weight_nr_accesses, - "a.weight_age, &parsed); - if (ret != 13) + "a.weight_age, &wmarks.metric, + &wmarks.interval, &wmarks.high, &wmarks.mid, + &wmarks.low, &parsed); + if (ret != 18) break; if (!damos_action_valid(action)) { pr_err("wrong action %d\n", action); From patchwork Fri Nov 5 20:47:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71AEDC433F5 for ; Fri, 5 Nov 2021 20:47:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2D8FC60FBF for ; Fri, 5 Nov 2021 20:47:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2D8FC60FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BCCD19400F4; Fri, 5 Nov 2021 16:47:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B7D3C9400EF; Fri, 5 Nov 2021 16:47:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6B869400F4; Fri, 5 Nov 2021 16:47:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id 9538E9400EF for ; Fri, 5 Nov 2021 16:47:56 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6113B779A3 for ; Fri, 5 Nov 2021 20:47:56 +0000 (UTC) X-FDA: 78776063352.29.E7F9168 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP id D3534E001990 for ; Fri, 5 Nov 2021 20:47:38 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BF51860174; Fri, 5 Nov 2021 20:47:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145275; bh=XetGG75b6rThKS+EWmp6D+wuYIkXABjZysKVOhjS3sU=; h=Date:From:To:Subject:In-Reply-To:From; b=efHjmXCLn4RL4xNylRllLbiWKXO7/KFJ5TqWBnOOi4r4RSrlVtmKdeHei03PQZVlP kj/8ygoSgaoX6ODLx+gMesA3TyxRD2/GAA/DjD56hTbka3rxydMYubBmLzfNSHwbUJ CTDsjhKH9oGe2Qy8+7gPV9DwwXQC3W2/bCMinV0U= Date: Fri, 05 Nov 2021 13:47:54 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 251/262] selftests/damon: support watermarks Message-ID: <20211105204754.3AoRsMkYE%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D3534E001990 X-Stat-Signature: 7owsmy7rh9on7me18rzc47rrqdmp7niy Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=efHjmXCL; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145258-852820 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: selftests/damon: support watermarks This commit updates DAMON selftests for 'schemes' debugfs file to reflect the changes in the format. Link: https://lkml.kernel.org/r/20211019150731.16699-14-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/tools/testing/selftests/damon/debugfs_attrs.sh~selftests-damon-support-watermarks +++ a/tools/testing/selftests/damon/debugfs_attrs.sh @@ -63,10 +63,10 @@ echo "$orig_content" > "$file" file="$DBGFS/schemes" orig_content=$(cat "$file") -test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \ +test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3 1 100 3 2 1" \ "$orig_content" "valid input" test_write_fail "$file" "1 2 -3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines" +3 4 5 6 3 0 0 0 1 2 3 1 100 3 2 1" "$orig_content" "multi lines" test_write_succ "$file" "" "$orig_content" "disabling" echo "$orig_content" > "$file" From patchwork Fri Nov 5 20:47:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E102C433F5 for ; Fri, 5 Nov 2021 20:48:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B48A760FBF for ; Fri, 5 Nov 2021 20:48:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B48A760FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4EF229400F5; Fri, 5 Nov 2021 16:48:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49F629400EF; Fri, 5 Nov 2021 16:48:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 367689400F5; Fri, 5 Nov 2021 16:48:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0138.hostedemail.com [216.40.44.138]) by kanga.kvack.org (Postfix) with ESMTP id 23F609400EF for ; Fri, 5 Nov 2021 16:48:00 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D5C138249980 for ; Fri, 5 Nov 2021 20:47:59 +0000 (UTC) X-FDA: 78776063478.26.4084374 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP id 75138F0000B0 for ; Fri, 5 Nov 2021 20:47:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 390F560E94; Fri, 5 Nov 2021 20:47:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145278; bh=phuAd4uflVdFoL/s3a/K4ACiVtKmm+RIavgG7d0c89Y=; h=Date:From:To:Subject:In-Reply-To:From; b=E/pMiYGnNGfeBcsN2d6XMQZNMUgZ/yefkCvay26tLHK/y7oSOzuuPLfr2DUsc96rd HTde148DrWriCfLglU0L1s2GjFFUGJTCtz4nvsboOocTahFvzk4HLRvydWA7LDiyad HvM0fdgQ/vq1Kt/7FCgu7exmLsCtyKgQblco8Tuk= Date: Fri, 05 Nov 2021 13:47:57 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org, yangyingliang@huawei.com Subject: [patch 252/262] mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) Message-ID: <20211105204757.6AUMFKAtz%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 75138F0000B0 X-Stat-Signature: 6iojs4zcywxfq6mwut76akxsickdjeyo Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="E/pMiYGn"; dmarc=none; spf=pass (imf11.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145279-125823 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) This commit implements a new kernel subsystem that finds cold memory regions using DAMON and reclaims those immediately. It is intended to be used as proactive lightweigh reclamation logic for light memory pressure. For heavy memory pressure, it could be inactivated and fall back to the traditional page-scanning based reclamation. It's implemented on top of DAMON framework to use the DAMON-based Operation Schemes (DAMOS) feature. It utilizes all the DAMOS features including speed limit, prioritization, and watermarks. It could be enabled and tuned in boot time via the kernel boot parameter, and in run time via its module parameters ('/sys/module/damon_reclaim/parameters/') interface. [yangyingliang@huawei.com: fix error return code in damon_reclaim_turn()] Link: https://lkml.kernel.org/r/20211025124500.2758060-1-yangyingliang@huawei.com Link: https://lkml.kernel.org/r/20211019150731.16699-15-sj@kernel.org Signed-off-by: SeongJae Park Signed-off-by: Yang Yingliang Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- mm/damon/Kconfig | 12 + mm/damon/Makefile | 1 mm/damon/reclaim.c | 356 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 369 insertions(+) --- a/mm/damon/Kconfig~mm-damon-introduce-damon-based-reclamation-damon_reclaim +++ a/mm/damon/Kconfig @@ -73,4 +73,16 @@ config DAMON_DBGFS_KUNIT_TEST If unsure, say N. +config DAMON_RECLAIM + bool "Build DAMON-based reclaim (DAMON_RECLAIM)" + depends on DAMON_PADDR + help + This builds the DAMON-based reclamation subsystem. It finds pages + that not accessed for a long time (cold) using DAMON and reclaim + those. + + This is suggested to be used as a proactive and lightweight + reclamation under light memory pressure, while the traditional page + scanning-based reclamation is used for heavy pressure. + endmenu --- a/mm/damon/Makefile~mm-damon-introduce-damon-based-reclamation-damon_reclaim +++ a/mm/damon/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_DAMON) := core.o obj-$(CONFIG_DAMON_VADDR) += prmtv-common.o vaddr.o obj-$(CONFIG_DAMON_PADDR) += prmtv-common.o paddr.o obj-$(CONFIG_DAMON_DBGFS) += dbgfs.o +obj-$(CONFIG_DAMON_RECLAIM) += reclaim.o --- /dev/null +++ a/mm/damon/reclaim.c @@ -0,0 +1,356 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAMON-based page reclamation + * + * Author: SeongJae Park + */ + +#define pr_fmt(fmt) "damon-reclaim: " fmt + +#include +#include +#include +#include +#include + +#ifdef MODULE_PARAM_PREFIX +#undef MODULE_PARAM_PREFIX +#endif +#define MODULE_PARAM_PREFIX "damon_reclaim." + +/* + * Enable or disable DAMON_RECLAIM. + * + * You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``. + * Setting it as ``N`` disables DAMON_RECLAIM. Note that DAMON_RECLAIM could + * do no real monitoring and reclamation due to the watermarks-based activation + * condition. Refer to below descriptions for the watermarks parameter for + * this. + */ +static bool enabled __read_mostly; +module_param(enabled, bool, 0600); + +/* + * Time threshold for cold memory regions identification in microseconds. + * + * If a memory region is not accessed for this or longer time, DAMON_RECLAIM + * identifies the region as cold, and reclaims. 120 seconds by default. + */ +static unsigned long min_age __read_mostly = 120000000; +module_param(min_age, ulong, 0600); + +/* + * Limit of time for trying the reclamation in milliseconds. + * + * DAMON_RECLAIM tries to use only up to this time within a time window + * (quota_reset_interval_ms) for trying reclamation of cold pages. This can be + * used for limiting CPU consumption of DAMON_RECLAIM. If the value is zero, + * the limit is disabled. + * + * 10 ms by default. + */ +static unsigned long quota_ms __read_mostly = 10; +module_param(quota_ms, ulong, 0600); + +/* + * Limit of size of memory for the reclamation in bytes. + * + * DAMON_RECLAIM charges amount of memory which it tried to reclaim within a + * time window (quota_reset_interval_ms) and makes no more than this limit is + * tried. This can be used for limiting consumption of CPU and IO. If this + * value is zero, the limit is disabled. + * + * 128 MiB by default. + */ +static unsigned long quota_sz __read_mostly = 128 * 1024 * 1024; +module_param(quota_sz, ulong, 0600); + +/* + * The time/size quota charge reset interval in milliseconds. + * + * The charge reset interval for the quota of time (quota_ms) and size + * (quota_sz). That is, DAMON_RECLAIM does not try reclamation for more than + * quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms + * milliseconds. + * + * 1 second by default. + */ +static unsigned long quota_reset_interval_ms __read_mostly = 1000; +module_param(quota_reset_interval_ms, ulong, 0600); + +/* + * The watermarks check time interval in microseconds. + * + * Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is + * enabled but inactive due to its watermarks rule. 5 seconds by default. + */ +static unsigned long wmarks_interval __read_mostly = 5000000; +module_param(wmarks_interval, ulong, 0600); + +/* + * Free memory rate (per thousand) for the high watermark. + * + * If free memory of the system in bytes per thousand bytes is higher than + * this, DAMON_RECLAIM becomes inactive, so it does nothing but periodically + * checks the watermarks. 500 (50%) by default. + */ +static unsigned long wmarks_high __read_mostly = 500; +module_param(wmarks_high, ulong, 0600); + +/* + * Free memory rate (per thousand) for the middle watermark. + * + * If free memory of the system in bytes per thousand bytes is between this and + * the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring + * and the reclaiming. 400 (40%) by default. + */ +static unsigned long wmarks_mid __read_mostly = 400; +module_param(wmarks_mid, ulong, 0600); + +/* + * Free memory rate (per thousand) for the low watermark. + * + * If free memory of the system in bytes per thousand bytes is lower than this, + * DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks + * the watermarks. In the case, the system falls back to the LRU-based page + * granularity reclamation logic. 200 (20%) by default. + */ +static unsigned long wmarks_low __read_mostly = 200; +module_param(wmarks_low, ulong, 0600); + +/* + * Sampling interval for the monitoring in microseconds. + * + * The sampling interval of DAMON for the cold memory monitoring. Please refer + * to the DAMON documentation for more detail. 5 ms by default. + */ +static unsigned long sample_interval __read_mostly = 5000; +module_param(sample_interval, ulong, 0600); + +/* + * Aggregation interval for the monitoring in microseconds. + * + * The aggregation interval of DAMON for the cold memory monitoring. Please + * refer to the DAMON documentation for more detail. 100 ms by default. + */ +static unsigned long aggr_interval __read_mostly = 100000; +module_param(aggr_interval, ulong, 0600); + +/* + * Minimum number of monitoring regions. + * + * The minimal number of monitoring regions of DAMON for the cold memory + * monitoring. This can be used to set lower-bound of the monitoring quality. + * But, setting this too high could result in increased monitoring overhead. + * Please refer to the DAMON documentation for more detail. 10 by default. + */ +static unsigned long min_nr_regions __read_mostly = 10; +module_param(min_nr_regions, ulong, 0600); + +/* + * Maximum number of monitoring regions. + * + * The maximum number of monitoring regions of DAMON for the cold memory + * monitoring. This can be used to set upper-bound of the monitoring overhead. + * However, setting this too low could result in bad monitoring quality. + * Please refer to the DAMON documentation for more detail. 1000 by default. + */ +static unsigned long max_nr_regions __read_mostly = 1000; +module_param(max_nr_regions, ulong, 0600); + +/* + * Start of the target memory region in physical address. + * + * The start physical address of memory region that DAMON_RECLAIM will do work + * against. By default, biggest System RAM is used as the region. + */ +static unsigned long monitor_region_start __read_mostly; +module_param(monitor_region_start, ulong, 0600); + +/* + * End of the target memory region in physical address. + * + * The end physical address of memory region that DAMON_RECLAIM will do work + * against. By default, biggest System RAM is used as the region. + */ +static unsigned long monitor_region_end __read_mostly; +module_param(monitor_region_end, ulong, 0600); + +/* + * PID of the DAMON thread + * + * If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread. + * Else, -1. + */ +static int kdamond_pid __read_mostly = -1; +module_param(kdamond_pid, int, 0400); + +static struct damon_ctx *ctx; +static struct damon_target *target; + +struct damon_reclaim_ram_walk_arg { + unsigned long start; + unsigned long end; +}; + +static int walk_system_ram(struct resource *res, void *arg) +{ + struct damon_reclaim_ram_walk_arg *a = arg; + + if (a->end - a->start < res->end - res->start) { + a->start = res->start; + a->end = res->end; + } + return 0; +} + +/* + * Find biggest 'System RAM' resource and store its start and end address in + * @start and @end, respectively. If no System RAM is found, returns false. + */ +static bool get_monitoring_region(unsigned long *start, unsigned long *end) +{ + struct damon_reclaim_ram_walk_arg arg = {}; + + walk_system_ram_res(0, ULONG_MAX, &arg, walk_system_ram); + if (arg.end <= arg.start) + return false; + + *start = arg.start; + *end = arg.end; + return true; +} + +static struct damos *damon_reclaim_new_scheme(void) +{ + struct damos_watermarks wmarks = { + .metric = DAMOS_WMARK_FREE_MEM_RATE, + .interval = wmarks_interval, + .high = wmarks_high, + .mid = wmarks_mid, + .low = wmarks_low, + }; + struct damos_quota quota = { + /* + * Do not try reclamation for more than quota_ms milliseconds + * or quota_sz bytes within quota_reset_interval_ms. + */ + .ms = quota_ms, + .sz = quota_sz, + .reset_interval = quota_reset_interval_ms, + /* Within the quota, page out older regions first. */ + .weight_sz = 0, + .weight_nr_accesses = 0, + .weight_age = 1 + }; + struct damos *scheme = damon_new_scheme( + /* Find regions having PAGE_SIZE or larger size */ + PAGE_SIZE, ULONG_MAX, + /* and not accessed at all */ + 0, 0, + /* for min_age or more micro-seconds, and */ + min_age / aggr_interval, UINT_MAX, + /* page out those, as soon as found */ + DAMOS_PAGEOUT, + /* under the quota. */ + "a, + /* (De)activate this according to the watermarks. */ + &wmarks); + + return scheme; +} + +static int damon_reclaim_turn(bool on) +{ + struct damon_region *region; + struct damos *scheme; + int err; + + if (!on) { + err = damon_stop(&ctx, 1); + if (!err) + kdamond_pid = -1; + return err; + } + + err = damon_set_attrs(ctx, sample_interval, aggr_interval, 0, + min_nr_regions, max_nr_regions); + if (err) + return err; + + if (monitor_region_start > monitor_region_end) + return -EINVAL; + if (!monitor_region_start && !monitor_region_end && + !get_monitoring_region(&monitor_region_start, + &monitor_region_end)) + return -EINVAL; + /* DAMON will free this on its own when finish monitoring */ + region = damon_new_region(monitor_region_start, monitor_region_end); + if (!region) + return -ENOMEM; + damon_add_region(region, target); + + /* Will be freed by 'damon_set_schemes()' below */ + scheme = damon_reclaim_new_scheme(); + if (!scheme) { + err = -ENOMEM; + goto free_region_out; + } + err = damon_set_schemes(ctx, &scheme, 1); + if (err) + goto free_scheme_out; + + err = damon_start(&ctx, 1); + if (!err) { + kdamond_pid = ctx->kdamond->pid; + return 0; + } + +free_scheme_out: + damon_destroy_scheme(scheme); +free_region_out: + damon_destroy_region(region, target); + return err; +} + +#define ENABLE_CHECK_INTERVAL_MS 1000 +static struct delayed_work damon_reclaim_timer; +static void damon_reclaim_timer_fn(struct work_struct *work) +{ + static bool last_enabled; + bool now_enabled; + + now_enabled = enabled; + if (last_enabled != now_enabled) { + if (!damon_reclaim_turn(now_enabled)) + last_enabled = now_enabled; + else + enabled = last_enabled; + } + + schedule_delayed_work(&damon_reclaim_timer, + msecs_to_jiffies(ENABLE_CHECK_INTERVAL_MS)); +} +static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn); + +static int __init damon_reclaim_init(void) +{ + ctx = damon_new_ctx(); + if (!ctx) + return -ENOMEM; + + damon_pa_set_primitives(ctx); + + /* 4242 means nothing but fun */ + target = damon_new_target(4242); + if (!target) { + damon_destroy_ctx(ctx); + return -ENOMEM; + } + damon_add_target(ctx, target); + + schedule_delayed_work(&damon_reclaim_timer, 0); + return 0; +} + +module_init(damon_reclaim_init); From patchwork Fri Nov 5 20:48:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0B50C433EF for ; Fri, 5 Nov 2021 20:48:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4A51A6120A for ; Fri, 5 Nov 2021 20:48:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4A51A6120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DC9039400F6; Fri, 5 Nov 2021 16:48:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA0E69400EF; Fri, 5 Nov 2021 16:48:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C321E9400F7; Fri, 5 Nov 2021 16:48:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id AE48B9400F6 for ; Fri, 5 Nov 2021 16:48:03 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6E0AB181843D1 for ; Fri, 5 Nov 2021 20:48:03 +0000 (UTC) X-FDA: 78776063646.10.80D6F42 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id 108C13000096 for ; Fri, 5 Nov 2021 20:47:55 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9FA3960174; Fri, 5 Nov 2021 20:48:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145282; bh=BJe1/d4rFaMZZyg8mkSALeidZ2fUubKkcT6qUwOfgeA=; h=Date:From:To:Subject:In-Reply-To:From; b=wQSMjVOeuvhwjuuLXXxlDxTdgMXx8mlMLfeJtHfudSfDJTntkIk2d/PHa2NZquWzx Th1Z2XxXEZNvDZYMA6dNORbNvh5iedbPmuNLPdHB8KdR3DJZJDo/Ta8bWlt9dqVvhM VsQoWYghnxWgChJZqp2m7t4lGgngQ/9aPM20FAQU= Date: Fri, 05 Nov 2021 13:48:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, Jonathan.Cameron@huawei.com, linux-mm@kvack.org, markubo@amazon.de, mm-commits@vger.kernel.org, rientjes@google.com, shakeelb@google.com, shuah@kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 253/262] Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM Message-ID: <20211105204801.-uH66wZFO%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 108C13000096 X-Stat-Signature: 1np335ffbwtnstbwdeaqoeay15uutddk Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=wQSMjVOe; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145275-666242 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM This commit adds an admin-guide document for DAMON-based Reclamation. Link: https://lkml.kernel.org/r/20211019150731.16699-16-sj@kernel.org Signed-off-by: SeongJae Park Cc: Amit Shah Cc: Benjamin Herrenschmidt Cc: David Hildenbrand Cc: David Rientjes Cc: David Woodhouse Cc: Greg Thelen Cc: Jonathan Cameron Cc: Jonathan Corbet Cc: Leonard Foerster Cc: Marco Elver Cc: Markus Boehme Cc: Shakeel Butt Cc: Shuah Khan Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/index.rst | 1 Documentation/admin-guide/mm/damon/reclaim.rst | 235 +++++++++++++++ 2 files changed, 236 insertions(+) --- a/Documentation/admin-guide/mm/damon/index.rst~documentation-admin-guide-mm-damon-add-a-document-for-damon_reclaim +++ a/Documentation/admin-guide/mm/damon/index.rst @@ -13,3 +13,4 @@ optimize those. start usage + reclaim --- /dev/null +++ a/Documentation/admin-guide/mm/damon/reclaim.rst @@ -0,0 +1,235 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================= +DAMON-based Reclamation +======================= + +DAMON-based Reclamation (DAMON_RECLAIM) is a static kernel module that aimed to +be used for proactive and lightweight reclamation under light memory pressure. +It doesn't aim to replace the LRU-list based page_granularity reclamation, but +to be selectively used for different level of memory pressure and requirements. + +Where Proactive Reclamation is Required? +======================================== + +On general memory over-committed systems, proactively reclaiming cold pages +helps saving memory and reducing latency spikes that incurred by the direct +reclaim of the process or CPU consumption of kswapd, while incurring only +minimal performance degradation [1]_ [2]_ . + +Free Pages Reporting [3]_ based memory over-commit virtualization systems are +good example of the cases. In such systems, the guest VMs reports their free +memory to host, and the host reallocates the reported memory to other guests. +As a result, the memory of the systems are fully utilized. However, the +guests could be not so memory-frugal, mainly because some kernel subsystems and +user-space applications are designed to use as much memory as available. Then, +guests could report only small amount of memory as free to host, results in +memory utilization drop of the systems. Running the proactive reclamation in +guests could mitigate this problem. + +How It Works? +============= + +DAMON_RECLAIM finds memory regions that didn't accessed for specific time +duration and page out. To avoid it consuming too much CPU for the paging out +operation, a speed limit can be configured. Under the speed limit, it pages +out memory regions that didn't accessed longer time first. System +administrators can also configure under what situation this scheme should +automatically activated and deactivated with three memory pressure watermarks. + +Interface: Module Parameters +============================ + +To use this feature, you should first ensure your system is running on a kernel +that is built with ``CONFIG_DAMON_RECLAIM=y``. + +To let sysadmins enable or disable it and tune for the given system, +DAMON_RECLAIM utilizes module parameters. That is, you can put +``damon_reclaim.=`` on the kernel boot command line or write +proper values to ``/sys/modules/damon_reclaim/parameters/`` files. + +Note that the parameter values except ``enabled`` are applied only when +DAMON_RECLAIM starts. Therefore, if you want to apply new parameter values in +runtime and DAMON_RECLAIM is already enabled, you should disable and re-enable +it via ``enabled`` parameter file. Writing of the new values to proper +parameter values should be done before the re-enablement. + +Below are the description of each parameter. + +enabled +------- + +Enable or disable DAMON_RECLAIM. + +You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``. +Setting it as ``N`` disables DAMON_RECLAIM. Note that DAMON_RECLAIM could do +no real monitoring and reclamation due to the watermarks-based activation +condition. Refer to below descriptions for the watermarks parameter for this. + +min_age +------- + +Time threshold for cold memory regions identification in microseconds. + +If a memory region is not accessed for this or longer time, DAMON_RECLAIM +identifies the region as cold, and reclaims it. + +120 seconds by default. + +quota_ms +-------- + +Limit of time for the reclamation in milliseconds. + +DAMON_RECLAIM tries to use only up to this time within a time window +(quota_reset_interval_ms) for trying reclamation of cold pages. This can be +used for limiting CPU consumption of DAMON_RECLAIM. If the value is zero, the +limit is disabled. + +10 ms by default. + +quota_sz +-------- + +Limit of size of memory for the reclamation in bytes. + +DAMON_RECLAIM charges amount of memory which it tried to reclaim within a time +window (quota_reset_interval_ms) and makes no more than this limit is tried. +This can be used for limiting consumption of CPU and IO. If this value is +zero, the limit is disabled. + +128 MiB by default. + +quota_reset_interval_ms +----------------------- + +The time/size quota charge reset interval in milliseconds. + +The charget reset interval for the quota of time (quota_ms) and size +(quota_sz). That is, DAMON_RECLAIM does not try reclamation for more than +quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms +milliseconds. + +1 second by default. + +wmarks_interval +--------------- + +Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is +enabled but inactive due to its watermarks rule. + +wmarks_high +----------- + +Free memory rate (per thousand) for the high watermark. + +If free memory of the system in bytes per thousand bytes is higher than this, +DAMON_RECLAIM becomes inactive, so it does nothing but only periodically checks +the watermarks. + +wmarks_mid +---------- + +Free memory rate (per thousand) for the middle watermark. + +If free memory of the system in bytes per thousand bytes is between this and +the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring and +the reclaiming. + +wmarks_low +---------- + +Free memory rate (per thousand) for the low watermark. + +If free memory of the system in bytes per thousand bytes is lower than this, +DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks the +watermarks. In the case, the system falls back to the LRU-list based page +granularity reclamation logic. + +sample_interval +--------------- + +Sampling interval for the monitoring in microseconds. + +The sampling interval of DAMON for the cold memory monitoring. Please refer to +the DAMON documentation (:doc:`usage`) for more detail. + +aggr_interval +------------- + +Aggregation interval for the monitoring in microseconds. + +The aggregation interval of DAMON for the cold memory monitoring. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. + +min_nr_regions +-------------- + +Minimum number of monitoring regions. + +The minimal number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set lower-bound of the monitoring quality. +But, setting this too high could result in increased monitoring overhead. +Please refer to the DAMON documentation (:doc:`usage`) for more detail. + +max_nr_regions +-------------- + +Maximum number of monitoring regions. + +The maximum number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set upper-bound of the monitoring overhead. +However, setting this too low could result in bad monitoring quality. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. + +monitor_region_start +-------------------- + +Start of target memory region in physical address. + +The start physical address of memory region that DAMON_RECLAIM will do work +against. That is, DAMON_RECLAIM will find cold memory regions in this region +and reclaims. By default, biggest System RAM is used as the region. + +monitor_region_end +------------------ + +End of target memory region in physical address. + +The end physical address of memory region that DAMON_RECLAIM will do work +against. That is, DAMON_RECLAIM will find cold memory regions in this region +and reclaims. By default, biggest System RAM is used as the region. + +kdamond_pid +----------- + +PID of the DAMON thread. + +If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread. Else, +-1. + +Example +======= + +Below runtime example commands make DAMON_RECLAIM to find memory regions that +not accessed for 30 seconds or more and pages out. The reclamation is limited +to be done only up to 1 GiB per second to avoid DAMON_RECLAIM consuming too +much CPU time for the paging out operation. It also asks DAMON_RECLAIM to do +nothing if the system's free memory rate is more than 50%, but start the real +works if it becomes lower than 40%. If DAMON_RECLAIM doesn't make progress and +therefore the free memory rate becomes lower than 20%, it asks DAMON_RECLAIM to +do nothing again, so that we can fall back to the LRU-list based page +granularity reclamation. :: + + # cd /sys/modules/damon_reclaim/parameters + # echo 30000000 > min_age + # echo $((1 * 1024 * 1024 * 1024)) > quota_sz + # echo 1000 > quota_reset_interval_ms + # echo 500 > wmarks_high + # echo 400 > wmarks_mid + # echo 200 > wmarks_low + # echo Y > enabled + +.. [1] https://research.google/pubs/pub48551/ +.. [2] https://lwn.net/Articles/787611/ +.. [3] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html From patchwork Fri Nov 5 20:48:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E115C433F5 for ; Fri, 5 Nov 2021 20:48:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D0FAF60FBF for ; Fri, 5 Nov 2021 20:48:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D0FAF60FBF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7D4F19400F7; Fri, 5 Nov 2021 16:48:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 786269400EF; Fri, 5 Nov 2021 16:48:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 69AF29400F7; Fri, 5 Nov 2021 16:48:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 58D769400EF for ; Fri, 5 Nov 2021 16:48:06 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 217B11856AD5B for ; Fri, 5 Nov 2021 20:48:06 +0000 (UTC) X-FDA: 78776063772.07.C139080 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id C776CF00039E for ; Fri, 5 Nov 2021 20:48:05 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DE9EF60E94; Fri, 5 Nov 2021 20:48:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145285; bh=sGFNxz2N1YWJUJLa6sSjdrPx0R2KpQItJ3kc8xQtrno=; h=Date:From:To:Subject:In-Reply-To:From; b=iO0wvjhyBQRCrgBN2bSKGMhc7ZjHPe3bcw1fr5Xe47hwqxKXYMgvWaj9djxjLcUBa gBCEiqsNMEkBf95GPiW2D5Y5fp5kuV0U+Xj8La4ANGY961GCuot3YCVPYVV0NWXTXN e+o1o2UZYOnK8N5Sa/UH+QRpinyPJmIezXW14+IU= Date: Fri, 05 Nov 2021 13:48:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 254/262] mm/damon: remove unnecessary variable initialization Message-ID: <20211105204804.WWEtMEnCD%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=iO0wvjhy; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C776CF00039E X-Stat-Signature: hyjsyrxnt1df9iaa49awdfwrpas9df4u X-HE-Tag: 1636145285-864929 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon: remove unnecessary variable initialization Patch series "mm/damon: Fix some small bugs", v4. This patch (of 2): In 'damon_va_apply_three_regions', There is no need to set variable 'i' as 0 Link: https://lkml.kernel.org/r/b7df8d3dad0943a37e01f60c441b1968b2b20354.1634720326.git.xhao@linux.alibaba.com Link: https://lkml.kernel.org/r/cover.1634720326.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/vaddr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/damon/vaddr.c~mm-damon-remove-unnecessary-variable-initialization +++ a/mm/damon/vaddr.c @@ -306,7 +306,7 @@ static void damon_va_apply_three_regions struct damon_addr_range bregions[3]) { struct damon_region *r, *next; - unsigned int i = 0; + unsigned int i; /* Remove regions which are not in the three big regions now */ damon_for_each_region_safe(r, next, t) { From patchwork Fri Nov 5 20:48:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D969C433EF for ; Fri, 5 Nov 2021 20:48:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C669D60E94 for ; Fri, 5 Nov 2021 20:48:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C669D60E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 738789400F8; Fri, 5 Nov 2021 16:48:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E8B29400EF; Fri, 5 Nov 2021 16:48:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D7579400F8; Fri, 5 Nov 2021 16:48:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id 4FEF79400EF for ; Fri, 5 Nov 2021 16:48:09 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 174718249980 for ; Fri, 5 Nov 2021 20:48:09 +0000 (UTC) X-FDA: 78776063898.28.ECD78D4 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf06.hostedemail.com (Postfix) with ESMTP id BE888801A8BF for ; Fri, 5 Nov 2021 20:48:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CC3A760FBF; Fri, 5 Nov 2021 20:48:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145288; bh=8zNn0MTxAyxMFfsR2o/RpV1VmQ5LkBRO4BH3td/iDdk=; h=Date:From:To:Subject:In-Reply-To:From; b=IYW5oQV4FQVoInok6fvQeiuXjqjkirgW+Jrb7HTCHRTkFb/9uuLPabxOX44McEj3z ObtdTdzn7mMUHRHtulxIC3XqrQ0AyO+YPggEHlkZXwtFx/en3RG4YLddPz3uQlw66v RFByHfdgSbvM93BiYZKMoY9YKRAng3XyRZpdiPn0= Date: Fri, 05 Nov 2021 13:48:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org, xhao@linux.alibaba.com Subject: [patch 255/262] mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on Message-ID: <20211105204807.WLs3d5z72%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BE888801A8BF X-Stat-Signature: mw9basssfqxe9y5izk7sfo8k1uy8ji89 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IYW5oQV4; dmarc=none; spf=pass (imf06.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145288-12279 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Xin Hao Subject: mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on When the ctx->adaptive_targets list is empty, I did some test on monitor_on interface like this. # cat /sys/kernel/debug/damon/target_ids # # echo on > /sys/kernel/debug/damon/monitor_on # damon: kdamond (5390) starts Though the ctx->adaptive_targets list is empty, but the kthread_run still be called, and the kdamond.x thread still be created, this is meaningless. So there adds a judgment in 'dbgfs_monitor_on_write', if the ctx->adaptive_targets list is empty, return -EINVAL. Link: https://lkml.kernel.org/r/0a60a6e8ec9d71989e0848a4dc3311996ca3b5d4.1634720326.git.xhao@linux.alibaba.com Signed-off-by: Xin Hao Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 1 + mm/damon/core.c | 5 +++++ mm/damon/dbgfs.c | 15 ++++++++++++--- 3 files changed, 18 insertions(+), 3 deletions(-) --- a/include/linux/damon.h~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on +++ a/include/linux/damon.h @@ -440,6 +440,7 @@ void damon_destroy_scheme(struct damos * struct damon_target *damon_new_target(unsigned long id); void damon_add_target(struct damon_ctx *ctx, struct damon_target *t); +bool damon_targets_empty(struct damon_ctx *ctx); void damon_free_target(struct damon_target *t); void damon_destroy_target(struct damon_target *t); unsigned int damon_nr_regions(struct damon_target *t); --- a/mm/damon/core.c~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on +++ a/mm/damon/core.c @@ -180,6 +180,11 @@ void damon_add_target(struct damon_ctx * list_add_tail(&t->list, &ctx->adaptive_targets); } +bool damon_targets_empty(struct damon_ctx *ctx) +{ + return list_empty(&ctx->adaptive_targets); +} + static void damon_del_target(struct damon_target *t) { list_del(&t->list); --- a/mm/damon/dbgfs.c~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on +++ a/mm/damon/dbgfs.c @@ -878,12 +878,21 @@ static ssize_t dbgfs_monitor_on_write(st return -EINVAL; } - if (!strncmp(kbuf, "on", count)) + if (!strncmp(kbuf, "on", count)) { + int i; + + for (i = 0; i < dbgfs_nr_ctxs; i++) { + if (damon_targets_empty(dbgfs_ctxs[i])) { + kfree(kbuf); + return -EINVAL; + } + } ret = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs); - else if (!strncmp(kbuf, "off", count)) + } else if (!strncmp(kbuf, "off", count)) { ret = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs); - else + } else { ret = -EINVAL; + } if (!ret) ret = count; From patchwork Fri Nov 5 20:48:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17E47C433F5 for ; Fri, 5 Nov 2021 20:48:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C17BD60E94 for ; Fri, 5 Nov 2021 20:48:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C17BD60E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 676A99400F9; Fri, 5 Nov 2021 16:48:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FE2D9400EF; Fri, 5 Nov 2021 16:48:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4C68F9400F9; Fri, 5 Nov 2021 16:48:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id 3EF719400EF for ; Fri, 5 Nov 2021 16:48:12 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0DC5F1856C07C for ; Fri, 5 Nov 2021 20:48:12 +0000 (UTC) X-FDA: 78776064024.15.DABD981 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf08.hostedemail.com (Postfix) with ESMTP id C729B30000B1 for ; Fri, 5 Nov 2021 20:47:59 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B3B136120A; Fri, 5 Nov 2021 20:48:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145290; bh=qu2rlHwySO0EP1Ru+ktgVkit7Y4aha8BJc6JMSG9pH8=; h=Date:From:To:Subject:In-Reply-To:From; b=C1cOudhPy+P2wf4Mppkwu3UDePA9tf4tHRHeOi4/b7iIEMt1gVosshttioxZ+vaQO UBNS0HEWyOEWGVj7lzCD4gXeC7jFLMX6DrnzADeoYI55TDgd0+90DdOVCnAJ9FDk+X 34EnUkJwMvjTLfhLr8t32s8U/oZR99e7VAxB4hIU= Date: Fri, 05 Nov 2021 13:48:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 256/262] Docs/admin-guide/mm/damon/start: fix wrong example commands Message-ID: <20211105204810.2vA7rsIAo%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C729B30000B1 X-Stat-Signature: uqb9cz5rt9s1das4qi9fphtm3a898bwh Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=C1cOudhP; spf=pass (imf08.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145279-936030 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/start: fix wrong example commands Patch series "Fix trivial nits in Documentation/admin-guide/mm". This patchset fixes trivial nits in admin guide documents for DAMON and pagemap. This patch (of 4): Some of the example commands in DAMON getting started guide are outdated, missing sudo, or just wrong. This commit fixes those. Link: https://lkml.kernel.org/r/20211022090311.3856-2-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: Peter Xu Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/start.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) --- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-fix-wrong-example-commands +++ a/Documentation/admin-guide/mm/damon/start.rst @@ -19,7 +19,7 @@ your workload. :: # mount -t debugfs none /sys/kernel/debug/ # git clone https://github.com/awslabs/damo # ./damo/damo record $(pidof ) - # ./damo/damo report heat --plot_ascii + # ./damo/damo report heats --heatmap stdout The final command draws the access heatmap of ````. The heatmap shows which memory region (x-axis) is accessed when (y-axis) and how frequently @@ -94,9 +94,9 @@ Visualizing Recorded Patterns The following three commands visualize the recorded access patterns and save the results as separate image files. :: - $ damo report heats --heatmap access_pattern_heatmap.png - $ damo report wss --range 0 101 1 --plot wss_dist.png - $ damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png + $ sudo damo report heats --heatmap access_pattern_heatmap.png + $ sudo damo report wss --range 0 101 1 --plot wss_dist.png + $ sudo damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png - ``access_pattern_heatmap.png`` will visualize the data access pattern in a heatmap, showing which memory region (y-axis) got accessed when (x-axis) @@ -115,9 +115,9 @@ Data Access Pattern Aware Memory Managem Below three commands make every memory region of size >=4K that doesn't accessed for >=60 seconds in your workload to be swapped out. :: - $ echo "#min-size max-size min-acc max-acc min-age max-age action" > scheme - $ echo "4K max 0 0 60s max pageout" >> scheme - $ damo schemes -c my_thp_scheme + $ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme + $ echo "4K max 0 0 60s max pageout" >> test_scheme + $ damo schemes -c test_scheme .. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns .. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html From patchwork Fri Nov 5 20:48:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03DB6C433F5 for ; Fri, 5 Nov 2021 20:48:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B08F960E94 for ; Fri, 5 Nov 2021 20:48:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B08F960E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4C7ED9400EF; Fri, 5 Nov 2021 16:48:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 477A09400FA; Fri, 5 Nov 2021 16:48:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 368D99400EF; Fri, 5 Nov 2021 16:48:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 25B219400FA for ; Fri, 5 Nov 2021 16:48:15 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DB28A779AB for ; Fri, 5 Nov 2021 20:48:14 +0000 (UTC) X-FDA: 78776064108.09.27A3ABC Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf29.hostedemail.com (Postfix) with ESMTP id 8E8549000256 for ; Fri, 5 Nov 2021 20:48:14 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9688560FBF; Fri, 5 Nov 2021 20:48:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145293; bh=WYYjCw+9p5AVStszW+MdgDmdHCd9cLV2dMuI+dseqgU=; h=Date:From:To:Subject:In-Reply-To:From; b=yjboW2SCyIqZfWSLJaJCfXlKdpH8c5AhvRVLnqE4jhTUTwWP9RDxZ1DVjCm7OmWt8 tfl19TuETII5gCgm6cjD+a1RkLsOJv6j4D5RoSjK08QaPVlZ3keom/U9XRYJyv4m7a IkSxTG/SgHLjtg1bHy6AE8eACG4R7hSK4eHyJF7U= Date: Fri, 05 Nov 2021 13:48:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 257/262] Docs/admin-guide/mm/damon/start: fix a wrong link Message-ID: <20211105204813.v-9etVvIl%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8E8549000256 X-Stat-Signature: bswoyoop4cwkaos4ssbk8ek1t5z4q4nq Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=yjboW2SC; dmarc=none; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1636145294-278064 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/start: fix a wrong link The 'Getting Started' of DAMON is providing a link to DAMON's user interface document while saying about its user space tool's detailed usages. This commit fixes the link. Link: https://lkml.kernel.org/r/20211022090311.3856-3-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: Peter Xu Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/start.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-fix-a-wrong-link +++ a/Documentation/admin-guide/mm/damon/start.rst @@ -6,7 +6,9 @@ Getting Started This document briefly describes how you can use DAMON by demonstrating its default user space tool. Please note that this document describes only a part -of its features for brevity. Please refer to :doc:`usage` for more details. +of its features for brevity. Please refer to the usage `doc +`_ of the tool for more +details. TL; DR From patchwork Fri Nov 5 20:48:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0867EC433F5 for ; Fri, 5 Nov 2021 20:48:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD86E6120A for ; Fri, 5 Nov 2021 20:48:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AD86E6120A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 47A009400FB; Fri, 5 Nov 2021 16:48:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 429C89400FA; Fri, 5 Nov 2021 16:48:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 319D79400FB; Fri, 5 Nov 2021 16:48:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0032.hostedemail.com [216.40.44.32]) by kanga.kvack.org (Postfix) with ESMTP id 1F7379400FA for ; Fri, 5 Nov 2021 16:48:18 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DADB7180357D2 for ; Fri, 5 Nov 2021 20:48:17 +0000 (UTC) X-FDA: 78776064234.10.6EB430E Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP id 9224C30000AB for ; Fri, 5 Nov 2021 20:48:10 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 81CF460E94; Fri, 5 Nov 2021 20:48:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145296; bh=m7PoQp+X/QeFEK8trap/sxYsqTQOJZ238Z/ukyiuB0A=; h=Date:From:To:Subject:In-Reply-To:From; b=JMLfC7YF5rKjQoIiRPRiVmGBPU/xlHmVWe58K1XzP1rRnlgM5mrNcp52MWJyce88o CynxhanAx+5nVxjkEB7dXM0ppJUH5iY8WLiHvliuHNpwCkZcyl4CkdLtrLZJr1L5YG XRiq6HcsDbB6DJwhh6rwcCqqDtKjJm0VEBwT7Too= Date: Fri, 05 Nov 2021 13:48:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 258/262] Docs/admin-guide/mm/damon/start: simplify the content Message-ID: <20211105204816.NLJTYQk79%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9224C30000AB X-Stat-Signature: ghgg5mwbpngpzqt8zsb7efw8us4iuia4 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=JMLfC7YF; spf=pass (imf03.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145290-421300 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/damon/start: simplify the content Information in 'TL; DR' section of 'Getting Started' is duplicated in other parts of the doc. It is also asking readers to visit the access pattern visualizations gallery web site to show the results of example visualization commands, while the users of the commands can use terminal output. To make the doc simple, this commit removes the duplicated 'TL; DR' section and replaces the visualization example commands with versions using terminal outputs. Link: https://lkml.kernel.org/r/20211022090311.3856-4-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: Peter Xu Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/damon/start.rst | 111 +++++++++-------- 1 file changed, 59 insertions(+), 52 deletions(-) --- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-simplify-the-content +++ a/Documentation/admin-guide/mm/damon/start.rst @@ -11,38 +11,6 @@ of its features for brevity. Please ref details. -TL; DR -====== - -Follow the commands below to monitor and visualize the memory access pattern of -your workload. :: - - # # build the kernel with CONFIG_DAMON_*=y, install it, and reboot - # mount -t debugfs none /sys/kernel/debug/ - # git clone https://github.com/awslabs/damo - # ./damo/damo record $(pidof ) - # ./damo/damo report heats --heatmap stdout - -The final command draws the access heatmap of ````. The heatmap -shows which memory region (x-axis) is accessed when (y-axis) and how frequently -(number; the higher the more accesses have been observed). :: - - 111111111111111111111111111111111111111111111111111111110000 - 111121111111111111111111111111211111111111111111111111110000 - 000000000000000000000000000000000000000000000000001555552000 - 000000000000000000000000000000000000000000000222223555552000 - 000000000000000000000000000000000000000011111677775000000000 - 000000000000000000000000000000000000000488888000000000000000 - 000000000000000000000000000000000177888400000000000000000000 - 000000000000000000000000000046666522222100000000000000000000 - 000000000000000000000014444344444300000000000000000000000000 - 000000000000000002222245555510000000000000000000000000000000 - # access_frequency: 0 1 2 3 4 5 6 7 8 9 - # x-axis: space (140286319947776-140286426374096: 101.496 MiB) - # y-axis: time (605442256436361-605479951866441: 37.695430s) - # resolution: 60x10 (1.692 MiB and 3.770s for each character) - - Prerequisites ============= @@ -93,22 +61,66 @@ pattern in the ``damon.data`` file. Visualizing Recorded Patterns ============================= -The following three commands visualize the recorded access patterns and save -the results as separate image files. :: - - $ sudo damo report heats --heatmap access_pattern_heatmap.png - $ sudo damo report wss --range 0 101 1 --plot wss_dist.png - $ sudo damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png - -- ``access_pattern_heatmap.png`` will visualize the data access pattern in a - heatmap, showing which memory region (y-axis) got accessed when (x-axis) - and how frequently (color). -- ``wss_dist.png`` will show the distribution of the working set size. -- ``wss_chron_change.png`` will show how the working set size has - chronologically changed. +You can visualize the pattern in a heatmap, showing which memory region +(x-axis) got accessed when (y-axis) and how frequently (number).:: -You can view the visualizations of this example workload at [1]_. -Visualizations of other realistic workloads are available at [2]_ [3]_ [4]_. + $ sudo damo report heats --heatmap stdout + 22222222222222222222222222222222222222211111111111111111111111111111111111111100 + 44444444444444444444444444444444444444434444444444444444444444444444444444443200 + 44444444444444444444444444444444444444433444444444444444444444444444444444444200 + 33333333333333333333333333333333333333344555555555555555555555555555555555555200 + 33333333333333333333333333333333333344444444444444444444444444444444444444444200 + 22222222222222222222222222222222222223355555555555555555555555555555555555555200 + 00000000000000000000000000000000000000288888888888888888888888888888888888888400 + 00000000000000000000000000000000000000288888888888888888888888888888888888888400 + 33333333333333333333333333333333333333355555555555555555555555555555555555555200 + 88888888888888888888888888888888888888600000000000000000000000000000000000000000 + 88888888888888888888888888888888888888600000000000000000000000000000000000000000 + 33333333333333333333333333333333333333444444444444444444444444444444444444443200 + 00000000000000000000000000000000000000288888888888888888888888888888888888888400 + [...] + # access_frequency: 0 1 2 3 4 5 6 7 8 9 + # x-axis: space (139728247021568-139728453431248: 196.848 MiB) + # y-axis: time (15256597248362-15326899978162: 1 m 10.303 s) + # resolution: 80x40 (2.461 MiB and 1.758 s for each character) + +You can also visualize the distribution of the working set size, sorted by the +size.:: + + $ sudo damo report wss --range 0 101 10 + # + # target_id 18446632103789443072 + # avr: 107.708 MiB + 0 0 B | | + 10 95.328 MiB |**************************** | + 20 95.332 MiB |**************************** | + 30 95.340 MiB |**************************** | + 40 95.387 MiB |**************************** | + 50 95.387 MiB |**************************** | + 60 95.398 MiB |**************************** | + 70 95.398 MiB |**************************** | + 80 95.504 MiB |**************************** | + 90 190.703 MiB |********************************************************* | + 100 196.875 MiB |***********************************************************| + +Using ``--sortby`` option with the above command, you can show how the working +set size has chronologically changed.:: + + $ sudo damo report wss --range 0 101 10 --sortby time + # + # target_id 18446632103789443072 + # avr: 107.708 MiB + 0 3.051 MiB | | + 10 190.703 MiB |***********************************************************| + 20 95.336 MiB |***************************** | + 30 95.328 MiB |***************************** | + 40 95.387 MiB |***************************** | + 50 95.332 MiB |***************************** | + 60 95.320 MiB |***************************** | + 70 95.398 MiB |***************************** | + 80 95.398 MiB |***************************** | + 90 95.340 MiB |***************************** | + 100 95.398 MiB |***************************** | Data Access Pattern Aware Memory Management @@ -120,8 +132,3 @@ accessed for >=60 seconds in your worklo $ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme $ echo "4K max 0 0 60s max pageout" >> test_scheme $ damo schemes -c test_scheme - -.. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns -.. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html -.. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.png.html -.. [4] https://damonitor.github.io/test/result/visual/latest/rec.wss_time.png.html From patchwork Fri Nov 5 20:48:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0994DC433F5 for ; Fri, 5 Nov 2021 20:48:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B50FA60E94 for ; Fri, 5 Nov 2021 20:48:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org B50FA60E94 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 564179400FC; Fri, 5 Nov 2021 16:48:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 514119400FA; Fri, 5 Nov 2021 16:48:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B4119400FC; Fri, 5 Nov 2021 16:48:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 2C94A9400FA for ; Fri, 5 Nov 2021 16:48:21 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DBBFB8249980 for ; Fri, 5 Nov 2021 20:48:20 +0000 (UTC) X-FDA: 78776064360.15.629E717 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP id B891B9000093 for ; Fri, 5 Nov 2021 20:48:07 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7A24A60FBF; Fri, 5 Nov 2021 20:48:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145299; bh=VSPMYTHGdAu3WDcsLpdIv2hrbxnR3UuNEYV8zv3sHmg=; h=Date:From:To:Subject:In-Reply-To:From; b=PkiALZ5487mW6S+8d/nWGz2JEO9j4H8z0lAsk/umvVuEWBdFbd6c5xKeONRdzWH/8 5uE6dxk0uVMcp6bJtD+lCbBSrluR3VHQvCh2HJKwjYlmfd0enOZA5hEab8Db073Ulv DsFsPGACRHtGK8MZ1gHFeKaS7hmPzZ9HJ6rvGFP0= Date: Fri, 05 Nov 2021 13:48:19 -0700 From: Andrew Morton To: akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 259/262] Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Message-ID: <20211105204819.v6jIpVfd6%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PkiALZ54; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B891B9000093 X-Stat-Signature: p89j5tqr9uqqm7wwkgerzjht8jcbmxhq X-HE-Tag: 1636145287-217084 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Subject: Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Some descriptions of page flags in 'pagemap.rst' are written in assumption of none-rst, which respects every new line, as below: 7 - SLAB page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator When compound page is used, SLUB/SLQB will only set this flag on the head Because rst ignores the new line between the first sentence and second sentence, resulting html looks a little bit weird, as below. 7 - SLAB page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator When ^ compound page is used, SLUB/SLQB will only set this flag on the head page; SLOB will not flag it at all. This commit makes it more natural and consistent with other parts in the rendered version. Link: https://lkml.kernel.org/r/20211022090311.3856-5-sj@kernel.org Signed-off-by: SeongJae Park Cc: Jonathan Corbet Cc: Peter Xu Signed-off-by: Andrew Morton --- Documentation/admin-guide/mm/pagemap.rst | 53 ++++++++++----------- 1 file changed, 27 insertions(+), 26 deletions(-) --- a/Documentation/admin-guide/mm/pagemap.rst~docs-admin-guide-mm-pagemap-wordsmith-page-flags-descriptions +++ a/Documentation/admin-guide/mm/pagemap.rst @@ -90,13 +90,14 @@ Short descriptions to the page flags ==================================== 0 - LOCKED - page is being locked for exclusive access, e.g. by undergoing read/write IO + The page is being locked for exclusive access, e.g. by undergoing read/write + IO. 7 - SLAB - page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator + The page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator. When compound page is used, SLUB/SLQB will only set this flag on the head page; SLOB will not flag it at all. 10 - BUDDY - a free memory block managed by the buddy system allocator + A free memory block managed by the buddy system allocator. The buddy system organizes free memory in blocks of various orders. An order N block has 2^N physically contiguous pages, with the BUDDY flag set for and _only_ for the first page. @@ -112,65 +113,65 @@ Short descriptions to the page flags 16 - COMPOUND_TAIL A compound page tail (see description above). 17 - HUGE - this is an integral part of a HugeTLB page + This is an integral part of a HugeTLB page. 19 - HWPOISON - hardware detected memory corruption on this page: don't touch the data! + Hardware detected memory corruption on this page: don't touch the data! 20 - NOPAGE - no page frame exists at the requested address + No page frame exists at the requested address. 21 - KSM - identical memory pages dynamically shared between one or more processes + Identical memory pages dynamically shared between one or more processes. 22 - THP - contiguous pages which construct transparent hugepages + Contiguous pages which construct transparent hugepages. 23 - OFFLINE - page is logically offline + The page is logically offline. 24 - ZERO_PAGE - zero page for pfn_zero or huge_zero page + Zero page for pfn_zero or huge_zero page. 25 - IDLE - page has not been accessed since it was marked idle (see + The page has not been accessed since it was marked idle (see :ref:`Documentation/admin-guide/mm/idle_page_tracking.rst `). Note that this flag may be stale in case the page was accessed via a PTE. To make sure the flag is up-to-date one has to read ``/sys/kernel/mm/page_idle/bitmap`` first. 26 - PGTABLE - page is in use as a page table + The page is in use as a page table. IO related page flags --------------------- 1 - ERROR - IO error occurred + IO error occurred. 3 - UPTODATE - page has up-to-date data + The page has up-to-date data. ie. for file backed page: (in-memory data revision >= on-disk one) 4 - DIRTY - page has been written to, hence contains new data + The page has been written to, hence contains new data. i.e. for file backed page: (in-memory data revision > on-disk one) 8 - WRITEBACK - page is being synced to disk + The page is being synced to disk. LRU related page flags ---------------------- 5 - LRU - page is in one of the LRU lists + The page is in one of the LRU lists. 6 - ACTIVE - page is in the active LRU list + The page is in the active LRU list. 18 - UNEVICTABLE - page is in the unevictable (non-)LRU list It is somehow pinned and + The page is in the unevictable (non-)LRU list It is somehow pinned and not a candidate for LRU page reclaims, e.g. ramfs pages, - shmctl(SHM_LOCK) and mlock() memory segments + shmctl(SHM_LOCK) and mlock() memory segments. 2 - REFERENCED - page has been referenced since last LRU list enqueue/requeue + The page has been referenced since last LRU list enqueue/requeue. 9 - RECLAIM - page will be reclaimed soon after its pageout IO completed + The page will be reclaimed soon after its pageout IO completed. 11 - MMAP - a memory mapped page + A memory mapped page. 12 - ANON - a memory mapped page that is not part of a file + A memory mapped page that is not part of a file. 13 - SWAPCACHE - page is mapped to swap space, i.e. has an associated swap entry + The page is mapped to swap space, i.e. has an associated swap entry. 14 - SWAPBACKED - page is backed by swap/RAM + The page is backed by swap/RAM. The page-types tool in the tools/vm directory can be used to query the above flags. From patchwork Fri Nov 5 20:48:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B77E3C433FE for ; Fri, 5 Nov 2021 20:48:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E73160174 for ; Fri, 5 Nov 2021 20:48:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6E73160174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 035169400FD; Fri, 5 Nov 2021 16:48:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F26239400FA; Fri, 5 Nov 2021 16:48:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEF1B9400FD; Fri, 5 Nov 2021 16:48:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0024.hostedemail.com [216.40.44.24]) by kanga.kvack.org (Postfix) with ESMTP id CDE909400FA for ; Fri, 5 Nov 2021 16:48:23 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 97CC6779B1 for ; Fri, 5 Nov 2021 20:48:23 +0000 (UTC) X-FDA: 78776064528.17.125170A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id E0DDBB0000AE for ; Fri, 5 Nov 2021 20:48:15 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 65A3F60FBF; Fri, 5 Nov 2021 20:48:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145302; bh=WOwthI6oXUs6Iyt1L3v7EFYMr6E5szWJy3emswO6zzY=; h=Date:From:To:Subject:In-Reply-To:From; b=bapDyuhzp5dSanqGs3F82S0t/KW+ewub7TXTVwhDlDLC4ZHBIBmBOylk7uy8bYjUK NZUGko4Eva+0o+NqNRvNoQH6cDjbP6D2Gv8fIfRIKe4jjJ1EhjdFGROhO81v8QvlMW MQQjy2NkxD5XSpDdBqc9v7AL8cguibdQhMkmA12k= Date: Fri, 05 Nov 2021 13:48:22 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 260/262] mm/damon: simplify stop mechanism Message-ID: <20211105204822.UmpUEwvxA%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E0DDBB0000AE X-Stat-Signature: 1au6hf4mjjg1ru1q96sj4xck69nb9ncb Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bapDyuhz; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145295-574750 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changbin Du Subject: mm/damon: simplify stop mechanism A kernel thread can exit gracefully with kthread_stop(). So we don't need a new flag 'kdamond_stop'. And to make sure the task struct is not freed when accessing it, get reference to it before termination. Link: https://lkml.kernel.org/r/20211027130517.4404-1-changbin.du@gmail.com Signed-off-by: Changbin Du Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 1 mm/damon/core.c | 51 +++++++++++----------------------------- 2 files changed, 15 insertions(+), 37 deletions(-) --- a/include/linux/damon.h~mm-damon-simplify-stop-mechanism +++ a/include/linux/damon.h @@ -381,7 +381,6 @@ struct damon_ctx { /* public: */ struct task_struct *kdamond; - bool kdamond_stop; struct mutex kdamond_lock; struct damon_primitive primitive; --- a/mm/damon/core.c~mm-damon-simplify-stop-mechanism +++ a/mm/damon/core.c @@ -390,17 +390,6 @@ static unsigned long damon_region_sz_lim return sz; } -static bool damon_kdamond_running(struct damon_ctx *ctx) -{ - bool running; - - mutex_lock(&ctx->kdamond_lock); - running = ctx->kdamond != NULL; - mutex_unlock(&ctx->kdamond_lock); - - return running; -} - static int kdamond_fn(void *data); /* @@ -418,7 +407,6 @@ static int __damon_start(struct damon_ct mutex_lock(&ctx->kdamond_lock); if (!ctx->kdamond) { err = 0; - ctx->kdamond_stop = false; ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d", nr_running_ctxs); if (IS_ERR(ctx->kdamond)) { @@ -474,13 +462,15 @@ int damon_start(struct damon_ctx **ctxs, */ static int __damon_stop(struct damon_ctx *ctx) { + struct task_struct *tsk; + mutex_lock(&ctx->kdamond_lock); - if (ctx->kdamond) { - ctx->kdamond_stop = true; + tsk = ctx->kdamond; + if (tsk) { + get_task_struct(tsk); mutex_unlock(&ctx->kdamond_lock); - while (damon_kdamond_running(ctx)) - usleep_range(ctx->sample_interval, - ctx->sample_interval * 2); + kthread_stop(tsk); + put_task_struct(tsk); return 0; } mutex_unlock(&ctx->kdamond_lock); @@ -925,12 +915,8 @@ static bool kdamond_need_update_primitiv static bool kdamond_need_stop(struct damon_ctx *ctx) { struct damon_target *t; - bool stop; - mutex_lock(&ctx->kdamond_lock); - stop = ctx->kdamond_stop; - mutex_unlock(&ctx->kdamond_lock); - if (stop) + if (kthread_should_stop()) return true; if (!ctx->primitive.target_valid) @@ -1021,13 +1007,6 @@ static int kdamond_wait_activation(struc return -EBUSY; } -static void set_kdamond_stop(struct damon_ctx *ctx) -{ - mutex_lock(&ctx->kdamond_lock); - ctx->kdamond_stop = true; - mutex_unlock(&ctx->kdamond_lock); -} - /* * The monitoring daemon that runs as a kernel thread */ @@ -1038,17 +1017,18 @@ static int kdamond_fn(void *data) struct damon_region *r, *next; unsigned int max_nr_accesses = 0; unsigned long sz_limit = 0; + bool done = false; pr_debug("kdamond (%d) starts\n", current->pid); if (ctx->primitive.init) ctx->primitive.init(ctx); if (ctx->callback.before_start && ctx->callback.before_start(ctx)) - set_kdamond_stop(ctx); + done = true; sz_limit = damon_region_sz_limit(ctx); - while (!kdamond_need_stop(ctx)) { + while (!kdamond_need_stop(ctx) && !done) { if (kdamond_wait_activation(ctx)) continue; @@ -1056,7 +1036,7 @@ static int kdamond_fn(void *data) ctx->primitive.prepare_access_checks(ctx); if (ctx->callback.after_sampling && ctx->callback.after_sampling(ctx)) - set_kdamond_stop(ctx); + done = true; usleep_range(ctx->sample_interval, ctx->sample_interval + 1); @@ -1069,7 +1049,7 @@ static int kdamond_fn(void *data) sz_limit); if (ctx->callback.after_aggregation && ctx->callback.after_aggregation(ctx)) - set_kdamond_stop(ctx); + done = true; kdamond_apply_schemes(ctx); kdamond_reset_aggregated(ctx); kdamond_split_regions(ctx); @@ -1088,9 +1068,8 @@ static int kdamond_fn(void *data) damon_destroy_region(r, t); } - if (ctx->callback.before_terminate && - ctx->callback.before_terminate(ctx)) - set_kdamond_stop(ctx); + if (ctx->callback.before_terminate) + ctx->callback.before_terminate(ctx); if (ctx->primitive.cleanup) ctx->primitive.cleanup(ctx); From patchwork Fri Nov 5 20:48:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E3A1C433EF for ; Fri, 5 Nov 2021 20:48:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 44FC160174 for ; Fri, 5 Nov 2021 20:48:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 44FC160174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D7B209400FE; Fri, 5 Nov 2021 16:48:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D2A089400FA; Fri, 5 Nov 2021 16:48:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF1809400FE; Fri, 5 Nov 2021 16:48:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id B06339400FA for ; Fri, 5 Nov 2021 16:48:26 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7B14B779B0 for ; Fri, 5 Nov 2021 20:48:26 +0000 (UTC) X-FDA: 78776064612.06.E9EA4AE Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id D234DB0000A3 for ; Fri, 5 Nov 2021 20:48:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 3F34960174; Fri, 5 Nov 2021 20:48:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145305; bh=xsJdjv6ng8sEpgv7jHSwgU/wsI4SvLhKvGQZ2i0Y5jM=; h=Date:From:To:Subject:In-Reply-To:From; b=MXF1LXatueH6X10jjg5ap07oYGkfIgLLKb2Kn8Zhw3gYPSkFgJdViogpsRKqn+5sf fN6RRrbpB3mVWyrYFguHyKmScd1Ao8aoDwYEFd9cswbx9XmzI+rWEtkWaGoXidS08J C0TtGxESwDN8WMS89j6+A62Ok/v4dm/UV7T13S5s= Date: Fri, 05 Nov 2021 13:48:24 -0700 From: Andrew Morton To: akpm@linux-foundation.org, colin.i.king@gmail.com, colin.i.king@googlemail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 261/262] mm/damon: fix a few spelling mistakes in comments and a pr_debug message Message-ID: <20211105204824.qNbqcpySO%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D234DB0000A3 X-Stat-Signature: qpmgo1imziq9dycrd3j5r6ckraconntq Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=MXF1LXat; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1636145298-966449 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Colin Ian King Subject: mm/damon: fix a few spelling mistakes in comments and a pr_debug message There are a few spelling mistakes in the code. Fix these. Link: https://lkml.kernel.org/r/20211028184157.614544-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- mm/damon/core.c | 2 +- mm/damon/dbgfs-test.h | 2 +- mm/damon/vaddr-test.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) --- a/mm/damon/core.c~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message +++ a/mm/damon/core.c @@ -959,7 +959,7 @@ static unsigned long damos_wmark_wait_us /* higher than high watermark or lower than low watermark */ if (metric > scheme->wmarks.high || scheme->wmarks.low > metric) { if (scheme->wmarks.activated) - pr_debug("inactivate a scheme (%d) for %s wmark\n", + pr_debug("deactivate a scheme (%d) for %s wmark\n", scheme->action, metric > scheme->wmarks.high ? "high" : "low"); --- a/mm/damon/dbgfs-test.h~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message +++ a/mm/damon/dbgfs-test.h @@ -145,7 +145,7 @@ static void damon_dbgfs_test_set_init_re KUNIT_EXPECT_STREQ(test, (char *)buf, expect); } - /* Put invlid inputs and check the return error code */ + /* Put invalid inputs and check the return error code */ for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) { input = invalid_inputs[i]; pr_info("input: %s\n", input); --- a/mm/damon/vaddr-test.h~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message +++ a/mm/damon/vaddr-test.h @@ -233,7 +233,7 @@ static void damon_test_apply_three_regio * and 70-100) has totally freed and mapped to different area (30-32 and * 65-68). The target regions which were in the old second and third big * regions should now be removed and new target regions covering the new second - * and third big regions should be crated. + * and third big regions should be created. */ static void damon_test_apply_three_regions4(struct kunit *test) { From patchwork Fri Nov 5 20:48:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12605979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73751C433EF for ; Fri, 5 Nov 2021 20:48:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2505060174 for ; Fri, 5 Nov 2021 20:48:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2505060174 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BF32A9400FF; Fri, 5 Nov 2021 16:48:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA30D9400FA; Fri, 5 Nov 2021 16:48:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A921F9400FF; Fri, 5 Nov 2021 16:48:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 9BEF29400FA for ; Fri, 5 Nov 2021 16:48:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5DBDD180279BC for ; Fri, 5 Nov 2021 20:48:29 +0000 (UTC) X-FDA: 78776064738.12.142C1DD Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf26.hostedemail.com (Postfix) with ESMTP id BE30520019DC for ; Fri, 5 Nov 2021 20:48:29 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 2D47060174; Fri, 5 Nov 2021 20:48:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1636145308; bh=RAT4aEdVL5c3F3z9sf/D12jBWmrVR201Sw4cXQgfHzY=; h=Date:From:To:Subject:In-Reply-To:From; b=BgRIC4e1uFTx+cf8wdU70Vd6Cz136CJLG22ykCmb8kyT0FI47DD2xOvU9y+V1BD2D WvP3jP326/C1t6rzuyLE+dXn2YYA1Q+WZgtm/B63IAsy0f074Og9Xq7q0NqQnVSYww 33fxFlfFSbX+0Zp/wRQFD0MwF1Tl8uuehtS0WDlw= Date: Fri, 05 Nov 2021 13:48:27 -0700 From: Andrew Morton To: akpm@linux-foundation.org, changbin.du@gmail.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, sj@kernel.org, torvalds@linux-foundation.org Subject: [patch 262/262] mm/damon: remove return value from before_terminate callback Message-ID: <20211105204827.Ep_rdwILj%akpm@linux-foundation.org> In-Reply-To: <20211105133408.cccbb98b71a77d5e8430aba1@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=BgRIC4e1; dmarc=none; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BE30520019DC X-Stat-Signature: bjrcn4eieihpn7q4a3ndb4y1f8enbwsx X-HE-Tag: 1636145309-453704 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Changbin Du Subject: mm/damon: remove return value from before_terminate callback Since the return value of 'before_terminate' callback is never used, we make it have no return value. Link: https://lkml.kernel.org/r/20211029005023.8895-1-changbin.du@gmail.com Signed-off-by: Changbin Du Reviewed-by: SeongJae Park Signed-off-by: Andrew Morton --- include/linux/damon.h | 2 +- mm/damon/dbgfs.c | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) --- a/include/linux/damon.h~mm-damon-remove-return-value-from-before_terminate-callback +++ a/include/linux/damon.h @@ -322,7 +322,7 @@ struct damon_callback { int (*before_start)(struct damon_ctx *context); int (*after_sampling)(struct damon_ctx *context); int (*after_aggregation)(struct damon_ctx *context); - int (*before_terminate)(struct damon_ctx *context); + void (*before_terminate)(struct damon_ctx *context); }; /** --- a/mm/damon/dbgfs.c~mm-damon-remove-return-value-from-before_terminate-callback +++ a/mm/damon/dbgfs.c @@ -645,18 +645,17 @@ static void dbgfs_fill_ctx_dir(struct de debugfs_create_file(file_names[i], 0600, dir, ctx, fops[i]); } -static int dbgfs_before_terminate(struct damon_ctx *ctx) +static void dbgfs_before_terminate(struct damon_ctx *ctx) { struct damon_target *t, *next; if (!targetid_is_pid(ctx)) - return 0; + return; damon_for_each_target_safe(t, next, ctx) { put_pid((struct pid *)t->id); damon_destroy_target(t); } - return 0; } static struct damon_ctx *dbgfs_new_ctx(void)