From patchwork Wed Jul 26 15:07:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13328262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89D05C001DC for ; Wed, 26 Jul 2023 15:08:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 24C588D0002; Wed, 26 Jul 2023 11:08:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1FCD56B0085; Wed, 26 Jul 2023 11:08:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C6338D0002; Wed, 26 Jul 2023 11:08:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EF4706B0083 for ; Wed, 26 Jul 2023 11:08:51 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B70F4C0192 for ; Wed, 26 Jul 2023 15:08:51 +0000 (UTC) X-FDA: 81054095262.10.4205474 Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) by imf02.hostedemail.com (Postfix) with ESMTP id EF5508020A for ; Wed, 26 Jul 2023 15:07:07 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=Oor1EuXr; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690384028; a=rsa-sha256; cv=none; b=cUt3tMlg6asNxWgKK69hPiqviz6L6ra/6J/6iKfM8iRdDNoTxrm2PiEzsPTkDjeTafKrFT SxG05t2G2NzlTct0XcW9I7jNbGGk7Gx0T5lwIkPrx594q605EwiUaTyTCXExxtqxPg9XSL KnEsQkT/nfyZfkZAs3LB8B5j6wSFsDI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=Oor1EuXr; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690384028; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MpZCUc2DN64k7aeMSMLZUCCAbrXdS1MBsuJSZ1mAze4=; b=aTwdBK44rfODsclDtRB5Fa+OfaiGaqO5rWH4af4kaWJgJhn071waXPieFy5+EWVb9VlVHW gvvzzIH6m8OHZLsd7iTXOnd5Eo/jwj731amb0/n6BJ/yOSIOv4AsZA3Nw3Fe92RpZ6JoRx 7c7EauWq57nBAtp6bAfum2PYJSSkz84= Received: by mail-qk1-f177.google.com with SMTP id af79cd13be357-7673180224bso470842685a.0 for ; Wed, 26 Jul 2023 08:07:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1690384027; x=1690988827; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=MpZCUc2DN64k7aeMSMLZUCCAbrXdS1MBsuJSZ1mAze4=; b=Oor1EuXr2oNsSjTJos+24ANvHw0H7E8wBKKFTXEe/8B87EVmWYgR0mM5AoXugx37Sp zpyyDyg/sexZwYtiqLmXtys2aFDmq/feJc43XUdlO+UgN5TUfMqkAzA2wqVk4/jxFEYg tpKmGp7hkje+DEiBRW1cr53CZkGUo0c5MOvhgvRGdDZYwpNUVlMFIQuOLlP5KnKGtK07 uS0gP4JQNAS8CzzQuuqBmsrOnH2KG9nfMYxpI1ai8Un4pggpXtOd53GAWipu/0QzEwNu DHDudVCBJgPbN2EqUrm9WSnSTwdAoFFlu5yU0QGjDJi6XumFB9xATAQhTclZ+m/VcaSl 3PgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690384027; x=1690988827; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=MpZCUc2DN64k7aeMSMLZUCCAbrXdS1MBsuJSZ1mAze4=; b=SyOPpHLpMjgGdrGrFR0yKwsuJhvEaobzoAHnndn6XBJt3L+Bqy7T2zTxVi7n71zjai h3UK6lTzZ0J5YkGdswSQB8uKSRsDld8Wm1HFFrxc1M/CON9ADl1tyaEWyivfbBLjJe9R P3LmJ6APvEddHb0qMFiM6GfZ7oGLAXgzzGkzpesaV4gjJkrgTv3cZQjS5kIaAMkGCqCW F2iLZXFEtYFVDjeLD3gGGrJgNu9HMzVnxFYDD+lh3JobIqceojeVUHw+9y4+/xGj17BX Gpq40+Tl7fTKPq+pp+n6NX6erN7BdNmJsRmvRzf9HFWmdv2j0F40I4lGwb2ZPTQGk3Kv 4oAA== X-Gm-Message-State: ABy/qLZ4NOUjPxJStmkj72z14Q6TdO9Svz+Mm44xzg9gZ113ENUMkXls QRBfUD89zdgQNq7gHB2zjb2IKA== X-Google-Smtp-Source: APBJJlGJ1NxtNwRorz6e+qI8GGmhB8XLaT8DTlk7nOqyJQ1RluJh/FrQy9x1w+U7oiTUXlVICH1w3w== X-Received: by 2002:a05:620a:4691:b0:768:1109:397 with SMTP id bq17-20020a05620a469100b0076811090397mr2429148qkb.62.1690384026957; Wed, 26 Jul 2023 08:07:06 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:ad06]) by smtp.gmail.com with ESMTPSA id cx24-20020a05620a51d800b00767c961eb47sm4438556qkb.43.2023.07.26.08.07.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Jul 2023 08:07:06 -0700 (PDT) Date: Wed, 26 Jul 2023 11:07:05 -0400 From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Roman Gushchin , Rik van Riel , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2] mm: page_alloc: consume available CMA space first Message-ID: <20230726150705.GA1365610@cmpxchg.org> References: <20230726145304.1319046-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20230726145304.1319046-1-hannes@cmpxchg.org> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: EF5508020A X-Stat-Signature: jmbachn9b89fc7drxeku5oyzr84cf48y X-Rspam-User: X-HE-Tag: 1690384027-5518 X-HE-Meta: U2FsdGVkX18SvL0629ELrwIcRH4mk+vmkrKo6nrwcMczmGvCDvdq0IMZJLRjYsdH+NlcECc2RRnauFFFkByAR9gVLLaoHksk1kMHtCRuNe2GcSelFcYpxFGTBdkOG/Y/f/+fMH4c7ckCx7fjAIA1a8M7J52gEQFi7gQHt7K/eTz/rQ6kWaCdonNhXYhDEoL056FrjHbU5Ynx/zCt4xIzo51HdeVIHUeC18v6qvt+iGXmLXuCLxREaLcZxctJll9Ywt5YVF2GHjIqckTSBDVgrsnYupm6Utp1EoMziv3zaMw6yizndrdXEn1qImLfQyaroKmrdo1znAWPWn8Ll4f1klx0+a/lpbKTs+ZK3TlgZr0pnTvlM8DSEeFrYu//ACxEY3GwaIYpMURmU/rsg6lDJaIb9MW0zy58gwdFXp6JM9vE2GLtRcfkQNMQpys9d6CV5HLxJJh3T43QZmQAPyw6PDqh5BvT/yrf06MjzDmZuYKzQnX/VFLDWDBsb/fkdWkppNegmQBL+yOiRkqSSf9iqgOjzy19+lb+Fl4pQleUDjr0ATcMOu69UOxnaU3B7+ySseX8LCu5Zgzw5AL20vt6CaHdP63ixNs3MVGvbanRMpyaWVeLfs5uFsCi4hMFpAl5HJKp1a59tjcqWdDNUmEEgQTTE4R5LLv/BYedDgfOjQT7atJRwM4p0Ccj6Knml7gxTjl994ai6Bj+LrxtEijS/JOI8yeoOqpgI9Hu9UX1cDtWYpy+MWn+nXEUOVxs0eMsK2ADQVph1Z/KOuR1bjycNjdGEjEvjQTRIjyrhX6axD4a1uL+nV/GAzsVNE+yr9HHfwk74xQ59zD6Gfw6LuTVV5MMwvLBYjH/tLeN+bdGZ27nz2JiriScVHyhc783DPL5d+tAFP2q0JbgGqIaiGt1Pdiaq2TtsIybENWPTAq5Q9bbrfAAFYdGk4qW7FH0ZOlPTlycZoBnZ1XO+zmWwzz 44uXOO1Z KnxmRBqugIoe/uImRMuQspYphsGmg1O2+PIjNUp96P5gPq/lcm9S/YZ2uc0RPmm1G/So/Cz49+HTWFI9AGz04cqyP1A6XBtFiEbSoZdibUnXD+/Szx5IZiaif/QsC7HfyGLVS/DLxx9YvJy+YFpRMG7uU4fH1C43Tuu1qClLVzje+RuWy2dkRPjRhDUBM9zq02jxERG1UEzXLCcRkAaaDc4dRJv5UWdRqov0zHNT2/ierCtHQ8cpj9UHz9DbC8q3iSEQxDlxXeB/OS3F9U8nhJky654j2H4FR06rOfiMdKGoZ7QGx/xMHlk2CyICkAfCDQH2/kMrUnzr1dV/fVNmHiguS2Ix6XpFN2VXQLr7V6ZSigkXiKqF+O9dyDTPpak/+9sEQlTPqBPEUNXIm1j6h9BopzruPZkPkJ2K02pO6o4t8dKVkT2qZx0rgjHErdGo5L8zerVESobJ1sJll/Wv/PwKQyr2ohJi/6oxXxYahpT+slq+rG2mgn5cYcyTEIeY8DyXtxGLbakZi8FcHvwZR/u7nqQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On a memcache setup with heavy anon usage and no swap, we routinely see premature OOM kills with multiple gigabytes of free space left: Node 0 Normal free:4978632kB [...] free_cma:4893276kB This free space turns out to be CMA. We set CMA regions aside for potential hugetlb users on all of our machines, figuring that even if there aren't any, the memory is available to userspace allocations. When the OOMs trigger, it's from unmovable and reclaimable allocations that aren't allowed to dip into CMA. The non-CMA regions meanwhile are dominated by the anon pages. Movable pages can be migrated out of CMA when necessary, but we don't have a mechanism to migrate them *into* CMA to make room for unmovable allocations. The only recourse we have for these pages is reclaim, which due to a lack of swap is unavailable in our case. Because we have more options for CMA pages, change the policy to always fill up CMA first. This reduces the risk of premature OOMs. Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 53 +++++++++++++++++++------------------------------ 1 file changed, 20 insertions(+), 33 deletions(-) I realized shortly after sending the first version that the code can be further simplified by removing __rmqueue_cma_fallback() altogether. Build, boot and runtime tested that CMA is indeed used up first. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7d3460c7a480..b257f9651ce9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1634,17 +1634,6 @@ static int fallbacks[MIGRATE_TYPES][MIGRATE_PCPTYPES - 1] = { [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE }, }; -#ifdef CONFIG_CMA -static __always_inline struct page *__rmqueue_cma_fallback(struct zone *zone, - unsigned int order) -{ - return __rmqueue_smallest(zone, order, MIGRATE_CMA); -} -#else -static inline struct page *__rmqueue_cma_fallback(struct zone *zone, - unsigned int order) { return NULL; } -#endif - /* * Move the free pages in a range to the freelist tail of the requested type. * Note that start_page and end_pages are not aligned on a pageblock @@ -2124,29 +2113,27 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, { struct page *page; - if (IS_ENABLED(CONFIG_CMA)) { - /* - * Balance movable allocations between regular and CMA areas by - * allocating from CMA when over half of the zone's free memory - * is in the CMA area. - */ - if (alloc_flags & ALLOC_CMA && - zone_page_state(zone, NR_FREE_CMA_PAGES) > - zone_page_state(zone, NR_FREE_PAGES) / 2) { - page = __rmqueue_cma_fallback(zone, order); - if (page) - return page; - } +#ifdef CONFIG_CMA + /* + * Use up CMA first. Movable pages can be migrated out of CMA + * if necessary, but they cannot migrate into it to make room + * for unmovables elsewhere. The only recourse for them is + * then reclaim, which might be unavailable without swap. We + * want to reduce the risk of OOM with free CMA space left. + */ + if (alloc_flags & ALLOC_CMA) { + page = __rmqueue_smallest(zone, order, MIGRATE_CMA); + if (page) + return page; } -retry: - page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) - page = __rmqueue_cma_fallback(zone, order); - - if (!page && __rmqueue_fallback(zone, order, migratetype, - alloc_flags)) - goto retry; +#endif + + for (;;) { + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + break; + if (!__rmqueue_fallback(zone, order, migratetype, alloc_flags)) + break; } return page; }