From patchwork Thu Jan 11 18:33:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13517707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B2ABC47258 for ; Thu, 11 Jan 2024 18:33:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 969546B00A2; Thu, 11 Jan 2024 13:33:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E8E76B00A3; Thu, 11 Jan 2024 13:33:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 739EE6B00A4; Thu, 11 Jan 2024 13:33:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5B67C6B00A2 for ; Thu, 11 Jan 2024 13:33:45 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 306101A0CAD for ; Thu, 11 Jan 2024 18:33:45 +0000 (UTC) X-FDA: 81667878810.21.C0BD868 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf23.hostedemail.com (Postfix) with ESMTP id 63D4B14001E for ; Thu, 11 Jan 2024 18:33:43 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TflZKVXF; spf=pass (imf23.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704998023; a=rsa-sha256; cv=none; b=O594zRTcjvYXiuE9Xtqfc3/ZaGE99EFY9QVyflfB7tUdMLQTy5NntuHidyplH3T+KU0bU+ IcWIKstg/Mx7PX7zEaBqq2Ohjwx2Id8H3gzXFM4TpdOLJv243q6EOpTJlpFQDjai1tmOt6 Bv0xYMs+FTnQrvPsnkv0W8idK8c87o8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TflZKVXF; spf=pass (imf23.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704998023; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ooTkGE2rW3pIuHvnebpbhiU0Zel5DUvb4P5ZSh1D2Dw=; b=KNYIw5KpKlh7xZXk1bx0Rewq6wTf80j3SYtGfR4FM6jtPNtXY//F7xC6dawCj6qcIDBOcV p+1ZuciO4BRx1AzXsqiPYXhZwfilKUTgaw28PCEpQLEYUd+pm8fSQYwA+V1DuNo4qY0fN5 w4hs5XEtXVMnD/FXf1NT0/Zu9PqPB3c= Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-1d51ba18e1bso46461265ad.0 for ; Thu, 11 Jan 2024 10:33:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704998021; x=1705602821; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=ooTkGE2rW3pIuHvnebpbhiU0Zel5DUvb4P5ZSh1D2Dw=; b=TflZKVXFmbNmvhFoaTvdQZhXHhco2DwnC3IDTO7Ogg0jfjco5/Ky2zgtol6wN92PVi nSw64IQswOVctX4QiD5ev9RPmhpMPFOL59AZCMN1/h2zSxFkWYrjHVvuxiKqSgxGsZjU bjdeI6a1/HJ3D49tfHTsSfQOkLyqdUN3EaPPYqbnwrFqyeVBC5+QGPZQYsHnN2GV/INH xAG703loVZSKEweffJdJpqDGqdsd75j6Az6AUHF6t/sGR1WZsV33AuiBnoq2UTYCGmWq Zgqr2hcCelbQWXgazG2S1UD0XTPQIlGsylwgQqala572QhfF06g2OwwcMnA161+HlRuE EAlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704998021; x=1705602821; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ooTkGE2rW3pIuHvnebpbhiU0Zel5DUvb4P5ZSh1D2Dw=; b=Nrv4nAYZSmUuQLPbJ5LZ/QRPbnLctJC2LeNix98kythgjBaugiqmFx1AketfNwCVBJ P2rBlArmtqb02rSnTDERgsp8MJ3ZzH9s7cGJQKxZvm6EiuhNZXuKahm3+OvMUKdX4bne ylJnhwEHs0GjfvJ0HE6xkzRCAw82LsxSJdpyjoKNX3VfIkHXXJfSIRm3M/Z3VGiVUjZC oEZZpajw8yJobSpYZoRGLvky6cj6uazHSSbJTAiaBEscNpPuk4EM95Bj2ptM1SOa+wxu D/8bUI3sPt85HbhHxKvAxxB4IxVoToKpytG9y0nDxNKPBki7oNQAdFXxYZtaSi6G+7gB feqQ== X-Gm-Message-State: AOJu0YxY+UAucmEx4FkbYF3Wx/dWB9ntkmaJVwPi14G+pkL+PYwqlYOU AJjwI8Mh2GT16/NDZNO1zDWFLR4AlM28gzT9 X-Google-Smtp-Source: AGHT+IEDHbwlcQ2ocO41xF0rCg4oiBOl0m3y0fRQIMYq2UBZBjcvr8e3NyI9T83Xh8KcFvM5suVFWw== X-Received: by 2002:a17:902:690c:b0:1d4:1b4e:ebf5 with SMTP id j12-20020a170902690c00b001d41b4eebf5mr171681plk.10.1704998020964; Thu, 11 Jan 2024 10:33:40 -0800 (PST) Received: from KASONG-MB2.tencent.com ([1.203.117.98]) by smtp.gmail.com with ESMTPSA id mf3-20020a170902fc8300b001d08e080042sm1483267plb.43.2024.01.11.10.33.38 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 11 Jan 2024 10:33:40 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Yu Zhao , Chris Li , Matthew Wilcox , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v2 2/3] mm, lru_gen: move pages in bulk when aging Date: Fri, 12 Jan 2024 02:33:20 +0800 Message-ID: <20240111183321.19984-3-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240111183321.19984-1-ryncsn@gmail.com> References: <20240111183321.19984-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 63D4B14001E X-Stat-Signature: xw54dqubyzs1agnkz73wbm39pe5aaqaa X-Rspam-User: X-HE-Tag: 1704998023-313043 X-HE-Meta: U2FsdGVkX19OOHwpZcOyeeFv2ObNlmvWCntM+q/R/MPIcU8oYUIw1W1+GU9x2EHuM1oknPLUyFmxrqc+OVMj9qhSxAEnfI/XekOXpNKTRWBVK/lSRcX1BUdYUQaGeo4ZmwHh63s6Pbjs5HE83XsCLvo14LkCIHzkhBoyPSy4Q1xZzXFm4yCxi6Dbep2Xl33xheNfXP5BRxL/NpP7g1rJmtkaoLCfezLQhfNImAqbclHA06TeY1Kom829sSgM7t0R3Jk2hg/nkFRSdczT3tHCPaEmCzQeKaGnJI7aGNh8XqR3nQHOXyryhiYOJMLSUB3835ol4/yOwYOj+9KUl1ZacRE1R2EXzk0lZpg2DNTtswB0Zp6FgChHzqj0oKxQFkQopSfoTAImU3iihoayaxqAWGYajG4G5Ui7wFQu4pPTaiYBP2U+UZTPE1ddjmkQLpzvFrwHi+b2PbBnnfVFXLHqf1PgZBl29zq+snEw/H2lq7czhMRU5pnEmNBlEDanFRPMgW28Xn70pml2Avf1/yRIbs3gAG/v5PVpsHf3FN88eRMc4AwZHfxRwaUvg4PbuwuiyufZtnnmKVhfS1HIANvgjKA3jm7fsZsjgJg27PEboBV6KRCwu6KUhk1tT83YQF6TRt1eVLDABNI0jnc8C+OIUOccUdrriQzDVU+AwL32UCB8pZUYrVYMbgxjscYeij0da6nz3xE+TVDVazv55EPf6kPfK+wwe8K02gqJIjGzYqztUMQA/V9kIYOAQfKler86nEQ7F12aMTMoV9yUe3fHBg4rVIe+9XOk8DIjIhMTxMA2hEIkIrhUZ9QAma3nj7w850o9rXm+IENxn1ATOXUC4Gz8pDMn6IQlyptFavNHv5rXIt5f3Sy6YuT24JfQZurUNT620A2OTDbaXpezgTtXrezF8PnWk1raBQcnd4lZuLFaIIMxfgt7opcXKtyq19ZI9Fsp56/2ttkzRm/tLF5 7dizrWNk yYPZdxZpYj6ms+m5OB76CDcNOl+ZYP2EsRJkIheGVzlg7B4xZ+FJ+57WomdYgXDUcRIF6h5NB1pqiM+2NlaiuckeaC4k+9Yw+9SlN7ZFSGfJI2B+rrFjb3/M2OARb2QEgp8IkE4TJkzXkgtJnk0K4LtntYYw9FObZie0TIuO7jjbAvHx4I6AnfzynD57Oe/Pfnz5sJnT80fQW0iOT+jl5V1Qm4O+PyPws9xajwlOv8baDGucZy1yt/SQbLZmJEvb3sjcajb4kKnZYbPyMF4OtZKknlqLqwuSXpyg4gM4mdMi+rJ2D8ztVP4F49w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.001261, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Another overhead of aging is page moving. Actually, in most cases, pages are being moved to the same gen after folio_inc_gen is called, especially the protected pages. So it's better to move them in bulk. This also has a good effect on LRU order. Currently when MGLRU ages, it walks the LRU backwards, and the protected pages are moved to the tail of newer gen one by one, which actually reverses the order of pages in LRU. Moving them in batches can help keep their order, only in a small scope though, due to the scan limit of MAX_LRU_BATCH pages. After this commit, we can see a slight performance gain (with CONFIG_DEBUG_LIST=n): Test 1: of memcached in a 4G memcg on a EPYC 7K62 with: memcached -u nobody -m 16384 -s /tmp/memcached.socket \ -a 0766 -t 16 -B binary & memtier_benchmark -S /tmp/memcached.socket \ -P memcache_binary -n allkeys \ --key-minimum=1 --key-maximum=16000000 -d 1024 \ --ratio=1:0 --key-pattern=P:P -c 2 -t 16 --pipeline 8 -x 6 Average result of 18 test runs: Before: 44017.78 Ops/sec After patch 1-2: 44810.01 Ops/sec (+1.8%) Test 2: MySQL in 6G memcg with: echo 'set GLOBAL innodb_buffer_pool_size=16106127360;' | \ mysql -u USER -h localhost --password=PASS sysbench /usr/share/sysbench/oltp_read_only.lua \ --mysql-user=USER --mysql-password=PASS --mysql-db=sb\ --tables=48 --table-size=2000000 --threads=16 --time=1800\ --report-interval=5 run QPS of 6 test runs: Before: 134126.83 134352.13 134045.19 133985.12 134787.47 134554.43 After patch 1-2 (+0.4%): 134913.38 134695.35 134891.31 134662.66 135090.32 134901.14 Only about 10% CPU time is spent in kernel space for MySQL test so the improvement is very trivial. There could be a higher performance gain when pages are getting protected aggressively. Signed-off-by: Kairui Song --- mm/vmscan.c | 84 ++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 13 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 185d53607c7e..57b6549946c3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3120,9 +3120,46 @@ static int folio_update_gen(struct folio *folio, int gen) */ struct gen_update_batch { int delta[MAX_NR_GENS]; + struct folio *head, *tail; }; -static void lru_gen_update_batch(struct lruvec *lruvec, int type, int zone, +static void inline lru_gen_inc_bulk_finish(struct lru_gen_folio *lrugen, + int bulk_gen, bool type, int zone, + struct gen_update_batch *batch) +{ + if (!batch->head) + return; + + list_bulk_move_tail(&lrugen->folios[bulk_gen][type][zone], + &batch->head->lru, + &batch->tail->lru); + + batch->head = NULL; +} + +/* + * When aging, protected pages will go to the tail of the same higher + * gen, so the can be moved in batches. Besides reduced overhead, this + * also avoids changing their LRU order in a small scope. + */ +static inline void lru_gen_try_inc_bulk(struct lru_gen_folio *lrugen, struct folio *folio, + int bulk_gen, int gen, bool type, int zone, + struct gen_update_batch *batch) +{ + /* + * If folio not moving to the bulk_gen, it's raced with promotion + * so it need to go to the head of another LRU. + */ + if (bulk_gen != gen) + list_move(&folio->lru, &lrugen->folios[gen][type][zone]); + + if (!batch->head) + batch->tail = folio; + + batch->head = folio; +} + +static void lru_gen_update_batch(struct lruvec *lruvec, int bulk_gen, int type, int zone, struct gen_update_batch *batch) { int gen; @@ -3130,6 +3167,8 @@ static void lru_gen_update_batch(struct lruvec *lruvec, int type, int zone, struct lru_gen_folio *lrugen = &lruvec->lrugen; enum lru_list lru = type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON; + lru_gen_inc_bulk_finish(lrugen, bulk_gen, type, zone, batch); + for (gen = 0; gen < MAX_NR_GENS; gen++) { int delta = batch->delta[gen]; @@ -3714,6 +3753,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) struct gen_update_batch batch = { }; struct lru_gen_folio *lrugen = &lruvec->lrugen; int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); + int bulk_gen = (old_gen + 1) % MAX_NR_GENS; if (type == LRU_GEN_ANON && !can_swap) goto done; @@ -3721,24 +3761,33 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) /* prevent cold/hot inversion if force_scan is true */ for (zone = 0; zone < MAX_NR_ZONES; zone++) { struct list_head *head = &lrugen->folios[old_gen][type][zone]; + struct folio *prev = NULL; - while (!list_empty(head)) { - struct folio *folio = lru_to_folio(head); + if (!list_empty(head)) + prev = lru_to_folio(head); + while (prev) { + struct folio *folio = prev; VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(folio), folio); VM_WARN_ON_ONCE_FOLIO(folio_test_active(folio), folio); VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) != type, folio); VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio); + if (unlikely(list_is_first(&folio->lru, head))) + prev = NULL; + else + prev = lru_to_folio(&folio->lru); + new_gen = folio_inc_gen(lruvec, folio, false, &batch); - list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]); + lru_gen_try_inc_bulk(lrugen, folio, bulk_gen, new_gen, type, zone, &batch); if (!--remaining) { - lru_gen_update_batch(lruvec, type, zone, &batch); + lru_gen_update_batch(lruvec, bulk_gen, type, zone, &batch); return false; } } - lru_gen_update_batch(lruvec, type, zone, &batch); + + lru_gen_update_batch(lruvec, bulk_gen, type, zone, &batch); } done: reset_ctrl_pos(lruvec, type, true); @@ -4258,7 +4307,7 @@ void lru_gen_soft_reclaim(struct mem_cgroup *memcg, int nid) ******************************************************************************/ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc, - int tier_idx, struct gen_update_batch *batch) + int tier_idx, int bulk_gen, struct gen_update_batch *batch) { bool success; int gen = folio_lru_gen(folio); @@ -4301,7 +4350,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c int hist = lru_hist_from_seq(lrugen->min_seq[type]); gen = folio_inc_gen(lruvec, folio, false, batch); - list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + lru_gen_try_inc_bulk(lrugen, folio, bulk_gen, gen, type, zone, batch); WRITE_ONCE(lrugen->protected[hist][type][tier - 1], lrugen->protected[hist][type][tier - 1] + delta); @@ -4311,7 +4360,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c /* ineligible */ if (zone > sc->reclaim_idx || skip_cma(folio, sc)) { gen = folio_inc_gen(lruvec, folio, false, batch); - list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + lru_gen_try_inc_bulk(lrugen, folio, bulk_gen, gen, type, zone, batch); return true; } @@ -4385,11 +4434,16 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, LIST_HEAD(moved); int skipped_zone = 0; struct gen_update_batch batch = { }; + int bulk_gen = (gen + 1) % MAX_NR_GENS; int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; struct list_head *head = &lrugen->folios[gen][type][zone]; + struct folio *prev = NULL; - while (!list_empty(head)) { - struct folio *folio = lru_to_folio(head); + if (!list_empty(head)) + prev = lru_to_folio(head); + + while (prev) { + struct folio *folio = prev; int delta = folio_nr_pages(folio); VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(folio), folio); @@ -4398,8 +4452,12 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio); scanned += delta; + if (unlikely(list_is_first(&folio->lru, head))) + prev = NULL; + else + prev = lru_to_folio(&folio->lru); - if (sort_folio(lruvec, folio, sc, tier, &batch)) + if (sort_folio(lruvec, folio, sc, tier, bulk_gen, &batch)) sorted += delta; else if (isolate_folio(lruvec, folio, sc)) { list_add(&folio->lru, list); @@ -4419,7 +4477,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, skipped += skipped_zone; } - lru_gen_update_batch(lruvec, type, zone, &batch); + lru_gen_update_batch(lruvec, bulk_gen, type, zone, &batch); if (!remaining || isolated >= MIN_LRU_BATCH) break;