From patchwork Tue Jun 22 07:49:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 12336461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AF75C2B9F4 for ; Tue, 22 Jun 2021 05:48:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 35FA36115B for ; Tue, 22 Jun 2021 05:48:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 35FA36115B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CCBDE6B006C; Tue, 22 Jun 2021 01:48:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7C966B0070; Tue, 22 Jun 2021 01:48:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACF826B0072; Tue, 22 Jun 2021 01:48:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 7A08C6B006C for ; Tue, 22 Jun 2021 01:48:57 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 18EA0181AC9CC for ; Tue, 22 Jun 2021 05:48:57 +0000 (UTC) X-FDA: 78280281114.05.8382F1F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf03.hostedemail.com (Postfix) with ESMTP id BB4D6C0201F8 for ; Tue, 22 Jun 2021 05:48:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1624340936; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YHxgv1eTquCmD6Mjigwdi8Spxl8dlyPH5NJv2Jh2SZk=; b=Iroq+mApZN9MgQKrN35zKYm7HNilwmTkaHcgGT8wI9ImUoVbZgGhoRXwnOaIlm0/fe2RsX RHsq4WqzTwY/XTHevBxatMk5HNqNNnDuOLnDsHlfJK9GeTsrcY++PHsCHJ+xGpWiDTxLp1 5BByrrzNNDlTrmRdSDi6beQptOYkaOg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-19-jb35Z6bBMmWL9YqXuM2NyQ-1; Tue, 22 Jun 2021 01:48:52 -0400 X-MC-Unique: jb35Z6bBMmWL9YqXuM2NyQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0B742106B7D8; Tue, 22 Jun 2021 05:48:51 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-84.bne.redhat.com [10.64.54.84]) by smtp.corp.redhat.com (Postfix) with ESMTP id D71AA5D9DE; Tue, 22 Jun 2021 05:48:47 +0000 (UTC) From: Gavin Shan To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, alexander.duyck@gmail.com, david@redhat.com, mst@redhat.com, akpm@linux-foundation.org, anshuman.khandual@arm.com, catalin.marinas@arm.com, will@kernel.org, shan.gavin@gmail.com Subject: [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold Date: Tue, 22 Jun 2021 15:49:25 +0800 Message-Id: <20210622074926.333223-3-gshan@redhat.com> In-Reply-To: <20210622074926.333223-1-gshan@redhat.com> References: <20210622074926.333223-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Iroq+mAp; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf03.hostedemail.com: domain of gshan@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=gshan@redhat.com X-Rspamd-Server: rspam02 X-Stat-Signature: ikex8fcekgu1gxixt9hhnufqrr6c47p1 X-Rspamd-Queue-Id: BB4D6C0201F8 X-HE-Tag: 1624340936-627787 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The page reporting threshold is currently sticky to @pageblock_order. The page reporting can never be triggered because the freeing page can't come up with a free area like that huge. The situation becomes worse when the system memory becomes heavily fragmented. For example, the following configurations are used on ARM64 when 64KB base page size is enabled. In this specific case, the page reporting won't be triggered until the freeing page comes up with a 512MB free area. That's hard to be met, especially when the system memory becomes heavily fragmented. PAGE_SIZE: 64KB HPAGE_SIZE: 512MB pageblock_order: 13 (512MB) MAX_ORDER: 14 This allows the drivers to specify the threshold when the page reporting device is registered. The threshold falls back to @pageblock_order if it's not specified by the driver. The existing users (hv_balloon and virtio_balloon) don't specify the threshold and @pageblock_order is still taken as their page reporting order. So this shouldn't introduce functional changes. Signed-off-by: Gavin Shan --- include/linux/page_reporting.h | 3 +++ mm/page_reporting.c | 14 ++++++++++---- mm/page_reporting.h | 10 ++-------- 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h index 3b99e0ec24f2..fe648dfa3a7c 100644 --- a/include/linux/page_reporting.h +++ b/include/linux/page_reporting.h @@ -18,6 +18,9 @@ struct page_reporting_dev_info { /* Current state of page reporting */ atomic_t state; + + /* Minimal order of page reporting */ + unsigned int order; }; /* Tear-down and bring-up for page reporting devices */ diff --git a/mm/page_reporting.c b/mm/page_reporting.c index df9c5054e1b4..27670360bae6 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -47,7 +47,7 @@ __page_reporting_request(struct page_reporting_dev_info *prdev) } /* notify prdev of free page reporting request */ -void __page_reporting_notify(void) +void __page_reporting_notify(unsigned int order) { struct page_reporting_dev_info *prdev; @@ -58,7 +58,7 @@ void __page_reporting_notify(void) */ rcu_read_lock(); prdev = rcu_dereference(pr_dev_info); - if (likely(prdev)) + if (likely(prdev && order >= prdev->order)) __page_reporting_request(prdev); rcu_read_unlock(); @@ -229,7 +229,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev, /* Generate minimum watermark to be able to guarantee progress */ watermark = low_wmark_pages(zone) + - (PAGE_REPORTING_CAPACITY << PAGE_REPORTING_MIN_ORDER); + (PAGE_REPORTING_CAPACITY << prdev->order); /* * Cancel request if insufficient free memory or if we failed @@ -239,7 +239,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev, return err; /* Process each free list starting from lowest order/mt */ - for (order = PAGE_REPORTING_MIN_ORDER; order < MAX_ORDER; order++) { + for (order = prdev->order; order < MAX_ORDER; order++) { for (mt = 0; mt < MIGRATE_TYPES; mt++) { /* We do not pull pages from the isolate free list */ if (is_migrate_isolate(mt)) @@ -324,6 +324,12 @@ int page_reporting_register(struct page_reporting_dev_info *prdev) goto err_out; } + /* + * We need to choose the minimal order of page reporting if it's + * not specified by the driver. + */ + prdev->order = prdev->order ? prdev->order : pageblock_order; + /* initialize state and work structures */ atomic_set(&prdev->state, PAGE_REPORTING_IDLE); INIT_DELAYED_WORK(&prdev->work, &page_reporting_process); diff --git a/mm/page_reporting.h b/mm/page_reporting.h index 2c385dd4ddbd..d9f972e72649 100644 --- a/mm/page_reporting.h +++ b/mm/page_reporting.h @@ -10,11 +10,9 @@ #include #include -#define PAGE_REPORTING_MIN_ORDER pageblock_order - #ifdef CONFIG_PAGE_REPORTING DECLARE_STATIC_KEY_FALSE(page_reporting_enabled); -void __page_reporting_notify(void); +void __page_reporting_notify(unsigned int order); static inline bool page_reported(struct page *page) { @@ -37,12 +35,8 @@ static inline void page_reporting_notify_free(unsigned int order) if (!static_branch_unlikely(&page_reporting_enabled)) return; - /* Determine if we have crossed reporting threshold */ - if (order < PAGE_REPORTING_MIN_ORDER) - return; - /* This will add a few cycles, but should be called infrequently */ - __page_reporting_notify(); + __page_reporting_notify(order); } #else /* CONFIG_PAGE_REPORTING */ #define page_reported(_page) false