From patchwork Tue Sep 8 20:10:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11764389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BC56618 for ; Tue, 8 Sep 2020 20:11:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B8A1420936 for ; Tue, 8 Sep 2020 20:11:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fmJwwx/y" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B8A1420936 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DC4FC6B0071; Tue, 8 Sep 2020 16:11:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D25E26B0072; Tue, 8 Sep 2020 16:11:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEF076B0073; Tue, 8 Sep 2020 16:11:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id A348E6B0071 for ; Tue, 8 Sep 2020 16:11:14 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6639A180AD80F for ; Tue, 8 Sep 2020 20:11:14 +0000 (UTC) X-FDA: 77240988468.20.ink00_571508b270d7 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 2E38E180C07AB for ; Tue, 8 Sep 2020 20:11:14 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,david@redhat.com,,RULES_HIT:30029:30051:30054:30064:30090,0,RBL:207.211.31.81:@redhat.com:.lbl8.mailshell.net-62.18.0.100 66.10.201.10;04yrqc4ipd8sex7ju8igffba184dkycpuh38zbdbwcy6t7e4zas8ts4rr3tsi7u.pfeuqh16q1z6d8h6wmxq35e1ffi64roy8mmrdpnkwuj9ja6hgywxk5jnp5jsaus.r-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: ink00_571508b270d7 X-Filterd-Recvd-Size: 9925 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Sep 2020 20:11:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599595871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWGa3qSfN9Odfx6sHlIPwO9T/8fSKtz1FdQi3pBhRQ0=; b=fmJwwx/yLiiDaUw2tz0cH2djPZWOd+UwB8ClaG1Np5FgNH+2tb/EQ0iWHsA7/KCicJuQQP dYlWRgt0RtIHuBcd7pZ44DNLhCMqNz6q4QyOm/1F026N2OFG59n8eMBQCnwfilGycJExm5 IugipTAEMIM2vc6wsut1Myf7OpYpoQY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-217-ugEopWE8Oy6O8LRJQsXXwg-1; Tue, 08 Sep 2020 16:11:07 -0400 X-MC-Unique: ugEopWE8Oy6O8LRJQsXXwg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4701F1007465; Tue, 8 Sep 2020 20:11:04 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-46.ams2.redhat.com [10.36.115.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E9B75D9E8; Tue, 8 Sep 2020 20:10:56 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Thomas Gleixner , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Julien Grall , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v2 4/7] mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources Date: Tue, 8 Sep 2020 22:10:09 +0200 Message-Id: <20200908201012.44168-5-david@redhat.com> In-Reply-To: <20200908201012.44168-1-david@redhat.com> References: <20200908201012.44168-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Rspamd-Queue-Id: 2E38E180C07AB X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource either created within add_memory*() or passed via add_memory_resource() shall be marked mergeable and merged with applicable siblings. To implement that, we need a kernel/resource interface to mark selected System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger merging. Note: We really want to merge after the whole operation succeeded, not directly when adding a resource to the resource tree (it would break add_memory_resource() and require splitting resources again when the operation failed - e.g., due to -ENOMEM). Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Thomas Gleixner Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 +++ include/linux/memory_hotplug.h | 9 +++++ kernel/resource.c | 60 ++++++++++++++++++++++++++++++++++ mm/memory_hotplug.c | 7 ++++ 4 files changed, 80 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index d7620d7c941a0..7e61389dcb017 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -60,6 +60,7 @@ struct resource { /* IORESOURCE_SYSRAM specific bits. */ #define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x02000000 /* Always detected via a driver. */ +#define IORESOURCE_SYSRAM_MERGEABLE 0x04000000 /* Resource can be merged. */ #define IORESOURCE_EXCLUSIVE 0x08000000 /* Userland may not map this resource */ @@ -253,6 +254,9 @@ extern void __release_region(struct resource *, resource_size_t, extern void release_mem_region_adjustable(struct resource *, resource_size_t, resource_size_t); #endif +#ifdef CONFIG_MEMORY_HOTPLUG +extern void merge_system_ram_resource(struct resource *res); +#endif /* Wrappers for managed devices */ struct device; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 5cd48332ce119..feb4aac03f2eb 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -68,6 +68,15 @@ struct mhp_params { pgprot_t pgprot; }; +/* Flags used for add_memory() and friends. */ + +/* + * Allow merging of the added System RAM resource with adjacent, mergeable + * resources. After a successful call to add_memory_resource() with this flag + * set, the resource pointer must no longer be used as it might be stale. + */ +#define MEMHP_MERGE_RESOURCE 1 + /* * Zone resizing functions * diff --git a/kernel/resource.c b/kernel/resource.c index 36b3552210120..7a91b935f4c20 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1363,6 +1363,66 @@ void release_mem_region_adjustable(struct resource *parent, } #endif /* CONFIG_MEMORY_HOTREMOVE */ +#ifdef CONFIG_MEMORY_HOTPLUG +static bool system_ram_resources_mergeable(struct resource *r1, + struct resource *r2) +{ + /* We assume either r1 or r2 is IORESOURCE_SYSRAM_MERGEABLE. */ + return r1->flags == r2->flags && r1->end + 1 == r2->start && + r1->name == r2->name && r1->desc == r2->desc && + !r1->child && !r2->child; +} + +/* + * merge_system_ram_resource - mark the System RAM resource mergeable and try to + * merge it with adjacent, mergeable resources + * @res: resource descriptor + * + * This interface is intended for memory hotplug, whereby lots of contiguous + * system ram resources are added (e.g., via add_memory*()) by a driver, and + * the actual resource boundaries are not of interest (e.g., it might be + * relevant for DIMMs). Only resources that are marked mergeable, that have the + * same parent, and that don't have any children are considered. All mergeable + * resources must be immutable during the request. + * + * Note: + * - The caller has to make sure that no pointers to resources that are + * marked mergeable are used anymore after this call - the resource might + * be freed and the pointer might be stale! + * - release_mem_region_adjustable() will split on demand on memory hotunplug + */ +void merge_system_ram_resource(struct resource *res) +{ + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + struct resource *cur; + + if (WARN_ON_ONCE((res->flags & flags) != flags)) + return; + + write_lock(&resource_lock); + res->flags |= IORESOURCE_SYSRAM_MERGEABLE; + + /* Try to merge with next item in the list. */ + cur = res->sibling; + if (cur && system_ram_resources_mergeable(res, cur)) { + res->end = cur->end; + res->sibling = cur->sibling; + free_resource(cur); + } + + /* Try to merge with previous item in the list. */ + cur = res->parent->child; + while (cur && cur->sibling != res) + cur = cur->sibling; + if (cur && system_ram_resources_mergeable(cur, res)) { + cur->end = res->end; + cur->sibling = res->sibling; + free_resource(res); + } + write_unlock(&resource_lock); +} +#endif /* CONFIG_MEMORY_HOTPLUG */ + /* * Managed region resource */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 64b07f006bc10..f62c2a46df8b1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1103,6 +1103,13 @@ int __ref add_memory_resource(int nid, struct resource *res, /* device_online() will take the lock when calling online_pages() */ mem_hotplug_done(); + /* + * In case we're allowed to merge the resource, flag it and trigger + * merging now that adding succeeded. + */ + if (flags & MEMHP_MERGE_RESOURCE) + merge_system_ram_resource(res); + /* online pages if requested */ if (memhp_default_online_type != MMOP_OFFLINE) walk_memory_blocks(start, size, NULL, online_memory_block);