From patchwork Mon Feb 4 14:44:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 10795831 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9CC7A1669 for ; Mon, 4 Feb 2019 14:52:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 891272B9BB for ; Mon, 4 Feb 2019 14:52:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C5882B9DE; Mon, 4 Feb 2019 14:52:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CFBA82BA00 for ; Mon, 4 Feb 2019 14:52:46 +0000 (UTC) Received: from localhost ([127.0.0.1]:43955 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gqfc5-0007Uc-JY for patchwork-qemu-devel@patchwork.kernel.org; Mon, 04 Feb 2019 09:52:45 -0500 Received: from eggs.gnu.org ([209.51.188.92]:45857) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gqfaT-00066m-CF for qemu-devel@nongnu.org; Mon, 04 Feb 2019 09:51:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gqfTj-0005m2-S8 for qemu-devel@nongnu.org; Mon, 04 Feb 2019 09:44:09 -0500 Received: from mail-qk1-f170.google.com ([209.85.222.170]:39728) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gqfTj-0005kz-Mh for qemu-devel@nongnu.org; Mon, 04 Feb 2019 09:44:07 -0500 Received: by mail-qk1-f170.google.com with SMTP id c21so47086qkl.6 for ; Mon, 04 Feb 2019 06:44:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=FGRGdnNKdUUmLJtZGztJU10WeumIhfvKGweCeX5ooWE=; b=B2LbbkHxLJyWfLo2TSs7zmh8BvYwIKqa0/G50/uN8uhifAXuckA0AUbgxYwcrXYnA0 XmKc0aTVky6W4gStLcQ3cRcvUYR705lCxLjxGvP71++9opBzeZQDwEFhycIWAEIAXrLT 33vZNxLKPog9s/5YE065r8/t5mx/U/Vu+UDRVqVqfYtTg4m8EOlxwh0xoMrIaw5/7DSU XPegkUo5ANaHtR1Sp1ZEhq5AbLtXWEJiw3VnVM+osw2Y8gb+dRmOZdG0/rlyDvp5qrIn mhxk5LRjkCul9CKN6vHd25X38wFYHJmMbhSpx3bCbhCpSioLxEota7n8epKxl0a6KMwa 7F5A== X-Gm-Message-State: AJcUukeEpt9rOJ7x3+CQgfpIPAqRWTtXmwZdfJY1QoNll2tETasVc+EU ZNz9MN7/NkqtJz71cGexk9OCn9nCCu4= X-Google-Smtp-Source: ALg8bN6+9O3IqMjD5qiLfxTQUbfMczarZWM724gdWOE2vdKP9wvVBHahV3i56vLOG0UI5rzd39nesg== X-Received: by 2002:ae9:d8c2:: with SMTP id u185mr44580927qkf.107.1549291445862; Mon, 04 Feb 2019 06:44:05 -0800 (PST) Received: from redhat.com (pool-173-76-246-42.bstnma.fios.verizon.net. [173.76.246.42]) by smtp.gmail.com with ESMTPSA id a2sm10698526qkk.4.2019.02.04.06.44.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Feb 2019 06:44:05 -0800 (PST) Date: Mon, 4 Feb 2019 09:44:04 -0500 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Message-ID: <20190204142638.27021-24-mst@redhat.com> References: <20190204142638.27021-1-mst@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190204142638.27021-1-mst@redhat.com> X-Mailer: git-send-email 2.17.1.1206.gb667731e2e.dirty X-Mutt-Fcc: =sent X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.85.222.170 Subject: [Qemu-devel] [PULL 23/25] mmap-alloc: fix hugetlbfs misaligned length in ppc64 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Peter Crosthwaite , Greg Kurz , Murilo Opsfelder Araujo , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Murilo Opsfelder Araujo The commit 7197fb4058bcb68986bae2bb2c04d6370f3e7218 ("util/mmap-alloc: fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64. However, we still need to consider the underlying huge page size during munmap() because it requires that both address and length be a multiple of the underlying huge page size for Huge TLB mappings. Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES section of the munmap(2) manual: "For munmap(), addr and length must both be a multiple of the underlying huge page size." On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB mappings because the mapped segment can be aligned with the underlying huge page size, not aligned with the native system page size, as returned by getpagesize(). This has the side effect of not releasing huge pages back to the pool after a hugetlbfs file-backed memory device is hot-unplugged. This patch fixes the situation in qemu_ram_mmap() and qemu_ram_munmap() by considering the underlying page size on ppc64. After this patch, memory hot-unplug releases huge pages back to the pool. Fixes: 7197fb4058bcb68986bae2bb2c04d6370f3e7218 Signed-off-by: Murilo Opsfelder Araujo Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Greg Kurz --- include/qemu/mmap-alloc.h | 2 +- exec.c | 4 ++-- util/mmap-alloc.c | 22 ++++++++++++++++------ util/oslib-posix.c | 2 +- 4 files changed, 20 insertions(+), 10 deletions(-) diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h index 50385e3f81..ef04f0ed5b 100644 --- a/include/qemu/mmap-alloc.h +++ b/include/qemu/mmap-alloc.h @@ -9,6 +9,6 @@ size_t qemu_mempath_getpagesize(const char *mem_path); void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared); -void qemu_ram_munmap(void *ptr, size_t size); +void qemu_ram_munmap(int fd, void *ptr, size_t size); #endif diff --git a/exec.c b/exec.c index 25f3938a27..03dd673d36 100644 --- a/exec.c +++ b/exec.c @@ -1873,7 +1873,7 @@ static void *file_ram_alloc(RAMBlock *block, if (mem_prealloc) { os_mem_prealloc(fd, area, memory, smp_cpus, errp); if (errp && *errp) { - qemu_ram_munmap(area, memory); + qemu_ram_munmap(fd, area, memory); return NULL; } } @@ -2394,7 +2394,7 @@ static void reclaim_ramblock(RAMBlock *block) xen_invalidate_map_cache_entry(block->host); #ifndef _WIN32 } else if (block->fd >= 0) { - qemu_ram_munmap(block->host, block->max_length); + qemu_ram_munmap(block->fd, block->host, block->max_length); close(block->fd); #endif } else { diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index f71ea038c8..8565885420 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -80,6 +80,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) int flags; int guardfd; size_t offset; + size_t pagesize; size_t total; void *guardptr; void *ptr; @@ -100,7 +101,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) * anonymous memory is OK. */ flags = MAP_PRIVATE; - if (fd == -1 || qemu_fd_getpagesize(fd) == getpagesize()) { + pagesize = qemu_fd_getpagesize(fd); + if (fd == -1 || pagesize == getpagesize()) { guardfd = -1; flags |= MAP_ANONYMOUS; } else { @@ -109,6 +111,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) } #else guardfd = -1; + pagesize = getpagesize(); flags = MAP_PRIVATE | MAP_ANONYMOUS; #endif @@ -120,7 +123,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) assert(is_power_of_2(align)); /* Always align to host page size */ - assert(align >= getpagesize()); + assert(align >= pagesize); flags = MAP_FIXED; flags |= fd == -1 ? MAP_ANONYMOUS : 0; @@ -143,17 +146,24 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) * a guard page guarding against potential buffer overflows. */ total -= offset; - if (total > size + getpagesize()) { - munmap(ptr + size + getpagesize(), total - size - getpagesize()); + if (total > size + pagesize) { + munmap(ptr + size + pagesize, total - size - pagesize); } return ptr; } -void qemu_ram_munmap(void *ptr, size_t size) +void qemu_ram_munmap(int fd, void *ptr, size_t size) { + size_t pagesize; + if (ptr) { /* Unmap both the RAM block and the guard page */ - munmap(ptr, size + getpagesize()); +#if defined(__powerpc64__) && defined(__linux__) + pagesize = qemu_fd_getpagesize(fd); +#else + pagesize = getpagesize(); +#endif + munmap(ptr, size + pagesize); } } diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 4ce1ba9ca4..37c5854b9c 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -226,7 +226,7 @@ void qemu_vfree(void *ptr) void qemu_anon_ram_free(void *ptr, size_t size) { trace_qemu_anon_ram_free(ptr, size); - qemu_ram_munmap(ptr, size); + qemu_ram_munmap(-1, ptr, size); } void qemu_set_block(int fd)