From patchwork Fri Jan 3 21:24:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 11317461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 639A413A4 for ; Fri, 3 Jan 2020 21:24:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2A83F22314 for ; Fri, 3 Jan 2020 21:24:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="rXB1CvTJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A83F22314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 673268E0005; Fri, 3 Jan 2020 16:24:17 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 623B38E0003; Fri, 3 Jan 2020 16:24:17 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 538538E0005; Fri, 3 Jan 2020 16:24:17 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 3F6BB8E0003 for ; Fri, 3 Jan 2020 16:24:17 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id E46F0180AD801 for ; Fri, 3 Jan 2020 21:24:16 +0000 (UTC) X-FDA: 76337601312.26.badge60_5f8519530d33f X-Spam-Summary: 10,1,0,93c1f98357ff7f37,d41d8cd98f00b204,alexander.duyck@gmail.com,:virtio-dev@lists.oasis-open.org:kvm@vger.kernel.org:mst@redhat.com:linux-kernel@vger.kernel.org:willy@infradead.org:mhocko@kernel.org::akpm@linux-foundation.org:mgorman@techsingularity.net:vbabka@suse.cz:yang.zhang.wz@gmail.com:nitesh@redhat.com:konrad.wilk@oracle.com:david@redhat.com:pagupta@redhat.com:riel@surriel.com:lcapitulino@redhat.com:dave.hansen@intel.com:wei.w.wang@intel.com:aarcange@redhat.com:pbonzini@redhat.com:dan.j.williams@intel.com:alexander.h.duyck@linux.intel.com:osalvador@suse.de,RULES_HIT:41:69:152:355:379:404:960:965:966:973:988:989:1260:1277:1311:1313:1314:1345:1359:1431:1437:1515:1516:1518:1535:1544:1593:1594:1605:1711:1730:1747:1777:1792:1801:2196:2199:2393:2559:2562:2895:2898:2899:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4118:4250:4321:4384:4385:4389:4390:4395:4605:5007:6261:6653:6742:7576:7903:8603:8957:9413:10004:11026:11658:11914:12043:12048 :12291:1 X-HE-Tag: badge60_5f8519530d33f X-Filterd-Recvd-Size: 7926 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Fri, 3 Jan 2020 21:24:16 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id 195so23185647pfw.11 for ; Fri, 03 Jan 2020 13:24:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=R+UOvEIvAKvV92s1Oao0n2zpHAWrzhC1LqeTb2UW2HA=; b=rXB1CvTJ6A9GhMN42hgYg/qS0Xy+3wuP23OMsdSchPUiu75AHNojATGXwFrAigRx6/ qvZXOsLYwpJ5Ttf5dejWTslhzKpQocZjfk7+bpoKBW0KmYIYq6B5hVwujEt97/Q33sVf SnNJ2lZTNR/+aGDeNkR1mqkWxJFOv75Zlv32khd1Tx/okf+rDl8dxfpu1QFXncUtMxbr wxBKwTdl2xxpVe34MGN8TBeTY21dgptqgpQ/xWMKYUYgkXuEpi9P48qluxa9UTJ0Xy9c Xpq/y+E2f1AKxBeqKeD7/Br2IGioPQ5Ovxks6S4U7h2Q2IOvbKLewj4ggKujYeF5QYXI beWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=R+UOvEIvAKvV92s1Oao0n2zpHAWrzhC1LqeTb2UW2HA=; b=I/+OruUxCVOcgMWCprlOJE1N17xrlPY3XSOQzk17CVitPeXMhYHJipWvna1cavJIbg xZUenovXp0zpURG3yDhn6COL2LPsPdgOzfqw8/VA8O2cjjw6sdc1Dh0BgrQEXOzWsiHJ 7zDYOY0Wn821CfNOeYccaHSmMuJPmAH483lG4rtQbhp6C/d7qERgas2A0GRmpk75N7Ly IDthBPnTtbGQjWIy9SrUpgtcb9/BHljvsojyDRvnZBcOezsrFHRQ9CtP8FPjms7LU1vb hpw91L6bJd9VsyXZEiOT++fyxw8mag4kc7JefJf1J9F2UYt6gMzLdgBWuyoTFZEqFUdc D1kw== X-Gm-Message-State: APjAAAXhpgqcOZUI+n9Ro5GY7aK9AVv20Zf39gYg/whqTduYwyW5+ai/ asFxFlPnbV91SWzPSNvBuZg= X-Google-Smtp-Source: APXvYqybTyKvXei4N7XcHa7h/QWfnTj20X/f0SCjkOdjAZ6X/HleMJe2ybbwn56JEMDhin5x0CRs4w== X-Received: by 2002:a62:a206:: with SMTP id m6mr86696087pff.210.1578086655318; Fri, 03 Jan 2020 13:24:15 -0800 (PST) Received: from localhost.localdomain ([2001:470:b:9c3:9e5c:8eff:fe4f:f2d0]) by smtp.gmail.com with ESMTPSA id w123sm50624962pfb.167.2020.01.03.13.24.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Jan 2020 13:24:14 -0800 (PST) Subject: [PATCH v16 QEMU 4/3 RFC] memory: Add support for MADV_FREE as mechanism to lazy discard pages From: Alexander Duyck To: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org, mst@redhat.com, linux-kernel@vger.kernel.org, willy@infradead.org, mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, vbabka@suse.cz Cc: yang.zhang.wz@gmail.com, nitesh@redhat.com, konrad.wilk@oracle.com, david@redhat.com, pagupta@redhat.com, riel@surriel.com, lcapitulino@redhat.com, dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com, osalvador@suse.de Date: Fri, 03 Jan 2020 13:24:14 -0800 Message-ID: <20200103212339.29849.99817.stgit@localhost.localdomain> In-Reply-To: <20200103210509.29237.18426.stgit@localhost.localdomain> References: <20200103210509.29237.18426.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alexander Duyck Add support for the MADV_FREE advice argument when discarding pages. Specifically we add an option to perform a lazy discard for use with free page reporting as this allows us to avoid expensive page zeroing in the case that the system is not under memory pressure. To enable this I simply extended the ram_block_discard_range function to add an extra parameter for "lazy" freeing. I then renamed the function, wrapped it in a function defined using the original name and defaulting lazy to false. From there I created a second wrapper for ram_block_free_range and updated the page reporting code to use that. Signed-off-by: Alexander Duyck --- exec.c | 39 +++++++++++++++++++++++++++------------ hw/virtio/virtio-balloon.c | 2 +- include/exec/cpu-common.h | 1 + 3 files changed, 29 insertions(+), 13 deletions(-) diff --git a/exec.c b/exec.c index ffdb5185353b..14eda993058c 100644 --- a/exec.c +++ b/exec.c @@ -3843,15 +3843,8 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque) return ret; } -/* - * Unmap pages of memory from start to start+length such that - * they a) read as 0, b) Trigger whatever fault mechanism - * the OS provides for postcopy. - * The pages must be unmapped by the end of the function. - * Returns: 0 on success, none-0 on failure - * - */ -int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) +static int __ram_block_discard_range(RAMBlock *rb, uint64_t start, + size_t length, bool lazy) { int ret = -1; @@ -3904,13 +3897,18 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) #endif } if (need_madvise) { - /* For normal RAM this causes it to be unmapped, +#ifdef CONFIG_MADVISE +#ifdef MADV_FREE + int advice = (lazy && !need_fallocate) ? MADV_FREE : MADV_DONTNEED; +#else + int advice = MADV_DONTNEED; +#endif + /* For normal RAM this causes it to be lazy freed or unmapped, * for shared memory it causes the local mapping to disappear * and to fall back on the file contents (which we just * fallocate'd away). */ -#if defined(CONFIG_MADVISE) - ret = madvise(host_startaddr, length, MADV_DONTNEED); + ret = madvise(host_startaddr, length, advice); if (ret) { ret = -errno; error_report("ram_block_discard_range: Failed to discard range " @@ -3938,6 +3936,23 @@ err: return ret; } +/* + * Unmap pages of memory from start to start+length such that + * they a) read as 0, b) Trigger whatever fault mechanism + * the OS provides for postcopy. + * The pages must be unmapped by the end of the function. + * Returns: 0 on success, none-0 on failure + * + */ +int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) +{ + return __ram_block_discard_range(rb, start, length, false); +} + +int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length) +{ + return __ram_block_discard_range(rb, start, length, true); +} bool ramblock_is_pmem(RAMBlock *rb) { return rb->flags & RAM_PMEM; diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 47f253d016db..b904bdde8b1b 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -346,7 +346,7 @@ static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) if ((ram_offset | size) & (rb_page_size - 1)) continue; - ram_block_discard_range(rb, ram_offset, size); + ram_block_free_range(rb, ram_offset, size); } virtqueue_push(vq, elem, 0); diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 81753bbb3431..2bbd26784c63 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -104,6 +104,7 @@ typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque); int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque); int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length); +int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length); #endif