From patchwork Tue Feb 11 22:53:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 11377075 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0412013A4 for ; Tue, 11 Feb 2020 22:53:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAF5820842 for ; Tue, 11 Feb 2020 22:53:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jSBrLcrA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BAF5820842 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 009876B0363; Tue, 11 Feb 2020 17:53:27 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EFC3D6B0365; Tue, 11 Feb 2020 17:53:26 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E12856B0366; Tue, 11 Feb 2020 17:53:26 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0232.hostedemail.com [216.40.44.232]) by kanga.kvack.org (Postfix) with ESMTP id CB0A56B0363 for ; Tue, 11 Feb 2020 17:53:26 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8505D180AD802 for ; Tue, 11 Feb 2020 22:53:26 +0000 (UTC) X-FDA: 76479349212.03.head43_582e6fb0f023e X-Spam-Summary: 10,1,0,180fd03979d1f627,d41d8cd98f00b204,alexander.duyck@gmail.com,:virtio-dev@lists.oasis-open.org:kvm@vger.kernel.org:mst@redhat.com:david@redhat.com:linux-kernel@vger.kernel.org::akpm@linux-foundation.org:yang.zhang.wz@gmail.com:pagupta@redhat.com:konrad.wilk@oracle.com:nitesh@redhat.com:riel@surriel.com:willy@infradead.org:lcapitulino@redhat.com:dave.hansen@intel.com:wei.w.wang@intel.com:aarcange@redhat.com:pbonzini@redhat.com:dan.j.williams@intel.com:mhocko@kernel.org:mgorman@techsingularity.net:alexander.h.duyck@linux.intel.com:vbabka@suse.cz:osalvador@suse.de,RULES_HIT:41:69:152:355:379:404:960:965:966:973:988:989:1260:1277:1311:1313:1314:1345:1359:1431:1437:1515:1516:1518:1535:1544:1593:1594:1605:1711:1730:1747:1777:1792:1801:2196:2199:2393:2559:2562:2895:2898:2899:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4118:4250:4321:4384:4385:4389:4390:4395:4605:5007:6261:6653:6742:7576:7903:8603:8957:9413:10004:11026:11658:11914:12043:12048 :12291:1 X-HE-Tag: head43_582e6fb0f023e X-Filterd-Recvd-Size: 7917 Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Feb 2020 22:53:25 +0000 (UTC) Received: by mail-wm1-f66.google.com with SMTP id q9so5800648wmj.5 for ; Tue, 11 Feb 2020 14:53:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=S1eHB/ml3ctPQzFmJyOptm0m86oSEkXU1wRBmsiOhmM=; b=jSBrLcrAG1ltGRBRq4U2VE5u9yNE3cjbueJAk+fuoekSc36pMuMCbACNJV1w0tSNAF S1ONMJgbLb2a9GJV5tPUW6YKTQyYfcwa5dhdIjBJJg7YMV/DLtD1TqnFJo43lOkxJlAy crG0C/egrX4U+xGq5NSnyNGMdcZggAG8D16OX8PzzoST8qcF1+9F0Enj5K8qVRwn4wfj Fxkp2AFdbQuigpmFSoyEHKWAAOHkH1EB6oY3cwxzS/mm+3aacujad/Za1gkDNR/5rpv9 1SIwobT7+uqgvdeCezpcWBpcaW8FV1YtQl7Q8e4QLTQs/NhCW0/BMrUqBGtdzpYQE4fF fXWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=S1eHB/ml3ctPQzFmJyOptm0m86oSEkXU1wRBmsiOhmM=; b=ax4Ja32AfY2ZCbEP1nDpfZeb9JBSQNDuHq6nNJF1sDXn19kzv0Avj3GfmH1nGEUa51 uJ6g7XRt36wvgsrN5TT0UamhIIq3wZSBMthOFWgZbZkqQ4yiaE0G4DpsvEqM+O+YySy/ P3CQU2x/yGA+n2HOMQlunW65tZ0ZQRjz+vaSoz8laMq/Vvifa50LNR/FFgE6dIjtHE0V sEZK6uLbPH9g69Z6K5JeSuzCTkPsUgCMB+/y99/a4rQVHmayWk2SpHcpynTEZCS6Q95Y +9zEjZhk6TblN6pTwk476HIYgBbcrSMLOhDMCCJ+FJGZrJEl/x5YoVLy5967pGfCm4qt hrtA== X-Gm-Message-State: APjAAAXs8FyQdMQ5LbPxcKHuqYENbovTTsCQbAzGq60t52cv8tAC+qu7 b3/yHKFBjzaCcit05Gdbkls= X-Google-Smtp-Source: APXvYqzqlf5oxNgk8th18WowzQtUb3awrK2YQcMVxb29DDDmi7fe37ioHUaSOFJ2r6tjOtRseRlyPg== X-Received: by 2002:a1c:a5c7:: with SMTP id o190mr2079763wme.183.1581461604557; Tue, 11 Feb 2020 14:53:24 -0800 (PST) Received: from localhost.localdomain ([2001:470:b:9c3:9e5c:8eff:fe4f:f2d0]) by smtp.gmail.com with ESMTPSA id z11sm6981099wrv.96.2020.02.11.14.53.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Feb 2020 14:53:24 -0800 (PST) Subject: [PATCH v17 QEMU 4/3 RFC] memory: Add support for MADV_FREE as mechanism to lazy discard pages From: Alexander Duyck To: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org, mst@redhat.com, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, konrad.wilk@oracle.com, nitesh@redhat.com, riel@surriel.com, willy@infradead.org, lcapitulino@redhat.com, dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, mhocko@kernel.org, mgorman@techsingularity.net, alexander.h.duyck@linux.intel.com, vbabka@suse.cz, osalvador@suse.de Date: Tue, 11 Feb 2020 14:53:18 -0800 Message-ID: <20200211225220.30596.80416.stgit@localhost.localdomain> In-Reply-To: <20200211224416.29318.44077.stgit@localhost.localdomain> References: <20200211224416.29318.44077.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alexander Duyck Add support for the MADV_FREE advice argument when discarding pages. Specifically we add an option to perform a lazy discard for use with free page reporting as this allows us to avoid expensive page zeroing in the case that the system is not under memory pressure. To enable this I simply extended the ram_block_discard_range function to add an extra parameter for "lazy" freeing. I then renamed the function, wrapped it in a function defined using the original name and defaulting lazy to false. From there I created a second wrapper for ram_block_free_range and updated the page reporting code to use that. Signed-off-by: Alexander Duyck --- exec.c | 39 +++++++++++++++++++++++++++------------ hw/virtio/virtio-balloon.c | 2 +- include/exec/cpu-common.h | 1 + 3 files changed, 29 insertions(+), 13 deletions(-) diff --git a/exec.c b/exec.c index 67e520d18ea5..2266574eb06e 100644 --- a/exec.c +++ b/exec.c @@ -3881,15 +3881,8 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque) return ret; } -/* - * Unmap pages of memory from start to start+length such that - * they a) read as 0, b) Trigger whatever fault mechanism - * the OS provides for postcopy. - * The pages must be unmapped by the end of the function. - * Returns: 0 on success, none-0 on failure - * - */ -int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) +static int __ram_block_discard_range(RAMBlock *rb, uint64_t start, + size_t length, bool lazy) { int ret = -1; @@ -3941,13 +3934,18 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) #endif } if (need_madvise) { - /* For normal RAM this causes it to be unmapped, +#ifdef CONFIG_MADVISE +#ifdef MADV_FREE + int advice = (lazy && !need_fallocate) ? MADV_FREE : MADV_DONTNEED; +#else + int advice = MADV_DONTNEED; +#endif + /* For normal RAM this causes it to be lazy freed or unmapped, * for shared memory it causes the local mapping to disappear * and to fall back on the file contents (which we just * fallocate'd away). */ -#if defined(CONFIG_MADVISE) - ret = madvise(host_startaddr, length, MADV_DONTNEED); + ret = madvise(host_startaddr, length, advice); if (ret) { ret = -errno; error_report("ram_block_discard_range: Failed to discard range " @@ -3975,6 +3973,23 @@ err: return ret; } +/* + * Unmap pages of memory from start to start+length such that + * they a) read as 0, b) Trigger whatever fault mechanism + * the OS provides for postcopy. + * The pages must be unmapped by the end of the function. + * Returns: 0 on success, none-0 on failure + * + */ +int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length) +{ + return __ram_block_discard_range(rb, start, length, false); +} + +int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length) +{ + return __ram_block_discard_range(rb, start, length, true); +} bool ramblock_is_pmem(RAMBlock *rb) { return rb->flags & RAM_PMEM; diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 5faafd2f62ac..7df92af73792 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -346,7 +346,7 @@ static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) if ((ram_offset | size) & (rb_page_size - 1)) continue; - ram_block_discard_range(rb, ram_offset, size); + ram_block_free_range(rb, ram_offset, size); } virtqueue_push(vq, elem, 0); diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 81753bbb3431..2bbd26784c63 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -104,6 +104,7 @@ typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque); int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque); int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length); +int ram_block_free_range(RAMBlock *rb, uint64_t start, size_t length); #endif