From patchwork Sat Sep 1 11:28:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F3665A4 for ; Sun, 2 Sep 2018 02:21:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E85C29FC7 for ; Sun, 2 Sep 2018 02:21:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 122BC29FCA; Sun, 2 Sep 2018 02:21:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F68D29FC7 for ; Sun, 2 Sep 2018 02:21:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19B036B5FDE; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 088F26B5FDA; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBBD16B5FD8; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 77BDF6B5FD9 for ; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id q12-v6so8859993pgp.6 for ; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:cc:cc:subject :references:mime-version:content-disposition:lines; bh=egbTjjM1xgoMT7s6K9WQMwtm1Lb83qyQoJhWtdvDkhc=; b=c1fxKqtHMm3xpxyD+th/25qI6dzlqoiFMf8qfYHJxNM+IyY8HVUrkb6CtAvTf1TUGt 32mh0/uXtpwVO0zbZGO+FOeLz9BHhl2hwPSplqqAbgaKGQtYjVh3PXA8Wsv0l7KMrOUv /8EmWUDgrbCCymGB0ZBopE1UZTrehhz2d2UP0oErRxvKobZh/xHtPTfivJ1lZaOFA84+ EMMalwYrKrLa6RyGASpki65++JvFEP0b+xZvbyqpz14xEUcxU2mHfNS1C1OZdi+2+qP+ XwkEPaFMip/MNKMBL8Wf2o+W59M078cc70/nx/aGRHgTlpqvMi0mblh2OB0Tpf0ZZAVA emLw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51BlIqC0xPCWCGIx41zbY1rhuhCqMLFMLXq1H+cLD0et85N9o+x/ MEtAl9Ie8YXe2i70JStrW6ddkfWWgl00zmi/+2+6XHR2qK1oQot48FvEOvB+p7M7FSGU/HID6vW A5TDNONU3wlBjzm1LWqg/h4Q/iN+i5JuMpsHzAm4S9eF4KlQ5LmL6BFTXOEFF4QHR6Q== X-Received: by 2002:a62:c218:: with SMTP id l24-v6mr22969607pfg.185.1535854870193; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZFuM84URefxx9LmOCtDJqz1CyCoKh6bpMQBnHhoj+lx6rPYQBRzyPSVMPMGbbeHxzBGi3M X-Received: by 2002:a62:c218:: with SMTP id l24-v6mr22969591pfg.185.1535854869776; Sat, 01 Sep 2018 19:21:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854869; cv=none; d=google.com; s=arc-20160816; b=fnsXSUXJgBuR/9NOIvRPheN80RrsrS49icAHzEmok9Lpa3xoUrCWFwszAmTrdUKZMf 6fssFvMMCmRJdb41GyqrBovvo0PKL6AAzbHQZvrEWML/iWQ7y3r/I0S0sRLk07gIfL/J ZJksaGAEDZvVsV+xC8b/v5FjCLshSN1J6Qr62Qh4cyJjpNLGqs+tF/SfzWdPUa5eDHWJ qkCz5+xAnUXVUsuDjIaWp05UOmjXzToAGXOYtvRT08TXZkkFFmfgEwMCcEeawt0bBFVC qyhlGJAD4wuwAa82lxlQRUauM6kqeDA2zoP+6/f6+NvKxK6cguBuvAY2D4wOjZn4EOSa hRmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:content-disposition:mime-version:references:subject:cc:cc:cc :cc:cc:cc:cc:cc:cc:to:from:date:user-agent:message-id :arc-authentication-results; bh=egbTjjM1xgoMT7s6K9WQMwtm1Lb83qyQoJhWtdvDkhc=; b=uCihEoeNyya0I3tkxWah6fRUr0QhuxeiflFFURxqOqb2YtJ0itIhA/ILRJXDw/LE1h TIs6A0hMmSqDdflaL9kn+3tg0EPCCN8MkenyqZaHyMXeY8vNEQlLgGbCwVD2VDnm/wNV ppj7ir6SF7Z4YPAjMuhguUCRyIee4kv1GL+TcrFU/S4mnz7zUneb9rVqLCOeJ2Fp8S4d shYpV2vSFRn2igPJuMsPqzTAeH8hQ4dU7J1rduIqBU7Ut5PLdg3B1u5BZaFPgVH7LM8p 25r3tC15+f7enudPNL+1IOThZnHOl+czQqbbF7XJ7/YLgLMpcLAgtz4acPMWQUqxJmuh 4GqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id d32-v6si13682201pla.93.2018.09.01.19.21.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:09 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211544" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003Za-5R; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901124811.469883138@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:19 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List , Fengguang Wu cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: Huang Ying CC: Brendan Gregg cc: kvm@vger.kernel.org Cc: LKML Subject: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct References: <20180901112818.126790961@intel.com> MIME-Version: 1.0 Content-Disposition: inline; filename=0001-kvm-register-in-task_struct.patch Lines: 64 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The added pointer will be used by the /proc/PID/idle_bitmap code to quickly identify QEMU task and walk EPT/NPT accordingly. For virtual machines, the A bits will be set in guest page tables and EPT/NPT, rather than the QEMU task page table. This costs 8 bytes in task_struct which could be wasteful for the majority normal tasks. The alternative is to add a flag only, and let it find the corresponding VM in kvm vm_list. Signed-off-by: Fengguang Wu --- include/linux/sched.h | 10 ++++++++++ virt/kvm/kvm_main.c | 1 + 2 files changed, 11 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 43731fe51c97..26c8549bbc28 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -38,6 +38,7 @@ struct cfs_rq; struct fs_struct; struct futex_pi_state; struct io_context; +struct kvm; struct mempolicy; struct nameidata; struct nsproxy; @@ -1179,6 +1180,9 @@ struct task_struct { /* Used by LSM modules for access restriction: */ void *security; #endif +#if IS_ENABLED(CONFIG_KVM) + struct kvm *kvm; +#endif /* * New fields for task_struct should be added above here, so that @@ -1898,4 +1902,10 @@ static inline void rseq_syscall(struct pt_regs *regs) #endif +#if IS_ENABLED(CONFIG_KVM) +static inline struct kvm *task_kvm(struct task_struct *t) { return t->kvm; } +#else +static inline struct kvm *task_kvm(struct task_struct *t) { return NULL; } +#endif + #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8b47507faab5..0c483720de8d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm) if (type == KVM_EVENT_CREATE_VM) { add_uevent_var(env, "EVENT=create"); kvm->userspace_pid = task_pid_nr(current); + current->kvm = kvm; } else if (type == KVM_EVENT_DESTROY_VM) { add_uevent_var(env, "EVENT=destroy"); } From patchwork Sat Sep 1 11:28:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584931 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 987525A4 for ; Sun, 2 Sep 2018 02:21:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 792C329FC7 for ; Sun, 2 Sep 2018 02:21:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6B06729FCA; Sun, 2 Sep 2018 02:21:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9895A29FC7 for ; Sun, 2 Sep 2018 02:21:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBFF26B5FDD; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C478C6B5FDB; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC5F76B5FDC; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 651B16B5FD8 for ; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id r20-v6so8862753pgv.20 for ; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:subject:references :mime-version:content-disposition:lines; bh=EJZQHVKM1S3q2m2s8ocvE5ZqVMPUZK9QTxK1UM20ckE=; b=uD/OXxG7AzEjvk1J1P44URe0LXm9d+ylf5hkBwOx+Rrm6HJjHR0/t0x692SofZr81c GpdRDkJ6sNGOlLDnvnwbaF+V5L24M5cyYc3upvnUOGoQsyOln5VaXfXU59/w3UeKM9tE Eovt/7DzEtBGTlJ3F2uBT6QOTXw5nrmRhXAi3nHOyMp+DocoOzeJQPsuZemLNm05Ji+3 F7UXMLKVu6RRR6xxkXg0J0D0uy9mYAXIAFyi/V1jYtkYgo+x0VmxPciDnDmGAgR9m6Cj P90PaWv5qocPNJ5L1Mbm5fscyquVHQeff7lHxaUoA9mlYSiKfYGG1Z0ZzBLu5KMXz8dz UW2w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51DDVf9R++xQo7js7Fpivi4c8WoE6qJVgXO94yhnRQp3NeHFD1tS 9grZjPao0+ajMW72d+G5KnC37HF0WGWCIpHQBWfzpmZWfZaBqg6XvkkQJt+Hg5UGu6yt2Z+1U+y evpRq26S73iCpkTqX0df4Mgpnf56dWfQPhjwxAGL95iAyuozVHQTlRUAEvtpY6Sb/pg== X-Received: by 2002:a17:902:d90a:: with SMTP id c10-v6mr2224873plz.35.1535854870067; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZvH4VpeOPR8Sdv2trw/wsFj7dXPvfBjfFifNSSWNgc2kaajxVlaN18XXjPajAmicw+8tuw X-Received: by 2002:a17:902:d90a:: with SMTP id c10-v6mr2224852plz.35.1535854869439; Sat, 01 Sep 2018 19:21:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854869; cv=none; d=google.com; s=arc-20160816; b=dTofRsnMpFUFHfBHJSjWkzVrGKoj4edZKMH3cCRfnfA+wCYFydH1ARSmuSc0koGZK+ APAaWIim/m09sjbq1qvs3vkRRC+UjH9pW2Y6z4D2Rc1uDa7JQbOT5F8Uea9RZvRZu4XT xN1n4NzSxH7eSPZ0pIO21e8iRPqN3YVuikarEVEX9tLPhaJaK//W6P8a5VVElcIUFBkS 73+AXDnJIJ9y4cKd1ZOX+O5v3aWwS5n+4+WuQjpMPmuDpBh7724B6R3njKmW4Ost31Y8 Ffum8/wg5x99RgJAWSRJqNH87NEECBBcXnOl2RekHc9Ns6fwQ+JbORBE3bYQmd2EH2wa RglQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:content-disposition:mime-version:references:subject:cc:cc:cc :cc:cc:cc:cc:to:from:date:user-agent:message-id :arc-authentication-results; bh=EJZQHVKM1S3q2m2s8ocvE5ZqVMPUZK9QTxK1UM20ckE=; b=BWMyuVeFGs/bRFggZMi6NFJog0POOy/RD7YXlo2NASK/Mum4aTppDmjj5yceSTrAK1 69mFcQxAcb+MLz5UXffT4alEHgQeQSIKDstoNTyDn7wyz0tXsbvmy1WkWA9z3Yuiun1m rGe6olxcIx+CaAjmeusSulnw3QeCoYg/WlbIERhIDu3roVGyU/D+A2hDN4XtDXBxWC2N wA1pZ1IloVEEv55ARxWR2KDNF2cExHlEV9S3YymAZFS/g4l64801AWSdSMhBz3pzbAt2 XG17MUheci26SK+IfvHc3To0TVpvRX6TdrXDbJcJ/WgUseKjpSHKVbWn4yrxbu7eIf8U 8aXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id x19-v6si12798679pgf.477.2018.09.01.19.21.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:09 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="77329555" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by FMSMGA003.fm.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003Ze-5w; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901124811.530300789@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:20 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List , Huang Ying , Brendan Gregg , Fengguang Wu cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: kvm@vger.kernel.org Cc: LKML Subject: [RFC][PATCH 2/5] [PATCH 2/5] proc: introduce /proc/PID/idle_bitmap References: <20180901112818.126790961@intel.com> MIME-Version: 1.0 Content-Disposition: inline; filename=0002-proc-introduce-proc-PID-idle_bitmap.patch Lines: 158 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This will be similar to /sys/kernel/mm/page_idle/bitmap documented in Documentation/admin-guide/mm/idle_page_tracking.rst, however indexed by process virtual address. When using the global PFN indexed idle bitmap, we find 2 kind of overheads: - to track a task's working set, Brendan Gregg end up writing wss-v1 for small tasks and wss-v2 for large tasks: https://github.com/brendangregg/wss That's because VAs may point to random PAs throughout the physical address space. So we either query /proc/pid/pagemap first and access the lots of random PFNs (with lots of syscalls) in the bitmap, or write+read the whole system idle bitmap beforehand. - page table walking by PFN has much more overheads than to walk a page table in its natural order: - rmap queries - more locking - random memory reads/writes This interface provides a cheap path for the majority non-shared mapping pages. To walk 1TB memory of 4k active pages, it costs 2s vs 15s system time to scan the per-task/global idle bitmaps. Which means ~7x speedup. The gap will be enlarged if consider - the extra /proc/pid/pagemap walk - natural page table walks can skip the whole 512 PTEs if PMD is idle OTOH, the per-task idle bitmap is not suitable in some situations: - not accurate for shared pages - don't work with non-mapped file pages - don't perform well for sparse page tables (pointed out by Huang Ying) So it's more about complementing the existing global idle bitmap. CC: Huang Ying CC: Brendan Gregg Signed-off-by: Fengguang Wu --- fs/proc/base.c | 2 ++ fs/proc/internal.h | 1 + fs/proc/task_mmu.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 66 insertions(+) diff --git a/fs/proc/base.c b/fs/proc/base.c index aaffc0c30216..d81322b5b8d2 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2942,6 +2942,7 @@ static const struct pid_entry tgid_base_stuff[] = { REG("smaps", S_IRUGO, proc_pid_smaps_operations), REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations), REG("pagemap", S_IRUSR, proc_pagemap_operations), + REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations), #endif #ifdef CONFIG_SECURITY DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations), @@ -3327,6 +3328,7 @@ static const struct pid_entry tid_base_stuff[] = { REG("smaps", S_IRUGO, proc_tid_smaps_operations), REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations), REG("pagemap", S_IRUSR, proc_pagemap_operations), + REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations), #endif #ifdef CONFIG_SECURITY DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations), diff --git a/fs/proc/internal.h b/fs/proc/internal.h index da3dbfa09e79..732a502acc27 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -305,6 +305,7 @@ extern const struct file_operations proc_pid_smaps_rollup_operations; extern const struct file_operations proc_tid_smaps_operations; extern const struct file_operations proc_clear_refs_operations; extern const struct file_operations proc_pagemap_operations; +extern const struct file_operations proc_mm_idle_operations; extern unsigned long task_vsize(struct mm_struct *); extern unsigned long task_statm(struct mm_struct *, diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index dfd73a4616ce..376406a9cf45 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1564,6 +1564,69 @@ const struct file_operations proc_pagemap_operations = { .open = pagemap_open, .release = pagemap_release, }; + +/* will be filled when kvm_ept_idle module loads */ +struct file_operations proc_ept_idle_operations = { +}; +EXPORT_SYMBOL_GPL(proc_ept_idle_operations); + +static ssize_t mm_idle_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = file->private_data; + ssize_t ret = -ESRCH; + + // TODO: implement mm_walk for normal tasks + + if (task_kvm(task)) { + if (proc_ept_idle_operations.read) + return proc_ept_idle_operations.read(file, buf, count, ppos); + } + + return ret; +} + + +static int mm_idle_open(struct inode *inode, struct file *file) +{ + struct task_struct *task = get_proc_task(inode); + + if (!task) + return -ESRCH; + + file->private_data = task; + + if (task_kvm(task)) { + if (proc_ept_idle_operations.open) + return proc_ept_idle_operations.open(inode, file); + } + + return 0; +} + +static int mm_idle_release(struct inode *inode, struct file *file) +{ + struct task_struct *task = file->private_data; + + if (!task) + return 0; + + if (task_kvm(task)) { + if (proc_ept_idle_operations.release) + return proc_ept_idle_operations.release(inode, file); + } + + put_task_struct(task); + return 0; +} + +const struct file_operations proc_mm_idle_operations = { + .llseek = mem_lseek, /* borrow this */ + .read = mm_idle_read, + .open = mm_idle_open, + .release = mm_idle_release, +}; + #endif /* CONFIG_PROC_PAGE_MONITOR */ #ifdef CONFIG_NUMA From patchwork Sat Sep 1 11:28:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584943 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A10916B1 for ; Sun, 2 Sep 2018 02:21:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A3A829FC7 for ; Sun, 2 Sep 2018 02:21:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E93E29FCA; Sun, 2 Sep 2018 02:21:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA76129FC7 for ; Sun, 2 Sep 2018 02:21:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56CF26B5FDA; Sat, 1 Sep 2018 22:21:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 51C0A6B5FDB; Sat, 1 Sep 2018 22:21:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36FDE6B5FDC; Sat, 1 Sep 2018 22:21:12 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id E340B6B5FDA for ; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id g12-v6so8493322plo.1 for ; Sat, 01 Sep 2018 19:21:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:cc:subject:references :mime-version:content-disposition:lines; bh=UH7GdVSbYlnhbsrtH+631I/wzcCxlYfj+6RD5P9A1u0=; b=jHrHs7xxOx2yQ60A9WCk506aOADa/pxt9UTNuWnC4f/K+UhkZJPod/yRxg2ZVSOkR5 Oh7tV6xVK4YzLfeodjCUQbjUtR6Htdl6DnnfCeK4v98zHfprCBWhlOTD647RatScadkY ztUliBdkMJEpxhHU8aWoaK4plWjWHLCtqY+K82xO6qy3kUAd7e7Ea17LfnU8wq+hUYwc A7szgLURjr/a+oSe1rAgBvxxtwFVIdrMpe3c9vdSjZ9F4BLh1pmO6YHhrkqtkDLqSvRv wFmJftZ/57T6ZSsCzWzwr4zQvIuhbc/DqHvEX+FrV7bjegFjNd3D7lawvVrhNGpUHrxJ cRbA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51CPRVbNLNP/dOTgls+E915DU5iyNWSQVFPapLLQTYAfRTjKsPvB 5KE4Tlhj6eo0Lde0u0FoaCwq7FJlHMBn671xGXK3RTwa7rT8bMssPamCU25iMMoscYQ2r74PVzo HlKcGWRtKVRQnoXv8xJGBG1GxNTxfyX+b3TNR33R7ISNwx64vWK1IqBkFASQSijNn8g== X-Received: by 2002:a17:902:1121:: with SMTP id d30-v6mr21608337pla.250.1535854871642; Sat, 01 Sep 2018 19:21:11 -0700 (PDT) X-Google-Smtp-Source: ANB0VdahWUqAxUvcRqlImxKUIxUf0Pb9NZdBbugZPX0vU2yg8vlJzMbc7qYUtDY7N/tTkYsA4Oxz X-Received: by 2002:a17:902:1121:: with SMTP id d30-v6mr21608298pla.250.1535854870587; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854870; cv=none; d=google.com; s=arc-20160816; b=Z/LGypZKwmcZDctccsVRez+OgdqWqaBbrVxPB8LMMiGD46n3yL1h5V50S85RbJXdRJ OKcPmy1NyVMMKXG+0mqqcpt0uf4yVjLRh9HgPs8g5DhFC7DZssdnUjUei+QrcInNgM4h XDGroEaeZQ0P+CEhLuhmSnmVvCkUSkcJFA/BkXfzqLN+KJQBhqhM8lXx8bgJuPialj86 6e0PE38khnu19s8Z8w11T2he/eWRouBRR2VkFBRQpxbasw7wCoLvgmpTYsVmtOnNXxDe X283j/ars8xQw2BERKu7nIfWVhRb0PSLAjzoWGQ9D9PH+LWAVlq4O11L0fY+8bRQQ8dl mOHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:content-disposition:mime-version:references:subject:cc:cc:cc :cc:cc:cc:cc:cc:to:from:date:user-agent:message-id :arc-authentication-results; bh=UH7GdVSbYlnhbsrtH+631I/wzcCxlYfj+6RD5P9A1u0=; b=hWCOMYBBMCkh3Gcqw5y6Yls+oUHkDCb1PgPmBU37bqXunHV7c/5yYCjPUN9CX0HJ7D jBJKr/B97lrL06qm51Fu7UXBrPtf6pLi7osH78Q7xbrZQxLsUhnEy3fdthMeRJBbspIQ WqRHXk2E4xF1NWUunsB3F7lZLi5Z9pJSBGQlRmeJ/eHxGs2NEOPthWqWgNWmSu/oDwel /kJSHUvvZX4re34Qsaectc/h4hHwhaBvoYPg1PePI/QF9osWcRf3crL8ni6PyzumfJhb 7S29Zl8HQyiTdHkyzz5VxPZFKZXPhv8oXkJaBMHvNSHUFEnFv2tIoP5vCncD4X/YZdPv Nbyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id d32-v6si13682201pla.93.2018.09.01.19.21.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:10 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211550" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003Zi-6a; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901124811.591511876@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:21 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List , Peng DongX , Fengguang Wu cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: Huang Ying CC: Brendan Gregg cc: kvm@vger.kernel.org Cc: LKML Subject: [RFC][PATCH 3/5] [PATCH 3/5] kvm-ept-idle: HVA indexed EPT read References: <20180901112818.126790961@intel.com> MIME-Version: 1.0 Content-Disposition: inline; filename=0003-kvm-ept-idle-HVA-indexed-EPT-read.patch Lines: 171 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP For virtual machines, "accessed" bits will be set in guest page tables and EPT/NPT. So for qemu-kvm process, convert HVA to GFN to GPA, then do EPT/NPT walks. Thanks to the in-memslot linear HVA-GPA mapping, the conversion can be done efficiently, outside of the loops for page table walks. In this manner, we provide uniform interface for both virtual machines and normal processes. The use scenario would be per task/VM working set tracking and migration. Very convenient for applying task/vma and VM granularity policies. Signed-off-by: Peng DongX Signed-off-by: Fengguang Wu --- arch/x86/kvm/ept_idle.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/ept_idle.h | 24 ++++++++++ 2 files changed, 142 insertions(+) create mode 100644 arch/x86/kvm/ept_idle.c create mode 100644 arch/x86/kvm/ept_idle.h diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c new file mode 100644 index 000000000000..5b97dd01011b --- /dev/null +++ b/arch/x86/kvm/ept_idle.c @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include + +#include "ept_idle.h" + + +// mindless copy from kvm_handle_hva_range(). +// TODO: handle order and hole. +static int ept_idle_walk_hva_range(struct ept_idle_ctrl *eic, + unsigned long start, + unsigned long end) +{ + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + int ret = 0; + + slots = kvm_memslots(eic->kvm); + kvm_for_each_memslot(memslot, slots) { + unsigned long hva_start, hva_end; + gfn_t gfn_start, gfn_end; + + hva_start = max(start, memslot->userspace_addr); + hva_end = min(end, memslot->userspace_addr + + (memslot->npages << PAGE_SHIFT)); + if (hva_start >= hva_end) + continue; + /* + * {gfn(page) | page intersects with [hva_start, hva_end)} = + * {gfn_start, gfn_start+1, ..., gfn_end-1}. + */ + gfn_start = hva_to_gfn_memslot(hva_start, memslot); + gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); + + ret = ept_idle_walk_gfn_range(eic, gfn_start, gfn_end); + if (ret) + return ret; + } + + return ret; +} + +static ssize_t ept_idle_read(struct file *file, char *buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = file->private_data; + struct ept_idle_ctrl *eic; + unsigned long hva_start = *ppos << BITMAP_BYTE2PVA_SHIFT; + unsigned long hva_end = hva_start + (count << BITMAP_BYTE2PVA_SHIFT); + int ret; + + if (*ppos % IDLE_BITMAP_CHUNK_SIZE || + count % IDLE_BITMAP_CHUNK_SIZE) + return -EINVAL; + + eic = kzalloc(sizeof(*eic), GFP_KERNEL); + if (!eic) + return -EBUSY; + + eic->buf = buf; + eic->buf_size = count; + eic->kvm = task_kvm(task); + if (!eic->kvm) { + ret = -EINVAL; + goto out_free; + } + + ret = ept_idle_walk_hva_range(eic, hva_start, hva_end); + if (ret) + goto out_free; + + ret = eic->bytes_copied; + *ppos += ret; +out_free: + kfree(eic); + + return ret; +} + +static int ept_idle_open(struct inode *inode, struct file *file) +{ + if (!try_module_get(THIS_MODULE)) + return -EBUSY; + + return 0; +} + +static int ept_idle_release(struct inode *inode, struct file *file) +{ + module_put(THIS_MODULE); + return 0; +} + +extern struct file_operations proc_ept_idle_operations; + +static int ept_idle_entry(void) +{ + proc_ept_idle_operations.owner = THIS_MODULE; + proc_ept_idle_operations.read = ept_idle_read; + proc_ept_idle_operations.open = ept_idle_open; + proc_ept_idle_operations.release = ept_idle_release; + + return 0; +} + +static void ept_idle_exit(void) +{ + memset(&proc_ept_idle_operations, 0, sizeof(proc_ept_idle_operations)); +} + +MODULE_LICENSE("GPL"); +module_init(ept_idle_entry); +module_exit(ept_idle_exit); diff --git a/arch/x86/kvm/ept_idle.h b/arch/x86/kvm/ept_idle.h new file mode 100644 index 000000000000..e0b9dcecf50b --- /dev/null +++ b/arch/x86/kvm/ept_idle.h @@ -0,0 +1,24 @@ +#ifndef _EPT_IDLE_H +#define _EPT_IDLE_H + +#define IDLE_BITMAP_CHUNK_SIZE sizeof(u64) +#define IDLE_BITMAP_CHUNK_BITS (IDLE_BITMAP_CHUNK_SIZE * BITS_PER_BYTE) + +#define BITMAP_BYTE2PVA_SHIFT (3 + PAGE_SHIFT) + +#define EPT_IDLE_KBUF_FULL 1 +#define EPT_IDLE_KBUF_BYTES 8000 +#define EPT_IDLE_KBUF_BITS (EPT_IDLE_KBUF_BYTES * 8) + +struct ept_idle_ctrl { + struct kvm *kvm; + + u64 kbuf[EPT_IDLE_KBUF_BITS / IDLE_BITMAP_CHUNK_BITS]; + int bits_read; + + void __user *buf; + int buf_size; + int bytes_copied; +}; + +#endif From patchwork Sat Sep 1 11:28:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584947 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 846685A4 for ; Sun, 2 Sep 2018 02:21:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 727EB29FC7 for ; Sun, 2 Sep 2018 02:21:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6675529FCA; Sun, 2 Sep 2018 02:21:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B190229FC7 for ; Sun, 2 Sep 2018 02:21:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AA2C6B5FDC; Sat, 1 Sep 2018 22:21:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 80F836B5FE2; Sat, 1 Sep 2018 22:21:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A3906B5FE2; Sat, 1 Sep 2018 22:21:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 152356B5FDC for ; Sat, 1 Sep 2018 22:21:14 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id t23-v6so9165393pfe.20 for ; Sat, 01 Sep 2018 19:21:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:cc:subject:references :mime-version:content-disposition:lines; bh=6pQ9H8o874X7rSZn+xRX1JY9zwVE8rZwmfhFAxwHlpw=; b=Z0dDBlsKfC7HCO9KjTdNcWbVNzkynswImzVv41BhEIIoI+ZoqCYdPh7EduBK2pd6Ob /bYucaAe3mpVKGSTcgrxN35trJngaW/zzU2LK8bRSVawyECPTfjYjYNYdKV9bDIG9qiT Reft9HR4uLkuBV4ltu0FbvKNpI8CfVx21Z4HH2WG5PtCYqDbOHHfN+7E7uzcobAqMOIo gtWyNt6MijS4H9r1z1Aj9WKpeoEDcco3ZA0wSjQHIDg3r2WsjaawaPnXV/HU05ctoBTA xPwujYbymJdVc11GaZfYt7ux48yHxBq6Gdt4raHV91lXC1UU89EVspf0gOT/ZVqpEFrv IpcQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51BDSJWxqTJsdTa/wo9D/Hetolf96xN5LX0luM7oTioLzG38sq9v Lt92LmuDPJpcyPM6i6Myv/VTeyR8W8V6fWOV7AML0veC+K9aAL4AHT3bPsIn02q1WjdaGil4QoT dXXOuoflV0YTQXzbBNIRVbF6VihpxMK4z04JCgOpa+WM3yN61ITzXIIzkN4i+v081dg== X-Received: by 2002:a17:902:3a3:: with SMTP id d32-v6mr22150145pld.294.1535854873611; Sat, 01 Sep 2018 19:21:13 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZjf92RRo69wQvCnbpfCM4xYvTYEjNq0YSInP9Ib2CYFjcmoNYymKa42CmOhYfmF2Nl/+Rp X-Received: by 2002:a17:902:3a3:: with SMTP id d32-v6mr22150037pld.294.1535854870853; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854870; cv=none; d=google.com; s=arc-20160816; b=BqYNtJgk78VnwMi2W7eaAtBniywb4kXluab3iHoXukLnHWG6gyIcIqV0QjkDKkNdS3 dDaSMySbf5NV6amdQVIuHtoeDAsf4jxQD/ddl1K2RkRJWbaAZwBs/2rJjbXhV109lKkq hJ+OLxoERl9o+sYeOAZI048rmqYvtCagqzenaukZvNdb8mt4R4sWOOz5aOP8RfZpYoWf GU7cQWqKuHBV1xyel2Lr8Ic/hufjEP+2pt9TYd58G9uxBlqfnGj5JjPwg8NF6qe0BgWE cylH59rCfm8FSNfRfb9n/tJuCrhjOpsN61Ldzk5nXUV+QLGsLFOEzGYBmTemFyDmwQJz 4s5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:content-disposition:mime-version:references:subject:cc:cc:cc :cc:cc:cc:cc:cc:to:from:date:user-agent:message-id :arc-authentication-results; bh=6pQ9H8o874X7rSZn+xRX1JY9zwVE8rZwmfhFAxwHlpw=; b=Xxw8/0SPUymy7Ts5QXbWIYjoC+76MW/vOG2qQE9FdTBDGeT/irYTZocRHntgfNCjHt yV3P5p8+Mtq9Kaa7ItAPdBxpC/XyokKfzB/m6oioGTAJIdHDwtqwgcdZ7YJifJblOLOC nEvvgIEK24C4uTCU9fV55jomUvXPW/LUk27N9kZPs8piHX3/0g/I9lhdl14cy/L6kw80 d+Qq53g5o91jVW82SFemkL1anDVbprJ34cV+NXh3A37l2hzBrIjyQchXksX91Z0rfAZB rt2RiyB/NcHWgWzT2ah6ru0gHpmo68xmhNEMFczbWV1SjwFn3hFOednk0Ap+SDwXbKzQ Qw/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id 35-v6si13524442pgo.639.2018.09.01.19.21.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:10 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211552" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003Zm-76; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901124811.644382292@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:22 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List , Dave Hansen , Fengguang Wu cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie cc: Huang Ying CC: Brendan Gregg cc: kvm@vger.kernel.org Cc: LKML Subject: [RFC][PATCH 4/5] [PATCH 4/5] kvm-ept-idle: EPT page table walk for A bits References: <20180901112818.126790961@intel.com> MIME-Version: 1.0 Content-Disposition: inline; filename=0004-kvm-ept-idle-EPT-page-table-walk-for-A-bits.patch Lines: 305 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This borrows host page table walk macros/functions to do EPT walk. So it depends on them using the same level. Dave Hansen raised the concern that hottest pages may be cached in TLB and don't frequently set the accessed bits. The solution would be to invalidate TLB for the mm being walked, when finished one round of scan. Warning: read() also clears the accessed bit btw, in order to avoid one more page table walk for write(). That may not be desirable for some use cases, so we can avoid clearing accessed bit when opened in readonly mode. The interface should be further improved to 1) report holes and huge pages in one go 2) represent huge pages and sparse page tables efficiently (1) can be trivially fixed by extending the bitmap to more bits per PAGE_SIZE. (2) would need fundemental changes to the interface. It seems existing solutions for sparse files like SEEK_HOLE/SEEK_DATA and FIEMAP ioctl may not serve this situation well. The most efficient way could be to fill user space read() buffer with an array of small extents: struct idle_extent { unsigned type : 4; unsigned nr : 4; }; where type can be one of 4K_HOLE 4K_IDLE 4K_ACCESSED 2M_HOLE 2M_IDLE 2M_ACCESSED 1G_OR_LARGER_PAGE ... There can be up to 16 types, so more page sizes can be defined. The above names are just for easy understanding the typical case. It's also possible that PAGE_SIZE is not 4K, or PMD represents 4M pages. In which case we change type names to more suitable ones like PTE_HOLE, PMD_ACCESSED. Since it's page table walking, the user space should better know the exact page sizes. Either the accessed bit or page migration are tied to the real page size. Anyone interested in adding PTE_DIRTY or more types? The main problem with such extent reporting interface is, the number of bytes returned by read (variable extents) will mismatch the advanced file position (fixed VA indexes), which is not POSIX compliant. Simple cp/cat may still work, as they don't lseek based on read return value. If that's really a concern, we may use ioctl() instead.. CC: Dave Hansen Signed-off-by: Fengguang Wu --- arch/x86/kvm/ept_idle.c | 211 ++++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/ept_idle.h | 55 +++++++++++++ 2 files changed, 266 insertions(+) diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c index 5b97dd01011b..8a233ab8656d 100644 --- a/arch/x86/kvm/ept_idle.c +++ b/arch/x86/kvm/ept_idle.c @@ -9,6 +9,217 @@ #include "ept_idle.h" +static int add_to_idle_bitmap(struct ept_idle_ctrl *eic, + int idle, unsigned long addr_range) +{ + int nbits = addr_range >> PAGE_SHIFT; + int bits_left = EPT_IDLE_KBUF_BITS - eic->bits_read; + int ret = 0; + + if (nbits >= bits_left) { + ret = EPT_IDLE_KBUF_FULL; + nbits = bits_left; + } + + // TODO: this assumes u64 == unsigned long + if (!idle) + __bitmap_clear((unsigned long *)eic->kbuf, eic->bits_read, nbits); + eic->bits_read += nbits; + + return ret; +} + +static int ept_pte_range(struct ept_idle_ctrl *eic, + pmd_t *pmd, unsigned long addr, unsigned long end) +{ + pte_t *pte; + int err = 0; + int idle; + + pte = pte_offset_kernel(pmd, addr); + do { + if (!ept_pte_present(*pte) || + !ept_pte_accessed(*pte)) + idle = 1; + else { + idle = 0; + pte_clear_flags(*pte, _PAGE_EPT_ACCESSED); + } + + err = add_to_idle_bitmap(eic, idle, PAGE_SIZE); + if (err) + break; + } while (pte++, addr += PAGE_SIZE, addr != end); + + return err; +} + +static int ept_pmd_range(struct ept_idle_ctrl *eic, + pud_t *pud, unsigned long addr, unsigned long end) +{ + pmd_t *pmd; + unsigned long next; + int err = 0; + int idle; + + pmd = pmd_offset(pud, addr); + do { + next = pmd_addr_end(addr, end); + idle = -1; + if (!ept_pmd_present(*pmd) || + !ept_pmd_accessed(*pmd)) { + idle = 1; + } else if (pmd_large(*pmd)) { + idle = 0; + pmd_clear_flags(*pmd, _PAGE_EPT_ACCESSED); + } + if (idle >= 0) + err = add_to_idle_bitmap(eic, idle, next - addr); + else + err = ept_pte_range(eic, pmd, addr, next); + if (err) + break; + } while (pmd++, addr = next, addr != end); + + return err; +} + +static int ept_pud_range(struct ept_idle_ctrl *eic, + p4d_t *p4d, unsigned long addr, unsigned long end) +{ + pud_t *pud; + unsigned long next; + int err = 0; + int idle; + + pud = pud_offset(p4d, addr); + do { + next = pud_addr_end(addr, end); + idle = -1; + if (!ept_pud_present(*pud) || + !ept_pud_accessed(*pud)) { + idle = 1; + } else if (pud_large(*pud)) { + idle = 0; + pud_clear_flags(*pud, _PAGE_EPT_ACCESSED); + } + if (idle >= 0) + err = add_to_idle_bitmap(eic, idle, next - addr); + else + err = ept_pmd_range(eic, pud, addr, next); + if (err) + break; + } while (pud++, addr = next, addr != end); + + return err; +} + +static int ept_p4d_range(struct ept_idle_ctrl *eic, + pgd_t *pgd, unsigned long addr, unsigned long end) +{ + p4d_t *p4d; + unsigned long next; + int err = 0; + + p4d = p4d_offset(pgd, addr); + do { + next = p4d_addr_end(addr, end); + if (!ept_p4d_present(*p4d)) + err = add_to_idle_bitmap(eic, 1, next - addr); + else + err = ept_pud_range(eic, p4d, addr, next); + if (err) + break; + } while (p4d++, addr = next, addr != end); + + return err; +} + +static int ept_page_range(struct ept_idle_ctrl *eic, + pgd_t *ept_root, unsigned long addr, unsigned long end) +{ + pgd_t *pgd; + unsigned long next; + int err = 0; + + BUG_ON(addr >= end); + pgd = pgd_offset_pgd(ept_root, addr); + do { + next = pgd_addr_end(addr, end); + if (!ept_pgd_present(*pgd)) + err = add_to_idle_bitmap(eic, 1, next - addr); + else + err = ept_p4d_range(eic, pgd, addr, next); + if (err) + break; + } while (pgd++, addr = next, addr != end); + + return err; +} + +static void init_ept_idle_ctrl_buffer(struct ept_idle_ctrl *eic) +{ + eic->bits_read = 0; + memset(eic->kbuf, 0xff, EPT_IDLE_KBUF_BYTES); +} + +static int ept_idle_walk_gfn_range(struct ept_idle_ctrl *eic, + unsigned long gfn_start, + unsigned long gfn_end) +{ + struct kvm_vcpu *vcpu = kvm_get_vcpu(eic->kvm, 0); + struct kvm_mmu *mmu = &vcpu->arch.mmu; + pgd_t *ept_root; + unsigned long gpa_start = gfn_to_gpa(gfn_start); + unsigned long gpa_end = gfn_to_gpa(gfn_end); + unsigned long gpa_addr; + int bytes_read; + int ret = -EINVAL; + + init_ept_idle_ctrl_buffer(eic); + + spin_lock(&eic->kvm->mmu_lock); + if (mmu->base_role.ad_disabled) { + printk(KERN_NOTICE "CPU does not support EPT A/D bits tracking\n"); + goto out_unlock; + } + + if (mmu->shadow_root_level != 4 + (!!pgtable_l5_enabled())) { + printk(KERN_NOTICE "Unsupported EPT level %d\n", mmu->shadow_root_level); + goto out_unlock; + } + + if (!VALID_PAGE(mmu->root_hpa)) { + goto out_unlock; + } + + ept_root = __va(mmu->root_hpa); + + for (gpa_addr = gpa_start; + gpa_addr < gpa_end; + gpa_addr += EPT_IDLE_KBUF_BITS << PAGE_SHIFT) { + ept_page_range(eic, ept_root, gpa_addr, gpa_end); + spin_unlock(&eic->kvm->mmu_lock); + + bytes_read = eic->bits_read / 8; + + ret = copy_to_user(eic->buf, eic->kbuf, bytes_read); + if (ret) + return -EFAULT; + + eic->buf += bytes_read; + eic->bytes_copied += bytes_read; + if (eic->bytes_copied >= eic->buf_size) + return 0; + + init_ept_idle_ctrl_buffer(eic); + cond_resched(); + spin_lock(&eic->kvm->mmu_lock); + } +out_unlock: + spin_unlock(&eic->kvm->mmu_lock); + return ret; +} // mindless copy from kvm_handle_hva_range(). // TODO: handle order and hole. diff --git a/arch/x86/kvm/ept_idle.h b/arch/x86/kvm/ept_idle.h index e0b9dcecf50b..b715ec7b7513 100644 --- a/arch/x86/kvm/ept_idle.h +++ b/arch/x86/kvm/ept_idle.h @@ -1,6 +1,61 @@ #ifndef _EPT_IDLE_H #define _EPT_IDLE_H +#define _PAGE_BIT_EPT_ACCESSED 8 +#define _PAGE_EPT_ACCESSED (_AT(pteval_t, 1) << _PAGE_BIT_EPT_ACCESSED) + +#define _PAGE_EPT_PRESENT (_AT(pteval_t, 7)) + +static inline int ept_pte_present(pte_t a) +{ + return pte_flags(a) & _PAGE_EPT_PRESENT; +} + +static inline int ept_pmd_present(pmd_t a) +{ + return pmd_flags(a) & _PAGE_EPT_PRESENT; +} + +static inline int ept_pud_present(pud_t a) +{ + return pud_flags(a) & _PAGE_EPT_PRESENT; +} + +static inline int ept_p4d_present(p4d_t a) +{ + return p4d_flags(a) & _PAGE_EPT_PRESENT; +} + +static inline int ept_pgd_present(pgd_t a) +{ + return pgd_flags(a) & _PAGE_EPT_PRESENT; +} + +static inline int ept_pte_accessed(pte_t a) +{ + return pte_flags(a) & _PAGE_EPT_ACCESSED; +} + +static inline int ept_pmd_accessed(pmd_t a) +{ + return pmd_flags(a) & _PAGE_EPT_ACCESSED; +} + +static inline int ept_pud_accessed(pud_t a) +{ + return pud_flags(a) & _PAGE_EPT_ACCESSED; +} + +static inline int ept_p4d_accessed(p4d_t a) +{ + return p4d_flags(a) & _PAGE_EPT_ACCESSED; +} + +static inline int ept_pgd_accessed(pgd_t a) +{ + return pgd_flags(a) & _PAGE_EPT_ACCESSED; +} + #define IDLE_BITMAP_CHUNK_SIZE sizeof(u64) #define IDLE_BITMAP_CHUNK_BITS (IDLE_BITMAP_CHUNK_SIZE * BITS_PER_BYTE) From patchwork Sat Sep 1 11:28:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7762D5A4 for ; Sun, 2 Sep 2018 02:21:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6652C29FC7 for ; Sun, 2 Sep 2018 02:21:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 59D8E29FCA; Sun, 2 Sep 2018 02:21:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F07CE29FC7 for ; Sun, 2 Sep 2018 02:21:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B52E6B5FD9; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 913466B5FDB; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 78D906B5FDA; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 193846B5FDB for ; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id t23-v6so9165353pfe.20 for ; Sat, 01 Sep 2018 19:21:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:cc:cc:subject :references:mime-version:content-disposition:lines; bh=U0AlHIzYWwWdibjBNKZjKcnKPmcDOWtACPflAPhiSpA=; b=ovUk06zrjACOEdUuHBZ+VBYZxuD3OHhqRZG2Rw/nWQowCFhi26pJKpUlO2pFT4cpxZ bKm4JYTE7KsApLq8jH7fNy9vVpxqTRzvwciDDED/lYw9bEwiRu1EKCgqw/pTpKEvZJNJ Jp9++ey/hNNSdHvHJXXFZdlypBjVBr23eEMRpsu3Lk2MMNEX54Clptd+J8AHL4fXwjS0 5kINkbm3tJbORorq6Qaac0W2tzSZ8wQmS8TWFy+YK25+Po9NFcKLrlS9PbB/D4YQ/h35 4uP4WBQQFcWxXyAbP1KTtCJDPt2RhjlR6QEkWIL830M6hk8jRKQAYp0CdZSYHz/UPGj5 Yccw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51CWSdDWR5yU7Nmi7AvBK9/tmTsbYoizZDQoVPkeNueZzoVJ2zZ6 hSJfUZNPzT5Iwz4ZySTVLCxQO/OSPGdHqUg+TZxmTBcraL3uO4qPz07tvv+L2nmijsk721k9K7M XTzDm1DjmMTv0g3DaN3CzwvzqEgG1jSIIoJXKNtd1G7GQ/27c7Cv+JyGvKIP90ei05Q== X-Received: by 2002:a63:550b:: with SMTP id j11-v6mr19660708pgb.167.1535854870818; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdbv9AQqWqxBwsBTUPomJ/lDqKiSUTIdDOwjsxV2SuvJgFUUxXkZljNNuPiRI51E9DdD0yvj X-Received: by 2002:a63:550b:: with SMTP id j11-v6mr19660690pgb.167.1535854870245; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854870; cv=none; d=google.com; s=arc-20160816; b=rKMxw2slFCMXeOJwz/M8vbrqLzoPHxbExULOReXdI/WxUDe98b2Ubs4YDCXIeZjKbp QnG0cf2op9YDoY+LF9YRRde2EUBlh9DsicwcP+Chek1Rv3gCDzBgNvqUn6VJjK4MQimW 7NfuLLhRZQcwVeAmeknDm0Zv5p1hkPaOuHNhgnwkaJzDQo1j7eW/yr2NIypTuMN0W4d6 WCKXqvH5Svj0LxyLxWdYBR5BAjZqGi1/pOK2irWchq1pS4U734/I6GZSbQNr+/tdQgVo DZJ+0SABSHYI81VxjKn4lpnS1rzstp9uQO5PWlzcb+451qUcFRylE9H6Jqd+SPfYQA9n splA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:content-disposition:mime-version:references:subject:cc:cc:cc :cc:cc:cc:cc:cc:cc:to:from:date:user-agent:message-id :arc-authentication-results; bh=U0AlHIzYWwWdibjBNKZjKcnKPmcDOWtACPflAPhiSpA=; b=FtVqxqrSnIpDn/2hg86swMcGkZr7cogUbjuftmmJprON+QFAcfpyu9cGfzw0mr7HlI rclmVEaSB0jzyoiNb+wcrRrqufEHZXO+Yoex76yUfbFvO4oHh2+tb80QCN10QByf8Ul1 7oR64PRPb1bloLfnJkHzbz87QeQOBHvnYbKF0Puskcyp8RochH+lZhVLk98lFtBtjapU yEf0mAeYFxTVtl+uS6cb8KeylIInSr+Y0jPrybhSsEVG3eq2Z2H1l/ksPfTOpss1GRHe U0lj5iSd8q7XYEeHNcc5yxVtx80KfwULotsAg+2G9Cy+GY0TV9/N5WoL9m8jc9cVMF/Q RAyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id d32-v6si13682201pla.93.2018.09.01.19.21.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:10 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211548" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003Zq-7h; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901124811.703808090@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:23 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List , Fengguang Wu cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: Huang Ying CC: Brendan Gregg cc: kvm@vger.kernel.org Cc: LKML Subject: [RFC][PATCH 5/5] [PATCH 5/5] kvm-ept-idle: enable module References: <20180901112818.126790961@intel.com> MIME-Version: 1.0 Content-Disposition: inline; filename=0005-kvm-ept-idle-enable-module.patch Lines: 47 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000303, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Fengguang Wu --- arch/x86/kvm/Kconfig | 11 +++++++++++ arch/x86/kvm/Makefile | 4 ++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 1bbec387d289..4c6dec47fac6 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -96,6 +96,17 @@ config KVM_MMU_AUDIT This option adds a R/W kVM module parameter 'mmu_audit', which allows auditing of KVM MMU events at runtime. +config KVM_EPT_IDLE + tristate "KVM EPT idle page tracking" + depends on KVM_INTEL + depends on PROC_PAGE_MONITOR + ---help--- + Provides support for walking EPT to get the A bits on Intel + processors equipped with the VT extensions. + + To compile this as a module, choose M here: the module + will be called kvm-ept-idle. + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/vhost/Kconfig diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index dc4f2fdf5e57..5cad0590205d 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -19,6 +19,10 @@ kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ kvm-intel-y += vmx.o pmu_intel.o kvm-amd-y += svm.o pmu_amd.o +kvm-ept-idle-y += ept_idle.o + obj-$(CONFIG_KVM) += kvm.o obj-$(CONFIG_KVM_INTEL) += kvm-intel.o obj-$(CONFIG_KVM_AMD) += kvm-amd.o + +obj-$(CONFIG_KVM_EPT_IDLE) += kvm-ept-idle.o