From patchwork Sun Jan 20 23:39:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772531 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DAB4417E9 for ; Sun, 20 Jan 2019 23:41:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA4E229CB1 for ; Sun, 20 Jan 2019 23:41:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BBF3D29CC0; Sun, 20 Jan 2019 23:41:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id F0E7329CB1 for ; Sun, 20 Jan 2019 23:41:23 +0000 (UTC) Received: (qmail 25867 invoked by uid 550); 20 Jan 2019 23:41:17 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 25808 invoked from network); 20 Jan 2019 23:41:17 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vs+bNFu0yF1CxUiI9KZwr438wTX73TGD5F830H4iAIA=; b=oQMSxgeWwvX3fxnLAJIhrPzeISSSa757Vq1K2z2xXFfbGMJNm+mNFmdOPlzRygbHlM 5zGbjPbW/hSEvWy7F/KCTZPeDnl5hL7P+Eii3cu15r4wxdMTY/0ArkUF1lrhPq5GTHH5 41MAsV9b64OSoAX8RNfLkmyk8/yZ/7splCiI41pnzF41mDlZ23BPaAhUKQYcLqO5RvrO liU9DxfBLkv4OPAlj3EIOvJxvWAiT60VeLXoQLD+nF5QpktvP0XtDD2hWckJMVWt8cYH W9VbmZW81UdFbBnNEYvZOG3QTqFEgACaCl4c0Xu8L/9eZnihyWQvBa2cjjWc3oeG0thY CXSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vs+bNFu0yF1CxUiI9KZwr438wTX73TGD5F830H4iAIA=; b=jySDNmX8jihA59tRNZvebb6yD3hd+e0BwrgLQqqkwqrPctmjEBnlM8fCMaluQK5A5q /fmFKOvYsGxRXpEirEshdN75Nxdig04lBR6FZVZmi7qNdnRzBzZhNJl852VaWJHclHXM wMcTWHomLYYMmRzOBXhNll1hX5byokf0Js8ygDAz8BTjEERcw/uqVAOaCa0yOLPrcgvg 93GmXLbmFJfneAkcEzk6oR+8/0Y0WkUVsVC2l/F/DLt5YDTJI4O3bcVyez8HdK4y4qYK ZovG1JFZL/uXhzRq4RVa6ydrBA47k4GmlcmV7HjOTmMLRhifbK1fqreTTgNH6D3O9JuZ 5APQ== X-Gm-Message-State: AJcUukfu/Nv+P6Huq5Ubwt7syViJXPuaWhsu2iw0uRiJ61eLrDm41Nvk hubcFS4QL5hfKRZp5MW4tLQsIw== X-Google-Smtp-Source: ALg8bN5FZlmXzIT2hTz+cxotHPasI5Ywv/1TMW+F1EmUBuKhIt3uhvjkzNB2etOELjfHhmG1GUQ/rQ== X-Received: by 2002:adf:f605:: with SMTP id t5mr24033912wrp.229.1548027665620; Sun, 20 Jan 2019 15:41:05 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 01/11] KVM: State whether memory should be freed in kvm_free_memslot Date: Mon, 21 Jan 2019 01:39:30 +0200 Message-Id: <20190120233940.15282-2-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP The conditions upon which kvm_free_memslot are kind of ad-hock, it will be hard to extend memslot with allocatable data that needs to be freed, so I replaced the current mechanism by clear flag that states if the memory slot should be freed. Signed-off-by: Ahmed Abd El Mawgood --- virt/kvm/kvm_main.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1f888a103f..2f37b4b6a2 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -548,9 +548,10 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) * Free any memory in @free but not in @dont. */ static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, - struct kvm_memory_slot *dont) + struct kvm_memory_slot *dont, + enum kvm_mr_change change) { - if (!dont || free->dirty_bitmap != dont->dirty_bitmap) + if (change == KVM_MR_DELETE) kvm_destroy_dirty_bitmap(free); kvm_arch_free_memslot(kvm, free, dont); @@ -566,7 +567,7 @@ static void kvm_free_memslots(struct kvm *kvm, struct kvm_memslots *slots) return; kvm_for_each_memslot(memslot, slots) - kvm_free_memslot(kvm, memslot, NULL); + kvm_free_memslot(kvm, memslot, NULL, KVM_MR_DELETE); kvfree(slots); } @@ -1061,14 +1062,14 @@ int __kvm_set_memory_region(struct kvm *kvm, kvm_arch_commit_memory_region(kvm, mem, &old, &new, change); - kvm_free_memslot(kvm, &old, &new); + kvm_free_memslot(kvm, &old, &new, change); kvfree(old_memslots); return 0; out_slots: kvfree(slots); out_free: - kvm_free_memslot(kvm, &new, &old); + kvm_free_memslot(kvm, &new, &old, change); out: return r; } From patchwork Sun Jan 20 23:39:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772533 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A96C013BF for ; Sun, 20 Jan 2019 23:41:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 978DE29CB1 for ; Sun, 20 Jan 2019 23:41:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B07329CC0; Sun, 20 Jan 2019 23:41:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 238BA29CB9 for ; Sun, 20 Jan 2019 23:41:30 +0000 (UTC) Received: (qmail 26176 invoked by uid 550); 20 Jan 2019 23:41:21 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 26106 invoked from network); 20 Jan 2019 23:41:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JVFrpeC80OLPbUMIpkWyZ48VgZSdWjNoszlgtYxl1bA=; b=Iyzp/Z0tl0zUVSzsrk15rnenCu3gmeS9BbVxPnOR2X0A/wDa67TG2ErOw86zlPJST0 ANFCzpKMRXGco1+Q+3Xcmq6VS5TIB1TJJIBddBx70n1IFKfzNY5MOmuXo2QN65LKCHnj ep/Xjc3Z/aKRTl59oK6xkKe2rVc/4jCaxo4NH6VqNDXMq3RlS1SbKVFhGGgNi0VwBB7M pEP9qVG9dyo/IQNHMP515X2fAyKiTuXAIWzb7siMHqd6STQsv7KzjPufW5NceEhq4S7P ZcXRbcoQ989ppL5tObtjpUtor1QP/joOnD9UK/zK6PKLelzhLR+l6+O9aDF/pqCptwT6 dkOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JVFrpeC80OLPbUMIpkWyZ48VgZSdWjNoszlgtYxl1bA=; b=PrG011+9M+CW4EivH82Z+sKoPuwmkLpOZ9Uh9TUkqPS3PVZopWJu4FdLdEtQDyVyOS whY33J/o3Oryrl7dzttBNSPHC0Y4CVF/4uJxymNyJVhEwGlM9rcfBlWfbYSA6tENuppQ OezaJWmCrT2RO05j3q+hqO6L4Mg83fRW9OrEts2Ul2PmT1RuyhHjnfaUD+rs4yFzluyo fH/PzipcE0kL9LXRII0b7wrtqa0BlvUVK2lqqOaVZC6J8fb0QP3twaCb0KQR9wrE8HHt rRfhjM1YUfoiCFCvCK9V4eiw/fmQvFogwH9BHVzmiZmMe5Wif8MYYPB9qRm2acMgqrEH jqwg== X-Gm-Message-State: AJcUukc0z7eiVBZWrLD+FdBILJJtrnY4leyU4PDZw1wd0sHX9G7ZPvou CpFI8mwGqBiUMG0neidkeJHFcQ== X-Google-Smtp-Source: ALg8bN7yd4LLsUEEVxWZ6AaI+n5mCvGMx2xoXPiVv6mpYsuArzhbuv6KL8o8fpyT9SQ/Ftboiosjgg== X-Received: by 2002:adf:ff09:: with SMTP id k9mr24408212wrr.97.1548027669143; Sun, 20 Jan 2019 15:41:09 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 02/11] KVM: X86: Add arbitrary data pointer in kvm memslot iterator functions Date: Mon, 21 Jan 2019 01:39:31 +0200 Message-Id: <20190120233940.15282-3-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP This will help sharing data into the slot_level_handler callback. In my case I need to a share a counter for the pages traversed to use it in some bitmap. Being able to send arbitrary memory pointer into the slot_level_handler callback made it easy. Signed-off-by: Ahmed Abd El Mawgood --- arch/x86/kvm/mmu.c | 65 ++++++++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 28 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index ce770b4462..098df7d135 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1525,7 +1525,7 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect) static bool __rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - bool pt_protect) + bool pt_protect, void *data) { u64 *sptep; struct rmap_iterator iter; @@ -1564,7 +1564,8 @@ static bool wrprot_ad_disabled_spte(u64 *sptep) * - W bit on ad-disabled SPTEs. * Returns true iff any D or W bits were cleared. */ -static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head) +static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + void *data) { u64 *sptep; struct rmap_iterator iter; @@ -1590,7 +1591,8 @@ static bool spte_set_dirty(u64 *sptep) return mmu_spte_update(sptep, spte); } -static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head) +static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + void *data) { u64 *sptep; struct rmap_iterator iter; @@ -1622,7 +1624,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, while (mask) { rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask), PT_PAGE_TABLE_LEVEL, slot); - __rmap_write_protect(kvm, rmap_head, false); + __rmap_write_protect(kvm, rmap_head, false, NULL); /* clear the first set bit */ mask &= mask - 1; @@ -1648,7 +1650,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm, while (mask) { rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask), PT_PAGE_TABLE_LEVEL, slot); - __rmap_clear_dirty(kvm, rmap_head); + __rmap_clear_dirty(kvm, rmap_head, NULL); /* clear the first set bit */ mask &= mask - 1; @@ -1701,7 +1703,8 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) { rmap_head = __gfn_to_rmap(gfn, i, slot); - write_protected |= __rmap_write_protect(kvm, rmap_head, true); + write_protected |= __rmap_write_protect(kvm, rmap_head, true, + NULL); } return write_protected; @@ -1715,7 +1718,8 @@ static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn) return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn); } -static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head) +static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + void *data) { u64 *sptep; struct rmap_iterator iter; @@ -1735,7 +1739,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, struct kvm_memory_slot *slot, gfn_t gfn, int level, unsigned long data) { - return kvm_zap_rmapp(kvm, rmap_head); + return kvm_zap_rmapp(kvm, rmap_head, NULL); } static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, @@ -5552,13 +5556,15 @@ void kvm_mmu_uninit_vm(struct kvm *kvm) } /* The return value indicates if tlb flush on all vcpus is needed. */ -typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head *rmap_head); +typedef bool (*slot_level_handler) (struct kvm *kvm, + struct kvm_rmap_head *rmap_head, void *data); /* The caller should hold mmu-lock before calling this function. */ static __always_inline bool slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, slot_level_handler fn, int start_level, int end_level, - gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb) + gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb, + void *data) { struct slot_rmap_walk_iterator iterator; bool flush = false; @@ -5566,7 +5572,7 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn, end_gfn, &iterator) { if (iterator.rmap) - flush |= fn(kvm, iterator.rmap); + flush |= fn(kvm, iterator.rmap, data); if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { if (flush && lock_flush_tlb) { @@ -5588,36 +5594,36 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, static __always_inline bool slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot, slot_level_handler fn, int start_level, int end_level, - bool lock_flush_tlb) + bool lock_flush_tlb, void *data) { return slot_handle_level_range(kvm, memslot, fn, start_level, end_level, memslot->base_gfn, memslot->base_gfn + memslot->npages - 1, - lock_flush_tlb); + lock_flush_tlb, data); } static __always_inline bool slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot, - slot_level_handler fn, bool lock_flush_tlb) + slot_level_handler fn, bool lock_flush_tlb, void *data) { return slot_handle_level(kvm, memslot, fn, PT_PAGE_TABLE_LEVEL, - PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb); + PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb, data); } static __always_inline bool slot_handle_large_level(struct kvm *kvm, struct kvm_memory_slot *memslot, - slot_level_handler fn, bool lock_flush_tlb) + slot_level_handler fn, bool lock_flush_tlb, void *data) { return slot_handle_level(kvm, memslot, fn, PT_PAGE_TABLE_LEVEL + 1, - PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb); + PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb, data); } static __always_inline bool slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot, - slot_level_handler fn, bool lock_flush_tlb) + slot_level_handler fn, bool lock_flush_tlb, void *data) { return slot_handle_level(kvm, memslot, fn, PT_PAGE_TABLE_LEVEL, - PT_PAGE_TABLE_LEVEL, lock_flush_tlb); + PT_PAGE_TABLE_LEVEL, lock_flush_tlb, data); } void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) @@ -5645,7 +5651,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) flush |= slot_handle_level_range(kvm, memslot, kvm_zap_rmapp, PT_PAGE_TABLE_LEVEL, PT_MAX_HUGEPAGE_LEVEL, start, - end - 1, flush_tlb); + end - 1, flush_tlb, NULL); } } @@ -5657,9 +5663,10 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) } static bool slot_rmap_write_protect(struct kvm *kvm, - struct kvm_rmap_head *rmap_head) + struct kvm_rmap_head *rmap_head, + void *data) { - return __rmap_write_protect(kvm, rmap_head, false); + return __rmap_write_protect(kvm, rmap_head, false, data); } void kvm_mmu_slot_remove_write_access(struct kvm *kvm, @@ -5669,7 +5676,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, spin_lock(&kvm->mmu_lock); flush = slot_handle_all_level(kvm, memslot, slot_rmap_write_protect, - false); + false, NULL); spin_unlock(&kvm->mmu_lock); /* @@ -5696,7 +5703,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, } static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, - struct kvm_rmap_head *rmap_head) + struct kvm_rmap_head *rmap_head, + void *data) { u64 *sptep; struct rmap_iterator iter; @@ -5740,7 +5748,7 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, /* FIXME: const-ify all uses of struct kvm_memory_slot. */ spin_lock(&kvm->mmu_lock); slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot, - kvm_mmu_zap_collapsible_spte, true); + kvm_mmu_zap_collapsible_spte, true, NULL); spin_unlock(&kvm->mmu_lock); } @@ -5750,7 +5758,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, bool flush; spin_lock(&kvm->mmu_lock); - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false); + flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false, NULL); spin_unlock(&kvm->mmu_lock); lockdep_assert_held(&kvm->slots_lock); @@ -5774,7 +5782,7 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, spin_lock(&kvm->mmu_lock); flush = slot_handle_large_level(kvm, memslot, slot_rmap_write_protect, - false); + false, NULL); spin_unlock(&kvm->mmu_lock); /* see kvm_mmu_slot_remove_write_access */ @@ -5792,7 +5800,8 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm, bool flush; spin_lock(&kvm->mmu_lock); - flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false); + flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false, + NULL); spin_unlock(&kvm->mmu_lock); lockdep_assert_held(&kvm->slots_lock); From patchwork Sun Jan 20 23:39:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772535 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E48F13BF for ; Sun, 20 Jan 2019 23:41:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D01C29CB1 for ; Sun, 20 Jan 2019 23:41:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D86C29CC0; Sun, 20 Jan 2019 23:41:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 9D54729CB1 for ; Sun, 20 Jan 2019 23:41:40 +0000 (UTC) Received: (qmail 27677 invoked by uid 550); 20 Jan 2019 23:41:25 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 26572 invoked from network); 20 Jan 2019 23:41:25 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=juz3/8ubyO70o/WwTakdDVvsVYAiN+8d71XTlXhc8d0=; b=gnuyvR2bnuLilcd7+9IZF+J+o8w0XrNB/1Xkz7HCGSUq9qPW+NOUi/WjYSQqOIbFY6 zLE2d9gMubbXqFn/RyND6TvDIfdb/L+z0gY/VXI5CSIrtblIhzSkrwfO9Z3V+IZ3pvOf OgWoTmijDHQtTrwB5UKFvw62XDi3VPUnfQJz+ZgdTCd6RSzSyBC7iwapfZxpAHiNfiBX pBvS382M8jVHkBZVPVELstfsb3pvL6o7J+IpQFEUIL3qZGuiRXIiQ8eWV08tqCEpZ3k+ StfsIH+p6gK7RZjheXduzukbYDmmTJja/NEELPCmFu5nVOPQKZ38nUuJFdcfezgwc4gg o3Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=juz3/8ubyO70o/WwTakdDVvsVYAiN+8d71XTlXhc8d0=; b=CoGKnhZq+WGvXmwRSCay7GPK86I9xEbi/SJRcHpBHnK5f3b/lsC5pOlPrWZK7/EDy2 QBIN0Vd9J6FzWpQIiDO642Gk0Hads9a/e2oRObgUFoDY3Zp33/eS1cFKyfvzVkITnCwL 6aVMhQ4RHlrCltKI48RVvFWBLn6ADmY5ToBlG0DZyE+ghGvmdsgqGfbr/GG7jwDg0fsD pq6RkQMVgNW6h6TQBcqQuyzvwQMlMS01+YPI8yLtLq/tAeHUUGjS2rUMp1QqKybsnKxN JTVDLr7VY9X00ev4yRmIzoFi2aRCF3mII2ZiYKKBgaqN0d+t7bTgz7QiUMss3y3g0YFi zd9g== X-Gm-Message-State: AJcUukcSKQoBspBURygPfqwUVfXHorWyOnjyiK+YriDhZ83xXNh/J4Ff V2iMjYGi84BBKVMQpJGjABR5qg== X-Google-Smtp-Source: ALg8bN4jEjW9am+Xzj8QcDzzmQ0pxFo+cktjow7q+YKP8Hur7vhk7wc+PrdAJwB7Juu+8w+lON7BAA== X-Received: by 2002:a1c:2c6:: with SMTP id 189mr22057938wmc.21.1548027673915; Sun, 20 Jan 2019 15:41:13 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 03/11] KVM: X86: Add helper function to convert SPTE to GFN Date: Mon, 21 Jan 2019 01:39:32 +0200 Message-Id: <20190120233940.15282-4-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Ahmed Abd El Mawgood --- arch/x86/kvm/mmu.c | 7 +++++++ arch/x86/kvm/mmu.h | 1 + 2 files changed, 8 insertions(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 098df7d135..bbfe3f2863 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1053,6 +1053,13 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index) return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS)); } +gfn_t spte_to_gfn(u64 *spte) +{ + struct kvm_mmu_page *sp; + + sp = page_header(__pa(spte)); + return kvm_mmu_page_get_gfn(sp, spte - sp->spt); +} static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn) { diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index c7b333147c..49d7f2f002 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -211,4 +211,5 @@ void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, struct kvm_memory_slot *slot, u64 gfn); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); +gfn_t spte_to_gfn(u64 *sptep); #endif From patchwork Sun Jan 20 23:39:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C747413BF for ; Sun, 20 Jan 2019 23:41:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3D0B29CB9 for ; Sun, 20 Jan 2019 23:41:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A49EE29CB1; Sun, 20 Jan 2019 23:41:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id CA3C929CB1 for ; Sun, 20 Jan 2019 23:41:49 +0000 (UTC) Received: (qmail 28035 invoked by uid 550); 20 Jan 2019 23:41:29 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 27981 invoked from network); 20 Jan 2019 23:41:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S5NVvoOgAgdZo8wmN11QDMw/7RKccGkERCz5dFRhs3I=; b=BdDxLGM1dmo0GQ+oCKzyAyW/P0iJv04mOqd5KTQC7iP2RnAShme5aGa8+eljPHQD5R 9UmVk6y1JS7PQ1SLhctISAowhr2r0ah2arCKYzHwcPIMBUI450ApL7L8k0GqISPqpioD UxI4E6VVT0p3jaEWvnULiJntikx2nuzcq1kWXgGHNp3baPwmjmhoIgkO7KuC6y813yud zvXyPUwiIOeN+gu9q5I5gKTTvdXNundyTG6tPOqDVEezJVRkPGQ+ow29IgF9qiH/lwsP cnQMPZD9X9blbOT7wc7fTFkK8bEjOwn4Wu1NeUueXVV60brUtY0RSwlBTv9y0t/tG3Nh GlJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S5NVvoOgAgdZo8wmN11QDMw/7RKccGkERCz5dFRhs3I=; b=p4aNogBUdgsn0wjA8bdn6Bbzt6V6M5SBKHYXmfGNOdir6nad4h9SFyPk5sq/Fa64u8 xrHe0NEsqg3XHPv58w7Q2nawkNg52K830urmFVm7390wfifS9a42yPepyF4qpjCSTSWm 66SWsqRkE7zZCqNQm8ERhmYlYIuiL9NgyEFpSjEE6kbK7+xoQB3WmRDqIXCy6AcqiIuG 2EAi+n6cpB845NYRKBwJPxnvhO8oWifl8NxUXxsDUsZk/cyv/iwIKGy/NMJy69e3Y5qn J2g5P/OakL8C2ztH+GCs1+Ny7kbdGrrNAKNCg3j3XfubdLZTw+0+8ExNQsgCRFP83DuS dBjA== X-Gm-Message-State: AJcUukfsM2gr4R4QuOWaSRYLmO2+unwU9TrbAbfMTEqVgSb/RvT9y0JH Rr/aXNEvpf8RcAgqv1t3iM0uDw== X-Google-Smtp-Source: ALg8bN4kc5ObGqv6NoNo2QHDN6CCtu/mF6luBJqOGiE2VJKAp52yGhiJxl9kbMkt/ISKNQ1uC0ZBkw== X-Received: by 2002:a1c:7409:: with SMTP id p9mr23272500wmc.136.1548027677700; Sun, 20 Jan 2019 15:41:17 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 04/11] KVM: Document Memory ROE Date: Mon, 21 Jan 2019 01:39:33 +0200 Message-Id: <20190120233940.15282-5-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP ROE version documented here is implemented in the next 2 patches Signed-off-by: Ahmed Abd El Mawgood --- Documentation/virtual/kvm/hypercalls.txt | 40 ++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt index da24c138c8..a31f316ce6 100644 --- a/Documentation/virtual/kvm/hypercalls.txt +++ b/Documentation/virtual/kvm/hypercalls.txt @@ -141,3 +141,43 @@ a0 corresponds to the APIC ID in the third argument (a2), bit 1 corresponds to the APIC ID a2+1, and so on. Returns the number of CPUs to which the IPIs were delivered successfully. + +7. KVM_HC_ROE +---------------- +Architecture: x86 +Status: active +Purpose: Hypercall used to apply Read-Only Enforcement to guest memory and +registers +Usage 1: + a0: ROE_VERSION + +Returns non-signed number that represents the current version of ROE +implementation current version. + +Usage 2: + + a0: ROE_MPROTECT (requires version >= 1) + a1: Start address aligned to page boundary. + a2: Number of pages to be protected. + +This configuration lets a guest kernel have part of its read/write memory +converted into read-only. This action is irreversible. +Upon successful run, the number of pages protected is returned. + +Usage 3: + a0: ROE_MPROTECT_CHUNK (requires version >= 2) + a1: Start address aligned to page boundary. + a2: Number of bytes to be protected. +This configuration lets a guest kernel have part of its read/write memory +converted into read-only with bytes granularity. ROE_MPROTECT_CHUNK is +relatively slow compared to ROE_MPROTECT. This action is irreversible. +Upon successful run, the number of bytes protected is returned. + +Error codes: + -KVM_ENOSYS: system call being triggered from ring 3 or it is not + implemented. + -EINVAL: error based on given parameters. + +Notes: KVM_HC_ROE can not be triggered from guest Ring 3 (user mode). The +reason is that user mode malicious software can make use of it to enforce read +only protection on an arbitrary memory page thus crashing the kernel. From patchwork Sun Jan 20 23:39:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772539 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DCC5217E9 for ; Sun, 20 Jan 2019 23:42:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAD3B29CB1 for ; Sun, 20 Jan 2019 23:42:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BCAEF29CC0; Sun, 20 Jan 2019 23:42:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 4461C29CB1 for ; Sun, 20 Jan 2019 23:41:59 +0000 (UTC) Received: (qmail 28258 invoked by uid 550); 20 Jan 2019 23:41:32 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 28186 invoked from network); 20 Jan 2019 23:41:31 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ba3tDASNTotjNQKnGm55/6Aw1NBRDm2GVVzFiCk5Tjc=; b=xt1IaSpMNL7ziYvM+rZnBGX/PKTiPlj5tS9UtgSyiPJn92VBAFynaLivIruQFoWrCb LWwmR1ZBO4JmpORKmpVZMKLjGdeji5S4uPPgk/3NMH/2MRqC/jA7/CI6b0fUhpdTkHfJ V5my0sqbH76wXLJ5UwMA8Uxu1T/CddVtzUkb/HNFRtxIwdOgLeYRDOcPntoayD4/9vgn VLkI1KDpyRjLH7DfuaWSnV/UHNEm9SG4NRSDK+ctqa/B0J5XaZtSW7FPOPc7vqvGrbzt IadnMIC/hexahdFXpeqoIqD3dhm2k/Ax8jV8mj26WBPQ+KgBU/Szwfkjkp+fYwqJRBOP 3EnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ba3tDASNTotjNQKnGm55/6Aw1NBRDm2GVVzFiCk5Tjc=; b=JQlL4kuUnKmn0snW7xBstPydP+YA4reFBgO3ACsaCLZvcsPrNEaNuVDK70knfcKzuw TT6/Nqvu8kTuUgXAHSGnyf4j8riGW67+ya6kOmqnToLO71CpzeG4GgbtYmK5Di1G9qMW drEQS06vIOHuEcYeOaEmVXGKPnp1Z6otlA9inHuPYR0YhSAgFIk6DaIcwoDCbuB33Ot8 B0rQRG9lu0ImCcpFJIb5IVeriUHoQE3r+oBEf2PoP3lT6Lj0+s1Sx2govzkknnueTFBX WtZJXh1412Gcmtgj+pnS8Cn+1U2smcFRRO4ja7mR+uQ1ewf1+T3Cma79/uiZbbc21en7 FbeQ== X-Gm-Message-State: AJcUukfAU/mO87LC1SH0WTrWt65o03TXq3+8HjzJYwDNZZbSfF+fndfa /5MU6BnCRWB84M/Kh97qvlFSuA== X-Google-Smtp-Source: ALg8bN7ul2pdzsg6zMf4tyvqYVChZ72Zkyxhd1WnH7clqi7jYMFuxYuMyxLoCiPtpUytkSGt+Q8+Aw== X-Received: by 2002:a5d:444a:: with SMTP id x10mr26249203wrr.162.1548027680378; Sun, 20 Jan 2019 15:41:20 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 05/11] KVM: Create architecture independent ROE skeleton Date: Mon, 21 Jan 2019 01:39:34 +0200 Message-Id: <20190120233940.15282-6-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP This patch introduces a hypercall that can assist against subset of kernel rootkits, it works by place readonly protection in shadow PTE. The end result protection is also kept in a bitmap for each kvm_memory_slot and is used as reference when updating SPTEs. The whole goal is to protect the guest kernel static data from modification if attacker is running from guest ring 0, for this reason there is no hypercall to revert effect of Memory ROE hypercall. This patch doesn't implement integrity check on guest TLB so obvious attack on the current implementation will involve guest virtual address -> guest physical address remapping, but there are plans to fix that. For this patch to work on a given arch/ one would need to implement 2 function that are architecture specific: kvm_roe_arch_commit_protection() and kvm_roe_arch_is_userspace(). Also it would need to have kvm_roe invoked using the appropriate hypercall mechanism. Signed-off-by: Ahmed Abd El Mawgood --- include/kvm/roe.h | 16 ++++ include/linux/kvm_host.h | 1 + include/uapi/linux/kvm_para.h | 4 + virt/kvm/kvm_main.c | 19 +++-- virt/kvm/roe.c | 136 ++++++++++++++++++++++++++++++++++ virt/kvm/roe_generic.h | 19 +++++ 6 files changed, 190 insertions(+), 5 deletions(-) create mode 100644 include/kvm/roe.h create mode 100644 virt/kvm/roe.c create mode 100644 virt/kvm/roe_generic.h diff --git a/include/kvm/roe.h b/include/kvm/roe.h new file mode 100644 index 0000000000..6a86866623 --- /dev/null +++ b/include/kvm/roe.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __KVM_ROE_H__ +#define __KVM_ROE_H__ +/* + * KVM Read Only Enforcement + * Copyright (c) 2018 Ahmed Abd El Mawgood + * + * Author Ahmed Abd El Mawgood + * + */ +void kvm_roe_arch_commit_protection(struct kvm *kvm, + struct kvm_memory_slot *slot); +int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3); +bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu); +#endif diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index c38cc5eb7e..a627c6e81a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -297,6 +297,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) struct kvm_memory_slot { gfn_t base_gfn; unsigned long npages; + unsigned long *roe_bitmap; unsigned long *dirty_bitmap; struct kvm_arch_memory_slot arch; unsigned long userspace_addr; diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h index 6c0ce49931..e6004e0750 100644 --- a/include/uapi/linux/kvm_para.h +++ b/include/uapi/linux/kvm_para.h @@ -28,7 +28,11 @@ #define KVM_HC_MIPS_CONSOLE_OUTPUT 8 #define KVM_HC_CLOCK_PAIRING 9 #define KVM_HC_SEND_IPI 10 +#define KVM_HC_ROE 11 +/* ROE Functionality parameters */ +#define ROE_VERSION 0 +#define ROE_MPROTECT 1 /* * hypercalls use architecture specific */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2f37b4b6a2..88b5fbcbb0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -61,6 +61,7 @@ #include "coalesced_mmio.h" #include "async_pf.h" #include "vfio.h" +#include "roe_generic.h" #define CREATE_TRACE_POINTS #include @@ -551,9 +552,10 @@ static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, struct kvm_memory_slot *dont, enum kvm_mr_change change) { - if (change == KVM_MR_DELETE) + if (change == KVM_MR_DELETE) { + kvm_roe_free(free); kvm_destroy_dirty_bitmap(free); - + } kvm_arch_free_memslot(kvm, free, dont); free->npages = 0; @@ -1018,6 +1020,8 @@ int __kvm_set_memory_region(struct kvm *kvm, if (kvm_create_dirty_bitmap(&new) < 0) goto out_free; } + if (kvm_roe_init(&new) < 0) + goto out_free; slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); if (!slots) @@ -1348,13 +1352,18 @@ static bool memslot_is_readonly(struct kvm_memory_slot *slot) return slot->flags & KVM_MEM_READONLY; } +static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot); +} + static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn, gfn_t *nr_pages, bool write) { if (!slot || slot->flags & KVM_MEMSLOT_INVALID) return KVM_HVA_ERR_BAD; - if (memslot_is_readonly(slot) && write) + if (gfn_is_readonly(slot, gfn) && write) return KVM_HVA_ERR_RO_BAD; if (nr_pages) @@ -1402,7 +1411,7 @@ unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, unsigned long hva = __gfn_to_hva_many(slot, gfn, NULL, false); if (!kvm_is_error_hva(hva) && writable) - *writable = !memslot_is_readonly(slot); + *writable = !gfn_is_readonly(slot, gfn); return hva; } @@ -1640,7 +1649,7 @@ kvm_pfn_t __gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn, } /* Do not map writable pfn in the readonly memslot. */ - if (writable && memslot_is_readonly(slot)) { + if (writable && gfn_is_readonly(slot, gfn)) { *writable = false; writable = NULL; } diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c new file mode 100644 index 0000000000..33d3a4f507 --- /dev/null +++ b/virt/kvm/roe.c @@ -0,0 +1,136 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * KVM Read Only Enforcement + * Copyright (c) 2018 Ahmed Abd El Mawgood + * + * Author: Ahmed Abd El Mawgood + * + */ +#include +#include +#include +#include + +int kvm_roe_init(struct kvm_memory_slot *slot) +{ + slot->roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) * + sizeof(unsigned long), GFP_KERNEL); + if (!slot->roe_bitmap) + return -ENOMEM; + return 0; + +} + +void kvm_roe_free(struct kvm_memory_slot *slot) +{ + kvfree(slot->roe_bitmap); +} + +static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, u64 npages) +{ + int i; + + for (i = gfn - slot->base_gfn; i < gfn + npages - slot->base_gfn; i++) + set_bit(i, slot->roe_bitmap); + kvm_roe_arch_commit_protection(kvm, slot); +} + + +static int __kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) +{ + struct kvm_memory_slot *slot; + gfn_t gfn = gpa >> PAGE_SHIFT; + int count = 0; + + while (npages != 0) { + slot = gfn_to_memslot(kvm, gfn); + if (!slot) { + gfn += 1; + npages -= 1; + continue; + } + if (gfn + npages > slot->base_gfn + slot->npages) { + u64 _npages = slot->base_gfn + slot->npages - gfn; + + kvm_roe_protect_slot(kvm, slot, gfn, _npages); + gfn += _npages; + count += _npages; + npages -= _npages; + } else { + kvm_roe_protect_slot(kvm, slot, gfn, npages); + count += npages; + npages = 0; + } + } + if (count == 0) + return -EINVAL; + return count; +} + +static int kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) +{ + int r; + + mutex_lock(&kvm->slots_lock); + r = __kvm_roe_protect_range(kvm, gpa, npages); + mutex_unlock(&kvm->slots_lock); + return r; +} + + +static int kvm_roe_full_protect_range(struct kvm_vcpu *vcpu, u64 gva, + u64 npages) +{ + struct kvm *kvm = vcpu->kvm; + gpa_t gpa; + u64 hva; + u64 count = 0; + int i; + int status; + + if (gva & ~PAGE_MASK) + return -EINVAL; + // We need to make sure that there will be no overflow + if ((npages << PAGE_SHIFT) >> PAGE_SHIFT != npages || npages == 0) + return -EINVAL; + for (i = 0; i < npages; i++) { + gpa = kvm_mmu_gva_to_gpa_system(vcpu, gva + (i << PAGE_SHIFT), + NULL); + hva = gfn_to_hva(kvm, gpa >> PAGE_SHIFT); + if (kvm_is_error_hva(hva)) + continue; + if (!access_ok(hva, 1 << PAGE_SHIFT)) + continue; + status = kvm_roe_protect_range(vcpu->kvm, gpa, 1); + if (status > 0) + count += status; + } + if (count == 0) + return -EINVAL; + return count; +} + +int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3) +{ + int ret; + /* + * First we need to make sure that we are running from something that + * isn't usermode + */ + if (kvm_roe_arch_is_userspace(vcpu)) + return -KVM_ENOSYS; + switch (a0) { + case ROE_VERSION: + ret = 1; //current version + break; + case ROE_MPROTECT: + ret = kvm_roe_full_protect_range(vcpu, a1, a2); + break; + default: + ret = -EINVAL; + } + return ret; +} +EXPORT_SYMBOL_GPL(kvm_roe); diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h new file mode 100644 index 0000000000..36e5b52c5b --- /dev/null +++ b/virt/kvm/roe_generic.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __KVM_ROE_GENERIC_H__ +#define __KVM_ROE_GENERIC_H__ +/* + * KVM Read Only Enforcement + * Copyright (c) 2018 Ahmed Abd El Mawgood + * + * Author Ahmed Abd El Mawgood + * + */ + +void kvm_roe_free(struct kvm_memory_slot *slot); +int kvm_roe_init(struct kvm_memory_slot *slot); +static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return test_bit(gfn - slot->base_gfn, slot->roe_bitmap); +} +#endif From patchwork Sun Jan 20 23:39:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772541 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7AD417E9 for ; Sun, 20 Jan 2019 23:42:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D645A29CB1 for ; Sun, 20 Jan 2019 23:42:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CA75529CC0; Sun, 20 Jan 2019 23:42:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 13E0629CB1 for ; Sun, 20 Jan 2019 23:42:10 +0000 (UTC) Received: (qmail 28636 invoked by uid 550); 20 Jan 2019 23:41:36 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 28595 invoked from network); 20 Jan 2019 23:41:36 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kwyRxsWyGV0P+PJEruTJ74ZAlIR7fu4S0vOirK73SPI=; b=KkONDRwn5zOGvo/lfcIR6JaHYCQJ71+ENCW4DtseaUyh4SnPhS0gIkzY3aU6lyCmTn DSXsguW/uTAQZJr6pxKkJalWxbNESmmDS/Ho3WmACgd7QJ1wXOdvAw63eUcrHguBDaKI fW+3W9VSoc+mxJ57CfMiltcRDmClj1OuUeh/3U8JSzXLZ6Q3/+DeAoMy0WAz9+xr9kiy Ag2FI28FDU24JFAMBSKCfTEW8/XGeoB5oKFeh4KWUb0IL+aR+qJ5qXQNsO5CYhUpvG6I HyGyTIkZFuaO51oYxNju5GSjZmGprxiqTDpGK8enpkz10vso2Wj8eyO2SP/24mB8/605 wOLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kwyRxsWyGV0P+PJEruTJ74ZAlIR7fu4S0vOirK73SPI=; b=C2Bz6C6qrJAjj5PwWM4uzxBL05ihql83rHGN59lWgSsq/S+1ip3anpGAXrmhkflXyT lt+xV+a5knqebh2n9F6lJIy5Kkl/JUyStTSuskIx7bgsqaDRWGNc7Xe5LiWz/Bg889eh 2BP1DCGEz+uPXlWDhSmQwyoDMnD28BJu732QM7hiwa0R5qS/8AzNmaje+sOkLme9BUrv QwPvEzv+Nbx9QSrjPtZZtLQ3HZtkLUVCklf7j6WpAP6Qwh3z9/AbON3JtEXEiBqrd4Db /smhxFRuhQnEG+XOxPGzlzUBDBmMWF4EtSZhtDr35d6k/HHktxPGnL4c5TpGjrKNPgq1 X0PQ== X-Gm-Message-State: AJcUukcCqkVjDuYAdYtL2xDjFzb1Lhku8ylHjlNxYrs0JfiTQqAE/205 h58MnbDYEy1MGaemwFxY/0vccg== X-Google-Smtp-Source: ALg8bN7l5YRAtdBSvTda12xjTOSl11wcupPyJIPFdE7W1HlRGYjbrYV8Tw1H81ERJ7v59wnikiw3jw== X-Received: by 2002:adf:fbc8:: with SMTP id d8mr25164144wrs.318.1548027684715; Sun, 20 Jan 2019 15:41:24 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 06/11] KVM: X86: Enable ROE for x86 Date: Mon, 21 Jan 2019 01:39:35 +0200 Message-Id: <20190120233940.15282-7-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP This patch implements kvm_roe_arch_commit_protection and kvm_roe_arch_is_userspace for x86, and invoke kvm_roe via the appropriate vmcall. Signed-off-by: Ahmed Abd El Mawgood --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/Makefile | 4 +- arch/x86/kvm/mmu.c | 71 +++++----------------- arch/x86/kvm/mmu.h | 30 +++++++++- arch/x86/kvm/roe.c | 101 ++++++++++++++++++++++++++++++++ arch/x86/kvm/roe_arch.h | 28 +++++++++ arch/x86/kvm/x86.c | 11 ++-- 7 files changed, 183 insertions(+), 64 deletions(-) create mode 100644 arch/x86/kvm/roe.c create mode 100644 arch/x86/kvm/roe_arch.h diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4660ce90de..797d838c3e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1239,7 +1239,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, u64 acc_track_mask, u64 me_mask); void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); -void kvm_mmu_slot_remove_write_access(struct kvm *kvm, +void kvm_mmu_slot_apply_write_access(struct kvm *kvm, struct kvm_memory_slot *memslot); void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *memslot); diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 69b3a7c300..39f7766afe 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -9,7 +9,9 @@ CFLAGS_vmx.o := -I. KVM := ../../../virt/kvm kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ - $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o + $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o \ + $(KVM)/roe.o roe.o + kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bbfe3f2863..2e3a43076e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -23,7 +23,7 @@ #include "x86.h" #include "kvm_cache_regs.h" #include "cpuid.h" - +#include "roe_arch.h" #include #include #include @@ -1343,8 +1343,8 @@ static void pte_list_remove(struct kvm_rmap_head *rmap_head, u64 *sptep) __pte_list_remove(sptep, rmap_head); } -static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level, - struct kvm_memory_slot *slot) +struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level, + struct kvm_memory_slot *slot) { unsigned long idx; @@ -1394,16 +1394,6 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) __pte_list_remove(spte, rmap_head); } -/* - * Used by the following functions to iterate through the sptes linked by a - * rmap. All fields are private and not assumed to be used outside. - */ -struct rmap_iterator { - /* private fields */ - struct pte_list_desc *desc; /* holds the sptep if not NULL */ - int pos; /* index of the sptep */ -}; - /* * Iteration must be started by this function. This should also be used after * removing/dropping sptes from the rmap link because in such cases the @@ -1411,8 +1401,7 @@ struct rmap_iterator { * * Returns sptep if found, NULL otherwise. */ -static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, - struct rmap_iterator *iter) +u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, struct rmap_iterator *iter) { u64 *sptep; @@ -1438,7 +1427,7 @@ static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, * * Returns sptep if found, NULL otherwise. */ -static u64 *rmap_get_next(struct rmap_iterator *iter) +u64 *rmap_get_next(struct rmap_iterator *iter) { u64 *sptep; @@ -1513,7 +1502,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep) * * Return true if tlb need be flushed. */ -static bool spte_write_protect(u64 *sptep, bool pt_protect) +bool spte_write_protect(u64 *sptep, bool pt_protect) { u64 spte = *sptep; @@ -1531,8 +1520,7 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect) } static bool __rmap_write_protect(struct kvm *kvm, - struct kvm_rmap_head *rmap_head, - bool pt_protect, void *data) + struct kvm_rmap_head *rmap_head, bool pt_protect) { u64 *sptep; struct rmap_iterator iter; @@ -1631,7 +1619,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, while (mask) { rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask), PT_PAGE_TABLE_LEVEL, slot); - __rmap_write_protect(kvm, rmap_head, false, NULL); + __rmap_write_protect(kvm, rmap_head, false); /* clear the first set bit */ mask &= mask - 1; @@ -1701,22 +1689,6 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu) return 0; } -bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, - struct kvm_memory_slot *slot, u64 gfn) -{ - struct kvm_rmap_head *rmap_head; - int i; - bool write_protected = false; - - for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) { - rmap_head = __gfn_to_rmap(gfn, i, slot); - write_protected |= __rmap_write_protect(kvm, rmap_head, true, - NULL); - } - - return write_protected; -} - static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn) { struct kvm_memory_slot *slot; @@ -5562,10 +5534,6 @@ void kvm_mmu_uninit_vm(struct kvm *kvm) kvm_page_track_unregister_notifier(kvm, node); } -/* The return value indicates if tlb flush on all vcpus is needed. */ -typedef bool (*slot_level_handler) (struct kvm *kvm, - struct kvm_rmap_head *rmap_head, void *data); - /* The caller should hold mmu-lock before calling this function. */ static __always_inline bool slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, @@ -5609,9 +5577,8 @@ slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot, lock_flush_tlb, data); } -static __always_inline bool -slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot, - slot_level_handler fn, bool lock_flush_tlb, void *data) +bool slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot, + slot_level_handler fn, bool lock_flush_tlb, void *data) { return slot_handle_level(kvm, memslot, fn, PT_PAGE_TABLE_LEVEL, PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb, data); @@ -5673,21 +5640,15 @@ static bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head, void *data) { - return __rmap_write_protect(kvm, rmap_head, false, data); + return __rmap_write_protect(kvm, rmap_head, false); } -void kvm_mmu_slot_remove_write_access(struct kvm *kvm, - struct kvm_memory_slot *memslot) +void kvm_mmu_slot_apply_write_access(struct kvm *kvm, + struct kvm_memory_slot *memslot) { - bool flush; - - spin_lock(&kvm->mmu_lock); - flush = slot_handle_all_level(kvm, memslot, slot_rmap_write_protect, - false, NULL); - spin_unlock(&kvm->mmu_lock); - + bool flush = protect_all_levels(kvm, memslot); /* - * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log() + * kvm_mmu_slot_apply_write_access() and kvm_vm_ioctl_get_dirty_log() * which do tlb flush out of mmu-lock should be serialized by * kvm->slots_lock otherwise tlb flush would be missed. */ @@ -5792,7 +5753,7 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, false, NULL); spin_unlock(&kvm->mmu_lock); - /* see kvm_mmu_slot_remove_write_access */ + /* see kvm_mmu_slot_apply_write_access*/ lockdep_assert_held(&kvm->slots_lock); if (flush) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 49d7f2f002..35b46a6a0a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -4,7 +4,7 @@ #include #include "kvm_cache_regs.h" - +#include "roe_arch.h" #define PT64_PT_BITS 9 #define PT64_ENT_PER_PAGE (1 << PT64_PT_BITS) #define PT32_PT_BITS 10 @@ -43,6 +43,24 @@ #define PT32_ROOT_LEVEL 2 #define PT32E_ROOT_LEVEL 3 +#define for_each_rmap_spte(_rmap_head_, _iter_, _spte_) \ + for (_spte_ = rmap_get_first(_rmap_head_, _iter_); \ + _spte_; _spte_ = rmap_get_next(_iter_)) + +/* + * Used by the following functions to iterate through the sptes linked by a + * rmap. All fields are private and not assumed to be used outside. + */ +struct rmap_iterator { + /* private fields */ + struct pte_list_desc *desc; /* holds the sptep if not NULL */ + int pos; /* index of the sptep */ +}; + +u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, + struct rmap_iterator *iter); +u64 *rmap_get_next(struct rmap_iterator *iter); +bool spte_write_protect(u64 *sptep, bool pt_protect); static inline u64 rsvd_bits(int s, int e) { if (e < s) @@ -203,13 +221,19 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return -(u32)fault & errcode; } +/* The return value indicates if tlb flush on all vcpus is needed. */ +typedef bool (*slot_level_handler) (struct kvm *kvm, + struct kvm_rmap_head *rmap_head, void *data); + void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm); void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); -bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, - struct kvm_memory_slot *slot, u64 gfn); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); gfn_t spte_to_gfn(u64 *sptep); +bool slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot, + slot_level_handler fn, bool lock_flush_tlb, void *data); +struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level, + struct kvm_memory_slot *slot); #endif diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c new file mode 100644 index 0000000000..f787106be8 --- /dev/null +++ b/arch/x86/kvm/roe.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * KVM Read Only Enforcement + * Copyright (c) 2018 Ahmed Abd El Mawgood + * + * Author: Ahmed Abd El Mawgood + * + */ +#include +#include +#include + + +#include +#include "kvm_cache_regs.h" +#include "mmu.h" +#include "roe_arch.h" + +static bool __rmap_write_protect_roe(struct kvm *kvm, + struct kvm_rmap_head *rmap_head, bool pt_protect, + struct kvm_memory_slot *memslot) +{ + u64 *sptep; + struct rmap_iterator iter; + bool prot; + bool flush = false; + + for_each_rmap_spte(rmap_head, &iter, sptep) { + int idx = spte_to_gfn(sptep) - memslot->base_gfn; + + prot = !test_bit(idx, memslot->roe_bitmap) && pt_protect; + flush |= spte_write_protect(sptep, prot); + } + return flush; +} + +bool kvm_mmu_slot_gfn_write_protect_roe(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn) +{ + struct kvm_rmap_head *rmap_head; + int i; + bool write_protected = false; + + for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) { + rmap_head = __gfn_to_rmap(gfn, i, slot); + write_protected |= __rmap_write_protect_roe(kvm, rmap_head, + true, slot); + } + return write_protected; +} + +static bool slot_rmap_apply_protection(struct kvm *kvm, + struct kvm_rmap_head *rmap_head, void *data) +{ + struct kvm_memory_slot *memslot = (struct kvm_memory_slot *) data; + bool prot_mask = !(memslot->flags & KVM_MEM_READONLY); + + return __rmap_write_protect_roe(kvm, rmap_head, prot_mask, memslot); +} + +bool roe_protect_all_levels(struct kvm *kvm, struct kvm_memory_slot *memslot) +{ + bool flush; + + spin_lock(&kvm->mmu_lock); + flush = slot_handle_all_level(kvm, memslot, slot_rmap_apply_protection, + false, memslot); + spin_unlock(&kvm->mmu_lock); + return flush; +} + +void kvm_roe_arch_commit_protection(struct kvm *kvm, + struct kvm_memory_slot *slot) +{ + kvm_mmu_slot_apply_write_access(kvm, slot); + kvm_arch_flush_shadow_memslot(kvm, slot); +} +EXPORT_SYMBOL_GPL(kvm_roe_arch_commit_protection); + +bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu) +{ + u64 rflags; + u64 cr0 = kvm_read_cr0(vcpu); + u64 iopl; + + // first checking we are not in protected mode + if ((cr0 & 1) == 0) + return false; + /* + * we don't need to worry about comments in __get_regs + * because we are sure that this function will only be + * triggered at the end of a hypercall instruction. + */ + rflags = kvm_get_rflags(vcpu); + iopl = (rflags >> 12) & 3; + if (iopl != 3) + return false; + return true; +} +EXPORT_SYMBOL_GPL(kvm_roe_arch_is_userspace); diff --git a/arch/x86/kvm/roe_arch.h b/arch/x86/kvm/roe_arch.h new file mode 100644 index 0000000000..17a8b79d36 --- /dev/null +++ b/arch/x86/kvm/roe_arch.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __KVM_ROE_HARCH_H__ +#define __KVM_ROE_HARCH_H__ +/* + * KVM Read Only Enforcement + * Copyright (c) 2018 Ahmed Abd El Mawgood + * + * Author: Ahmed Abd El Mawgood + * + */ +#include "mmu.h" + +bool roe_protect_all_levels(struct kvm *kvm, struct kvm_memory_slot *memslot); + +static inline bool protect_all_levels(struct kvm *kvm, + struct kvm_memory_slot *memslot) +{ + return roe_protect_all_levels(kvm, memslot); +} +bool kvm_mmu_slot_gfn_write_protect_roe(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn); +static inline bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, u64 gfn) +{ + return kvm_mmu_slot_gfn_write_protect_roe(kvm, slot, gfn); +} +#endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02c8e095a2..19b0f2307e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -20,6 +20,7 @@ */ #include +#include #include "irq.h" #include "mmu.h" #include "i8254.h" @@ -4469,7 +4470,7 @@ int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *lo /* * All the TLBs can be flushed out of mmu lock, see the comments in - * kvm_mmu_slot_remove_write_access(). + * kvm_mmu_slot_apply_write_access(). */ lockdep_assert_held(&kvm->slots_lock); if (flush) @@ -7025,7 +7026,6 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr, return ret; } #endif - /* * kvm_pv_kick_cpu_op: Kick a vcpu. * @@ -7097,6 +7097,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit); break; #endif + case KVM_HC_ROE: + ret = kvm_roe(vcpu, a0, a1, a2, a3); + break; default: ret = -KVM_ENOSYS; break; @@ -9360,8 +9363,8 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm, struct kvm_memory_slot *new) { /* Still write protect RO slot */ + kvm_mmu_slot_apply_write_access(kvm, new); if (new->flags & KVM_MEM_READONLY) { - kvm_mmu_slot_remove_write_access(kvm, new); return; } @@ -9399,7 +9402,7 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm, if (kvm_x86_ops->slot_enable_log_dirty) kvm_x86_ops->slot_enable_log_dirty(kvm, new); else - kvm_mmu_slot_remove_write_access(kvm, new); + kvm_mmu_slot_apply_write_access(kvm, new); } else { if (kvm_x86_ops->slot_disable_log_dirty) kvm_x86_ops->slot_disable_log_dirty(kvm, new); From patchwork Sun Jan 20 23:39:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772543 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D98013BF for ; Sun, 20 Jan 2019 23:42:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC82229CC0 for ; Sun, 20 Jan 2019 23:42:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DCE2E29CD5; Sun, 20 Jan 2019 23:42:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 3BBA129CC0 for ; Sun, 20 Jan 2019 23:42:24 +0000 (UTC) Received: (qmail 29999 invoked by uid 550); 20 Jan 2019 23:41:40 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 29926 invoked from network); 20 Jan 2019 23:41:40 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FG9I5pcqQ6PBpmnFSV+YSdqpqF1gFPiVklkeK1UqFdY=; b=zankkbHaqdHGoJFjC87tGlsiQRSC4DJuXnNPwZxmUsRW6xZwNY6eS0h+rqEFfUYzKZ F6oZEwFH+j/FXhGE5Eq2cHQJ3c9fonjji/qHOSd63FCKpAJD+gpZ1F5wkKNv+PbdN1tD kI8KgNbqbVsv+CYXgDlYnIjjtZUz1uZpTTQnFQj0a7qTDzjOCqcM5p4JiYnjEe599w1E 0oPp1YxAVOjYe+YdFRpUONFtv01J2U/+ijKyYVEC8Kx2CBNSHNVpsapmjmtw/MoVfMNg qnQs0FKzJX3cu2W3cDTTwyUgDUgPllt1INKjYYrl/Q0rVkmnzo7MkwHOUxM+bm1rYNim H8kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FG9I5pcqQ6PBpmnFSV+YSdqpqF1gFPiVklkeK1UqFdY=; b=jjneXW/QtgCUN+ispPlr/33+6D0Fo9Npo1R0Im4LJ3H30i8HG119FmQjcLb6Xq0l40 um8b/jFHx08mLyA/2iLHrLkERhOdDwEQllSZnwLMhlDZwiv1Z0aX6kQ3ehbnAgiXa6Yj YmE4bfsJ6NNtPROeMQoYRHt3OVDtI817qIPBl+J8u0ZV+xuoMfT0aic2lhzeWGv7JZZe rpDysUMqCdgWCXYo5Axxie5ffF+4LSmO09u6gNIrRYLVDjHPC9bcnIoaLugfyRHuAyrY /z7DkOvz0zslRXcxiTJbHRVoLirY21mVDoG/4/LSYgfN267hGQPfhXR9BR5/BL1RJY4T EU0A== X-Gm-Message-State: AJcUukfFDAYiJubpkkS/DilLue0f9ZoWfhUYio8aDlqBRUYY6SQe+qzT n6vxfxqdCkJASfmCxgjmqrWxbw== X-Google-Smtp-Source: ALg8bN7aQrlCDWjbF2HnN6Ed8pW95Eay2sLkhNpmE9H65vAHIdL1UDsem4gNkZRJJM1oY56Jg40w/w== X-Received: by 2002:a1c:c58d:: with SMTP id v135mr23505842wmf.88.1548027688429; Sun, 20 Jan 2019 15:41:28 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 07/11] KVM: Add support for byte granular memory ROE Date: Mon, 21 Jan 2019 01:39:36 +0200 Message-Id: <20190120233940.15282-8-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP This patch documents and implements ROE_MPROTECT_CHUNK, a part of ROE hypercall designed to protect regions of a memory page with byte granularity. This feature provides a key primitive to protect against attacks involving pages remapping. Signed-off-by: Ahmed Abd El Mawgood --- include/linux/kvm_host.h | 24 ++++ include/uapi/linux/kvm_para.h | 1 + virt/kvm/kvm_main.c | 24 +++- virt/kvm/roe.c | 212 ++++++++++++++++++++++++++++++++-- virt/kvm/roe_generic.h | 6 + 5 files changed, 253 insertions(+), 14 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a627c6e81a..9acf5f54ac 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -294,10 +294,34 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) */ #define KVM_MEM_MAX_NR_PAGES ((1UL << 31) - 1) +/* + * This structure is used to hold memory areas that are to be protected in a + * memory frame with mixed page permissions. + **/ +struct protected_chunk { + gpa_t gpa; + u64 size; + struct list_head list; +}; + +static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk, + gpa_t gpa, int len) { + /* + * https://stackoverflow.com/questions/325933/ + * determine-whether-two-date-ranges-overlap + * Assuming that it works, that link ^ provides a solution that is + * better than anything I would ever come up with. + */ + return (gpa <= chunk->gpa + chunk->size - 1) && + (gpa + len - 1 >= chunk->gpa); +} + struct kvm_memory_slot { gfn_t base_gfn; unsigned long npages; unsigned long *roe_bitmap; + unsigned long *partial_roe_bitmap; + struct list_head *prot_list; unsigned long *dirty_bitmap; struct kvm_arch_memory_slot arch; unsigned long userspace_addr; diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h index e6004e0750..4a84f974bc 100644 --- a/include/uapi/linux/kvm_para.h +++ b/include/uapi/linux/kvm_para.h @@ -33,6 +33,7 @@ /* ROE Functionality parameters */ #define ROE_VERSION 0 #define ROE_MPROTECT 1 +#define ROE_MPROTECT_CHUNK 2 /* * hypercalls use architecture specific */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 88b5fbcbb0..819033f475 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1354,18 +1354,19 @@ static bool memslot_is_readonly(struct kvm_memory_slot *slot) static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn) { - return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot); + return gfn_is_full_roe(slot, gfn) || + gfn_is_partial_roe(slot, gfn) || + memslot_is_readonly(slot); } + static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn, gfn_t *nr_pages, bool write) { if (!slot || slot->flags & KVM_MEMSLOT_INVALID) return KVM_HVA_ERR_BAD; - if (gfn_is_readonly(slot, gfn) && write) return KVM_HVA_ERR_RO_BAD; - if (nr_pages) *nr_pages = slot->npages - (gfn - slot->base_gfn); @@ -1927,14 +1928,29 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, return __kvm_read_guest_atomic(slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); +static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, gfn_t gfn, int offset, + int len) +{ + u64 addr; + if (!slot) + return KVM_HVA_ERR_RO_BAD; + if (kvm_roe_check_range(slot, gfn, offset, len)) + return KVM_HVA_ERR_RO_BAD; + if (memslot_is_readonly(slot)) + return KVM_HVA_ERR_RO_BAD; + if (gfn_is_full_roe(slot, gfn)) + return KVM_HVA_ERR_RO_BAD; + addr = __gfn_to_hva_many(slot, gfn, NULL, false); + return addr; +} static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn, const void *data, int offset, int len) { int r; unsigned long addr; - addr = gfn_to_hva_memslot(memslot, gfn); + addr = roe_gfn_to_hva(memslot, gfn, offset, len); if (kvm_is_error_hva(addr)) return -EFAULT; r = __copy_to_user((void __user *)addr + offset, data, len); diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c index 33d3a4f507..4393a6a6a2 100644 --- a/virt/kvm/roe.c +++ b/virt/kvm/roe.c @@ -11,34 +11,89 @@ #include #include #include +#include "roe_generic.h" int kvm_roe_init(struct kvm_memory_slot *slot) { slot->roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) * sizeof(unsigned long), GFP_KERNEL); if (!slot->roe_bitmap) - return -ENOMEM; + goto fail1; + slot->partial_roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) * + sizeof(unsigned long), GFP_KERNEL); + if (!slot->partial_roe_bitmap) + goto fail2; + slot->prot_list = kvzalloc(sizeof(struct list_head), GFP_KERNEL); + if (!slot->prot_list) + goto fail3; + INIT_LIST_HEAD(slot->prot_list); return 0; +fail3: + kvfree(slot->partial_roe_bitmap); +fail2: + kvfree(slot->roe_bitmap); +fail1: + return -ENOMEM; + +} + +static bool kvm_roe_protected_range(struct kvm_memory_slot *slot, gpa_t gpa, + int len) +{ + struct list_head *pos; + struct protected_chunk *cur_chunk; + + list_for_each(pos, slot->prot_list) { + cur_chunk = list_entry(pos, struct protected_chunk, list); + if (kvm_roe_range_overlap(cur_chunk, gpa, len)) + return true; + } + return false; +} + +bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, + int len) +{ + gpa_t gpa = (gfn << PAGE_SHIFT) + offset; + if (!gfn_is_partial_roe(slot, gfn)) + return false; + return kvm_roe_protected_range(slot, gpa, len); } + void kvm_roe_free(struct kvm_memory_slot *slot) { + struct protected_chunk *pos, *n; + struct list_head *head = slot->prot_list; + kvfree(slot->roe_bitmap); + kvfree(slot->partial_roe_bitmap); + list_for_each_entry_safe(pos, n, head, list) { + list_del(&pos->list); + kvfree(pos); + } + kvfree(slot->prot_list); } static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, u64 npages) + gfn_t gfn, u64 npages, bool partial) { int i; + void *bitmap; + if (partial) + bitmap = slot->partial_roe_bitmap; + else + bitmap = slot->roe_bitmap; for (i = gfn - slot->base_gfn; i < gfn + npages - slot->base_gfn; i++) - set_bit(i, slot->roe_bitmap); + set_bit(i, bitmap); kvm_roe_arch_commit_protection(kvm, slot); } -static int __kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) +static int __kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages, + bool partial) { struct kvm_memory_slot *slot; gfn_t gfn = gpa >> PAGE_SHIFT; @@ -54,12 +109,12 @@ static int __kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) if (gfn + npages > slot->base_gfn + slot->npages) { u64 _npages = slot->base_gfn + slot->npages - gfn; - kvm_roe_protect_slot(kvm, slot, gfn, _npages); + kvm_roe_protect_slot(kvm, slot, gfn, _npages, partial); gfn += _npages; count += _npages; npages -= _npages; } else { - kvm_roe_protect_slot(kvm, slot, gfn, npages); + kvm_roe_protect_slot(kvm, slot, gfn, npages, partial); count += npages; npages = 0; } @@ -69,12 +124,13 @@ static int __kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) return count; } -static int kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages) +static int kvm_roe_protect_range(struct kvm *kvm, gpa_t gpa, u64 npages, + bool partial) { int r; mutex_lock(&kvm->slots_lock); - r = __kvm_roe_protect_range(kvm, gpa, npages); + r = __kvm_roe_protect_range(kvm, gpa, npages, partial); mutex_unlock(&kvm->slots_lock); return r; } @@ -103,7 +159,7 @@ static int kvm_roe_full_protect_range(struct kvm_vcpu *vcpu, u64 gva, continue; if (!access_ok(hva, 1 << PAGE_SHIFT)) continue; - status = kvm_roe_protect_range(vcpu->kvm, gpa, 1); + status = kvm_roe_protect_range(vcpu->kvm, gpa, 1, false); if (status > 0) count += status; } @@ -112,6 +168,139 @@ static int kvm_roe_full_protect_range(struct kvm_vcpu *vcpu, u64 gva, return count; } +static int kvm_roe_insert_chunk_next(struct list_head *pos, u64 gpa, u64 size) +{ + struct protected_chunk *chunk; + + chunk = kvzalloc(sizeof(struct protected_chunk), GFP_KERNEL); + chunk->gpa = gpa; + chunk->size = size; + INIT_LIST_HEAD(&chunk->list); + list_add(&chunk->list, pos); + return size; +} + +static int kvm_roe_expand_chunk(struct protected_chunk *pos, u64 gpa, u64 size) +{ + u64 old_ptr = pos->gpa; + u64 old_size = pos->size; + + if (gpa < old_ptr) + pos->gpa = gpa; + if (gpa + size > old_ptr + old_size) + pos->size = gpa + size - pos->gpa; + return size; +} + +static bool kvm_roe_merge_chunks(struct protected_chunk *chunk) +{ + /*attempt merging 2 consecutive given the first one*/ + struct protected_chunk *next = list_next_entry(chunk, list); + + if (!kvm_roe_range_overlap(chunk, next->gpa, next->size)) + return false; + kvm_roe_expand_chunk(chunk, next->gpa, next->size); + list_del(&next->list); + kvfree(next); + return true; +} + +static int __kvm_roe_insert_chunk(struct kvm_memory_slot *slot, u64 gpa, + u64 size) +{ + /* kvm->slots_lock must be acquired*/ + struct protected_chunk *pos; + struct list_head *head = slot->prot_list; + + if (list_empty(head)) + return kvm_roe_insert_chunk_next(head, gpa, size); + /* + * pos here will never get deleted maybe the next one will + * that is why list_for_each_entry_safe is completely unsafe + */ + list_for_each_entry(pos, head, list) { + if (kvm_roe_range_overlap(pos, gpa, size)) { + int ret = kvm_roe_expand_chunk(pos, gpa, size); + + while (head != pos->list.next) + if (!kvm_roe_merge_chunks(pos)) + break; + return ret; + } + if (pos->gpa > gpa) { + struct protected_chunk *prev; + + prev = list_prev_entry(pos, list); + return kvm_roe_insert_chunk_next(&prev->list, gpa, + size); + } + } + pos = list_last_entry(head, struct protected_chunk, list); + + return kvm_roe_insert_chunk_next(&pos->list, gpa, size); +} + +static int kvm_roe_insert_chunk(struct kvm *kvm, u64 gpa, u64 size) +{ + struct kvm_memory_slot *slot; + gfn_t gfn = gpa >> PAGE_SHIFT; + int ret; + + mutex_lock(&kvm->slots_lock); + slot = gfn_to_memslot(kvm, gfn); + ret = __kvm_roe_insert_chunk(slot, gpa, size); + mutex_unlock(&kvm->slots_lock); + return ret; +} + +static int kvm_roe_partial_page_protect(struct kvm_vcpu *vcpu, u64 gva, + u64 size) +{ + gpa_t gpa = kvm_mmu_gva_to_gpa_system(vcpu, gva, NULL); + + kvm_roe_protect_range(vcpu->kvm, gpa, 1, true); + return kvm_roe_insert_chunk(vcpu->kvm, gpa, size); +} + +static int kvm_roe_partial_protect(struct kvm_vcpu *vcpu, u64 gva, u64 size) +{ + u64 gva_start = gva; + u64 gva_end = gva+size; + u64 gpn_start = gva_start >> PAGE_SHIFT; + u64 gpn_end = gva_end >> PAGE_SHIFT; + u64 _size; + int count = 0; + // We need to make sure that there will be no overflow or zero size + if (gva_end <= gva_start) + return -EINVAL; + + // protect the partial page at the start + if (gpn_end > gpn_start) + _size = PAGE_SIZE - (gva_start & PAGE_MASK) + 1; + else + _size = size; + size -= _size; + count += kvm_roe_partial_page_protect(vcpu, gva_start, _size); + // full protect in the middle pages + if (gpn_end - gpn_start > 1) { + int ret; + u64 _gva = (gpn_start + 1) << PAGE_SHIFT; + u64 npages = gpn_end - gpn_start - 1; + + size -= npages << PAGE_SHIFT; + ret = kvm_roe_full_protect_range(vcpu, _gva, npages); + if (ret > 0) + count += ret << PAGE_SHIFT; + } + // protect the partial page at the end + if (size != 0) + count += kvm_roe_partial_page_protect(vcpu, + gpn_end << PAGE_SHIFT, size); + if (count == 0) + return -EINVAL; + return count; +} + int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3) { int ret; @@ -123,11 +312,14 @@ int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3) return -KVM_ENOSYS; switch (a0) { case ROE_VERSION: - ret = 1; //current version + ret = 2; //current version break; case ROE_MPROTECT: ret = kvm_roe_full_protect_range(vcpu, a1, a2); break; + case ROE_MPROTECT_CHUNK: + ret = kvm_roe_partial_protect(vcpu, a1, a2); + break; default: ret = -EINVAL; } diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h index 36e5b52c5b..ad121372f2 100644 --- a/virt/kvm/roe_generic.h +++ b/virt/kvm/roe_generic.h @@ -12,8 +12,14 @@ void kvm_roe_free(struct kvm_memory_slot *slot); int kvm_roe_init(struct kvm_memory_slot *slot); +bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, + int len); static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn) { return test_bit(gfn - slot->base_gfn, slot->roe_bitmap); } +static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap); +} #endif From patchwork Sun Jan 20 23:39:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772545 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DB83F13BF for ; Sun, 20 Jan 2019 23:42:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB12729CB1 for ; Sun, 20 Jan 2019 23:42:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BEE5B29CC0; Sun, 20 Jan 2019 23:42:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id D73B529CB1 for ; Sun, 20 Jan 2019 23:42:37 +0000 (UTC) Received: (qmail 30295 invoked by uid 550); 20 Jan 2019 23:41:43 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30209 invoked from network); 20 Jan 2019 23:41:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ibxTSM7aAfMxp0mh34e5XoCF9bu27Z34+6gCZTrPT34=; b=lcJa/9GoZ/Y6WfC0bylBfI5oCVTUbuuQaazcglM7FxSXk53oJ3EwA7gQ6DK+WxiFMb Tlrgwvc736fHcfg+YAswmtczDWnSSBMIcGSSXYg6i0MRH/ZIX2y9cpGSxmZ+J63hsT4/ QTQXtg0gQ8OgWOCw2BrEXgDSkegRbpVoC0R+6nlx8MG5HL4zfx/R+JZ6gMYPKE+l1tLJ AXI8c4Qwl9aiXZE+hXoOGirioCNsW/YzcD/4+02Q9kHE3d1EMNxtT20zPoj7Mcuv0039 pK40Sk4UGAAo1eF1IXEqWJ3/lN+mYFSzRQBsmUcXjgxuNNvUwayW8QCEFaj6ijFFXHa+ HtKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ibxTSM7aAfMxp0mh34e5XoCF9bu27Z34+6gCZTrPT34=; b=p4HlvkB05UIOe+3Pgq0Hz38tLD4KPgooMZttLc+605VlZWYIyFbr9Y6jNiqE5IEM3L 5MhUF8d68GiDssMdp6VlzBwyDCfYcgU7or/D2wmmv7ZLUHeiJlZ3HXf0GsdCSxq1Sed+ qL66sfk6XFN1nSG3+TbC8dSzSS/Z4rkytxg0Fhl1rvpOEADf5UhNw01ALP5w/wr3aLnj Uo7ELIt59uunyWeTvKUMSOm9PWZyLOm3XMrHBrgf0MUE1mBVDPqKFwwIGzWOctStoSZT BvQDZbv2z1gDjBzUTwVrz3w8WHELyCnIta5IFtsjcnzeMMpQNd/JshppewsRXRm2Sa58 4E8A== X-Gm-Message-State: AJcUukfpH03jnS5S8TFeY96vSPg/zEdBdY6piCcf9GXXvHbQadsaXVWo 4Ha18pcgtf1WQ+d/TKrQgn4F5g== X-Google-Smtp-Source: ALg8bN6p9RWZDpkENLmzoXwCtF9v6nAyN1W721G5G6yv2ozcu7srm3C8FwlQtG+MKjShLdYSf4N48A== X-Received: by 2002:a1c:9a4c:: with SMTP id c73mr23491864wme.35.1548027691743; Sun, 20 Jan 2019 15:41:31 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 08/11] KVM: X86: Port ROE_MPROTECT_CHUNK to x86 Date: Mon, 21 Jan 2019 01:39:37 +0200 Message-Id: <20190120233940.15282-9-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Apply d->memslot->partial_roe_bitmap to shadow page table entries too. Signed-off-by: Ahmed Abd El Mawgood --- arch/x86/kvm/roe.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c index f787106be8..700f69823b 100644 --- a/arch/x86/kvm/roe.c +++ b/arch/x86/kvm/roe.c @@ -25,11 +25,14 @@ static bool __rmap_write_protect_roe(struct kvm *kvm, struct rmap_iterator iter; bool prot; bool flush = false; + void *full_bmp = memslot->roe_bitmap; + void *part_bmp = memslot->partial_roe_bitmap; for_each_rmap_spte(rmap_head, &iter, sptep) { int idx = spte_to_gfn(sptep) - memslot->base_gfn; - prot = !test_bit(idx, memslot->roe_bitmap) && pt_protect; + prot = !(test_bit(idx, full_bmp) || test_bit(idx, part_bmp)); + prot = prot && pt_protect; flush |= spte_write_protect(sptep, prot); } return flush; From patchwork Sun Jan 20 23:39:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BB9917E9 for ; Sun, 20 Jan 2019 23:42:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A4AA29CB1 for ; Sun, 20 Jan 2019 23:42:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E04429CD1; Sun, 20 Jan 2019 23:42:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id E1FCF29CB1 for ; Sun, 20 Jan 2019 23:42:48 +0000 (UTC) Received: (qmail 30516 invoked by uid 550); 20 Jan 2019 23:41:46 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30452 invoked from network); 20 Jan 2019 23:41:45 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FcIt9r2n/mEgqCNb5UT29WUXJF2aCPaNHWPRCZT6MPQ=; b=1c9y8M8u1GlScRssrmufW1/Y7fhxLH4q1jt/CA5XKaXFsNAdtbu0w9xJ2Qx9oc50kG jNFY65vUmtflsLNaOdQyDTK3ARCroDpnifxvPaAW79sXXeb4IJqUXimuiBGpaUlYHwq7 cZu2ZqpZUh9Yg10KJKG9SEzo7YGdtIOzH8maEzfl3X5BYCRRcQpqurbS2ftuZ5VuFZYG UacqMhRQougz3i0vLt1pouYosIW92Ax47uLaYXP0teBu7W5AHJ36K+5RXmYlxJBPFWs7 k9pEtFLWsg+tSppbCbX1eqBgb8Kw/rb1Rb1MaEkCfsEQM56SMpRCHdZIg7MV3aRHZoW5 FT5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FcIt9r2n/mEgqCNb5UT29WUXJF2aCPaNHWPRCZT6MPQ=; b=ZFeWp5Ij7nv5X0zU7BEzFcBYk4SvkiXPghw02PcCKDb8GlZpEGFnjhw7hycxmYmSnW zDZvU/CPSyTnXx2+GmJD81gVjlnG8xX8kUfwVb86iwMbP92q7u+mQGFV6VY4hycNC1g3 dHL6gCUCGfpvOdbxnroFTv3sK+grpHSBxtTlxCLlMPgHXV6iunsNndG7rwKT4/DD33Sb ZqaW93/XQ11MV/yyF6oPY559TRPvSmWfhGZXtzQfUQ0G43QygVrd8k7uUm8XXpFM2ojg fr8xNTymxN6qF2ZnoQMk7e7sr+nK6SgoLWxyqIvPMx/1E1M/syB/HsCc+cnvd7xSIpp2 3pYg== X-Gm-Message-State: AJcUukfcChHDPqQZSr1c5vjP9y2oLz0QmUvLlrixCA6BuDTQmx+Ilyix UbSb8sULqsylkXCV/7Yk9+V/CQ== X-Google-Smtp-Source: ALg8bN4DTv7SHsMAYfU64me5hoPQjfKcJBRkYkXoG2pmxBtf2ZQEC5d/5sRI+8BCLbZJXWgDqNZGlQ== X-Received: by 2002:a5d:470b:: with SMTP id y11mr25418979wrq.16.1548027694264; Sun, 20 Jan 2019 15:41:34 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 09/11] KVM: Add new exit reason For ROE violations Date: Mon, 21 Jan 2019 01:39:38 +0200 Message-Id: <20190120233940.15282-10-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP The problem is that qemu will not be able to detect ROE violations, so one option would be create host API to tell if a given page is ROE protected, or create ROE violation exit reason. Signed-off-by: Ahmed Abd El Mawgood --- arch/x86/kvm/x86.c | 10 +++++++++- include/kvm/roe.h | 12 ++++++++++++ include/uapi/linux/kvm.h | 2 +- virt/kvm/kvm_main.c | 1 + virt/kvm/roe.c | 2 +- virt/kvm/roe_generic.h | 9 +-------- 6 files changed, 25 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 19b0f2307e..368e3d99fd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5409,6 +5409,7 @@ static int emulator_read_write(struct x86_emulate_ctxt *ctxt, const struct read_write_emulator_ops *ops) { struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); + struct kvm_memory_slot *slot; gpa_t gpa; int rc; @@ -5450,7 +5451,14 @@ static int emulator_read_write(struct x86_emulate_ctxt *ctxt, vcpu->run->mmio.len = min(8u, vcpu->mmio_fragments[0].len); vcpu->run->mmio.is_write = vcpu->mmio_is_write = ops->write; - vcpu->run->exit_reason = KVM_EXIT_MMIO; + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa >> PAGE_SHIFT); + if (slot && ops->write && (kvm_roe_check_range(slot, gpa>>PAGE_SHIFT, + gpa - (gpa & PAGE_MASK), bytes) || + gfn_is_full_roe(slot, gpa>>PAGE_SHIFT))) + vcpu->run->exit_reason = KVM_EXIT_ROE; + else + vcpu->run->exit_reason = KVM_EXIT_MMIO; + vcpu->run->mmio.phys_addr = gpa; return ops->read_write_exit_mmio(vcpu, gpa, val, bytes); diff --git a/include/kvm/roe.h b/include/kvm/roe.h index 6a86866623..3121a67753 100644 --- a/include/kvm/roe.h +++ b/include/kvm/roe.h @@ -13,4 +13,16 @@ void kvm_roe_arch_commit_protection(struct kvm *kvm, struct kvm_memory_slot *slot); int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3); bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu); +bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, + int len); +static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return test_bit(gfn - slot->base_gfn, slot->roe_bitmap); + +} +static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap); +} + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 6d4ea4b6c9..0a386bb5f2 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -235,7 +235,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_S390_STSI 25 #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 - +#define KVM_EXIT_ROE 28 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ #define KVM_INTERNAL_ERROR_EMULATION 1 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 819033f475..d92d300539 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -62,6 +62,7 @@ #include "async_pf.h" #include "vfio.h" #include "roe_generic.h" +#include #define CREATE_TRACE_POINTS #include diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c index 4393a6a6a2..9540473f89 100644 --- a/virt/kvm/roe.c +++ b/virt/kvm/roe.c @@ -60,7 +60,7 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, return false; return kvm_roe_protected_range(slot, gpa, len); } - +EXPORT_SYMBOL_GPL(kvm_roe_check_range); void kvm_roe_free(struct kvm_memory_slot *slot) { diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h index ad121372f2..f1ce4a8aec 100644 --- a/virt/kvm/roe_generic.h +++ b/virt/kvm/roe_generic.h @@ -14,12 +14,5 @@ void kvm_roe_free(struct kvm_memory_slot *slot); int kvm_roe_init(struct kvm_memory_slot *slot); bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, int len); -static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn) -{ - return test_bit(gfn - slot->base_gfn, slot->roe_bitmap); -} -static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn) -{ - return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap); -} + #endif From patchwork Sun Jan 20 23:39:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772549 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93FCD17E9 for ; Sun, 20 Jan 2019 23:43:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F07A29CB1 for ; Sun, 20 Jan 2019 23:43:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6C55A29CC0; Sun, 20 Jan 2019 23:43:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 9321529CB1 for ; Sun, 20 Jan 2019 23:43:00 +0000 (UTC) Received: (qmail 32107 invoked by uid 550); 20 Jan 2019 23:41:53 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 32046 invoked from network); 20 Jan 2019 23:41:52 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IhG1ga+e4jMCr/1icaT0qzsbDovza4q20X40VnRy5Mw=; b=vPY8CKyF4BZgHUvB9f+hTiC4cuY5UAnYYG9Lexcqx7DTbCm925fQ9AMbHHWoxa/BPW OpAktjU2Y9R7ncJshPKtn6/mWocevKPSQqT2VND91cqmRabjphUqCDeF0eKTHRrgo54E ia1NVcOtQxFjM907lZSXRSHtSy86MjLjmZueYrWCOsXvynd8UnmC4LeCytW2BvPbyCVI xSIiAavK5pn4MgtrZdPKEKnfXTxvInD7mwV8hjFLxns9e1SLnfJU1G0rWTrY5MhDVW8b cwUIEqse0kiTlNHvk8yi0Ky3IV0u9yYkxzw6wLrUzIixsD4rzup9+nBdpUkuqhHJuelH nEFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IhG1ga+e4jMCr/1icaT0qzsbDovza4q20X40VnRy5Mw=; b=juqA6ReHVwNvcFMeu7Wbv3Xr92arP7n46xuBXXJTxvfQ/BVACsyode9qlUpKZAoGDc S90AkoRU2UXBaaI0xkBY6kv9kKzbC+9cNCONYpFBI1hUeC5H+x6fYXOfiYNbbT8s+SN0 ZdX8gG5/o/CpVjb3y0SMsUJsWpTDskotDKxoPlz+ch1X+W9iEjiLVjZ7Ryna8ErAIBXN 2xV93SHjPIEoKGs3JgAXD0bxkJ7Yq6Skwzf+PbQlIHZX2vWkJ1VZ62Mupyu7u7fcM/Qu PYm+CcACP0F4WyhWyRqfGhAj4etT5z57vMnomX1PEwA/U22OqxR0qKgW98ql4Sth5ieo FCmQ== X-Gm-Message-State: AJcUukcSNS9n5yfgHbKPZUcy6qfQU53mLEFJruKbjXz1c97rUiVqz8LP BhoZWeccIuLxPg+kcdkY9m22hQ== X-Google-Smtp-Source: ALg8bN4Y+xaNp3oVkt+57vV4ZLidw9dYYAcjBRI3aruZ8x1uvjg7qcZTEuQvUQ7ijUbKgU31CHGC4g== X-Received: by 2002:a5d:60cc:: with SMTP id x12mr24587573wrt.193.1548027701035; Sun, 20 Jan 2019 15:41:41 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 10/11] KVM: Log ROE violations in system log Date: Mon, 21 Jan 2019 01:39:39 +0200 Message-Id: <20190120233940.15282-11-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Ahmed Abd El Mawgood --- virt/kvm/kvm_main.c | 3 ++- virt/kvm/roe.c | 25 +++++++++++++++++++++++++ virt/kvm/roe_generic.h | 3 ++- 3 files changed, 29 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d92d300539..b3dc7255b0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1945,13 +1945,14 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, gfn_t gfn, int offset, addr = __gfn_to_hva_many(slot, gfn, NULL, false); return addr; } + static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn, const void *data, int offset, int len) { int r; unsigned long addr; - addr = roe_gfn_to_hva(memslot, gfn, offset, len); + kvm_roe_check_and_log(memslot, gfn, data, offset, len); if (kvm_is_error_hva(addr)) return -EFAULT; r = __copy_to_user((void __user *)addr + offset, data, len); diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c index 9540473f89..e424b45e1c 100644 --- a/virt/kvm/roe.c +++ b/virt/kvm/roe.c @@ -76,6 +76,31 @@ void kvm_roe_free(struct kvm_memory_slot *slot) kvfree(slot->prot_list); } +static void kvm_warning_roe_violation(u64 addr, const void *data, int len) +{ + int i; + const char *d = data; + char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL); + + for (i = 0; i < len; i++) + sprintf(buf+3*i, " %02x", d[i]); + pr_warn("ROE violation:\n"); + pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr); + pr_warn("\tData: %s\n", buf); + kvfree(buf); +} + +void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn, + const void *data, int offset, int len) +{ + if (!memslot) + return; + if (!gfn_is_full_roe(memslot, gfn) && + !kvm_roe_check_range(memslot, gfn, offset, len)) + return; + kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data, len); +} + static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, u64 npages, bool partial) { diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h index f1ce4a8aec..6c5f0cf381 100644 --- a/virt/kvm/roe_generic.h +++ b/virt/kvm/roe_generic.h @@ -14,5 +14,6 @@ void kvm_roe_free(struct kvm_memory_slot *slot); int kvm_roe_init(struct kvm_memory_slot *slot); bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, int len); - +void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn, + const void *data, int offset, int len); #endif From patchwork Sun Jan 20 23:39:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10772551 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 455B613BF for ; Sun, 20 Jan 2019 23:43:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 332F229CB1 for ; Sun, 20 Jan 2019 23:43:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2663729CC0; Sun, 20 Jan 2019 23:43:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 8070029CB1 for ; Sun, 20 Jan 2019 23:43:11 +0000 (UTC) Received: (qmail 32380 invoked by uid 550); 20 Jan 2019 23:41:56 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 32281 invoked from network); 20 Jan 2019 23:41:55 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cHRFL44h2OEM1lGSXPFKxqS991ECPuCJqpPKgvIlapQ=; b=1K05vbsbiCFg1uDMFOk6XT6m4OZTKGiD/ZqJKFxK5KMAqMLoaEfcotniRiGyZkp6k8 bQH7NeJOcciWncnukc4dIIDAZP0vHyfWHgidmxQX0E8E/X4xfgjYkghTiqxibK/NMAfK 7f/ddrBKo8XB/scx7PQrslNBw7bT9kI33S6ysDMsq2VHyEX04DCNxdxdVaARwebRIa2K kWVsjT8orert5hb6mAvvO+MMwOcSa6++VfblQRjkXuNJDPmFPfMZGYbjdod4Q5lsZjDI CANrz1MgyNFK1dGyIE52Ns7wGy7KObs+vtG/O9PN75UQ8ok8S3iOxccEviGs2MrfgIcK mvew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cHRFL44h2OEM1lGSXPFKxqS991ECPuCJqpPKgvIlapQ=; b=KTKKSyqdOs63weT5g1lN6VCgkKwD3s0aV8CXXdWkc4SjbPwXkJcRA3xwJnoXEHtv+G uKKy63pmTQpXLEiyBWwtQt5oJwaNeDedM1ah+w31HEcjOC1awD+2SodtFzPLZaOPbAz1 CzOFWu3Q39ed+n+EXUzSFC97S8h+Wh9znDXs3kDpH0IuJFYQrbvpOxvBWbW7sLV2UJTP vAlb756TUCY7xTkSyqYkLaIV3QM7C9nbDpd586P2unoLHBvTmJXPjExiInCWHWhgGUI0 uDHZVpx5A/n92EBzisSUPPDttwuYfWbsANMVjhIeFzxp24hmSgf+o/wXg/qhN+maUzqf bZBg== X-Gm-Message-State: AJcUukcH/aSXs0PgV6YKXw31oCrdfp4560AcDXNP5FNyNln0Vh3eq5bk ai1ZC+xZ4iu6sMtR7kP5zWk7AQ== X-Google-Smtp-Source: ALg8bN4CAzW3Z7hD2q8jAG4vnDTlcFzSVnLGfjSxq1S5eVp+69Eb3PEdTH6rN1KzAkxGPHYjjHsBiA== X-Received: by 2002:a1c:cc19:: with SMTP id h25mr22437444wmb.80.1548027703919; Sun, 20 Jan 2019 15:41:43 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [RESEND PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree Date: Mon, 21 Jan 2019 01:39:40 +0200 Message-Id: <20190120233940.15282-12-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> References: <20190120233940.15282-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP The old way of storing protected chunks was a linked list. That made linear overhead when searching for chunks. When reaching 2000 chunk, The time taken two read the last chunk was about 10 times slower than the first chunk. This patch stores the chunks as tree for faster search. Signed-off-by: Ahmed Abd El Mawgood --- include/linux/kvm_host.h | 36 ++++++- virt/kvm/roe.c | 228 +++++++++++++++++++++++++++------------ virt/kvm/roe_generic.h | 3 + 3 files changed, 197 insertions(+), 70 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9acf5f54ac..5f4bec0662 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -301,7 +302,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) struct protected_chunk { gpa_t gpa; u64 size; - struct list_head list; + struct rb_node node; }; static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk, @@ -316,12 +317,43 @@ static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk, (gpa + len - 1 >= chunk->gpa); } +static inline int kvm_roe_range_cmp_position(struct protected_chunk *chunk, + gpa_t gpa, int len) { + /* + * returns -1 if the gpa and len are smaller than chunk. + * returns 0 if they overlap or strictly adjacent + * returns 1 if gpa and len are bigger than the chunk + */ + + if (gpa + len <= chunk->gpa) + return -1; + if (gpa >= chunk->gpa + chunk->size) + return 1; + return 0; +} + +static inline int kvm_roe_range_cmp_mergability(struct protected_chunk *chunk, + gpa_t gpa, int len) { + /* + * returns -1 if the gpa and len are smaller than chunk and not adjacent + * to it + * returns 0 if they overlap or strictly adjacent + * returns 1 if gpa and len are bigger than the chunk and not adjacent + * to it + */ + if (gpa + len < chunk->gpa) + return -1; + if (gpa > chunk->gpa + chunk->size) + return 1; + return 0; + +} struct kvm_memory_slot { gfn_t base_gfn; unsigned long npages; unsigned long *roe_bitmap; unsigned long *partial_roe_bitmap; - struct list_head *prot_list; + struct rb_root *prot_root; unsigned long *dirty_bitmap; struct kvm_arch_memory_slot arch; unsigned long userspace_addr; diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c index e424b45e1c..15297c0e57 100644 --- a/virt/kvm/roe.c +++ b/virt/kvm/roe.c @@ -23,10 +23,10 @@ int kvm_roe_init(struct kvm_memory_slot *slot) sizeof(unsigned long), GFP_KERNEL); if (!slot->partial_roe_bitmap) goto fail2; - slot->prot_list = kvzalloc(sizeof(struct list_head), GFP_KERNEL); - if (!slot->prot_list) + slot->prot_root = kvzalloc(sizeof(struct rb_root), GFP_KERNEL); + if (!slot->prot_root) goto fail3; - INIT_LIST_HEAD(slot->prot_list); + *slot->prot_root = RB_ROOT; return 0; fail3: kvfree(slot->partial_roe_bitmap); @@ -40,12 +40,19 @@ int kvm_roe_init(struct kvm_memory_slot *slot) static bool kvm_roe_protected_range(struct kvm_memory_slot *slot, gpa_t gpa, int len) { - struct list_head *pos; - struct protected_chunk *cur_chunk; - - list_for_each(pos, slot->prot_list) { - cur_chunk = list_entry(pos, struct protected_chunk, list); - if (kvm_roe_range_overlap(cur_chunk, gpa, len)) + struct rb_node *node = slot->prot_root->rb_node; + + while (node) { + struct protected_chunk *cur_chunk; + int cmp; + + cur_chunk = rb_entry(node, struct protected_chunk, node); + cmp = kvm_roe_range_cmp_position(cur_chunk, gpa, len); + if (cmp < 0)/*target chunk is before current node*/ + node = node->rb_left; + else if (cmp > 0)/*target chunk is after current node*/ + node = node->rb_right; + else return true; } return false; @@ -62,18 +69,24 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, } EXPORT_SYMBOL_GPL(kvm_roe_check_range); -void kvm_roe_free(struct kvm_memory_slot *slot) +static void kvm_roe_destroy_tree(struct rb_node *node) { - struct protected_chunk *pos, *n; - struct list_head *head = slot->prot_list; + struct protected_chunk *cur_chunk; + + if (!node) + return; + kvm_roe_destroy_tree(node->rb_left); + kvm_roe_destroy_tree(node->rb_right); + cur_chunk = rb_entry(node, struct protected_chunk, node); + kvfree(cur_chunk); +} +void kvm_roe_free(struct kvm_memory_slot *slot) +{ kvfree(slot->roe_bitmap); kvfree(slot->partial_roe_bitmap); - list_for_each_entry_safe(pos, n, head, list) { - list_del(&pos->list); - kvfree(pos); - } - kvfree(slot->prot_list); + kvm_roe_destroy_tree(slot->prot_root->rb_node); + kvfree(slot->prot_root); } static void kvm_warning_roe_violation(u64 addr, const void *data, int len) @@ -193,40 +206,119 @@ static int kvm_roe_full_protect_range(struct kvm_vcpu *vcpu, u64 gva, return count; } -static int kvm_roe_insert_chunk_next(struct list_head *pos, u64 gpa, u64 size) -{ - struct protected_chunk *chunk; - - chunk = kvzalloc(sizeof(struct protected_chunk), GFP_KERNEL); - chunk->gpa = gpa; - chunk->size = size; - INIT_LIST_HEAD(&chunk->list); - list_add(&chunk->list, pos); - return size; -} - -static int kvm_roe_expand_chunk(struct protected_chunk *pos, u64 gpa, u64 size) +static u64 kvm_roe_expand_chunk(struct protected_chunk *pos, u64 gpa, u64 size) { u64 old_ptr = pos->gpa; u64 old_size = pos->size; + u64 ret = 0; - if (gpa < old_ptr) + if (gpa < old_ptr) { pos->gpa = gpa; - if (gpa + size > old_ptr + old_size) + ret |= KVM_ROE_MERGE_LEFT; + } + if (gpa + size > old_ptr + old_size) { pos->size = gpa + size - pos->gpa; - return size; + ret |= KVM_ROE_MERGE_RIGHT; + } + return ret; } +static void kvm_roe_merge_left(struct rb_root *root, struct rb_node *start) +{ + struct rb_root fake_root; + struct protected_chunk *target, *first; + struct rb_node *node, *stop; + u64 i, count = 0; -static bool kvm_roe_merge_chunks(struct protected_chunk *chunk) + if (!start->rb_left) + return; + fake_root = (struct rb_root) {start->rb_left}; + stop = rb_prev(rb_first(&fake_root)); + /* Back traverse till no node can be merged*/ + target = container_of(start, struct protected_chunk, node); + for (node = rb_last(&fake_root); node != stop; node = rb_prev(node)) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + if (kvm_roe_range_cmp_mergability(target, pos->gpa, pos->size)) + break; + count += 1; + } + if (!count) + return; + /* merging step*/ + node = rb_next(node); + first = container_of(node, struct protected_chunk, node); + kvm_roe_expand_chunk(target, first->gpa, first->size); + /* forward traverse and delete all in between*/ + for (i = 0; i < count; i++) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + rb_erase(node, root); + kvfree(pos); + node = rb_next(node); + } +} + +static void kvm_roe_merge_right(struct rb_root *root, struct rb_node *start) { - /*attempt merging 2 consecutive given the first one*/ - struct protected_chunk *next = list_next_entry(chunk, list); + struct rb_root fake_root; + struct protected_chunk *target, *first; + struct rb_node *node, *stop; + u64 i, count = 0; - if (!kvm_roe_range_overlap(chunk, next->gpa, next->size)) - return false; - kvm_roe_expand_chunk(chunk, next->gpa, next->size); - list_del(&next->list); - kvfree(next); + if (!start->rb_right) + return; + fake_root = (struct rb_root) {start->rb_right}; + stop = rb_next(rb_last(&fake_root)); + /* Forward traverse till no node can be merged*/ + target = container_of(start, struct protected_chunk, node); + for (node = rb_first(&fake_root); node != stop; node = rb_next(node)) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + if (kvm_roe_range_cmp_mergability(target, pos->gpa, pos->size)) + break; + count += 1; + } + if (!count) + return; + /* merging step*/ + node = rb_prev(node); + first = container_of(node, struct protected_chunk, node); + kvm_roe_expand_chunk(target, first->gpa, first->size); + /* Backward traverse and delete all in between*/ + for (i = 0; i < count; i++) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + rb_erase(node, root); + kvfree(pos); + node = rb_prev(node); + } +} + +static bool kvm_roe_merge_chunks(struct rb_root *root, struct rb_node *target, + u64 gpa, u64 size) +{ + /* + * attempt merging all adjacent chunks after inserting a chunk that is + * adjacent or inersecting with an existing chunk + */ + struct protected_chunk *cur; + u64 merge; + + cur = container_of(target, struct protected_chunk, node); + merge = kvm_roe_expand_chunk(cur, gpa, size); + /* + * We will not have to worry about the parent node while merging + * If it was mergeable with the new to be inserted chunk we wouldn't + * have gone deeper. + **/ + if (merge & KVM_ROE_MERGE_LEFT) + kvm_roe_merge_left(root, target); + if (merge & KVM_ROE_MERGE_RIGHT) + kvm_roe_merge_right(root, target); return true; } @@ -234,35 +326,35 @@ static int __kvm_roe_insert_chunk(struct kvm_memory_slot *slot, u64 gpa, u64 size) { /* kvm->slots_lock must be acquired*/ - struct protected_chunk *pos; - struct list_head *head = slot->prot_list; - - if (list_empty(head)) - return kvm_roe_insert_chunk_next(head, gpa, size); - /* - * pos here will never get deleted maybe the next one will - * that is why list_for_each_entry_safe is completely unsafe - */ - list_for_each_entry(pos, head, list) { - if (kvm_roe_range_overlap(pos, gpa, size)) { - int ret = kvm_roe_expand_chunk(pos, gpa, size); - - while (head != pos->list.next) - if (!kvm_roe_merge_chunks(pos)) - break; - return ret; - } - if (pos->gpa > gpa) { - struct protected_chunk *prev; - - prev = list_prev_entry(pos, list); - return kvm_roe_insert_chunk_next(&prev->list, gpa, - size); + struct rb_node **new = &(slot->prot_root->rb_node), *parent = NULL; + struct protected_chunk *insert_me; + bool merge = false; + + while (*new) { + struct protected_chunk *chunk; + int cmp; + + chunk = container_of(*new, struct protected_chunk, node); + cmp = kvm_roe_range_cmp_mergability(chunk, gpa, size); + parent = *new; + if (cmp < 0) { + new = &((*new)->rb_left); + } else if (cmp > 0) { + new = &((*new)->rb_right); + } else { + merge = true; + kvm_roe_merge_chunks(slot->prot_root, *new, gpa, size); + break; } } - pos = list_last_entry(head, struct protected_chunk, list); - - return kvm_roe_insert_chunk_next(&pos->list, gpa, size); + if (merge) + return size; + insert_me = kvzalloc(sizeof(struct protected_chunk), GFP_KERNEL); + insert_me->gpa = gpa; + insert_me->size = size; + rb_link_node(&insert_me->node, parent, new); + rb_insert_color(&insert_me->node, slot->prot_root); + return size; } static int kvm_roe_insert_chunk(struct kvm *kvm, u64 gpa, u64 size) diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h index 6c5f0cf381..8e42c9795c 100644 --- a/virt/kvm/roe_generic.h +++ b/virt/kvm/roe_generic.h @@ -10,6 +10,9 @@ * */ +#define KVM_ROE_MERGE_LEFT 0x1 +#define KVM_ROE_MERGE_RIGHT 0x2 + void kvm_roe_free(struct kvm_memory_slot *slot); int kvm_roe_init(struct kvm_memory_slot *slot); bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,