From patchwork Sun Jan 6 19:23:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmed Abd El Mawgood X-Patchwork-Id: 10749649 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AFF26C5 for ; Sun, 6 Jan 2019 19:27:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BD73288FA for ; Sun, 6 Jan 2019 19:27:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CDB1288ED; Sun, 6 Jan 2019 19:27:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 992FD288ED for ; Sun, 6 Jan 2019 19:26:59 +0000 (UTC) Received: (qmail 28109 invoked by uid 550); 6 Jan 2019 19:26:08 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 28015 invoked from network); 6 Jan 2019 19:26:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mena-vt-edu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cHRFL44h2OEM1lGSXPFKxqS991ECPuCJqpPKgvIlapQ=; b=KEfqEDbYHJHFat0d7k/SsLeiNGb2l2oBGNsu1x9SHxosQwcUh5K0nDZzEvDGhtv69V rBVJnLVh9dJi7VzxS+I9htCZxxJ1pu62txl/k0LcelzrHv7D0s8Ln0Xl7dOx1S5wFVU+ kNlB9+8xD6XpC3TqA3UVkjSalKkdGZVy5qjlIhOAsyPuwfL2nwuqTGeaEngl9sPl3k19 Qi6hGVW7bFeMB3W+3tGESlW5x0Yqsrh6e2db3Cn8KSqlCo2D9878S9mcnEsi3yGqT9Yl DEcs18lqm4aQk9eVU5Dedo8GHqRTFtXmoMGswZlhVj/dbzxRc5geaWqQyQrwSJdsahsq CkkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cHRFL44h2OEM1lGSXPFKxqS991ECPuCJqpPKgvIlapQ=; b=gvOOBQ52ZEAOOZLqGZmLRVghkMPWH4qXh4QzuMBblRozmUf6ZgvhmjQr7y5KtuDadj /Q5PxN2t/3U1q0IjIPw2JtfH9P+RTCEW9nYSpjiBgUbfYSzghKP2N35o8VqEBw4vuQ11 4OhRh7S6zuBAZm1mh3gRqbF6A1rjv2nxqgzKY1V7PnAyafXpKARVgsOgdTv+QsgSYyjL TEjL6CABXaTbzOhJR70vfrg6YzwFKpiIDVehsEiE8IMBAGBMrWipEQWXys4pYAzc6c8c Hy2gW7VSAigyxVWsIrOWeJLhuPAcDl0ibaJWiAmEUP8Bb/4s5+BFy62P3X0WjUBeIU+z E86g== X-Gm-Message-State: AA+aEWY+ZGzts4jwvhx650gkF8rSPfOgkv774Xtkt+OGpHRyFa2tZwRP pVB143Wk8IqmJHZ84NOu/25kVg== X-Google-Smtp-Source: AFSGD/XqtQKuRnnj21d08kcDdLhSO4wUG6nERjP9lWtoHtsK6ajf6QBm/wfd9O9rmFuIR6O0HjFzcg== X-Received: by 2002:a17:906:4bd7:: with SMTP id x23-v6mr45533733ejv.105.1546802755701; Sun, 06 Jan 2019 11:25:55 -0800 (PST) From: Ahmed Abd El Mawgood To: Paolo Bonzini , rkrcmar@redhat.com, Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , hpa@zytor.com, x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, ahmedsoliman0x666@gmail.com, ovich00@gmail.com, kernel-hardening@lists.openwall.com, nigel.edwards@hpe.com, Boris Lukashev , Igor Stoppa Cc: Ahmed Abd El Mawgood Subject: [PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree Date: Sun, 6 Jan 2019 21:23:45 +0200 Message-Id: <20190106192345.13578-12-ahmedsoliman@mena.vt.edu> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20190106192345.13578-1-ahmedsoliman@mena.vt.edu> References: <20190106192345.13578-1-ahmedsoliman@mena.vt.edu> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP The old way of storing protected chunks was a linked list. That made linear overhead when searching for chunks. When reaching 2000 chunk, The time taken two read the last chunk was about 10 times slower than the first chunk. This patch stores the chunks as tree for faster search. Signed-off-by: Ahmed Abd El Mawgood --- include/linux/kvm_host.h | 36 ++++++- virt/kvm/roe.c | 228 +++++++++++++++++++++++++++------------ virt/kvm/roe_generic.h | 3 + 3 files changed, 197 insertions(+), 70 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9acf5f54ac..5f4bec0662 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -301,7 +302,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) struct protected_chunk { gpa_t gpa; u64 size; - struct list_head list; + struct rb_node node; }; static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk, @@ -316,12 +317,43 @@ static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk, (gpa + len - 1 >= chunk->gpa); } +static inline int kvm_roe_range_cmp_position(struct protected_chunk *chunk, + gpa_t gpa, int len) { + /* + * returns -1 if the gpa and len are smaller than chunk. + * returns 0 if they overlap or strictly adjacent + * returns 1 if gpa and len are bigger than the chunk + */ + + if (gpa + len <= chunk->gpa) + return -1; + if (gpa >= chunk->gpa + chunk->size) + return 1; + return 0; +} + +static inline int kvm_roe_range_cmp_mergability(struct protected_chunk *chunk, + gpa_t gpa, int len) { + /* + * returns -1 if the gpa and len are smaller than chunk and not adjacent + * to it + * returns 0 if they overlap or strictly adjacent + * returns 1 if gpa and len are bigger than the chunk and not adjacent + * to it + */ + if (gpa + len < chunk->gpa) + return -1; + if (gpa > chunk->gpa + chunk->size) + return 1; + return 0; + +} struct kvm_memory_slot { gfn_t base_gfn; unsigned long npages; unsigned long *roe_bitmap; unsigned long *partial_roe_bitmap; - struct list_head *prot_list; + struct rb_root *prot_root; unsigned long *dirty_bitmap; struct kvm_arch_memory_slot arch; unsigned long userspace_addr; diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c index e424b45e1c..15297c0e57 100644 --- a/virt/kvm/roe.c +++ b/virt/kvm/roe.c @@ -23,10 +23,10 @@ int kvm_roe_init(struct kvm_memory_slot *slot) sizeof(unsigned long), GFP_KERNEL); if (!slot->partial_roe_bitmap) goto fail2; - slot->prot_list = kvzalloc(sizeof(struct list_head), GFP_KERNEL); - if (!slot->prot_list) + slot->prot_root = kvzalloc(sizeof(struct rb_root), GFP_KERNEL); + if (!slot->prot_root) goto fail3; - INIT_LIST_HEAD(slot->prot_list); + *slot->prot_root = RB_ROOT; return 0; fail3: kvfree(slot->partial_roe_bitmap); @@ -40,12 +40,19 @@ int kvm_roe_init(struct kvm_memory_slot *slot) static bool kvm_roe_protected_range(struct kvm_memory_slot *slot, gpa_t gpa, int len) { - struct list_head *pos; - struct protected_chunk *cur_chunk; - - list_for_each(pos, slot->prot_list) { - cur_chunk = list_entry(pos, struct protected_chunk, list); - if (kvm_roe_range_overlap(cur_chunk, gpa, len)) + struct rb_node *node = slot->prot_root->rb_node; + + while (node) { + struct protected_chunk *cur_chunk; + int cmp; + + cur_chunk = rb_entry(node, struct protected_chunk, node); + cmp = kvm_roe_range_cmp_position(cur_chunk, gpa, len); + if (cmp < 0)/*target chunk is before current node*/ + node = node->rb_left; + else if (cmp > 0)/*target chunk is after current node*/ + node = node->rb_right; + else return true; } return false; @@ -62,18 +69,24 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset, } EXPORT_SYMBOL_GPL(kvm_roe_check_range); -void kvm_roe_free(struct kvm_memory_slot *slot) +static void kvm_roe_destroy_tree(struct rb_node *node) { - struct protected_chunk *pos, *n; - struct list_head *head = slot->prot_list; + struct protected_chunk *cur_chunk; + + if (!node) + return; + kvm_roe_destroy_tree(node->rb_left); + kvm_roe_destroy_tree(node->rb_right); + cur_chunk = rb_entry(node, struct protected_chunk, node); + kvfree(cur_chunk); +} +void kvm_roe_free(struct kvm_memory_slot *slot) +{ kvfree(slot->roe_bitmap); kvfree(slot->partial_roe_bitmap); - list_for_each_entry_safe(pos, n, head, list) { - list_del(&pos->list); - kvfree(pos); - } - kvfree(slot->prot_list); + kvm_roe_destroy_tree(slot->prot_root->rb_node); + kvfree(slot->prot_root); } static void kvm_warning_roe_violation(u64 addr, const void *data, int len) @@ -193,40 +206,119 @@ static int kvm_roe_full_protect_range(struct kvm_vcpu *vcpu, u64 gva, return count; } -static int kvm_roe_insert_chunk_next(struct list_head *pos, u64 gpa, u64 size) -{ - struct protected_chunk *chunk; - - chunk = kvzalloc(sizeof(struct protected_chunk), GFP_KERNEL); - chunk->gpa = gpa; - chunk->size = size; - INIT_LIST_HEAD(&chunk->list); - list_add(&chunk->list, pos); - return size; -} - -static int kvm_roe_expand_chunk(struct protected_chunk *pos, u64 gpa, u64 size) +static u64 kvm_roe_expand_chunk(struct protected_chunk *pos, u64 gpa, u64 size) { u64 old_ptr = pos->gpa; u64 old_size = pos->size; + u64 ret = 0; - if (gpa < old_ptr) + if (gpa < old_ptr) { pos->gpa = gpa; - if (gpa + size > old_ptr + old_size) + ret |= KVM_ROE_MERGE_LEFT; + } + if (gpa + size > old_ptr + old_size) { pos->size = gpa + size - pos->gpa; - return size; + ret |= KVM_ROE_MERGE_RIGHT; + } + return ret; } +static void kvm_roe_merge_left(struct rb_root *root, struct rb_node *start) +{ + struct rb_root fake_root; + struct protected_chunk *target, *first; + struct rb_node *node, *stop; + u64 i, count = 0; -static bool kvm_roe_merge_chunks(struct protected_chunk *chunk) + if (!start->rb_left) + return; + fake_root = (struct rb_root) {start->rb_left}; + stop = rb_prev(rb_first(&fake_root)); + /* Back traverse till no node can be merged*/ + target = container_of(start, struct protected_chunk, node); + for (node = rb_last(&fake_root); node != stop; node = rb_prev(node)) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + if (kvm_roe_range_cmp_mergability(target, pos->gpa, pos->size)) + break; + count += 1; + } + if (!count) + return; + /* merging step*/ + node = rb_next(node); + first = container_of(node, struct protected_chunk, node); + kvm_roe_expand_chunk(target, first->gpa, first->size); + /* forward traverse and delete all in between*/ + for (i = 0; i < count; i++) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + rb_erase(node, root); + kvfree(pos); + node = rb_next(node); + } +} + +static void kvm_roe_merge_right(struct rb_root *root, struct rb_node *start) { - /*attempt merging 2 consecutive given the first one*/ - struct protected_chunk *next = list_next_entry(chunk, list); + struct rb_root fake_root; + struct protected_chunk *target, *first; + struct rb_node *node, *stop; + u64 i, count = 0; - if (!kvm_roe_range_overlap(chunk, next->gpa, next->size)) - return false; - kvm_roe_expand_chunk(chunk, next->gpa, next->size); - list_del(&next->list); - kvfree(next); + if (!start->rb_right) + return; + fake_root = (struct rb_root) {start->rb_right}; + stop = rb_next(rb_last(&fake_root)); + /* Forward traverse till no node can be merged*/ + target = container_of(start, struct protected_chunk, node); + for (node = rb_first(&fake_root); node != stop; node = rb_next(node)) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + if (kvm_roe_range_cmp_mergability(target, pos->gpa, pos->size)) + break; + count += 1; + } + if (!count) + return; + /* merging step*/ + node = rb_prev(node); + first = container_of(node, struct protected_chunk, node); + kvm_roe_expand_chunk(target, first->gpa, first->size); + /* Backward traverse and delete all in between*/ + for (i = 0; i < count; i++) { + struct protected_chunk *pos; + + pos = container_of(node, struct protected_chunk, node); + rb_erase(node, root); + kvfree(pos); + node = rb_prev(node); + } +} + +static bool kvm_roe_merge_chunks(struct rb_root *root, struct rb_node *target, + u64 gpa, u64 size) +{ + /* + * attempt merging all adjacent chunks after inserting a chunk that is + * adjacent or inersecting with an existing chunk + */ + struct protected_chunk *cur; + u64 merge; + + cur = container_of(target, struct protected_chunk, node); + merge = kvm_roe_expand_chunk(cur, gpa, size); + /* + * We will not have to worry about the parent node while merging + * If it was mergeable with the new to be inserted chunk we wouldn't + * have gone deeper. + **/ + if (merge & KVM_ROE_MERGE_LEFT) + kvm_roe_merge_left(root, target); + if (merge & KVM_ROE_MERGE_RIGHT) + kvm_roe_merge_right(root, target); return true; } @@ -234,35 +326,35 @@ static int __kvm_roe_insert_chunk(struct kvm_memory_slot *slot, u64 gpa, u64 size) { /* kvm->slots_lock must be acquired*/ - struct protected_chunk *pos; - struct list_head *head = slot->prot_list; - - if (list_empty(head)) - return kvm_roe_insert_chunk_next(head, gpa, size); - /* - * pos here will never get deleted maybe the next one will - * that is why list_for_each_entry_safe is completely unsafe - */ - list_for_each_entry(pos, head, list) { - if (kvm_roe_range_overlap(pos, gpa, size)) { - int ret = kvm_roe_expand_chunk(pos, gpa, size); - - while (head != pos->list.next) - if (!kvm_roe_merge_chunks(pos)) - break; - return ret; - } - if (pos->gpa > gpa) { - struct protected_chunk *prev; - - prev = list_prev_entry(pos, list); - return kvm_roe_insert_chunk_next(&prev->list, gpa, - size); + struct rb_node **new = &(slot->prot_root->rb_node), *parent = NULL; + struct protected_chunk *insert_me; + bool merge = false; + + while (*new) { + struct protected_chunk *chunk; + int cmp; + + chunk = container_of(*new, struct protected_chunk, node); + cmp = kvm_roe_range_cmp_mergability(chunk, gpa, size); + parent = *new; + if (cmp < 0) { + new = &((*new)->rb_left); + } else if (cmp > 0) { + new = &((*new)->rb_right); + } else { + merge = true; + kvm_roe_merge_chunks(slot->prot_root, *new, gpa, size); + break; } } - pos = list_last_entry(head, struct protected_chunk, list); - - return kvm_roe_insert_chunk_next(&pos->list, gpa, size); + if (merge) + return size; + insert_me = kvzalloc(sizeof(struct protected_chunk), GFP_KERNEL); + insert_me->gpa = gpa; + insert_me->size = size; + rb_link_node(&insert_me->node, parent, new); + rb_insert_color(&insert_me->node, slot->prot_root); + return size; } static int kvm_roe_insert_chunk(struct kvm *kvm, u64 gpa, u64 size) diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h index 6c5f0cf381..8e42c9795c 100644 --- a/virt/kvm/roe_generic.h +++ b/virt/kvm/roe_generic.h @@ -10,6 +10,9 @@ * */ +#define KVM_ROE_MERGE_LEFT 0x1 +#define KVM_ROE_MERGE_RIGHT 0x2 + void kvm_roe_free(struct kvm_memory_slot *slot); int kvm_roe_init(struct kvm_memory_slot *slot); bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,