From patchwork Wed Sep 12 22:55:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10598431 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D15C14F9 for ; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A6342AC7E for ; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E2EE2AC80; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E8DD2AC7E for ; Wed, 12 Sep 2018 22:55:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 503B48E0004; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 44F908E0001; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 317B18E0003; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id E3A558E0001 for ; Wed, 12 Sep 2018 18:55:38 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id h3-v6so1547626pgc.8 for ; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=y9y4Vx2iOX2o/hg2UHXLgGMAdsn0DmQc363NJwvUEWs=; b=G5u/ctiM1/uYaleIuCoGQf+PxNDxWtgbJZD+kojTn+9Zke4DoXCQD9mRq98mMar73+ 95kNv6k3BvXlwMzl4crayn6IQ/jkpAUZXmpSwIsMtEC/tJ8SuC2ykpsMKmErjH+p5c62 FBURb+8ZB5Mu38rygnWgfKhrNBNik48eYodu2fl9R8JRYAJJnhjok0eTf4fXW4zBD43j IYQ3lSnUDZ1knYD2ZrmUIoV9k7lkxBMkXmoUGGL6aAUB6EHPzl2WZ9igdExFDV1bNpgw TyxFiExPx5k0dwiuYifzHhlYDiHlyzfjGTNiBo0l8rk9a7sbvYyU7Jn0BzcZxskslvur q8lA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51ACV8AfnJHdf0m1fkM+PpQnQFzXSUMyg14yVPoX2OgCjUG12tAA 6e7/Oky9ITISCtmk7RHLPou+Efq+15l97V7I36pIuoTRTfX78CjXX8oN12VUdEgLSJ3jRgTlOCR HjvMvMzDeZzeiXwAwOLFQN4Vq/fLYIUUNQ7Fd/M5+EjTNshWLjPxFI4MIrfdDM6JjLQ== X-Received: by 2002:a17:902:622:: with SMTP id 31-v6mr4458623plg.153.1536792938565; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbNICRMZGWbsTct/e7ArW97SwtS3C03NFB1oqT7mKPuwE/LgwfuV7T8YiT9rvg5VJFgLO7k X-Received: by 2002:a17:902:622:: with SMTP id 31-v6mr4458579plg.153.1536792937252; Wed, 12 Sep 2018 15:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536792937; cv=none; d=google.com; s=arc-20160816; b=gMBnJ5Gxgt00ZHzoCQ8QC458Agu0WQCu3xYtm7zC2Ndhf1/zttEkEOLIvIW/fkmrpw skpXP0i7exjPBK3WlvQrg+c1+T8EsPQImY8YpHwZDZ4QH5Gl0jvyRE3+wV1iRpTAQGta 0stO9BnWwmHg4L7Rdy8i1Q5RVR+OCj1KHiIQ9QQAtMSJXoMae9VQP+bAnuuN6UkaP1UV hF8SF12MIEjXbHe2+flBCDgzNvDAPNGlezNItN/pOdZD1av/uWbKNZJ3oJWOPfJj77Wi 84LTowcHKdeGqNbLdOA74cMxq6FRy62BVAwEkjsJVtneOlQ5erL1rbH3C1+Ua++2VbGL 6BCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=y9y4Vx2iOX2o/hg2UHXLgGMAdsn0DmQc363NJwvUEWs=; b=Z3Zjg53fTfvyoDbbcHg+2aNhEY74uQE5gTuEcAiIIGNQRKmn+hSqeD6Ow5VIRFo7Ph sstP+xpqpmIo6zgoK4tNqa6OrZuOKegD2Uoat10M8FgyyARf2R81NebH9cWDCIcBoWfo Twus2O4Vf4I47wBnGozgRpcvPK6EdmIeQRMegNu4ze2DqSxmy4AfNlWBPlhOL/nfzRab k1NjfZ8EXaRlIUp8pijYMsbm4wFJjcxZgBa4v9BoZaJT4puwb3xHvbrkg9T6AD8NA4Li S4KIrpNeKswFgr+BRMMFnXvc8WQ7NSZGyc59Bu4UWI7N8CUC0x4Xdh6UOd3vrM0Lo5bp e4Jw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n5-v6si2470946pgf.529.2018.09.12.15.55.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 15:55:37 -0700 (PDT) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2018 15:55:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,366,1531810800"; d="scan'208";a="69545856" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2018 15:55:22 -0700 From: Rick Edgecombe To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org, alexei.starovoitov@gmail.com Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v5 1/4] vmalloc: Add __vmalloc_node_try_addr function Date: Wed, 12 Sep 2018 15:55:37 -0700 Message-Id: <1536792940-8294-2-git-send-email-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> References: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Create __vmalloc_node_try_addr function that tries to allocate at a specific address and supports caller specified behavior for whether any lazy purging happens if there is a collision. This new function draws from the __vmalloc_node_range implementation. Attempts to merge the two into a single allocator resulted in logic that was difficult to follow, so they are left separate. Signed-off-by: Rick Edgecombe --- include/linux/vmalloc.h | 3 + mm/vmalloc.c | 177 +++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 179 insertions(+), 1 deletion(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 398e9c9..c7712c8 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, int node, const void *caller); +extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, int try_purge, const void *caller); #ifndef CONFIG_MMU extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags); static inline void *__vmalloc_node_flags_caller(unsigned long size, int node, diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a728fc4..1954458 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1709,6 +1709,181 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, return NULL; } +static bool pvm_find_next_prev(unsigned long end, + struct vmap_area **pnext, + struct vmap_area **pprev); + +/* Try to allocate a region of KVA of the specified address and size. */ +static struct vmap_area *try_alloc_vmap_area(unsigned long addr, + unsigned long size, int node, gfp_t gfp_mask, + int try_purge) +{ + struct vmap_area *va; + struct vmap_area *cur_va = NULL; + struct vmap_area *first_before = NULL; + int need_purge = 0; + int blocked = 0; + int purged = 0; + unsigned long addr_end; + + WARN_ON(!size); + WARN_ON(offset_in_page(size)); + + addr_end = addr + size; + if (addr > addr_end) + return ERR_PTR(-EOVERFLOW); + + might_sleep(); + + va = kmalloc_node(sizeof(struct vmap_area), + gfp_mask & GFP_RECLAIM_MASK, node); + if (unlikely(!va)) + return ERR_PTR(-ENOMEM); + + /* + * Only scan the relevant parts containing pointers to other objects + * to avoid false negatives. + */ + kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); + +retry: + spin_lock(&vmap_area_lock); + + pvm_find_next_prev(addr, &cur_va, &first_before); + + if (!cur_va) + goto found; + + /* + * If there is no VA that starts before the target address, start the + * check from the closest VA in order to cover the case where the + * allocation overlaps at the end. + */ + if (first_before && addr < first_before->va_end) + cur_va = first_before; + + /* Linearly search through to make sure there is a hole */ + while (cur_va->va_start < addr_end) { + if (cur_va->va_end > addr) { + if (cur_va->flags & VM_LAZY_FREE) { + need_purge = 1; + } else { + blocked = 1; + break; + } + } + + if (list_is_last(&cur_va->list, &vmap_area_list)) + break; + + cur_va = list_next_entry(cur_va, list); + } + + /* + * If a non-lazy free va blocks the allocation, or + * we are not supposed to purge, but we need to, the + * allocation fails. + */ + if (blocked || (need_purge && !try_purge)) + goto fail; + + if (try_purge && need_purge) { + /* if purged once before, give up */ + if (purged) + goto fail; + + /* + * If the va blocking the allocation is set to + * be purged then purge all vmap_areas that are + * set to purged since this will flush the TLBs + * anyway. + */ + spin_unlock(&vmap_area_lock); + purge_vmap_area_lazy(); + need_purge = 0; + purged = 1; + goto retry; + } + +found: + va->va_start = addr; + va->va_end = addr_end; + va->flags = 0; + __insert_vmap_area(va); + spin_unlock(&vmap_area_lock); + + return va; +fail: + spin_unlock(&vmap_area_lock); + kfree(va); + if (need_purge && !blocked) + return ERR_PTR(-EUCLEAN); + return ERR_PTR(-EBUSY); +} + +/** + * __vmalloc_try_addr - try to alloc at a specific address + * @addr: address to try + * @size: size to try + * @gfp_mask: flags for the page level allocator + * @prot: protection mask for the allocated pages + * @vm_flags: additional vm area flags (e.g. %VM_NO_GUARD) + * @node: node to use for allocation or NUMA_NO_NODE + * @try_purge: try to purge if needed to fulfill and allocation + * @caller: caller's return address + * + * Try to allocate at the specific address. If it succeeds the address is + * returned. If it fails an EBUSY ERR_PTR is returned. If try_purge is + * zero, it will return an EUCLEAN ERR_PTR if it could have allocated if it + * was allowed to purge. It may trigger TLB flushes if a purge is needed, + * and try_purge is set. + */ +void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, int try_purge, const void *caller) +{ + struct vmap_area *va; + struct vm_struct *area; + void *alloc_addr; + unsigned long real_size = size; + + size = PAGE_ALIGN(size); + if (!size || (size >> PAGE_SHIFT) > totalram_pages) + return NULL; + + WARN_ON(in_interrupt()); + + if (!(vm_flags & VM_NO_GUARD)) + size += PAGE_SIZE; + + va = try_alloc_vmap_area(addr, size, node, gfp_mask, try_purge); + if (IS_ERR(va)) + goto fail; + + area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); + if (unlikely(!area)) { + warn_alloc(gfp_mask, NULL, "kmalloc: allocation failure"); + return ERR_PTR(-ENOMEM); + } + + setup_vmalloc_vm(area, va, vm_flags, caller); + + alloc_addr = __vmalloc_area_node(area, gfp_mask, prot, node); + if (!alloc_addr) { + warn_alloc(gfp_mask, NULL, + "vmalloc: allocation failure: %lu bytes", real_size); + return ERR_PTR(-ENOMEM); + } + + clear_vm_uninitialized_flag(area); + + kmemleak_vmalloc(area, real_size, gfp_mask); + + return alloc_addr; +fail: + return va; +} + /** * __vmalloc_node_range - allocate virtually contiguous memory * @size: allocation size @@ -2355,7 +2530,6 @@ void free_vm_area(struct vm_struct *area) } EXPORT_SYMBOL_GPL(free_vm_area); -#ifdef CONFIG_SMP static struct vmap_area *node_to_va(struct rb_node *n) { return rb_entry_safe(n, struct vmap_area, rb_node); @@ -2403,6 +2577,7 @@ static bool pvm_find_next_prev(unsigned long end, return true; } +#ifdef CONFIG_SMP /** * pvm_determine_end - find the highest aligned address between two vmap_areas * @pnext: in/out arg for the next vmap_area From patchwork Wed Sep 12 22:55:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10598433 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E06614F9 for ; Wed, 12 Sep 2018 22:55:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F03C02AC76 for ; Wed, 12 Sep 2018 22:55:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E3A2F2AC7E; Wed, 12 Sep 2018 22:55:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34DAC2AC76 for ; Wed, 12 Sep 2018 22:55:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8959C8E0002; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 845388E0001; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 676A98E0005; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id 164518E0002 for ; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id d10-v6so1629950pll.22 for ; Wed, 12 Sep 2018 15:55:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=QqwWDnXJn8z6+qpoq1dVwrfhWZlisBIiIIAv83FSdxI=; b=HA9jCu63o0dM4yX72mvDfv16KQ+l9XhFn41kTwTONqYyBkPIuOrthSLLpAXqifp1zh Aa3Q2md4mijD/13Ro0dqlQZ8U3xnZ3zlAhg0t1tpJ5Sti8UN0KCOq6CQuhl5pllJ1V9s skECgGPTbpmSMesDNrbavKheCIes+ygjbHHy4Q3kZBWdgGHf7AbacQHEwq07nB5a/3+V UCAo0pegRJWi9KNU9Owa9a83T7Qt2wA/S+UqZHOo5gpQsQ6ce6D31u3264QhIUcJtOxT 9GMeyZqAIc8iaZD2Od+1iXaLrJ6ua61Pelb1Bc4atkNlGC+1ejYKTKOWhB1ok2ng9+kR MUXQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51B4UMdvO/oJVzT5e/pgWcQkNYBYOfBfnBAtxclVA20wUcX+DvAE efevUsSWKYkQ/sAyMgfEhcyVR1ieFQfIwGXEaOJ7Epw0iRC22E0bAVnQv6OA7+6TWz3e3gu2RSH L0GnF9C3+px+7pSBGdRL6JmBkP2arY/jM1fzmA/J/HQ/Fs8o35KE/z32RwhCls0+jgQ== X-Received: by 2002:a17:902:9a8a:: with SMTP id w10-v6mr4388572plp.14.1536792938737; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZ1vQsITSKX/PB7vDdi6O1xpfeF1dVvX1BmXl8YQ++ggZUesa5WQwzYtgiFiItTkh3/c1iJ X-Received: by 2002:a17:902:9a8a:: with SMTP id w10-v6mr4388546plp.14.1536792937504; Wed, 12 Sep 2018 15:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536792937; cv=none; d=google.com; s=arc-20160816; b=Xz5MiAQAg2Cs+pgqyGcZwXX6v9SOGBMGlJYQP/Zu6U3usRtB7GbMTwGx+gxKLAQ42K eSs+0Ky4lnRKVjvOv9GGjvXXHIH4CcoLVFrVVAVO9sGIBilB/UDwkXFM4MwTU4KbKHH0 g9EhapOMnpfskkLLywB8/G/wymXULtvPh2Ox/DnZCAXUq5u6WP5ZDlgeg0eHYaiR3hO1 GVMzP4HCsf3RdIVgTtMpNvhkqEJuQsdQ1vORvNY4XvAk3tvV9UiBuEYCf8LrDfodmYl6 ovlOlCtY9yzgkgYPta4FE62nfz8SJPLEmAabQTmLecMCiZg+uPMSiYzTxjxdQYAN4EhZ YE0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=QqwWDnXJn8z6+qpoq1dVwrfhWZlisBIiIIAv83FSdxI=; b=xmWMOjefskyl3Cd0F0QsXxinS/GEa+NdsLiNbZ4rmUeZmfn4z2la5OR7+aE7xdyJrV cJxKmf04AGIIE+fitiuLQEXMPZ8VAIYDaP3+bqJZv5z81BOQfd9GtLtpMYFnjx2E6dY/ iJ5Nh2NIxVr1ag3RdTcZq6YPzB9aWNyNxj2LmuLPR+cFINniirLE8fXCo43BzSmG0q8J eSp253VG6YbuedcSveZJvtv5d9x5be1GTpuQlNLm6ywuS2ZD2VBFdzSXSFn4ay5K1kki 0LWKLw6IUYKzKV7zq+2ZBEOt9WhMXI2vlxz4/sLilF4I4i5EwmzWWzRJDjYs5XLMPKPp ugSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n5-v6si2470946pgf.529.2018.09.12.15.55.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 15:55:37 -0700 (PDT) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2018 15:55:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,366,1531810800"; d="scan'208";a="69545859" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2018 15:55:22 -0700 From: Rick Edgecombe To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org, alexei.starovoitov@gmail.com Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v5 2/4] x86/modules: Increase randomization for modules Date: Wed, 12 Sep 2018 15:55:38 -0700 Message-Id: <1536792940-8294-3-git-send-email-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> References: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This changes the behavior of the KASLR logic for allocating memory for the text sections of loadable modules. It randomizes the location of each module text section with about 17 bits of entropy in typical use. This is enabled on X86_64 only. For 32 bit, the behavior is unchanged. It refactors existing code around module randomization somewhat. There are now three different behaviors for x86 module_alloc depending on config. RANDOMIZE_BASE=n, and RANDOMIZE_BASE=y ARCH=x86_64, and RANDOMIZE_BASE=y ARCH=i386. The refactor of the existing code is to try to clearly show what those behaviors are without having three separate versions or threading the behaviors in a bunch of little spots. The reason it is not enabled on 32 bit yet is because the module space is much smaller and simulations haven't been run to see how it performs. The new algorithm breaks the module space in two, a random area and a backup area. It first tries to allocate at a number of randomly located starting pages inside the random section without purging any lazy free vmap areas and triggering the associated TLB flush. If this fails, it will try again a number of times allowing for purges if needed. It also saves any position that could have succeeded if it was allowed to purge, which doubles the chances of finding a spot that would fit. Finally if those both fail to find a position it will allocate in the backup area. The backup area base will be offset in the same way as the current algorithm does for the base area, 1024 possible locations. Signed-off-by: Rick Edgecombe --- arch/x86/include/asm/pgtable_64_types.h | 7 ++ arch/x86/kernel/module.c | 165 +++++++++++++++++++++++++++----- 2 files changed, 149 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 04edd2d..5e26369 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -143,6 +143,13 @@ extern unsigned int ptrs_per_p4d; #define MODULES_END _AC(0xffffffffff000000, UL) #define MODULES_LEN (MODULES_END - MODULES_VADDR) +/* + * Dedicate the first part of the module space to a randomized area when KASLR + * is in use. Leave the remaining part for a fallback if we are unable to + * allocate in the random area. + */ +#define MODULES_RAND_LEN PAGE_ALIGN((MODULES_LEN/3)*2) + #define ESPFIX_PGD_ENTRY _AC(-2, UL) #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index f58336a..d50a0a0 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -48,34 +48,151 @@ do { \ } while (0) #endif -#ifdef CONFIG_RANDOMIZE_BASE +#if defined(CONFIG_X86_64) && defined(CONFIG_RANDOMIZE_BASE) +static inline unsigned long get_modules_rand_len(void) +{ + return MODULES_RAND_LEN; +} +#else +static inline unsigned long get_modules_rand_len(void) +{ + BUILD_BUG(); + return 0; +} + +inline bool kaslr_enabled(void); +#endif + +static inline int kaslr_randomize_each_module(void) +{ + return IS_ENABLED(CONFIG_RANDOMIZE_BASE) + && IS_ENABLED(CONFIG_X86_64) + && kaslr_enabled(); +} + +static inline int kaslr_randomize_base(void) +{ + return IS_ENABLED(CONFIG_RANDOMIZE_BASE) + && !IS_ENABLED(CONFIG_X86_64) + && kaslr_enabled(); +} + static unsigned long module_load_offset; +static const unsigned long NR_NO_PURGE = 5000; +static const unsigned long NR_TRY_PURGE = 5000; /* Mutex protects the module_load_offset. */ static DEFINE_MUTEX(module_kaslr_mutex); static unsigned long int get_module_load_offset(void) { - if (kaslr_enabled()) { - mutex_lock(&module_kaslr_mutex); - /* - * Calculate the module_load_offset the first time this - * code is called. Once calculated it stays the same until - * reboot. - */ - if (module_load_offset == 0) - module_load_offset = - (get_random_int() % 1024 + 1) * PAGE_SIZE; - mutex_unlock(&module_kaslr_mutex); - } + mutex_lock(&module_kaslr_mutex); + /* + * Calculate the module_load_offset the first time this + * code is called. Once calculated it stays the same until + * reboot. + */ + if (module_load_offset == 0) + module_load_offset = (get_random_int() % 1024 + 1) * PAGE_SIZE; + mutex_unlock(&module_kaslr_mutex); + return module_load_offset; } -#else -static unsigned long int get_module_load_offset(void) + +static unsigned long get_module_vmalloc_start(void) { - return 0; + if (kaslr_randomize_each_module()) + return MODULES_VADDR + get_modules_rand_len() + + get_module_load_offset(); + else if (kaslr_randomize_base()) + return MODULES_VADDR + get_module_load_offset(); + + return MODULES_VADDR; +} + +static void *try_module_alloc(unsigned long addr, unsigned long size, + int try_purge) +{ + const unsigned long vm_flags = 0; + + return __vmalloc_node_try_addr(addr, size, GFP_KERNEL, PAGE_KERNEL_EXEC, + vm_flags, NUMA_NO_NODE, try_purge, + __builtin_return_address(0)); +} + +/* + * Find a random address to try that won't obviously not fit. Random areas are + * allowed to overflow into the backup area + */ +static unsigned long get_rand_module_addr(unsigned long size) +{ + unsigned long nr_max_pos = (MODULES_LEN - size) / MODULE_ALIGN + 1; + unsigned long nr_rnd_pos = get_modules_rand_len() / MODULE_ALIGN; + unsigned long nr_pos = min(nr_max_pos, nr_rnd_pos); + + unsigned long module_position_nr = get_random_long() % nr_pos; + unsigned long offset = module_position_nr * MODULE_ALIGN; + + return MODULES_VADDR + offset; +} + +/* + * Try to allocate in the random area. First 5000 times without purging, then + * 5000 times with purging. If these fail, return NULL. + */ +static void *try_module_randomize_each(unsigned long size) +{ + void *p = NULL; + unsigned int i; + unsigned long last_lazy_free_blocked = 0; + + /* This will have a guard page */ + unsigned long va_size = PAGE_ALIGN(size) + PAGE_SIZE; + + if (!kaslr_randomize_each_module()) + return NULL; + + /* Make sure there is at least one address that might fit. */ + if (va_size < PAGE_ALIGN(size) || va_size > MODULES_LEN) + return NULL; + + /* Try to find a spot that doesn't need a lazy purge */ + for (i = 0; i < NR_NO_PURGE; i++) { + unsigned long addr = get_rand_module_addr(va_size); + + /* First try to avoid having to purge */ + p = try_module_alloc(addr, size, 0); + + /* + * Save the last value that was blocked by a + * lazy purge area. + */ + if (IS_ERR(p) && PTR_ERR(p) == -EUCLEAN) + last_lazy_free_blocked = addr; + else if (!IS_ERR(p)) + return p; + } + + /* Try the most recent spot that could be used after a lazy purge */ + if (last_lazy_free_blocked) { + p = try_module_alloc(last_lazy_free_blocked, size, 1); + + if (!IS_ERR(p)) + return p; + } + + /* Look for more spots and allow lazy purges */ + for (i = 0; i < NR_TRY_PURGE; i++) { + unsigned long addr = get_rand_module_addr(va_size); + + /* Give up and allow for purges */ + p = try_module_alloc(addr, size, 1); + + if (!IS_ERR(p)) + return p; + } + return NULL; } -#endif void *module_alloc(unsigned long size) { @@ -84,16 +201,18 @@ void *module_alloc(unsigned long size) if (PAGE_ALIGN(size) > MODULES_LEN) return NULL; - p = __vmalloc_node_range(size, MODULE_ALIGN, - MODULES_VADDR + get_module_load_offset(), - MODULES_END, GFP_KERNEL, - PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + p = try_module_randomize_each(size); + + if (!p) + p = __vmalloc_node_range(size, MODULE_ALIGN, + get_module_vmalloc_start(), MODULES_END, + GFP_KERNEL, PAGE_KERNEL_EXEC, 0, + NUMA_NO_NODE, __builtin_return_address(0)); + if (p && (kasan_module_alloc(p, size) < 0)) { vfree(p); return NULL; } - return p; } From patchwork Wed Sep 12 22:55:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10598435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 768DE112B for ; Wed, 12 Sep 2018 22:55:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 172ED2AC76 for ; Wed, 12 Sep 2018 22:55:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0AA6E2AC7E; Wed, 12 Sep 2018 22:55:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B325F2AC76 for ; Wed, 12 Sep 2018 22:55:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1BD08E0005; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BA2438E0001; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5EE88E0005; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 4E7648E0003 for ; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id 90-v6so1628068pla.18 for ; Wed, 12 Sep 2018 15:55:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Lmk7RJ4HAcUTrVlTM0qUi0H/bFfvBm4kuPmUSOALSlM=; b=pFF+B7XD6EueUH7QCKjAW/zcsA1Wuk+1LtP3g5dAskqgj5vABHTelf5kuPLBMhV+4e U+ygi/edOzauTbD4kH0I+c8YbqczCbGGOIiAmuX5wn9vfz0cab6X9QpSyI4cl87FhQRZ aeFDP+q1dCefIXA/BZ5h7uYMdiR5WaWen3OYtDDp4VP1FKX+GXZVqWd1Roh9pqmpaFKl 01NnQOEOc1WQKL7AH1tLkxfRqKUSg8YPljsb8tsT+0UkvfgPr9DmyfbISAprFA4/4PDL m4VjQNcMPjjCCvhF0KW+wY0lnn4/6XPFB6sB2svYFsBlbK0tLCDd075wiQ6NHkrtWauV niMw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51D01xTHdWP2AUpeLs1bNfO7dc2PKLSv6dQNbyc6DJxq2FjJTkEj MjFLso/t25NPSUWSMuWZWOPqPuMVdML1cwWqUfuaD9nV4DztyyoqCdgPgMTRvfBjE/+qMKVqFs3 B4WmFG7RgbcBKssOeXeyW1iy/al/FAzeoKpt3q0Qwoi7X9ZggT1cXUnx0N3VVQoC0JA== X-Received: by 2002:a62:63c2:: with SMTP id x185-v6mr4552821pfb.13.1536792938973; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdasv0gMU2YtwEHuAiTTW4rZ5GfeWf3GVV1r6zXt7PiGE4GFweW1tOVDP/KWKuMhlIZUGwVa X-Received: by 2002:a62:63c2:: with SMTP id x185-v6mr4552779pfb.13.1536792937976; Wed, 12 Sep 2018 15:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536792937; cv=none; d=google.com; s=arc-20160816; b=w1JFXxJ1Z6SGRWsSzW/5xJoDcMwrHIYqc21Kbx7G8DhQqeIIwy2FpjoK5MzwUvaLRO TPA5TFTkyXLJsVMgFgDPP/UJh3qfu3rNTS5cFPSm2k7UuhXPMMB4kF7yhLrgDI8NrYWJ KXX4C0aSNlM6qSP4Juqp7RqI/LKfCo/eMUQ3diZHfjcoz+whFbAw81x97cR+BBwRZNMu j+++35gKI4pawgLPGLsPfAHesIqRozHCr4c8DNRFAVQ0CCEwS6RyEL8S3PbyWT0frfay +PV+xEQHw2oipnroboYgqboa75TVHMTwVj4ttWFcITgCW6x+7gqf/v4SWfIalyUKvUeh ptvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Lmk7RJ4HAcUTrVlTM0qUi0H/bFfvBm4kuPmUSOALSlM=; b=sTIQwN9iTfSQmtlxsanieSnZZ5q6H8C086MzMpZKz9ayXQlQ2IcjGcrWYn2SXLlbfa Uhz1moNUMbFxGiwM5l+SCMv7fdR4IKgZLyPRNkqQmZZAfljtvcjKGFuehR01WNLaI1Qz UDqTcFGEc0q2szKYZl/Y/US65T8sGcwnq91lsj9rJD6yQSa5HPsTGMleI40+iqQci2LP i9mRCyX5hDwigveFDVJttFLoVny/k8TFEMSWMDBAaUoo9JUxtH+VnnOTgtqyxo8RGkXh WuvvC28rtMz0QkafDGWO4PKxJ1L2vxuNwA/7NWVT/VgcaPbaIERs1Bx14R++iTUeHZXN UqPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id g1-v6si2145414plt.77.2018.09.12.15.55.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 15:55:37 -0700 (PDT) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2018 15:55:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,366,1531810800"; d="scan'208";a="69545862" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2018 15:55:22 -0700 From: Rick Edgecombe To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org, alexei.starovoitov@gmail.com Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v5 3/4] vmalloc: Add debugfs modfraginfo Date: Wed, 12 Sep 2018 15:55:39 -0700 Message-Id: <1536792940-8294-4-git-send-email-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> References: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add debugfs file "modfraginfo" for providing info on module space fragmentation. This can be used for determining if loadable module randomization is causing any problems for extreme module loading situations, like huge numbers of modules or extremely large modules. Sample output when KASLR is enabled and X86_64 is configured: Largest free space: 897912 kB Total free space: 1025424 kB Allocations in backup area: 0 Sample output when just X86_64: Largest free space: 897912 kB Total free space: 1025424 kB Signed-off-by: Rick Edgecombe --- mm/vmalloc.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 101 insertions(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 1954458..a44b902 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -33,6 +34,7 @@ #include #include +#include #include #include @@ -2919,7 +2921,105 @@ static int __init proc_vmalloc_init(void) proc_create_seq("vmallocinfo", 0400, NULL, &vmalloc_op); return 0; } -module_init(proc_vmalloc_init); +#else +static int __init proc_vmalloc_init(void) +{ + return 0; +} +#endif + +#if defined(CONFIG_RANDOMIZE_BASE) && defined(CONFIG_X86_64) +static inline unsigned long is_in_backup(unsigned long addr) +{ + return addr >= MODULES_VADDR + MODULES_RAND_LEN; +} +#else +static inline unsigned long is_in_backup(unsigned long addr) +{ + return 0; +} +inline bool kaslr_enabled(void); #endif + +#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64) +static int modulefraginfo_debug_show(struct seq_file *m, void *v) +{ + unsigned long last_end = MODULES_VADDR; + unsigned long total_free = 0; + unsigned long largest_free = 0; + unsigned long backup_cnt = 0; + unsigned long gap; + struct vmap_area *prev, *cur = NULL; + + spin_lock(&vmap_area_lock); + + if (!pvm_find_next_prev(MODULES_VADDR, &cur, &prev) || !cur) + goto done; + + for (; cur->va_end <= MODULES_END; cur = list_next_entry(cur, list)) { + /* Don't count areas that are marked to be lazily freed */ + if (!(cur->flags & VM_LAZY_FREE)) { + backup_cnt += is_in_backup(cur->va_start); + gap = cur->va_start - last_end; + if (gap > largest_free) + largest_free = gap; + total_free += gap; + last_end = cur->va_end; + } + + if (list_is_last(&cur->list, &vmap_area_list)) + break; + } + +done: + gap = (MODULES_END - last_end); + if (gap > largest_free) + largest_free = gap; + total_free += gap; + + spin_unlock(&vmap_area_lock); + + seq_printf(m, "\tLargest free space:\t%lu kB\n", largest_free / 1024); + seq_printf(m, "\t Total free space:\t%lu kB\n", total_free / 1024); + + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled()) + seq_printf(m, "Allocations in backup area:\t%lu\n", backup_cnt); + + return 0; +} + +static int proc_module_frag_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, modulefraginfo_debug_show, NULL); +} + +static const struct file_operations debug_module_frag_operations = { + .open = proc_module_frag_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static void __init debug_modfrag_init(void) +{ + debugfs_create_file("modfraginfo", 0400, NULL, NULL, + &debug_module_frag_operations); +} +#else /* defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64) */ +static void __init debug_modfrag_init(void) +{ +} +#endif + +#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_PROC_FS) +static int __init info_vmalloc_init(void) +{ + proc_vmalloc_init(); + debug_modfrag_init(); + return 0; +} + +module_init(info_vmalloc_init); +#endif From patchwork Wed Sep 12 22:55:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10598437 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF49F14F9 for ; Wed, 12 Sep 2018 22:55:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BC432AC76 for ; Wed, 12 Sep 2018 22:55:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8FA932AC7E; Wed, 12 Sep 2018 22:55:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 740542AC76 for ; Wed, 12 Sep 2018 22:55:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F86F8E0006; Wed, 12 Sep 2018 18:55:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 42A118E0003; Wed, 12 Sep 2018 18:55:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CB0F8E0001; Wed, 12 Sep 2018 18:55:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id BDC468E0003 for ; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id r2-v6so1567136pgp.3 for ; Wed, 12 Sep 2018 15:55:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=u9qCp+Mu6y0cGWXZab1XmOoWhab9AKLv/vn76JmKLeg=; b=GSQ4c9VfXiBv0qcqIyzOEv4/6r279Zwco4Is3zepdbXQBvVe7SkPcbN/RWLUEDJ3GW Ut6m6XYEHnDooYjfiVoRPx7JQTqQS2cOqhjDpkjqWiqw7p98yFHXjo1RCTn8gAeOLRvD Qo7oa5G9tieLtMC5f6taz4Gq0Iblz8fGw9gJ56J24GimFOQvoJCiPzo5rpeMgpuWdRRQ 9YrVQBUB889G81g/97lWHcKjzpQKB9npQBkjPjEnyxVMGx3lsijz0vZy/e/QgnPpXxDq OCxudXmHnD/AnejEtzrWzxJc9cF0JBBZgSXYky8UgDCl1afSpeawnqTGlsT4stJvV37I kLlg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51BieJ0nlU8mjZau+vKIQ04OEvoFKb8Pz/0N02h28CRTLE2bNwo3 amMH/iJuFlF9Tbkwr1AlBCLcwy7qMEkt/7v09BjDM64M6APHzPgNY1GtAJPP1dr8jVUynO9LXNJ jCvVi6ZO+eHKlxl4ZIiuwi7tplAu6Vlb5dMt3p7pUOSnfM+VWF2h+Vj7UyLggARazfQ== X-Received: by 2002:a17:902:6b44:: with SMTP id g4-v6mr4520563plt.50.1536792939371; Wed, 12 Sep 2018 15:55:39 -0700 (PDT) X-Google-Smtp-Source: ANB0VdacO3uVWnCZYaZNbLZ5w97RHMtulNqfwR+zSm6mAC4Ceq4BJWNeBbfksF+PXnynKeEGqfzn X-Received: by 2002:a17:902:6b44:: with SMTP id g4-v6mr4520512plt.50.1536792937789; Wed, 12 Sep 2018 15:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536792937; cv=none; d=google.com; s=arc-20160816; b=iBaABYr/pt+lkAFHWHiYMZ5lEtGHtQkIIr/JOZzb8XxacMKrGik4RKzKQZBrYpJhln 3lZLjJdYcQBqlH2FY5sQ9lrrUhqZUXBZXSxIYBOlejPv72QMhrifXSE28AvZldo0iOl0 WcGnRhFJcpoeFEU/YGgdWJv7k1yE/DCiqhhmNWDh8rvLNULXDaobURuWqIgm+zUumy0C 7jFG4KEk7hot/lN9V/9dn2zitq3VTeNFMbzQ6DVm18vQZ8is6ZtJxfEgalf8+jRNZQN3 IJps3z/1wNOjsx9zvrd9A2sYMAwfnWu1+MvE5q5DciEt4o548jh4Qq1k7Q1DYP246zaV R6VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=u9qCp+Mu6y0cGWXZab1XmOoWhab9AKLv/vn76JmKLeg=; b=hr9tOUigO/eZPtGX2FvaokNp5yshB0tP73SSz9DiGyRcuLjQxU4iiZZbH0Jh7f2kIP kKd3gH8D/cr0mUFpxnu0hDE/30j21pvXs2VFl6bJ/9d4qsCT4pwZxgdPDuZ21ejYXc2o pucrYv3+WloXpOsORrr9EMDBDBnr3ARo/PNoUP8xOGVvnvIayxgiX6YjZo0CE/XAZr5A IwfRR+rgRBEJAHgm5MfanDbxeipVzl/R6NsVCgS8VOoF5+6VdQ8zsZ6r/Kda1EB6NnZC tDp0bJRrUr0HmkiFkgXUrBOiL8JbwXF2a4srEcei2uFbzYvFboxHg3LLUoWDk0eqqaL/ PgMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n5-v6si2470946pgf.529.2018.09.12.15.55.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 15:55:37 -0700 (PDT) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2018 15:55:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,366,1531810800"; d="scan'208";a="69545865" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2018 15:55:23 -0700 From: Rick Edgecombe To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org, alexei.starovoitov@gmail.com Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v5 4/4] Kselftest for module text allocation benchmarking Date: Wed, 12 Sep 2018 15:55:40 -0700 Message-Id: <1536792940-8294-5-git-send-email-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> References: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This adds a test module in lib/, and a script in kselftest that does benchmarking on the allocation of memory in the module space. Performance here would have some small impact on kernel module insertions, BPF JIT insertions and kprobes. In the case of KASLR features for the module space, this module can be used to measure the allocation performance of different configurations. This module needs to be compiled into the kernel because module_alloc is not exported. With some modification to the code, as explained in the comments, it can be enabled to measure TLB flushes as well. There are two tests in the module. One allocates until failure in order to test module capacity and the other times allocating space in the module area. They both use module sizes that roughly approximate the distribution of in-tree X86_64 modules. You can control the number of modules used in the tests like this: echo m1000>/dev/mod_alloc_test Run the test for module capacity like: echo t1>/dev/mod_alloc_test The other test will measure the allocation time, and for CONFG_X86_64 and CONFIG_RANDOMIZE_BASE, also give data on how often the “backup area" is used. Run the test for allocation time and backup area usage like: echo t2>/dev/mod_alloc_test The output will be something like this: num all(ns) last(ns) 1000 1083 1099 Last module in backup count = 0 Total modules in backup = 0 >1 module in backup count = 0 To run a suite of allocation time tests for a collection of module numbers you can run: tools/testing/selftests/bpf/test_mod_alloc.sh Signed-off-by: Rick Edgecombe --- lib/Kconfig.debug | 10 + lib/Makefile | 1 + lib/test_mod_alloc.c | 446 ++++++++++++++++++++++++++ tools/testing/selftests/bpf/test_mod_alloc.sh | 29 ++ 4 files changed, 486 insertions(+) create mode 100644 lib/test_mod_alloc.c create mode 100755 tools/testing/selftests/bpf/test_mod_alloc.sh diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 4966c4f..c6c147c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1883,6 +1883,16 @@ config TEST_BPF If unsure, say N. +config TEST_MOD_ALLOC + bool "Tests for module allocator/vmalloc" + help + This builds the "test_mod_alloc" module that performs performance + and functional tests on the module text section allocator. The module + uses X86_64 module text sizes for simulations, for other architectures + it will be less accurate. + + If unsure, say N. + config FIND_BIT_BENCHMARK tristate "Test find_bit functions" help diff --git a/lib/Makefile b/lib/Makefile index ca3f7eb..3d5923e 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -58,6 +58,7 @@ UBSAN_SANITIZE_test_ubsan.o := y obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LIST_SORT) += test_list_sort.o obj-$(CONFIG_TEST_LKM) += test_module.o +obj-$(CONFIG_TEST_MOD_ALLOC) += test_mod_alloc.o obj-$(CONFIG_TEST_OVERFLOW) += test_overflow.o obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o obj-$(CONFIG_TEST_SORT) += test_sort.o diff --git a/lib/test_mod_alloc.c b/lib/test_mod_alloc.c new file mode 100644 index 0000000..71c146e --- /dev/null +++ b/lib/test_mod_alloc.c @@ -0,0 +1,446 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct mod { int filesize; int coresize; int initsize; }; + +/* ==== Begin optional logging ==== */ + +/* + * Note: for more accurate test results add this to mm/vmalloc.c: + * void debug_purge_vmap_area_lazy(void) + * { + * purge_vmap_area_lazy(); + * } + * and replace the below with: + * extern void debug_purge_vmap_area_lazy(void); + */ +static void debug_purge_vmap_area_lazy(void) +{ +} + + +/* + * Note: In order to get an accurate count for the tlb flushes triggered in + * vmalloc, create a counter in vmalloc.c: with this method signature and export + * it. Then replace the below with: __purge_vmap_area_lazy + * extern unsigned long get_tlb_flushes_vmalloc(void); + */ +static unsigned long get_tlb_flushes_vmalloc(void) +{ + return 0; +} + +/* ==== End optional logging ==== */ + + +#define MAX_ALLOC_CNT 20000 +#define ITERS 1000 + +struct vm_alloc { + void *core; + unsigned long core_size; + void *init; +}; + +struct check_alloc { + unsigned long start; + unsigned long vm_end; + unsigned long real_end; +}; + +static struct vm_alloc *allocs_vm; + +static struct check_alloc *check_allocs; +static int check_alloc_cnt; + +static long mod_cnt = 100; + +/* This may be different for non-x86 */ +static unsigned long calc_end(void *start, unsigned long size) +{ + unsigned long startl = (unsigned long) start; + + return startl + PAGE_ALIGN(size) + PAGE_SIZE; +} + +static void reset_cur_allocs(void) +{ + check_alloc_cnt = 0; +} + +static void add_check_alloc(void *start, unsigned long size) +{ + check_allocs[check_alloc_cnt].start = (unsigned long) start; + check_allocs[check_alloc_cnt].vm_end = calc_end(start, size); + check_allocs[check_alloc_cnt].real_end = size; + check_alloc_cnt++; +} + +static int do_check(void *ptr, unsigned long size) +{ + int i; + unsigned long start = (unsigned long) ptr; + unsigned long end = calc_end(ptr, size); + unsigned long sum = 0; + unsigned long addr; + + if (!start) + return 1; + + for (i = 0; i < check_alloc_cnt; i++) { + struct check_alloc *cur_alloc = &(check_allocs[i]); + + /* overlap end */ + if (start >= cur_alloc->start && start < cur_alloc->vm_end) { + pr_info("overlap end\n"); + return 1; + } + + /* overlap start */ + if (end >= cur_alloc->start && end < cur_alloc->start) { + pr_info("overlap start\n"); + return 1; + } + + /* overlap whole thing */ + if (start <= cur_alloc->start && end > cur_alloc->vm_end) { + pr_info("overlap whole thing\n"); + return 1; + } + + /* inside */ + if (start >= cur_alloc->start && end < cur_alloc->vm_end) { + pr_info("inside\n"); + return 1; + } + + /* bounds */ + if (start < MODULES_VADDR || + end > MODULES_VADDR + MODULES_LEN) { + pr_info("out of bounds\n"); + return 1; + } + for (addr = cur_alloc->start; + addr < cur_alloc->real_end; + addr += PAGE_SIZE) { + sum += *((unsigned long *) addr); + } + if (sum != 0) + pr_info("Memory was not zeroed\n"); + + kasan_check_read((void *)cur_alloc->start, + cur_alloc->vm_end - cur_alloc->start - PAGE_SIZE); + } + return 0; +} + + +const static int core_hist[10] = {1, 5, 21, 46, 141, 245, 597, 2224, 1875, 0}; +const static int init_hist[10] = {0, 0, 0, 0, 10, 19, 70, 914, 3906, 236}; +const static int file_hist[10] = {6, 20, 55, 86, 286, 551, 918, 2024, 1028, + 181}; + +const static int bins[10] = {5000000, 2000000, 1000000, 500000, 200000, 100000, + 50000, 20000, 10000, 5000}; +/* + * Rough approximation of the X86_64 module size distribution. + */ +static int get_mod_rand_size(const int *hist) +{ + int area_under = get_random_long() % 5155; + int i; + int last_bin = bins[0] + 1; + int sum = 0; + + for (i = 0; i <= 9; i++) { + sum += hist[i]; + if (area_under <= sum) + return bins[i] + + (get_random_long() % (last_bin - bins[i])); + last_bin = bins[i]; + } + return 4096; +} + +static struct mod get_rand_module(void) +{ + struct mod ret; + + ret.coresize = get_mod_rand_size(core_hist); + ret.initsize = get_mod_rand_size(init_hist); + ret.filesize = get_mod_rand_size(file_hist); + return ret; +} + +static void do_test_alloc_fail(void) +{ + struct vm_alloc *cur_alloc; + struct mod cur_mod; + void *file; + int mod_n, free_mod_n; + unsigned long fail = 0; + int iter; + + for (iter = 0; iter < ITERS; iter++) { + pr_info("Running iteration: %d\n", iter); + memset(allocs_vm, 0, mod_cnt * sizeof(struct vm_alloc)); + reset_cur_allocs(); + debug_purge_vmap_area_lazy(); + for (mod_n = 0; mod_n < mod_cnt; mod_n++) { + cur_mod = get_rand_module(); + cur_alloc = &allocs_vm[mod_n]; + + /* Allocate */ + file = vmalloc(cur_mod.filesize); + cur_alloc->core = module_alloc(cur_mod.coresize); + + /* Check core allocation postion is good */ + if (do_check(cur_alloc->core, cur_mod.coresize)) { + pr_info("Check failed core:%d\n", mod_n); + break; + } + /* Add core position for future checking */ + add_check_alloc(cur_alloc->core, cur_mod.coresize); + + cur_alloc->init = module_alloc(cur_mod.initsize); + + /* Check init position */ + if (do_check(cur_alloc->init, cur_mod.initsize)) { + pr_info("Check failed init:%d\n", mod_n); + break; + } + + /* Clean up everything except core */ + if (!cur_alloc->core || !cur_alloc->init) { + fail++; + vfree(file); + if (cur_alloc->init) + vfree(cur_alloc->init); + break; + } + vfree(cur_alloc->init); + vfree(file); + } + + /* Clean up core sizes */ + for (free_mod_n = 0; free_mod_n < mod_n; free_mod_n++) { + cur_alloc = &allocs_vm[free_mod_n]; + if (cur_alloc->core) + vfree(cur_alloc->core); + } + } + pr_info("Failures(%ld modules):%lu\n", mod_cnt, fail); +} + +#if defined(CONFIG_X86_64) && defined(CONFIG_RANDOMIZE_BASE) +static int is_in_backup(void *addr) +{ + return (unsigned long)addr >= MODULES_VADDR + MODULES_RAND_LEN; +} +#else +static int is_in_backup(void *addr) +{ + return 0; +} +#endif + +static void do_test_last_perf(void) +{ + struct vm_alloc *cur_alloc; + struct mod cur_mod; + void *file; + int mod_n, mon_n_free; + unsigned long fail = 0; + int iter; + ktime_t start, diff; + ktime_t total_last = 0; + ktime_t total_all = 0; + + /* + * The number of last core allocations for each iteration that were + * allocated in the backup area. + */ + int last_in_bk = 0; + + /* + * The total number of core allocations that were in the backup area for + * all iterations. + */ + int total_in_bk = 0; + + /* The number of iterations where the count was more than 1 */ + int cnt_more_than_1 = 0; + + /* + * The number of core allocations that were in the backup area for the + * current iteration. + */ + int cur_in_bk = 0; + + unsigned long before_tlbs; + unsigned long tlb_cnt_total; + unsigned long tlb_cur; + unsigned long total_tlbs = 0; + + pr_info("Starting %d iterations of %ld modules\n", ITERS, mod_cnt); + + for (iter = 0; iter < ITERS; iter++) { + debug_purge_vmap_area_lazy(); + before_tlbs = get_tlb_flushes_vmalloc(); + memset(allocs_vm, 0, mod_cnt * sizeof(struct vm_alloc)); + tlb_cnt_total = 0; + cur_in_bk = 0; + for (mod_n = 0; mod_n < mod_cnt; mod_n++) { + /* allocate how the module allocator allocates */ + + cur_mod = get_rand_module(); + cur_alloc = &allocs_vm[mod_n]; + file = vmalloc(cur_mod.filesize); + + tlb_cur = get_tlb_flushes_vmalloc(); + + start = ktime_get(); + cur_alloc->core = module_alloc(cur_mod.coresize); + diff = ktime_get() - start; + + cur_alloc->init = module_alloc(cur_mod.initsize); + + /* Collect metrics */ + if (is_in_backup(cur_alloc->core)) { + cur_in_bk++; + if (mod_n == mod_cnt - 1) + last_in_bk++; + } + total_all += diff; + + if (mod_n == mod_cnt - 1) + total_last += diff; + + tlb_cnt_total += get_tlb_flushes_vmalloc() - tlb_cur; + + /* If there is a failure, quit. init/core freed later */ + if (!cur_alloc->core || !cur_alloc->init) { + fail++; + vfree(file); + break; + } + /* Init sections do not last long so free here */ + vfree(cur_alloc->init); + cur_alloc->init = NULL; + vfree(file); + } + + /* Collect per iteration metrics */ + total_in_bk += cur_in_bk; + if (cur_in_bk > 1) + cnt_more_than_1++; + total_tlbs += get_tlb_flushes_vmalloc() - before_tlbs; + + /* Collect per iteration metrics */ + for (mon_n_free = 0; mon_n_free < mod_cnt; mon_n_free++) { + cur_alloc = &allocs_vm[mon_n_free]; + vfree(cur_alloc->init); + vfree(cur_alloc->core); + } + } + + if (fail) + pr_info("There was an alloc failure, results invalid!\n"); + + pr_info("num\t\tall(ns)\t\tlast(ns)"); + pr_info("%ld\t\t%llu\t\t%llu\n", mod_cnt, + total_all / (ITERS * mod_cnt), + total_last / ITERS); + + if (IS_ENABLED(CONFIG_X86_64) && IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { + pr_info("Last module in backup count = %d\n", last_in_bk); + pr_info("Total modules in backup = %d\n", total_in_bk); + pr_info(">1 module in backup count = %d\n", cnt_more_than_1); + } + /* + * This will usually hide info when the instrumentation is not in place. + */ + if (tlb_cnt_total) + pr_info("TLB Flushes: %lu\n", tlb_cnt_total); +} + +static void do_test(int test) +{ + switch (test) { + case 1: + do_test_alloc_fail(); + break; + case 2: + do_test_last_perf(); + break; + default: + pr_info("Unknown test\n"); + } +} + +static ssize_t device_file_write(struct file *filp, const char *user_buf, + size_t count, loff_t *offp) +{ + char buf[100]; + long iter; + long new_mod_cnt; + + if (count >= sizeof(buf) - 1) { + pr_info("Command too long\n"); + return count; + } + + copy_from_user(buf, user_buf, count); + buf[count] = 0; + if (buf[0] == 'm') { + kstrtol(buf+1, 10, &new_mod_cnt); + if (new_mod_cnt > 0 && new_mod_cnt <= MAX_ALLOC_CNT) { + pr_info("New module count: %ld\n", new_mod_cnt); + mod_cnt = new_mod_cnt; + if (allocs_vm) + vfree(allocs_vm); + allocs_vm = vmalloc(sizeof(struct vm_alloc) * mod_cnt); + + if (check_allocs) + vfree(check_allocs); + check_allocs = vmalloc(sizeof(struct check_alloc) + * mod_cnt); + } else + pr_info("more than %d not supported\n", MAX_ALLOC_CNT); + } else if (buf[0] == 't') { + kstrtol(buf + 1, 10, &iter); + do_test(iter); + } else { + pr_info("Unknown command\n"); + } + + return count; +} + +static const char *dv_name = "mod_alloc_test"; +const static struct file_operations test_mod_alloc_fops = { + .owner = THIS_MODULE, + .write = device_file_write, +}; + +static int __init mod_alloc_test_init(void) +{ + debugfs_create_file(dv_name, 0400, NULL, NULL, &test_mod_alloc_fops); + + return 0; +} + +MODULE_LICENSE("GPL"); + +module_init(mod_alloc_test_init); diff --git a/tools/testing/selftests/bpf/test_mod_alloc.sh b/tools/testing/selftests/bpf/test_mod_alloc.sh new file mode 100755 index 0000000..e9aea57 --- /dev/null +++ b/tools/testing/selftests/bpf/test_mod_alloc.sh @@ -0,0 +1,29 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +UNMOUNT_DEBUG_FS=0 +if ! mount | grep -q debugfs; then + if mount -t debugfs none /sys/kernel/debug/; then + UNMOUNT_DEBUG_FS=1 + else + echo "Could not mount debug fs." + exit 1 + fi +fi + +if [ ! -e /sys/kernel/debug/mod_alloc_test ]; then + echo "Test module not found, did you build kernel with TEST_MOD_ALLOC?" + exit 1 +fi + +echo "Beginning module_alloc performance tests." + +for i in `seq 1000 1000 8000`; do + echo m$i>/sys/kernel/debug/mod_alloc_test + echo t2>/sys/kernel/debug/mod_alloc_test +done + +echo "Module_alloc performance tests ended." + +if [ $UNMOUNT_DEBUG_FS -eq 1 ]; then + umount /sys/kernel/debug/ +fi