From patchwork Tue Jun 23 01:18:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saravana Kannan X-Patchwork-Id: 11619539 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F0C1912 for ; Tue, 23 Jun 2020 01:29:23 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6762D20720 for ; Tue, 23 Jun 2020 01:29:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="rNyVFn+H"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="bh/pq5ka" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6762D20720 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:From:Subject:Mime-Version:Message-Id:Date: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=TkIT+svPe6QaARSMKTJ1siNurcmpu+su1isjU2hsszk=; b=rNyVFn+H/IkFtf4AzEaKeuewBT V0Evb4WoXCfpTRvpz7v1D+W4WOuETil2wv16vZqfsi5th2q8cGYD7muxef5oNzajX4Q/9saawaWv/ xPcenuTQyI8GGGnaSrlzasngZ4NjPoZyQ8NQN2J7ShBSEjmuCfBuIQLG0PtbhqDdjAafLs9SfcwaZ phhJNGhn3CjauPOQtdpRCYqNt2iq8QzcaOz2BWBmpPDnwaG8jrBX4Z+ytAI419MBihFjQMlE+m3Wq 7Z9wlToFB39RDGD1fwZGCD2bqJP5gb+L74iiB9Om5ShSCGi5QeJIhCSwZVhU7AoqDKZkkmk4LiQvp afRGFYBw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jnXii-000216-D6; Tue, 23 Jun 2020 01:27:28 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jnXif-000201-Fo for linux-arm-kernel@lists.infradead.org; Tue, 23 Jun 2020 01:27:26 +0000 Received: by mail-yb1-xb49.google.com with SMTP id e20so22270145ybc.23 for ; Mon, 22 Jun 2020 18:27:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=UOuwZGpgzJVvH8pfpeZ0keyZ//465N3FyB2wMRomwb4=; b=bh/pq5kaM7FeIrZzypeLaSXBK8XSC14fuFLe/P1qXyWg+kQu1XaweHivg01izyhyNv KUEJxy18Swjm/dSQUSLHPF9Ynx3QLuDv5s583y4DNnPzqokVzRDZKlw1vn6WiUmgXXc0 2fNIUm3lp/PBoOcdzuZ40pvye6m70igqfwh5dTnW1v1iXpi4NXZbQkAxsvnHidnt4kge KsGge2QOVw/LfGXppAqseniFpv3LNhDaXew+vz+y5aSShHGcpqjJ2HUCYMYWXaqSFauh j8IsRzTJWkjmu7AJcGBM4JVRlbOUgpL1/SgMksiHkfieb7sBFdnkSEYf2aJYmg4Htwo1 5Iog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=UOuwZGpgzJVvH8pfpeZ0keyZ//465N3FyB2wMRomwb4=; b=eAJJLn42Ps0TmMnZXsMtYT9tQkSp10ZiJtwvf0qBcr0wK2bnpxT9C7oqem4H8r8fDV wvG46PBZvzdmSWrYvY1aWP+X/3mKSYPKzzEzNXsMmTO+cAYBf3xmbXBR0wdHZYALTIM3 7DSjmjEhycktBy5KuXDcFeNsTy6P3S8jtTjYijNDXEB3Xvoy7mLheTttRGzac/STbOGN PhloWIHB2wgecydgTYZtH7fXtk9OrBLSCaA1mCc1I3cunCF0v66kb9BwhHOWdoTogDZN Bd6TPcCRIXURbD7H5EheHklTioRxEGpVFbmIMIasupj5nWfBjm7wC35X3cLkfR9r2BMF p1yQ== X-Gm-Message-State: AOAM530IkMwDDyBVwdS0MXx7GOgVI9THUvwvs9kb0xhBwqM7AIU9+7yb S/UnCYii4JHqb3eHpUdviyrq7A4JhoA/Ibc= X-Google-Smtp-Source: ABdhPJy7qcpSVpegKx4QAc5kw2vXm64fXjkxcVFY9F2PveOYLJCjOmFJR5k3dfbz5KVZc3jVoAnglPgHZolq4+g= X-Received: by 2002:a25:2488:: with SMTP id k130mr31280767ybk.241.1592875635446; Mon, 22 Jun 2020 18:27:15 -0700 (PDT) Date: Mon, 22 Jun 2020 18:18:02 -0700 Message-Id: <20200623011803.91232-1-saravanak@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.27.0.111.gc72c7da667-goog Subject: [PATCH v2] arm64/module: Optimize module load time by optimizing PLT counting From: Saravana Kannan To: Catalin Marinas , Will Deacon X-Spam-Note: CRM114 invocation failed X-Spam-Score: -7.7 (-------) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-7.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -7.5 USER_IN_DEF_DKIM_WL From: address is in the default DKIM white-list -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:b49 listed in] [list.dnswl.org] -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.0 DKIMWL_WL_MED DKIMwl.org - Medium sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, kernel-team@android.com, linux-arm-kernel@lists.infradead.org, Saravana Kannan , Ard Biesheuvel Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org When loading a module, module_frob_arch_sections() tries to figure out the number of PLTs that'll be needed to handle all the RELAs. While doing this, it tries to dedupe PLT allocations for multiple R_AARCH64_CALL26 relocations to the same symbol. It does the same for R_AARCH64_JUMP26 relocations. To make checks for duplicates easier/faster, it sorts the relocation list by type, symbol and addend. That way, to check for a duplicate relocation, it just needs to compare with the previous entry. However, sorting the entire relocation array is unnecessary and expensive (O(n log n)) because there are a lot of other relocation types that don't need deduping or can't be deduped. So this commit partitions the array into entries that need deduping and those that don't. And then sorts just the part that needs deduping. And when CONFIG_RANDOMIZE_BASE is disabled, the sorting is skipped entirely because PLTs are not allocated for R_AARCH64_CALL26 and R_AARCH64_JUMP26 if it's disabled. This gives significant reduction in module load time for modules with large number of relocations with no measurable impact on modules with a small number of relocations. In my test setup with CONFIG_RANDOMIZE_BASE enabled, these were the results for a few downstream modules: Module Size (MB) wlan 14 video codec 3.8 drm 1.8 IPA 2.5 audio 1.2 gpu 1.8 Without this patch: Module Number of entries sorted Module load time (ms) wlan 243739 283 video codec 74029 138 drm 53837 67 IPA 42800 90 audio 21326 27 gpu 20967 32 Total time to load all these module: 637 ms With this patch: Module Number of entries sorted Module load time (ms) wlan 22454 61 video codec 10150 47 drm 13014 40 IPA 8097 63 audio 4606 16 gpu 6527 20 Total time to load all these modules: 247 Time saved during boot for just these 6 modules: 390 ms Cc: Ard Biesheuvel Signed-off-by: Saravana Kannan Acked-by: Will Deacon --- v1 -> v2: - Provided more details in the commit text - Pulled in Will's comments on the coding style - Pulled in Ard's suggestion about skipping jumps with the same section index (parts of Will's suggested code) arch/arm64/kernel/module-plts.c | 46 ++++++++++++++++++++++++++++++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c index 65b08a74aec6..0ce3a28e3347 100644 --- a/arch/arm64/kernel/module-plts.c +++ b/arch/arm64/kernel/module-plts.c @@ -253,6 +253,40 @@ static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num, return ret; } +static bool branch_rela_needs_plt(Elf64_Sym *syms, Elf64_Rela *rela, + Elf64_Word dstidx) +{ + + Elf64_Sym *s = syms + ELF64_R_SYM(rela->r_info); + + if (s->st_shndx == dstidx) + return false; + + return ELF64_R_TYPE(rela->r_info) == R_AARCH64_JUMP26 || + ELF64_R_TYPE(rela->r_info) == R_AARCH64_CALL26; +} + +/* Group branch PLT relas at the front end of the array. */ +static int partition_branch_plt_relas(Elf64_Sym *syms, Elf64_Rela *rela, + int numrels, Elf64_Word dstidx) +{ + int i = 0, j = numrels - 1; + + if (!IS_ENABLED(CONFIG_RANDOMIZE_BASE)) + return 0; + + while (i < j) { + if (branch_rela_needs_plt(syms, &rela[i], dstidx)) + i++; + else if (branch_rela_needs_plt(syms, &rela[j], dstidx)) + swap(rela[i], rela[j]); + else + j--; + } + + return i; +} + int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, char *secstrings, struct module *mod) { @@ -290,7 +324,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, for (i = 0; i < ehdr->e_shnum; i++) { Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset; - int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela); + int nents, numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela); Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info; if (sechdrs[i].sh_type != SHT_RELA) @@ -300,8 +334,14 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, if (!(dstsec->sh_flags & SHF_EXECINSTR)) continue; - /* sort by type, symbol index and addend */ - sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL); + /* + * sort branch relocations requiring a PLT by type, symbol index + * and addend + */ + nents = partition_branch_plt_relas(syms, rels, numrels, + sechdrs[i].sh_info); + if (nents) + sort(rels, nents, sizeof(Elf64_Rela), cmp_rela, NULL); if (!str_has_prefix(secstrings + dstsec->sh_name, ".init")) core_plts += count_plts(syms, rels, numrels,