From patchwork Wed Nov 9 13:41:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037552 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51418C4332F for ; Wed, 9 Nov 2022 13:59:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229797AbiKIN7x (ORCPT ); Wed, 9 Nov 2022 08:59:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229915AbiKIN7w (ORCPT ); Wed, 9 Nov 2022 08:59:52 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3926412E; Wed, 9 Nov 2022 05:59:51 -0800 (PST) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9DqgPf027411; Wed, 9 Nov 2022 13:59:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=hafRT3cbaR0JvZQBfXnlGBpP2JcVzJVAiWG91/wkYnA=; b=Zt0kx/gUcQcEAh2Pm2EnFn71DE6zqS6InnoqtkpBG8Qx8vDSieWX7WVmkYnZLRS8u17L csU3OaIlH+YVdIV7AMO3KrOFG8ChidNSR8Aw4lXje09d1duxdCDi0jG+XwU3mzgomVWk q8O6txzip291xUimnFCFjglr2hcP/ce77TGPdo5zBnG56CfZq2q0gGzSb9vSK4GQ5N31 y2w75WODjNa1u+SKkd7JgDuOnGul4ht81ajJxEnLvkMghCHvQYOvHOiCuIWD1wlwKw41 BJMAf5yRZ9k38hF+22pHXRHcIQvuBFgu0cjEPEXnTAbE9UVuzTpO25rWMoFTUqtIk/JN 0g== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krdfkg0t4-24 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:59:45 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Cbjcb017960; Wed, 9 Nov 2022 13:41:57 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2103.outbound.protection.outlook.com [104.47.70.103]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctdn8kx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:41:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MxngrQQFPe0WyP+rSG0SLAS1fzHazzvXaA24hEjRfJ2vgpJ5WRGTx+wot62GOq4gVO4QOMlt8D8SY933jHyIuySvhZtyEBupiNzkALl1Xg2pnyM2gd4L4zlsuh/n+RqulGUTTSBZtMlOhl8w+86EZNdGOA4oO2cH0Fj4FIELQg9+IzsN87mS9ndOZUqEupBk0gOK4z9CQJz9LRQusB7dYM8I7mtgddQYkAAwovnPxanN+uqaaNiyF/66OULZqZKV7b7kHMcrolrduel7T/XZKIIWdW33blDYFWJh3hSL+AY7laHmXBETzT6eoU1iLujgV9r4LDH/t9obwOChn9SNkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=hafRT3cbaR0JvZQBfXnlGBpP2JcVzJVAiWG91/wkYnA=; b=h7K8sot9F9hexgXBcoXzUInNqG4nc0zOj4rueWcttNSdV7Xt/0j+PwSj+af5zMdOXmjaoMdBeTtE9EzDlHoBVzoIqBl+2xrYu2gxnI5V+8/+kR/yH+UiVPYqn4++cjzfrw9DnJmXbL5W5fO0invKjc9thPqi1Jt+k0rpe76YxW3f4rtX6n3dN9rOVy439qC3AgQZBUrU4hdF4Bw52YqWIQuyKfUST01HFmWfk1ksK3qCfx8nBlwziGNeii/7PpC9mW/9WG1pjbhATbBOfNI2F6fYIVTvtp5FTN7UjUzfPogqZuT03doewxUJjzwB8XA3JeJBoXLrl3lem7i8tXtXDw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hafRT3cbaR0JvZQBfXnlGBpP2JcVzJVAiWG91/wkYnA=; b=SZL4aCAhFhPTBCggrs8zF7TQfgJwtBrNNGzxgxhg1jRN22445rrrydVlPtqVrpMOTECs8PQQZAl7/6X/+O5krdE2kdVIIg5vmPEYs4aDf6Jz6LC2YWpgQa4rWP9WB1VD1p8xhCKJLB+QLd9FHmykIkTzdds6o51/C2g7zoRKuA4= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:41:55 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:41:55 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 1/8] kbuild: bring back tristate.conf Date: Wed, 9 Nov 2022 13:41:25 +0000 Message-Id: <20221109134132.9052-2-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P265CA0218.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:33a::19) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: 0d8d8b48-b508-422d-c72d-08dac2582a43 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: GHS/mBiJCNcdU7dsgBQPPFKnqeMiyF8iUHPyi4ZrhPASYTutUAUepYa+8Ozq25mlflOXQ2ttyyxky/oSz9FBCYXq+FO6bgEN7I7xThlfH0Wi9zT/yLrZvyjgXiEyXhWZNTnf08uVAehVlLUvPkLZGW6X17PQ2tjZnt47vt83ut4spdg4jl5YX+RRPolbyRkMZMo9rQJVdBg13hFqTK3eKU00e52sptXLak9aB80wNNh2sNBE4gFsvQc0J7OMni8H1E+KMSaGxRslLDMAVcyqqiiZL1u0+wtP6L9DyCR1CRjcoIDd3VPx8hSwDoE9AsRTM7Zb/W0c3GMfkgzvHlRzrq877lbbeCAp8/EDWoYNBnplry6/CYY7zYmVo+bml8lmE3AOVXpXvHBi7crtmwgQGnyfwWwx6KrL97wSlRtTe0WHJt8cxPx5N95NvkGRtua01WFjclmHGmsKliPCDcyUBgW3lQkKdtq4f9iu+eTwKkcC6SAi8MhxCCkcp10sCHKMRYqgN6V1rQgfMTadPpXUIq1c5/UP+60DM1GJucjBZUcgwrdmcge+wb8TJVY6HDepyRDdjQq+vBAbp9Zsd2LyI2YH38i9GtJdQAGYcfGR1O7gXW9ggHyOJPJMwMK8y1H9cR7lN7009QZ6rdrBtINx3vrvbtFW+xICqr/I6uKyeTvbRF0DhBF4CD5xJr0LSuYkbYFirl8lRwuYmjTCccvLHw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 5p+RsvTxWTETbVj3y85NuUX56bzKP/dVGNiI5qxMkbjt8BX1PTVwWj2dPt04qqOkr1L5BiwRaisKIN939HsNl60GFGZLUo61f1qJRUqC/8UrZbmb7PGbPB3w+9kofWefvxAper6hol0TfGR9tPnuu1Ya7g17S7hx3qlsCRfySZ+hou3+3ExseeJumWiT3eiNaiENuodYjB3P68UsH/9TWCYVMB3TUiIasOWqpmvsaprQRyKN2VbsobHyhA/XcoVtBIPu7i1CPpMiTMNpjnwdxV6YpfuWzypRekfGPeAqUHsySmzruEHsOWpoU3wdkbSCqXJBfQ15UG4omt4Cy09qwUUQLgLd/kJpkaRNF6b0b9l8OTvPSKWNCzbqIgtcI5lYCvADXjw5GchkIV2w0ZC0POhsNFHXS64AWP3etWEI+AtH2pLQyrm6+xxQuGZnHPYLsIrMQ0oBlME5/Qjv20fkHCmDix9L0V71LbVGkPXGldeW/t8g4/0GdB1Z/ZreylTEXrbO90iJwqy7YmxnYgkQ14+TxCEJnDjlEEyAN7N2BCkl/OpzAiHx9Y+g73wxyiPGrOEaoqFnKpVD3/EK6eMOwmwOgUEKG4RpAdWiZSmhRPfvzL0f+0ivcg10fi16Lm2hOtVfU5s0jSPrrhpiIWdcbSrHgsNZQOOeLKhIbWs2rDFm7xKM/Pm2q+IEIQk0JvrzdW8N5F6ENlKa2N4QuElQq/HYtq1BYAiPxKwdu2MQzMC9GhC/g32gYrTHvEfZYYDEFlNjBF5l03Z1WZ1PcIfJlFQp7tVNeXzuHfDN8TCH1ZnC2XJBb0Ua1hWDtZPGJIZyshQC3rRJF2qe/HiQv1jRo2SKbI2jh0uu20PpGhy8gFjcyLjCnCbv7Lnlpib95ytY09Q7R0pBWUBOB8H/f4J6Fr7WugYMdzVR9GlJ7TYPIN3uIy0z2kllwpEB9xmBNVZemyewILhdJ1YYSiutrD2WYjvJzAuVeL/6DfTyxgbL2PYqmcWAw8EVn4wgQXpNjL/RqHsUsulnfzDsCwTd3nXs00J4mgm9CxwFBg0sO66nPBZGVTZLWMnj7PqUj7hh9fKt8HbXucak0YhA3QREZ7m6WzwQcrf7Lh1rLi6FcxzhA/fn63cLnwz0Td9LgTJr4DnGbUTlMz45RtlMT27DFLOhtHsx6lfADvlKMF9CFuSnNQsR92sRO0dZGkR/azRUiXpTMHSV5eFGpjw3qf0mwtqut29/NrzWIJMq+d0ebg4QQM3EnwJHGYT2bEzJR/cItZrgrOVjzwVIBkGnPSHb5fhQumivMu6OtJWKVgLwp5glekWT42sQe89sbGtiI9J5187c8Jbu36Tex8PSy616GnDVWKzmexZawjYOL1LcO1C44vMB0PNhHyHMCjD3mGja9fZDsbfvgqcSWbQ+37z+Ni92DJX+1ySHVtFNgFaYylSMekENKqr3gx7KLkkPz3kngUhj8lzO4n/Dz4w1w21hlvdgBkugkT6Auntei3LRIDaIfiV8Iua4MW8vBxWWcuZxrV3tOQwofgPH7hjgz+sv4kTFeQIIuLP6zi/7/PzKeqAruqfFKW/r+YZPUqAUDZIec7qMUiH6D3QKMCCj0JlcRN6G6A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0d8d8b48-b508-422d-c72d-08dac2582a43 X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:41:55.0231 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: z+jREwBVAmDaWtA2SbixSycR1eJ4qLzvOgvMJaJQ22QH0cdK+4AS5NnkKT71MkdsUrhVqejFIyJ1uZOP6despA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-ORIG-GUID: fyXCLyHjorH_CRYY5eGzhhuXmxXF4N7t X-Proofpoint-GUID: fyXCLyHjorH_CRYY5eGzhhuXmxXF4N7t Precedence: bulk List-ID: tristate.conf was dropped because it is not needed to build a modules.builtin (although dropping it introduces a few false positives into modules.builtin support), and doing so avoids one round of recursion through the build tree to build it. But kallmodsyms support requires building a mapping from object file name to built-in module name for all builtin modules: this seems to me impossible to accomplish without parsing all makefiles under the influence of tristate.conf, since the makefiles are the only place this mapping is recorded. So bring it back for this purpose. (Thanks to the refactoring in the 5.16 timeframe, this is basically a reimplementation of commit 8b41fc4454e36fbfdbb23f940d023d4dece2de29 rather than a simple reversion.) Signed-off-by: Nick Alcock Reviewed-by: Victor Erminpour Reviewed-by: Kris Van Hees --- Notes: v7: rewrite in terms of the new confdata refactoring v8: adjust for changes in 5.17 merge window Documentation/kbuild/kconfig.rst | 5 ++++ Makefile | 2 +- scripts/kconfig/confdata.c | 41 +++++++++++++++++++++++++++----- 3 files changed, 41 insertions(+), 7 deletions(-) diff --git a/Documentation/kbuild/kconfig.rst b/Documentation/kbuild/kconfig.rst index 5967c79c3baa..e2c78760d442 100644 --- a/Documentation/kbuild/kconfig.rst +++ b/Documentation/kbuild/kconfig.rst @@ -162,6 +162,11 @@ KCONFIG_AUTOCONFIG This environment variable can be set to specify the path & name of the "auto.conf" file. Its default value is "include/config/auto.conf". +KCONFIG_TRISTATE +---------------- +This environment variable can be set to specify the path & name of the +"tristate.conf" file. Its default value is "include/config/tristate.conf". + KCONFIG_AUTOHEADER ------------------ This environment variable can be set to specify the path & name of the diff --git a/Makefile b/Makefile index d148a55bfd0f..5d26447fecb8 100644 --- a/Makefile +++ b/Makefile @@ -793,7 +793,7 @@ $(KCONFIG_CONFIG): # # Do not use $(call cmd,...) here. That would suppress prompts from syncconfig, # so you cannot notice that Kconfig is waiting for the user input. -%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h %/generated/rustc_cfg: $(KCONFIG_CONFIG) +%/config/auto.conf %/config/auto.conf.cmd %/generated/autoconf.h %/generated/rustc_cfg %/tristate.conf: $(KCONFIG_CONFIG) $(Q)$(kecho) " SYNC $@" $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig else # !may-sync-config diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c index b7c9f1dd5e42..160d12b69957 100644 --- a/scripts/kconfig/confdata.c +++ b/scripts/kconfig/confdata.c @@ -223,6 +223,13 @@ static const char *conf_get_rustccfg_name(void) return name ? name : "include/generated/rustc_cfg"; } +static const char *conf_get_tristate_name(void) +{ + char *name = getenv("KCONFIG_TRISTATE"); + + return name ? name : "include/config/tristate.conf"; +} + static int conf_set_sym_val(struct symbol *sym, int def, int def_flags, char *p) { char *p2; @@ -670,8 +677,12 @@ static char *escape_string_value(const char *in) enum output_n { OUTPUT_N, OUTPUT_N_AS_UNSET, OUTPUT_N_NONE }; +#define PRINT_ESCAPE 0x01 +#define PRINT_UPCASE 0x02 +#define PRINT_TRISTATE_ONLY 0x04 + static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n, - bool escape_string) + int flags) { const char *val; char *escaped = NULL; @@ -679,6 +690,9 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n, if (sym->type == S_UNKNOWN) return; + if (flags & PRINT_TRISTATE_ONLY && sym->type != S_TRISTATE) + return; + val = sym_get_string_value(sym); if ((sym->type == S_BOOLEAN || sym->type == S_TRISTATE) && @@ -688,29 +702,38 @@ static void __print_symbol(FILE *fp, struct symbol *sym, enum output_n output_n, return; } - if (sym->type == S_STRING && escape_string) { + if (sym->type == S_STRING && flags & PRINT_ESCAPE) { escaped = escape_string_value(val); val = escaped; } - fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val); + if (flags & PRINT_UPCASE) + fprintf(fp, "%s%s=%c\n", CONFIG_, sym->name, (char)toupper(*val)); + else + fprintf(fp, "%s%s=%s\n", CONFIG_, sym->name, val); free(escaped); } static void print_symbol_for_dotconfig(FILE *fp, struct symbol *sym) { - __print_symbol(fp, sym, OUTPUT_N_AS_UNSET, true); + __print_symbol(fp, sym, OUTPUT_N_AS_UNSET, PRINT_ESCAPE); } static void print_symbol_for_autoconf(FILE *fp, struct symbol *sym) { - __print_symbol(fp, sym, OUTPUT_N_NONE, false); + __print_symbol(fp, sym, OUTPUT_N_NONE, 0); +} + +static void print_symbol_for_tristate(FILE *fp, struct symbol *sym) +{ + __print_symbol(fp, sym, OUTPUT_N_NONE, PRINT_ESCAPE | PRINT_UPCASE | + PRINT_TRISTATE_ONLY); } void print_symbol_for_listconfig(struct symbol *sym) { - __print_symbol(stdout, sym, OUTPUT_N, true); + __print_symbol(stdout, sym, OUTPUT_N, PRINT_ESCAPE); } static void print_symbol_for_c(FILE *fp, struct symbol *sym) @@ -1207,6 +1230,12 @@ int conf_write_autoconf(int overwrite) if (ret) return ret; + ret = __conf_write_autoconf(conf_get_tristate_name(), + print_symbol_for_tristate, + &comment_style_pound); + if (ret) + return ret; + /* * Create include/config/auto.conf. This must be the last step because * Kbuild has a dependency on auto.conf and this marks the successful From patchwork Wed Nov 9 13:41:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F4F8C433FE for ; Wed, 9 Nov 2022 13:53:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230154AbiKINxI (ORCPT ); Wed, 9 Nov 2022 08:53:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230258AbiKINxE (ORCPT ); Wed, 9 Nov 2022 08:53:04 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB9842ED5D; Wed, 9 Nov 2022 05:53:01 -0800 (PST) Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9DlkcU027743; Wed, 9 Nov 2022 13:52:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=1z6C9gtkixehzLN/UpkpEQAvPNA7JZPk/QsoJZ6Xg60=; b=DMRAstWGBzxcmafXNBlunjvmRW82MCBXFP7ZWM5wBkFNcAbxOjV5Eq134EywxBKN0rGl gHU79ub3By46AqyUCvNUay+GHyYcZPGL8LlqUFOqgDf01njH3scldUstPve2C67IdKvt /ty6/txnaL3yys0895XkGGV+lcDXLfTTutvssanxNdA2ydf7iup8GNqkvR64ZlrxdK8q owE4mBo0PVRkakwZ5jCtSTtGVta7g6VT30/qXy7Ua3AyFLWtys7AQYA7bO5mKVAofG+2 JDr86qicO8GVyy04k/aaszCGFUhDH3FznO9DQUOO3fAw6LgkyRD91ycMOR850FMQBbY/ BA== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krddar0bt-21 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:52:54 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CxQBH017819; Wed, 9 Nov 2022 13:42:02 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctdn8pp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:01 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MCu+Mm5XiEnvn1+oKEJvmDu69zKcmefQpTXV/xnAQxktmu2Q//flyCVHjEH6d92NhM/QAzFx2ggonmibatVv1vrWTYORn5t+PZNkP7vaG8imgrkeHnuO2w4A849ITy+0DP3sG9mnvBA9WZ00bHuGZff7kUL6nCMwwY1f1/GV5KUhgKZ2AkUF4m3k63/z5A6xm1QOKfPifzXPu1DoXTaMef/5reqm3TNlpOzRb6EAZb19pmeihS9WiTdjYPx5uZi7icFD3sQU83cspLCOteO4EzpySPHV3f69/6FGLf3tplM55pmgXGB/Y7sWszRQN03i7x8a0KdmcNmi4ameECBcVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1z6C9gtkixehzLN/UpkpEQAvPNA7JZPk/QsoJZ6Xg60=; b=geIghMuqDJZSjlHxgeYN9KZSjQ2sNDh537Y1T3Prs9A28IiH+CpwsIrlLbBr79bjwkElTX0iRwRFt3cWk3MC95IYmCxyiyZx5UeRa0NxIH8GdzxHI9d1RJU+Qr79GOjRpspgRFf9xDpYpkxzmt2H6TEag3ZkdnUw7zGYSTWCdnCUk1u0U4U6d3BKa/xzxd4IyETovh2HgkdBJNWyybXHOjvc7d7EyBAJrc2loG8awW4ry98LHI+f/xGM7gpStW4vfDqIUIOhOH8dj5iCx9k0o/i/ULeJEX8sefrrOO++O675aDpvIgX6HQ6o3zVtX4e3dbhXkxzQfwbJwiRx98jfmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1z6C9gtkixehzLN/UpkpEQAvPNA7JZPk/QsoJZ6Xg60=; b=BjexByZ70m7yIUYrISSWCUoYRYmdnyzubqNdyWoUnV9GxLPSxj2PEprSWQm4QTHXUnW0b5JZqKTwJlcVnxtq2LhX9gG0oCcnGHAjQvKa8r4Sd5rD2T3/dZYj9XUG0LaYApahIWa724dEEZ5xAVUocypm6e7BMMheJgXneWAADH0= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:41:59 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:41:59 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 2/8] kbuild: add modules_thick.builtin Date: Wed, 9 Nov 2022 13:41:26 +0000 Message-Id: <20221109134132.9052-3-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P265CA0209.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:33a::16) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: dfd5365c-a865-424c-bf10-08dac2582cfe X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: RJquKyCBWbx+kbPXXfNpOezlVVmi74v0nwKj5GLSRJ9zHpoA913xD02B9gAmQbP2SwDEb0iSOga0VQqohWtol///aj3xgFYs4XAxS6q7b223H6qX/dqjEsAC54j/Ca4FwP4xdRxUdKUf98/m2AtR3u4VGNHsFgVsrczetIICe+6XGW2J/m1QyeGA3zRLZmsrQTt6m+K7vXyBAIma9xmT2I9edCl1Otzg7MznRNFt6dCXozgL2GP0MA3BkHht5djnam943i/m/PRw7+Shh3QVV3N+GnUs8mvCV93/J0p18KSQsw2j+m3Hx7UUcQWy4lfGWlIQLswjtl9pvQ3bM5ie9AZN+Z+p7GMz2ifCont3/dRBvLG8nCciiTzhlo2CPmcet5VZ8JmtyDevyYg3cV5ayt9OknUTmwQTSAuydM/zyob5f5TmQon8BwwDxRI+lawvQMb+m1BCUF6cclYMEs+nK7OxLicfjgLa1BVYphlKe+Hoy/NQsZjLdrtj+WZ6Y3SQo4agURE3YJbelcy+V4wUeJtRgD2Ciu3LmPTm6lQJXsv4KNrQ6OEZE5VOvyzk2UQgB2p2UCxmDrM6/1xG4u86OI8H0UOOvt8lBZksxqenLVXrQmCY5pu+KzxrZCqC2mOp/DXvIoIX2P646rezGkBKmsBNaY+gJfxplqP80Eruuu5TVaRMm4h+Z2sy7Ic1SAhUlM9RiOa1oxqvo4cbiiuumwCla2LwKnxHUSWZmW/vFLbTGyB0vNIAyP+r13xXpi/f X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003)(2004002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: It/ATvnm4at35qxdWRnmIdWdZV+h1HNuTiPypHOHJFNHAlk1X7UAO5FJ57vB01DXmdm7jutXc7vmjnA40vIcb1U8RoJD9kK1C5bk9RMUXEE7q4C9XWGQoK6/3gOzFo9+iHPQZha3HBHTjsIpjtzhHqlS1NcQupeNxV7TBZ2zDeRQygS1DXRj2b5VQ3Y35fl3jB51jJJ5DZmqigH40raUHOpVsOkcmX/f32sH9LOXFGuxqckTANULT06TZJPV+Bk9XyNq1Pzd9YEGROY7XyGq8U4W3b036uW/Xgl4Vyl4WKApPProwRpr1TEEdhDJR4e/XEB4A7fSytojBZsH64RrQgl4FJgA00qg78cbn83xLdMN8kOVGvSvNTs3XQDBNjytafJz4uTYDsD6ivCSRSoQixol3EgEZkzWaJ1g4jTpofOj/vxttc094kp44o0//x2MMAEMN0GcIhSiCtfCYYeMzvrAYTWQ4ii3M+kO4hm/tWVvtnFD+tSxAoZqGeKlig6HQ8gN1et0ysvDzQ71I+eV8G8hiEIAs96Cga653sHr/QrkX/a7I+MlbJbhGZrR/KSeWhsf614g/NajN5BvD3NPSBst430OoVCag4YlQ+fPwee4y8V925w0R1YxQeLh0zonsv4Vh31ZJyKMSx6h6f8jYu6mysOas4XgdBIaDG7jq7bn+NiFW/jPGdqYWTP7B8rzsFyn0MuV9l56pKr2x33LfZsZQfRiBOaO5IyaQ8nJQ4BQdZaYA2L/p2MLp/vpkPFMIXoB/slqNi5wzyGdbv1u5DVg7vD/qKsukLHSsxGkEtRJJFFLkIyyMzvfg6v4yswm46pzQ5VrtXkXOv2Iywm1APHO/TJxhbdFPRA3gX7UOaRR/A2eX3aFmHBr9SOHK+0e8i1DdqhOxkaQCfaEA2MoQuIw0VX3AaDp3cA14LoTyyTGb/6HWde5aeYGb8rGqwWQ0txOrpipWMFS176xOkM7pbbACK0fgUVdPp1cvT7yvI2TDn9dBanlSSY4LpfbQ05zCc8DBRaHZK9XVJpCasZGr+g/We1mLDXelIAEoLzWYTX9S1/viMgdV7IID9EDNMXr0VSshQG85qiNGinN7UBvBV5wqWw5pjwBvFlM+9lAe2G2IetREE8krw25dKQsjGkksBZyyj8CyOfKUg56lvkPEk6x1j3/2LCNmBP7WNcdmNE57xM6oulf1XhfZEO51lNk2VsjEHFlqZKd3bkO1VA+xmz/q0E/nNclVrs97g1WIHN3ZyzL5ctB0ZO++ljXfHqlFIrp5k/GJV2Pk1TuQ3NH4VwSVsWNvSdwWFRA346Tcks8m4WjgKumkwqs4sDzBxGwTSYVc2XR+z1/M3E8ulq5SHyKLgoxrElVsM88T+iwiiKV1pWTFpwMFo28JnEWKwONDauybnLo1VKZ4eqgCEMVXVN5v9cKAmPbyYgjNJB2+2+SbBK8nJ5abIdb5vNG4UMJDc1QrCti+WZja9uY3Do97aTKorD0UsH2v/MCRKh4smnTYea5V/+RixKyIwkp6ygmXSIWdENDyMr1x2JnOoY2jnbjAXBcfjxV6xXl6FCR6EumECbOWv2y1FYiqlFUCnJQ6NIm72ahHAy0yXhkWTYbMQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: dfd5365c-a865-424c-bf10-08dac2582cfe X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:41:59.6024 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: P9uWAdFrI2Yr1SQ/GorhH3qJHlBcm8WocQ2afIbdziqpofSGKwbwfbhHjBabHBKoEdMSydm/9ZXph/aYEXdqdA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: MXNvhTqpoBrO3oVUE4aNUvJqhZEXNtL3 X-Proofpoint-ORIG-GUID: MXNvhTqpoBrO3oVUE4aNUvJqhZEXNtL3 Precedence: bulk List-ID: This is similar to modules.builtin, and constructed in a similar way to the way that used to be built before commit 8b41fc4454e36fbfdbb23f940d023d4dece2de29, via tristate.conf inclusion and recursive concatenation up the tree. Unlike modules.builtin, modules_thick.builtin gives the names of the object files that make up modules that are comprised of more than one object file, using a syntax similar to that of makefiles, e.g.: crypto/crypto.o: crypto/api.o crypto/cipher.o crypto/compress.o crypto/memneq.o crypto/crypto_algapi.o: crypto/algapi.o crypto/proc.o crypto/scatterwalk.o crypto/aead.o: crypto/geniv.o: (where the latter two are single-file modules). An upcoming commit will use this mapping to populate /proc/kallmodsyms. A parser is included that yields a stram of (module, objfile name[]) mappings: it's a bit baroque, but then parsing text files in C is quite painful, and I'd rather put the complexity in here than in its callers. The parser is not built in this commit, nor does it have any callers yet; nor is any rule added that causes modules_thick.builtin to actually be constructed. (Again, see a later commit for that.) I am not wedded to the approach used to construct this file, but I don't see any other way to do it despite spending a week or so trying to tie it into Kbuild without using a separate Makefile.modbuiltin: unlike the names of builtin modules (which are also recorded in the source files themseves via MODULE_*() macros) the mapping from object file name to built-in module name is not recorded anywhere but in the makefiles themselves, so we have to at least reparse them with something to indicate the builtin-ness of each module (i.e., tristate.conf) if we are to figure out which modules are built-in and which are not. Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees Signed-off-by: Luis Chamberlain --- Notes: v9: move modules_thick.builtin generation into the top-level Kbuild .gitignore | 1 + Documentation/dontdiff | 1 + Kbuild | 16 +++ Makefile | 2 +- scripts/Kbuild.include | 6 ++ scripts/Makefile.modbuiltin | 56 ++++++++++ scripts/modules_thick.c | 200 ++++++++++++++++++++++++++++++++++++ scripts/modules_thick.h | 48 +++++++++ 8 files changed, 329 insertions(+), 1 deletion(-) create mode 100644 scripts/Makefile.modbuiltin create mode 100644 scripts/modules_thick.c create mode 100644 scripts/modules_thick.h diff --git a/.gitignore b/.gitignore index 5da004814678..f129bf52cbd4 100644 --- a/.gitignore +++ b/.gitignore @@ -52,6 +52,7 @@ *.zst Module.symvers modules.order +modules_thick.builtin # # Top-level generic files diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 352ff53a2306..077d43b9675d 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -183,6 +183,7 @@ modules.builtin modules.builtin.modinfo modules.nsdeps modules.order +modules_thick.builtin modversions.h* nconf nconf-cfg diff --git a/Kbuild b/Kbuild index 464b34a08f51..a84e8312c174 100644 --- a/Kbuild +++ b/Kbuild @@ -97,3 +97,19 @@ obj-$(CONFIG_SAMPLES) += samples/ obj-$(CONFIG_NET) += net/ obj-y += virt/ obj-y += $(ARCH_DRIVERS) + +# Generate modules_thick.builtin if needed. +# +# modules_thick.builtin maps from kernel modules (or rather the object file +# names they would have had had they not been built in) to their constituent +# object files: we can use this to determine which modules any given object +# file is part of. (We cannot eliminate the slight redundancy here without +# double-expansion.) + +modthickbuiltin-files := $(addsuffix modules_thick.builtin, $(filter %/,$(obj-y))) + +$(modthickbuiltin-files): include/config/tristate.conf + $(Q)$(MAKE) $(modbuiltin)=$(patsubst %/modules_thick.builtin,%,$@) builtin-file=modules_thick.builtin + +modules_thick.builtin: $(modthickbuiltin-files) $(obj-y) + $(Q)$(AWK) '!x[$$0]++' $(patsubst %/built-in.a, %/$@, $(filter %/built-in.a,$(obj-y))) > $@ diff --git a/Makefile b/Makefile index 5d26447fecb8..21117f9d4202 100644 --- a/Makefile +++ b/Makefile @@ -2008,7 +2008,7 @@ clean: $(clean-dirs) -o -name '*.lex.c' -o -name '*.tab.[ch]' \ -o -name '*.asn1.[ch]' \ -o -name '*.symtypes' -o -name 'modules.order' \ - -o -name '.tmp_*' \ + -o -name '.tmp_*' -o -name modules_thick.builtin \ -o -name '*.c.[012]*.*' \ -o -name '*.ll' \ -o -name '*.gcno' \ diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include index 2bc08ace38a3..28d5eb7a9b61 100644 --- a/scripts/Kbuild.include +++ b/scripts/Kbuild.include @@ -78,6 +78,12 @@ endef # $(Q)$(MAKE) $(build)=dir build := -f $(srctree)/scripts/Makefile.build obj +### +# Shorthand for $(Q)$(MAKE) -f scripts/Makefile.modbuiltin obj= +# Usage: +# $(Q)$(MAKE) $(modbuiltin)=dir +modbuiltin := -f $(srctree)/scripts/Makefile.modbuiltin obj + ### # Shorthand for $(Q)$(MAKE) -f scripts/Makefile.dtbinst obj= # Usage: diff --git a/scripts/Makefile.modbuiltin b/scripts/Makefile.modbuiltin new file mode 100644 index 000000000000..a27b692ea795 --- /dev/null +++ b/scripts/Makefile.modbuiltin @@ -0,0 +1,56 @@ +# SPDX-License-Identifier: GPL-2.0 +# ========================================================================== +# Generating modules_thick.builtin +# ========================================================================== + +src := $(obj) + +PHONY := __modbuiltin +__modbuiltin: + +include include/config/auto.conf +# tristate.conf sets tristate variables to uppercase 'Y' or 'M' +# That way, we get the list of built-in modules in obj-Y +include include/config/tristate.conf + +include scripts/Kbuild.include + +ifdef building_out_of_srctree +# Create output directory if not already present +_dummy := $(shell [ -d $(obj) ] || mkdir -p $(obj)) +endif + +# The filename Kbuild has precedence over Makefile +kbuild-dir := $(if $(filter /%,$(src)),$(src),$(srctree)/$(src)) +kbuild-file := $(if $(wildcard $(kbuild-dir)/Kbuild),$(kbuild-dir)/Kbuild,$(kbuild-dir)/Makefile) +include $(kbuild-file) + +include scripts/Makefile.lib + +modthickbuiltin-subdirs := $(patsubst %,%/modules_thick.builtin, $(subdir-ym)) +modthickbuiltin-target := $(obj)/modules_thick.builtin + +__modbuiltin: $(obj)/$(builtin-file) $(subdir-ym) + @: + +$(modthickbuiltin-target): $(subdir-ym) FORCE + $(Q) rm -f $@ + $(Q) $(foreach mod-o, $(filter %.o,$(obj-Y)),\ + printf "%s:" $(addprefix $(obj)/,$(mod-o)) >> $@; \ + printf " %s" $(sort $(strip $(addprefix $(obj)/,$($(mod-o:.o=-objs)) \ + $($(mod-o:.o=-y)) $($(mod-o:.o=-Y))))) >> $@; \ + printf "\n" >> $@; ) \ + cat /dev/null $(modthickbuiltin-subdirs) >> $@; + +PHONY += FORCE + +FORCE: + +# Descending +# --------------------------------------------------------------------------- + +PHONY += $(subdir-ym) +$(subdir-ym): + $(Q)$(MAKE) $(modbuiltin)=$@ builtin-file=$(builtin-file) + +.PHONY: $(PHONY) diff --git a/scripts/modules_thick.c b/scripts/modules_thick.c new file mode 100644 index 000000000000..9a15e99c1330 --- /dev/null +++ b/scripts/modules_thick.c @@ -0,0 +1,200 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * A simple modules_thick reader. + * + * (C) 2014, 2021 Oracle, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include + +#include "modules_thick.h" + +/* + * Read a modules_thick.builtin file and translate it into a stream of + * name / module-name pairs. + */ + +/* + * Construct a modules_thick.builtin iterator. + */ +struct modules_thick_iter * +modules_thick_iter_new(const char *modules_thick_file) +{ + struct modules_thick_iter *i; + + i = calloc(1, sizeof(struct modules_thick_iter)); + if (i == NULL) + return NULL; + + i->f = fopen(modules_thick_file, "r"); + + if (i->f == NULL) { + fprintf(stderr, "Cannot open builtin module file %s: %s\n", + modules_thick_file, strerror(errno)); + return NULL; + } + + return i; +} + +/* + * Iterate, returning a new null-terminated array of object file names, and a + * new dynamically-allocated module name. (The module name passed in is freed.) + * + * The array of object file names should be freed by the caller: the strings it + * points to are owned by the iterator, and should not be freed. + */ + +char ** __attribute__((__nonnull__)) +modules_thick_iter_next(struct modules_thick_iter *i, char **module_name) +{ + size_t npaths = 1; + char **module_paths; + char *last_slash; + char *last_dot; + char *trailing_linefeed; + char *object_name = i->line; + char *dash; + int composite = 0; + + /* + * Read in all module entries, computing the suffixless, pathless name + * of the module and building the next arrayful of object file names for + * return. + * + * Modules can consist of multiple files: in this case, the portion + * before the colon is the path to the module (as before): the portion + * after the colon is a space-separated list of files that should be + * considered part of this module. In this case, the portion before the + * name is an "object file" that does not actually exist: it is merged + * into built-in.a without ever being written out. + * + * All module names have - translated to _, to match what is done to the + * names of the same things when built as modules. + */ + + /* + * Reinvocation of exhausted iterator. Return NULL, once. + */ +retry: + if (getline(&i->line, &i->line_size, i->f) < 0) { + if (ferror(i->f)) { + fprintf(stderr, "Error reading from modules_thick file:" + " %s\n", strerror(errno)); + exit(1); + } + rewind(i->f); + return NULL; + } + + if (i->line[0] == '\0') + goto retry; + + /* + * Slice the line in two at the colon, if any. If there is anything + * past the ': ', this is a composite module. (We allow for no colon + * for robustness, even though one should always be present.) + */ + if (strchr(i->line, ':') != NULL) { + char *name_start; + + object_name = strchr(i->line, ':'); + *object_name = '\0'; + object_name++; + name_start = object_name + strspn(object_name, " \n"); + if (*name_start != '\0') { + composite = 1; + object_name = name_start; + } + } + + /* + * Figure out the module name. + */ + last_slash = strrchr(i->line, '/'); + last_slash = (!last_slash) ? i->line : + last_slash + 1; + free(*module_name); + *module_name = strdup(last_slash); + dash = *module_name; + + while (dash != NULL) { + dash = strchr(dash, '-'); + if (dash != NULL) + *dash = '_'; + } + + last_dot = strrchr(*module_name, '.'); + if (last_dot != NULL) + *last_dot = '\0'; + + trailing_linefeed = strchr(object_name, '\n'); + if (trailing_linefeed != NULL) + *trailing_linefeed = '\0'; + + /* + * Multifile separator? Object file names explicitly stated: + * slice them up and shuffle them in. + * + * The array size may be an overestimate if any object file + * names start or end with spaces (very unlikely) but cannot be + * an underestimate. (Check for it anyway.) + */ + if (composite) { + char *one_object; + + for (npaths = 0, one_object = object_name; + one_object != NULL; + npaths++, one_object = strchr(one_object + 1, ' ')); + } + + module_paths = malloc((npaths + 1) * sizeof(char *)); + if (!module_paths) { + fprintf(stderr, "%s: out of memory on module %s\n", __func__, + *module_name); + exit(1); + } + + if (composite) { + char *one_object; + size_t i = 0; + + while ((one_object = strsep(&object_name, " ")) != NULL) { + if (i >= npaths) { + fprintf(stderr, "%s: num_objs overflow on module " + "%s: this is a bug.\n", __func__, + *module_name); + exit(1); + } + + module_paths[i++] = one_object; + } + } else + module_paths[0] = i->line; /* untransformed module name */ + + module_paths[npaths] = NULL; + + return module_paths; +} + +/* + * Free an iterator. Can be called while iteration is underway, so even + * state that is freed at the end of iteration must be freed here too. + */ +void +modules_thick_iter_free(struct modules_thick_iter *i) +{ + if (i == NULL) + return; + fclose(i->f); + free(i->line); + free(i); +} diff --git a/scripts/modules_thick.h b/scripts/modules_thick.h new file mode 100644 index 000000000000..f5edcaf9550c --- /dev/null +++ b/scripts/modules_thick.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * A simple modules_thick reader. + * + * (C) 2014, 2021 Oracle, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#ifndef _LINUX_MODULES_THICK_H +#define _LINUX_MODULES_THICK_H + +#include +#include + +/* + * modules_thick.builtin iteration state. + */ +struct modules_thick_iter { + FILE *f; + char *line; + size_t line_size; +}; + +/* + * Construct a modules_thick.builtin iterator. + */ +struct modules_thick_iter * +modules_thick_iter_new(const char *modules_thick_file); + +/* + * Iterate, returning a new null-terminated array of object file names, and a + * new dynamically-allocated module name. (The module name passed in is freed.) + * + * The array of object file names should be freed by the caller: the strings it + * points to are owned by the iterator, and should not be freed. + */ + +char ** __attribute__((__nonnull__)) +modules_thick_iter_next(struct modules_thick_iter *i, char **module_name); + +void +modules_thick_iter_free(struct modules_thick_iter *i); + +#endif From patchwork Wed Nov 9 13:41:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEEBAC433FE for ; Wed, 9 Nov 2022 13:52:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229537AbiKINwy (ORCPT ); Wed, 9 Nov 2022 08:52:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbiKINww (ORCPT ); Wed, 9 Nov 2022 08:52:52 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5DF21CB24; Wed, 9 Nov 2022 05:52:51 -0800 (PST) Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Dk6D6027859; Wed, 9 Nov 2022 13:52:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=aV9EStDlUFiVpRAMrN0YzrYdZBeYHzIysbC7g5GURjs=; b=XxBi5hGE2C/ioVCERDjycR5Lw/3khtCBlrC+aGCHilJtq5ijgEGRmjV9IA6UrH36gc72 Xn4BaInT3pBMN5ky6Fw3Im79fB2IHt3MQydkBFQI34tYqC/BMpWhNDC9x9yxrfmaFmeD 2ObKncTZYdmd/0kGn9AT/XxdLZW/lyP3Ok3jdf3wyCG/MGBB+qDfDGfe+7Blkq5014Kb GPoDGS267gro+rvlrMb7rCJT9MDPsTo9nyOT+37t/71Ss85sTwv6f5vVRR3hgabMS6DY //oAzPGs1NkvkIR33t2RsAEOlLe0T/kMrgKJlhzs8E75O6X7pjkkCymQsRwtQZhzjPJC rQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krd2w02vk-16 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:52:38 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CgvIk004294; Wed, 9 Nov 2022 13:42:16 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2106.outbound.protection.outlook.com [104.47.70.106]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpcq3ddus-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:16 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K5NRjr1llKNL3FihA4kjrC5tJMHWrbyo5V37neI2/NeqqvhUSe17TEnBShcEJUIjvjzlkItcv5sl+SpRq0qlsw75UuSJNnfuU5F0j2caubqTuLq1CP3sEUEwFNKrno0FEu2rKRrei62MaDPuAS+bjOwZINZdpwapIvbxaq6CWS/2uxUPDyI0vBgYoDJnq3qPeu2Jzyt+D3uOlLC+HClbiaz60oRaCVfg4ZW/y3A6xZYn8dguRVIpQ4Kn1C8yKArl8xAfi9QIJEkdDPqO0dqx/eZo/8cI4MkfaVv5U6Y5uMlRQTRN6b1KjEI+afGOE3liaav3rOAMpCYASeCUl8vktA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aV9EStDlUFiVpRAMrN0YzrYdZBeYHzIysbC7g5GURjs=; b=i1w0Nx0RJYyJlgJ/rRc1eMQMkck2rPZ51ur8/T/ig7IimQzy9HKu5JgJ9EvwW5TxNvWje6HhMeniRYs0lA16VLRfx9q96ymJMgn1rs09cX7l5Und9J/UPVYdMe1FI5HBUJUZNdi8dmMzGD1bpt9RnYasUknSOdtXF5YGxmtM4r3/3LzR2eEzgpnMeDRR8t5gvHIdYMW/GBhaf+rED+9iDuPaoefAP7HdelzcJY4q0tIpR71bB7pj5demutFFgCXoigwmqTq4HbiGaZMmSIofi3m3AV3442TKKG3xNqutyzQ0IRtg0OOnZDOa7whirvW8v2i36oN4wDUAxasgnKVOkA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aV9EStDlUFiVpRAMrN0YzrYdZBeYHzIysbC7g5GURjs=; b=KG9Riv9baNnLy13HwWeS0cO2FhILkJOes4fyQmPRGPeezqD2huZn135AwK8FrvcpMTCdWGF9fn7MA7bnsiWwT2F+pnyk46iUpTdr81QGwMCv5JiueZI6S5YWZCXrzMy5NL0Po/8M2xryFFMgTOXQ/ZsJpSC4SBeO3BEedGi2Plo= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:14 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:14 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 3/8] kbuild: generate an address ranges map at vmlinux link time Date: Wed, 9 Nov 2022 13:41:27 +0000 Message-Id: <20221109134132.9052-4-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO2P265CA0498.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:13a::23) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: afb6815c-2e84-413a-f672-08dac25835b4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NbK1l019LIxsZU1Lr3YMh5A82u//oGck1lhjYxhYof5Y2bCWv+JDZ7M+4QgGR6qbHTZaKe65dGzkBKFiqzIQFGyRyAre8r2cxO2TEVMZAdY1glYtBCMaOnByI33h+JBDemItkmMA+ezVPNKApjw3Uxu5V+cVJJbVW2rjGphuvTfJEp/FhXFzxiCf9VpCU2tf+xDdzroYyCKl5KZnjA5XEv55O24+hiCUa1fx4eTWYwQk1VpXOQgRFe5y/LQtkwRJkpGHAlCnyt5LCMKU4xTuWvrPiIIrzs4vVW5xeHIr/GxcMw2pwExVx2HXq/goll/kJCKtKs8p6Bjm/Tb/zUcfm6KG6mZNGm4uHCXSPDtvll+8dXHeSK2PpqHfb+n7CixNxCTQW2w+gpzkb45f23E9TlRU98V3qtjQRQn7JzScICPS7j4Z5J/qrLvd6MtsTVj7jPHX4GkUCZZpBYtshump84CK80cvQpPpMH/CmamsqjrYhMssyydyaZFZUxb6rAGEiFdCGPcZoKWNBPehpEw3LymirYKWWEXPIgmJEmQppmpCvfchucX/GQU249TPex8Uo4qG5GzF1J4XB2Dp3JuZ/pDN5hzuis8uupLT1f5WtSbLZ3OU82rIuIbzmpZP7ex35J1pxzB9v47Wgl0K1IAWwRTOy7GvmPfeYD+r9H4zEuV+dPgVQGDq6EUqC9CDcOO7W6zfHxWZ90Qvy/lew2MSUA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: p+r8FjoIiEgvVGQ6+F+RCpQET19jI/aCh0N3oKKA910HOr0OW3TTB0qbR/1mSUI5I21M3mYQKKKJN5BPuGrxBKGmQGpslYlkzynky9SlavilhVz6wjMbWSW3Zwj9+KbtcQ98qPKLWMwTRRojOoy1UoWTIFFyTx6qu7wdn0Q6vPzy/cSGwg8/b7xI2uZidufqUUQfQ8BFDEbEYVQb8I5bqHzPXIHRKJAiNJcYAL8Vqkg9TbeeHGds2b9jpRlISYkQjXeGTxnCEMFTgy2ttaglJQsf7Zxo0Pa3WvsWXfDOgGjBk2DO8/AUYNQEKHpipZBxTATowO/cszT7PLwGNpKDCttWf876dyo1XVMN8skyexWPbuQcBcuYB9fW8iqp5HLdCWTRoo9PCCz9m41kuxJmDpornsM7nyYbr1UrpY37d4arTWlJ96lSMBhq93ptgkljjKJjYnZYIF+UdTF6AgdC2q+nhPS1WhRm6UFm7PK9hVoaDZVmbv33dL+TPNMI8mllGrCFHeh2DH4PSi8nNU34gmUc9bHD6Re+Kj9mo1TyN9St28VT72vblM3741w1CcNmcI3w6CxI7hSOL8MrTDpQyWKy7ykm3R5KpjsrlSXUP9JcIwZ4foY85ItbXvzhEBLmwegK9sCQowrk4b9miXxr4YeVLkNRbQZ3AqDrjvx+41VCTyXLd1NoHKxs9uR2rYXmpjuTHjUbnkjGwPV+lNdJM1l0CDAg5SqAkY9WvbszMnVRWj3tYYTuttvfIhlVgC+TxYOv8rsOBUcFwhMHmEd/CWIor2GLQZYuoyIaamcK1t5nE5mijLS4VXzZtVGHvxAF68RYYfTn4vNv2SvI+Chn6rVFPCz+kkz/J5IQhNkg7oXT454NLU2y/U3DDpXCId02PEneii8OVwqu5Y8St2teWnakQgIgyq9jlumvf6YTzFpFKRTzlyzcr+VVUzTfRgJB6CHw17o8Re0Gy/Q1x4GkznHnfHgGkn0HMl6VOi6ZONM/IB35XPjPhyYlJe6OQtHKPBnoMKCp4puB/9Ohev0Ki8joXDkPYtkw7JoSw8YNLGyOVKT74x5zoK6BkbIMR9V2pF8rp50ds6ozPocnO0OznnEcfqPRIpYU2/xeCh0YywiPMAO5Z2FnNNLxLGC/AKlUnblMrTljYr7WQvZEs/dDkscu354CIwAUaTsTYLinHwsCFGNJeYZrHf3Pi5DDWG4+8//Zn5kgRfwLw+ILBu+fPxpVKmhF2exQb+wcdlZXm9ZS1e9Oanc5G6tvui4QbG5OrNzcHmjuCZm7iLZDDVFWN4q0uhNJvk6HlgI+Z0nFoPDfCn7paz58D8SmRVvBn3JOAvKbn/aYSlegINZ04vB6ocuKMYH8QPMazYHtpH28rxfrklPzne7/VPIdNdr0HvGEK9Tuz5QlVIK+FPjFfCroUPRz5HNIX3ePl2mqPMd1811DWS4YmyYdNoDojb7f3Qb2LrjOTE6u7TKDc+YOWmgPgwB1WRk0iYpPNTSUyDoN6e2Lma/kx345R6nmaR2jcOfoWW89XwJpop2ZOepSHmjaxPFif02cBhsW4JkFVxa5L6R46hoPLuXhyy4/W1cVvwKzn1wAYWwRGR6te7AeO+iA5w== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: afb6815c-2e84-413a-f672-08dac25835b4 X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:14.2340 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 3wtNMI8wTYpWEN6SsgB/CeHaoTuItycXuDvBpTrXGr2ifOJOUrQtmM1dxm4ip97YhNQhag2zmPN3tsbXyfkVJA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 mlxscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: TfL92OZ6CTu4Z4YCWvfZ9Yg_oI7A6-wr X-Proofpoint-ORIG-GUID: TfL92OZ6CTu4Z4YCWvfZ9Yg_oI7A6-wr Precedence: bulk List-ID: This emits a new file, .tmp_vmlinux.ranges, which maps address range/size pairs in vmlinux to the object files which make them up, e.g., in part: 0x0000000000000000 0x30 arch/x86/kernel/cpu/common.o 0x0000000000001000 0x1000 arch/x86/events/intel/ds.o 0x0000000000002000 0x4000 arch/x86/kernel/irq_64.o 0x0000000000006000 0x5000 arch/x86/kernel/process.o 0x000000000000b000 0x1000 arch/x86/kernel/cpu/common.o 0x000000000000c000 0x5000 arch/x86/mm/cpu_entry_area.o 0x0000000000011000 0x10 arch/x86/kernel/espfix_64.o 0x0000000000011010 0x2 arch/x86/kernel/cpu/common.o [...] In my simple tests this seems to work with clang too, but if I'm not sure how stable the format of clang's linker mapfiles is: if it turns out not to work in some versions, the mapfile-massaging awk script added here might need some adjustment. Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- Notes: v6: use ${wl} where appropriate to avoid failure on UML scripts/link-vmlinux.sh | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh index 918470d768e9..287a2b2c4d46 100755 --- a/scripts/link-vmlinux.sh +++ b/scripts/link-vmlinux.sh @@ -101,7 +101,7 @@ vmlinux_link() ${ld} ${ldflags} -o ${output} \ ${wl}--whole-archive ${objs} ${wl}--no-whole-archive \ ${wl}--start-group ${libs} ${wl}--end-group \ - $@ ${ldlibs} + ${wl}-Map=.tmp_vmlinux.map $@ ${ldlibs} } # generate .BTF typeinfo from DWARF debuginfo @@ -144,6 +144,19 @@ kallsyms() { local kallsymopt; + # read the linker map to identify ranges of addresses: + # - for each *.o file, report address, size, pathname + # - most such lines will have four fields + # - but sometimes there is a line break after the first field + # - start reading at "Linker script and memory map" + # - stop reading at ".brk" + ${AWK} ' + /\.o$/ && start==1 { print $(NF-2), $(NF-1), $NF } + /^Linker script and memory map/ { start = 1 } + /^\.brk/ { exit(0) } + ' .tmp_vmlinux.map | sort > .tmp_vmlinux.ranges + + # get kallsyms options if is_enabled CONFIG_KALLSYMS_ALL; then kallsymopt="${kallsymopt} --all-symbols" fi From patchwork Wed Nov 9 13:41:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A453FC433FE for ; Wed, 9 Nov 2022 13:53:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230345AbiKINxf (ORCPT ); Wed, 9 Nov 2022 08:53:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230293AbiKINx0 (ORCPT ); Wed, 9 Nov 2022 08:53:26 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D525DD7; Wed, 9 Nov 2022 05:53:24 -0800 (PST) Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Dl4dB026016; Wed, 9 Nov 2022 13:53:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=7hNpr2jAHme90wG89ywYaxbkWPRr1q/sz64Wtpn03MQ=; b=NHKnknRYStFbzrnGyb6/wFJTXJhkVGQZekAuI3GdHLeHEzekBu4y+IbNM+AmTL0JCM13 jjxhul1N1EPDGCsAte1WiLCxjlDGxbKCT40bDoMcMXcMjAkfQZmKv60rG6GeNNKDDDtY v5wMbsRaQc8B69v9k3STSC7ltU+0OMab4FbU2M/SSN1v9ilClQRxkKGxXRPvk/Vvd+zY gM2AmrC6Qh8vK8ZOFelT+owbq1TeYn4eznlfd0qQKAvaVra76m73ujKvG34RO8qisIYK 8NZzVAGfACfr18feGR4sRrM/groAN2aNkAa16Csr7yk/7ikcgW+GQWdnaO73BPXaEiuj 2Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krcnmr467-86 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:53:10 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CoVwO004321; Wed, 9 Nov 2022 13:42:21 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2101.outbound.protection.outlook.com [104.47.70.101]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpcq3ddxv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hpuFhzJG/H/nCcfxjwQjkzqWOPkXHylpNT9YE/juc1b0tr3NxoKevOODYlU91gDeyIMBeiFaxaulqsMISQQRmGosIwSCUuZ/+cV2ad/pyc3YgdUqgbj1kzGL9D64EegE54plj8wmgp6vYusGEZPf3wKMvrSoGNvebzYJhxsYcJFWxFFBn4xcAKN8Ypi2efXjumyJ4nFfa0F42pFXPu7I6rVPFk8Tje4JGadSZpmWnRPYBAmLvBIRPAH29M0NHR6AW9OmXRqkSfXjbRYa+csmHssHtZr698h34i4INgZf4eGiIDZPVlyTljEnnErPMDHUEQSdLnri9z7RsKd927+mMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7hNpr2jAHme90wG89ywYaxbkWPRr1q/sz64Wtpn03MQ=; b=IFz5JJXRbbxhgZYhE5azW+IjXwga/uYffGypaEwSVyjgt2eFe3oK2DN2KW82fiI6HsS/KrDADaXZiSxatr0OeLjkgzZHnM/D/BR1LMawTonEifXlULuhcuEIiA5508CuTHk3ey8frpfXpKPEpvpKQJTHhv1k08rAcCeqpLEx9HMiRq38tPKE7u34yRNKakjrw2cS/5gG56P79nykLJWV3f++ADYBxJRY4uaCewogffEKio0+wZRH50cgIDIc4AVneYEc8TJ2BYSBWbpGa1KXrdAVKKZ9WDfnntqXbB5Sb6PaQRHyfLekjlzcmXpTDZ5Gv3BekvfIJSlYwYQKp7f9Rw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7hNpr2jAHme90wG89ywYaxbkWPRr1q/sz64Wtpn03MQ=; b=yLh5X7sJpHWstQDIw6MFmgSTWF8cl6qX03oC3UoduhkGeFIqMYAfiZFO2wxl8ENUMy/71PgzrkYXF4pnTcaigjNqDEsbOlB94xpDC3KLq+5aoFy3OhkzWIWUDPf5mZHhlqu8rV/iy7TIGSPycAxCmBaeTeHoD5d3ipO7KGJdF2I= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:18 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:18 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 4/8] kallsyms: introduce sections needed to map symbols to built-in modules Date: Wed, 9 Nov 2022 13:41:28 +0000 Message-Id: <20221109134132.9052-5-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P123CA0141.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:193::20) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: 3cd08165-0a31-4006-bf43-08dac2583836 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: G3ukjBd9eAr0LIPYTybkpXWHacg5e5hH9CA74ynj0OX6H0Glb61QHd5CB8WLdk3aZXk41i1KwAdu2yXOAuCozgufzA3hWKXCwujytgo61KXvM5BOqBHxS8n5Ln2pjb7Nj3vUkXJbGnyvAhSopju2Vf4D8Bl0+Wihdz30ZyV4X4ZizxAqIs/hxpr8GiFkqDZx08A3RC6ElEfAsU1f/erC4440yFrTA3dww5TMEM11MBxgpovgq3qeTziMc3Zm2rr4osXLd1nlyR1y649nI7K8L9oe6Uv2Dn6U7GmHJmdO254wM+9KmHJ4w+INe2ZwSSqmF8X/CP3yIOxsuffi7ZOZ9UAj0UrbqTLkbcX0nSZgXp3oIN0lseQJFAWQC2CxlK9lLawGKwnAaLwffyzZssE6819qDZHMKmxEN8TjflHlPtpM75u5k0rrWooMfgyVHlN37OdtlfbO/vFQ+Wt/+nzVl+kUBiboFR2Iv2WxJMLaBzlOrwf4wepsquRj1QwE7kc81bywB5DjG0Ib3qpVA9ITP9hxIJUycdrAilHtsFBjUaoXyutEV2EEW+clc1Sm9CJW9eUnj4/DocihackZT158OTDm7W7zcAewVe1jCYsugEfN5i1o+/wIRr3NxLW15FUEPm36H3bCMCTYABeHoT5D0idWm+mg+E0jjiaFLeu4XNVPIAVPF+ftvkTYtxy6/ROHHyQX33A7K+V4GEmesUxi1A== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(45080400002)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YL5jOLbiIQ7epZYbEJnVLCKcJhGeshbL+fsv6q9GrVakQL88bcQdM5YKjDNDw2Es2uAyBmni4eYFkY+Z7sqZcryU2E9n5aQg9Dz2dt+d1F9Ejc8/EjQJYRJaftvNTwZQYSCuwPOQXYWyI/6pOntCmp06V+gJTzqLkhKyGQp5jj2TYgwq4RLHV1uOPel8Mq10VTolK4IyTXzl+K/DYMhN8V+8R1x1kHrsF8x32yRMxClIzBEKC/eg6yLsdaTQcbXWFCD5QvT8j1L+uxbKAEDcS1XsvfLv0zZ7GmTcCeQe9gxGb38a0nBIKNOEFZjE4dKkbY73DmNuwXfaHmVKkciFv1CxQvRRfktAuBZ7Ual+w+QzcduRn7imV4RCBhXT31/u3uQwFlv1r2hJQ4n4PowWUZb4ou0QDksi03F5ZJ5neuekyoqrzLNWh+HHmtlWRH3lhXvouJBo/W6W3fXYNTO4XBORLwT8DExEaIkNsDVt+MMpDLkmSHuxZDxy/f2phWug65gVW7D35KyK6J04iqMNXT2yh5ItRdnv/B3BgmJjY0RPZrFIw2NHuwwLITPDv8YRdO2xtAZScYVU9XzIXX+ZVIj0xy5xqYxS1sT53r7qxMqY7Xs0H7oSqpWfzTVYDWpdg+36ehPj7voQ4yKeIKCHdz3Tbnx09FN1Zz3AbqXVFWLg09Gj9sNc1T3MAr19jlmAOS0KD1YCo4khwzi7eJQc19W2xWzvKdLDGecPHz4MpfZC0UE/DzOb6n2lKFjaQuv5dWE/v4lNsylFbS3Du0wL1vruR1n0ZBd8+SkQxgHUYFIIlbfimrAkvLvIR8cZuHpR8H3+jdGulHJTg2S82R+l87cyHfWXxNzZ1MFSbkTSBinzSnuMSgbLiDvA187BEkmYgr0S046Xqq0ru+R77/rj8CQ0V/jUwN8mLtIS3lCwdm/UttXsnPV2anUeZnnDmRIOiBhuslBfAOjF5OySMFGUpPeuw13+1eVRgx3gCQSXqLmSIWu0ol0CheD1YjbDOLSV/qKddvwuVoajxXpyhDGjhZlQkETSej8iAbmTrmYA8gktlb/MbAgbxQHicZ0Ki+/J3qgF+usiOQ2y1OXgqIQujs9f+OILhHRnhZ/Ky8n5klr0OjT44PNdXwOM3fcl9kAn2YUQGaF22HrYybJR06t/ZOkiOdp//92pqSeevXnJmSnJ1Bjomken34cVzMwpSs/vJrD7LaaaE5aU1iNHTYIN1fdzYnwMAPWu0n33uQIJBW+zVYE4modRBAbSzuDSNsvWUf+6oPx+4Om3JJlKA6z98qu6i8/SanKsWQBPOT1xiCv+iW98eanobOU9eYOYnNemFvMs2gAQ+PEG6/F4bIVJ/VtSSPZweH13+tFLhUua2SAWqQd5Xs6btG2u5t9u816B0I9MdBFTlQvq81qLNQQHk1TGAFlOTs2RIiklNfc1quiOOlXblNqI4DY7WtQjpZX5DWRbL6QITMc7oPd9xzWuLJRrxxqNU7ek9IMe9QN5reKGkaS+3E894FwCiuepY6H2YXgFoR9KLazfx5os366s7m6gnCKbURESARjQRsfQRWytPUIwJbya5IYgzxML/yiePa2tdxRG1MNgHkn/sGwF5A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3cd08165-0a31-4006-bf43-08dac2583836 X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:18.4244 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: y4JD4/2ZR5CRF2eRQwnd0+ITVspc+jRiOxwDImEVOFbJruqiux9yJkTJj32GnHTPrjqwtQ9EjpKNHjHD3Zbg8A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 mlxscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: hGhHuCyQIXX7JcTqPY6_ZtK_Okrf--2L X-Proofpoint-ORIG-GUID: hGhHuCyQIXX7JcTqPY6_ZtK_Okrf--2L Precedence: bulk List-ID: The mapping consists of three new symbols, computed by integrating the information in the (just-added) .tmp_vmlinux.ranges and modules_thick.builtin: taken together, they map address ranges (corresponding to object files on the input) to the names of zero or more modules containing those address ranges. - kallsyms_module_addresses/kallsyms_module_offsets encodes the address/offset of each object file (derived from the linker map), in exactly the same way as kallsyms_addresses/kallsyms_offsets does for symbols. There is no size: instead, the object files are assumed to tile the address space. (This is slightly more space-efficient than using a size). Non-text-section addresses are skipped: for now, all the users of this interface only need module/non-module information for instruction pointer addresses, not absolute-addressed symbols and the like. This restriction can easily be lifted in future. (Regarding the name: right now the entries correspond pretty closely to object files, so we could call the section kallsyms_objfiles or something, but the optimizer added in the next commit will change this.) - kallsyms_mod_objnames encodes the name of each module in a modified form of strtab: notably, if an object file appears in *multiple* modules, all of which are built in, this is encoded via a zero byte, a one-byte module count, then a series of that many null-terminated strings. As a special case, the table starts with a single zero byte which does *not* represent the start of a multi-module list. (The name is "objnames" because in an upcoming commit it will store some object file names too.) - kallsyms_modules connects the two, encoding a table associated 1:1 with kallsyms_module_addresses / kallsyms_module_offsets, pointing at an offset in kallsyms_module_names describing which module (or modules, for a multi-module list) the code occupying this address range is part of. If an address range is part of no module (always built-in) it points at 0 (the null byte at the start of the kallsyms_module_names list). There is no optimization yet: kallsyms_modules and kallsyms_module_names will almost certainly contain many duplicate entries, and kallsyms_module_{addresses,offsets} may contain consecutive entries that point to the same place. The size hit is fairly substantial as a result, though still much less than a naive implementation mapping each symbol to a module name would be: 50KiB or so. Since this commit is the first user of modules_thick.builtin, introduce rules to actually build it when CONFIG_KALLMODSYMS is set (similarly to modules.order, it is named in the top-level makefile purely for documentation purposes, then reiterated in the makefile where it is actually built, in this case the top-level Kbuild). Since it's also the first user of the new Kconfig symbol to enable compiling-out of /proc/kallmodsyms support, introduce that symbol too. Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- Notes: v9: Rename .kallsyms_module_names to .kallsyms_mod_objnames now that it contains object file names too. Adjustments to the Kconfig wording; adjustments to modules_thick.builtin rules. Adjust to getopt_long use in scripts/kallsyms. Kbuild | 6 + Makefile | 5 +- init/Kconfig | 9 ++ scripts/Makefile | 6 + scripts/kallsyms.c | 375 ++++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 392 insertions(+), 9 deletions(-) diff --git a/Kbuild b/Kbuild index a84e8312c174..2bd36178ae64 100644 --- a/Kbuild +++ b/Kbuild @@ -113,3 +113,9 @@ $(modthickbuiltin-files): include/config/tristate.conf modules_thick.builtin: $(modthickbuiltin-files) $(obj-y) $(Q)$(AWK) '!x[$$0]++' $(patsubst %/built-in.a, %/$@, $(filter %/built-in.a,$(obj-y))) > $@ + +ifdef CONFIG_KALLMODSYMS +ifdef need-builtin +extra-y += modules_thick.builtin +endif +endif diff --git a/Makefile b/Makefile index 21117f9d4202..71f4d7abd6ed 100644 --- a/Makefile +++ b/Makefile @@ -1232,7 +1232,7 @@ vmlinux.o modules.builtin.modinfo modules.builtin: vmlinux_o @: PHONY += vmlinux -vmlinux: vmlinux.o $(KBUILD_LDS) modpost +vmlinux: vmlinux.o $(KBUILD_LDS) modules_thick.builtin modpost $(Q)$(MAKE) -f $(srctree)/scripts/Makefile.vmlinux # The actual objects are generated when descending, @@ -1562,6 +1562,9 @@ __modinst_pre: endif # CONFIG_MODULES +modules_thick.builtin: $(build-dir) + @: + ### # Cleaning is done on three levels. # make clean Delete most generated files diff --git a/init/Kconfig b/init/Kconfig index abf65098f1b6..70cc2e67bef7 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1570,6 +1570,15 @@ config POSIX_TIMERS If unsure say y. +config KALLMODSYMS + default y + bool "Enable support for /proc/kallmodsyms" if EXPERT + depends on KALLSYMS + help + This option enables the /proc/kallmodsyms file, which unambiguously + maps built-in kernel symbols and their associated object files and + modules to addresses. + config PRINTK default y bool "Enable support for printk" if EXPERT diff --git a/scripts/Makefile b/scripts/Makefile index 1575af84d557..acd46bfeedc3 100644 --- a/scripts/Makefile +++ b/scripts/Makefile @@ -32,6 +32,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED endif +kallsyms-objs := kallsyms.o + +ifdef CONFIG_KALLMODSYMS +kallsyms-objs += modules_thick.o +endif + # The following programs are only built on demand hostprogs += unifdef diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 03fa07ad45d9..6b9654a151fb 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -5,7 +5,10 @@ * This software may be used and distributed according to the terms * of the GNU General Public License, incorporated herein by reference. * - * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S + * Usage: nm -n vmlinux + * | scripts/kallsyms [--all-symbols] [--absolute-percpu] + * [--base-relative] [--builtin=modules_thick.builtin] + * > symbols.S * * Table compression uses all the unused char codes on the symbols and * maps these to the most used substrings (tokens). For instance, it might @@ -25,6 +28,10 @@ #include #include #include +#include +#include "modules_thick.h" + +#include "../include/generated/autoconf.h" #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0])) @@ -84,11 +91,118 @@ static int token_profit[0x10000]; static unsigned char best_table[256][2]; static unsigned char best_table_len[256]; +#ifdef CONFIG_KALLMODSYMS +static unsigned int strhash(const char *s) +{ + /* fnv32 hash */ + unsigned int hash = 2166136261U; + + for (; *s; s++) + hash = (hash ^ *s) * 0x01000193; + return hash; +} + +#define OBJ2MOD_BITS 10 +#define OBJ2MOD_N (1 << OBJ2MOD_BITS) +#define OBJ2MOD_MASK (OBJ2MOD_N - 1) +struct obj2mod_elem { + char *obj; + char *mods; /* sorted module name strtab */ + size_t nmods; /* number of modules in "mods" */ + size_t mods_size; /* size of all mods together */ + int mod_offset; /* offset of module name in .kallsyms_mod_objnames */ + struct obj2mod_elem *obj2mod_next; +}; + +/* + * Map from object files to obj2mod entries (a unique mapping). + */ + +static struct obj2mod_elem *obj2mod[OBJ2MOD_N]; +static size_t num_objfiles; + +/* + * An ordered list of address ranges and the objfile that occupies that range. + */ +struct addrmap_entry { + char *obj; + unsigned long long addr; + unsigned long long end_addr; + struct obj2mod_elem *objfile; +}; +static struct addrmap_entry *addrmap; +static int addrmap_num, addrmap_alloced; + +static void obj2mod_init(void) +{ + memset(obj2mod, 0, sizeof(obj2mod)); +} + +static struct obj2mod_elem *obj2mod_get(const char *obj) +{ + int i = strhash(obj) & OBJ2MOD_MASK; + struct obj2mod_elem *elem; + + for (elem = obj2mod[i]; elem; elem = elem->obj2mod_next) { + if (strcmp(elem->obj, obj) == 0) + return elem; + } + return NULL; +} + +/* + * Note that a given object file is found in some module, interning it in the + * obj2mod hash. Should not be called more than once for any given (module, + * object) pair. + */ +static void obj2mod_add(char *obj, char *mod) +{ + int i = strhash(obj) & OBJ2MOD_MASK; + struct obj2mod_elem *elem; + + elem = obj2mod_get(obj); + if (!elem) { + elem = malloc(sizeof(struct obj2mod_elem)); + if (!elem) + goto oom; + memset(elem, 0, sizeof(struct obj2mod_elem)); + elem->obj = strdup(obj); + if (!elem->obj) + goto oom; + elem->mods = strdup(mod); + if (!elem->mods) + goto oom; + + elem->obj2mod_next = obj2mod[i]; + obj2mod[i] = elem; + num_objfiles++; + } else { + elem->mods = realloc(elem->mods, elem->mods_size + + strlen(mod) + 1); + if (!elem->mods) + goto oom; + strcpy(elem->mods + elem->mods_size, mod); + } + + elem->mods_size += strlen(mod) + 1; + elem->nmods++; + if (elem->nmods > 255) { + fprintf(stderr, "kallsyms: %s: too many modules associated with this object file\n", + obj); + exit(EXIT_FAILURE); + } + return; +oom: + fprintf(stderr, "kallsyms: out of memory\n"); + exit(1); +} +#endif /* CONFIG_KALLMODSYMS */ static void usage(void) { fprintf(stderr, "Usage: kallsyms [--all-symbols] [--absolute-percpu] " - "[--base-relative] in.map > out.S\n"); + "[--base-relative] [--builtin=modules_thick.builtin] " + " in.map > out.S\n"); exit(1); } @@ -112,10 +226,16 @@ static bool is_ignored_symbol(const char *name, char type) "kallsyms_offsets", "kallsyms_relative_base", "kallsyms_num_syms", + "kallsyms_num_modules", "kallsyms_names", "kallsyms_markers", "kallsyms_token_table", "kallsyms_token_index", + "kallsyms_module_offsets", + "kallsyms_module_addresses", + "kallsyms_modules", + "kallsyms_mod_objnames", + "kallsyms_mod_objnames_len", /* Exclude linker generated symbols which vary between passes */ "_SDA_BASE_", /* ppc */ "_SDA2_BASE_", /* ppc */ @@ -262,8 +382,8 @@ static struct sym_entry *read_symbol(FILE *in) return sym; } -static int symbol_in_range(const struct sym_entry *s, - const struct addr_range *ranges, int entries) +static int addr_in_range(unsigned long long addr, + const struct addr_range *ranges, int entries) { size_t i; const struct addr_range *ar; @@ -271,7 +391,7 @@ static int symbol_in_range(const struct sym_entry *s, for (i = 0; i < entries; ++i) { ar = &ranges[i]; - if (s->addr >= ar->start && s->addr <= ar->end) + if (addr >= ar->start && addr <= ar->end) return 1; } @@ -285,8 +405,8 @@ static int symbol_valid(const struct sym_entry *s) /* if --all-symbols is not specified, then symbols outside the text * and inittext sections are discarded */ if (!all_symbols) { - if (symbol_in_range(s, text_ranges, - ARRAY_SIZE(text_ranges)) == 0) + if (addr_in_range(s->addr, text_ranges, + ARRAY_SIZE(text_ranges)) == 0) return 0; /* Corner case. Discard any symbols with the same value as * _etext _einittext; they can move between pass 1 and 2 when @@ -378,6 +498,121 @@ static void output_address(unsigned long long addr) printf("\tPTR\t_text - %#llx\n", _text - addr); } +#ifdef CONFIG_KALLMODSYMS +/* Output the .kallmodsyms_mod_objnames symbol content. */ +static void output_kallmodsyms_mod_objnames(void) +{ + struct obj2mod_elem *elem; + size_t offset = 1; + size_t i; + + /* + * Traverse and emit, updating mod_offset accordingly. Emit a single \0 + * at the start, to encode non-modular objfiles. + */ + output_label("kallsyms_mod_objnames"); + printf("\t.byte\t0\n"); + for (i = 0; i < OBJ2MOD_N; i++) { + for (elem = obj2mod[i]; elem; + elem = elem->obj2mod_next) { + const char *onemod; + size_t i; + + elem->mod_offset = offset; + onemod = elem->mods; + + /* + * Technically this is a waste of space: we could just + * as well implement multimodule entries by pointing one + * byte further back, to the trailing \0 of the previous + * entry, but doing it this way makes it more obvious + * when an entry is a multimodule entry. + */ + if (elem->nmods != 1) { + printf("\t.byte\t0\n"); + printf("\t.byte\t%zi\n", elem->nmods); + offset += 2; + } + + for (i = elem->nmods; i > 0; i--) { + printf("\t.asciz\t\"%s\"\n", onemod); + offset += strlen(onemod) + 1; + onemod += strlen(onemod) + 1; + } + } + } + printf("\n"); + output_label("kallsyms_mod_objnames_len"); + printf("\t.long\t%zi\n", offset); +} + +static void output_kallmodsyms_objfiles(void) +{ + size_t i = 0; + size_t emitted_offsets = 0; + size_t emitted_objfiles = 0; + + if (base_relative) + output_label("kallsyms_module_offsets"); + else + output_label("kallsyms_module_addresses"); + + for (i = 0; i < addrmap_num; i++) { + long long offset; + int overflow; + + if (base_relative) { + if (!absolute_percpu) { + offset = addrmap[i].addr - relative_base; + overflow = (offset < 0 || offset > UINT_MAX); + } else { + offset = relative_base - addrmap[i].addr - 1; + overflow = (offset < INT_MIN || offset >= 0); + } + if (overflow) { + fprintf(stderr, "kallsyms failure: " + "objfile %s at address %#llx out of range in relative mode\n", + addrmap[i].objfile ? addrmap[i].objfile->obj : + "in always-built-in object", table[i]->addr); + exit(EXIT_FAILURE); + } + printf("\t.long\t0x%x\n", (int)offset); + } else + printf("\tPTR\t%#llx\n", addrmap[i].addr); + emitted_offsets++; + } + + output_label("kallsyms_modules"); + + for (i = 0; i < addrmap_num; i++) { + struct obj2mod_elem *elem = addrmap[i].objfile; + /* + * Address range cites no modular object file: point at 0, the + * built-in module. + */ + if (addrmap[i].objfile == NULL) { + printf("\t.long\t0x0\n"); + emitted_objfiles++; + continue; + } + + /* + * Zero offset is the initial \0, there to catch uninitialized + * obj2mod entries, and is forbidden. + */ + assert(elem->mod_offset != 0); + + printf("\t.long\t0x%x\n", elem->mod_offset); + emitted_objfiles++; + } + + assert(emitted_offsets == emitted_objfiles); + output_label("kallsyms_num_modules"); + printf("\t.long\t%zi\n", emitted_objfiles); + printf("\n"); +} +#endif /* CONFIG_KALLMODSYMS */ + /* uncompress a compressed symbol. When this function is called, the best table * might still be compressed itself, so the function needs to be recursive */ static int expand_symbol(const unsigned char *data, int len, char *result) @@ -477,6 +712,11 @@ static void write_src(void) printf("\n"); } +#ifdef CONFIG_KALLMODSYMS + output_kallmodsyms_mod_objnames(); + output_kallmodsyms_objfiles(); +#endif + output_label("kallsyms_num_syms"); printf("\t.long\t%u\n", table_cnt); printf("\n"); @@ -784,7 +1024,7 @@ static void make_percpus_absolute(void) unsigned int i; for (i = 0; i < table_cnt; i++) - if (symbol_in_range(table[i], &percpu_range, 1)) { + if (addr_in_range(table[i]->addr, &percpu_range, 1)) { /* * Keep the 'A' override for percpu symbols to * ensure consistent behavior compared to older @@ -811,13 +1051,123 @@ static void record_relative_base(void) } } +#ifdef CONFIG_KALLMODSYMS +/* + * Read the linker map. + */ +static void read_linker_map(void) +{ + unsigned long long addr, size; + char *obj; + FILE *f = fopen(".tmp_vmlinux.ranges", "r"); + + if (!f) { + fprintf(stderr, "Cannot open '.tmp_vmlinux.ranges'.\n"); + exit(1); + } + + addrmap_num = 0; + addrmap_alloced = 4096; + addrmap = malloc(sizeof(*addrmap) * addrmap_alloced); + if (!addrmap) + goto oom; + + /* + * For each address range, add to addrmap the address and the objfile + * entry to which the range maps. Only add entries relating to text + * ranges. + * + * Ranges that do not correspond to a built-in module, but to an + * always-built-in object file, have no obj2mod_elem and point at NULL + * instead. Their obj member is still filled out. + */ + + while (fscanf(f, "%llx %llx %ms\n", &addr, &size, &obj) == 3) { + struct obj2mod_elem *elem = obj2mod_get(obj); + + if (addr == 0 || size == 0 || + !addr_in_range(addr, text_ranges, ARRAY_SIZE(text_ranges))) { + free(obj); + continue; + } + + if (addrmap_num >= addrmap_alloced) { + addrmap_alloced *= 2; + addrmap = realloc(addrmap, + sizeof(*addrmap) * addrmap_alloced); + if (!addrmap) + goto oom; + } + + addrmap[addrmap_num].addr = addr; + addrmap[addrmap_num].end_addr = addr + size; + addrmap[addrmap_num].objfile = elem; + addrmap[addrmap_num].obj = obj; + addrmap_num++; + } + fclose(f); + return; + +oom: + fprintf(stderr, "kallsyms: out of memory\n"); + exit(1); +} + +/* + * Read "modules_thick.builtin" (the list of built-in modules). Construct the + * obj2mod hash to track objfile -> module mappings. Read ".tmp_vmlinux.ranges" + * (the linker map) and build addrmap[], which maps address ranges to built-in + * module names (using obj2mod). + */ +static void read_modules(const char *modules_builtin) +{ + struct modules_thick_iter *i; + char *module_name = NULL; + char **module_paths; + + obj2mod_init(); + /* + * Iterate over all modules in modules_thick.builtin and add each. + */ + i = modules_thick_iter_new(modules_builtin); + if (i == NULL) { + fprintf(stderr, "Cannot iterate over builtin modules.\n"); + exit(1); + } + + while ((module_paths = modules_thick_iter_next(i, &module_name))) { + char **walk = module_paths; + while (*walk) { + obj2mod_add(*walk, module_name); + walk++; + } + free(module_paths); + } + + free(module_name); + modules_thick_iter_free(i); + + /* + * Read linker map. + */ + read_linker_map(); +} +#else +static void read_modules(const char *unused) {} +#endif /* CONFIG_KALLMODSYMS */ + int main(int argc, char **argv) { + char *modules_builtin = "modules_thick.builtin"; + while (1) { + static int has_modules_builtin; + static struct option long_options[] = { {"all-symbols", no_argument, &all_symbols, 1}, {"absolute-percpu", no_argument, &absolute_percpu, 1}, {"base-relative", no_argument, &base_relative, 1}, + {"builtin", required_argument, &has_modules_builtin, 1}, {}, }; @@ -827,12 +1177,21 @@ int main(int argc, char **argv) break; if (c != 0) usage(); + + if (has_modules_builtin) { + modules_builtin = strdup(optarg); + if (!modules_builtin) { + fprintf(stderr, "Out of memory parsing args\n"); + exit(1); + } + } } if (optind >= argc) usage(); read_map(argv[optind]); + read_modules(modules_builtin); shrink_table(); if (absolute_percpu) make_percpus_absolute(); From patchwork Wed Nov 9 13:41:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037554 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0CD3C4332F for ; Wed, 9 Nov 2022 14:02:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230158AbiKIOCs (ORCPT ); Wed, 9 Nov 2022 09:02:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230164AbiKIOCr (ORCPT ); Wed, 9 Nov 2022 09:02:47 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98BC7280; Wed, 9 Nov 2022 06:02:45 -0800 (PST) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Dlo1u018000; Wed, 9 Nov 2022 14:02:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=RdKfiUQzQZfWe2qNENmLmxoQLh+dBt776HmM3eMhbLc=; b=QWUx95eH+TYtD9iPmcAOl75VTSyQpinNf3aIiob56OPmhPZFMX1rnZ1/tW5ICvNgpf43 2HMCdrIYhsEcXJmHiv4EO9d5Jsq0mALIXWj002WV/chRtzhEmtVqY2EKkS2CMgr7rkE+ hASve/uHvqsZqCmUndSnvJLJSwuAAHco8oENRkRgL7np6JK0Ax+1gbupZjGkaunKS5rH rdpASKSmRAq2aHpCXTinSDRpfPBfays/0cgRkhdkVIUSLqBHeWvoUxT8vdh5bHuTS8QQ 0sTB4ofEi8it8brFazT+xqPojzQnqzIwRd7q8pnopvxg5pLh5vEo32giIKCn3Fg+cXbz YQ== Received: from phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta02.appoci.oracle.com [147.154.114.232]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krdd2829v-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 14:02:37 +0000 Received: from pps.filterd (phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CCHvb018952; Wed, 9 Nov 2022 13:42:25 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2106.outbound.protection.outlook.com [104.47.70.106]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctmwawg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:24 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JrdAOCqTWiHolhn3O1DauPPZNMiRZXCGB7WJEKQPJmqvuSO6Bu497PWWVszXP5AgevEzrWYlZrRLVpF0FTVdPBOTH+oUNrPZotcQg7oJpTJHhDmcmr6ztvDvyiHFRn9GyznQQfMhNwBuRLlIsdwpgmjnrlLBK/dbOVfknWlvmBD3yqOSORZQMt9A4RRL2gVPgyAjHP4wuuiGjfaNg5ekP3ZlJBbR1sbynJqjMF61mKcMHqcMkASSoocnoaZIzBIhLljT/b7urJHWQEs4lzdID+Q5YQcsb1hWVH/1Vo4H8BF66kQGtY76X2j8wwWpzy/BB3mj2QCyDRi6AWVd0leYSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RdKfiUQzQZfWe2qNENmLmxoQLh+dBt776HmM3eMhbLc=; b=i1tfYB91+W+RO3I3Xb4BvT+OYyNGP4/gEjz0+YjxMwWM+oUETEKizzOhOcG7PZBDM96Y3mNhHxjGwrzHd+5s6K3f9Y2rXW9DBdLcWw0Mx7u2DGrBMV3WOmQwMntq3Q/x7CGQD1+TvVcjgxesEeBmwW962LphsIghKoV8tPkyLjOSxRLH0P/TSDRspR9pSjX3xhQgozZfuUqi/i/gcZ5UsXRHny9zww+zA6wNB6OSNV+cX2vTFwaAwNm7ThotnE1QGTAf6SW+GesG22dnnB6747e0TZVVrvZPze9fznvR1dC9QqSNeE70DNPC5CUv+7l8crAwE1yXngk45lltyEg1Hw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RdKfiUQzQZfWe2qNENmLmxoQLh+dBt776HmM3eMhbLc=; b=ojGfP8WbihXIgxJGsvmgt9z6yYZBbFhXc4dlKU57eXGB/Ff2xhVsboLKphdCPWWgToLe8TC9YY55533P8N1uXGSCWq2QFRsDnRRg+pV2R1fMcPVqbvrtRogZN2CzrTld27Dqv4znpV76EQCO9Pavarodu/dkF/YnZ3+8B26q4XM= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:22 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:22 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 5/8] kallsyms: optimize .kallsyms_modules* Date: Wed, 9 Nov 2022 13:41:29 +0000 Message-Id: <20221109134132.9052-6-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P123CA0142.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:193::21) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: 300dbcde-61fd-48d4-1ec4-08dac2583aaf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: di/Xg6UVFy1LeyW7kHLN0QL3YoWxSk0E//NQMdcIOG9YVq2+QoOJOUmMENNhOCeX9NGMrxDEEgYBTuSvMO2Mjhy2LAyYylnkAJn2z9GztyLxAMtCY9YBvVGSV4qDIdOvIfD7AU24Z09+i3JN6lYlgI+gGgUaJfKq/diFZve/QU4c49ZtpNI59vmR3E0eoGnM0Dl6maTofkisnnz2rVFw4uuA7ELZXWH4+Ga5vspdJ9jEJiSJ68fYjPfjFrV935Uab/n3K7x2mqyAhfgY6OaWIvOvaQdCWoLLNg12yAC9JWzxE0BZI1F7EazneRAKDf2synFu+x0iaDo5s9xczZrcncJmX5o1v7dtNmW/fczXkddM1jUIp7Bg5p+PXFDEpoNUEcFbY5NpakZLEyZiRYg/1mDI+hBNkROiz744ZVeNCmcOEZshIB2kwXXA6RcRSUqyBThNTd7GjBswPM3yfm2hsO46X8Y9c6/mvSEzqyY6kKnxWw8kvlPbnwo9W33857ZPrY2x+AUM8LYEEMdLDg6IFZqVA+zAO2RvgVE12GzHLur7D15azmH8leUgb47lEsu7w0LxRAyvsp+0rAp6OcXenK235IddEq4RbZpFrwWaEebn1nuMlKtwbv//8bqZ9Avp0tbuHA+GM3/z/i68C9YzJ726Inv3AcNwY6Esk9qP1PZlolXhU3ohJ9wfgrmljVgUOT7zriWX9X2tz/9EQ//YmLGVzG8Os9N9BKL4xvU5eT2nT8ak0GvwnSkxIxPhaOFB X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(6486002)(478600001)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003)(142923001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: MGZoR2TziY+neyl2B1fK+TaN4/AmlrEXIZerdhjkiO2F0aHb4198LjwYkWTvoZA44dWRR+jJmoweRaAL59/FtRG+WqdP18w5WpGL2YAhaR0lL3GW37SOeFK015hzPOhHkkYo7VWONUKePmR/TeeBD9kLbVwJSPOa2FfDNgAfRRO9xf3FphPrQAr3k498xWFoDuxjIijeRiq/yDL4yFvLNnZzie1Afjf6biTyIEY97zDF5vfkn0QDD3Be4rNgMYDB5D4EMUwWjWKHTgqrWa7XduJrdx3OPt/vlwKFSoph4HEmqA3Fi5KU1zZQl83ltI3Pb5C1I7PIO1W1rBEzWSAXu0VystML/bf1RmXYMk+d5sss/BN1QoOfDA5LYRFiZrarFgmjgwXD0tqRrfUZmREjYQFy75hzomCJqypifSgxCEYkJU8bLSO+am6LRpWRnDbho6DND20ZNuJpcRZwqHlRlI/8nSvpfpbhSLGuNQbrnPww4OLDhPhvg4uaYcF3r75CWIo+qIlFjcPEjqv5e/QJOcl0No1/1U3l5VkrJ9HC1j2L+M5Tcrvv1T8GDS+wlB8azFBrOeargDnxnvGYyPvZn8JxHD9lJS/wOhPKEScHIMRvZEgedY9KvDkzRHtxDXBC9wH/7ltDAeA0rBvKM+tt6DbIgBqpxr6daHrnHZxi7FY+9ofdb+z6TpfkezMMFDoKtjCWVeJvIqsVaOscvU4zgssv8AntPsH5UM8wBMn0mr8Sisgqtzx6tYjf4zKFyHtF+4kG3RE4XryU8rl5efK+IwrgxRLqUJgNmuwVT71CXH4tbPMhkcKAVz5Omepdk+C5nW1yTw7kmDrdY6NfAjtM4mC5tjxnMpwILlJOP2vRMMSiUIo7ubihkrxyw19s/qh3ruFPnqipXXLCjQwLHo99jAt9B7s/VIfVNWrnfRriga+IE4ghE4Mq2lnq0WwsIxVY7QIc5PHSmqO7QppPZm5dB1102XnyW/tqHo71fy45g2Fb1CiPyen8ES93kyIthMki6QWk44f8OhQxIsOqlIN1TIUexQl1FoHgoBXjvH7Df4HmYCyOK575Tzx4XAZgduIDSUJE4pj6jyaudkn1chs7sZABUH5ZPA3pFOkpP27wyDo8nF+LvRp2rhq+KFZYn3hiQ8cXbFFd7/kjpkeXhlGcO3Fr+Hpbqvz3gepe8g66lWSUVzdd8B/7UZKj+ixoRhG04PVCLmwbwPT/vl60/tnrMtcoZkjfN2YFM5Gl8xcwbs0+xHWqTAWy/pmItWOuavRtQ+9EoH9agYFfm5DYI4vECwOtcamVjhpwz1+fr6BLzqGz9eyZaXX58HauxVj2wgm/MEL2wuwBW9eg5C2BGAjMh2k3eGREGAXmgeTCeUm11d7vBGqjaeIBC7bZB2X7kGBlC8gr0cuXZLxOdFdt+ZtTHM/ACfvNQaTnbDMho585cmhgLC7koNmI/vhvI1s4bcHLwR5QkbTaXf0rEuTBHp+4WwFmE53NYaRmvtVDOD9PI02fOMPphvRH8cho/JGwxlt8BwYudUF14ruwA/Gj+L72n2cFNu1KUPFOswidS0gXN+J0xPzDj48Qv7x+iCmayXqkSatYBcqaSYoh7ZszR8p6jw== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 300dbcde-61fd-48d4-1ec4-08dac2583aaf X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:22.5687 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PUQ6aBRiC+4gIPB/PWMtZQhkOHgwPk1Ne/aK82tWLUBWtgujYcJlVGFG62ZkJfNs/qNrEQ8uBvv3hog4GoUH5Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 spamscore=0 bulkscore=0 adultscore=0 mlxscore=0 suspectscore=0 malwarescore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: oQ1aQ0X38_p5ELYyek9_l-HrHNH4wvrj X-Proofpoint-ORIG-GUID: oQ1aQ0X38_p5ELYyek9_l-HrHNH4wvrj Precedence: bulk List-ID: These symbols are terribly inefficiently stored at the moment. Add a simple optimizer which fuses obj2mod_elem entries and uses this to implement three cheap optimizations: - duplicate names are eliminated from .kallsyms_module_names. - entries in .kallsyms_modules which point at single-file modules which also appear in a multi-module list are redirected to point inside that list, and the single-file entry is dropped from .kallsyms_module_names. Thus, modules which contain some object files shared with other modules and some object files exclusive to them do not double up the module name. (There might still be some duplication between multiple multi-module lists, but this is an extremely marginal size effect, and resolving it would require an extra layer of lookup tables which would be even more complex, and incompressible to boot). - Entries in .kallsyms_modules that would contain the same value after the above optimizations are fused together, along with their corresponding .kallsyms_module_addresses/offsets entries. Due to this fusion process, and because object files can be split apart into multiple parts by the linker for hot/cold partitioning and the like, entries in .kallsyms_module_addresses/offsets no longer correspond 1:1 to object files, but more to some contiguous range of addresses which are guaranteed to belong to a single built-in module, but which may well stretch over multiple object files. The optimizer's time complexity is O(log n) in the number of objfiles at most (and probably much lower), so, given the relatively low number of objfiles, its runtime overhead is in the noise. Optimization reduces the overhead of the kallmodsyms tables by about 7500 items, dropping the .tmp_kallsyms2.o object file size by about 33KiB, leaving it 8672 bytes larger than before: a gain of .4%. The vmlinux size is not yet affected because the variables are not used and are eliminated by the linker: but if they were used (after the next commit but one), the size impact of all of this on the final kernel is minimal: in my testing, vmlinux grew by 0.17% (10824 bytes), and the compressed vmlinux only grew by 0.08% (7552 bytes): though this is very configuration-dependent, it seems likely to scale roughly with the kernel as a whole. (The next commit changes these numbers a bit, but not much.) Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- Notes: v9: Fix a bug in optimize_obj2mod that prevented proper reuse of module names for object files appearing in both multimodule modules and single-module modules. Adjustments to allow for objfile support. Tiny style fixes. scripts/kallsyms.c | 297 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 288 insertions(+), 9 deletions(-) diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 6b9654a151fb..f89f569eb3c9 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -102,6 +102,17 @@ static unsigned int strhash(const char *s) return hash; } +static unsigned int memhash(char *s, size_t len) +{ + /* fnv32 hash */ + unsigned int hash = 2166136261U; + size_t i; + + for (i = 0; i < len; i++) + hash = (hash ^ *(s + i)) * 0x01000193; + return hash; +} + #define OBJ2MOD_BITS 10 #define OBJ2MOD_N (1 << OBJ2MOD_BITS) #define OBJ2MOD_MASK (OBJ2MOD_N - 1) @@ -111,14 +122,35 @@ struct obj2mod_elem { size_t nmods; /* number of modules in "mods" */ size_t mods_size; /* size of all mods together */ int mod_offset; /* offset of module name in .kallsyms_mod_objnames */ + + /* + * Hash values of all module names in this elem, combined: used for + * rapid comparisons. Populated quite late, at optimize_obj2mod time. + */ + unsigned int modhash; + + /* + * If set at emission time, this points at another obj2mod entry that + * contains the module name we need (possibly at a slightly later + * offset, if the entry is for an objfile that appears in many modules). + */ + struct obj2mod_elem *xref; + + /* + * Chain links for object -> module and module->object mappings. + */ struct obj2mod_elem *obj2mod_next; + struct obj2mod_elem *mod2obj_next; }; /* - * Map from object files to obj2mod entries (a unique mapping). + * Map from object files to obj2mod entries (a unique mapping), and vice versa + * (not unique, but entries for objfiles in more than one module in this hash + * are ignored). */ static struct obj2mod_elem *obj2mod[OBJ2MOD_N]; +static struct obj2mod_elem *mod2obj[OBJ2MOD_N]; static size_t num_objfiles; /* @@ -162,6 +194,8 @@ static void obj2mod_add(char *obj, char *mod) elem = obj2mod_get(obj); if (!elem) { + int j = strhash(mod) & OBJ2MOD_MASK; + elem = malloc(sizeof(struct obj2mod_elem)); if (!elem) goto oom; @@ -175,8 +209,15 @@ static void obj2mod_add(char *obj, char *mod) elem->obj2mod_next = obj2mod[i]; obj2mod[i] = elem; + elem->mod2obj_next = mod2obj[j]; + mod2obj[j] = elem; num_objfiles++; } else { + /* + * objfile appears in multiple modules. mod2obj for this entry + * will be ignored from now on, except insofar as it is needed + * to maintain the hash chain. + */ elem->mods = realloc(elem->mods, elem->mods_size + strlen(mod) + 1); if (!elem->mods) @@ -196,6 +237,162 @@ static void obj2mod_add(char *obj, char *mod) fprintf(stderr, "kallsyms: out of memory\n"); exit(1); } + +static int qstrcmp(const void *a, const void *b) +{ + return strcmp((const char *) a, (const char *) b); +} + +static int qmodhash(const void *a, const void *b) +{ + struct obj2mod_elem * const *el_a = a; + struct obj2mod_elem * const *el_b = b; + if ((*el_a)->modhash < (*el_b)->modhash) + return -1; + else if ((*el_a)->modhash > (*el_b)->modhash) + return 1; + return 0; +} + +/* + * Associate all object files in obj2mod which refer to the same module with a + * single obj2mod entry for emission, preferring to point into the module list + * in a multi-module objfile. + */ +static void optimize_obj2mod(void) +{ + size_t i; + size_t n = 0; + struct obj2mod_elem *elem; + struct obj2mod_elem *dedup; + + /* An array of all obj2mod_elems, later sorted by hashval. */ + struct obj2mod_elem **uniq; + struct obj2mod_elem *last; + + /* + * Canonicalize all module lists by sorting them, then compute their + * hash values. + */ + uniq = malloc(sizeof(struct obj2mod_elem *) * num_objfiles); + if (uniq == NULL) + goto oom; + + for (i = 0; i < OBJ2MOD_N; i++) { + for (elem = obj2mod[i]; elem; elem = elem->obj2mod_next) { + if (elem->nmods >= 2) { + char **sorter; + char *walk; + char *tmp_mods; + size_t j; + + tmp_mods = malloc(elem->mods_size); + sorter = malloc(sizeof(char *) * elem->nmods); + if (sorter == NULL || tmp_mods == NULL) + goto oom; + memcpy(tmp_mods, elem->mods, elem->mods_size); + + for (j = 0, walk = tmp_mods; j < elem->nmods; + j++) { + sorter[j] = walk; + walk += strlen(walk) + 1; + } + qsort(sorter, elem->nmods, sizeof (char *), + qstrcmp); + for (j = 0, walk = elem->mods; j < elem->nmods; + j++) { + strcpy(walk, sorter[j]); + walk += strlen(walk) + 1; + } + free(tmp_mods); + free(sorter); + } + + uniq[n] = elem; + uniq[n]->modhash = memhash(elem->mods, elem->mods_size); + n++; + } + } + + qsort(uniq, num_objfiles, sizeof (struct obj2mod_elem *), qmodhash); + + /* + * Work over multimodule entries. These must be emitted into + * .kallsyms_mod_objnames as a unit, but we can still optimize by + * reusing some other identical entry. Single-file modules are amenable + * to the same optimization, but we avoid doing it for now so that we + * can prefer to point them directly inside a multimodule entry. + */ + for (i = 0, last = NULL; i < num_objfiles; i++) { + const char *onemod; + size_t j; + + if (uniq[i]->nmods < 2) + continue; + + /* Duplicate multimodule. Reuse the first we saw. */ + if (last != NULL && last->modhash == uniq[i]->modhash && + memcmp(uniq[i]->mods, last->mods, + uniq[i]->mods_size) == 0) { + uniq[i]->xref = last; + continue; + } + + /* + * Single-module entries relating to modules also emitted as + * part of this multimodule entry can refer to it: later, we + * will hunt down the right specific module name within this + * multimodule entry and point directly to it. + */ + onemod = uniq[i]->mods; + for (j = uniq[i]->nmods; j > 0; j--) { + int h = strhash(onemod) & OBJ2MOD_MASK; + + for (dedup = mod2obj[h]; dedup; + dedup = dedup->mod2obj_next) { + if (dedup->nmods > 1) + continue; + + if (strcmp(dedup->mods, onemod) != 0) + continue; + dedup->xref = uniq[i]; + assert(uniq[i]->xref == NULL); + } + onemod += strlen(onemod) + 1; + } + + last = uniq[i]; + } + + /* + * Now traverse all single-module entries, xreffing every one that + * relates to a given module to the first one we saw that refers to that + * module. + */ + for (i = 0, last = NULL; i < num_objfiles; i++) { + if (uniq[i]->nmods > 1) + continue; + + if (uniq[i]->xref != NULL) + continue; + + /* Duplicate module name. Reuse the first we saw. */ + if (last != NULL && last->modhash == uniq[i]->modhash && + memcmp(uniq[i]->mods, last->mods, uniq[i]->mods_size) == 0) { + uniq[i]->xref = last; + assert(last->xref == NULL); + continue; + } + last = uniq[i]; + } + + free(uniq); + + return; +oom: + fprintf(stderr, "kallsyms: out of memory optimizing module list\n"); + exit(EXIT_FAILURE); +} #endif /* CONFIG_KALLMODSYMS */ static void usage(void) @@ -507,8 +704,8 @@ static void output_kallmodsyms_mod_objnames(void) size_t i; /* - * Traverse and emit, updating mod_offset accordingly. Emit a single \0 - * at the start, to encode non-modular objfiles. + * Traverse and emit, chasing xref and updating mod_offset accordingly. + * Emit a single \0 at the start, to encode non-modular objfiles. */ output_label("kallsyms_mod_objnames"); printf("\t.byte\t0\n"); @@ -517,9 +714,25 @@ static void output_kallmodsyms_mod_objnames(void) elem = elem->obj2mod_next) { const char *onemod; size_t i; + struct obj2mod_elem *out_elem = elem; - elem->mod_offset = offset; - onemod = elem->mods; + /* + * Single-module ref to a multimodule: will be emitted + * as a whole, so avoid emitting pieces of it (which + * would go unreferenced in any case). + */ + if (elem->xref && + elem->nmods == 1 && elem->xref->nmods > 1) + continue; + + if (elem->xref) + out_elem = elem->xref; + + if (out_elem->mod_offset != 0) + continue; /* Already emitted. */ + + out_elem->mod_offset = offset; + onemod = out_elem->mods; /* * Technically this is a waste of space: we could just @@ -528,13 +741,14 @@ static void output_kallmodsyms_mod_objnames(void) * entry, but doing it this way makes it more obvious * when an entry is a multimodule entry. */ - if (elem->nmods != 1) { + if (out_elem->nmods != 1) { printf("\t.byte\t0\n"); - printf("\t.byte\t%zi\n", elem->nmods); + printf("\t.byte\t%zi\n", out_elem->nmods); offset += 2; } - for (i = elem->nmods; i > 0; i--) { + for (i = out_elem->nmods; i > 0; i--) { + printf("/* 0x%lx */\n", offset); printf("\t.asciz\t\"%s\"\n", onemod); offset += strlen(onemod) + 1; onemod += strlen(onemod) + 1; @@ -561,6 +775,13 @@ static void output_kallmodsyms_objfiles(void) long long offset; int overflow; + /* + * Fuse consecutive address ranges citing the same object file + * into one. + */ + if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile) + continue; + if (base_relative) { if (!absolute_percpu) { offset = addrmap[i].addr - relative_base; @@ -586,6 +807,13 @@ static void output_kallmodsyms_objfiles(void) for (i = 0; i < addrmap_num; i++) { struct obj2mod_elem *elem = addrmap[i].objfile; + int orig_nmods; + const char *orig_modname; + int mod_offset; + + if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile) + continue; + /* * Address range cites no modular object file: point at 0, the * built-in module. @@ -596,13 +824,63 @@ static void output_kallmodsyms_objfiles(void) continue; } + orig_nmods = elem->nmods; + orig_modname = elem->mods; + + /* + * Chase down xrefs, if need be. There can only be one layer of + * these: from single-module entry to other single-module + * entry, or from single- or multi-module entry to another + * multi-module entry. Single -> single and multi -> multi + * always points at the start of the xref target, so its offset + * can be used as is. + */ + if (elem->xref) + elem = elem->xref; + + if (elem->nmods == 1 || orig_nmods > 1) { + + if (elem->nmods == 1) + printf("/* 0x%llx--0x%llx: module %s */\n", + addrmap[i].addr, addrmap[i].end_addr, + elem->mods); + else + printf("/* 0x%llx--0x%llx: multimodule */\n", + addrmap[i].addr, addrmap[i].end_addr); + + mod_offset = elem->mod_offset; + } else { + /* + * If this is a reference from a single-module entry to + * a multi-module entry, hunt down the offset to this + * specific module's name (which is guaranteed to be + * present: see optimize_obj2mod). + */ + + size_t j = elem->nmods; + const char *onemod = elem->mods; + mod_offset = elem->mod_offset; + + for (; j > 0; j--) { + if (strcmp(orig_modname, onemod) == 0) + break; + onemod += strlen(onemod) + 1; + } + assert(j > 0); + /* + * +2 to skip the null byte and count at the start of + * the multimodule entry. + */ + mod_offset += onemod - elem->mods + 2; + } + /* * Zero offset is the initial \0, there to catch uninitialized * obj2mod entries, and is forbidden. */ assert(elem->mod_offset != 0); - printf("\t.long\t0x%x\n", elem->mod_offset); + printf("\t.long\t0x%x\n", mod_offset); emitted_objfiles++; } @@ -1146,6 +1424,7 @@ static void read_modules(const char *modules_builtin) free(module_name); modules_thick_iter_free(i); + optimize_obj2mod(); /* * Read linker map. From patchwork Wed Nov 9 13:41:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FD21C433FE for ; Wed, 9 Nov 2022 13:55:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230295AbiKINzC (ORCPT ); Wed, 9 Nov 2022 08:55:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230300AbiKINzB (ORCPT ); Wed, 9 Nov 2022 08:55:01 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96EAD19D; Wed, 9 Nov 2022 05:54:58 -0800 (PST) Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9DljZw027737; Wed, 9 Nov 2022 13:54:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=2/AXzHCvqQXde74LhZFpYpeNnEDDNsHTMzasoJKQahE=; b=nwwWfN4/Gy+Ukr9QgDTZS9K//CCBgfMu3oeJgaAgQ48JpyGT4+N//4jd50nZFUI2CFZk DzRp7riX8pD8pt9gqJ4rrxwe0XAW6It19ErNfRSizHAk1gs0HiqdBnsvHLRdAg5w95YC ALk3kDa1jtKQQMSG8FFU/7Z2ermpTUv25SRONLjvOoV4/SNq6SKwymmGMaqaxxWHnWQo GtNJKtH88c2jBm9dvy5+/Gtm2qSZA7vT86zvvNia2GMHKN6AHPba7/sBL1l9lEEHLehX zHwwV7H2X7aplMoi2B94papaGrcGGOMJ51a5YrPfpTLbPSYQmRh4/U8wILcEwYW0Nyn6 Zw== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krddar0bv-27 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:54:50 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Cp1LL017872; Wed, 9 Nov 2022 13:42:29 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2107.outbound.protection.outlook.com [104.47.70.107]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctdn98s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:28 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F2tgrBmcjHFhhSyYUSRQ0SYpsLiBBfDJQsXjC2qdbOhCqCMFsFosaUIrgyV/ewcWrtcel3ajHIda+VG3BlAlQh7dE1ul6bfp1OmKxDsvo8nLU8RVEZcJSene0Kcw72l0eCRjv3/O9kSSklqAs02H+OMY1NFs5fb8o0MfVJQ13b/YHmRhoNol3cI5gyiJTenuTc9Q8aOujEz0uiV/VdRtKhIPfDf1VmtIZUVtN2mtfDZgKBKBZGbUVMJx3TQt1dRujJZJ2BQXY+6yvUODKEW63Za+eD1JrLR4n5BqnHVlrWoKvQsdbwlvpso/eMJBGAVYkfLDQzoby0G7B9IjXeUtYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2/AXzHCvqQXde74LhZFpYpeNnEDDNsHTMzasoJKQahE=; b=ZvZBwKGT1zR9CmCRPdZrB4oLCorV6XylIBVoyzejuIW0ivMKEfEyIm9ugwO4ZAbmhR0jr3HFrpjNhVTrjcny45abpFsqB1Q7vSwpUk10kUiLi1bqmFOdKi1H1AjwM9TcuQHeHWyXE11QO+8hqIh8x5de+x/BmdiE0L0gP+GoBCeMtmhKh2qmi4FW/g8HMILQYibjqx2WJYenYfzfdSr8A/5HLtNG02M0OLla3o+xSjZ677vCkPXrExChLMhbUqiV5a6i0gy2MlOllwD9I/YAPYwtxzw5NGMTzxdkbV/MjW0KA9fL63UK5wpX1/v3pqFdGpjqhHK04Xtq/zSnBizO/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2/AXzHCvqQXde74LhZFpYpeNnEDDNsHTMzasoJKQahE=; b=SW9THsNsYu857cFQrKZif0F9YYVnTcz6B0msuwaixX+KNWRy/DQ4Ad3glKKu7AgpG+LacOMabPxJCtTsjv34HXvPLuDco0jKKU/batNIn3DsbrDi5rH41TlxsmkSuT0DPsbf2M3Q4m/hmMYopC7I1/ZKVXl2gZuAvIOPw3yxSR4= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:26 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:26 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 6/8] kallsyms: distinguish text symbols fully using object file names Date: Wed, 9 Nov 2022 13:41:30 +0000 Message-Id: <20221109134132.9052-7-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P123CA0462.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1aa::17) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: e71fc019-d662-437e-da93-08dac2583ced X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BnjesswRabPK7nJ8oeR2yWXsFqljluq1e+N8jrmGHGlDhabT99R8pZX11xTaTQuiSKljQAuESM/gL3e1P4M6qu5ncSH5i1chkhJGob3HRxFl7yIR8ZK4tvEHg9Ock0Tw+c63kXOW/sRraLfKehlNhVXZOc7W2L1ebKbEQDOzM8XT0Pv9iDeVDAUu1XXVXMuvDxDJoDWUhc9VQ8Z4zAa+7xu5VW8c7N+GJHtoMFbKlmvpVFM3uWw7TbOzrWucahoKjIr9OMrMUbHaZOSRiUVvzXJUzValD1iASE5YAH9un4DIF73p0WJRrg46bsQlfuwcFMZHOfEb3kzrZXAL+lbs8N2PgCaYax6DLCg63yxSmP+Rdx+iRgwaLLIXB2BxAyxli02u4gW5uYPjT1fD8lLQyLMQS0PAhc54C1yzNpbgdqgIoKI3oiYdciCSCWywKfEwOGc+QWFQ/8hhF/b99ovDGBPFbySIytvMzJ6vhLfZ5UEEfRP/bCa5MTsDspyarQdsQFUQXgCD/YOZeWjsa0rOwIP6VHROi2oOvQ9IU4dSICpcxsuNF6xLry2exo8FBiSqVehwvaX/3L1n2QKd+IawpHe26Mvk0z+RjCZLzBoSAImw7QSKAsEeirpMzH1rOOPhHONzPa3I1X/Wc2L1+c/TzomkAVgdZ//nOgjUtMvSfawuPNLmw0C5sa3LrSv4O4Q5DAuUFMLBfVWe10qLUmBuJg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: b0TE+qpDNeGLQ+rASFbyms2rDccmn6jpe8VHla/9tZFc9kb829ZKRkkkOstPmUjdHPh1jqI+KyPMSEaJDJC4QR7/6hJ31epNhvqGjkjhrQr3QUyoGJrzMlv01V2eqQY7a1jM/LEo3m2W6ZfEwTG1fopF5ErNQCukTzVdgsnFQiyS1tOQsHDad/gNXQpxAgKi4LKba9Qrzj1E9UpoSflK7v8mg6ya6XNkRgdQYrly5agvn+kjMmMbcAVMuOGhLWalOHgot7Z/fN/F3FrevPxJS+Rz4FaIRchl7HRLJ6LjoNcfpPnI9TCgPjeu2QJNnokaJCOZsyduGnZ+wwcezh5e7DxRwClIXngVTMHgBB7WScUL8xXwzmKZ5/QL7Wp7Vtxr97xGB/Hc2z7ietoSmvfI85oO7yATupCmv1dNVWxL9aMySt0BRNwvoXsE5Amkb7jaAfl6ffDtylk44EyQeVmGJWQFZHLNaWwYlz2zMLSLOt9lp/RFVr1mPv16X6nT5BKH15lXKKAlagr/j8REo1IuPeW4Q+cYYCn7wGyN8DrXw5vVZLZYmm4OV52vvf+xlQ7tYcbLu3yIH45jG2nx0IO3/n+7uTy3b2dMt0i3Xf74vMo1E1+nmlGZ+jUqMqDfROds+fUwx/TwOIri424PrK9yR5WDqyIs4SJ6fNgKJ6/pDrl9vwGjYbidWZzJ2bFcZPJ9ub13oFXBK++VDh6quLuaq9EHrpbP5D2UiXzlC1wpCNCdF4v3likylUVy4D1BlV/Swg7zEvEq1CzFph7UoBqFZ/jZRCO1G5AGzq/PcDaytxbCLQFdK5EnybNmxFaI7H7JUht+6Ud8+Njzo+egJbDGpgCVCCG8r+Qz/q8qGxUqpGEYiO79epnRy/ZQKTih0BhykzGc7P+Z3ovoaMdVyO1CZsOY63T+28s3qiTVG3J4O31DI6e0dVsMKj4z/TimmufJv5NbTQ81RXS6ooMR3Fn4zScV4KH7/70seYHd8rFqFVljokaDRHD4Q4bvYPCg5Qp5rQMXDBp1PfokcmEc+6+OjBYwcUM2DpVNLLAShVW2h3JJmqzdwew/h5z3T3QW0R5gXe+jijItxx4w9nSm2/l2D1BxHvia8EdMZexEJY7edu6kk/73GRjhfjRoh+n1ZRAgrdkHo38rJdjap00P6f3aqtV995Ka75RvBqE8zPRiaPbdKDKiQFQNYxUTsQ/gkqDlpOBC4JCowMewhpDrnJuNvLlBKVm5zS52NRiTo6fV9qUKv7N1m8Ril7nKXajx69W99bLqhPO7hPREgoBYlzxqZyKXjFmCLshFoLR6ioexXUWnXYP283x03tk0EXYM4esCGnz4+ZH51X5/WAT+HEC0WYgoxk1l3dJyzABQ/UACCIIRBZVTq74kpJKnCl5qBNWupvefkWThXbof8cK1bYux8mCVNG/1O531ro3kDdl1tA4eCmpTD8ia+2kTc76hUIzKSXGWzYHI8RuYhQfymp+WoYzNOJuxPL7y71/EczUcchCGCIqfbX909DezIruvK3srwFUHJJNtIaDD/Usd+UgOI4HRyLP30wBRZq+2v1xZgfHFF+lkMQ/0Z55MBo2KMalllmR4vtpTwvLkD4lixVJx7A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: e71fc019-d662-437e-da93-08dac2583ced X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:26.3353 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: WK9wwbol+m/n8eezsL/F/z9A5GUyFl8gv/ZRujz5OrElIRwfca/5Iqw4pJqziHpsr6wxZqEt46odru975pvlCQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: bNBPOotOQ4LabJaCB3Q4uIhxVXMK2-ej X-Proofpoint-ORIG-GUID: bNBPOotOQ4LabJaCB3Q4uIhxVXMK2-ej Precedence: bulk List-ID: The commits before this one allow you to distinguish identically-named text symbols located in different built-in object files from each other; but identically-named symbols can appear in any object files at all, including object files that cannot be built as modules. We already have nearly all the machinery to disambiguate these symbols as well. Since any given object file can contain at most one definition of a given symbol, it suffices to name the object files containing any symbol which is otherwise ambiguous. (No others need be named, saving a bunch of space). We associate address ranges with object file names using a new .kallsyms_objfiles section just like the previously-added .kallsyms_modules section. But that's not quite enough. Even the object file name is ambiguous in some cases: e.g. there are a lot of files named "core.o" in the kernel. We could just store the full pathname for every object file, but this is needlessly wasteful: doing this eats more than 50KiB in object file names alone, and nearly all the content of every name is repeated for many names. But if we store the object file names in the same section as the built-in module names, drop the .o, and store minimal path suffixes, we can save almost all that space. (For example, "core.o" would be stored as "core" unless there are ambiguous symbols in two different object files both named "core", in which case they'd be named "sched/core" and "futex/core", etc, possibly re-extending to "kernel/sched/core" if still ambiguous). We do this by a repeated-rehashing process. First, we compute a hash value for symbol\0modhash for every symbol (the modhash is ignored if this is a built-in symbol). Any two symbols with the same such hash are identically-named: add the maximally-shortened (one-component, .o-stripped) object file name for all such symbols, and rehash, this time hashing symbol\0objname\0modhash. Any two symbols which still have the same hash are still ambiguous: lengthen the name given to one of the symbols' object files and repeat. Eventually, all the ambiguity will go away. (We do have to take care not to re-lengthen anything we already lengthened in any given hashing round.) This involves multiple sorting passes but the impact on compilation time appears to be nearly zero, and the impact on space in the running kernel is noticeable: only a few dozen names need lengthening, so we can completely ignore the overhead from storing repeated path components because there are hardly any of them. But that's not all. We can also do similar optimization tricks to what was done with .kallsyms_modules, reusing module names and names of already-emitted object files: so any given object file name only appears once in the strtab, and can be cited by many address ranges and even by module entries. Put all this together and the net overhead of this in my testing is about 3KiB of new object file names in the .kallsyms_mod_objnames table and 6KiB for the .kallsyms_objfiles table (mostly zeroes: in future maybe we can find a way to elide some of those, but 6KiB is small enough that it's not worth taking too much effort). No ambiguous textual symbols remain outside actual modules (which can still contain identically-named symbols in different object files because kallsyms doesn't run over them so none of these tables can be built for them. At least, it doesn't yet.) Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- Notes: v9: new. scripts/kallsyms.c | 559 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 546 insertions(+), 13 deletions(-) diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index f89f569eb3c9..ffb69a8f6ff8 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -113,6 +113,9 @@ static unsigned int memhash(char *s, size_t len) return hash; } +/* + * Object file -> module and shortened object file name tracking machinery. + */ #define OBJ2MOD_BITS 10 #define OBJ2MOD_N (1 << OBJ2MOD_BITS) #define OBJ2MOD_MASK (OBJ2MOD_N - 1) @@ -143,15 +146,40 @@ struct obj2mod_elem { struct obj2mod_elem *mod2obj_next; }; +/* + * Shortened object file names. These are only ever consulted after checking + * the obj2mod hashes: names that already exist in there are used directly from + * there (pointed to via the mod_xref field) rather than being re-emitted. + * Entries that do not exist there are added to the end of the mod_objnames + * list. + */ +struct obj2short_elem { + const char *obj; + char *desuffixed; /* objname sans suffix */ + const char *short_obj; /* shortened at / and suffix */ + int short_offset; /* offset of short name in .kallsyms_mod_objnames */ + int last_rehash; /* used during disambiguate_hash_syms */ + + struct obj2mod_elem *mod_xref; + struct obj2short_elem *short_xref; + struct obj2short_elem *short_next; +}; + /* * Map from object files to obj2mod entries (a unique mapping), and vice versa * (not unique, but entries for objfiles in more than one module in this hash - * are ignored). + * are ignored); also map from object file names to shortened names for them + * (also unique: there is no point storing both longer and shorter forms of one + * name, so if a longer name is needed we consistently use it instead of the + * shorter form.) + * + * obj2short is populated very late, at disambiguate_syms time. */ static struct obj2mod_elem *obj2mod[OBJ2MOD_N]; static struct obj2mod_elem *mod2obj[OBJ2MOD_N]; -static size_t num_objfiles; +static struct obj2short_elem *obj2short[OBJ2MOD_N]; +static size_t num_objfiles, num_shortnames; /* * An ordered list of address ranges and the objfile that occupies that range. @@ -165,6 +193,9 @@ struct addrmap_entry { static struct addrmap_entry *addrmap; static int addrmap_num, addrmap_alloced; +static void disambiguate_syms(void); +static void optimize_objnames(void); + static void obj2mod_init(void) { memset(obj2mod, 0, sizeof(obj2mod)); @@ -182,6 +213,18 @@ static struct obj2mod_elem *obj2mod_get(const char *obj) return NULL; } +static struct obj2short_elem *obj2short_get(const char *obj) +{ + int i = strhash(obj) & OBJ2MOD_MASK; + struct obj2short_elem *elem; + + for (elem = obj2short[i]; elem; elem = elem->short_next) { + if (strcmp(elem->obj, obj) == 0) + return elem; + } + return NULL; +} + /* * Note that a given object file is found in some module, interning it in the * obj2mod hash. Should not be called more than once for any given (module, @@ -254,6 +297,12 @@ static int qmodhash(const void *a, const void *b) return 0; } +static int qobj2short(const void *a, const void *b) +{ + return strcmp((*(struct obj2short_elem **)a)->short_obj, + (*(struct obj2short_elem **)b)->short_obj); +} + /* * Associate all object files in obj2mod which refer to the same module with a * single obj2mod entry for emission, preferring to point into the module list @@ -393,6 +442,336 @@ static void optimize_obj2mod(void) fprintf(stderr, "kallsyms: out of memory optimizing module list\n"); exit(EXIT_FAILURE); } + +/* + * Associate all short-name entries in obj2short that refer to the same short + * name with a single entry for emission, either (preferably) a module that + * shares that name or (alternatively) the first obj2short entry referencing + * that name. + */ +static void optimize_objnames(void) +{ + size_t i; + size_t num_objnames = 0; + struct obj2short_elem *elem; + struct obj2short_elem **uniq; + struct obj2short_elem *last; + + uniq = malloc(sizeof(struct obj2short_elem *) * num_shortnames); + if (uniq == NULL) { + fprintf(stderr, "kallsyms: out of memory optimizing object file name list\n"); + exit(EXIT_FAILURE); + } + + /* + * Much like optimize_obj2mod, except there is no need to canonicalize + * anything or handle multimodule entries, and we need to chase down + * possible entries in mod2obj first (so as not to duplicate them in the + * final kallsyms_mod_objnames strtab). + */ + for (i = 0; i < OBJ2MOD_N; i++) + for (elem = obj2short[i]; elem; elem = elem->short_next) + uniq[num_objnames++] = elem; + + qsort(uniq, num_objnames, sizeof(struct obj2short_elem *), qobj2short); + + for (i = 0, last = NULL; i < num_objnames; i++) { + int h = strhash(uniq[i]->short_obj) & OBJ2MOD_MASK; + struct obj2mod_elem *mod_elem; + + for (mod_elem = mod2obj[h]; mod_elem; + mod_elem = mod_elem->mod2obj_next) { + /* + * mod_elem entries are only valid if they are for + * single-module objfiles: see obj2mod_add + */ + if (mod_elem->nmods > 1) + continue; + + if (strcmp(mod_elem->mods, uniq[i]->short_obj) != 0) + continue; + uniq[i]->mod_xref = mod_elem; + break; + } + + /* + * Only look for a short_xref match if we don't already have one + * in mod_xref. (This means that multiple objfiles with the + * same short name that is also a module name all chain directly + * to the module name via mod_xref, rather than going through a + * chain of short_xrefs.) + */ + if (uniq[i]->mod_xref) + continue; + + if (last != NULL && strcmp(last->short_obj, + uniq[i]->short_obj) == 0) { + uniq[i]->short_xref = last; + continue; + } + + last = uniq[i]; + } + + free(uniq); +} + +/* + * Used inside disambiguate_syms to identify colliding symbols. We spot this by + * hashing symbol\0modhash (or just the symbol name if this is in the core + * kernel) and seeing if that collides. (This means we don't need to bother + * canonicalizing the module list, since optimize_obj2mod already did it for + * us.) + * + * If that collides, we try disambiguating by adding ever-longer pieces of the + * object file name before the modhash until we no longer collide. The result + * of this repeated addition becomes the obj2short hashtab. + */ +struct sym_maybe_collides { + struct sym_entry *sym; + struct addrmap_entry *addr; + struct obj2short_elem *short_objname; + unsigned int symhash; +}; + +static int qsymhash(const void *a, const void *b) +{ + const struct sym_maybe_collides *el_a = a; + const struct sym_maybe_collides *el_b = b; + if (el_a->symhash < el_b->symhash) + return -1; + else if (el_a->symhash > el_b->symhash) + return 1; + return 0; +} + +static int find_addrmap(const void *a, const void *b) +{ + const struct sym_entry *sym = a; + const struct addrmap_entry *map = b; + + if (sym->addr < map->addr) + return -1; + else if (sym->addr >= map->end_addr) + return 1; + return 0; +} + +/* + * Allocate or lengthen an object file name for a symbol that needs it. + */ +static int lengthen_short_name(struct sym_maybe_collides *sym, int hash_cycle) +{ + struct obj2short_elem *short_objname = obj2short_get(sym->addr->obj); + + if (!short_objname) { + int i = strhash(sym->addr->obj) & OBJ2MOD_MASK; + char *p; + + short_objname = malloc(sizeof(struct obj2short_elem)); + if (short_objname == NULL) + goto oom; + + /* + * New symbol: try maximal shortening, which is just the object + * file name (no directory) with the suffix removed (the suffix + * is useless for disambiguation since it is almost always .o). + * + * Add a bit of paranoia to allow for names starting with /, + * ending with ., and names with no suffix. (At least two of + * these are most unlikely, but possible.) + */ + + memset(short_objname, 0, sizeof(struct obj2short_elem)); + short_objname->obj = sym->addr->obj; + + p = strrchr(sym->addr->obj, '.'); + if (p) + short_objname->desuffixed = strndup(sym->addr->obj, + p - sym->addr->obj); + else + short_objname->desuffixed = strdup(sym->addr->obj); + + if (short_objname->desuffixed == NULL) + goto oom; + + p = strrchr(short_objname->desuffixed, '/'); + if (p && p[1] != 0) + short_objname->short_obj = p + 1; + else + short_objname->short_obj = short_objname->desuffixed; + + short_objname->short_next = obj2short[i]; + short_objname->last_rehash = hash_cycle; + obj2short[i] = short_objname; + + num_shortnames++; + return 1; + } + + /* + * Objname already lengthened by a previous symbol clash: do nothing + * until we rehash again. + */ + if (short_objname->last_rehash == hash_cycle) + return 0; + short_objname->last_rehash = hash_cycle; + + /* + * Existing symbol: lengthen the objname we already have. + */ + + if (short_objname->desuffixed == short_objname->short_obj) { + fprintf(stderr, "Cannot disambiguate %s: objname %s is " + "max-length but still colliding\n", + sym->sym->sym, short_objname->short_obj); + return 0; + } + + /* + * Allow for absolute paths, where the first byte is '/'. + */ + + if (short_objname->desuffixed >= short_objname->short_obj - 2) + short_objname->short_obj = short_objname->desuffixed; + else { + for (short_objname->short_obj -= 2; + short_objname->short_obj > short_objname->desuffixed && + *short_objname->short_obj != '/'; + short_objname->short_obj--); + + if (*short_objname->short_obj == '/') + short_objname->short_obj++; + } + return 1; + oom: + fprintf(stderr, "Out of memory disambiguating syms\n"); + exit(EXIT_FAILURE); +} + +/* + * Do one round of disambiguation-check symbol hashing, factoring in the current + * set of applicable shortened object file names for those symbols that need + * them. + */ +static void disambiguate_hash_syms(struct sym_maybe_collides *syms) +{ + size_t i; + for (i = 0; i < table_cnt; i++) { + struct obj2short_elem *short_objname = NULL; + char *tmp, *p; + size_t tmp_size; + + if (syms[i].sym == NULL) { + syms[i].symhash = 0; + continue; + } + + short_objname = obj2short_get(syms[i].addr->obj); + + tmp_size = strlen((char *) &(syms[i].sym->sym[1])) + 1; + + if (short_objname) + tmp_size += strlen(short_objname->short_obj) + 1; + + if (syms[i].addr->objfile) + tmp_size += sizeof(syms[i].addr->objfile->modhash); + + tmp = malloc(tmp_size); + if (tmp == NULL) { + fprintf(stderr, "Out of memory disambiguating syms\n"); + exit(EXIT_FAILURE); + } + + p = stpcpy(tmp, (char *) &(syms[i].sym->sym[1])); + p++; + if (short_objname) { + p = stpcpy(p, short_objname->short_obj); + p++; + } + if (syms[i].addr->objfile) + memcpy(p, &(syms[i].addr->objfile->modhash), + sizeof(syms[i].addr->objfile->modhash)); + + syms[i].symhash = memhash(tmp, tmp_size); + free(tmp); + } + + qsort(syms, table_cnt, sizeof (struct sym_maybe_collides), qsymhash); +} + +/* + * Figure out which object file names are necessary to disambiguate all symbols + * in the linked kernel: transform them for minimum length while retaining + * disambiguity: point to them in obj2short. + */ +static void disambiguate_syms(void) +{ + size_t i; + int retry; + int hash_cycle = 0; + unsigned int lasthash; + struct sym_maybe_collides *syms; + + syms = calloc(table_cnt, sizeof(struct sym_maybe_collides)); + + if (syms == NULL) + goto oom; + + /* + * Initial table population: symbol-dependent things not affected by + * disambiguation rounds. + */ + for (i = 0; i < table_cnt; i++) { + struct addrmap_entry *addr; + + /* + * Only bother doing anything for function symbols. + */ + if (table[i]->sym[0] != 't' && table[i]->sym[0] != 'T' && + table[i]->sym[0] != 'w' && table[i]->sym[0] != 'W') + continue; + + addr = bsearch(table[i], addrmap, addrmap_num, + sizeof(struct addrmap_entry), find_addrmap); + + /* + * Some function symbols (section start symbols, discarded + * non-text-range symbols, etc) don't appear in the linker map + * at all. + */ + if (addr == NULL) + continue; + + syms[i].sym = table[i]; + syms[i].addr = addr; + } + + do { + hash_cycle++; + retry = 0; + lasthash = 0; + disambiguate_hash_syms(syms); + + for (i = 0; i < table_cnt; i++) { + if (syms[i].sym == NULL) + continue; + if (syms[i].symhash == lasthash) { + if (lengthen_short_name(&syms[i], hash_cycle)) + retry = 1; + } + lasthash = syms[i].symhash; + } + } while (retry); + + free(syms); + return; + oom: + fprintf(stderr, "kallsyms: out of memory disambiguating syms\n"); + exit(EXIT_FAILURE); + +} + #endif /* CONFIG_KALLMODSYMS */ static void usage(void) @@ -424,6 +803,7 @@ static bool is_ignored_symbol(const char *name, char type) "kallsyms_relative_base", "kallsyms_num_syms", "kallsyms_num_modules", + "kallsyms_num_objfiles", "kallsyms_names", "kallsyms_markers", "kallsyms_token_table", @@ -431,6 +811,7 @@ static bool is_ignored_symbol(const char *name, char type) "kallsyms_module_offsets", "kallsyms_module_addresses", "kallsyms_modules", + "kallsyms_objfiles", "kallsyms_mod_objnames", "kallsyms_mod_objnames_len", /* Exclude linker generated symbols which vary between passes */ @@ -700,6 +1081,7 @@ static void output_address(unsigned long long addr) static void output_kallmodsyms_mod_objnames(void) { struct obj2mod_elem *elem; + struct obj2short_elem *short_elem; size_t offset = 1; size_t i; @@ -755,15 +1137,75 @@ static void output_kallmodsyms_mod_objnames(void) } } } + + /* + * Module names are done; now emit objfile names that don't match + * objfile names. They go in the same section to enable deduplication + * between (maximally-shortened) objfile names and module names. + * (This is another reason why objfile names drop the suffix.) + */ + for (i = 0; i < OBJ2MOD_N; i++) { + for (short_elem = obj2short[i]; short_elem; + short_elem = short_elem->short_next) { + + /* Already emitted? */ + if (short_elem->mod_xref) + continue; + + if (short_elem->short_xref) + short_elem = short_elem->short_xref; + + if (short_elem->short_offset != 0) + continue; + + printf("/* 0x%lx: shortened from %s */\n", offset, + short_elem->obj); + + short_elem->short_offset = offset; + printf("\t.asciz\t\"%s\"\n", short_elem->short_obj); + offset += strlen(short_elem->short_obj) + 1; + } + } + printf("\n"); output_label("kallsyms_mod_objnames_len"); printf("\t.long\t%zi\n", offset); } +/* + * Return 1 if this address range cites the same built-in module and objfile + * name as the previous one. + */ +static int same_kallmodsyms_range(int i) +{ + struct obj2short_elem *last_short; + struct obj2short_elem *this_short; + if (i == 0) + return 0; + + last_short = obj2short_get(addrmap[i-1].obj); + this_short = obj2short_get(addrmap[i].obj); + + if (addrmap[i-1].objfile == addrmap[i].objfile) { + + if ((last_short == NULL && this_short != NULL) || + (last_short != NULL && this_short == NULL)) + return 0; + + if (last_short == NULL && this_short == NULL) + return 1; + + if (strcmp(last_short->short_obj, this_short->short_obj) == 0) + return 1; + } + return 0; +} + static void output_kallmodsyms_objfiles(void) { size_t i = 0; size_t emitted_offsets = 0; + size_t emitted_modules = 0; size_t emitted_objfiles = 0; if (base_relative) @@ -775,12 +1217,15 @@ static void output_kallmodsyms_objfiles(void) long long offset; int overflow; - /* - * Fuse consecutive address ranges citing the same object file - * into one. - */ - if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile) - continue; + printf("/* 0x%llx--0x%llx: %s */\n", addrmap[i].addr, + addrmap[i].end_addr, addrmap[i].obj); + + /* + * Fuse consecutive address ranges citing the same built-in + * module and objfile name into one. + */ + if (same_kallmodsyms_range(i)) + continue; if (base_relative) { if (!absolute_percpu) { @@ -807,11 +1252,12 @@ static void output_kallmodsyms_objfiles(void) for (i = 0; i < addrmap_num; i++) { struct obj2mod_elem *elem = addrmap[i].objfile; + struct obj2mod_elem *orig_elem = NULL; int orig_nmods; const char *orig_modname; int mod_offset; - if (i > 0 && addrmap[i-1].objfile == addrmap[i].objfile) + if (same_kallmodsyms_range(i)) continue; /* @@ -819,8 +1265,10 @@ static void output_kallmodsyms_objfiles(void) * built-in module. */ if (addrmap[i].objfile == NULL) { + printf("/* 0x%llx--0x%llx: %s: built-in */\n", + addrmap[i].addr, addrmap[i].end_addr, addrmap[i].obj); printf("\t.long\t0x0\n"); - emitted_objfiles++; + emitted_modules++; continue; } @@ -835,8 +1283,10 @@ static void output_kallmodsyms_objfiles(void) * always points at the start of the xref target, so its offset * can be used as is. */ - if (elem->xref) + if (elem->xref) { + orig_elem = elem; elem = elem->xref; + } if (elem->nmods == 1 || orig_nmods > 1) { @@ -872,6 +1322,19 @@ static void output_kallmodsyms_objfiles(void) * the multimodule entry. */ mod_offset += onemod - elem->mods + 2; + + /* + * If this was the result of an xref chase, store this + * mod_offset in the original entry so we can just reuse + * it if an objfile shares this name. + */ + + printf("/* 0x%llx--0x%llx: %s: single-module ref to %s in multimodule at %x */\n", + addrmap[i].addr, addrmap[i].end_addr, + orig_elem->mods, onemod, elem->mod_offset); + + if (orig_elem) + orig_elem->mod_offset = mod_offset; } /* @@ -881,12 +1344,68 @@ static void output_kallmodsyms_objfiles(void) assert(elem->mod_offset != 0); printf("\t.long\t0x%x\n", mod_offset); - emitted_objfiles++; + emitted_modules++; } - assert(emitted_offsets == emitted_objfiles); + assert(emitted_offsets == emitted_modules); output_label("kallsyms_num_modules"); + printf("\t.long\t%zi\n", emitted_modules); + + output_label("kallsyms_objfiles"); + + for (i = 0; i < addrmap_num; i++) { + struct obj2short_elem *elem; + int mod_offset; + + if (same_kallmodsyms_range(i)) + continue; + + /* + * No corresponding objfile name: no disambiguation needed; + * point at 0. + */ + elem = obj2short_get(addrmap[i].obj); + + if (elem == NULL) { + printf("/* 0x%llx--0x%llx: %s: unambiguous */\n", + addrmap[i].addr, addrmap[i].end_addr, + addrmap[i].obj); + printf("\t.long\t0x0\n"); + emitted_objfiles++; + continue; + } + + /* + * Maybe the name is also used for a module: if it is, it cannot + * be a multimodule. + */ + + if (elem->mod_xref) { + assert(elem->mod_xref->nmods == 1); + mod_offset = elem->mod_xref->mod_offset; + printf("/* 0x%llx--0x%llx: %s: shortened as %s, references module */\n", + addrmap[i].addr, addrmap[i].end_addr, + addrmap[i].obj, elem->short_obj); + } else { + /* + * A name only used for objfiles. Chase down xrefs to + * reuse existing entries. + */ + if (elem->short_xref) + elem = elem->short_xref; + + mod_offset = elem->short_offset; + printf("/* 0x%llx--0x%llx: %s: shortened as %s */\n", + addrmap[i].addr, addrmap[i].end_addr, + addrmap[i].obj, elem->short_obj); + } + printf("\t.long\t0x%x\n", mod_offset); + emitted_objfiles++; + } + assert(emitted_offsets == emitted_objfiles); + output_label("kallsyms_num_objfiles"); printf("\t.long\t%zi\n", emitted_objfiles); + printf("\n"); } #endif /* CONFIG_KALLMODSYMS */ @@ -1430,6 +1949,20 @@ static void read_modules(const char *modules_builtin) * Read linker map. */ read_linker_map(); + + /* + * Now the modules are sorted out and we know their address ranges, use + * the modhashes computed in optimize_obj2mod to identify any symbols + * that are still ambiguous and set up the minimal representation of + * their objfile name to disambiguate them. + */ + disambiguate_syms(); + + /* + * Now we have objfile names, optimize the objfile list. + */ + optimize_objnames(); + } #else static void read_modules(const char *unused) {} From patchwork Wed Nov 9 13:41:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04A6FC433FE for ; Wed, 9 Nov 2022 14:25:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231523AbiKIOZO (ORCPT ); Wed, 9 Nov 2022 09:25:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231148AbiKIOY7 (ORCPT ); Wed, 9 Nov 2022 09:24:59 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76BA4326DA; Wed, 9 Nov 2022 06:22:24 -0800 (PST) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9EFXLj006707; Wed, 9 Nov 2022 14:22:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=I1d1bqXhzhDNq6Kusl94sLc+gKjQzctiUPQdP4RW9Ac=; b=eX/M1MLxtdBOa0F6QsnDiWLJ8eZdHFZKQ7Xknd0ljxNa/1juCA1R+gS9Ba/nukymQXmV PxO66eLOu7hqIY4I7yp0IOD6YLkqeU0DhpOfHtXLOzZFVme/VtVcwcvb3g4fKMC9sSsK 57Augnz0pBi46xHRjRjyCeONoB4eLgk3ZoFaimAnaJyKfK9Kcrupyl25JFJWaActTU7U wk3NflAJpt3FhKMdupGZ3aPeXmKi53raBulK8moz4BPAMPdKZrH/SRtsa1ocj8XlQCWo woToDw/obJSFSK/bFxXoW4p2CLdRbJy21PvhMedyxjcdSIjsAAb0nX5zyx3JABGi8Xju eQ== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krdfkg3uu-34 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 14:22:10 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CY2Ze017805; Wed, 9 Nov 2022 13:42:43 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2103.outbound.protection.outlook.com [104.47.70.103]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctdn9ek-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ArqE5SqIsuTgkvKHCaQDRR/vLvELbM38b6dAwsdj9Iy7pdvFzVISLN0nJw60MtC+F9KLS4YVp/3pywpecefl6GEe2r17fmKw3ujU++1Q3pcbff7XZosqwt3TkYx2azxev/d3sdGQyER8Yb+yNSnDA1u2phelD1S9k45520Oetwa2TBmc7G0dxAqmRk4ApjVBy7X8LnAcvVcAgPt5rE9F/QBEDUeSXTEOuh2v0LJQJmbNTj/xvQdqi4HV02RslNf64PS03q8Fd/Fw7pFd1Ey+NlDprvD25uh5CWphN1Y154N0H/oj0gEAPlbHLr7xOCCifAoXW09GiH5R1iu6cPDiBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=I1d1bqXhzhDNq6Kusl94sLc+gKjQzctiUPQdP4RW9Ac=; b=au8/oDJNWc7uGbDcqNPYJQ9/sEUGK4McC5991WPLe/jMd2x73YUctG1TUXTYJjSmUdQT7iQCWgqoDvdyoBo5d0WHe3YmhWoJzxRsFWOVzU3d6UrFFB+QsOyNT+1zqSKCRsVGTGhYbim/+Jd5C4uBXEgbA5XVsmBI5Ki6DuS1s5ahNdjN9X+dapNAmKK7RUBVwpk/iP1D1KCMVhJBXf0nP89EDiavPWeYP5jFHaukqSXzcPjPvaeJnkzUclq6kdmYun6rDQZGAp4U1rXqmvqfSRe0pPANerezpKihFzhxp1/zmCTEOI7Kgo3oxbc6GcMXrwM39Aw7Op+1G40keCdJ0w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=I1d1bqXhzhDNq6Kusl94sLc+gKjQzctiUPQdP4RW9Ac=; b=Ir+jygVsxdL/DczIjEgXNfR0RGbsNr5H4fby0UpgRfeiN7gzAQJT5o0ddGdjiPPzIqFl3MoZzHkRNpEQRn8lweCXQS6Gs5HZsx2prumKaJVy2diNLlJT4+sK93Coi2DqHDAjiYtmFET4y2y+hoCM9gFMf1V9S9v319cBse28I3A= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:40 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:40 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation Date: Wed, 9 Nov 2022 13:41:31 +0000 Message-Id: <20221109134132.9052-8-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LNXP123CA0004.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:d2::16) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: 8ab987ce-8a7b-4d6d-7685-08dac258453d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Xz6EwKElrdiPx/D+D6kFTGxTO3aN3mNrGowomGLaRL33ITNSTkB1wFY7OH/UYtJksBFRpSiBBYp89knGvIYcoELn34lBg5Ui7J+RTix/giWnn6M1hyVxQfqI0Di7xOUZZvgZiEfj2NuodIvOOd/Ix8roQGaAG7k/ZhxLmgLogw+Tj2cskJsyetRY7meHX1/ubqJryyklGGE1b3o99SJzV+XQ6LaFhU+f+VF7XnFMYUdJvOSBHmFe9PorpNDuB0Ymv8bODA67nZJyi5MromtfNtzMJq6bCEi5qoc+nAy1kV/1ut2ryPX+m/jI6GcQ17GWBGAbaMnQdy5Fj5orHUprUWELzTmuaqjxC3u2Om7h6vP2oCwsEayyRCI4FLdaGk/7WeqGeN/Qi704rI3XK+FI4kJIcX07boecrLAvmTmxqQrx6Flksm7hpmh0Pix+HA15mVF2yqo61qG885nyiXw6c85vb5CFoGu1M0KnG5NvusjEJ0M4PEx7H1OLHhgsaK1nH5HdDv9Vx/oMJarlLTdk4xAEH1htLh6+/H04W+t5kuAjK4li3FZwH3Z3Hkyzrt1lqADVoO5F1BF6TZGnX1cyNkK13OodRaNte/TwLeO6fe81CwOv38g6EL0XuCemo86nYrq6K9mfjIPMDSBregf7kLDOviVe4+pUw2mikkYbRIVBqQfhik9U3ssNmxnx1znTT7L/exg6uxEZbrqu8wGNnQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: HVBQFLSkfDzEGgmEbgc1DzjRB2xdj1jCr/c08x848BU64f9GQLbBYUYyud2a0ObpGdDtclHhiee5jG1mSWOZnYbmXUOBBvXWu2oejgCUDJCO1uhHYgVPrsuEfGGuy7VxCYo5deNzZ1URagXQT9FQGeEHfm1L6hn0eLt05f3MW3Oj/02mYaCid5fAYtkOINMYzy31ZcftK7vBLsa7czIs0JatA82S6g6mNMh30JLKAPT5I0JmKvyJ8gwmoGB/0BXJ3CX4Jzo/D8QZcOXC0Y3xSnPUlRgG8d0p/PpoyifHDqIDPYRL3P6wP6aa/8jxhDRpf1JdAs8DgEUbzX7HRM1iWq6w9RP9owRsKJO7synzmwetbvWMAM0snRXuDhAoTyqX2MMtI55xse6Q4SIfo/lQOANZFCBYOZnjn4YzhLLrtaNfeG6RcfgJevVegsO7MwgRbj9vzf+U3o6qhzlTZfdKAuyHHFY0W0fLXbkPN4smdC40DkD7JiaXlZDSWIca4NLBQnM4X+2RWiC4oo8vc2uBVAV0+8HVkaiAoGyp7a6XqF4X8zNNlZvkwZ9OhE2LR34AdVxysMdamtOeMKIjyxz9SvERwqC6naV3gNDTRcbrCE7XJMQtweIFekiz7o7q6GQitE/SXeEhXDegrPwzzUkg7i6q8bKS3U7tfjzeyjQNlnd9Ra3pDE6RpPEFBAOoEsRJ1oJAtTHAJEm4NX0FpKRxn587Ma7JN7YVXJvwQBKOL+yCRsPHtSSwQU6JskmknVQm6eIv9wq2c4FBCcA3xMtNBKY7S7oKVtAr6h2WqeGrc/ejsRmS+IFztnbaO2423Xtz/gohEVLLmEQ5iYA7nHMN52DwQPt1WC8PbVqM9ZGCq3zf38uDyKAYo57kU44GF6GDnIkzSIOFrj/3FzD4DOW8TQi9SkPg/Og4dNW5A6XyPQG15neEzUGvoc4sYFRqlvAcJFyIVeefBVxVeagpj1Al5yY0/FeRQpyfhoa3LTdYUAeK2nmY66i3RklMb7DDQbQo0m2/J9Ql/yQ4YrttRqxMuijHKp+5gYwhDiF6pKZfr0uX5h85IzvUH565EIO+5hjZGDii4CwxdZFN2OSQAF+rRncQdCgmJl+UHHRe+LGjdJ+6MQdpd1OfMbDZnkl8Gzk2qGqUcNccED5MSkRW0zqvJk3aW0ABEu04X1T08wTSOkrduNNzcSbDft+5mHUU1Cq2lQ0RucEW/Jyi5teXgj5yIfzl8H0awoCUwBK8iTwlZsBAN1DjR6klSSr6uBCqeiIBbls4LmXugnAi1czznsrf0tKa1ClV80yEG8sCIWakqxPpabgBS4qP2M9q5L+mOqsIdSiNFeqBbWmG8mv3CuRZ8Stz6YMh+gzRhdi0A5spG+mveObwqGSE3vCv4XcWIJl1dYnevEt3Os+t4LgbbmW+sDFSjFfQB5ppCpnTIWLKj6YYAGfktlFVLz7fNYG9OuUoYijCKeGL0h1vy1XuOKyXa2+Y26hVVaw7elhMmRO8Ob4tY+Jqj3V5J4QqLXwqEtepifsO+XfCmrFLBkfgPcGq5XlBXL6JQ/1A92anEU8nWWOMPQSVzBYWrFuttVxn4nPdmuaal0YRv2Tnysgq+s/SUg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8ab987ce-8a7b-4d6d-7685-08dac258453d X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:40.2944 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: l84Km0OqCdW0jXxt+U0fcwOQZKNSqhs2bhdOkC452GxK8fLeBxeI3qrzW0YF575y0x2gE62x5N7f+47imvbOvQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-ORIG-GUID: kl124VuEc4rHVQfU_TC6A3VRXi9WX96S X-Proofpoint-GUID: kl124VuEc4rHVQfU_TC6A3VRXi9WX96S Precedence: bulk List-ID: Use the tables added in the previous commits to introduce a new /proc/kallmodsyms, in which [module names] are also given for things that *could* have been modular had they not been built in to the kernel. So symbols that are part of, say, ext4 are reported as [ext4] even if ext4 happens to be buiilt in to the kernel in this configuration. This helps disambiguate symbols with identical names when some are in built-in modules are some are not, but if symbols are still ambiguous, {object file names} are added as needed to disambiguate them. The object file names are only shown if they would prevent ambiguity, and are minimized by chopping off as many leading path components as possible without (symbol, module, objfile) combinations becoming ambiguous again. (Not every symbol with an {object file name} is necessarily ambiguous, but at least one symbol in any such object file would have been ambiguous if the object file was not mentioned.) Symbols that are part of multiple modules at the same time are shown with [multiple] [module names]: consumers will have to be ready to handle such lines. Also, kernel symbols for built-in modules will be sorted by address, as usual for the core kernel, so will probably appear interspersed with other symbols that are part of different modules and non-modular always-built-in symbols, which, as usual, have no square-bracketed module denotation. This differs from /proc/kallsyms; even though /proc/kallsyms shows the same symbols as /proc/kallmodsyms in the same order, the only modules it names are loadable ones, which are necessarily in single contiguous blocks and thus shown contiguously. The result looks like this: ([...] to show where lines are omitted for brevity): ffffffff97606e50 t not_visible ffffffff97606e70 T perf_msr_probe ffffffff97606f80 t test_msr [rapl] ffffffff97606fa0 t __rapl_pmu_event_start [rapl] [...] ffffffffa6007350 t rapl_pmu_event_stop [rapl] ffffffffa6007440 t rapl_pmu_event_del [rapl] ffffffffa6007460 t rapl_hrtimer_handle [rapl] ffffffffa6007500 t rapl_pmu_event_read [rapl] ffffffffa6007520 t rapl_pmu_event_init [rapl] ffffffffa6007630 t rapl_cpu_offline [rapl] ffffffffa6007710 t amd_pmu_event_map {core.o} ffffffffa6007750 t amd_pmu_add_event {core.o} ffffffffa6007760 t amd_put_event_constraints_f17h {core.o} The [rapl] notation is emitted even if rapl is built into the kernel (but, obviously, not if it's not in the .config at all, or is in a loadable module that is not loaded). The {core.o} is an object file name. Further down, we see what happens when object files are reused by multiple modules, all of which are built in to the kernel, and some of which have symbols that would be ambiguous without an object file name attached in addition to the module names: ffffffff97d7aed0 t liquidio_pcie_mmio_enabled [liquidio] ffffffff97d7aef0 t liquidio_pcie_resume [liquidio] ffffffff97d7af00 t liquidio_ptp_adjtime [liquidio] ffffffff97d7af50 t liquidio_ptp_enable [liquidio] ffffffff97d7af70 t liquidio_get_stats64 [liquidio] ffffffff97d7b0f0 t liquidio_fix_features [liquidio] ffffffff97d7b1c0 t liquidio_get_port_parent_id [liquidio] [...] ffffffff97d824c0 t lio_vf_rep_modinit [liquidio] ffffffff97d824f0 t lio_vf_rep_modexit [liquidio] ffffffff97d82520 t lio_ethtool_get_channels [liquidio] [liquidio_vf] ffffffff97d82600 t lio_ethtool_get_ringparam [liquidio] [liquidio_vf] ffffffff97d826a0 t lio_get_msglevel [liquidio] [liquidio_vf] ffffffff97d826c0 t lio_vf_set_msglevel [liquidio] [liquidio_vf] ffffffff97d826e0 t lio_get_pauseparam [liquidio] [liquidio_vf] ffffffff97d82710 t lio_get_ethtool_stats [liquidio] [liquidio_vf] ffffffff97d82e70 t lio_vf_get_ethtool_stats [liquidio] [liquidio_vf] [...] ffffffff97d91a80 t cn23xx_vf_mbox_thread [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91aa0 t cpumask_weight.constprop.0 [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91ac0 t cn23xx_vf_msix_interrupt_handler [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91bd0 t cn23xx_vf_get_oq_ticks [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91c00 t cn23xx_vf_ask_pf_to_do_flr [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91c70 t cn23xx_octeon_pfvf_handshake [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d91e20 t cn23xx_setup_octeon_vf_device [liquidio] [liquidio_vf] {cn23xx_vf_device.o} ffffffff97d92060 t octeon_mbox_read [liquidio] [liquidio_vf] ffffffff97d92230 t octeon_mbox_write [liquidio] [liquidio_vf] [...] ffffffff97d946b0 t octeon_alloc_soft_command_resp [liquidio] [liquidio_vf] ffffffff97d947e0 t octnet_send_nic_data_pkt [liquidio] [liquidio_vf] ffffffff97d94820 t octnet_send_nic_ctrl_pkt [liquidio] [liquidio_vf] ffffffff97d94ab0 t liquidio_get_stats64 [liquidio_vf] ffffffff97d94c10 t liquidio_fix_features [liquidio_vf] ffffffff97d94cd0 t wait_for_pending_requests [liquidio_vf] Like /proc/kallsyms, the output is driven by address, so keeps the curious property of /proc/kallsyms that symbols may appear repeatedly with different addresses: but now, unlike in /proc/kallsyms, we can see that those symbols appear repeatedly because they are *different symbols* that ultimately belong to different modules or different object files within a module, all of which are built in to the kernel. As with /proc/kallsyms, non-root usage produces addresses that are all zero. I am not wedded to the name or format of /proc/kallmodsyms, but felt it best to split it out of /proc/kallsyms to avoid breaking existing kallsyms parsers. This is currently driven by a new config option, but now that kallmodsyms data uses very little space, this option might be something people don't want to bother with: maybe we can just control it via CONFIG_KALLSYMS or something. Internally, this uses a new kallsyms_builtin_module_address() almost identical to kallsyms_sym_address() to get the address corresponding to a given .kallsyms_modules index, and a new get_builtin_modobj_idx quite similar to get_symbol_pos to determine the index in the .kallsyms_modules and .kallsyms_objfiles arrays that relate to a given address. We save a little time by exploiting the fact that all callers will only ever traverse this list from start to end by allowing them to pass in the previous index returned from this function as a hint: thus very few bsearches are actually needed. (In theory this could change to just walk straight down kallsyms_module_addresses/offsets and not bother bsearching at all, but doing it this way is hardly any slower and much more robust.) We explicitly filter out displaying modules for non-text symbols (perhaps this could be lifted for initialized data symbols in future). There might be occasional incorrect module or objfile names for section start/end symbols. The display process is complicated a little by the weird format of the .kallsyms_mod_objnames table: we have to look for multimodule entries and print them as space-separated lists of module names. Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- Notes: v9: add objfile support. Commit message adjustments. kernel/kallsyms.c | 277 ++++++++++++++++++++++++++++++++++--- kernel/kallsyms_internal.h | 14 ++ 2 files changed, 274 insertions(+), 17 deletions(-) diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 60c20f301a6b..9667962173f1 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -187,6 +187,25 @@ static bool cleanup_symbol_name(char *s) return false; } +#ifdef CONFIG_KALLMODSYMS +static unsigned long kallsyms_builtin_module_address(int idx) +{ + if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE)) + return kallsyms_module_addresses[idx]; + + /* values are unsigned offsets if --absolute-percpu is not in effect */ + if (!IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU)) + return kallsyms_relative_base + (u32)kallsyms_module_offsets[idx]; + + /* ...otherwise, positive offsets are absolute values */ + if (kallsyms_module_offsets[idx] >= 0) + return kallsyms_module_offsets[idx]; + + /* ...and negative offsets are relative to kallsyms_relative_base - 1 */ + return kallsyms_relative_base - 1 - kallsyms_module_offsets[idx]; +} +#endif + /* Lookup the address for this symbol. Returns 0 if not found. */ unsigned long kallsyms_lookup_name(const char *name) { @@ -293,6 +312,54 @@ static unsigned long get_symbol_pos(unsigned long addr, return low; } +/* + * The caller passes in an address, and we return an index to the corresponding + * builtin module index in .kallsyms_modules and .kallsyms_objfiles, or + * (unsigned long) -1 if none match. + * + * The hint_idx, if set, is a hint as to the possible return value, to handle + * the common case in which consecutive runs of addresses relate to the same + * index. + */ +#ifdef CONFIG_KALLMODSYMS +static unsigned long get_builtin_modobj_idx(unsigned long addr, unsigned long hint_idx) +{ + unsigned long low, high, mid; + + if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE)) + BUG_ON(!kallsyms_module_addresses); + else + BUG_ON(!kallsyms_module_offsets); + + /* + * Do a binary search on the sorted kallsyms_modules array. The last + * entry in this array indicates the end of the text section, not an + * object file. + */ + low = 0; + high = kallsyms_num_modules - 1; + + if (hint_idx > low && hint_idx < (high - 1) && + addr >= kallsyms_builtin_module_address(hint_idx) && + addr < kallsyms_builtin_module_address(hint_idx + 1)) + return hint_idx; + + if (addr >= kallsyms_builtin_module_address(low) + && addr < kallsyms_builtin_module_address(high)) { + while (high - low > 1) { + mid = low + (high - low) / 2; + if (kallsyms_builtin_module_address(mid) <= addr) + low = mid; + else + high = mid; + } + return low; + } + + return (unsigned long) -1; +} +#endif + /* * Lookup an address but don't bother to find any names. */ @@ -564,6 +631,9 @@ struct kallsym_iter { char type; char name[KSYM_NAME_LEN]; char module_name[MODULE_NAME_LEN]; + const char *builtin_module_names; + const char *builtin_objfile_name; + unsigned long hint_builtin_modobj_idx; int exported; int show_value; }; @@ -594,6 +664,9 @@ static int get_ksymbol_mod(struct kallsym_iter *iter) &iter->value, &iter->type, iter->name, iter->module_name, &iter->exported); + iter->builtin_module_names = NULL; + iter->builtin_objfile_name = NULL; + if (ret < 0) { iter->pos_mod_end = iter->pos; return 0; @@ -613,6 +686,9 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter) &iter->value, &iter->type, iter->name, iter->module_name, &iter->exported); + iter->builtin_module_names = NULL; + iter->builtin_objfile_name = NULL; + if (ret < 0) { iter->pos_ftrace_mod_end = iter->pos; return 0; @@ -627,6 +703,8 @@ static int get_ksymbol_bpf(struct kallsym_iter *iter) strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN); iter->exported = 0; + iter->builtin_module_names = NULL; + iter->builtin_objfile_name = NULL; ret = bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end, &iter->value, &iter->type, iter->name); @@ -647,23 +725,74 @@ static int get_ksymbol_kprobe(struct kallsym_iter *iter) { strlcpy(iter->module_name, "__builtin__kprobes", MODULE_NAME_LEN); iter->exported = 0; + iter->builtin_module_names = NULL; + iter->builtin_objfile_name = NULL; return kprobe_get_kallsym(iter->pos - iter->pos_bpf_end, &iter->value, &iter->type, iter->name) < 0 ? 0 : 1; } /* Returns space to next name. */ -static unsigned long get_ksymbol_core(struct kallsym_iter *iter) +static unsigned long get_ksymbol_core(struct kallsym_iter *iter, int kallmodsyms) { unsigned off = iter->nameoff; - iter->module_name[0] = '\0'; + iter->exported = 0; iter->value = kallsyms_sym_address(iter->pos); iter->type = kallsyms_get_symbol_type(off); + iter->module_name[0] = '\0'; + iter->builtin_module_names = NULL; + iter->builtin_objfile_name = NULL; + off = kallsyms_expand_symbol(off, iter->name, ARRAY_SIZE(iter->name)); +#ifdef CONFIG_KALLMODSYMS + if (kallmodsyms) { + unsigned long modobj_idx = (unsigned long) -1; + + if (kallsyms_module_offsets) + modobj_idx = + get_builtin_modobj_idx(iter->value, + iter->hint_builtin_modobj_idx); + /* + * This is a built-in module iff the tables of built-in modules + * (address->module name mappings), object files (ditto), and + * module/objfile names are known, and if the address was found + * there, and if the corresponding module index is nonzero, and + * iff this is a text (or weak) symbol. All other cases mean + * off the end of the binary or in a non-modular range in + * between one or more modules. + * + * The same rules are true for kallsyms_objfiles, except that + * zero entries are much more common because we only record + * object file names if we need them to disambiguate one or more + * symbols: see scripts/kallsyms.c:disambiguate_syms. + * + * (Also guard against corrupt kallsyms_modules or + * kallsyms_objfiles arrays pointing off the end of + * kallsyms_mod_objnames.) + */ + if (kallsyms_modules != NULL && kallsyms_mod_objnames != NULL && + kallsyms_objfiles != NULL && + (iter->type == 't' || iter->type == 'T' || + iter->type == 'w' || iter->type == 'W') && + modobj_idx != (unsigned long) -1) { + + if (kallsyms_modules[modobj_idx] != 0 && + kallsyms_modules[modobj_idx] < kallsyms_mod_objnames_len) + iter->builtin_module_names = + &kallsyms_mod_objnames[kallsyms_modules[modobj_idx]]; + + if (kallsyms_objfiles[modobj_idx] != 0 && + kallsyms_objfiles[modobj_idx] < kallsyms_mod_objnames_len) + iter->builtin_objfile_name = + &kallsyms_mod_objnames[kallsyms_objfiles[modobj_idx]]; + } + iter->hint_builtin_modobj_idx = modobj_idx; + } +#endif return off - iter->nameoff; } @@ -709,7 +838,7 @@ static int update_iter_mod(struct kallsym_iter *iter, loff_t pos) } /* Returns false if pos at or past end of file. */ -static int update_iter(struct kallsym_iter *iter, loff_t pos) +static int update_iter(struct kallsym_iter *iter, loff_t pos, int kallmodsyms) { /* Module symbols can be accessed randomly. */ if (pos >= kallsyms_num_syms) @@ -719,7 +848,7 @@ static int update_iter(struct kallsym_iter *iter, loff_t pos) if (pos != iter->pos) reset_iter(iter, pos); - iter->nameoff += get_ksymbol_core(iter); + iter->nameoff += get_ksymbol_core(iter, kallmodsyms); iter->pos++; return 1; @@ -729,14 +858,14 @@ static void *s_next(struct seq_file *m, void *p, loff_t *pos) { (*pos)++; - if (!update_iter(m->private, *pos)) + if (!update_iter(m->private, *pos, 0)) return NULL; return p; } static void *s_start(struct seq_file *m, loff_t *pos) { - if (!update_iter(m->private, *pos)) + if (!update_iter(m->private, *pos, 0)) return NULL; return m->private; } @@ -745,7 +874,7 @@ static void s_stop(struct seq_file *m, void *p) { } -static int s_show(struct seq_file *m, void *p) +static int s_show_internal(struct seq_file *m, void *p, int kallmodsyms) { void *value; struct kallsym_iter *iter = m->private; @@ -756,23 +885,82 @@ static int s_show(struct seq_file *m, void *p) value = iter->show_value ? (void *)iter->value : NULL; - if (iter->module_name[0]) { + /* + * Real module, or built-in module and /proc/kallsyms being shown. + */ + if (iter->module_name[0] != '\0' || + (iter->builtin_module_names != NULL && kallmodsyms != 0)) { char type; /* - * Label it "global" if it is exported, - * "local" if not exported. + * Label it "global" if it is exported, "local" if not exported. */ type = iter->exported ? toupper(iter->type) : tolower(iter->type); - seq_printf(m, "%px %c %s\t[%s]\n", value, - type, iter->name, iter->module_name); - } else - seq_printf(m, "%px %c %s\n", value, +#ifdef CONFIG_KALLMODSYMS + if (kallmodsyms) { + /* + * /proc/kallmodsyms, built as a module. + */ + if (iter->builtin_module_names == NULL) + seq_printf(m, "%px %c %s\t[%s]", value, + type, iter->name, + iter->module_name); + /* + * /proc/kallmodsyms, single-module symbol. + */ + else if (*iter->builtin_module_names != '\0') + seq_printf(m, "%px %c %s\t[%s]", value, + type, iter->name, + iter->builtin_module_names); + /* + * /proc/kallmodsyms, multimodule symbol. Formatted + * as \0MODULE_COUNTmodule-1\0module-2\0, where + * MODULE_COUNT is a single byte, 2 or higher. + */ + else { + size_t i = *(char *)(iter->builtin_module_names + 1); + const char *walk = iter->builtin_module_names + 2; + + seq_printf(m, "%px %c %s\t[%s]", value, + type, iter->name, walk); + + while (--i > 0) { + walk += strlen(walk) + 1; + seq_printf(m, " [%s]", walk); + } + } + /* + * Possibly there is an objfile name too, if needed to + * disambiguate at least one symbol. + */ + if (iter->builtin_objfile_name) + seq_printf(m, " {%s.o}", iter->builtin_objfile_name); + + seq_printf(m, "\n"); + } else /* !kallmodsyms */ +#endif /* CONFIG_KALLMODSYMS */ + seq_printf(m, "%px %c %s\t[%s]\n", value, + type, iter->name, iter->module_name); + } else { + seq_printf(m, "%px %c %s", value, iter->type, iter->name); +#ifdef CONFIG_KALLMODSYMS + if (kallmodsyms) { + if (iter->builtin_objfile_name) + seq_printf(m, "\t{%s.o}", iter->builtin_objfile_name); + } +#endif /* CONFIG_KALLMODSYMS */ + seq_printf(m, "\n"); + } return 0; } +static int s_show(struct seq_file *m, void *p) +{ + return s_show_internal(m, p, 0); +} + static const struct seq_operations kallsyms_op = { .start = s_start, .next = s_next, @@ -780,6 +968,36 @@ static const struct seq_operations kallsyms_op = { .show = s_show }; +#ifdef CONFIG_KALLMODSYMS +static int s_mod_show(struct seq_file *m, void *p) +{ + return s_show_internal(m, p, 1); +} + +static void *s_mod_next(struct seq_file *m, void *p, loff_t *pos) +{ + (*pos)++; + + if (!update_iter(m->private, *pos, 1)) + return NULL; + return p; +} + +static void *s_mod_start(struct seq_file *m, loff_t *pos) +{ + if (!update_iter(m->private, *pos, 1)) + return NULL; + return m->private; +} + +static const struct seq_operations kallmodsyms_op = { + .start = s_mod_start, + .next = s_mod_next, + .stop = s_stop, + .show = s_mod_show +}; +#endif + #ifdef CONFIG_BPF_SYSCALL struct bpf_iter__ksym { @@ -905,7 +1123,8 @@ bool kallsyms_show_value(const struct cred *cred) } } -static int kallsyms_open(struct inode *inode, struct file *file) +static int kallsyms_open_internal(struct inode *inode, struct file *file, + const struct seq_operations *ops) { /* * We keep iterator in m->private, since normal case is to @@ -913,7 +1132,7 @@ static int kallsyms_open(struct inode *inode, struct file *file) * using get_symbol_offset for every symbol. */ struct kallsym_iter *iter; - iter = __seq_open_private(file, &kallsyms_op, sizeof(*iter)); + iter = __seq_open_private(file, ops, sizeof(*iter)); if (!iter) return -ENOMEM; reset_iter(iter, 0); @@ -926,6 +1145,18 @@ static int kallsyms_open(struct inode *inode, struct file *file) return 0; } +static int kallsyms_open(struct inode *inode, struct file *file) +{ + return kallsyms_open_internal(inode, file, &kallsyms_op); +} + +#ifdef CONFIG_KALLMODSYMS +static int kallmodsyms_open(struct inode *inode, struct file *file) +{ + return kallsyms_open_internal(inode, file, &kallmodsyms_op); +} +#endif + #ifdef CONFIG_KGDB_KDB const char *kdb_walk_kallsyms(loff_t *pos) { @@ -936,7 +1167,7 @@ const char *kdb_walk_kallsyms(loff_t *pos) reset_iter(&kdb_walk_kallsyms_iter, 0); } while (1) { - if (!update_iter(&kdb_walk_kallsyms_iter, *pos)) + if (!update_iter(&kdb_walk_kallsyms_iter, *pos, 0)) return NULL; ++*pos; /* Some debugging symbols have no name. Ignore them. */ @@ -953,9 +1184,21 @@ static const struct proc_ops kallsyms_proc_ops = { .proc_release = seq_release_private, }; +#ifdef CONFIG_KALLMODSYMS +static const struct proc_ops kallmodsyms_proc_ops = { + .proc_open = kallmodsyms_open, + .proc_read = seq_read, + .proc_lseek = seq_lseek, + .proc_release = seq_release_private, +}; +#endif + static int __init kallsyms_init(void) { proc_create("kallsyms", 0444, NULL, &kallsyms_proc_ops); +#ifdef CONFIG_KALLMODSYMS + proc_create("kallmodsyms", 0444, NULL, &kallmodsyms_proc_ops); +#endif return 0; } device_initcall(kallsyms_init); diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h index 2d0c6f2f0243..0ee6d97b732e 100644 --- a/kernel/kallsyms_internal.h +++ b/kernel/kallsyms_internal.h @@ -22,8 +22,22 @@ __section(".rodata") __attribute__((weak)); extern const unsigned long kallsyms_relative_base __section(".rodata") __attribute__((weak)); +extern const unsigned long kallsyms_num_modules +__section(".rodata") __attribute__((weak)); + +extern const unsigned long kallsyms_num_objfiles +__section(".rodata") __attribute__((weak)); + +extern const unsigned long kallsyms_mod_objnames_len +__section(".rodata") __attribute__((weak)); + extern const char kallsyms_token_table[] __weak; extern const u16 kallsyms_token_index[] __weak; +extern const unsigned long kallsyms_module_addresses[] __weak; +extern const int kallsyms_module_offsets[] __weak; +extern const u32 kallsyms_modules[] __weak; +extern const u32 kallsyms_objfiles[] __weak; +extern const char kallsyms_mod_objnames[] __weak; extern const unsigned int kallsyms_markers[] __weak; From patchwork Wed Nov 9 13:41:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Alcock X-Patchwork-Id: 13037555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D019C43217 for ; Wed, 9 Nov 2022 14:09:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230419AbiKIOJv (ORCPT ); Wed, 9 Nov 2022 09:09:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230415AbiKIOJg (ORCPT ); Wed, 9 Nov 2022 09:09:36 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 141E51902E; Wed, 9 Nov 2022 06:07:30 -0800 (PST) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9DlpsQ018026; Wed, 9 Nov 2022 14:07:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2022-7-12; bh=zgGkXDjkyMHjdfNQzVhUVuVo36dzABRxbA++2XsbHco=; b=WAF3fmYIn+mXuWIs3tF3mCbylJ4ElHfXaKhagk0jspjusbj9D1FWRk7gkWejNiIdk9K3 gbe1tarDOSROhOldalVLBwUxBbEx1QJ3tVhxuRTEeEZ9jp/rr6E0zjFv3iCByZAl+SKz cP8bXKjXY5MOMrN9Me2XTOZZZCZGOiihQEsRFEGWNDSMRq7Ws4xEKCna3r5JGC9mzxVD EwQjt1aEaACvKA92nu5eKc1acLMfBQHq9d1DJmg6JXtIABeaSlW0TXfPPNNnwnAgXscK K0z3+qazHPNrm7rFG1ANlqH7yB1UOiWtbFigMqj9UYNNG2jwqV1oyuPNo8C4+Uhx2xUn Kg== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3krdd2839k-16 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 14:07:22 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9CgHIs017950; Wed, 9 Nov 2022 13:42:49 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3kpctdn9hr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 Nov 2022 13:42:49 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aNyxQGwO8k+7h9nu/Bw972tMjvAM3KLdMUI4IXlA79QTEawzyUj2w44Xs84CUugwaWWIedS4q5G5XHhcDexQ5vntG0Mk4xnLBH2WwZg6Aq6G90vbFoupg8XDuuE4+fZJsDKI9gNHihXBSYg2DSpJ8m/eugjb2grLL8dAhks9c4Ip1IlGseUQVoamuPOkFRa7NnzORpbYiloChYNHvR6L4R0MHgNX26rIYrWtTxTMpqD3F25swsi0xPAamqvsrhZYJedwpekQWCNu2oDATtNFVg3rD8QWryGmKEhiAfQUMrur/5g3pJp66r+rAFDdWWTnfIhspnkrUAC2Xi5mNAci2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zgGkXDjkyMHjdfNQzVhUVuVo36dzABRxbA++2XsbHco=; b=OZ5ke6BoB4bHLnlUEUH5sfRre6v1jgp3kP0G6XQ78blRH3m65OV0GG0/57bZhTafJz/Mt91MvbnghftGLJfWO3MndPgbH10vbIItRpjM2+Azx4n0rfK9AF1QyvX/SkC7Vpihjy7/G+/ai6j4Xl0kz6Irr5KWUIW5unMNQwoO4uG784ZAZuE6jh0BUVQXV6QSS66Ry2umhUsT6SkcKiQvFDmm7liA8cPHiCoYorWq/duynOkRGuImUO5Vd0b/6uG1XBpqPmOcSYqN54I6b+i+kEBudw4M2eWtuQFhYrKsLZPwIiQvGFxIw4zYAVLequVu7WdH0tqcqwUhckORrOoFgA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zgGkXDjkyMHjdfNQzVhUVuVo36dzABRxbA++2XsbHco=; b=QAsRMyLqzvsWvCpSbBLHdPR4ZElucj+vx8rRY2RYLDt0FEB43QGbTuvgg+UQ9UBmZ00ufi1+rbBl6ekQP1jExgSHvU2ZZKIAkc8uoNs0V6GridE1Rj/JsVIGGqKrPikP7U7IyUy2kr5kQwh5feQCibFY9Ry8wOkWt06I52xDPvM= Received: from DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) by BLAPR10MB4817.namprd10.prod.outlook.com (2603:10b6:208:321::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 13:42:44 +0000 Received: from DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2]) by DS0PR10MB6798.namprd10.prod.outlook.com ([fe80::d361:ae7a:f995:2bb2%3]) with mapi id 15.20.5791.027; Wed, 9 Nov 2022 13:42:44 +0000 From: Nick Alcock To: mcgrof@kernel.org, masahiroy@kernel.org Cc: linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, arnd@arndb.de, akpm@linux-foundation.org, eugene.loh@oracle.com, kris.van.hees@oracle.com Subject: [PATCH v9 8/8] perf: proof-of-concept kallmodsyms support Date: Wed, 9 Nov 2022 13:41:32 +0000 Message-Id: <20221109134132.9052-9-nick.alcock@oracle.com> X-Mailer: git-send-email 2.38.0.266.g481848f278 In-Reply-To: <20221109134132.9052-1-nick.alcock@oracle.com> References: <20221109134132.9052-1-nick.alcock@oracle.com> X-ClientProxiedBy: LO4P123CA0316.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:197::15) To DS0PR10MB6798.namprd10.prod.outlook.com (2603:10b6:8:13c::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR10MB6798:EE_|BLAPR10MB4817:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b22d71a-5c1f-4fd6-2808-08dac25847fd X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: e+HLcHQpL4gM67OfQRIH1OWWv8aXMUN1vmrqPWGHzBuuii3gT38dmAicMsuawj46oOKbyH/MvIJDwQKUC4Ms/f1NCwiIyvRVzPMgoW26gePAXkeEBjp7kObsLN1f9E38VqJMlqcNkqIs2oohTMN54jrSGx4ZMJLSWugrYSAgztk/EMPZHaYNUTjn1ntxKlA/xL7V3/NL/NIY1ixJxLsZjppFzr3D2Jm32DdG8ZalCrBaNFZdAecQV4RG7aY9qrytFddGVaJkQq8tDtLUdYiWHF5UkQm201nwHgAgxh1CWza6akY6XuGhowrejpAaSO0GMUnxnVjZS1D+s3gbaIz1EYx+y+jHSOxeH2wDHRwLbU6intxr4h17+1BgCZoMyjfB8VadIadAXF1nZXAhjbOAp/Rb59U01CClj7VuZlQSagQQX9eMT4/LAOidpES5Mel+QZgpyLsbGKO+izRbA5kW/RDxzJY/5bv28XLbSacGli51MYlWKJF7NU5DcsobZXxEz5KMmGyX8iUqJ7jQr5XmpkCJOplcq73IHXK3Fu0IPs4fy0ATbLpzxjiDpCxApLzCDMtl3WQadiwUp2TfG/DdBMDjoGB6+OV9T60eB7bIgkLZLEIdcrNYDmpAuGNsWHB0q3x/PhV66GjGVWugNiUUKk9/NRDTXIVE4yt/xQpx00LaEJOsj00Lpfk15loIhnUmE75tkefvd60btiCr6TdULQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR10MB6798.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(346002)(366004)(136003)(376002)(39860400002)(396003)(451199015)(2616005)(66556008)(186003)(83380400001)(1076003)(38100700002)(6512007)(5660300002)(8936002)(30864003)(2906002)(6506007)(4326008)(6486002)(478600001)(6666004)(107886003)(66476007)(8676002)(66946007)(316002)(44832011)(41300700001)(86362001)(36756003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ePkyO2mNp33DAWv3zjQla6IZ+XIt3EzPE+b+i00lPCiV3ZFt8o4pLCpDBSjoK89yh9dtdlRxcn3MIX4S5PLb8Y473of8pW5SA0ihxRjrCGe7g5NsvudKvSs2CA6W5Yx3ypcbEpBhs+6rOVk1a8lSIV3bo5bQd4KeQeUMFKExKFUQdjWStKnTQhDM/k2/oIKg/82fL5CLJKlrVtgSFF00gIlZDYjNBjI+Nuc2Gwjkqeq2Sclw3oYAy9TOXxaPLyfpY+No13LRGhcaEPHbGSsFxm3wUhIG44B6eOLl0eaz0z3MDuICj5hENsOSt/7DHYUBgBJo6tWfmGlBdF2HxWlCtEvHeaWmUQMzQUYI2v6zcgQ5djKtbOH/5eOgfkYd+LbZByL1cvr63WI0V0sOp2n0QF1oMtN7xuSmab0gjcejSpHkpm03Ec7cwuHJfKXBBke3cEElSjHVdQb9BKbzQbtnWKHBPyoUVeoQzNh1hwSMz0NZIBxpkxxxydCfmOAyBqE1sh7/xGwMbvLAMKPJ/kBvdE4wISqJbeNzkr8UOO0HGUvLm1XWd8U50CLnRPV86q3WnEMLafbR+a2/cspazhCmiG/iZHfW/ZsRaI6TmK8BRxz3Fmp4FqFmdheHiQZYUiqNgO/ONedtyojQLhod08j9sm0bdT+pS4UP7jD2XnPnulO86vB61OaEu89a/4w+IzUkFC/pPHZ/7AmlOq1vGiufJ4JYUD2Mp84nsp0xqyrS+zr9oYxBgJ2DkDHae5WB91+A/tXQgx/50swVwgP4yOwSnnTLBRljfpQqBieUUPpWLRdDHL0tW2tlgCmwuIFmpGVA8DoKpkUE5wsDA4Ao3D7u16aZoP7aBWwZG7H/FYr1vlLMtS5K6eAQr7nl9l4hWcw9u48lWUz8phC3Rvsq8sSqiuW3bqhq1gX8nQnIN3BgeBpIxCELzVDrluwiEcRuDvW7Nr9NbZ0nd3NKw30/ssK+8YZZiEgNsOGzwM549HEjaioa3PFBujBB0E3/Vkqm0hg36RkJXrHVavX1gHuhQ+ii3vwtMLY0f1XMg5gzxdZ/G2v17Q3fdzlpOf6MgJNB75ZiQ2uzucAH/0rwzrZmnd2c3fohNcm/x/oC9dxMTUelyuRQt2Ul10i/sjin8W9uZIFGetDSRX1uIPGQixkGJn7e07Cea1kLhzi4/Opd/zJYy+FJEPrXFbnlpvXyS72czi0DDz5EyYI2FQp4ANDNc3NS53xCbraMoQA0JpzfVmTHKJDt5OtTZze7reDOlfar6WLWLjz/54xnXlRjYum7vRDRZ5tC4fqbHVfVXDie7bPkf9zzB1krmvc76sAHBTvbUc3UQ8MFyNSLjVvHbj2tPdU9aWIgqo1O+LPeYz2htTZyW1akHhSgQteUB7GewtQPrk3MN3vToM/Vb9zIXnrMkxrBIyo3o83VnE44fjEbZXEabWrMWJ+3RedfAqu2Bix7kviBNAw773kcrnSwpPzs3RvmEZJWNBxfgfJKFY3Vr0biFTn1wh7zDZIp5uToWfcp+ecp/CFthFOdeUegm+GX+N9aU/oFYfkdzDyqofhPZlJtnrx67hYHNf6YO9+dwFOB7CZyLmxppYf6Urw1lCWPVX53RA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b22d71a-5c1f-4fd6-2808-08dac25847fd X-MS-Exchange-CrossTenant-AuthSource: DS0PR10MB6798.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 13:42:44.8905 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: rJ8+NhmCs+3OtFEfCWWqVoVARdL5LcajoMo8zbg6W7EfyPmeVmGBlt3KJ33WXHIf4RDcA07kfuFIhfhq0qnwSA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLAPR10MB4817 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090104 X-Proofpoint-GUID: 0OXUHubPEW3SruX1UqudMrmnNoSlj3jC X-Proofpoint-ORIG-GUID: 0OXUHubPEW3SruX1UqudMrmnNoSlj3jC Precedence: bulk List-ID: This is only very partial: it adds support to 'perf kallsyms', allowing you to say, e.g. % ./perf kallsyms __scsi_device_lookup __scsi_device_lookup: scsi_mod (built-in) 0xffffffff9b901de0-0xffffffff9b901e30 (0xffffffff9b901de0-0xffffffff9b901e30) and get told that this is in a built-in module. We also handle symbols that are in multiple modules at once: % ./perf kallsyms lio_set_fecparam lio_set_fecparam: liquidio, liquidio_vf (built-in) 0xffffffff9b934da0-0xffffffff9b934e10 (0xffffffff9b934da0-0xffffffff9b934e10) We do this the simplistic way, by augmenting symbols with a module array field, the members of which are atoms in a new machine.modules red-black tree (structured the same, and managed with the same code, as the existing transient tree that is used to compare /proc/modules against each other). kallsyms symbols no longer carry a [module name] around with them that needs cutting off whenever it's used (all users that relied on this adjusted, I hope) but instead have that name hived off into the per-symbol module array, with a new 'built_in' field to tell users whether this is a built-in module or not. Since we cannot use the presence of '[' to detect modules any more, we do it at kallmodsyms read time by spotting _end and considering it to denote the end of the core kernel and the start of the modular range. (I *think* this works on all arches.) Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees --- tools/perf/builtin-kallsyms.c | 35 +++++- tools/perf/util/event.c | 14 ++- tools/perf/util/machine.c | 6 +- tools/perf/util/machine.h | 1 + tools/perf/util/symbol.c | 207 +++++++++++++++++++++++++--------- tools/perf/util/symbol.h | 12 +- 6 files changed, 211 insertions(+), 64 deletions(-) diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c index c08ee81529e8..6bcec2522d2d 100644 --- a/tools/perf/builtin-kallsyms.c +++ b/tools/perf/builtin-kallsyms.c @@ -35,10 +35,37 @@ static int __cmd_kallsyms(int argc, const char **argv) continue; } - printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n", - symbol->name, map->dso->short_name, map->dso->long_name, - map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end), - symbol->start, symbol->end); + if (!symbol->modules) { + printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n", + symbol->name, map->dso->short_name, map->dso->long_name, + map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end), + symbol->start, symbol->end); + } else { + if (!symbol->built_in) + printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n", + symbol->name, map->dso->short_name, map->dso->long_name, + map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end), + symbol->start, symbol->end); + else if (symbol->modules[1] == 0) + printf("%s: %s (built-in) %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n", + symbol->name, symbol->modules[0], map->unmap_ip(map, symbol->start), + map->unmap_ip(map, symbol->end), symbol->start, symbol->end); + else { /* Symbol in multiple modules at once */ + char **mod; + + printf("%s: ", symbol->name); + + for (mod = symbol->modules; *mod; mod++) { + if (mod != symbol->modules) + printf(", "); + printf("%s", *mod); + } + + printf (" (built-in) %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n", + map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end), + symbol->start, symbol->end); + } + } } machine__delete(machine); diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 1fa14598b916..a344b35f7e38 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -97,16 +97,28 @@ static int find_symbol_cb(void *arg, const char *name, char type, u64 start) { struct process_symbol_args *args = arg; + char *chop, *tmp_alloc = NULL; + const char *tmp = name; + + if ((chop = strchr(name, '\t')) != NULL) { + tmp_alloc = strndup(name, name - chop); + if (tmp_alloc == NULL) + return -ENOMEM; + tmp = tmp_alloc; + } /* * Must be a function or at least an alias, as in PARISC64, where "_text" is * an 'A' to the same address as "_stext". */ if (!(kallsyms__is_function(type) || - type == 'A') || strcmp(name, args->name)) + type == 'A') || strcmp(tmp, args->name)) { + free(tmp_alloc); return 0; + } args->start = start; + free(tmp_alloc); return 1; } diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 76316e459c3d..2be5a3c1a267 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -173,7 +173,7 @@ struct machine *machine__new_kallsyms(void) * ask for not using the kcore parsing code, once this one is fixed * to create a map per module. */ - if (machine && machine__load_kallsyms(machine, "/proc/kallsyms") <= 0) { + if (machine && machine__load_kallsyms(machine, "/proc/kallmodsyms") <= 0) { machine__delete(machine); machine = NULL; } @@ -237,6 +237,7 @@ void machine__exit(struct machine *machine) zfree(&machine->mmap_name); zfree(&machine->current_tid); zfree(&machine->kallsyms_filename); + modules__delete_modules(&machine->modules); for (i = 0; i < THREADS__TABLE_SIZE; i++) { struct threads *threads = &machine->threads[i]; @@ -1410,7 +1411,8 @@ int machines__create_kernel_maps(struct machines *machines, pid_t pid) int machine__load_kallsyms(struct machine *machine, const char *filename) { struct map *map = machine__kernel_map(machine); - int ret = __dso__load_kallsyms(map->dso, filename, map, true); + int ret = __dso__load_kallsyms(map->dso, filename, map, &machine->modules, + true); if (ret > 0) { dso__set_loaded(map->dso); diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index 74935dfaa937..393063840cd1 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -55,6 +55,7 @@ struct machine { struct dsos dsos; struct maps *kmaps; struct map *vmlinux_map; + struct rb_root modules; u64 kernel_start; pid_t *current_tid; size_t current_tid_sz; diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index a3a165ae933a..aab7ffdd0573 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -41,10 +41,16 @@ #include #include -static int dso__load_kernel_sym(struct dso *dso, struct map *map); +static int dso__load_kernel_sym(struct dso *dso, struct map *map, + struct rb_root *modules); static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map); static bool symbol__is_idle(const char *name); +static int read_proc_modules(const char *filename, struct rb_root *modules); +static struct module_info *find_module(const char *name, + struct rb_root *modules); +static void add_module(struct module_info *mi, struct rb_root *modules); + int vmlinux_path__nr_entries; char **vmlinux_path; @@ -85,6 +91,12 @@ static enum dso_binary_type binary_type_symtab[] = { #define DSO_BINARY_TYPE__SYMTAB_CNT ARRAY_SIZE(binary_type_symtab) +struct module_info { + struct rb_node rb_node; + char *name; + u64 start; +}; + static bool symbol_type__filter(char symbol_type) { symbol_type = toupper(symbol_type); @@ -234,15 +246,10 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms) * kernel text segment and beginning of first module's text * segment is very big. Therefore do not fill this gap and do * not assign it to the kernel dso map (kallsyms). - * - * In kallsyms, it determines module symbols using '[' character - * like in: - * ffffffffc1937000 T hdmi_driver_init [snd_hda_codec_hdmi] */ if (prev->end == prev->start) { /* Last kernel/module symbol mapped to end of page */ - if (is_kallsyms && (!strchr(prev->name, '[') != - !strchr(curr->name, '['))) + if (is_kallsyms && prev->built_in != curr->built_in) prev->end = roundup(prev->end + 4096, 4096); else prev->end = curr->start; @@ -301,6 +308,8 @@ struct symbol *symbol__new(u64 start, u64 len, u8 binding, u8 type, const char * sym->type = type; sym->binding = binding; sym->namelen = namelen - 1; + sym->modules = NULL; + sym->built_in = 0; pr_debug4("%s: %s %#" PRIx64 "-%#" PRIx64 "\n", __func__, name, start, sym->end); @@ -318,6 +327,7 @@ void symbol__delete(struct symbol *sym) annotation__exit(notes); } } + free(sym->modules); free(((void *)sym) - symbol_conf.priv_size); } @@ -716,12 +726,37 @@ static bool symbol__is_idle(const char *name) return strlist__has_entry(idle_symbols_list, name); } -static int map__process_kallsym_symbol(void *arg, const char *name, +struct process_kallsym_symbol_arg { + struct dso *dso; + struct rb_root *modules; + int seen_end; +}; + +static int map__process_kallsym_symbol(void *arg_, const char *name, char type, u64 start) { struct symbol *sym; - struct dso *dso = arg; + struct process_kallsym_symbol_arg *arg = arg_; + struct dso *dso = arg->dso; struct rb_root_cached *root = &dso->symbols; + struct rb_root *modules = arg->modules; + char *module; + const char *modulep; + int counting = 1; + size_t nmods = 0; + char **mods = NULL; + char **modp = NULL; + + /* + * Split off the modules part. + */ + if ((module = strchr(name, '\t')) != NULL) { + *module = 0; + module++; + } + + if (strcmp(name, "_end") == 0) + arg->seen_end = 1; if (!symbol_type__filter(type)) return 0; @@ -731,18 +766,88 @@ static int map__process_kallsym_symbol(void *arg, const char *name, return 0; /* - * module symbols are not sorted so we add all - * symbols, setting length to 0, and rely on - * symbols__fixup_end() to fix it up. + * non-builtin module symbols are not sorted so we add all symbols, + * setting length to 0, and rely on symbols__fixup_end() to fix it up. */ sym = symbol__new(start, 0, kallsyms2elf_binding(type), kallsyms2elf_type(type), name); if (sym == NULL) return -ENOMEM; + + sym->built_in = !arg->seen_end; + + /* + * Pass over the modules list twice: once to count the number of + * modules this symbol is part of and allocate an array to store their + * names, then again to fill it out. + * + * Arguably inefficient, due to one allocation per built-in symbol, even + * though many symbols will have the same mods array. In practice, + * it's just too small a waste to matter. The module names are pointers + * into the machine->modules rb-tree (lazily populated here). + */ + +fill: + modulep = module; + while (modulep && (modulep = strchr(modulep, '[')) != NULL) { + struct module_info *mi; + const char *end_bra = strchr(modulep, ']'); + + modulep++; + if (end_bra == NULL || end_bra <= modulep) + continue; + + if (counting) { + nmods++; + continue; + } + + /* + * Fill-out phase. + */ + + *modp = strndup(modulep, end_bra - modulep); + if (*modp == NULL) { + free(mods); + return -ENOMEM; + } + + mi = find_module(*modp, modules); + if (!mi) { + mi = zalloc(sizeof(struct module_info)); + + if (!mi) { + free (mods); + free (*modp); + return -ENOMEM; + } + mi->name = *modp; + } + else { + free(*modp); + *modp = mi->name; + } + + modp++; + } + + if (counting && nmods > 0) { + mods = calloc(nmods + 1, sizeof (char *)); + if (mods == NULL) + return -ENOMEM; + modp = mods; + + counting = 0; + goto fill; + } + + sym->modules = mods; + /* * We will pass the symbols to the filter later, in - * map__split_kallsyms, when we have split the maps per module + * map__split_kallsyms, when we have split the maps per + * (non-built-in) module */ - __symbols__insert(root, sym, !strchr(name, '[')); + __symbols__insert(root, sym, !arg->seen_end); return 0; } @@ -752,9 +857,11 @@ static int map__process_kallsym_symbol(void *arg, const char *name, * so that we can in the next step set the symbol ->end address and then * call kernel_maps__split_kallsyms. */ -static int dso__load_all_kallsyms(struct dso *dso, const char *filename) +static int dso__load_all_kallsyms(struct dso *dso, const char *filename, + struct rb_root *modules) { - return kallsyms__parse(filename, dso, map__process_kallsym_symbol); + struct process_kallsym_symbol_arg arg = {dso, modules, 0}; + return kallsyms__parse(filename, &arg, map__process_kallsym_symbol); } static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso) @@ -766,22 +873,14 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso) struct rb_root_cached *root = &dso->symbols; struct rb_node *next = rb_first_cached(root); - if (!kmaps) - return -1; - *root = RB_ROOT_CACHED; while (next) { - char *module; - pos = rb_entry(next, struct symbol, rb_node); next = rb_next(&pos->rb_node); rb_erase_cached(&pos->rb_node, &old_root); RB_CLEAR_NODE(&pos->rb_node); - module = strchr(pos->name, '\t'); - if (module) - *module = '\0'; curr_map = maps__find(kmaps, pos->start); @@ -830,19 +929,19 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta, x86_64 = machine__is(machine, "x86_64"); while (next) { - char *module; - pos = rb_entry(next, struct symbol, rb_node); next = rb_next(&pos->rb_node); - module = strchr(pos->name, '\t'); - if (module) { + if (!pos->built_in && pos->modules) { if (!symbol_conf.use_modules) goto discard_symbol; - *module++ = '\0'; - - if (strcmp(curr_map->dso->short_name, module)) { + /* + * Non-built-in symbols can only be in one module at + * once. + */ + assert(pos->modules[1] == NULL); + if (strcmp(curr_map->dso->short_name, pos->modules[0])) { if (curr_map != initial_map && dso->kernel == DSO_SPACE__KERNEL_GUEST && machine__is_default_guest(machine)) { @@ -856,12 +955,12 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta, dso__set_loaded(curr_map->dso); } - curr_map = maps__find_by_name(kmaps, module); + curr_map = maps__find_by_name(kmaps, pos->modules[0]); if (curr_map == NULL) { pr_debug("%s/proc/{kallsyms,modules} " "inconsistency while looking " "for \"%s\" module!\n", - machine->root_dir, module); + machine->root_dir, pos->modules[0]); curr_map = initial_map; goto discard_symbol; } @@ -971,12 +1070,6 @@ bool symbol__restricted_filename(const char *filename, return restricted; } -struct module_info { - struct rb_node rb_node; - char *name; - u64 start; -}; - static void add_module(struct module_info *mi, struct rb_root *modules) { struct rb_node **p = &modules->rb_node; @@ -995,7 +1088,7 @@ static void add_module(struct module_info *mi, struct rb_root *modules) rb_insert_color(&mi->rb_node, modules); } -static void delete_modules(struct rb_root *modules) +void modules__delete_modules(struct rb_root *modules) { struct module_info *mi; struct rb_node *next = rb_first(modules); @@ -1060,7 +1153,7 @@ static int read_proc_modules(const char *filename, struct rb_root *modules) return -1; if (modules__parse(filename, modules, __read_proc_modules)) { - delete_modules(modules); + modules__delete_modules(modules); return -1; } @@ -1101,9 +1194,9 @@ int compare_proc_modules(const char *from, const char *to) if (!from_node && !to_node) ret = 0; - delete_modules(&to_modules); + modules__delete_modules(&to_modules); out_delete_from: - delete_modules(&from_modules); + modules__delete_modules(&from_modules); return ret; } @@ -1133,7 +1226,7 @@ static int do_validate_kcore_modules(const char *filename, struct maps *kmaps) } } out: - delete_modules(&modules); + modules__delete_modules(&modules); return err; } @@ -1467,18 +1560,20 @@ static int kallsyms__delta(struct kmap *kmap, const char *filename, u64 *delta) } int __dso__load_kallsyms(struct dso *dso, const char *filename, - struct map *map, bool no_kcore) + struct map *map, struct rb_root *modules, + bool no_kcore) { struct kmap *kmap = map__kmap(map); u64 delta = 0; - if (symbol__restricted_filename(filename, "/proc/kallsyms")) + if (symbol__restricted_filename(filename, "/proc/kallsyms") && + symbol__restricted_filename(filename, "/proc/kallmodsyms")) return -1; if (!kmap || !kmap->kmaps) return -1; - if (dso__load_all_kallsyms(dso, filename) < 0) + if (dso__load_all_kallsyms(dso, filename, modules) < 0) return -1; if (kallsyms__delta(kmap, filename, &delta)) @@ -1499,9 +1594,9 @@ int __dso__load_kallsyms(struct dso *dso, const char *filename, } int dso__load_kallsyms(struct dso *dso, const char *filename, - struct map *map) + struct map *map, struct rb_root *modules) { - return __dso__load_kallsyms(dso, filename, map, false); + return __dso__load_kallsyms(dso, filename, map, modules, false); } static int dso__load_perf_map(const char *map_path, struct dso *dso) @@ -1814,12 +1909,13 @@ int dso__load(struct dso *dso, struct map *map) dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP; if (dso->kernel && !kmod) { + machine = map__kmaps(map)->machine; + if (dso->kernel == DSO_SPACE__KERNEL) - ret = dso__load_kernel_sym(dso, map); + ret = dso__load_kernel_sym(dso, map, &machine->modules); else if (dso->kernel == DSO_SPACE__KERNEL_GUEST) ret = dso__load_guest_kernel_sym(dso, map); - machine = map__kmaps(map)->machine; if (machine__is(machine, "x86_64")) machine__map_x86_64_entry_trampolines(machine, dso); goto out; @@ -2220,7 +2316,8 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map) return strdup(path); } -static int dso__load_kernel_sym(struct dso *dso, struct map *map) +static int dso__load_kernel_sym(struct dso *dso, struct map *map, + struct rb_root *modules) { int err; const char *kallsyms_filename = NULL; @@ -2282,7 +2379,7 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map) kallsyms_filename = kallsyms_allocated_filename; do_kallsyms: - err = dso__load_kallsyms(dso, kallsyms_filename, map); + err = dso__load_kallsyms(dso, kallsyms_filename, map, modules); if (err > 0) pr_debug("Using %s for symbols\n", kallsyms_filename); free(kallsyms_allocated_filename); @@ -2323,11 +2420,11 @@ static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map) if (!kallsyms_filename) return -1; } else { - sprintf(path, "%s/proc/kallsyms", machine->root_dir); + sprintf(path, "%s/proc/kallmodsyms", machine->root_dir); kallsyms_filename = path; } - err = dso__load_kallsyms(dso, kallsyms_filename, map); + err = dso__load_kallsyms(dso, kallsyms_filename, map, &machine->modules); if (err > 0) pr_debug("Using %s for symbols\n", kallsyms_filename); if (err > 0 && !dso__is_kcore(dso)) { diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h index 0b893dcc8ea6..9ca218e09acf 100644 --- a/tools/perf/util/symbol.h +++ b/tools/perf/util/symbol.h @@ -66,6 +66,11 @@ struct symbol { u8 annotate2:1; /** Architecture specific. Unused except on PPC where it holds st_other. */ u8 arch_sym; + /** Null-terminated array of pointers to names of containing modules in the + modules red-black tree. May be NULL for none. */ + char **modules; + /** Set if this symbol is built in to the core kernel. */ + int built_in; /** The name of length namelen associated with the symbol. */ char name[]; }; @@ -137,8 +142,9 @@ int dso__load_vmlinux(struct dso *dso, struct map *map, const char *vmlinux, bool vmlinux_allocated); int dso__load_vmlinux_path(struct dso *dso, struct map *map); int __dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map, - bool no_kcore); -int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map); + struct rb_root *modules, bool no_kcore); +int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map, + struct rb_root *modules); void dso__insert_symbol(struct dso *dso, struct symbol *sym); @@ -161,6 +167,8 @@ int sysfs__read_build_id(const char *filename, struct build_id *bid); int modules__parse(const char *filename, void *arg, int (*process_module)(void *arg, const char *name, u64 start, u64 size)); +void modules__delete_modules(struct rb_root *modules); + int filename__read_debuglink(const char *filename, char *debuglink, size_t size);