[v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms

In the kernel environment, scenarios often arise where identical names
are shared among symbols within core image or modules.
While this poses no complications for the kernel's binary itself, it
creates challenges when conducting trace or probe operations using tools
like kprobe.

A solution has been introduced, referred to as "kas_alias."
During the kernel's build process, an extensive scan of all objects is
performed, encompassing both core kernel components and modules, to
collect comprehensive symbol information.
Subsequently, for all duplicate symbolsthe process enriches symbol names
by appending meaningful suffixes derived from source files and line
numbers.
These freshly generated aliases simplify interaction with symbols.

The procedure is executed as follows.
During the kernel's build phase, an exhaustive search for duplicates among
symbols that share the same name in both kernel image and all modules
object files.
For the kernel core image, a new nem data file is created and alias for
all duplicate symbols is added.
For modules, the lib objects the ELF symtable is modified with the
addition of the alias for the duplicate symbols.

Consider the symbol "device_show", you can expect an output like the
following:

 ~ # cat /proc/kallsyms | grep " device_show"
 ffffffff963cd2a0 t device_show
 ffffffff963cd2a0 t device_show@drivers_pci_pci_sysfs_c_49
 ffffffff96454b60 t device_show
 ffffffff96454b60 t device_show@drivers_virtio_virtio_c_16
 ffffffff966e1700 T device_show_ulong
 ffffffff966e1740 T device_show_int
 ffffffff966e1770 T device_show_bool
 ffffffffc04e10a0 t device_show [mmc_core]
 ffffffffc04e10a0 t device_show@drivers_mmc_core_sdio_bus_c_45  [mmc_core]

Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>

NOTE1:
About the symbols name duplication that happens as consequence of the
inclusion compat_binfmt_elf.c does, it is evident that this corner is
inherently challenging the addr2line approach.
Attempting to conceal this limitation would be counterproductive.

compat_binfmt_elf.c includes directly binfmt_elf.c, addr2line can't help
but report all functions and data declared by that file, coming from
binfmt_elf.c.

My position is that, rather than producing a more complicated pipeline
to handle this corner case, it is better to fix the compat_binfmt_elf.c
anomaly.

This patch does not deal with the two potentially problematic symbols
defined by compat_binfmt_elf.c

NOTE2:
The current implementation does not offer a solution for out-of-tree modules.
My stance is that these modules fall outside the scope, but I welcome any
comments or feedback regarding this matter.

Changes from v1:
* Integrated changes requested by Masami to exclude symbols with prefixes
  "_cfi" and "_pfx".
* Introduced a small framework to handle patterns that need to be excluded
  from the alias production.
* Excluded other symbols using the framework.
* Introduced the ability to discriminate between text and data symbols.
* Added two new config symbols in this version:
  CONFIG_KALLSYMS_ALIAS_DATA, which allows data for data, and
  CONFIG_KALLSYMS_ALIAS_DATA_ALL, which excludes all filters and provides
  an alias for each duplicated symbol.

https://lore.kernel.org/all/20230711151925.1092080-1-alessandro.carminati@gmail.com/

Changes from v2:
* Alias tags are created by querying DWARF information from the vmlinux.
* The filename + line number is normalized and appended to the original
  name.
* The tag begins with '@' to indicate the symbol source.
* Not a change, but worth mentioning, since the alias is added to the
  existing list, the old duplicated name is preserved, and the livepatch
  way of dealing with duplicates is maintained.
* Acknowledging the existence of scenarios where inlined functions
  declared in header files may result in multiple copies due to compiler
  behavior, though it is not actionable as it does not pose an operational
  issue.
* Highlighting a single exception where the same name refers to different
  functions: the case of "compat_binfmt_elf.c," which directly includes
  "binfmt_elf.c" producing identical function copies in two separate
  modules.

https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/

Changes from v3:
* kas_alias was rewritten in Python to create a more concise and
  maintainable codebase.
* The previous automation process used by kas_alias to locate the vmlinux
  and the addr2line has been replaced with an explicit command-line switch
  for specifying these requirements.
* addr2line has been added into the main Makefile.
* A new command-line switch has been introduced, enabling users to extend
  the alias to global data names.

https://lore.kernel.org/all/20230828080423.3539686-1-alessandro.carminati@gmail.com/

Changes from v4:
* Fixed the O=<build dir> build issue
* The tool halts execution upon encountering major issues, thereby ensuring
  the pipeline is interrupted.
* A cmdline option to specify the source directory added.
* Minor code adjusments.
* Tested on mips32 and i386

https://lore.kernel.org/all/20230919193948.465340-1-alessandro.carminati@gmail.com/

Changes from v5:
* Regex filter extended to all symbols
* Alias creation extended to module objects
* Code cleaned and commented
* kas_alias verbose execution via KAS_ALIAS_DEBUG env variable
* CONFIG_KALLSYMS_ALIAS_SRCLINE selects KBUILD_BUILTIN to ensure no races
  during modules build
* Tested on x86_64, aarch64 and i386

https://lore.kernel.org/all/20230927173516.1456594-1-alessandro.carminati@gmail.com/
---
 Makefile                  |  14 +-
 init/Kconfig              |  22 ++
 scripts/Makefile.modfinal |  10 +-
 scripts/kas_alias.py      | 545 ++++++++++++++++++++++++++++++++++++++
 scripts/link-vmlinux.sh   |  26 +-
 5 files changed, 613 insertions(+), 4 deletions(-)
 create mode 100755 scripts/kas_alias.py

Message ID	20231024201157.748254-1-alessandro.carminati@gmail.com (mailing list archive)
State	Superseded
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C906C3E469 for <linux-trace-kernel@vger.kernel.org>; Tue, 24 Oct 2023 20:13:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MigfiV1m" Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C86C4D7A; Tue, 24 Oct 2023 13:13:21 -0700 (PDT) Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-9adb9fa7200so33122166b.0; Tue, 24 Oct 2023 13:13:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698178400; x=1698783200; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=9OEcRaigufH8nG/ZcZfahqtSFS0F7ALJxwdexbLA9q0=; b=MigfiV1mJOZCdqv++xgoua/LDeluN1SwXoCR6GzR8xumNMOVXfzU9QJX8eptmbcrrM 9Aqb6O2OKcGMGSI8dhSKiqQBu3qg1LGW+dC1OeY+teUfM0K48wnw4R71iYbu9IrQrnLb RUU4WN0GDGHEDypUHhcs02KSc12WnvMt4ITmHgyl6eJJxxvoXir90UE3wiWmCHC9E/29 6JqNRCBnxw8TdBu04vzErJYh7BoCA7ctycFyDWUsJ1MFTJkzl8nq0ElmaM4OcWAl4X08 nNg1BITHOAR1rB2HSSzBDAvN6li3kx7NQHzvn83ueI5bQ95F1i++7Tt7Oy5bYYVwsgFm p+ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698178400; x=1698783200; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9OEcRaigufH8nG/ZcZfahqtSFS0F7ALJxwdexbLA9q0=; b=npXv8T4M3EYh4SM9IZOmB0RC01IXK9bijSzxuWNaoO7vg+5MXkmrdvBNivI38GgiIj R/wZP63hzfwGdhGPuBxjO+dY8LTgjYtl/6+EKyN3o4Lvmd8+ZxTntse0S3qUYEEpGXMY BH8uWlFwnnKWM3LPrOlFxnEVH9zqgAVplAqOYJtI1ECHCm1q7Qn/vWBdsk1YK7a2u07t 0qBgS5dz6I+nLB1H8v2y/QpIhMuTu4JQwb8+QhSZ5OBQhxzjIeVkI8yCmczERnyW6N9z SF9emCRdVKGg+6B2fZwXfKzjOBcf9AVQNWLHHxG/D9UBleLzg3YoZTkLGB/OLX7VOsfs WtwA== X-Gm-Message-State: AOJu0YyjM5O2BJhAn0zTZwJW07hM2HJdT3nFnO8xXImBvIjc59MGIdJZ PtLwc/QXOi0V2M8ow81Th58= X-Google-Smtp-Source: AGHT+IG1XApwDQyZ4uvTHkBhOZb9BNFJ1QEM8vbOpnClfKfRq/I9/tnwemAtQEtz0mJ37DzBnCdY8w== X-Received: by 2002:a17:907:94d0:b0:9ae:65d6:a51f with SMTP id dn16-20020a17090794d000b009ae65d6a51fmr17170828ejc.18.1698178399148; Tue, 24 Oct 2023 13:13:19 -0700 (PDT) Received: from lab.hqhome163.com ([194.183.10.152]) by smtp.googlemail.com with ESMTPSA id pj19-20020a170906d79300b009ad829ed144sm8679318ejb.130.2023.10.24.13.13.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 13:13:18 -0700 (PDT) From: "Alessandro Carminati (Red Hat)" <alessandro.carminati@gmail.com> To: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org>, Steven Rostedt <rostedt@goodmis.org>, Daniel Bristot de Oliveira <bristot@kernel.org>, Josh Poimboeuf <jpoimboe@kernel.org>, Luis Chamberlain <mcgrof@kernel.org>, Nathan Chancellor <nathan@kernel.org>, Nick Desaulniers <ndesaulniers@google.com>, Nicolas Schier <nicolas@fjasle.eu>, Alexander Lobakin <aleksander.lobakin@intel.com>, Nick Alcock <nick.alcock@oracle.com>, Kris Van Hees <kris.van.hees@oracle.com>, Eugene Loh <eugene.loh@oracle.com>, Francis Laniel <flaniel@linux.microsoft.com>, Viktor Malik <vmalik@redhat.com>, Petr Mladek <pmladek@suse.com>, Tom Rix <trix@redhat.com>, Alessandro Carminati <alessandro.carminati@gmail.com>, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, llvm@lists.linux.dev Subject: [PATCH v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms Date: Tue, 24 Oct 2023 20:11:57 +0000 Message-Id: <20231024201157.748254-1-alessandro.carminati@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: <linux-trace-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-trace-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-trace-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit
Series	[v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms \| expand [v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms

[v6] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms

Commit Message

Comments

Patch