diff mbox series

symbols: discard stray file symbols

Message ID 2412a7a0-bdcd-4647-8ea2-8d2a927dcde3@suse.com (mailing list archive)
State New
Headers show
Series symbols: discard stray file symbols | expand

Commit Message

Jan Beulich April 16, 2025, 9 a.m. UTC
By observation GNU ld 2.25 may emit file symbols for .data.read_mostly
when linking xen.efi. Due to the nature of file symbols in COFF symbol
tables (see the code comment) the symbols_offsets[] entries for such
symbols would cause assembler warnings regarding value truncation. Of
course the resulting entries would also be both meaningless and useless.
Add a heuristic to get rid of them, really taking effect only when
--all-symbols is specified (otherwise these symbols are discarded
anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Factor 2 may in principle still be too small: We zap what looks like
real file symbols already in read_symbol(), so table_cnt doesn't really
reflect the number of symbol table entries encountered. It has proven to
work for me in practice though, with still some leeway left.
diff mbox series

Patch

--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -213,6 +213,16 @@  static int symbol_valid(struct sym_entry
 	if (strstr((char *)s->sym + offset, "_compiled."))
 		return 0;
 
+	/* At least GNU ld 2.25 may emit bogus file symbols referencing a
+	 * section name while linking xen.efi. In COFF symbol tables the
+	 * "value" of file symbols is a link (symbol table index) to the next
+	 * file symbol. Since file (and other) symbols (can) come with one
+	 * (or in principle more) auxiliary symbol table entries, the value in
+	 * this heuristic is bounded to twice the number of symbols we have
+	 * found. See also read_symbol() as to the '?' checked for here. */
+	if (s->sym[0] == '?' && s->sym[1] == '.' && s->addr < table_cnt * 2)
+		return 0;
+
 	return 1;
 }