diff mbox series

[v2,dwarves,2/2] dwarf_loader: use libdw__lock for

Message ID 20241114155822.898466-3-alan.maguire@oracle.com (mailing list archive)
State New
Headers show
Series Check DW_OP_[GNU_]entry_value for possible parameter matching | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Alan Maguire Nov. 14, 2024, 3:58 p.m. UTC
Eduard noticed [1] intermittent segmentation faults triggered by caching
done internally in dwarf_getlocation(s).  A binary tree of location
information is cached in the CU, and if multiple threads access it
concurrently we can get segmentation faults.  It is possible to
compile elfutils with experimental thread-safe support, but safer for
now to add locking to pahole.

No additional overhead in pahole encoding was observed
as a result of these changes.

Reported-by: Eduard Zingerman <eddyz87@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Jiri Olsa <olsajiri@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>

[1] https://lore.kernel.org/dwarves/080794545d8eb3df3d6eba90ac621111ab7171f5.camel@gmail.com/
---
 dwarf_loader.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Eduard Zingerman Nov. 14, 2024, 6:01 p.m. UTC | #1
On Thu, 2024-11-14 at 15:58 +0000, Alan Maguire wrote:
> Eduard noticed [1] intermittent segmentation faults triggered by caching
> done internally in dwarf_getlocation(s).  A binary tree of location
> information is cached in the CU, and if multiple threads access it
> concurrently we can get segmentation faults.  It is possible to
> compile elfutils with experimental thread-safe support, but safer for
> now to add locking to pahole.
> 
> No additional overhead in pahole encoding was observed
> as a result of these changes.
> 
> Reported-by: Eduard Zingerman <eddyz87@gmail.com>
> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> Suggested-by: Jiri Olsa <olsajiri@gmail.com>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> 
> [1] https://lore.kernel.org/dwarves/080794545d8eb3df3d6eba90ac621111ab7171f5.camel@gmail.com/
> ---

Looking at the libdw code, both dwarf_getlocations() and
dwarf_getlocation() might end up calling __libdw_intern_expression(),
where race condition occurs. So I think that this lock positioning
should be safe. In theory, the locking could be pushed down,
to only occur around __dwarf_getlocations / dwarf_getlocation,
but that would complicate the code a bit.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]
diff mbox series

Patch

diff --git a/dwarf_loader.c b/dwarf_loader.c
index bc862b5..9299c1f 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -450,7 +450,14 @@  static bool attr_type(Dwarf_Die *die, uint32_t attr_name, Dwarf_Off *offset)
 static int attr_location(Dwarf_Die *die, Dwarf_Op **expr, size_t *exprlen)
 {
 	Dwarf_Attribute attr;
+	int ret = 1;
+
 	if (dwarf_attr(die, DW_AT_location, &attr) != NULL) {
+		/* use libdw__lock as dwarf_getlocation(s) has concurrency
+		 * issues when libdw is not compiled with experimental
+		 * --enable-thread-safety
+		 */
+		pthread_mutex_lock(&libdw__lock);
 		if (dwarf_getlocation(&attr, expr, exprlen) == 0) {
 			/* DW_OP_addrx needs additional lookup for real addr. */
 			if (*exprlen != 0 && expr[0]->atom == DW_OP_addrx) {
@@ -462,11 +469,12 @@  static int attr_location(Dwarf_Die *die, Dwarf_Op **expr, size_t *exprlen)
 
 				expr[0]->number = address;
 			}
-			return 0;
+			ret = 0;
 		}
+		pthread_mutex_unlock(&libdw__lock);
 	}
 
-	return 1;
+	return ret;
 }
 
 /* The struct dwarf_tag has a fixed size while the 'struct tag' is just the base
@@ -1193,6 +1201,10 @@  static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
 	int loc_num = -1;
 	int ret = -1;
 
+	/* use libdw__lock as dwarf_getlocation(s) has concurrency issues
+	 * when libdw is not compiled with experimental --enable-thread-safety
+	 */
+	pthread_mutex_lock(&libdw__lock);
 	while ((offset = __dwarf_getlocations(attr, offset, &base, &start, &end, &expr, &exprlen)) > 0) {
 		loc_num++;
 
@@ -1228,6 +1240,7 @@  static int parameter__reg(Dwarf_Attribute *attr, int expected_reg)
 		}
 	}
 out:
+	pthread_mutex_unlock(&libdw__lock);
 	return ret;
 }