From patchwork Thu Aug 29 17:42:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13783518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E466BC87FCE for ; Thu, 29 Aug 2024 17:43:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F1746B00B9; Thu, 29 Aug 2024 13:43:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 652526B00BA; Thu, 29 Aug 2024 13:43:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A4EC6B00BB; Thu, 29 Aug 2024 13:43:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1F0686B00B9 for ; Thu, 29 Aug 2024 13:43:05 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CF0801A0CCD for ; Thu, 29 Aug 2024 17:43:04 +0000 (UTC) X-FDA: 82506003888.20.1BE436D Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf09.hostedemail.com (Postfix) with ESMTP id 09F05140023 for ; Thu, 29 Aug 2024 17:43:02 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="o/PzqoW7"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf09.hostedemail.com: domain of andrii@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724953283; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4WGFVVD8NNfz/tzbo/ADWproGkeU+mb1JZaG+UH0brI=; b=K1ADJjOU1BU75lRpV4ialNgJugF3OYcrDozxG0AGOiXKLBm45o7TOxyFIpj4XNtjs1KcCE 8CfykLxsSF4yPA0N8WQwqqwhO7/6wum7Wc7jcatYa8Lfwr7h5IEnJtjkRmhbU1ap37NEEi GAbTfpsbhDm6FiQDaFmA+8i0KbXwumM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724953283; a=rsa-sha256; cv=none; b=5C3yCm0XY9tK35aPY0qvqvB0VaP7YlR42vxj+IXuYr6rXafoCojwgu/A9tiOm3V8IJO2gD 2WFsKODIro5PDArxwpM9sRPDuc4y22TzZ2lVCVmrg3aTnEgLkK7UgcA/X5g+Du3XD4rRNJ CEzT6JLJG8Olar9rBZlv8C0mH7Q6EHQ= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="o/PzqoW7"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf09.hostedemail.com: domain of andrii@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=andrii@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 0BCAAAE089A; Thu, 29 Aug 2024 17:42:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5B75C4CEC5; Thu, 29 Aug 2024 17:43:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1724953381; bh=xyGq5opgp+BqdKV8FxoUZpF+NuXZlsagAy54RWJaSuw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o/PzqoW7INNxwdo+fytTAWkWzKGO8/uy0SFJKvvcPHhhsz5O2HhqxHA4KI+/GhPQz Bn9QmPVSB7OugoetJuA8uo6ECiphr+3cFzGxSmODcqdcTuksQqL7wRgbXXG33IBHoy zt/HZlHJvmQgrVOFFFTmq737JXTGh4TIPW+1qWjCObdKP954I7xiY05+Iv8YIHyyk3 NKMX3wdvOKwSDRW7xewHZZ8JtQZR+0JFawkOFN1CMYubkW08RwWMCWHkVFT6ganQ9G h+fven9XVydBH8J6AEXF+5brs2SdRrtL6KVG89VKJZS6roxKuBTfVfj3MzdMxgEzss Ek7rLSRjqNlMQ== From: Andrii Nakryiko To: bpf@vger.kernel.org Cc: linux-mm@kvack.org, akpm@linux-foundation.org, adobriyan@gmail.com, shakeel.butt@linux.dev, hannes@cmpxchg.org, ak@linux.intel.com, osandov@osandov.com, song@kernel.org, jannh@google.com, linux-fsdevel@vger.kernel.org, willy@infradead.org, Andrii Nakryiko , Eduard Zingerman Subject: [PATCH v7 bpf-next 08/10] bpf: decouple stack_map_get_build_id_offset() from perf_callchain_entry Date: Thu, 29 Aug 2024 10:42:30 -0700 Message-ID: <20240829174232.3133883-9-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240829174232.3133883-1-andrii@kernel.org> References: <20240829174232.3133883-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 09F05140023 X-Stat-Signature: tsiywjrdqwxzz79s5ibjknchrgi4q36i X-Rspam-User: X-HE-Tag: 1724953382-332328 X-HE-Meta: U2FsdGVkX18PIr0wMhXGX1F1IP/GhQ6H8XaHuRInywFQvdYJakWAmeaBCTq24Pwdf1v813wpRaGjvR8Lhy7c0qQPBUzOgzm5KB3Qpo3DCvCTQuQdw/7ERGqOWFOmX2E56B2rTL/GXJ3rrfpzz/dXiC4F5r0/2Wi3KaddywLF3Uqq7GOuvlQ8BSkQYKqN2vlIDTakyA3DZmLdhKbWVljNaI5eSHf3Pm/Eug+Dnk6xNr4mEo/GdPDFiwm/cPJtxD5QAbj16tfi8ovHgf1BkH5y6waCIx/dd2rdtdPHG9nKAhoqk3JD1b2wkI+4HSAWI6LrSZeJo6fM+CNTkneDSvJcLXZ5Z3IxdWwjt5+7Go1hVcP5PYsYYKmrPn4LP43oS5uAmwBD6tZoQ7acwSs5NdfwHDUX9NGdhycXebv8Wwx0aQx3a4NaJ+YyANQS7l/2JBIxlICXkQDwNbo7a96OgYRWnydHrmn8+ff8iTOPYIwJX69+rPk46238rXy0FZ1hLo+imQ8BjRoEMoaPElN6zswN1WmwUaiIU5I1JWw4adwQhBMYARsxxgirr/PwDL49sccBMK9URxbOF9TJptx/R4JboJIihik5weKsYdJiXhrpx8iVd/86siof2gS5vuBjnogOQDGkhEpnJERaTtsJ38I7xV0hMOmJVIOsPKYGf1mlStPYjzaOvFtbds01yD6DUKXsGcLHF1mxaXYj027iV8UXiO4cm7ABKoBpI+XKtriRQ/6iLMAfbUZavmny8d/kZaz8QqarcIHNkn2xhyLkWT0JvwQNE5uUOCBIdP1YQkAJaCxp+4UfHPx4/g6mZIozX2YGiX/Yu/EIjFmvxBjFQrOMPbcp5RUw8/Ajvak81dqq/RtrqJgCk7lairoorAVL/O23xTXotSG3X57fjgTkduCnbp6WVMk1lDfV1+0zN7kbcDsAm4DARq7+68ksbFq1HmfaDonVUzQPYjT72H9EucB TYGP4Ysm lWGVeXvLrfWAGBVKbhbuNpMUls1j0STYALx0XIVQbwerD49wwDJ1i3s1QSNmbpKHc5bEmcjwA+kiwqiyEcHvw+rcBO/UedqwoUdxb8IH1XTnEYjW9kpISPqm7mtZ4AVMspKkXAZWb8arCE5HqkA14IEvWUe6/7wNID5HFi149vtSMFek5o27Wo85oZ5HsKEwsuR3EFNNHhO/cDU79cwHLpvv074va813H0OMeGLPmKrfx3Z0pJiqVuyHugqmr4Hd0T5M5Z92U1wy09Yc6Ks7v59gCfHbQCD0Wvn7mMfsWdHyhZJSJWuRQXSHVFg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Change stack_map_get_build_id_offset() which is used to convert stack trace IP addresses into build ID+offset pairs. Right now this function accepts an array of u64s as an input, and uses array of struct bpf_stack_build_id as an output. This is problematic because u64 array is coming from perf_callchain_entry, which is (non-sleepable) RCU protected, so once we allows sleepable build ID fetching, this all breaks down. But its actually pretty easy to make stack_map_get_build_id_offset() works with array of struct bpf_stack_build_id as both input and output. Which is what this patch is doing, eliminating the dependency on perf_callchain_entry. We require caller to fill out bpf_stack_build_id.ip fields (all other can be left uninitialized), and update in place as we do build ID resolution. We make sure to READ_ONCE() and cache locally current IP value as we used it in a few places to find matching VMA and so on. Given this data is directly accessible and modifiable by user's BPF code, we should make sure to have a consistent view of it. Reviewed-by: Eduard Zingerman Signed-off-by: Andrii Nakryiko --- kernel/bpf/stackmap.c | 49 +++++++++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 770ae8e88016..6457222b0b46 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -124,8 +124,18 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr) return ERR_PTR(err); } +/* + * Expects all id_offs[i].ip values to be set to correct initial IPs. + * They will be subsequently: + * - either adjusted in place to a file offset, if build ID fetching + * succeeds; in this case id_offs[i].build_id is set to correct build ID, + * and id_offs[i].status is set to BPF_STACK_BUILD_ID_VALID; + * - or IP will be kept intact, if build ID fetching failed; in this case + * id_offs[i].build_id is zeroed out and id_offs[i].status is set to + * BPF_STACK_BUILD_ID_IP. + */ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, - u64 *ips, u32 trace_nr, bool user) + u32 trace_nr, bool user) { int i; struct mmap_unlock_irq_work *work = NULL; @@ -142,30 +152,28 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, /* cannot access current->mm, fall back to ips */ for (i = 0; i < trace_nr; i++) { id_offs[i].status = BPF_STACK_BUILD_ID_IP; - id_offs[i].ip = ips[i]; memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX); } return; } for (i = 0; i < trace_nr; i++) { - if (range_in_vma(prev_vma, ips[i], ips[i])) { + u64 ip = READ_ONCE(id_offs[i].ip); + + if (range_in_vma(prev_vma, ip, ip)) { vma = prev_vma; - memcpy(id_offs[i].build_id, prev_build_id, - BUILD_ID_SIZE_MAX); + memcpy(id_offs[i].build_id, prev_build_id, BUILD_ID_SIZE_MAX); goto build_id_valid; } - vma = find_vma(current->mm, ips[i]); + vma = find_vma(current->mm, ip); if (!vma || build_id_parse_nofault(vma, id_offs[i].build_id, NULL)) { /* per entry fall back to ips */ id_offs[i].status = BPF_STACK_BUILD_ID_IP; - id_offs[i].ip = ips[i]; memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX); continue; } build_id_valid: - id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i] - - vma->vm_start; + id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ip - vma->vm_start; id_offs[i].status = BPF_STACK_BUILD_ID_VALID; prev_vma = vma; prev_build_id = id_offs[i].build_id; @@ -216,7 +224,7 @@ static long __bpf_get_stackid(struct bpf_map *map, struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; - u32 hash, id, trace_nr, trace_len; + u32 hash, id, trace_nr, trace_len, i; bool user = flags & BPF_F_USER_STACK; u64 *ips; bool hash_matches; @@ -238,15 +246,18 @@ static long __bpf_get_stackid(struct bpf_map *map, return id; if (stack_map_use_build_id(map)) { + struct bpf_stack_build_id *id_offs; + /* for build_id+offset, pop a bucket before slow cmp */ new_bucket = (struct stack_map_bucket *) pcpu_freelist_pop(&smap->freelist); if (unlikely(!new_bucket)) return -ENOMEM; new_bucket->nr = trace_nr; - stack_map_get_build_id_offset( - (struct bpf_stack_build_id *)new_bucket->data, - ips, trace_nr, user); + id_offs = (struct bpf_stack_build_id *)new_bucket->data; + for (i = 0; i < trace_nr; i++) + id_offs[i].ip = ips[i]; + stack_map_get_build_id_offset(id_offs, trace_nr, user); trace_len = trace_nr * sizeof(struct bpf_stack_build_id); if (hash_matches && bucket->nr == trace_nr && memcmp(bucket->data, new_bucket->data, trace_len) == 0) { @@ -445,10 +456,16 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, copy_len = trace_nr * elem_size; ips = trace->ip + skip; - if (user && user_build_id) - stack_map_get_build_id_offset(buf, ips, trace_nr, user); - else + if (user && user_build_id) { + struct bpf_stack_build_id *id_offs = buf; + u32 i; + + for (i = 0; i < trace_nr; i++) + id_offs[i].ip = ips[i]; + stack_map_get_build_id_offset(buf, trace_nr, user); + } else { memcpy(buf, ips, copy_len); + } if (size > copy_len) memset(buf + copy_len, 0, size - copy_len);