From patchwork Thu Jan 9 06:18:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13932080 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83CAB2147F8; Thu, 9 Jan 2025 06:07:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402825; cv=none; b=Ihb92m9jr3nc2buN6Fbgk8vyn20YpKb3tpzG7jweg5DOBM0esCjqZu2EfVzYF5qPfuJEiadvj/ynlyOKUPyXK5D7YZhzfcgtLPwtdtW8apiRt5Pg3fveKfzmhPMvH3neliKDpDNMcsKoh9Yl8vxXV6drqvhAGG4DGtzr8CJh6CM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402825; c=relaxed/simple; bh=M20n8oEPRFgP6hJG00kr3BZfyGsN6BOq4MAUzZqbdhU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HgOOf1XrKoSBGrZkIWX7dzHwGy99gnFlK4lg2QtWtOhlxEk8ZnzGC45z/vfov67Jw1LrsvI/w/WIStGvUq23RYu6PlBFp4wEha6a1ncFIC7Bs5FgtFBmh0hYu+LI7ShctxIXHjT2C9VaspdeHtFyHvr4Ubg3x1+cIIk9WlUAdGw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YTDnW5lf8z4f3jqr; Thu, 9 Jan 2025 14:06:39 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id CEB381A06DC; Thu, 9 Jan 2025 14:06:54 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgAni196Z39nvD3QAQ--.4010S5; Thu, 09 Jan 2025 14:06:54 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next v2 1/5] bpf: Free special fields after unlock in htab_lru_map_delete_node() Date: Thu, 9 Jan 2025 14:18:57 +0800 Message-Id: <20250109061901.2620825-2-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20250109061901.2620825-1-houtao@huaweicloud.com> References: <20250109061901.2620825-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAni196Z39nvD3QAQ--.4010S5 X-Coremail-Antispam: 1UD129KBjvJXoW7KFyDGw4rKw48Gw1rtw1kKrg_yoW8GF4Upa n5Gay3Ga18ZF1qkayrtF4vgryrCw45Gw47KrW8GFyYy3W7Za4DW3W5GF93KFyaqrWkZrna qrZ0qr98tFyUurDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPFb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU3cTm DUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao When bpf_timer is used in LRU hash map, calling check_and_free_fields() in htab_lru_map_delete_node() will invoke bpf_timer_cancel_and_free() to free the bpf_timer. If the timer is running on other CPUs and PREEMPT_RT is enabled, hrtimer_cancel will invoke hrtimer_cancel_wait_running() and it will try to acquire a spin-lock, however, htab_lru_map_delete_node() has already acquired a raw-spin-lock, it violates the lockdep rule and may trigger the "BUG: scheduling while atomic" warning. Fix the issue by moving the invocation of check_and_free_fields() out of bucket lock. Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 40095dda891d3..963cccb01daae 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -824,13 +824,14 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) if (l == tgt_l) { hlist_nulls_del_rcu(&l->hash_node); - check_and_free_fields(htab, l); bpf_map_dec_elem_count(&htab->map); break; } htab_unlock_bucket(htab, b, tgt_l->hash, flags); + if (l == tgt_l) + check_and_free_fields(htab, l); return l == tgt_l; } From patchwork Thu Jan 9 06:18:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13932079 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83D412147F9; Thu, 9 Jan 2025 06:07:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402825; cv=none; b=XwJCPMzp5vmLFJZeEYyJFDz4OyIB+aCpWOnUvqlkCUJ0PPAlQXgkpeCPnYdeD8MAoCVN/jlSq7xzIhBwxAqFk86eQtBGWbl/YI/5zpD3PWMSNdgHSGhaBs0J+D2HZ3jB6anOhqgCDp+twelYPWdwOSCXf3pewe+71ALyLwD9kAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402825; c=relaxed/simple; bh=nbp0Q7q1gCbD5AWkKWqKPqpK5Z6sb+YCTvS4T7koG80=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=M8U/XLsijrt2zN6Wvan7gmyGwcOWhlLuOL9OsZFEgBbn7XVPd5hTtHjf/UUPNFY9Y2Xdjg9uFc+nud4wca9cawFRfqtoqyhtcePwbvJcwSvwDQKCCG1ceoW5yBElq2APVDLuCoNQWOhZrAuyiV78W5bifO4lKjjdcsTQIDTVUwo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YTDnQ0YbSz4f3jLp; Thu, 9 Jan 2025 14:06:34 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 79D101A1799; Thu, 9 Jan 2025 14:06:55 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgAni196Z39nvD3QAQ--.4010S6; Thu, 09 Jan 2025 14:06:55 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next v2 2/5] bpf: Bail out early in __htab_map_lookup_and_delete_elem() Date: Thu, 9 Jan 2025 14:18:58 +0800 Message-Id: <20250109061901.2620825-3-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20250109061901.2620825-1-houtao@huaweicloud.com> References: <20250109061901.2620825-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAni196Z39nvD3QAQ--.4010S6 X-Coremail-Antispam: 1UD129KBjvJXoW7ZFW5tw1rXry8tw1fGw1xKrg_yoW5JFyxpF Z3KrWxWry8ursIqa4ftw1jkayrJ34jyw48Ka4DJFyrCF13Zryvqw13AF93GFy3Gr92yr4r trZ2qF1fK3y2qrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU0I3 85UUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Use goto statement to bail out early when the target element is not found, instead of using a large else branch to handle the more likely case. This change doesn't affect functionality and simply make the code cleaner. Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 51 ++++++++++++++++++++++---------------------- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 963cccb01daae..6545ef40e128a 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -1635,37 +1635,38 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, l = lookup_elem_raw(head, hash, key, key_size); if (!l) { ret = -ENOENT; - } else { - if (is_percpu) { - u32 roundup_value_size = round_up(map->value_size, 8); - void __percpu *pptr; - int off = 0, cpu; + goto out_unlock; + } - pptr = htab_elem_get_ptr(l, key_size); - for_each_possible_cpu(cpu) { - copy_map_value_long(&htab->map, value + off, per_cpu_ptr(pptr, cpu)); - check_and_init_map_value(&htab->map, value + off); - off += roundup_value_size; - } - } else { - u32 roundup_key_size = round_up(map->key_size, 8); + if (is_percpu) { + u32 roundup_value_size = round_up(map->value_size, 8); + void __percpu *pptr; + int off = 0, cpu; - if (flags & BPF_F_LOCK) - copy_map_value_locked(map, value, l->key + - roundup_key_size, - true); - else - copy_map_value(map, value, l->key + - roundup_key_size); - /* Zeroing special fields in the temp buffer */ - check_and_init_map_value(map, value); + pptr = htab_elem_get_ptr(l, key_size); + for_each_possible_cpu(cpu) { + copy_map_value_long(&htab->map, value + off, per_cpu_ptr(pptr, cpu)); + check_and_init_map_value(&htab->map, value + off); + off += roundup_value_size; } + } else { + u32 roundup_key_size = round_up(map->key_size, 8); - hlist_nulls_del_rcu(&l->hash_node); - if (!is_lru_map) - free_htab_elem(htab, l); + if (flags & BPF_F_LOCK) + copy_map_value_locked(map, value, l->key + + roundup_key_size, + true); + else + copy_map_value(map, value, l->key + + roundup_key_size); + /* Zeroing special fields in the temp buffer */ + check_and_init_map_value(map, value); } + hlist_nulls_del_rcu(&l->hash_node); + if (!is_lru_map) + free_htab_elem(htab, l); +out_unlock: htab_unlock_bucket(htab, b, hash, bflags); if (is_lru_map && l) From patchwork Thu Jan 9 06:18:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13932077 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55C6D43169; Thu, 9 Jan 2025 06:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402821; cv=none; b=oJyzgqNJY2AEwkUr7upzrnAkX1rkFzA0V3AvtYoYOsaA3wOuQL8gNHbYFThbYBnf1a2l+vSMBVWQdc+f9QvLlrO2VSe8BA46wPGDiu58pa6QEiSInTZFq2to21XAmVzWRAvxMXDdBqx/v1niitPFRY946tT6lVzrrcEf5csMGlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402821; c=relaxed/simple; bh=SKJQGbJxuETkb4XV2d3Je3W09hUwJSEILdjytyLB7qU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JdhY5nfDgUp8t/1lQShr3n2x6m73Bn89fyQNZLeI1Iznqc2FldLsArsQqPgSlxJmUgD1fxeUgHZO6Hih9SEQdodhEg7AnNV+AiMblypwyPNPbHGmdkgVijzLP2i4NEbpayLZbmfucNSonPfXaKV2T0o+khaxS7hxacw0zBpILGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4YTDnR6q4Vz4f3jRG; Thu, 9 Jan 2025 14:06:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 26F1D1A0BFF; Thu, 9 Jan 2025 14:06:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgAni196Z39nvD3QAQ--.4010S7; Thu, 09 Jan 2025 14:06:55 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next v2 3/5] bpf: Free element after unlock in __htab_map_lookup_and_delete_elem() Date: Thu, 9 Jan 2025 14:18:59 +0800 Message-Id: <20250109061901.2620825-4-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20250109061901.2620825-1-houtao@huaweicloud.com> References: <20250109061901.2620825-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAni196Z39nvD3QAQ--.4010S7 X-Coremail-Antispam: 1UD129KBjvJXoW7AFy7JryfXrWDZw43tF4xJFb_yoW8Gry7pF Z5KrW2ga1kWrnYv343Ja1vkrWUGw1rXw1UGF1kG34rtFn8Wr97Gw12vF92qF13Xr1vyFZ5 XFW2yw15t3y5CrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP2b4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU04x RDUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao The freeing of special fields in map value may acquire a spin-lock (e.g., the freeing of bpf_timer), however, the lookup_and_delete_elem procedure has already held a raw-spin-lock, which violates the lockdep rule. The running context of __htab_map_lookup_and_delete_elem() has already disabled the migration. Therefore, it is OK to invoke free_htab_elem() after unlocking the bucket lock. Fix the potential problem by freeing element after unlocking bucket lock in __htab_map_lookup_and_delete_elem(). Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 6545ef40e128a..4a9eeb7aef855 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -1663,14 +1663,16 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, check_and_init_map_value(map, value); } hlist_nulls_del_rcu(&l->hash_node); - if (!is_lru_map) - free_htab_elem(htab, l); out_unlock: htab_unlock_bucket(htab, b, hash, bflags); - if (is_lru_map && l) - htab_lru_push_free(htab, l); + if (l) { + if (is_lru_map) + htab_lru_push_free(htab, l); + else + free_htab_elem(htab, l); + } return ret; } From patchwork Thu Jan 9 06:19:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13932082 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21234214A62; Thu, 9 Jan 2025 06:07:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402828; cv=none; b=cbWsVwX/r+Jc2FguBYQ/OXKDSrLjWtBFm8IHuo+1Qo8eGZ5uQb/3hTCRY8pyUGMmC9p1YR6D8tNNN46/Xr012K5AxQA+UVADDrFi0Y62H3sc+UUVw5pmYgDkpU90ZA1R7nxIlHGSOq+fr4F7hVzRg8izinF8uVC9pWOqB/t8njE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402828; c=relaxed/simple; bh=iSrAso/Zw9Pn2Wy5gI3bsnDMPAuSoBK1jgFCgX+rdUU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OuTp45y5XyVmR9Q1lKFo+xu5nysrV1QB/ZC73LlqsJF7Zlqw+VI9wnmpoQuEfMeoU4xIHX6dqrTLB/Ctibm1xbDXM+XqBLKz7BDAOgXF5Iio/w3n5DD7eEl3cK2i1R3UaIl+YKsff2pSH++53gbdEDWkKysLdtZd4y6+w4fIgTo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YTDnY5PMKz4f3jqr; Thu, 9 Jan 2025 14:06:41 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id C315C1A17A4; Thu, 9 Jan 2025 14:06:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgAni196Z39nvD3QAQ--.4010S8; Thu, 09 Jan 2025 14:06:56 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next v2 4/5] bpf: Cancel the running bpf_timer through kworker Date: Thu, 9 Jan 2025 14:19:00 +0800 Message-Id: <20250109061901.2620825-5-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20250109061901.2620825-1-houtao@huaweicloud.com> References: <20250109061901.2620825-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAni196Z39nvD3QAQ--.4010S8 X-Coremail-Antispam: 1UD129KBjvJXoW3JFW8Xw48Cw4kJr1xury3urg_yoW7Aw4DpF WfKry7Kr1kWr1qvrsFvF1kGa48Cws3Gw17Grn7Kr15ZF13Ww1vqFWI9F1a9F45Crn3ArZa vr40v39akwn8u37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPSb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF 7I0E14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7I U0sqXPUUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lock. The reason why the bucket lock is necessary is that the old_element has already been stashed in htab->extra_elems after alloc_htab_elem() returns. If freeing the old_element after the bucket lock is unlocked, the stashed element may be reused by concurrent update procedure and the freeing of old_element will run concurrently with the reuse of the old_element. However, the invocation of check_and_free_fields() may acquire a spin-lock which violates the lockdep rule because its caller has already held a raw-spin-lock (bucket lock). The following warning will be reported when such race happens: BUG: scheduling while atomic: test_progs/676/0x00000003 3 locks held by test_progs/676: #0: ffffffff864b0240 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0x2c0/0x830 #1: ffff88810e961188 (&htab->lockdep_key){....}-{2:2}, at: htab_map_update_elem+0x306/0x1500 #2: ffff8881f4eac1b8 (&base->softirq_expiry_lock){....}-{2:2}, at: hrtimer_cancel_wait_running+0xe9/0x1b0 Modules linked in: bpf_testmod(O) Preemption disabled at: [] htab_map_update_elem+0x293/0x1500 CPU: 0 UID: 0 PID: 676 Comm: test_progs Tainted: G ... 6.12.0+ #11 Tainted: [W]=WARN, [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)... Call Trace: dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 __schedule_bug+0x120/0x170 __schedule+0x300c/0x4800 schedule_rtlock+0x37/0x60 rtlock_slowlock_locked+0x6d9/0x54c0 rt_spin_lock+0x168/0x230 hrtimer_cancel_wait_running+0xe9/0x1b0 hrtimer_cancel+0x24/0x30 bpf_timer_delete_work+0x1d/0x40 bpf_timer_cancel_and_free+0x5e/0x80 bpf_obj_free_fields+0x262/0x4a0 check_and_free_fields+0x1d0/0x280 htab_map_update_elem+0x7fc/0x1500 bpf_prog_9f90bc20768e0cb9_overwrite_cb+0x3f/0x43 bpf_prog_ea601c4649694dbd_overwrite_timer+0x5d/0x7e bpf_prog_test_run_syscall+0x322/0x830 __sys_bpf+0x135d/0x3ca0 __x64_sys_bpf+0x75/0xb0 x64_sys_call+0x1b5/0xa10 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... It seems feasible to break the reuse and refill of per-cpu extra_elems into two independent parts: reuse the per-cpu extra_elems with bucket lock being held and refill the old_element as per-cpu extra_elems after the bucket lock is unlocked. However, it will make the concurrent overwrite procedures on the same CPU return unexpected -E2BIG error when the map is full. Therefore, the patch fixes the lock problem by breaking the cancelling of bpf_timer into two steps: 1) use hrtimer_try_to_cancel() and check its return value 2) if the timer is running, use hrtimer_cancel() through a kworker to cancel it again Considering that the current implementation of hrtimer_cancel() will try to spin on current CPU or acquire a being held softirq_expiry_lock when the current timer is running, these steps above are reasonable. However, it also has downside. When the timer is running, the cancelling of the timer is delayed when releasing the last map uref. The delay is also fixable (e.g., break the cancelling of bpf timer into two parts: one part in locked scope, another one in unlocked scope), so it can be revised later if necessary. It is a bit hard to decide the right fix tag. One reason is that the problem depends on PREEMPT_RT which is enabled in v6.12. Considering the softirq_expiry_lock lock exists since v5.4 and bpf_timer is introduced in v5.15, the bpf_timer commit is used in the fixes tag and an extra depends-on tag is added to state the dependency on PREEMPT_RT. Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Depends-on: v6.12 with PREEMPT_RT enabled Reported-by: Sebastian Andrzej Siewior Closes: https://lore.kernel.org/bpf/20241106084527.4gPrMnHt@linutronix.de Signed-off-by: Hou Tao --- kernel/bpf/helpers.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index bcda671feafd9..7330bd4ee6818 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1591,12 +1591,19 @@ void bpf_timer_cancel_and_free(void *val) * bpf_timer_cancel_and_free(timer2) bpf_timer_cancel_and_free(timer1) * * To avoid these issues, punt to workqueue context when we are in a - * timer callback. + * timer callback. When the timer is running on other CPUs, also using + * workqueue context to cancel the timer. */ - if (this_cpu_read(hrtimer_running)) - queue_work(system_unbound_wq, &t->cb.delete_work); - else - bpf_timer_delete_work(&t->cb.delete_work); + if (!this_cpu_read(hrtimer_running) && hrtimer_try_to_cancel(&t->timer) >= 0) { + kfree_rcu(t, cb.rcu); + return; + } + + /* The timer is running on current or other CPU. Use a kworker to wait + * for the completion of the timer instead of spinning on current CPU + * or trying to acquire a sleepable lock to wait for its completion. + */ + queue_work(system_unbound_wq, &t->cb.delete_work); } /* This function is called by map_delete/update_elem for individual element and From patchwork Thu Jan 9 06:19:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13932078 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C85D25949C; Thu, 9 Jan 2025 06:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402822; cv=none; b=hyAQhhPW152w97AmvsjzCihF+/xkeb6CuUwMohEQ/hbQ2UKIx0wxiyXalfQLsC1xdD++TaguU2fUikh7HBu7GSZSzj+khaMxSNQLOE86Qx8oLN4SmGBXHliRgeGl36JaZifqNSeKvTru2VW4QFLoNfbUdYc1A20PsH1Trrdfcx4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736402822; c=relaxed/simple; bh=FI+NkTda/WuOdm2ZoJy+YlzC8D7pemqKRf1RPO8NzYQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CRjdu0Mzh6vQAn0pGgpNHuOKYyr0LFOpgYcr9XU2Z79mNKqp62iHBQdDX3HyKOILCJF2A0qDAX5yYahVGXubdzQkn3eC+7PAZDWl5VGhatn1LbO6kIm6nS3E4kcYf3qtQuXswL/ruAkgwsy1Gf6MHKpGpcvDGuFUBTX4fYkJHrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4YTDnT1gP7z4f3jXV; Thu, 9 Jan 2025 14:06:37 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 6BDA41A0B7C; Thu, 9 Jan 2025 14:06:57 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.124.27]) by APP4 (Coremail) with SMTP id gCh0CgAni196Z39nvD3QAQ--.4010S9; Thu, 09 Jan 2025 14:06:57 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , Sebastian Andrzej Siewior , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next v2 5/5] selftests/bpf: Add test case for the freeing of bpf_timer Date: Thu, 9 Jan 2025 14:19:01 +0800 Message-Id: <20250109061901.2620825-6-houtao@huaweicloud.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20250109061901.2620825-1-houtao@huaweicloud.com> References: <20250109061901.2620825-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAni196Z39nvD3QAQ--.4010S9 X-Coremail-Antispam: 1UD129KBjvJXoWxKr1kZF4xuF1ruF15XFW7CFg_yoWxuw1Upa yrK345Kr4rXw47Ww48tFn7GrWfKrs5XFyxGry0gw1UZr1Iqws5tF92gFy5tFW3CFWDWryS vF4FkFZ8GrZrJrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_GFv_Wryl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6r W5MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUI-eODUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao The main purpose of the test is to demonstrate the lock problem for the free of bpf_timer under PREEMPT_RT. When freeing a bpf_timer which is running on other CPU in bpf_timer_cancel_and_free(), hrtimer_cancel() will try to acquire a spin-lock (namely softirq_expiry_lock), however the freeing procedure has already held a raw-spin-lock. The test first creates two threads: one to start timers and the other to free timers. The start-timers thread will start the timer and then wake up the free-timers thread to free these timers when the starts complete. After freeing, the free-timer thread will wake up the start-timer thread to complete the current iteration. A loop of 10 iterations is used. Signed-off-by: Hou Tao --- .../selftests/bpf/prog_tests/free_timer.c | 165 ++++++++++++++++++ .../testing/selftests/bpf/progs/free_timer.c | 71 ++++++++ 2 files changed, 236 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/free_timer.c create mode 100644 tools/testing/selftests/bpf/progs/free_timer.c diff --git a/tools/testing/selftests/bpf/prog_tests/free_timer.c b/tools/testing/selftests/bpf/prog_tests/free_timer.c new file mode 100644 index 0000000000000..b7b77a6b29799 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/free_timer.c @@ -0,0 +1,165 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2025. Huawei Technologies Co., Ltd */ +#define _GNU_SOURCE +#include +#include +#include + +#include "free_timer.skel.h" + +struct run_ctx { + struct bpf_program *start_prog; + struct bpf_program *overwrite_prog; + pthread_barrier_t notify; + int loop; + bool start; + bool stop; +}; + +static void start_threads(struct run_ctx *ctx) +{ + ctx->start = true; +} + +static void stop_threads(struct run_ctx *ctx) +{ + ctx->stop = true; + /* Guarantee the order between ->stop and ->start */ + __atomic_store_n(&ctx->start, true, __ATOMIC_RELEASE); +} + +static int wait_for_start(struct run_ctx *ctx) +{ + while (!__atomic_load_n(&ctx->start, __ATOMIC_ACQUIRE)) + usleep(10); + + return ctx->stop; +} + +static void *overwrite_timer_fn(void *arg) +{ + struct run_ctx *ctx = arg; + int loop, fd, err; + cpu_set_t cpuset; + long ret = 0; + + /* Pin on CPU 0 */ + CPU_ZERO(&cpuset); + CPU_SET(0, &cpuset); + pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset); + + /* Is the thread being stopped ? */ + err = wait_for_start(ctx); + if (err) + return NULL; + + fd = bpf_program__fd(ctx->overwrite_prog); + loop = ctx->loop; + while (loop-- > 0) { + LIBBPF_OPTS(bpf_test_run_opts, opts); + + /* Wait for start thread to complete */ + pthread_barrier_wait(&ctx->notify); + + /* Overwrite timers */ + err = bpf_prog_test_run_opts(fd, &opts); + if (err) + ret |= 1; + else if (opts.retval) + ret |= 2; + + /* Notify start thread to start timers */ + pthread_barrier_wait(&ctx->notify); + } + + return (void *)ret; +} + +static void *start_timer_fn(void *arg) +{ + struct run_ctx *ctx = arg; + int loop, fd, err; + cpu_set_t cpuset; + long ret = 0; + + /* Pin on CPU 1 */ + CPU_ZERO(&cpuset); + CPU_SET(1, &cpuset); + pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset); + + /* Is the thread being stopped ? */ + err = wait_for_start(ctx); + if (err) + return NULL; + + fd = bpf_program__fd(ctx->start_prog); + loop = ctx->loop; + while (loop-- > 0) { + LIBBPF_OPTS(bpf_test_run_opts, opts); + + /* Run the prog to start timer */ + err = bpf_prog_test_run_opts(fd, &opts); + if (err) + ret |= 4; + else if (opts.retval) + ret |= 8; + + /* Notify overwrite thread to do overwrite */ + pthread_barrier_wait(&ctx->notify); + + /* Wait for overwrite thread to complete */ + pthread_barrier_wait(&ctx->notify); + } + + return (void *)ret; +} + +void test_free_timer(void) +{ + struct free_timer *skel; + struct bpf_program *prog; + struct run_ctx ctx; + pthread_t tid[2]; + void *ret; + int err; + + skel = free_timer__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open_load")) + return; + + memset(&ctx, 0, sizeof(ctx)); + + prog = bpf_object__find_program_by_name(skel->obj, "start_timer"); + if (!ASSERT_OK_PTR(prog, "find start prog")) + goto out; + ctx.start_prog = prog; + + prog = bpf_object__find_program_by_name(skel->obj, "overwrite_timer"); + if (!ASSERT_OK_PTR(prog, "find overwrite prog")) + goto out; + ctx.overwrite_prog = prog; + + pthread_barrier_init(&ctx.notify, NULL, 2); + ctx.loop = 10; + + err = pthread_create(&tid[0], NULL, start_timer_fn, &ctx); + if (!ASSERT_OK(err, "create start_timer")) + goto out; + + err = pthread_create(&tid[1], NULL, overwrite_timer_fn, &ctx); + if (!ASSERT_OK(err, "create overwrite_timer")) { + stop_threads(&ctx); + goto out; + } + + start_threads(&ctx); + + ret = NULL; + err = pthread_join(tid[0], &ret); + ASSERT_EQ(err | (long)ret, 0, "start_timer"); + ret = NULL; + err = pthread_join(tid[1], &ret); + ASSERT_EQ(err | (long)ret, 0, "overwrite_timer"); +out: + free_timer__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/free_timer.c b/tools/testing/selftests/bpf/progs/free_timer.c new file mode 100644 index 0000000000000..4501ae8fc4143 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/free_timer.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2025. Huawei Technologies Co., Ltd */ +#include +#include +#include +#include + +#define MAX_ENTRIES 8 + +struct map_value { + struct bpf_timer timer; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, int); + __type(value, struct map_value); + __uint(max_entries, MAX_ENTRIES); +} map SEC(".maps"); + +static int timer_cb(void *map, void *key, struct map_value *value) +{ + volatile int sum = 0; + int i; + + bpf_for(i, 0, 1024 * 1024) sum += i; + + return 0; +} + +static int start_cb(int key) +{ + struct map_value *value; + + value = bpf_map_lookup_elem(&map, (void *)&key); + if (!value) + return 0; + + bpf_timer_init(&value->timer, &map, CLOCK_MONOTONIC); + bpf_timer_set_callback(&value->timer, timer_cb); + /* Hope 100us will be enough to wake-up and run the overwrite thread */ + bpf_timer_start(&value->timer, 100000, BPF_F_TIMER_CPU_PIN); + + return 0; +} + +static int overwrite_cb(int key) +{ + struct map_value zero = {}; + + /* Free the timer which may run on other CPU */ + bpf_map_update_elem(&map, (void *)&key, &zero, BPF_ANY); + + return 0; +} + +SEC("syscall") +int BPF_PROG(start_timer) +{ + bpf_loop(MAX_ENTRIES, start_cb, NULL, 0); + return 0; +} + +SEC("syscall") +int BPF_PROG(overwrite_timer) +{ + bpf_loop(MAX_ENTRIES, overwrite_cb, NULL, 0); + return 0; +} + +char _license[] SEC("license") = "GPL";