From patchwork Wed Dec 14 10:38:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tonghao Zhang X-Patchwork-Id: 13072993 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67729C4332F for ; Wed, 14 Dec 2022 10:39:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237870AbiLNKjI (ORCPT ); Wed, 14 Dec 2022 05:39:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbiLNKjG (ORCPT ); Wed, 14 Dec 2022 05:39:06 -0500 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69D4D23167 for ; Wed, 14 Dec 2022 02:39:05 -0800 (PST) Received: by mail-pg1-x532.google.com with SMTP id 142so1701447pga.1 for ; Wed, 14 Dec 2022 02:39:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=VZm9GhQRwrMyYINPX/F65U/umc4ZnamMlTLDNv2w3qg=; b=JiqB7A5XNlWO5fEiRohQSc6WbJYHJZ6bkL5wZ8XWq6wrjh0oerY0zyZzdRX3KPN5ov spIw2KtAvIqz1K9HqcQp6FeuVNv27pnjjXPYvlRyTFky41n64K0iGU81Y9xZyO6JYAoI t02UeNdCOs3zguw8htVsOKfL90yl1EhqbiJzq7Yqsi3rK5n7LJ4zxPn4AI/HQTTL6z/3 BzZxz64e6T3IgC/JtdpzUO12EOrO/6BkRmHkH6NZD3kjin/bClPxvodnlVbK1+KX7LqB XdjbkmH78KrMjC/7LqRdMbyGoqY2IY/26Dy91mA1jlEviEJaNnC/QioHISwVi784gfUu yryg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VZm9GhQRwrMyYINPX/F65U/umc4ZnamMlTLDNv2w3qg=; b=xsa2saiVRIPkKpcxIdes6e/tfRVbEnN5ba8HVppxdM+5DmwMaj3x8vpOd46VVChtK8 T4lLYBPLL43cc0tYpYvt46JOVeMbP3dyXU9HjbaZ0cXCILcukpoptVC/HgqpP1fFy4pA Uxk0YtqfjwLTTAmgaJtWyXl3whjcWwTA20b/YUSnUgo+VwGmVj6wBsVOKUS6QNwgHkxI WjhgwMfqC0qNFUA6S1naGVLmgsltUtrU5taT8LKx//pZyLT2w6SNyTmZdCg723QCfXIk tHZSNpqyjBIRFDGnnJodGPPHAr2Ec2+CF04WjVWYjVK893kmVELGaIAHZ+WfTEsjO29h 55bQ== X-Gm-Message-State: ANoB5pkagp74gOAsjs52Hnsiz/M3kBVaAYQpOOhvxVz7aohEl5EHNipG XwYfwTSpx+3KzZ4Ily87pUeyvc5kEH/cJRwf X-Google-Smtp-Source: AA0mqf5iFqs/6uMtcl5iUKMFy4oYMzmHEZQBb6RShHr7nHm7x9EKh5mw6ytTdI30R48V7FrL0NRq+g== X-Received: by 2002:aa7:8b42:0:b0:56b:abd4:83b1 with SMTP id i2-20020aa78b42000000b0056babd483b1mr23459588pfd.2.1671014344440; Wed, 14 Dec 2022 02:39:04 -0800 (PST) Received: from localhost.localdomain ([111.201.145.40]) by smtp.gmail.com with ESMTPSA id o76-20020a62cd4f000000b005751f455e0esm9177272pfg.120.2022.12.14.02.39.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Dec 2022 02:39:03 -0800 (PST) From: xiangxia.m.yue@gmail.com To: bpf@vger.kernel.org Cc: Tonghao Zhang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Hou Tao Subject: [bpf-next 1/2] bpf: hash map, avoid deadlock with suitable hash mask Date: Wed, 14 Dec 2022 18:38:56 +0800 Message-Id: <20221214103857.69082-1-xiangxia.m.yue@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Tonghao Zhang The deadlock still may occur while accessed in NMI and non-NMI context. Because in NMI, we still may access the same bucket but with different map_locked index. For example, on the same CPU, .max_entries = 2, we update the hash map, with key = 4, while running bpf prog in NMI nmi_handle(), to update hash map with key = 20, so it will have the same bucket index but have different map_locked index. To fix this issue, using min mask to hash again. Signed-off-by: Tonghao Zhang Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Andrii Nakryiko Cc: Martin KaFai Lau Cc: Song Liu Cc: Yonghong Song Cc: John Fastabend Cc: KP Singh Cc: Stanislav Fomichev Cc: Hao Luo Cc: Jiri Olsa Cc: Hou Tao Acked-by: Yonghong Song Acked-by: Hou Tao --- kernel/bpf/hashtab.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 5aa2b5525f79..8b25036a8690 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -152,7 +152,7 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab, { unsigned long flags; - hash = hash & HASHTAB_MAP_LOCK_MASK; + hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1); preempt_disable(); if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { @@ -171,7 +171,7 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab, struct bucket *b, u32 hash, unsigned long flags) { - hash = hash & HASHTAB_MAP_LOCK_MASK; + hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1); raw_spin_unlock_irqrestore(&b->raw_lock, flags); __this_cpu_dec(*(htab->map_locked[hash])); preempt_enable(); From patchwork Wed Dec 14 10:38:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tonghao Zhang X-Patchwork-Id: 13072994 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C54F3C4332F for ; Wed, 14 Dec 2022 10:39:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237970AbiLNKjN (ORCPT ); Wed, 14 Dec 2022 05:39:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229463AbiLNKjK (ORCPT ); Wed, 14 Dec 2022 05:39:10 -0500 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 531A823167 for ; Wed, 14 Dec 2022 02:39:09 -0800 (PST) Received: by mail-pf1-x436.google.com with SMTP id n3so4124710pfq.10 for ; Wed, 14 Dec 2022 02:39:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=K76DX5yd8zONl0RgJwbWgYfbrXc3KM7LvVM3PASB6IM=; b=OWqg4eIZImiLxm7wT/VIZfM/o96H5WlPAlTqPn6uM9p3NqorJK6aiwIw6Cfu/0jqUT ivyjYGeaho2sNiT1zDgfQ3XcDnwFAmM5jkN6XCizGwymwVkT9QQ9Y8QA2KqYI2h5WSvp ikBdnW7IBqvoc1W2wfXw9CNI3bpH7uReD4Rd9Y+1H/VGeJzHi5dm9rk7cC+6hu+y7So2 TT+dylmWE5xLG8547+47xq0Qkn9wm1mVcs1C0mmBq9xONKfi6ztOkm3/HsV4RjbQMyET surKsV7QVDmG6LCZoxYaSb5Sg70FbpnPee3f+DD7WO4BWPMi2OfYhpHScHMTBmRROoaU 5RbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K76DX5yd8zONl0RgJwbWgYfbrXc3KM7LvVM3PASB6IM=; b=FQjtsv2TIzU5VEl1+zKXjt2Y9oA7BN2gp5OVLTGAOpJotMF9nfqYSDF/gMkZQQEewp 7T9Mhxx523vQk6w3s9cZqLxq3tdjOugP3v8R6gPTCfB4cJUnXFZgYrGXs+82aVtQEULT 3OFf3SD6omuIsYHYIqWStSLVnfOPH50lDNKVXQufJF7yfPq83yLM7W6YPhQISbyVHwjU q+9E6827I+exEQtp79Toxhb1ElfMUsQsu3bcDqsQEWIiCTr92dk0h7wBywRqvswwNeom zcihdAQwgJLiwPcXhHnc8SjCs4Ya6dtQ3EakRq93N32Ovaf9j7eii4VnXexvihVcqFZV 4ngA== X-Gm-Message-State: ANoB5pm5cmZARmu+V46aPWBDw81EHppMmgk66R50wgI0Wip+8woebnT/ Ha5DD6fz4UJkVUBSq+wsk0uSUqCrOJY9uQH/ X-Google-Smtp-Source: AA0mqf6ChMimmKNN262h/x9hVuRFnLA4dUcaGPHyg0KH6ST6ZA4npkpwUyc7BFABs+fr/pFJL728CQ== X-Received: by 2002:a62:87cc:0:b0:576:dc40:6db9 with SMTP id i195-20020a6287cc000000b00576dc406db9mr24820340pfe.13.1671014348482; Wed, 14 Dec 2022 02:39:08 -0800 (PST) Received: from localhost.localdomain ([111.201.145.40]) by smtp.gmail.com with ESMTPSA id o76-20020a62cd4f000000b005751f455e0esm9177272pfg.120.2022.12.14.02.39.04 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Dec 2022 02:39:07 -0800 (PST) From: xiangxia.m.yue@gmail.com To: bpf@vger.kernel.org Cc: Tonghao Zhang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Hou Tao Subject: [bpf-next 2/2] selftests/bpf: add test cases for htab map Date: Wed, 14 Dec 2022 18:38:57 +0800 Message-Id: <20221214103857.69082-2-xiangxia.m.yue@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20221214103857.69082-1-xiangxia.m.yue@gmail.com> References: <20221214103857.69082-1-xiangxia.m.yue@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Tonghao Zhang This testing show how to reproduce deadlock in special case. Signed-off-by: Tonghao Zhang Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Andrii Nakryiko Cc: Martin KaFai Lau Cc: Song Liu Cc: Yonghong Song Cc: John Fastabend Cc: KP Singh Cc: Stanislav Fomichev Cc: Hao Luo Cc: Jiri Olsa Cc: Hou Tao --- .../selftests/bpf/prog_tests/htab_deadlock.c | 74 +++++++++++++++++++ .../selftests/bpf/progs/htab_deadlock.c | 30 ++++++++ 2 files changed, 104 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/htab_deadlock.c create mode 100644 tools/testing/selftests/bpf/progs/htab_deadlock.c diff --git a/tools/testing/selftests/bpf/prog_tests/htab_deadlock.c b/tools/testing/selftests/bpf/prog_tests/htab_deadlock.c new file mode 100644 index 000000000000..7dce4c2fe4f5 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/htab_deadlock.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 DiDi Global Inc. */ +#define _GNU_SOURCE +#include +#include +#include + +#include "htab_deadlock.skel.h" + +static int perf_event_open(void) +{ + struct perf_event_attr attr = {0}; + int pfd; + + /* create perf event */ + attr.size = sizeof(attr); + attr.type = PERF_TYPE_HARDWARE; + attr.config = PERF_COUNT_HW_CPU_CYCLES; + attr.freq = 1; + attr.sample_freq = 1000; + pfd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, PERF_FLAG_FD_CLOEXEC); + + return pfd >= 0 ? pfd : -errno; +} + +void test_htab_deadlock(void) +{ + unsigned int val = 0, key = 20; + struct bpf_link *link = NULL; + struct htab_deadlock *skel; + cpu_set_t cpus; + int err; + int pfd; + int i; + + skel = htab_deadlock__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + return; + + err = htab_deadlock__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto clean_skel; + + /* NMI events. */ + pfd = perf_event_open(); + if (pfd < 0) { + if (pfd == -ENOENT || pfd == -EOPNOTSUPP) { + printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n", __func__); + test__skip(); + goto clean_skel; + } + if (!ASSERT_GE(pfd, 0, "perf_event_open")) + goto clean_skel; + } + + link = bpf_program__attach_perf_event(skel->progs.bpf_perf_event, pfd); + if (!ASSERT_OK_PTR(link, "attach_perf_event")) + goto clean_pfd; + + /* Pinned on CPU 0 */ + CPU_ZERO(&cpus); + CPU_SET(0, &cpus); + pthread_setaffinity_np(pthread_self(), sizeof(cpus), &cpus); + + for (i = 0; i < 100000; i++) + bpf_map_update_elem(bpf_map__fd(skel->maps.htab), + &key, &val, BPF_ANY); + + bpf_link__destroy(link); +clean_pfd: + close(pfd); +clean_skel: + htab_deadlock__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/htab_deadlock.c b/tools/testing/selftests/bpf/progs/htab_deadlock.c new file mode 100644 index 000000000000..c4bd1567f882 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/htab_deadlock.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 DiDi Global Inc. */ +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 2); + __uint(map_flags, BPF_F_ZERO_SEED); + __uint(key_size, sizeof(unsigned int)); + __uint(value_size, sizeof(unsigned int)); +} htab SEC(".maps"); + +SEC("fentry/nmi_handle") +int bpf_nmi_handle(struct pt_regs *regs) +{ + unsigned int val = 0, key = 4; + + bpf_map_update_elem(&htab, &key, &val, BPF_ANY); + return 0; +} + +SEC("perf_event") +int bpf_perf_event(struct pt_regs *regs) +{ + return 0; +}