From patchwork Fri Jan 31 19:28:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955693 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A6671F1316; Fri, 31 Jan 2025 19:29:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351765; cv=none; b=HlkxfuX0a0vRtY6Ia9ZHe0irS0vAds2Kz8QSwSluSBRwqr+6qqa+kYC7j5OKm1g+TyEfwl3y0SeB6ZJ6XW16SC29EAJzWWvi7LkKi5WxAMA53vQLOQPpVsPlOr5xMLM7fbgQzhB4frBenG9QwyJugJxgTPg8mvuZoas4AoZ7ZjY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351765; c=relaxed/simple; bh=HHGh55qBNDJfyTtks9UxNN5CoO2OyTy41mW6UiHUHOU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H+8cs3TtObhDH1Yacp4aWsA0bfgfLu+MvyrYnPtY82A63efwKwLHXZiYxquKbDV+wiNQwNtD01KvCr9ofjiDQONCc2uv8DeOcRZ3VdfWUz+gCWSFiQ8BQuGTZTFV32CU8xOMRkb0e4Jw6oXU9gellT3/yKZ3YxJvrCmVPYD9K+k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OQeCjnS1; arc=none smtp.client-ip=209.85.216.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OQeCjnS1" Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2ee709715d9so3282626a91.3; Fri, 31 Jan 2025 11:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351763; x=1738956563; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xEPaZV1umqoPhyWbc+D1akpU5i5CCQxx/9lrNiXUvwY=; b=OQeCjnS1/CH2JFdd5zfolzDyyebb5ep6q3I+PSXUckWo2qWVl8BiB0H1q6Tw5ZNK5Y u1z8xxvEDI/IwxsKBYVFLiMhwqVb6pO8BRv1H8varkHKr8AKzHJkBk86KzNmfYnfWng6 rtQStvRZxDpVkP/0EWIG/Uvjb2+O4J7SWdEdA87MWLCU/x6J4JyW+PNrtKmMRH7avDO1 BLYWPCp442ihkxpfRLzFbJmkuN2mAtoj0gLOk9nyx2inqNc8Jica9Fkh3D/Ip49Q6s9c O/Ef5tomf/OcYYCQM491zrLc9Hl1ytlV4n1JSKzrHoxQ2sP7+u/RWdACfipPUhlgCLyg 0vhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351763; x=1738956563; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xEPaZV1umqoPhyWbc+D1akpU5i5CCQxx/9lrNiXUvwY=; b=ndE/8hi3ilYGl8e5JGWAOUINMKmdcEPr6+rTWuwNKxrVkd1tEN5XYiHDsH7ixRcPc+ U/4mxbF3QdbSDXQUFS8QvOGrlvhdfWSI5h+mDxqBJccWxVvvqQqdCKXPHjdlMT+NIUyg s/RSkSxL+KeDMTxqc3VNWQOT47UGW8P2R8KNWdga1mCWY6dQLIw5yUal96kHxmy/J0DX sOVvRMuHN5Ukir03xXF8t4VhefSFWvV/y3Qm1fPe2QyV8IApYY54Soobje07s0VqvHQj aKkDi9YGSn8rJW8XRxj1nubshaTKIQkv8RMoPukSWVoKKgHxHUgCXW1rCiicmy930I8T yGOA== X-Gm-Message-State: AOJu0YwNTRQOWG1zfby56UjDrMDa6PwykzndqbQp9vCkzFPkY9x/MGot 4FlDH6fVYtYyfAPvJ97vyl0/T3IPuMuHLYfn+2CqdosXQizeTai8b12SfTY6hJM= X-Gm-Gg: ASbGnctN5cVznRk5y0rFZiIw/OFh7PSnvC1bz6jtUfLtuoH4gM1UIV397/PJEwY4MIf 6C1cEAxKFgfpnJTLsBlESGvUSs9S1sPZtmsD8ufGfTB/ajDmHP9QhmhRGM3Uey2r99XpDGfzbWm AM+Q5wo/liSKLQ+gpqpS1Gvk1mOOqLDq39BusQC6BtMSKgvbVM1TAG5mQQXHLCP9CR8hkzVl+YS Kq6A5Qg/qemKoy2bMt/Tdl44DxDnPFUeyh1A8m8fXF6DFhn2abat7w867cj1rIlM51X/bQQnvjb M1X4HY4+Gn60zmFKXwZT/vDz7CLvZC83us6Q7oF5t1EfW1vNzjEaReFNKyBTAWgZgA== X-Google-Smtp-Source: AGHT+IHk9/VYv/sdDhW2qzX0iEWN6upUJyUWKgytAV9sZ+8oQ/3t7aPgONPSmsiqmHfRGh5SUwJvMg== X-Received: by 2002:a17:90b:4ec8:b0:2ee:f677:aa14 with SMTP id 98e67ed59e1d1-2f83abea7c9mr18137238a91.13.1738351763030; Fri, 31 Jan 2025 11:29:23 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:22 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 01/18] bpf: Make every prog keep a copy of ctx_arg_info Date: Fri, 31 Jan 2025 11:28:40 -0800 Message-ID: <20250131192912.133796-2-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Currently, ctx_arg_info is read-only in the view of the verifier since it is shared among programs of the same attach type. Make each program have their own copy of ctx_arg_info so that we can use it to store program specific information. In the next patch where we support acquiring a referenced kptr through a struct_ops argument tagged with "__ref", ctx_arg_info->ref_obj_id will be used to store the unique reference object id of the argument. This avoids creating a requirement in the verifier that "__ref" tagged arguments must be the first set of references acquired [0]. [0] https://lore.kernel.org/bpf/20241220195619.2022866-2-amery.hung@gmail.com/ Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- include/linux/bpf.h | 7 +++++-- kernel/bpf/bpf_iter.c | 13 ++++++------- kernel/bpf/verifier.c | 25 +++++++++++++++---------- 3 files changed, 26 insertions(+), 19 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f3f50e29d639..f4df39e8c735 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1507,7 +1507,7 @@ struct bpf_prog_aux { u32 max_rdonly_access; u32 max_rdwr_access; struct btf *attach_btf; - const struct bpf_ctx_arg_aux *ctx_arg_info; + struct bpf_ctx_arg_aux *ctx_arg_info; void __percpu *priv_stack_ptr; struct mutex dst_mutex; /* protects dst_* pointers below, *after* prog becomes visible */ struct bpf_prog *dst_prog; @@ -1945,6 +1945,9 @@ static inline void bpf_struct_ops_desc_release(struct bpf_struct_ops_desc *st_op #endif +int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog, + const struct bpf_ctx_arg_aux *info, u32 cnt); + #if defined(CONFIG_CGROUP_BPF) && defined(CONFIG_BPF_LSM) int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, int cgroup_atype); @@ -2546,7 +2549,7 @@ struct bpf_iter__bpf_map_elem { int bpf_iter_reg_target(const struct bpf_iter_reg *reg_info); void bpf_iter_unreg_target(const struct bpf_iter_reg *reg_info); -bool bpf_iter_prog_supported(struct bpf_prog *prog); +int bpf_iter_prog_supported(struct bpf_prog *prog); const struct bpf_func_proto * bpf_iter_get_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_prog *prog); diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 106735145948..380e9a7cac75 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -335,7 +335,7 @@ static void cache_btf_id(struct bpf_iter_target_info *tinfo, tinfo->btf_id = prog->aux->attach_btf_id; } -bool bpf_iter_prog_supported(struct bpf_prog *prog) +int bpf_iter_prog_supported(struct bpf_prog *prog) { const char *attach_fname = prog->aux->attach_func_name; struct bpf_iter_target_info *tinfo = NULL, *iter; @@ -344,7 +344,7 @@ bool bpf_iter_prog_supported(struct bpf_prog *prog) int prefix_len = strlen(prefix); if (strncmp(attach_fname, prefix, prefix_len)) - return false; + return -EINVAL; mutex_lock(&targets_mutex); list_for_each_entry(iter, &targets, list) { @@ -360,12 +360,11 @@ bool bpf_iter_prog_supported(struct bpf_prog *prog) } mutex_unlock(&targets_mutex); - if (tinfo) { - prog->aux->ctx_arg_info_size = tinfo->reg_info->ctx_arg_info_size; - prog->aux->ctx_arg_info = tinfo->reg_info->ctx_arg_info; - } + if (!tinfo) + return -EINVAL; - return tinfo != NULL; + return bpf_prog_ctx_arg_info_init(prog, tinfo->reg_info->ctx_arg_info, + tinfo->reg_info->ctx_arg_info_size); } const struct bpf_func_proto * diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9971c03adfd5..a41ba019780f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -22377,6 +22377,18 @@ static void print_verification_stats(struct bpf_verifier_env *env) env->peak_states, env->longest_mark_read_walk); } +int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog, + const struct bpf_ctx_arg_aux *info, u32 cnt) +{ + prog->aux->ctx_arg_info = kcalloc(cnt, sizeof(*info), GFP_KERNEL); + if (!prog->aux->ctx_arg_info) + return -ENOMEM; + + memcpy(prog->aux->ctx_arg_info, info, sizeof(*info) * cnt); + prog->aux->ctx_arg_info_size = cnt; + return 0; +} + static int check_struct_ops_btf_id(struct bpf_verifier_env *env) { const struct btf_type *t, *func_proto; @@ -22457,17 +22469,12 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env) return -EACCES; } - /* btf_ctx_access() used this to provide argument type info */ - prog->aux->ctx_arg_info = - st_ops_desc->arg_info[member_idx].info; - prog->aux->ctx_arg_info_size = - st_ops_desc->arg_info[member_idx].cnt; - prog->aux->attach_func_proto = func_proto; prog->aux->attach_func_name = mname; env->ops = st_ops->verifier_ops; - return 0; + return bpf_prog_ctx_arg_info_init(prog, st_ops_desc->arg_info[member_idx].info, + st_ops_desc->arg_info[member_idx].cnt); } #define SECURITY_PREFIX "security_" @@ -22917,9 +22924,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) prog->aux->attach_btf_trace = true; return 0; } else if (prog->expected_attach_type == BPF_TRACE_ITER) { - if (!bpf_iter_prog_supported(prog)) - return -EINVAL; - return 0; + return bpf_iter_prog_supported(prog); } if (prog->type == BPF_PROG_TYPE_LSM) { From patchwork Fri Jan 31 19:28:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955694 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEB6E1F151B; Fri, 31 Jan 2025 19:29:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351766; cv=none; b=Pbfs7Nx1j5J6AAKsBF7SRvNi17qvvmN+0pQKgnPAnehMbVAQZc6+8klHgWYhMvH6Jz+wHgw98mHaWY5fJAGwPX3k+/4bQMTvQ4D2ThLOeYf77PdYw2mxA4IEwZC5FYS1hraDGQ4KVNSku536vto+Vjmuuy9kQgq6EXgxbkCf1OY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351766; c=relaxed/simple; bh=GD7R+Fz4eBNAaJo5wZU20CAbH/D/Hyf5K5x+UBCLZX4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HHIaEQYvzk0vFu6v5Y5KIhTF0tCZHuelXqi2mp/0mtHhKCRhPDjR1ISyPWRxueC3QpXW15Zv+QgJJxk4qAC9ZDPE3PxnvjNsi1CvcgugV3NRKmdE0uxvXaOWk9T7MmgxJeLsskUUNtZ6KoQ06EH4tLtS1Kck9uj8TvX6o2BnqW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NR6ExAz4; arc=none smtp.client-ip=209.85.216.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NR6ExAz4" Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2ee51f8c47dso3251273a91.1; Fri, 31 Jan 2025 11:29:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351764; x=1738956564; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=S+ZK19OjatDB2LUHgojb+T0W7QGgqPbstXnyVAy8ynY=; b=NR6ExAz4jGwfdjj0nlxUAkeXgHcCeU7SSeKoxbjGS04mjH1JniI8aodziath0O3YIq O03lrdMHaSN3s93RYH9qpFMKn2RkdTA1TivlSm4+oZ6K+1NeP7l57Lb0sK2tBy4Sz1v3 w5QPKWyWT+FUCa7rWg9OLjhLpzAj+Np3Dy53EoVtdV3DY0AqpY4wBeWythYIsQXJBt7u 52nzj0lVIhqMXsRIEtSIEdnZtNwERn5Gq25RRtgYy5929GmSsH/hF6IuT07Llrddn+4r UuQGWpwT0/T9UhNNKwkkETsK+okV0WrCymbqVkWAZW8Mz5slPtV27elAs8M5OgUUEfQq q4eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351764; x=1738956564; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=S+ZK19OjatDB2LUHgojb+T0W7QGgqPbstXnyVAy8ynY=; b=I4uIht1vAw1Dr56FR/xkyUU4zSy07cztj06KfzPgfJ0utZV0iATJyZQ4QksMz1E5e3 lKK6XBidBC4aHdbKVNmzL5U7+zpPGQpra5+fwDqeIoWjQA0h5Yu2Wi8I5raFL9IQOcNl ewtxegxi2UDWogL6RDsi2dJQe5owSSluvPuc/FI9zqYTQYUHJj02L2TIk1oE4XerK4TC pAmBn63qviYguklfICflPcnJcUw5ZRpFIWGnb7UwqJwiWvvFLnPzsk0NG3Q9qLwna6vp WiPJMRQDCUbvJeCO96HCEk48iWyUGgwpnzefz42PS72OMshl7jMc94gqEUc1eWA2MKUV 9j/A== X-Gm-Message-State: AOJu0Yy2jwfsQN+XGSgAuBnRzLSpEN62Z/Et4+31wi7UP/bOMNmTwZSk l+FORYHShgCZ/LmtlnQzBmHPQ0r8DFsi6dB8o+Lcu7lD5GLZ5XTYhUcBrGQHf9o= X-Gm-Gg: ASbGnctHqr6BFz4ISL5z52z7XSFvPKp4/AAay86NitmVqK5sNH77DlMbwVwYsi3awdA RmfEXtnJfxE5z24LMmeor1N3lYkDfqt1dwln+fFBS6xeP7wnw0VdLhCQVv+iph0JHdr17RBKfeF 3OMBqSQUzIR29lxrrVFK6ipePcN1w9q94fLZLD/Gknb0DkDWTU/MTwYhXhEmk3XD62ASVWNL0kc ncKfqhJ/GnUqibxpxNrryoY2FNMVWzOXd0f2W2pkIv1w4Klupxp1/2f2iBiN4DJW2C/Gu/nuvUK U8H3UNKoRuB8CmqAxZ1SR5brPvQ1OXBI4IYzjYNqsJ5TEFwK9fhUdcoMJqEnyPG+Tg== X-Google-Smtp-Source: AGHT+IGLJvi91GSVyozQ9nuQWbP2qMuqj9BBtTq02cmSKq08T21U4bZpAk+VbUD358Gowfy+JEvfbQ== X-Received: by 2002:a17:90b:2702:b0:2ee:f440:53ed with SMTP id 98e67ed59e1d1-2f83ac83923mr16580288a91.31.1738351763959; Fri, 31 Jan 2025 11:29:23 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:23 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 02/18] bpf: Support getting referenced kptr from struct_ops argument Date: Fri, 31 Jan 2025 11:28:41 -0800 Message-ID: <20250131192912.133796-3-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Allows struct_ops programs to acqurie referenced kptrs from arguments by directly reading the argument. The verifier will acquire a reference for struct_ops a argument tagged with "__ref" in the stub function in the beginning of the main program. The user will be able to access the referenced kptr directly by reading the context as long as it has not been released by the program. This new mechanism to acquire referenced kptr (compared to the existing "kfunc with KF_ACQUIRE") is introduced for ergonomic and semantic reasons. In the first use case, Qdisc_ops, an skb is passed to .enqueue in the first argument. This mechanism provides a natural way for users to get a referenced kptr in the .enqueue struct_ops programs and makes sure that a qdisc will always enqueue or drop the skb. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- include/linux/bpf.h | 3 +++ kernel/bpf/bpf_struct_ops.c | 26 ++++++++++++++++++++------ kernel/bpf/btf.c | 1 + kernel/bpf/verifier.c | 35 ++++++++++++++++++++++++++++++++--- 4 files changed, 56 insertions(+), 9 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f4df39e8c735..15164787ce7f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -968,6 +968,7 @@ struct bpf_insn_access_aux { struct { struct btf *btf; u32 btf_id; + u32 ref_obj_id; }; }; struct bpf_verifier_log *log; /* for verbose logs */ @@ -1481,6 +1482,8 @@ struct bpf_ctx_arg_aux { enum bpf_reg_type reg_type; struct btf *btf; u32 btf_id; + u32 ref_obj_id; + bool refcounted; }; struct btf_mod_pair { diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 9b7f3b9c5262..68df8d8b6db3 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -146,6 +146,7 @@ void bpf_struct_ops_image_free(void *image) } #define MAYBE_NULL_SUFFIX "__nullable" +#define REFCOUNTED_SUFFIX "__ref" /* Prepare argument info for every nullable argument of a member of a * struct_ops type. @@ -174,11 +175,13 @@ static int prepare_arg_info(struct btf *btf, struct bpf_struct_ops_arg_info *arg_info) { const struct btf_type *stub_func_proto, *pointed_type; + bool is_nullable = false, is_refcounted = false; const struct btf_param *stub_args, *args; struct bpf_ctx_arg_aux *info, *info_buf; u32 nargs, arg_no, info_cnt = 0; char ksym[KSYM_SYMBOL_LEN]; const char *stub_fname; + const char *suffix; s32 stub_func_id; u32 arg_btf_id; int offset; @@ -223,12 +226,19 @@ static int prepare_arg_info(struct btf *btf, info = info_buf; for (arg_no = 0; arg_no < nargs; arg_no++) { /* Skip arguments that is not suffixed with - * "__nullable". + * "__nullable or __ref". */ - if (!btf_param_match_suffix(btf, &stub_args[arg_no], - MAYBE_NULL_SUFFIX)) + is_nullable = btf_param_match_suffix(btf, &stub_args[arg_no], + MAYBE_NULL_SUFFIX); + is_refcounted = btf_param_match_suffix(btf, &stub_args[arg_no], + REFCOUNTED_SUFFIX); + if (!is_nullable && !is_refcounted) continue; + if (is_nullable) + suffix = MAYBE_NULL_SUFFIX; + else if (is_refcounted) + suffix = REFCOUNTED_SUFFIX; /* Should be a pointer to struct */ pointed_type = btf_type_resolve_ptr(btf, args[arg_no].type, @@ -236,7 +246,7 @@ static int prepare_arg_info(struct btf *btf, if (!pointed_type || !btf_type_is_struct(pointed_type)) { pr_warn("stub function %s has %s tagging to an unsupported type\n", - stub_fname, MAYBE_NULL_SUFFIX); + stub_fname, suffix); goto err_out; } @@ -254,11 +264,15 @@ static int prepare_arg_info(struct btf *btf, } /* Fill the information of the new argument */ - info->reg_type = - PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL; info->btf_id = arg_btf_id; info->btf = btf; info->offset = offset; + if (is_nullable) { + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL; + } else if (is_refcounted) { + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID; + info->refcounted = true; + } info++; info_cnt++; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 9de6acddd479..fd3470fbd144 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6677,6 +6677,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, info->reg_type = ctx_arg_info->reg_type; info->btf = ctx_arg_info->btf ? : btf_vmlinux; info->btf_id = ctx_arg_info->btf_id; + info->ref_obj_id = ctx_arg_info->refcounted ? ctx_arg_info->ref_obj_id : 0; return true; } } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a41ba019780f..a0f51903e977 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1543,6 +1543,17 @@ static void release_reference_state(struct bpf_verifier_state *state, int idx) return; } +static bool find_reference_state(struct bpf_verifier_state *state, int ptr_id) +{ + int i; + + for (i = 0; i < state->acquired_refs; i++) + if (state->refs[i].id == ptr_id) + return true; + + return false; +} + static int release_lock_state(struct bpf_verifier_state *state, int type, int id, void *ptr) { int i; @@ -5981,7 +5992,8 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, /* check access to 'struct bpf_context' fields. Supports fixed offsets only */ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, int size, enum bpf_access_type t, enum bpf_reg_type *reg_type, - struct btf **btf, u32 *btf_id, bool *is_retval, bool is_ldsx) + struct btf **btf, u32 *btf_id, bool *is_retval, bool is_ldsx, + u32 *ref_obj_id) { struct bpf_insn_access_aux info = { .reg_type = *reg_type, @@ -6003,8 +6015,16 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, *is_retval = info.is_retval; if (base_type(*reg_type) == PTR_TO_BTF_ID) { + if (info.ref_obj_id && + !find_reference_state(env->cur_state, info.ref_obj_id)) { + verbose(env, "invalid bpf_context access off=%d. Reference may already be released\n", + off); + return -EACCES; + } + *btf = info.btf; *btf_id = info.btf_id; + *ref_obj_id = info.ref_obj_id; } else { env->insn_aux_data[insn_idx].ctx_field_size = info.ctx_field_size; } @@ -7367,7 +7387,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn struct bpf_retval_range range; enum bpf_reg_type reg_type = SCALAR_VALUE; struct btf *btf = NULL; - u32 btf_id = 0; + u32 btf_id = 0, ref_obj_id = 0; if (t == BPF_WRITE && value_regno >= 0 && is_pointer_value(env, value_regno)) { @@ -7380,7 +7400,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn return err; err = check_ctx_access(env, insn_idx, off, size, t, ®_type, &btf, - &btf_id, &is_retval, is_ldsx); + &btf_id, &is_retval, is_ldsx, &ref_obj_id); if (err) verbose_linfo(env, insn_idx, "; "); if (!err && t == BPF_READ && value_regno >= 0) { @@ -7411,6 +7431,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn if (base_type(reg_type) == PTR_TO_BTF_ID) { regs[value_regno].btf = btf; regs[value_regno].btf_id = btf_id; + regs[value_regno].ref_obj_id = ref_obj_id; } } regs[value_regno].type = reg_type; @@ -22148,6 +22169,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) { bool pop_log = !(env->log.level & BPF_LOG_LEVEL2); struct bpf_subprog_info *sub = subprog_info(env, subprog); + struct bpf_prog_aux *aux = env->prog->aux; struct bpf_verifier_state *state; struct bpf_reg_state *regs; int ret, i; @@ -22255,6 +22277,13 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) mark_reg_known_zero(env, regs, BPF_REG_1); } + /* Acquire references for struct_ops program arguments tagged with "__ref" */ + if (!subprog && env->prog->type == BPF_PROG_TYPE_STRUCT_OPS) { + for (i = 0; i < aux->ctx_arg_info_size; i++) + aux->ctx_arg_info[i].ref_obj_id = aux->ctx_arg_info[i].refcounted ? + acquire_reference(env, 0) : 0; + } + ret = do_check(env); out: /* check for NULL is necessary, since cur_state can be freed inside From patchwork Fri Jan 31 19:28:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955695 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCCDA1F2C5D; Fri, 31 Jan 2025 19:29:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351767; cv=none; b=VRCDk36VbqPhmbXwHAuU35J5hYKyhc7Y3XskNDPHd3lol1YB2Ad95AoNaiXtucjmjs/BuVlw1IjZ+t2CYOnBsU1mZczPFbr+xcE3r/JRBhT0QTf6/BIy/sucs9OZpEXkQmt537q6uNiVAWMp5cxIW0uVvWNnHT6MPRZk1WmUnrc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351767; c=relaxed/simple; bh=Eii9FQ+DTTqmpz923cWf0JM70bcV8nIl9miv2RT2emk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jLAXgnbFW5PQaDfckF5uybdegvnuuSiXah46YmOZFXbr+tBf1mM4M4tUW22+rvHCaOGlGeWV9P8evkSp3DhoR9cZ/JIstHy5ZnfW8Qm6fFV097HK0r5LnftfEKbgh7eNalc01iPqwXlJ+z84Kv8Z+fI4i1bpBgltaA4grIwTYL0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QwDurmWP; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QwDurmWP" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2ee74291415so3173811a91.3; Fri, 31 Jan 2025 11:29:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351765; x=1738956565; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GK5ysT2yUWJFdfhYa20N6Qw0GVTqj3D7iUCa6tFcono=; b=QwDurmWPlKBRkR1d7IugwO1Wsuf55U64idAabUQlP0troTlO/SOs87XW4lBk7C8tL5 JGq3d2ZavZ6CzEsnfzs7MujKoeNSS23D7s9mWPY0BXuTtNQNrEF+ppfRLVOso/uAdbGL EQ/VdRIoej4t8rm6zg8+6/hT8Hde4K54060mfRkddSurToKOZWdiKulNIEvnqNPIsGIi QkoZ1/VrJvwx0mVQFBjerynRvLTcS9bG3d9e+f72z8o5GNweJVRWW158WuNA6waAS4ZA oEGtyBpl3i/Cn2E1uIPSAprSa6TEK4Abx3/I3F+967daUhlYAByGb0/27zgEMhl/a1Nf d9kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351765; x=1738956565; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GK5ysT2yUWJFdfhYa20N6Qw0GVTqj3D7iUCa6tFcono=; b=qYaqAnvJkurXqhluqXtUixbFqEv3l/QSPkQ0MOvO+H73qaI/vRfdiCeTU0fbFUarQW EqE2k7Pf7Oq4CyqAL7VczhSgQ/aWPJiv+PSWL4M0pmexKuJo6qqRhFajT/42+YOA+44W 4GVLo3BLu/IVcshZ83H/ldGpSLuuK3Wb3T+zDpWTas5MyUc1zTkYS+aolUJ0bRfqPaF0 6WwlNksoREBhBAJvnOo5m6GcwzY2C0ltBOyRI3y4pECPkm6g5B0JSqaTFT27IjsxsiaZ BkQ+mdzSrKKBM5QcWYJJylcYBJGi/Y+fqu54OI6KgGXsSwjcu2vt9W3HZCCU7WnVfekF 6NXQ== X-Gm-Message-State: AOJu0YwNY8qwp5zj/QaIBvyd8Af4NzFa8j/xqkVcfJ3Q1xg/axLf9tYR Yv5mRCU5uVXzLwjg+oFho0iZ7iF98h8Hqa3oZFhkB23UTl53AMgapSIr7qagMKc= X-Gm-Gg: ASbGnctNTwSCw6Ks56Q7y/t01cwR6Slwg7HiNKK9mJb9efOHEnblwKjZnA7bimGxLyh Y+Tky/qVJxIVFzqVknK23XdspgUtA9rQbcFSz7oAK0H+KHw/4EspiacuJbwNePSiMoezmS9/Fz0 3QzXLzmbVEBGZ1O9VKjFAtnid+jpnyFS9+uUMkCnzj0dcrjx1m99xJOmHMY9Q7JyUIGvHSwXOy1 vjkw3AMn2nnT2rdqZsbKulngigl9SVtGPgP3etTes1I1dGhv/85NpmV+4YLDM8J1sQkiSxgfUHP mUB/RGIhktEtwunG4mIVZIT70JkQoMCSHqo7Kuqrxo8mbtyJk45iNNnEYAVoSrM/cw== X-Google-Smtp-Source: AGHT+IEEXHWLmOhh13FwPFBLHMmlwA7pa6MRznWdUkROl2Tpf+kPpzmuzOL+nHbd5NDVzg3sgNKEZw== X-Received: by 2002:a17:90b:2d0c:b0:2ee:b66d:6576 with SMTP id 98e67ed59e1d1-2f83ac8452cmr17947837a91.30.1738351764983; Fri, 31 Jan 2025 11:29:24 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:24 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 03/18] selftests/bpf: Test referenced kptr arguments of struct_ops programs Date: Fri, 31 Jan 2025 11:28:42 -0800 Message-ID: <20250131192912.133796-4-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Test referenced kptr acquired through struct_ops argument tagged with "__ref". The success case checks whether 1) a reference to the correct type is acquired, and 2) the referenced kptr argument can be accessed in multiple paths as long as it hasn't been released. In the fail cases, we first confirm that a referenced kptr acquried through a struct_ops argument is not allowed to be leaked. Then, we make sure this new referenced kptr acquiring mechanism does not accidentally allow referenced kptrs to flow into global subprograms through their arguments. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- .../prog_tests/test_struct_ops_refcounted.c | 12 ++++++ .../bpf/progs/struct_ops_refcounted.c | 31 +++++++++++++++ ...ruct_ops_refcounted_fail__global_subprog.c | 39 +++++++++++++++++++ .../struct_ops_refcounted_fail__ref_leak.c | 22 +++++++++++ .../selftests/bpf/test_kmods/bpf_testmod.c | 7 ++++ .../selftests/bpf/test_kmods/bpf_testmod.h | 2 + 6 files changed, 113 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_refcounted.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_refcounted.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__global_subprog.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__ref_leak.c diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_refcounted.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_refcounted.c new file mode 100644 index 000000000000..e290a2f6db95 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_refcounted.c @@ -0,0 +1,12 @@ +#include + +#include "struct_ops_refcounted.skel.h" +#include "struct_ops_refcounted_fail__ref_leak.skel.h" +#include "struct_ops_refcounted_fail__global_subprog.skel.h" + +void test_struct_ops_refcounted(void) +{ + RUN_TESTS(struct_ops_refcounted); + RUN_TESTS(struct_ops_refcounted_fail__ref_leak); + RUN_TESTS(struct_ops_refcounted_fail__global_subprog); +} diff --git a/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c new file mode 100644 index 000000000000..76dcb6089d7f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_refcounted.c @@ -0,0 +1,31 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +__attribute__((nomerge)) extern void bpf_task_release(struct task_struct *p) __ksym; + +/* This is a test BPF program that uses struct_ops to access a referenced + * kptr argument. This is a test for the verifier to ensure that it + * 1) recongnizes the task as a referenced object (i.e., ref_obj_id > 0), and + * 2) the same reference can be acquired from multiple paths as long as it + * has not been released. + */ +SEC("struct_ops/test_refcounted") +int BPF_PROG(refcounted, int dummy, struct task_struct *task) +{ + if (dummy == 1) + bpf_task_release(task); + else + bpf_task_release(task); + return 0; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_refcounted = { + .test_refcounted = (void *)refcounted, +}; + + diff --git a/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__global_subprog.c b/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__global_subprog.c new file mode 100644 index 000000000000..ae074aa62852 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__global_subprog.c @@ -0,0 +1,39 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +extern void bpf_task_release(struct task_struct *p) __ksym; + +__noinline int subprog_release(__u64 *ctx __arg_ctx) +{ + struct task_struct *task = (struct task_struct *)ctx[1]; + int dummy = (int)ctx[0]; + + bpf_task_release(task); + + return dummy + 1; +} + +/* Test that the verifier rejects a program that contains a global + * subprogram with referenced kptr arguments + */ +SEC("struct_ops/test_refcounted") +__failure __log_level(2) +__msg("Validating subprog_release() func#1...") +__msg("invalid bpf_context access off=8. Reference may already be released") +int refcounted_fail__global_subprog(unsigned long long *ctx) +{ + struct task_struct *task = (struct task_struct *)ctx[1]; + + bpf_task_release(task); + + return subprog_release(ctx); +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_ref_acquire = { + .test_refcounted = (void *)refcounted_fail__global_subprog, +}; diff --git a/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__ref_leak.c b/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__ref_leak.c new file mode 100644 index 000000000000..e945b1a04294 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_refcounted_fail__ref_leak.c @@ -0,0 +1,22 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +/* Test that the verifier rejects a program that acquires a referenced + * kptr through context without releasing the reference + */ +SEC("struct_ops/test_refcounted") +__failure __msg("Unreleased reference id=1 alloc_insn=0") +int BPF_PROG(refcounted_fail__ref_leak, int dummy, + struct task_struct *task) +{ + return 0; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_ref_acquire = { + .test_refcounted = (void *)refcounted_fail__ref_leak, +}; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index cc9dde507aba..802cbd871035 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -1176,10 +1176,17 @@ static int bpf_testmod_ops__test_maybe_null(int dummy, return 0; } +static int bpf_testmod_ops__test_refcounted(int dummy, + struct task_struct *task__ref) +{ + return 0; +} + static struct bpf_testmod_ops __bpf_testmod_ops = { .test_1 = bpf_testmod_test_1, .test_2 = bpf_testmod_test_2, .test_maybe_null = bpf_testmod_ops__test_maybe_null, + .test_refcounted = bpf_testmod_ops__test_refcounted, }; struct bpf_struct_ops bpf_bpf_testmod_ops = { diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h index 356803d1c10e..c57b2f9dab10 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h @@ -36,6 +36,8 @@ struct bpf_testmod_ops { /* Used to test nullable arguments. */ int (*test_maybe_null)(int dummy, struct task_struct *task); int (*unsupported_ops)(void); + /* Used to test ref_acquired arguments. */ + int (*test_refcounted)(int dummy, struct task_struct *task); /* The following fields are used to test shadow copies. */ char onebyte; From patchwork Fri Jan 31 19:28:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955696 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E87EE1F37D5; Fri, 31 Jan 2025 19:29:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351768; cv=none; b=A/CukftG+8BVoAeWg/f50rKXLB7rCGJ+PjU6VaAwlVN+5Cu2ba0lJalQ3/jeNmhehlp1jc6iGjxzNw95PQ5Jg3gq/7Djlgjknt2f0EJv1N+1dKYttrADfNZBCwOIzZsiELxSE0hLxd53aXpWce5vphjLD4HJwlex98LBgVjX2Yk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351768; c=relaxed/simple; bh=vqBnJCqLsHZMfRWS6PNQnHG80lG6jVLSKXQhR8oX1Pw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XtJLUgQdrl7/zRQBghjeSVTXrgvbwN4TfLFL1QxiVyDW0Km1FnqTqV7M09x6q+bAlxctPu0Kb+nZlN2rIba26UOoQr98Ckj2AQs9kVc56K0y9JqkPuM0iaFnM5HFEFgQbueJFkdR1I5OzRXcTMqWv/I+dNEYd+ad+yfQ3I4JFsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UcEBfJPh; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UcEBfJPh" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-2ee786b3277so3197862a91.1; Fri, 31 Jan 2025 11:29:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351766; x=1738956566; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SZH8+9xA7t+NvI67ZN2R2VonRxu9XPdlzrbFaeXMWx8=; b=UcEBfJPhra/92pSZBdtMCMfxnz32e3wrZlLm0n+INXiXyvGAQIplB7yANZH7XSBv06 3KbihvG5ZjA874EpW230ZyCgrOhGM0QMzc1xKcPTtwWSnJo6aKb8TS+qwWVIufTmeFRg T5OLkjc/Ib9MLdr/lvA7GK/viMSYbFX82hR1G2YJzsANfZ2VMfFwEqX6h0nQKtB+OeGf 1prRaCwideZ8n1t3t+oTqi5e01TOk6O2WEk/YH1fTM1pfzcLRSINJTLsqGPPPIouSZzF +RQH/JK0xVCBNkjBbBu9O3muaKV+u/c9yDOmY2H3F1rgBoncgYWGVnE9yKp1/zs+HLea owkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351766; x=1738956566; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SZH8+9xA7t+NvI67ZN2R2VonRxu9XPdlzrbFaeXMWx8=; b=MlV8kh8IcoG6PYjQgszDaJN59qNrfotgsaGO4HRBpBPzMm36iiHz6NNHOyeL2NSKoC iwSqjbapfUzv1nstu7UkfjCgIOTYS96asEcplADIR31RAA0vpdlOSWZqeL9hbkHt0BHv wSAXLKbgDzMexaraUipRpt+XXqUvP66dB3y91Ri0NUL7J6RD5GbJJoXcGzRIpcRL8naH ccEco2HM44KJhCUmX3I3mINJG0H2Y2fj/VfLzu2hB4cPo3O0S9fbeXEbPS1TNADf7uhd bkW3Pu662hZde1WBhfqBAhdOhIQq6zMefIR+Lxtbc11kfvPACvCe3WZ4utxt+ALK1cWQ vvng== X-Gm-Message-State: AOJu0YxSTCujHZxZn2kdGGHGrPlFvg5Fb3tlf1UxirevrQFVdTU+ZKPg /AEc9iHlydAvMqF03ECFSuBkDDUsKoRwtrTi3eZxo3gjmPbX14q77NDd53Ab39I= X-Gm-Gg: ASbGnctMEQp2UfqBjfqdSAA9u/N35WQCDvzSDITKda7bWg77c1H0HhwvhvS1kNh6Wr6 naHSZyor/vtO1Tmo6fAr+VO/3QQTdud+aM60tVyFfOlN6vEaAMHvwXuClZK5EMHNNDg5lTHncyW SjX7uVzdLm+3aDalX14Va90woRLq1o6vJu9saN1BD0MTFpIM9neLFl6POZimeqf8uQAbTCgJmes TTDUjkKpqalUlDRxp46AGpqQ6nlRUZynt+RYdIis4f26xHcOAVwJMnnarSDXMFoMdt/LUES4Ccj btBcOEFDZghnuAFWdS/OiyksxMq4WOYy65GKiA9zJk5TXR9ecTPJAYLAba0tiTtLaA== X-Google-Smtp-Source: AGHT+IHo8X/v7PwhwXN4Mn6YUUFPQw54ECamlgVdmI1BeAMioMj/lvyh7I6iv1+6HPCjotXyPEOS7w== X-Received: by 2002:a17:90b:5150:b0:2ee:c2b5:97a0 with SMTP id 98e67ed59e1d1-2f83ac72363mr17383985a91.25.1738351766066; Fri, 31 Jan 2025 11:29:26 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:25 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 04/18] bpf: Allow struct_ops prog to return referenced kptr Date: Fri, 31 Jan 2025 11:28:43 -0800 Message-ID: <20250131192912.133796-5-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Allow a struct_ops program to return a referenced kptr if the struct_ops operator's return type is a struct pointer. To make sure the returned pointer continues to be valid in the kernel, several constraints are required: 1) The type of the pointer must matches the return type 2) The pointer originally comes from the kernel (not locally allocated) 3) The pointer is in its unmodified form Implementation wise, a referenced kptr first needs to be allowed to _leak_ in check_reference_leak() if it is in the return register. Then, in check_return_code(), constraints 1-3 are checked. During struct_ops registration, a check is also added to warn about operators with non-struct pointer return. In addition, since the first user, Qdisc_ops::dequeue, allows a NULL pointer to be returned when there is no skb to be dequeued, we will allow a scalar value with value equals to NULL to be returned. In the future when there is a struct_ops user that always expects a valid pointer to be returned from an operator, we may extend tagging to the return value. We can tell the verifier to only allow NULL pointer return if the return value is tagged with MAY_BE_NULL. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- kernel/bpf/bpf_struct_ops.c | 12 +++++++++++- kernel/bpf/verifier.c | 36 ++++++++++++++++++++++++++++++++---- 2 files changed, 43 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 68df8d8b6db3..8df5e8045d07 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -389,7 +389,7 @@ int bpf_struct_ops_desc_init(struct bpf_struct_ops_desc *st_ops_desc, st_ops_desc->value_type = btf_type_by_id(btf, value_id); for_each_member(i, t, member) { - const struct btf_type *func_proto; + const struct btf_type *func_proto, *ret_type; void **stub_func_addr; u32 moff; @@ -426,6 +426,16 @@ int bpf_struct_ops_desc_init(struct bpf_struct_ops_desc *st_ops_desc, if (!func_proto || bpf_struct_ops_supported(st_ops, moff)) continue; + if (func_proto->type) { + ret_type = btf_type_resolve_ptr(btf, func_proto->type, NULL); + if (ret_type && !__btf_type_is_struct(ret_type)) { + pr_warn("func ptr %s in struct %s returns non-struct pointer, which is not supported\n", + mname, st_ops->name); + err = -EOPNOTSUPP; + goto errout; + } + } + if (btf_distill_func_proto(log, btf, func_proto, mname, &st_ops->func_models[i])) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a0f51903e977..5bcf095e8d0c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10758,6 +10758,8 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exit) { struct bpf_verifier_state *state = env->cur_state; + enum bpf_prog_type type = resolve_prog_type(env->prog); + struct bpf_reg_state *reg = reg_state(env, BPF_REG_0); bool refs_lingering = false; int i; @@ -10767,6 +10769,12 @@ static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exi for (i = 0; i < state->acquired_refs; i++) { if (state->refs[i].type != REF_TYPE_PTR) continue; + /* Allow struct_ops programs to return a referenced kptr back to + * kernel. Type checks are performed later in check_return_code. + */ + if (type == BPF_PROG_TYPE_STRUCT_OPS && !exception_exit && + reg->ref_obj_id == state->refs[i].id) + continue; verbose(env, "Unreleased reference id=%d alloc_insn=%d\n", state->refs[i].id, state->refs[i].insn_idx); refs_lingering = true; @@ -16405,13 +16413,14 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char const char *exit_ctx = "At program exit"; struct tnum enforce_attach_type_range = tnum_unknown; const struct bpf_prog *prog = env->prog; - struct bpf_reg_state *reg; + struct bpf_reg_state *reg = reg_state(env, regno); struct bpf_retval_range range = retval_range(0, 1); enum bpf_prog_type prog_type = resolve_prog_type(env->prog); int err; struct bpf_func_state *frame = env->cur_state->frame[0]; const bool is_subprog = frame->subprogno; bool return_32bit = false; + const struct btf_type *reg_type, *ret_type = NULL; /* LSM and struct_ops func-ptr's return type could be "void" */ if (!is_subprog || frame->in_exception_callback_fn) { @@ -16420,10 +16429,26 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char if (prog->expected_attach_type == BPF_LSM_CGROUP) /* See below, can be 0 or 0-1 depending on hook. */ break; - fallthrough; + if (!prog->aux->attach_func_proto->type) + return 0; + break; case BPF_PROG_TYPE_STRUCT_OPS: if (!prog->aux->attach_func_proto->type) return 0; + + if (frame->in_exception_callback_fn) + break; + + /* Allow a struct_ops program to return a referenced kptr if it + * matches the operator's return type and is in its unmodified + * form. A scalar zero (i.e., a null pointer) is also allowed. + */ + reg_type = reg->btf ? btf_type_by_id(reg->btf, reg->btf_id) : NULL; + ret_type = btf_type_resolve_ptr(prog->aux->attach_btf, + prog->aux->attach_func_proto->type, + NULL); + if (ret_type && ret_type == reg_type && reg->ref_obj_id) + return __check_ptr_off_reg(env, reg, regno, false); break; default: break; @@ -16445,8 +16470,6 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char return -EACCES; } - reg = cur_regs(env) + regno; - if (frame->in_async_callback_fn) { /* enforce return zero from async callbacks like timer */ exit_ctx = "At async callback return"; @@ -16545,6 +16568,11 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char case BPF_PROG_TYPE_NETFILTER: range = retval_range(NF_DROP, NF_ACCEPT); break; + case BPF_PROG_TYPE_STRUCT_OPS: + if (!ret_type) + return 0; + range = retval_range(0, 0); + break; case BPF_PROG_TYPE_EXT: /* freplace program can return anything as its return value * depends on the to-be-replaced kernel func or bpf program. From patchwork Fri Jan 31 19:28:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955697 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D285B1F3D38; Fri, 31 Jan 2025 19:29:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351769; cv=none; b=kAwjP7VYPXj9dZAu98mlVd8emU7Tb6OR3Q5wxA+yH6rxZvP9A5ja/lcl+yagq5JAuh7BdhMjgaI1c2q0GNCOTa5DyyTAeffxxfcPbmXHk6QqGSEkXv+XX7XMHxhAooBxf+iKKmpXQIxHTHqB56NEFUIyDo8BMLvEe65jJ3p8ams= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351769; c=relaxed/simple; bh=KFrnSOYoyPuxNOt/FuXtfX4w07l7a5dNldJmOMjDc/s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kMe5iCJ37AXNjOrasDaJNs9hdvUWhVO9FraI1U2eZ3pbitu1NK/y+8YrY7YFfmnL8GDyOvzxacFLELIz13gliyGhojtSF44m4FFRyCC7NrEeFo1njpvlblYDyy/V3AlWPQLsdb0uqadLX6Six8dn2C3nSksu0hp0LIqUu/LsmZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Im9pBRlW; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Im9pBRlW" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-2eec9b3a1bbso3239146a91.3; Fri, 31 Jan 2025 11:29:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351767; x=1738956567; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+VE1mjTWAjK2f8mnRBq6Pc8Xh+C0vPWa7z7jgNYsPk8=; b=Im9pBRlWOspF7mh4Ifwd6on2htHvqniwAo07vjO9Qa7tIG8GqoyhzpJUg5MAH++3ci h7DSKlsxQ0ArfHigWSNTCyh20YxzQtFyJvKL03CFuojTMzf8IEvkDMNOExAUZZs/Lbyw Uwt461Xb5qHbyYCJqapieeMgXSeXxcG/5RwQkzUHwVa/UrgLFhisCcyY+dwm1n6zvyio ZgBOJrcLFH4zCsG7BMQxYGIaq80DNTR4d1kD+Z+N41AgYkAcIK0cXe+e5sw+FLQeUgin QEn4P7nnsg9At6T7yrnjVt70zVtDBsMKdiB1kkryigpoDwWOeEoeQkroCg9FvxtvvUnI xN0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351767; x=1738956567; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+VE1mjTWAjK2f8mnRBq6Pc8Xh+C0vPWa7z7jgNYsPk8=; b=DYXZ5XmllFQzz1Q1oVTLx2aBE/r4jwWRE8mqqyq2TE4vufz63shMLFKnRg8NvlQ+ER QpBziGhwzBcYY+bKmNP50VwolpwQMhwOhxWhmB+Dcsse9sDKsGbSX5G5UmZOav0r9KQR TFHX0e/EB3JD2T2czBi7brgl41BgEzjRn5hgtO+Rp2O75YnvrbjrJ/3maUyJY7DUf60i yAr0GewoKZgxjsZMIWwx/NoRC5FBgnZC9lMaKM4OZU/yyQKuCo6nKoQHru3sX3Cqfm4L ebP05tuYdFTVNrH5vZEcinW57Byup8Hz/K5RBn7R8rjPBpvQWALpsxmoNRrZShtzstb2 Gy7g== X-Gm-Message-State: AOJu0YxU6r1SniemvjciPqLgxjvB7BL+81tiVgB6FrzWarvBP6d2RO4D RFUWhz/Ihopnfy/ueaYJfYYLPRQ2NbLNs/naj4mj1rQPPPuYnyPQGjcV42x20B4= X-Gm-Gg: ASbGnctLnqRRlk+qug54GabPe2ViQmequ8nk1bM2wvk8RrSfSnH0i9NyCAWj8L3SSvX MuvYpLTq0Hu2cRmEJiZbJ+1v/PaH8fEOuh4pP4Jo+DB2tD5Eezy4A6/XTQUJu7NiAY0A/usmHke Dh+NkEQi8kIfX34/es0BS8ZR4ca7A3zCfpgUFBgHhyLqJsD0Lj4XpZ6zyTvWrDFEi3itF23gEpu QrkpyAytk5J0GyDKMvaVKdmN/U1U64hhl8Jco/E3nktOPRmzkcb/7fgGwSzmW2W3JLOM/fevK6h WAMv/zW7OFO7cZdC3JWLW1SxYhUQqnVcYsHTGp2q3Ro4OsqOvYUylSzlXql+Cfd53Q== X-Google-Smtp-Source: AGHT+IHeP0WZnlh3IOjIJfedvQ4f5dqbo6l98X0OwbCowK/uCdwALebtAttGbwDgmozRpQjS59P74w== X-Received: by 2002:a17:90b:5445:b0:2ee:a583:e616 with SMTP id 98e67ed59e1d1-2f83abdeb6fmr18883584a91.9.1738351767020; Fri, 31 Jan 2025 11:29:27 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:26 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 05/18] selftests/bpf: Test returning referenced kptr from struct_ops programs Date: Fri, 31 Jan 2025 11:28:44 -0800 Message-ID: <20250131192912.133796-6-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Test struct_ops programs returning referenced kptr. When the return type of a struct_ops operator is pointer to struct, the verifier should only allow programs that return a scalar NULL or a non-local kptr with the correct type in its unmodified form. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- .../prog_tests/test_struct_ops_kptr_return.c | 16 +++++++++ .../bpf/progs/struct_ops_kptr_return.c | 30 ++++++++++++++++ ...uct_ops_kptr_return_fail__invalid_scalar.c | 26 ++++++++++++++ .../struct_ops_kptr_return_fail__local_kptr.c | 34 +++++++++++++++++++ ...uct_ops_kptr_return_fail__nonzero_offset.c | 25 ++++++++++++++ .../struct_ops_kptr_return_fail__wrong_type.c | 30 ++++++++++++++++ .../selftests/bpf/test_kmods/bpf_testmod.c | 8 +++++ .../selftests/bpf/test_kmods/bpf_testmod.h | 4 +++ 8 files changed, 173 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_kptr_return.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__invalid_scalar.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__local_kptr.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__nonzero_offset.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__wrong_type.c diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_kptr_return.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_kptr_return.c new file mode 100644 index 000000000000..467cc72a3588 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_kptr_return.c @@ -0,0 +1,16 @@ +#include + +#include "struct_ops_kptr_return.skel.h" +#include "struct_ops_kptr_return_fail__wrong_type.skel.h" +#include "struct_ops_kptr_return_fail__invalid_scalar.skel.h" +#include "struct_ops_kptr_return_fail__nonzero_offset.skel.h" +#include "struct_ops_kptr_return_fail__local_kptr.skel.h" + +void test_struct_ops_kptr_return(void) +{ + RUN_TESTS(struct_ops_kptr_return); + RUN_TESTS(struct_ops_kptr_return_fail__wrong_type); + RUN_TESTS(struct_ops_kptr_return_fail__invalid_scalar); + RUN_TESTS(struct_ops_kptr_return_fail__nonzero_offset); + RUN_TESTS(struct_ops_kptr_return_fail__local_kptr); +} diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c new file mode 100644 index 000000000000..36386b3c23a1 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return.c @@ -0,0 +1,30 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +void bpf_task_release(struct task_struct *p) __ksym; + +/* This test struct_ops BPF programs returning referenced kptr. The verifier should + * allow a referenced kptr or a NULL pointer to be returned. A referenced kptr to task + * here is acquried automatically as the task argument is tagged with "__ref". + */ +SEC("struct_ops/test_return_ref_kptr") +struct task_struct *BPF_PROG(kptr_return, int dummy, + struct task_struct *task, struct cgroup *cgrp) +{ + if (dummy % 2) { + bpf_task_release(task); + return NULL; + } + return task; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_kptr_return = { + .test_return_ref_kptr = (void *)kptr_return, +}; + + diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__invalid_scalar.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__invalid_scalar.c new file mode 100644 index 000000000000..caeea158ef69 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__invalid_scalar.c @@ -0,0 +1,26 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct cgroup *bpf_cgroup_acquire(struct cgroup *p) __ksym; +void bpf_task_release(struct task_struct *p) __ksym; + +/* This test struct_ops BPF programs returning referenced kptr. The verifier should + * reject programs returning a non-zero scalar value. + */ +SEC("struct_ops/test_return_ref_kptr") +__failure __msg("At program exit the register R0 has smin=1 smax=1 should have been in [0, 0]") +struct task_struct *BPF_PROG(kptr_return_fail__invalid_scalar, int dummy, + struct task_struct *task, struct cgroup *cgrp) +{ + bpf_task_release(task); + return (struct task_struct *)1; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_kptr_return = { + .test_return_ref_kptr = (void *)kptr_return_fail__invalid_scalar, +}; diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__local_kptr.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__local_kptr.c new file mode 100644 index 000000000000..b8b4f05c3d7f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__local_kptr.c @@ -0,0 +1,34 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_experimental.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct cgroup *bpf_cgroup_acquire(struct cgroup *p) __ksym; +void bpf_task_release(struct task_struct *p) __ksym; + +/* This test struct_ops BPF programs returning referenced kptr. The verifier should + * reject programs returning a local kptr. + */ +SEC("struct_ops/test_return_ref_kptr") +__failure __msg("At program exit the register R0 is not a known value (ptr_or_null_)") +struct task_struct *BPF_PROG(kptr_return_fail__local_kptr, int dummy, + struct task_struct *task, struct cgroup *cgrp) +{ + struct task_struct *t; + + bpf_task_release(task); + + t = bpf_obj_new(typeof(*task)); + if (!t) + return NULL; + + return t; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_kptr_return = { + .test_return_ref_kptr = (void *)kptr_return_fail__local_kptr, +}; diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__nonzero_offset.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__nonzero_offset.c new file mode 100644 index 000000000000..7ddeb28c2329 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__nonzero_offset.c @@ -0,0 +1,25 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct cgroup *bpf_cgroup_acquire(struct cgroup *p) __ksym; +void bpf_task_release(struct task_struct *p) __ksym; + +/* This test struct_ops BPF programs returning referenced kptr. The verifier should + * reject programs returning a modified referenced kptr. + */ +SEC("struct_ops/test_return_ref_kptr") +__failure __msg("dereference of modified trusted_ptr_ ptr R0 off={{[0-9]+}} disallowed") +struct task_struct *BPF_PROG(kptr_return_fail__nonzero_offset, int dummy, + struct task_struct *task, struct cgroup *cgrp) +{ + return (struct task_struct *)&task->jobctl; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_kptr_return = { + .test_return_ref_kptr = (void *)kptr_return_fail__nonzero_offset, +}; diff --git a/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__wrong_type.c b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__wrong_type.c new file mode 100644 index 000000000000..6a2dd5367802 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_kptr_return_fail__wrong_type.c @@ -0,0 +1,30 @@ +#include +#include +#include "../test_kmods/bpf_testmod.h" +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct cgroup *bpf_cgroup_acquire(struct cgroup *p) __ksym; +void bpf_task_release(struct task_struct *p) __ksym; + +/* This test struct_ops BPF programs returning referenced kptr. The verifier should + * reject programs returning a referenced kptr of the wrong type. + */ +SEC("struct_ops/test_return_ref_kptr") +__failure __msg("At program exit the register R0 is not a known value (ptr_or_null_)") +struct task_struct *BPF_PROG(kptr_return_fail__wrong_type, int dummy, + struct task_struct *task, struct cgroup *cgrp) +{ + struct task_struct *ret; + + ret = (struct task_struct *)bpf_cgroup_acquire(cgrp); + bpf_task_release(task); + + return ret; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_kptr_return = { + .test_return_ref_kptr = (void *)kptr_return_fail__wrong_type, +}; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index 802cbd871035..89dc502de9d4 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -1182,11 +1182,19 @@ static int bpf_testmod_ops__test_refcounted(int dummy, return 0; } +static struct task_struct * +bpf_testmod_ops__test_return_ref_kptr(int dummy, struct task_struct *task__ref, + struct cgroup *cgrp) +{ + return NULL; +} + static struct bpf_testmod_ops __bpf_testmod_ops = { .test_1 = bpf_testmod_test_1, .test_2 = bpf_testmod_test_2, .test_maybe_null = bpf_testmod_ops__test_maybe_null, .test_refcounted = bpf_testmod_ops__test_refcounted, + .test_return_ref_kptr = bpf_testmod_ops__test_return_ref_kptr, }; struct bpf_struct_ops bpf_bpf_testmod_ops = { diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h index c57b2f9dab10..c9fab51f16e2 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.h @@ -6,6 +6,7 @@ #include struct task_struct; +struct cgroup; struct bpf_testmod_test_read_ctx { char *buf; @@ -38,6 +39,9 @@ struct bpf_testmod_ops { int (*unsupported_ops)(void); /* Used to test ref_acquired arguments. */ int (*test_refcounted)(int dummy, struct task_struct *task); + /* Used to test returning referenced kptr. */ + struct task_struct *(*test_return_ref_kptr)(int dummy, struct task_struct *task, + struct cgroup *cgrp); /* The following fields are used to test shadow copies. */ char onebyte; From patchwork Fri Jan 31 19:28:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955698 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7A151F3D55; Fri, 31 Jan 2025 19:29:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351770; cv=none; b=HATi4OYON3LYGpnGAKn1CjP3HBiLrerxtxvSiy1Q/hRYDCz1Rqv++duDmPWg7Nam2cp1bV5LrMSwjXziqnafdI4ooOMF1vne5o6s20EYmi2UjL6cZ2jT6SqtobQ6IsLUrsXmiHlfi+Sq8DUUi3i3Yx1zBn2ptlUcRY/ZPBkP6W8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351770; c=relaxed/simple; bh=eB27qpGeHbnL+g+H5i6dwyeK5gEdO4WsfdSVSblw4mA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=caz2dg5ViFtvUh6OZlIExp8q1dyojSL8Q3XVz3FCXaU7Fxoqac0fUDt6yMFejOV7imuQ9VzmfXQk1A3j0SJGzku3huQBvqUo8YIy6LWZOEAoqXMJbVV9j+bLBjdgHBlPgal3tkJ08GHA3glHgE+iCFBNvFY+BlfqGgfH8OtPlOA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=N8Ofc1j3; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="N8Ofc1j3" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2efb17478adso3986916a91.1; Fri, 31 Jan 2025 11:29:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351768; x=1738956568; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Bqaa0Rde68BkHu0sINbJqayYMFBgLoQtb5xV4LwmnQE=; b=N8Ofc1j3YPYVV5dx8UkQYNWEEJa+Pl2xYNgZLHTRPTxjYK9r4ObLBdilL6llZw+Ofw VtFADOUhde8R+2LtBdO9QfL8mMMSSaHVRxGcEpjyKHgNPpHxZ0pOL41jigpIWvlLGwhH x5FjNEaS6dVxX367kd8hkHtlxcKGJGORq35XrbTrvVedBWnTDoYSv6rXDFc8+3+mskMB VyuK1T20hzeaNpH5zqeA4oY5s6SFYPp/6vVrhe6g5IQHWfSAaaMY0FxTMjxrhJ7+k+Pb g6fXXRTykJMU71zLVmJTIBU7uE/D2ULUF0v2sqPt/hcxG+OaN9ZgSZNPvvjBoscBmwIf zg4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351768; x=1738956568; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bqaa0Rde68BkHu0sINbJqayYMFBgLoQtb5xV4LwmnQE=; b=auiYjSJSKN4Z+xAuNSgxEPJ8XkD9NzxrPyNHdoaMXu/+Io0BFkElZ+k+0cqWtPDtdB KahwmFzkie7DgGNZsXV8TZ169gSHtY4JbBDnkuEqoevwiTP4TKu6dU2v2N27heYeQUnE kh8/o9zB5EdwiXNQCo8NSHhN5Uia0ANM7XgXEUdLuoUUlCK1v2KA06oiDyb0orZmvRiT b7geEu2q5p8d5qKFYahG/V6e3X5+SThGO5phhcxEm1xgybMhDkQ0vGdC4w86RQRAS+7g /+NulM9yuLwY+J7Np2b/forSo5wt5CWesiTmCPzWyACDfegpfVfDi3du9VazzNeWn1XQ QOTQ== X-Gm-Message-State: AOJu0YzOMJwVNagY+7/rkvce8jpntcQQ8ntn1Eor5uRK7qlWhAE4M5kS g9lkDDC0eAROHsj+ZItHg7nZEz2KKVEKkWC+WexvZyU0kH/Vom65/Nzhn7N2GfQ= X-Gm-Gg: ASbGncupySc7FBbqZoS0rOCOD6MQ8zqdPobLHqIzdFfQkUNh16l34WJQ8tI4VRCa9Ek iWaL5lrNTUbYz7FIzTSUQaxUnYGn4MDYnBKYNXBjR3YMZ8EW5vNyDWPsTnqZRDT/dwAEx09UJwr 07sXzf3wi8uP/VB2Z31zMVz5iA5X+mE8UBbXEtf3j4kEb6mrzfLR1PcHml2w5lGCB3A2swF4f7/ QcETFjSek5eHK7Qg6BJH6a6ssb1OK/Vw8fKQF26cmG5evCXTnBLVy63W5bVws1kCp4FjAPJOrCY 23CSh2ojFI4LR+FdK32y4Djcy64+EPxeYxIyWL+GWcC9GuD14jA3iKMoxXOVXG+d4g== X-Google-Smtp-Source: AGHT+IGVenGNd/dfn6AvA05UbGO7TXBtSAlcuE855ntas1bUVivwV4lFPxVwe/xRfi4J2tQy7I/iUg== X-Received: by 2002:a17:90b:53c8:b0:2ee:8430:b847 with SMTP id 98e67ed59e1d1-2f83abb34f2mr17396071a91.6.1738351767958; Fri, 31 Jan 2025 11:29:27 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:27 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 06/18] bpf: Prepare to reuse get_ctx_arg_idx Date: Fri, 31 Jan 2025 11:28:45 -0800 Message-ID: <20250131192912.133796-7-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Rename get_ctx_arg_idx to bpf_ctx_arg_idx, and allow others to call it. No functional change. Signed-off-by: Amery Hung --- include/linux/btf.h | 1 + kernel/bpf/btf.c | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 2a08a2b55592..ce057c6b3947 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -519,6 +519,7 @@ bool btf_param_match_suffix(const struct btf *btf, const char *suffix); int btf_ctx_arg_offset(const struct btf *btf, const struct btf_type *func_proto, u32 arg_no); +u32 btf_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, int off); struct bpf_verifier_log; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index fd3470fbd144..ca5779f6961b 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6370,8 +6370,8 @@ static bool is_int_ptr(struct btf *btf, const struct btf_type *t) return btf_type_is_int(t); } -static u32 get_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, - int off) +u32 btf_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, + int off) { const struct btf_param *args; const struct btf_type *t; @@ -6549,7 +6549,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, tname, off); return false; } - arg = get_ctx_arg_idx(btf, t, off); + arg = btf_ctx_arg_idx(btf, t, off); args = (const struct btf_param *)(t + 1); /* if (t == NULL) Fall back to default BPF prog with * MAX_BPF_FUNC_REG_ARGS u64 arguments. From patchwork Fri Jan 31 19:28:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955699 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50EE51F3FE6; Fri, 31 Jan 2025 19:29:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351771; cv=none; b=TststJOJtZr15K9bsB0njcMeiGiDgnOfqwJWhi33FZFq/MbI/GjLEak0mc7ogeyL45NjqQn3V0rqR8vV+YZRWV5tx6JHNnQ3e2iULPK0pH7dMRh3I00jrBTTay1AnyMdKS+MkHcoPHiHCddezsXXDOHCYbESqGdvvlM7LKAkl5U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351771; c=relaxed/simple; bh=E2FPEoDrBFd0lRBTxumkryWrhzXuxyfUoTUazgGSfDA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KiFxtywC2zj8n4qXMeeY39pRGxaFui/GrkbGk+/k0YmHY+QTAp8KXHhGiGHac0g104yaAILNjMhh0hEKuNQzR8WP9/vw3aGDPa5Y2eqLRstgLQdpPUWR3S677GHdwOiSD62Q7YvTgAWIO90ATaxxgHWkxoXlAadNx+ZbsU5uFOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ivs0yDtN; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ivs0yDtN" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2ee8aa26415so4202107a91.1; Fri, 31 Jan 2025 11:29:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351769; x=1738956569; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DgX9dUcS1QzoON8XWODEy7A4wJkZDpESE5En71DUIxw=; b=ivs0yDtNKXmcsjee6ubLfi0InqTMUsbxCtjbcS5d0AiqtV8RYEMybYUeTEKPAXYZm+ Nv+RbMOnvxS8x6H1X3mYrNWpY6JvhLjhebHT49lyFZK9s7Ae0zZ41CRq/agxRwGVMwRf qYwhZ0/fsG6JHNOSdLKT0tnvY6fPDzZffgg5UGvl55tFJAY7RWzO7EA1YDr5xzWPmg9J zW/EouWVXksmfbf+o9HnX8GW8x266Jo+rpUimGMkmdd1AmM44mdpwrmUzuUxXVG0VOK3 mVbjayRaI9+adIiaF2qkL2KKDEdOic+SjAhVE1kUtHtvw90tkeBvy+5PuTo/uszGQWEX SAZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351769; x=1738956569; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DgX9dUcS1QzoON8XWODEy7A4wJkZDpESE5En71DUIxw=; b=WTrtL87e5z1oiwVnbHsawjC1Q1DnX6ydB1U2l/xNvGRAFEpoVC/YKVVleP1C/mXSDl lBxhLl0a4JS7eHn8n+tO1jDTuUMjKcIGK6Dru7K10s1OgMHDiuugDue8LcUQRgrAL372 VBEISH3UEKjV+H1VKo0tPeJ0eJhuoSbHcIRaEAahtvbAPPqqYYNR1QentIiqDDY2ZO0C lQ0A0W0KnEvyz2BHTxqFP3sHKSmqu37KPRT1F+SLYVLEe03bP+AqmLTSrjxDqx59Z9yz VLfpbE2mmpvk+QUPlAWGts47yZUzZcqlDuZDnWXSd0qdllOXe5jlgvcZQ0+y+Mw4QgRP gEww== X-Gm-Message-State: AOJu0Yxwl03FDjtR2deiUDmR5q4dod3zseZKTUejDhKS9xw8GnvDZYsB E943JKlPC5IN4B84C5H+99p2SWHpI4cd/dpm+Zqd4SsIU5IRZ1UR2q58stOwQtA= X-Gm-Gg: ASbGncvoef828T2P24EDuz4TYUWwgJ8P82b6h8eCdswU+eJ5KzfypV09FA5VJqQblCb 7Zcst9zOYvrl+XezSwDTVIRSlNaX+U2EsW5Nn4qe/M6sXSQ6iKYAsnaRemVrJOWxMtcRDnavOLV Hz4iHUgAOnfatkf08OMg2cahSuRSEkd4/BJJtQZzo+quk0JPA5XKT+GP5cY9CmnUY9hACBW4DHz xZweYbnAo+1IAmuM1+4+2cGza46NVP8v2r6fsKu4fcwQ7mPBRNvGQR+47S1u4EQt6rDlXG58VAr Al+kvioGWvu/nHpOIozo5fYJSF90fYb1Fz9/K/Dbz2q5fdjWMuz+hIbfpiIW3yppPw== X-Google-Smtp-Source: AGHT+IFg0uOED7G77J125I6avlTWgSiLJQORQXW8soRUgW+osiwVeUR47pSsYKzjYL6h5P6Bg+p0PQ== X-Received: by 2002:a17:90b:2b86:b0:2ee:f687:6adb with SMTP id 98e67ed59e1d1-2f83abb403dmr17171033a91.3.1738351769383; Fri, 31 Jan 2025 11:29:29 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:28 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 07/18] bpf: Generalize finding member offset of struct_ops prog Date: Fri, 31 Jan 2025 11:28:46 -0800 Message-ID: <20250131192912.133796-8-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Generalize prog_ops_moff() so that we can use it to retrieve a struct_ops program's offset for different ops. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- include/linux/bpf.h | 1 + kernel/bpf/bpf_struct_ops.c | 13 +++++++++++++ net/ipv4/bpf_tcp_ca.c | 23 ++--------------------- 3 files changed, 16 insertions(+), 21 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 15164787ce7f..6003ba36f6c5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1892,6 +1892,7 @@ static inline void bpf_module_put(const void *data, struct module *owner) module_put(owner); } int bpf_struct_ops_link_create(union bpf_attr *attr); +u32 bpf_struct_ops_prog_moff(const struct bpf_prog *prog); #ifdef CONFIG_NET /* Define it here to avoid the use of forward declaration */ diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 8df5e8045d07..d3a76f0c5a82 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -1386,3 +1386,16 @@ void bpf_map_struct_ops_info_fill(struct bpf_map_info *info, struct bpf_map *map info->btf_vmlinux_id = btf_obj_id(st_map->btf); } + +u32 bpf_struct_ops_prog_moff(const struct bpf_prog *prog) +{ + const struct btf_member *m; + const struct btf_type *t; + u32 midx; + + t = btf_type_by_id(prog->aux->attach_btf, prog->aux->attach_btf_id); + midx = prog->expected_attach_type; + m = &btf_type_member(t)[midx]; + + return __btf_member_bit_offset(t, m) / 8; +} diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 554804774628..415bd3b18eef 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -16,7 +16,6 @@ static struct bpf_struct_ops bpf_tcp_congestion_ops; static const struct btf_type *tcp_sock_type; static u32 tcp_sock_id, sock_id; -static const struct btf_type *tcp_congestion_ops_type; static int bpf_tcp_ca_init(struct btf *btf) { @@ -33,11 +32,6 @@ static int bpf_tcp_ca_init(struct btf *btf) tcp_sock_id = type_id; tcp_sock_type = btf_type_by_id(btf, tcp_sock_id); - type_id = btf_find_by_name_kind(btf, "tcp_congestion_ops", BTF_KIND_STRUCT); - if (type_id < 0) - return -EINVAL; - tcp_congestion_ops_type = btf_type_by_id(btf, type_id); - return 0; } @@ -135,19 +129,6 @@ static const struct bpf_func_proto bpf_tcp_send_ack_proto = { .arg2_type = ARG_ANYTHING, }; -static u32 prog_ops_moff(const struct bpf_prog *prog) -{ - const struct btf_member *m; - const struct btf_type *t; - u32 midx; - - midx = prog->expected_attach_type; - t = tcp_congestion_ops_type; - m = &btf_type_member(t)[midx]; - - return __btf_member_bit_offset(t, m) / 8; -} - static const struct bpf_func_proto * bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) @@ -166,7 +147,7 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, * setsockopt() to make further changes which * may potentially allocate new resources. */ - if (prog_ops_moff(prog) != + if (bpf_struct_ops_prog_moff(prog) != offsetof(struct tcp_congestion_ops, release)) return &bpf_sk_setsockopt_proto; return NULL; @@ -177,7 +158,7 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, * The bpf-tcp-cc already has a more powerful way * to read tcp_sock from the PTR_TO_BTF_ID. */ - if (prog_ops_moff(prog) != + if (bpf_struct_ops_prog_moff(prog) != offsetof(struct tcp_congestion_ops, release)) return &bpf_sk_getsockopt_proto; return NULL; From patchwork Fri Jan 31 19:28:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955700 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D53D71F4261; Fri, 31 Jan 2025 19:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351773; cv=none; b=YFvRvjKH5yaKj9Iey4B8i42kAZRgCuAxL4FipLBBdRifYTJ/4+RzGWusb8IHYzJIvBwBPdsGckgBXyklvYSWYqoHGM11BP4jbkXvgA1UNGLiJfYB3eJIQNwyav9VRrRaucXk8xyer54BWT+zpOti+lWkyonuEKe46/Ogbre4fVI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351773; c=relaxed/simple; bh=vBwUEqep/ePAbyRom2DLKjSYzYOaGcIjFtMqqXAC1mU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TgF8zNaIqCgNJrl2jiOwwwe3/P+qVh4Chkt9XcnZPEt7H3lMYtqS7q+RQLFdWFdUJvxZ1dxVh/NPr3MmMf71WoZuPf0B0+2Ea+k2+jwpjaicXOwgOr3/j7PLHhXYc6JBhYg1XT7GKDmMccQfeeQmcf6n6SH5UPJ5IEkc5Cw3N40= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=c2zxOK/j; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="c2zxOK/j" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-2f13acbe29bso5491743a91.1; Fri, 31 Jan 2025 11:29:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351771; x=1738956571; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4bB1ND19ONvBxPAxNoR2REJQJ3W8recu2G+VC9ggdDE=; b=c2zxOK/jMRmB+6aV89j1Go72WDtWHCf99jK5UtHLqZJ80Kxne+FcQiYtbABUBaL6qA gJv3vhc/faikM4aDanvIK8AnF+Pg0LBaJjHaskyiMsRBHeNDB5QFJcGfuhsXHtqyoCn6 vscO7EgfHRs5tQMg8w4QLQ5KGljhk18gHM1vvh4/UTDf64WYXWRrsVY4SZwkk9pgoNdn bNpZ5VR4rq8XLbSJWeI6P6EplOYBzicQhNH8UsxVup7entFv84wF8TigE17CQSN1IoDH 9UA2b5KTtwYXAZ36EJj+uB+hvmDC/HBaFZsrQYM/Ytp7djmrdJX9vl7lnvro9Rt0GGEQ gweA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351771; x=1738956571; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4bB1ND19ONvBxPAxNoR2REJQJ3W8recu2G+VC9ggdDE=; b=bTJXL/8urfI8U0hUKZIVrxWi7E6hcAMG8/djh1AmcVOQKcP4q4z02AReRmzKWVi9HH eoAoKPcH0K2SyZm/PpEg1LCJEQaKaHaX02cce+Y16+6AoM/N7CiJnSZS4CRC2w1GrIvO X+v0ihzc5KkqEq+cA95WygHNZucQlHomSsfd1Bl8Oj3Wg5oqpnjBb3ql5iinSaOsp6hS 3LUM03436sm+dHYa3Yc+SRiVfzr8yKgh9SBM7dyI24HKaH/LxGv/nYLW56J9/zFXa+Wk XRqVVrdjIOWFCsBWiEX7MAyJ9RLVmrMAwl25h3sLNAF/keDOmglUdPHb1GFiq4VsswlS Za/A== X-Gm-Message-State: AOJu0Yybgkpe18bdUh5qnYah6hWBglj9HTjd20KczJRwm/Vmg9+aJg+8 ocxwuHi2Ro4dOYTszd84ph6dZWQPD5+OrbYuzBWMg9WIt3Zg9BjsYefmmUlnZQQ= X-Gm-Gg: ASbGnctebVgSQHRU84oseL5l5I4aIBLjGF0cKnHi1ugEmyW1kab+wQJw5OXAEnniW+M Yhe5koV9gF/vCRXQ122e0x8hjyg4PMWo2gvLiPezwuGMaARMEowCz6wiL7lBjonblVTSENxtK1p gix/EwrGkO8zm7CHNd2AKVBIzVeLR2e5zJkaj9Rrfn8Bu/HmPrDU3qfjyiwzLi9pebBIqfJBN3p HahwMeyLddphlujYu63NQ7QrdLTPIDVFB/Kk0sfuJf2Podp6iGg2l8Kzf0tYVKjtt37kWwZ0hyZ M/0F25ftSAk/g1D/U3W3R0Pl0cIGnmi0f1jZoP0W6qMwQUAhbo2835eWCgx00Tm6MQ== X-Google-Smtp-Source: AGHT+IEudCtTtcXAfwQAjJKmI8DGD17UdnuveFILKkye3nCSLN+v29OfDjKPOCUGga5llRSn2cVa3w== X-Received: by 2002:a17:90b:534b:b0:2ee:cbd0:4910 with SMTP id 98e67ed59e1d1-2f84633eaffmr13326994a91.1.1738351770813; Fri, 31 Jan 2025 11:29:30 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:30 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 08/18] bpf: net_sched: Support implementation of Qdisc_ops in bpf Date: Fri, 31 Jan 2025 11:28:47 -0800 Message-ID: <20250131192912.133796-9-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Enable users to implement a classless qdisc using bpf. The last few patches in this series has prepared struct_ops to support core operators in Qdisc_ops. The recent advancement in bpf such as allocated objects, bpf list and bpf rbtree has also provided powerful and flexible building blocks to realize sophisticated scheduling algorithms. Therefore, in this patch, we start allowing qdisc to be implemented using bpf struct_ops. Users can implement Qdisc_ops.{enqueue, dequeue, init, reset, and .destroy in Qdisc_ops in bpf and register the qdisc dynamically into the kernel. We do not allow users to attach bpf qdiscs to classful qdiscs. This is to prevent accidentally breaking existings clasful qdiscs if they rely on some data in the child qdisc. This restrication can potentially be lifted in the future. Note that, we still allow bpf qdisc to be attached to mq. Co-developed-by: Cong Wang Signed-off-by: Cong Wang Signed-off-by: Amery Hung --- net/sched/Kconfig | 12 +++ net/sched/Makefile | 1 + net/sched/bpf_qdisc.c | 210 ++++++++++++++++++++++++++++++++++++++++ net/sched/sch_api.c | 14 ++- net/sched/sch_generic.c | 3 +- 5 files changed, 236 insertions(+), 4 deletions(-) create mode 100644 net/sched/bpf_qdisc.c diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 8180d0c12fce..ccd0255da5a5 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -403,6 +403,18 @@ config NET_SCH_ETS If unsure, say N. +config NET_SCH_BPF + bool "BPF-based Qdisc" + depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF + help + This option allows BPF-based queueing disiplines. With BPF struct_ops, + users can implement supported operators in Qdisc_ops using BPF programs. + The queue holding skb can be built with BPF maps or graphs. + + Say Y here if you want to use BPF-based Qdisc. + + If unsure, say N. + menuconfig NET_SCH_DEFAULT bool "Allow override default queue discipline" help diff --git a/net/sched/Makefile b/net/sched/Makefile index 82c3f78ca486..904d784902d1 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_NET_SCH_FQ_PIE) += sch_fq_pie.o obj-$(CONFIG_NET_SCH_CBS) += sch_cbs.o obj-$(CONFIG_NET_SCH_ETF) += sch_etf.o obj-$(CONFIG_NET_SCH_TAPRIO) += sch_taprio.o +obj-$(CONFIG_NET_SCH_BPF) += bpf_qdisc.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c new file mode 100644 index 000000000000..00f3232f4a98 --- /dev/null +++ b/net/sched/bpf_qdisc.c @@ -0,0 +1,210 @@ +#include +#include +#include +#include +#include +#include +#include + +static struct bpf_struct_ops bpf_Qdisc_ops; + +struct bpf_sk_buff_ptr { + struct sk_buff *skb; +}; + +static int bpf_qdisc_init(struct btf *btf) +{ + return 0; +} + +static const struct bpf_func_proto * +bpf_qdisc_get_func_proto(enum bpf_func_id func_id, + const struct bpf_prog *prog) +{ + /* Tail call is disabled since there is no gaurantee valid refcounted + * kptrs will always be passed to another bpf program with __ref arguments. + */ + switch (func_id) { + case BPF_FUNC_tail_call: + return NULL; + default: + return bpf_base_func_proto(func_id, prog); + } +} + +BTF_ID_LIST_SINGLE(bpf_sk_buff_ids, struct, sk_buff) +BTF_ID_LIST_SINGLE(bpf_sk_buff_ptr_ids, struct, bpf_sk_buff_ptr) + +static bool bpf_qdisc_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + struct btf *btf = prog->aux->attach_btf; + u32 arg; + + arg = btf_ctx_arg_idx(btf, prog->aux->attach_func_proto, off); + if (bpf_struct_ops_prog_moff(prog) == offsetof(struct Qdisc_ops, enqueue)) { + if (arg == 2 && type == BPF_READ) { + info->reg_type = PTR_TO_BTF_ID | PTR_TRUSTED; + info->btf = btf; + info->btf_id = bpf_sk_buff_ptr_ids[0]; + return true; + } + } + + return bpf_tracing_btf_ctx_access(off, size, type, prog, info); +} + +static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + const struct btf_type *t, *skbt; + size_t end; + + skbt = btf_type_by_id(reg->btf, bpf_sk_buff_ids[0]); + t = btf_type_by_id(reg->btf, reg->btf_id); + if (t != skbt) { + bpf_log(log, "only read is supported\n"); + return -EACCES; + } + + switch (off) { + case offsetof(struct sk_buff, tstamp): + end = offsetofend(struct sk_buff, tstamp); + break; + case offsetof(struct sk_buff, priority): + end = offsetofend(struct sk_buff, priority); + break; + case offsetof(struct sk_buff, mark): + end = offsetofend(struct sk_buff, mark); + break; + case offsetof(struct sk_buff, queue_mapping): + end = offsetofend(struct sk_buff, queue_mapping); + break; + case offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, tc_classid): + end = offsetof(struct sk_buff, cb) + + offsetofend(struct qdisc_skb_cb, tc_classid); + break; + case offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, data[0]) ... + offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, + data[QDISC_CB_PRIV_LEN - 1]): + end = offsetof(struct sk_buff, cb) + + offsetofend(struct qdisc_skb_cb, data[QDISC_CB_PRIV_LEN - 1]); + break; + case offsetof(struct sk_buff, tc_index): + end = offsetofend(struct sk_buff, tc_index); + break; + default: + bpf_log(log, "no write support to sk_buff at off %d\n", off); + return -EACCES; + } + + if (off + size > end) { + bpf_log(log, + "write access at off %d with size %d beyond the member of sk_buff ended at %zu\n", + off, size, end); + return -EACCES; + } + + return 0; +} + +static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { + .get_func_proto = bpf_qdisc_get_func_proto, + .is_valid_access = bpf_qdisc_is_valid_access, + .btf_struct_access = bpf_qdisc_btf_struct_access, +}; + +static int bpf_qdisc_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + const struct Qdisc_ops *uqdisc_ops; + struct Qdisc_ops *qdisc_ops; + u32 moff; + + uqdisc_ops = (const struct Qdisc_ops *)udata; + qdisc_ops = (struct Qdisc_ops *)kdata; + + moff = __btf_member_bit_offset(t, member) / 8; + switch (moff) { + case offsetof(struct Qdisc_ops, peek): + qdisc_ops->peek = qdisc_peek_dequeued; + return 0; + case offsetof(struct Qdisc_ops, id): + if (bpf_obj_name_cpy(qdisc_ops->id, uqdisc_ops->id, + sizeof(qdisc_ops->id)) <= 0) + return -EINVAL; + return 1; + } + + return 0; +} + +static int bpf_qdisc_reg(void *kdata, struct bpf_link *link) +{ + return register_qdisc(kdata); +} + +static void bpf_qdisc_unreg(void *kdata, struct bpf_link *link) +{ + return unregister_qdisc(kdata); +} + +static int Qdisc_ops__enqueue(struct sk_buff *skb__ref, struct Qdisc *sch, + struct sk_buff **to_free) +{ + return 0; +} + +static struct sk_buff *Qdisc_ops__dequeue(struct Qdisc *sch) +{ + return NULL; +} + +static struct sk_buff *Qdisc_ops__peek(struct Qdisc *sch) +{ + return NULL; +} + +static int Qdisc_ops__init(struct Qdisc *sch, struct nlattr *arg, + struct netlink_ext_ack *extack) +{ + return 0; +} + +static void Qdisc_ops__reset(struct Qdisc *sch) +{ +} + +static void Qdisc_ops__destroy(struct Qdisc *sch) +{ +} + +static struct Qdisc_ops __bpf_ops_qdisc_ops = { + .enqueue = Qdisc_ops__enqueue, + .dequeue = Qdisc_ops__dequeue, + .peek = Qdisc_ops__peek, + .init = Qdisc_ops__init, + .reset = Qdisc_ops__reset, + .destroy = Qdisc_ops__destroy, +}; + +static struct bpf_struct_ops bpf_Qdisc_ops = { + .verifier_ops = &bpf_qdisc_verifier_ops, + .reg = bpf_qdisc_reg, + .unreg = bpf_qdisc_unreg, + .init_member = bpf_qdisc_init_member, + .init = bpf_qdisc_init, + .name = "Qdisc_ops", + .cfi_stubs = &__bpf_ops_qdisc_ops, + .owner = THIS_MODULE, +}; + +static int __init bpf_qdisc_kfunc_init(void) +{ + return register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); +} +late_initcall(bpf_qdisc_kfunc_init); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index e3e91cf867eb..c8057e0692a6 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -358,7 +359,7 @@ static struct Qdisc_ops *qdisc_lookup_ops(struct nlattr *kind) read_lock(&qdisc_mod_lock); for (q = qdisc_base; q; q = q->next) { if (nla_strcmp(kind, q->id) == 0) { - if (!try_module_get(q->owner)) + if (!bpf_try_module_get(q, q->owner)) q = NULL; break; } @@ -1200,6 +1201,13 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, return -EINVAL; } + if (new && + !(parent->flags & TCQ_F_MQROOT) && + new->ops->owner == BPF_MODULE_OWNER) { + NL_SET_ERR_MSG(extack, "BPF qdisc not supported on a non root"); + return -EINVAL; + } + if (new && !(parent->flags & TCQ_F_MQROOT) && rcu_access_pointer(new->stab)) { @@ -1287,7 +1295,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev, /* We will try again qdisc_lookup_ops, * so don't keep a reference. */ - module_put(ops->owner); + bpf_module_put(ops, ops->owner); err = -EAGAIN; goto err_out; } @@ -1398,7 +1406,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev, netdev_put(dev, &sch->dev_tracker); qdisc_free(sch); err_out2: - module_put(ops->owner); + bpf_module_put(ops, ops->owner); err_out: *errp = err; return NULL; diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 14ab2f4c190a..e6fda9f20272 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -1078,7 +1079,7 @@ static void __qdisc_destroy(struct Qdisc *qdisc) ops->destroy(qdisc); lockdep_unregister_key(&qdisc->root_lock_key); - module_put(ops->owner); + bpf_module_put(ops, ops->owner); netdev_put(dev, &qdisc->dev_tracker); trace_qdisc_destroy(qdisc); From patchwork Fri Jan 31 19:28:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955701 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEADF1F4275; Fri, 31 Jan 2025 19:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351774; cv=none; b=hb924JE/wVNRkqdz7hKOoomzGQBrc8FnM5ZEB1FJ6KavXlssDf6mSwnFvRM65GOa3xYSlaegG+MGlG5mF4U+BM3knwCdiwu4AoQC4aVkNushv2cmzSM9uz6JULbpcHVA0HouItSX66iLaRzUl7BfEODwBKQCA+s7tIEJybtB8xU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351774; c=relaxed/simple; bh=GPsQE8JRKuwT8SbG2JwH65Q2PUIB137RV3jOot94V64=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=acJNygB7/WhAu9W8JtUb6LzqXw5lWKFMhXlPOWsONhO5Lhx2FbMpjlikgF8Y2kxofuee81fOhCXUkcDiyhGYNg8T/2eQ/Ve5XpfME8ToRnIraA+Txionwdt3UHnM3v1fe+Gr6M4++1wY8FMY+VXcXWrUWpfhtdmqqvLqBnXXfzo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZKQw4ucL; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZKQw4ucL" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2ef72924e53so4187456a91.3; Fri, 31 Jan 2025 11:29:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351772; x=1738956572; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d3cHzm0uEOud7HRhJ1FrC8HEWxtQmFr04yI2MB9NIwI=; b=ZKQw4ucLFn6BeEA+53YEPIBbhzRvq2z1P0NEvNy4/oLw7kfNo0RaCJSvPfpkgICqmH gYYNjzLRsffY30cvTg+8SPyM+hzscme4HK3n2jH7u+v/xlUXeGloM25kUgfbistDwriQ QSjAe+vGbf2vBcL618ElOCDheaGLvKea7eYaQ9r7utdmkL/hpYN/Ei2y5Oa5msaysXeV XZTP4v9SRIEfSMjhSWup3/AD9b+uic6FillmPU7pJLd43t6kqXssFt/4r9BNPFMXennF NO3U8a0ltLYcaZ4CMIhvDKyC13yIBjI9hUpxxfWWdUJwTdQEM+M4XCy1zbbtd9eq0lXu WdrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351772; x=1738956572; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d3cHzm0uEOud7HRhJ1FrC8HEWxtQmFr04yI2MB9NIwI=; b=uB6pSXHYnks2rplRSHoQyUqS/8V1XxNe+wSWX9OGzwjTE92OfzYlLXqlF3UvnulK6J NPbyh3uTl+zQ+yGK6SssAc7u8Qb7OLhsFVNF57SAj+/33p0Rx7j9Zei0TlAA5eH4Xv4G MZ/tMw/I5GepPYY9/a/KFOTkaKGPOAuVvY6vAdkd1uoep/JJKRYZespq0VP0W8kZhk2l jEZKWf4He1RsnuT9WzlA/rS9hIEspHLwzPipNTKsdNVb7/fQBAqOxy29gawYgF6Sk4yR snZf3MfzWfmOFCNO/IEyvsciO7EVle3vRPfF2bt8yYWP+xYwpr15JNT90Pqc8TrEsDJk 4hhA== X-Gm-Message-State: AOJu0YyxOnOEvmAZe967BQSmjGFVHsesh/YI/vP4BLHIJPesESu1703N smDdeiGLmE5ORS9efRKsNnCTYUqp+4DCnLfLJgrt7HZVFHM+8tvkYs7wXKkYfYE= X-Gm-Gg: ASbGncvO9z0YliQie+xeyYMN/DtQUOSCRjoKNIxwSvcbd9lkYffOvPwjMtDjwtIVtnf Pde8nFEIsDkVEVBqdtchqEv0YF1tPyTP5sbVqDyDqi/b7oct97Ga8LfmFlv92PCqW8IACggE6Xl VzlZVOTD4MZtU5BqN+7Keeepnz2nHNzQ6+TzdqDNWC6vl5N0MnH3KFpXg3oBCGKtsDgjF9l/OLw eezDUGRY0Fdes+h730zdXUM/C/ay3wcZj4LyHJ+Lh7tLLIq+GNDEQPNZpOIUQIWPGxF7UPeDU5I 313iFGisbg5onvDV42pJSRP+bla1i59Xm2gWlW6umg6GVIZtimQ9S1q9yh3dL0sUAg== X-Google-Smtp-Source: AGHT+IF5nYo754SHrDlyACzRUruOwRNHSBiinnS95KhOCs+q7P4S3q91VJWQ14g1wcLIPHi6wVTmgA== X-Received: by 2002:a17:90a:e183:b0:2ee:cd83:8fe6 with SMTP id 98e67ed59e1d1-2f83ac83853mr18079559a91.35.1738351771864; Fri, 31 Jan 2025 11:29:31 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:31 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 09/18] bpf: net_sched: Add basic bpf qdisc kfuncs Date: Fri, 31 Jan 2025 11:28:48 -0800 Message-ID: <20250131192912.133796-10-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add basic kfuncs for working on skb in qdisc. Both bpf_qdisc_skb_drop() and bpf_kfree_skb() can be used to release a reference to an skb. However, bpf_qdisc_skb_drop() can only be called in .enqueue where a to_free skb list is available from kernel to defer the release. bpf_kfree_skb() should be used elsewhere. It is also used in bpf_obj_free_fields() when cleaning up skb in maps and collections. bpf_skb_get_hash() returns the flow hash of an skb, which can be used to build flow-based queueing algorithms. Finally, allow users to create read-only dynptr via bpf_dynptr_from_skb(). Signed-off-by: Amery Hung --- include/linux/bpf.h | 1 + kernel/bpf/bpf_struct_ops.c | 2 + net/sched/bpf_qdisc.c | 93 ++++++++++++++++++++++++++++++++++++- 3 files changed, 95 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 6003ba36f6c5..bbca7b537cf8 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1810,6 +1810,7 @@ struct bpf_struct_ops { void *cfi_stubs; struct module *owner; const char *name; + const struct btf_type *type; struct btf_func_model func_models[BPF_STRUCT_OPS_MAX_NR_MEMBERS]; }; diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index d3a76f0c5a82..1ee6d41d4948 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -460,6 +460,8 @@ int bpf_struct_ops_desc_init(struct bpf_struct_ops_desc *st_ops_desc, goto errout; } + st_ops->type = t; + return 0; errout: diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 00f3232f4a98..e188616c86a4 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -111,6 +111,80 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, return 0; } +__bpf_kfunc_start_defs(); + +/* bpf_skb_get_hash - Get the flow hash of an skb. + * @skb: The skb to get the flow hash from. + */ +__bpf_kfunc u32 bpf_skb_get_hash(struct sk_buff *skb) +{ + return skb_get_hash(skb); +} + +/* bpf_kfree_skb - Release an skb's reference and drop it immediately. + * @skb: The skb whose reference to be released and dropped. + */ +__bpf_kfunc void bpf_kfree_skb(struct sk_buff *skb) +{ + kfree_skb(skb); +} + +/* bpf_qdisc_skb_drop - Drop an skb by adding it to a deferred free list. + * @skb: The skb whose reference to be released and dropped. + * @to_free_list: The list of skbs to be dropped. + */ +__bpf_kfunc void bpf_qdisc_skb_drop(struct sk_buff *skb, + struct bpf_sk_buff_ptr *to_free_list) +{ + __qdisc_drop(skb, (struct sk_buff **)to_free_list); +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(qdisc_kfunc_ids) +BTF_ID_FLAGS(func, bpf_skb_get_hash, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_kfree_skb, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_qdisc_skb_drop, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) +BTF_KFUNCS_END(qdisc_kfunc_ids) + +BTF_SET_START(qdisc_common_kfunc_set) +BTF_ID(func, bpf_skb_get_hash) +BTF_ID(func, bpf_kfree_skb) +BTF_ID(func, bpf_dynptr_from_skb) +BTF_SET_END(qdisc_common_kfunc_set) + +BTF_SET_START(qdisc_enqueue_kfunc_set) +BTF_ID(func, bpf_qdisc_skb_drop) +BTF_SET_END(qdisc_enqueue_kfunc_set) + +static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) +{ + if (bpf_Qdisc_ops.type != btf_type_by_id(prog->aux->attach_btf, + prog->aux->attach_btf_id)) + return 0; + + /* Skip the check when prog->attach_func_name is not yet available + * during check_cfg(). + */ + if (!btf_id_set8_contains(&qdisc_kfunc_ids, kfunc_id) || + !prog->aux->attach_func_name) + return 0; + + if (bpf_struct_ops_prog_moff(prog) == offsetof(struct Qdisc_ops, enqueue)) { + if (btf_id_set_contains(&qdisc_enqueue_kfunc_set, kfunc_id)) + return 0; + } + + return btf_id_set_contains(&qdisc_common_kfunc_set, kfunc_id) ? 0 : -EACCES; +} + +static const struct btf_kfunc_id_set bpf_qdisc_kfunc_set = { + .owner = THIS_MODULE, + .set = &qdisc_kfunc_ids, + .filter = bpf_qdisc_kfunc_filter, +}; + static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { .get_func_proto = bpf_qdisc_get_func_proto, .is_valid_access = bpf_qdisc_is_valid_access, @@ -203,8 +277,25 @@ static struct bpf_struct_ops bpf_Qdisc_ops = { .owner = THIS_MODULE, }; +BTF_ID_LIST(bpf_sk_buff_dtor_ids) +BTF_ID(func, bpf_kfree_skb) + static int __init bpf_qdisc_kfunc_init(void) { - return register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); + int ret; + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { + { + .btf_id = bpf_sk_buff_ids[0], + .kfunc_btf_id = bpf_sk_buff_dtor_ids[0] + }, + }; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_qdisc_kfunc_set); + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, + ARRAY_SIZE(skb_kfunc_dtors), + THIS_MODULE); + ret = ret ?: register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); + + return ret; } late_initcall(bpf_qdisc_kfunc_init); From patchwork Fri Jan 31 19:28:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955702 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F18E31F4280; Fri, 31 Jan 2025 19:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351775; cv=none; b=aDyIcNBVXL/20+tb0oda4ty6l+eC0mDJNbJcOtgoI9tjxWN/acf7HpLA14teZTSBBMxxoXs/YzOl+EqHQS+MLTFpRG798bMX5b7ZNZj19sjQlTvH3USii39vkDeUnnvVonCULdqxUzmcJUjBwiTN0LgPtX0U8xrCm2eyhqjdJFw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351775; c=relaxed/simple; bh=yxQYFsT9AePiAzQhyVc7xoKjhDpHjJ27Ru0+BglOpZ0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WXsVSdQykV/WiB4PtX3OiCFO0EGnJmGhK083gQHdZcC8e49RxG2s6vzv4E3V5Rh/rMGOmwMu6q/bpsWuVJSHNCIZnhEIEZywYXd5FMKyeN8CyqE4GAz0r+hieoFpu/m5tjnDz8uM+NNxIJ4FTQ5oqWuf6GXvbt4pn/xQWTcFsOU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lEiU9z1c; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lEiU9z1c" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2ef87d24c2dso3234331a91.1; Fri, 31 Jan 2025 11:29:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351773; x=1738956573; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UDXRG+l+8UkDA8rcyi4Z/G13+60E1Jk+sSAlFlqGB+E=; b=lEiU9z1cPbgr/WexfYh2rsIwyBSYpAFun+lmozKvwqhJSPciC94lFlAGdyhSaq5lgv Um4mIXwqLq0MtIjykdar/bFAwLHQLAJSGrexlNA7J5xbq9gEPz1ZFvi3iJQNRdtsxYUN PhcTKDRm0/9JzHF/MvChbDFER/L104mrwuebCQtjiIAT6qhMfRDyBs/xfyqIGhbkuoG2 VjkcKtckauUAcc1rBEf0o8hDYUAX9TBegpKWFWG+Ys/vW3nKBfTqfHvjgzMXQizI+4c2 kDdfH3ZvYIs2LUXLCU3E5NnaBtuczjPzSaKCsVovcj8gFe6t9lk++yUUtKg3Z5Q7XN6b 9vig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351773; x=1738956573; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UDXRG+l+8UkDA8rcyi4Z/G13+60E1Jk+sSAlFlqGB+E=; b=Amw+5eoLam9JrFlOq+G/WOUUuDd/Uf5vZOylJYgU1onQE+3izLPlgEzGkrOe2BJnoC +28a/ooRiJskKrXhrgYl/30W+QWOetp7uScHBEqM9YdT02ChaStA4q7zdwZbFdShQBgv y9mBjTfGE1cDmo/7jIQCzfp7h69pBZNP/agun5D0DtrQnoDUu7bjT7UBH/4TCIv0TCJx XPlanRYImmf1qbc+sW8RBaJGULGwH/OjFHo1V+4YnJzlOoKMMn07QxPI7/uFbSaYJmPx ew2aEUeIrv480o8AyRg+pE++6hl5UsWbWH+htgj/foFvVqs8giG8xPVZ411r2JNWbxUZ Q3NA== X-Gm-Message-State: AOJu0YxB5BBfMengEi2hq/MAV4+ai+HXMvvvrjZz5+tlcJHAUmc/nEEZ UTVmBy2xCSKb9LnRPhtba4/aRO9DKRhTueWuwjgri3jHr+W80WGZ5X2flUQ3Be0= X-Gm-Gg: ASbGncsQJsueUre6XN9WRswx9PHmUi20d24alMyIuxmCVqxrhKkTEWkpFGjt7hg+mmY F5YgXSJiqAY9jTcXWIMyVfvmErQnwulOEHxoFbYGY4Wel3hAo4pAkjfl1RFRkJqOjkJt4+Jk57G 25E+zcLXFJfa1U3gPUGRdM1oF1GG7Xf2ZEAOMORlX1OzD2iqXQBhfFow4T0mO/80zwcU4tnz/Ws 26/IAl7uDiD4l5gZYzwG5GXdsUs2ELkkbKuwMyct605l55iCyIUpYLd7UzEI3JQ79TkRm8qFhcb ieW+RhZaiKR3nAMnWM3N+RHMMRyriEffnWlbuBwedSP/kRGbgNsTMFj1qk08EnNMZw== X-Google-Smtp-Source: AGHT+IFoPrLEFTMuwKHCxW+85CuBvGB0pph0vStHjM2TDFKri4C5KnJ44CRE359EBL/DSbFPwVxleg== X-Received: by 2002:a17:90b:1f86:b0:2ee:aed2:c15c with SMTP id 98e67ed59e1d1-2f83ac8c3famr17781165a91.28.1738351772957; Fri, 31 Jan 2025 11:29:32 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:32 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 10/18] bpf: Search and add kfuncs in struct_ops prologue and epilogue Date: Fri, 31 Jan 2025 11:28:49 -0800 Message-ID: <20250131192912.133796-11-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Currently, add_kfunc_call() is only invoked once before the main verification loop. Therefore, the verifier could not find the bpf_kfunc_btf_tab of a new kfunc call which is not seen in user defined struct_ops operators but introduced in gen_prologue or gen_epilogue during do_misc_fixup(). Fix this by searching kfuncs in the patching instruction buffer and add them to prog->aux->kfunc_tab. Signed-off-by: Amery Hung Acked-by: Eduard Zingerman --- kernel/bpf/verifier.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5bcf095e8d0c..c11d105b3c6f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3215,6 +3215,21 @@ bpf_jit_find_kfunc_model(const struct bpf_prog *prog, return res ? &res->func_model : NULL; } +static int add_kfunc_in_insns(struct bpf_verifier_env *env, + struct bpf_insn *insn, int cnt) +{ + int i, ret; + + for (i = 0; i < cnt; i++, insn++) { + if (bpf_pseudo_kfunc_call(insn)) { + ret = add_kfunc_call(env, insn->imm, insn->off); + if (ret < 0) + return ret; + } + } + return 0; +} + static int add_subprog_and_kfunc(struct bpf_verifier_env *env) { struct bpf_subprog_info *subprog = env->subprog_info; @@ -20368,7 +20383,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) { struct bpf_subprog_info *subprogs = env->subprog_info; const struct bpf_verifier_ops *ops = env->ops; - int i, cnt, size, ctx_field_size, delta = 0, epilogue_cnt = 0; + int i, cnt, size, ctx_field_size, ret, delta = 0, epilogue_cnt = 0; const int insn_cnt = env->prog->len; struct bpf_insn *epilogue_buf = env->epilogue_buf; struct bpf_insn *insn_buf = env->insn_buf; @@ -20397,6 +20412,10 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) return -ENOMEM; env->prog = new_prog; delta += cnt - 1; + + ret = add_kfunc_in_insns(env, epilogue_buf, epilogue_cnt - 1); + if (ret < 0) + return ret; } } @@ -20417,6 +20436,10 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) env->prog = new_prog; delta += cnt - 1; + + ret = add_kfunc_in_insns(env, insn_buf, cnt - 1); + if (ret < 0) + return ret; } } From patchwork Fri Jan 31 19:28:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955703 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 361801F4261; Fri, 31 Jan 2025 19:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351776; cv=none; b=SzxzYFkTHa/FSOU6cV6Fo+1BUvxXOEnfJ7RSiOcN7N40S4HIC5Y6VPBLefrfwAi6HjMU8SF7C8BqBXIl0tcJ8jFZ2eF0JjoqFAhjgHOMQRKbE8zyR70Y5vrC1u1+KdwAk91SkPkV9NPtkVu8cUUOKVfvt5baF0V5ZSsoRRA6EKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351776; c=relaxed/simple; bh=oX7sfD2SSGnrQzUVIz968zYmlPPJL7x9U30Q9X0yB68=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ft3yR7VQg2dfgN6h00YD8XKfGqjTVa7Zafs8P4C4JBD5W1qiQL7Com7U1qrheZf4VRojFw1LzS23N2CEcw2JYp0mjRNZR08t8Ns2aKmBW+axVy4jtbIc3JhPKCqJzc5/xW6v4jIKWlKyP/gz1oClkAqZEZ4w8mJJ/PCFcL7EA+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UzpryLj0; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UzpryLj0" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-2ef28f07dbaso3241360a91.2; Fri, 31 Jan 2025 11:29:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351774; x=1738956574; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IX3XRBZ19QDBVzJDe9UanSrzS5pPfBfZXfSGAVpWKCU=; b=UzpryLj0yhA62qfGT/9S5U/cb4mQNHoBuhjtiKRCr6NMAUvVuX9ScR/p7+WkgPIrDS OJONlGeCn/6ndUj0OgLeoT6ueH6NrQL92PFvnbhbIitDFVnaJEeI8L3rMkoj8j6/VoRm BKXvD++Bpfj2z3mybZksn5ND+OKpkI+Qdm/r6ToFyvm253+rWKumEPlSTXXQDbxgcgfB 0InthWbLsCe+XOTH7JC+LS0c+dOeGnQhZ4nBuH1tdWeZ9/3O2cPt+vUlwqzhFhfuLvoI nubpY2q+jXqRWKhyGwnYqoZLi/c7aDpbyVBJWgWaEh9VDavq1wkhBx6Oskd1ZC15EfOU Hsiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351774; x=1738956574; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IX3XRBZ19QDBVzJDe9UanSrzS5pPfBfZXfSGAVpWKCU=; b=B1qetl2m3NF2jEUrO9Sw9TDsRqn5YOcMu3gkEs/kH5BUqePikVCEnMrIreQuH05l+i NLHFhrskIloulv3eMBLElJa7jg/7mYfwNwjje01ZA1b21Ou072tr+OQhsEHj8Dtgyc6w b8suz9gFIX1QztQUhA3ew5UEJ8eRnQda7gHjkHT2mMjq6q4XASUC34h/ae2pMHGcJ7Rb qlbL/wkKIpYeXb/ingdn7cUPIt9ysRF0HyigPHEA38c6SdFin2quJi3TN6xhDC9Ne7vV cJdGjiOyOh0gfDVOsSIWSYysdIRYbe8m97pqU1ZkOGy9mh4sRcclJiE/tesgQGF7OhVo ZyAw== X-Gm-Message-State: AOJu0YywfBkZgHfJsyMm05755Rb07P8Awrc75LF+5r3Q970MAJUFBkWP SvM5SQOn11yFzrkMM2Tet7yZSfW5XpMudwomNTzC8DfPuYZwJVvVSODq9xda5DA= X-Gm-Gg: ASbGncs6CvY+xn5zu/0bmCMvydKY6+zFDf3dUgmUBx9O569TLucqAsXRl0BQ97624P3 ejz6Ifq11nWI/mIRpPtT64DLZqdCnEOGQNd/OYu6gBwCSTKH3yZ42Zpy2SiuYrJfwqUbEPUxqC4 5IWQe8Ob9/1rqXkHddtmCDBYhHdSZP8WxGPxwPb53IijWv+G5IhDQn5Ot/sFC1qmVw667+7jtmW hcpfJzS2XbYMMC1iVXCle9BXS2vDwYNNhUB1yw9S7u0ym9DJg31H6Z4ct9qJMPaVbuyCH2JzJpQ aSYWao0artn+PEx8mnGp0lXmZZQGPb+ckEKTxaIPoCre98k003kMutrsQmI18MXTYQ== X-Google-Smtp-Source: AGHT+IGHZvE5vtd8WKJND6Fdhu7IcyuwAHzmiULORvD049e6llceB4o5ZiiClGRmN9pYttPQQJ6z1g== X-Received: by 2002:a17:90b:1f8b:b0:2f5:63a:449c with SMTP id 98e67ed59e1d1-2f83ac5e4bbmr17557552a91.28.1738351774377; Fri, 31 Jan 2025 11:29:34 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:34 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 11/18] bpf: net_sched: Add a qdisc watchdog timer Date: Fri, 31 Jan 2025 11:28:50 -0800 Message-ID: <20250131192912.133796-12-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add a watchdog timer to bpf qdisc. The watchdog can be used to schedule the execution of qdisc through kfunc, bpf_qdisc_schedule(). It can be useful for building traffic shaping scheduling algorithm, where the time the next packet will be dequeued is known. Signed-off-by: Amery Hung --- include/linux/filter.h | 10 +++++ net/sched/bpf_qdisc.c | 92 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 102 insertions(+) diff --git a/include/linux/filter.h b/include/linux/filter.h index a3ea46281595..3ed6eb9e7c73 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -469,6 +469,16 @@ static inline bool insn_is_cast_user(const struct bpf_insn *insn) .off = 0, \ .imm = BPF_CALL_IMM(FUNC) }) +/* Kfunc call */ + +#define BPF_CALL_KFUNC(OFF, IMM) \ + ((struct bpf_insn) { \ + .code = BPF_JMP | BPF_CALL, \ + .dst_reg = 0, \ + .src_reg = BPF_PSEUDO_KFUNC_CALL, \ + .off = OFF, \ + .imm = IMM }) + /* Raw code statement block */ #define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM) \ diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index e188616c86a4..5abf11aa8340 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -8,6 +8,10 @@ static struct bpf_struct_ops bpf_Qdisc_ops; +struct bpf_sched_data { + struct qdisc_watchdog watchdog; +}; + struct bpf_sk_buff_ptr { struct sk_buff *skb; }; @@ -111,6 +115,46 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, return 0; } +BTF_ID_LIST(bpf_qdisc_init_prologue_ids) +BTF_ID(func, bpf_qdisc_init_prologue) + +static int bpf_qdisc_gen_prologue(struct bpf_insn *insn_buf, bool direct_write, + const struct bpf_prog *prog) +{ + struct bpf_insn *insn = insn_buf; + + if (bpf_struct_ops_prog_moff(prog) != offsetof(struct Qdisc_ops, init)) + return 0; + + *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); + *insn++ = BPF_CALL_KFUNC(0, bpf_qdisc_init_prologue_ids[0]); + *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6); + *insn++ = prog->insnsi[0]; + + return insn - insn_buf; +} + +BTF_ID_LIST(bpf_qdisc_reset_destroy_epilogue_ids) +BTF_ID(func, bpf_qdisc_reset_destroy_epilogue) + +static int bpf_qdisc_gen_epilogue(struct bpf_insn *insn_buf, const struct bpf_prog *prog, + s16 ctx_stack_off) +{ + struct bpf_insn *insn = insn_buf; + + if (bpf_struct_ops_prog_moff(prog) != offsetof(struct Qdisc_ops, reset) && + bpf_struct_ops_prog_moff(prog) != offsetof(struct Qdisc_ops, destroy)) + return 0; + + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_FP, ctx_stack_off); + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); + *insn++ = BPF_CALL_KFUNC(0, bpf_qdisc_reset_destroy_epilogue_ids[0]); + *insn++ = BPF_EXIT_INSN(); + + return insn - insn_buf; +} + __bpf_kfunc_start_defs(); /* bpf_skb_get_hash - Get the flow hash of an skb. @@ -139,6 +183,36 @@ __bpf_kfunc void bpf_qdisc_skb_drop(struct sk_buff *skb, __qdisc_drop(skb, (struct sk_buff **)to_free_list); } +/* bpf_qdisc_watchdog_schedule - Schedule a qdisc to a later time using a timer. + * @sch: The qdisc to be scheduled. + * @expire: The expiry time of the timer. + * @delta_ns: The slack range of the timer. + */ +__bpf_kfunc void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 delta_ns) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_schedule_range_ns(&q->watchdog, expire, delta_ns); +} + +/* bpf_qdisc_init_prologue - Hidden kfunc called in prologue of .init. */ +__bpf_kfunc void bpf_qdisc_init_prologue(struct Qdisc *sch) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_init(&q->watchdog, sch); +} + +/* bpf_qdisc_reset_destroy_epilogue - Hidden kfunc called in epilogue of .reset + * and .destroy + */ +__bpf_kfunc void bpf_qdisc_reset_destroy_epilogue(struct Qdisc *sch) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_cancel(&q->watchdog); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(qdisc_kfunc_ids) @@ -146,6 +220,9 @@ BTF_ID_FLAGS(func, bpf_skb_get_hash, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_kfree_skb, KF_RELEASE) BTF_ID_FLAGS(func, bpf_qdisc_skb_drop, KF_RELEASE) BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_watchdog_schedule, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_init_prologue, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_reset_destroy_epilogue, KF_TRUSTED_ARGS) BTF_KFUNCS_END(qdisc_kfunc_ids) BTF_SET_START(qdisc_common_kfunc_set) @@ -156,8 +233,13 @@ BTF_SET_END(qdisc_common_kfunc_set) BTF_SET_START(qdisc_enqueue_kfunc_set) BTF_ID(func, bpf_qdisc_skb_drop) +BTF_ID(func, bpf_qdisc_watchdog_schedule) BTF_SET_END(qdisc_enqueue_kfunc_set) +BTF_SET_START(qdisc_dequeue_kfunc_set) +BTF_ID(func, bpf_qdisc_watchdog_schedule) +BTF_SET_END(qdisc_dequeue_kfunc_set) + static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) { if (bpf_Qdisc_ops.type != btf_type_by_id(prog->aux->attach_btf, @@ -174,6 +256,9 @@ static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) if (bpf_struct_ops_prog_moff(prog) == offsetof(struct Qdisc_ops, enqueue)) { if (btf_id_set_contains(&qdisc_enqueue_kfunc_set, kfunc_id)) return 0; + } else if (bpf_struct_ops_prog_moff(prog) == offsetof(struct Qdisc_ops, dequeue)) { + if (btf_id_set_contains(&qdisc_dequeue_kfunc_set, kfunc_id)) + return 0; } return btf_id_set_contains(&qdisc_common_kfunc_set, kfunc_id) ? 0 : -EACCES; @@ -189,6 +274,8 @@ static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { .get_func_proto = bpf_qdisc_get_func_proto, .is_valid_access = bpf_qdisc_is_valid_access, .btf_struct_access = bpf_qdisc_btf_struct_access, + .gen_prologue = bpf_qdisc_gen_prologue, + .gen_epilogue = bpf_qdisc_gen_epilogue, }; static int bpf_qdisc_init_member(const struct btf_type *t, @@ -204,6 +291,11 @@ static int bpf_qdisc_init_member(const struct btf_type *t, moff = __btf_member_bit_offset(t, member) / 8; switch (moff) { + case offsetof(struct Qdisc_ops, priv_size): + if (uqdisc_ops->priv_size) + return -EINVAL; + qdisc_ops->priv_size = sizeof(struct bpf_sched_data); + return 1; case offsetof(struct Qdisc_ops, peek): qdisc_ops->peek = qdisc_peek_dequeued; return 0; From patchwork Fri Jan 31 19:28:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955704 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A30AA1F37A2; Fri, 31 Jan 2025 19:29:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351778; cv=none; b=PIFzhKFZiRTMhtHHkz9bJvglEp0Eq0DD9CuVt9BkwwFUru6WzVrIlLqp45IYAEAcnoI7HHo5Mz11Tz8DqXia2uy3BUquDvhWZ/sIHsAJULPe3HOuE8cfOp98+x7DEuJAs0X4LSuv9RTpITOWzkj78w6ElMzK8FhgobcJHP1UACs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351778; c=relaxed/simple; bh=9j9ayn9vddtSbb5PDOLCF9fL7FXTV/IwE8oNGDcu+/Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nmUmygMa1KzUYXXHLgjChfArCPDieBs8iziz79fZ6m5C0VKTLlpdJ4Res4b/AwXSO5E+v8bysDBTAX4nfzLpJGA64oWkP/LIjdZQO9DqkKGGL3gjIsBwF+mI8/MuvT5z2LtJ9AihoKxpDDnMjbOwuRoh/ME1+3ZykkWeW0wG+sM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ed1dTr87; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ed1dTr87" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2eec9b3a1bbso3239358a91.3; Fri, 31 Jan 2025 11:29:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351776; x=1738956576; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CeQdpMyy6nDYQ4+R5zYQB6sZCeY7xrydRKRRjxE/p3M=; b=ed1dTr87rKoCI+pr6JIVyd8giFnwgecnsHktMqT7joBQFBGWW+RTBfNZNMsk+StS8I uEQ83xYATr7D4UhX6HYSgpAZ+KlDw5vRP1JSiuOzMGitKfaMQ1P9pol0oit3SUgBPG2p MOWbzFthljrYXl8tbwzKeABDceJxRjR97PbbFjCsYfhFahFl2vxtZSvbJSvKg+HBWvd/ /3BCvpw9HfWuJN9rAou91nDYU+bzU+x1kEd/7byzd1lCUTHqBHHOC50VuXXaPCRuEv0u d3WndHVEoqKyRAeS+1a2IB9s2fvoa1G1iI6MjPdI3BBAWY80TbtRvWVzvKDdmOYXJxL6 pN9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351776; x=1738956576; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CeQdpMyy6nDYQ4+R5zYQB6sZCeY7xrydRKRRjxE/p3M=; b=HwyCUNXMUgfjs1r9pUuJ0R1nDcYvF2uN6cdrXFshMNcLo9j/9UVDdzKh6XvuGtbZxd nLLwOS62/2KUiWd9sOYjmnvRVHyzGLUU8zQgAfCTV9PeR5ga4WBllPT1tH7OQRxkSdhp TqC13f5l6QowCjhJH2ALFYNbT2MIkgBrua7lKWh563U9afkgOHTNMjlrdr4fvDQUCTVC T+4AWZ4tl3EBu5/J7b0xkDrPowdc5tYUcLdRfdQfKhwQV6Zb+YvFgaNqZH+VVNHzZcNr 7ep2UiKH6zmnVcsGCoeHJToBddGvGoDbU0T3KFDUV4qvhF35QEfmsM5cThFuBRBGPFNe N9aw== X-Gm-Message-State: AOJu0Yx9+96/L0bx8KXLoil65EleeHMNtM6A6fgb1TqMA6uaD+vCJHpH VV69RX6VpKWg0M9yDhsw2MdCHFh260d8iDLm501mTdJLS3naRRGsV05Qy3/ln+U= X-Gm-Gg: ASbGncvwYLXB618h55Lbu01YspM0sLz8BgIxb9TLRMpuA5yG4ngiUKVznvjRSLuxhof 429FQFNZxCGAwZ5XMH6fT5JY86n6LGYpAp/jBy2Z7DRY2cxKD/CcPryBjog7hHGyVDuPossClH/ mVl7FkOuXC0sQ08Etm4MJ6SwXQlUu3B784PTkpEf/6JmD3kTbZi8K9ydBGsMawoy1CcAanglwns 1Dmphs3HNp8ypR8yWGNVG6sgGk6k+VP9CYLRFyfAJKeOr02zhCiA+HQ9uLJ/b9YahTUcJm7n1hq hdgAWKMBfPZk7BjPjJHbghBqondnCdJNcvv6wNi6f0MUJ4L9B3J2bdwexwtUrtuVsg== X-Google-Smtp-Source: AGHT+IGLnp8kodW3NTulgGNV/t+SlNdynQwACA4YDsKRevCUawfcq1NaCpLJNnBArga9rD81fAjTpQ== X-Received: by 2002:a17:90b:5410:b0:2ee:96a5:721e with SMTP id 98e67ed59e1d1-2f83abdf0e5mr20817604a91.12.1738351775679; Fri, 31 Jan 2025 11:29:35 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:35 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 12/18] bpf: net_sched: Support updating bstats Date: Fri, 31 Jan 2025 11:28:51 -0800 Message-ID: <20250131192912.133796-13-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add a kfunc to update Qdisc bstats when an skb is dequeued. The kfunc is only available in .dequeue programs. Signed-off-by: Amery Hung --- net/sched/bpf_qdisc.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 5abf11aa8340..1f2819e41df8 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -213,6 +213,15 @@ __bpf_kfunc void bpf_qdisc_reset_destroy_epilogue(struct Qdisc *sch) qdisc_watchdog_cancel(&q->watchdog); } +/* bpf_qdisc_bstats_update - Update Qdisc basic statistics + * @sch: The qdisc from which an skb is dequeued. + * @skb: The skb to be dequeued. + */ +__bpf_kfunc void bpf_qdisc_bstats_update(struct Qdisc *sch, const struct sk_buff *skb) +{ + bstats_update(&sch->bstats, skb); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(qdisc_kfunc_ids) @@ -223,6 +232,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_watchdog_schedule, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_init_prologue, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_reset_destroy_epilogue, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_bstats_update, KF_TRUSTED_ARGS) BTF_KFUNCS_END(qdisc_kfunc_ids) BTF_SET_START(qdisc_common_kfunc_set) @@ -238,6 +248,7 @@ BTF_SET_END(qdisc_enqueue_kfunc_set) BTF_SET_START(qdisc_dequeue_kfunc_set) BTF_ID(func, bpf_qdisc_watchdog_schedule) +BTF_ID(func, bpf_qdisc_bstats_update) BTF_SET_END(qdisc_dequeue_kfunc_set) static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) From patchwork Fri Jan 31 19:28:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955705 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FBD71F4735; Fri, 31 Jan 2025 19:29:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351779; cv=none; b=r0EyHhvpGTbHYlgrreuNCjVhwQHblEUoiSHWJFryunD3qS5pf5PHBucm6welrqyEMNfpC9/eEd6QD8hcs5v8DHjCwwhRqVd3K5+lyFuDpMGGCs9u/7UJ4KIZ7UWonDZOTDXSpvUd3xIfgzDw76vSmBbVd1IabZIseJ5mvgaNIgs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351779; c=relaxed/simple; bh=+EeiWRNMuZIxJQgj0Fh2A4CN804Wmkb1hmHSFeK94s4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UdXpqjeUTVjprlHECxs0c48eQMs+EPkmm075LyFQQ+t1BJExEwE6kCmyuCtalBIIWxbC7UEB5MLVBP3Yus0yjCa8Fd0YQrUsvmj6WRkljpTv4heWyT779MAcR3Fnc19WsECoHsPF7nUwziYQ6Ozxj5rAfxp19yw5+V1KBOpFwCg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=iwfdTMZw; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iwfdTMZw" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-2f441791e40so3223625a91.3; Fri, 31 Jan 2025 11:29:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351777; x=1738956577; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VS8kdcfcBAOmk2prWNwmQUyDvDYcTk81Ub9yVHX/RSg=; b=iwfdTMZwDF4rrlKNmuFAuUF5GZMSwdTQdWnJfjmMiChphKEq5QyB/b+CZPW6mmUqmV h1tTPYHIF5YlGLoy79sa+7q7f63rvDL1doUA1TDsFvo3gVbcktzYoBi/SJYGi1lXRvcd Lhb96jwh1fMiATUd7fpPia45NYopQ6cvKwC++CICpKB+SVVCVQXEOBjWTE579CEOiCXZ wI6Q+BNNRGW+/OnP9OTcj8qSk2KlO8jkp2AIXnNnUUYAJR8dcNK+7Ov3ynQqB2JzsQhb EfIXlrS9DIMfQE0ZheylZNIuHBRex6PrEGgWBk2/LtoSVzBh4rtX4HI22FdOEEmT4L+Q e1vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351777; x=1738956577; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VS8kdcfcBAOmk2prWNwmQUyDvDYcTk81Ub9yVHX/RSg=; b=qYsQeMUIXqO8PV//PMc1ZsJPlnkdVvK4775XPNLQyNoTqqs+ekJKUaSIaPhMtLfkOG S7E9xcJojOrCbgrotzdea+FvAy7EBy8CqB9hTld77S0aVUOlFpCNwJ0ZhINKfBlNB4G2 e13AktDOv57LmZmOT94LhhlvAOFDXl9tLliTjGqp6OrMhJCFldlX/xOKFAWitNi5LdAb hlkHdzRvfGVc6ll8o4s3K9t3P5SYbvutMl8n17SOZGj10jRjX3BuwKOJm6b4fMpVfViA 5mJNKOpMb/GodTq+T3E1JRKFmzBuVXjlZ0oCbyDSM4nwCWRJW6s6xxcPM7t4FK83B1iW ddmQ== X-Gm-Message-State: AOJu0YyRbasnr6D2GY8hewICq0VJOV/4RgGvTuc+UVCqLvIWiPco8lp0 LeDdA/m/1ym57AuJ46e4bIh0VxVGzWYhxcya9zwIJQMfrqu64NmF7I3e3TS7Qv8= X-Gm-Gg: ASbGncvTzNptv8D9bGJW1sZd4HY6I+Mdilkqrh9rBpfDV7PyK+FPgMLqpqO7T7OgIx3 gwnY7jZ+HfgxMxZDJg6lmS9EXGkzsHjWDqlzCU7rctc4w2bsWc1w8WtNDcwrqfTXDMQ/3odrcXi YeO/q3vojo2AuGYZ/gZi14aiAbd/LIBVu1MZpJzrpTHDQSDowdOj5VomiKJ9rMbB/QiPfjacluC muyk4waWNga0TgK5tqLY1M9a4JDM30xfM2ttH/NMLZl9LHvDV70DJb5ibxmJZVRKQGDelJMa4m+ mKRixWhQhr8sa7dTBPLfSFH3cv1UdzMj6dMlXhiIv/icK4p+ZU7wEjSz9+kW/0dIIg== X-Google-Smtp-Source: AGHT+IE1fNqXeyAqXFp+E9U3d3FHqcwhaSz0DfHlzQfmfYDuVsoUiglaU10iwKSfaWQSymOxnnxdRw== X-Received: by 2002:a17:90b:2e8f:b0:2ee:f80c:687c with SMTP id 98e67ed59e1d1-2f83acafc21mr16858890a91.31.1738351777341; Fri, 31 Jan 2025 11:29:37 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:37 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 13/18] bpf: net_sched: Support updating qstats Date: Fri, 31 Jan 2025 11:28:52 -0800 Message-ID: <20250131192912.133796-14-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Allow bpf qdisc programs to update Qdisc qstats directly with btf struct access. Signed-off-by: Amery Hung --- net/sched/bpf_qdisc.c | 53 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 45 insertions(+), 8 deletions(-) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 1f2819e41df8..2427343d8a10 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -36,6 +36,7 @@ bpf_qdisc_get_func_proto(enum bpf_func_id func_id, } } +BTF_ID_LIST_SINGLE(bpf_qdisc_ids, struct, Qdisc) BTF_ID_LIST_SINGLE(bpf_sk_buff_ids, struct, sk_buff) BTF_ID_LIST_SINGLE(bpf_sk_buff_ptr_ids, struct, bpf_sk_buff_ptr) @@ -60,20 +61,37 @@ static bool bpf_qdisc_is_valid_access(int off, int size, return bpf_tracing_btf_ctx_access(off, size, type, prog, info); } -static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, - const struct bpf_reg_state *reg, - int off, int size) +static int bpf_qdisc_qdisc_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) { - const struct btf_type *t, *skbt; size_t end; - skbt = btf_type_by_id(reg->btf, bpf_sk_buff_ids[0]); - t = btf_type_by_id(reg->btf, reg->btf_id); - if (t != skbt) { - bpf_log(log, "only read is supported\n"); + switch (off) { + case offsetof(struct Qdisc, qstats) ... offsetofend(struct Qdisc, qstats) - 1: + end = offsetofend(struct Qdisc, qstats); + break; + default: + bpf_log(log, "no write support to Qdisc at off %d\n", off); + return -EACCES; + } + + if (off + size > end) { + bpf_log(log, + "write access at off %d with size %d beyond the member of Qdisc ended at %zu\n", + off, size, end); return -EACCES; } + return 0; +} + +static int bpf_qdisc_sk_buff_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + size_t end; + switch (off) { case offsetof(struct sk_buff, tstamp): end = offsetofend(struct sk_buff, tstamp); @@ -115,6 +133,25 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, return 0; } +static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + const struct btf_type *t, *skbt, *qdisct; + + skbt = btf_type_by_id(reg->btf, bpf_sk_buff_ids[0]); + qdisct = btf_type_by_id(reg->btf, bpf_qdisc_ids[0]); + t = btf_type_by_id(reg->btf, reg->btf_id); + + if (t == skbt) + return bpf_qdisc_sk_buff_access(log, reg, off, size); + else if (t == qdisct) + return bpf_qdisc_qdisc_access(log, reg, off, size); + + bpf_log(log, "only read is supported\n"); + return -EACCES; +} + BTF_ID_LIST(bpf_qdisc_init_prologue_ids) BTF_ID(func, bpf_qdisc_init_prologue) From patchwork Fri Jan 31 19:28:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955706 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F99B1F4E24; Fri, 31 Jan 2025 19:29:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351780; cv=none; b=EGG7g0TYfcpEIWZ3X6iRTa56abGUtFP6zJpCfvHz2m/4DrYipv7IfeTvJYcr2m8uW2EXtcYkvS4G53AWzLYv60DG7ywKrf3c8JlTgoBUCnAqrigcuo42O697A1YcaGUFJdOBkkzyzC1aDeivvWNq1vJ027tNrrtxzPihAmYYiQU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351780; c=relaxed/simple; bh=yLFNWHhOTiarDSo9Sp1rvBts/yC/10vhC6dx3FQyXNE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MHHvGEx11URKeF+E2BBN2DPjTg+u47T4eU38gOwBA6VlUSokJg0PA5TTz1//lcoJ/3aPeaSpin2UXmBXCI8afg+ofGl6taBxiK4DiFLVQFMBx/fu2Tm15tTgM4FsvFeQSYsn0pOAeCxCHkmh1gN0nV8gdtGf0soak9WRDs2w3W8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i2LXlrWR; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i2LXlrWR" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2eec9b3a1bbso3239411a91.3; Fri, 31 Jan 2025 11:29:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351778; x=1738956578; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KbaWVWR/t9s3rvWCPWr9riionS64waNGKaWVDFN+gfw=; b=i2LXlrWRI+kF35/OfingDg3LTUq3BxSetnHUsJSpbHYKFrU//KtRyeaKH0O1u3hODs MicncTtDe4a6SgR+wj/Slx84exWoGXbFIEOg372dgkc7updvEFphi2hicQDgHs/UT7HH bfP04fiDsNWAMbkbDSClHd4EtAt9egbfSFxbMDEwLl3J435fqY1DvD1bG3YRcrWG4sYE TcAoIFGDcyvSXZ0dK1qSd2LnPtiqfMMLgWQhYwumvDG2+YCn7OMTbBYb/u+E7+8wtMWP KPqbc5iR4bkK95gqhNEGuT3BAQEbVd+nAsnz0p1SRwi89JyQKYZ8D50L/ovSwpeV//7P kCkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351778; x=1738956578; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KbaWVWR/t9s3rvWCPWr9riionS64waNGKaWVDFN+gfw=; b=Dfhcw3c3UUxteNpudniqHYb/vJ1/N2BCYpP2oiO9B9PArLXTzm7uAx5cocuv5BzN/F 2bXdUbtbfrskk8HasI2mRmm15K5e3MdajW3lwgKERlNzOelBS2FZKKVYPyk4U005mNWq yTETk7UR7JWv8Ist+MpivQhxEdxqdTZKeDg7+dNXKjJibME3trbPv9DmVHO16G7XQpdf mp1v1hxxgDmKZbQVe8APasD82wz7nQx/gMzwv8eiiPBST4jAagfwbPaj+zYSt7c4xcAS CNA4Kx8XD0pIDzEalLw0nlZT/WW/lpx4TalDtA7WTYYn4RtoyaxvETnJ4CoSyFO/5cHh lk/w== X-Gm-Message-State: AOJu0YwUjadP3wEhFFGYFbKPn8EEMjhDmgXeY6m3T9HpXMf/1JpzLI1f dQwBhueuBixzI382cPRzDbcJgi+vLmzsv2Ms7EX/UIY0koaNp8t6KyvIOIDwOpQ= X-Gm-Gg: ASbGnctgRYW72Mr80wzcGRo5cJddT8NnguqXITj4zyiLENT8m8aB/MC7PgydloHvsbA l9Y0ce0FZfNsgDjYokLP7qut3rCQMro2GFeL5xxrZWEjjE7BCLN33c+ap6kn58DP12scBIIFEu3 qXUZLFDlanB6eq48nIRS/bcugaxVYJKwQFAVd1xMV4mftbQj6vYdsVW3StvzJ5z8ClccrAlHKM7 fzE7gEEG1eSeh8psFgcvdsB2OsjpaWSzZ1AxoA3DWqa0EiAPNsEOLwXYZmhyYkYYF5bvLdu4Alk kGmb8KnWpLSCR/0CoPkTzWsECu46tvcinG4SyS1PbmpBNKHQEKuGKiZVFqOZcdokjQ== X-Google-Smtp-Source: AGHT+IFu8HWzuVOZ1Rz3rEV4JKYix6XxyoSSJfcBCf5ZmgRj4hznKWqOf/sWJHUduTI6YlArLunnEQ== X-Received: by 2002:a17:90b:4ec3:b0:2f7:4cce:ae37 with SMTP id 98e67ed59e1d1-2f83ac00165mr20647338a91.18.1738351778350; Fri, 31 Jan 2025 11:29:38 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:38 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 14/18] bpf: net_sched: Allow writing to more Qdisc members Date: Fri, 31 Jan 2025 11:28:53 -0800 Message-ID: <20250131192912.133796-15-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Allow bpf qdisc to write to Qdisc->limit and Qdisc->q.qlen. Signed-off-by: Amery Hung --- net/sched/bpf_qdisc.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 2427343d8a10..b8f02ff8734e 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -68,6 +68,12 @@ static int bpf_qdisc_qdisc_access(struct bpf_verifier_log *log, size_t end; switch (off) { + case offsetof(struct Qdisc, limit): + end = offsetofend(struct Qdisc, limit); + break; + case offsetof(struct Qdisc, q) + offsetof(struct qdisc_skb_head, qlen): + end = offsetof(struct Qdisc, q) + offsetofend(struct qdisc_skb_head, qlen); + break; case offsetof(struct Qdisc, qstats) ... offsetofend(struct Qdisc, qstats) - 1: end = offsetofend(struct Qdisc, qstats); break; From patchwork Fri Jan 31 19:28:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955707 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62BDB1F4E32; Fri, 31 Jan 2025 19:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351781; cv=none; b=aTnxbOlRUiD6kAchk+OxokYb9xdablqLVnxq+THfPJuAsJld2Ut0F60tG+SCTwE39oIaFmhHRtMSfJ5ERLrW8dIC3EPrmE8G3A0GF3yvA7/f3fzivUeV6o1kbfz+5dYWe/Y57AaOpt6XTZhptmZt6hz2IJF3ZBeTNNFpGQHxHCI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351781; c=relaxed/simple; bh=wecBTFOKU/7DDslMJtQcO1DcEyTkzBNNAGp1ShemhsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JKxYC7Z+hXT91JQkPsiyqSz7Gu+ANdarCXjbGyR2G6pq20jsN7OUficcBBeE4MfK0YHg/lrtAqn7BZUE7WbNixL8fugmkE39+FMb9icd7K6ety7gqeuaaossrur0nvJyHqaro6LEGIqLZv3dBpMHeXEX92PQTUATb2V3R60m1Hk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Yg2IYtI1; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Yg2IYtI1" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-2f13acbe29bso5491906a91.1; Fri, 31 Jan 2025 11:29:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351779; x=1738956579; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EeAP7Bc6x9e9oxdbqzfBkmCuQiCzPgcgtTAZRyBQYU8=; b=Yg2IYtI1PbogV1jWvI42lKAjr8R1e25IzKUNSQ1jeWnS5fF4jqHwCDiVcr3H5k/Wnk GsfR/UHehe6f5+HsFssmWDI/FbBtS+YeXvWwHakEmUwdLU97+tbkYU7zOuo2kJCKLecO 6c8aby7B7YjYbDTldoHDT5mXtEgp8CJJtgPVhCdtyE5r/YVc80PSpFJm/T17G+pNRYdF 4f34ME+sfRlzzIJb9oSL8UExIpEySeL12lTK6Yg7SbSa/0yw1//j2h4ChOZqn+rHcEKc Yl0jFHjZYpEnDo2MrEPRp+jW7RpdcuVpqZeZE8E++BP62DRzQUZLTNIg9AqPGOLV5WFR AOFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351779; x=1738956579; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EeAP7Bc6x9e9oxdbqzfBkmCuQiCzPgcgtTAZRyBQYU8=; b=Qfnj37O9jS9sslwOEfEuXB9uD8RemyZyBwoNuUO9/pt5Ke79w1CmZ58eg/Fp18IECQ azCTzCSKeBRsCkePBegf2R6hwkejCyNbF2Y5QMnfFgLzeYy+JWlQAPLWXbFTB9mcGjgn zQLsEpioTUlbiJVoSwVsBiVyEGCAu25AUmoYAd3cD/DB68imL4c0s00cqBjrqBiiF1Jh mhmIicu/OsFnoJb9WahALzUFz5MzLKNdeIxtSgDiWmuYiOGUxV//Sviau/1Op7udGyqs BY3+H+8wdHudFy1dHo4SAykQYhiSRFpLk92VqfNoQOqY1w8TEGn0+yNYqRGDXcv7HMau fvBQ== X-Gm-Message-State: AOJu0Yw0OojEBVLsaVYFzCNPfkT7788PDkMh7o8ZFCTmC/F8i6ykWcLf Sgg6WHPPeQk+DqDVeys43/78SKq+Xtlra8+IZYFx1SR+LSaNjnGl65GVB8COjkU= X-Gm-Gg: ASbGncvqspKs58U/VSFCt2zSvB9IxcX+qXOIlG57G+uyw7ntZNAUgCVzfo2NFAvoRvm L+uaO5rTwYiwYtydUrAxSSOTJ5mnlHNaDync3+cWuY7q/uZgy702rmVUhioKKkfC22yTMfaNzY2 fH1ZjgM9dGy0/LnEn28Y0KbKs4UC9bqrJJIp1x7v8xafKiza9um0JyzBEjR20uA20UKHFbLJuVa D4EvsaHYhkkJUJB6wWD2dMpLSXwJ0fx/VKtGsblFQW+l9EHMo4gr48lWF/tIExmW1fFmQCIz4ZQ x83KH5s0L46P6rtbn1KF6W674dJg72Z+uOrIY+a8FP8NPNzb+4z86L5CirgYnF5CaA== X-Google-Smtp-Source: AGHT+IETH6D8aPmHF7RVN6o45r66JYgqiVo5uti/kYyiDC/X1NCh/0NLaPHl3YBDdMv4rpIi5FEYcA== X-Received: by 2002:a17:90a:38a3:b0:2f5:63a:44f8 with SMTP id 98e67ed59e1d1-2f994e50109mr7865153a91.8.1738351779400; Fri, 31 Jan 2025 11:29:39 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:39 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 15/18] libbpf: Support creating and destroying qdisc Date: Fri, 31 Jan 2025 11:28:54 -0800 Message-ID: <20250131192912.133796-16-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Extend struct bpf_tc_hook with handle, qdisc name and a new attach type, BPF_TC_QDISC, to allow users to add or remove any qdisc specified in addition to clsact. Signed-off-by: Amery Hung --- tools/lib/bpf/libbpf.h | 5 ++++- tools/lib/bpf/netlink.c | 20 +++++++++++++++++--- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 3020ee45303a..12c81e6da219 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -1270,6 +1270,7 @@ enum bpf_tc_attach_point { BPF_TC_INGRESS = 1 << 0, BPF_TC_EGRESS = 1 << 1, BPF_TC_CUSTOM = 1 << 2, + BPF_TC_QDISC = 1 << 3, }; #define BPF_TC_PARENT(a, b) \ @@ -1284,9 +1285,11 @@ struct bpf_tc_hook { int ifindex; enum bpf_tc_attach_point attach_point; __u32 parent; + __u32 handle; + const char *qdisc; size_t :0; }; -#define bpf_tc_hook__last_field parent +#define bpf_tc_hook__last_field qdisc struct bpf_tc_opts { size_t sz; diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 68a2def17175..c997e69d507f 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -529,9 +529,9 @@ int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id) } -typedef int (*qdisc_config_t)(struct libbpf_nla_req *req); +typedef int (*qdisc_config_t)(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook); -static int clsact_config(struct libbpf_nla_req *req) +static int clsact_config(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook) { req->tc.tcm_parent = TC_H_CLSACT; req->tc.tcm_handle = TC_H_MAKE(TC_H_CLSACT, 0); @@ -539,6 +539,16 @@ static int clsact_config(struct libbpf_nla_req *req) return nlattr_add(req, TCA_KIND, "clsact", sizeof("clsact")); } +static int qdisc_config(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook) +{ + const char *qdisc = OPTS_GET(hook, qdisc, NULL); + + req->tc.tcm_parent = OPTS_GET(hook, parent, TC_H_ROOT); + req->tc.tcm_handle = OPTS_GET(hook, handle, 0); + + return nlattr_add(req, TCA_KIND, qdisc, strlen(qdisc) + 1); +} + static int attach_point_to_config(struct bpf_tc_hook *hook, qdisc_config_t *config) { @@ -552,6 +562,9 @@ static int attach_point_to_config(struct bpf_tc_hook *hook, return 0; case BPF_TC_CUSTOM: return -EOPNOTSUPP; + case BPF_TC_QDISC: + *config = &qdisc_config; + return 0; default: return -EINVAL; } @@ -596,7 +609,7 @@ static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags) req.tc.tcm_family = AF_UNSPEC; req.tc.tcm_ifindex = OPTS_GET(hook, ifindex, 0); - ret = config(&req); + ret = config(&req, hook); if (ret < 0) return ret; @@ -639,6 +652,7 @@ int bpf_tc_hook_destroy(struct bpf_tc_hook *hook) case BPF_TC_INGRESS: case BPF_TC_EGRESS: return libbpf_err(__bpf_tc_detach(hook, NULL, true)); + case BPF_TC_QDISC: case BPF_TC_INGRESS | BPF_TC_EGRESS: return libbpf_err(tc_qdisc_delete(hook)); case BPF_TC_CUSTOM: From patchwork Fri Jan 31 19:28:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955708 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C4DF1F4735; Fri, 31 Jan 2025 19:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351782; cv=none; b=Wnux4zpz6+5Cg1meCid/vtZ9b/Awx4kVlFtGG4vc+zOOqtgf3d52WsX57/qlQ0X7YWy7UaQf/X6/EoXD2TbFLBHeKhYj3L+4XOzjn/QqEvwzzZY53uGqycOr4nculv+pJqyKQBBaxIiNXqy5Xax3hZF6X40TZUsykdh9VG7aWIk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351782; c=relaxed/simple; bh=RFTb05MOTvQ1XNo5zRefdOckyFOYcGFeYSohdaoDjRI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ll2zopjvSTUDnc/3JfEDOe0x5THui0/95cfed7gRyqQJYw0qYpF2sA9j3saoZLCe+tZVQg4DUS4bcQbwwBtF4W/1lc8vmlCjWoIabNVES6zOdDNBJzgfvRdwmj/hNJyVeLNC2wBa5Mu9A6+i3TsG+dqY375IBPtJg7Il3XS5J9w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hmrSDMSe; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hmrSDMSe" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2ef28f07dbaso3241507a91.2; Fri, 31 Jan 2025 11:29:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351780; x=1738956580; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uj2Ol0TrAZxxMAWaqZASaKyaG7rHi/lbAJ67oVFmDO8=; b=hmrSDMSernhCWDl4df029a6TH3LK5uxbebPN5ePXylpEanb9IBEk4uDKICzBGgtF9g nE33iJ1qFyuvuBDtEeEIlLMb6+fH6i+2u0Hw0aCmkkDCJlZbNKqyEtcRNBIs/9T0HArk kdlUTwOrr21gtEmrBKycaxKFX9TXlsxGajS+V0VFD+0YeOpSecCJki6PnsDDuT1ACf7o 7ZnxE7dBLR3OF5eL8uQoc6GCxTeaCCZjyKYyjfbNu6RsYLnA3dlo0XxCmpAn0vgAxQC9 wWk/TC4RBDktMOA0NcZhSbsAzP6iIjw0dlxaMPfhXaxlOSDiffvHRPAk//VAkgTSEGJE x4yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351780; x=1738956580; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uj2Ol0TrAZxxMAWaqZASaKyaG7rHi/lbAJ67oVFmDO8=; b=jrsEObV9kcUAGtKESJpIOQIKEUtTTCYzkKRmNHksW5+0vrsCdlvEGNlK8wqux3s/5T SS5b/OarnbxyhjzZJcj6yarASS1Z4QwBBxiWOMU/07k9mGQQJanetwqnzeFdMQDaNCji /BN2kP93FD/VCFCufo1tf3xre9yh9N3Cvx0M2HUylr7WQG6XueiHYPKe8sOYqMh9rtnP dtAoD4IDTaI92/IU5tFTzQNptbPPPD+88tJBL9dBG6XgZsslWsvy9YQ/pJn/hm+7IhYT BS7ArKHXXR2WO8lojVDBeBQRQFjdtu3zINzBgM2IMMwh+UtkEZ6FXS5lS8pdfA2llYvZ Tlbw== X-Gm-Message-State: AOJu0YxtGkT+rUhWZ3p+G/xhPMOpkOARX5R1Vg43ysy1+55JpPObkUR2 /Jwl5uFcZv0qqsN16r9mrC9Q0AbaMTiVt7qqJHFUOABT7Zm9TEmeBym+PVggCYM= X-Gm-Gg: ASbGncv4pYPCL/SnrmzfkolkuzswzzXaTz7EqEa/3WPS9HD9S255wJfaNO08NUv0BwB 7Ftw/jwiS1WErBS9Lzjd97NtFSWz19WdDONKkdOJKwE5EHvWZ4YmnMZ6akiC4l8twatdIqt3Bz8 9GuBA5ai4/28LBuzAULm7umMVruf5eytQYqfL3Gkrcj378u4p+qBXljJgQVut0Wj3lKbypl3w2c HhuDbY7UKmKF45A027lFBtXkNQqybj020SrtN2Oj1F4EJXLsPOp6/78zouXVMWrwKZnZY7FsaVF 2rlmyHtjZG+FonIyRzhKfPHi4dI/XaI7ZT7+a1fb1dCgwAfIF3b99jW5/QCzcM8qjw== X-Google-Smtp-Source: AGHT+IHwZ2SbjOFvxQl3dG8HvhMN6BrLeUBYBS9blurO59o9p/dpCV18zt8sE7Fr9uoEXnEQoupc1Q== X-Received: by 2002:a17:90b:5445:b0:2ee:a583:e616 with SMTP id 98e67ed59e1d1-2f83abdeb6fmr18884489a91.9.1738351780424; Fri, 31 Jan 2025 11:29:40 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:40 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 16/18] selftests/bpf: Add a basic fifo qdisc test Date: Fri, 31 Jan 2025 11:28:55 -0800 Message-ID: <20250131192912.133796-17-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung This selftest includes a bare minimum fifo qdisc, which simply enqueues sk_buffs into the back of a bpf list and dequeues from the front of the list. Signed-off-by: Amery Hung --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/bpf_qdisc.c | 79 ++++++++++++ .../selftests/bpf/progs/bpf_qdisc_common.h | 27 ++++ .../selftests/bpf/progs/bpf_qdisc_fifo.c | 117 ++++++++++++++++++ 4 files changed, 224 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_common.h create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index c378d5d07e02..6b0cab55bd2d 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -71,6 +71,7 @@ CONFIG_NET_IPGRE=y CONFIG_NET_IPGRE_DEMUX=y CONFIG_NET_IPIP=y CONFIG_NET_MPLS_GSO=y +CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y CONFIG_NET_SCHED=y diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c new file mode 100644 index 000000000000..f2efc69af348 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -0,0 +1,79 @@ +#include +#include +#include + +#include "network_helpers.h" +#include "bpf_qdisc_fifo.skel.h" + +#define LO_IFINDEX 1 + +static const unsigned int total_bytes = 10 * 1024 * 1024; + +static void do_test(char *qdisc) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = LO_IFINDEX, + .attach_point = BPF_TC_QDISC, + .parent = TC_H_ROOT, + .handle = 0x8000000, + .qdisc = qdisc); + int srv_fd = -1, cli_fd = -1; + int err; + + err = bpf_tc_hook_create(&hook); + if (!ASSERT_OK(err, "attach qdisc")) + return; + + srv_fd = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_OK_FD(srv_fd, "start server")) + goto done; + + cli_fd = connect_to_fd(srv_fd, 0); + if (!ASSERT_OK_FD(cli_fd, "connect to client")) + goto done; + + err = send_recv_data(srv_fd, cli_fd, total_bytes); + ASSERT_OK(err, "send_recv_data"); + +done: + if (srv_fd != -1) + close(srv_fd); + if (cli_fd != -1) + close(cli_fd); + + bpf_tc_hook_destroy(&hook); +} + +static void test_fifo(void) +{ + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + do_test("bpf_fifo"); + + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + +void test_bpf_qdisc(void) +{ + struct netns_obj *netns; + + netns = netns_new("bpf_qdisc_ns", true); + if (!ASSERT_OK_PTR(netns, "netns_new")) + return; + + if (test__start_subtest("fifo")) + test_fifo(); + + netns_free(netns); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h b/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h new file mode 100644 index 000000000000..62a778f94908 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h @@ -0,0 +1,27 @@ +#ifndef _BPF_QDISC_COMMON_H +#define _BPF_QDISC_COMMON_H + +#define NET_XMIT_SUCCESS 0x00 +#define NET_XMIT_DROP 0x01 /* skb dropped */ +#define NET_XMIT_CN 0x02 /* congestion notification */ + +#define TC_PRIO_CONTROL 7 +#define TC_PRIO_MAX 15 + +u32 bpf_skb_get_hash(struct sk_buff *p) __ksym; +void bpf_kfree_skb(struct sk_buff *p) __ksym; +void bpf_qdisc_skb_drop(struct sk_buff *p, struct bpf_sk_buff_ptr *to_free) __ksym; +void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 delta_ns) __ksym; +void bpf_qdisc_bstats_update(struct Qdisc *sch, const struct sk_buff *skb) __ksym; + +static struct qdisc_skb_cb *qdisc_skb_cb(const struct sk_buff *skb) +{ + return (struct qdisc_skb_cb *)skb->cb; +} + +static inline unsigned int qdisc_pkt_len(const struct sk_buff *skb) +{ + return qdisc_skb_cb(skb)->pkt_len; +} + +#endif diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c b/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c new file mode 100644 index 000000000000..705e7da325da --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c @@ -0,0 +1,117 @@ +#include +#include "bpf_experimental.h" +#include "bpf_qdisc_common.h" + +char _license[] SEC("license") = "GPL"; + +struct skb_node { + struct sk_buff __kptr * skb; + struct bpf_list_node node; +}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + +private(A) struct bpf_spin_lock q_fifo_lock; +private(A) struct bpf_list_head q_fifo __contains(skb_node, node); + +SEC("struct_ops/bpf_fifo_enqueue") +int BPF_PROG(bpf_fifo_enqueue, struct sk_buff *skb, struct Qdisc *sch, + struct bpf_sk_buff_ptr *to_free) +{ + struct skb_node *skbn; + u32 pkt_len; + + if (sch->q.qlen == sch->limit) + goto drop; + + skbn = bpf_obj_new(typeof(*skbn)); + if (!skbn) + goto drop; + + pkt_len = qdisc_pkt_len(skb); + + sch->q.qlen++; + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_qdisc_skb_drop(skb, to_free); + + bpf_spin_lock(&q_fifo_lock); + bpf_list_push_back(&q_fifo, &skbn->node); + bpf_spin_unlock(&q_fifo_lock); + + sch->qstats.backlog += pkt_len; + return NET_XMIT_SUCCESS; +drop: + bpf_qdisc_skb_drop(skb, to_free); + return NET_XMIT_DROP; +} + +SEC("struct_ops/bpf_fifo_dequeue") +struct sk_buff *BPF_PROG(bpf_fifo_dequeue, struct Qdisc *sch) +{ + struct bpf_list_node *node; + struct sk_buff *skb = NULL; + struct skb_node *skbn; + + bpf_spin_lock(&q_fifo_lock); + node = bpf_list_pop_front(&q_fifo); + bpf_spin_unlock(&q_fifo_lock); + if (!node) + return NULL; + + skbn = container_of(node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + if (!skb) + return NULL; + + sch->qstats.backlog -= qdisc_pkt_len(skb); + bpf_qdisc_bstats_update(sch, skb); + sch->q.qlen--; + + return skb; +} + +SEC("struct_ops/bpf_fifo_init") +int BPF_PROG(bpf_fifo_init, struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + sch->limit = 1000; + return 0; +} + +SEC("struct_ops/bpf_fifo_reset") +void BPF_PROG(bpf_fifo_reset, struct Qdisc *sch) +{ + struct bpf_list_node *node; + struct skb_node *skbn; + int i; + + bpf_for(i, 0, sch->q.qlen) { + struct sk_buff *skb = NULL; + + bpf_spin_lock(&q_fifo_lock); + node = bpf_list_pop_front(&q_fifo); + bpf_spin_unlock(&q_fifo_lock); + + if (!node) + break; + + skbn = container_of(node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_kfree_skb(skb); + bpf_obj_drop(skbn); + } + sch->q.qlen = 0; +} + +SEC(".struct_ops") +struct Qdisc_ops fifo = { + .enqueue = (void *)bpf_fifo_enqueue, + .dequeue = (void *)bpf_fifo_dequeue, + .init = (void *)bpf_fifo_init, + .reset = (void *)bpf_fifo_reset, + .id = "bpf_fifo", +}; + From patchwork Fri Jan 31 19:28:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955709 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CF331F4E5A; Fri, 31 Jan 2025 19:29:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351784; cv=none; b=b1I8yksfP/uPtvoHMpJCcW1zaPDMs99do+0OzOqEubBTwXVBtgeVa+PARPexWvgG46/5801N3IK+jZMmEnb3y+NyutOaO/5FyNzExfC29shhQQPLWl7+Ai0oVHt47TqVvl/06tzoNzSlP/WDIe6v/6PF6BI1biXV1Y9cSRkhIeE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351784; c=relaxed/simple; bh=ZA5ErMZT2ybXhxaR0RFOkuudpRYCUeueovEUbA/H59M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L7UyIQbK6+NRSMcPyuAwxNC6fw4Hlj/9VCcaIW46FSBcZ1oAnVY6e7jnGRZrzWAL2AnoDv37SJsYo1kPz0sUDhRrmMAEWd8J6CxBtZm+TPKr+9/om+nxlBqn1LuDUPz5ph1tKRFCSSCUYFLoMi7mEPxaZJ8FhgEpNBX6VtblOyg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=I7K9iUTV; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I7K9iUTV" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2163dc5155fso43779545ad.0; Fri, 31 Jan 2025 11:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351781; x=1738956581; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SbMTv6gRhfGF0R0tgTDhQ0HzlGo4XkSmvSiNW4A323U=; b=I7K9iUTVJDFKjhB9bEdrra2yi8VNPoitX2Vqa8knVefTgWaR1ph+dyqWClZqZvd0pV WYi/RmNi6ZjHQ49il2jXmN+Gte+PCH32a5Uv3bwFpCFk61rkVST1PPDVoRW1KUL9CLh4 Jv7ryvXUs8W9zqE4B50/qCwxvagA3YvlLqFh0Dfu9teH3hItBZCIi9S8wAOrfNn5SA0D Wp/JzU5Fg2H6ELS5Vrz7NQTTV2d4gGtBuh7m5CQZ9NS0LwaKePNtSLKJXE4IhxPEFP1/ r4MHDaXf0bj9Yt4HSdsy445UOb+Icyw6hzhFj6Nlxe4yemfTU0fX7CS/pGovr1H0Xo61 zxHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351781; x=1738956581; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SbMTv6gRhfGF0R0tgTDhQ0HzlGo4XkSmvSiNW4A323U=; b=kUFRXPcY2kScgXTI51ZzSR4lSNm6xImf6p3V2uaS1IRv4d7+WmkigFQWYk0RAnxVEi tlzKZUEsT1/qLWmiuNoItbSmU6gzOxBIiyYgIz9ACSEbHmNZj08YOwJXQfNzwybi0MEs 64Mk4cmZUzYg3w8iUcRb2o2h+3onpYLaQw5RkJeXxbffRxhYvv4L5mne9dsNsSGnFW0p bRSIXO9XANYtXV7+0PZUXL2BjIhVtZflRBVeueNO0yE5R8XAPqG8v+qSX7Sx1lQpx6T3 vdeGgpixWJjhfRKPz4Ktv8SrR7QO5wxb3XGIcoBCPQISP1Ec+FijD+JIYrFXjk3WM3Oy 16Kw== X-Gm-Message-State: AOJu0YzDO/jzKP6ctLnPtfG7qQhijlhRGJHoAxVRpWiQh0yd6A6jxYS6 dY75rhJXHkCBrYdKqfA/mwYMm4j43ue3s/AS4hiIaY7bv45DZO936uNSN/8522s= X-Gm-Gg: ASbGncuHwPMAkjMnDODu33UGc07W+ICMMM2B2vUhXVA6USK+qh7NBY5VDleaMECv//t jHQD7i8MdX9b+69KIHWSPheuSahKPvyvkJ9Onj7sJuhdL2vfgBRggfE0bQwXzaPG24Q32L6czNq mUj/Q2Z0TGak1nn8xBCesSOZbfpAXx9WGviw7X4dzdObfIdKk8whpvAUo5d//gc4rpF7HRXpN3S R2pRWz1LCeVwxgA2U/2JLvzXYsF0AfDO1KUUPu/JC8mxhSqW+zMFx5PXHXdRD2zPV4ZancheqJe CaQasNd5xLAA1CJTQqJWzvHtGGLTETTNWHSSMsf442kA6MzaZsVfPAuBhiqenaj2bw== X-Google-Smtp-Source: AGHT+IEOz1Ubp83PBkVr6zsGw/bZ5tgUaInS+DwuN4pjgioNkXnFY5EiCJRHnJDGp3MfNxj+1fJz4w== X-Received: by 2002:a17:902:da88:b0:215:a05d:fb05 with SMTP id d9443c01a7336-21dd7dd8924mr211482985ad.32.1738351781439; Fri, 31 Jan 2025 11:29:41 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:41 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 17/18] selftests/bpf: Add a bpf fq qdisc to selftest Date: Fri, 31 Jan 2025 11:28:56 -0800 Message-ID: <20250131192912.133796-18-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung This test implements a more sophisticated qdisc using bpf. The bpf fair- queueing (fq) qdisc gives each flow an equal chance to transmit data. It also respects the timestamp of skb for rate limiting. Signed-off-by: Amery Hung --- .../selftests/bpf/prog_tests/bpf_qdisc.c | 24 + .../selftests/bpf/progs/bpf_qdisc_fq.c | 720 ++++++++++++++++++ 2 files changed, 744 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c index f2efc69af348..7e8e3170e6b6 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -4,6 +4,7 @@ #include "network_helpers.h" #include "bpf_qdisc_fifo.skel.h" +#include "bpf_qdisc_fq.skel.h" #define LO_IFINDEX 1 @@ -64,6 +65,27 @@ static void test_fifo(void) bpf_qdisc_fifo__destroy(fifo_skel); } +static void test_fq(void) +{ + struct bpf_qdisc_fq *fq_skel; + struct bpf_link *link; + + fq_skel = bpf_qdisc_fq__open_and_load(); + if (!ASSERT_OK_PTR(fq_skel, "bpf_qdisc_fq__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fq_skel->maps.fq); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fq__destroy(fq_skel); + return; + } + + do_test("bpf_fq"); + + bpf_link__destroy(link); + bpf_qdisc_fq__destroy(fq_skel); +} + void test_bpf_qdisc(void) { struct netns_obj *netns; @@ -74,6 +96,8 @@ void test_bpf_qdisc(void) if (test__start_subtest("fifo")) test_fifo(); + if (test__start_subtest("fq")) + test_fq(); netns_free(netns); } diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c b/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c new file mode 100644 index 000000000000..9ac8cdb08e61 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c @@ -0,0 +1,720 @@ +#include +#include +#include +#include "bpf_experimental.h" +#include "bpf_qdisc_common.h" + +char _license[] SEC("license") = "GPL"; + +#define NSEC_PER_USEC 1000L +#define NSEC_PER_SEC 1000000000L + +#define NUM_QUEUE (1 << 20) + +struct fq_bpf_data { + u32 quantum; + u32 initial_quantum; + u32 flow_refill_delay; + u32 flow_plimit; + u64 horizon; + u32 orphan_mask; + u32 timer_slack; + u64 time_next_delayed_flow; + u64 unthrottle_latency_ns; + u8 horizon_drop; + u32 new_flow_cnt; + u32 old_flow_cnt; + u64 ktime_cache; +}; + +enum { + CLS_RET_PRIO = 0, + CLS_RET_NONPRIO = 1, + CLS_RET_ERR = 2, +}; + +struct skb_node { + u64 tstamp; + struct sk_buff __kptr * skb; + struct bpf_rb_node node; +}; + +struct fq_flow_node { + int credit; + u32 qlen; + u64 age; + u64 time_next_packet; + struct bpf_list_node list_node; + struct bpf_rb_node rb_node; + struct bpf_rb_root queue __contains(skb_node, node); + struct bpf_spin_lock lock; + struct bpf_refcount refcount; +}; + +struct dequeue_nonprio_ctx { + bool stop_iter; + u64 expire; + u64 now; +}; + +struct remove_flows_ctx { + bool gc_only; + u32 reset_cnt; + u32 reset_max; +}; + +struct unset_throttled_flows_ctx { + bool unset_all; + u64 now; +}; + +struct fq_stashed_flow { + struct fq_flow_node __kptr * flow; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u64); + __type(value, struct fq_stashed_flow); + __uint(max_entries, NUM_QUEUE); +} fq_nonprio_flows SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u64); + __type(value, struct fq_stashed_flow); + __uint(max_entries, 1); +} fq_prio_flows SEC(".maps"); + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + +private(A) struct bpf_spin_lock fq_delayed_lock; +private(A) struct bpf_rb_root fq_delayed __contains(fq_flow_node, rb_node); + +private(B) struct bpf_spin_lock fq_new_flows_lock; +private(B) struct bpf_list_head fq_new_flows __contains(fq_flow_node, list_node); + +private(C) struct bpf_spin_lock fq_old_flows_lock; +private(C) struct bpf_list_head fq_old_flows __contains(fq_flow_node, list_node); + +private(D) struct fq_bpf_data q; + +/* Wrapper for bpf_kptr_xchg that expects NULL dst */ +static void bpf_kptr_xchg_back(void *map_val, void *ptr) +{ + void *ret; + + ret = bpf_kptr_xchg(map_val, ptr); + if (ret) + bpf_obj_drop(ret); +} + +static bool skbn_tstamp_less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct skb_node *skbn_a; + struct skb_node *skbn_b; + + skbn_a = container_of(a, struct skb_node, node); + skbn_b = container_of(b, struct skb_node, node); + + return skbn_a->tstamp < skbn_b->tstamp; +} + +static bool fn_time_next_packet_less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct fq_flow_node *flow_a; + struct fq_flow_node *flow_b; + + flow_a = container_of(a, struct fq_flow_node, rb_node); + flow_b = container_of(b, struct fq_flow_node, rb_node); + + return flow_a->time_next_packet < flow_b->time_next_packet; +} + +static void +fq_flows_add_head(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct fq_flow_node *flow, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + bpf_list_push_front(head, &flow->list_node); + bpf_spin_unlock(lock); + *flow_cnt += 1; +} + +static void +fq_flows_add_tail(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct fq_flow_node *flow, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + bpf_list_push_back(head, &flow->list_node); + bpf_spin_unlock(lock); + *flow_cnt += 1; +} + +static void +fq_flows_remove_front(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct bpf_list_node **node, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + *node = bpf_list_pop_front(head); + bpf_spin_unlock(lock); + *flow_cnt -= 1; +} + +static bool +fq_flows_is_empty(struct bpf_list_head *head, struct bpf_spin_lock *lock) +{ + struct bpf_list_node *node; + + bpf_spin_lock(lock); + node = bpf_list_pop_front(head); + if (node) { + bpf_list_push_front(head, node); + bpf_spin_unlock(lock); + return false; + } + bpf_spin_unlock(lock); + + return true; +} + +/* flow->age is used to denote the state of the flow (not-detached, detached, throttled) + * as well as the timestamp when the flow is detached. + * + * 0: not-detached + * 1 - (~0ULL-1): detached + * ~0ULL: throttled + */ +static void fq_flow_set_detached(struct fq_flow_node *flow) +{ + flow->age = bpf_jiffies64(); +} + +static bool fq_flow_is_detached(struct fq_flow_node *flow) +{ + return flow->age != 0 && flow->age != ~0ULL; +} + +static bool sk_listener(struct sock *sk) +{ + return (1 << sk->__sk_common.skc_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV); +} + +static void fq_gc(void); + +static int fq_new_flow(void *flow_map, struct fq_stashed_flow **sflow, u64 hash) +{ + struct fq_stashed_flow tmp = {}; + struct fq_flow_node *flow; + int ret; + + flow = bpf_obj_new(typeof(*flow)); + if (!flow) + return -ENOMEM; + + flow->credit = q.initial_quantum, + flow->qlen = 0, + flow->age = 1, + flow->time_next_packet = 0, + + ret = bpf_map_update_elem(flow_map, &hash, &tmp, 0); + if (ret == -ENOMEM || ret == -E2BIG) { + fq_gc(); + bpf_map_update_elem(&fq_nonprio_flows, &hash, &tmp, 0); + } + + *sflow = bpf_map_lookup_elem(flow_map, &hash); + if (!*sflow) { + bpf_obj_drop(flow); + return -ENOMEM; + } + + bpf_kptr_xchg_back(&(*sflow)->flow, flow); + return 0; +} + +static int +fq_classify(struct sk_buff *skb, struct fq_stashed_flow **sflow) +{ + struct sock *sk = skb->sk; + int ret = CLS_RET_NONPRIO; + u64 hash = 0; + + if ((skb->priority & TC_PRIO_MAX) == TC_PRIO_CONTROL) { + *sflow = bpf_map_lookup_elem(&fq_prio_flows, &hash); + ret = CLS_RET_PRIO; + } else { + if (!sk || sk_listener(sk)) { + hash = bpf_skb_get_hash(skb) & q.orphan_mask; + /* Avoid collision with an existing flow hash, which + * only uses the lower 32 bits of hash, by setting the + * upper half of hash to 1. + */ + hash |= (1ULL << 32); + } else if (sk->__sk_common.skc_state == TCP_CLOSE) { + hash = bpf_skb_get_hash(skb) & q.orphan_mask; + hash |= (1ULL << 32); + } else { + hash = sk->__sk_common.skc_hash; + } + *sflow = bpf_map_lookup_elem(&fq_nonprio_flows, &hash); + } + + if (!*sflow) + ret = fq_new_flow(&fq_nonprio_flows, sflow, hash) < 0 ? + CLS_RET_ERR : CLS_RET_NONPRIO; + + return ret; +} + +static bool fq_packet_beyond_horizon(struct sk_buff *skb) +{ + return (s64)skb->tstamp > (s64)(q.ktime_cache + q.horizon); +} + +SEC("struct_ops/bpf_fq_enqueue") +int BPF_PROG(bpf_fq_enqueue, struct sk_buff *skb, struct Qdisc *sch, + struct bpf_sk_buff_ptr *to_free) +{ + struct fq_flow_node *flow = NULL, *flow_copy; + struct fq_stashed_flow *sflow; + u64 time_to_send, jiffies; + struct skb_node *skbn; + int ret; + + if (sch->q.qlen >= sch->limit) + goto drop; + + if (!skb->tstamp) { + time_to_send = q.ktime_cache = bpf_ktime_get_ns(); + } else { + if (fq_packet_beyond_horizon(skb)) { + q.ktime_cache = bpf_ktime_get_ns(); + if (fq_packet_beyond_horizon(skb)) { + if (q.horizon_drop) + goto drop; + + skb->tstamp = q.ktime_cache + q.horizon; + } + } + time_to_send = skb->tstamp; + } + + ret = fq_classify(skb, &sflow); + if (ret == CLS_RET_ERR) + goto drop; + + flow = bpf_kptr_xchg(&sflow->flow, flow); + if (!flow) + goto drop; + + if (ret == CLS_RET_NONPRIO) { + if (flow->qlen >= q.flow_plimit) { + bpf_kptr_xchg_back(&sflow->flow, flow); + goto drop; + } + + if (fq_flow_is_detached(flow)) { + flow_copy = bpf_refcount_acquire(flow); + + jiffies = bpf_jiffies64(); + if ((s64)(jiffies - (flow_copy->age + q.flow_refill_delay)) > 0) { + if (flow_copy->credit < q.quantum) + flow_copy->credit = q.quantum; + } + flow_copy->age = 0; + fq_flows_add_tail(&fq_new_flows, &fq_new_flows_lock, flow_copy, + &q.new_flow_cnt); + } + } + + skbn = bpf_obj_new(typeof(*skbn)); + if (!skbn) { + bpf_kptr_xchg_back(&sflow->flow, flow); + goto drop; + } + + skbn->tstamp = skb->tstamp = time_to_send; + + sch->qstats.backlog += qdisc_pkt_len(skb); + + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_qdisc_skb_drop(skb, to_free); + + bpf_spin_lock(&flow->lock); + bpf_rbtree_add(&flow->queue, &skbn->node, skbn_tstamp_less); + bpf_spin_unlock(&flow->lock); + + flow->qlen++; + bpf_kptr_xchg_back(&sflow->flow, flow); + + sch->q.qlen++; + return NET_XMIT_SUCCESS; + +drop: + bpf_qdisc_skb_drop(skb, to_free); + sch->qstats.drops++; + return NET_XMIT_DROP; +} + +static int fq_unset_throttled_flows(u32 index, struct unset_throttled_flows_ctx *ctx) +{ + struct bpf_rb_node *node = NULL; + struct fq_flow_node *flow; + + bpf_spin_lock(&fq_delayed_lock); + + node = bpf_rbtree_first(&fq_delayed); + if (!node) { + bpf_spin_unlock(&fq_delayed_lock); + return 1; + } + + flow = container_of(node, struct fq_flow_node, rb_node); + if (!ctx->unset_all && flow->time_next_packet > ctx->now) { + q.time_next_delayed_flow = flow->time_next_packet; + bpf_spin_unlock(&fq_delayed_lock); + return 1; + } + + node = bpf_rbtree_remove(&fq_delayed, &flow->rb_node); + + bpf_spin_unlock(&fq_delayed_lock); + + if (!node) + return 1; + + flow = container_of(node, struct fq_flow_node, rb_node); + flow->age = 0; + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + + return 0; +} + +static void fq_flow_set_throttled(struct fq_flow_node *flow) +{ + flow->age = ~0ULL; + + if (q.time_next_delayed_flow > flow->time_next_packet) + q.time_next_delayed_flow = flow->time_next_packet; + + bpf_spin_lock(&fq_delayed_lock); + bpf_rbtree_add(&fq_delayed, &flow->rb_node, fn_time_next_packet_less); + bpf_spin_unlock(&fq_delayed_lock); +} + +static void fq_check_throttled(u64 now) +{ + struct unset_throttled_flows_ctx ctx = { + .unset_all = false, + .now = now, + }; + unsigned long sample; + + if (q.time_next_delayed_flow > now) + return; + + sample = (unsigned long)(now - q.time_next_delayed_flow); + q.unthrottle_latency_ns -= q.unthrottle_latency_ns >> 3; + q.unthrottle_latency_ns += sample >> 3; + + q.time_next_delayed_flow = ~0ULL; + bpf_loop(NUM_QUEUE, fq_unset_throttled_flows, &ctx, 0); +} + +static struct sk_buff* +fq_dequeue_nonprio_flows(u32 index, struct dequeue_nonprio_ctx *ctx) +{ + u64 time_next_packet, time_to_send; + struct bpf_rb_node *rb_node; + struct sk_buff *skb = NULL; + struct bpf_list_head *head; + struct bpf_list_node *node; + struct bpf_spin_lock *lock; + struct fq_flow_node *flow; + struct skb_node *skbn; + bool is_empty; + u32 *cnt; + + if (q.new_flow_cnt) { + head = &fq_new_flows; + lock = &fq_new_flows_lock; + cnt = &q.new_flow_cnt; + } else if (q.old_flow_cnt) { + head = &fq_old_flows; + lock = &fq_old_flows_lock; + cnt = &q.old_flow_cnt; + } else { + if (q.time_next_delayed_flow != ~0ULL) + ctx->expire = q.time_next_delayed_flow; + goto break_loop; + } + + fq_flows_remove_front(head, lock, &node, cnt); + if (!node) + goto break_loop; + + flow = container_of(node, struct fq_flow_node, list_node); + if (flow->credit <= 0) { + flow->credit += q.quantum; + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + return NULL; + } + + bpf_spin_lock(&flow->lock); + rb_node = bpf_rbtree_first(&flow->queue); + if (!rb_node) { + bpf_spin_unlock(&flow->lock); + is_empty = fq_flows_is_empty(&fq_old_flows, &fq_old_flows_lock); + if (head == &fq_new_flows && !is_empty) { + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + } else { + fq_flow_set_detached(flow); + bpf_obj_drop(flow); + } + return NULL; + } + + skbn = container_of(rb_node, struct skb_node, node); + time_to_send = skbn->tstamp; + + time_next_packet = (time_to_send > flow->time_next_packet) ? + time_to_send : flow->time_next_packet; + if (ctx->now < time_next_packet) { + bpf_spin_unlock(&flow->lock); + flow->time_next_packet = time_next_packet; + fq_flow_set_throttled(flow); + return NULL; + } + + rb_node = bpf_rbtree_remove(&flow->queue, rb_node); + bpf_spin_unlock(&flow->lock); + + if (!rb_node) + goto add_flow_and_break; + + skbn = container_of(rb_node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + + if (!skb) + goto add_flow_and_break; + + flow->credit -= qdisc_skb_cb(skb)->pkt_len; + flow->qlen--; + +add_flow_and_break: + fq_flows_add_head(head, lock, flow, cnt); + +break_loop: + ctx->stop_iter = true; + return skb; +} + +static struct sk_buff *fq_dequeue_prio(void) +{ + struct fq_flow_node *flow = NULL; + struct fq_stashed_flow *sflow; + struct bpf_rb_node *rb_node; + struct sk_buff *skb = NULL; + struct skb_node *skbn; + u64 hash = 0; + + sflow = bpf_map_lookup_elem(&fq_prio_flows, &hash); + if (!sflow) + return NULL; + + flow = bpf_kptr_xchg(&sflow->flow, flow); + if (!flow) + return NULL; + + bpf_spin_lock(&flow->lock); + rb_node = bpf_rbtree_first(&flow->queue); + if (!rb_node) { + bpf_spin_unlock(&flow->lock); + goto out; + } + + skbn = container_of(rb_node, struct skb_node, node); + rb_node = bpf_rbtree_remove(&flow->queue, &skbn->node); + bpf_spin_unlock(&flow->lock); + + if (!rb_node) + goto out; + + skbn = container_of(rb_node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + +out: + bpf_kptr_xchg_back(&sflow->flow, flow); + + return skb; +} + +SEC("struct_ops/bpf_fq_dequeue") +struct sk_buff *BPF_PROG(bpf_fq_dequeue, struct Qdisc *sch) +{ + struct dequeue_nonprio_ctx cb_ctx = {}; + struct sk_buff *skb = NULL; + int i; + + if (!sch->q.qlen) + goto out; + + skb = fq_dequeue_prio(); + if (skb) + goto dequeue; + + q.ktime_cache = cb_ctx.now = bpf_ktime_get_ns(); + fq_check_throttled(q.ktime_cache); + bpf_for(i, 0, sch->limit) { + skb = fq_dequeue_nonprio_flows(i, &cb_ctx); + if (cb_ctx.stop_iter) + break; + }; + + if (skb) { +dequeue: + sch->q.qlen--; + sch->qstats.backlog -= qdisc_pkt_len(skb); + bpf_qdisc_bstats_update(sch, skb); + return skb; + } + + if (cb_ctx.expire) + bpf_qdisc_watchdog_schedule(sch, cb_ctx.expire, q.timer_slack); +out: + return NULL; +} + +static int fq_remove_flows_in_list(u32 index, void *ctx) +{ + struct bpf_list_node *node; + struct fq_flow_node *flow; + + bpf_spin_lock(&fq_new_flows_lock); + node = bpf_list_pop_front(&fq_new_flows); + bpf_spin_unlock(&fq_new_flows_lock); + if (!node) { + bpf_spin_lock(&fq_old_flows_lock); + node = bpf_list_pop_front(&fq_old_flows); + bpf_spin_unlock(&fq_old_flows_lock); + if (!node) + return 1; + } + + flow = container_of(node, struct fq_flow_node, list_node); + bpf_obj_drop(flow); + + return 0; +} + +extern unsigned CONFIG_HZ __kconfig; + +/* limit number of collected flows per round */ +#define FQ_GC_MAX 8 +#define FQ_GC_AGE (3*CONFIG_HZ) + +static bool fq_gc_candidate(struct fq_flow_node *flow) +{ + u64 jiffies = bpf_jiffies64(); + + return fq_flow_is_detached(flow) && + ((s64)(jiffies - (flow->age + FQ_GC_AGE)) > 0); +} + +static int +fq_remove_flows(struct bpf_map *flow_map, u64 *hash, + struct fq_stashed_flow *sflow, struct remove_flows_ctx *ctx) +{ + if (sflow->flow && + (!ctx->gc_only || fq_gc_candidate(sflow->flow))) { + bpf_map_delete_elem(flow_map, hash); + ctx->reset_cnt++; + } + + return ctx->reset_cnt < ctx->reset_max ? 0 : 1; +} + +static void fq_gc(void) +{ + struct remove_flows_ctx cb_ctx = { + .gc_only = true, + .reset_cnt = 0, + .reset_max = FQ_GC_MAX, + }; + + bpf_for_each_map_elem(&fq_nonprio_flows, fq_remove_flows, &cb_ctx, 0); +} + +SEC("struct_ops/bpf_fq_reset") +void BPF_PROG(bpf_fq_reset, struct Qdisc *sch) +{ + struct unset_throttled_flows_ctx utf_ctx = { + .unset_all = true, + }; + struct remove_flows_ctx rf_ctx = { + .gc_only = false, + .reset_cnt = 0, + .reset_max = NUM_QUEUE, + }; + struct fq_stashed_flow *sflow; + u64 hash = 0; + + sch->q.qlen = 0; + sch->qstats.backlog = 0; + + bpf_for_each_map_elem(&fq_nonprio_flows, fq_remove_flows, &rf_ctx, 0); + + rf_ctx.reset_cnt = 0; + bpf_for_each_map_elem(&fq_prio_flows, fq_remove_flows, &rf_ctx, 0); + fq_new_flow(&fq_prio_flows, &sflow, hash); + + bpf_loop(NUM_QUEUE, fq_remove_flows_in_list, NULL, 0); + q.new_flow_cnt = 0; + q.old_flow_cnt = 0; + + bpf_loop(NUM_QUEUE, fq_unset_throttled_flows, &utf_ctx, 0); + + return; +} + +SEC("struct_ops/bpf_fq_init") +int BPF_PROG(bpf_fq_init, struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + struct net_device *dev = sch->dev_queue->dev; + u32 psched_mtu = dev->mtu + dev->hard_header_len; + struct fq_stashed_flow *sflow; + u64 hash = 0; + + if (fq_new_flow(&fq_prio_flows, &sflow, hash) < 0) + return -ENOMEM; + + sch->limit = 10000; + q.initial_quantum = 10 * psched_mtu; + q.quantum = 2 * psched_mtu; + q.flow_refill_delay = 40; + q.flow_plimit = 100; + q.horizon = 10ULL * NSEC_PER_SEC; + q.horizon_drop = 1; + q.orphan_mask = 1024 - 1; + q.timer_slack = 10 * NSEC_PER_USEC; + q.time_next_delayed_flow = ~0ULL; + q.unthrottle_latency_ns = 0ULL; + q.new_flow_cnt = 0; + q.old_flow_cnt = 0; + + return 0; +} + +SEC(".struct_ops") +struct Qdisc_ops fq = { + .enqueue = (void *)bpf_fq_enqueue, + .dequeue = (void *)bpf_fq_dequeue, + .reset = (void *)bpf_fq_reset, + .init = (void *)bpf_fq_init, + .id = "bpf_fq", +}; From patchwork Fri Jan 31 19:28:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 13955710 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6083D1F540E; Fri, 31 Jan 2025 19:29:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351785; cv=none; b=OWJuz1yjoMNalJ/QBoX6MbufxOvHA8+qTfluc9qoV7lO086SeMAEZLZFnCfs+uApTWn2aAV703DZIVuyTv+AWikZaI6analLPt4GOfEIEgqQzbUWRzyD1FG8xo2TUqC/+A6Wpz1GF2zIrcvUhRQMpRroX3fK0FXifXNCft7zHRs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738351785; c=relaxed/simple; bh=mlGlyxbLFEcczK4pQFhxrC/qDivTpB0ocgBa+jCOWF8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GXCt/NDGH9bYDdLaawzTx/JhN3zOi24kKZmI+NqBOOiX9MqsOU90RhoRt7X4mPISPphGrtfUoNizvnNVMap/jOSPEYCNt4cjvbwPKl58E5uv0BTvW0el0M50nGBSqbhmkZvu77LGeYDsmezLDFD5cy2DgG+O5aQ2U5YwHzug4OY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=J1foTLrv; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="J1foTLrv" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2ee67e9287fso4220383a91.0; Fri, 31 Jan 2025 11:29:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738351782; x=1738956582; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HZO+JZUveFKqFj0FRWk6w5VCR/pGW5PYXof0XoaAvX4=; b=J1foTLrv5PiDCz0XuwNcILhjiEC86Q13xD1UNHrE8rbOc9fDSaApDR3Tciv5zUXKgQ 2/KqPFlASBQpxQg0uxTtS42QKjDjx9sGPLHhSEzrUpsgETCVEP4vrhSObnbZ/RrS3NWZ jlRpPjRgBpFDDGxV7PF5ZbRtqwWcZnX6TgXDS7/gJaGWQ5ZLBbJH/KSixiom1veSNEZ3 wIWT2sJ9AekqbyU4butpPEU6nlxPfrIM1eFpqCbImDbZ88QmhWlD9VgZavoZUHfnvfuV n7Hhbqmt+lOoDPLiGcnD7ur45yIRsqNYNoHUviLc/Hs+lNXBUTpeYrCEjxWNvy2ogH6m Wa7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351782; x=1738956582; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HZO+JZUveFKqFj0FRWk6w5VCR/pGW5PYXof0XoaAvX4=; b=WexDRgMXYp+FqUILP33quGHKdrzuMzl+jkHdrXO4ChYaK8GWccM6Pp1XTOpmoVQE31 KWRoFx9VWi7bWGdghPklN9qHEX+CydloWBTMswMXqLcDcuC4CYpK/5JRx7sNkptl2ciq kAoicBrSO4egVYjBC3XpL+uWsmyx712eobA+0i+2L7ItEh20VystlJkCYI89OEkLkPDW 0QisBofNjUrxGPcWpRXUd6gbLuj0e8T82msBgHAIszeASY7Kp63PkTstevbWZd+Eh24g 7+LHz2xZ6jaxzZrTWSPB49cqoc8TLRNfNmrtdOM+hTdMewhnhKX5Blbt1w0aMTskZJM+ sq9g== X-Gm-Message-State: AOJu0YzvISWr0NI07H2Z+kV8FXJCw5A0dIVySfaJvkNyiZ8dD0Ef6MZC l1LcsHvOf+wCut3E1YcJQZDez9hrl3d7c0L2geJThk7fkEuZ4igz+B1bXnNQsbI= X-Gm-Gg: ASbGncuF2C+bsteTD3tfT+IIuQtRYSlmg0QTcIASZompHlOvbhhzqDQYpc+II56mmW0 AYmqaO4+ZJttV3nwDOnERz2uhBlUE79zhHhahlFNBpt8jfzUA7UY0aHH9qSPogTMfOzE3eqKF75 f2Npap9mvwmJoS+Eizk6naqPmN1CuixAPNiNyg7Cy7PQPOjnX2I1cO4W4X/J1xuSwGpQjZu3GcP SGeb+mijVXYburNqoR9D6WRFTjDJhrrfI9kLnrqa0qbGgqElUEhObqYmB2iV27yPFY8iu3aJgmJ 0st22uS2K4aKv/w75hnhgdWWKz3wrIDo0iMfoM2pf8rF2qnhAtqJ31bSxM11odQt7Q== X-Google-Smtp-Source: AGHT+IHzBRo8IJ8uBu2ZAke1h9oI55cpoyun0ISD0Ka/fo3p3Z0VUKxja2HfVfk/GULzuiO+g6OyWg== X-Received: by 2002:a17:90b:3a0e:b0:2f2:8bdd:cd8b with SMTP id 98e67ed59e1d1-2f83ac7f028mr18265321a91.29.1738351782397; Fri, 31 Jan 2025 11:29:42 -0800 (PST) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f8489d3707sm4072471a91.23.2025.01.31.11.29.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:29:42 -0800 (PST) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, cong.wang@bytedance.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, ming.lei@redhat.com, kernel-team@meta.com Subject: [PATCH bpf-next v3 18/18] selftests/bpf: Test attaching bpf qdisc to mq and non root Date: Fri, 31 Jan 2025 11:28:57 -0800 Message-ID: <20250131192912.133796-19-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250131192912.133796-1-ameryhung@gmail.com> References: <20250131192912.133796-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Until we are certain that existing classful qdiscs work with bpf qdisc, make sure we don't allow attaching a bpf qdisc to non root. Meanwhile, attaching to mq is allowed. Signed-off-by: Amery Hung --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/bpf_qdisc.c | 111 +++++++++++++++++- 2 files changed, 110 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 6b0cab55bd2d..3201a962b3dc 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -74,6 +74,7 @@ CONFIG_NET_MPLS_GSO=y CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y +CONFIG_NET_SCH_HTB=y CONFIG_NET_SCHED=y CONFIG_NETDEVSIM=y CONFIG_NETFILTER=y diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c index 7e8e3170e6b6..f3158170edff 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -86,18 +86,125 @@ static void test_fq(void) bpf_qdisc_fq__destroy(fq_skel); } +static int netdevsim_write_cmd(const char *path, const char *cmd) +{ + FILE *fp; + + fp = fopen(path, "w"); + if (!ASSERT_OK_PTR(fp, "write_netdevsim_cmd")) + return -errno; + + fprintf(fp, cmd); + fclose(fp); + return 0; +} + +static void test_qdisc_attach_to_mq(void) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, + .attach_point = BPF_TC_QDISC, + .parent = 0x00010001, + .handle = 0x8000000, + .qdisc = "bpf_fifo"); + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + int err; + + hook.ifindex = if_nametoindex("eni1np1"); + if (!ASSERT_NEQ(hook.ifindex, 0, "if_nametoindex")) + return; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + ASSERT_OK(system("tc qdisc add dev eni1np1 root handle 1: mq"), "create mq"); + + err = bpf_tc_hook_create(&hook); + ASSERT_OK(err, "attach qdisc"); + + bpf_tc_hook_destroy(&hook); + + ASSERT_OK(system("tc qdisc delete dev eni1np1 root mq"), "delete mq"); + + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + +static void test_qdisc_attach_to_non_root(void) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = LO_IFINDEX, + .attach_point = BPF_TC_QDISC, + .parent = 0x00010001, + .handle = 0x8000000, + .qdisc = "bpf_fifo"); + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + int err; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + ASSERT_OK(system("tc qdisc add dev lo root handle 1: htb"), "create htb"); + ASSERT_OK(system("tc class add dev lo parent 1: classid 1:1 htb rate 75Kbit"), "create htb class"); + + err = bpf_tc_hook_create(&hook); + ASSERT_ERR(err, "attach qdisc"); + + bpf_tc_hook_destroy(&hook); + + ASSERT_OK(system("tc qdisc delete dev lo root htb"), "delete htb"); + + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + void test_bpf_qdisc(void) { + struct nstoken *nstoken = NULL; struct netns_obj *netns; + int err; - netns = netns_new("bpf_qdisc_ns", true); + netns = netns_new("bpf_qdisc_ns", false); if (!ASSERT_OK_PTR(netns, "netns_new")) return; + err = netdevsim_write_cmd("/sys/bus/netdevsim/new_device", "1 1 4"); + if (!ASSERT_OK(err, "create netdevsim")) { + netns_free(netns); + return; + } + + ASSERT_OK(system("ip link set eni1np1 netns bpf_qdisc_ns"), "ip link set netdevsim"); + + nstoken = open_netns("bpf_qdisc_ns"); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto out; + if (test__start_subtest("fifo")) test_fifo(); if (test__start_subtest("fq")) test_fq(); - + if (test__start_subtest("attach to mq")) + test_qdisc_attach_to_mq(); + if (test__start_subtest("attach to non root")) + test_qdisc_attach_to_non_root(); + +out: + err = netdevsim_write_cmd("/sys/bus/netdevsim/del_device", "1"); + ASSERT_OK(err, "delete netdevsim"); netns_free(netns); }