From patchwork Wed Mar 19 21:53:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023198 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F67621A43C; Wed, 19 Mar 2025 21:54:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421252; cv=none; b=p/qdORqfps+5gnxYM8Y1J1uwrG+AhEt9XTp769P5Mhqv1s7ZhZ/M+53WHDbX4CvWGbc5sxPg9r2VSMS6ssrBpWlTXzn0lF4AU4wR6MHPSApcGje1SmVllH8f2Yt3KAuE06YUcvaDOmORbDsvQwDBFbaH1VYB6yDvZdTcuOGSNzU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421252; c=relaxed/simple; bh=neDbu40ISQ9T7NoG8b3vjY7df+tdCtsATiunE1RAx4w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=n47l8b8eEU/2+nJAkCr9uWtDxTMWDBE5y61ytT183/zJkfGoczEYHUrwf6EtltSnKYf0RIpZffRTe97ENpBkAfwod/KkOy5eRXjG7YLBsdd+SRQDbARaViIuFH+deBsUyvQBUxcN+xiWalTWZ+51sYtaP5vS4D/ZM7EeFAcDH1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nNUsKkoh; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nNUsKkoh" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-225b5448519so988135ad.0; Wed, 19 Mar 2025 14:54:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421250; x=1743026050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lO/+kBviSk5Nk705Nhk0txXJBkcBVv7rbgcbEyDfqh8=; b=nNUsKkohtuOeUYh6YY+i8rG8r7S2T6aKSWGCYZPTTdLNpKfUA6J8f33fOdAmTe2rT6 CLJZ3KloDdHl9oq3QiG4q47IiQjMvuNM1wfEUYPuTdgZxvwwh5L7Xcs2LNSbh2HO4NO6 XKaHCDIbX3/tMpd55B95VhjyHb5o/L8AfVb298fLZgAFigHfbODKp7rFSY2EVYKinnmH CvvxhZwXRToNXuyN1pIDIq3+b3LjInD6HuGilwQyqlfwosPbb1aH90wxYaVdMxc6rRCu mfwmjvsbh8RkZWD0yVpm2I246Q9qgMeLlf2CIjTgvZZqyqhR+C/88WErk+DDIPYIOdaq mM9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421250; x=1743026050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lO/+kBviSk5Nk705Nhk0txXJBkcBVv7rbgcbEyDfqh8=; b=WK6PPfABJOQ7IuZk9zUU+H/IVm9hoI36H7XLB3BFaUs2NCo5eVhIh+GFNGh3aXwjDv HZTljIEkRS5qCbBBKOw1NE4vE4rEKF2+9kRYKhz2blOc8fMr6RQbnOzPGKeY3gGZkQwe 6mG4SCcAtYsC+assOQhqPN47mfvztxU78K54GX5KkE1POG5//5zQu6v2O+G5hAzVn/cZ bUQIWpL5TrgOyMO8jt9++xZc/3j6+FdmdHnMo3Spzd8XKG2/ww07y0efPgtRnIOvtQaE eqzm3B5ZGwA2bosLvH6MJxIj8+e3mBmMvg7M5oP8Mm/XK4S159hEmWTc5SPj45a5tupG UbcA== X-Gm-Message-State: AOJu0Yx8J1RS0qJJsKu/z9NcIuwBu5QRAXdtBdadra29q/1doEOdUARh ghKstqQ7lTEVNAEwv3I+tk5pCmVwnACgVA2sNRdOWs4xiWD5RF/qczC6WHTSEZ4= X-Gm-Gg: ASbGncv+vSj5MJ8djSrh3M57cf8Nm4zg7yAhROtR6WqrMIDwkjQcTotb3Bj/daLPo4U lQiywYCfuavQtWfcJL33gflOcEjPXFdnsdodd8utfH13CjK2HhI/h4JyRfzU1bN9xpxAKLih/np XJRcIp3QV2ilu9Ldg+1/8SqmTc9tFKX6tsZpj2scRzm3ygJxxxeHk2iPR9Q7Y+WEENP7B/4VfX/ MN7HxwMpJuuI6Tsc2ObjkhLtCsi0r0xuLmNB925hkusbPPYIVXebgMGzu72mh9BNqC5M+Sz8lrl C7nZ7JXMQiGhOIoEvuZhQhbGB7QTnBp9q8q2CkH0cJPkozOuZVN87LyAUfY4RdSPxdHug9SIdsm to6nwCvPE42Hyx2mfitU= X-Google-Smtp-Source: AGHT+IH4XHushp9PaBlooyao7NdNpW6DLjoJ2WtLDhNpr5S8gJy23IHoI881753FFd6Cn3wO3McZ3Q== X-Received: by 2002:a17:903:1d2:b0:21f:85d0:828 with SMTP id d9443c01a7336-22649c8f940mr71956625ad.41.1742421249633; Wed, 19 Mar 2025 14:54:09 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:09 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 01/11] bpf: Add struct_ops context information to struct bpf_prog_aux Date: Wed, 19 Mar 2025 14:53:48 -0700 Message-ID: <20250319215358.2287371-2-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Juntong Deng This patch adds struct_ops context information to struct bpf_prog_aux. This context information will be used in the kfunc filter. Currently the added context information includes struct_ops member offset and a pointer to struct bpf_struct_ops. Signed-off-by: Juntong Deng Acked-by: Alexei Starovoitov --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 8 ++++++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 973a88d9b52b..111bea4e507f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1521,6 +1521,7 @@ struct bpf_prog_aux { u32 real_func_cnt; /* includes hidden progs, only used for JIT and freeing progs */ u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */ u32 attach_btf_id; /* in-kernel BTF type id to attach to */ + u32 attach_st_ops_member_off; u32 ctx_arg_info_size; u32 max_rdonly_access; u32 max_rdwr_access; @@ -1566,6 +1567,7 @@ struct bpf_prog_aux { #endif struct bpf_ksym ksym; const struct bpf_prog_ops *ops; + const struct bpf_struct_ops *st_ops; struct bpf_map **used_maps; struct mutex used_maps_mutex; /* mutex for used_maps and used_map_cnt */ struct btf_mod_pair *used_btfs; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9f8cbd5c61bc..41fd93db8258 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -22736,7 +22736,7 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env) const struct btf_member *member; struct bpf_prog *prog = env->prog; bool has_refcounted_arg = false; - u32 btf_id, member_idx; + u32 btf_id, member_idx, member_off; struct btf *btf; const char *mname; int i, err; @@ -22787,7 +22787,8 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env) return -EINVAL; } - err = bpf_struct_ops_supported(st_ops, __btf_member_bit_offset(t, member) / 8); + member_off = __btf_member_bit_offset(t, member) / 8; + err = bpf_struct_ops_supported(st_ops, member_off); if (err) { verbose(env, "attach to unsupported member %s of struct %s\n", mname, st_ops->name); @@ -22826,6 +22827,9 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env) } } + prog->aux->st_ops = st_ops; + prog->aux->attach_st_ops_member_off = member_off; + prog->aux->attach_func_proto = func_proto; prog->aux->attach_func_name = mname; env->ops = st_ops->verifier_ops; From patchwork Wed Mar 19 21:53:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023199 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4ACC21B90B; Wed, 19 Mar 2025 21:54:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421253; cv=none; b=nvlTAuv4FwyQGNgw57h0Td3XCzwRJ8281iqV1gro+Za/WEeTK8hn8cyGOWAq4Hea9dmsKyCy2flzO42FDR0upQXugRqOkrxoJX0Y/0yFbF8ltklYkZanJLMK/XmPxRFfcX3LikBHN5onrbEc1VyOJ6qocSWEaF9cEA0feBUHIlQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421253; c=relaxed/simple; bh=MKY8DDHg1nLkWceCGBdbkxFs4pXbBDDZ21a5MpLI0LY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tSP9ERpY/AKZ1kq4NypCV2gQ1pKMmd8IIXIqu3WPzY8rGKH1GjoSLAbnmV94NYSc2aKpy+uQL5LgfDGUaVl7MoURxkdKz/6whXjd2CYEWrw8C5o7FGRKIAEm2TYg1g8O/3iXHs8068fwttFaxb2yUJY3RS8qIn6q3huYM9zZAzE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lErDvXak; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lErDvXak" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-22398e09e39so716875ad.3; Wed, 19 Mar 2025 14:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421251; x=1743026051; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dIlOPDqec1+k+Ika4W2J3cSnlzDPm3T0b07QKZ3Ghzk=; b=lErDvXakL3TeTAGAPFHeYxxfvPI2BdmcL9gKV1AWwDS1AObikCKgD0YafGHcZbsAHE 875pOBjH6gMmzeuncBehME8TV8Sf04VSH9G8GGfYpgaLJaPIN6INys9NmXp6JKWZ8n4O TASuEW2YbBTEpVgo6w6tzTGHF/iIZIloo/8+iP07RmQDrjgXCxXw2wxGvaotNJo6ES69 mxhNCnDrT30pf84Cok4RT4kqxAK4BH6NA1Ga5IyEXQMTWx7c8akukZAfghp/kHZ/b+d8 JMI82f/4LGQFbGWOAqFs6LplKkfGALkhLsgHx5VYYrBBirVgXv+5bzU4jqlF1HD8wxYL xHEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421251; x=1743026051; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dIlOPDqec1+k+Ika4W2J3cSnlzDPm3T0b07QKZ3Ghzk=; b=HTLjCUeKVYdKyvUM6RyGxeWBgGn0cYoSPMyiqd2+Tx73tq/isY2RiofnHnpgyuO42n Ep2I7tXLYaH9tZaPvPoozsGPg+S4Oy93Mri43CRP2+bQn+8MdawwHvxCnujH2+eiM3sw GV/2jxURNNlw/XKFOH09yS5CSrDHCyAi1byu9b61Qp/u2NDQ6FU5yOWABJBOHZKCnJjx Utc1EUrdEc2PeXqc3J3Ocbm8NrLI1FpFCGtV9gKkhLG5AekgF9YY8tPu+aLJv3mol2VJ g2Ag318b5rJrzbgulucUrX95QfFGLQEETXR5ReUtoBmqpS4yo2a35E9YoD9nGlF6Hf+v acew== X-Gm-Message-State: AOJu0YzwyFAAIKWT6t6AECoDj73h1BBHD+uDHVhTh0IQG7YX0jJNDpfM wa5rgSeobz+E+Pcj/pYtgaMOPq/VJodO17OMmwksD1knokjuV03XbOUk1u6pbsg= X-Gm-Gg: ASbGnctf1hknJPzkDvG17PWR9AdNhlAp3Oa5jomD7b+xylvY5+aw/Noey37WioEIEyb 8wB803HheEDJvXbOpDwnP8KCmSFHZCRDYvyZMIBQ7RfVJbQQaKf2avlIqyawjKs0CiDYgPFh4T9 vn0SBmPwk8wy3rxwIdvTGG2e2lyGiOupN3tk1eHUka19i0grh6JiaG4alu1UuYdIIzLXdNorlTV AiHmlYJj0d7YDTNbU1dPP2jSdN4ndEw6ftyuCocHwiu/20/AlRJQ4K5TFbk+n61+HsRMbCLzQvW /PWp/GgCnWFSQQ+13zmhBLT6H4/AQ0DSWg7ptA79bgPTIBJwkRzGgziPC0cMcBkePRzvtlXxQM5 qxEgFACzTjkX1AgkCc8OgcrWeIao8bQ== X-Google-Smtp-Source: AGHT+IHRX0z+uj0xHOydDCZhhYWAgsW3VCrsJryhFtiQxw0CDBgXU+EmbZ3/JrZ0V5lmMq8aDrhrwA== X-Received: by 2002:aa7:888a:0:b0:72f:590f:2859 with SMTP id d2e1a72fcca58-7377a869447mr1250494b3a.13.1742421250613; Wed, 19 Mar 2025 14:54:10 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:10 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 02/11] bpf: Prepare to reuse get_ctx_arg_idx Date: Wed, 19 Mar 2025 14:53:49 -0700 Message-ID: <20250319215358.2287371-3-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Rename get_ctx_arg_idx to bpf_ctx_arg_idx, and allow others to call it. No functional change. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- include/linux/btf.h | 1 + kernel/bpf/btf.c | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index ebc0c0c9b944..b2983706292f 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -522,6 +522,7 @@ bool btf_param_match_suffix(const struct btf *btf, const char *suffix); int btf_ctx_arg_offset(const struct btf *btf, const struct btf_type *func_proto, u32 arg_no); +u32 btf_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, int off); struct bpf_verifier_log; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 519e3f5e9c10..9a4920828c30 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6369,8 +6369,8 @@ static bool is_int_ptr(struct btf *btf, const struct btf_type *t) return btf_type_is_int(t); } -static u32 get_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, - int off) +u32 btf_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto, + int off) { const struct btf_param *args; const struct btf_type *t; @@ -6649,7 +6649,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, tname, off); return false; } - arg = get_ctx_arg_idx(btf, t, off); + arg = btf_ctx_arg_idx(btf, t, off); args = (const struct btf_param *)(t + 1); /* if (t == NULL) Fall back to default BPF prog with * MAX_BPF_FUNC_REG_ARGS u64 arguments. From patchwork Wed Mar 19 21:53:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023200 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2A4721C9F2; Wed, 19 Mar 2025 21:54:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421254; cv=none; b=iqzaMcCBHL1kLrTJ1v/z/NsSjhDaKQzoUWx8oWiMTOZZ4oCIDUKyr0TiwRqjfTwfyf27mqruX9zCJpicQzEd8gyILyFMyVwFKEgUtEwdIVkWBLef7Eq1fSyvp50bQRBnUML2YyzcvfMQeK+4GwglOdAiz4iy7ybKODJCiy0wlkE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421254; c=relaxed/simple; bh=rs06dbIdpXiRETb6ghtLt3iLOXAScwV8g+6KzPVmcew=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nqWGob+DtmJmKsy44E7W+MwUVa347qMNO2fHn8fEm1QQ/flpk/ax7ledCPaCfZ5nlENsYNaeryXneRL1nYAC9Olt7CaSWRxJvVqWqS0/74sxl01DxFoVWMw9n5+dbtS8KFzKYsQl500k+WmNtj42wbDHFUKhqRbZSxiC9GA0+2s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BQbknyZE; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BQbknyZE" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-223fd89d036so804365ad.1; Wed, 19 Mar 2025 14:54:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421252; x=1743026052; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vaTPExmOh7fK7MqnAPa6Vkti2097rUmISDNMFNX1BdE=; b=BQbknyZEuOZ28MDwyjuwjXz6UJoNN0BbNrEcKa7yY6pltmAWLD/s5CfM5PRiRpv2v5 eiC74n3mI2iAb0tw+iXcW8xLnaq6xjm2AuLFqTtjH3VyzkauI5hUQ/UexoKQW6yAOsPe 3f59GlAJzIOlO6FIRzihOXNeUfDcdafKwlsA1eVJJZ+QDGDkzZi4DWHdZ+oGcZ6XG4+f Q+bBwJJlh/gPs0UXz/mMivWXW6rYCUyax5eU+BfnVgvr2n0pgLDv6UZM1Sc3k5X118rP SdwyhRS4J7w9DYQk71ZACazDQgJQQn798wVM58kiArq31f6V6baHRdQPoRVqnigMmrXI ej1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421252; x=1743026052; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vaTPExmOh7fK7MqnAPa6Vkti2097rUmISDNMFNX1BdE=; b=qlj+uIncZ+f3q1Fau3hPkxC/i9B35N/oxHBl13NjJQGYZGhyIHEceddFT4P42tFMqr +b3Tb7y1KfJsLbNO2l9/DZzO+zD0xtlc3uTLI5uzOOTzzERqJTPJwZWgR9sV5cw0lwuS LBsWXVaDcpOmzY0JPRomYPinWFG7+OYiPQCWgcXzAvalT7PuPPiRaqSiX9vKoUbAANsS icMgNSVw3Lr0DbDvpkbfAERPyI0sWDlq7cnop4sG+vtoNeeXg48SSqEDKWapJhkGkGZ5 I9lNmF64vegdghl/b5wuuKoLdSi7cQAmAq4ePKtE3quLS+Mjg5zeP8lbLLax8u8rXX8P SuGA== X-Gm-Message-State: AOJu0Yw902AKuMPAB3JjNOKVyCDiOpqHrZo7LwCsuinRgskfSnmxIZME H+PlLLJDFVO8o/znVN+XbXIyPlmgqWM34nTUm3VmswI3MwXemIRnDbekQfi+tRM= X-Gm-Gg: ASbGncsPVVG7lRycy52sf9jCcAM2q2wqKU1CATMK5lpFdC/mUBwgMuui+t74+alfK++ 971dmVoo5wHjVBTJZFENvDljDUE3yYo9MuBufQ9LWxaLmUkKHdZhj8BV9nfCDEnF9fU7lk9NGkP Mav++AoLUqej72V2Yc8uCMR4Dao9fqzXMXOLhs5PiGpHYVC8Y7y76t69y2LHI609Q6TgOFvT3Si qmxLrvVmfDNCiYR05ih5drbYN7FbQKj7XEXO427v9AUG6/xc82liUmiTDbY7hnJplz/DU+662Kt sqwwOGRaUQVq7y/23NNGAjqEcz716It6wiZ6BG1Wk2AZ2O7dU2omramYxzYq/+v/TifbVi/4KD7 8UuNm74mavaBrWBqGrJD57ep11IQQfA== X-Google-Smtp-Source: AGHT+IH0nGjvh9omc092pi5lyE1PduIKQP57z0gmOUFzxe2ge6mGygZ63vQM7frnnHrFfXN8KOVadg== X-Received: by 2002:a05:6a20:9f88:b0:1f5:8dea:bb93 with SMTP id adf61e73a8af0-1fbeb184152mr8290997637.7.1742421251757; Wed, 19 Mar 2025 14:54:11 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:11 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 03/11] bpf: net_sched: Support implementation of Qdisc_ops in bpf Date: Wed, 19 Mar 2025 14:53:50 -0700 Message-ID: <20250319215358.2287371-4-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung The recent advancement in bpf such as allocated objects, bpf list and bpf rbtree has provided powerful and flexible building blocks to realize sophisticated packet scheduling algorithms. As struct_ops now supports core operators in Qdisc_ops, start allowing qdisc to be implemented using bpf struct_ops with this patch. Users can implement Qdisc_ops.{enqueue, dequeue, init, reset, destroy} in bpf and register the qdisc dynamically into the kernel. Co-developed-by: Cong Wang Signed-off-by: Cong Wang Signed-off-by: Amery Hung Acked-by: Cong Wang Acked-by: Toke Høiland-Jørgensen --- net/sched/Kconfig | 12 +++ net/sched/Makefile | 1 + net/sched/bpf_qdisc.c | 232 ++++++++++++++++++++++++++++++++++++++++ net/sched/sch_api.c | 7 +- net/sched/sch_generic.c | 3 +- 5 files changed, 251 insertions(+), 4 deletions(-) create mode 100644 net/sched/bpf_qdisc.c diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 8180d0c12fce..ccd0255da5a5 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -403,6 +403,18 @@ config NET_SCH_ETS If unsure, say N. +config NET_SCH_BPF + bool "BPF-based Qdisc" + depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF + help + This option allows BPF-based queueing disiplines. With BPF struct_ops, + users can implement supported operators in Qdisc_ops using BPF programs. + The queue holding skb can be built with BPF maps or graphs. + + Say Y here if you want to use BPF-based Qdisc. + + If unsure, say N. + menuconfig NET_SCH_DEFAULT bool "Allow override default queue discipline" help diff --git a/net/sched/Makefile b/net/sched/Makefile index 82c3f78ca486..904d784902d1 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_NET_SCH_FQ_PIE) += sch_fq_pie.o obj-$(CONFIG_NET_SCH_CBS) += sch_cbs.o obj-$(CONFIG_NET_SCH_ETF) += sch_etf.o obj-$(CONFIG_NET_SCH_TAPRIO) += sch_taprio.o +obj-$(CONFIG_NET_SCH_BPF) += bpf_qdisc.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c new file mode 100644 index 000000000000..7eca556a3782 --- /dev/null +++ b/net/sched/bpf_qdisc.c @@ -0,0 +1,232 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include + +static struct bpf_struct_ops bpf_Qdisc_ops; + +struct bpf_sk_buff_ptr { + struct sk_buff *skb; +}; + +static int bpf_qdisc_init(struct btf *btf) +{ + return 0; +} + +BTF_ID_LIST_SINGLE(bpf_qdisc_ids, struct, Qdisc) +BTF_ID_LIST_SINGLE(bpf_sk_buff_ids, struct, sk_buff) +BTF_ID_LIST_SINGLE(bpf_sk_buff_ptr_ids, struct, bpf_sk_buff_ptr) + +static bool bpf_qdisc_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + struct btf *btf = prog->aux->attach_btf; + u32 arg; + + arg = btf_ctx_arg_idx(btf, prog->aux->attach_func_proto, off); + if (prog->aux->attach_st_ops_member_off == offsetof(struct Qdisc_ops, enqueue)) { + if (arg == 2 && type == BPF_READ) { + info->reg_type = PTR_TO_BTF_ID | PTR_TRUSTED; + info->btf = btf; + info->btf_id = bpf_sk_buff_ptr_ids[0]; + return true; + } + } + + return bpf_tracing_btf_ctx_access(off, size, type, prog, info); +} + +static int bpf_qdisc_qdisc_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, size_t *end) +{ + switch (off) { + case offsetof(struct Qdisc, limit): + *end = offsetofend(struct Qdisc, limit); + break; + case offsetof(struct Qdisc, q) + offsetof(struct qdisc_skb_head, qlen): + *end = offsetof(struct Qdisc, q) + offsetofend(struct qdisc_skb_head, qlen); + break; + case offsetof(struct Qdisc, qstats) ... offsetofend(struct Qdisc, qstats) - 1: + *end = offsetofend(struct Qdisc, qstats); + break; + default: + return -EACCES; + } + + return 0; +} + +static int bpf_qdisc_sk_buff_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, size_t *end) +{ + switch (off) { + case offsetof(struct sk_buff, tstamp): + *end = offsetofend(struct sk_buff, tstamp); + break; + case offsetof(struct sk_buff, priority): + *end = offsetofend(struct sk_buff, priority); + break; + case offsetof(struct sk_buff, mark): + *end = offsetofend(struct sk_buff, mark); + break; + case offsetof(struct sk_buff, queue_mapping): + *end = offsetofend(struct sk_buff, queue_mapping); + break; + case offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, tc_classid): + *end = offsetof(struct sk_buff, cb) + + offsetofend(struct qdisc_skb_cb, tc_classid); + break; + case offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, data[0]) ... + offsetof(struct sk_buff, cb) + offsetof(struct qdisc_skb_cb, + data[QDISC_CB_PRIV_LEN - 1]): + *end = offsetof(struct sk_buff, cb) + + offsetofend(struct qdisc_skb_cb, data[QDISC_CB_PRIV_LEN - 1]); + break; + case offsetof(struct sk_buff, tc_index): + *end = offsetofend(struct sk_buff, tc_index); + break; + default: + return -EACCES; + } + + return 0; +} + +static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + const struct btf_type *t, *skbt, *qdisct; + size_t end; + int err; + + skbt = btf_type_by_id(reg->btf, bpf_sk_buff_ids[0]); + qdisct = btf_type_by_id(reg->btf, bpf_qdisc_ids[0]); + t = btf_type_by_id(reg->btf, reg->btf_id); + + if (t == skbt) { + err = bpf_qdisc_sk_buff_access(log, reg, off, &end); + } else if (t == qdisct) { + err = bpf_qdisc_qdisc_access(log, reg, off, &end); + } else { + bpf_log(log, "only read is supported\n"); + return -EACCES; + } + + if (err) { + bpf_log(log, "no write support to %s at off %d\n", + btf_name_by_offset(reg->btf, t->name_off), off); + return -EACCES; + } + + if (off + size > end) { + bpf_log(log, + "write access at off %d with size %d beyond the member of %s ended at %zu\n", + off, size, btf_name_by_offset(reg->btf, t->name_off), end); + return -EACCES; + } + + return 0; +} + +static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { + .get_func_proto = bpf_base_func_proto, + .is_valid_access = bpf_qdisc_is_valid_access, + .btf_struct_access = bpf_qdisc_btf_struct_access, +}; + +static int bpf_qdisc_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + const struct Qdisc_ops *uqdisc_ops; + struct Qdisc_ops *qdisc_ops; + u32 moff; + + uqdisc_ops = (const struct Qdisc_ops *)udata; + qdisc_ops = (struct Qdisc_ops *)kdata; + + moff = __btf_member_bit_offset(t, member) / 8; + switch (moff) { + case offsetof(struct Qdisc_ops, peek): + qdisc_ops->peek = qdisc_peek_dequeued; + return 0; + case offsetof(struct Qdisc_ops, id): + if (bpf_obj_name_cpy(qdisc_ops->id, uqdisc_ops->id, + sizeof(qdisc_ops->id)) <= 0) + return -EINVAL; + return 1; + } + + return 0; +} + +static int bpf_qdisc_reg(void *kdata, struct bpf_link *link) +{ + return register_qdisc(kdata); +} + +static void bpf_qdisc_unreg(void *kdata, struct bpf_link *link) +{ + return unregister_qdisc(kdata); +} + +static int Qdisc_ops__enqueue(struct sk_buff *skb__ref, struct Qdisc *sch, + struct sk_buff **to_free) +{ + return 0; +} + +static struct sk_buff *Qdisc_ops__dequeue(struct Qdisc *sch) +{ + return NULL; +} + +static int Qdisc_ops__init(struct Qdisc *sch, struct nlattr *arg, + struct netlink_ext_ack *extack) +{ + return 0; +} + +static void Qdisc_ops__reset(struct Qdisc *sch) +{ +} + +static void Qdisc_ops__destroy(struct Qdisc *sch) +{ +} + +static struct Qdisc_ops __bpf_ops_qdisc_ops = { + .enqueue = Qdisc_ops__enqueue, + .dequeue = Qdisc_ops__dequeue, + .init = Qdisc_ops__init, + .reset = Qdisc_ops__reset, + .destroy = Qdisc_ops__destroy, +}; + +static struct bpf_struct_ops bpf_Qdisc_ops = { + .verifier_ops = &bpf_qdisc_verifier_ops, + .reg = bpf_qdisc_reg, + .unreg = bpf_qdisc_unreg, + .init_member = bpf_qdisc_init_member, + .init = bpf_qdisc_init, + .name = "Qdisc_ops", + .cfi_stubs = &__bpf_ops_qdisc_ops, + .owner = THIS_MODULE, +}; + +static int __init bpf_qdisc_kfunc_init(void) +{ + return register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); +} +late_initcall(bpf_qdisc_kfunc_init); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index e3e91cf867eb..1aad41b7d5a8 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -358,7 +359,7 @@ static struct Qdisc_ops *qdisc_lookup_ops(struct nlattr *kind) read_lock(&qdisc_mod_lock); for (q = qdisc_base; q; q = q->next) { if (nla_strcmp(kind, q->id) == 0) { - if (!try_module_get(q->owner)) + if (!bpf_try_module_get(q, q->owner)) q = NULL; break; } @@ -1287,7 +1288,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev, /* We will try again qdisc_lookup_ops, * so don't keep a reference. */ - module_put(ops->owner); + bpf_module_put(ops, ops->owner); err = -EAGAIN; goto err_out; } @@ -1398,7 +1399,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev, netdev_put(dev, &sch->dev_tracker); qdisc_free(sch); err_out2: - module_put(ops->owner); + bpf_module_put(ops, ops->owner); err_out: *errp = err; return NULL; diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 14ab2f4c190a..e6fda9f20272 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -1078,7 +1079,7 @@ static void __qdisc_destroy(struct Qdisc *qdisc) ops->destroy(qdisc); lockdep_unregister_key(&qdisc->root_lock_key); - module_put(ops->owner); + bpf_module_put(ops, ops->owner); netdev_put(dev, &qdisc->dev_tracker); trace_qdisc_destroy(qdisc); From patchwork Wed Mar 19 21:53:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023201 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE22E21CA18; Wed, 19 Mar 2025 21:54:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421255; cv=none; b=qldBxYJYObZcJjFVrBBJe964CNF7AYS8wVguqyYU+XibjGVNlnU13gBv9+275NoMJ21uC7cCrl+Jp1PweKezCyVYZ4GtAq6Wekor5ZMIC2Vtsat//OPICv96ZlOBJ4zM8B0vr4XYzhTbKqqsebx1JwWd/vMsrKr6kJN6DnRvQVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421255; c=relaxed/simple; bh=HtpmPOfCekbXG4UayHfNUkrx4xFGghDA7+xcxJ/7hSk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Rsv1BiT545Qiw9UzoSLbV+FEm2JOpS07wuzuKOek/VYcqk7+uAP/AVpVYt+txrQwgMYiGrrwLu+ZDLNwptTYUzpd7e4WqnkF8JmdEkukhJgZ+fJiItJvKWgBW2fWtJXMuqtkj2aQWOMM9/cf+8dsHSwkmZajPaNTMIjTVsjJwo8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Lx2cphLG; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Lx2cphLG" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-224019ad9edso1037265ad.1; Wed, 19 Mar 2025 14:54:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421253; x=1743026053; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fhvff8xITZZMdar9HZn0Dny8IHdsn0Mdqy7WvvccQvQ=; b=Lx2cphLGWJO3sXF8fjJBk/g5ZVIhCkV7xDOAlB6AXKMK2b/kDqVSzb0nCxJc2GT7P2 ioL+eZemV+5ov7D1n0yNiZK7LugY2yDgwusTHAYf0YfCG0iird5VPhJVQqVMybzQG+Ra O1mb1z6w5GPdhYmKnWm0lhlXOsEPNX3gqngpmpFGsMJtRU5yKAVtpRBP+255T+Hf8nGv BaRBJBCj41KmQxgzW8+/HI6Kn3mnXci1w4XF7qQLedhnFa2z87MvukDSU7kXzMluNp3g fXfUmPlmGF0DXqhLIL1yS1XQWuxej6S0WxXoq+BFucp7IKU3385qiB5Dlk/H+mtg2mW1 K7Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421253; x=1743026053; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fhvff8xITZZMdar9HZn0Dny8IHdsn0Mdqy7WvvccQvQ=; b=SBMRj9xN8XcwUuc0KSVJ5Wr1CbfxNOftWFoN938mpXYKmCMYv5xyvzPBgeAgA1slHa 2CY8yJrO1zOO9CrFXhvT2zOSuxVFu20K2BOxOYbt7zgoIBBnXVBvCI9WM5QmNJdYrsnn k9ckdjK1+MeSIsT3HSlWJAUNQqdcrk5NDfmdWZMI7QwjBj6MtfLoYb6rJei4AmhiQb+7 Bt1TCOKOa8PxiSRm1XrTNoMM5iPIdpdgshI2AySvNAoJDGIgYliE+5103gWHYL9Ohf+J ayZUhvXvYXLc8QQXWqOcnqEj4KHWXKctQYAzGgHYlEJRF4Z0jEKn7BfN1DewkvRwEmVU DGHA== X-Gm-Message-State: AOJu0Yza0VCgeCqd3z8Gt61kIiaTkEA7fwAAfRT5KfFtDKgRSZfV7Qbl XxRzlVHdyUdu590lksVJEcpWyRKNzeX7CfjteVb1GewOwFsAjk3a7pyUElvDvBI= X-Gm-Gg: ASbGncuDwMtMO6Nx5pNkDOUFzPRAqeTzxqiwuQ9huP+nouGWCWkq8HW20FVvD4MJ0BQ oIOoJ2UUq2xAH5FY/sPB12j3nz+yzVHrOxNLt23TQdsrrvBO3Xo5oozHMkPDJbgar7ea5sxlzFh iIO+i6plxm7cQt0MU3pntakeGtJ0LetCLtrraXREEwspDrCRsywyWANw18p2j2DmcnFlieuF7cs 3lxKzSXRlxq5qndaYjNBFDROFKTOzlrg5jui1lXqMwy2+1e8fDqcTWFUWs1QLP/XUhYojDELkGs Y2ifqq760H1QNpt4r1LlJuGO0Qyaax7OfoG0VRSJuHgP3uSMYrMv4XEDG6H87oLTODwMawSjp3b tCL4tYs6L3thcQ1jvi4Y= X-Google-Smtp-Source: AGHT+IEpYG+sO24ltPABTJgvHEiBIBel0SKKVWBC7giLHZhMKaqWr/MRv6XTDzdh3f+t/rboS5mEEg== X-Received: by 2002:a17:903:2301:b0:224:76f:9e44 with SMTP id d9443c01a7336-22649828e20mr51270295ad.8.1742421252987; Wed, 19 Mar 2025 14:54:12 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:12 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 04/11] bpf: net_sched: Add basic bpf qdisc kfuncs Date: Wed, 19 Mar 2025 14:53:51 -0700 Message-ID: <20250319215358.2287371-5-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add basic kfuncs for working on skb in qdisc. Both bpf_qdisc_skb_drop() and bpf_kfree_skb() can be used to release a reference to an skb. However, bpf_qdisc_skb_drop() can only be called in .enqueue where a to_free skb list is available from kernel to defer the release. bpf_kfree_skb() should be used elsewhere. It is also used in bpf_obj_free_fields() when cleaning up skb in maps and collections. bpf_skb_get_hash() returns the flow hash of an skb, which can be used to build flow-based queueing algorithms. Finally, allow users to create read-only dynptr via bpf_dynptr_from_skb(). Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- net/sched/bpf_qdisc.c | 111 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 110 insertions(+), 1 deletion(-) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 7eca556a3782..d812a72ca032 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -8,6 +8,9 @@ #include #include +#define QDISC_OP_IDX(op) (offsetof(struct Qdisc_ops, op) / sizeof(void (*)(void))) +#define QDISC_MOFF_IDX(moff) (moff / sizeof(void (*)(void))) + static struct bpf_struct_ops bpf_Qdisc_ops; struct bpf_sk_buff_ptr { @@ -139,6 +142,95 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, return 0; } +__bpf_kfunc_start_defs(); + +/* bpf_skb_get_hash - Get the flow hash of an skb. + * @skb: The skb to get the flow hash from. + */ +__bpf_kfunc u32 bpf_skb_get_hash(struct sk_buff *skb) +{ + return skb_get_hash(skb); +} + +/* bpf_kfree_skb - Release an skb's reference and drop it immediately. + * @skb: The skb whose reference to be released and dropped. + */ +__bpf_kfunc void bpf_kfree_skb(struct sk_buff *skb) +{ + kfree_skb(skb); +} + +/* bpf_qdisc_skb_drop - Drop an skb by adding it to a deferred free list. + * @skb: The skb whose reference to be released and dropped. + * @to_free_list: The list of skbs to be dropped. + */ +__bpf_kfunc void bpf_qdisc_skb_drop(struct sk_buff *skb, + struct bpf_sk_buff_ptr *to_free_list) +{ + __qdisc_drop(skb, (struct sk_buff **)to_free_list); +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(qdisc_kfunc_ids) +BTF_ID_FLAGS(func, bpf_skb_get_hash, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_kfree_skb, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_qdisc_skb_drop, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) +BTF_KFUNCS_END(qdisc_kfunc_ids) + +BTF_SET_START(qdisc_common_kfunc_set) +BTF_ID(func, bpf_skb_get_hash) +BTF_ID(func, bpf_kfree_skb) +BTF_ID(func, bpf_dynptr_from_skb) +BTF_SET_END(qdisc_common_kfunc_set) + +BTF_SET_START(qdisc_enqueue_kfunc_set) +BTF_ID(func, bpf_qdisc_skb_drop) +BTF_SET_END(qdisc_enqueue_kfunc_set) + +enum qdisc_ops_kf_flags { + QDISC_OPS_KF_COMMON = 0, + QDISC_OPS_KF_ENQUEUE = 1 << 0, +}; + +static const u32 qdisc_ops_context_flags[] = { + [QDISC_OP_IDX(enqueue)] = QDISC_OPS_KF_ENQUEUE, + [QDISC_OP_IDX(dequeue)] = QDISC_OPS_KF_COMMON, + [QDISC_OP_IDX(init)] = QDISC_OPS_KF_COMMON, + [QDISC_OP_IDX(reset)] = QDISC_OPS_KF_COMMON, + [QDISC_OP_IDX(destroy)] = QDISC_OPS_KF_COMMON, +}; + +static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) +{ + u32 moff, flags; + + if (!btf_id_set8_contains(&qdisc_kfunc_ids, kfunc_id)) + return 0; + + if (prog->aux->st_ops != &bpf_Qdisc_ops) + return -EACCES; + + moff = prog->aux->attach_st_ops_member_off; + flags = qdisc_ops_context_flags[QDISC_MOFF_IDX(moff)]; + + if ((flags & QDISC_OPS_KF_ENQUEUE) && + btf_id_set_contains(&qdisc_enqueue_kfunc_set, kfunc_id)) + return 0; + + if (btf_id_set_contains(&qdisc_common_kfunc_set, kfunc_id)) + return 0; + + return -EACCES; +} + +static const struct btf_kfunc_id_set bpf_qdisc_kfunc_set = { + .owner = THIS_MODULE, + .set = &qdisc_kfunc_ids, + .filter = bpf_qdisc_kfunc_filter, +}; + static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { .get_func_proto = bpf_base_func_proto, .is_valid_access = bpf_qdisc_is_valid_access, @@ -225,8 +317,25 @@ static struct bpf_struct_ops bpf_Qdisc_ops = { .owner = THIS_MODULE, }; +BTF_ID_LIST(bpf_sk_buff_dtor_ids) +BTF_ID(func, bpf_kfree_skb) + static int __init bpf_qdisc_kfunc_init(void) { - return register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); + int ret; + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { + { + .btf_id = bpf_sk_buff_ids[0], + .kfunc_btf_id = bpf_sk_buff_dtor_ids[0] + }, + }; + + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_qdisc_kfunc_set); + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, + ARRAY_SIZE(skb_kfunc_dtors), + THIS_MODULE); + ret = ret ?: register_bpf_struct_ops(&bpf_Qdisc_ops, Qdisc_ops); + + return ret; } late_initcall(bpf_qdisc_kfunc_init); From patchwork Wed Mar 19 21:53:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023202 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DF6B21CC5F; Wed, 19 Mar 2025 21:54:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421256; cv=none; b=BJMkqY44i/WnCRtHlv+z8H5AI+iX5QX9/GpCRU/gIW+MBZbUZEAF6IjIMb02CM2xY6lOLWeldsgdQe7A6yOlJiC0K6e07+ETY7VNGjmDcecQHfF6DKe0ddb2mrLM63DakadbJjQVmoti/wzrxE1uABMVJ0faWCh9Q+PEmgP/O0Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421256; c=relaxed/simple; bh=uap6ui5lIiEAhHFejCXpS20PeCAU94lncSy4ptxcXlY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CEp67oYqznRygiT2u2p04cHh04VtCRuSxQkVaL/y77qF4oU0qCcFP1NNqtV8KX5nic8arBdt4ZY5QAFBEhFiar3T24fOs4Bt4EDs6p3bhutihUBIW91PA6wOpXSBcSzHBwKi3785IxKSa9jSI45rI4XGjwtqVwB5IrOCiBfPPd0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HIA+5c9s; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HIA+5c9s" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-2240b4de12bso977915ad.2; Wed, 19 Mar 2025 14:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421254; x=1743026054; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5Cs84tRrnFA4DuDv2H6jTSlOH1U7N2PREqr2phBkCU8=; b=HIA+5c9sT0idYL2w4L5a8+egG2FB5gKM21zN3XCmSWsHljpoJO6a/LT15lDvairKJ7 dtZwag3SfQy74aKlgCriCqJQgtofd1qB565Y18sCWrBkp/b1sQJP5o1EP+l2NmA5F6Zu 7ZbbJ5Ouygtu/9+1ekDxcBUnTrTnnakQZ5Hal7rtk4+DcGwBzZGvpRb5ooJmingL/axv JoRMkQ9Htzk6ttVIrFyR5JRdPXiCMzwYG5Ik2zxOyMN1StudQ9tpa9/JWJ6AMitOxAvI Dq6u2WDbLKTXbZsqIRz5cnKQ2SWu0+kkZZk8WKv4Xg64JgRSOhmtP9E78AfsF36tw4UQ EjUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421254; x=1743026054; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5Cs84tRrnFA4DuDv2H6jTSlOH1U7N2PREqr2phBkCU8=; b=sRfPnWG26NTNY+t51oq2Sctn7HaZ0dK1SvY1Tu9LuLoLnKQpFobUUIVbEwrh0u7jZC tysRIf+qJCuCZ6loWsJmi+P4HPaX5IXeAS147x9NYqgZ05abRoZJ7M66qEeH/H9UQSHy XzsFrDhJdmMYkXwKinXNo8DFYRAuhTntrDE8iJLSGcdqQMSRx9SGWV5E8ZUTGynTVs9r kzLmeleeIG1HNah+MIOe/eSx0pEm/fNOsN/YaWXlQ2Onw1I1bOSAtNBREydU7oUy7w8E TlIVWJk59EHsdWJJz+ohOxn3KY5qxAHhDLCaq0pJVkn9/fDtU93ML8/NTOjSnYsiYGv1 YnOA== X-Gm-Message-State: AOJu0YzSrgvXxDiEny98M5Im+LtDvbr84B86U7W524rqVXeSTa/0el07 Dzvr5GNrD2ZE0+NXhwrY1cUh2nomVtlTBTbgpOtJD+aO7f0IqsZy90kFBiE6ZRM= X-Gm-Gg: ASbGnct/sCDJY69O+YMwaZWJLRgtTTXjgDwWa0hJV/ujWK9BrMHM45+mP1N2U87dzyF AfT52if+IL+aCHGgu7+iiC+lqvePK62IaeRB1cPudwiEZ8irjagHgoAzsnhtDVIkra54H4i6Jvz aY5LTPGWfFoJqVZhZde+w4+/cPOoRUGGD2o3Ac21g9SS+CaQkhd8o3O1dxuO/SocS9DLvKs2s4q 2VmUFTsNYli4Yq6DC8hwp6T9FhyZSFmJsd4pvGmgd28389ntQjvELnnGru0DTMECMotQb20Ftqr 9N5PK/D1T8eit2JicEfshIEhTyqzAQ7PO5fiyd974eNYP7k2eZe9W2IFmAcKMxOQoevuJPMwyzb 2KlVbY1VHYUQ7VvuS7oqhF7QThSrbLA== X-Google-Smtp-Source: AGHT+IHMF+9QrmGX/WVdzbFt8rdZXbTPFmQvGvLY/onszOBhaTKFQiHn1EiIpY5Vi2G7pRudsWL4fw== X-Received: by 2002:a05:6a00:3cd2:b0:736:a973:748 with SMTP id d2e1a72fcca58-7376d6ff4d6mr6212323b3a.22.1742421254232; Wed, 19 Mar 2025 14:54:14 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:13 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 05/11] bpf: net_sched: Add a qdisc watchdog timer Date: Wed, 19 Mar 2025 14:53:52 -0700 Message-ID: <20250319215358.2287371-6-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add a watchdog timer to bpf qdisc. The watchdog can be used to schedule the execution of qdisc through kfunc, bpf_qdisc_schedule(). It can be useful for building traffic shaping scheduling algorithm, where the time the next packet will be dequeued is known. The implementation relies on struct_ops gen_prologue/epilogue to patch bpf programs provided by users. Operator specific prologue/epilogue kfuncs are introduced instead of watchdog kfuncs so that it is easier to extend prologue/epilogue in the future (writing C vs BPF bytecode). Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- net/sched/bpf_qdisc.c | 106 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 1 deletion(-) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index d812a72ca032..5f4ab4877535 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -13,6 +13,10 @@ static struct bpf_struct_ops bpf_Qdisc_ops; +struct bpf_sched_data { + struct qdisc_watchdog watchdog; +}; + struct bpf_sk_buff_ptr { struct sk_buff *skb; }; @@ -142,6 +146,56 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log, return 0; } +BTF_ID_LIST(bpf_qdisc_init_prologue_ids) +BTF_ID(func, bpf_qdisc_init_prologue) + +static int bpf_qdisc_gen_prologue(struct bpf_insn *insn_buf, bool direct_write, + const struct bpf_prog *prog) +{ + struct bpf_insn *insn = insn_buf; + + if (prog->aux->attach_st_ops_member_off != offsetof(struct Qdisc_ops, init)) + return 0; + + /* r6 = r1; // r6 will be "u64 *ctx". r1 is "u64 *ctx". + * r1 = r1[0]; // r1 will be "struct Qdisc *sch" + * r0 = bpf_qdisc_init_prologue(r1); + * r1 = r6; // r1 will be "u64 *ctx". + */ + *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); + *insn++ = BPF_CALL_KFUNC(0, bpf_qdisc_init_prologue_ids[0]); + *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6); + *insn++ = prog->insnsi[0]; + + return insn - insn_buf; +} + +BTF_ID_LIST(bpf_qdisc_reset_destroy_epilogue_ids) +BTF_ID(func, bpf_qdisc_reset_destroy_epilogue) + +static int bpf_qdisc_gen_epilogue(struct bpf_insn *insn_buf, const struct bpf_prog *prog, + s16 ctx_stack_off) +{ + struct bpf_insn *insn = insn_buf; + + if (prog->aux->attach_st_ops_member_off != offsetof(struct Qdisc_ops, reset) && + prog->aux->attach_st_ops_member_off != offsetof(struct Qdisc_ops, destroy)) + return 0; + + /* r1 = stack[ctx_stack_off]; // r1 will be "u64 *ctx" + * r1 = r1[0]; // r1 will be "struct Qdisc *sch" + * r0 = bpf_qdisc_reset_destroy_epilogue(r1); + * BPF_EXIT; + */ + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_FP, ctx_stack_off); + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); + *insn++ = BPF_CALL_KFUNC(0, bpf_qdisc_reset_destroy_epilogue_ids[0]); + *insn++ = BPF_EXIT_INSN(); + + return insn - insn_buf; +} + __bpf_kfunc_start_defs(); /* bpf_skb_get_hash - Get the flow hash of an skb. @@ -170,6 +224,36 @@ __bpf_kfunc void bpf_qdisc_skb_drop(struct sk_buff *skb, __qdisc_drop(skb, (struct sk_buff **)to_free_list); } +/* bpf_qdisc_watchdog_schedule - Schedule a qdisc to a later time using a timer. + * @sch: The qdisc to be scheduled. + * @expire: The expiry time of the timer. + * @delta_ns: The slack range of the timer. + */ +__bpf_kfunc void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 delta_ns) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_schedule_range_ns(&q->watchdog, expire, delta_ns); +} + +/* bpf_qdisc_init_prologue - Hidden kfunc called in prologue of .init. */ +__bpf_kfunc void bpf_qdisc_init_prologue(struct Qdisc *sch) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_init(&q->watchdog, sch); +} + +/* bpf_qdisc_reset_destroy_epilogue - Hidden kfunc called in epilogue of .reset + * and .destroy + */ +__bpf_kfunc void bpf_qdisc_reset_destroy_epilogue(struct Qdisc *sch) +{ + struct bpf_sched_data *q = qdisc_priv(sch); + + qdisc_watchdog_cancel(&q->watchdog); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(qdisc_kfunc_ids) @@ -177,6 +261,9 @@ BTF_ID_FLAGS(func, bpf_skb_get_hash, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_kfree_skb, KF_RELEASE) BTF_ID_FLAGS(func, bpf_qdisc_skb_drop, KF_RELEASE) BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_watchdog_schedule, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_init_prologue, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_reset_destroy_epilogue, KF_TRUSTED_ARGS) BTF_KFUNCS_END(qdisc_kfunc_ids) BTF_SET_START(qdisc_common_kfunc_set) @@ -187,16 +274,22 @@ BTF_SET_END(qdisc_common_kfunc_set) BTF_SET_START(qdisc_enqueue_kfunc_set) BTF_ID(func, bpf_qdisc_skb_drop) +BTF_ID(func, bpf_qdisc_watchdog_schedule) BTF_SET_END(qdisc_enqueue_kfunc_set) +BTF_SET_START(qdisc_dequeue_kfunc_set) +BTF_ID(func, bpf_qdisc_watchdog_schedule) +BTF_SET_END(qdisc_dequeue_kfunc_set) + enum qdisc_ops_kf_flags { QDISC_OPS_KF_COMMON = 0, QDISC_OPS_KF_ENQUEUE = 1 << 0, + QDISC_OPS_KF_DEQUEUE = 1 << 1, }; static const u32 qdisc_ops_context_flags[] = { [QDISC_OP_IDX(enqueue)] = QDISC_OPS_KF_ENQUEUE, - [QDISC_OP_IDX(dequeue)] = QDISC_OPS_KF_COMMON, + [QDISC_OP_IDX(dequeue)] = QDISC_OPS_KF_DEQUEUE, [QDISC_OP_IDX(init)] = QDISC_OPS_KF_COMMON, [QDISC_OP_IDX(reset)] = QDISC_OPS_KF_COMMON, [QDISC_OP_IDX(destroy)] = QDISC_OPS_KF_COMMON, @@ -219,6 +312,10 @@ static int bpf_qdisc_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id) btf_id_set_contains(&qdisc_enqueue_kfunc_set, kfunc_id)) return 0; + if ((flags & QDISC_OPS_KF_DEQUEUE) && + btf_id_set_contains(&qdisc_dequeue_kfunc_set, kfunc_id)) + return 0; + if (btf_id_set_contains(&qdisc_common_kfunc_set, kfunc_id)) return 0; @@ -235,6 +332,8 @@ static const struct bpf_verifier_ops bpf_qdisc_verifier_ops = { .get_func_proto = bpf_base_func_proto, .is_valid_access = bpf_qdisc_is_valid_access, .btf_struct_access = bpf_qdisc_btf_struct_access, + .gen_prologue = bpf_qdisc_gen_prologue, + .gen_epilogue = bpf_qdisc_gen_epilogue, }; static int bpf_qdisc_init_member(const struct btf_type *t, @@ -250,6 +349,11 @@ static int bpf_qdisc_init_member(const struct btf_type *t, moff = __btf_member_bit_offset(t, member) / 8; switch (moff) { + case offsetof(struct Qdisc_ops, priv_size): + if (uqdisc_ops->priv_size) + return -EINVAL; + qdisc_ops->priv_size = sizeof(struct bpf_sched_data); + return 1; case offsetof(struct Qdisc_ops, peek): qdisc_ops->peek = qdisc_peek_dequeued; return 0; From patchwork Wed Mar 19 21:53:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023203 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0416721D3E3; Wed, 19 Mar 2025 21:54:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421258; cv=none; b=EdFgGX82jVXqWNpxRZcT+Fxt/EjhwIHCj5V7uN/RtoTrnJUSaRH+63VHxkhncevmxx1MVA49zaaFV2M6406BxlGiTTEAIwJYX0AVS1Oju1kduY1f0RdyOHdRJXff9KR2kHS+LacCDW4ulF5iLUY/d/0e3oe8sWh1FXd+abhglO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421258; c=relaxed/simple; bh=89/4mUJpr9aktknNh0PQoqcLMbO5zSd3Gu5eTVqHXXU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pud10EO79GsoPAEP6nEumkvwHM3NtROOHawrPBz9aUzv/XdR2OPt9hefMBPCZZbSJvfdCmplprOA5xRVOCToZwAeCEgb2x5DWkJLrSjrbyvb84MwhR4P9+/57EciDu/IE+tQF6EcJyciBaNohTKioHSB1QLEQMvkfGnZlg6pTvU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Lf+DZ4+w; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Lf+DZ4+w" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-225477548e1so915125ad.0; Wed, 19 Mar 2025 14:54:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421256; x=1743026056; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q7Vh09nD0sM/32Xv3BcGHrqGV2e/Cdvu6uqMumBsx/Y=; b=Lf+DZ4+wkzRif/I/tZOTeRTwgKboi2QkikMO6aZRlSUuqqf2jYe39LA7sjbp5qSQot 2KlNMBlqxqShYtshr2f0jcSsma5CE4yJ2F/Sn//JkaDZCGcvtpLzJRykLpcBRuaeee26 7df6aWcegF8plwaBg1kK15zCDRKkuZKl1jpuUDObdKG8QcEhGyyUcfvnuAHllqIzb8i7 HK9aQDtDnUsEmmjrPBGvkQI9O10V3Tk+MtrwE4DOHHUSoYHuHSo+xHMiMLLg9gWg3nty 1Yw+3ALVBgud0eLzAg6zOplDMKWNcdOYaVC8oEuq3hdm/8M7RILo/4qsub8FBBYEkW2/ n3+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421256; x=1743026056; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q7Vh09nD0sM/32Xv3BcGHrqGV2e/Cdvu6uqMumBsx/Y=; b=NTUUby5m11N6hlCUqYsZYFDr8Z7OC5LycFd4PTRgd0NH/OdhE9UrcGa/Z3VVSIqcn/ 40Ie11VoCFIkGoDQwa13/nRPA7s07sungnzSw223IEirSxZlVJuYMLY/oI7jEcNrwiam 9RCTNmHu+vvjCu+tlS+/PjFDhsYW8hiACYd3NqZnXKr4J/+7fQyiq4Jv1Sv8MBE9vaJc hnNVaigDbLQWILvebjae7HieHjINNRf7azDy2lq9kpNpQLsee38pKz1uhomYWrgKkudr /xfLsYjFm4/h2lE++zVIE1Y96i8g3segS6dEIKTk43m53nRPf0Yq3rQfmjiOiTwK+zEv ZmHw== X-Gm-Message-State: AOJu0YyM4R72cnxJjakwjt8E7RnSOA4rLUuuoeQFeiy/cA9zo4Orp47J jvwfPRtGFC+dTKFZOjTR+kCqppU+uUbqiLbo7PC8H/dFI6Ip6HAUnQ5Ti+yBLcY= X-Gm-Gg: ASbGncukp3JM3qkiwh0iF7aU6QPw0bhW3xAosTOi7kL/OCBHNewF7yRJilYjVEdok5n FpH+XsmywmrYRu8SpSEbrRSwpVEWEhBtbW+1hSPsFwm8Wzb8CbDHzI7vW6Txz6qDx9XsuyVuDHB WKRdDLQa8ZUusBHesczrGelC6r83KbwS7BSw5fhX0RhsuPBO4OBHVwkOCyWJK8LovF+imkttEkq +uND/4AzyEtCP6RBloP210MxX57NaHKUIPLIAgarMh0TylPGu8/0YaBLPFmOeWtMz7Be6YChwDi PgEqjZUq0JOfWJpPzqzCTWYe7ybDoFgT/c8Exnes8E3XFojfC9C5C45DLdvvNyM+b4lqvfBiuog k52xz7mi4zwquKhBw2sc= X-Google-Smtp-Source: AGHT+IHMH58rimYFPQittMlW6vY4qxGMd/J9nQYfitscTC+ih6WzJ6VtOu6g2LG/lIppyBTOLifZVA== X-Received: by 2002:aa7:888a:0:b0:72f:590f:2859 with SMTP id d2e1a72fcca58-7377a869447mr1250761b3a.13.1742421255969; Wed, 19 Mar 2025 14:54:15 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:15 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 06/11] bpf: net_sched: Support updating bstats Date: Wed, 19 Mar 2025 14:53:53 -0700 Message-ID: <20250319215358.2287371-7-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Add a kfunc to update Qdisc bstats when an skb is dequeued. The kfunc is only available in .dequeue programs. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- net/sched/bpf_qdisc.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 5f4ab4877535..5aff83d7d1d8 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -254,6 +254,15 @@ __bpf_kfunc void bpf_qdisc_reset_destroy_epilogue(struct Qdisc *sch) qdisc_watchdog_cancel(&q->watchdog); } +/* bpf_qdisc_bstats_update - Update Qdisc basic statistics + * @sch: The qdisc from which an skb is dequeued. + * @skb: The skb to be dequeued. + */ +__bpf_kfunc void bpf_qdisc_bstats_update(struct Qdisc *sch, const struct sk_buff *skb) +{ + bstats_update(&sch->bstats, skb); +} + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(qdisc_kfunc_ids) @@ -264,6 +273,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_watchdog_schedule, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_init_prologue, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_qdisc_reset_destroy_epilogue, KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_qdisc_bstats_update, KF_TRUSTED_ARGS) BTF_KFUNCS_END(qdisc_kfunc_ids) BTF_SET_START(qdisc_common_kfunc_set) @@ -279,6 +289,7 @@ BTF_SET_END(qdisc_enqueue_kfunc_set) BTF_SET_START(qdisc_dequeue_kfunc_set) BTF_ID(func, bpf_qdisc_watchdog_schedule) +BTF_ID(func, bpf_qdisc_bstats_update) BTF_SET_END(qdisc_dequeue_kfunc_set) enum qdisc_ops_kf_flags { From patchwork Wed Mar 19 21:53:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023204 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 676B121D599; Wed, 19 Mar 2025 21:54:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421259; cv=none; b=e3r5hRVmTCgJpPfbnS6zuY3TJKNo+LtACnOOasc3VLvB6tr+3kdwz2pTYtLT9sh7hxu1bCT+nX+27lZZu1VrJZvbA4mXFEMCxsU4dU8oXgQbg3chkEM3NlQLRkqTBvcCDFL78LmLWCIQm6WPkz08cGOaBd1pUNyaxUx9znBjJbI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421259; c=relaxed/simple; bh=btqMoGbYeZhkkjj5rQSDn8rJfomeuwBI6ynVxR9DRDw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SCbPGqBCLR0+LWoOgb/OtRVGnCVToYSKTOXaeC7PXFPOtQS1F+js+TT7n7R1NeiRakmsZxashWi5t3lqNL3aTBK0+4hUxqgqpkBpn4sKy7rbzSyPLXTqwjPhFDqKY7EjScxqsywbJgITm13nCV98NUEsqOnkq51dIE4Y/1/M7BE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FB4rYGib; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FB4rYGib" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-22622ddcc35so944705ad.2; Wed, 19 Mar 2025 14:54:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421257; x=1743026057; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=31CpfObX2eGPSgjTgTNcr5qnzFMbgrPtDWbjGnbqUeQ=; b=FB4rYGibbrAJ8Y+xkqH+eHqlw4sTq5MNlZ8sh7s1nEYq9ocWJzgdynKwOiNMryJblk 7QtMWfNLNRMlL64P3cNTNnqkngaZ3nmsFTnxWqI7ulyoFDcvRNEBfmfDgqGtaQ/fFuBt KcFo5ZAY92NH0k5NHHVyOOEjo7LLrtuf547qwIyob4vkgAxvdjTZ0PljiOdQtb8d7JmQ TvoyMgFvKV3Ch37h2y6k9VPU7Tbc2J30rV6IYAfsED6k/9RJ3kwAbxMiZLV9v+evBWya JwCqWSG1jekZKwLcMkdocblWDXYUB0hx64E239iNZDslxVfpivR/MKQEFl6t0Kg0nQkH ulnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421257; x=1743026057; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=31CpfObX2eGPSgjTgTNcr5qnzFMbgrPtDWbjGnbqUeQ=; b=JA2HRUvvNcaySQbn4MYBIgMTwozZvPGi/Tk2LslSzVi9qvrdtj4eo9hZs+8TpE7Mtv W7CA6XSZi9nPTnzDGsrtK6RkA5NX61Exrny7uddU3aPJpLkONUB7HetoM6W+zSk58BsI XPb9k4rl9j9oxVY4W7ccS127aLyfuaGi17DUg3CEuTVcj/Hp98j6deX/17tZJVQ9eU9M kupgEYQKcWt1loTTc0pDtYC5fnlMa7hmuS+sq13GpG89oKnl9iVDYHa6BeNXqelTbDBa 1g2N69rwPaUph999l58JyGR+HLb5vjO1sERG6C+WL+lpBVtwGw9KzG6yHbV028s3x/J0 +3Mw== X-Gm-Message-State: AOJu0YyWVlNhdjX/Weeo81MTUD/lSw9CtJJivbXV09RJXn6dyGvRul7b ynzr5cPuJuAN4/mgy2VLOfXVdyqi9+i0I2NBhPeH1Dv/cUS4QiBsW6muwD5s7AQ= X-Gm-Gg: ASbGncsbU9tXDv48+2BmES96tqzu0E4IfoF7IkyxUKKxuaWgDyGEiy/eacdo/JGwmYE TtSjcWjjhhvxw21+Hm++gBMl44JPYIsMdd3gQ9/qq7MxDKV2VnDkOU1+k1A91oqdvqM60Ry7fF0 AAq8FNPCk1nEKhDyLcU2xH7VFpB5l+A6Jxxto1bsgeosTVcILNMKpVWxoOYaMzdXcsDl6xFlifV OWeNvH6Xe/dpkh9aTu7sksUTPTG3v2py5bb9J9bq8rFwCbUqHZ0IgMbXncHZZsqJqYho+nr4oHJ xMTifhvAi8lTJhVQ9PtGKyfseXoo89kHPteqzAE0hBabGo036jK5HrJUKRkWdl5U9V3yx4A1Y1A lPwIpwqOd3gYBWp3EYZw= X-Google-Smtp-Source: AGHT+IF6+oXuAo43AC0Hu3btbLc1k7CE8PxBOi1Jz5C+dHbIFB6l4Cypd+w8kGU+hr2hcsNkEdjInA== X-Received: by 2002:a17:902:c94e:b0:215:94eb:adb6 with SMTP id d9443c01a7336-22649a80a38mr66580435ad.40.1742421257477; Wed, 19 Mar 2025 14:54:17 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:16 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 07/11] bpf: net_sched: Disable attaching bpf qdisc to non root Date: Wed, 19 Mar 2025 14:53:54 -0700 Message-ID: <20250319215358.2287371-8-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Do not allow users to attach bpf qdiscs to classful qdiscs. This is to prevent accidentally breaking existings classful qdiscs if they rely on some data in the child qdisc. This restriction can potentially be lifted in the future. Note that, we still allow bpf qdisc to be attached to mq. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- net/sched/bpf_qdisc.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c index 5aff83d7d1d8..cb158c8c433e 100644 --- a/net/sched/bpf_qdisc.c +++ b/net/sched/bpf_qdisc.c @@ -158,13 +158,19 @@ static int bpf_qdisc_gen_prologue(struct bpf_insn *insn_buf, bool direct_write, return 0; /* r6 = r1; // r6 will be "u64 *ctx". r1 is "u64 *ctx". + * r2 = r1[16]; // r2 will be "struct netlink_ext_ack *extack" * r1 = r1[0]; // r1 will be "struct Qdisc *sch" - * r0 = bpf_qdisc_init_prologue(r1); + * r0 = bpf_qdisc_init_prologue(r1, r2); + * if r0 == 0 goto pc+1; + * BPF_EXIT; * r1 = r6; // r1 will be "u64 *ctx". */ *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 16); *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); *insn++ = BPF_CALL_KFUNC(0, bpf_qdisc_init_prologue_ids[0]); + *insn++ = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1); + *insn++ = BPF_EXIT_INSN(); *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6); *insn++ = prog->insnsi[0]; @@ -237,11 +243,26 @@ __bpf_kfunc void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 } /* bpf_qdisc_init_prologue - Hidden kfunc called in prologue of .init. */ -__bpf_kfunc void bpf_qdisc_init_prologue(struct Qdisc *sch) +__bpf_kfunc int bpf_qdisc_init_prologue(struct Qdisc *sch, + struct netlink_ext_ack *extack) { struct bpf_sched_data *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + struct Qdisc *p; + + if (sch->parent != TC_H_ROOT) { + p = qdisc_lookup(dev, TC_H_MAJ(sch->parent)); + if (!p) + return -ENOENT; + + if (!(p->flags & TCQ_F_MQROOT)) { + NL_SET_ERR_MSG(extack, "BPF qdisc only supported on root or mq"); + return -EINVAL; + } + } qdisc_watchdog_init(&q->watchdog, sch); + return 0; } /* bpf_qdisc_reset_destroy_epilogue - Hidden kfunc called in epilogue of .reset From patchwork Wed Mar 19 21:53:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023205 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8992921D5BD; Wed, 19 Mar 2025 21:54:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421261; cv=none; b=jC9y8sPEJuAc8ruCHkr97pAX7FcxtT0MJhoORJTd0sMH0SvgRKc1SZxYv5u7StMXYYKz+w0F++bFMrrNVhdXwsVrKo+Zlj30vfBlUpr53gZtaExN+6uSdSBE9+Wt5WG0JPMFPI7xyONEv/8AcfFJTdyYPczpYC6RA5PfXcO0jZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421261; c=relaxed/simple; bh=H65Rd6PJNUiZCQZxJba7/R0D+DJJvwvv5AQi47YKh/U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Vm580HtVItJS5g3TTXHZSdNoG+I5HX4xMbZ+1+oSo6FukUmV1itOVxvy7VcoSYoN0OpgVyeLfp9OEN9lmss3+P5vPqrM879/ftX7E6QAf7YX2ez+rv6tbPhZB0KGlp3bZBpFgZHKZVfhdyRSyZsvBi5nhmWbzBGSjDO/OxepDyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jrd9xD86; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jrd9xD86" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-22423adf751so738525ad.2; Wed, 19 Mar 2025 14:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421259; x=1743026059; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=v8ejcrrrC6yV6s9uq4HHgS9blRXEVLqg2s5IbclXas8=; b=jrd9xD86kX05I0wwKvWLSi39KGb0e06z9h+pBTYJehtNV4wfHua7c5hg6X2vg+uXYD 9flL8qv8cDa0UvUv7Yi5pGBebA61pT5tGjDN7Fn0Keyze4eIobrT02Duyl1EkFQNeKd1 TQEGloo4YoUIwPsZXUxKBzhJMGwvjYYyRDsRX8yLDn0RCHBDWOSzkigkv9weXFzfr2AS 89nAHQoWkUslunTd8253+oo+R5YMBQeMWHZCeBaEZDiO//ZlfUrAeXZL3z6SYIbttlOU tKOG0OCk/IeNI3fYIeDyrvEBhZdrI0fTgg8e550Jv5UbG7QJihZTkti/eihBitmOFYcC 0ydg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421259; x=1743026059; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v8ejcrrrC6yV6s9uq4HHgS9blRXEVLqg2s5IbclXas8=; b=k6KQp0LGk40XPWR/I0oj5wDiAahUuQWiW39GWDySG5MGw2pDQ/Ly62eVo14NF4Xb/g xaUlfR30hDB4ENC/4KkcdSHD4iEV7OdcwUO4dSCZuxE891grzS6oPcTQIK6Osh2rOZ+v Otq2JalDR3TZUszgqE0f4pnQ17NIwfpPOBeRc9+QL+H6G0+ExZFYiv8hFcpvEwSVBY6X A1CC4n466ogj1LIGF/nDKnskPCRQBTgYxlMljrohZO7AbpljKDQA/w0rKHI//b8k+VLI zZOwferU27BeSjOtKFrc26B04dhc23/RVlstvvS8BIvrJl0gT8k2E8XoU5utS3R1aU5f DcfA== X-Gm-Message-State: AOJu0YzC9zQavoDc8xMITQmF6ryWyrvBIj1/UCALnxJ2ctnxUma9H9yy 2BtzrRPJqHD3fhVqkShChKLVzllJk8mIzXqZ1b3k7sws4khDlkBkd0KOLmKVUf0= X-Gm-Gg: ASbGncul//RnxpYNmyWdiUNgkyAnC7yoVwoffypf06xbDYWQaznNVWj6L7AhTn2Hir7 w0RKNw6w2Bdwd2KVfLYI8liKhoTemve+SQWbBWpT5gw7xDUThKRubXMYDSfRvgx0n661mVTem/M 9X7eYkc10MYfHUOmu/GYH+d8dsfxfv8pjdaRIGjgCtmLud821wPmhWibGGphmk38lULR5HGuabt pBI6QkL28ddub7M7Zx2qT0oYI7rIFuB4dO64H32oky35q1ECc1d+RvB+udL8UrJny52Wm0mTv89 4GhLAWK8puIzxlIl6ko8cgeN12GQLlr4G3k5lMB1zDBOG8M4YHxQ2O82SrwyXvqWzSiXHyjW4E0 kzzRlFJl5xz4kUKThTRlfzfLGlKwfQw== X-Google-Smtp-Source: AGHT+IEPplkzUeVG9njCT7k0txZiZVG8WI5gOjKnLJQEle+H+oESowMwmpttkNFdBA81hP/4L6ehXQ== X-Received: by 2002:a05:6a00:2291:b0:736:3c6a:be02 with SMTP id d2e1a72fcca58-7376d631cddmr5939387b3a.11.1742421258538; Wed, 19 Mar 2025 14:54:18 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:18 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 08/11] libbpf: Support creating and destroying qdisc Date: Wed, 19 Mar 2025 14:53:55 -0700 Message-ID: <20250319215358.2287371-9-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung Extend struct bpf_tc_hook with handle, qdisc name and a new attach type, BPF_TC_QDISC, to allow users to add or remove any qdisc specified in addition to clsact. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- tools/lib/bpf/libbpf.h | 5 ++++- tools/lib/bpf/netlink.c | 20 +++++++++++++++++--- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index e0605403f977..fdcee6a71e0f 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -1283,6 +1283,7 @@ enum bpf_tc_attach_point { BPF_TC_INGRESS = 1 << 0, BPF_TC_EGRESS = 1 << 1, BPF_TC_CUSTOM = 1 << 2, + BPF_TC_QDISC = 1 << 3, }; #define BPF_TC_PARENT(a, b) \ @@ -1297,9 +1298,11 @@ struct bpf_tc_hook { int ifindex; enum bpf_tc_attach_point attach_point; __u32 parent; + __u32 handle; + const char *qdisc; size_t :0; }; -#define bpf_tc_hook__last_field parent +#define bpf_tc_hook__last_field qdisc struct bpf_tc_opts { size_t sz; diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 68a2def17175..c997e69d507f 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -529,9 +529,9 @@ int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id) } -typedef int (*qdisc_config_t)(struct libbpf_nla_req *req); +typedef int (*qdisc_config_t)(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook); -static int clsact_config(struct libbpf_nla_req *req) +static int clsact_config(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook) { req->tc.tcm_parent = TC_H_CLSACT; req->tc.tcm_handle = TC_H_MAKE(TC_H_CLSACT, 0); @@ -539,6 +539,16 @@ static int clsact_config(struct libbpf_nla_req *req) return nlattr_add(req, TCA_KIND, "clsact", sizeof("clsact")); } +static int qdisc_config(struct libbpf_nla_req *req, const struct bpf_tc_hook *hook) +{ + const char *qdisc = OPTS_GET(hook, qdisc, NULL); + + req->tc.tcm_parent = OPTS_GET(hook, parent, TC_H_ROOT); + req->tc.tcm_handle = OPTS_GET(hook, handle, 0); + + return nlattr_add(req, TCA_KIND, qdisc, strlen(qdisc) + 1); +} + static int attach_point_to_config(struct bpf_tc_hook *hook, qdisc_config_t *config) { @@ -552,6 +562,9 @@ static int attach_point_to_config(struct bpf_tc_hook *hook, return 0; case BPF_TC_CUSTOM: return -EOPNOTSUPP; + case BPF_TC_QDISC: + *config = &qdisc_config; + return 0; default: return -EINVAL; } @@ -596,7 +609,7 @@ static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags) req.tc.tcm_family = AF_UNSPEC; req.tc.tcm_ifindex = OPTS_GET(hook, ifindex, 0); - ret = config(&req); + ret = config(&req, hook); if (ret < 0) return ret; @@ -639,6 +652,7 @@ int bpf_tc_hook_destroy(struct bpf_tc_hook *hook) case BPF_TC_INGRESS: case BPF_TC_EGRESS: return libbpf_err(__bpf_tc_detach(hook, NULL, true)); + case BPF_TC_QDISC: case BPF_TC_INGRESS | BPF_TC_EGRESS: return libbpf_err(tc_qdisc_delete(hook)); case BPF_TC_CUSTOM: From patchwork Wed Mar 19 21:53:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023206 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E196C220686; Wed, 19 Mar 2025 21:54:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421262; cv=none; b=iqkfIjVdIMi3ixtg0WcBJBZ4+7pLL7UBLSFEO+E6Mu4TQWXcl7tZ3tuJXxbBCIeYEzwl3hAPwW2Hk11EA6euCXBeUrxL3ZaM+X0Dt0vNGc2+OVVBOSNX2HXrTKzFG0pX2M3hm6udhWC5X5EP/+zhxzhWhjTH/xd6WwkG+xuRMyc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421262; c=relaxed/simple; bh=7oCmsNZ8473D2gAN3hzXmHGHBnHahfxBN0/mybaCK0g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=epXGobFjA4uleQ927sgaeG8vk+8vbspely7KCVMHAqLuySsEYqMOTagF8IeSF4WRbPFeQnHZH/CRYUdYaj3P8Dj6s7fMplJ2GE10aJKOGvevdk4L/yMrQP/4gbOPO/2VNY1JKuy/I6SnPMZN1raO0XDq6FC3B76vRTZljunuZsM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HdR7SGM/; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HdR7SGM/" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-22438c356c8so770725ad.1; Wed, 19 Mar 2025 14:54:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421260; x=1743026060; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KChyYDIA43QzHoDj7a8SpWlMHUhtVWuxUFkLRAF6kHM=; b=HdR7SGM/aYOczB42cUglIJxblMKchViwJbmbyO/tE2e7QCVOpbJ4clVY7p9WkDxAjt Xg8uLwXeHKRkkLV+vcgxBByaySl+89ZorXlBhkTXDDJCiGGNi27YNyEbGwIlBbewRwNJ lJTK47aiqdX9YpDwIySs5G61LfZp5/QYUBe407kN+6MzPbICnFeKLRxGwog/ygC0U7Vo PjcB86m2TYoLqbUWlKTUHBsA7iDLCOXlYnjVxQl2a2C32v7hwQiAYIRndgUuw7A2RiH4 glqJLYArvm77Jrj9gNT8yTHz9kxwc1Vyg8FRnJybAN6WFQimq3ghcw+Hg5i+DTPor3VW IdrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421260; x=1743026060; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KChyYDIA43QzHoDj7a8SpWlMHUhtVWuxUFkLRAF6kHM=; b=P7NP/XDXlwqYgk6PK72Dl9C8vffovu4G7ACe7UY5iOVpqiFAvh34sqo+iIGweSzbtz Kxq4B+70oz8bcWWt4IVbFRcLv2NUG8tVxM8RqccouKslre099cERwKLyfLc6zTz9GN/+ pArO9NbGHfQK0H3ygdgICeVD6X3x4dETM0WlpW4UPcxgzwRwl4SvQNSUdCreyWbRjL7l nQV67TkZAuSllOfunSfPMHwlgfJfxjni09/01StnIBG72WMrnnM+E+0QKi9YMVM7VXtZ QA1iGvaWvg87yVJdqMSQmG+Kh5N3mK18UQ6uF1MVrz5X53yddESLwWBo4dJo43/BOlRv YiwQ== X-Gm-Message-State: AOJu0YxFdXTwYFSsrHQ6onc/emyNoXXMpBzkLxhwQeGkVYGol0t61wzP AluzLVJnxz8Js9evj/c0LhjEcNAYFY5Xf2sO5PQ8nrmQCwzf7JiMnTw5XJwYCjs= X-Gm-Gg: ASbGnctBLjlY3u/0Qi3WvPKZVQd99C4DJx9mTuKi+8T+OPCEK25CGHGxWABenvaDJki sim7jLvDBHdCS8G5+/Wnp81aYYest9T4Gcm7lbPvNfWymM8Zbn850rKNGASOEhVLpFnPtTusWw3 SdCFi0cUXNDrP2XG1/YUPdycGSiu5mBzJfiuEB5icpSeuYGDnUn+BZFyZ7pmgJ85bhijts9Hwns 7GKVMIn/KOCdxWWYpfH/5K1A/FGt+8vRSXAeDG40bBTRiq3vtEjP52cnBdhY3l32LZRoomHCJkJ WylkuUOuXri2D1WGFksttjtyqAOLf+0CtKMyiiKqw5YZRFeuabUTff+NcZ2Bm8KNjEmByUbjSzx choyg3ZtntSFvBmXxrUE= X-Google-Smtp-Source: AGHT+IER4Wb+7sjIVeTNHgIin1FmwnOgqp5vlAgr+64pnKxRson9Vf15ODLw0A5H/Inu7ZpJd3u3VQ== X-Received: by 2002:a05:6a20:2d23:b0:1f5:5b2a:f641 with SMTP id adf61e73a8af0-1fd116ff312mr1438602637.28.1742421259961; Wed, 19 Mar 2025 14:54:19 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:19 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 09/11] selftests/bpf: Add a basic fifo qdisc test Date: Wed, 19 Mar 2025 14:53:56 -0700 Message-ID: <20250319215358.2287371-10-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung This selftest includes a bare minimum fifo qdisc, which simply enqueues sk_buffs into the back of a bpf list and dequeues from the front of the list. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/bpf_qdisc.c | 81 ++++++++++++ .../selftests/bpf/progs/bpf_qdisc_common.h | 29 +++++ .../selftests/bpf/progs/bpf_qdisc_fifo.c | 119 ++++++++++++++++++ 4 files changed, 230 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_common.h create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index c378d5d07e02..6b0cab55bd2d 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -71,6 +71,7 @@ CONFIG_NET_IPGRE=y CONFIG_NET_IPGRE_DEMUX=y CONFIG_NET_IPIP=y CONFIG_NET_MPLS_GSO=y +CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y CONFIG_NET_SCHED=y diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c new file mode 100644 index 000000000000..1ec321eb089f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -0,0 +1,81 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +#include "network_helpers.h" +#include "bpf_qdisc_fifo.skel.h" + +#define LO_IFINDEX 1 + +static const unsigned int total_bytes = 10 * 1024 * 1024; + +static void do_test(char *qdisc) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = LO_IFINDEX, + .attach_point = BPF_TC_QDISC, + .parent = TC_H_ROOT, + .handle = 0x8000000, + .qdisc = qdisc); + int srv_fd = -1, cli_fd = -1; + int err; + + err = bpf_tc_hook_create(&hook); + if (!ASSERT_OK(err, "attach qdisc")) + return; + + srv_fd = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_OK_FD(srv_fd, "start server")) + goto done; + + cli_fd = connect_to_fd(srv_fd, 0); + if (!ASSERT_OK_FD(cli_fd, "connect to client")) + goto done; + + err = send_recv_data(srv_fd, cli_fd, total_bytes); + ASSERT_OK(err, "send_recv_data"); + +done: + if (srv_fd != -1) + close(srv_fd); + if (cli_fd != -1) + close(cli_fd); + + bpf_tc_hook_destroy(&hook); +} + +static void test_fifo(void) +{ + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + do_test("bpf_fifo"); + + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + +void test_bpf_qdisc(void) +{ + struct netns_obj *netns; + + netns = netns_new("bpf_qdisc_ns", true); + if (!ASSERT_OK_PTR(netns, "netns_new")) + return; + + if (test__start_subtest("fifo")) + test_fifo(); + + netns_free(netns); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h b/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h new file mode 100644 index 000000000000..24a83cdec7cd --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_common.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _BPF_QDISC_COMMON_H +#define _BPF_QDISC_COMMON_H + +#define NET_XMIT_SUCCESS 0x00 +#define NET_XMIT_DROP 0x01 /* skb dropped */ +#define NET_XMIT_CN 0x02 /* congestion notification */ + +#define TC_PRIO_CONTROL 7 +#define TC_PRIO_MAX 15 + +u32 bpf_skb_get_hash(struct sk_buff *p) __ksym; +void bpf_kfree_skb(struct sk_buff *p) __ksym; +void bpf_qdisc_skb_drop(struct sk_buff *p, struct bpf_sk_buff_ptr *to_free) __ksym; +void bpf_qdisc_watchdog_schedule(struct Qdisc *sch, u64 expire, u64 delta_ns) __ksym; +void bpf_qdisc_bstats_update(struct Qdisc *sch, const struct sk_buff *skb) __ksym; + +static struct qdisc_skb_cb *qdisc_skb_cb(const struct sk_buff *skb) +{ + return (struct qdisc_skb_cb *)skb->cb; +} + +static inline unsigned int qdisc_pkt_len(const struct sk_buff *skb) +{ + return qdisc_skb_cb(skb)->pkt_len; +} + +#endif diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c b/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c new file mode 100644 index 000000000000..a42024ce6c30 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_fifo.c @@ -0,0 +1,119 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "bpf_experimental.h" +#include "bpf_qdisc_common.h" + +char _license[] SEC("license") = "GPL"; + +struct skb_node { + struct sk_buff __kptr * skb; + struct bpf_list_node node; +}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + +private(A) struct bpf_spin_lock q_fifo_lock; +private(A) struct bpf_list_head q_fifo __contains(skb_node, node); + +SEC("struct_ops/bpf_fifo_enqueue") +int BPF_PROG(bpf_fifo_enqueue, struct sk_buff *skb, struct Qdisc *sch, + struct bpf_sk_buff_ptr *to_free) +{ + struct skb_node *skbn; + u32 pkt_len; + + if (sch->q.qlen == sch->limit) + goto drop; + + skbn = bpf_obj_new(typeof(*skbn)); + if (!skbn) + goto drop; + + pkt_len = qdisc_pkt_len(skb); + + sch->q.qlen++; + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_qdisc_skb_drop(skb, to_free); + + bpf_spin_lock(&q_fifo_lock); + bpf_list_push_back(&q_fifo, &skbn->node); + bpf_spin_unlock(&q_fifo_lock); + + sch->qstats.backlog += pkt_len; + return NET_XMIT_SUCCESS; +drop: + bpf_qdisc_skb_drop(skb, to_free); + return NET_XMIT_DROP; +} + +SEC("struct_ops/bpf_fifo_dequeue") +struct sk_buff *BPF_PROG(bpf_fifo_dequeue, struct Qdisc *sch) +{ + struct bpf_list_node *node; + struct sk_buff *skb = NULL; + struct skb_node *skbn; + + bpf_spin_lock(&q_fifo_lock); + node = bpf_list_pop_front(&q_fifo); + bpf_spin_unlock(&q_fifo_lock); + if (!node) + return NULL; + + skbn = container_of(node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + if (!skb) + return NULL; + + sch->qstats.backlog -= qdisc_pkt_len(skb); + bpf_qdisc_bstats_update(sch, skb); + sch->q.qlen--; + + return skb; +} + +SEC("struct_ops/bpf_fifo_init") +int BPF_PROG(bpf_fifo_init, struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + sch->limit = 1000; + return 0; +} + +SEC("struct_ops/bpf_fifo_reset") +void BPF_PROG(bpf_fifo_reset, struct Qdisc *sch) +{ + struct bpf_list_node *node; + struct skb_node *skbn; + int i; + + bpf_for(i, 0, sch->q.qlen) { + struct sk_buff *skb = NULL; + + bpf_spin_lock(&q_fifo_lock); + node = bpf_list_pop_front(&q_fifo); + bpf_spin_unlock(&q_fifo_lock); + + if (!node) + break; + + skbn = container_of(node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_kfree_skb(skb); + bpf_obj_drop(skbn); + } + sch->q.qlen = 0; +} + +SEC(".struct_ops") +struct Qdisc_ops fifo = { + .enqueue = (void *)bpf_fifo_enqueue, + .dequeue = (void *)bpf_fifo_dequeue, + .init = (void *)bpf_fifo_init, + .reset = (void *)bpf_fifo_reset, + .id = "bpf_fifo", +}; + From patchwork Wed Mar 19 21:53:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023207 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51E97221543; Wed, 19 Mar 2025 21:54:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421264; cv=none; b=UOeSf3WgXbfmFmXYK2lzcc2b6spF69N+edqq00/Pc72bFVXmK42WtgrtHHayF+6QwoflVaoG2jOIsKDcjafp83TL3mvUduGq6zwmhkMQuo9o9TwcvBxUoz3F/7WRcCUreNsI28WiFupv2usCH6z9ipH8prKreP3LR3qem2VlA5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421264; c=relaxed/simple; bh=8OLVn7hu1v5cstJbQMd6XNeQZdEFIb2Pt0G3MQbk5lA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Z7276p5DvjQEKYHCLs9BGw035RjiwsLlzyKdcGYhBepQdKsuezDXaiKII3Cm5Yx7Nh/olVWzVKOMP2pOpsckvL0CD5kdvsZya5uBS49zwVNt5T8NGGAo8ota5GF61+GnhlomjlcpprW5csSBFgnx1A/DVvMXsFGAQGSAl4yDbNg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gA5tNiD4; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gA5tNiD4" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-2240b4de12bso979315ad.2; Wed, 19 Mar 2025 14:54:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421261; x=1743026061; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PT85SIs/JkVmClXb4qMjRuR3aJ0Rmu+Fmom+AM2i/6Q=; b=gA5tNiD4RCuaBP3rHlf6E0YtRnH5R5zV6y+FoQwt3FmWE7QWkB5146xJSxhnRcbdvm JdEbqSkbOO4dL5XTvcgZwLPjk0dxr6gEEv1Iy2V/fpBkhnMfdUkUVOMaoPvId3EB6+bZ O/4iOWw9LijhBKD6Ug26SMhjNHv4h48+W6QUyRN5eKdnPFXn7PR34I1vCe6AKdSpspOm fiHL1vdl0zqbrWXbUgMe/qwg/sz5i3vfNNP0hd/0MjfKshzIEsnYKO3qr6GEyCTGboAy 7H/U7HpQjF4BL4GzYXux3HvxFATPqpaSC3qvgcvz+ZB6tRPBDbEm676pcxaGl8wxQdr/ XeFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421261; x=1743026061; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PT85SIs/JkVmClXb4qMjRuR3aJ0Rmu+Fmom+AM2i/6Q=; b=GYLb4haAXnt5wkT+IKtbF5cTPvo2qnJm5JFMf7Mn5sDMWmhDR7YDF8kuHDOxlK3gSR HErr+dpFHAk7rMcRfDrd0+ACysVG2JKJLINg9sRjfQS5/BKhn4/qcJ6EcQ5MNxrjS7FL 2ejA/ApcUng51RJyGl+guoKfIpRk/jDfKxGx5S8TSMebX8pd/wAC2thPSfRi50dFFoe4 svoK/UtkohtHDLp/J/P1ddjrPxB3py91nmodBSl2zJ/Y9qIti5fQhP8+wG0W2rxCP3Ia +knmRgf1oK7guUJOU/01EklEHE/v/A4LN3JpqhhULn4eGZdTpMtq+8oVR6gAt7zHL0FT u0SA== X-Gm-Message-State: AOJu0YxAXHPiBrTNhXJjNfZ3SB6OK1/3SA2oCevcjQLQ1IJsawdU7L/L spvFjW++MZxgZTBe+ESir18N9VfbPIJSAVmfFRLVxCLBCFXofs8hRDF3NZyFxc0= X-Gm-Gg: ASbGncvU+u1lNP2jiFJd+ALya/mIK0eb8VHBCMws8Zysv1u9UIAm0t3/DrPhoxTE3q1 EtI6LNKHhTFJ4RIoUIdqD4aALAM+N6mJLzSWt9oMrK/DdFC8wZZPwB6L08KVPt0bXoZLm7oansU wFpusNadCpViE16zx12DlIZDU/sQRFJMPMAavmQYUxDnbPgYmi8MSjfJQ0ryuWNoTrSZ9iNtDbT fPKUsJWjXRwHbWEL/tmh53RiuuISt2SrQ5yIsaUVQ1VTZNjmSg8zcqLThP+hYtkYHwP4Ke4fTPW LP04YuWBxmtn5EzbTgKMUz5AlBavwBbvhVIiKVR5XufJSjwnKN9Lfy/PTjfUpfIQ9vtSjtetvXu yAcFGpu1XJ5I8LxBcGvU= X-Google-Smtp-Source: AGHT+IHrwLEuUyMZxaTrIl7RJD8svbu/+7mMNk+wefABxRFPo3MQ8FKzsKxGyx0z6RZF5rHiynXEPw== X-Received: by 2002:a05:6a00:4651:b0:736:2a73:6756 with SMTP id d2e1a72fcca58-7376d6ff504mr7270788b3a.21.1742421261196; Wed, 19 Mar 2025 14:54:21 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:20 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 10/11] selftests/bpf: Add a bpf fq qdisc to selftest Date: Wed, 19 Mar 2025 14:53:57 -0700 Message-ID: <20250319215358.2287371-11-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Amery Hung This test implements a more sophisticated qdisc using bpf. The bpf fair- queueing (fq) qdisc gives each flow an equal chance to transmit data. It also respects the timestamp of skb for rate limiting. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- .../selftests/bpf/prog_tests/bpf_qdisc.c | 24 + .../selftests/bpf/progs/bpf_qdisc_fq.c | 752 ++++++++++++++++++ 2 files changed, 776 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c index 1ec321eb089f..230d8f935303 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -6,6 +6,7 @@ #include "network_helpers.h" #include "bpf_qdisc_fifo.skel.h" +#include "bpf_qdisc_fq.skel.h" #define LO_IFINDEX 1 @@ -66,6 +67,27 @@ static void test_fifo(void) bpf_qdisc_fifo__destroy(fifo_skel); } +static void test_fq(void) +{ + struct bpf_qdisc_fq *fq_skel; + struct bpf_link *link; + + fq_skel = bpf_qdisc_fq__open_and_load(); + if (!ASSERT_OK_PTR(fq_skel, "bpf_qdisc_fq__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fq_skel->maps.fq); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fq__destroy(fq_skel); + return; + } + + do_test("bpf_fq"); + + bpf_link__destroy(link); + bpf_qdisc_fq__destroy(fq_skel); +} + void test_bpf_qdisc(void) { struct netns_obj *netns; @@ -76,6 +98,8 @@ void test_bpf_qdisc(void) if (test__start_subtest("fifo")) test_fifo(); + if (test__start_subtest("fq")) + test_fq(); netns_free(netns); } diff --git a/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c b/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c new file mode 100644 index 000000000000..663cc155fb92 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_qdisc_fq.c @@ -0,0 +1,752 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* bpf_fq is intended for testing the bpf qdisc infrastructure and not a direct + * copy of sch_fq. bpf_fq implements the scheduling algorithm of sch_fq before + * 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") was + * introduced. It gives each flow a fair chance to transmit packets in a + * round-robin fashion. Note that for flow pacing, bpf_fq currently only + * respects skb->tstamp but not skb->sk->sk_pacing_rate. In addition, if there + * are multiple bpf_fq instances, they will have a shared view of flows and + * configuration since some key data structure such as fq_prio_flows, + * fq_nonprio_flows, and fq_bpf_data are global. + * + * To use bpf_fq alone without running selftests, use the following commands. + * + * 1. Register bpf_fq to the kernel + * bpftool struct_ops register bpf_qdisc_fq.bpf.o /sys/fs/bpf + * 2. Add bpf_fq to an interface + * tc qdisc add dev root handle bpf_fq + * 3. Delete bpf_fq attached to the interface + * tc qdisc delete dev root + * 4. Unregister bpf_fq + * bpftool struct_ops unregister name fq + * + * The qdisc name, bpf_fq, used in tc commands is defined by Qdisc_ops.id. + * The struct_ops_map_name, fq, used in the bpftool command is the name of the + * Qdisc_ops. + * + * SEC(".struct_ops") + * struct Qdisc_ops fq = { + * ... + * .id = "bpf_fq", + * }; + */ + +#include +#include +#include +#include "bpf_experimental.h" +#include "bpf_qdisc_common.h" + +char _license[] SEC("license") = "GPL"; + +#define NSEC_PER_USEC 1000L +#define NSEC_PER_SEC 1000000000L + +#define NUM_QUEUE (1 << 20) + +struct fq_bpf_data { + u32 quantum; + u32 initial_quantum; + u32 flow_refill_delay; + u32 flow_plimit; + u64 horizon; + u32 orphan_mask; + u32 timer_slack; + u64 time_next_delayed_flow; + u64 unthrottle_latency_ns; + u8 horizon_drop; + u32 new_flow_cnt; + u32 old_flow_cnt; + u64 ktime_cache; +}; + +enum { + CLS_RET_PRIO = 0, + CLS_RET_NONPRIO = 1, + CLS_RET_ERR = 2, +}; + +struct skb_node { + u64 tstamp; + struct sk_buff __kptr * skb; + struct bpf_rb_node node; +}; + +struct fq_flow_node { + int credit; + u32 qlen; + u64 age; + u64 time_next_packet; + struct bpf_list_node list_node; + struct bpf_rb_node rb_node; + struct bpf_rb_root queue __contains(skb_node, node); + struct bpf_spin_lock lock; + struct bpf_refcount refcount; +}; + +struct dequeue_nonprio_ctx { + bool stop_iter; + u64 expire; + u64 now; +}; + +struct remove_flows_ctx { + bool gc_only; + u32 reset_cnt; + u32 reset_max; +}; + +struct unset_throttled_flows_ctx { + bool unset_all; + u64 now; +}; + +struct fq_stashed_flow { + struct fq_flow_node __kptr * flow; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u64); + __type(value, struct fq_stashed_flow); + __uint(max_entries, NUM_QUEUE); +} fq_nonprio_flows SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u64); + __type(value, struct fq_stashed_flow); + __uint(max_entries, 1); +} fq_prio_flows SEC(".maps"); + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + +private(A) struct bpf_spin_lock fq_delayed_lock; +private(A) struct bpf_rb_root fq_delayed __contains(fq_flow_node, rb_node); + +private(B) struct bpf_spin_lock fq_new_flows_lock; +private(B) struct bpf_list_head fq_new_flows __contains(fq_flow_node, list_node); + +private(C) struct bpf_spin_lock fq_old_flows_lock; +private(C) struct bpf_list_head fq_old_flows __contains(fq_flow_node, list_node); + +private(D) struct fq_bpf_data q; + +/* Wrapper for bpf_kptr_xchg that expects NULL dst */ +static void bpf_kptr_xchg_back(void *map_val, void *ptr) +{ + void *ret; + + ret = bpf_kptr_xchg(map_val, ptr); + if (ret) + bpf_obj_drop(ret); +} + +static bool skbn_tstamp_less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct skb_node *skbn_a; + struct skb_node *skbn_b; + + skbn_a = container_of(a, struct skb_node, node); + skbn_b = container_of(b, struct skb_node, node); + + return skbn_a->tstamp < skbn_b->tstamp; +} + +static bool fn_time_next_packet_less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct fq_flow_node *flow_a; + struct fq_flow_node *flow_b; + + flow_a = container_of(a, struct fq_flow_node, rb_node); + flow_b = container_of(b, struct fq_flow_node, rb_node); + + return flow_a->time_next_packet < flow_b->time_next_packet; +} + +static void +fq_flows_add_head(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct fq_flow_node *flow, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + bpf_list_push_front(head, &flow->list_node); + bpf_spin_unlock(lock); + *flow_cnt += 1; +} + +static void +fq_flows_add_tail(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct fq_flow_node *flow, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + bpf_list_push_back(head, &flow->list_node); + bpf_spin_unlock(lock); + *flow_cnt += 1; +} + +static void +fq_flows_remove_front(struct bpf_list_head *head, struct bpf_spin_lock *lock, + struct bpf_list_node **node, u32 *flow_cnt) +{ + bpf_spin_lock(lock); + *node = bpf_list_pop_front(head); + bpf_spin_unlock(lock); + *flow_cnt -= 1; +} + +static bool +fq_flows_is_empty(struct bpf_list_head *head, struct bpf_spin_lock *lock) +{ + struct bpf_list_node *node; + + bpf_spin_lock(lock); + node = bpf_list_pop_front(head); + if (node) { + bpf_list_push_front(head, node); + bpf_spin_unlock(lock); + return false; + } + bpf_spin_unlock(lock); + + return true; +} + +/* flow->age is used to denote the state of the flow (not-detached, detached, throttled) + * as well as the timestamp when the flow is detached. + * + * 0: not-detached + * 1 - (~0ULL-1): detached + * ~0ULL: throttled + */ +static void fq_flow_set_detached(struct fq_flow_node *flow) +{ + flow->age = bpf_jiffies64(); +} + +static bool fq_flow_is_detached(struct fq_flow_node *flow) +{ + return flow->age != 0 && flow->age != ~0ULL; +} + +static bool sk_listener(struct sock *sk) +{ + return (1 << sk->__sk_common.skc_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV); +} + +static void fq_gc(void); + +static int fq_new_flow(void *flow_map, struct fq_stashed_flow **sflow, u64 hash) +{ + struct fq_stashed_flow tmp = {}; + struct fq_flow_node *flow; + int ret; + + flow = bpf_obj_new(typeof(*flow)); + if (!flow) + return -ENOMEM; + + flow->credit = q.initial_quantum, + flow->qlen = 0, + flow->age = 1, + flow->time_next_packet = 0, + + ret = bpf_map_update_elem(flow_map, &hash, &tmp, 0); + if (ret == -ENOMEM || ret == -E2BIG) { + fq_gc(); + bpf_map_update_elem(&fq_nonprio_flows, &hash, &tmp, 0); + } + + *sflow = bpf_map_lookup_elem(flow_map, &hash); + if (!*sflow) { + bpf_obj_drop(flow); + return -ENOMEM; + } + + bpf_kptr_xchg_back(&(*sflow)->flow, flow); + return 0; +} + +static int +fq_classify(struct sk_buff *skb, struct fq_stashed_flow **sflow) +{ + struct sock *sk = skb->sk; + int ret = CLS_RET_NONPRIO; + u64 hash = 0; + + if ((skb->priority & TC_PRIO_MAX) == TC_PRIO_CONTROL) { + *sflow = bpf_map_lookup_elem(&fq_prio_flows, &hash); + ret = CLS_RET_PRIO; + } else { + if (!sk || sk_listener(sk)) { + hash = bpf_skb_get_hash(skb) & q.orphan_mask; + /* Avoid collision with an existing flow hash, which + * only uses the lower 32 bits of hash, by setting the + * upper half of hash to 1. + */ + hash |= (1ULL << 32); + } else if (sk->__sk_common.skc_state == TCP_CLOSE) { + hash = bpf_skb_get_hash(skb) & q.orphan_mask; + hash |= (1ULL << 32); + } else { + hash = sk->__sk_common.skc_hash; + } + *sflow = bpf_map_lookup_elem(&fq_nonprio_flows, &hash); + } + + if (!*sflow) + ret = fq_new_flow(&fq_nonprio_flows, sflow, hash) < 0 ? + CLS_RET_ERR : CLS_RET_NONPRIO; + + return ret; +} + +static bool fq_packet_beyond_horizon(struct sk_buff *skb) +{ + return (s64)skb->tstamp > (s64)(q.ktime_cache + q.horizon); +} + +SEC("struct_ops/bpf_fq_enqueue") +int BPF_PROG(bpf_fq_enqueue, struct sk_buff *skb, struct Qdisc *sch, + struct bpf_sk_buff_ptr *to_free) +{ + struct fq_flow_node *flow = NULL, *flow_copy; + struct fq_stashed_flow *sflow; + u64 time_to_send, jiffies; + struct skb_node *skbn; + int ret; + + if (sch->q.qlen >= sch->limit) + goto drop; + + if (!skb->tstamp) { + time_to_send = q.ktime_cache = bpf_ktime_get_ns(); + } else { + if (fq_packet_beyond_horizon(skb)) { + q.ktime_cache = bpf_ktime_get_ns(); + if (fq_packet_beyond_horizon(skb)) { + if (q.horizon_drop) + goto drop; + + skb->tstamp = q.ktime_cache + q.horizon; + } + } + time_to_send = skb->tstamp; + } + + ret = fq_classify(skb, &sflow); + if (ret == CLS_RET_ERR) + goto drop; + + flow = bpf_kptr_xchg(&sflow->flow, flow); + if (!flow) + goto drop; + + if (ret == CLS_RET_NONPRIO) { + if (flow->qlen >= q.flow_plimit) { + bpf_kptr_xchg_back(&sflow->flow, flow); + goto drop; + } + + if (fq_flow_is_detached(flow)) { + flow_copy = bpf_refcount_acquire(flow); + + jiffies = bpf_jiffies64(); + if ((s64)(jiffies - (flow_copy->age + q.flow_refill_delay)) > 0) { + if (flow_copy->credit < q.quantum) + flow_copy->credit = q.quantum; + } + flow_copy->age = 0; + fq_flows_add_tail(&fq_new_flows, &fq_new_flows_lock, flow_copy, + &q.new_flow_cnt); + } + } + + skbn = bpf_obj_new(typeof(*skbn)); + if (!skbn) { + bpf_kptr_xchg_back(&sflow->flow, flow); + goto drop; + } + + skbn->tstamp = skb->tstamp = time_to_send; + + sch->qstats.backlog += qdisc_pkt_len(skb); + + skb = bpf_kptr_xchg(&skbn->skb, skb); + if (skb) + bpf_qdisc_skb_drop(skb, to_free); + + bpf_spin_lock(&flow->lock); + bpf_rbtree_add(&flow->queue, &skbn->node, skbn_tstamp_less); + bpf_spin_unlock(&flow->lock); + + flow->qlen++; + bpf_kptr_xchg_back(&sflow->flow, flow); + + sch->q.qlen++; + return NET_XMIT_SUCCESS; + +drop: + bpf_qdisc_skb_drop(skb, to_free); + sch->qstats.drops++; + return NET_XMIT_DROP; +} + +static int fq_unset_throttled_flows(u32 index, struct unset_throttled_flows_ctx *ctx) +{ + struct bpf_rb_node *node = NULL; + struct fq_flow_node *flow; + + bpf_spin_lock(&fq_delayed_lock); + + node = bpf_rbtree_first(&fq_delayed); + if (!node) { + bpf_spin_unlock(&fq_delayed_lock); + return 1; + } + + flow = container_of(node, struct fq_flow_node, rb_node); + if (!ctx->unset_all && flow->time_next_packet > ctx->now) { + q.time_next_delayed_flow = flow->time_next_packet; + bpf_spin_unlock(&fq_delayed_lock); + return 1; + } + + node = bpf_rbtree_remove(&fq_delayed, &flow->rb_node); + + bpf_spin_unlock(&fq_delayed_lock); + + if (!node) + return 1; + + flow = container_of(node, struct fq_flow_node, rb_node); + flow->age = 0; + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + + return 0; +} + +static void fq_flow_set_throttled(struct fq_flow_node *flow) +{ + flow->age = ~0ULL; + + if (q.time_next_delayed_flow > flow->time_next_packet) + q.time_next_delayed_flow = flow->time_next_packet; + + bpf_spin_lock(&fq_delayed_lock); + bpf_rbtree_add(&fq_delayed, &flow->rb_node, fn_time_next_packet_less); + bpf_spin_unlock(&fq_delayed_lock); +} + +static void fq_check_throttled(u64 now) +{ + struct unset_throttled_flows_ctx ctx = { + .unset_all = false, + .now = now, + }; + unsigned long sample; + + if (q.time_next_delayed_flow > now) + return; + + sample = (unsigned long)(now - q.time_next_delayed_flow); + q.unthrottle_latency_ns -= q.unthrottle_latency_ns >> 3; + q.unthrottle_latency_ns += sample >> 3; + + q.time_next_delayed_flow = ~0ULL; + bpf_loop(NUM_QUEUE, fq_unset_throttled_flows, &ctx, 0); +} + +static struct sk_buff* +fq_dequeue_nonprio_flows(u32 index, struct dequeue_nonprio_ctx *ctx) +{ + u64 time_next_packet, time_to_send; + struct bpf_rb_node *rb_node; + struct sk_buff *skb = NULL; + struct bpf_list_head *head; + struct bpf_list_node *node; + struct bpf_spin_lock *lock; + struct fq_flow_node *flow; + struct skb_node *skbn; + bool is_empty; + u32 *cnt; + + if (q.new_flow_cnt) { + head = &fq_new_flows; + lock = &fq_new_flows_lock; + cnt = &q.new_flow_cnt; + } else if (q.old_flow_cnt) { + head = &fq_old_flows; + lock = &fq_old_flows_lock; + cnt = &q.old_flow_cnt; + } else { + if (q.time_next_delayed_flow != ~0ULL) + ctx->expire = q.time_next_delayed_flow; + goto break_loop; + } + + fq_flows_remove_front(head, lock, &node, cnt); + if (!node) + goto break_loop; + + flow = container_of(node, struct fq_flow_node, list_node); + if (flow->credit <= 0) { + flow->credit += q.quantum; + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + return NULL; + } + + bpf_spin_lock(&flow->lock); + rb_node = bpf_rbtree_first(&flow->queue); + if (!rb_node) { + bpf_spin_unlock(&flow->lock); + is_empty = fq_flows_is_empty(&fq_old_flows, &fq_old_flows_lock); + if (head == &fq_new_flows && !is_empty) { + fq_flows_add_tail(&fq_old_flows, &fq_old_flows_lock, flow, &q.old_flow_cnt); + } else { + fq_flow_set_detached(flow); + bpf_obj_drop(flow); + } + return NULL; + } + + skbn = container_of(rb_node, struct skb_node, node); + time_to_send = skbn->tstamp; + + time_next_packet = (time_to_send > flow->time_next_packet) ? + time_to_send : flow->time_next_packet; + if (ctx->now < time_next_packet) { + bpf_spin_unlock(&flow->lock); + flow->time_next_packet = time_next_packet; + fq_flow_set_throttled(flow); + return NULL; + } + + rb_node = bpf_rbtree_remove(&flow->queue, rb_node); + bpf_spin_unlock(&flow->lock); + + if (!rb_node) + goto add_flow_and_break; + + skbn = container_of(rb_node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + + if (!skb) + goto add_flow_and_break; + + flow->credit -= qdisc_skb_cb(skb)->pkt_len; + flow->qlen--; + +add_flow_and_break: + fq_flows_add_head(head, lock, flow, cnt); + +break_loop: + ctx->stop_iter = true; + return skb; +} + +static struct sk_buff *fq_dequeue_prio(void) +{ + struct fq_flow_node *flow = NULL; + struct fq_stashed_flow *sflow; + struct bpf_rb_node *rb_node; + struct sk_buff *skb = NULL; + struct skb_node *skbn; + u64 hash = 0; + + sflow = bpf_map_lookup_elem(&fq_prio_flows, &hash); + if (!sflow) + return NULL; + + flow = bpf_kptr_xchg(&sflow->flow, flow); + if (!flow) + return NULL; + + bpf_spin_lock(&flow->lock); + rb_node = bpf_rbtree_first(&flow->queue); + if (!rb_node) { + bpf_spin_unlock(&flow->lock); + goto out; + } + + skbn = container_of(rb_node, struct skb_node, node); + rb_node = bpf_rbtree_remove(&flow->queue, &skbn->node); + bpf_spin_unlock(&flow->lock); + + if (!rb_node) + goto out; + + skbn = container_of(rb_node, struct skb_node, node); + skb = bpf_kptr_xchg(&skbn->skb, skb); + bpf_obj_drop(skbn); + +out: + bpf_kptr_xchg_back(&sflow->flow, flow); + + return skb; +} + +SEC("struct_ops/bpf_fq_dequeue") +struct sk_buff *BPF_PROG(bpf_fq_dequeue, struct Qdisc *sch) +{ + struct dequeue_nonprio_ctx cb_ctx = {}; + struct sk_buff *skb = NULL; + int i; + + if (!sch->q.qlen) + goto out; + + skb = fq_dequeue_prio(); + if (skb) + goto dequeue; + + q.ktime_cache = cb_ctx.now = bpf_ktime_get_ns(); + fq_check_throttled(q.ktime_cache); + bpf_for(i, 0, sch->limit) { + skb = fq_dequeue_nonprio_flows(i, &cb_ctx); + if (cb_ctx.stop_iter) + break; + }; + + if (skb) { +dequeue: + sch->q.qlen--; + sch->qstats.backlog -= qdisc_pkt_len(skb); + bpf_qdisc_bstats_update(sch, skb); + return skb; + } + + if (cb_ctx.expire) + bpf_qdisc_watchdog_schedule(sch, cb_ctx.expire, q.timer_slack); +out: + return NULL; +} + +static int fq_remove_flows_in_list(u32 index, void *ctx) +{ + struct bpf_list_node *node; + struct fq_flow_node *flow; + + bpf_spin_lock(&fq_new_flows_lock); + node = bpf_list_pop_front(&fq_new_flows); + bpf_spin_unlock(&fq_new_flows_lock); + if (!node) { + bpf_spin_lock(&fq_old_flows_lock); + node = bpf_list_pop_front(&fq_old_flows); + bpf_spin_unlock(&fq_old_flows_lock); + if (!node) + return 1; + } + + flow = container_of(node, struct fq_flow_node, list_node); + bpf_obj_drop(flow); + + return 0; +} + +extern unsigned CONFIG_HZ __kconfig; + +/* limit number of collected flows per round */ +#define FQ_GC_MAX 8 +#define FQ_GC_AGE (3*CONFIG_HZ) + +static bool fq_gc_candidate(struct fq_flow_node *flow) +{ + u64 jiffies = bpf_jiffies64(); + + return fq_flow_is_detached(flow) && + ((s64)(jiffies - (flow->age + FQ_GC_AGE)) > 0); +} + +static int +fq_remove_flows(struct bpf_map *flow_map, u64 *hash, + struct fq_stashed_flow *sflow, struct remove_flows_ctx *ctx) +{ + if (sflow->flow && + (!ctx->gc_only || fq_gc_candidate(sflow->flow))) { + bpf_map_delete_elem(flow_map, hash); + ctx->reset_cnt++; + } + + return ctx->reset_cnt < ctx->reset_max ? 0 : 1; +} + +static void fq_gc(void) +{ + struct remove_flows_ctx cb_ctx = { + .gc_only = true, + .reset_cnt = 0, + .reset_max = FQ_GC_MAX, + }; + + bpf_for_each_map_elem(&fq_nonprio_flows, fq_remove_flows, &cb_ctx, 0); +} + +SEC("struct_ops/bpf_fq_reset") +void BPF_PROG(bpf_fq_reset, struct Qdisc *sch) +{ + struct unset_throttled_flows_ctx utf_ctx = { + .unset_all = true, + }; + struct remove_flows_ctx rf_ctx = { + .gc_only = false, + .reset_cnt = 0, + .reset_max = NUM_QUEUE, + }; + struct fq_stashed_flow *sflow; + u64 hash = 0; + + sch->q.qlen = 0; + sch->qstats.backlog = 0; + + bpf_for_each_map_elem(&fq_nonprio_flows, fq_remove_flows, &rf_ctx, 0); + + rf_ctx.reset_cnt = 0; + bpf_for_each_map_elem(&fq_prio_flows, fq_remove_flows, &rf_ctx, 0); + fq_new_flow(&fq_prio_flows, &sflow, hash); + + bpf_loop(NUM_QUEUE, fq_remove_flows_in_list, NULL, 0); + q.new_flow_cnt = 0; + q.old_flow_cnt = 0; + + bpf_loop(NUM_QUEUE, fq_unset_throttled_flows, &utf_ctx, 0); +} + +SEC("struct_ops/bpf_fq_init") +int BPF_PROG(bpf_fq_init, struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + struct net_device *dev = sch->dev_queue->dev; + u32 psched_mtu = dev->mtu + dev->hard_header_len; + struct fq_stashed_flow *sflow; + u64 hash = 0; + + if (fq_new_flow(&fq_prio_flows, &sflow, hash) < 0) + return -ENOMEM; + + sch->limit = 10000; + q.initial_quantum = 10 * psched_mtu; + q.quantum = 2 * psched_mtu; + q.flow_refill_delay = 40; + q.flow_plimit = 100; + q.horizon = 10ULL * NSEC_PER_SEC; + q.horizon_drop = 1; + q.orphan_mask = 1024 - 1; + q.timer_slack = 10 * NSEC_PER_USEC; + q.time_next_delayed_flow = ~0ULL; + q.unthrottle_latency_ns = 0ULL; + q.new_flow_cnt = 0; + q.old_flow_cnt = 0; + + return 0; +} + +SEC(".struct_ops") +struct Qdisc_ops fq = { + .enqueue = (void *)bpf_fq_enqueue, + .dequeue = (void *)bpf_fq_dequeue, + .reset = (void *)bpf_fq_reset, + .init = (void *)bpf_fq_init, + .id = "bpf_fq", +}; From patchwork Wed Mar 19 21:53:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Amery Hung X-Patchwork-Id: 14023208 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FC40221566; Wed, 19 Mar 2025 21:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421264; cv=none; b=pZJhYDr6uixAH+uk5l47TCKCUyaR15KVtpfkQLkKLJN+MZy9mmv4UIHZQqMorPWFIGShH32NLxykliiUHyvv1832m9WKauCOVsenAYBeOWpSjYi7mmSfIQ4n86IAsCPDp5YYXa5j4xiOBP3UdiYmMtgLI0s181cZvd4EW4JrtrY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742421264; c=relaxed/simple; bh=QD9H9mqzgNHpluvagvhymBG621wAxj4V0HMO9BVA+lY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tg3TJDEyjJBJm6l86et7e63GkY6UsrOfRJvvpJFQ6eK7XbX2ajdf6J9/cmsy9CV+XfEnDKpEgZsxQgz22793/qD1Gd/uM4QnMa+Bm4J6UDh8+6jm8JgREB0UGDkb4Xny/Ul8vUDAnOZbou1h6B6gHZ24w/Vn8H61CarPHYc/sVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FGFn9JcW; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FGFn9JcW" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-22438c356c8so771135ad.1; Wed, 19 Mar 2025 14:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742421262; x=1743026062; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kFOhTWLEYBOHG2TWNS1E2NePPrp1eOvdJzwGarC0dkY=; b=FGFn9JcWR1JffOoR8B1g0WiCdyvkv3MjuvQSWGW9yMrDhnxuj9opSsprW1wGwmwMpS /Pv+7No78ZxFPkTqtpXKaOgO+FhyQPIa1iUEr29Z4bDHwMXzbX4rxM8zVONC3NUpK9Q7 uWQRnVqlZy9/Mc4uJB8yjTfyq1bRTSHJvf2eJAApFo92ggtWnZjL5WmtSMXmoftVDW04 yeipRKdZMxzML1ZqaoNkCX3HcylnIEnYhMMMg3IWug9CJPj93tH/WMFgNPkKm35D6ve4 KJgV8WEjPtq4Z4YKAnqLPhwNqJZgwoW9IieLmLSXjlyDkciCSymf9rfPX3hlolbaWA3l js+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742421262; x=1743026062; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kFOhTWLEYBOHG2TWNS1E2NePPrp1eOvdJzwGarC0dkY=; b=ePJRhFQzX+hilaBZFC19jGm3EkWYA4MOnKCa+LIR5W2fynDKczHXCIVk33yqofbLQH J/KS3YVK/NhHW1GSmtm5NIBZhojUWpm2x5qikLcxbeGspHESG0rK5F6PLyfkGrBDqjub DAGuOzQVvJZobvrFVhvTyq224UeYQJO+nuPJeT3m+JW9h9dLjw6c0fCAXS9/14Q0PF0m wXhU0z01JuEJHk6qDgtqSsAR9zdbQGAXcW+pdlg1mztoPOhTc7oIjK1cxYGx7o5sGpfk GU7DGdWmYH8dVieIuJo8bTwWzS9bG5s1zNJLxR4N2ky0W+ignjvUNHHvi7BXNbeAt0rP TMsw== X-Gm-Message-State: AOJu0YzsInjQF4/PTiULvBiiWLuKFiPUJ2t2WZoicZQVEh53qsFPZI8o LykiuLNhhG1hXJC2EhFDdiP43heD7JSiXsdL2IzIUE29/tdg/HcCD5hHIlC0jJY= X-Gm-Gg: ASbGncsBVCpXbeXwhSdGMEYU4/SvpMJxHdv4kHpgj0xTXO8g96JT9Vaf1CBGJzG2LcO 73pJ8YVoAfoNgrAQkRRcGF7MmxUFdSEPH94aSaqmI+55EB2bUR839/2e72H0Zbz4ahiLARxnmdX oU//HhDeqGHV+nSF1MDqB+20dkOkD5NJTH8oDCV4s+3R6QPEYKEPeZ432Foc6lmjfuN2P4rmMwA 1kBSg/WdqFXXg3UTW6zEFRf9kadiQf/EqeC/hTrgf8w/THyjV0FLd4jtpb8oVDueTLYwVtCtPAx jHGLKAg1kciXrnOB0XwELl2zwkMnyXW/+Wuj++khuMbCddJ7etBwOSN8OPF5JGaKZhWQN5pPI8l Y7NVzYc2MSXZEDeYnAcAE6MjPsgytgw== X-Google-Smtp-Source: AGHT+IG/zL0ChyNjjHYv8bMWloE+jx26VvglCpJxLBuHntESbJK0TADHU1qhC9gRPwIKp+sDWY10Ug== X-Received: by 2002:a05:6a00:ac8:b0:736:34ca:dee2 with SMTP id d2e1a72fcca58-7377a819993mr1165410b3a.4.1742421262456; Wed, 19 Mar 2025 14:54:22 -0700 (PDT) Received: from localhost.localdomain (c-76-146-13-146.hsd1.wa.comcast.net. [76.146.13.146]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-737116b0e8asm12175596b3a.158.2025.03.19.14.54.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Mar 2025 14:54:21 -0700 (PDT) From: Amery Hung To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, alexei.starovoitov@gmail.com, martin.lau@kernel.org, kuba@kernel.org, edumazet@google.com, xiyou.wangcong@gmail.com, jhs@mojatatu.com, sinquersw@gmail.com, toke@redhat.com, juntong.deng@outlook.com, jiri@resnulli.us, stfomichev@gmail.com, ekarani.silvestre@ccc.ufcg.edu.br, yangpeihao@sjtu.edu.cn, yepeilin.cs@gmail.com, ameryhung@gmail.com, kernel-team@meta.com Subject: [PATCH bpf-next v6 11/11] selftests/bpf: Test attaching bpf qdisc to mq and non root Date: Wed, 19 Mar 2025 14:53:58 -0700 Message-ID: <20250319215358.2287371-12-ameryhung@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250319215358.2287371-1-ameryhung@gmail.com> References: <20250319215358.2287371-1-ameryhung@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Until we are certain that existing classful qdiscs work with bpf qdisc, make sure we don't allow attaching a bpf qdisc to non root. Meanwhile, attaching to mq is allowed. Signed-off-by: Amery Hung Acked-by: Toke Høiland-Jørgensen --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/bpf_qdisc.c | 75 +++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 6b0cab55bd2d..3201a962b3dc 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -74,6 +74,7 @@ CONFIG_NET_MPLS_GSO=y CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y +CONFIG_NET_SCH_HTB=y CONFIG_NET_SCHED=y CONFIG_NETDEVSIM=y CONFIG_NETFILTER=y diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c index 230d8f935303..c9a54177c84e 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_qdisc.c @@ -88,6 +88,77 @@ static void test_fq(void) bpf_qdisc_fq__destroy(fq_skel); } +static void test_qdisc_attach_to_mq(void) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, + .attach_point = BPF_TC_QDISC, + .parent = TC_H_MAKE(1 << 16, 1), + .handle = 0x11 << 16, + .qdisc = "bpf_fifo"); + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + int err; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + SYS(out, "ip link add veth0 type veth peer veth1"); + hook.ifindex = if_nametoindex("veth0"); + SYS(out, "tc qdisc add dev veth0 root handle 1: mq"); + + err = bpf_tc_hook_create(&hook); + ASSERT_OK(err, "attach qdisc"); + + bpf_tc_hook_destroy(&hook); + + SYS(out, "tc qdisc delete dev veth0 root mq"); +out: + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + +static void test_qdisc_attach_to_non_root(void) +{ + DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = LO_IFINDEX, + .attach_point = BPF_TC_QDISC, + .parent = TC_H_MAKE(1 << 16, 1), + .handle = 0x11 << 16, + .qdisc = "bpf_fifo"); + struct bpf_qdisc_fifo *fifo_skel; + struct bpf_link *link; + int err; + + fifo_skel = bpf_qdisc_fifo__open_and_load(); + if (!ASSERT_OK_PTR(fifo_skel, "bpf_qdisc_fifo__open_and_load")) + return; + + link = bpf_map__attach_struct_ops(fifo_skel->maps.fifo); + if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) { + bpf_qdisc_fifo__destroy(fifo_skel); + return; + } + + SYS(out, "tc qdisc add dev lo root handle 1: htb"); + SYS(out_del_htb, "tc class add dev lo parent 1: classid 1:1 htb rate 75Kbit"); + + err = bpf_tc_hook_create(&hook); + if (!ASSERT_ERR(err, "attach qdisc")) + bpf_tc_hook_destroy(&hook); + +out_del_htb: + SYS(out, "tc qdisc delete dev lo root htb"); +out: + bpf_link__destroy(link); + bpf_qdisc_fifo__destroy(fifo_skel); +} + void test_bpf_qdisc(void) { struct netns_obj *netns; @@ -100,6 +171,10 @@ void test_bpf_qdisc(void) test_fifo(); if (test__start_subtest("fq")) test_fq(); + if (test__start_subtest("attach to mq")) + test_qdisc_attach_to_mq(); + if (test__start_subtest("attach to non root")) + test_qdisc_attach_to_non_root(); netns_free(netns); }