From patchwork Wed Feb 3 04:16:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063369 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A538C433E6 for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 32A5664E4D for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232636AbhBCERo (ORCPT ); Tue, 2 Feb 2021 23:17:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232274AbhBCERg (ORCPT ); Tue, 2 Feb 2021 23:17:36 -0500 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72CFEC06174A; Tue, 2 Feb 2021 20:16:56 -0800 (PST) Received: by mail-oi1-x22c.google.com with SMTP id k25so25379454oik.13; Tue, 02 Feb 2021 20:16:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5HRWSbLtPDag7SiNqYpgFvHK80P7pmdpmcgHM2bBEUw=; b=fs0M2UAX9+eW3VgxKgpghsMg3kB6UpHz0gUcL+wzjrAeKgj0pfNXi/ZryWb39mWE2i md4OGSsXyNXFryTstkF06LoXrSPmQ2qgCTYQTa6HL36wptCJq9EYuVbVUih8ZUrfOLBm b82nYl/9BLdqmIWPAXa3m8M/n48ZAdDH3/Zv61lzX27J/LyPd7EOAAbpRatFfFuqz7Rt 3UhGVXPCFWKjqj5Rykk3z//Q3sESHyRNxK1dPTjPr6JUqKZTrx1OjKTQlA2QcCStBeEB gp0czW4cIf1S1WKU5gKrHMfnIRhjDPnGTU8ubrEus+zSxB4ZnTLI3kBXZlIXg+HzevTj tAdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5HRWSbLtPDag7SiNqYpgFvHK80P7pmdpmcgHM2bBEUw=; b=cdndaVmSEAK7r3uoYmhZ0fzAh6fpJ0FDje0Wv1q6B+pe7mfSWxanKuRDdveM9/A+Tk mckRRoaK0ki+I+IUUrGn+b4Un713ahWOrkxwZtZuDmRU83M7UyGbImqosfzaSIA14CcB N2URshoTPtWNw13KAfXEjKsDoYJQukdLNE+pbZEvF3RLTUNPDcBCL8MsgA/9nZo0qRnM YEFGb24jqgJsm0Sxh+cwqf6YuaTLGNRWDRhh5ndi0GXf1Cj6kGInzQTtbHJQtv6zvWQN SLYthlrjmXkgf0Z1soa97WKxl3UGwe+rnQt/A5LdlzbbdEn0Z+mgbLnIEBSSFQ5Ceb6e 5VHA== X-Gm-Message-State: AOAM530ZuYFisg3YPMsJ6FGH2488YUym0Ahrz89qh6n8lVARCZT33bA+ c2N/vtLbgaEuk2ne1rFfWljEfnfA6F1UOA== X-Google-Smtp-Source: ABdhPJz2XhEJ8vp4kK1mKNUekKl0O/HIqh5Ksh/ptQpUwh5FRKJLQOnsQ0VOv+kZ+8rLvTd1T8K7/w== X-Received: by 2002:aca:de06:: with SMTP id v6mr831034oig.60.1612325815153; Tue, 02 Feb 2021 20:16:55 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:54 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 01/19] bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP Date: Tue, 2 Feb 2021 20:16:18 -0800 Message-Id: <20210203041636.38555-2-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Before we add non-TCP support, it is necessary to rename BPF_STREAM_PARSER as it will be no longer specific to TCP, and it does not have to be a parser either. This patch renames BPF_STREAM_PARSER to BPF_SOCK_MAP, so that sock_map.c hopefully would be protocol-independent. Also, improve its Kconfig description to avoid confusion. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/bpf.h | 4 ++-- include/linux/bpf_types.h | 2 +- include/net/tcp.h | 4 ++-- include/net/udp.h | 4 ++-- net/Kconfig | 13 ++++++------- net/core/Makefile | 2 +- net/ipv4/Makefile | 2 +- net/ipv4/tcp_bpf.c | 4 ++-- 8 files changed, 17 insertions(+), 18 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 321966fc35db..b5af6a4e9927 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1771,7 +1771,7 @@ static inline void bpf_map_offload_map_free(struct bpf_map *map) } #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */ -#if defined(CONFIG_BPF_STREAM_PARSER) +#if defined(CONFIG_BPF_SOCK_MAP) int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, struct bpf_prog *old, u32 which); int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog); @@ -1804,7 +1804,7 @@ static inline int sock_map_update_elem_sys(struct bpf_map *map, void *key, void { return -EOPNOTSUPP; } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL) void bpf_sk_reuseport_detach(struct sock *sk); diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 99f7fd657d87..6e27726ae578 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -103,7 +103,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_HASH_OF_MAPS, htab_of_maps_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP, dev_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP_HASH, dev_map_hash_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_SK_STORAGE, sk_storage_map_ops) -#if defined(CONFIG_BPF_STREAM_PARSER) +#if defined(CONFIG_BPF_SOCK_MAP) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops) #endif diff --git a/include/net/tcp.h b/include/net/tcp.h index 4bb42fb19711..be66571ad122 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2207,14 +2207,14 @@ void tcp_update_ulp(struct sock *sk, struct proto *p, struct sk_msg; struct sk_psock; -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); #else static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) { } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #ifdef CONFIG_NET_SOCK_MSG int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, diff --git a/include/net/udp.h b/include/net/udp.h index 877832bed471..0ff921e6b866 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -511,9 +511,9 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk, return segs; } -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP struct sk_psock; struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); -#endif /* BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #endif /* _UDP_H */ diff --git a/net/Kconfig b/net/Kconfig index f4c32d982af6..0cc0805a8127 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -305,20 +305,19 @@ config BPF_JIT /proc/sys/net/core/bpf_jit_harden (optional) /proc/sys/net/core/bpf_jit_kallsyms (optional) -config BPF_STREAM_PARSER - bool "enable BPF STREAM_PARSER" +config BPF_SOCK_MAP + bool "enable BPF socket maps" depends on INET depends on BPF_SYSCALL depends on CGROUP_BPF select STREAM_PARSER select NET_SOCK_MSG help - Enabling this allows a stream parser to be used with - BPF_MAP_TYPE_SOCKMAP. + Enabling this allows skb parser and verdict to be used with + BPF_MAP_TYPE_SOCKMAP or BPF_MAP_TYPE_SOCKHASH. - BPF_MAP_TYPE_SOCKMAP provides a map type to use with network sockets. - It can be used to enforce socket policy, implement socket redirects, - etc. + This provides a BPF map type to use with network sockets. It can + be used to enforce socket policy, implement socket redirects, etc. config NET_FLOW_LIMIT bool diff --git a/net/core/Makefile b/net/core/Makefile index 3e2c378e5f31..e7c1bdaadefd 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -28,7 +28,7 @@ obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o obj-$(CONFIG_LWTUNNEL) += lwtunnel.o obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o -obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o +obj-$(CONFIG_BPF_SOCK_MAP) += sock_map.o obj-$(CONFIG_DST_CACHE) += dst_cache.o obj-$(CONFIG_HWBM) += hwbm.o obj-$(CONFIG_NET_DEVLINK) += devlink.o diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 5b77a46885b9..f72f84d1b982 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -62,7 +62,7 @@ obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o -obj-$(CONFIG_BPF_STREAM_PARSER) += udp_bpf.o +obj-$(CONFIG_BPF_SOCK_MAP) += udp_bpf.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index bc7d2a586e18..2252f1d90676 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -229,7 +229,7 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, } EXPORT_SYMBOL_GPL(tcp_bpf_sendmsg_redir); -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP static bool tcp_bpf_stream_read(const struct sock *sk) { struct sk_psock *psock; @@ -629,4 +629,4 @@ void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) if (prot == &tcp_bpf_prots[family][TCP_BPF_BASE]) newsk->sk_prot = sk->sk_prot_creator; } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ From patchwork Wed Feb 3 04:16:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063371 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E21F5C43381 for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E7A564F67 for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232657AbhBCERt (ORCPT ); Tue, 2 Feb 2021 23:17:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232617AbhBCERj (ORCPT ); Tue, 2 Feb 2021 23:17:39 -0500 Received: from mail-oi1-x22f.google.com (mail-oi1-x22f.google.com [IPv6:2607:f8b0:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B421C0613D6; Tue, 2 Feb 2021 20:16:57 -0800 (PST) Received: by mail-oi1-x22f.google.com with SMTP id k25so25379504oik.13; Tue, 02 Feb 2021 20:16:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QuxR8jr/6XGuWWyDoFOLUavtPH9v/4WH/NrMUIXz4tg=; b=qAJiezNkwFImSrEXjXp0t6oE/jNVXwJpuOhP1PR+nc7jEH1F8bE0diKR3zefu2bLPg pqMnHn5X0fuzuPcozi5Gu4WAKn2wRb04xy+5vDrnDR0pLN3ycF4hlxl0Z4BkPXWHcvU0 XenXvmin+EkNHkDJrIhJtzTiOeWFolEa6GJr9D/c+MaHlhQcwi8T0FHwefiLOeYEUSd/ hiVm2Ezurbwguhh0+UrQy00+VlRuGdbAbSt04knahUAXY/r/kKcz1P8mOk9b4siWlceJ KUyjDmAIwjEJc8cruALC3cuYGaqnExm5ENFrnn1k8IEiGP44jw4QKhIdJgfM8LlUMhpz 7ezg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QuxR8jr/6XGuWWyDoFOLUavtPH9v/4WH/NrMUIXz4tg=; b=rfMfGKGt+NZhherPPoCDYecBVn1G0OqG/JsfOs3DYoyDz3PWIrJIERDo45PNLJ3lPz 4yZYc5yF5Yf6pziO5EiezQksT9qoxtLUD7b+bEqUXQbqEwGJlR0YT+VcsXfLFzojOac8 Yl4bhMIeDAjU+7+Uh9fdNTVSpapPpUEMLIkLFBmXVpBKWSnv9JU01p7ecVLiGoLV1wpz JDqN+45fjRiTo1ChntH4ZUc/T0pq4c0/Xv0w9PJao5Xd3UIvY0Q1x358TFkL7uXdhBEL PqpWToQ8DvXp9rk7Dw10uXbPwVGUFOLgnq/grn1GCy0Xm10vtQ6EEXroSnEm87U5MHDv JD6Q== X-Gm-Message-State: AOAM533kRWygwf63urpiBdkbHxDdIJPSRH82TZFqdsn/wuvPy6iul7rZ d7CTKabeucbMZTUxIbJnHkn/Lx/qkxYbLQ== X-Google-Smtp-Source: ABdhPJxDx+LvVm517EZ0sNCGHpBIi12aiMRdUHVTukP9EfFGymF+SMcr0fsHmF9y6S95cYEQuEAkAQ== X-Received: by 2002:aca:508f:: with SMTP id e137mr781235oib.32.1612325816659; Tue, 02 Feb 2021 20:16:56 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:56 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 02/19] skmsg: get rid of struct sk_psock_parser Date: Tue, 2 Feb 2021 20:16:19 -0800 Message-Id: <20210203041636.38555-3-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang struct sk_psock_parser is embedded in sk_psock, it is unnecessary as skb verdict also uses ->saved_data_ready. We can simply fold these fields into sk_psock. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 16 +++++------- net/core/skmsg.c | 58 ++++++++++++++++--------------------------- net/core/sock_map.c | 8 +++--- 3 files changed, 31 insertions(+), 51 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 8edbbf5f2f93..56d641df3b0c 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -70,12 +70,6 @@ struct sk_psock_link { void *link_raw; }; -struct sk_psock_parser { - struct strparser strp; - bool enabled; - void (*saved_data_ready)(struct sock *sk); -}; - struct sk_psock_work_state { struct sk_buff *skb; u32 len; @@ -90,7 +84,8 @@ struct sk_psock { u32 eval; struct sk_msg *cork; struct sk_psock_progs progs; - struct sk_psock_parser parser; + struct strparser strp; + bool bpf_running; struct sk_buff_head ingress_skb; struct list_head ingress_msg; unsigned long state; @@ -100,6 +95,7 @@ struct sk_psock { void (*saved_unhash)(struct sock *sk); void (*saved_close)(struct sock *sk, long timeout); void (*saved_write_space)(struct sock *sk); + void (*saved_data_ready)(struct sock *sk); struct proto *sk_proto; struct sk_psock_work_state work_state; struct work_struct work; @@ -400,8 +396,8 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock) static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) { - if (psock->parser.enabled) - psock->parser.saved_data_ready(sk); + if (psock->bpf_running) + psock->saved_data_ready(sk); else sk->sk_data_ready(sk); } @@ -440,6 +436,6 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock) { if (!psock) return false; - return psock->parser.enabled; + return psock->bpf_running; } #endif /* _LINUX_SKMSG_H */ diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 1261512d6807..f72fcb03d25c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -653,7 +653,7 @@ static void sk_psock_destroy_deferred(struct work_struct *gc) /* Parser has been stopped */ if (psock->progs.skb_parser) - strp_done(&psock->parser.strp); + strp_done(&psock->strp); cancel_work_sync(&psock->work); @@ -750,14 +750,6 @@ static int sk_psock_bpf_run(struct sk_psock *psock, struct bpf_prog *prog, return bpf_prog_run_pin_on_cpu(prog, skb); } -static struct sk_psock *sk_psock_from_strp(struct strparser *strp) -{ - struct sk_psock_parser *parser; - - parser = container_of(strp, struct sk_psock_parser, strp); - return container_of(parser, struct sk_psock, parser); -} - static void sk_psock_skb_redirect(struct sk_buff *skb) { struct sk_psock *psock_other; @@ -899,7 +891,7 @@ static int sk_psock_strp_read_done(struct strparser *strp, int err) static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb) { - struct sk_psock *psock = sk_psock_from_strp(strp); + struct sk_psock *psock = container_of(strp, struct sk_psock, strp); struct bpf_prog *prog; int ret = skb->len; @@ -923,10 +915,10 @@ static void sk_psock_strp_data_ready(struct sock *sk) psock = sk_psock(sk); if (likely(psock)) { if (tls_sw_has_ctx_rx(sk)) { - psock->parser.saved_data_ready(sk); + psock->saved_data_ready(sk); } else { write_lock_bh(&sk->sk_callback_lock); - strp_data_ready(&psock->parser.strp); + strp_data_ready(&psock->strp); write_unlock_bh(&sk->sk_callback_lock); } } @@ -1009,57 +1001,49 @@ int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock) .parse_msg = sk_psock_strp_parse, }; - psock->parser.enabled = false; - return strp_init(&psock->parser.strp, sk, &cb); + psock->bpf_running = false; + return strp_init(&psock->strp, sk, &cb); } void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (parser->enabled) + if (psock->bpf_running) return; - parser->saved_data_ready = sk->sk_data_ready; + psock->saved_data_ready = sk->sk_data_ready; sk->sk_data_ready = sk_psock_verdict_data_ready; sk->sk_write_space = sk_psock_write_space; - parser->enabled = true; + psock->bpf_running = true; } void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (parser->enabled) + if (psock->bpf_running) return; - parser->saved_data_ready = sk->sk_data_ready; + psock->saved_data_ready = sk->sk_data_ready; sk->sk_data_ready = sk_psock_strp_data_ready; sk->sk_write_space = sk_psock_write_space; - parser->enabled = true; + psock->bpf_running = true; } void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (!parser->enabled) + if (!psock->bpf_running) return; - sk->sk_data_ready = parser->saved_data_ready; - parser->saved_data_ready = NULL; - strp_stop(&parser->strp); - parser->enabled = false; + sk->sk_data_ready = psock->saved_data_ready; + psock->saved_data_ready = NULL; + strp_stop(&psock->strp); + psock->bpf_running = false; } void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (!parser->enabled) + if (!psock->bpf_running) return; - sk->sk_data_ready = parser->saved_data_ready; - parser->saved_data_ready = NULL; - parser->enabled = false; + sk->sk_data_ready = psock->saved_data_ready; + psock->saved_data_ready = NULL; + psock->bpf_running = false; } diff --git a/net/core/sock_map.c b/net/core/sock_map.c index d758fb83c884..37ff8e13e4cc 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -148,9 +148,9 @@ static void sock_map_del_link(struct sock *sk, struct bpf_map *map = link->map; struct bpf_stab *stab = container_of(map, struct bpf_stab, map); - if (psock->parser.enabled && stab->progs.skb_parser) + if (psock->bpf_running && stab->progs.skb_parser) strp_stop = true; - if (psock->parser.enabled && stab->progs.skb_verdict) + if (psock->bpf_running && stab->progs.skb_verdict) verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); @@ -283,14 +283,14 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, goto out_drop; write_lock_bh(&sk->sk_callback_lock); - if (skb_parser && skb_verdict && !psock->parser.enabled) { + if (skb_parser && skb_verdict && !psock->bpf_running) { ret = sk_psock_init_strp(sk, psock); if (ret) goto out_unlock_drop; psock_set_prog(&psock->progs.skb_verdict, skb_verdict); psock_set_prog(&psock->progs.skb_parser, skb_parser); sk_psock_start_strp(sk, psock); - } else if (!skb_parser && skb_verdict && !psock->parser.enabled) { + } else if (!skb_parser && skb_verdict && !psock->bpf_running) { psock_set_prog(&psock->progs.skb_verdict, skb_verdict); sk_psock_start_verdict(sk,psock); } From patchwork Wed Feb 3 04:16:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063373 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC179C433E0 for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78EF564F61 for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232644AbhBCERq (ORCPT ); Tue, 2 Feb 2021 23:17:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232619AbhBCERj (ORCPT ); Tue, 2 Feb 2021 23:17:39 -0500 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECF04C0613ED; Tue, 2 Feb 2021 20:16:58 -0800 (PST) Received: by mail-ot1-x335.google.com with SMTP id d1so22108716otl.13; Tue, 02 Feb 2021 20:16:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CfV1Fm31xMUSASHcYoqG39qAmbNZS+l3nJ5YNxvPQIA=; b=TSkNM8uNe8ESvviJRGqbIRHnF9Gvv+ayrvhUa74TKINc/yUpCFOTxQo2to5vopHZs6 bvUFUnL1o0FoZNYnJQTJXm+2jecwlQjqClz8dfmI/zC0/QfH8O4ZjoyxNxzx61F0Zpzf VwcmAtHFXNTUmtutVqKgSzEsef16923zdlDAyQY9f2Hs65Qojq0Wdk7FFNaQY0Mwe/hg 0m9Qse9jdBKzozgg7Zaf3933O6FwhDHBiFffwp8+QBxZCfwaOA8C1Bfkw4k//ARAxj04 iR7RxezJJC15PUXV26uJZTC80RihHNME/7XXeZxb84rAOPnTh1cCsdzwUqVHTLq7Rq6A H6Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CfV1Fm31xMUSASHcYoqG39qAmbNZS+l3nJ5YNxvPQIA=; b=eMuMG6K3VtG+dgcciL2TYH5ewgMWyXPedoN66tb8AB15Ca1F7Ie0WMCVOrKsoSahOQ iAW0MdM5RDLz7sphqJr4hF/INKZFtQ1SBPhbTN09oto5SHKq1c498oFQq4YcM6L5Kxh3 a8fMR6lJCC6Ss6H6+hcUldYhuJh8OFzrkci0hnS4Gj36wY7UFVRe5F1lybq3B1Nw39mg c/9nxvyB0xH1D8FmzHT5ULAVXKFit1q6Vz//sKNHxp5If9r4SOgbPkJNZXG3FVujB88n ikb2qNXqz3+BQQMyIuw6kssmsSgYMKced2anYYh6cDyXNMGYV3i6UwH8/CMULDJFQZaY 7Gyg== X-Gm-Message-State: AOAM530XyBvKLhhy+i9SAOykZjWAPsIymK5+j/4ZRkIz/9PsXnGzFM5V MuVHgn0GszMDYBF6Q2Ji+1EUGxkPZzXEwQ== X-Google-Smtp-Source: ABdhPJzr2aYtgsW3vl3Cv4zEu+1IQjrzP5LXAMUHl1WS+FkUv0yJRmbGXlNbgPAWHq38zFEXTcIjkA== X-Received: by 2002:a9d:19aa:: with SMTP id k39mr770506otk.28.1612325818120; Tue, 02 Feb 2021 20:16:58 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:57 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 03/19] skmsg: use skb ext instead of TCP_SKB_CB Date: Tue, 2 Feb 2021 20:16:20 -0800 Message-Id: <20210203041636.38555-4-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently TCP_SKB_CB() is hard-coded in skmsg code, it certainly won't work for any other non-TCP protocols. We can move them to skb ext instead of playing with skb cb, which is harder to make correct. Of course, except ->data_end, which is used by sk_skb_convert_ctx_access() to adjust compile-time constant offset. Fortunately, we can reuse the anonymous union where the field 'tcp_tsorted_anchor' is and save/restore the overwritten part before/after a brief use. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skbuff.h | 4 ++++ include/linux/skmsg.h | 45 ++++++++++++++++++++++++++++++++++++++++++ include/net/tcp.h | 25 ----------------------- net/Kconfig | 1 + net/core/filter.c | 3 +-- net/core/skbuff.c | 7 +++++++ net/core/skmsg.c | 44 ++++++++++++++++++++++++++++------------- net/core/sock_map.c | 12 +++++------ 8 files changed, 94 insertions(+), 47 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 46f901adf1a8..12a28268233a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -755,6 +755,7 @@ struct sk_buff { void (*destructor)(struct sk_buff *skb); }; struct list_head tcp_tsorted_anchor; + void *data_end; }; #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) @@ -4166,6 +4167,9 @@ enum skb_ext_id { #endif #if IS_ENABLED(CONFIG_MPTCP) SKB_EXT_MPTCP, +#endif +#if IS_ENABLED(CONFIG_NET_SOCK_MSG) + SKB_EXT_BPF, #endif SKB_EXT_NUM, /* must be last */ }; diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 56d641df3b0c..e212b0d1ba35 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -438,4 +438,49 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock) return false; return psock->bpf_running; } + +struct skb_bpf_ext { + __u32 flags; + struct sock *sk_redir; +}; + +static inline void bpf_compute_data_end_sk_skb(struct sk_buff *skb) +{ + skb->data_end = skb->data + skb_headlen(skb); +} + +#if IS_ENABLED(CONFIG_NET_SOCK_MSG) +static inline +bool skb_bpf_ext_ingress(const struct sk_buff *skb) +{ + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); + + return ext->flags & BPF_F_INGRESS; +} + +static inline +void skb_bpf_ext_set_ingress(const struct sk_buff *skb) +{ + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); + + ext->flags |= BPF_F_INGRESS; +} + +static inline +struct sock *skb_bpf_ext_redirect_fetch(struct sk_buff *skb) +{ + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); + + return ext->sk_redir; +} + +static inline +void skb_bpf_ext_redirect_clear(struct sk_buff *skb) +{ + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); + + ext->flags = 0; + ext->sk_redir = NULL; +} +#endif /* CONFIG_NET_SOCK_MSG */ #endif /* _LINUX_SKMSG_H */ diff --git a/include/net/tcp.h b/include/net/tcp.h index be66571ad122..f7591768525d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -882,36 +882,11 @@ struct tcp_skb_cb { struct inet6_skb_parm h6; #endif } header; /* For incoming skbs */ - struct { - __u32 flags; - struct sock *sk_redir; - void *data_end; - } bpf; }; }; #define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0])) -static inline void bpf_compute_data_end_sk_skb(struct sk_buff *skb) -{ - TCP_SKB_CB(skb)->bpf.data_end = skb->data + skb_headlen(skb); -} - -static inline bool tcp_skb_bpf_ingress(const struct sk_buff *skb) -{ - return TCP_SKB_CB(skb)->bpf.flags & BPF_F_INGRESS; -} - -static inline struct sock *tcp_skb_bpf_redirect_fetch(struct sk_buff *skb) -{ - return TCP_SKB_CB(skb)->bpf.sk_redir; -} - -static inline void tcp_skb_bpf_redirect_clear(struct sk_buff *skb) -{ - TCP_SKB_CB(skb)->bpf.sk_redir = NULL; -} - extern const struct inet_connection_sock_af_ops ipv4_specific; #if IS_ENABLED(CONFIG_IPV6) diff --git a/net/Kconfig b/net/Kconfig index 0cc0805a8127..1e45bcaa23f1 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -422,6 +422,7 @@ config SOCK_VALIDATE_XMIT config NET_SOCK_MSG bool + select SKB_EXTENSIONS default n help The NET_SOCK_MSG provides a framework for plain sockets (e.g. TCP) or diff --git a/net/core/filter.c b/net/core/filter.c index e15d4741719a..c1a19a663630 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -9532,8 +9532,7 @@ static u32 sk_skb_convert_ctx_access(enum bpf_access_type type, case offsetof(struct __sk_buff, data_end): off = si->off; off -= offsetof(struct __sk_buff, data_end); - off += offsetof(struct sk_buff, cb); - off += offsetof(struct tcp_skb_cb, bpf.data_end); + off += offsetof(struct sk_buff, data_end); *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg, si->src_reg, off); break; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 145503d3f06b..7695a2b65832 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -60,6 +60,7 @@ #include #include #include +#include #include #include @@ -4259,6 +4260,9 @@ static const u8 skb_ext_type_len[] = { #if IS_ENABLED(CONFIG_MPTCP) [SKB_EXT_MPTCP] = SKB_EXT_CHUNKSIZEOF(struct mptcp_ext), #endif +#if IS_ENABLED(CONFIG_NET_SOCK_MSG) + [SKB_EXT_BPF] = SKB_EXT_CHUNKSIZEOF(struct skb_bpf_ext), +#endif }; static __always_inline unsigned int skb_ext_total_length(void) @@ -4275,6 +4279,9 @@ static __always_inline unsigned int skb_ext_total_length(void) #endif #if IS_ENABLED(CONFIG_MPTCP) skb_ext_type_len[SKB_EXT_MPTCP] + +#endif +#if IS_ENABLED(CONFIG_NET_SOCK_MSG) + skb_ext_type_len[SKB_EXT_BPF] + #endif 0; } diff --git a/net/core/skmsg.c b/net/core/skmsg.c index f72fcb03d25c..2b5b8f05187a 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -525,7 +525,8 @@ static void sk_psock_backlog(struct work_struct *work) len = skb->len; off = 0; start: - ingress = tcp_skb_bpf_ingress(skb); + ingress = skb_bpf_ext_ingress(skb); + skb_ext_del(skb, SKB_EXT_BPF); do { ret = -EIO; if (likely(psock->sk->sk_socket)) @@ -746,8 +747,13 @@ EXPORT_SYMBOL_GPL(sk_psock_msg_verdict); static int sk_psock_bpf_run(struct sk_psock *psock, struct bpf_prog *prog, struct sk_buff *skb) { - bpf_compute_data_end_sk_skb(skb); - return bpf_prog_run_pin_on_cpu(prog, skb); + int ret; + + tcp_skb_tsorted_save(skb) { + bpf_compute_data_end_sk_skb(skb); + ret = bpf_prog_run_pin_on_cpu(prog, skb); + } tcp_skb_tsorted_restore(skb); + return ret; } static void sk_psock_skb_redirect(struct sk_buff *skb) @@ -755,7 +761,7 @@ static void sk_psock_skb_redirect(struct sk_buff *skb) struct sk_psock *psock_other; struct sock *sk_other; - sk_other = tcp_skb_bpf_redirect_fetch(skb); + sk_other = skb_bpf_ext_redirect_fetch(skb); /* This error is a buggy BPF program, it returned a redirect * return code, but then didn't set a redirect interface. */ @@ -797,6 +803,9 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb) struct bpf_prog *prog; int ret = __SK_PASS; + if (!skb_ext_add(skb, SKB_EXT_BPF)) + return __SK_DROP; + rcu_read_lock(); prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { @@ -805,9 +814,9 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb) * TLS context. */ skb->sk = psock->sk; - tcp_skb_bpf_redirect_clear(skb); + skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb)); skb->sk = NULL; } sk_psock_tls_verdict_apply(skb, psock->sk, ret); @@ -819,7 +828,6 @@ EXPORT_SYMBOL_GPL(sk_psock_tls_strp_read); static void sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb, int verdict) { - struct tcp_skb_cb *tcp; struct sock *sk_other; int err = -EIO; @@ -831,9 +839,7 @@ static void sk_psock_verdict_apply(struct sk_psock *psock, goto out_free; } - tcp = TCP_SKB_CB(skb); - tcp->bpf.flags |= BPF_F_INGRESS; - + skb_bpf_ext_set_ingress(skb); /* If the queue is empty then we can submit directly * into the msg queue. If its not empty we have to * queue work otherwise we may get OOO data. Otherwise, @@ -873,11 +879,15 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb) goto out; } skb_set_owner_r(skb, sk); + if (!skb_ext_add(skb, SKB_EXT_BPF)) { + kfree_skb(skb); + goto out; + } prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { - tcp_skb_bpf_redirect_clear(skb); + skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb)); } sk_psock_verdict_apply(psock, skb, ret); out: @@ -949,11 +959,17 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, goto out; } skb_set_owner_r(skb, sk); + if (!skb_ext_add(skb, SKB_EXT_BPF)) { + len = 0; + kfree_skb(skb); + goto out; + } + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { - tcp_skb_bpf_redirect_clear(skb); + skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); - ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + ret = sk_psock_map_verd(ret, skb_bpf_ext_redirect_fetch(skb)); } sk_psock_verdict_apply(psock, skb, ret); out: diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 37ff8e13e4cc..7b70fa290836 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -657,7 +657,7 @@ const struct bpf_func_proto bpf_sock_map_update_proto = { BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb, struct bpf_map *, map, u32, key, u64, flags) { - struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); struct sock *sk; if (unlikely(flags & ~(BPF_F_INGRESS))) @@ -667,8 +667,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb, if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; - tcb->bpf.flags = flags; - tcb->bpf.sk_redir = sk; + ext->flags = flags; + ext->sk_redir = sk; return SK_PASS; } @@ -1250,7 +1250,7 @@ const struct bpf_func_proto bpf_sock_hash_update_proto = { BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb, struct bpf_map *, map, void *, key, u64, flags) { - struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); + struct skb_bpf_ext *ext = skb_ext_find(skb, SKB_EXT_BPF); struct sock *sk; if (unlikely(flags & ~(BPF_F_INGRESS))) @@ -1260,8 +1260,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb, if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; - tcb->bpf.flags = flags; - tcb->bpf.sk_redir = sk; + ext->flags = flags; + ext->sk_redir = sk; return SK_PASS; } From patchwork Wed Feb 3 04:16:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063375 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 884A4C433DB for ; Wed, 3 Feb 2021 04:18:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B52A64F67 for ; Wed, 3 Feb 2021 04:18:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232699AbhBCESB (ORCPT ); Tue, 2 Feb 2021 23:18:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232621AbhBCERl (ORCPT ); Tue, 2 Feb 2021 23:17:41 -0500 Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 791B4C061786; Tue, 2 Feb 2021 20:17:00 -0800 (PST) Received: by mail-oi1-x231.google.com with SMTP id w8so25386679oie.2; Tue, 02 Feb 2021 20:17:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0Nog2NOlla9fFB2+XCVhJkcq2aHweemJtQThwh2rknc=; b=LJ93Olw3H8vzpCeEh52kogcl6MAUqT3hZJ/5xc66emGywBDmLbJetM5pc59oqC18S1 qSRlt2ak5/ljhnMRUAbFN0I5y5MFOqYj6fJ6MoiaRzkL7iwH4xHIm8OMMhV/fITgViBL rc09iuNI+lfyZLkJy4A9sDpne0cFIdUZpwbLWFG0M0VcW4vktlwUfY7aFUp6Z88s9Kkw J6UeIGqAr2jZanD2rTeQpiTXO9T3MLpbCTs7PFV59XQvm1mAkhIgYi/tsUVDYovsbAzj fqCVZNON98khQXU/Ts2ZC8bUH3hwIBQ4Cci7dfFoFoC7i3ji7pzH6vcOQZuzaH0AH7hp OQhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0Nog2NOlla9fFB2+XCVhJkcq2aHweemJtQThwh2rknc=; b=Ih6dDDW60XmZ9gZbGc/Ff86n6GS1RZIX1LIYadIKwatVFVEAiZnrmbue9vYY2d2LRV DXb0RgUk3VB9pGcsO33P3Omi6qqzPAFUkQRrsbsdBKPsLzparAUJy0Y4CjraNEI5kjAU 6WicMsXy2mc0mWwWmHQpAPd8zKznBYnuQnSyIWk7Xe6JBgMFldpeN8aNAnC69GhSDkwT Ss0mnhOBZp403XsC7UK5ao10LfSIzasEI2jzplH7XRO2lHTEJF1040dGijC8uh7eel3/ nDePaIUxTDbrZWd/X8YLCWkjsHH3cEPmwBepVK73v/Abi1tp6KljlDURnktyE4btKh+y 3TYw== X-Gm-Message-State: AOAM533KQqJ5/r5n3kR77uxSPEOnurIRHcHUi4CrvuqVlzUxWHAmV5Qf 8MuDXht7FNdra5VvQLREiu+4S+ZkDenwKA== X-Google-Smtp-Source: ABdhPJxIRxRpag2bZDk3accj3WFyvY97GWd09VPbMdNPqBFzJk0WXmgeREjULu4opOUBOCsRQEtYDg== X-Received: by 2002:aca:c704:: with SMTP id x4mr845814oif.24.1612325819717; Tue, 02 Feb 2021 20:16:59 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:59 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 04/19] sock_map: rename skb_parser and skb_verdict Date: Tue, 2 Feb 2021 20:16:21 -0800 Message-Id: <20210203041636.38555-5-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang These two ebpf programs are tied to BPF_SK_SKB_STREAM_PARSER and BPF_SK_SKB_STREAM_VERDICT, rename them to reflect the fact they are currently used for TCP. And save the generic name skb_verdict for general use. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 8 +-- net/core/skmsg.c | 14 ++--- net/core/sock_map.c | 60 +++++++++---------- .../selftests/bpf/prog_tests/sockmap_listen.c | 8 +-- .../selftests/bpf/progs/test_sockmap_listen.c | 4 +- 5 files changed, 47 insertions(+), 47 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index e212b0d1ba35..218566ac4fa1 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -56,8 +56,8 @@ struct sk_msg { struct sk_psock_progs { struct bpf_prog *msg_parser; - struct bpf_prog *skb_parser; - struct bpf_prog *skb_verdict; + struct bpf_prog *stream_parser; + struct bpf_prog *stream_verdict; }; enum sk_psock_state_bits { @@ -426,8 +426,8 @@ static inline int psock_replace_prog(struct bpf_prog **pprog, static inline void psock_progs_drop(struct sk_psock_progs *progs) { psock_set_prog(&progs->msg_parser, NULL); - psock_set_prog(&progs->skb_parser, NULL); - psock_set_prog(&progs->skb_verdict, NULL); + psock_set_prog(&progs->stream_parser, NULL); + psock_set_prog(&progs->stream_verdict, NULL); } int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb); diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 2b5b8f05187a..51446fe63be5 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -653,7 +653,7 @@ static void sk_psock_destroy_deferred(struct work_struct *gc) /* No sk_callback_lock since already detached. */ /* Parser has been stopped */ - if (psock->progs.skb_parser) + if (psock->progs.stream_parser) strp_done(&psock->strp); cancel_work_sync(&psock->work); @@ -686,9 +686,9 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) write_lock_bh(&sk->sk_callback_lock); sk_psock_restore_proto(sk, psock); rcu_assign_sk_user_data(sk, NULL); - if (psock->progs.skb_parser) + if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); - else if (psock->progs.skb_verdict) + else if (psock->progs.stream_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); @@ -807,7 +807,7 @@ int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb) return __SK_DROP; rcu_read_lock(); - prog = READ_ONCE(psock->progs.skb_verdict); + prog = READ_ONCE(psock->progs.stream_verdict); if (likely(prog)) { /* We skip full set_owner_r here because if we do a SK_PASS * or SK_DROP we can skip skb memory accounting and use the @@ -883,7 +883,7 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb) kfree_skb(skb); goto out; } - prog = READ_ONCE(psock->progs.skb_verdict); + prog = READ_ONCE(psock->progs.stream_verdict); if (likely(prog)) { skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); @@ -906,7 +906,7 @@ static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb) int ret = skb->len; rcu_read_lock(); - prog = READ_ONCE(psock->progs.skb_parser); + prog = READ_ONCE(psock->progs.stream_parser); if (likely(prog)) { skb->sk = psock->sk; ret = sk_psock_bpf_run(psock, prog, skb); @@ -965,7 +965,7 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, goto out; } - prog = READ_ONCE(psock->progs.skb_verdict); + prog = READ_ONCE(psock->progs.stream_verdict); if (likely(prog)) { skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 7b70fa290836..521663582982 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -148,9 +148,9 @@ static void sock_map_del_link(struct sock *sk, struct bpf_map *map = link->map; struct bpf_stab *stab = container_of(map, struct bpf_stab, map); - if (psock->bpf_running && stab->progs.skb_parser) + if (psock->bpf_running && stab->progs.stream_parser) strp_stop = true; - if (psock->bpf_running && stab->progs.skb_verdict) + if (psock->bpf_running && stab->progs.stream_verdict) verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); @@ -224,23 +224,23 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, struct sock *sk) { - struct bpf_prog *msg_parser, *skb_parser, *skb_verdict; + struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; struct sk_psock *psock; int ret; - skb_verdict = READ_ONCE(progs->skb_verdict); - if (skb_verdict) { - skb_verdict = bpf_prog_inc_not_zero(skb_verdict); - if (IS_ERR(skb_verdict)) - return PTR_ERR(skb_verdict); + stream_verdict = READ_ONCE(progs->stream_verdict); + if (stream_verdict) { + stream_verdict = bpf_prog_inc_not_zero(stream_verdict); + if (IS_ERR(stream_verdict)) + return PTR_ERR(stream_verdict); } - skb_parser = READ_ONCE(progs->skb_parser); - if (skb_parser) { - skb_parser = bpf_prog_inc_not_zero(skb_parser); - if (IS_ERR(skb_parser)) { - ret = PTR_ERR(skb_parser); - goto out_put_skb_verdict; + stream_parser = READ_ONCE(progs->stream_parser); + if (stream_parser) { + stream_parser = bpf_prog_inc_not_zero(stream_parser); + if (IS_ERR(stream_parser)) { + ret = PTR_ERR(stream_parser); + goto out_put_stream_verdict; } } @@ -249,7 +249,7 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, msg_parser = bpf_prog_inc_not_zero(msg_parser); if (IS_ERR(msg_parser)) { ret = PTR_ERR(msg_parser); - goto out_put_skb_parser; + goto out_put_stream_parser; } } @@ -261,8 +261,8 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, if (psock) { if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) || - (skb_parser && READ_ONCE(psock->progs.skb_parser)) || - (skb_verdict && READ_ONCE(psock->progs.skb_verdict))) { + (stream_parser && READ_ONCE(psock->progs.stream_parser)) || + (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) { sk_psock_put(sk, psock); ret = -EBUSY; goto out_progs; @@ -283,15 +283,15 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, goto out_drop; write_lock_bh(&sk->sk_callback_lock); - if (skb_parser && skb_verdict && !psock->bpf_running) { + if (stream_parser && stream_verdict && !psock->bpf_running) { ret = sk_psock_init_strp(sk, psock); if (ret) goto out_unlock_drop; - psock_set_prog(&psock->progs.skb_verdict, skb_verdict); - psock_set_prog(&psock->progs.skb_parser, skb_parser); + psock_set_prog(&psock->progs.stream_verdict, stream_verdict); + psock_set_prog(&psock->progs.stream_parser, stream_parser); sk_psock_start_strp(sk, psock); - } else if (!skb_parser && skb_verdict && !psock->bpf_running) { - psock_set_prog(&psock->progs.skb_verdict, skb_verdict); + } else if (!stream_parser && stream_verdict && !psock->bpf_running) { + psock_set_prog(&psock->progs.stream_verdict, stream_verdict); sk_psock_start_verdict(sk,psock); } write_unlock_bh(&sk->sk_callback_lock); @@ -303,12 +303,12 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, out_progs: if (msg_parser) bpf_prog_put(msg_parser); -out_put_skb_parser: - if (skb_parser) - bpf_prog_put(skb_parser); -out_put_skb_verdict: - if (skb_verdict) - bpf_prog_put(skb_verdict); +out_put_stream_parser: + if (stream_parser) + bpf_prog_put(stream_parser); +out_put_stream_verdict: + if (stream_verdict) + bpf_prog_put(stream_verdict); return ret; } @@ -1462,10 +1462,10 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, pprog = &progs->msg_parser; break; case BPF_SK_SKB_STREAM_PARSER: - pprog = &progs->skb_parser; + pprog = &progs->stream_parser; break; case BPF_SK_SKB_STREAM_VERDICT: - pprog = &progs->skb_verdict; + pprog = &progs->stream_verdict; break; default: return -EOPNOTSUPP; diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index d7d65a700799..c26e6bf05e49 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1014,8 +1014,8 @@ static void test_skb_redir_to_connected(struct test_sockmap_listen *skel, struct bpf_map *inner_map, int family, int sotype) { - int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); - int parser = bpf_program__fd(skel->progs.prog_skb_parser); + int verdict = bpf_program__fd(skel->progs.prog_stream_verdict); + int parser = bpf_program__fd(skel->progs.prog_stream_parser); int verdict_map = bpf_map__fd(skel->maps.verdict_map); int sock_map = bpf_map__fd(inner_map); int err; @@ -1125,8 +1125,8 @@ static void test_skb_redir_to_listening(struct test_sockmap_listen *skel, struct bpf_map *inner_map, int family, int sotype) { - int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); - int parser = bpf_program__fd(skel->progs.prog_skb_parser); + int verdict = bpf_program__fd(skel->progs.prog_stream_verdict); + int parser = bpf_program__fd(skel->progs.prog_stream_parser); int verdict_map = bpf_map__fd(skel->maps.verdict_map); int sock_map = bpf_map__fd(inner_map); int err; diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c index a3a366c57ce1..fa221141e9c1 100644 --- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c +++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c @@ -31,13 +31,13 @@ struct { static volatile bool test_sockmap; /* toggled by user-space */ SEC("sk_skb/stream_parser") -int prog_skb_parser(struct __sk_buff *skb) +int prog_stream_parser(struct __sk_buff *skb) { return skb->len; } SEC("sk_skb/stream_verdict") -int prog_skb_verdict(struct __sk_buff *skb) +int prog_stream_verdict(struct __sk_buff *skb) { unsigned int *count; __u32 zero = 0; From patchwork Wed Feb 3 04:16:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063377 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EC8CC433DB for ; Wed, 3 Feb 2021 04:19:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EDDB164F67 for ; Wed, 3 Feb 2021 04:19:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232709AbhBCESy (ORCPT ); Tue, 2 Feb 2021 23:18:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232731AbhBCESR (ORCPT ); Tue, 2 Feb 2021 23:18:17 -0500 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA3B7C061788; Tue, 2 Feb 2021 20:17:01 -0800 (PST) Received: by mail-oi1-x22b.google.com with SMTP id x71so25345614oia.9; Tue, 02 Feb 2021 20:17:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=LUc4vFqaH026CJlyG/ik+ARGw3xO0txEaARW0fIlhtbGytcVs2aWJ0QxIWwrMMDqfF pbxLHHx61mDNswnOt1+rzimONAYwkYMEE5/SG3Via5jgdpnVUfXNNWg8+6F4WEFQuUZy /WFqWYqIdYeFusQMQfvvPu4u/cGyX3EJVBcKBafttplcVbevt/EgU5nte/MViGe1khgR 5kISf0TD92mztcofBpMQMrpa3mYGbHGQM000FntbJjMe8e+Wnx0WadaeiMhM/YKGEM+B IauOdBBBZSSaCZYmMIGtwaDhFNqFhFkHqwj0y7jeyQl2oRuA4MpTZMvN7vMPXsT6CLUC JMCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=t8f99NPauMAq/i3fP5Ll1CPB7cUNR4qxrnZZUbCH/VPYR1XV99iCNPTlWZxke5t1Ra ZDtTqCYARFsxo3ZB6au5d44t8nhFKwbcPouYAT3hSdtdv0QcPFpQxqhA8PNqyAaPMhCs MLlypfgyMtwV2Bly0haHKxtI6Vm6ltpU/u5s7d6kL8DHWwhxEMCM+kGgDUCG5g6kRzd8 tzU+zmmF13t9NDAwcrPm8saL1l/et/bNH6IE2IKMbwaumWLRdvRb0qOd/QE9ui5fmuNZ XaCRwm/PLiiqYEOi5hIOB2iYXRoPPcVSlaoLxEiWpL9Vh6Mvlq6Qdcf3iY90OG0FPQkE iSjg== X-Gm-Message-State: AOAM532zpw5Ua78wZKDA302cIdYvDVXN4y6ietSKQQM0ZH47hmNWpKbR jyf+A1UCJWt3zmOMwLIPH2PoRWBV+BmCVg== X-Google-Smtp-Source: ABdhPJyiCpNN6z8AhRrFCetB7GZ2hLGoIiiShe30NMW5a1OooszLg9txuOvFWG/lZV5l8I8lEDuaHg== X-Received: by 2002:aca:e103:: with SMTP id y3mr783973oig.11.1612325821171; Tue, 02 Feb 2021 20:17:01 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:00 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 05/19] sock_map: introduce BPF_SK_SKB_VERDICT Date: Tue, 2 Feb 2021 20:16:22 -0800 Message-Id: <20210203041636.38555-6-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang I was planning to reuse BPF_SK_SKB_STREAM_VERDICT but its name is confusing and more importantly it seems kTLS relies on it to deliver sk_msg too. To avoid messing up kTLS, we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to set stream_verdict and skb_verdict at the same time. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 3 +++ include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 1 + net/core/skmsg.c | 4 +++- net/core/sock_map.c | 23 ++++++++++++++++++++++- tools/bpf/bpftool/common.c | 1 + tools/bpf/bpftool/prog.c | 1 + tools/include/uapi/linux/bpf.h | 1 + 8 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 218566ac4fa1..cb79b1afa556 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -58,6 +58,7 @@ struct sk_psock_progs { struct bpf_prog *msg_parser; struct bpf_prog *stream_parser; struct bpf_prog *stream_verdict; + struct bpf_prog *skb_verdict; }; enum sk_psock_state_bits { @@ -428,6 +429,7 @@ static inline void psock_progs_drop(struct sk_psock_progs *progs) psock_set_prog(&progs->msg_parser, NULL); psock_set_prog(&progs->stream_parser, NULL); psock_set_prog(&progs->stream_verdict, NULL); + psock_set_prog(&progs->skb_verdict, NULL); } int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb); @@ -482,5 +484,6 @@ void skb_bpf_ext_redirect_clear(struct sk_buff *skb) ext->flags = 0; ext->sk_redir = NULL; } + #endif /* CONFIG_NET_SOCK_MSG */ #endif /* _LINUX_SKMSG_H */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e5999d86c76e..a56549fc2825 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2936,6 +2936,7 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_SK_MSG; case BPF_SK_SKB_STREAM_PARSER: case BPF_SK_SKB_STREAM_VERDICT: + case BPF_SK_SKB_VERDICT: return BPF_PROG_TYPE_SK_SKB; case BPF_LIRC_MODE2: return BPF_PROG_TYPE_LIRC_MODE2; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 51446fe63be5..ecbd6f0d49a5 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -688,7 +688,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) rcu_assign_sk_user_data(sk, NULL); if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); - else if (psock->progs.stream_verdict) + else if (psock->progs.stream_verdict || psock->progs.skb_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); @@ -966,6 +966,8 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, } prog = READ_ONCE(psock->progs.stream_verdict); + if (!prog) + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 521663582982..f827f1ecefcc 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -152,6 +152,8 @@ static void sock_map_del_link(struct sock *sk, strp_stop = true; if (psock->bpf_running && stab->progs.stream_verdict) verdict_stop = true; + if (psock->bpf_running && stab->progs.skb_verdict) + verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); } @@ -224,7 +226,7 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, struct sock *sk) { - struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; + struct bpf_prog *msg_parser, *stream_parser, *stream_verdict, *skb_verdict; struct sk_psock *psock; int ret; @@ -253,6 +255,15 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } } + skb_verdict = READ_ONCE(progs->skb_verdict); + if (skb_verdict) { + skb_verdict = bpf_prog_inc_not_zero(skb_verdict); + if (IS_ERR(skb_verdict)) { + ret = PTR_ERR(skb_verdict); + goto out_put_msg_parser; + } + } + psock = sock_map_psock_get_checked(sk); if (IS_ERR(psock)) { ret = PTR_ERR(psock); @@ -262,6 +273,7 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, if (psock) { if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) || (stream_parser && READ_ONCE(psock->progs.stream_parser)) || + (skb_verdict && READ_ONCE(psock->progs.skb_verdict)) || (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) { sk_psock_put(sk, psock); ret = -EBUSY; @@ -293,6 +305,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } else if (!stream_parser && stream_verdict && !psock->bpf_running) { psock_set_prog(&psock->progs.stream_verdict, stream_verdict); sk_psock_start_verdict(sk,psock); + } else if (!stream_verdict && skb_verdict && !psock->bpf_running) { + psock_set_prog(&psock->progs.skb_verdict, skb_verdict); + sk_psock_start_verdict(sk, psock); } write_unlock_bh(&sk->sk_callback_lock); return 0; @@ -301,6 +316,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, out_drop: sk_psock_put(sk, psock); out_progs: + if (skb_verdict) + bpf_prog_put(skb_verdict); +out_put_msg_parser: if (msg_parser) bpf_prog_put(msg_parser); out_put_stream_parser: @@ -1467,6 +1485,9 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, case BPF_SK_SKB_STREAM_VERDICT: pprog = &progs->stream_verdict; break; + case BPF_SK_SKB_VERDICT: + pprog = &progs->skb_verdict; + break; default: return -EOPNOTSUPP; } diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 65303664417e..1828bba19020 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -57,6 +57,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = { [BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict", + [BPF_SK_SKB_VERDICT] = "sk_skb_verdict", [BPF_SK_MSG_VERDICT] = "sk_msg_verdict", [BPF_LIRC_MODE2] = "lirc_mode2", [BPF_FLOW_DISSECTOR] = "flow_dissector", diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 1fe3ba255bad..a78d8c03b7ea 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -76,6 +76,7 @@ enum dump_mode { static const char * const attach_type_strings[] = { [BPF_SK_SKB_STREAM_PARSER] = "stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict", + [BPF_SK_SKB_VERDICT] = "skb_verdict", [BPF_SK_MSG_VERDICT] = "msg_verdict", [BPF_FLOW_DISSECTOR] = "flow_dissector", [__MAX_BPF_ATTACH_TYPE] = NULL, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; From patchwork Wed Feb 3 04:16:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063379 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94A9BC433E0 for ; Wed, 3 Feb 2021 04:20:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 43E2064F67 for ; Wed, 3 Feb 2021 04:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232965AbhBCETk (ORCPT ); Tue, 2 Feb 2021 23:19:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232735AbhBCESR (ORCPT ); Tue, 2 Feb 2021 23:18:17 -0500 Received: from mail-oi1-x22f.google.com (mail-oi1-x22f.google.com [IPv6:2607:f8b0:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62C6BC06178A; Tue, 2 Feb 2021 20:17:03 -0800 (PST) Received: by mail-oi1-x22f.google.com with SMTP id k142so10946144oib.7; Tue, 02 Feb 2021 20:17:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RJxHYpQ1EGYk8KyogI0G2uf/nMD7anQE0l50SEpEKak=; b=vCWrZcs+f2Tg7IwaBrRWc7EMreWyODe+SqExVvslpHMEvJkukhj1iJpR040NFhO9/D xC29FSeak3uIMDxIWb6kIOtTVstGtXCzpX4wZeL2mtYYrM5kqFh/1PY1BHn0wLfPEVVi rEi5BJ+dF9hImfDsOc7xbQUI4wpIoHxATFZMxencs7B/bygOFLeIaQ7fdW6wJbAmMt0n GfY4VoMzmohjqseZnHunFCwtQoUyGtVzsI8oYqsr9O7KvI7+fInaI3mxvYqkdZ6WYKiF wBBl/f2817RevlrixqnKrNrWeLwnPhAcgF9YHwTgS1D28ySW6tllusCnOSR3v/9G/5oA jahw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RJxHYpQ1EGYk8KyogI0G2uf/nMD7anQE0l50SEpEKak=; b=N/vGvOuJelOxwHfxr0568749xyoLi11z6p7Ht/RTzNRM8Rib/Bvjf6tJlTxDkgMy5L 53dCJeYExpQQfX6exBY5kOnitVHvS5Coc9sUsO20DldaRcGHpf2ffeghxS6SjlBUTb9P XwmkpcPnehUjTgYiMomHz4PdX90IuPvU1rbKvBX2qFHnNclvi9z88d5WomPnr0K+a1L0 yoLfRcqjhUGDVb6ePCs4zhQuNqNt74m5Wpl0alPqS+H+EmXy346fgXdRMwxYXywbME4V EdJZ8JD0SmnLG70hQdcEwhWJx8hK6tBBN/ZS0dyujJq5B4dlU69gA6QjfTT06fgZGZXw cFGQ== X-Gm-Message-State: AOAM532XhB+abz2hrrBrAy6OhiDKnvJYxqWxLyEKNAXr0OTVb/B2mjzz /oyhX2UZAuD7Os1iokmRmbe0ioIRO2EQSA== X-Google-Smtp-Source: ABdhPJzvutRsyfWp4nHmXABvoew4i34BB1wrYYbB12T6TLC10u/goRZCtL5djBTDBUQXUHLHEsLpcQ== X-Received: by 2002:a54:4482:: with SMTP id v2mr799439oiv.121.1612325822627; Tue, 02 Feb 2021 20:17:02 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:02 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 06/19] sock: introduce sk_prot->update_proto() Date: Tue, 2 Feb 2021 20:16:23 -0800 Message-Id: <20210203041636.38555-7-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently sockmap calls into each protocol to update the struct proto and replace it. This certainly won't work when the protocol is implemented as a module, for example, AF_UNIX. Introduce a new ops sk->sk_prot->update_proto(), so each protocol can implement its own way to replace the struct proto. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 18 +++--------------- include/net/sock.h | 3 +++ include/net/tcp.h | 2 +- include/net/udp.h | 2 +- net/core/sock_map.c | 22 +++------------------- net/ipv4/tcp_bpf.c | 20 +++++++++++++++++--- net/ipv4/tcp_ipv4.c | 3 +++ net/ipv4/udp.c | 3 +++ net/ipv4/udp_bpf.c | 14 ++++++++++++-- net/ipv6/tcp_ipv6.c | 3 +++ net/ipv6/udp.c | 3 +++ 11 files changed, 52 insertions(+), 41 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index cb79b1afa556..cb94d0f89c08 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -97,6 +97,7 @@ struct sk_psock { void (*saved_close)(struct sock *sk, long timeout); void (*saved_write_space)(struct sock *sk); void (*saved_data_ready)(struct sock *sk); + int (*saved_update_proto)(struct sock *sk, bool restore); struct proto *sk_proto; struct sk_psock_work_state work_state; struct work_struct work; @@ -335,25 +336,12 @@ static inline void sk_psock_cork_free(struct sk_psock *psock) } } -static inline void sk_psock_update_proto(struct sock *sk, - struct sk_psock *psock, - struct proto *ops) -{ - /* Pairs with lockless read in sk_clone_lock() */ - WRITE_ONCE(sk->sk_prot, ops); -} - static inline void sk_psock_restore_proto(struct sock *sk, struct sk_psock *psock) { sk->sk_prot->unhash = psock->saved_unhash; - if (inet_csk_has_ulp(sk)) { - tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space); - } else { - sk->sk_write_space = psock->saved_write_space; - /* Pairs with lockless read in sk_clone_lock() */ - WRITE_ONCE(sk->sk_prot, psock->sk_proto); - } + if (psock->saved_update_proto) + psock->saved_update_proto(sk, true); } static inline void sk_psock_set_state(struct sk_psock *psock, diff --git a/include/net/sock.h b/include/net/sock.h index 7644ea64a376..e474a9202be8 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1184,6 +1184,9 @@ struct proto { void (*unhash)(struct sock *sk); void (*rehash)(struct sock *sk); int (*get_port)(struct sock *sk, unsigned short snum); +#ifdef CONFIG_BPF_SOCK_MAP + int (*update_proto)(struct sock *sk, bool restore); +#endif /* Keeping track of sockets in use */ #ifdef CONFIG_PROC_FS diff --git a/include/net/tcp.h b/include/net/tcp.h index f7591768525d..c2fff35859b6 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2183,7 +2183,7 @@ struct sk_msg; struct sk_psock; #ifdef CONFIG_BPF_SOCK_MAP -struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); +int tcp_bpf_update_proto(struct sock *sk, bool restore); void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); #else static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) diff --git a/include/net/udp.h b/include/net/udp.h index 0ff921e6b866..e3e5dfc8e0f0 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -513,7 +513,7 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk, #ifdef CONFIG_BPF_SOCK_MAP struct sk_psock; -struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); +int udp_bpf_update_proto(struct sock *sk, bool restore); #endif /* CONFIG_BPF_SOCK_MAP */ #endif /* _UDP_H */ diff --git a/net/core/sock_map.c b/net/core/sock_map.c index f827f1ecefcc..255067e5c73a 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -181,26 +181,10 @@ static void sock_map_unref(struct sock *sk, void *link_raw) static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock) { - struct proto *prot; - - switch (sk->sk_type) { - case SOCK_STREAM: - prot = tcp_bpf_get_proto(sk, psock); - break; - - case SOCK_DGRAM: - prot = udp_bpf_get_proto(sk, psock); - break; - - default: + if (!sk->sk_prot->update_proto) return -EINVAL; - } - - if (IS_ERR(prot)) - return PTR_ERR(prot); - - sk_psock_update_proto(sk, psock, prot); - return 0; + psock->saved_update_proto = sk->sk_prot->update_proto; + return sk->sk_prot->update_proto(sk, false); } static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 2252f1d90676..16e00802ccba 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -601,19 +601,33 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops) ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP; } -struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock) +int tcp_bpf_update_proto(struct sock *sk, bool restore) { + struct sk_psock *psock = sk_psock(sk); int family = sk->sk_family == AF_INET6 ? TCP_BPF_IPV6 : TCP_BPF_IPV4; int config = psock->progs.msg_parser ? TCP_BPF_TX : TCP_BPF_BASE; + if (restore) { + if (inet_csk_has_ulp(sk)) { + tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space); + } else { + sk->sk_write_space = psock->saved_write_space; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, psock->sk_proto); + } + return 0; + } + if (sk->sk_family == AF_INET6) { if (tcp_bpf_assert_proto_ops(psock->sk_proto)) - return ERR_PTR(-EINVAL); + return -EINVAL; tcp_bpf_check_v6_needs_rebuild(psock->sk_proto); } - return &tcp_bpf_prots[family][config]; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, &tcp_bpf_prots[family][config]); + return 0; } /* If a child got cloned from a listening socket that had tcp_bpf diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 62b6fd385a47..d7c30b762cc3 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2803,6 +2803,9 @@ struct proto tcp_prot = { .hash = inet_hash, .unhash = inet_unhash, .get_port = inet_csk_get_port, +#ifdef CONFIG_BPF_SOCK_MAP + .update_proto = tcp_bpf_update_proto, +#endif .enter_memory_pressure = tcp_enter_memory_pressure, .leave_memory_pressure = tcp_leave_memory_pressure, .stream_memory_free = tcp_stream_memory_free, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c67e483fce41..84ab4f2e874a 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2843,6 +2843,9 @@ struct proto udp_prot = { .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, .get_port = udp_v4_get_port, +#ifdef CONFIG_BPF_SOCK_MAP + .update_proto = udp_bpf_update_proto, +#endif .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 7a94791efc1a..595836088e85 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -41,12 +41,22 @@ static int __init udp_bpf_v4_build_proto(void) } core_initcall(udp_bpf_v4_build_proto); -struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock) +int udp_bpf_update_proto(struct sock *sk, bool restore) { int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6; + struct sk_psock *psock = sk_psock(sk); + + if (restore) { + sk->sk_write_space = psock->saved_write_space; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, psock->sk_proto); + return 0; + } if (sk->sk_family == AF_INET6) udp_bpf_check_v6_needs_rebuild(psock->sk_proto); - return &udp_bpf_prots[family]; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, &udp_bpf_prots[family]); + return 0; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 8539715ff035..77b11799a3fe 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -2131,6 +2131,9 @@ struct proto tcpv6_prot = { .hash = inet6_hash, .unhash = inet_unhash, .get_port = inet_csk_get_port, +#ifdef CONFIG_BPF_SOCK_MAP + .update_proto = tcp_bpf_update_proto, +#endif .enter_memory_pressure = tcp_enter_memory_pressure, .leave_memory_pressure = tcp_leave_memory_pressure, .stream_memory_free = tcp_stream_memory_free, diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index a02ac875a923..66ebdfc83c95 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1711,6 +1711,9 @@ struct proto udpv6_prot = { .unhash = udp_lib_unhash, .rehash = udp_v6_rehash, .get_port = udp_v6_get_port, +#ifdef CONFIG_BPF_SOCK_MAP + .update_proto = udp_bpf_update_proto, +#endif .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), From patchwork Wed Feb 3 04:16:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063385 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15527C433E9 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D6E2B64F7C for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232997AbhBCETo (ORCPT ); Tue, 2 Feb 2021 23:19:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232756AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFC29C06178B; Tue, 2 Feb 2021 20:17:04 -0800 (PST) Received: by mail-oo1-xc2a.google.com with SMTP id y72so5711606ooa.5; Tue, 02 Feb 2021 20:17:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ojv8oP/7DrPVwj1wzfTdEvvaO8PPS/QTeufALEIgqOU=; b=mxRlFC+IELWEi7o+jKOI5sSiSZVJ9D3t9M2StQ7sQmK1Di201bBgmRziJaLFmZFl0b 2Cg/63NL2J+zxknTpMVxIgdz1C3c7oWPtEjAnOMQGk5fqHFpCR3uFQcZy4l5H9hfkZl2 1gbuIlA3lhb9VAiv37BMhAfzvwiPBZRu8GCrBUYmgrBuRSs1JCTGjgKb1egTulBjfo5G +nmPkNCl9SuWsQUijdqEUL5PY8Xilo5VseaIHOg+LsfCpDmZmu1N4SqmHK0wdeniIwqZ vPrfk2V0gnwigXTRxJ2nsqCwCPLTlKwTgyEbhfftil9BRi9DGr2amp4hmYtICmTu4w6E TsNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ojv8oP/7DrPVwj1wzfTdEvvaO8PPS/QTeufALEIgqOU=; b=bL9rlSdlVRP+fLSJIp1tLaewfF5UEQq6x5q9txPpmSQwJGPabmeX3KDNPtDnfH2O4F VOqqSx7Nit7+p1mKKzdRjYXlpVLocqs9cOAkgF3MpcdTPqTEXlfmjJoPOKqX1V4+Lc6f ji4i5o1MEYo62UMp3OlbTwPZI+s+2K26mUABIL/NChkIxyHMsmJtzw8Wr764YEwA1LZO rljBOqijuCCQdDBFyyLFmrzYb3+ytecCsbpdDX9M4HB28wxcq8b/2NYmcR68FU8cCKMm F4mDlgAHCndokVSnTUi+2mWL2YN4W6KOlPCBhBML6bgQKB0aE5RwVyCdoWGc6TZQVyLt unGg== X-Gm-Message-State: AOAM530fckpHQm5lWPFxQlIrw+BcyoSvVJPdq6Xcl6laB1MdnQwNjzRQ UX0GL1M6vn2iLBs49+1bLqJZP6jp6S16vw== X-Google-Smtp-Source: ABdhPJyIt8Yrfz85xYwgJUnkJegZTul1rNk1tPOZJ+FINcISP6e3JM96rni9/8pHA2+gBw4NvicY/g== X-Received: by 2002:a4a:9c85:: with SMTP id z5mr787878ooj.93.1612325824044; Tue, 02 Feb 2021 20:17:04 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:03 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 07/19] udp: implement ->sendmsg_locked() Date: Tue, 2 Feb 2021 20:16:24 -0800 Message-Id: <20210203041636.38555-8-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang UDP already has udp_sendmsg() which takes lock_sock() inside. We have to build ->sendmsg_locked() on top of it, by adding a new parameter for whether the sock has been locked. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/udp.h | 1 + net/ipv4/af_inet.c | 1 + net/ipv4/udp.c | 30 +++++++++++++++++++++++------- 3 files changed, 25 insertions(+), 7 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index e3e5dfc8e0f0..13f9354dbd3e 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -289,6 +289,7 @@ int udp_get_port(struct sock *sk, unsigned short snum, int udp_err(struct sk_buff *, u32); int udp_abort(struct sock *sk, int err); int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len); +int udp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); int udp_push_pending_frames(struct sock *sk); void udp_flush_pending_frames(struct sock *sk); int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size); diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index aaa94bea19c3..d184d9379a92 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1071,6 +1071,7 @@ const struct proto_ops inet_dgram_ops = { .setsockopt = sock_common_setsockopt, .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, + .sendmsg_locked = udp_sendmsg_locked, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .sendpage = inet_sendpage, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 84ab4f2e874a..635e1e8b2968 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1018,7 +1018,7 @@ int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size) } EXPORT_SYMBOL_GPL(udp_cmsg_send); -int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +static int __udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len, bool locked) { struct inet_sock *inet = inet_sk(sk); struct udp_sock *up = udp_sk(sk); @@ -1057,15 +1057,18 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) * There are pending frames. * The socket lock must be held while it's corked. */ - lock_sock(sk); + if (!locked) + lock_sock(sk); if (likely(up->pending)) { if (unlikely(up->pending != AF_INET)) { - release_sock(sk); + if (!locked) + release_sock(sk); return -EINVAL; } goto do_append_data; } - release_sock(sk); + if (!locked) + release_sock(sk); } ulen += sizeof(struct udphdr); @@ -1235,11 +1238,13 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } - lock_sock(sk); + if (!locked) + lock_sock(sk); if (unlikely(up->pending)) { /* The socket is already corked while preparing it. */ /* ... which is an evident application bug. --ANK */ - release_sock(sk); + if (!locked) + release_sock(sk); net_dbg_ratelimited("socket already corked\n"); err = -EINVAL; @@ -1266,7 +1271,8 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) err = udp_push_pending_frames(sk); else if (unlikely(skb_queue_empty(&sk->sk_write_queue))) up->pending = 0; - release_sock(sk); + if (!locked) + release_sock(sk); out: ip_rt_put(rt); @@ -1296,8 +1302,18 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) err = 0; goto out; } + +int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udp_sendmsg(sk, msg, len, false); +} EXPORT_SYMBOL(udp_sendmsg); +int udp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udp_sendmsg(sk, msg, len, true); +} + int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { From patchwork Wed Feb 3 04:16:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063381 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFF8DC433DB for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC15764F78 for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232731AbhBCETm (ORCPT ); Tue, 2 Feb 2021 23:19:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232750AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B342C06178C; Tue, 2 Feb 2021 20:17:06 -0800 (PST) Received: by mail-oi1-x234.google.com with SMTP id n7so25342892oic.11; Tue, 02 Feb 2021 20:17:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dupMR2pG+JHyTg89AqGH2oELASs9jaapVR219k48+YQ=; b=bCQeWjUUGeLZfuPLirC0xi27hLCVpiAaXsUPT+uIEPNfJtLDtYKsbzBX3crGj9eyq7 gXIynuPA/nBuTKUbOQxIx0opQTrArAKOXbh3sakpMrZ2ycHuvD1XATkv8VVdGk/dgOht HyikSyn12SNmRQWLRg/MSWOTJrgKp/5n7+HV4xjYS5Y+yK3gJuW1U55llAnu3L/AFQAi VBM+s+ITlZUz3iDAzjNNA2RQpogUPAAZ/dWXD3lokxC8EO5F9cWFUYnTVVXPw7WffQtG T/syGi+5GGTGYJecAfgxBd1YxO7RJEkMZUId6xPJAvS2oALKTvHXzc/H1W7n0otUoh2y gPHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dupMR2pG+JHyTg89AqGH2oELASs9jaapVR219k48+YQ=; b=PQ+XEhLnrwqnmYaf6nhzwTBR0OViklt5CnYXecWEJl9RaQIyrzg7n/2yGR7XV1T/Vg PJDum57bbIoZ9kFMX3WgiucEt3ypXiLtcelavr9jKRIyRJs8x3UWzwOfr/VR6y1msZcS J910a+HYCoNcNDYt+QtqeXSQqApUrwmXRfVx4FOaOpPg/EY0hsBterCKwORZGkAQNVIB 4tfMboRqvUHYPEs91OLpjGKQqsKdPMZxk9aY++h10I0Y4XYfCgmNXFXoQ5pDPpliWbK+ 2OaNhCCP6cZqfteVZveQ+GwmL4sckEmhR70GHqxriShan0JuUE8q5E092rqJGtqYzJgQ CKaw== X-Gm-Message-State: AOAM531pIiFw4x31DekWuWNNKE35XVcFWeqySVjlI6v6cf89xo9qhfEK 59iEYBhWDDkT8gqaBTk/j9P3zQJBDuMpBg== X-Google-Smtp-Source: ABdhPJwPoE3QqiuZUR/2DI44BMHX9o+bWbyBSXRpMDOM8cnL7EnzaYnz4nExxPrTZMpBLQzg8DO2Bw== X-Received: by 2002:a54:4e88:: with SMTP id c8mr786082oiy.148.1612325825505; Tue, 02 Feb 2021 20:17:05 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:04 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 08/19] udp: implement ->read_sock() for sockmap Date: Tue, 2 Feb 2021 20:16:25 -0800 Message-Id: <20210203041636.38555-9-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/udp.h | 2 ++ net/ipv4/af_inet.c | 1 + net/ipv4/udp.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 37 insertions(+) diff --git a/include/net/udp.h b/include/net/udp.h index 13f9354dbd3e..b6b75cabf4e4 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -327,6 +327,8 @@ struct sock *__udp6_lib_lookup(struct net *net, struct sk_buff *skb); struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, __be16 sport, __be16 dport); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); /* UDP uses skb->dev_scratch to cache as much information as possible and avoid * possibly multiple cache miss on dequeue() diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index d184d9379a92..4a4c6d3d2786 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1072,6 +1072,7 @@ const struct proto_ops inet_dgram_ops = { .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, .sendmsg_locked = udp_sendmsg_locked, + .read_sock = udp_read_sock, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .sendpage = inet_sendpage, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 635e1e8b2968..6dffbcec0b51 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1792,6 +1792,40 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags, } EXPORT_SYMBOL(__skb_recv_udp); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + struct sk_buff *skb; + int copied = 0, err; + + while (1) { + int offset = 0; + + skb = __skb_recv_udp(sk, 0, 1, &offset, &err); + if (!skb) + break; + if (offset < skb->len) { + int used; + size_t len; + + len = skb->len - offset; + used = recv_actor(desc, skb, offset, len); + if (used <= 0) { + if (!copied) + copied = used; + break; + } else if (used <= len) { + copied += used; + offset += used; + } + } + if (!desc->count) + break; + } + + return copied; +} + /* * This should be easy, if there is something there we * return it, otherwise we block. From patchwork Wed Feb 3 04:16:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063383 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E21AC43381 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ED6E564F61 for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233004AbhBCETv (ORCPT ); Tue, 2 Feb 2021 23:19:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232760AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4028C061793; Tue, 2 Feb 2021 20:17:07 -0800 (PST) Received: by mail-oo1-xc30.google.com with SMTP id y21so2387069oot.12; Tue, 02 Feb 2021 20:17:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hu2PSvf1ubuS5CTlM1297EMYv4VI1bTTMBsgsZV1bMQ=; b=Xopi7NxJGZjMD6E96E6Zwqp1FGbnAnTTS9fpOU3UDAKc2HXN/q98ycxqbFqFn41q7v gzbQJQKHI3E+lfnNWYHG8HhpaofH6PI29MpySujaIcGP0REiKB0Xi/dQgUD5WPJy5YA0 mg4nVdrQlP7nkWt0IsXg0tVTKQkW0wCaKJFCzaQI9zzHDGEm6WVpAVgxjlApeyDUTown 0bE0SNP3/GlLFfd/tXKz5PA4xZRLJwhBpRQpYqTtbv9IA0jjjRFxcu0541J6/qvqS8LX R8tquy6MKOVbFa+En0bk2xBqtCigC2yXb8oq1NT2cyooKQASt/3JFSTEiRT+IWvuK3TU HXQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hu2PSvf1ubuS5CTlM1297EMYv4VI1bTTMBsgsZV1bMQ=; b=ZK587+h5oaGcJeGc2VSUKe8rOzwOGFW4SeVS+/6exOZ3T1caMsUmC+fySNVEtLUlii sHpDGhgFf1G0pslglVSP9Y09wv6/0BtuEjVO3MvRJJIjhOqUIHgsKgBH3szdJGoX7DBS LqH8PGpeCQ63tRCnYAq8qrHT0cl6kvWknmgRgZeaPrQV0woxxhFOAnYjsgzGSBDhPYaf 71T/UT0ikalPJcrmWjmev6tC5OgaGj6wZZgX4MLFUlvCbA0L1XRMP/xXy4WVFaSP4QAK thfxmI/0sw3QiqwYA+37HyRxJ9r0kEp51moy7s1BFs0xLZIYysdZ0yGG4T9r3mIqHlse dWVw== X-Gm-Message-State: AOAM532rWY7fmF3z12PgTdVOZcyP5ufqpjqNnpMX/r8ZugvWMmNI7OXa sHnPxfnmM3MGt97vyozbRlFSaGNpkbv8nw== X-Google-Smtp-Source: ABdhPJwYbPn6zl136gqq8TKcxOfa7RepL+MBrMNS/3yzIgNRTS22izXE3VQPaSHuLulohXaKev/aoA== X-Received: by 2002:a4a:d1de:: with SMTP id a30mr810423oos.43.1612325827011; Tue, 02 Feb 2021 20:17:07 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:06 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 09/19] udp: add ->read_sock() and ->sendmsg_locked() to ipv6 Date: Tue, 2 Feb 2021 20:16:26 -0800 Message-Id: <20210203041636.38555-10-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Similarly, udpv6_sendmsg() takes lock_sock() inside too, we have to build ->sendmsg_locked() on top of it. For ->read_sock(), we can just use udp_read_sock(). Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/ipv6.h | 1 + net/ipv4/udp.c | 1 + net/ipv6/af_inet6.c | 2 ++ net/ipv6/udp.c | 27 +++++++++++++++++++++------ 4 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index bd1f396cc9c7..48b6850dae85 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1119,6 +1119,7 @@ int inet6_hash_connect(struct inet_timewait_death_row *death_row, int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); +int udpv6_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); /* * reassembly.c diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 6dffbcec0b51..3acb1be73131 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1825,6 +1825,7 @@ int udp_read_sock(struct sock *sk, read_descriptor_t *desc, return copied; } +EXPORT_SYMBOL(udp_read_sock); /* * This should be easy, if there is something there we diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index f091fe9b4da5..63c2d024f572 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -714,7 +714,9 @@ const struct proto_ops inet6_dgram_ops = { .setsockopt = sock_common_setsockopt, /* ok */ .getsockopt = sock_common_getsockopt, /* ok */ .sendmsg = inet6_sendmsg, /* retpoline's sake */ + .sendmsg_locked = udpv6_sendmsg_locked, .recvmsg = inet6_recvmsg, /* retpoline's sake */ + .read_sock = udp_read_sock, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 66ebdfc83c95..c52ea171060d 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1272,7 +1272,7 @@ static int udp_v6_push_pending_frames(struct sock *sk) return err; } -int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +static int __udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len, bool locked) { struct ipv6_txoptions opt_space; struct udp_sock *up = udp_sk(sk); @@ -1361,7 +1361,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) * There are pending frames. * The socket lock must be held while it's corked. */ - lock_sock(sk); + if (!locked) + lock_sock(sk); if (likely(up->pending)) { if (unlikely(up->pending != AF_INET6)) { release_sock(sk); @@ -1370,7 +1371,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) dst = NULL; goto do_append_data; } - release_sock(sk); + if (!locked) + release_sock(sk); } ulen += sizeof(struct udphdr); @@ -1533,11 +1535,13 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } - lock_sock(sk); + if (!locked) + lock_sock(sk); if (unlikely(up->pending)) { /* The socket is already corked while preparing it. */ /* ... which is an evident application bug. --ANK */ - release_sock(sk); + if (!locked) + release_sock(sk); net_dbg_ratelimited("udp cork app bug 2\n"); err = -EINVAL; @@ -1562,7 +1566,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (err > 0) err = np->recverr ? net_xmit_errno(err) : 0; - release_sock(sk); + if (!locked) + release_sock(sk); out: dst_release(dst); @@ -1593,6 +1598,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } +int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udpv6_sendmsg(sk, msg, len, false); +} + +int udpv6_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udpv6_sendmsg(sk, msg, len, true); +} + void udpv6_destroy_sock(struct sock *sk) { struct udp_sock *up = udp_sk(sk); From patchwork Wed Feb 3 04:16:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063387 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62DA4C4332E for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 37B9864F61 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233021AbhBCETx (ORCPT ); Tue, 2 Feb 2021 23:19:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232762AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A4CDC061794; Tue, 2 Feb 2021 20:17:09 -0800 (PST) Received: by mail-ot1-x332.google.com with SMTP id e70so22106517ote.11; Tue, 02 Feb 2021 20:17:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GN8zP3UB5VJMaiI/5BQTm+oIsq+71H6Qn+yKzIEhQE8=; b=N4k6iB1iwAc4+xCCISbCeU7Le+8Cw7BkN9C2UpcKHeCaWPIM0BGYT6Cv0BAc+zQf0z UcThmK/mjM3AZ805CHXi+iKL/IuXxp4h9V2ciR9yiz2v+g7jiQCQy/qxaG4eIvNdOA+R MfKW5T09POMs8bUeygBdrR9JsLwC+BjnTtKy9YDITQbtheH7XyB/DcE3Iy8zTK6C7SLQ qnYHm2obNLADh2eqarfqHwUuhVktSv4lR0HAG3xUXjpuKbRx+t8KnTFA29hnuhfAXuYz BxIjNpDypqSbplTSPjPzOIZysfwOiRtfx5UeanIgt09OMB4kEQ9TY9LVbnh2XS+3ZMrH w+ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GN8zP3UB5VJMaiI/5BQTm+oIsq+71H6Qn+yKzIEhQE8=; b=MJ+U5M/Nm1aiHvyKklSGyjsqVnBQoueSEfl8yhPIfZpyibDft4eQVu9dOAADAqOE0+ 9p/RsixMSYI5lMUgrAAdB0rWWgIyoxIAmOpIu1Z6Ibl72UD09MEEgnMuWw6xe5XVSNOh e4SWDI09/Z0uBU+VrdspWgxA4bANXrsp1xL7w4JJSV5/CF7Jo6NiyF6kgID57uCXUKWm Xhqv/m+zh5mf8mVkBwELztCjOGIo/OBdxXxuQSV/mz5K4ojLp4tJL3PIsjI5RE2Xkl2e ZbKsyM+3iR1O2ddEPWQo1PuOnawCOfPAfx0paHy+lHy2WUDGzztxKCLX1rsvG7Hc/ijM 9pug== X-Gm-Message-State: AOAM531rSDs+FX5o6uUrdsclhuMqxutSqskyVHtPQMIx9kpv2Qfnz7Lg ZQ0nB1CQ92aLl5PtA/shfqh+rHnKeRtrww== X-Google-Smtp-Source: ABdhPJxCGYaDts//pAYD30pC10JIsY/BnC6sSVkE+STYVWl72C4V1snCvm9h8It6Z78JCNaRD6/0PA== X-Received: by 2002:a9d:4e04:: with SMTP id p4mr770036otf.150.1612325828466; Tue, 02 Feb 2021 20:17:08 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:07 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 10/19] af_unix: implement ->sendmsg_locked for dgram socket Date: Tue, 2 Feb 2021 20:16:27 -0800 Message-Id: <20210203041636.38555-11-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We already have unix_dgram_sendmsg(), we can just build its ->sendmsg_locked() on top of it. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/unix/af_unix.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 41c3303c3357..4e1fa4ecbcfb 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -659,6 +659,7 @@ static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset, static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, struct pipe_inode_info *, size_t size, unsigned int flags); +static int __unix_dgram_sendmsg(struct sock*, struct msghdr *, size_t); static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int); static int unix_dgram_connect(struct socket *, struct sockaddr *, @@ -738,6 +739,7 @@ static const struct proto_ops unix_dgram_ops = { .listen = sock_no_listen, .shutdown = unix_shutdown, .sendmsg = unix_dgram_sendmsg, + .sendmsg_locked = __unix_dgram_sendmsg, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, @@ -1611,10 +1613,10 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb) * Send AF_UNIX data. */ -static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, - size_t len) +static int __unix_dgram_sendmsg(struct sock *sk, struct msghdr *msg, + size_t len) { - struct sock *sk = sock->sk; + struct socket *sock = sk->sk_socket; struct net *net = sock_net(sk); struct unix_sock *u = unix_sk(sk); DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name); @@ -1814,6 +1816,12 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, return err; } +static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, + size_t len) +{ + return __unix_dgram_sendmsg(sock->sk, msg, len); +} + /* We use paged skbs for stream sockets, and limit occupancy to 32768 * bytes, and a minimum of a full page. */ From patchwork Wed Feb 3 04:16:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063393 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2435C43219 for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6F0B164F91 for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233033AbhBCETy (ORCPT ); Tue, 2 Feb 2021 23:19:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231944AbhBCES1 (ORCPT ); Tue, 2 Feb 2021 23:18:27 -0500 Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBA5EC061797; Tue, 2 Feb 2021 20:17:10 -0800 (PST) Received: by mail-ot1-x336.google.com with SMTP id d7so22161932otf.3; Tue, 02 Feb 2021 20:17:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FbcI6wt1YBzeWqrusTkpbaZfnU8cH2xe50gLcZjha8Y=; b=bTgpZ8EpJ+U6hBDnEOA1mhM+6MfCw0fSbK9KY39fKP6MA1CytaAePTJjaFfL2nBkU2 +TmPYYU3yeWxWm/U2JfCk2zrDl+JHfIFKisv7oQXxFdALRcLiQ1rqoeqcFUstb06eCb2 N/roDNssfhs33lO730uAnym3KqWZqACvrwJSSWataNBTMEwVc/Ff3szcNQkpxE8ERFdY C31QPsTmdd4IJwKCXn2OjFrI19fUH0j+6LXAvRJ/TAcEtl09ztHoQqfGD6fCZUdg01Oa vNXxZEwGpB4L1WwaXKfuyZu5gBoQJ8h1qzFib3gL+SxXvSVurgwX6vyYuZSk7YFK3TQK 0TFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FbcI6wt1YBzeWqrusTkpbaZfnU8cH2xe50gLcZjha8Y=; b=eZGnXsqzuDOArdUdpaAB5zMNBzQa5mkJU9YtoK8T2VPM4VOplJ5hNlRBp/wsZn/hQg QYH9dYN46A1m1wqXx47LMEsHzcXMcflyEExS7bULxUZmteDORUrdIMWAi1sOARsMcbLp 5fhGmFRPjB6YVZXZx+y8LAZEa4jRwVuhRJyH+rV4puuycU3oHuuo2CaGQuFAOKHHbqjR 4Q8NJFHeXdzZ3WZS6bATuXEHukN2U6dxCo0sRZeeA+yYEUA7Rz8NkvM96Ky6G8ItUmTb 3jOQUrtnRpwAOdJAvN9UUf5nDYpMfOZVnCzKc4WFXss/CiNyogRME/eD2tiJlMXjmhCn GqoA== X-Gm-Message-State: AOAM532K5SXXEr/562ywzybQzK+MdwMDWybg7xMmO/bBLy1Lx0X/cz0I 0wYyf1jVxG7I9G4KMyz/DzREyJV3tBBELQ== X-Google-Smtp-Source: ABdhPJxzQ9wsLkDoFo4vhjLHrIA+8tYBqV99ZetqsvSJCXTrcFWw0ItPVRG9mIMxga/h5rlJzLfaIg== X-Received: by 2002:a9d:6c85:: with SMTP id c5mr760176otr.300.1612325830009; Tue, 02 Feb 2021 20:17:10 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:09 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 11/19] af_unix: implement ->read_sock() for sockmap Date: Tue, 2 Feb 2021 20:16:28 -0800 Message-Id: <20210203041636.38555-12-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/unix/af_unix.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4e1fa4ecbcfb..9315c4f4c27a 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -662,6 +662,7 @@ static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, static int __unix_dgram_sendmsg(struct sock*, struct msghdr *, size_t); static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int); +int unix_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t recv_actor); static int unix_dgram_connect(struct socket *, struct sockaddr *, int, int); static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t); @@ -739,6 +740,7 @@ static const struct proto_ops unix_dgram_ops = { .listen = sock_no_listen, .shutdown = unix_shutdown, .sendmsg = unix_dgram_sendmsg, + .read_sock = unix_read_sock, .sendmsg_locked = __unix_dgram_sendmsg, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, @@ -2190,6 +2192,50 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, return err; } +int unix_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + unsigned int flags = MSG_DONTWAIT; + struct unix_sock *u = unix_sk(sk); + struct sk_buff *skb; + int copied = 0; + + while (1) { + int offset, err; + + mutex_lock(&u->iolock); + skb = __skb_recv_datagram(sk, &sk->sk_receive_queue, flags, + &offset, &err); + if (!skb) { + mutex_unlock(&u->iolock); + break; + } + + if (offset < skb->len) { + int used; + size_t len; + + len = skb->len - offset; + used = recv_actor(desc, skb, offset, len); + if (used <= 0) { + if (!copied) + copied = used; + mutex_unlock(&u->iolock); + break; + } else if (used <= len) { + copied += used; + offset += used; + } + } + mutex_unlock(&u->iolock); + + if (!desc->count) + break; + } + + return copied; +} + /* * Sleep until more data has arrived. But check for races.. */ From patchwork Wed Feb 3 04:16:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063399 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 722D1C4321A for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A8B364F7E for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233056AbhBCET5 (ORCPT ); Tue, 2 Feb 2021 23:19:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232710AbhBCES1 (ORCPT ); Tue, 2 Feb 2021 23:18:27 -0500 Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C7FBC0617A7; Tue, 2 Feb 2021 20:17:12 -0800 (PST) Received: by mail-oi1-x234.google.com with SMTP id a77so25393520oii.4; Tue, 02 Feb 2021 20:17:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I8srG1L59T5lSA/sx67czDaMe/6mWiAIpF1LBobm/C8=; b=QUKYtXCfh0UBl/bdGnwunxOtACNvBs9dBnbmcTrJ7hHwRtG1ouO8NLCkYwNFQFZEbV f+dA61CT7JCiZ5+vin/uJGUccemujooRl2V4mA+RlpFTD79STZUetOqg7JfEO1chhVtf fpgIdQ2061t+NKr9r4ULHJDlWuFajw8OHMsoBhmXdB4Km7miTfaQ+A6lv8pXZow+dO96 Iy+BrDTXkOasf6fLNqvFBAdxAUqKj8AMiWrkTi0GWHoHATGApXWWrEY9dU1f557FFh34 mBv4uLZDx2MFjexS4w3vj2VyVkJ7sOz9GM3QuIXPyxKgj9Eu8jphlUBtU70ZOOgMZtkt KNVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I8srG1L59T5lSA/sx67czDaMe/6mWiAIpF1LBobm/C8=; b=hvLCRIALYAolPRld7L7Mw41a34fb2+pL3AGeqW3jW5H7eEoyYpG7qfCI5JQnAXu0pb /XQiUr2a3Vff6oHfNG55+6LIEa5TdFuloyNwtMFCBwRjkvnkCys2/UpKfs7uORGR7HAX cd7XUCcQ/GWX4N1mPCH4q6tUM+f2C8IqFnZSV8MUrVD/uoQKE+de41gfi6s8bEdvm3/U sQt7KCAhRTHOxS5NEZFSuYBrS2BnwwXMUito0Aijcf6u9+BLQx3MvTXuq0jcF/+HKMm4 M4LYH7zlD6o1w6M736P4U1DPjlS564P78CZ0b8MzdBW5irESXD+MaIJVCRS63PhvB1NC afAw== X-Gm-Message-State: AOAM530chb8QVZeuR23YREAe44yfxC1RyuaBWRtw77io0JamxW6IP14W 6FOGPu4yHyzj3g6aRbDRiBf094I+nKYabg== X-Google-Smtp-Source: ABdhPJzoP42CF4cWCYXDjP5S2b6u//MVgpOumSRK4FCz9fDlVcwcEbpF3oUWOBe/5hgwt29qVBW7+Q== X-Received: by 2002:aca:b683:: with SMTP id g125mr789725oif.47.1612325831586; Tue, 02 Feb 2021 20:17:11 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:10 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 12/19] af_unix: implement ->update_proto() Date: Tue, 2 Feb 2021 20:16:29 -0800 Message-Id: <20210203041636.38555-13-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang unix_proto is special, it is very different from INET proto, which even does not have a ->close(). We have to add a dummy one to satisfy sockmap. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- MAINTAINERS | 1 + include/net/af_unix.h | 10 +++++++++ net/unix/Makefile | 1 + net/unix/af_unix.c | 12 ++++++++++- net/unix/unix_bpf.c | 50 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 73 insertions(+), 1 deletion(-) create mode 100644 net/unix/unix_bpf.c diff --git a/MAINTAINERS b/MAINTAINERS index 1df56a32d2df..1fa3971c45b0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9950,6 +9950,7 @@ F: net/core/skmsg.c F: net/core/sock_map.c F: net/ipv4/tcp_bpf.c F: net/ipv4/udp_bpf.c +F: net/unix/unix_bpf.c LANTIQ / INTEL Ethernet drivers M: Hauke Mehrtens diff --git a/include/net/af_unix.h b/include/net/af_unix.h index f42fdddecd41..fa75f899e88a 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -89,4 +89,14 @@ void unix_sysctl_unregister(struct net *net); static inline int unix_sysctl_register(struct net *net) { return 0; } static inline void unix_sysctl_unregister(struct net *net) {} #endif + +extern struct proto unix_proto; + +#ifdef CONFIG_BPF_SOCK_MAP +int unix_bpf_update_proto(struct sock *sk, bool restore); +void __init unix_bpf_build_proto(void); +#else +static inline void __init unix_bpf_build_proto(void) +{} +#endif #endif diff --git a/net/unix/Makefile b/net/unix/Makefile index 54e58cc4f945..7d2c70c575b6 100644 --- a/net/unix/Makefile +++ b/net/unix/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_UNIX) += unix.o unix-y := af_unix.o garbage.o unix-$(CONFIG_SYSCTL) += sysctl_net_unix.o +unix-$(CONFIG_BPF_SOCK_MAP) += unix_bpf.o obj-$(CONFIG_UNIX_DIAG) += unix_diag.o unix_diag-y := diag.o diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 9315c4f4c27a..4ce12d3c369e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -773,10 +773,18 @@ static const struct proto_ops unix_seqpacket_ops = { .show_fdinfo = unix_show_fdinfo, }; -static struct proto unix_proto = { +static void unix_close(struct sock *sk, long timeout) +{ +} + +struct proto unix_proto = { .name = "UNIX", .owner = THIS_MODULE, .obj_size = sizeof(struct unix_sock), + .close = unix_close, +#ifdef CONFIG_BPF_SOCK_MAP + .update_proto = unix_bpf_update_proto, +#endif }; static struct sock *unix_create1(struct net *net, struct socket *sock, int kern) @@ -861,6 +869,7 @@ static int unix_release(struct socket *sock) return 0; unix_release_sock(sk, 0); + sk->sk_prot->close(sk, 0); sock->sk = NULL; return 0; @@ -2973,6 +2982,7 @@ static int __init af_unix_init(void) sock_register(&unix_family_ops); register_pernet_subsys(&unix_net_ops); + unix_bpf_build_proto(); out: return rc; } diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c new file mode 100644 index 000000000000..2e6a26ec4958 --- /dev/null +++ b/net/unix/unix_bpf.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Cong Wang */ + +#include +#include +#include + +static struct proto *unix_prot_saved __read_mostly; +static DEFINE_SPINLOCK(unix_prot_lock); +static struct proto unix_bpf_prot; + +static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base) +{ + *prot = *base; + prot->close = sock_map_close; +} + +static void unix_bpf_check_needs_rebuild(struct proto *ops) +{ + if (unlikely(ops != smp_load_acquire(&unix_prot_saved))) { + spin_lock_bh(&unix_prot_lock); + if (likely(ops != unix_prot_saved)) { + unix_bpf_rebuild_protos(&unix_bpf_prot, ops); + smp_store_release(&unix_prot_saved, ops); + } + spin_unlock_bh(&unix_prot_lock); + } +} + +int unix_bpf_update_proto(struct sock *sk, bool restore) +{ + struct sk_psock *psock = sk_psock(sk); + + if (restore) { + sk->sk_write_space = psock->saved_write_space; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, psock->sk_proto); + return 0; + } + + unix_bpf_check_needs_rebuild(psock->sk_proto); + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, &unix_bpf_prot); + return 0; +} + +void __init unix_bpf_build_proto(void) +{ + unix_bpf_rebuild_protos(&unix_bpf_prot, &unix_proto); +} From patchwork Wed Feb 3 04:16:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063391 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE9E8C43333 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A410164F72 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233071AbhBCET6 (ORCPT ); Tue, 2 Feb 2021 23:19:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232763AbhBCESc (ORCPT ); Tue, 2 Feb 2021 23:18:32 -0500 Received: from mail-ot1-x32b.google.com (mail-ot1-x32b.google.com [IPv6:2607:f8b0:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8FD0C0617A9; Tue, 2 Feb 2021 20:17:13 -0800 (PST) Received: by mail-ot1-x32b.google.com with SMTP id t25so12038880otc.5; Tue, 02 Feb 2021 20:17:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0wUnpV/obp82NreuLykxFGGqjFKMgAiVjOGGr0jGMcU=; b=FaGgsDLSq3mLL5v/I5xuWWu9aBFL39PaQM+K6AwAgWudvaWJ+VlBDlac1u5l/onpmJ ph053zX+eODmxz88QrlBV841g/lBt0yPmMfVLIkP/IELFj6ESgOpZxsn/2992HJQaum0 3+iOF7pL6eQ8BD0XysPsJe9NsgdM0J5lQG5YZKTf4+pDOIyf+kym2OdoK/DRbfL0wh3K 08qgzQjdVZZ5D48e6jbHALTKGIF3SmZD1Qh+Tgypi9gbr1O3hUmMVpWtmxRcCURwQWbV N4lj/wfPfJQ1PppK/6jLrkTobNOAPyWs9TENXO4bhyJfDg/nC/9uDF9R+KiD/GKhMxHo Ij7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0wUnpV/obp82NreuLykxFGGqjFKMgAiVjOGGr0jGMcU=; b=POci3EdBG34/2xYlS2hVuZCSu1EN+BHa9WGgIhem0k1e4d3wKVPUYxYy5ejV8KLSGC 9yw4w2Gia5Ve8rlKF6PgLo7OTJk0OzldFMPZUPpiWVa+9UHC+dj7znTwrClo8K8aLGC/ BWe4fdP7NBBr1alib+SDDAqw6I47/jpiDp/nKx60qxCBPqVw2Um802EXH/LQjRSv3UCh Ejyldoecxm/oIaQZTvzO3YIlVzfnvsfKFxL9fX4RntDsdYYJ7DkOttirFkrgbGTeesR0 etg807B19HhD1TmbkCofxi9VEwqnVvPAt8L89DdkMwQzJwmo0iXq7iz2z5302cL0vz66 IDbg== X-Gm-Message-State: AOAM532pFQTEnRmjcX87ZurrEEpsGU5vSUJlPQlCXpxjFfZA+UJB02wH sLRDkgLp6Idk4kd1s+LiFUW8X8fuZnG9xg== X-Google-Smtp-Source: ABdhPJzhbdskf5D/hsrcvveDEXNmR6SeUuWh+YI3Ja+QIejYn7K0rt7S+CsjHPSxLbDWpTqLDWRnMw== X-Received: by 2002:a9d:6852:: with SMTP id c18mr788253oto.166.1612325833043; Tue, 02 Feb 2021 20:17:13 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:12 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 13/19] af_unix: set TCP_ESTABLISHED for datagram sockets too Date: Tue, 2 Feb 2021 20:16:30 -0800 Message-Id: <20210203041636.38555-14-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently only unix stream socket sets TCP_ESTABLISHED, datagram socket can set this too when they connect to its peer socket. At least __ip4_datagram_connect() does the same. This will be used by the next patch to determine whether an AF_UNIX datagram socket can be redirected in sockmap. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/unix/af_unix.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4ce12d3c369e..21c4406f879b 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1206,6 +1206,8 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, unix_peer(sk) = other; unix_state_double_unlock(sk, other); } + + sk->sk_state = other->sk_state = TCP_ESTABLISHED; return 0; out_unlock: @@ -1438,12 +1440,10 @@ static int unix_socketpair(struct socket *socka, struct socket *sockb) init_peercred(ska); init_peercred(skb); - if (ska->sk_type != SOCK_DGRAM) { - ska->sk_state = TCP_ESTABLISHED; - skb->sk_state = TCP_ESTABLISHED; - socka->state = SS_CONNECTED; - sockb->state = SS_CONNECTED; - } + ska->sk_state = TCP_ESTABLISHED; + skb->sk_state = TCP_ESTABLISHED; + socka->state = SS_CONNECTED; + sockb->state = SS_CONNECTED; return 0; } From patchwork Wed Feb 3 04:16:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063389 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACCA3C4332D for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61C2164F7C for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233062AbhBCET5 (ORCPT ); Tue, 2 Feb 2021 23:19:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232767AbhBCESa (ORCPT ); Tue, 2 Feb 2021 23:18:30 -0500 Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F178C0617AA; Tue, 2 Feb 2021 20:17:15 -0800 (PST) Received: by mail-ot1-x333.google.com with SMTP id f6so22095833ots.9; Tue, 02 Feb 2021 20:17:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ffOSXHVfVukIRx64nQpYmmUG63thxvE/1brmM+JdKMs=; b=fJUbts7FCGnPppInUfONfJXujv3VIsSNHcxo918F3cp9xP4h7nq2Uav+3Wppjzpk8R HkB0SesEzijeJuNfXyjKIxpp3p/OhtS12CY6xqqlmj2N7txw3sqryBUhBxAWsSvCUogR OVN7gWyW9qfQgM5NKS0YlrKiIb8afyfiRXQ9gXiCbZN/qd4HMNjBk07Ez+PHxDbkqa9t FN20E4vf4d7kOmTJLw151lZqtCCv5BD2jePtZ1IuMzc1hUBfzKlE9m8s9Xi2MdmTkUuU XwzQUFJKxFUHFFwtVgywH7djnS2v7N6QaebiZ0KyySE2//cqf3NkQ+dpac9Vv5HEdCZz b1SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ffOSXHVfVukIRx64nQpYmmUG63thxvE/1brmM+JdKMs=; b=MckenP15hOUHS60V4MqXjlAEbC4dX8mneRjNOdj5XxuZo/zipYT2aouz44uKWU6kmJ LpBVPTcO2WQgBy/18DtQT9jZN5PVICk05qEXBt/wH8tdjcIXg+nAlDhAiLFx0sQKnAwa F5AOAloiubekwL8au3oCADDy3gB+3xmVSXc7tO9GzXeXKiThj0qbtKHzUHMTbRTBXnsg wYvzhlCU/5DJaMP5xdRJhLnZhaIh7l70ZwejJNs2deyQHO2aN23hsPZGrogZx3dfPNYf db5ROO3FpgnkuiG7XkBXz4PTiHpZepN6fzEIPA8u2WUKtdkVaA+suhhoQzIZo/Y3ptVK btsQ== X-Gm-Message-State: AOAM531WlwkITOYCuRcXDXWITfd4PlO3Digj0iYI65nhNPHhmzll8/k+ eSHfbU6xhAB7vYQE7vxPiRRKplpJkZW8OQ== X-Google-Smtp-Source: ABdhPJyVbJM6ZSydt4Uu3bZbe2jRyPOKOlHNwVJf8qqzIFFxqZzaxmGn0y5DarDq8xypZe+Cy9P9NQ== X-Received: by 2002:a9d:71c6:: with SMTP id z6mr807246otj.276.1612325834550; Tue, 02 Feb 2021 20:17:14 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:13 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 14/19] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Date: Tue, 2 Feb 2021 20:16:31 -0800 Message-Id: <20210203041636.38555-15-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 4 ++ include/net/tcp.h | 2 - net/core/skmsg.c | 104 +++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_bpf.c | 106 +----------------------------------------- net/tls/tls_sw.c | 4 +- 5 files changed, 112 insertions(+), 108 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index cb94d0f89c08..0e52fc5521a0 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -125,6 +125,10 @@ int sk_msg_zerocopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err); +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags); static inline void sk_msg_check_to_free(struct sk_msg *msg, u32 i, u32 bytes) { diff --git a/include/net/tcp.h b/include/net/tcp.h index c2fff35859b6..b314aee5800d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2194,8 +2194,6 @@ static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) #ifdef CONFIG_NET_SOCK_MSG int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, int flags); -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags); #endif /* CONFIG_NET_SOCK_MSG */ #ifdef CONFIG_CGROUP_BPF diff --git a/net/core/skmsg.c b/net/core/skmsg.c index ecbd6f0d49a5..8e3edbdf4c7c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -399,6 +399,110 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, } EXPORT_SYMBOL_GPL(sk_msg_memcopy_from_iter); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err) +{ + DEFINE_WAIT_FUNC(wait, woken_wake_function); + int ret = 0; + + if (sk->sk_shutdown & RCV_SHUTDOWN) + return 1; + + if (!timeo) + return ret; + + add_wait_queue(sk_sleep(sk), &wait); + sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); + ret = sk_wait_event(sk, &timeo, + !list_empty(&psock->ingress_msg) || + !skb_queue_empty(&sk->sk_receive_queue), &wait); + sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); + remove_wait_queue(sk_sleep(sk), &wait); + return ret; +} +EXPORT_SYMBOL_GPL(sk_msg_wait_data); + +/* Receive sk_msg from psock->ingress_msg to @msg. */ +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags) +{ + struct iov_iter *iter = &msg->msg_iter; + int peek = flags & MSG_PEEK; + struct sk_msg *msg_rx; + int i, copied = 0; + + msg_rx = list_first_entry_or_null(&psock->ingress_msg, + struct sk_msg, list); + + while (copied != len) { + struct scatterlist *sge; + + if (unlikely(!msg_rx)) + break; + + i = msg_rx->sg.start; + do { + struct page *page; + int copy; + + sge = sk_msg_elem(msg_rx, i); + copy = sge->length; + page = sg_page(sge); + if (copied + copy > len) + copy = len - copied; + copy = copy_page_to_iter(page, sge->offset, copy, iter); + if (!copy) + return copied ? copied : -EFAULT; + + copied += copy; + if (likely(!peek)) { + sge->offset += copy; + sge->length -= copy; + if (!msg_rx->skb) + sk_mem_uncharge(sk, copy); + msg_rx->sg.size -= copy; + + if (!sge->length) { + sk_msg_iter_var_next(i); + if (!msg_rx->skb) + put_page(page); + } + } else { + /* Lets not optimize peek case if copy_page_to_iter + * didn't copy the entire length lets just break. + */ + if (copy != sge->length) + return copied; + sk_msg_iter_var_next(i); + } + + if (copied == len) + break; + } while (i != msg_rx->sg.end); + + if (unlikely(peek)) { + if (msg_rx == list_last_entry(&psock->ingress_msg, + struct sk_msg, list)) + break; + msg_rx = list_next_entry(msg_rx, list); + continue; + } + + msg_rx->sg.start = i; + if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { + list_del(&msg_rx->list); + if (msg_rx->skb) + consume_skb(msg_rx->skb); + kfree(msg_rx); + } + msg_rx = list_first_entry_or_null(&psock->ingress_msg, + struct sk_msg, list); + } + + return copied; +} +EXPORT_SYMBOL_GPL(sk_msg_recvmsg); + static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, struct sk_buff *skb) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 16e00802ccba..3c0206a4f0e0 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -10,86 +10,6 @@ #include #include -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags) -{ - struct iov_iter *iter = &msg->msg_iter; - int peek = flags & MSG_PEEK; - struct sk_msg *msg_rx; - int i, copied = 0; - - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); - - while (copied != len) { - struct scatterlist *sge; - - if (unlikely(!msg_rx)) - break; - - i = msg_rx->sg.start; - do { - struct page *page; - int copy; - - sge = sk_msg_elem(msg_rx, i); - copy = sge->length; - page = sg_page(sge); - if (copied + copy > len) - copy = len - copied; - copy = copy_page_to_iter(page, sge->offset, copy, iter); - if (!copy) - return copied ? copied : -EFAULT; - - copied += copy; - if (likely(!peek)) { - sge->offset += copy; - sge->length -= copy; - if (!msg_rx->skb) - sk_mem_uncharge(sk, copy); - msg_rx->sg.size -= copy; - - if (!sge->length) { - sk_msg_iter_var_next(i); - if (!msg_rx->skb) - put_page(page); - } - } else { - /* Lets not optimize peek case if copy_page_to_iter - * didn't copy the entire length lets just break. - */ - if (copy != sge->length) - return copied; - sk_msg_iter_var_next(i); - } - - if (copied == len) - break; - } while (i != msg_rx->sg.end); - - if (unlikely(peek)) { - if (msg_rx == list_last_entry(&psock->ingress_msg, - struct sk_msg, list)) - break; - msg_rx = list_next_entry(msg_rx, list); - continue; - } - - msg_rx->sg.start = i; - if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { - list_del(&msg_rx->list); - if (msg_rx->skb) - consume_skb(msg_rx->skb); - kfree(msg_rx); - } - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); - } - - return copied; -} -EXPORT_SYMBOL_GPL(__tcp_bpf_recvmsg); - static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, struct sk_msg *msg, u32 apply_bytes, int flags) { @@ -243,28 +163,6 @@ static bool tcp_bpf_stream_read(const struct sock *sk) return !empty; } -static int tcp_bpf_wait_data(struct sock *sk, struct sk_psock *psock, - int flags, long timeo, int *err) -{ - DEFINE_WAIT_FUNC(wait, woken_wake_function); - int ret = 0; - - if (sk->sk_shutdown & RCV_SHUTDOWN) - return 1; - - if (!timeo) - return ret; - - add_wait_queue(sk_sleep(sk), &wait); - sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); - ret = sk_wait_event(sk, &timeo, - !list_empty(&psock->ingress_msg) || - !skb_queue_empty(&sk->sk_receive_queue), &wait); - sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); - remove_wait_queue(sk_sleep(sk), &wait); - return ret; -} - static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len) { @@ -284,13 +182,13 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, } lock_sock(sk); msg_bytes_ready: - copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { int data, err = 0; long timeo; timeo = sock_rcvtimeo(sk, nonblock); - data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); if (data) { if (!sk_psock_queue_empty(psock)) goto msg_bytes_ready; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 01d933ae5f16..1dcb34dfd56b 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1789,8 +1789,8 @@ int tls_sw_recvmsg(struct sock *sk, skb = tls_wait_data(sk, psock, flags, timeo, &err); if (!skb) { if (psock) { - int ret = __tcp_bpf_recvmsg(sk, psock, - msg, len, flags); + int ret = sk_msg_recvmsg(sk, psock, msg, len, + flags); if (ret > 0) { decrypted += ret; From patchwork Wed Feb 3 04:16:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063395 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3F08C43331 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C3DBE64F68 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233075AbhBCET7 (ORCPT ); Tue, 2 Feb 2021 23:19:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232852AbhBCES4 (ORCPT ); Tue, 2 Feb 2021 23:18:56 -0500 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1EC8C0617AB; Tue, 2 Feb 2021 20:17:16 -0800 (PST) Received: by mail-oo1-xc2c.google.com with SMTP id z36so5707810ooi.6; Tue, 02 Feb 2021 20:17:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qGgIBOZxfhj4RQXaRAJW/z/oxERFBXXb52q9pb05GWg=; b=ktAWrgMwRW/oX/JdYmeAPZuH2HFw7cuWSH4dOAlO7VW8gFmpW+kaA/GT+ySqF7FfI+ P90ifZbOQIdsDr+1cMVlOnFTHRBKzGU0WoB31s+LW3O9Z6jaUK305UZ4xvDi9gMG9gt4 IWgfx8fnOoF9ztmE2oSynaNmzxJWT3r+xVH0RoSIiK/S2MCGU3jkqX8wtj1NStIyIUOC bxvSGW09PiGA0eoB2FV7jw2yKTT3srNMoeKvuTnLtoS8Z1dnfeW5rQPKFRQSJZdqIpQL ylnuwHdHqP2LHfGMzVFIJXromVHNRyp4It9KONBshrPFB0UOwtYqwTCHQj9Nkf3bHy4Y PLqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qGgIBOZxfhj4RQXaRAJW/z/oxERFBXXb52q9pb05GWg=; b=UYz1AwaeAwgp9JcoSsYkcNCYfnyOCTQkY317QCWYlieO0ZSelm1H48/wQnR1oisrrV 74jhy5cDTTM/+YKHM6LVvSltyHTg2EMO6PLL8wluSqJ/UEUs9kfxH+w5MAYBQi0Ti08M Z9uTZaU2VS1PLB3AfUTPjKAlpHuRpiG42f8Oo3IygeWzS17OMLduoS/BKzfhqnQ8rJ7D dDLks7kh32zT641UvS4Jr5SJmSq/XPhjF78UZ11uwFugHtZnElPdAA7DLbggcjaMpcNL SPnGQUtkC1/CaMOQE6bCmJq9J69rIGSMTFaQK5R283Ck2OYAEhhdsnxPcEc/GvsRQteB U0BA== X-Gm-Message-State: AOAM533yC4zykTZaY3/hAojVk7y2hrTK48WVriC50nKC4w3JLpB8+Css H7vL+ryQVF7jHGnls6hvRmLuKFrSDsm0YQ== X-Google-Smtp-Source: ABdhPJzau0g7y3LgHF18hGPSqwt1wkCbsH3h2Tu/zVHKYAhvMRvHiRS0Ytu5Dp4X33E+nyRYNNiUKA== X-Received: by 2002:a4a:870c:: with SMTP id z12mr802907ooh.15.1612325836069; Tue, 02 Feb 2021 20:17:16 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:15 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 15/19] udp: implement udp_bpf_recvmsg() for sockmap Date: Tue, 2 Feb 2021 20:16:32 -0800 Message-Id: <20210203041636.38555-16-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/ipv4/udp_bpf.c | 64 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 595836088e85..9a37ba056575 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -4,6 +4,68 @@ #include #include #include +#include + +#include "udp_impl.h" + +static struct proto *udpv6_prot_saved __read_mostly; + +static int sk_udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int noblock, int flags, int *addr_len) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6) + return udpv6_prot_saved->recvmsg(sk, msg, len, noblock, flags, + addr_len); +#endif + return udp_prot.recvmsg(sk, msg, len, noblock, flags, addr_len); +} + +static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int nonblock, int flags, int *addr_len) +{ + struct sk_psock *psock; + int copied, ret; + + if (unlikely(flags & MSG_ERRQUEUE)) + return inet_recv_error(sk, msg, len, addr_len); + + psock = sk_psock_get(sk); + if (unlikely(!psock)) + return sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + + lock_sock(sk); + if (sk_psock_queue_empty(psock)) { + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + +msg_bytes_ready: + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + if (!copied) { + int data, err = 0; + long timeo; + + timeo = sock_rcvtimeo(sk, nonblock); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); + if (data) { + if (!sk_psock_queue_empty(psock)) + goto msg_bytes_ready; + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + if (err) { + ret = err; + goto out; + } + copied = -EAGAIN; + } + ret = copied; +out: + release_sock(sk); + sk_psock_put(sk, psock); + return ret; +} enum { UDP_BPF_IPV4, @@ -11,7 +73,6 @@ enum { UDP_BPF_NUM_PROTS, }; -static struct proto *udpv6_prot_saved __read_mostly; static DEFINE_SPINLOCK(udpv6_prot_lock); static struct proto udp_bpf_prots[UDP_BPF_NUM_PROTS]; @@ -20,6 +81,7 @@ static void udp_bpf_rebuild_protos(struct proto *prot, const struct proto *base) *prot = *base; prot->unhash = sock_map_unhash; prot->close = sock_map_close; + prot->recvmsg = udp_bpf_recvmsg; } static void udp_bpf_check_v6_needs_rebuild(struct proto *ops) From patchwork Wed Feb 3 04:16:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063405 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C7CCC433E6 for ; Wed, 3 Feb 2021 04:21:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F418B64F65 for ; Wed, 3 Feb 2021 04:21:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232735AbhBCEUm (ORCPT ); Tue, 2 Feb 2021 23:20:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232947AbhBCETV (ORCPT ); Tue, 2 Feb 2021 23:19:21 -0500 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 661CEC061351; Tue, 2 Feb 2021 20:17:18 -0800 (PST) Received: by mail-oo1-xc2c.google.com with SMTP id x23so5704323oop.1; Tue, 02 Feb 2021 20:17:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FCPDcm2U8/mVowiAl26odeeIloTX+VCHLbcc34SXhNg=; b=KygWNLfC1FY5z3olUJ14gMzLoUBOpOOHpXUNc6udgsKN13ZIUd4Zc6YSKzU8aQ1Ykt 4nX0Is7bxVk3/QMseeZQEpVvFNvzpqlNhdD5diHAWaVg6q7xXBfEuaeytbhHIJhry2Qk 0NvbYQtxIAf0j6KSyhLpW8fOgq38YmDNOjVu+q2niaHw+lNhUUEGEiBABvpEM0HsxcSh d73xvEFQGJ+H75E01LJbIZ2Mo+mXIF4TyMITrNe9rBbO8nUwyC7aAdqUtuNfaLifhGkn Zmg9WDG7XN+NlcY8UxWpWyVHjycGr8Xd36auU7DbsosSKzF5vYD4sfQrjh8pCBRwktwX 5U2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FCPDcm2U8/mVowiAl26odeeIloTX+VCHLbcc34SXhNg=; b=PiNdyKj/nrY/XR3dzlTuawnTflMGlK95mGhIzWKcxd2nSgB8q8IvIYm7T2VImhT/ue veRqprdQDDUs0pVmyk1jKiUH+6beH3PVQ8Zden44Z/9ZyZlFvJrC3rSdzZEOO9QWzCJo 9/Gz4kPY/fprbIArxhJx4zHTNBLjgvjmDUplGFz3MOS/RG4jGsW+btmEh7k1ima7Cg0c Z5c+fYm24+wgULcPKRHo5Gh80vtrOngQYCD1VUbzaALu8zR1U6LtOy6MjdMolnpynFrS 1vGCTYt/tMRkN/dQgQvyMf+Ih5118QS/kqZqrGET6+OeXEJipPloWCWZV5LWgQ91REYa SfGQ== X-Gm-Message-State: AOAM530dZzdAXQKElU+QLmHi35osaAamRT5lpV3TKwnx4Ww0dJV6HkMJ nTeGsCEpMf0xTJgTBLBw/eIqNAQuxoIAlA== X-Google-Smtp-Source: ABdhPJyKa0cfPJhBmE40ohTRIkysvgja2gz1icWBPmYBMWNAqtpuU8mN91+qkg9Yn2IPLSyAe+GpOA== X-Received: by 2002:a4a:d112:: with SMTP id k18mr682277oor.48.1612325837706; Tue, 02 Feb 2021 20:17:17 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:16 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 16/19] af_unix: implement unix_dgram_bpf_recvmsg() Date: Tue, 2 Feb 2021 20:16:33 -0800 Message-Id: <20210203041636.38555-17-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We have to implement unix_dgram_bpf_recvmsg() to replace the original ->recvmsg() to retrieve skmsg from ingress_msg. AF_UNIX is again special here because the lack of sk_prot->recvmsg(). I simply add a special case inside unix_dgram_recvmsg() to call sk->sk_prot->recvmsg() directly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/af_unix.h | 3 +++ net/unix/af_unix.c | 21 ++++++++++++++++--- net/unix/unix_bpf.c | 49 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index fa75f899e88a..f6c43667e995 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -82,6 +82,9 @@ static inline struct unix_sock *unix_sk(const struct sock *sk) long unix_inq_len(struct sock *sk); long unix_outq_len(struct sock *sk); +int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size, + int nonblock, int flags, int *addr_len); + #ifdef CONFIG_SYSCTL int unix_sysctl_register(struct net *net); void unix_sysctl_unregister(struct net *net); diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 21c4406f879b..eebcd6f7ef88 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2094,11 +2094,11 @@ static void unix_copy_addr(struct msghdr *msg, struct sock *sk) } } -static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, - size_t size, int flags) +int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size, + int nonblock, int flags, int *addr_len) { struct scm_cookie scm; - struct sock *sk = sock->sk; + struct socket *sock = sk->sk_socket; struct unix_sock *u = unix_sk(sk); struct sk_buff *skb, *last; long timeo; @@ -2201,6 +2201,21 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, return err; } +static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, + int flags) +{ + struct sock *sk = sock->sk; + int addr_len = 0; + +#ifdef CONFIG_BPF_SOCK_MAP + if (sk->sk_prot != &unix_proto) + return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, + flags & ~MSG_DONTWAIT, &addr_len); +#endif + return __unix_dgram_recvmsg(sk, msg, size, flags & MSG_DONTWAIT, + flags, &addr_len); +} + int unix_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t recv_actor) { diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c index 2e6a26ec4958..570261fd18cd 100644 --- a/net/unix/unix_bpf.c +++ b/net/unix/unix_bpf.c @@ -5,6 +5,54 @@ #include #include +static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg, + size_t len, int nonblock, int flags, + int *addr_len) +{ + struct sk_psock *psock; + int copied, ret; + + psock = sk_psock_get(sk); + if (unlikely(!psock)) + return __unix_dgram_recvmsg(sk, msg, len, nonblock, flags, + addr_len); + + lock_sock(sk); + if (!skb_queue_empty(&sk->sk_receive_queue) && + sk_psock_queue_empty(psock)) { + ret = __unix_dgram_recvmsg(sk, msg, len, nonblock, flags, + addr_len); + goto out; + } + +msg_bytes_ready: + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + if (!copied) { + int data, err = 0; + long timeo; + + timeo = sock_rcvtimeo(sk, nonblock); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); + if (data) { + if (!sk_psock_queue_empty(psock)) + goto msg_bytes_ready; + ret = __unix_dgram_recvmsg(sk, msg, len, nonblock, + flags, addr_len); + goto out; + } + if (err) { + ret = err; + goto out; + } + copied = -EAGAIN; + } + ret = copied; +out: + release_sock(sk); + sk_psock_put(sk, psock); + return ret; +} + static struct proto *unix_prot_saved __read_mostly; static DEFINE_SPINLOCK(unix_prot_lock); static struct proto unix_bpf_prot; @@ -13,6 +61,7 @@ static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base { *prot = *base; prot->close = sock_map_close; + prot->recvmsg = unix_dgram_bpf_recvmsg; } static void unix_bpf_check_needs_rebuild(struct proto *ops) From patchwork Wed Feb 3 04:16:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063397 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BF42C43217 for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DE2C764F93 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233085AbhBCEUB (ORCPT ); Tue, 2 Feb 2021 23:20:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232861AbhBCETA (ORCPT ); Tue, 2 Feb 2021 23:19:00 -0500 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19D59C061352; Tue, 2 Feb 2021 20:17:20 -0800 (PST) Received: by mail-oo1-xc2c.google.com with SMTP id q3so5707691oog.4; Tue, 02 Feb 2021 20:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hgdUBSB4MvEvN3e6yvQq8UCSM6sKbUZ7RhvBlW/6NiI=; b=pNcRXnqIiCzgEBvY66O1RauFKbKd7kuoeID08LnOef1yPXN5t59HFrAlyVkOn0A7Ux tkdtMiXHbtwbQWlwuc3I518f0aaLf0K7X71LdaqkDI9fJB20slybzLuZVZ56zXAeS2W+ T9nJgfxaYvGLczTqg2PyhYs0ex7ajkS7zD08BfkzN/rTgXv2R7O3yR0IRGh4sUCvOm7h vmpR+9knw8XrefcEG6w4C7pDaBZjFTCQVzFJjui+QtOAWfLKxsY8FmeKys65LsOohaBB mnJ369W0x9Y7uWOXND/KrX5J24jmrtGsMHZOY1nyugynht/KykJN8ILSD+0kk6L7MkYp ImnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hgdUBSB4MvEvN3e6yvQq8UCSM6sKbUZ7RhvBlW/6NiI=; b=k68Cr6L2nJshvIo6jjv8/1uXHUFGOfiHuDrR77cLskqcvnAY5BRX/gJXjhklo/GHu1 dPEDXBz9u4I9NEq/88D4fYzXELSa4/H/kazjlS2fbWTmv+z44x72HAI7d8eVx4xEIEZE HGl0y1FoSzVH9u0JVlgtblH5YDUt6C4GZFYqUtQ6uJG7Z74SVW9JDxH3zY4Dc3t9CAHb lL3ioV/1mcYPaepFdGOk9fxHks4iBeIT46RFfBSGnS5ekjMXO/0GY58A23j3wCPIDh3p jkdUiJovLIhlmSZJEZmTmbnaFNVowfCtZDO6clMg2qFXi2yXxQc85YxpGS1lc0W5vWUk 1B+Q== X-Gm-Message-State: AOAM530ThCqvwRc3+EYOxgGDatSzkn3BIslYfWc+yMzY9qDrsBARLU0i 7mnYqgKkf4DZtxMFM2mXlrVbo1t1RF8zvg== X-Google-Smtp-Source: ABdhPJwA3HhzhwojS+ZjjSbtl9Hwa+f3jzEb8um8+EKP+8q0qA23p+y+6MhVsj7pyGfBpyldHK3STA== X-Received: by 2002:a4a:96b3:: with SMTP id s48mr813426ooi.11.1612325839404; Tue, 02 Feb 2021 20:17:19 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:18 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 17/19] sock_map: update sock type checks Date: Tue, 2 Feb 2021 20:16:34 -0800 Message-Id: <20210203041636.38555-18-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Now both AF_UNIX and UDP support sockmap and redirection, we can safely update the sock type checks for them accordingly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/core/skmsg.c | 3 ++- net/core/sock_map.c | 15 ++++++++++++--- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 8e3edbdf4c7c..a502137f7bc2 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -667,7 +667,8 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) write_lock_bh(&sk->sk_callback_lock); - if (inet_csk_has_ulp(sk)) { + if ((sk->sk_family == AF_INET || sk->sk_family == AF_INET6) && + inet_csk_has_ulp(sk)) { psock = ERR_PTR(-EINVAL); goto out; } diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 255067e5c73a..7e56a3ec7a57 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -544,14 +544,22 @@ static bool sk_is_udp(const struct sock *sk) sk->sk_protocol == IPPROTO_UDP; } +static bool sk_is_unix(const struct sock *sk) +{ + return sk->sk_type == SOCK_DGRAM && sk->sk_family == AF_UNIX; +} + static bool sock_map_redirect_allowed(const struct sock *sk) { - return sk_is_tcp(sk) && sk->sk_state != TCP_LISTEN; + if (sk_is_tcp(sk)) + return sk->sk_state != TCP_LISTEN; + else + return sk->sk_state == TCP_ESTABLISHED; } static bool sock_map_sk_is_suitable(const struct sock *sk) { - return sk_is_tcp(sk) || sk_is_udp(sk); + return !!sk->sk_prot->update_proto; } static bool sock_map_sk_state_allowed(const struct sock *sk) @@ -560,7 +568,8 @@ static bool sock_map_sk_state_allowed(const struct sock *sk) return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_LISTEN); else if (sk_is_udp(sk)) return sk_hashed(sk); - + else if (sk_is_unix(sk)) + return sk->sk_state == TCP_ESTABLISHED; return false; } From patchwork Wed Feb 3 04:16:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063403 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6B7AC433E0 for ; Wed, 3 Feb 2021 04:21:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 901DF614A7 for ; Wed, 3 Feb 2021 04:21:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232654AbhBCEUk (ORCPT ); Tue, 2 Feb 2021 23:20:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232955AbhBCET1 (ORCPT ); Tue, 2 Feb 2021 23:19:27 -0500 Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7D35C061353; Tue, 2 Feb 2021 20:17:21 -0800 (PST) Received: by mail-ot1-x333.google.com with SMTP id i30so22161750ota.6; Tue, 02 Feb 2021 20:17:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=saLuuaeWczof9wJYKKradx1AJv81Ng0Xq/LvVJwFhoo=; b=u+vfaiGTXjWLjGwvgTY0K3aKnG5b13FMXezlUJdteUKVF/H2ksIS9eVRWp/T4P9yMX FiuJoAxiDrmVzbFZnlzbWQ2Wp2M5eN8rqYVK1/T1NzKjRCK6cpZLPMSyKXydgGro0cQr GdqUOeVOW3W3sgjtgFsd24xmRgqIp2MgKfhHDRTRQ9ExiYoG//DpUp5EZHBpnBuKks6P PyF2Ksar5E1DejK8ajCzXr0H2WbAStPTMBmrtkrqm/F4AA0Jt55Kalue+AgO1xqAM4h3 7UQlPJOlv9m5ySx8EJeknRYonvAqvPuuq5ugWdrKr9yvn+O6AjBeL0caJMpv0yk4Om+0 Ly1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=saLuuaeWczof9wJYKKradx1AJv81Ng0Xq/LvVJwFhoo=; b=Jkoey850LiB8uQGDoulnzUBfcYx2/Fd1GNYckKkg9iU34Dy2X1FV0Gpeo0FUMBetbF 2bsgeJFsNxP+5nTbDihFUubBwoETJbuAObjn/WP9RH1O6AhRd7lR9ToPsVciW+ZY/c5l ZgfPw50FL3XSHKQNHUPgDn83eM8leU0mFg2WrUV2r4SlWco/CQcjr16nAGj78U43139b QIO+OnALAZrpisfHb/611zHgjq6HLwtJm5pN+Fx4LTAeRQDo9Il7BEJ0KSwLQFO4rxia VOrufgVAWFH9HIaAOavMb+cmI+VNzX8CAb//YgxCF5NydRLi0lkkQq78jUVIecPdkNrZ dzqw== X-Gm-Message-State: AOAM533afZkKjqpd6jeb6pfyuWCzXZ7r7hJJGeM/VxXJ0v5xOS2HCgj+ e45twvHunheKImaZgGCUcKRuDiiO0uZHmw== X-Google-Smtp-Source: ABdhPJzXiZZ5nGmXDboUNeaxgc9lkigW75/iGy/FT/+XSRGxeMohurhXz7PM3x8eX3z3sm3tD2pjAA== X-Received: by 2002:a9d:3ec4:: with SMTP id b62mr756468otc.43.1612325840931; Tue, 02 Feb 2021 20:17:20 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:20 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 18/19] selftests/bpf: add test cases for unix and udp sockmap Date: Tue, 2 Feb 2021 20:16:35 -0800 Message-Id: <20210203041636.38555-19-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Add two test cases to ensure redirection between two AF_UNIX sockets or two UDP sockets work. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 241 ++++++++++++++++++ .../selftests/bpf/progs/test_sockmap_listen.c | 20 ++ 2 files changed, 261 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index c26e6bf05e49..8f52302165a6 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1441,6 +1441,8 @@ static const char *family_str(sa_family_t family) return "IPv4"; case AF_INET6: return "IPv6"; + case AF_UNIX: + return "Unix"; default: return "unknown"; } @@ -1563,6 +1565,239 @@ static void test_redir(struct test_sockmap_listen *skel, struct bpf_map *map, } } +static void udp_redir_to_connected(int family, int sotype, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + p0 = socket_loopback(family, sotype | SOCK_NONBLOCK); + if (p0 < 0) + return; + len = sizeof(addr); + err = xgetsockname(p0, sockaddr(&addr), &len); + if (err) + goto close_peer0; + + c0 = xsocket(family, sotype | SOCK_NONBLOCK, 0); + if (c0 < 0) + goto close_peer0; + err = xconnect(c0, sockaddr(&addr), len); + if (err) + goto close_cli0; + err = xgetsockname(c0, sockaddr(&addr), &len); + if (err) + goto close_cli0; + err = xconnect(p0, sockaddr(&addr), len); + if (err) + goto close_cli0; + + p1 = socket_loopback(family, sotype | SOCK_NONBLOCK); + if (p1 < 0) + goto close_cli0; + err = xgetsockname(p1, sockaddr(&addr), &len); + if (err) + goto close_cli0; + + c1 = xsocket(family, sotype | SOCK_NONBLOCK, 0); + if (c1 < 0) + goto close_peer1; + err = xconnect(c1, sockaddr(&addr), len); + if (err) + goto close_cli1; + err = xgetsockname(c1, sockaddr(&addr), &len); + if (err) + goto close_cli1; + err = xconnect(p1, sockaddr(&addr), len); + if (err) + goto close_cli1; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close_cli1; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close_cli1; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close_cli1: + xclose(c1); +close_peer1: + xclose(p1); +close_cli0: + xclose(c0); +close_peer0: + xclose(p0); +} + +static void udp_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family, + int sotype) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + udp_redir_to_connected(family, sotype, sock_map, verdict_map, + REDIR_EGRESS); + skel->bss->test_ingress = true; + udp_redir_to_connected(family, sotype, sock_map, verdict_map, + REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void test_udp_redir(struct test_sockmap_listen *skel, struct bpf_map *map, + int family) +{ + const char *family_name, *map_name; + char s[MAX_TEST_NAME]; + + family_name = family_str(family); + map_name = map_type_str(map); + snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); + if (!test__start_subtest(s)) + return; + udp_skb_redir_to_connected(skel, map, family, SOCK_DGRAM); +} + +static void unix_redir_to_connected(int sotype, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + int c0, c1, p0, p1; + unsigned int pass; + int err, n; + int sfd[2]; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + if (socketpair(AF_UNIX, sotype | SOCK_NONBLOCK, 0, sfd)) + return; + c0 = sfd[0], p0 = sfd[1]; + + if (socketpair(AF_UNIX, sotype | SOCK_NONBLOCK, 0, sfd)) + goto close0; + c1 = sfd[0], p1 = sfd[1]; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close: + xclose(c1); + xclose(p1); +close0: + xclose(c0); + xclose(p0); +} + +static void unix_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int sotype) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + unix_redir_to_connected(sotype, sock_map, verdict_map, REDIR_EGRESS); + skel->bss->test_ingress = true; + unix_redir_to_connected(sotype, sock_map, verdict_map, REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void test_unix_redir(struct test_sockmap_listen *skel, struct bpf_map *map, + int sotype) +{ + const char *family_name, *map_name; + char s[MAX_TEST_NAME]; + + family_name = family_str(AF_UNIX); + map_name = map_type_str(map); + snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); + if (!test__start_subtest(s)) + return; + unix_skb_redir_to_connected(skel, map, sotype); +} + static void test_reuseport(struct test_sockmap_listen *skel, struct bpf_map *map, int family, int sotype) { @@ -1626,10 +1861,16 @@ void test_sockmap_listen(void) skel->bss->test_sockmap = true; run_tests(skel, skel->maps.sock_map, AF_INET); run_tests(skel, skel->maps.sock_map, AF_INET6); + test_udp_redir(skel, skel->maps.sock_map, AF_INET); + test_udp_redir(skel, skel->maps.sock_map, AF_INET6); + test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM); skel->bss->test_sockmap = false; run_tests(skel, skel->maps.sock_hash, AF_INET); run_tests(skel, skel->maps.sock_hash, AF_INET6); + test_udp_redir(skel, skel->maps.sock_hash, AF_INET); + test_udp_redir(skel, skel->maps.sock_hash, AF_INET6); + test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM); test_sockmap_listen__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c index fa221141e9c1..49537c78e34a 100644 --- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c +++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c @@ -29,6 +29,7 @@ struct { } verdict_map SEC(".maps"); static volatile bool test_sockmap; /* toggled by user-space */ +static volatile bool test_ingress; /* toggled by user-space */ SEC("sk_skb/stream_parser") int prog_stream_parser(struct __sk_buff *skb) @@ -55,6 +56,25 @@ int prog_stream_verdict(struct __sk_buff *skb) return verdict; } +SEC("sk_skb/skb_verdict") +int prog_skb_verdict(struct __sk_buff *skb) +{ + unsigned int *count; + __u32 zero = 0; + int verdict; + + if (test_sockmap) + verdict = bpf_sk_redirect_map(skb, &sock_map, zero, test_ingress ? BPF_F_INGRESS : 0); + else + verdict = bpf_sk_redirect_hash(skb, &sock_hash, &zero, test_ingress ? BPF_F_INGRESS : 0); + + count = bpf_map_lookup_elem(&verdict_map, &verdict); + if (count) + (*count)++; + + return verdict; +} + SEC("sk_msg") int prog_msg_verdict(struct sk_msg_md *msg) { From patchwork Wed Feb 3 04:16:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12063401 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C71DC432C3 for ; Wed, 3 Feb 2021 04:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07DCB64F72 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233090AbhBCEUC (ORCPT ); Tue, 2 Feb 2021 23:20:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232870AbhBCETA (ORCPT ); Tue, 2 Feb 2021 23:19:00 -0500 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51FD3C061354; Tue, 2 Feb 2021 20:17:23 -0800 (PST) Received: by mail-oi1-x22b.google.com with SMTP id m13so25343114oig.8; Tue, 02 Feb 2021 20:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5XzKRAnIzM4OkR+o5IUQ3F91LoQlv8ZVaHR4VoSR6KY=; b=hBXMpXnvsYlkaXFEnxrF05njhzROvSlnxq9SXuaAGIBTF6MzOCO3s5MO6O+kE2ZUgF 9I1+dAVFnrbcWOU51fPWZJCP6k7/0/rHdRKcrcXH9Jla9R2/aMq0h8xK5jPOC/vBfHl8 hY+rZF0z+as5mpUFdC37Y+jThj18GAPmUoWsCkrmq099JJj/GjKCbfemXfPR72p2eiGQ hxgpXV4W5CPYiFfl3ePsBi1PWW8BWivxxC0H3GiHayBB+ByaJtiyUmi7w7TqRzm/0DO1 j39ajrceRln0M1sLYlStpqH4nYNtqtaOI3PcToapMycWIY59DfixzZbtxYKlXvtwRX5+ 00pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5XzKRAnIzM4OkR+o5IUQ3F91LoQlv8ZVaHR4VoSR6KY=; b=kQHtVsk75MeRahB//o8JDBlOkkSEsdB+gD0xFfs6IcVlWac5rFmQCq2CxEnZa/95Yp SfcSiwQu3Pfgt6wpglOpHTnFXe5RGUPzL7WbyYpZCDdNIBBbP33xFmMJbsIbV2yiBJIM ug+w/eh/4a8sJtK1EyWqe6US5BJfAFG6fuhUZ74cS9iJ2211ylzaOURSrmwmuGjj/e+0 6l1DIkYpaK99YMSOYIQpQZMAQdCC/nGVBtgzVw2Crk1pM0Y1PSXNOo6Iq75INJOlN0zg NcGJKHRfrSMNbD+jTTSiZqETwUH/qeA9EdmmLf59QHQqBP3z54U9b6api2E4dAWFv/vI nrKg== X-Gm-Message-State: AOAM532WuAPIH46bwjsOyNJQGhKFnb4sP7znwoRrTMu5pknIjyqr9iNT bF2WG3RuePYHcU2WjK5Mx2IIV93xBOfMJw== X-Google-Smtp-Source: ABdhPJyaRgC/aLw9LT04IH7XGkVKTm3z5bI3W8j/Kb/iT/+aZf4GyQ7Ku9pYAs5jGioUuzXujVgUJQ== X-Received: by 2002:aca:b655:: with SMTP id g82mr785226oif.91.1612325842490; Tue, 02 Feb 2021 20:17:22 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:21 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 19/19] selftests/bpf: add test case for redirection between udp and unix Date: Tue, 2 Feb 2021 20:16:36 -0800 Message-Id: <20210203041636.38555-20-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 226 ++++++++++++++++++ 1 file changed, 226 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index 8f52302165a6..e0c2a0a4f501 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1798,6 +1798,228 @@ static void test_unix_redir(struct test_sockmap_listen *skel, struct bpf_map *ma unix_skb_redir_to_connected(skel, map, sotype); } +static void udp_unix_redir_to_connected(int family, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + int sfd[2]; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + if (socketpair(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0, sfd)) + return; + c0 = sfd[0], p0 = sfd[1]; + + p1 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK); + if (p0 < 0) + goto close; + len = sizeof(addr); + err = xgetsockname(p1, sockaddr(&addr), &len); + if (err) + goto close_peer1; + + c1 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0); + if (c1 < 0) + goto close_peer1; + err = xconnect(c1, sockaddr(&addr), len); + if (err) + goto close_cli1; + err = xgetsockname(c1, sockaddr(&addr), &len); + if (err) + goto close_cli1; + err = xconnect(p1, sockaddr(&addr), len); + if (err) + goto close_cli1; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close_cli1; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close_cli1; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close_cli1: + xclose(c1); +close_peer1: + xclose(p1); +close: + xclose(c0); + xclose(p0); +} + +static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS); + skel->bss->test_ingress = true; + udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void unix_udp_redir_to_connected(int family, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + int sfd[2]; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + p0 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK); + if (p0 < 0) + return; + len = sizeof(addr); + err = xgetsockname(p0, sockaddr(&addr), &len); + if (err) + goto close_peer0; + + c0 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0); + if (c0 < 0) + goto close_peer0; + err = xconnect(c0, sockaddr(&addr), len); + if (err) + goto close_cli0; + err = xgetsockname(c0, sockaddr(&addr), &len); + if (err) + goto close_cli0; + err = xconnect(p0, sockaddr(&addr), len); + if (err) + goto close_cli0; + + if (socketpair(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0, sfd)) + goto close_cli0; + c1 = sfd[0], p1 = sfd[1]; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close: + xclose(c1); + xclose(p1); +close_cli0: + xclose(c0); +close_peer0: + xclose(p0); + +} + +static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS); + skel->bss->test_ingress = true; + unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void test_udp_unix_redir(struct test_sockmap_listen *skel, struct bpf_map *map, + int family) +{ + const char *family_name, *map_name; + char s[MAX_TEST_NAME]; + + family_name = family_str(family); + map_name = map_type_str(map); + snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); + if (!test__start_subtest(s)) + return; + udp_unix_skb_redir_to_connected(skel, map, family); + unix_udp_skb_redir_to_connected(skel, map, family); +} + static void test_reuseport(struct test_sockmap_listen *skel, struct bpf_map *map, int family, int sotype) { @@ -1864,6 +2086,8 @@ void test_sockmap_listen(void) test_udp_redir(skel, skel->maps.sock_map, AF_INET); test_udp_redir(skel, skel->maps.sock_map, AF_INET6); test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM); + test_udp_unix_redir(skel, skel->maps.sock_map, AF_INET); + test_udp_unix_redir(skel, skel->maps.sock_map, AF_INET6); skel->bss->test_sockmap = false; run_tests(skel, skel->maps.sock_hash, AF_INET); @@ -1871,6 +2095,8 @@ void test_sockmap_listen(void) test_udp_redir(skel, skel->maps.sock_hash, AF_INET); test_udp_redir(skel, skel->maps.sock_hash, AF_INET6); test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM); + test_udp_unix_redir(skel, skel->maps.sock_hash, AF_INET); + test_udp_unix_redir(skel, skel->maps.sock_hash, AF_INET6); test_sockmap_listen__destroy(skel); }