From patchwork Wed Mar 31 02:32:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174159 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58253C433E1 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2BAF6619D7 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233325AbhCaCdK (ORCPT ); Tue, 30 Mar 2021 22:33:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233394AbhCaCcs (ORCPT ); Tue, 30 Mar 2021 22:32:48 -0400 Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A74BC061574; Tue, 30 Mar 2021 19:32:48 -0700 (PDT) Received: by mail-ot1-x332.google.com with SMTP id v24-20020a9d69d80000b02901b9aec33371so17565976oto.2; Tue, 30 Mar 2021 19:32:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=azpBQutchzCr3FplNtXl63QhozAq1NlwxEgz2eiaadY=; b=kBf+h/AIoHx4fu2MmfkD8h9J7SyGsjPU91DNETj5SIfIoZuWXzrJZIJ1ADtcznpiKU CVBW/3kk96E8J5tLiy+2U9Om7SRIOG9Vkf6OAdu04sG8MsA+FKiATqPxqWyo8+Y+k6E5 hNPcI+mE8zCvu+i+ZpXs+6IDBcpMvo5wj60/rk3GXIvNmVhXvpGh7LsJByVhlunrBBgG d/XDGQ/HVL01L//OAGW5e1S7td7gNa+9Lu1CNUZ/IWUFr6sPQ3+/EMA4/gfzllTN+pzB MaiaUQfr0XaS/CNpals9J0rmlel780PRegvEHdle3FmH7VIeZd1AK/rWVI98e+qquVTP ZJPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=azpBQutchzCr3FplNtXl63QhozAq1NlwxEgz2eiaadY=; b=ZvciV3JBdZrQZQ3BiEJdc9mlFfoWdJdNuiKJayI//5oETZ2iJozka0l0orKFik2s8j m8vPjTZl8QPAKsNr3wX9AwTRDkM5m0q2or2Af0t95mLbb+TWDZFP30K/1xeYL1MTkQve wkdc8AQqVWSOKtG/hnNMsfqlVCSz8AMtH7MS/nWBNyRf0/vRA4+fGx+bzldma6ZaoBbT xoi4tohqZU1AOsYlaOVcAKMvSaRQLSqGjrgEVStYJ7F8qBXMhsAKGrbqELWjuPM9vA0X +83C9WW3MoHXp/IpGXhcZo2t9RSX4Dr+tZwHEJLujrRUc9Sh6Ewfoa222u1+7kVkIh/l 2Hqg== X-Gm-Message-State: AOAM532rD2+v2r+VZ+JGYb0rOwg78vbjjnNEIktGjeyfkyqFSWMDw7d2 ma5jQIoSsBpdTgRqx14J3JFaa/+2vG4awQ== X-Google-Smtp-Source: ABdhPJzqxSB1fVkP2cdWPSaMqufh6YautgKdaLuKmrhlD5JuCXXZEgKRCQnOZkYk6HZ4UAkQYE4uFA== X-Received: by 2002:a05:6830:343:: with SMTP id h3mr729857ote.201.1617157967944; Tue, 30 Mar 2021 19:32:47 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:47 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Lorenz Bauer , Jakub Sitnicki Subject: [Patch bpf-next v8 01/16] skmsg: lock ingress_skb when purging Date: Tue, 30 Mar 2021 19:32:22 -0700 Message-Id: <20210331023237.41094-2-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently we purge the ingress_skb queue only when psock refcnt goes down to 0, so locking the queue is not necessary, but in order to be called during ->close, we have to lock it here. Cc: John Fastabend Cc: Daniel Borkmann Cc: Lorenz Bauer Acked-by: Jakub Sitnicki Signed-off-by: Cong Wang Acked-by: John Fastabend --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 07f54015238a..bebf84ed4e30 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -634,7 +634,7 @@ static void sk_psock_zap_ingress(struct sk_psock *psock) { struct sk_buff *skb; - while ((skb = __skb_dequeue(&psock->ingress_skb)) != NULL) { + while ((skb = skb_dequeue(&psock->ingress_skb)) != NULL) { skb_bpf_redirect_clear(skb); kfree_skb(skb); } From patchwork Wed Mar 31 02:32:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174161 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D75C433E3 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4DF87619D8 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233417AbhCaCdL (ORCPT ); Tue, 30 Mar 2021 22:33:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233396AbhCaCcu (ORCPT ); Tue, 30 Mar 2021 22:32:50 -0400 Received: from mail-oi1-x236.google.com (mail-oi1-x236.google.com [IPv6:2607:f8b0:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4668FC06175F; Tue, 30 Mar 2021 19:32:50 -0700 (PDT) Received: by mail-oi1-x236.google.com with SMTP id a8so18560067oic.11; Tue, 30 Mar 2021 19:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1DrML2FTV498fgxHP9w6wGEKfVHhmZ2PXW+mjKUsjIU=; b=MNJoHn25JN6vB2Uy0I5i2x1MSIlVE0RDlfELR1c3uKIqtPusCtRO/p7n31eg1cK2/0 QRDVRGURQstqgBTnAMn3JJxRLPhVLlJZeGBF5ZaDdz7JxATzjUTYE3fAhijBTKpmMe85 ZanrGVZiKwq+vDZoWYjMZdGrqrMKsmMs3NM+Xaeo/S7zVe68Az1qrEm48HCxfM8+AopY Tu72GxjIDDa6GYegaSi+zz8BM84REBHmjyvKzzw9vTLwlkXHiE7TjwEbuyb76ZfBlPgn 8zf7eSMm+PUlqmSLQ+CITDxerOY1NzeedZmmotT2x+skvOXxFb2aYdhUgjCszcQDn27Q s19A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1DrML2FTV498fgxHP9w6wGEKfVHhmZ2PXW+mjKUsjIU=; b=ZqIXto/K+Te93ZbkkCDHvcg0PQrRbQCtrMrItKZxbtcXrOdN69kpyDcbjrkaTfHVly zd12sM4Q49tNpFImu6eBVAoJzxBbDIrZ0oIRWUfURmEPG7w8Z1bAaCbAzGPoaB2wkG2r DNJJYC5IJaiiqvVmjC/v1ukuOwsllD+Qbw6G65Y2+/Lum1OcyCCupz3e37N11mHcxwoW gAnQryRWu71ZLSE481yAi51tZfcplB1nd43utH/EyG/lbeaChc47FtwEx4pyBS6bbS9u fOpVqedtxAFyTajRo/l4uDPbt5juE0nWGy4q661YwZzC0XLlkPQW4B0Z3EmB8+dWUm7a oqVg== X-Gm-Message-State: AOAM531iqjXQwowMlUJMGfKI9LeIGlQjhfPLTMfRSOgXSaf+uFevKa3E 7Loz7RRmmne1P11B/qtmwzrbcvUQlalogw== X-Google-Smtp-Source: ABdhPJw5uVoP/6VoQM3RqC6pF7zqypNSd1YLuchf/uTqO3DdU8NBNk6u8h2sCfYSn6cw7Cgy8sVLVg== X-Received: by 2002:a05:6808:3dc:: with SMTP id o28mr726296oie.120.1617157969441; Tue, 30 Mar 2021 19:32:49 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:49 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Lorenz Bauer , Jakub Sitnicki , John Fastabend Subject: [Patch bpf-next v8 02/16] skmsg: introduce a spinlock to protect ingress_msg Date: Tue, 30 Mar 2021 19:32:23 -0700 Message-Id: <20210331023237.41094-3-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently we rely on lock_sock to protect ingress_msg, it is too big for this, we can actually just use a spinlock to protect this list like protecting other skb queues. __tcp_bpf_recvmsg() is still special because of peeking, it still has to use lock_sock. Cc: Daniel Borkmann Cc: Lorenz Bauer Acked-by: Jakub Sitnicki Acked-by: John Fastabend Signed-off-by: Cong Wang --- include/linux/skmsg.h | 46 +++++++++++++++++++++++++++++++++++++++++++ net/core/skmsg.c | 3 +++ net/ipv4/tcp_bpf.c | 18 ++++++----------- 3 files changed, 55 insertions(+), 12 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 6c09d94be2e9..f2d45a73b2b2 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -89,6 +89,7 @@ struct sk_psock { #endif struct sk_buff_head ingress_skb; struct list_head ingress_msg; + spinlock_t ingress_lock; unsigned long state; struct list_head link; spinlock_t link_lock; @@ -284,7 +285,45 @@ static inline struct sk_psock *sk_psock(const struct sock *sk) static inline void sk_psock_queue_msg(struct sk_psock *psock, struct sk_msg *msg) { + spin_lock_bh(&psock->ingress_lock); list_add_tail(&msg->list, &psock->ingress_msg); + spin_unlock_bh(&psock->ingress_lock); +} + +static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock) +{ + struct sk_msg *msg; + + spin_lock_bh(&psock->ingress_lock); + msg = list_first_entry_or_null(&psock->ingress_msg, struct sk_msg, list); + if (msg) + list_del(&msg->list); + spin_unlock_bh(&psock->ingress_lock); + return msg; +} + +static inline struct sk_msg *sk_psock_peek_msg(struct sk_psock *psock) +{ + struct sk_msg *msg; + + spin_lock_bh(&psock->ingress_lock); + msg = list_first_entry_or_null(&psock->ingress_msg, struct sk_msg, list); + spin_unlock_bh(&psock->ingress_lock); + return msg; +} + +static inline struct sk_msg *sk_psock_next_msg(struct sk_psock *psock, + struct sk_msg *msg) +{ + struct sk_msg *ret; + + spin_lock_bh(&psock->ingress_lock); + if (list_is_last(&msg->list, &psock->ingress_msg)) + ret = NULL; + else + ret = list_next_entry(msg, list); + spin_unlock_bh(&psock->ingress_lock); + return ret; } static inline bool sk_psock_queue_empty(const struct sk_psock *psock) @@ -292,6 +331,13 @@ static inline bool sk_psock_queue_empty(const struct sk_psock *psock) return psock ? list_empty(&psock->ingress_msg) : true; } +static inline void kfree_sk_msg(struct sk_msg *msg) +{ + if (msg->skb) + consume_skb(msg->skb); + kfree(msg); +} + static inline void sk_psock_report_error(struct sk_psock *psock, int err) { struct sock *sk = psock->sk; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index bebf84ed4e30..305dddc51857 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -592,6 +592,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) INIT_WORK(&psock->work, sk_psock_backlog); INIT_LIST_HEAD(&psock->ingress_msg); + spin_lock_init(&psock->ingress_lock); skb_queue_head_init(&psock->ingress_skb); sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED); @@ -638,7 +639,9 @@ static void sk_psock_zap_ingress(struct sk_psock *psock) skb_bpf_redirect_clear(skb); kfree_skb(skb); } + spin_lock_bh(&psock->ingress_lock); __sk_psock_purge_ingress_msg(psock); + spin_unlock_bh(&psock->ingress_lock); } static void sk_psock_link_destroy(struct sk_psock *psock) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 17c322b875fd..ae980716d896 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -18,9 +18,7 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, struct sk_msg *msg_rx; int i, copied = 0; - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); - + msg_rx = sk_psock_peek_msg(psock); while (copied != len) { struct scatterlist *sge; @@ -68,22 +66,18 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, } while (i != msg_rx->sg.end); if (unlikely(peek)) { - if (msg_rx == list_last_entry(&psock->ingress_msg, - struct sk_msg, list)) + msg_rx = sk_psock_next_msg(psock, msg_rx); + if (!msg_rx) break; - msg_rx = list_next_entry(msg_rx, list); continue; } msg_rx->sg.start = i; if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { - list_del(&msg_rx->list); - if (msg_rx->skb) - consume_skb(msg_rx->skb); - kfree(msg_rx); + msg_rx = sk_psock_dequeue_msg(psock); + kfree_sk_msg(msg_rx); } - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); + msg_rx = sk_psock_peek_msg(psock); } return copied; From patchwork Wed Mar 31 02:32:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174165 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6772C433E8 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D12D619D3 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233386AbhCaCdL (ORCPT ); Tue, 30 Mar 2021 22:33:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233403AbhCaCcv (ORCPT ); Tue, 30 Mar 2021 22:32:51 -0400 Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69CA2C061574; Tue, 30 Mar 2021 19:32:51 -0700 (PDT) Received: by mail-ot1-x333.google.com with SMTP id k14-20020a9d7dce0000b02901b866632f29so17606541otn.1; Tue, 30 Mar 2021 19:32:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8Wh0mzicKWG1utRZNIevS3oRZH+mXn5JWIkzgtioLig=; b=H0z0T66CbU6G1Qbor/GIxdSbaX9Bt+5B9VSgkpdr9sjv3u6Hc7T5Ib6yoBOfSKfVy2 HuJAfkl1P5I9z+b+NRkUIBcKzHmhLcOZ7lBXCGt1vhHb25wO81nhP+WstT97sd+y9cTt JTQfZsxfFHebRk9Y82JyafDLxPizdOIIJnc+6316v8MeN1R9Z5ntNYaaN9KwCGL9JCZr XdWGz6EIJv7liJEXdV5tbr2aKpZI3jVXByJnTKwjaVCpIQJJDStPoD14lZAa8zA7KVNJ phKCb72UeIjv+KaHl7Kv9oXoMXETvskOy+A1cDJMIxpnItGbipfuL+V2W3S+cceNklUl 7ksw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8Wh0mzicKWG1utRZNIevS3oRZH+mXn5JWIkzgtioLig=; b=eIL1/FG7snACLfGxptEhYCz8XrHx9sYFwvPyYJhk6qIXyfWDZ2OYqGj8KEzn8UGQjp /QCilcBh+XRsz7ve8jFL1iZ5MmNqTi2rQh+cXafFHLIuFWPBySnfw9ATOO8QPr0s4TpJ LN+KodpTH81gnj3HUFR+Io6QewGYYkzDimrAzNKrizKC94fktmY6ndqncapHM6O8RvNi bKRWezHvPlVEaXosVhxiJrvbXRBfHlwj4xbNMfxEI7R9MbVtD0mFLgpPt1LKqIKkf8Qt WjVKu2kXglKnUmFl0BORHRoL0fm3orQa4ROnzhDBQKNw2yvWySH4j6SQHASsnu0Imwey /+Ew== X-Gm-Message-State: AOAM530wo7n/isf7fzbuV9UrZFIsSfRc+kgHtuHOcEyaGiKz+msx357k aBAp10p5frYZH5feeRsnmzq1UOu9D0663w== X-Google-Smtp-Source: ABdhPJzmIdkEvbn6PuxGYdemkk2WpgNtEt8XDmIMlmpgEwJzbLi5u6gL7yWmZQ/HbCnjft12zMJWKg== X-Received: by 2002:a9d:6296:: with SMTP id x22mr771545otk.196.1617157970720; Tue, 30 Mar 2021 19:32:50 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:50 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 03/16] net: introduce skb_send_sock() for sock_map Date: Tue, 30 Mar 2021 19:32:24 -0700 Message-Id: <20210331023237.41094-4-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We only have skb_send_sock_locked() which requires callers to use lock_sock(). Introduce a variant skb_send_sock() which locks on its own, callers do not need to lock it any more. This will save us from adding a ->sendmsg_locked for each protocol. To reuse the code, pass function pointers to __skb_send_sock() and build skb_send_sock() and skb_send_sock_locked() on top. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Reviewed-by: Jakub Sitnicki --- include/linux/skbuff.h | 1 + net/core/skbuff.c | 55 ++++++++++++++++++++++++++++++++++++------ 2 files changed, 49 insertions(+), 7 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index c8def85fcc22..dbf820a50a39 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -3626,6 +3626,7 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset, unsigned int flags); int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, int len); +int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len); void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to); unsigned int skb_zerocopy_headlen(const struct sk_buff *from); int skb_zerocopy(struct sk_buff *to, struct sk_buff *from, diff --git a/net/core/skbuff.c b/net/core/skbuff.c index e8320b5d651a..3ad9e8425ab2 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2500,9 +2500,32 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset, } EXPORT_SYMBOL_GPL(skb_splice_bits); -/* Send skb data on a socket. Socket must be locked. */ -int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, - int len) +static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg, + struct kvec *vec, size_t num, size_t size) +{ + struct socket *sock = sk->sk_socket; + + if (!sock) + return -EINVAL; + return kernel_sendmsg(sock, msg, vec, num, size); +} + +static int sendpage_unlocked(struct sock *sk, struct page *page, int offset, + size_t size, int flags) +{ + struct socket *sock = sk->sk_socket; + + if (!sock) + return -EINVAL; + return kernel_sendpage(sock, page, offset, size, flags); +} + +typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg, + struct kvec *vec, size_t num, size_t size); +typedef int (*sendpage_func)(struct sock *sk, struct page *page, int offset, + size_t size, int flags); +static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, + int len, sendmsg_func sendmsg, sendpage_func sendpage) { unsigned int orig_len = len; struct sk_buff *head = skb; @@ -2522,7 +2545,8 @@ int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, memset(&msg, 0, sizeof(msg)); msg.msg_flags = MSG_DONTWAIT; - ret = kernel_sendmsg_locked(sk, &msg, &kv, 1, slen); + ret = INDIRECT_CALL_2(sendmsg, kernel_sendmsg_locked, + sendmsg_unlocked, sk, &msg, &kv, 1, slen); if (ret <= 0) goto error; @@ -2553,9 +2577,11 @@ int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, slen = min_t(size_t, len, skb_frag_size(frag) - offset); while (slen) { - ret = kernel_sendpage_locked(sk, skb_frag_page(frag), - skb_frag_off(frag) + offset, - slen, MSG_DONTWAIT); + ret = INDIRECT_CALL_2(sendpage, kernel_sendpage_locked, + sendpage_unlocked, sk, + skb_frag_page(frag), + skb_frag_off(frag) + offset, + slen, MSG_DONTWAIT); if (ret <= 0) goto error; @@ -2587,8 +2613,23 @@ int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, error: return orig_len == len ? ret : orig_len - len; } + +/* Send skb data on a socket. Socket must be locked. */ +int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset, + int len) +{ + return __skb_send_sock(sk, skb, offset, len, kernel_sendmsg_locked, + kernel_sendpage_locked); +} EXPORT_SYMBOL_GPL(skb_send_sock_locked); +/* Send skb data on a socket. Socket must be unlocked. */ +int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len) +{ + return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked, + sendpage_unlocked); +} + /** * skb_store_bits - store bits from kernel buffer to skb * @skb: destination buffer From patchwork Wed Mar 31 02:32:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174167 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A183C433E5 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C78C619D7 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233383AbhCaCdM (ORCPT ); Tue, 30 Mar 2021 22:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233404AbhCaCcw (ORCPT ); Tue, 30 Mar 2021 22:32:52 -0400 Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4944C061574; Tue, 30 Mar 2021 19:32:52 -0700 (PDT) Received: by mail-oo1-xc34.google.com with SMTP id j20-20020a4ad6d40000b02901b66fe8acd6so4259203oot.7; Tue, 30 Mar 2021 19:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZvA8CFNzThNTsW5v98MoXpze0rMZogre745UZc6yQ9M=; b=Qe3RKQW7aUk8/pZP4tsx3BPI9VOUnq/bfcWfEsG0pQjpyfyX3PuTGl2KK2ZEmZvFW9 NfSZsaxggGLw69r3+xd7cKUdwfzO/sF/NwQVVTKcGBsv9I3QZSJX4e9/EiTTecAdD09E YzVSX5Y8M7TBh2QVUqEoR2nsVfBXxuusPmFs4YwObLjJRK6ZlfbTEUkDKlBqejzheDAo ajH7I0W4WrzI3HMDkndFvxUSbhGjMZD6oMVICkv+z3cHkbuqGCzdAWHffdsBNsQ2h8pO ZnIMM5ZUD1WT6U768PA07zCqbiTQypYn5zBxYFm7cTKZi3VkZgYZEjJ8Sl13IXeH7HlU 3oFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZvA8CFNzThNTsW5v98MoXpze0rMZogre745UZc6yQ9M=; b=t2cL0QsoQcWO2Zp9X1H8XOlXw3WfNnEaHlTgP3b1vSktEb1KELt+XqhaorooFu4IuW evQSVciaK8D4CMHyJ4EB5cyuzr6WBPNlquL3XkOI2b5DW4YbWYGU7DPfrGpni4N+yVjq mQBc1dJ4dnhjkaZgXDd8Bb2eIJVDHoTNMVwsjHQkmwIdXbtCIKj/fCQNWUFt3GjhfO6o 7qH5jAZHXU3dSRGVkxXFtYeQxvQEJQOlBX4PiVff72hYDAyNsRH0gmFtzTbpzK7tarRi gsoYzA/JooACi9h0JQQnKG7RVbGypvv/mxu27i/zsz4PMmuOloDeP2eVmwWrpDyfIS4h rB9A== X-Gm-Message-State: AOAM5318R8j9keSQbH4f/jCeg68YAzWDMNj2XP2/PDdvHnL5imu6mVAZ g8FhstQfB94HjQggGWxVqMrFUbi8Ezi5Vg== X-Google-Smtp-Source: ABdhPJyvGV4B4CPWCrLxsJuhVsQNBTfwalnuzK3bGzpC17ipe73C0eD/BIEzLFu0qHzl8tQBJt8BfA== X-Received: by 2002:a4a:7615:: with SMTP id t21mr853554ooc.72.1617157971947; Tue, 30 Mar 2021 19:32:51 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:51 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer , John Fastabend Subject: [Patch bpf-next v8 04/16] skmsg: avoid lock_sock() in sk_psock_backlog() Date: Tue, 30 Mar 2021 19:32:25 -0700 Message-Id: <20210331023237.41094-5-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We do not have to lock the sock to avoid losing sk_socket, instead we can purge all the ingress queues when we close the socket. Sending or receiving packets after orphaning socket makes no sense. We do purge these queues when psock refcnt reaches zero but here we want to purge them explicitly in sock_map_close(). There are also some nasty race conditions on testing bit SK_PSOCK_TX_ENABLED and queuing/canceling the psock work, we can expand psock->ingress_lock a bit to protect them too. As noticed by John, we still have to lock the psock->work, because the same work item could be running concurrently on different CPU's. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- include/linux/skmsg.h | 2 ++ net/core/skmsg.c | 50 +++++++++++++++++++++++++++++-------------- net/core/sock_map.c | 1 + 3 files changed, 37 insertions(+), 16 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index f2d45a73b2b2..7382c4b518d7 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -99,6 +99,7 @@ struct sk_psock { void (*saved_write_space)(struct sock *sk); void (*saved_data_ready)(struct sock *sk); struct proto *sk_proto; + struct mutex work_mutex; struct sk_psock_work_state work_state; struct work_struct work; union { @@ -347,6 +348,7 @@ static inline void sk_psock_report_error(struct sk_psock *psock, int err) } struct sk_psock *sk_psock_init(struct sock *sk, int node); +void sk_psock_stop(struct sk_psock *psock, bool wait); #if IS_ENABLED(CONFIG_BPF_STREAM_PARSER) int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock); diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 305dddc51857..9c25020086a9 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -497,7 +497,7 @@ static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb, if (!ingress) { if (!sock_writeable(psock->sk)) return -EAGAIN; - return skb_send_sock_locked(psock->sk, skb, off, len); + return skb_send_sock(psock->sk, skb, off, len); } return sk_psock_skb_ingress(psock, skb); } @@ -511,8 +511,7 @@ static void sk_psock_backlog(struct work_struct *work) u32 len, off; int ret; - /* Lock sock to avoid losing sk_socket during loop. */ - lock_sock(psock->sk); + mutex_lock(&psock->work_mutex); if (state->skb) { skb = state->skb; len = state->len; @@ -529,7 +528,7 @@ static void sk_psock_backlog(struct work_struct *work) skb_bpf_redirect_clear(skb); do { ret = -EIO; - if (likely(psock->sk->sk_socket)) + if (!sock_flag(psock->sk, SOCK_DEAD)) ret = sk_psock_handle_skb(psock, skb, off, len, ingress); if (ret <= 0) { @@ -553,7 +552,7 @@ static void sk_psock_backlog(struct work_struct *work) kfree_skb(skb); } end: - release_sock(psock->sk); + mutex_unlock(&psock->work_mutex); } struct sk_psock *sk_psock_init(struct sock *sk, int node) @@ -591,6 +590,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) spin_lock_init(&psock->link_lock); INIT_WORK(&psock->work, sk_psock_backlog); + mutex_init(&psock->work_mutex); INIT_LIST_HEAD(&psock->ingress_msg); spin_lock_init(&psock->ingress_lock); skb_queue_head_init(&psock->ingress_skb); @@ -631,7 +631,7 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock) } } -static void sk_psock_zap_ingress(struct sk_psock *psock) +static void __sk_psock_zap_ingress(struct sk_psock *psock) { struct sk_buff *skb; @@ -639,9 +639,7 @@ static void sk_psock_zap_ingress(struct sk_psock *psock) skb_bpf_redirect_clear(skb); kfree_skb(skb); } - spin_lock_bh(&psock->ingress_lock); __sk_psock_purge_ingress_msg(psock); - spin_unlock_bh(&psock->ingress_lock); } static void sk_psock_link_destroy(struct sk_psock *psock) @@ -654,6 +652,18 @@ static void sk_psock_link_destroy(struct sk_psock *psock) } } +void sk_psock_stop(struct sk_psock *psock, bool wait) +{ + spin_lock_bh(&psock->ingress_lock); + sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); + sk_psock_cork_free(psock); + __sk_psock_zap_ingress(psock); + spin_unlock_bh(&psock->ingress_lock); + + if (wait) + cancel_work_sync(&psock->work); +} + static void sk_psock_done_strp(struct sk_psock *psock); static void sk_psock_destroy_deferred(struct work_struct *gc) @@ -665,12 +675,12 @@ static void sk_psock_destroy_deferred(struct work_struct *gc) sk_psock_done_strp(psock); cancel_work_sync(&psock->work); + mutex_destroy(&psock->work_mutex); psock_progs_drop(&psock->progs); sk_psock_link_destroy(psock); sk_psock_cork_free(psock); - sk_psock_zap_ingress(psock); if (psock->sk_redir) sock_put(psock->sk_redir); @@ -688,8 +698,7 @@ static void sk_psock_destroy(struct rcu_head *rcu) void sk_psock_drop(struct sock *sk, struct sk_psock *psock) { - sk_psock_cork_free(psock); - sk_psock_zap_ingress(psock); + sk_psock_stop(psock, false); write_lock_bh(&sk->sk_callback_lock); sk_psock_restore_proto(sk, psock); @@ -699,7 +708,6 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) else if (psock->progs.stream_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); - sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); call_rcu(&psock->rcu, sk_psock_destroy); } @@ -770,14 +778,20 @@ static void sk_psock_skb_redirect(struct sk_buff *skb) * error that caused the pipe to break. We can't send a packet on * a socket that is in this state so we drop the skb. */ - if (!psock_other || sock_flag(sk_other, SOCK_DEAD) || - !sk_psock_test_state(psock_other, SK_PSOCK_TX_ENABLED)) { + if (!psock_other || sock_flag(sk_other, SOCK_DEAD)) { + kfree_skb(skb); + return; + } + spin_lock_bh(&psock_other->ingress_lock); + if (!sk_psock_test_state(psock_other, SK_PSOCK_TX_ENABLED)) { + spin_unlock_bh(&psock_other->ingress_lock); kfree_skb(skb); return; } skb_queue_tail(&psock_other->ingress_skb, skb); schedule_work(&psock_other->work); + spin_unlock_bh(&psock_other->ingress_lock); } static void sk_psock_tls_verdict_apply(struct sk_buff *skb, struct sock *sk, int verdict) @@ -845,8 +859,12 @@ static void sk_psock_verdict_apply(struct sk_psock *psock, err = sk_psock_skb_ingress_self(psock, skb); } if (err < 0) { - skb_queue_tail(&psock->ingress_skb, skb); - schedule_work(&psock->work); + spin_lock_bh(&psock->ingress_lock); + if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) { + skb_queue_tail(&psock->ingress_skb, skb); + schedule_work(&psock->work); + } + spin_unlock_bh(&psock->ingress_lock); } break; case __SK_REDIRECT: diff --git a/net/core/sock_map.c b/net/core/sock_map.c index dd53a7771d7e..e564fdeaada1 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1540,6 +1540,7 @@ void sock_map_close(struct sock *sk, long timeout) saved_close = psock->saved_close; sock_map_remove_links(sk, psock); rcu_read_unlock(); + sk_psock_stop(psock, true); release_sock(sk); saved_close(sk, timeout); } From patchwork Wed Mar 31 02:32:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174163 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89948C433DB for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5DA926187E for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233366AbhCaCdM (ORCPT ); Tue, 30 Mar 2021 22:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233405AbhCaCcy (ORCPT ); Tue, 30 Mar 2021 22:32:54 -0400 Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7AB5C061574; Tue, 30 Mar 2021 19:32:53 -0700 (PDT) Received: by mail-oo1-xc36.google.com with SMTP id q127-20020a4a33850000b02901b646aa81b1so4264062ooq.8; Tue, 30 Mar 2021 19:32:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Kfpkxqp/EUWiWSJanqZp6gALxArnEJLF0PtroKlsYNQ=; b=vIIamue+OQ9cGDphEwlAth5HdRXxcUlBTfO4m4I1cO9XkM4MtwR362RLkmorblIaw3 qzl+8Un7kIi5SAfLMlBduHTlxAlNZecLmGvLxMlgGi3gdjREyJo2EN94NBzfx3LqLZJX a2qtcrE2aFDEdwaSa/cmNWDe8LhpiFDUNW5xdJ62l2MA1T+UkVHGWVHKG7fx5Rp38Hoh EfhUKdxc3BY+IgdRygvVQtflCpZT+IOI0FQ6wnqo/pUmFfqqytqUL580YeJdFsUI4hkh 8MuMNGx0LskV8VGDvFOkWoAsIzx8zSwicZY1x5O0lUm3o6vHc29F1WtKdoqaU2MnVWvf CXvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Kfpkxqp/EUWiWSJanqZp6gALxArnEJLF0PtroKlsYNQ=; b=F7F5sBkab4MHVWkF8tVqGQe1ss2rz/qwIigNyx0MrNzOBGel0YU+uDE262AIxTX2UY CZcjiVz6Czpe/VNd6nnTbtDCDkCJ/D94VIKmM7ua8XKrkN84dfkTTPLCV79Yx1QKAV92 whs2x1GQyjbuU6JmY4noZI1pdACOW4qiZMUhqvre6GpBByXHFLnZecCM3Wa9xVU+qX6V QsIqNe+TqYopjuSp4OfpxNgE4KZNtNsYkcOqwEKRME/8iSLIXjatJzxO4lewG3EKHgDt kVnnJD/MWd88P+uX6BUWxLGi+q0NtjB4mHcmc5XURcG+d4/UYVgdnSR8uAE6pUvZPoJY 2aMQ== X-Gm-Message-State: AOAM533QspZXaIYuEOroN+EAcoK6P4eRCVn3D2XpNmoC3qTia/8OaKWc YWUBJemunYYWjULCXb0OcUIoA10Lej5tNg== X-Google-Smtp-Source: ABdhPJwxh7ZblC/BaL7L6K2MbnGsfT3ln1LiUA6GZFaDnBGnJpQDAWSD9XpiZTJQMdXG6SWbY63xkw== X-Received: by 2002:a4a:4cd6:: with SMTP id a205mr877860oob.4.1617157973148; Tue, 30 Mar 2021 19:32:53 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:52 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer , John Fastabend Subject: [Patch bpf-next v8 05/16] skmsg: use rcu work for destroying psock Date: Tue, 30 Mar 2021 19:32:26 -0700 Message-Id: <20210331023237.41094-6-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang The RCU callback sk_psock_destroy() only queues work psock->gc, so we can just switch to rcu work to simplify the code. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- include/linux/skmsg.h | 5 +---- net/core/skmsg.c | 17 +++++------------ 2 files changed, 6 insertions(+), 16 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 7382c4b518d7..e7aba150539d 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -102,10 +102,7 @@ struct sk_psock { struct mutex work_mutex; struct sk_psock_work_state work_state; struct work_struct work; - union { - struct rcu_head rcu; - struct work_struct gc; - }; + struct rcu_work rwork; }; int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len, diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 9c25020086a9..d43d43905d2c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -666,10 +666,10 @@ void sk_psock_stop(struct sk_psock *psock, bool wait) static void sk_psock_done_strp(struct sk_psock *psock); -static void sk_psock_destroy_deferred(struct work_struct *gc) +static void sk_psock_destroy(struct work_struct *work) { - struct sk_psock *psock = container_of(gc, struct sk_psock, gc); - + struct sk_psock *psock = container_of(to_rcu_work(work), + struct sk_psock, rwork); /* No sk_callback_lock since already detached. */ sk_psock_done_strp(psock); @@ -688,14 +688,6 @@ static void sk_psock_destroy_deferred(struct work_struct *gc) kfree(psock); } -static void sk_psock_destroy(struct rcu_head *rcu) -{ - struct sk_psock *psock = container_of(rcu, struct sk_psock, rcu); - - INIT_WORK(&psock->gc, sk_psock_destroy_deferred); - schedule_work(&psock->gc); -} - void sk_psock_drop(struct sock *sk, struct sk_psock *psock) { sk_psock_stop(psock, false); @@ -709,7 +701,8 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); - call_rcu(&psock->rcu, sk_psock_destroy); + INIT_RCU_WORK(&psock->rwork, sk_psock_destroy); + queue_rcu_work(system_wq, &psock->rwork); } EXPORT_SYMBOL_GPL(sk_psock_drop); From patchwork Wed Mar 31 02:32:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174171 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6874C433E9 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BF6B4619D9 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233394AbhCaCdN (ORCPT ); Tue, 30 Mar 2021 22:33:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233406AbhCaCcz (ORCPT ); Tue, 30 Mar 2021 22:32:55 -0400 Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 047EEC061574; Tue, 30 Mar 2021 19:32:55 -0700 (PDT) Received: by mail-ot1-x32f.google.com with SMTP id g8-20020a9d6c480000b02901b65ca2432cso17576840otq.3; Tue, 30 Mar 2021 19:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6NSVMjZHKmhObvQ+/EbKaviQbdzEnvypy6KLGhx7zcA=; b=SDJVCpcbKLmrESZEdDuPahoBu3BihPt2aFX+sEnBgqPpTLhA1VobQ5DXeD2hVk6N4K gCj9g9lhgIY5cbxZXOp186lo+HlXkiY+eU2l2yKp9q/7R+ytTvZToc4j314E65OAUzS1 U1r3E8VJqwRa2jJEZ1yNxXpR+EFZ+s/0FKiGwtZoVPCdZNqKW2xv3XzKweBxH7lVvM7N ihv9dechoSG1oY/mtmDkIbNFFq6Db0+KdogfKAWr2KHQ9Q3042xDwgpKnJsTGxdGTaX7 B1+8w/CiDjFXmW91YtjbZJPgCBh8XUrlIbxFWwKYiJEi+EvSZ2qUl4y5wGApxuflrfbX bwZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6NSVMjZHKmhObvQ+/EbKaviQbdzEnvypy6KLGhx7zcA=; b=EoFpHzEYkF1Blnp/+TxDex+72PdXwj0d0HLG5U8zLp/moQiquFPyRtQp3qwtABWHrL xLYXr+JYPZYe5k/Xi9EK7Zi1Hi6+nZ+f4QUSKsLc5+aIkjBuNMKbEbw/1ngqkDOL1ahc 1vELNrKTrCeX5k9FmpREiLSJB4GUjBHMWnyM7Marmr52b6WX+3+xzaAPnxyUtwKl3gxl N3S2LalMO1rxMhEsAj5XmCNLP5ZI+LKwvxvnA6MLc9hrhfpE3TxDpi+jFnJ0SY6Plo5Z a7+cFg3H7wrWcQJZtPZMMTyOzNEWvwt7EsrcaULtPhEbTieUg46BcE8j9NLALPtnM73G neOw== X-Gm-Message-State: AOAM5301geddFDSIFtjVAdJaFZ3qxB2V4pi2uWJPuu3HwTlMdiqV9tLb 1nacjH4T1Hrgm+CbaXhlTMUs1W2UsEexZw== X-Google-Smtp-Source: ABdhPJzeV/3sReNFHsfyt2/J2XCMEGvutIh11u3PlkjjcNITAfGkKUpwkAWssBxdNEzPZxWz9yNC/g== X-Received: by 2002:a9d:3e10:: with SMTP id a16mr763475otd.261.1617157974342; Tue, 30 Mar 2021 19:32:54 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:54 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer , John Fastabend Subject: [Patch bpf-next v8 06/16] skmsg: use GFP_KERNEL in sk_psock_create_ingress_msg() Date: Tue, 30 Mar 2021 19:32:27 -0700 Message-Id: <20210331023237.41094-7-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang This function is only called in process context. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index d43d43905d2c..656eceab73bc 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -410,7 +410,7 @@ static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, if (!sk_rmem_schedule(sk, skb, skb->truesize)) return NULL; - msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_ATOMIC); + msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_KERNEL); if (unlikely(!msg)) return NULL; From patchwork Wed Mar 31 02:32:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174189 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4593C433E6 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9CAF1619D6 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233401AbhCaCdN (ORCPT ); Tue, 30 Mar 2021 22:33:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233410AbhCaCc4 (ORCPT ); Tue, 30 Mar 2021 22:32:56 -0400 Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42017C061574; Tue, 30 Mar 2021 19:32:56 -0700 (PDT) Received: by mail-oi1-x233.google.com with SMTP id n140so18579192oig.9; Tue, 30 Mar 2021 19:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ia35mLKoc/vzo79tg+pjBoddoF8fRkfZ+FycVXH6fOo=; b=PtUXUUOGhfnXaUZfgQvUE5SGxYS6ri0EFSDTszqlEsSAkcsvtQJSQO25/cMp6Vhe4I YEcEtujeEkLxehJlHnBvznHEwPbSulzCHuulzYdo9mr0DuiPE1bkA6SGsWb1Ed7tUdMk XTMsPf2Om6SJ8dC0EMciavI4aoKizhPbXpvV2QH9ZGN2yU3hTfhDZ/e8KlQ3nqgVClnm GnuRNj2GYT1dpSNEhsmrBV3k9Z0sKo9ONQhNXSf1P+BqtaDvCyjXj4dY5kNJF1B6wG9X OBji9bFVbR30Se4k19VOqYOyzf4X37bn6+b+I4RuRn7jBGWB7WP3+AymFXW3DHc7XyNG yzsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ia35mLKoc/vzo79tg+pjBoddoF8fRkfZ+FycVXH6fOo=; b=eP/rAzNunMT4KiWFsLdUH31J7ldv5t+kyCu/LpEdQmFb5h2EI0GpaRFMdoWxnc00b9 hElGKZnCdjg2rSgHg5+I3oIX6UVYIChnTnJ4OVQcf0fiXc0prProgFR02WB/hnrW5+X2 GIBkdiRMvfHIPWz6TBjs9EdUpdX8AMLDNkaVita0Wp3eqaZlubd7G7skP5dQw9bRFB9P IooBolEAnN+bM1Ji9WFsMEXv6yVt8R72JHdS7f31Ia//upBaB5d6vlI+e8FJz89MURY7 QI6GSp8nldC9iD5MYlNwzrzgeT5j53K/RsVlHWnMp9nxANvMhRzsPW+YoMIPtulgo5WD cekQ== X-Gm-Message-State: AOAM533a0WAJW1sDesLnCuWkDKQQSR/3dxCq7/HCxREyE4lVKnqRRAGq 2yKCOqLrZy3A2ayK4LHzUA4K5TrC4kwOkw== X-Google-Smtp-Source: ABdhPJzjsBubH5BnPVqvvuIDnyqIs1QyH6GDsnFcjo18gU+RPywRqXBTbZD3T5puijAPA2KXS2v3Yw== X-Received: by 2002:a05:6808:54c:: with SMTP id i12mr675971oig.17.1617157975557; Tue, 30 Mar 2021 19:32:55 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:55 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer , John Fastabend Subject: [Patch bpf-next v8 07/16] sock_map: simplify sock_map_link() a bit Date: Tue, 30 Mar 2021 19:32:28 -0700 Message-Id: <20210331023237.41094-8-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang sock_map_link() passes down map progs, but it is confusing to see both map progs and psock progs. Make the map progs more obvious by retrieving it directly with sock_map_progs() inside sock_map_link(). Now it is aligned with sock_map_link_no_progs() too. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Cc: John Fastabend Signed-off-by: Cong Wang Acked-by: John Fastabend --- net/core/sock_map.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/net/core/sock_map.c b/net/core/sock_map.c index e564fdeaada1..d06face0f16c 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -26,6 +26,7 @@ struct bpf_stab { static int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, struct bpf_prog *old, u32 which); +static struct sk_psock_progs *sock_map_progs(struct bpf_map *map); static struct bpf_map *sock_map_alloc(union bpf_attr *attr) { @@ -224,10 +225,10 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) return psock; } -static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, - struct sock *sk) +static int sock_map_link(struct bpf_map *map, struct sock *sk) { struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; + struct sk_psock_progs *progs = sock_map_progs(map); struct sk_psock *psock; int ret; @@ -492,7 +493,7 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx, * and sk_write_space callbacks overridden. */ if (sock_map_redirect_allowed(sk)) - ret = sock_map_link(map, &stab->progs, sk); + ret = sock_map_link(map, sk); else ret = sock_map_link_no_progs(map, sk); if (ret < 0) @@ -1004,7 +1005,7 @@ static int sock_hash_update_common(struct bpf_map *map, void *key, * and sk_write_space callbacks overridden. */ if (sock_map_redirect_allowed(sk)) - ret = sock_map_link(map, &htab->progs, sk); + ret = sock_map_link(map, sk); else ret = sock_map_link_no_progs(map, sk); if (ret < 0) From patchwork Wed Mar 31 02:32:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174169 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30470C433EB for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E7E88619C9 for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233431AbhCaCdO (ORCPT ); Tue, 30 Mar 2021 22:33:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233411AbhCaCc5 (ORCPT ); Tue, 30 Mar 2021 22:32:57 -0400 Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DD62C061574; Tue, 30 Mar 2021 19:32:57 -0700 (PDT) Received: by mail-ot1-x332.google.com with SMTP id y19-20020a0568301d93b02901b9f88a238eso17551924oti.11; Tue, 30 Mar 2021 19:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eV9B2QnitCddGO0nj6V/hwG740QNXO2GVWdy3iY1rK8=; b=W5SqZGz5ex6DpRpip8TagcaHzh7QO5IqIITTzTAwPtW/E7mr9y0qF+SVEMmdFULAp7 QU0GBYwVyIQAwVdr6qFZ7vA7wQYUJ8WFnHM21M8DMsXCT/klt014840j+C2LZmvvBWkV ARZtkppWc4maz2AIIa5jIgivyoI4/+GBQ7TG0gcoI4CMPyefoB3PeOV0TyIUYdvL+rHE LnxIUg+a+0SQTXPjCRPPFSyO2yR0P3N35F5n/hltlt9JzKCRjs2+ReO0CyEytHl8nA/4 cRIDKPR27jWmdahbYU2vrGrgo8SnN1ghstNK/7NomlRgVfeUWinlgACX2687L4GQMH7b UOEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eV9B2QnitCddGO0nj6V/hwG740QNXO2GVWdy3iY1rK8=; b=J/Em9nEJNzyqK3kcgOznY54OrWsoEZTX5v0NsNWEzr8lEIgNI+NCIHPKj5kxzLiOQE WQ0QFu91PFXx4gXZ+Id1hzfsn2RjQzLHnhjIwqBti6D0FY9B8iiJDX8Y6lnxynDqX7ef M7gbDyH6uHZGVJn7UFJr2EazDMW2CPeElYf3qlhzloX5OSLeZjBtIl7byp/y+zkkSZ4j 0xzIwG1G2Dn4bClEQufURXSfVf628gFh0Y0fXZp9NjfjdFkXfdKTANJd3tLoQpRqkFs6 FCkFqE68uhHiMFnPeX1why0P4F1tCdmpoyuVKzsa20PQdw5/gTYfpkSPIrhwn+YH49fI ZQEQ== X-Gm-Message-State: AOAM5314glak06rEN/XHSqjVR1fTjuYDqwrkCDM7QEGoa6Zc3t3UL7/Q 6VnFir/cBJYRq9zLGgnSEYItttXBvkfx2g== X-Google-Smtp-Source: ABdhPJxLLLNvq5B/Fbm+tI+py5eRYLmeL1o3zN1/MJk/VAV4FWJrR49yCDhc8LF45n9C49Ib5xIIgA== X-Received: by 2002:a05:6830:1c6e:: with SMTP id s14mr769987otg.17.1617157976749; Tue, 30 Mar 2021 19:32:56 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:56 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer , John Fastabend Subject: [Patch bpf-next v8 08/16] sock_map: kill sock_map_link_no_progs() Date: Tue, 30 Mar 2021 19:32:29 -0700 Message-Id: <20210331023237.41094-9-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Now we can fold sock_map_link_no_progs() into sock_map_link() and get rid of sock_map_link_no_progs(). Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Cc: John Fastabend Signed-off-by: Cong Wang --- net/core/sock_map.c | 55 +++++++++++++-------------------------------- 1 file changed, 15 insertions(+), 40 deletions(-) diff --git a/net/core/sock_map.c b/net/core/sock_map.c index d06face0f16c..42d797291d34 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -225,13 +225,24 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) return psock; } +static bool sock_map_redirect_allowed(const struct sock *sk); + static int sock_map_link(struct bpf_map *map, struct sock *sk) { - struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; struct sk_psock_progs *progs = sock_map_progs(map); + struct bpf_prog *stream_verdict = NULL; + struct bpf_prog *stream_parser = NULL; + struct bpf_prog *msg_parser = NULL; struct sk_psock *psock; int ret; + /* Only sockets we can redirect into/from in BPF need to hold + * refs to parser/verdict progs and have their sk_data_ready + * and sk_write_space callbacks overridden. + */ + if (!sock_map_redirect_allowed(sk)) + goto no_progs; + stream_verdict = READ_ONCE(progs->stream_verdict); if (stream_verdict) { stream_verdict = bpf_prog_inc_not_zero(stream_verdict); @@ -257,6 +268,7 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) } } +no_progs: psock = sock_map_psock_get_checked(sk); if (IS_ERR(psock)) { ret = PTR_ERR(psock); @@ -316,27 +328,6 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) return ret; } -static int sock_map_link_no_progs(struct bpf_map *map, struct sock *sk) -{ - struct sk_psock *psock; - int ret; - - psock = sock_map_psock_get_checked(sk); - if (IS_ERR(psock)) - return PTR_ERR(psock); - - if (!psock) { - psock = sk_psock_init(sk, map->numa_node); - if (IS_ERR(psock)) - return PTR_ERR(psock); - } - - ret = sock_map_init_proto(sk, psock); - if (ret < 0) - sk_psock_put(sk, psock); - return ret; -} - static void sock_map_free(struct bpf_map *map) { struct bpf_stab *stab = container_of(map, struct bpf_stab, map); @@ -467,8 +458,6 @@ static int sock_map_get_next_key(struct bpf_map *map, void *key, void *next) return 0; } -static bool sock_map_redirect_allowed(const struct sock *sk); - static int sock_map_update_common(struct bpf_map *map, u32 idx, struct sock *sk, u64 flags) { @@ -488,14 +477,7 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx, if (!link) return -ENOMEM; - /* Only sockets we can redirect into/from in BPF need to hold - * refs to parser/verdict progs and have their sk_data_ready - * and sk_write_space callbacks overridden. - */ - if (sock_map_redirect_allowed(sk)) - ret = sock_map_link(map, sk); - else - ret = sock_map_link_no_progs(map, sk); + ret = sock_map_link(map, sk); if (ret < 0) goto out_free; @@ -1000,14 +982,7 @@ static int sock_hash_update_common(struct bpf_map *map, void *key, if (!link) return -ENOMEM; - /* Only sockets we can redirect into/from in BPF need to hold - * refs to parser/verdict progs and have their sk_data_ready - * and sk_write_space callbacks overridden. - */ - if (sock_map_redirect_allowed(sk)) - ret = sock_map_link(map, sk); - else - ret = sock_map_link_no_progs(map, sk); + ret = sock_map_link(map, sk); if (ret < 0) goto out_free; From patchwork Wed Mar 31 02:32:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174183 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F11D8C433EC for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D2E176187E for ; Wed, 31 Mar 2021 02:33:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233434AbhCaCdO (ORCPT ); Tue, 30 Mar 2021 22:33:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233412AbhCaCc6 (ORCPT ); Tue, 30 Mar 2021 22:32:58 -0400 Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A87C8C061574; Tue, 30 Mar 2021 19:32:58 -0700 (PDT) Received: by mail-oi1-x233.google.com with SMTP id i3so18579252oik.7; Tue, 30 Mar 2021 19:32:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BYGRS7ua+OKY+2aA10FiPISRkcm6BlguDB9Q0Xk6oFE=; b=tQqe4rSkQm3MwxxGONRjo2t3iTxS5HD4EwRxuYZPA087CrzUovR0q3MpYTPYcMzY49 RBKmJPSyW89lOaMexacAwl9BVZNfl6L720v9Q6FAUXLQk83KNWTiAWeEu9sXKcEZvnNJ I/bGlIJjsAUOVMMHS5aTvsrmlEXgENhngMKx4rnTWw/f0QLq820sAFk+ZuBUSZMxjd8b hMjZzSRSJpaxFOwzHhnuQCemLjfCT102NFzk/+e77wk1ZXB+1fgZAC8mkLRaRvNBuue1 u1sOE2/7Qbd15ikT4s0aaELSi0/Tw5LUJh39rJebctIS4G8tCOENNti/8uUh/H1nsrp8 l+bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BYGRS7ua+OKY+2aA10FiPISRkcm6BlguDB9Q0Xk6oFE=; b=oxuPezqgp1YkXxA2xEUw5Oezf+Ayvafci2Qq9YGdCTFEK96CFXKuQGWnt2qfduTeVU wuZ0cwuP1aVKK1crfOW8jL8KyHN1aPWVUl6VUmR9Wh+cl3+7OP94ugxVHkVkDtQ2tG75 lFe27/CfoHjNIlK6DZvGOFksG87eUdL/4Mg4m782CxX57r3mT5ye6C3x5sDfXJxgC+nl m3QHOP58PA9DnWyvAuh1zMr1ZPFYbYdg27AiHWOxxXJhxexskiOLKcLeCUelNaSmHLMS Bp1KIt3IgHPjM8KPpC4J1sS0bpVHVcAoFEZ/mqElSY2vvgv+LcYpW5uSUZIxCxzVcKqw v+ew== X-Gm-Message-State: AOAM531xrJbp8XnBXM4UcUW3fCVdmiflwEhk9KC35TG2bjb4AbiXtCrZ LOZpgsMRGAis6edIbk4QPv4Qy5Jdy5pJhg== X-Google-Smtp-Source: ABdhPJz4zkqoexHSaKiaZUzd14EpuXRzyweYRWIlsNFwjW8GkPhZeLYsw5QG+JTrLef7xIkhevgLQw== X-Received: by 2002:a05:6808:1cb:: with SMTP id x11mr679933oic.89.1617157977984; Tue, 30 Mar 2021 19:32:57 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:57 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 09/16] sock_map: introduce BPF_SK_SKB_VERDICT Date: Tue, 30 Mar 2021 19:32:30 -0700 Message-Id: <20210331023237.41094-10-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach stream_verdict and skb_verdict programs to the same map. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Acked-by: John Fastabend --- include/linux/skmsg.h | 2 ++ include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 1 + net/core/skmsg.c | 4 +++- net/core/sock_map.c | 28 ++++++++++++++++++++++++++++ tools/bpf/bpftool/common.c | 1 + tools/bpf/bpftool/prog.c | 1 + tools/include/uapi/linux/bpf.h | 1 + 8 files changed, 38 insertions(+), 1 deletion(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index e7aba150539d..c83dbc2d81d9 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -58,6 +58,7 @@ struct sk_psock_progs { struct bpf_prog *msg_parser; struct bpf_prog *stream_parser; struct bpf_prog *stream_verdict; + struct bpf_prog *skb_verdict; }; enum sk_psock_state_bits { @@ -487,6 +488,7 @@ static inline void psock_progs_drop(struct sk_psock_progs *progs) psock_set_prog(&progs->msg_parser, NULL); psock_set_prog(&progs->stream_parser, NULL); psock_set_prog(&progs->stream_verdict, NULL); + psock_set_prog(&progs->skb_verdict, NULL); } int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 598716742593..49371eba98ba 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -957,6 +957,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 9603de81811a..6428634da57e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2948,6 +2948,7 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_SK_MSG; case BPF_SK_SKB_STREAM_PARSER: case BPF_SK_SKB_STREAM_VERDICT: + case BPF_SK_SKB_VERDICT: return BPF_PROG_TYPE_SK_SKB; case BPF_LIRC_MODE2: return BPF_PROG_TYPE_LIRC_MODE2; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 656eceab73bc..a045812d7c78 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -697,7 +697,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) rcu_assign_sk_user_data(sk, NULL); if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); - else if (psock->progs.stream_verdict) + else if (psock->progs.stream_verdict || psock->progs.skb_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); @@ -1024,6 +1024,8 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, } skb_set_owner_r(skb, sk); prog = READ_ONCE(psock->progs.stream_verdict); + if (!prog) + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { skb_dst_drop(skb); skb_bpf_redirect_clear(skb); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 42d797291d34..c2a0411e08a8 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -156,6 +156,8 @@ static void sock_map_del_link(struct sock *sk, strp_stop = true; if (psock->saved_data_ready && stab->progs.stream_verdict) verdict_stop = true; + if (psock->saved_data_ready && stab->progs.skb_verdict) + verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); } @@ -232,6 +234,7 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) struct sk_psock_progs *progs = sock_map_progs(map); struct bpf_prog *stream_verdict = NULL; struct bpf_prog *stream_parser = NULL; + struct bpf_prog *skb_verdict = NULL; struct bpf_prog *msg_parser = NULL; struct sk_psock *psock; int ret; @@ -268,6 +271,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) } } + skb_verdict = READ_ONCE(progs->skb_verdict); + if (skb_verdict) { + skb_verdict = bpf_prog_inc_not_zero(skb_verdict); + if (IS_ERR(skb_verdict)) { + ret = PTR_ERR(skb_verdict); + goto out_put_msg_parser; + } + } + no_progs: psock = sock_map_psock_get_checked(sk); if (IS_ERR(psock)) { @@ -278,6 +290,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) if (psock) { if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) || (stream_parser && READ_ONCE(psock->progs.stream_parser)) || + (skb_verdict && READ_ONCE(psock->progs.skb_verdict)) || + (skb_verdict && READ_ONCE(psock->progs.stream_verdict)) || + (stream_verdict && READ_ONCE(psock->progs.skb_verdict)) || (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) { sk_psock_put(sk, psock); ret = -EBUSY; @@ -309,6 +324,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) } else if (!stream_parser && stream_verdict && !psock->saved_data_ready) { psock_set_prog(&psock->progs.stream_verdict, stream_verdict); sk_psock_start_verdict(sk,psock); + } else if (!stream_verdict && skb_verdict && !psock->saved_data_ready) { + psock_set_prog(&psock->progs.skb_verdict, skb_verdict); + sk_psock_start_verdict(sk, psock); } write_unlock_bh(&sk->sk_callback_lock); return 0; @@ -317,6 +335,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk) out_drop: sk_psock_put(sk, psock); out_progs: + if (skb_verdict) + bpf_prog_put(skb_verdict); +out_put_msg_parser: if (msg_parser) bpf_prog_put(msg_parser); out_put_stream_parser: @@ -1442,8 +1463,15 @@ static int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, break; #endif case BPF_SK_SKB_STREAM_VERDICT: + if (progs->skb_verdict) + return -EBUSY; pprog = &progs->stream_verdict; break; + case BPF_SK_SKB_VERDICT: + if (progs->stream_verdict) + return -EBUSY; + pprog = &progs->skb_verdict; + break; default: return -EOPNOTSUPP; } diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 65303664417e..1828bba19020 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -57,6 +57,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = { [BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict", + [BPF_SK_SKB_VERDICT] = "sk_skb_verdict", [BPF_SK_MSG_VERDICT] = "sk_msg_verdict", [BPF_LIRC_MODE2] = "lirc_mode2", [BPF_FLOW_DISSECTOR] = "flow_dissector", diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index f2b915b20546..3f067d2d7584 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -76,6 +76,7 @@ enum dump_mode { static const char * const attach_type_strings[] = { [BPF_SK_SKB_STREAM_PARSER] = "stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict", + [BPF_SK_SKB_VERDICT] = "skb_verdict", [BPF_SK_MSG_VERDICT] = "msg_verdict", [BPF_FLOW_DISSECTOR] = "flow_dissector", [__MAX_BPF_ATTACH_TYPE] = NULL, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index ab9f2233607c..69902603012c 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -957,6 +957,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; From patchwork Wed Mar 31 02:32:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174185 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C85A2C433F4 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B25776187E for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233439AbhCaCdR (ORCPT ); Tue, 30 Mar 2021 22:33:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233083AbhCaCdA (ORCPT ); Tue, 30 Mar 2021 22:33:00 -0400 Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08DD9C061574; Tue, 30 Mar 2021 19:33:00 -0700 (PDT) Received: by mail-ot1-x331.google.com with SMTP id 31-20020a9d00220000b02901b64b9b50b1so17551852ota.9; Tue, 30 Mar 2021 19:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3QAaU2+XlxVxCw9q8naQHreSc9uRGeExXmcUh51X8FE=; b=Cw8rM3wQNYO0QgiOKwC4OVp3pLLTnIwQDG6pP63j28trPh+Ivqep1W1J6B1bak3Oo7 Kzc91z9xj55vMt70RK8GS0KfI9oTTOeYG2FUbn6Ny37hvmcdmHaBUkub/dq5UmpMdHCK 48drL8AxIda29vnHdgombyVhQ7C5Vu7ZFjGIScWsZhj+dwEtWzr51SJYdVByQmn4HRVv lhQ31j/Cu9vBVrgRoKSFfg23k6qBwQGKnccX0a/zYHQ8U4vyJXPu9cMItz2dpxdaMn1l oEuS48BF7zqzyWo/WzNySTXNZVbYOdWozT/PAD5XfJ/BwFcMhmUxgf/JmhIe4596D+JR QpLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3QAaU2+XlxVxCw9q8naQHreSc9uRGeExXmcUh51X8FE=; b=rKWpf9c81AmQK4ynGbrbwTJx4EhbxLQbwEFDPNfsEI0KjzP8ETzArZ9WiRjvoN/ACB o/+9rNKTp5Ucx3TF2n09o7y1awhBzMc+vq3k1WiobEPeJyKGE4WF90snkR9gMwuT5zTt r9PpMtAh/xIpm48o1GGrxPdsoS4ttuWtneAVDNTQ1n4cC3ZWf5PiesHjadzxsfYApYbe sq9nUZkd+mLOdthlW/JOd98zQz1bN5rRhzd7sGlCSguh4ZMcHAtzGSSEgW9KtGj0Hp67 FCFI/9uQ/2TgcRO0wECHKh6a/xotgxqjni8Tbl7fmlEVXmhRJow4R3vzFOZCQoG/Pssh r61w== X-Gm-Message-State: AOAM530iB2ColBno0EkOeHwnEtO1Ut9ndkrje4TRJI+urS/+hBNfTsKU 2h+wyjq6jx2V+OCdhmcfTXd9fM5maeqZMw== X-Google-Smtp-Source: ABdhPJyY/dPG0v5hwiLhbA4iO/U8BriLIDLcenZeCCd8HGRf1xhFbPPhXKLq8sdWWtAyadaQ7Xi72g== X-Received: by 2002:a9d:4049:: with SMTP id o9mr797853oti.58.1617157979237; Tue, 30 Mar 2021 19:32:59 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:32:58 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 10/16] sock: introduce sk->sk_prot->psock_update_sk_prot() Date: Tue, 30 Mar 2021 19:32:31 -0700 Message-Id: <20210331023237.41094-11-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Currently sockmap calls into each protocol to update the struct proto and replace it. This certainly won't work when the protocol is implemented as a module, for example, AF_UNIX. Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each protocol can implement its own way to replace the struct proto. This also helps get rid of symbol dependencies on CONFIG_INET. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 18 +++--------------- include/net/sock.h | 3 +++ include/net/tcp.h | 1 + include/net/udp.h | 1 + net/core/skmsg.c | 5 ----- net/core/sock_map.c | 24 ++++-------------------- net/ipv4/tcp_bpf.c | 24 +++++++++++++++++++++--- net/ipv4/tcp_ipv4.c | 3 +++ net/ipv4/udp.c | 3 +++ net/ipv4/udp_bpf.c | 15 +++++++++++++-- net/ipv6/tcp_ipv6.c | 3 +++ net/ipv6/udp.c | 3 +++ 12 files changed, 58 insertions(+), 45 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index c83dbc2d81d9..5e800ddc2dc6 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -99,6 +99,7 @@ struct sk_psock { void (*saved_close)(struct sock *sk, long timeout); void (*saved_write_space)(struct sock *sk); void (*saved_data_ready)(struct sock *sk); + int (*psock_update_sk_prot)(struct sock *sk, bool restore); struct proto *sk_proto; struct mutex work_mutex; struct sk_psock_work_state work_state; @@ -395,25 +396,12 @@ static inline void sk_psock_cork_free(struct sk_psock *psock) } } -static inline void sk_psock_update_proto(struct sock *sk, - struct sk_psock *psock, - struct proto *ops) -{ - /* Pairs with lockless read in sk_clone_lock() */ - WRITE_ONCE(sk->sk_prot, ops); -} - static inline void sk_psock_restore_proto(struct sock *sk, struct sk_psock *psock) { sk->sk_prot->unhash = psock->saved_unhash; - if (inet_csk_has_ulp(sk)) { - tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space); - } else { - sk->sk_write_space = psock->saved_write_space; - /* Pairs with lockless read in sk_clone_lock() */ - WRITE_ONCE(sk->sk_prot, psock->sk_proto); - } + if (psock->psock_update_sk_prot) + psock->psock_update_sk_prot(sk, true); } static inline void sk_psock_set_state(struct sk_psock *psock, diff --git a/include/net/sock.h b/include/net/sock.h index 0b6266fd6bf6..8b4155e756c2 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1184,6 +1184,9 @@ struct proto { void (*unhash)(struct sock *sk); void (*rehash)(struct sock *sk); int (*get_port)(struct sock *sk, unsigned short snum); +#ifdef CONFIG_BPF_SYSCALL + int (*psock_update_sk_prot)(struct sock *sk, bool restore); +#endif /* Keeping track of sockets in use */ #ifdef CONFIG_PROC_FS diff --git a/include/net/tcp.h b/include/net/tcp.h index 075de26f449d..2efa4e5ea23d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2203,6 +2203,7 @@ struct sk_psock; #ifdef CONFIG_BPF_SYSCALL struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); +int tcp_bpf_update_proto(struct sock *sk, bool restore); void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); #endif /* CONFIG_BPF_SYSCALL */ diff --git a/include/net/udp.h b/include/net/udp.h index d4d064c59232..df7cc1edc200 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -518,6 +518,7 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk, #ifdef CONFIG_BPF_SYSCALL struct sk_psock; struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); +int udp_bpf_update_proto(struct sock *sk, bool restore); #endif #endif /* _UDP_H */ diff --git a/net/core/skmsg.c b/net/core/skmsg.c index a045812d7c78..9fc83f7cc1a0 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -562,11 +562,6 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) write_lock_bh(&sk->sk_callback_lock); - if (inet_csk_has_ulp(sk)) { - psock = ERR_PTR(-EINVAL); - goto out; - } - if (sk->sk_user_data) { psock = ERR_PTR(-EBUSY); goto out; diff --git a/net/core/sock_map.c b/net/core/sock_map.c index c2a0411e08a8..2915c7c8778b 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -185,26 +185,10 @@ static void sock_map_unref(struct sock *sk, void *link_raw) static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock) { - struct proto *prot; - - switch (sk->sk_type) { - case SOCK_STREAM: - prot = tcp_bpf_get_proto(sk, psock); - break; - - case SOCK_DGRAM: - prot = udp_bpf_get_proto(sk, psock); - break; - - default: + if (!sk->sk_prot->psock_update_sk_prot) return -EINVAL; - } - - if (IS_ERR(prot)) - return PTR_ERR(prot); - - sk_psock_update_proto(sk, psock, prot); - return 0; + psock->psock_update_sk_prot = sk->sk_prot->psock_update_sk_prot; + return sk->sk_prot->psock_update_sk_prot(sk, false); } static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) @@ -556,7 +540,7 @@ static bool sock_map_redirect_allowed(const struct sock *sk) static bool sock_map_sk_is_suitable(const struct sock *sk) { - return sk_is_tcp(sk) || sk_is_udp(sk); + return !!sk->sk_prot->psock_update_sk_prot; } static bool sock_map_sk_state_allowed(const struct sock *sk) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ae980716d896..ac8cfbaeacd2 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -595,20 +595,38 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops) ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP; } -struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock) +int tcp_bpf_update_proto(struct sock *sk, bool restore) { + struct sk_psock *psock = sk_psock(sk); int family = sk->sk_family == AF_INET6 ? TCP_BPF_IPV6 : TCP_BPF_IPV4; int config = psock->progs.msg_parser ? TCP_BPF_TX : TCP_BPF_BASE; + if (restore) { + if (inet_csk_has_ulp(sk)) { + tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space); + } else { + sk->sk_write_space = psock->saved_write_space; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, psock->sk_proto); + } + return 0; + } + + if (inet_csk_has_ulp(sk)) + return -EINVAL; + if (sk->sk_family == AF_INET6) { if (tcp_bpf_assert_proto_ops(psock->sk_proto)) - return ERR_PTR(-EINVAL); + return -EINVAL; tcp_bpf_check_v6_needs_rebuild(psock->sk_proto); } - return &tcp_bpf_prots[family][config]; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, &tcp_bpf_prots[family][config]); + return 0; } +EXPORT_SYMBOL_GPL(tcp_bpf_update_proto); /* If a child got cloned from a listening socket that had tcp_bpf * protocol callbacks installed, we need to restore the callbacks to diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index daad4f99db32..dfc6d1c0e710 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2806,6 +2806,9 @@ struct proto tcp_prot = { .hash = inet_hash, .unhash = inet_unhash, .get_port = inet_csk_get_port, +#ifdef CONFIG_BPF_SYSCALL + .psock_update_sk_prot = tcp_bpf_update_proto, +#endif .enter_memory_pressure = tcp_enter_memory_pressure, .leave_memory_pressure = tcp_leave_memory_pressure, .stream_memory_free = tcp_stream_memory_free, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 4a0478b17243..38952aaee3a1 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2849,6 +2849,9 @@ struct proto udp_prot = { .unhash = udp_lib_unhash, .rehash = udp_v4_rehash, .get_port = udp_v4_get_port, +#ifdef CONFIG_BPF_SYSCALL + .psock_update_sk_prot = udp_bpf_update_proto, +#endif .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 7a94791efc1a..6001f93cd3a0 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -41,12 +41,23 @@ static int __init udp_bpf_v4_build_proto(void) } core_initcall(udp_bpf_v4_build_proto); -struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock) +int udp_bpf_update_proto(struct sock *sk, bool restore) { int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6; + struct sk_psock *psock = sk_psock(sk); + + if (restore) { + sk->sk_write_space = psock->saved_write_space; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, psock->sk_proto); + return 0; + } if (sk->sk_family == AF_INET6) udp_bpf_check_v6_needs_rebuild(psock->sk_proto); - return &udp_bpf_prots[family]; + /* Pairs with lockless read in sk_clone_lock() */ + WRITE_ONCE(sk->sk_prot, &udp_bpf_prots[family]); + return 0; } +EXPORT_SYMBOL_GPL(udp_bpf_update_proto); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index d0f007741e8e..bff22d6ef516 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -2139,6 +2139,9 @@ struct proto tcpv6_prot = { .hash = inet6_hash, .unhash = inet_unhash, .get_port = inet_csk_get_port, +#ifdef CONFIG_BPF_SYSCALL + .psock_update_sk_prot = tcp_bpf_update_proto, +#endif .enter_memory_pressure = tcp_enter_memory_pressure, .leave_memory_pressure = tcp_leave_memory_pressure, .stream_memory_free = tcp_stream_memory_free, diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index d25e5a9252fd..ef2c75bb4771 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1713,6 +1713,9 @@ struct proto udpv6_prot = { .unhash = udp_lib_unhash, .rehash = udp_v6_rehash, .get_port = udp_v6_get_port, +#ifdef CONFIG_BPF_SYSCALL + .psock_update_sk_prot = udp_bpf_update_proto, +#endif .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min), From patchwork Wed Mar 31 02:32:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174179 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B83BCC433C1 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8EEFF619D7 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233403AbhCaCdS (ORCPT ); Tue, 30 Mar 2021 22:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233413AbhCaCdB (ORCPT ); Tue, 30 Mar 2021 22:33:01 -0400 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30984C061574; Tue, 30 Mar 2021 19:33:01 -0700 (PDT) Received: by mail-oi1-x22c.google.com with SMTP id x2so18612004oiv.2; Tue, 30 Mar 2021 19:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f7W2njqz7r7ROv+AQD2dgc8rqwZ6i4LcPiSpnknm8TY=; b=rSrg2uf61pnTv5m8hOO5O2Oyt5+1TZltn6OIWVG3zBlm+nqHC5Mv/wqxGoYeTfIdFj dLGJxpgLX1scN6YAoN8qnVQJvqhXARb1UDGf8kLXCNpuED7+GUUN41JVsr8wcRNqzoTx Q55x+LcYLHcjsFfHUWj/2JM36y0+7yoaOxbcZLrW4QFoGxZKbXp9hZtAkWNvoPZtaoh0 XDgqR0hvqfJRB+p8f99LhnWmbiQhQcgaQYTLgS74x4StMjXAMS6ou4XOlwsESCoPAHJP +abccmi5ufb/58hLKhQWShT/qa468H9orKgVAhrHmuan52cD1257ChrWI+up7SIBaPbH tZOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f7W2njqz7r7ROv+AQD2dgc8rqwZ6i4LcPiSpnknm8TY=; b=cgJW4ZcdR3CdyeXaRJANRfF5S8vYjjDukv/JanW1LfRs4ZQCZJTDCBaimOBSPw+ALy pXecThSS0rwMSTb/ep6Ic4I0/uftbUcuqeq1UoZG6NkshN1lGg0NYit+PB0syDp12IyL pNv3OYdno/YqVz/j1rkDVOF9xNJsLAjuWBnUSGPCs8JtTMVZD7SUswruS4EnTICt7afs 0gd6EnCeRF07bP4BBRqlHldI4ydPZQ58N83OeTkrSgc6Mv9DlDbffnQtyC60A1spSyj1 meNXDoifYWdgWgcBTjpEvb4haoE9H0TUUCd368Mpo/gLtqlYLmcjtpaptFEDWwNDC09X Xhag== X-Gm-Message-State: AOAM5319pQC+8G1IaLDWrYW5C/bAd0DtgVSUmIfpCTan6eHS3iaNROqi Nyg187ecLMtq3MQMltnjSPnZsRQWnNz8ug== X-Google-Smtp-Source: ABdhPJwa9fnTqiehNOlKoalrvl8FDYPo+ZGEBUOIcqt33hsmtd12TN/EsrA25ivBPXbyNpBYKlWpBw== X-Received: by 2002:aca:741:: with SMTP id 62mr669201oih.104.1617157980493; Tue, 30 Mar 2021 19:33:00 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.32.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:00 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 11/16] udp: implement ->read_sock() for sockmap Date: Tue, 30 Mar 2021 19:32:32 -0700 Message-Id: <20210331023237.41094-12-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang This is similar to tcp_read_sock(), except we do not need to worry about connections, we just need to retrieve skb from UDP receive queue. Note, the return value of ->read_sock() is unused in sk_psock_verdict_data_ready(), and UDP still does not support splice() due to lack of ->splice_read(), so users can not reach udp_read_sock() directly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Acked-by: John Fastabend --- include/net/udp.h | 2 ++ net/ipv4/af_inet.c | 1 + net/ipv4/udp.c | 29 +++++++++++++++++++++++++++++ net/ipv6/af_inet6.c | 1 + 4 files changed, 33 insertions(+) diff --git a/include/net/udp.h b/include/net/udp.h index df7cc1edc200..347b62a753c3 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -329,6 +329,8 @@ struct sock *__udp6_lib_lookup(struct net *net, struct sk_buff *skb); struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, __be16 sport, __be16 dport); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); /* UDP uses skb->dev_scratch to cache as much information as possible and avoid * possibly multiple cache miss on dequeue() diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 1355e6c0d567..f17870ee558b 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1070,6 +1070,7 @@ const struct proto_ops inet_dgram_ops = { .setsockopt = sock_common_setsockopt, .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, + .read_sock = udp_read_sock, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .sendpage = inet_sendpage, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 38952aaee3a1..4d02f6839e38 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1782,6 +1782,35 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags, } EXPORT_SYMBOL(__skb_recv_udp); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + int copied = 0; + + while (1) { + struct sk_buff *skb; + int err, used; + + skb = skb_recv_udp(sk, 0, 1, &err); + if (!skb) + return err; + used = recv_actor(desc, skb, 0, skb->len); + if (used <= 0) { + if (!copied) + copied = used; + break; + } else if (used <= skb->len) { + copied += used; + } + + if (!desc->count) + break; + } + + return copied; +} +EXPORT_SYMBOL(udp_read_sock); + /* * This should be easy, if there is something there we * return it, otherwise we block. diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 802f5111805a..71de739b4a9e 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -714,6 +714,7 @@ const struct proto_ops inet6_dgram_ops = { .getsockopt = sock_common_getsockopt, /* ok */ .sendmsg = inet6_sendmsg, /* retpoline's sake */ .recvmsg = inet6_recvmsg, /* retpoline's sake */ + .read_sock = udp_read_sock, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, From patchwork Wed Mar 31 02:32:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174173 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3BC8C433ED for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7A90D6187E for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233448AbhCaCdS (ORCPT ); Tue, 30 Mar 2021 22:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233414AbhCaCdC (ORCPT ); Tue, 30 Mar 2021 22:33:02 -0400 Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72FACC061574; Tue, 30 Mar 2021 19:33:02 -0700 (PDT) Received: by mail-ot1-x334.google.com with SMTP id 31-20020a9d00220000b02901b64b9b50b1so17551925ota.9; Tue, 30 Mar 2021 19:33:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KLA6Vftyh0UcPve9MLTcr/Qlwo1Hmm6TlBTby3/AhU0=; b=nncVWsyALY/qAVVnTVLmIDmmWq7d9N37BKXAEw9cf9jMFEUH//A8tpQgVgtPzLEMRv 8F8kOBTvRxz4GKTvExU0TyY4ph1qLT1ViH66h3laVThVoOk3HMtV8TsqSw8Do1gcuBqu 4KqBF4S834QFA7Zbbthh3lZW0i0csUj17dTG44u2c51Zgs6hs6Yq4yghlFFUntAkhGjX BysxC2fGue+WHDgeXEh8X5Uz7HYTrNUq6IhALyshiKV21Kn08ChBCtUGTkJUAWsQsSi7 GxamNFsrFpMWThU9izyAYqLACZqDW5YPBQ+9xymaq12Bw/LeY8pZxLmbAe54SV9UhQOB UvYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KLA6Vftyh0UcPve9MLTcr/Qlwo1Hmm6TlBTby3/AhU0=; b=B003eKpD1lFMbeZvEXrO/KGziqkjR0fmdU4Go8cvJfZcTqc//xEJ70bztXtvr4HYz7 kqxKmOF2pivHRPzejYSDfSO9X047IadMaIb3g7g0v/nyrD20Yf5TDeAqGmHHMvlp7ctp 3bCJksi8LRXWh5pUgt4Qrscyh8EUQUssuAO54NFoTAjiKuDd6hk523gHirLzDvngyv+s eBUDMKk4TPG+GyM+1WxV8Uexe2n3xszCYLgmKg6J4a3mh2Th7yZOqrSX5HC6f8mV2bXJ LVDgNWLcHL3BU97YgJS/N24UEYrUPwuosp/kDpOgHqM/TA6xnSR/8g65Hto8XtiFNdqP OxNA== X-Gm-Message-State: AOAM531R0WE1YvN9OtymjgLsJ7yroGOllPQYGrqP/NwnRLVa6cHQULR8 Ua2jLB2oF/lNNcUZmpjDYbLuupuRxlEcCA== X-Google-Smtp-Source: ABdhPJzYgZruCowa/vXbDQ34akW/QikuayDx6Kx52AUkE8sQTVZ9k10RgbjOeMTLMkIzfN0DquOr7A== X-Received: by 2002:a05:6830:1b7a:: with SMTP id d26mr780789ote.324.1617157981724; Tue, 30 Mar 2021 19:33:01 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.33.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:01 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 12/16] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Date: Tue, 30 Mar 2021 19:32:33 -0700 Message-Id: <20210331023237.41094-13-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Acked-by: John Fastabend --- include/linux/skmsg.h | 4 ++ include/net/tcp.h | 2 - net/core/skmsg.c | 98 +++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_bpf.c | 100 +----------------------------------------- net/tls/tls_sw.c | 4 +- 5 files changed, 106 insertions(+), 102 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 5e800ddc2dc6..f78e90a04a69 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -125,6 +125,10 @@ int sk_msg_zerocopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err); +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags); static inline void sk_msg_check_to_free(struct sk_msg *msg, u32 i, u32 bytes) { diff --git a/include/net/tcp.h b/include/net/tcp.h index 2efa4e5ea23d..31b1696c62ba 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2209,8 +2209,6 @@ void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, int flags); -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags); #endif /* CONFIG_NET_SOCK_MSG */ #if !defined(CONFIG_BPF_SYSCALL) || !defined(CONFIG_NET_SOCK_MSG) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 9fc83f7cc1a0..92a83c02562a 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -399,6 +399,104 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, } EXPORT_SYMBOL_GPL(sk_msg_memcopy_from_iter); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err) +{ + DEFINE_WAIT_FUNC(wait, woken_wake_function); + int ret = 0; + + if (sk->sk_shutdown & RCV_SHUTDOWN) + return 1; + + if (!timeo) + return ret; + + add_wait_queue(sk_sleep(sk), &wait); + sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); + ret = sk_wait_event(sk, &timeo, + !list_empty(&psock->ingress_msg) || + !skb_queue_empty(&sk->sk_receive_queue), &wait); + sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); + remove_wait_queue(sk_sleep(sk), &wait); + return ret; +} +EXPORT_SYMBOL_GPL(sk_msg_wait_data); + +/* Receive sk_msg from psock->ingress_msg to @msg. */ +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags) +{ + struct iov_iter *iter = &msg->msg_iter; + int peek = flags & MSG_PEEK; + struct sk_msg *msg_rx; + int i, copied = 0; + + msg_rx = sk_psock_peek_msg(psock); + while (copied != len) { + struct scatterlist *sge; + + if (unlikely(!msg_rx)) + break; + + i = msg_rx->sg.start; + do { + struct page *page; + int copy; + + sge = sk_msg_elem(msg_rx, i); + copy = sge->length; + page = sg_page(sge); + if (copied + copy > len) + copy = len - copied; + copy = copy_page_to_iter(page, sge->offset, copy, iter); + if (!copy) + return copied ? copied : -EFAULT; + + copied += copy; + if (likely(!peek)) { + sge->offset += copy; + sge->length -= copy; + if (!msg_rx->skb) + sk_mem_uncharge(sk, copy); + msg_rx->sg.size -= copy; + + if (!sge->length) { + sk_msg_iter_var_next(i); + if (!msg_rx->skb) + put_page(page); + } + } else { + /* Lets not optimize peek case if copy_page_to_iter + * didn't copy the entire length lets just break. + */ + if (copy != sge->length) + return copied; + sk_msg_iter_var_next(i); + } + + if (copied == len) + break; + } while (i != msg_rx->sg.end); + + if (unlikely(peek)) { + msg_rx = sk_psock_next_msg(psock, msg_rx); + if (!msg_rx) + break; + continue; + } + + msg_rx->sg.start = i; + if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { + msg_rx = sk_psock_dequeue_msg(psock); + kfree_sk_msg(msg_rx); + } + msg_rx = sk_psock_peek_msg(psock); + } + + return copied; +} +EXPORT_SYMBOL_GPL(sk_msg_recvmsg); + static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, struct sk_buff *skb) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ac8cfbaeacd2..3d622a0d0753 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -10,80 +10,6 @@ #include #include -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags) -{ - struct iov_iter *iter = &msg->msg_iter; - int peek = flags & MSG_PEEK; - struct sk_msg *msg_rx; - int i, copied = 0; - - msg_rx = sk_psock_peek_msg(psock); - while (copied != len) { - struct scatterlist *sge; - - if (unlikely(!msg_rx)) - break; - - i = msg_rx->sg.start; - do { - struct page *page; - int copy; - - sge = sk_msg_elem(msg_rx, i); - copy = sge->length; - page = sg_page(sge); - if (copied + copy > len) - copy = len - copied; - copy = copy_page_to_iter(page, sge->offset, copy, iter); - if (!copy) - return copied ? copied : -EFAULT; - - copied += copy; - if (likely(!peek)) { - sge->offset += copy; - sge->length -= copy; - if (!msg_rx->skb) - sk_mem_uncharge(sk, copy); - msg_rx->sg.size -= copy; - - if (!sge->length) { - sk_msg_iter_var_next(i); - if (!msg_rx->skb) - put_page(page); - } - } else { - /* Lets not optimize peek case if copy_page_to_iter - * didn't copy the entire length lets just break. - */ - if (copy != sge->length) - return copied; - sk_msg_iter_var_next(i); - } - - if (copied == len) - break; - } while (i != msg_rx->sg.end); - - if (unlikely(peek)) { - msg_rx = sk_psock_next_msg(psock, msg_rx); - if (!msg_rx) - break; - continue; - } - - msg_rx->sg.start = i; - if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { - msg_rx = sk_psock_dequeue_msg(psock); - kfree_sk_msg(msg_rx); - } - msg_rx = sk_psock_peek_msg(psock); - } - - return copied; -} -EXPORT_SYMBOL_GPL(__tcp_bpf_recvmsg); - static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, struct sk_msg *msg, u32 apply_bytes, int flags) { @@ -237,28 +163,6 @@ static bool tcp_bpf_stream_read(const struct sock *sk) return !empty; } -static int tcp_bpf_wait_data(struct sock *sk, struct sk_psock *psock, - int flags, long timeo, int *err) -{ - DEFINE_WAIT_FUNC(wait, woken_wake_function); - int ret = 0; - - if (sk->sk_shutdown & RCV_SHUTDOWN) - return 1; - - if (!timeo) - return ret; - - add_wait_queue(sk_sleep(sk), &wait); - sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); - ret = sk_wait_event(sk, &timeo, - !list_empty(&psock->ingress_msg) || - !skb_queue_empty(&sk->sk_receive_queue), &wait); - sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); - remove_wait_queue(sk_sleep(sk), &wait); - return ret; -} - static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len) { @@ -278,13 +182,13 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, } lock_sock(sk); msg_bytes_ready: - copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { int data, err = 0; long timeo; timeo = sock_rcvtimeo(sk, nonblock); - data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); if (data) { if (!sk_psock_queue_empty(psock)) goto msg_bytes_ready; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 01d933ae5f16..1dcb34dfd56b 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1789,8 +1789,8 @@ int tls_sw_recvmsg(struct sock *sk, skb = tls_wait_data(sk, psock, flags, timeo, &err); if (!skb) { if (psock) { - int ret = __tcp_bpf_recvmsg(sk, psock, - msg, len, flags); + int ret = sk_msg_recvmsg(sk, psock, msg, len, + flags); if (ret > 0) { decrypted += ret; From patchwork Wed Mar 31 02:32:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174177 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0282DC433FC for ; Wed, 31 Mar 2021 02:33:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E3461619DD for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233452AbhCaCdS (ORCPT ); Tue, 30 Mar 2021 22:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233416AbhCaCdE (ORCPT ); Tue, 30 Mar 2021 22:33:04 -0400 Received: from mail-oo1-xc2d.google.com (mail-oo1-xc2d.google.com [IPv6:2607:f8b0:4864:20::c2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4EFEC061574; Tue, 30 Mar 2021 19:33:03 -0700 (PDT) Received: by mail-oo1-xc2d.google.com with SMTP id n12-20020a4ad12c0000b02901b63e7bc1b4so4271690oor.5; Tue, 30 Mar 2021 19:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ktDD3wFfhTgzuwFfQ6FhKlNOYR4srlqkDAegPxHtcbc=; b=TLA1i1TB+5PExdHvEK4YV1Lw2KIv9u3NBreSQWEb0IgyeSrvVK6BUk++7rIWwmWT76 n5YjBNrq9nRYUKcpv1dByNWKJ+xO3NUpXM5Y8B7zYbzHIIkQobmiYqQD5F9OKu3mt3JN 2+Ag2qFdgmGXG2B+PakyFbla2X3KtTrTKbu76OHbtb57FLZ6OJo5GKUoZpHKVOAZw/Ng xgnqmrBnN8dhqTi+o1FzTINi8ADr4g+zX6Y2zgAsYLVfGRQt/dxcSU0Eu8mWdDszD2jV EWtmpldqKA/j98QVprgvFCqM4A5y+OyaUPI7v5nItJqeMohhM8gA/dyhaBxiZl4n3gdl Sljg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ktDD3wFfhTgzuwFfQ6FhKlNOYR4srlqkDAegPxHtcbc=; b=c4eGm1nykDeCNvMckCcy8QjvksD2rTwxcYoPlrsS7lVQAQGFH1nx8D1AGbEhCAk7Qd aEtJdeBb2LdNc9Y/MCWnZ8fvK5Gn10+62bJ3u1Mv5HU/KshZyaMFdIK8m5neLNKiE3dY 36tj4qOvlN+hEre0+KdztevaF/d9sNPCnO1QocN29epWov+JVi9zVv3El/G3796wAdxC 7eKyjwzfv2sTIF0eMd23Lb2xgOxTfZCfWSim89aLi4U1qenkK9bV8att4FXobZxohGwH v9kcDUpLaDQhOCkuBAEBA3+0sMcIg/D/4Nc4s8GrjEeByi1Xp45nNGlMOJpF0MGXKmIs P3SQ== X-Gm-Message-State: AOAM530zHubRNq9o7oMYU86Ww42omaVg4o/LornRiKqZBzXhDMyF8PUT RQ0Z/FFrpeE3fsp8ZORd5C6dGAi8nOkKTA== X-Google-Smtp-Source: ABdhPJw0jWqUzV1Rjmht3od8xP+zoHTL5J/caUb7B+7sP45WwqoBAkjrEZgjj8zFB2XWpjH5Nf9DAQ== X-Received: by 2002:a4a:e643:: with SMTP id q3mr881901oot.46.1617157982955; Tue, 30 Mar 2021 19:33:02 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.33.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:02 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 13/16] udp: implement udp_bpf_recvmsg() for sockmap Date: Tue, 30 Mar 2021 19:32:34 -0700 Message-Id: <20210331023237.41094-14-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Acked-by: John Fastabend --- net/ipv4/udp_bpf.c | 64 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 6001f93cd3a0..7d5c4ebf42fe 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -4,6 +4,68 @@ #include #include #include +#include + +#include "udp_impl.h" + +static struct proto *udpv6_prot_saved __read_mostly; + +static int sk_udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int noblock, int flags, int *addr_len) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6) + return udpv6_prot_saved->recvmsg(sk, msg, len, noblock, flags, + addr_len); +#endif + return udp_prot.recvmsg(sk, msg, len, noblock, flags, addr_len); +} + +static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int nonblock, int flags, int *addr_len) +{ + struct sk_psock *psock; + int copied, ret; + + if (unlikely(flags & MSG_ERRQUEUE)) + return inet_recv_error(sk, msg, len, addr_len); + + psock = sk_psock_get(sk); + if (unlikely(!psock)) + return sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + + lock_sock(sk); + if (sk_psock_queue_empty(psock)) { + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + +msg_bytes_ready: + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + if (!copied) { + int data, err = 0; + long timeo; + + timeo = sock_rcvtimeo(sk, nonblock); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); + if (data) { + if (!sk_psock_queue_empty(psock)) + goto msg_bytes_ready; + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + if (err) { + ret = err; + goto out; + } + copied = -EAGAIN; + } + ret = copied; +out: + release_sock(sk); + sk_psock_put(sk, psock); + return ret; +} enum { UDP_BPF_IPV4, @@ -11,7 +73,6 @@ enum { UDP_BPF_NUM_PROTS, }; -static struct proto *udpv6_prot_saved __read_mostly; static DEFINE_SPINLOCK(udpv6_prot_lock); static struct proto udp_bpf_prots[UDP_BPF_NUM_PROTS]; @@ -20,6 +81,7 @@ static void udp_bpf_rebuild_protos(struct proto *prot, const struct proto *base) *prot = *base; prot->unhash = sock_map_unhash; prot->close = sock_map_close; + prot->recvmsg = udp_bpf_recvmsg; } static void udp_bpf_check_v6_needs_rebuild(struct proto *ops) From patchwork Wed Mar 31 02:32:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174175 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA0A8C433F7 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D295A619E7 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233459AbhCaCdS (ORCPT ); Tue, 30 Mar 2021 22:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233422AbhCaCdF (ORCPT ); Tue, 30 Mar 2021 22:33:05 -0400 Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECCDEC061574; Tue, 30 Mar 2021 19:33:04 -0700 (PDT) Received: by mail-ot1-x32a.google.com with SMTP id s11-20020a056830124bb029021bb3524ebeso17611090otp.0; Tue, 30 Mar 2021 19:33:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R7RKRv5Jc3lBQtN71qE8m5rDgXTr5NP4RcQ1X/jJ5Lc=; b=P/aNmwDJ05dBrydNX7hSg9fEL99L/hwLfxRRbDJ727uQZVkFwlXTkvS6yy3L+tfml9 TFV5+7OST/2oWoO7PYPQ8kz9X9Y7PxSYenFeJABl9wsJRa5NS0pL07JFWq1HT8u5HElT MitFJXqqhYmENIv2yp6aF8xqRe1og5qgGPZ60IeYJgI/Yq9vf6TU5T24nDkCV8dcuJTQ JwIAunhcMxeR0OLH76TBHwWmRSfyqEHxVg2vvM/jhuNvuxhepRAWN+Z2h8pSoaKSQkcq zLASnkqrELPnSBwu5I/ADbQ7+dXbsTWievdGmzPGsjaTBLKgFRtZvIuvzx/K93382fC+ 4oNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R7RKRv5Jc3lBQtN71qE8m5rDgXTr5NP4RcQ1X/jJ5Lc=; b=XY7+VBDrFSm5ZMj2gb/Q1ufVpVhFtQQGxz7PJBrTl0zCYkvnYaYroEBnog5+cBUE7W qA4uuXDCD51ZyD5oSRdESxDyuqRXSkBLgx0UnreJwL58U28ezwMXEtD5rCmwfDz5W80R pn35rKhT2ewaqRfppCRARA5mqLQEFrw1QET78+WTjvSqCnmr9eQg7WSRTBmZKcm/PN+6 0qzubrG6PDql4J0PK4+ICw5gmAtLycPrgMiDikOMpgc5UaNV7/npLkf82uPeEFRSXO2+ V8NyETFO9PUSfnH19L1i4qfkr5Rq411Xpa+I8SS5keAQKXVgDkX/oXy12cQdg8A177dl pqtQ== X-Gm-Message-State: AOAM5314qgUS/JS8z4nU2FT4KFcVuas8lNw1s2g1H0bPXstQoPt6KggA GZ44w/XyVVgSn9VczsHl2siep4Fk9b1xxg== X-Google-Smtp-Source: ABdhPJxCEJnNCG1i7I3PLqFSP8qYjSDaquzE9jlsE2fL72K/h7B9sMJLjRhPji+eGMlnEfzU/AKSaQ== X-Received: by 2002:a9d:62d9:: with SMTP id z25mr801447otk.194.1617157984182; Tue, 30 Mar 2021 19:33:04 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:03 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 14/16] sock_map: update sock type checks for UDP Date: Tue, 30 Mar 2021 19:32:35 -0700 Message-Id: <20210331023237.41094-15-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Now UDP supports sockmap and redirection, we can safely update the sock type checks for it accordingly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang Acked-by: John Fastabend --- net/core/sock_map.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 2915c7c8778b..3d190d22b0d8 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -535,7 +535,10 @@ static bool sk_is_udp(const struct sock *sk) static bool sock_map_redirect_allowed(const struct sock *sk) { - return sk_is_tcp(sk) && sk->sk_state != TCP_LISTEN; + if (sk_is_tcp(sk)) + return sk->sk_state != TCP_LISTEN; + else + return sk->sk_state == TCP_ESTABLISHED; } static bool sock_map_sk_is_suitable(const struct sock *sk) From patchwork Wed Mar 31 02:32:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174187 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D70B3C433F8 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C417F619E5 for ; Wed, 31 Mar 2021 02:33:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233462AbhCaCdT (ORCPT ); Tue, 30 Mar 2021 22:33:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233425AbhCaCdG (ORCPT ); Tue, 30 Mar 2021 22:33:06 -0400 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39841C061574; Tue, 30 Mar 2021 19:33:06 -0700 (PDT) Received: by mail-oi1-x230.google.com with SMTP id a8so18560566oic.11; Tue, 30 Mar 2021 19:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gOC5unRxfv/K/WT+PzYY09V5EeMjRcHWr4Ygtd7dIvU=; b=P0NRpYe679s9LHxwi9aAsg85TAPMBGn93P8bsxt+OUcCNNPg8TLIhSFf6PBh9mMnoX 5Os0KdXAAXYDn1m1j8k0s1BH9ietm+6M5o4WZiklMwfKES8g/tQqNeXRtTONGLUI1oUk Ls41eFkJyeLgGPsh1PWm1l005sW0IGVqJGQOvrcU6p7UzeZgTtRhWprNPQbxnm7FKuru v2528JaO6pUdLHSasBNN4xfwXWhLuE9wL7hqh7EcvfINTTZgMAcqfAuACyMzqLE+mI0E odgteIXfJ0QQYXPrxsy/plDY+RdCHqq0OMgbnyjZilO0/nPV/emv4Q4uL0EqEGBDzze+ aCjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gOC5unRxfv/K/WT+PzYY09V5EeMjRcHWr4Ygtd7dIvU=; b=mgEGrmMfY5NknlcaScBhzj+MJxtZzTc6NDNiTKyYt3YLwyGPgETKkMp6rgzXpJ7RGF UVa5J0Us89D+K0pBgQq01jNiAELEOfe4CPz5ouN16JeKdCfGJFWDIIUo4OHkVDZzfuyL HQB+Pl5o5EIPz/HlZn3HEUce7eBwDnLLDHdCEnCTyGrIrYPNTwIPCI7PJq6RaCmSjfzD QaKdcbX+5pg2FTZwvbSK/wduRqxuD3/6k0WGYJlx6Ds0vhPnI8Cjlvsi8iwKEtqFq1gJ UCuQXjbHwCSuGMc8JpwwniCAjHK/ljzsbkPyyVLdaOugnsqRVNsNi+dqsqa3e4tSZP81 rwPw== X-Gm-Message-State: AOAM530IkH8Ja0KeZz3Wj5+sZYKDH3mZBQD60vzG8FwWZ2g/CBg+y7ay zQeCVFf/oC96aL8wqF6alrd01usm+SO8oA== X-Google-Smtp-Source: ABdhPJyZhkwu/YjdhUYgXc6YeSaCZdoxQE8UZDq5875xpAnmZtuDQSEShfCv/uxehxMMwBhWLLRoTA== X-Received: by 2002:aca:df44:: with SMTP id w65mr678554oig.36.1617157985433; Tue, 30 Mar 2021 19:33:05 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.33.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:05 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 15/16] selftests/bpf: add a test case for udp sockmap Date: Tue, 30 Mar 2021 19:32:36 -0700 Message-Id: <20210331023237.41094-16-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang Add a test case to ensure redirection between two UDP sockets work. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 136 ++++++++++++++++++ .../selftests/bpf/progs/test_sockmap_listen.c | 22 +++ 2 files changed, 158 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index c26e6bf05e49..648d9ae898d2 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1603,6 +1603,141 @@ static void test_reuseport(struct test_sockmap_listen *skel, } } +static void udp_redir_to_connected(int family, int sotype, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + p0 = socket_loopback(family, sotype | SOCK_NONBLOCK); + if (p0 < 0) + return; + len = sizeof(addr); + err = xgetsockname(p0, sockaddr(&addr), &len); + if (err) + goto close_peer0; + + c0 = xsocket(family, sotype | SOCK_NONBLOCK, 0); + if (c0 < 0) + goto close_peer0; + err = xconnect(c0, sockaddr(&addr), len); + if (err) + goto close_cli0; + err = xgetsockname(c0, sockaddr(&addr), &len); + if (err) + goto close_cli0; + err = xconnect(p0, sockaddr(&addr), len); + if (err) + goto close_cli0; + + p1 = socket_loopback(family, sotype | SOCK_NONBLOCK); + if (p1 < 0) + goto close_cli0; + err = xgetsockname(p1, sockaddr(&addr), &len); + if (err) + goto close_cli0; + + c1 = xsocket(family, sotype | SOCK_NONBLOCK, 0); + if (c1 < 0) + goto close_peer1; + err = xconnect(c1, sockaddr(&addr), len); + if (err) + goto close_cli1; + err = xgetsockname(c1, sockaddr(&addr), &len); + if (err) + goto close_cli1; + err = xconnect(p1, sockaddr(&addr), len); + if (err) + goto close_cli1; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close_cli1; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close_cli1; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close_cli1: + xclose(c1); +close_peer1: + xclose(p1); +close_cli0: + xclose(c0); +close_peer0: + xclose(p0); +} + +static void udp_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + udp_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map, + REDIR_EGRESS); + skel->bss->test_ingress = true; + udp_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map, + REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void test_udp_redir(struct test_sockmap_listen *skel, struct bpf_map *map, + int family) +{ + const char *family_name, *map_name; + char s[MAX_TEST_NAME]; + + family_name = family_str(family); + map_name = map_type_str(map); + snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); + if (!test__start_subtest(s)) + return; + udp_skb_redir_to_connected(skel, map, family); +} + static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map, int family) { @@ -1611,6 +1746,7 @@ static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map, test_redir(skel, map, family, SOCK_STREAM); test_reuseport(skel, map, family, SOCK_STREAM); test_reuseport(skel, map, family, SOCK_DGRAM); + test_udp_redir(skel, map, family); } void test_sockmap_listen(void) diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c index fa221141e9c1..a39eba9f5201 100644 --- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c +++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c @@ -29,6 +29,7 @@ struct { } verdict_map SEC(".maps"); static volatile bool test_sockmap; /* toggled by user-space */ +static volatile bool test_ingress; /* toggled by user-space */ SEC("sk_skb/stream_parser") int prog_stream_parser(struct __sk_buff *skb) @@ -55,6 +56,27 @@ int prog_stream_verdict(struct __sk_buff *skb) return verdict; } +SEC("sk_skb/skb_verdict") +int prog_skb_verdict(struct __sk_buff *skb) +{ + unsigned int *count; + __u32 zero = 0; + int verdict; + + if (test_sockmap) + verdict = bpf_sk_redirect_map(skb, &sock_map, zero, + test_ingress ? BPF_F_INGRESS : 0); + else + verdict = bpf_sk_redirect_hash(skb, &sock_hash, &zero, + test_ingress ? BPF_F_INGRESS : 0); + + count = bpf_map_lookup_elem(&verdict_map, &verdict); + if (count) + (*count)++; + + return verdict; +} + SEC("sk_msg") int prog_msg_verdict(struct sk_msg_md *msg) { From patchwork Wed Mar 31 02:32:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 12174181 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2041C43381 for ; Wed, 31 Mar 2021 02:33:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8FF43619BB for ; Wed, 31 Mar 2021 02:33:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233468AbhCaCdT (ORCPT ); Tue, 30 Mar 2021 22:33:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233426AbhCaCdH (ORCPT ); Tue, 30 Mar 2021 22:33:07 -0400 Received: from mail-oi1-x22d.google.com (mail-oi1-x22d.google.com [IPv6:2607:f8b0:4864:20::22d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A1EBC061574; Tue, 30 Mar 2021 19:33:07 -0700 (PDT) Received: by mail-oi1-x22d.google.com with SMTP id w70so18654556oie.0; Tue, 30 Mar 2021 19:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zJsA5AZqU+uTJz0h7iEhTFBjGQWrPnAhpQ/zHuK8xKc=; b=spkOC5wVCdDJKIO9YZiXnJKge4i/1KPDa//qPOz5QwQMZYpPHARuu3YfKFeZTJ5yHr l/sZXvSQvmAePdiXaDSm19sQOpmeIKKT4A4m9ws1BkuaMM77tiEIk81Wq68w3mFakwNl N2sGb0tblC2scKcKIGJ4z+zbn+8LgJ3Fh6XITjCoa6Pdvukzkmoc5yzcJKrF9PGPp0tG ZLBsVDvGzwQAA/mmrNT/jcRfAAyFNCTMCij2OJCKhW34SUWUjACzWnaM8QBz++JISnof giy3QhZNOiEM4P+XbA07jRIcHV7jhO5hq3+6FHHHXbAUjgxVxxtW6kVrTahUJI+uM/H/ Qfnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zJsA5AZqU+uTJz0h7iEhTFBjGQWrPnAhpQ/zHuK8xKc=; b=noIq+6U/ClPwLm9TtV+WSBsnw9LGshRj62Fz1O3lsPOMiHVRx2G0HzSHJRCH8c7ozi gO6iVGbYwB+eTrLSXbVFHR4YN7mKgbVT/0WEIfeE0bEW6kFraiSw+rk9mhdM/KYuMT7u OvVVVTTDAIrCo1EYOW/IaLqIB6qeX6unlF6rn/rofDw/wr13E2wvnEdXddn3glG04GuO SgepGAE1YUl07OUIphyIyLvch7X5+4Udt6Z+s9bj8G3LH1rWU4RwavqvoD/21jxcwIe+ t+g1RYmVip3l62nLa5LsH+lzHGaDyXIopKca+eeTWIpw822IM17xPHtmNYDHKcpz0c9r 6YNw== X-Gm-Message-State: AOAM5316fPS4+optQ78+P0wxywWz3SnEZ6lTC/6BFTamDL2S7BdhdpB1 1fkYmi7OplN/RQwSrYB1iRR+04/o31UVUQ== X-Google-Smtp-Source: ABdhPJw840JvOOhx11ku3mcL6/lrfwH2JNsJAw/RBGSz62yk5Rhz3RDl+YOjeEXh46RUeSwIuntsWg== X-Received: by 2002:a05:6808:ab0:: with SMTP id r16mr671778oij.34.1617157986746; Tue, 30 Mar 2021 19:33:06 -0700 (PDT) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:a099:767b:2b62:48df]) by smtp.gmail.com with ESMTPSA id 7sm188125ois.20.2021.03.30.19.33.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 19:33:06 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next v8 16/16] selftests/bpf: add a test case for loading BPF_SK_SKB_VERDICT Date: Tue, 30 Mar 2021 19:32:37 -0700 Message-Id: <20210331023237.41094-17-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331023237.41094-1-xiyou.wangcong@gmail.com> References: <20210331023237.41094-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Cong Wang This adds a test case to ensure BPF_SK_SKB_VERDICT and BPF_SK_STREAM_VERDICT will never be attached at the same time. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_basic.c | 40 +++++++++++++++++++ .../progs/test_sockmap_skb_verdict_attach.c | 18 +++++++++ 2 files changed, 58 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/test_sockmap_skb_verdict_attach.c diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index b8b48cac2ac3..ab77596b64e3 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -7,6 +7,7 @@ #include "test_skmsg_load_helpers.skel.h" #include "test_sockmap_update.skel.h" #include "test_sockmap_invalid_update.skel.h" +#include "test_sockmap_skb_verdict_attach.skel.h" #include "bpf_iter_sockmap.skel.h" #define TCP_REPAIR 19 /* TCP sock is under repair right now */ @@ -281,6 +282,39 @@ static void test_sockmap_copy(enum bpf_map_type map_type) bpf_iter_sockmap__destroy(skel); } +static void test_sockmap_skb_verdict_attach(enum bpf_attach_type first, + enum bpf_attach_type second) +{ + struct test_sockmap_skb_verdict_attach *skel; + int err, map, verdict; + + skel = test_sockmap_skb_verdict_attach__open_and_load(); + if (CHECK_FAIL(!skel)) { + perror("test_sockmap_skb_verdict_attach__open_and_load"); + return; + } + + verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + map = bpf_map__fd(skel->maps.sock_map); + + err = bpf_prog_attach(verdict, map, first, 0); + if (CHECK_FAIL(err)) { + perror("bpf_prog_attach"); + goto out; + } + + err = bpf_prog_attach(verdict, map, second, 0); + assert(err == -1 && errno == EBUSY); + + err = bpf_prog_detach2(verdict, map, first); + if (CHECK_FAIL(err)) { + perror("bpf_prog_detach2"); + goto out; + } +out: + test_sockmap_skb_verdict_attach__destroy(skel); +} + void test_sockmap_basic(void) { if (test__start_subtest("sockmap create_update_free")) @@ -301,4 +335,10 @@ void test_sockmap_basic(void) test_sockmap_copy(BPF_MAP_TYPE_SOCKMAP); if (test__start_subtest("sockhash copy")) test_sockmap_copy(BPF_MAP_TYPE_SOCKHASH); + if (test__start_subtest("sockmap skb_verdict attach")) { + test_sockmap_skb_verdict_attach(BPF_SK_SKB_VERDICT, + BPF_SK_SKB_STREAM_VERDICT); + test_sockmap_skb_verdict_attach(BPF_SK_SKB_STREAM_VERDICT, + BPF_SK_SKB_VERDICT); + } } diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_skb_verdict_attach.c b/tools/testing/selftests/bpf/progs/test_sockmap_skb_verdict_attach.c new file mode 100644 index 000000000000..2d31f66e4f23 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_sockmap_skb_verdict_attach.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "vmlinux.h" +#include + +struct { + __uint(type, BPF_MAP_TYPE_SOCKMAP); + __uint(max_entries, 2); + __type(key, __u32); + __type(value, __u64); +} sock_map SEC(".maps"); + +SEC("sk_skb/skb_verdict") +int prog_skb_verdict(struct __sk_buff *skb) +{ + return SK_DROP; +} + +char _license[] SEC("license") = "GPL";