From patchwork Mon Jan 18 14:39:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 12027457 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0497C433E0 for ; Mon, 18 Jan 2021 14:42:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8682122B40 for ; Mon, 18 Jan 2021 14:42:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392727AbhAROly (ORCPT ); Mon, 18 Jan 2021 09:41:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393075AbhAROkg (ORCPT ); Mon, 18 Jan 2021 09:40:36 -0500 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04B2CC061575; Mon, 18 Jan 2021 06:39:56 -0800 (PST) Received: by mail-pf1-x442.google.com with SMTP id h10so10298024pfo.9; Mon, 18 Jan 2021 06:39:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BQpIDmpmqR2mhiXRTWsYQzQxNoPXxseuWN2BDaNXzQc=; b=ZhQ+6o4Ht6DtwDqPCRE5w8+NsbM5JnEbbNQoT0Hq200JUEbWvnvL/iK7sh4BPZ7jbh Z0qJKs+NUgflx49bja/2NeA4fZYhDaSC6yVyfa3XIb0LLG35ncRMPsyIBxU3ahca8Tgb jFjmmvwD8jYNoxD0gOi/sPR7eiWZKJeZZ5MvBIkhynueZ3FYv8NG1T6DvzBxIS70BYp9 RwqrdVf1ze4lxxIA/B0RrRp/Azg216BM0cWIGEOB0Djlj4aUXSepmdSYZxqdkTL2o8pG SLey/ayWv2ULjAcTWYl6UcGzgmdkf9ameNFwM+YVMI3NMQRuv8MQDO1oJTBQLmn9FjwI hcaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BQpIDmpmqR2mhiXRTWsYQzQxNoPXxseuWN2BDaNXzQc=; b=EUqJ2sLFS5YkCZId2cY8IYoFNmfJ29jdM2yIcQRXhLAJVH5VkNYFp4Ng0X0/wvUGNK 7sFVjuYordmOAOegXCv2oTw2PgnZW11B621BmAbGyVZkPPYJHwfCroPGnDxw7cED+eHb JALF2+JxDVMOY19gg2itwIVpfywNQqVYR8ImlXXqvFYnxoJ2b9EiZ0LSI8FFTvYm7c1b YfLg5rgob/jZTzDvY2PBocVW7OEJcEkSnzazRylLlJcFONyiil0cFmFP/8ntT8sB48Xu JkK/qpfNYxjlwg9FzByfkemrqdgnc9P1YiBTM8FGgMz9tP826d6cj2/3eEidCMaKXynR 4dGQ== X-Gm-Message-State: AOAM530xk8muwYKbfQVXWEfVget/u42cUn1WrsBhi78EDukPUXsWlnFM u1twP3wpy6Cee9SuiHDcSTE= X-Google-Smtp-Source: ABdhPJxnyv3ZffDGzg5YpRKik0DeLOfZXZlZ0mNAAQgZ+VsddMNyRm36u5dTkOTCjja7XY3Xfuqchw== X-Received: by 2002:a62:e704:0:b029:1b9:cb4:7626 with SMTP id s4-20020a62e7040000b02901b90cb47626mr637227pfh.52.1610980795610; Mon, 18 Jan 2021 06:39:55 -0800 (PST) Received: from localhost ([178.236.46.205]) by smtp.gmail.com with ESMTPSA id c14sm15405219pfd.37.2021.01.18.06.39.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 06:39:55 -0800 (PST) From: menglong8.dong@gmail.com X-Google-Original-From: dong.menglong@zte.com.cn To: kuba@kernel.org, christian.brauner@ubuntu.com Cc: davem@davemloft.net, yoshfuji@linux-ipv6.org, dong.menglong@zte.com.cn, daniel@iogearbox.net, gnault@redhat.com, ast@kernel.org, nicolas.dichtel@6wind.com, ap420073@gmail.com, edumazet@google.com, pabeni@redhat.com, jakub@cloudflare.com, bjorn.topel@intel.com, keescook@chromium.org, viro@zeniv.linux.org.uk, rdna@fb.com, maheshb@google.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 1/3] net: core: init every ctl_table in netns_core_table Date: Mon, 18 Jan 2021 22:39:30 +0800 Message-Id: <20210118143932.56069-2-dong.menglong@zte.com.cn> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210118143932.56069-1-dong.menglong@zte.com.cn> References: <20210118143932.56069-1-dong.menglong@zte.com.cn> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong For now, there is only one element in netns_core_table, and it is inited directly in sysctl_core_net_init. To make it more flexible, we can init every element at once, just like what ipv4_sysctl_init_net() did. Signed-off-by: Menglong Dong --- net/core/sysctl_net_core.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index d86d8d11cfe4..966d976dee84 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -606,15 +606,19 @@ static __net_init int sysctl_core_net_init(struct net *net) tbl = netns_core_table; if (!net_eq(net, &init_net)) { + int i; + tbl = kmemdup(tbl, sizeof(netns_core_table), GFP_KERNEL); if (tbl == NULL) goto err_dup; - tbl[0].data = &net->core.sysctl_somaxconn; + /* Update the variables to point into the current struct net */ + for (i = 0; i < ARRAY_SIZE(netns_core_table) - 1; i++) { + tbl[i].data += (void *)net - (void *)&init_net; - /* Don't export any sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - tbl[0].procname = NULL; + /* Don't export any sysctls to unprivileged users */ + if (net->user_ns != &init_user_ns) + tbl[i].procname = NULL; } } From patchwork Mon Jan 18 14:39:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 12027451 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12B5DC433DB for ; Mon, 18 Jan 2021 14:41:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C23BF22472 for ; Mon, 18 Jan 2021 14:41:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393139AbhAROlX (ORCPT ); Mon, 18 Jan 2021 09:41:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393079AbhAROkm (ORCPT ); Mon, 18 Jan 2021 09:40:42 -0500 Received: from mail-pl1-x643.google.com (mail-pl1-x643.google.com [IPv6:2607:f8b0:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE218C061757; Mon, 18 Jan 2021 06:40:00 -0800 (PST) Received: by mail-pl1-x643.google.com with SMTP id t6so8764675plq.1; Mon, 18 Jan 2021 06:40:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+E4QJbI5ohbqhEdy8Ep8gmkZ7hLL7GgbwHKJKePuegA=; b=EmOF57qh1P24Ku69Mp8y35f5YGaulMCb6YAtFbrxa3oLsMrkGu9BGwfYmv7f+/7Qx4 +YweOlHbM9JXaorJQemFn8aU9XJCBuplO2J/OIlEECh/Lmf+roz/SeYaJr6lMRw0Jnrv nhGPKPyjRZd/GYX4JNyG+SLq1yCs9oJf5TpL8CwCp4LYFOt1gLSeBrii5n+ibuEi8CSU ZK+81eCE6LUChSxYo5ODOzKiOg+lQNc23NVAgUBYhzM4eKUhu13LcvsxoP6AGhXH5NQP COZkgshkZt1jJJelYKDK3Fk3HuIOarULURDIbGQ66on4f2I6FBEgBR+5xq6H752UeMGK hVwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+E4QJbI5ohbqhEdy8Ep8gmkZ7hLL7GgbwHKJKePuegA=; b=Vs/i9SNA7W6mjP1zmJMXeHNVPvPNvtxzSGSpg5/J0mS4PQYpibwk9diPVThtvVLH8u 6oGSc6l0mf1AAkBLAHbw4sOr3fwt/kBQ/3cIROPtb+ZFnBKT95Pptt25qLet1htyfmY0 5RMBh449Aul0OX/g8mrVexGBZpPnNC2h6WYCnMUEV2PZwXYfgW0tcR5pdevNMVJ32j9d JpZEdKOWzGDb7a+zj29ptop9xOehEPmvF7xSF/T15DXduHQ9rEH8RQRNdXD/ns2fcGQl XqfRGFodcuTpBNDhoWph5ekUDAhCbIyEnlUp0i2XiaHyCl095K0Dow4wKyV53Jd6QDYK 6JTg== X-Gm-Message-State: AOAM533PwpjxqgjbTYqFuz4TR7CiN11Kvi5ZJFHc6XlW9QVgLRNqmCj1 oXHT+wSliH1VbI0Tu88nA/7nRegsTmw= X-Google-Smtp-Source: ABdhPJzD+eRjNQx9jEE8HPszk/X0G+/mLUjDE8E8Yo2q9z7zLkPxUXbC55CKu0XUaMyAC0dUuBG8hA== X-Received: by 2002:a17:90b:1014:: with SMTP id gm20mr21105510pjb.5.1610980800513; Mon, 18 Jan 2021 06:40:00 -0800 (PST) Received: from localhost ([178.236.46.205]) by smtp.gmail.com with ESMTPSA id f29sm15906756pgm.76.2021.01.18.06.39.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 06:39:59 -0800 (PST) From: menglong8.dong@gmail.com X-Google-Original-From: dong.menglong@zte.com.cn To: kuba@kernel.org, christian.brauner@ubuntu.com Cc: davem@davemloft.net, yoshfuji@linux-ipv6.org, dong.menglong@zte.com.cn, daniel@iogearbox.net, gnault@redhat.com, ast@kernel.org, nicolas.dichtel@6wind.com, ap420073@gmail.com, edumazet@google.com, pabeni@redhat.com, jakub@cloudflare.com, bjorn.topel@intel.com, keescook@chromium.org, viro@zeniv.linux.org.uk, rdna@fb.com, maheshb@google.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 2/3] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default Date: Mon, 18 Jan 2021 22:39:31 +0800 Message-Id: <20210118143932.56069-3-dong.menglong@zte.com.cn> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210118143932.56069-1-dong.menglong@zte.com.cn> References: <20210118143932.56069-1-dong.menglong@zte.com.cn> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong For now, sysctl_wmem_default and sysctl_rmem_default are globally unified. It's not convenient in some case. For example, when we use docker and try to control the default udp socket receive buffer for each container. For that reason, make sysctl_wmem_default and sysctl_rmem_default per-namespace. Signed-off-by: Menglong Dong --- include/net/netns/core.h | 2 ++ include/net/sock.h | 3 --- net/core/net_namespace.c | 2 ++ net/core/sock.c | 6 ++---- net/core/sysctl_net_core.c | 32 ++++++++++++++++---------------- net/ipv4/ip_output.c | 2 +- 6 files changed, 23 insertions(+), 24 deletions(-) diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 36c2d998a43c..317b47df6d08 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -9,6 +9,8 @@ struct netns_core { /* core sysctls */ struct ctl_table_header *sysctl_hdr; + int sysctl_wmem_default; + int sysctl_rmem_default; int sysctl_somaxconn; #ifdef CONFIG_PROC_FS diff --git a/include/net/sock.h b/include/net/sock.h index bdc4323ce53c..b846a6d24459 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2653,9 +2653,6 @@ extern __u32 sysctl_rmem_max; extern int sysctl_tstamp_allow_data; extern int sysctl_optmem_max; -extern __u32 sysctl_wmem_default; -extern __u32 sysctl_rmem_default; - DECLARE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key); static inline int sk_get_wmem0(const struct sock *sk, const struct proto *proto) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 2ef3b4557f40..eb4ea99131d6 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -374,6 +374,8 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns) static int __net_init net_defaults_init_net(struct net *net) { + net->core.sysctl_rmem_default = SK_RMEM_MAX; + net->core.sysctl_wmem_default = SK_WMEM_MAX; net->core.sysctl_somaxconn = SOMAXCONN; return 0; } diff --git a/net/core/sock.c b/net/core/sock.c index bbcd4b97eddd..2421e4ea1915 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -270,8 +270,6 @@ __u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX; EXPORT_SYMBOL(sysctl_wmem_max); __u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX; EXPORT_SYMBOL(sysctl_rmem_max); -__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX; -__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX; /* Maximal space eaten by iovec or ancillary data plus some space */ int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512); @@ -2970,8 +2968,8 @@ void sock_init_data(struct socket *sock, struct sock *sk) timer_setup(&sk->sk_timer, NULL, 0); sk->sk_allocation = GFP_KERNEL; - sk->sk_rcvbuf = sysctl_rmem_default; - sk->sk_sndbuf = sysctl_wmem_default; + sk->sk_rcvbuf = sock_net(sk)->core.sysctl_rmem_default; + sk->sk_sndbuf = sock_net(sk)->core.sysctl_wmem_default; sk->sk_state = TCP_CLOSE; sk_set_socket(sk, sock); diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 966d976dee84..5c1c75e42a09 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -326,22 +326,6 @@ static struct ctl_table net_core_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = &min_rcvbuf, }, - { - .procname = "wmem_default", - .data = &sysctl_wmem_default, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_sndbuf, - }, - { - .procname = "rmem_default", - .data = &sysctl_rmem_default, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_rcvbuf, - }, { .procname = "dev_weight", .data = &weight_p, @@ -584,6 +568,22 @@ static struct ctl_table netns_core_table[] = { .extra1 = SYSCTL_ZERO, .proc_handler = proc_dointvec_minmax }, + { + .procname = "wmem_default", + .data = &init_net.core.sysctl_wmem_default, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_sndbuf, + }, + { + .procname = "rmem_default", + .data = &init_net.core.sysctl_rmem_default, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_rcvbuf, + }, { } }; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 2ed0b01f72f0..0fbdcda6f314 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1709,7 +1709,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, sk->sk_protocol = ip_hdr(skb)->protocol; sk->sk_bound_dev_if = arg->bound_dev_if; - sk->sk_sndbuf = sysctl_wmem_default; + sk->sk_sndbuf = sock_net(sk)->core.sysctl_wmem_default; ipc.sockc.mark = fl4.flowi4_mark; err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, len, 0, &ipc, &rt, MSG_DONTWAIT); From patchwork Mon Jan 18 14:39:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 12027453 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E382C433E6 for ; Mon, 18 Jan 2021 14:41:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C697522BF3 for ; Mon, 18 Jan 2021 14:41:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404898AbhAROlf (ORCPT ); Mon, 18 Jan 2021 09:41:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393085AbhAROkq (ORCPT ); Mon, 18 Jan 2021 09:40:46 -0500 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22E95C0613C1; Mon, 18 Jan 2021 06:40:06 -0800 (PST) Received: by mail-pj1-x1043.google.com with SMTP id md11so9657553pjb.0; Mon, 18 Jan 2021 06:40:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lFmA9TnP9NE8OaeHmI0MhXoMypeEzomf+Eb9FD+FhuU=; b=pLU/aVXPocVbCZ0riNNXLjJBUe7Zv03l79sxJ6FkYxvDI3Aj074kVQpt8AxRNAHC9E Q1p/VVi9DcFV1/MZiHr9iM6oT5Xcnb2yo8LTeVRKJTalf3CJInFfHxKyn8Vbh4H+kjLz gNmlfnBqlYi/jM/T+d/rq5lTXx3ZiFDhcR7N1jciBW2b5GgaSc0rYPgAgme3VA0gjnrL b2pg1MLIzu3Zjhbtd4oPCHhbHqOzJb6yPePfsrWR3yeFeEcWXBJ2NhpJ8ZWC5EZRrWDe utcMfkNfNSW8VjxAgxI9jfVKCjippXZ7wsuTB5ZasfXfg6mSSKdyefEXk/bJ/uPNbijE Vc3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lFmA9TnP9NE8OaeHmI0MhXoMypeEzomf+Eb9FD+FhuU=; b=FKOcS64i5LjRkMIqtWnHOzAe9ybmrReEGLKfNj628mcnZIpGjijKANvtM7QnilkD5p j7PevwhJNgUrKh6t5MT6XYw81wZB8zhmETrAF5VPAXqMaulWw+k9SU7m5DERdf8hI9qZ x8g5s39DCHUhCg6hVt5t/txsbr9zBGoSqUIfSHGT8c+1Pdb3t+t6803uVvhfmnNPhQ8W KnJicQ4VhcATJ9hDEPel+KCxubavhp4STSGwAXMUdNbXsFgfD2wc7AMvFa+LnY3hoClY AUlqVkpBpuoI20gYSttU9kydr92SuEeiE/XLtxB9jAecr1/nGp43I4IasRo6FPdJESTU Lylw== X-Gm-Message-State: AOAM5300ufX5LTp8NfeMSIIWjrwvEQomuQCt/frIHitR3UPlVrNj3Blo f6BDYzHMHY8Y0RhrYjWa1f0= X-Google-Smtp-Source: ABdhPJy+buD3uEUWJFMyhCWmvswVpc7p0czFqcV7apnWeiqJoxWb4PuP1M9/49UH4vGGJQA9ICdnBQ== X-Received: by 2002:a17:902:ed11:b029:de:2f19:4db1 with SMTP id b17-20020a170902ed11b02900de2f194db1mr27088125pld.20.1610980805726; Mon, 18 Jan 2021 06:40:05 -0800 (PST) Received: from localhost ([178.236.46.205]) by smtp.gmail.com with ESMTPSA id b12sm17324051pgr.9.2021.01.18.06.40.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 06:40:05 -0800 (PST) From: menglong8.dong@gmail.com X-Google-Original-From: dong.menglong@zte.com.cn To: kuba@kernel.org, christian.brauner@ubuntu.com Cc: davem@davemloft.net, yoshfuji@linux-ipv6.org, dong.menglong@zte.com.cn, daniel@iogearbox.net, gnault@redhat.com, ast@kernel.org, nicolas.dichtel@6wind.com, ap420073@gmail.com, edumazet@google.com, pabeni@redhat.com, jakub@cloudflare.com, bjorn.topel@intel.com, keescook@chromium.org, viro@zeniv.linux.org.uk, rdna@fb.com, maheshb@google.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 3/3] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max Date: Mon, 18 Jan 2021 22:39:32 +0800 Message-Id: <20210118143932.56069-4-dong.menglong@zte.com.cn> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210118143932.56069-1-dong.menglong@zte.com.cn> References: <20210118143932.56069-1-dong.menglong@zte.com.cn> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong For now, sysctl_wmem_max and sysctl_rmem_max are globally unified. It's not convenient in some case. For example, when we use docker and try to control the default udp socket receive buffer for each container. For that reason, make sysctl_wmem_max and sysctl_rmem_max per-namespace. Signed-off-by: Menglong Dong --- include/net/netns/core.h | 2 ++ include/net/sock.h | 3 --- net/core/filter.c | 4 ++-- net/core/net_namespace.c | 2 ++ net/core/sock.c | 12 ++++-------- net/core/sysctl_net_core.c | 32 ++++++++++++++++---------------- net/ipv4/tcp_output.c | 2 +- net/netfilter/ipvs/ip_vs_sync.c | 4 ++-- 8 files changed, 29 insertions(+), 32 deletions(-) diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 317b47df6d08..b4aecac6e8ce 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -11,6 +11,8 @@ struct netns_core { int sysctl_wmem_default; int sysctl_rmem_default; + int sysctl_wmem_max; + int sysctl_rmem_max; int sysctl_somaxconn; #ifdef CONFIG_PROC_FS diff --git a/include/net/sock.h b/include/net/sock.h index b846a6d24459..f6b0f2c482ad 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2647,9 +2647,6 @@ void sk_get_meminfo(const struct sock *sk, u32 *meminfo); #define SK_WMEM_MAX (_SK_MEM_OVERHEAD * _SK_MEM_PACKETS) #define SK_RMEM_MAX (_SK_MEM_OVERHEAD * _SK_MEM_PACKETS) -extern __u32 sysctl_wmem_max; -extern __u32 sysctl_rmem_max; - extern int sysctl_tstamp_allow_data; extern int sysctl_optmem_max; diff --git a/net/core/filter.c b/net/core/filter.c index 255aeee72402..3dca58f6c40c 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4717,13 +4717,13 @@ static int _bpf_setsockopt(struct sock *sk, int level, int optname, /* Only some socketops are supported */ switch (optname) { case SO_RCVBUF: - val = min_t(u32, val, sysctl_rmem_max); + val = min_t(u32, val, sock_net(sk)->core.sysctl_rmem_max); sk->sk_userlocks |= SOCK_RCVBUF_LOCK; WRITE_ONCE(sk->sk_rcvbuf, max_t(int, val * 2, SOCK_MIN_RCVBUF)); break; case SO_SNDBUF: - val = min_t(u32, val, sysctl_wmem_max); + val = min_t(u32, val, sock_net(sk)->core.sysctl_wmem_max); sk->sk_userlocks |= SOCK_SNDBUF_LOCK; WRITE_ONCE(sk->sk_sndbuf, max_t(int, val * 2, SOCK_MIN_SNDBUF)); diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index eb4ea99131d6..552e3c5b2a41 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -376,6 +376,8 @@ static int __net_init net_defaults_init_net(struct net *net) { net->core.sysctl_rmem_default = SK_RMEM_MAX; net->core.sysctl_wmem_default = SK_WMEM_MAX; + net->core.sysctl_rmem_max = SK_RMEM_MAX; + net->core.sysctl_wmem_max = SK_WMEM_MAX; net->core.sysctl_somaxconn = SOMAXCONN; return 0; } diff --git a/net/core/sock.c b/net/core/sock.c index 2421e4ea1915..eb7eaaa840ce 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -265,12 +265,6 @@ static struct lock_class_key af_wlock_keys[AF_MAX]; static struct lock_class_key af_elock_keys[AF_MAX]; static struct lock_class_key af_kern_callback_keys[AF_MAX]; -/* Run time adjustable parameters. */ -__u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX; -EXPORT_SYMBOL(sysctl_wmem_max); -__u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX; -EXPORT_SYMBOL(sysctl_rmem_max); - /* Maximal space eaten by iovec or ancillary data plus some space */ int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512); EXPORT_SYMBOL(sysctl_optmem_max); @@ -877,7 +871,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname, * play 'guess the biggest size' games. RCVBUF/SNDBUF * are treated in BSD as hints */ - val = min_t(u32, val, sysctl_wmem_max); + val = min_t(u32, val, sock_net(sk)->core.sysctl_wmem_max); set_sndbuf: /* Ensure val * 2 fits into an int, to prevent max_t() * from treating it as a negative value. @@ -909,7 +903,9 @@ int sock_setsockopt(struct socket *sock, int level, int optname, * play 'guess the biggest size' games. RCVBUF/SNDBUF * are treated in BSD as hints */ - __sock_set_rcvbuf(sk, min_t(u32, val, sysctl_rmem_max)); + __sock_set_rcvbuf(sk, + min_t(u32, val, + sock_net(sk)->core.sysctl_rmem_max)); break; case SO_RCVBUFFORCE: diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 5c1c75e42a09..30a8e3a324ec 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -310,22 +310,6 @@ proc_dolongvec_minmax_bpf_restricted(struct ctl_table *table, int write, static struct ctl_table net_core_table[] = { #ifdef CONFIG_NET - { - .procname = "wmem_max", - .data = &sysctl_wmem_max, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_sndbuf, - }, - { - .procname = "rmem_max", - .data = &sysctl_rmem_max, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_rcvbuf, - }, { .procname = "dev_weight", .data = &weight_p, @@ -584,6 +568,22 @@ static struct ctl_table netns_core_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = &min_rcvbuf, }, + { + .procname = "wmem_max", + .data = &init_net.core.sysctl_wmem_max, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_sndbuf, + }, + { + .procname = "rmem_max", + .data = &init_net.core.sysctl_rmem_max, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_rcvbuf, + }, { } }; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index f322e798a351..8c1b2b0e6211 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -241,7 +241,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss, if (wscale_ok) { /* Set window scaling on max possible window */ space = max_t(u32, space, sock_net(sk)->ipv4.sysctl_tcp_rmem[2]); - space = max_t(u32, space, sysctl_rmem_max); + space = max_t(u32, space, sock_net(sk)->core.sysctl_rmem_max); space = min_t(u32, space, *window_clamp); *rcv_wscale = clamp_t(int, ilog2(space) - 15, 0, TCP_MAX_WSCALE); diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index 9d43277b8b4f..2e7e10b76c36 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -1280,12 +1280,12 @@ static void set_sock_size(struct sock *sk, int mode, int val) lock_sock(sk); if (mode) { val = clamp_t(int, val, (SOCK_MIN_SNDBUF + 1) / 2, - sysctl_wmem_max); + sock_net(sk)->core.sysctl_wmem_max); sk->sk_sndbuf = val * 2; sk->sk_userlocks |= SOCK_SNDBUF_LOCK; } else { val = clamp_t(int, val, (SOCK_MIN_RCVBUF + 1) / 2, - sysctl_rmem_max); + sock_net(sk)->core.sysctl_rmem_max); sk->sk_rcvbuf = val * 2; sk->sk_userlocks |= SOCK_RCVBUF_LOCK; }