From patchwork Thu Apr 18 14:57:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aleksandr Mikhalitsyn X-Patchwork-Id: 13634950 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-relay-internal-1.canonical.com (smtp-relay-internal-1.canonical.com [185.125.188.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DB9916ABD5 for ; Thu, 18 Apr 2024 14:57:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.125.188.123 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713452282; cv=none; b=Yds/v1/4DdUCOdWJ3hQFVV7XIGzMAu+15I8P2nTXFlAz3kHqfd1eLQKxPsj0oht7it4qalUva1U7RVA0dWzCKYCt4vKX/2gfIxLHlNkG1NJQDllmAGqW3jt9HPU5u1fTqdR8TrKu94PRyGwJp7lsqj9ghIcjlzqI9OC8RSYRotw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713452282; c=relaxed/simple; bh=XnWJOnILBcBRDeU1BqBLAtGDVWZ7R54SvcYt0DpJjys=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=TU+G26rbvjJmxAA1jnleO8NDhErTu6tkhcPVcFc5baxI3X5g9lgKystagUJ4PhF7NxHxpAoJzTDavUTXxDQHFKivyJeY8/gn5jjuxzFo2/T1a06yAXsoohUP5bSly5FX3KNjyiRSDdTS3zmGyx8H6jE1sKc8DJ82EmtXBNLIWfs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com; spf=pass smtp.mailfrom=canonical.com; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b=BAT8MIoK; arc=none smtp.client-ip=185.125.188.123 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=canonical.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b="BAT8MIoK" Received: from mail-ej1-f70.google.com (mail-ej1-f70.google.com [209.85.218.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 791E140D7C for ; Thu, 18 Apr 2024 14:57:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1713452272; bh=TPzhJ3AAYaNuKmuKpflVm92Wqe6tHXbtl8g4S4zlOco=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=BAT8MIoKlopi90CacwOQVPh0v/pBJBYgIWvrP5qRLoZbLA2NqWDQHgyvXwiI8UEMW boJDhO6wVH6BH7mzp8Ow73Im10o8w+rMrgerZmcKTcYFIJDBknvCST6yCW3mHRykZM 0BizvkRbPrM9H3XgXAf0ilFPS46mONPfyCS+QVy70t7LQzYpqFhwkCIYqwPoj9oT3h 1YhQ0DXnF0rC3nmN1v3u5PyDfTA4r0v2m9wb/Quo2487xqk3uaBRR+SbHUfqAesfn7 CEK/p7Db17pH0wwod0eVQHbuvDuv06EdP+R+fBKJnM1d40nWBWmqwQz27d6lQBW7L1 9SCG5jpPRdd2w== Received: by mail-ej1-f70.google.com with SMTP id a640c23a62f3a-a51beadf204so53247966b.1 for ; Thu, 18 Apr 2024 07:57:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713452271; x=1714057071; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TPzhJ3AAYaNuKmuKpflVm92Wqe6tHXbtl8g4S4zlOco=; b=ef9SlS+DlKHlVnRiG9usXXlYlqHYRqGJqoSOH0K34O+HaX58nYvC09TtmYgggRaa1U p5d2Wj3NobPc6cKQbxXEyVtlc5qo5srsqks1VjXCxiNMVLPrFVzrb1+O3GThONF2Jitl kfAIl6T8fJjob2P0FCZeBf3fCsFU4GIEZQwoSOwVCw432W2OmWLAW/domyoftKEGgrCK FQz1BM2hZP4Qj2c62CH8ThGwds/tWUxWIU/zcMl2pnfiLnVwGKZoCOMnQSy+zbWiC4PJ Tk/ZtctOSxC6zZ+JU4YzvxESevR0YHPgvKiIpBrmmV3MqsooDrAq65IVvieeR7b6PRXm rwbg== X-Gm-Message-State: AOJu0YyptJeTfIkaL0FHQHvHUE+VrheX/Rlfrt6UpRYV74Bzn9gRTtUV LEe92EMLre5thPsod77Cu+8aDXs7GnYCXZtaUXMqv9aiFUgI9/2FI8Wp2v+E/Vun0A4UV+INFh2 qX4sWw/i8ToCtPFccqod66OpZ+Iqlm3a2bnqH3oTvR/nK1SgMNVgvJ39sVybenHSm2PV3pw== X-Received: by 2002:a17:906:1182:b0:a52:58a7:11d1 with SMTP id n2-20020a170906118200b00a5258a711d1mr1836771eja.38.1713452271346; Thu, 18 Apr 2024 07:57:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFhzDZOsSyjW2YgiF+3b9SbPzp6uVKID7mT1Wura27dE1HKH8MQkcJys3WmmEXVTnVSN92wWQ== X-Received: by 2002:a17:906:1182:b0:a52:58a7:11d1 with SMTP id n2-20020a170906118200b00a5258a711d1mr1836755eja.38.1713452271043; Thu, 18 Apr 2024 07:57:51 -0700 (PDT) Received: from amikhalitsyn.lan ([2001:470:6d:781:320c:9c91:fb97:fbfc]) by smtp.gmail.com with ESMTPSA id jj17-20020a170907985100b00a522a073a64sm993665ejc.187.2024.04.18.07.57.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 07:57:50 -0700 (PDT) From: Alexander Mikhalitsyn To: horms@verge.net.au Cc: netdev@vger.kernel.org, lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Mikhalitsyn , Julian Anastasov , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal Subject: [PATCH net-next v3 1/2] ipvs: add READ_ONCE barrier for ipvs->sysctl_amemthresh Date: Thu, 18 Apr 2024 16:57:42 +0200 Message-Id: <20240418145743.248109-1-aleksandr.mikhalitsyn@canonical.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Cc: Julian Anastasov Cc: Simon Horman Cc: Pablo Neira Ayuso Cc: Jozsef Kadlecsik Cc: Florian Westphal Suggested-by: Julian Anastasov Signed-off-by: Alexander Mikhalitsyn Acked-by: Julian Anastasov --- net/netfilter/ipvs/ip_vs_ctl.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 143a341bbc0a..32be24f0d4e4 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -94,6 +94,7 @@ static void update_defense_level(struct netns_ipvs *ipvs) { struct sysinfo i; int availmem; + int amemthresh; int nomem; int to_change = -1; @@ -105,7 +106,8 @@ static void update_defense_level(struct netns_ipvs *ipvs) /* si_swapinfo(&i); */ /* availmem = availmem - (i.totalswap - i.freeswap); */ - nomem = (availmem < ipvs->sysctl_amemthresh); + amemthresh = max(READ_ONCE(ipvs->sysctl_amemthresh), 0); + nomem = (availmem < amemthresh); local_bh_disable(); @@ -145,9 +147,8 @@ static void update_defense_level(struct netns_ipvs *ipvs) break; case 1: if (nomem) { - ipvs->drop_rate = ipvs->drop_counter - = ipvs->sysctl_amemthresh / - (ipvs->sysctl_amemthresh-availmem); + ipvs->drop_counter = amemthresh / (amemthresh - availmem); + ipvs->drop_rate = ipvs->drop_counter; ipvs->sysctl_drop_packet = 2; } else { ipvs->drop_rate = 0; @@ -155,9 +156,8 @@ static void update_defense_level(struct netns_ipvs *ipvs) break; case 2: if (nomem) { - ipvs->drop_rate = ipvs->drop_counter - = ipvs->sysctl_amemthresh / - (ipvs->sysctl_amemthresh-availmem); + ipvs->drop_counter = amemthresh / (amemthresh - availmem); + ipvs->drop_rate = ipvs->drop_counter; } else { ipvs->drop_rate = 0; ipvs->sysctl_drop_packet = 1; From patchwork Thu Apr 18 14:57:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Aleksandr Mikhalitsyn X-Patchwork-Id: 13634949 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-relay-internal-0.canonical.com (smtp-relay-internal-0.canonical.com [185.125.188.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E32716C852 for ; Thu, 18 Apr 2024 14:57:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.125.188.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713452279; cv=none; b=dBVlS9I4q6pwh3oF5AXg46wuDd/xh7Y8agF5KU8slAtoQgSyLqEnTu3/j2hEVAGwHlQ0T7ZM2bUUrlamJAY82/z57Iq6y2tw83qshCEg8Fpfo++AbjaGR8uCAjiJAU+gcUWs/Umfyb4HJ2tVwRBgUQ3aJ0O0CHBcZbBS+qwPzI0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713452279; c=relaxed/simple; bh=j1OY0T879lcbTo/Rtg5u2e8BORsJdDS072NnPHn0yIA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=R2GUeNRWojF0eBEj7Qtj4KfVjNqe7SgPwnaCVA49yDY4GznUUXwDDrB4j+qm7Rq5TJyu9wKKxdiDxgzppC8Q2Fld+pHmhFf9/4ys4/4kjBFAGjyOllewoBzwOwkBhkeTJpEK6HUHZklztRSwjYwJUX/68TPcdiBZKIABNmIqdgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com; spf=pass smtp.mailfrom=canonical.com; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b=CiF9P12g; arc=none smtp.client-ip=185.125.188.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=canonical.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b="CiF9P12g" Received: from mail-ej1-f70.google.com (mail-ej1-f70.google.com [209.85.218.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id F23D23F6B3 for ; Thu, 18 Apr 2024 14:57:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1713452275; bh=P9cfzGdFw85QMaHvElbpMa0bAXpO82OWeZGP3EPPeSc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=CiF9P12giw1kwclTTOYeSrYiG0Y83j/JaLrdYaUXtGDh2Ml3dt2DuEfuHr41vMMWO OFjj+A+QhMv5ncyp8jhNCFjOSusJd+Tt7N58AFjWexsubSUmXSDFk+7ymZyvqY9xKW ah1yzmb9+3x0IzwjmGyLT8tlQHNzbde11lg69c95Oi0ZMxa74DLvR2oH9MQDQTf9b6 de8IH4z0fKPgKSFGYUshd5ai9nW0d3sYcsxf7711d9Z7mOo2o8gBSHvRTfCM8EqXM+ vAdmEDg9Jz9eeLzwIkeglhuM7q8iChlPF9u5C0gFxqi9Lev9JlsGzg1srUwrw6OynB 3x5W/E6fhRc9g== Received: by mail-ej1-f70.google.com with SMTP id a640c23a62f3a-a52539e4970so49675366b.1 for ; Thu, 18 Apr 2024 07:57:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713452274; x=1714057074; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P9cfzGdFw85QMaHvElbpMa0bAXpO82OWeZGP3EPPeSc=; b=dlcPawLqDqjVdAdrwy8NwKY5DmFj/Fe+bLdgElj8s9XLPMr2p25UyXkJAQlP6hWQYF nWhwP08Ip1tmq2FWC9MShEyKDFcA15pS6FW8BexUEacHFr0xWJpRU0Aav8pYh4/Dq21Q pXIDpNx34hv/bRRLCj6uNQ6Zb0eYzMzPtm0AbDrOFhXUU6PtcihJnSLJD8htnNUE1Z+9 9qBysR3OWBSAsd2aH6vOKV3UibjLm3tPfdWTZIC24Y/JlREiVPKrWTM0RsCSXZ1/rEHP IEXrqL9SU612xSMEMHow7GYZjnZR9u1OQjvSde4lMCUmhwi1j2anQ4pkHHJYdle3qfGj GZcA== X-Gm-Message-State: AOJu0YxwAKA2ibucQei9Yy77ZPacnhQ8yESZvCgPf8H04fj7q5HkDo5j L2++3rfkl8jJdbzgILLuXWQyvAUSY/5GLONm2sKCX7pOem4zveAF9PjyErVER7DPdDOyxBOFSOQ CnXGG9r23EWWJMfg12p8k3mdqunnac0xWLGaWghAU764TlWUgIL1Cj1MhOcB91wyIBAmFz3r0vM qzbg== X-Received: by 2002:a17:906:abc5:b0:a51:ae39:d385 with SMTP id kq5-20020a170906abc500b00a51ae39d385mr2031035ejb.1.1713452274011; Thu, 18 Apr 2024 07:57:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFRD732CssEKFMVZ1ag7deR/XTNnVYkFjxEketr3SQgrrRHiOAgZRldqOSxuaIowUAoOABZOQ== X-Received: by 2002:a17:906:abc5:b0:a51:ae39:d385 with SMTP id kq5-20020a170906abc500b00a51ae39d385mr2031015ejb.1.1713452273632; Thu, 18 Apr 2024 07:57:53 -0700 (PDT) Received: from amikhalitsyn.lan ([2001:470:6d:781:320c:9c91:fb97:fbfc]) by smtp.gmail.com with ESMTPSA id jj17-20020a170907985100b00a522a073a64sm993665ejc.187.2024.04.18.07.57.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 07:57:53 -0700 (PDT) From: Alexander Mikhalitsyn To: horms@verge.net.au Cc: netdev@vger.kernel.org, lvs-devel@vger.kernel.org, netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Mikhalitsyn , =?utf-8?q?St?= =?utf-8?q?=C3=A9phane_Graber?= , Christian Brauner , Julian Anastasov , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal Subject: [PATCH net-next v3 2/2] ipvs: allow some sysctls in non-init user namespaces Date: Thu, 18 Apr 2024 16:57:43 +0200 Message-Id: <20240418145743.248109-2-aleksandr.mikhalitsyn@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240418145743.248109-1-aleksandr.mikhalitsyn@canonical.com> References: <20240418145743.248109-1-aleksandr.mikhalitsyn@canonical.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Let's make all IPVS sysctls writtable even when network namespace is owned by non-initial user namespace. Let's make a few sysctls to be read-only for non-privileged users: - sync_qlen_max - sync_sock_size - run_estimation - est_cpulist - est_nice I'm trying to be conservative with this to prevent introducing any security issues in there. Maybe, we can allow more sysctls to be writable, but let's do this on-demand and when we see real use-case. This patch is motivated by user request in the LXC project [1]. Having this can help with running some Kubernetes [2] or Docker Swarm [3] workloads inside the system containers. Link: https://github.com/lxc/lxc/issues/4278 [1] Link: https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103 [2] Link: https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682 [3] Cc: Stéphane Graber Cc: Christian Brauner Cc: Julian Anastasov Cc: Simon Horman Cc: Pablo Neira Ayuso Cc: Jozsef Kadlecsik Cc: Florian Westphal Signed-off-by: Alexander Mikhalitsyn Acked-by: Julian Anastasov --- net/netfilter/ipvs/ip_vs_ctl.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 32be24f0d4e4..c3ba71aa2654 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -4270,6 +4270,7 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) struct ctl_table *tbl; int idx, ret; size_t ctl_table_size = ARRAY_SIZE(vs_vars); + bool unpriv = net->user_ns != &init_user_ns; atomic_set(&ipvs->dropentry, 0); spin_lock_init(&ipvs->dropentry_lock); @@ -4284,12 +4285,6 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) tbl = kmemdup(vs_vars, sizeof(vs_vars), GFP_KERNEL); if (tbl == NULL) return -ENOMEM; - - /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - tbl[0].procname = NULL; - ctl_table_size = 0; - } } else tbl = vs_vars; /* Initialize sysctl defaults */ @@ -4315,10 +4310,17 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) ipvs->sysctl_sync_ports = 1; tbl[idx++].data = &ipvs->sysctl_sync_ports; tbl[idx++].data = &ipvs->sysctl_sync_persist_mode; + ipvs->sysctl_sync_qlen_max = nr_free_buffer_pages() / 32; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx++].data = &ipvs->sysctl_sync_qlen_max; + ipvs->sysctl_sync_sock_size = 0; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx++].data = &ipvs->sysctl_sync_sock_size; + tbl[idx++].data = &ipvs->sysctl_cache_bypass; tbl[idx++].data = &ipvs->sysctl_expire_nodest_conn; tbl[idx++].data = &ipvs->sysctl_sloppy_tcp; @@ -4341,15 +4343,22 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) tbl[idx++].data = &ipvs->sysctl_conn_reuse_mode; tbl[idx++].data = &ipvs->sysctl_schedule_icmp; tbl[idx++].data = &ipvs->sysctl_ignore_tunneled; + ipvs->sysctl_run_estimation = 1; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_run_estimation; ipvs->est_cpulist_valid = 0; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_est_cpulist; ipvs->sysctl_est_nice = IPVS_EST_NICE; + if (unpriv) + tbl[idx].mode = 0444; tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_est_nice;