From patchwork Thu Aug 29 14:46:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 13783342 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 856FC1B012C for ; Thu, 29 Aug 2024 14:46:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942811; cv=none; b=Eu2ijuJCW+M8qJSaQAtBx3xTkjB0JLb03HiRKJVnwC04M/pOqVzA40XlzL6I3vD1aAG9SXyPMUXJCmK9eo0PzM6ZFWlsEBMFwpDMWid0KjXiQdmHBAJ97fkcTYP0GjyunY030vOiyf7Co9zREMVe6R4iZP7YfcbEbGu8rVuzhK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942811; c=relaxed/simple; bh=om/hBxVwyEwqd9Wet0BnvxGv0o7q27USn31SauPK2VE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pAxeULBDHfieUXwKOwlUJvnddsKVAV2YYwP6YP6bd3K1wknkccI4Q1crvyO8NW9fHEjd7fAqhqlUzU4vgJrfUJMMbsFjLVQpWaoLF0acSP0AKnwjiHiD1MyerJ20vHjPjeMb4BpQVREztfsXhNdz4wFil6GNR22iVGHhka2Wa8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UdtVok33; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UdtVok33" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e178e745c49so1051216276.2 for ; Thu, 29 Aug 2024 07:46:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724942807; x=1725547607; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GsvFIbWLnIoB9I9D/ZAdpOE/EjanSMs0G2/A0cor4VM=; b=UdtVok330JwnuiohUYF/8l7MekvI68Ncctg1W8cAeQXdgqli5eYq7ZLuc0IcKoJw5C dbfqP9lCbR/hszEhIjZNvOuSdA4hiOo5rOhrJ4y8LhJcn7imLoAax81pSaal8rh2hLnd PzHzRDLcQQmBUQzpprPGlowj3EmBKX/LUJ/+B4/5V/lGPE4Pn+LqDyIq52/nqtdagwn/ VfOQLEX2jQEvr2k7uYotgrSxjQhDYklBnwFwCk4MqTCxv+Z8KrWWd7ZegXHhsQlbXM1a /yID+IA+y6M0GZUAxP6eGHIa2WvimpFUBWkS20E4ACGRcoLN9ClpKCuRKt3Fqd+ol7RD QJMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724942807; x=1725547607; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GsvFIbWLnIoB9I9D/ZAdpOE/EjanSMs0G2/A0cor4VM=; b=hJojpuIRtgSwEDYvBzwuiqGwLfIShxIk7Av3k3VP65wV/gJiTxAXzMNfDoTyiKg1IX oqRzRPAyZu2jtzEVi+kG/pFbFOfN4Wg0v1dwBIxe0A99aI4znFRA8q9SHcqAheMzlSEI KQw783kBrx9CAnxVBxsJixvfeZsRISGG5vwaMc7f1zSyM9Ka3fMzQJw5bBKzy9jkh9Vc oHSdQNw7lgGdReLGifsmDcnptOX7eP8iBxxYjtUPabiTOOR/tN+3e7SdRkSGOiNcCrjM zhCHMNCFsi/iotN4C6ELijp7tLEYx5uQCSbJPT7/QPV9orGCV4jgZbfdFfw0s8Ub52yA kFFg== X-Forwarded-Encrypted: i=1; AJvYcCV5ayUngjF85D6ciBZMIVeNTFObSpycZ6mgFMlymkF6evhImXuVDX7sQfFDGs21dscKsdpXu1Q=@vger.kernel.org X-Gm-Message-State: AOJu0YwfwXuHn1NhRgh9jOFw2lakytZ6zX7YOOaco8rhQbIx5Cs+IGfj I531XaIjYHGiNAW+M839lKEQLXptZg/NNffkJaI0ebBSfV9oMEQyMSz8OmOBB/87bI8P8OLJmQh NMqGjQ3hwFg== X-Google-Smtp-Source: AGHT+IFl6gXILQamF6WR2WBuxYjcrPKRtwnrxbZKfhi0osrgCPWUQApe9y4UJaRReTW8+JouX/spEAhp5V4S2Q== X-Received: from edumazet1.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:395a]) (user=edumazet job=sendgmr) by 2002:a25:846:0:b0:e11:5e94:17dc with SMTP id 3f1490d57ef6-e1a5ab75e35mr4392276.5.1724942807371; Thu, 29 Aug 2024 07:46:47 -0700 (PDT) Date: Thu, 29 Aug 2024 14:46:39 +0000 In-Reply-To: <20240829144641.3880376-1-edumazet@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240829144641.3880376-1-edumazet@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240829144641.3880376-2-edumazet@google.com> Subject: [PATCH v2 net-next 1/3] icmp: change the order of rate limits From: Eric Dumazet To: "David S . Miller" , Jakub Kicinski , Paolo Abeni Cc: David Ahern , Willy Tarreau , Keyu Man , Jesper Dangaard Brouer , netdev@vger.kernel.org, eric.dumazet@gmail.com, Eric Dumazet , stable@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org ICMP messages are ratelimited : After the blamed commits, the two rate limiters are applied in this order: 1) host wide ratelimit (icmp_global_allow()) 2) Per destination ratelimit (inetpeer based) In order to avoid side-channels attacks, we need to apply the per destination check first. This patch makes the following change : 1) icmp_global_allow() checks if the host wide limit is reached. But credits are not yet consumed. This is deferred to 3) 2) The per destination limit is checked/updated. This might add a new node in inetpeer tree. 3) icmp_global_consume() consumes tokens if prior operations succeeded. This means that host wide ratelimit is still effective in keeping inetpeer tree small even under DDOS. As a bonus, I removed icmp_global.lock as the fast path can use a lock-free operation. Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Reported-by: Keyu Man Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Cc: Jesper Dangaard Brouer Cc: stable@vger.kernel.org --- include/net/ip.h | 2 + net/ipv4/icmp.c | 103 ++++++++++++++++++++++++++--------------------- net/ipv6/icmp.c | 28 ++++++++----- 3 files changed, 76 insertions(+), 57 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index c5606cadb1a552f3e282a5e1e721fd47b07432b2..82248813619e3f21e09d52976accbdc76c7668c2 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -795,6 +795,8 @@ static inline void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb) } bool icmp_global_allow(void); +void icmp_global_consume(void); + extern int sysctl_icmp_msgs_per_sec; extern int sysctl_icmp_msgs_burst; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index b8f56d03fcbb62970a828e20dd9f05fcede2d552..0078e8fb2e86d0552ef85eb5bf5bef947b0f1c3d 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -224,57 +224,59 @@ int sysctl_icmp_msgs_per_sec __read_mostly = 1000; int sysctl_icmp_msgs_burst __read_mostly = 50; static struct { - spinlock_t lock; - u32 credit; + atomic_t credit; u32 stamp; -} icmp_global = { - .lock = __SPIN_LOCK_UNLOCKED(icmp_global.lock), -}; +} icmp_global; /** * icmp_global_allow - Are we allowed to send one more ICMP message ? * * Uses a token bucket to limit our ICMP messages to ~sysctl_icmp_msgs_per_sec. * Returns false if we reached the limit and can not send another packet. - * Note: called with BH disabled + * Works in tandem with icmp_global_consume(). */ bool icmp_global_allow(void) { - u32 credit, delta, incr = 0, now = (u32)jiffies; - bool rc = false; + u32 delta, now, oldstamp; + int incr, new, old; - /* Check if token bucket is empty and cannot be refilled - * without taking the spinlock. The READ_ONCE() are paired - * with the following WRITE_ONCE() in this same function. + /* Note: many cpus could find this condition true. + * Then later icmp_global_consume() could consume more credits, + * this is an acceptable race. */ - if (!READ_ONCE(icmp_global.credit)) { - delta = min_t(u32, now - READ_ONCE(icmp_global.stamp), HZ); - if (delta < HZ / 50) - return false; - } + if (atomic_read(&icmp_global.credit) > 0) + return true; - spin_lock(&icmp_global.lock); - delta = min_t(u32, now - icmp_global.stamp, HZ); - if (delta >= HZ / 50) { - incr = READ_ONCE(sysctl_icmp_msgs_per_sec) * delta / HZ; - if (incr) - WRITE_ONCE(icmp_global.stamp, now); - } - credit = min_t(u32, icmp_global.credit + incr, - READ_ONCE(sysctl_icmp_msgs_burst)); - if (credit) { - /* We want to use a credit of one in average, but need to randomize - * it for security reasons. - */ - credit = max_t(int, credit - get_random_u32_below(3), 0); - rc = true; + now = jiffies; + oldstamp = READ_ONCE(icmp_global.stamp); + delta = min_t(u32, now - oldstamp, HZ); + if (delta < HZ / 50) + return false; + + incr = READ_ONCE(sysctl_icmp_msgs_per_sec) * delta / HZ; + if (!incr) + return false; + + if (cmpxchg(&icmp_global.stamp, oldstamp, now) == oldstamp) { + old = atomic_read(&icmp_global.credit); + do { + new = min(old + incr, READ_ONCE(sysctl_icmp_msgs_burst)); + } while (!atomic_try_cmpxchg(&icmp_global.credit, &old, new)); } - WRITE_ONCE(icmp_global.credit, credit); - spin_unlock(&icmp_global.lock); - return rc; + return true; } EXPORT_SYMBOL(icmp_global_allow); +void icmp_global_consume(void) +{ + int credits = get_random_u32_below(3); + + /* Note: this might make icmp_global.credit negative. */ + if (credits) + atomic_sub(credits, &icmp_global.credit); +} +EXPORT_SYMBOL(icmp_global_consume); + static bool icmpv4_mask_allow(struct net *net, int type, int code) { if (type > NR_ICMP_TYPES) @@ -291,14 +293,16 @@ static bool icmpv4_mask_allow(struct net *net, int type, int code) return false; } -static bool icmpv4_global_allow(struct net *net, int type, int code) +static bool icmpv4_global_allow(struct net *net, int type, int code, + bool *apply_ratelimit) { if (icmpv4_mask_allow(net, type, code)) return true; - if (icmp_global_allow()) + if (icmp_global_allow()) { + *apply_ratelimit = true; return true; - + } __ICMP_INC_STATS(net, ICMP_MIB_RATELIMITGLOBAL); return false; } @@ -308,15 +312,16 @@ static bool icmpv4_global_allow(struct net *net, int type, int code) */ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt, - struct flowi4 *fl4, int type, int code) + struct flowi4 *fl4, int type, int code, + bool apply_ratelimit) { struct dst_entry *dst = &rt->dst; struct inet_peer *peer; bool rc = true; int vif; - if (icmpv4_mask_allow(net, type, code)) - goto out; + if (!apply_ratelimit) + return true; /* No rate limit on loopback */ if (dst->dev && (dst->dev->flags&IFF_LOOPBACK)) @@ -331,6 +336,8 @@ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt, out: if (!rc) __ICMP_INC_STATS(net, ICMP_MIB_RATELIMITHOST); + else + icmp_global_consume(); return rc; } @@ -402,6 +409,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) struct ipcm_cookie ipc; struct rtable *rt = skb_rtable(skb); struct net *net = dev_net(rt->dst.dev); + bool apply_ratelimit = false; struct flowi4 fl4; struct sock *sk; struct inet_sock *inet; @@ -413,11 +421,11 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) if (ip_options_echo(net, &icmp_param->replyopts.opt.opt, skb)) return; - /* Needed by both icmp_global_allow and icmp_xmit_lock */ + /* Needed by both icmpv4_global_allow and icmp_xmit_lock */ local_bh_disable(); - /* global icmp_msgs_per_sec */ - if (!icmpv4_global_allow(net, type, code)) + /* is global icmp_msgs_per_sec exhausted ? */ + if (!icmpv4_global_allow(net, type, code, &apply_ratelimit)) goto out_bh_enable; sk = icmp_xmit_lock(net); @@ -450,7 +458,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) rt = ip_route_output_key(net, &fl4); if (IS_ERR(rt)) goto out_unlock; - if (icmpv4_xrlim_allow(net, rt, &fl4, type, code)) + if (icmpv4_xrlim_allow(net, rt, &fl4, type, code, apply_ratelimit)) icmp_push_reply(sk, icmp_param, &fl4, &ipc, &rt); ip_rt_put(rt); out_unlock: @@ -596,6 +604,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, int room; struct icmp_bxm icmp_param; struct rtable *rt = skb_rtable(skb_in); + bool apply_ratelimit = false; struct ipcm_cookie ipc; struct flowi4 fl4; __be32 saddr; @@ -677,7 +686,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, } } - /* Needed by both icmp_global_allow and icmp_xmit_lock */ + /* Needed by both icmpv4_global_allow and icmp_xmit_lock */ local_bh_disable(); /* Check global sysctl_icmp_msgs_per_sec ratelimit, unless @@ -685,7 +694,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, * loopback, then peer ratelimit still work (in icmpv4_xrlim_allow) */ if (!(skb_in->dev && (skb_in->dev->flags&IFF_LOOPBACK)) && - !icmpv4_global_allow(net, type, code)) + !icmpv4_global_allow(net, type, code, &apply_ratelimit)) goto out_bh_enable; sk = icmp_xmit_lock(net); @@ -744,7 +753,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, goto out_unlock; /* peer icmp_ratelimit */ - if (!icmpv4_xrlim_allow(net, rt, &fl4, type, code)) + if (!icmpv4_xrlim_allow(net, rt, &fl4, type, code, apply_ratelimit)) goto ende; /* RFC says return as much as we can without exceeding 576 bytes. */ diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 7b31674644efc338ec458d92dfe495480825b0fd..46f70e4a835139ef7d8925c49440865355048193 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -175,14 +175,16 @@ static bool icmpv6_mask_allow(struct net *net, int type) return false; } -static bool icmpv6_global_allow(struct net *net, int type) +static bool icmpv6_global_allow(struct net *net, int type, + bool *apply_ratelimit) { if (icmpv6_mask_allow(net, type)) return true; - if (icmp_global_allow()) + if (icmp_global_allow()) { + *apply_ratelimit = true; return true; - + } __ICMP_INC_STATS(net, ICMP_MIB_RATELIMITGLOBAL); return false; } @@ -191,13 +193,13 @@ static bool icmpv6_global_allow(struct net *net, int type) * Check the ICMP output rate limit */ static bool icmpv6_xrlim_allow(struct sock *sk, u8 type, - struct flowi6 *fl6) + struct flowi6 *fl6, bool apply_ratelimit) { struct net *net = sock_net(sk); struct dst_entry *dst; bool res = false; - if (icmpv6_mask_allow(net, type)) + if (!apply_ratelimit) return true; /* @@ -228,6 +230,8 @@ static bool icmpv6_xrlim_allow(struct sock *sk, u8 type, if (!res) __ICMP6_INC_STATS(net, ip6_dst_idev(dst), ICMP6_MIB_RATELIMITHOST); + else + icmp_global_consume(); dst_release(dst); return res; } @@ -452,6 +456,7 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, struct net *net; struct ipv6_pinfo *np; const struct in6_addr *saddr = NULL; + bool apply_ratelimit = false; struct dst_entry *dst; struct icmp6hdr tmp_hdr; struct flowi6 fl6; @@ -533,11 +538,12 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, return; } - /* Needed by both icmp_global_allow and icmpv6_xmit_lock */ + /* Needed by both icmpv6_global_allow and icmpv6_xmit_lock */ local_bh_disable(); /* Check global sysctl_icmp_msgs_per_sec ratelimit */ - if (!(skb->dev->flags & IFF_LOOPBACK) && !icmpv6_global_allow(net, type)) + if (!(skb->dev->flags & IFF_LOOPBACK) && + !icmpv6_global_allow(net, type, &apply_ratelimit)) goto out_bh_enable; mip6_addr_swap(skb, parm); @@ -575,7 +581,7 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, np = inet6_sk(sk); - if (!icmpv6_xrlim_allow(sk, type, &fl6)) + if (!icmpv6_xrlim_allow(sk, type, &fl6, apply_ratelimit)) goto out; tmp_hdr.icmp6_type = type; @@ -717,6 +723,7 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb) struct ipv6_pinfo *np; const struct in6_addr *saddr = NULL; struct icmp6hdr *icmph = icmp6_hdr(skb); + bool apply_ratelimit = false; struct icmp6hdr tmp_hdr; struct flowi6 fl6; struct icmpv6_msg msg; @@ -781,8 +788,9 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb) goto out; /* Check the ratelimit */ - if ((!(skb->dev->flags & IFF_LOOPBACK) && !icmpv6_global_allow(net, ICMPV6_ECHO_REPLY)) || - !icmpv6_xrlim_allow(sk, ICMPV6_ECHO_REPLY, &fl6)) + if ((!(skb->dev->flags & IFF_LOOPBACK) && + !icmpv6_global_allow(net, ICMPV6_ECHO_REPLY, &apply_ratelimit)) || + !icmpv6_xrlim_allow(sk, ICMPV6_ECHO_REPLY, &fl6, apply_ratelimit)) goto out_dst_release; idev = __in6_dev_get(skb->dev); From patchwork Thu Aug 29 14:46:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 13783341 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F0D518D65D for ; Thu, 29 Aug 2024 14:46:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942811; cv=none; b=EcTBhvPRgTISw7GUy47xbX3URcjgp2wOUnqMJD7tIbfgp7r4fyMYxEGb1FROYX9EMdgdKuzPPaDeHRKFo43UFZ8E4z27iBOafLmSy56gZ3EhjFESEFMnO05iGWsrapVmo6Vxeelh0Vd+8YY9sSf+CtDPSbqesjdpHQubiIK7fdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942811; c=relaxed/simple; bh=2n/jwsFvVV67e3Q0zx36XoINVHndk8yfFssM6tLiNjw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=e+pYBsyKBnOBKOUmzKDSJOABC3Y10XHuathhbRWFNIh70MST/ETIJ/9GcUs/iZgq/ft1XHJZ8/Ds3eDfznwZ45bwRyqbtT5MtBCXGfkHMhxo/s1JS6dlMmWhAC+5RUe2r0956fdbYJRsBk64jIdiETUwsx8xuNC0cYE2BaQgP0A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=E/rcZ+7/; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="E/rcZ+7/" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e02a4de4f4eso1496925276.1 for ; Thu, 29 Aug 2024 07:46:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724942809; x=1725547609; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xefPbmqOpYxiYdMyBExipeTwxfiPd4h7e8P4/KZ8u3w=; b=E/rcZ+7/gfUACmThJCEPR5kT8ostNffsm9bbLEaSbdi23IgFMPdvNL8S+4N1czjckE k9CzNFOk1PiRGNT6kcf8cSm+FQ9ghlcHzQv2MCcnZWLE0MU0qTL6fBlGdSzjQT/1vE9O nRrem4qgP4LxVer4g4vCH7e9NkVQxI0Tlu4epGNQGV5SMBeUT0ZV54A8oyFv1KxnP6R0 JskgOHQ8JaqW5w8tKmFaQ4PAFXgDNXBHrYVzHUJ6goKeSswygtuGhu1A7pwgKww4cZ9V n3i55vE1H3GSsJlKb00UQ3zdyZt4Xvuuv/GfSt64n8J19UC5ZQWKImr62xxhW0H4wvuV QhGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724942809; x=1725547609; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xefPbmqOpYxiYdMyBExipeTwxfiPd4h7e8P4/KZ8u3w=; b=uNPDZRAsA7sdau14LkrHCQRvDGHaQvjg6N0IMh+yD9J0NAydoppf9CmmblKZ0AGUsJ lVmNa7NP5Sq35i/K40dxYBGUyP0j3NKYVxS7lPhIPaLRat9VH1EFCDXxW7WSKG0tFWMT D8pZMcAxhfC0NVG5rDgFcofkEEpIAKWm6TQKsQlGQPErpH9dJCm3SCxEMVmo6ChHVMKw Q1sE+Uz2BsNh4qDqCl0R/jx1n8ONXJ5CCYIWdXtMb8Dzk/ICoNlZyG2elM5BPHCaAE1d 6EPcVIRKO/B5OOWbzvCWEfsn4YuTyFV8uJPL2pmqmPsM2E7jAjmpukrRLLh2HNRQBcLw MkPg== X-Forwarded-Encrypted: i=1; AJvYcCXqRhLvpXVTdIeHwwiCK+3lTtYOwXZvdDymnMWpUXQ6jEm8/Rtwx1rWnx7mxQeJHdEEDI1/Mrg=@vger.kernel.org X-Gm-Message-State: AOJu0YycXrPnxgWXPOEGX8smdjmLq4gpqLe6kV6LIWHDWL4QPlGh6LZK 5BHCHPfbizsrB00BscayrTcK5EV+ZfGIHSXtbnZO0jjSjZGmVWS/IXzS+FZ6Cyy8nPvGC+qB8cJ sDGf7CShPlw== X-Google-Smtp-Source: AGHT+IGoFcFbQGagUKGZQwwPsJ685/LVDJzJf2XMnOcAGYH8ngRsn8YJ+A+F0gGpmRv+CHHRKpC3WSefms1XGA== X-Received: from edumazet1.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:395a]) (user=edumazet job=sendgmr) by 2002:a25:ac53:0:b0:e0e:c9bc:3206 with SMTP id 3f1490d57ef6-e1a5ab76a99mr4610276.5.1724942808861; Thu, 29 Aug 2024 07:46:48 -0700 (PDT) Date: Thu, 29 Aug 2024 14:46:40 +0000 In-Reply-To: <20240829144641.3880376-1-edumazet@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240829144641.3880376-1-edumazet@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240829144641.3880376-3-edumazet@google.com> Subject: [PATCH v2 net-next 2/3] icmp: move icmp_global.credit and icmp_global.stamp to per netns storage From: Eric Dumazet To: "David S . Miller" , Jakub Kicinski , Paolo Abeni Cc: David Ahern , Willy Tarreau , Keyu Man , Jesper Dangaard Brouer , netdev@vger.kernel.org, eric.dumazet@gmail.com, Eric Dumazet X-Patchwork-Delegate: kuba@kernel.org Host wide ICMP ratelimiter should be per netns, to provide better isolation. Following patch in this series makes the sysctl per netns. Signed-off-by: Eric Dumazet Reviewed-by: David Ahern --- include/net/ip.h | 4 ++-- include/net/netns/ipv4.h | 3 ++- net/ipv4/icmp.c | 26 +++++++++++--------------- net/ipv6/icmp.c | 4 ++-- 4 files changed, 17 insertions(+), 20 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index 82248813619e3f21e09d52976accbdc76c7668c2..d3bca4e83979f681c4931e9ff62db5941a059c11 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -794,8 +794,8 @@ static inline void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb) ip_cmsg_recv_offset(msg, skb->sk, skb, 0, 0); } -bool icmp_global_allow(void); -void icmp_global_consume(void); +bool icmp_global_allow(struct net *net); +void icmp_global_consume(struct net *net); extern int sysctl_icmp_msgs_per_sec; extern int sysctl_icmp_msgs_burst; diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 5fcd61ada62289253844be9cbe25387dd92385a5..54fe7c079fffb285b7a8a069f3d57f9440a6655a 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -122,7 +122,8 @@ struct netns_ipv4 { u8 sysctl_icmp_errors_use_inbound_ifaddr; int sysctl_icmp_ratelimit; int sysctl_icmp_ratemask; - + atomic_t icmp_global_credit; + u32 icmp_global_stamp; u32 ip_rt_min_pmtu; int ip_rt_mtu_expires; int ip_rt_min_advmss; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 0078e8fb2e86d0552ef85eb5bf5bef947b0f1c3d..2e1d81dbdbb6fe93ea53398bbe3b3627b35852b0 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -223,19 +223,15 @@ static inline void icmp_xmit_unlock(struct sock *sk) int sysctl_icmp_msgs_per_sec __read_mostly = 1000; int sysctl_icmp_msgs_burst __read_mostly = 50; -static struct { - atomic_t credit; - u32 stamp; -} icmp_global; - /** * icmp_global_allow - Are we allowed to send one more ICMP message ? + * @net: network namespace * * Uses a token bucket to limit our ICMP messages to ~sysctl_icmp_msgs_per_sec. * Returns false if we reached the limit and can not send another packet. * Works in tandem with icmp_global_consume(). */ -bool icmp_global_allow(void) +bool icmp_global_allow(struct net *net) { u32 delta, now, oldstamp; int incr, new, old; @@ -244,11 +240,11 @@ bool icmp_global_allow(void) * Then later icmp_global_consume() could consume more credits, * this is an acceptable race. */ - if (atomic_read(&icmp_global.credit) > 0) + if (atomic_read(&net->ipv4.icmp_global_credit) > 0) return true; now = jiffies; - oldstamp = READ_ONCE(icmp_global.stamp); + oldstamp = READ_ONCE(net->ipv4.icmp_global_stamp); delta = min_t(u32, now - oldstamp, HZ); if (delta < HZ / 50) return false; @@ -257,23 +253,23 @@ bool icmp_global_allow(void) if (!incr) return false; - if (cmpxchg(&icmp_global.stamp, oldstamp, now) == oldstamp) { - old = atomic_read(&icmp_global.credit); + if (cmpxchg(&net->ipv4.icmp_global_stamp, oldstamp, now) == oldstamp) { + old = atomic_read(&net->ipv4.icmp_global_credit); do { new = min(old + incr, READ_ONCE(sysctl_icmp_msgs_burst)); - } while (!atomic_try_cmpxchg(&icmp_global.credit, &old, new)); + } while (!atomic_try_cmpxchg(&net->ipv4.icmp_global_credit, &old, new)); } return true; } EXPORT_SYMBOL(icmp_global_allow); -void icmp_global_consume(void) +void icmp_global_consume(struct net *net) { int credits = get_random_u32_below(3); /* Note: this might make icmp_global.credit negative. */ if (credits) - atomic_sub(credits, &icmp_global.credit); + atomic_sub(credits, &net->ipv4.icmp_global_credit); } EXPORT_SYMBOL(icmp_global_consume); @@ -299,7 +295,7 @@ static bool icmpv4_global_allow(struct net *net, int type, int code, if (icmpv4_mask_allow(net, type, code)) return true; - if (icmp_global_allow()) { + if (icmp_global_allow(net)) { *apply_ratelimit = true; return true; } @@ -337,7 +333,7 @@ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt, if (!rc) __ICMP_INC_STATS(net, ICMP_MIB_RATELIMITHOST); else - icmp_global_consume(); + icmp_global_consume(net); return rc; } diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 46f70e4a835139ef7d8925c49440865355048193..071b0bc1179d81b18c340ce415cef21e02a30cd7 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -181,7 +181,7 @@ static bool icmpv6_global_allow(struct net *net, int type, if (icmpv6_mask_allow(net, type)) return true; - if (icmp_global_allow()) { + if (icmp_global_allow(net)) { *apply_ratelimit = true; return true; } @@ -231,7 +231,7 @@ static bool icmpv6_xrlim_allow(struct sock *sk, u8 type, __ICMP6_INC_STATS(net, ip6_dst_idev(dst), ICMP6_MIB_RATELIMITHOST); else - icmp_global_consume(); + icmp_global_consume(net); dst_release(dst); return res; } From patchwork Thu Aug 29 14:46:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 13783343 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B43E71B1418 for ; Thu, 29 Aug 2024 14:46:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942813; cv=none; b=q+lYmd3466usWyweoM4VmM62Nb46R0DXjpUkycxw5pWpu/E5sBhBbUVtnuj3VWv8kMbQ5eEYuafbFTLFkhiNN7V8X6lwlnXKORYhuCVDhCzeqoG2bCzGg/Me7ImCgAbQebwz5DHlJKppN1eFOCiItHP+eZLjcUeOyZTRCKEqBB8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724942813; c=relaxed/simple; bh=4OE2aGBgG2nZ7aIpfhhwLjWANwH7jOqUo5bpUyTU3rQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FC9d0b0ngi2wmSaIXOfhn5MAOQRDOste83PP5N9GBOwBh+Cz+RkWVDUFhzpGZsjjRkphWnM7IVR6wtPG+2I+sMIFvK8+hd3Z9PPDRdSMhZjNRe/g8CB2QZMyj3GANK9VpJ72q1Do84bvOXTchxxU0dtRQaMEoSd+/KUTrTvXnXk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=p7avtr9t; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--edumazet.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="p7avtr9t" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e1166ecfacdso1383223276.1 for ; Thu, 29 Aug 2024 07:46:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724942811; x=1725547611; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ySiNPSOW6cMUAhrPmgMi6C/tYAcCH+NJfoKW7cgN0cA=; b=p7avtr9tWv4zsCOw7ohVVYLVDfCwMMbGreieGlc4T3/C5MD7YOcWszb647mzutylP0 DGRLnSPwk3m4NUOHwPNsagW3bCBpfjLwwpwzHeRwJAabMKPorBxstLiNYqC1Aas6rtDe nLlxXfYhEx0aBQp5Sjte36IUcKPbboageVXBj9XrT+rfohlHunQj3MSel1MhgArBeeCa u2nGXOXI/yduq1yHDfgPzRLcuURmlKirmyz/Agj+IQnmH3Gg56l/Lmdo24qjbJOvMgPw 0K35AHQD0QPbzVX+7sQGtlmO4vsCcD1O5zYEpJyn8bl1aU99LkwoVhcxumczSncUE0t7 81QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724942811; x=1725547611; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ySiNPSOW6cMUAhrPmgMi6C/tYAcCH+NJfoKW7cgN0cA=; b=md3gLYP7rAQOMWpwXaVWN1P5OHpk2I5baPbEOlkgpaPsx9O+cNnkU6OVrQXjTXt2X2 edW4s03vYz9AR1swbYBqPZytPtnAAElx7oMPjxGKNhrBpM9zzoI3vG26Xhl6NrR8IVtP zO/luVGTHjruzcp0N1o2bB0sb3wKe0CfCKCJC+qIQ5AeiLavOcXkM5t2/Uw1uQYO37ch ZEXKjEPZh/Im59RMaRhbloS66w/S4z94nkKf/3Owc3JL4txLNfJoYBPPwo9wvNR0ZZwJ xeuZacLLbR2fXkkSi0ogEeRfhYF01lkytU/HouP0OKU6gqPGHxKEKR41hSYaCEOEnz1+ ZkOA== X-Forwarded-Encrypted: i=1; AJvYcCVeScUZBVdtMdMhvE5/lg58flRqOhSfDoAdE3cKMONksInAJZFqMbiWbrN4nz/qLKA6oBvc+CM=@vger.kernel.org X-Gm-Message-State: AOJu0YzezfOEI9mHrUiFi+WgzMvf8wjAgXfJ3M/4CBRELb7sOLgCGEU2 mcGXri38wO1MNcPrjblirlb9uUTXhGPPcwD5RdG6QXgB9cFSbVSRtvbXknIxOhbdx/qupaQz4v1 fx4Csuwj5Aw== X-Google-Smtp-Source: AGHT+IEZtSb9paxuNqGsZ4wCM7Ec2erg4Qnd3tx7UBrbQC6MRGSNED1oLMVZN/Yk1hYHWOzJTR4m0lXPe7W0gQ== X-Received: from edumazet1.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:395a]) (user=edumazet job=sendgmr) by 2002:a25:b205:0:b0:e0b:ab63:b9c7 with SMTP id 3f1490d57ef6-e1a5ae0a289mr5002276.7.1724942810396; Thu, 29 Aug 2024 07:46:50 -0700 (PDT) Date: Thu, 29 Aug 2024 14:46:41 +0000 In-Reply-To: <20240829144641.3880376-1-edumazet@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240829144641.3880376-1-edumazet@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240829144641.3880376-4-edumazet@google.com> Subject: [PATCH v2 net-next 3/3] icmp: icmp_msgs_per_sec and icmp_msgs_burst sysctls become per netns From: Eric Dumazet To: "David S . Miller" , Jakub Kicinski , Paolo Abeni Cc: David Ahern , Willy Tarreau , Keyu Man , Jesper Dangaard Brouer , netdev@vger.kernel.org, eric.dumazet@gmail.com, Eric Dumazet X-Patchwork-Delegate: kuba@kernel.org Previous patch made ICMP rate limits per netns, it makes sense to allow each netns to change the associated sysctl. Signed-off-by: Eric Dumazet Reviewed-by: David Ahern --- include/net/ip.h | 3 --- include/net/netns/ipv4.h | 2 ++ net/ipv4/icmp.c | 9 ++++----- net/ipv4/sysctl_net_ipv4.c | 32 ++++++++++++++++---------------- 4 files changed, 22 insertions(+), 24 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d3bca4e83979f681c4931e9ff62db5941a059c11..1ee472fa8b373e85907146f9a3f29ecc98e2e55b 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -797,9 +797,6 @@ static inline void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb) bool icmp_global_allow(struct net *net); void icmp_global_consume(struct net *net); -extern int sysctl_icmp_msgs_per_sec; -extern int sysctl_icmp_msgs_burst; - #ifdef CONFIG_PROC_FS int ip_misc_proc_init(void); #endif diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 54fe7c079fffb285b7a8a069f3d57f9440a6655a..276f622f3516871c438be27bafe61c039445b335 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -122,6 +122,8 @@ struct netns_ipv4 { u8 sysctl_icmp_errors_use_inbound_ifaddr; int sysctl_icmp_ratelimit; int sysctl_icmp_ratemask; + int sysctl_icmp_msgs_per_sec; + int sysctl_icmp_msgs_burst; atomic_t icmp_global_credit; u32 icmp_global_stamp; u32 ip_rt_min_pmtu; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 2e1d81dbdbb6fe93ea53398bbe3b3627b35852b0..1ed88883e1f2579c875f4e0769789dc2e0c6e15a 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -220,9 +220,6 @@ static inline void icmp_xmit_unlock(struct sock *sk) spin_unlock(&sk->sk_lock.slock); } -int sysctl_icmp_msgs_per_sec __read_mostly = 1000; -int sysctl_icmp_msgs_burst __read_mostly = 50; - /** * icmp_global_allow - Are we allowed to send one more ICMP message ? * @net: network namespace @@ -249,14 +246,14 @@ bool icmp_global_allow(struct net *net) if (delta < HZ / 50) return false; - incr = READ_ONCE(sysctl_icmp_msgs_per_sec) * delta / HZ; + incr = READ_ONCE(net->ipv4.sysctl_icmp_msgs_per_sec) * delta / HZ; if (!incr) return false; if (cmpxchg(&net->ipv4.icmp_global_stamp, oldstamp, now) == oldstamp) { old = atomic_read(&net->ipv4.icmp_global_credit); do { - new = min(old + incr, READ_ONCE(sysctl_icmp_msgs_burst)); + new = min(old + incr, READ_ONCE(net->ipv4.sysctl_icmp_msgs_burst)); } while (!atomic_try_cmpxchg(&net->ipv4.icmp_global_credit, &old, new)); } return true; @@ -1492,6 +1489,8 @@ static int __net_init icmp_sk_init(struct net *net) net->ipv4.sysctl_icmp_ratelimit = 1 * HZ; net->ipv4.sysctl_icmp_ratemask = 0x1818; net->ipv4.sysctl_icmp_errors_use_inbound_ifaddr = 0; + net->ipv4.sysctl_icmp_msgs_per_sec = 1000; + net->ipv4.sysctl_icmp_msgs_burst = 50; return 0; } diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 4af0c234d8d763f430608d60f38eff8a6d9935b4..a79b2a52ce01e6c1a1257ba31c17ac2f51ba19ec 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -600,22 +600,6 @@ static struct ctl_table ipv4_table[] = { .mode = 0444, .proc_handler = proc_tcp_available_ulp, }, - { - .procname = "icmp_msgs_per_sec", - .data = &sysctl_icmp_msgs_per_sec, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = SYSCTL_ZERO, - }, - { - .procname = "icmp_msgs_burst", - .data = &sysctl_icmp_msgs_burst, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = SYSCTL_ZERO, - }, { .procname = "udp_mem", .data = &sysctl_udp_mem, @@ -701,6 +685,22 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, + { + .procname = "icmp_msgs_per_sec", + .data = &init_net.ipv4.sysctl_icmp_msgs_per_sec, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + }, + { + .procname = "icmp_msgs_burst", + .data = &init_net.ipv4.sysctl_icmp_msgs_burst, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + }, { .procname = "ping_group_range", .data = &init_net.ipv4.ping_group_range.range,