From patchwork Fri Mar 21 04:00:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024854 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58E7EA944 for ; Fri, 21 Mar 2025 04:02:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=72.21.196.25 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529740; cv=none; b=ZjnVvhEjkiDig8fLQ3TU2WBE5H1HaxQ7KCtg+CRcpsgtMpqRd7QASHvA5rcUYTNnYy1bf1N5sL3AxF0SuXNzA9XTN+Ck97aLKfR9pyxxdbw39a93ZelxsNpUXuP3GMH2ucvUgX4uS8b67bICbnADHy59blJLVHGwOFcqI7v0YiI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529740; c=relaxed/simple; bh=zOmgCjvZBt5ZzBkOZ7fA6PRfE2HLKr2oqn15B9HdFZg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=rXTcrcGAFuWftLWS3Hi10yOr4IHhA32spYc9qRSPdNRz/zjI7TnTpG6JfNomDXtfIBkWFukhHkKFV9cPMUvxZd/8L2AuD1wcgV4aoNEFb1ucGD1C1F7v/E0ZdrPO6WrUUCFrCPuPlYuOIZjAotYNEB8E3ut1MbIB77mcLGSspN0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=DTrJYJoh; arc=none smtp.client-ip=72.21.196.25 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="DTrJYJoh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529738; x=1774065738; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2O0b88XU64brEuURodIW/A2w6KldKC/6p6w0yBr5dgw=; b=DTrJYJohvfSKpFCjxV3cwgrvcdRgE4A/IUpAXTsJcx8hMoNu2JQFp2zZ wwv8GGZFwxE4N9B/+OrxSKQNQwzYCf+K/UYoFFttoZcC9oifwm49yO3n1 YIF10IotKMcZkPROz1zHcSqQbaDq2BF4gi7aKW+pF9YSp3HnYdmqZo6Qc 8=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="476936461" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:02:13 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:28059] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.11.159:2525] with esmtp (Farcaster) id 808b003a-1a51-4439-8c94-875156348794; Fri, 21 Mar 2025 04:02:13 +0000 (UTC) X-Farcaster-Flow-ID: 808b003a-1a51-4439-8c94-875156348794 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:08 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:05 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 01/13] ipv6: Validate RTA_GATEWAY of RTA_MULTIPATH in rtm_to_fib6_config(). Date: Thu, 20 Mar 2025 21:00:38 -0700 Message-ID: <20250321040131.21057-2-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWB003.ant.amazon.com (10.13.139.135) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will perform RTM_NEWROUTE and RTM_DELROUTE under RCU, and then we want to perform some validation out of the RCU scope. When creating / removing an IPv6 route with RTA_MULTIPATH, inet6_rtm_newroute() / inet6_rtm_delroute() validates RTA_GATEWAY in each multipath entry. Let's do that in rtm_to_fib6_config(). Note that now RTM_DELROUTE returns an error for RTA_MULTIPATH with 0 entries, which was accepted but should result in -EINVAL as RTM_ADDROUTE. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 82 +++++++++++++++++++++++++----------------------- 1 file changed, 43 insertions(+), 39 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index c3406a0d45bd..6c9c99fb02fe 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5016,6 +5016,44 @@ static const struct nla_policy rtm_ipv6_policy[RTA_MAX+1] = { [RTA_FLOWLABEL] = { .type = NLA_BE32 }, }; +static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, + struct netlink_ext_ack *extack) +{ + struct rtnexthop *rtnh; + int remaining; + + remaining = cfg->fc_mp_len; + rtnh = (struct rtnexthop *)cfg->fc_mp; + + if (!rtnh_ok(rtnh, remaining)) { + NL_SET_ERR_MSG(extack, "Invalid nexthop configuration - no valid nexthops"); + return -EINVAL; + } + + do { + int attrlen = rtnh_attrlen(rtnh); + + if (attrlen > 0) { + struct nlattr *nla, *attrs; + + attrs = rtnh_attrs(rtnh); + nla = nla_find(attrs, attrlen, RTA_GATEWAY); + if (nla) { + if (nla_len(nla) < sizeof(cfg->fc_gateway)) { + NL_SET_ERR_MSG(extack, + "Invalid IPv6 address in RTA_GATEWAY"); + return -EINVAL; + } + } + } + + rtnh = rtnh_next(rtnh, &remaining); + } while (rtnh_ok(rtnh, remaining)); + + return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len, + extack, true); +} + static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, struct fib6_config *cfg, struct netlink_ext_ack *extack) @@ -5130,9 +5168,7 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, cfg->fc_mp = nla_data(tb[RTA_MULTIPATH]); cfg->fc_mp_len = nla_len(tb[RTA_MULTIPATH]); - err = lwtunnel_valid_encap_type_attr(cfg->fc_mp, - cfg->fc_mp_len, - extack, true); + err = rtm_to_fib6_multipath_config(cfg, extack); if (err < 0) goto errout; } @@ -5252,19 +5288,6 @@ static bool ip6_route_mpath_should_notify(const struct fib6_info *rt) return should_notify; } -static int fib6_gw_from_attr(struct in6_addr *gw, struct nlattr *nla, - struct netlink_ext_ack *extack) -{ - if (nla_len(nla) < sizeof(*gw)) { - NL_SET_ERR_MSG(extack, "Invalid IPv6 address in RTA_GATEWAY"); - return -EINVAL; - } - - *gw = nla_get_in6_addr(nla); - - return 0; -} - static int ip6_route_multipath_add(struct fib6_config *cfg, struct netlink_ext_ack *extack) { @@ -5305,18 +5328,11 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, nla = nla_find(attrs, attrlen, RTA_GATEWAY); if (nla) { - err = fib6_gw_from_attr(&r_cfg.fc_gateway, nla, - extack); - if (err) - goto cleanup; - + r_cfg.fc_gateway = nla_get_in6_addr(nla); r_cfg.fc_flags |= RTF_GATEWAY; } - r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP); - /* RTA_ENCAP_TYPE length checked in - * lwtunnel_valid_encap_type_attr - */ + r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP); nla = nla_find(attrs, attrlen, RTA_ENCAP_TYPE); if (nla) r_cfg.fc_encap_type = nla_get_u16(nla); @@ -5349,12 +5365,6 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, rtnh = rtnh_next(rtnh, &remaining); } - if (list_empty(&rt6_nh_list)) { - NL_SET_ERR_MSG(extack, - "Invalid nexthop configuration - no valid nexthops"); - return -EINVAL; - } - /* for add and replace send one notification with all nexthops. * Skip the notification in fib6_add_rt2node and send one with * the full route when done @@ -5476,21 +5486,15 @@ static int ip6_route_multipath_del(struct fib6_config *cfg, nla = nla_find(attrs, attrlen, RTA_GATEWAY); if (nla) { - err = fib6_gw_from_attr(&r_cfg.fc_gateway, nla, - extack); - if (err) { - last_err = err; - goto next_rtnh; - } - + r_cfg.fc_gateway = nla_get_in6_addr(nla); r_cfg.fc_flags |= RTF_GATEWAY; } } + err = ip6_route_del(&r_cfg, extack); if (err) last_err = err; -next_rtnh: rtnh = rtnh_next(rtnh, &remaining); } From patchwork Fri Mar 21 04:00:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024855 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9EF211ADC93 for ; Fri, 21 Mar 2025 04:02:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529757; cv=none; b=BZKE0A9ewnBM5mgnBDKFwM2yFnER/VjRGQKzMbYky8IUHPCpeeqGQRrIiYSPhwtjyfC3uweEB/Rq9KmY69mJyEuqU3/tUwch9br+7grlh/LqK3x7CIXEKcSl0zADm8Q3ilbQxqjR4ICEYuSw5MEfOHbOYrSIt3DcabTDGetGPzk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529757; c=relaxed/simple; bh=cRTUSyZYAk+3+JIFs8OUszyPgUftTF6JYUwFks1+XFk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=oHd165qRVXNapZZCfDBQlbnKUFwcF5dqXmazr9tKdNQ2n8luy+0J45i+lf2D+ewmsm7GKXcITXFkXaNhKwbaqitEtVLcBYReEiJjT8y7XeBqiZnRgyu7aeWVZFd0p6gNfEOJXNFYwl7ZcClwGC9Td8lyvWyF6L5p66yX3p2JTCY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=eiljMtWI; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="eiljMtWI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529755; x=1774065755; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6yVfca1C4xlVwk36jd4PEpNGlfkB96iV0sOBWQ/2wNQ=; b=eiljMtWI72hg7paYqW3/1wSdOqfEvJmfHiICDbc51In+aUzY+/lV3ZKQ JYXKaRiNuwAh75WHhwVdZXeyNlBnezZyHpmev1A2coXyEk20oWU1cZ97e pKaOzAJjJL8Znxe8bqdIZ1KPM1pJYju2/Sae9pHBKmikoTGD5V5PFtciX I=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="180556352" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:02:32 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.21.151:55471] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.43.118:2525] with esmtp (Farcaster) id f7affa5a-6777-40d6-b34e-30a438f97822; Fri, 21 Mar 2025 04:02:32 +0000 (UTC) X-Farcaster-Flow-ID: f7affa5a-6777-40d6-b34e-30a438f97822 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:32 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:29 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 02/13] ipv6: Get rid of RTNL for SIOCDELRT and RTM_DELROUTE. Date: Thu, 20 Mar 2025 21:00:39 -0700 Message-ID: <20250321040131.21057-3-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D032UWA002.ant.amazon.com (10.13.139.81) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Basically, removing an IPv6 route does not require RTNL because the IPv6 routing tables are protected by per table lock. Also, ip6_route_del() already relies on RCU and the table lock. inet6_rtm_delroute() calls nexthop_find_by_id() to check if the nexthop specified by RTA_NH_ID exists. nexthop uses rbtree and the top-down walk can be safely performed under RCU. Let's call nexthop_find_by_id() under RCU and get rid of RTNL from inet6_rtm_delroute() and SIOCDELRT. Even if the nexthop is removed after rcu_read_unlock() in inet6_rtm_delroute(), __remove_nexthop_fib() cleans up the routes tied to the nexthop, and ip6_route_del() returns -ESRCH. So the request was at least valid as of nexthop_find_by_id(), and it's just a matter of timing. Note that we need to pass false to lwtunnel_valid_encap_type_attr(). The following patches also use the newroute bool. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 6c9c99fb02fe..b737b242079e 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4482,19 +4482,20 @@ int ipv6_route_ioctl(struct net *net, unsigned int cmd, struct in6_rtmsg *rtmsg) rtmsg_to_fib6_config(net, rtmsg, &cfg); - rtnl_lock(); switch (cmd) { case SIOCADDRT: + rtnl_lock(); /* Only do the default setting of fc_metric in route adding */ if (cfg.fc_metric == 0) cfg.fc_metric = IP6_RT_PRIO_USER; err = ip6_route_add(&cfg, GFP_KERNEL, NULL); + rtnl_unlock(); break; case SIOCDELRT: err = ip6_route_del(&cfg, NULL); break; } - rtnl_unlock(); + return err; } @@ -5017,7 +5018,8 @@ static const struct nla_policy rtm_ipv6_policy[RTA_MAX+1] = { }; static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, - struct netlink_ext_ack *extack) + struct netlink_ext_ack *extack, + bool newroute) { struct rtnexthop *rtnh; int remaining; @@ -5051,15 +5053,16 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, } while (rtnh_ok(rtnh, remaining)); return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len, - extack, true); + extack, newroute); } static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, struct fib6_config *cfg, struct netlink_ext_ack *extack) { - struct rtmsg *rtm; + bool newroute = nlh->nlmsg_type == RTM_NEWROUTE; struct nlattr *tb[RTA_MAX+1]; + struct rtmsg *rtm; unsigned int pref; int err; @@ -5168,7 +5171,7 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, cfg->fc_mp = nla_data(tb[RTA_MULTIPATH]); cfg->fc_mp_len = nla_len(tb[RTA_MULTIPATH]); - err = rtm_to_fib6_multipath_config(cfg, extack); + err = rtm_to_fib6_multipath_config(cfg, extack, newroute); if (err < 0) goto errout; } @@ -5188,7 +5191,7 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, cfg->fc_encap_type = nla_get_u16(tb[RTA_ENCAP_TYPE]); err = lwtunnel_valid_encap_type(cfg->fc_encap_type, - extack, true); + extack, newroute); if (err < 0) goto errout; } @@ -5511,15 +5514,20 @@ static int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, if (err < 0) return err; - if (cfg.fc_nh_id && - !nexthop_find_by_id(sock_net(skb->sk), cfg.fc_nh_id)) { - NL_SET_ERR_MSG(extack, "Nexthop id does not exist"); - return -EINVAL; + if (cfg.fc_nh_id) { + rcu_read_lock(); + err = !nexthop_find_by_id(sock_net(skb->sk), cfg.fc_nh_id); + rcu_read_unlock(); + + if (err) { + NL_SET_ERR_MSG(extack, "Nexthop id does not exist"); + return -EINVAL; + } } - if (cfg.fc_mp) + if (cfg.fc_mp) { return ip6_route_multipath_del(&cfg, extack); - else { + } else { cfg.fc_delete_all_nh = 1; return ip6_route_del(&cfg, extack); } @@ -6731,7 +6739,7 @@ static const struct rtnl_msg_handler ip6_route_rtnl_msg_handlers[] __initconst_o {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_NEWROUTE, .doit = inet6_rtm_newroute}, {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_DELROUTE, - .doit = inet6_rtm_delroute}, + .doit = inet6_rtm_delroute, .flags = RTNL_FLAG_DOIT_UNLOCKED}, {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_GETROUTE, .doit = inet6_rtm_getroute, .flags = RTNL_FLAG_DOIT_UNLOCKED}, }; From patchwork Fri Mar 21 04:00:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024856 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7CF24120B for ; Fri, 21 Mar 2025 04:03:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=72.21.196.25 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529797; cv=none; b=IQj4yqyOnOL14hoWVOZgTLuj8Iw65JG1pJ5WVojKMzwu3tVm3WBOyzTmC8HbUh6ZdQza3YdUPYHsEvn7BnAGlENZG9NsyH4dgAAKl0pH9n077dZiv4Nd7D6yZBj7u1c5dGEnDWIJF8j40IOCMr60tte2KLLK+yP5qEuI84xt5L0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529797; c=relaxed/simple; bh=RcWCvHpoMEJJHHvX+Oz71g1ZJP3SmzLbpxqOOjg7SoE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=KTFPbjb2G8duFMxUPrQ0HvDksxud/D9Q5pAUpcFcv7pRxHxcA8qF31AptDI3s7cXSA5QiysP+UbR3r8xuAvQQiGqG33+rXbD370pZXqh8xnd+D7vhMBmL7vMSZmTER9NvVMuHUl7eUXelrbVCB7nrC7zEyP8k3X3dIZyLFo7gJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=jgRJn1JO; arc=none smtp.client-ip=72.21.196.25 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="jgRJn1JO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529796; x=1774065796; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pfWIdj711fkU+5EjnPZQaaBeFCWsqeSyVcvh4Xeuay4=; b=jgRJn1JOKoNKiowC3F+C/8dAw2xLZcYJO1eOIULvyzW9WZ73SggiYhkN Q1bVXXAZLnH4igMiccunTXf/9ao+LP3W1nto1Nn7b2Dz1veTpLHFE9xHC Oqu2Gi7kkSZWTGKnfSTJJd32Kup0heXAR3dn12s/vkPZsKKrfjhSGc94n M=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="476936669" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:03:14 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.38.20:34886] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.1.69:2525] with esmtp (Farcaster) id 42756a47-8755-43a6-91d1-e81407a1a9d9; Fri, 21 Mar 2025 04:02:58 +0000 (UTC) X-Farcaster-Flow-ID: 42756a47-8755-43a6-91d1-e81407a1a9d9 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:56 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:02:53 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 03/13] ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config(). Date: Thu, 20 Mar 2025 21:00:40 -0700 Message-ID: <20250321040131.21057-4-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D044UWA003.ant.amazon.com (10.13.139.43) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org ip6_route_info_create() is called from 3 functions: * ip6_route_add() * ip6_route_multipath_add() * addrconf_f6i_alloc() addrconf_f6i_alloc() does not need validation for struct fib6_config in ip6_route_info_create(). ip6_route_multipath_add() calls ip6_route_info_create() for multiple routes with slightly different fib6_config instances, which is copied from the base config passed from userspace. So, we need not validate the same config repeatedly. Let's move such validation into rtm_to_fib6_config(). Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 79 +++++++++++++++++++++++++----------------------- 1 file changed, 42 insertions(+), 37 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index b737b242079e..baad02c099ff 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3705,38 +3705,6 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, int err = -EINVAL; int addr_type; - /* RTF_PCPU is an internal flag; can not be set by userspace */ - if (cfg->fc_flags & RTF_PCPU) { - NL_SET_ERR_MSG(extack, "Userspace can not set RTF_PCPU"); - goto out; - } - - /* RTF_CACHE is an internal flag; can not be set by userspace */ - if (cfg->fc_flags & RTF_CACHE) { - NL_SET_ERR_MSG(extack, "Userspace can not set RTF_CACHE"); - goto out; - } - - if (cfg->fc_type > RTN_MAX) { - NL_SET_ERR_MSG(extack, "Invalid route type"); - goto out; - } - - if (cfg->fc_dst_len > 128) { - NL_SET_ERR_MSG(extack, "Invalid prefix length"); - goto out; - } - if (cfg->fc_src_len > 128) { - NL_SET_ERR_MSG(extack, "Invalid source address length"); - goto out; - } -#ifndef CONFIG_IPV6_SUBTREES - if (cfg->fc_src_len) { - NL_SET_ERR_MSG(extack, - "Specifying source address requires IPV6_SUBTREES to be enabled"); - goto out; - } -#endif if (cfg->fc_nh_id) { nh = nexthop_find_by_id(net, cfg->fc_nh_id); if (!nh) { @@ -3801,11 +3769,6 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, rt->fib6_src.plen = cfg->fc_src_len; #endif if (nh) { - if (rt->fib6_src.plen) { - NL_SET_ERR_MSG(extack, "Nexthops can not be used with source routing"); - err = -EINVAL; - goto out_free; - } if (!nexthop_get(nh)) { NL_SET_ERR_MSG(extack, "Nexthop has been deleted"); err = -ENOENT; @@ -5205,6 +5168,48 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, } } + if (newroute) { + /* RTF_PCPU is an internal flag; can not be set by userspace */ + if (cfg->fc_flags & RTF_PCPU) { + NL_SET_ERR_MSG(extack, "Userspace can not set RTF_PCPU"); + goto errout; + } + + /* RTF_CACHE is an internal flag; can not be set by userspace */ + if (cfg->fc_flags & RTF_CACHE) { + NL_SET_ERR_MSG(extack, "Userspace can not set RTF_CACHE"); + goto errout; + } + + if (cfg->fc_type > RTN_MAX) { + NL_SET_ERR_MSG(extack, "Invalid route type"); + goto errout; + } + + if (cfg->fc_dst_len > 128) { + NL_SET_ERR_MSG(extack, "Invalid prefix length"); + goto errout; + } + +#ifdef CONFIG_IPV6_SUBTREES + if (cfg->fc_src_len > 128) { + NL_SET_ERR_MSG(extack, "Invalid source address length"); + goto errout; + } + + if (cfg->fc_nh_id && cfg->fc_src_len) { + NL_SET_ERR_MSG(extack, "Nexthops can not be used with source routing"); + goto errout; + } +#else + if (cfg->fc_src_len) { + NL_SET_ERR_MSG(extack, + "Specifying source address requires IPV6_SUBTREES to be enabled"); + goto errout; + } +#endif + } + err = 0; errout: return err; From patchwork Fri Mar 21 04:00:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024857 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34EA61E1A05 for ; Fri, 21 Mar 2025 04:03:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529811; cv=none; b=JhgnyBC6BL35pDcSUAod0V78ZHNU/fOS0RFKFZZwMR481R/m29y8p/q6j9kQJPsdsM5+8X99Ng9mrWAL90KsFtXiUsQv1KSnYxIHv48vndnIAEUvdlvcAbic5Lvq9pNrLyA67ovEcNlm8TRb9ko0fxX35tOif6GdOfk5xYIfAw4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529811; c=relaxed/simple; bh=UXA9CBqplU/HS69ocFV4a6LK2SdwZ7y67vzUFSenG2A=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hbKwJW8wbccYPJh2FWRg3On4A85L9wcnaUp4CrsZvOytquBmGD5bj7ZEKFM1K1W1/nuSEDW5g8+yjHPo8ml3PSetTZ28uDQ1emgqX+fFsNHWSTrwN7bHJN+dfkyljN6ZbWJE3bvvxCKFEAaWK/+mj8cwrFlG8o7Hqv6czV2LLuQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=EeSwUkjG; arc=none smtp.client-ip=52.119.213.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="EeSwUkjG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529808; x=1774065808; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eLXQJQt1E/OlxCNmwZPh7n5wnCnzwnCe4AJgvdGlAFs=; b=EeSwUkjGaCb63Lbu/LdvkNopMAoEkGEql/gZG520xfvLKognxTSdtFvJ C8hY4bH+Bapo0cMSliSx6TSrV+ckdlGrlGppngJDxVo4CBZIqCCNhoOdU Pi2dV+pWGpFfTdUTPtH7/jDM4x+eMbXv0R/qfEFjTfcNBcOxYjVaK8hTY Y=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="76327025" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:03:23 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.7.35:63606] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.56.45:2525] with esmtp (Farcaster) id 4859e9dc-8db7-437f-8baf-646906bf8eac; Fri, 21 Mar 2025 04:03:21 +0000 (UTC) X-Farcaster-Flow-ID: 4859e9dc-8db7-437f-8baf-646906bf8eac Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:03:20 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:03:18 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 04/13] ipv6: Check GATEWAY in rtm_to_fib6_multipath_config(). Date: Thu, 20 Mar 2025 21:00:41 -0700 Message-ID: <20250321040131.21057-5-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWA001.ant.amazon.com (10.13.139.92) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org In ip6_route_multipath_add(), we call rt6_qualify_for_ecmp() for each entry. If it returns false, the request fails. rt6_qualify_for_ecmp() returns false if either of the conditions below is true: 1. f6i->fib6_flags has RTF_ADDRCONF 2. f6i->nh is not NULL 3. f6i->fib6_nh->fib_nh_gw_family is AF_UNSPEC 1 is unnecessary because rtm_to_fib6_config() never sets RTF_ADDRCONF to cfg->fc_flags. 2. is equivalent with cfg->fc_nh_id. 3. can be replaced by checking RTF_GATEWAY in the base and each multipath entry because AF_INET6 is set to f6i->fib6_nh->fib_nh_gw_family only when cfg.fc_is_fdb is true or RTF_GATEWAY is set, but the former is always false. Let's perform the equivalent checks in rtm_to_fib6_multipath_config(). Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index baad02c099ff..b51793ee7a18 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4996,6 +4996,7 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, } do { + bool has_gateway = cfg->fc_flags & RTF_GATEWAY; int attrlen = rtnh_attrlen(rtnh); if (attrlen > 0) { @@ -5009,9 +5010,17 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, "Invalid IPv6 address in RTA_GATEWAY"); return -EINVAL; } + + has_gateway = true; } } + if (newroute && (cfg->fc_nh_id || !has_gateway)) { + NL_SET_ERR_MSG(extack, + "Device only routes can not be added for IPv6 using the multipath API."); + return -EINVAL; + } + rtnh = rtnh_next(rtnh, &remaining); } while (rtnh_ok(rtnh, remaining)); @@ -5353,13 +5362,6 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, rt = NULL; goto cleanup; } - if (!rt6_qualify_for_ecmp(rt)) { - err = -EINVAL; - NL_SET_ERR_MSG(extack, - "Device only routes can not be added for IPv6 using the multipath API."); - fib6_info_release(rt); - goto cleanup; - } rt->fib6_nh->fib_nh_weight = rtnh->rtnh_hops + 1; From patchwork Fri Mar 21 04:00:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024858 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A06F24120B for ; Fri, 21 Mar 2025 04:03:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529832; cv=none; b=fSRbbDl8OD3TnghE80ZIssj6d+ZHEIzJHf5mLNKoixYRWQrwSMbN5lfI7JW2Neu3OAessHBZv2fMKecKjV0AFbyBp3pwmSEh3rnX1sOhA36xoyzkBsPCYmU8M0s0ezeyH53NBThvPax15k2G5y9gSweDMhYitXiqfuxBWCgUQbE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529832; c=relaxed/simple; bh=8WL2nNFggGT5UZjeMkRz7khz3bjWH5+JppVSgmsFIQU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=eQ6e+f4fwFlMSQUtYbyxpRi6XCnckhppAChzW6FXEMyYFEmVucFApACKk7rZ0qQwN53rcTDiK/LiMj8NnaQnDwHT+kKW9x0ksldDGXnp+GXXY8PkAc5uOGvtI0GCiiTBXi8nSN2h4om7bj1MPgPsZQltzrRKSpUgChO4/4uzRx4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=PhnpGHDu; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="PhnpGHDu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529831; x=1774065831; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=frwI6ICwtOt071miKLjDgUuf1L9osFhvOZc6jsQwvik=; b=PhnpGHDuyKAlGgjtJqmMA88ETv4P17oibTdQsdZtZm2gEZmma9hsBmfq B9gnMRKUsub3fj/IboC8VUzoa59DsxELl/iQSssVGZwkVxaAjYXmbGQ5t EJWwiyd9HFShGxywWfiKXL7667dsAesIoeYZmjNwRlYGqWDyL3PHn+Tj5 A=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="482285625" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:03:46 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:9256] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.43.118:2525] with esmtp (Farcaster) id e3733c30-cda6-4b1f-b490-b7659526514d; Fri, 21 Mar 2025 04:03:45 +0000 (UTC) X-Farcaster-Flow-ID: e3733c30-cda6-4b1f-b490-b7659526514d Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:03:44 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:03:42 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 05/13] ipv6: Move nexthop_find_by_id() after fib6_info_alloc(). Date: Thu, 20 Mar 2025 21:00:42 -0700 Message-ID: <20250321040131.21057-6-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D035UWA001.ant.amazon.com (10.13.139.101) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, we must perform two lookups for nexthop and dev under RCU to guarantee their lifetime. ip6_route_info_create() calls nexthop_find_by_id() first if RTA_NH_ID is specified, and then allocates struct fib6_info. nexthop_find_by_id() must be called under RCU, but we do not want to use GFP_ATOMIC for memory allocation here, which will be likely to fail in ip6_route_multipath_add(). Let's move nexthop_find_by_id() after the memory allocation. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index b51793ee7a18..28d38282a19e 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3699,24 +3699,11 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, { struct net *net = cfg->fc_nlinfo.nl_net; struct fib6_info *rt = NULL; - struct nexthop *nh = NULL; struct fib6_table *table; struct fib6_nh *fib6_nh; - int err = -EINVAL; + int err = -ENOBUFS; int addr_type; - if (cfg->fc_nh_id) { - nh = nexthop_find_by_id(net, cfg->fc_nh_id); - if (!nh) { - NL_SET_ERR_MSG(extack, "Nexthop id does not exist"); - goto out; - } - err = fib6_check_nexthop(nh, cfg, extack); - if (err) - goto out; - } - - err = -ENOBUFS; if (cfg->fc_nlinfo.nlh && !(cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_CREATE)) { table = fib6_get_table(net, cfg->fc_table); @@ -3732,7 +3719,7 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, goto out; err = -ENOMEM; - rt = fib6_info_alloc(gfp_flags, !nh); + rt = fib6_info_alloc(gfp_flags, !cfg->fc_nh_id); if (!rt) goto out; @@ -3768,12 +3755,27 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, ipv6_addr_prefix(&rt->fib6_src.addr, &cfg->fc_src, cfg->fc_src_len); rt->fib6_src.plen = cfg->fc_src_len; #endif - if (nh) { + + if (cfg->fc_nh_id) { + struct nexthop *nh; + + nh = nexthop_find_by_id(net, cfg->fc_nh_id); + if (!nh) { + err = -EINVAL; + NL_SET_ERR_MSG(extack, "Nexthop id does not exist"); + goto out_free; + } + + err = fib6_check_nexthop(nh, cfg, extack); + if (err) + goto out_free; + if (!nexthop_get(nh)) { NL_SET_ERR_MSG(extack, "Nexthop has been deleted"); err = -ENOENT; goto out_free; } + rt->nh = nh; fib6_nh = nexthop_fib6_nh(rt->nh); } else { From patchwork Fri Mar 21 04:00:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024859 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50B0F2A1CF for ; Fri, 21 Mar 2025 04:04:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.188.206 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529860; cv=none; b=nVroKJn3uy+iuqcjqaahTf4UbFzAyWyrwaXMypzwPFQ1qqocTwO4/bQPaD67DcZJRv4CAFwubH7P/wZAcpHYLKTQP/5Vn/YV4Q9qMIZibOr2f56nUddX+t0fw4ZZXwkA6eh9nks+T9RrUj1i5nmC2C9w32OvVXrnnOTWkeMl+eo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529860; c=relaxed/simple; bh=R2JQWFJmV6Jry29T494C8lLEdUiJ/3E6HgkoTsnnFdc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Fy4KR8tI0WHFhklewXXPDADmo4t1PmrFvCPVqtktnuBuA9mlUlpZn0W/mBOM9GC8OAxksz+LgiAT2ewFdoHqYrZXcMN0vknS9DKdwU+Ht3LPshaiD3VAeiIewcsnJo9Jemkc25eOAK54VFgRTSoMElXWNdRScfHcBqG5TAtXkds= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=ZjRJAWAv; arc=none smtp.client-ip=207.171.188.206 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="ZjRJAWAv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529859; x=1774065859; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7bGTC8FQx814JSfear+LfikTvmHq3luuQDdcA6CG00M=; b=ZjRJAWAvFuhXATBHKQIPUjwkAHgHXzTBL67lj69fqsQKRQQx80rGPLiM LV8fNBZZOqgPRISuJgNh4shywP/6nHiX6JiAgcIY9+Fz81TB5nQqvgPv1 is5joRJQ9MXEq16N2Wkhy28ssZJyaYHQtp+LEj7P9jxCNovZX2niI5HBR o=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="809303903" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:04:12 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:63979] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.29.36:2525] with esmtp (Farcaster) id 87847561-766b-4e6c-b801-48845bb865ac; Fri, 21 Mar 2025 04:04:12 +0000 (UTC) X-Farcaster-Flow-ID: 87847561-766b-4e6c-b801-48845bb865ac Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:08 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:06 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 06/13] ipv6: Split ip6_route_info_create(). Date: Thu, 20 Mar 2025 21:00:43 -0700 Message-ID: <20250321040131.21057-7-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D041UWB004.ant.amazon.com (10.13.139.143) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, we want to allocate everything as possible before entering the RCU section. The RCU section will start in the middle of ip6_route_info_create(), and this is problematic for ip6_route_multipath_add() that calls ip6_route_info_create() multiple times. Let's split ip6_route_info_create() into two parts; one for memory allocation and another for nexthop setup. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 95 +++++++++++++++++++++++++++++++----------------- 1 file changed, 62 insertions(+), 33 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 28d38282a19e..6299cfd9f12a 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3694,15 +3694,13 @@ void fib6_nh_release_dsts(struct fib6_nh *fib6_nh) } static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, - gfp_t gfp_flags, - struct netlink_ext_ack *extack) + gfp_t gfp_flags, + struct netlink_ext_ack *extack) { struct net *net = cfg->fc_nlinfo.nl_net; - struct fib6_info *rt = NULL; struct fib6_table *table; - struct fib6_nh *fib6_nh; - int err = -ENOBUFS; - int addr_type; + struct fib6_info *rt; + int err; if (cfg->fc_nlinfo.nlh && !(cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_CREATE)) { @@ -3714,22 +3712,22 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, } else { table = fib6_new_table(net, cfg->fc_table); } + if (!table) { + err = -ENOBUFS; + goto err; + } - if (!table) - goto out; - - err = -ENOMEM; rt = fib6_info_alloc(gfp_flags, !cfg->fc_nh_id); - if (!rt) - goto out; + if (!rt) { + err = -ENOMEM; + goto err; + } rt->fib6_metrics = ip_fib_metrics_init(cfg->fc_mx, cfg->fc_mx_len, extack); if (IS_ERR(rt->fib6_metrics)) { err = PTR_ERR(rt->fib6_metrics); - /* Do not leave garbage there. */ - rt->fib6_metrics = (struct dst_metrics *)&dst_default_metrics; - goto out_free; + goto free; } if (cfg->fc_flags & RTF_ADDRCONF) @@ -3737,12 +3735,12 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, if (cfg->fc_flags & RTF_EXPIRES) fib6_set_expires(rt, jiffies + - clock_t_to_jiffies(cfg->fc_expires)); + clock_t_to_jiffies(cfg->fc_expires)); if (cfg->fc_protocol == RTPROT_UNSPEC) cfg->fc_protocol = RTPROT_BOOT; - rt->fib6_protocol = cfg->fc_protocol; + rt->fib6_protocol = cfg->fc_protocol; rt->fib6_table = table; rt->fib6_metric = cfg->fc_metric; rt->fib6_type = cfg->fc_type ? : RTN_UNICAST; @@ -3755,6 +3753,20 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, ipv6_addr_prefix(&rt->fib6_src.addr, &cfg->fc_src, cfg->fc_src_len); rt->fib6_src.plen = cfg->fc_src_len; #endif + return rt; +free: + kfree(rt); +err: + return ERR_PTR(err); +} + +static int ip6_route_info_create_nh(struct fib6_info *rt, + struct fib6_config *cfg, + struct netlink_ext_ack *extack) +{ + struct net *net = cfg->fc_nlinfo.nl_net; + struct fib6_nh *fib6_nh; + int err; if (cfg->fc_nh_id) { struct nexthop *nh; @@ -3779,9 +3791,11 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, rt->nh = nh; fib6_nh = nexthop_fib6_nh(rt->nh); } else { - err = fib6_nh_init(net, rt->fib6_nh, cfg, gfp_flags, extack); + int addr_type; + + err = fib6_nh_init(net, rt->fib6_nh, cfg, GFP_ATOMIC, extack); if (err) - goto out; + goto out_release; fib6_nh = rt->fib6_nh; @@ -3800,21 +3814,20 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, if (!ipv6_chk_addr(net, &cfg->fc_prefsrc, dev, 0)) { NL_SET_ERR_MSG(extack, "Invalid source address"); err = -EINVAL; - goto out; + goto out_release; } rt->fib6_prefsrc.addr = cfg->fc_prefsrc; rt->fib6_prefsrc.plen = 128; - } else - rt->fib6_prefsrc.plen = 0; + } - return rt; -out: + return 0; +out_release: fib6_info_release(rt); - return ERR_PTR(err); + return err; out_free: ip_fib_metrics_put(rt->fib6_metrics); kfree(rt); - return ERR_PTR(err); + return err; } int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags, @@ -3827,6 +3840,10 @@ int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags, if (IS_ERR(rt)) return PTR_ERR(rt); + err = ip6_route_info_create_nh(rt, cfg, extack); + if (err) + return err; + err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, extack); fib6_info_release(rt); @@ -4550,6 +4567,7 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net, .fc_ignore_dev_down = true, }; struct fib6_info *f6i; + int err; if (anycast) { cfg.fc_type = RTN_ANYCAST; @@ -4560,14 +4578,19 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net, } f6i = ip6_route_info_create(&cfg, gfp_flags, extack); - if (!IS_ERR(f6i)) { - f6i->dst_nocount = true; + if (IS_ERR(f6i)) + return f6i; - if (!anycast && - (READ_ONCE(net->ipv6.devconf_all->disable_policy) || - READ_ONCE(idev->cnf.disable_policy))) - f6i->dst_nopolicy = true; - } + err = ip6_route_info_create_nh(f6i, &cfg, extack); + if (err) + return ERR_PTR(err); + + f6i->dst_nocount = true; + + if (!anycast && + (READ_ONCE(net->ipv6.devconf_all->disable_policy) || + READ_ONCE(idev->cnf.disable_policy))) + f6i->dst_nopolicy = true; return f6i; } @@ -5365,6 +5388,12 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, goto cleanup; } + err = ip6_route_info_create_nh(rt, &r_cfg, extack); + if (err) { + rt = NULL; + goto cleanup; + } + rt->fib6_nh->fib_nh_weight = rtnh->rtnh_hops + 1; err = ip6_route_info_append(info->nl_net, &rt6_nh_list, From patchwork Fri Mar 21 04:00:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024860 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C16112A1CF for ; Fri, 21 Mar 2025 04:04:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529877; cv=none; b=h8v3meDuJ6mCynFvwLQTqUT0uCW7INXEaPUGxJcbmN+tTiIo+f6SqkeAgydwkawIOTREz/sc6crTStLM4Ky0n47vthBjTb0N8vffZKWIBGJRvGwbLiamgjcl/SrzQol4OT5P0AOnMzbq4rJJoP9gpW98KDJs++mNMP3FS/E9clk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529877; c=relaxed/simple; bh=xNU97WSD6rBM8gw4ESGlGIGTSxu7eB9eNjsE5CQFPAI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=s6tDD1XZSRc6uGFWAwksYDC+XyRRIF0kFoLqj1Z15OUl9pC4N8GbimzRgud6DZXq+ktY90rRE0WPK/guB0Ag/5/p8ceUckaCq57gX4uxUWTLlnDvYL7O0KLQyiAr94Md7wAXymyHd0tNbVSNlk9Mz4Vrfrkk6THe9M8AMpsiQTc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=IevE6Hi5; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="IevE6Hi5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529875; x=1774065875; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=76glfmGHtRz+XzKIeLvL8CRAcmfZ5nNKYDKN0AMav8M=; b=IevE6Hi5LvKsQoaL+YM2qaJLlcoMYfGFRNlku8lKrI1mIFK+JBzQazXQ uDDZXbP5poCBI0gIFsipjCMVfv4oiK0jNhreFRpEo0psVqi1lCkJQMSXB TgoyGfs5egeeyXxUttpuYTAGv+iLnHqbOuEmzl+i4jRFXLXl12CUNvomz s=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="180556804" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:04:33 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:46044] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.37.176:2525] with esmtp (Farcaster) id 3bc04f57-206e-4ccf-81cb-8d0fcf8a55ed; Fri, 21 Mar 2025 04:04:33 +0000 (UTC) X-Farcaster-Flow-ID: 3bc04f57-206e-4ccf-81cb-8d0fcf8a55ed Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:32 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:30 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 07/13] ipv6: Preallocate rt->fib6_nh->rt6i_pcpu in ip6_route_info_create(). Date: Thu, 20 Mar 2025 21:00:44 -0700 Message-ID: <20250321040131.21057-8-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D031UWA003.ant.amazon.com (10.13.139.47) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org ip6_route_info_create_nh() will be called under RCU. Then, fib6_nh_init() is also under RCU, but per-cpu memory allocation is very likely to fail with GFP_ATOMIC while bluk-adding IPv6 routes and we will see a bunch of this message in dmesg. percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left Let's preallocate rt->fib6_nh->rt6i_pcpu in ip6_route_info_create(). If something fails before the original memory allocation in fib6_nh_init(), ip6_route_info_create_nh() calls fib6_info_release(), which releases the preallocated per-cpu memory. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 6299cfd9f12a..d50377131506 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3630,10 +3630,12 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh, goto out; pcpu_alloc: - fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags); if (!fib6_nh->rt6i_pcpu) { - err = -ENOMEM; - goto out; + fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags); + if (!fib6_nh->rt6i_pcpu) { + err = -ENOMEM; + goto out; + } } fib6_nh->fib_nh_dev = dev; @@ -3693,6 +3695,15 @@ void fib6_nh_release_dsts(struct fib6_nh *fib6_nh) } } +static int fib6_nh_prealloc_percpu(struct fib6_nh *fib6_nh, gfp_t gfp_flags) +{ + fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags); + if (!fib6_nh->rt6i_pcpu) + return -ENOMEM; + + return 0; +} + static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack) @@ -3730,6 +3741,12 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, goto free; } + if (!cfg->fc_nh_id) { + err = fib6_nh_prealloc_percpu(&rt->fib6_nh[0], gfp_flags); + if (err) + goto free_metrics; + } + if (cfg->fc_flags & RTF_ADDRCONF) rt->dst_nocount = true; @@ -3754,6 +3771,8 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, rt->fib6_src.plen = cfg->fc_src_len; #endif return rt; +free_metrics: + ip_fib_metrics_put(rt->fib6_metrics); free: kfree(rt); err: From patchwork Fri Mar 21 04:00:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024861 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E5231E9B3D for ; Fri, 21 Mar 2025 04:05:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.48.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529904; cv=none; b=urB/I+F6AogVMd5BFvK3VmE9JeSY98BhhP4paZwZMYZAehRyaR4avnMHWnviUlQRFddHGecOG+TGSIEGKZQ9pqIbXlWKLJrxfvDrRvP8Mt79/Ym93Syn6pSCTs19rcGmfRGhrHtx20ilFQQI2IcRuVrv00HhvbbKw8B48Zgi/mw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529904; c=relaxed/simple; bh=UhzxaBOiYHDOkINpRy36jejUgyjbSgvr0f7OLP5wuaE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=E0xiyi2f0Fo+ZYhJq0CWHZLaVXBcwBvnfNP+R4gzpXZyvohyju42LG3TRe6rVvyixqpPijyr6bib5sYgyCzd5lHIrusocUQxi5TFzTe6qucCGIHiBFG2F96381XXXWpgHjssl+PVvwiICc2qyvIjYqg7g7Wz7Bfiv7wFZDyZ0qE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=qXT3MWDW; arc=none smtp.client-ip=52.95.48.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="qXT3MWDW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529903; x=1774065903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a5dpJ78s4RWlmPiW3uAXfitaErnG4uvCAZ/+qdEl6pA=; b=qXT3MWDWHYjnjfCLhfE1/fnDIz+PyEdxuoRewasidHd0Z805UsMGlpqi SscB7EHN8T+XwbZ5iiYhnbZDuS/I2exnVCPaLkiA1Tl7kjWaEeZGMsl6E gEPjOshaDlftLiBO61m+odd0flLEN8jMcUT8JJ915Ho79VnC1HrA6fLjB g=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="473021422" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:04:58 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.7.35:23244] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.20.119:2525] with esmtp (Farcaster) id 2c495aa3-2281-4dc8-ac27-07d172ee5487; Fri, 21 Mar 2025 04:04:58 +0000 (UTC) X-Farcaster-Flow-ID: 2c495aa3-2281-4dc8-ac27-07d172ee5487 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:56 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:04:54 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 08/13] ipv6: Preallocate nhc_pcpu_rth_output in ip6_route_info_create(). Date: Thu, 20 Mar 2025 21:00:45 -0700 Message-ID: <20250321040131.21057-9-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D039UWB003.ant.amazon.com (10.13.138.93) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org ip6_route_info_create_nh() will be called under RCU. It calls fib_nh_common_init() and allocates nhc->nhc_pcpu_rth_output. As with the reason for rt->fib6_nh->rt6i_pcpu, we want to avoid GFP_ATOMIC allocation for nhc->nhc_pcpu_rth_output under RCU. Let's preallocate it in ip6_route_info_create(). Signed-off-by: Kuniyuki Iwashima --- net/ipv4/fib_semantics.c | 10 ++++++---- net/ipv6/route.c | 9 +++++++++ 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index f68bb9e34c34..5326f1501af0 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -617,10 +617,12 @@ int fib_nh_common_init(struct net *net, struct fib_nh_common *nhc, { int err; - nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *, - gfp_flags); - if (!nhc->nhc_pcpu_rth_output) - return -ENOMEM; + if (!nhc->nhc_pcpu_rth_output) { + nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *, + gfp_flags); + if (!nhc->nhc_pcpu_rth_output) + return -ENOMEM; + } if (encap) { struct lwtunnel_state *lwtstate; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index d50377131506..e65e2c8b7125 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3697,10 +3697,19 @@ void fib6_nh_release_dsts(struct fib6_nh *fib6_nh) static int fib6_nh_prealloc_percpu(struct fib6_nh *fib6_nh, gfp_t gfp_flags) { + struct fib_nh_common *nhc = &fib6_nh->nh_common; + fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags); if (!fib6_nh->rt6i_pcpu) return -ENOMEM; + nhc->nhc_pcpu_rth_output = alloc_percpu_gfp(struct rtable __rcu *, + gfp_flags); + if (!nhc->nhc_pcpu_rth_output) { + free_percpu(fib6_nh->rt6i_pcpu); + return -ENOMEM; + } + return 0; } From patchwork Fri Mar 21 04:00:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024862 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC5451E5B7A for ; Fri, 21 Mar 2025 04:05:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529926; cv=none; b=R7ZkVLz/dFCY1qGKc8iq5MOmO0rJjzec8/LDiv7lqFSfXCvS0CY01Mx2tolf+lANbcjvc08vdd/wTOrKnosURx1sHLG+XCyCxtVm3j8Vn8kRTgP37jqSyHuWqFecZARnGpFCG3t51vH5bXIJ2bgyxRHKQyLm7EuDGNRJC9iPB0A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529926; c=relaxed/simple; bh=x0p/epXVRJZD1U3hLQvxMzhT0z4+y15167tMKkYGCiw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=infHIefX82tzt5ZduBKhHBxzrgvyJ0zA/2OGjZ0P7SwEMg0imwT2dK1xSVKiUHyrkABceF1CAa6aNtIJBkoccDYNuQd24eQqMdrdrzHrp7NPVHtUDlPdfKWbnguvktvIZqIVXh4Ab5FYRP8VapIAxmPl1jjJ+k7v0uTlRvQbf3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=Vm6249bz; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="Vm6249bz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529925; x=1774065925; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CpXAOKaMDcZP0iVUtPq97g7AX/600hT0o7wS3lnOizQ=; b=Vm6249bzm7Ms9xFnueP71XbdwJZnETY7EbmdqdrlyClPJgj6qnlq/ghk +Eek+4CGe0MXlvyVlY1QXuNbmZHDDL9PbXpJb4ZeddYojucYXluTbZBjv UYKdAF31LrZK/6i0UOUOGGJw3exe3Huk9K/fg0ku/VUastsoINpvEG5UU o=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="281276861" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:05:22 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:12465] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.1.69:2525] with esmtp (Farcaster) id 1500f905-3b35-476b-b726-9bcbaa004f31; Fri, 21 Mar 2025 04:05:21 +0000 (UTC) X-Farcaster-Flow-ID: 1500f905-3b35-476b-b726-9bcbaa004f31 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:05:20 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:05:18 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 09/13] ipv6: Don't pass net to ip6_route_info_append(). Date: Thu, 20 Mar 2025 21:00:46 -0700 Message-ID: <20250321040131.21057-10-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D035UWB001.ant.amazon.com (10.13.138.33) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org net is not used in ip6_route_info_append() after commit 36f19d5b4f99 ("net/ipv6: Remove extra call to ip6_convert_metrics for multipath case"). Let's remove the argument. Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index e65e2c8b7125..26e5a372a9cd 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5283,8 +5283,7 @@ struct rt6_nh { struct list_head next; }; -static int ip6_route_info_append(struct net *net, - struct list_head *rt6_nh_list, +static int ip6_route_info_append(struct list_head *rt6_nh_list, struct fib6_info *rt, struct fib6_config *r_cfg) { @@ -5424,8 +5423,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, rt->fib6_nh->fib_nh_weight = rtnh->rtnh_hops + 1; - err = ip6_route_info_append(info->nl_net, &rt6_nh_list, - rt, &r_cfg); + err = ip6_route_info_append(&rt6_nh_list, rt, &r_cfg); if (err) { fib6_info_release(rt); goto cleanup; From patchwork Fri Mar 21 04:00:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024863 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 328CE4120B for ; Fri, 21 Mar 2025 04:05:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.48.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529950; cv=none; b=J3JeaTpMgjvOFQRX7fOLQMWNlCshPtieGoVQiXPsDJ0Qvja8CvlRGZVShKSmsYntBbNWmQK7lnW9u4Fb7P5pa5q0gZi5WVkfgwQJFdLmi6jsYGjH0mRsUU82R4DnNSrFLZI8HNuiEn77CRh5KWVD23/LMKY2cSK2qI0PUAEvZ/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529950; c=relaxed/simple; bh=OKlG7p2fle9d4erYAOmFYisBMcuCs0w1kPmmuGTsaMc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ed13MKJ2IgkNY7DdB0lVW+Km4dLoLn9i0BbpdDekwJoWtBgmusJvDB2nLuMGYdkHPTz/2zE3TafUWuML+tOS8Cg/0jDD/NDeRJsJ+hTCf2nSwKt/zpES6uCy7+F6nVu2APAQ8dCISm3D9u+TVI/9hVmtWvm+uXN9+yAAr5aa9iY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=OtQBFqRh; arc=none smtp.client-ip=52.95.48.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="OtQBFqRh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529948; x=1774065948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z5wB8Mgf6KaJ8r3pGV7I8PNwm0V5wdRAIqCsgJ/AOQ4=; b=OtQBFqRh+JMY6C4ERiLIM5VevvEL7ctQ5Uqbe2XCrWD36iYnrhy8NhGh hxaCnOuGxnZfxGk+a5G5xrFp4dnjPR0fVmodKFRC9b5mTx4zmvk91/pQG bKeVoHqmJNIZH2Fdxye4eAwLGyw+Jdd+wgSFu8y+2bXshgfA/m1WI6wWp M=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="473021658" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:05:46 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.21.151:38721] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.7.105:2525] with esmtp (Farcaster) id 0df01d63-3928-417d-8022-f56afe0e51b1; Fri, 21 Mar 2025 04:05:45 +0000 (UTC) X-Farcaster-Flow-ID: 0df01d63-3928-417d-8022-f56afe0e51b1 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:05:44 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:05:42 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 10/13] ipv6: Factorise ip6_route_multipath_add(). Date: Thu, 20 Mar 2025 21:00:47 -0700 Message-ID: <20250321040131.21057-11-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWB004.ant.amazon.com (10.13.139.150) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, the RCU section will start before ip6_route_info_create_nh() in ip6_route_multipath_add(), but ip6_route_info_create() is called in the same loop and will sleep. Let's split the loop into ip6_route_mpath_info_create() and ip6_route_mpath_info_create_nh(). Note that ip6_route_info_append() is now integrated into ip6_route_mpath_info_create_nh() because we need to call different free functions for nexthops that passed ip6_route_info_create_nh(). In case of failure, the remaining nexthops that ip6_route_info_create_nh() has not been called for will be freed by ip6_route_mpath_info_cleanup(). OTOH, if a nexthop passes ip6_route_info_create_nh(), it will be linked to a local temporary list, which will be spliced back to rt6_nh_list. In case of failure, these nexthops will be released by fib6_info_release() in ip6_route_multipath_add(). Signed-off-by: Kuniyuki Iwashima --- net/ipv6/route.c | 205 ++++++++++++++++++++++++++++++----------------- 1 file changed, 130 insertions(+), 75 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 26e5a372a9cd..a209d8c8ff75 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5281,29 +5281,131 @@ struct rt6_nh { struct fib6_info *fib6_info; struct fib6_config r_cfg; struct list_head next; + int weight; }; -static int ip6_route_info_append(struct list_head *rt6_nh_list, - struct fib6_info *rt, - struct fib6_config *r_cfg) +static void ip6_route_mpath_info_cleanup(struct list_head *rt6_nh_list) { - struct rt6_nh *nh; - int err = -EEXIST; + struct rt6_nh *nh, *nh_next; - list_for_each_entry(nh, rt6_nh_list, next) { - /* check if fib6_info already exists */ - if (rt6_duplicate_nexthop(nh->fib6_info, rt)) - return err; + list_for_each_entry_safe(nh, nh_next, rt6_nh_list, next) { + struct fib6_info *rt = nh->fib6_info; + + if (rt) { + free_percpu(rt->fib6_nh->nh_common.nhc_pcpu_rth_output); + free_percpu(rt->fib6_nh->rt6i_pcpu); + ip_fib_metrics_put(rt->fib6_metrics); + kfree(rt); + } + + list_del(&nh->next); + kfree(nh); } +} - nh = kzalloc(sizeof(*nh), GFP_KERNEL); - if (!nh) - return -ENOMEM; - nh->fib6_info = rt; - memcpy(&nh->r_cfg, r_cfg, sizeof(*r_cfg)); - list_add_tail(&nh->next, rt6_nh_list); +static int ip6_route_mpath_info_create(struct list_head *rt6_nh_list, + struct fib6_config *cfg, + struct netlink_ext_ack *extack) +{ + struct rtnexthop *rtnh; + int remaining; + int err; + + remaining = cfg->fc_mp_len; + rtnh = (struct rtnexthop *)cfg->fc_mp; + + /* Parse a Multipath Entry and build a list (rt6_nh_list) of + * fib6_info structs per nexthop + */ + while (rtnh_ok(rtnh, remaining)) { + struct fib6_config r_cfg; + struct fib6_info *rt; + struct rt6_nh *nh; + int attrlen; + + nh = kzalloc(sizeof(*nh), GFP_KERNEL); + if (!nh) { + err = -ENOMEM; + goto err; + } + + list_add_tail(&nh->next, rt6_nh_list); + + memcpy(&r_cfg, cfg, sizeof(*cfg)); + if (rtnh->rtnh_ifindex) + r_cfg.fc_ifindex = rtnh->rtnh_ifindex; + + attrlen = rtnh_attrlen(rtnh); + if (attrlen > 0) { + struct nlattr *nla, *attrs = rtnh_attrs(rtnh); + + nla = nla_find(attrs, attrlen, RTA_GATEWAY); + if (nla) { + r_cfg.fc_gateway = nla_get_in6_addr(nla); + r_cfg.fc_flags |= RTF_GATEWAY; + } + + r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP); + nla = nla_find(attrs, attrlen, RTA_ENCAP_TYPE); + if (nla) + r_cfg.fc_encap_type = nla_get_u16(nla); + } + + r_cfg.fc_flags |= (rtnh->rtnh_flags & RTNH_F_ONLINK); + + rt = ip6_route_info_create(&r_cfg, GFP_KERNEL, extack); + if (IS_ERR(rt)) { + err = PTR_ERR(rt); + goto err; + } + + nh->fib6_info = rt; + nh->weight = rtnh->rtnh_hops + 1; + memcpy(&nh->r_cfg, &r_cfg, sizeof(r_cfg)); + + rtnh = rtnh_next(rtnh, &remaining); + } return 0; +err: + ip6_route_mpath_info_cleanup(rt6_nh_list); + return err; +} + +static int ip6_route_mpath_info_create_nh(struct list_head *rt6_nh_list, + struct netlink_ext_ack *extack) +{ + struct rt6_nh *nh, *nh_next, *nh_tmp; + LIST_HEAD(tmp); + int err; + + list_for_each_entry_safe(nh, nh_next, rt6_nh_list, next) { + struct fib6_info *rt = nh->fib6_info; + + err = ip6_route_info_create_nh(rt, &nh->r_cfg, extack); + if (err) { + nh->fib6_info = NULL; + goto err; + } + + rt->fib6_nh->fib_nh_weight = nh->weight; + + list_move_tail(&nh->next, &tmp); + + list_for_each_entry(nh_tmp, rt6_nh_list, next) { + /* check if fib6_info already exists */ + if (rt6_duplicate_nexthop(nh_tmp->fib6_info, rt)) { + err = -EEXIST; + goto err; + } + } + } +out: + list_splice(&tmp, rt6_nh_list); + return err; +err: + ip6_route_mpath_info_cleanup(rt6_nh_list); + goto out; } static void ip6_route_mpath_notify(struct fib6_info *rt, @@ -5362,75 +5464,28 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, { struct fib6_info *rt_notif = NULL, *rt_last = NULL; struct nl_info *info = &cfg->fc_nlinfo; - struct fib6_config r_cfg; - struct rtnexthop *rtnh; - struct fib6_info *rt; - struct rt6_nh *err_nh; struct rt6_nh *nh, *nh_safe; + LIST_HEAD(rt6_nh_list); + struct rt6_nh *err_nh; __u16 nlflags; - int remaining; - int attrlen; - int err = 1; int nhn = 0; - int replace = (cfg->fc_nlinfo.nlh && - (cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_REPLACE)); - LIST_HEAD(rt6_nh_list); + int replace; + int err; + + replace = (cfg->fc_nlinfo.nlh && + (cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_REPLACE)); nlflags = replace ? NLM_F_REPLACE : NLM_F_CREATE; if (info->nlh && info->nlh->nlmsg_flags & NLM_F_APPEND) nlflags |= NLM_F_APPEND; - remaining = cfg->fc_mp_len; - rtnh = (struct rtnexthop *)cfg->fc_mp; - - /* Parse a Multipath Entry and build a list (rt6_nh_list) of - * fib6_info structs per nexthop - */ - while (rtnh_ok(rtnh, remaining)) { - memcpy(&r_cfg, cfg, sizeof(*cfg)); - if (rtnh->rtnh_ifindex) - r_cfg.fc_ifindex = rtnh->rtnh_ifindex; - - attrlen = rtnh_attrlen(rtnh); - if (attrlen > 0) { - struct nlattr *nla, *attrs = rtnh_attrs(rtnh); - - nla = nla_find(attrs, attrlen, RTA_GATEWAY); - if (nla) { - r_cfg.fc_gateway = nla_get_in6_addr(nla); - r_cfg.fc_flags |= RTF_GATEWAY; - } - - r_cfg.fc_encap = nla_find(attrs, attrlen, RTA_ENCAP); - nla = nla_find(attrs, attrlen, RTA_ENCAP_TYPE); - if (nla) - r_cfg.fc_encap_type = nla_get_u16(nla); - } - - r_cfg.fc_flags |= (rtnh->rtnh_flags & RTNH_F_ONLINK); - rt = ip6_route_info_create(&r_cfg, GFP_KERNEL, extack); - if (IS_ERR(rt)) { - err = PTR_ERR(rt); - rt = NULL; - goto cleanup; - } - - err = ip6_route_info_create_nh(rt, &r_cfg, extack); - if (err) { - rt = NULL; - goto cleanup; - } - - rt->fib6_nh->fib_nh_weight = rtnh->rtnh_hops + 1; - - err = ip6_route_info_append(&rt6_nh_list, rt, &r_cfg); - if (err) { - fib6_info_release(rt); - goto cleanup; - } + err = ip6_route_mpath_info_create(&rt6_nh_list, cfg, extack); + if (err) + return err; - rtnh = rtnh_next(rtnh, &remaining); - } + err = ip6_route_mpath_info_create_nh(&rt6_nh_list, extack); + if (err) + goto cleanup; /* for add and replace send one notification with all nexthops. * Skip the notification in fib6_add_rt2node and send one with From patchwork Fri Mar 21 04:00:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024864 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AFBEC1E5B66 for ; Fri, 21 Mar 2025 04:06:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529981; cv=none; b=eJLK+J/hYDYLs6AhkIg9amt5K1QailBeCQa0vcH2rOGoNqaeU9hDDDy9hHUDoXnpv0Z4j1WXghC8jOridDf58w2yv4qb1H0JNu2eoh5F0nRKRx7Di65Z59FjSCdxkOVtq9Uq6nbOfiQwaEl2SP3B6dR6fAJkTqrwR4jbIe+P87k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529981; c=relaxed/simple; bh=fgwx+7iMklHrWb3dDB4HvyWrLV8bhcQrbo12LRwC9pc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kqcVKPVzY59FAsmP5PzuKKSwEZLhEK0FIATFUl01jCmG4CeFQVzPA5H6YrzhqW6g5a6C51kbFaeV0shSMDgFZ9tKGxlmiSHixILg0pStPUsT7sHKdnJlybwqPAYWMDJi1gpCzG8FEt3S8IAWnE4ZHM1Ha1f2aw/s/geJvzcgcl0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=ChVKP0qv; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="ChVKP0qv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529980; x=1774065980; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1iaZXUu9i6eVKCq4SxYBsjMkrOGQqppMpVTGSVt7XIo=; b=ChVKP0qvJR20309qvwdfKbRL6+TvNzObufe1DZ4PLr3DxxNYuuyAWvs1 qkKbJdehVtGv4XHAK2FR+F1lKtAD88wQKZuUXluBLjnkaBxqtSh169afO h/56L/pdziVf6A/GdW9ZrHMFpmaGxdEZYncTr4EcCN8xumWmkOlO+Hbap 0=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="504642786" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:06:14 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.21.151:13029] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.56.45:2525] with esmtp (Farcaster) id 88e29678-cdef-4f8d-9401-0aa676ac2cda; Fri, 21 Mar 2025 04:06:12 +0000 (UTC) X-Farcaster-Flow-ID: 88e29678-cdef-4f8d-9401-0aa676ac2cda Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:09 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:06 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 11/13] ipv6: Protect fib6_link_table() with spinlock. Date: Thu, 20 Mar 2025 21:00:48 -0700 Message-ID: <20250321040131.21057-12-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWA003.ant.amazon.com (10.13.139.44) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. If the request specifies a new table ID, fib6_new_table() is called to create a new routing table. Two concurrent requests could specify the same table ID, so we need a lock to protect net->ipv6.fib_table_hash[h]. Let's add a spinlock to protect the hash bucket linkage. Signed-off-by: Kuniyuki Iwashima --- include/net/netns/ipv6.h | 1 + net/ipv6/ip6_fib.c | 26 +++++++++++++++++++++----- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index 5f2cfd84570a..47dc70d8100a 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -72,6 +72,7 @@ struct netns_ipv6 { struct rt6_statistics *rt6_stats; struct timer_list ip6_fib_timer; struct hlist_head *fib_table_hash; + spinlock_t fib_table_hash_lock; struct fib6_table *fib6_main_tbl; struct list_head fib6_walkers; rwlock_t fib6_walker_lock; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index c134ba202c4c..dab091f70f2b 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -249,19 +249,33 @@ static struct fib6_table *fib6_alloc_table(struct net *net, u32 id) struct fib6_table *fib6_new_table(struct net *net, u32 id) { - struct fib6_table *tb; + struct fib6_table *tb, *new_tb; if (id == 0) id = RT6_TABLE_MAIN; + tb = fib6_get_table(net, id); if (tb) return tb; - tb = fib6_alloc_table(net, id); - if (tb) - fib6_link_table(net, tb); + new_tb = fib6_alloc_table(net, id); + if (!new_tb) + return NULL; + + spin_lock_bh(&net->ipv6.fib_table_hash_lock); + + tb = fib6_get_table(net, id); + if (unlikely(tb)) { + spin_unlock_bh(&net->ipv6.fib_table_hash_lock); + kfree(new_tb); + return tb; + } - return tb; + fib6_link_table(net, new_tb); + + spin_unlock_bh(&net->ipv6.fib_table_hash_lock); + + return new_tb; } EXPORT_SYMBOL_GPL(fib6_new_table); @@ -2423,6 +2437,8 @@ static int __net_init fib6_net_init(struct net *net) if (!net->ipv6.fib_table_hash) goto out_rt6_stats; + spin_lock_init(&net->ipv6.fib_table_hash_lock); + net->ipv6.fib6_main_tbl = kzalloc(sizeof(*net->ipv6.fib6_main_tbl), GFP_KERNEL); if (!net->ipv6.fib6_main_tbl) From patchwork Fri Mar 21 04:00:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024865 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E52571E5B66 for ; Fri, 21 Mar 2025 04:06:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.48.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529998; cv=none; b=BgGpv8nwCyPShyiVzCdEGkwz+IpFeRKhnXAGlZY67iYh5wAYnDyKKBTfx6z12F3SOf3cdp7nHx/3mkPNssQom26vCSPFnqIF3EclCpKDVpIrht7Ftc/J/VImeMEhF/SNKohqZQV7DpbCZ62pKhhKRHiDmaVVXLcBEmY67Z0uTqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742529998; c=relaxed/simple; bh=1AQHgqLA1N5HxMEMcP1cjxM8G4H9CKKQsWdiV6cYGik=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Am4ynXgHOWq7huHTUNHVrmLHkiZfpQE037Az0+rZlNj7lPvzWy81/7ZJIDY2+mI3PE1bLHv9LRWbuhPJET2b4DjF+G9x5jzjVuDAVLBG1MiKM2gP4bwBE3GVQ/aJ8GzI+gK1k/pOb0+2K7HpSaG58aVK+hGYHqU6prZhuh4VIVQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=usgHEWfK; arc=none smtp.client-ip=52.95.48.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="usgHEWfK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742529997; x=1774065997; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hmOMvl/PQJMWaP2MLo4yUl8+RyhaQP0nveuQ10tUOxE=; b=usgHEWfK1Gv9cyT7RVgUmhPHkJ3q0mZZv3qNh6UMoE3eldbu1CFdTbYL lAoK9ZOUOv4iQbNTM/CVQ+90gw9VIIYuCNl8rlu0mJpXMBoqLDTJGcWsu QxPx/JH0DAFg7OQjSrxyYTZCuwYqK5fBWBNk/s4yCRjj5rsnk35imlXz8 k=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="473021834" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:06:35 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:28202] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.36.231:2525] with esmtp (Farcaster) id 23b37812-20cc-47f3-89e5-37b6d01ba7bc; Fri, 21 Mar 2025 04:06:34 +0000 (UTC) X-Farcaster-Flow-ID: 23b37812-20cc-47f3-89e5-37b6d01ba7bc Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:34 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:31 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 12/13] ipv6: Protect nh->f6i_list with spinlock and flag. Date: Thu, 20 Mar 2025 21:00:49 -0700 Message-ID: <20250321040131.21057-13-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D031UWA002.ant.amazon.com (10.13.139.96) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, we may be going to add a route tied to a dying nexthop. The nexthop itself is not freed during the RCU graceful period, but if we link a route after __remove_nexthop_fib() is called for the nexthop, the route will be leaked. To avoid the race between IPv6 route addition under RCU vs nexthop deletion under RTNL, let's add a dead flag and protect it and nh->f6i_list with a spinlock. __remove_nexthop_fib() acquires the nexthop's spinlock and sets false to nh->dead, then calls ip6_del_rt() for the linked route one by one without the spinlock because fib6_purge_rt() acquires it later. While adding an IPv6 route, fib6_add() acquires the nexthop lock and checks the dead flag just before inserting the route. Signed-off-by: Kuniyuki Iwashima --- include/net/nexthop.h | 2 ++ net/ipv4/nexthop.c | 20 +++++++++++++++++--- net/ipv6/ip6_fib.c | 25 ++++++++++++++++++++----- 3 files changed, 39 insertions(+), 8 deletions(-) diff --git a/include/net/nexthop.h b/include/net/nexthop.h index d9fb44e8b321..572e69cda476 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -152,6 +152,8 @@ struct nexthop { u8 protocol; /* app managing this nh */ u8 nh_flags; bool is_group; + bool dead; + spinlock_t lock; /* protect dead and f6i_list */ refcount_t refcnt; struct rcu_head rcu; diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 01df7dd795f0..94eab81bfe54 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -541,6 +541,7 @@ static struct nexthop *nexthop_alloc(void) INIT_LIST_HEAD(&nh->f6i_list); INIT_LIST_HEAD(&nh->grp_list); INIT_LIST_HEAD(&nh->fdb_list); + spin_lock_init(&nh->lock); } return nh; } @@ -2105,7 +2106,7 @@ static void remove_nexthop_group(struct nexthop *nh, struct nl_info *nlinfo) /* not called for nexthop replace */ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh) { - struct fib6_info *f6i, *tmp; + struct fib6_info *f6i; bool do_flush = false; struct fib_info *fi; @@ -2116,13 +2117,26 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh) if (do_flush) fib_flush(net); - /* ip6_del_rt removes the entry from this list hence the _safe */ - list_for_each_entry_safe(f6i, tmp, &nh->f6i_list, nh_list) { + spin_lock_bh(&nh->lock); + + nh->dead = true; + + while (1) { + f6i = list_first_entry_or_null(&nh->f6i_list, typeof(*f6i), nh_list); + if (!f6i) + break; + + spin_unlock_bh(&nh->lock); + /* __ip6_del_rt does a release, so do a hold here */ fib6_info_hold(f6i); ipv6_stub->ip6_del_rt(net, f6i, !READ_ONCE(net->ipv4.sysctl_nexthop_compat_mode)); + + spin_lock_bh(&nh->lock); } + + spin_unlock_bh(&nh->lock); } static void __remove_nexthop(struct net *net, struct nexthop *nh, diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index dab091f70f2b..a1aab33b2558 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1048,8 +1048,12 @@ static void fib6_purge_rt(struct fib6_info *rt, struct fib6_node *fn, rt6_flush_exceptions(rt); fib6_drop_pcpu_from(rt, table); - if (rt->nh && !list_empty(&rt->nh_list)) - list_del_init(&rt->nh_list); + if (rt->nh) { + spin_lock(&rt->nh->lock); + if (!list_empty(&rt->nh_list)) + list_del_init(&rt->nh_list); + spin_unlock(&rt->nh->lock); + } if (refcount_read(&rt->fib6_ref) != 1) { /* This route is used as dummy address holder in some split @@ -1499,10 +1503,21 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt, } #endif - err = fib6_add_rt2node(fn, rt, info, extack); + if (rt->nh) { + spin_lock(&rt->nh->lock); + if (rt->nh->dead) { + NL_SET_ERR_MSG(extack, "Nexthop has been deleted"); + err = -EINVAL; + } else { + err = fib6_add_rt2node(fn, rt, info, extack); + if (!err) + list_add(&rt->nh_list, &rt->nh->f6i_list); + } + spin_unlock(&rt->nh->lock); + } else { + err = fib6_add_rt2node(fn, rt, info, extack); + } if (!err) { - if (rt->nh) - list_add(&rt->nh_list, &rt->nh->f6i_list); __fib6_update_sernum_upto_root(rt, fib6_new_sernum(info->nl_net)); if (rt->fib6_flags & RTF_EXPIRES) From patchwork Fri Mar 21 04:00:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 14024866 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CEE52A1CF for ; Fri, 21 Mar 2025 04:07:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.48.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742530030; cv=none; b=dE8bUjvNcgQ4DxIJNJF+HU1P3d0coxdds1vJnBaWjikW6YXzngtk79WCNit1XKCxSCTUxxaW27ecvBwd84JkzZn34603WOfMjE/UWQFyhmFM2P9Ojim4/4jfuB2MDl2hRkVUoe5Hs0Z25Tq36Olu2udtAomGOGcfQxfgZxs69nE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742530030; c=relaxed/simple; bh=c80Lncl+hLIhw8DiMAi9dMXHy73A8PY7QkGhPulvxxs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YS+PZC31fzHg0QJSjNU0GA1/THnD07AA6/+O6rCi1srGTDm7WW4Y5D6xqY8Sp6JOlLGX3mziNphKpeavXQp+iZZh7ihynibeDBjTTSg3mF6J8Y9s1jE1miWtLpHhU8qCB6Fk2tTJzVrgQfNU3yFYesnu9tEosGZ3tMKg+BnMyIc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=V0dUzeXb; arc=none smtp.client-ip=52.95.48.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="V0dUzeXb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1742530028; x=1774066028; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZUrbQ6SxnLKnz1wFAPr1Cl25aZGvfLvmgaMyDgl1ZNY=; b=V0dUzeXb5Rjj/urttQJfN4GcxicSLRoYuuXM6MNSwD0uJXLeswu2dZg3 d1T6ehEjuMwgzrVuiUwdVrU4AbwnmVg/nguSSzsHIVO8SnfAg+NgODaBb 0uu60O2nDBGfuuEJ1HVoPlG7y8xRCyrDaL3ZOUdDnjb1zVQwoYUw8WO0k M=; X-IronPort-AV: E=Sophos;i="6.14,263,1736812800"; d="scan'208";a="473021916" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Mar 2025 04:07:06 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.38.20:6088] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.7.105:2525] with esmtp (Farcaster) id c4918cb9-630b-4c41-a144-b6da99908f0e; Fri, 21 Mar 2025 04:07:05 +0000 (UTC) X-Farcaster-Flow-ID: c4918cb9-630b-4c41-a144-b6da99908f0e Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:58 +0000 Received: from 6c7e67bfbae3.amazon.com (10.187.170.63) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 21 Mar 2025 04:06:56 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , "Paolo Abeni" CC: Simon Horman , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net-next 13/13] ipv6: Get rid of RTNL for SIOCADDRT and RTM_NEWROUTE. Date: Thu, 20 Mar 2025 21:00:50 -0700 Message-ID: <20250321040131.21057-14-kuniyu@amazon.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250321040131.21057-1-kuniyu@amazon.com> References: <20250321040131.21057-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D046UWB002.ant.amazon.com (10.13.139.181) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Now we are ready to remove RTNL from SIOCADDRT and RTM_NEWROUTE. The remaining things to do are 1. pass false to lwtunnel_valid_encap_type_attr() 2. use rcu_dereference_rtnl() in fib6_check_nexthop() 3. place rcu_read_lock() before ip6_route_info_create_nh(). Let's complete RTNL-free conversion. When each CPU-X adds 100000 routes on table-X in a batch on c7a.metal-48xl EC2 instance with 192 CPUs, without this series: $ sudo ./route_test.sh ... added 19200000 routes (100000 routes * 192 tables). Time elapsed: 189154 milliseconds. with this series: $ sudo ./route_test.sh ... added 19200000 routes (100000 routes * 192 tables). Time elapsed: 62531 milliseconds. Signed-off-by: Kuniyuki Iwashima --- net/ipv4/nexthop.c | 4 ++-- net/ipv6/route.c | 18 ++++++++++++------ 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 94eab81bfe54..dec1a107aa0b 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1543,12 +1543,12 @@ int fib6_check_nexthop(struct nexthop *nh, struct fib6_config *cfg, if (nh->is_group) { struct nh_group *nhg; - nhg = rtnl_dereference(nh->nh_grp); + nhg = rcu_dereference_rtnl(nh->nh_grp); if (nhg->has_v4) goto no_v4_nh; is_fdb_nh = nhg->fdb_nh; } else { - nhi = rtnl_dereference(nh->nh_info); + nhi = rcu_dereference_rtnl(nh->nh_info); if (nhi->family == AF_INET) goto no_v4_nh; is_fdb_nh = nhi->fdb_nh; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index a209d8c8ff75..d1d60415d1aa 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3868,12 +3868,16 @@ int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags, if (IS_ERR(rt)) return PTR_ERR(rt); + rcu_read_lock(); + err = ip6_route_info_create_nh(rt, cfg, extack); if (err) - return err; + goto unlock; err = __ip6_ins_rt(rt, &cfg->fc_nlinfo, extack); fib6_info_release(rt); +unlock: + rcu_read_unlock(); return err; } @@ -4494,12 +4498,10 @@ int ipv6_route_ioctl(struct net *net, unsigned int cmd, struct in6_rtmsg *rtmsg) switch (cmd) { case SIOCADDRT: - rtnl_lock(); /* Only do the default setting of fc_metric in route adding */ if (cfg.fc_metric == 0) cfg.fc_metric = IP6_RT_PRIO_USER; err = ip6_route_add(&cfg, GFP_KERNEL, NULL); - rtnl_unlock(); break; case SIOCDELRT: err = ip6_route_del(&cfg, NULL); @@ -5078,7 +5080,7 @@ static int rtm_to_fib6_multipath_config(struct fib6_config *cfg, } while (rtnh_ok(rtnh, remaining)); return lwtunnel_valid_encap_type_attr(cfg->fc_mp, cfg->fc_mp_len, - extack, newroute); + extack, false); } static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5216,7 +5218,7 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, cfg->fc_encap_type = nla_get_u16(tb[RTA_ENCAP_TYPE]); err = lwtunnel_valid_encap_type(cfg->fc_encap_type, - extack, newroute); + extack, false); if (err < 0) goto errout; } @@ -5483,6 +5485,8 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, if (err) return err; + rcu_read_lock(); + err = ip6_route_mpath_info_create_nh(&rt6_nh_list, extack); if (err) goto cleanup; @@ -5574,6 +5578,8 @@ static int ip6_route_multipath_add(struct fib6_config *cfg, } cleanup: + rcu_read_unlock(); + list_for_each_entry_safe(nh, nh_safe, &rt6_nh_list, next) { fib6_info_release(nh->fib6_info); list_del(&nh->next); @@ -6856,7 +6862,7 @@ static void bpf_iter_unregister(void) static const struct rtnl_msg_handler ip6_route_rtnl_msg_handlers[] __initconst_or_module = { {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_NEWROUTE, - .doit = inet6_rtm_newroute}, + .doit = inet6_rtm_newroute, .flags = RTNL_FLAG_DOIT_UNLOCKED}, {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_DELROUTE, .doit = inet6_rtm_delroute, .flags = RTNL_FLAG_DOIT_UNLOCKED}, {.owner = THIS_MODULE, .protocol = PF_INET6, .msgtype = RTM_GETROUTE,