From patchwork Mon Nov 22 15:15:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikolay Aleksandrov X-Patchwork-Id: 12632095 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE976C433EF for ; Mon, 22 Nov 2021 15:15:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239678AbhKVPSb (ORCPT ); Mon, 22 Nov 2021 10:18:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233449AbhKVPS3 (ORCPT ); Mon, 22 Nov 2021 10:18:29 -0500 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6862C061574 for ; Mon, 22 Nov 2021 07:15:22 -0800 (PST) Received: by mail-ed1-x52f.google.com with SMTP id g14so78529489edb.8 for ; Mon, 22 Nov 2021 07:15:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blackwall-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oaQjB7lbCNvQ6j5Qj1cZs1R1magKCcFGqEhPiwQHJDo=; b=RC1H6Emw3LrcCjWT+2H4BhZ4TptQRZljzPFGZwJKYP8sl2ymEbqZybw+u16tBvvRWs sGE+iay9bi0szFCuD2CpJrPi0TvrIXVs0SEzHTIglfnnKZ3Pd6nEzw3N3YzvZmIySuYi F6lntabrAztHGklGlswBjumnpKLsHRULA2r99VenuPvq8cknbtVL2QKUaTJ4DxM8ohD9 mOP6ZZUbp7l3tpex7srmYne06WDZ8kB3v+MhlpdUjDsFA/ROmH4EaT89wl92m32v5ZvN xl36xT8b3YSO/oE5lsvgYnH2NsludeNj3dkTNApQiwOcvxu0cXgJJvjB6Z88elKwjUxo 5J3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oaQjB7lbCNvQ6j5Qj1cZs1R1magKCcFGqEhPiwQHJDo=; b=ni8qRRvylVUxIm0vxCDk4TnniBI70hfl5OC8DOfHUufHuV0ZbZvm79oAmiS/nTPA+3 PRNLHV/Xx3SJg8lhBvldmygA2ClWBMuvqERAb7nfmOpCR1HoFkn9BoGdcGI3BO0f8zo/ K4LoA6jC8l83qlBGiHn/ULZxACBA3YkpYv0QuwHKW5mruRQTrfc8OPvMzKOXrM7buoUL OmNWDF9L+AMJ9yZZVykrUd2o3iJDce763AaClsrZmk3cXQwQUfOQTEy1sp7S9VO42Rqw kJXuLTjrCQdQOr+nryQNHKLJR4xVY0cLVPRaW0gVn1GY3M5ftlgXkBchYxVrBhpvfZry Xn9g== X-Gm-Message-State: AOAM533ryrKJy5NazyyAi7pTLzj9w+ApqICuEupAJsmPe2TdWakTVnT8 gh70zzd6EjU74F/4WZM2m+N9bMjmWuQbFe80 X-Google-Smtp-Source: ABdhPJzARBZokcWqNi0vW8TA77fO5p/k3+FUhHAkR6nrLGCs+B0Gs6FV5aqVgOBCAfz0DT0Be+WXCA== X-Received: by 2002:a17:907:948c:: with SMTP id dm12mr13870217ejc.551.1637594120573; Mon, 22 Nov 2021 07:15:20 -0800 (PST) Received: from debil.vdiclient.nvidia.com (84-238-136-197.ip.btc-net.bg. [84.238.136.197]) by smtp.gmail.com with ESMTPSA id qb21sm3906904ejc.78.2021.11.22.07.15.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Nov 2021 07:15:20 -0800 (PST) From: Nikolay Aleksandrov To: netdev@vger.kernel.org Cc: idosch@idosch.org, davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, Nikolay Aleksandrov Subject: [PATCH net v2 1/3] net: ipv6: add fib6_nh_release_dsts stub Date: Mon, 22 Nov 2021 17:15:12 +0200 Message-Id: <20211122151514.2813935-2-razor@blackwall.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211122151514.2813935-1-razor@blackwall.org> References: <20211122151514.2813935-1-razor@blackwall.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Nikolay Aleksandrov We need a way to release a fib6_nh's per-cpu dsts when replacing nexthops otherwise we can end up with stale per-cpu dsts which hold net device references, so add a new IPv6 stub called fib6_nh_release_dsts. It must be used after an RCU grace period, so no new dsts can be created through a group's nexthop entry. Similar to fib6_nh_release it shouldn't be used if fib6_nh_init has failed so it doesn't need a dummy stub when IPv6 is not enabled. Fixes: 7bf4796dd099 ("nexthops: add support for replace") Signed-off-by: Nikolay Aleksandrov --- v2: no changes include/net/ip6_fib.h | 1 + include/net/ipv6_stubs.h | 1 + net/ipv6/af_inet6.c | 1 + net/ipv6/route.c | 19 +++++++++++++++++++ 4 files changed, 22 insertions(+) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index c412dde4d67d..83b8070d1cc9 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -485,6 +485,7 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh, struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack); void fib6_nh_release(struct fib6_nh *fib6_nh); +void fib6_nh_release_dsts(struct fib6_nh *fib6_nh); int call_fib6_entry_notifiers(struct net *net, enum fib_event_type event_type, diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h index afbce90c4480..45e0339be6fa 100644 --- a/include/net/ipv6_stubs.h +++ b/include/net/ipv6_stubs.h @@ -47,6 +47,7 @@ struct ipv6_stub { struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack); void (*fib6_nh_release)(struct fib6_nh *fib6_nh); + void (*fib6_nh_release_dsts)(struct fib6_nh *fib6_nh); void (*fib6_update_sernum)(struct net *net, struct fib6_info *rt); int (*ip6_del_rt)(struct net *net, struct fib6_info *rt, bool skip_notify); void (*fib6_rt_update)(struct net *net, struct fib6_info *rt, diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 0c4da163535a..dab4a047590b 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -1026,6 +1026,7 @@ static const struct ipv6_stub ipv6_stub_impl = { .ip6_mtu_from_fib6 = ip6_mtu_from_fib6, .fib6_nh_init = fib6_nh_init, .fib6_nh_release = fib6_nh_release, + .fib6_nh_release_dsts = fib6_nh_release_dsts, .fib6_update_sernum = fib6_update_sernum_stub, .fib6_rt_update = fib6_rt_update, .ip6_del_rt = ip6_del_rt, diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 3ae25b8ffbd6..42d60c76d30a 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3680,6 +3680,25 @@ void fib6_nh_release(struct fib6_nh *fib6_nh) fib_nh_common_release(&fib6_nh->nh_common); } +void fib6_nh_release_dsts(struct fib6_nh *fib6_nh) +{ + int cpu; + + if (!fib6_nh->rt6i_pcpu) + return; + + for_each_possible_cpu(cpu) { + struct rt6_info *pcpu_rt, **ppcpu_rt; + + ppcpu_rt = per_cpu_ptr(fib6_nh->rt6i_pcpu, cpu); + pcpu_rt = xchg(ppcpu_rt, NULL); + if (pcpu_rt) { + dst_dev_put(&pcpu_rt->dst); + dst_release(&pcpu_rt->dst); + } + } +} + static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, gfp_t gfp_flags, struct netlink_ext_ack *extack) From patchwork Mon Nov 22 15:15:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikolay Aleksandrov X-Patchwork-Id: 12632099 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC37DC433F5 for ; Mon, 22 Nov 2021 15:15:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239766AbhKVPSd (ORCPT ); Mon, 22 Nov 2021 10:18:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239755AbhKVPSc (ORCPT ); Mon, 22 Nov 2021 10:18:32 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8F93C06173E for ; Mon, 22 Nov 2021 07:15:25 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id v1so44903544edx.2 for ; Mon, 22 Nov 2021 07:15:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blackwall-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Pftb0Lsb1bJIdCK96aoAVJ6JA25jDDfs3VUOFrqF600=; b=NkJe4OTWS5s5LRfXZS2bbun5l7burQiFRsnlsan9br72iBJRjWQ9a/hFQYsNzPpCjh jr03+U/3i6xK8WyWBQyoZXq433ZU2e/xLiUVkY8AfMWHpFQffyKVXZcL6SXX+S049+Q7 7K5S2ctpR3m7FSW3u3E7QW5pry1whRCqu8FMgqTBLma65M50+S2HmfiRGJt3dft4r6+r yZniQW5g2Kemdwmt4+GnsGahhCHLgKcSk3gOArVOlpMHIWsBzqn+NoMIpqdy9nWL4iZ+ 4uxYWRQllIvdaWV0VVkjzHZD31cPjQb6k5pNmqFSv3VbmcZp1f/4XBfsZNuVWiFNDgO2 o1Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Pftb0Lsb1bJIdCK96aoAVJ6JA25jDDfs3VUOFrqF600=; b=M9iTsRNgArt3sIKseVGQSGdQIJlP0+mZ/3Dh3eUL1d521OIVsBYa42LB+N9NFQFmKR gOI2oMi1GQ64mzTKBmhm2IlNBrbm9g2YfjQ5k+sMt/mm9Zk4xgiEcFHYO1WYQfe/AI0a ILkIJhGJy0ffZfaJ+fqQaipBOKhscJRe+Tn0DX8hFLASBTv2Ju2GqIOyTDQQSqtfp1QQ YzyJIQaHBBHsA8qPIT3LIC3I3RRV5JaySDwIzZfaBtRp2uUKZkshIzwHmnJ7I6yplC4d c4hhs0MCZ4ez3RxpmhV9u0IcyShj90LLUBT+UM+8RJexJGPnx4JNBYoXeEnmGryJkY3U cOHg== X-Gm-Message-State: AOAM532j6O6oU2vuVEvU8da3QC5b8bgTH7V2mkGG2/mgwcdJ+jjra5yJ y4zHDAEfIaZPykTDG0UhFEXYTSnexoAN9Nm5 X-Google-Smtp-Source: ABdhPJxzRtmw8pRquQAAKgOiy7yYpOsUuTVe4iJ4f0f5MBCAm3jGg9Yu1vQH8y/gU34UIMQfc40O5w== X-Received: by 2002:a17:907:7094:: with SMTP id yj20mr41049042ejb.265.1637594121513; Mon, 22 Nov 2021 07:15:21 -0800 (PST) Received: from debil.vdiclient.nvidia.com (84-238-136-197.ip.btc-net.bg. [84.238.136.197]) by smtp.gmail.com with ESMTPSA id qb21sm3906904ejc.78.2021.11.22.07.15.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Nov 2021 07:15:21 -0800 (PST) From: Nikolay Aleksandrov To: netdev@vger.kernel.org Cc: idosch@idosch.org, davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, Nikolay Aleksandrov Subject: [PATCH net v2 2/3] net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group Date: Mon, 22 Nov 2021 17:15:13 +0200 Message-Id: <20211122151514.2813935-3-razor@blackwall.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211122151514.2813935-1-razor@blackwall.org> References: <20211122151514.2813935-1-razor@blackwall.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Nikolay Aleksandrov When replacing a nexthop group, we must release the IPv6 per-cpu dsts of the removed nexthop entries after an RCU grace period because they contain references to the nexthop's net device and to the fib6 info. With specific series of events[1] we can reach net device refcount imbalance which is unrecoverable. IPv4 is not affected because dsts don't take a refcount on the route. [1] $ ip nexthop list id 200 via 2002:db8::2 dev bridge.10 scope link onlink id 201 via 2002:db8::3 dev bridge scope link onlink id 203 group 201/200 $ ip -6 route 2001:db8::10 nhid 203 metric 1024 pref medium nexthop via 2002:db8::3 dev bridge weight 1 onlink nexthop via 2002:db8::2 dev bridge.10 weight 1 onlink Create rt6_info through one of the multipath legs, e.g.: $ taskset -a -c 1 ./pkt_inj 24 bridge.10 2001:db8::10 (pkt_inj is just a custom packet generator, nothing special) Then remove that leg from the group by replace (let's assume it is id 200 in this case): $ ip nexthop replace id 203 group 201 Now remove the IPv6 route: $ ip -6 route del 2001:db8::10/128 The route won't be really deleted due to the stale rt6_info holding 1 refcnt in nexthop id 200. At this point we have the following reference count dependency: (deleted) IPv6 route holds 1 reference over nhid 203 nh 203 holds 1 ref over id 201 nh 200 holds 1 ref over the net device and the route due to the stale rt6_info Now to create circular dependency between nh 200 and the IPv6 route, and also to get a reference over nh 200, restore nhid 200 in the group: $ ip nexthop replace id 203 group 201/200 And now we have a permanent circular dependncy because nhid 203 holds a reference over nh 200 and 201, but the route holds a ref over nh 203 and is deleted. To trigger the bug just delete the group (nhid 203): $ ip nexthop del id 203 It won't really be deleted due to the IPv6 route dependency, and now we have 2 unlinked and deleted objects that reference each other: the group and the IPv6 route. Since the group drops the reference it holds over its entries at free time (i.e. its own refcount needs to drop to 0) that will never happen and we get a permanent ref on them, since one of the entries holds a reference over the IPv6 route it will also never be released. At this point the dependencies are: (deleted, only unlinked) IPv6 route holds reference over group nh 203 (deleted, only unlinked) group nh 203 holds reference over nh 201 and 200 nh 200 holds 1 ref over the net device and the route due to the stale rt6_info This is the last point where it can be fixed by running traffic through nh 200, and specifically through the same CPU so the rt6_info (dst) will get released due to the IPv6 genid, that in turn will free the IPv6 route, which in turn will free the ref count over the group nh 203. If nh 200 is deleted at this point, it will never be released due to the ref from the unlinked group 203, it will only be unlinked: $ ip nexthop del id 200 $ ip nexthop $ Now we can never release that stale rt6_info, we have IPv6 route with ref over group nh 203, group nh 203 with ref over nh 200 and 201, nh 200 with rt6_info (dst) with ref over the net device and the IPv6 route. All of these objects are only unlinked, and cannot be released, thus they can't release their ref counts. Message from syslogd@dev at Nov 19 14:04:10 ... kernel:[73501.828730] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3 Message from syslogd@dev at Nov 19 14:04:20 ... kernel:[73512.068811] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3 Fixes: 7bf4796dd099 ("nexthops: add support for replace") Signed-off-by: Nikolay Aleksandrov --- v2: added information about why IPv4 is not affected to the commit msg, no changes to the patch net/ipv4/nexthop.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 9e8100728d46..a69a9e76f99f 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1899,15 +1899,36 @@ static void remove_nexthop(struct net *net, struct nexthop *nh, /* if any FIB entries reference this nexthop, any dst entries * need to be regenerated */ -static void nh_rt_cache_flush(struct net *net, struct nexthop *nh) +static void nh_rt_cache_flush(struct net *net, struct nexthop *nh, + struct nexthop *replaced_nh) { struct fib6_info *f6i; + struct nh_group *nhg; + int i; if (!list_empty(&nh->fi_list)) rt_cache_flush(net); list_for_each_entry(f6i, &nh->f6i_list, nh_list) ipv6_stub->fib6_update_sernum(net, f6i); + + /* if an IPv6 group was replaced, we have to release all old + * dsts to make sure all refcounts are released + */ + if (!replaced_nh->is_group) + return; + + /* new dsts must use only the new nexthop group */ + synchronize_net(); + + nhg = rtnl_dereference(replaced_nh->nh_grp); + for (i = 0; i < nhg->num_nh; i++) { + struct nh_grp_entry *nhge = &nhg->nh_entries[i]; + struct nh_info *nhi = rtnl_dereference(nhge->nh->nh_info); + + if (nhi->family == AF_INET6) + ipv6_stub->fib6_nh_release_dsts(&nhi->fib6_nh); + } } static int replace_nexthop_grp(struct net *net, struct nexthop *old, @@ -2247,7 +2268,7 @@ static int replace_nexthop(struct net *net, struct nexthop *old, err = replace_nexthop_single(net, old, new, extack); if (!err) { - nh_rt_cache_flush(net, old); + nh_rt_cache_flush(net, old, new); __remove_nexthop(net, new, NULL); nexthop_put(new); From patchwork Mon Nov 22 15:15:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikolay Aleksandrov X-Patchwork-Id: 12632097 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EC37C433FE for ; Mon, 22 Nov 2021 15:15:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239761AbhKVPSc (ORCPT ); Mon, 22 Nov 2021 10:18:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233449AbhKVPSb (ORCPT ); Mon, 22 Nov 2021 10:18:31 -0500 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D328CC061574 for ; Mon, 22 Nov 2021 07:15:24 -0800 (PST) Received: by mail-ed1-x536.google.com with SMTP id y13so78303919edd.13 for ; Mon, 22 Nov 2021 07:15:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blackwall-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R2B4Fj4YV/PyfqcipOhaZfSpP1N1LY2SlQEPFvPlrys=; b=rVTIXXJgpzTsF9KM0vhgVgPmWkHSJUGYUkbmSCYQRi5vhn2zzOJCmWQ6/S+7kFHMf2 cLG0yeUvaTSmBVBVo9t9Q5jDlAlDXQ6GxInqyrpDtU8wWF+aK8nsVw1VLKjLt/6fwilB wlB3Uly3xgovYBcfK2W+X3R6s9WO8pYZQBr5E/WIC/+SThK71KCC0ssiyQCOFfMhuiy+ Oaw8V8+Fcd+wtYWZf2z89Bq56oV/w9zO4OC9mX4MwVeLmeQYGQNALXHkw7FkwV/OPDb2 ziQcuIZG15vaM9pKRgA97mx3hLcyIXmXXRM/IJZxNcVwmxoF5HsNAICiD+pH82zF/Nnj 3aaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R2B4Fj4YV/PyfqcipOhaZfSpP1N1LY2SlQEPFvPlrys=; b=2UcFVwbii43h4ivzJru53bGt3xvgZbbV6J0hFB7BYIkLR8agYHDyS5W1xwmaaK5OqN vIeMt90JYipaXtVXPj2cv3XXshmpoGTaNHsV0cr/Nh2Fl441lfGzSEm/GfnwgJgPlTpn jnPpZIF6Pu1lrE/FgVLORwvktIxJ32GewgN2U6WospM4T5EfG6ED72BEOz0cvw7/eChH NvYEqTKpZ4EbJSMaWoF7gpuup1SWDdc2U8CB/AUlMumn0Yy0OiBu/vlvud0jNON+rP/0 1O3VvzfWAo23cuqpVQr4iBEuqb8zbKbyARbBc5U+ImGgictdT8Gmize3rPU1lXRt5SmF dCSw== X-Gm-Message-State: AOAM531I/wdNdoD2BOhBMn9NLPWt1HkfpLQ0DWgJiCBIOa3mdsOaqI0O yLRPx5ZaLIlQ19ekGwxd48ALxt6PsICcXlSs X-Google-Smtp-Source: ABdhPJy7lkT9yO6iU/CRWgJbyojk5oTwZWyG6srQZYZGR/UIYplwQkKBg05iDhTeb3ebwiydalvANA== X-Received: by 2002:a17:907:629b:: with SMTP id nd27mr41408983ejc.24.1637594122456; Mon, 22 Nov 2021 07:15:22 -0800 (PST) Received: from debil.vdiclient.nvidia.com (84-238-136-197.ip.btc-net.bg. [84.238.136.197]) by smtp.gmail.com with ESMTPSA id qb21sm3906904ejc.78.2021.11.22.07.15.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Nov 2021 07:15:22 -0800 (PST) From: Nikolay Aleksandrov To: netdev@vger.kernel.org Cc: idosch@idosch.org, davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, Nikolay Aleksandrov Subject: [PATCH net v2 3/3] selftests: net: fib_nexthops: add test for group refcount imbalance bug Date: Mon, 22 Nov 2021 17:15:14 +0200 Message-Id: <20211122151514.2813935-4-razor@blackwall.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211122151514.2813935-1-razor@blackwall.org> References: <20211122151514.2813935-1-razor@blackwall.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Nikolay Aleksandrov The new selftest runs a sequence which causes circular refcount dependency between deleted objects which cannot be released and results in a netdevice refcount imbalance. Signed-off-by: Nikolay Aleksandrov --- v2: check for mausezahn before testing and make a few comments more verbose tools/testing/selftests/net/fib_nexthops.sh | 63 +++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index b5a69ad191b0..d444ee6aa3cb 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -629,6 +629,66 @@ ipv6_fcnal() log_test $? 0 "Nexthops removed on admin down" } +ipv6_grp_refs() +{ + if [ ! -x "$(command -v mausezahn)" ]; then + echo "SKIP: Could not run test; need mausezahn tool" + return + fi + + run_cmd "$IP link set dev veth1 up" + run_cmd "$IP link add veth1.10 link veth1 up type vlan id 10" + run_cmd "$IP link add veth1.20 link veth1 up type vlan id 20" + run_cmd "$IP -6 addr add 2001:db8:91::1/64 dev veth1.10" + run_cmd "$IP -6 addr add 2001:db8:92::1/64 dev veth1.20" + run_cmd "$IP -6 neigh add 2001:db8:91::2 lladdr 00:11:22:33:44:55 dev veth1.10" + run_cmd "$IP -6 neigh add 2001:db8:92::2 lladdr 00:11:22:33:44:55 dev veth1.20" + run_cmd "$IP nexthop add id 100 via 2001:db8:91::2 dev veth1.10" + run_cmd "$IP nexthop add id 101 via 2001:db8:92::2 dev veth1.20" + run_cmd "$IP nexthop add id 102 group 100" + run_cmd "$IP route add 2001:db8:101::1/128 nhid 102" + + # create per-cpu dsts through nh 100 + run_cmd "ip netns exec me mausezahn -6 veth1.10 -B 2001:db8:101::1 -A 2001:db8:91::1 -c 5 -t tcp "dp=1-1023, flags=syn" >/dev/null 2>&1" + + # remove nh 100 from the group to delete the route potentially leaving + # a stale per-cpu dst which holds a reference to the nexthop's net + # device and to the IPv6 route + run_cmd "$IP nexthop replace id 102 group 101" + run_cmd "$IP route del 2001:db8:101::1/128" + + # add both nexthops to the group so a reference is taken on them + run_cmd "$IP nexthop replace id 102 group 100/101" + + # if the bug described in commit "net: nexthop: release IPv6 per-cpu + # dsts when replacing a nexthop group" exists at this point we have + # an unlinked IPv6 route (but not freed due to stale dst) with a + # reference over the group so we delete the group which will again + # only unlink it due to the route reference + run_cmd "$IP nexthop del id 102" + + # delete the nexthop with stale dst, since we have an unlinked + # group with a ref to it and an unlinked IPv6 route with ref to the + # group, the nh will only be unlinked and not freed so the stale dst + # remains forever and we get a net device refcount imbalance + run_cmd "$IP nexthop del id 100" + + # if a reference was lost this command will hang because the net device + # cannot be removed + timeout -s KILL 5 ip netns exec me ip link del veth1.10 >/dev/null 2>&1 + + # we can't cleanup if the command is hung trying to delete the netdev + if [ $? -eq 137 ]; then + return 1 + fi + + # cleanup + run_cmd "$IP link del veth1.20" + run_cmd "$IP nexthop flush" + + return 0 +} + ipv6_grp_fcnal() { local rc @@ -734,6 +794,9 @@ ipv6_grp_fcnal() run_cmd "$IP nexthop add id 108 group 31/24" log_test $? 2 "Nexthop group can not have a blackhole and another nexthop" + + ipv6_grp_refs + log_test $? 0 "Nexthop group replace refcounts" } ipv6_res_grp_fcnal()