From patchwork Tue Aug 8 07:52:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 13346184 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6D83134D7 for ; Tue, 8 Aug 2023 15:51:05 +0000 (UTC) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2062b.outbound.protection.outlook.com [IPv6:2a01:111:f400:7eaa::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD956892C for ; Tue, 8 Aug 2023 08:50:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eD8my+zvCj/OauJV8nrRhex5GjBseA9XWJ2QzJZq3o6HmMTa8zVELLJvOExYloLR438p5+fM6M6Yw401ky6rltBvpJlLleXpuMTR5hZof4BYM9wgLkVI6Qx/Srjc37uwG58XoZjc4gHAqhhhiaJ9Epcnh6E2KBLbn7A73CBWxWez711K1tylbuaAmL9L74MBMC2s5fmoxG/Y3Kj9z8Aaw5inJgcOzVLYn7o5MkLl7mEfvfLFN3d5YhkLXYNWm6WRvaoXZnHlAZdr1HIpG3Rgq1CExsjYRBNP3Lq40bfMMPvgE+ffvoTbOLRXPTau2gQ2mQM2NddD+6bF9TGLSii4vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lvpZabAlat8o9oKmb1K8TCKCgAyhx7GaFzfXRUDn+uo=; b=G7Yi9FoRWKZWEB6Iey+CrhPw8iCYND1RxfKI3Gkh+UKO8iYncxn+8kOwWBBBTJoYw5v6Bu9vg0RmSR/2Chy2+EvXb8aIwX4cOJFqtXFvx/3m0FFbBOBBjoidCJlBjRB6YUnTNOsK3p+E7UcVFE+BOYU6gGRHWa545+R+bMta07jJfp0FUWsDgAjwAlRlxAIzqKqbU1dMB+YqzELkOH5g9pb84ESniyRRYVIrpcqBD8Th1m78UgJiNhygY/VlvOMPgV3cJsND4Xe8+uqUigEM79x/yjBj51+Dvz7niGuHhZaq9bcMusW2e5IRFAu99zojiEF/AT9Ix9tNIyr2rKsJRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lvpZabAlat8o9oKmb1K8TCKCgAyhx7GaFzfXRUDn+uo=; b=YL5GHPFivs95NfODXmBxJebgcALPml9ibl6mzfZX5iadeP9vFj9G2AuytsqJyzxC69tCTY1SwXXnu691uUx2lQRdfB54N1uCgi8I7CeEDMYhLC8qd5xdpC6xE9QUpsuKy0N2Ftim1pW0x+28BhHmQLTb7kbK58A40EsrRCKvoKPwR2S6BzcHX9a6gPzP0y/ld6WhNKQlqrIF2cGd4avTrBcRL0n/6EFy1swQ/u30YBdVgJh+ZstLTvHL1cZDsZXUQV8iVPGSH2HopG6NmM4ukULXcS+ANf6+2HA50L6ogUu9POiloIl6sMX14nFWa1XBcigMCsF3p5fsj9bn2EJofA== Received: from DM6PR04CA0009.namprd04.prod.outlook.com (2603:10b6:5:334::14) by SA0PR12MB7479.namprd12.prod.outlook.com (2603:10b6:806:24b::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.26; Tue, 8 Aug 2023 07:53:21 +0000 Received: from CO1PEPF000044FC.namprd21.prod.outlook.com (2603:10b6:5:334:cafe::f1) by DM6PR04CA0009.outlook.office365.com (2603:10b6:5:334::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.27 via Frontend Transport; Tue, 8 Aug 2023 07:53:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by CO1PEPF000044FC.mail.protection.outlook.com (10.167.241.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.0 via Frontend Transport; Tue, 8 Aug 2023 07:53:25 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 8 Aug 2023 00:53:04 -0700 Received: from dev-r-vrt-155.mtr.labs.mlnx (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 8 Aug 2023 00:53:01 -0700 From: Ido Schimmel To: CC: , , , , , , Ido Schimmel Subject: [PATCH net 1/3] nexthop: Fix infinite nexthop dump when using maximum nexthop ID Date: Tue, 8 Aug 2023 10:52:31 +0300 Message-ID: <20230808075233.3337922-2-idosch@nvidia.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230808075233.3337922-1-idosch@nvidia.com> References: <20230808075233.3337922-1-idosch@nvidia.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044FC:EE_|SA0PR12MB7479:EE_ X-MS-Office365-Filtering-Correlation-Id: e9644a85-59c9-4bf9-f6c0-08db97e48bc8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jOjhGL3oR9zVsvwMsVeVC007FiVwU9mxfhxbGKWoUMYxa9lvuKB/Xcv/pVM0VUa/DNBUWqIeqxwIPYBkmUekS/QknFS39icPKOIFsSk+IltwKS7YCc/f0EM3Cli91iEoFNnXAHAyO2amWsK3/dwrWRy4GrljnP8aWHAkvFhLBAaj8Q9uHh2HFxlbkd8Tcy9Zc5AGASgvsEKLeGFjCskIJZKwZOy0RCEb8OtO+6l0nKykFdVt9Kjz9QfUJJ8CotuuMDHarpzVporU/uz5kTOVUv/AbxaALTtiTUYjPlVrh0z01H7yWi5KytZhpgVxXRczslyXfId8j+M+cGPK6TQmxAsPLIkfsd6aoZ6MuQwcyRozEbswtWGc+/LV5AF0kyqwP7wbq4fM+mlYWUNtxtVu1dXqTCmNCxQTGaU8XcVt7naX+QV+pwXyD9tNAhXYNgjPPzt2eAEmRAjkZTKv92tyYIZoe4mguT3outmd/k40ul4Azu8EAK2xBnnyo3bONslgVj0HHh884TM7w+uigDBIH0sO0hUegyQ1WoLSsj1DGsGyKHM8Wda+voIaqwnn/M2GiaXhGljO6Zj2wnbvDX0sjIer98dnAHKZ9ZLGpld3110CyGVUHG1lXzRHScr+kvHE9Ax8k6pNa5qtkajPZrK+niNWnHBPSSA3brkP9oakXhnNiidsPQ4oSk4B+gwPbyWqT0Qvu0aL9bzhZUO+KvnUmMsCgAuWmyCas1tN261KWNm1nsqNqG168L+GKZKLloAqbJD3N5hD/+XtagJ/k3M5exj5u8PcHWfVGiJkB7H/VEpc7vZhTOW9nKUOYeZ7gR8ebiNpRpH4p3W4EJc2By/x3lwaYoIBESH+EHBq9F9XLFE= X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(396003)(346002)(376002)(136003)(39860400002)(82310400008)(90011799007)(451199021)(90021799007)(1800799003)(186006)(46966006)(40470700004)(36840700001)(86362001)(7636003)(107886003)(40480700001)(41300700001)(16526019)(336012)(478600001)(40460700003)(1076003)(8936002)(8676002)(26005)(6666004)(426003)(2616005)(5660300002)(47076005)(36756003)(2906002)(83380400001)(54906003)(36860700001)(316002)(356005)(70586007)(70206006)(6916009)(966005)(4326008)(82740400003);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2023 07:53:25.5035 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e9644a85-59c9-4bf9-f6c0-08db97e48bc8 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044FC.namprd21.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR12MB7479 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,SPF_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop dump callback always returns a positive number if nexthops were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthops are present. In the last recvmsg() call the dump callback will not fill in any nexthops because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip nexthop add id 1 blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394315, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 36 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 1], {nla_len=4, nla_type=NHA_BLACKHOLE}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 36 id 1 blackhole recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip nexthop add id $((2**32-1)) blackhole # ip nexthop id 4294967295 blackhole id 4294967295 blackhole [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOP response: # ip nexthop add id $((2**32-1)) blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394080, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 56 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 4294967295], {nla_len=4, nla_type=NHA_BLACKHOLE}]], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 56 id 4294967295 blackhole +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: ab84be7e54fc ("net: Initial nexthop code") Reported-by: Petr Machata Closes: https://lore.kernel.org/netdev/87sf91enuf.fsf@nvidia.com/ Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata --- net/ipv4/nexthop.c | 6 +----- tools/testing/selftests/net/fib_nexthops.sh | 5 +++++ 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index f95142e56da0..179e50d8fe07 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -3221,13 +3221,9 @@ static int rtm_dump_nexthop(struct sk_buff *skb, struct netlink_callback *cb) &rtm_dump_nexthop_cb, &filter); if (err < 0) { if (likely(skb->len)) - goto out; - goto out_err; + err = skb->len; } -out: - err = skb->len; -out_err: cb->seq = net->nexthop.seq; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); return err; diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index 0f5e88c8f4ff..10aa059b9f06 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -1981,6 +1981,11 @@ basic() run_cmd "$IP link set dev lo up" + # Dump should not loop endlessly when maximum nexthop ID is configured. + run_cmd "$IP nexthop add id $((2**32-1)) blackhole" + run_cmd "timeout 5 $IP nexthop" + log_test $? 0 "Maximum nexthop ID dump" + # # groups # From patchwork Tue Aug 8 07:52:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 13346186 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0471E13AC8 for ; Tue, 8 Aug 2023 15:51:35 +0000 (UTC) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2053.outbound.protection.outlook.com [40.107.243.53]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C07D167C6 for ; Tue, 8 Aug 2023 08:51:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S/v5TxvElOEryx/lgxPxLcXhkWd4spJWzp0eJdvZfVEojd6Lwv+arF1TZaf8+NlZDIv8FIdpFCRGviszcG2niPVVePLPtM6me2MN2yzQ8/TtaBys+544OyAktiq3rSMVRVuJcg5dZlbVlOq8CiRVlWtJMQ3LS1iIqzb1heHRrXx1HWOUZJ0sk1QbJj5v3Qmwj7h++mh1a8eqUsUZDkx+hdtAVZ+gRu8giqPI2YwOqm207eBJ58ULude+yTo5fnpeV9CqOmFvZzoshsXVuvXNVLOS4A2NqHFQ1KqHUaFZETT1kAMRszTRCBK9Kl/+FeNS5GOsK3UYr9frqkuwh4ThPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yNoA4cQQ6a4JKvQR3QlZxh0iwq0fcwTdfdhbEBBiif8=; b=Czsbic/q0Ju146AARCPHXq8Zdad40MPcfJjiOxB/RyswNCdvoraW4+m591hM7LsyB+JNMfiutW3/bbi/tCdGY9JtEHFceXKhGWC8U6htQDLnfq3/TIp58jaBIrhKVsYjO9zYA1plqB/QiHLGYf1h/lErq/QK4wVkUbIFsWZc98f86f3qQcfNTXhoH11gi6BybuaYUVAVG/J8TU4a7wPlGx21nnQMmevRaHtVcKpbtigM01tekh+ltKZHB7wzKVJjWTV5e1saddz5TcR6FK7rgqkxq2aBQTmeX18PRAy/JOsIKlutQMAf3n7WW/jkNmJtco1jSYRzzshDZGY5CM3aaw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yNoA4cQQ6a4JKvQR3QlZxh0iwq0fcwTdfdhbEBBiif8=; b=LuQRaeGjU6AUXhhbTSCIAI6sfsgL6qRvHk8HZhTsjiiwerpZEnEEV3s8Nq2soOWfSdQXW3NrmunI1DX2a1L75i5mHlW0oPbmO/DSB5tVDxgtP3Ee0DE5ZLomATyb8nr3V0p2nNzWA3Cx1rR1s/Y5jexsFYxyhbY5B0Zk7/lD+9dWGn+eksZzxu22wksCGJM0OFjflgl6YyM5OWIHhgoRhh6pqGThRPD+xVcYwMG+eMemByZVsFBTStii6vlULqG7McuLIezYTgi2V6JNvMhsyUaCkknIcBDpN1rJd23UFtvzbPY9laX1H8WzvmjUHRfwL0mxHLXz3CiGBMAwNj3qog== Received: from SA1PR05CA0023.namprd05.prod.outlook.com (2603:10b6:806:2d2::17) by BY5PR12MB4885.namprd12.prod.outlook.com (2603:10b6:a03:1de::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.27; Tue, 8 Aug 2023 07:53:14 +0000 Received: from SA2PEPF00001507.namprd04.prod.outlook.com (2603:10b6:806:2d2:cafe::ab) by SA1PR05CA0023.outlook.office365.com (2603:10b6:806:2d2::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6678.16 via Frontend Transport; Tue, 8 Aug 2023 07:53:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SA2PEPF00001507.mail.protection.outlook.com (10.167.242.39) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.19 via Frontend Transport; Tue, 8 Aug 2023 07:53:18 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 8 Aug 2023 00:53:07 -0700 Received: from dev-r-vrt-155.mtr.labs.mlnx (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 8 Aug 2023 00:53:04 -0700 From: Ido Schimmel To: CC: , , , , , , Ido Schimmel Subject: [PATCH net 2/3] nexthop: Make nexthop bucket dump more efficient Date: Tue, 8 Aug 2023 10:52:32 +0300 Message-ID: <20230808075233.3337922-3-idosch@nvidia.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230808075233.3337922-1-idosch@nvidia.com> References: <20230808075233.3337922-1-idosch@nvidia.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SA2PEPF00001507:EE_|BY5PR12MB4885:EE_ X-MS-Office365-Filtering-Correlation-Id: bfe8b6d7-f02b-4815-43b8-08db97e4877d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Jn4Lmn05pm6/jps0JzJR0K3LI4mNpUlNEOIXdGJb8C5kQvXjY8atNqYOe2Oe8Yl2fKBLvvlTCfcJX2BzoJ1wIuHCp0OMsJbmyqGIk437QdUa7gCDwry5UtUIHsy6rz1q2TWmK32D3UinWVorc/ZI1sH7kd3e0l307j0iAOK8nB2UBqQcoN8GkclLFBV3ixfk00KbdAwFypge4q1XmL/1EBXOIpRoubVh9BtctyomFVqj0Vve6JKQFNBWsiGYrU8Z9CeI/0cyji31yf75McqsnUQWA0A341lOfW6v+o8937vE8vJAG4CoyMRCFFAN3hDXAnGrb53D96SBY97aeNJnC3OQKuhgB9l7/YTo2M0m1KzsjvV6by8DXGS6VbXiPA0/b+EfHvdN/PBaTYcTJ+cT3sE/Ov4SNWKxxYZONjKRDz85w20HK6luGZSHSM7Qb+X/lIweDbRppNRHgInZyRgF8Iw8VGO2iVXVm2mh6h9y1FLH++tqM1BFkLW9Gjxf9odf2Xkd5dlWGnA93j+lmhydmrY+Ku9F/eNJ2cpTF99WnWfCX8kaRneAombH0RTA8oy3d0fkdyLsfEBanJStUNYIrmgxMZ4i2xhu4+19KwVrRJS8ZqR/DmEu5RUOkSfD2lt01upvTGH5Q9nwSJYcZFMjx/lA08FjKmjg8EZo6ghIh7byK/OyDxu2YbWMZSgV+b2IN/KQIU7tZI/uwa61JadXpPsUlQj4VCPGsbDTsaE/L7iAtZVrwWqmgszdyYdgE2pbqgbdgXuv5yiVcuJ+gNc1HBOAYcPkyl6ls1rtVk/mSRVLUwGSIacxhJNIs+LNJHSd X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(376002)(346002)(396003)(136003)(39860400002)(90011799007)(451199021)(82310400008)(90021799007)(1800799003)(186006)(46966006)(40470700004)(36840700001)(8676002)(8936002)(5660300002)(6916009)(4326008)(41300700001)(426003)(316002)(47076005)(83380400001)(40480700001)(86362001)(40460700003)(36860700001)(2906002)(6666004)(2616005)(26005)(1076003)(107886003)(36756003)(16526019)(336012)(70586007)(70206006)(7636003)(356005)(478600001)(82740400003)(54906003);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2023 07:53:18.3122 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bfe8b6d7-f02b-4815-43b8-08db97e4877d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SA2PEPF00001507.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4885 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org rtm_dump_nexthop_bucket_nh() is used to dump nexthop buckets belonging to a specific resilient nexthop group. The function returns a positive return code (the skb length) upon both success and failure. The above behavior is problematic. When a complete nexthop bucket dump is requested, the function that walks the different nexthops treats the non-zero return code as an error. This causes buckets belonging to different resilient nexthop groups to be dumped using different buffers even if they can all fit in the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 10 index 0 idle_time 10.27 nhid 1 [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 20 index 0 idle_time 6.44 nhid 1 [...] Fix by only returning a non-zero return code when an error occurred and restarting the dump from the bucket index we failed to fill in. This allows buckets belonging to different resilient nexthop groups to be dumped using the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 30.21 nhid 1 id 20 index 0 idle_time 26.7 nhid 1 [...] While this change is more of a performance improvement change than an actual bug fix, it is a prerequisite for a subsequent patch that does fix a bug. Fixes: 8a1bbabb034d ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata --- net/ipv4/nexthop.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 179e50d8fe07..f365a4f63899 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -3363,25 +3363,19 @@ static int rtm_dump_nexthop_bucket_nh(struct sk_buff *skb, dd->filter.res_bucket_nh_id != nhge->nh->id) continue; + dd->ctx->bucket_index = bucket_index; err = nh_fill_res_bucket(skb, nh, bucket, bucket_index, RTM_NEWNEXTHOPBUCKET, portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, cb->extack); - if (err < 0) { - if (likely(skb->len)) - goto out; - goto out_err; - } + if (err) + return err; } dd->ctx->done_nh_idx = dd->ctx->nh.idx + 1; - bucket_index = 0; + dd->ctx->bucket_index = 0; -out: - err = skb->len; -out_err: - dd->ctx->bucket_index = bucket_index; - return err; + return 0; } static int rtm_dump_nexthop_bucket_cb(struct sk_buff *skb, From patchwork Tue Aug 8 07:52:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 13346185 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5C7613AD7 for ; Tue, 8 Aug 2023 15:51:32 +0000 (UTC) Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2072.outbound.protection.outlook.com [40.107.101.72]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC91246B2 for ; Tue, 8 Aug 2023 08:51:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WwRN6QUPeyKuOI4Q3Cbmw8aA5rD/M9OyXP/4xgOWm52ELo3Wh1JE3ZffeVK5j9NoUalX2ydgLalfDrid5StmuFuMI7nvL9fOOBcXhrXi5gqIiUgkyw92NzlfYD+GJREHSdeX/ePOXaBeY0thsv7aVodHSfED2yU4pWWIt2gIR4aEOD46dR/3tFi+1IqmBGTdQKJlF8t5s3/ZlExhPsZUSBDWWXMKNkykZ+UFofkfAX3JgJS9TRTmO53Bt4JyhyC/9Roji/a08j54YB7KMXWOtw5spGMbNaUEhBvM0hH2yIvSZbXnv7t7ieaGsj5OpHRVRiyNTgTAe333NgVukLl3xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QsMvnmW9BZ/h6TD5kZziA+Eo14tOrc0UpEhC18OAmOo=; b=H7xD9SNqsz7KSsgMqmGwO0WeaOjjFsSJffe1oivQA12Ec7KYTsQsH44YRZbA5maEHuKkxEo5LCBlwNqLtSv+NMzHF27gPDanN+8cf4RQLqmw7oUoMrtAkhtFMdyg2SqOCa7BDbazyTmpRyfFhbnmLEfF4cIlKxrVOl/8AKSbk230LNcpHifwLuhD7+Z46CKsp6MmASBSB/Q2DeC1IwNb3lYsUG89VXlMR3qFiq4KRYVk2USQhxIMUNaYHZtQYb2HHb125ZT85JUicB/qec5ebUd7SS/gGNAs1v0dLRVw3nB5yBKFbLpSeqBJJFXQJsG978axepuCqIfOktOgU82HDg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QsMvnmW9BZ/h6TD5kZziA+Eo14tOrc0UpEhC18OAmOo=; b=qM1//o80YmPga5eE5ysl1UVAM9Pxj42K789+06bOY0BiiYrUQQaCYtsAyKMJyktxFy213KMxre/WjSBghvg1rpyMylkt8Xi+YaOuo81G6yjUDb43pWRVbDXXYWG0rXdoVpaPTmtcc0GiCJy+21MBznsODFTbAZF2jtC28yH21v3pDfm2ihMVOp2c9bY8lcmzPQmpZAS1KT3C/3D7W2WzL3et4v41iC9NNZXn2Vw5QehJU4e79bKMkE/I0o0KOQD0Tg+1kp5ahloqoCsOrX3dnMqRMGVLFU85oLSLjv31l0FVd3sa26VqwgJGUkXxWW1tbefk+kNy0M5uKmL1cNxEbQ== Received: from SA1PR05CA0016.namprd05.prod.outlook.com (2603:10b6:806:2d2::25) by CH3PR12MB8076.namprd12.prod.outlook.com (2603:10b6:610:127::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.27; Tue, 8 Aug 2023 07:53:16 +0000 Received: from SA2PEPF00001507.namprd04.prod.outlook.com (2603:10b6:806:2d2:cafe::5d) by SA1PR05CA0016.outlook.office365.com (2603:10b6:806:2d2::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6678.16 via Frontend Transport; Tue, 8 Aug 2023 07:53:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SA2PEPF00001507.mail.protection.outlook.com (10.167.242.39) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.19 via Frontend Transport; Tue, 8 Aug 2023 07:53:19 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 8 Aug 2023 00:53:10 -0700 Received: from dev-r-vrt-155.mtr.labs.mlnx (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 8 Aug 2023 00:53:07 -0700 From: Ido Schimmel To: CC: , , , , , , Ido Schimmel Subject: [PATCH net 3/3] nexthop: Fix infinite nexthop bucket dump when using maximum nexthop ID Date: Tue, 8 Aug 2023 10:52:33 +0300 Message-ID: <20230808075233.3337922-4-idosch@nvidia.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230808075233.3337922-1-idosch@nvidia.com> References: <20230808075233.3337922-1-idosch@nvidia.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SA2PEPF00001507:EE_|CH3PR12MB8076:EE_ X-MS-Office365-Filtering-Correlation-Id: f0eed962-3046-437c-87ba-08db97e48864 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wfp5FoQGlUX/rvVQU3hyY9PeD2DCWisC2X0pEzOXe9sRfGxb0o80s0O3HNlFIFQJXsd0z39BiTCh7dbo+YKlAMxOp9j30UoE73PF3kjtI+yoqrCLtnWHb5/ovHxRKctoXzZuIC0Y62yE6Q7p5PUhAtCY8FDvaLtpLy3iZ6hIqVTHprqw2gei5CCIoWITEir3FJer6ezlF+1qDqpOgpgFs2udwdRYCiOJfnJBSfvzvRbJCXoK9VSKw4Oo1kUptmb1lwg0xZDACsNmJ5N3NpLN9nxwoL+9A3fL8EdHFy7V2cx2nbnAfBnBUamgB+cEMuJmhjHVPY3RrR6thvXihY99EhJNBT9s+TQfFoEu7CfQr0H/Z+YnOkoIaJMcMhWITmZpYB4OWKNMFw4II52DiHdnvr0VUc/lM/9kThG//rIghOiRyDdEn8IXXJzefKjvzzxiR9FiMZ8jb2859Jy+I4nUKaqj5uQBDybGAtrliAHeW/4GFxYKmSRHv5MXFBNKF6mrCgY7RaRwK5gIs8OO22HnzB2e5quvHRzRvYk2PKZJZhMoZpu8iwa40mwgcFDZdS5VC5qzg8E1/I65CvZppljqdnwtD+derbU3HwWpyesV8sciKBBk5RrP3DBNPzOgNVmpMEfw3RWjfSZKXtkETCoXGyTzR4xV2smA0+b5VTG5H7Qv6PcprK2VHj7wdcA8mP9Sb0g/qBiRbRRdwDYDpZTB3uDjUk/Gmo/9IqYcPhfMXCOtC0HePTuaaVq/ipy9H7hoIFlqojtJgb5BN822DJrohH9Jz8SthV5DjOERCICVQ9VgdVJ55J8DOIpCpJ0dzWXf X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(376002)(346002)(396003)(136003)(39860400002)(90011799007)(451199021)(82310400008)(90021799007)(1800799003)(186006)(46966006)(40470700004)(36840700001)(8676002)(8936002)(5660300002)(6916009)(4326008)(41300700001)(426003)(316002)(47076005)(83380400001)(40480700001)(86362001)(40460700003)(36860700001)(2906002)(6666004)(2616005)(26005)(1076003)(107886003)(36756003)(16526019)(336012)(70586007)(70206006)(7636003)(356005)(478600001)(82740400003)(54906003);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2023 07:53:19.8278 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f0eed962-3046-437c-87ba-08db97e48864 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SA2PEPF00001507.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB8076 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop bucket dump callback always returns a positive number if nexthop buckets were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthop buckets are present. In the last recvmsg() call the dump callback will not fill in any nexthop buckets because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396980, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 128 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 6.66 nhid 1 id 10 index 1 idle_time 6.66 nhid 1 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # ip nexthop bucket id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOPBUCKET responses: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396737, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 148 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 148 id 4294967295 index 0 idle_time 6.61 nhid 1 id 4294967295 index 1 idle_time 6.61 nhid 1 +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: 8a1bbabb034d ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata --- net/ipv4/nexthop.c | 6 +----- tools/testing/selftests/net/fib_nexthops.sh | 5 +++++ 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index f365a4f63899..be5498f5dd31 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -3424,13 +3424,9 @@ static int rtm_dump_nexthop_bucket(struct sk_buff *skb, if (err < 0) { if (likely(skb->len)) - goto out; - goto out_err; + err = skb->len; } -out: - err = skb->len; -out_err: cb->seq = net->nexthop.seq; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); return err; diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index 10aa059b9f06..df8d90b51867 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -2206,6 +2206,11 @@ basic_res() run_cmd "$IP nexthop bucket list fdb" log_test $? 255 "Dump all nexthop buckets with invalid 'fdb' keyword" + # Dump should not loop endlessly when maximum nexthop ID is configured. + run_cmd "$IP nexthop add id $((2**32-1)) group 1/2 type resilient buckets 4" + run_cmd "timeout 5 $IP nexthop bucket" + log_test $? 0 "Maximum nexthop ID dump" + # # resilient nexthop buckets get requests #