diff mbox series

[net-next,1/2] nexthop: Restart nexthop dump based on last dumped nexthop identifier

Message ID 20210416155535.1694714-2-idosch@idosch.org (mailing list archive)
State Accepted
Commit 9e46fb656fdb40baec33a8942743d81a40f30fd3
Delegated to: Netdev Maintainers
Headers show
Series nexthop: Support large scale nexthop flushing | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 32 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/header_inline success Link

Commit Message

Ido Schimmel April 16, 2021, 3:55 p.m. UTC
From: Ido Schimmel <idosch@nvidia.com>

Currently, a multi-part nexthop dump is restarted based on the number of
nexthops that have been dumped so far. This can result in a lot of
nexthops not being dumped when nexthops are simultaneously deleted:

 # ip nexthop | wc -l
 65536
 # ip nexthop flush
 Dump was interrupted and may be inconsistent.
 Flushed 36040 nexthops
 # ip nexthop | wc -l
 29496

Instead, restart the dump based on the nexthop identifier (fixed number)
of the last successfully dumped nexthop:

 # ip nexthop | wc -l
 65536
 # ip nexthop flush
 Dump was interrupted and may be inconsistent.
 Flushed 65536 nexthops
 # ip nexthop | wc -l
 0

Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
---
 net/ipv4/nexthop.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

Comments

David Ahern April 18, 2021, 5:06 p.m. UTC | #1
On 4/16/21 8:55 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Currently, a multi-part nexthop dump is restarted based on the number of
> nexthops that have been dumped so far. This can result in a lot of
> nexthops not being dumped when nexthops are simultaneously deleted:
> 
>  # ip nexthop | wc -l
>  65536
>  # ip nexthop flush
>  Dump was interrupted and may be inconsistent.
>  Flushed 36040 nexthops
>  # ip nexthop | wc -l
>  29496
> 
> Instead, restart the dump based on the nexthop identifier (fixed number)
> of the last successfully dumped nexthop:
> 
>  # ip nexthop | wc -l
>  65536
>  # ip nexthop flush
>  Dump was interrupted and may be inconsistent.
>  Flushed 65536 nexthops
>  # ip nexthop | wc -l
>  0
> 
> Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
> Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>

Any reason not to put this in -net with a Fixes tag?
Ido Schimmel April 19, 2021, 5:48 a.m. UTC | #2
On Sun, Apr 18, 2021 at 10:06:41AM -0700, David Ahern wrote:
> On 4/16/21 8:55 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Currently, a multi-part nexthop dump is restarted based on the number of
> > nexthops that have been dumped so far. This can result in a lot of
> > nexthops not being dumped when nexthops are simultaneously deleted:
> > 
> >  # ip nexthop | wc -l
> >  65536
> >  # ip nexthop flush
> >  Dump was interrupted and may be inconsistent.
> >  Flushed 36040 nexthops
> >  # ip nexthop | wc -l
> >  29496
> > 
> > Instead, restart the dump based on the nexthop identifier (fixed number)
> > of the last successfully dumped nexthop:
> > 
> >  # ip nexthop | wc -l
> >  65536
> >  # ip nexthop flush
> >  Dump was interrupted and may be inconsistent.
> >  Flushed 65536 nexthops
> >  # ip nexthop | wc -l
> >  0
> > 
> > Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
> > Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > Reviewed-by: Petr Machata <petrm@nvidia.com>
> > ---
> >  net/ipv4/nexthop.c | 14 ++++++--------
> >  1 file changed, 6 insertions(+), 8 deletions(-)
> > 
> 
> Reviewed-by: David Ahern <dsahern@kernel.org>

Thanks

> 
> Any reason not to put this in -net with a Fixes tag?

I put it in the cover letter:

"Targeting at net-next since this use case never worked, the flow is
pretty obscure and such a large number of nexthops is unlikely to be
used in any real-world scenario."
diff mbox series

Patch

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 5a2fc8798d20..4075230b14c6 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -3140,26 +3140,24 @@  static int rtm_dump_walk_nexthops(struct sk_buff *skb,
 				  void *data)
 {
 	struct rb_node *node;
-	int idx = 0, s_idx;
+	int s_idx;
 	int err;
 
 	s_idx = ctx->idx;
 	for (node = rb_first(root); node; node = rb_next(node)) {
 		struct nexthop *nh;
 
-		if (idx < s_idx)
-			goto cont;
-
 		nh = rb_entry(node, struct nexthop, rb_node);
-		ctx->idx = idx;
+		if (nh->id < s_idx)
+			continue;
+
+		ctx->idx = nh->id;
 		err = nh_cb(skb, cb, nh, data);
 		if (err)
 			return err;
-cont:
-		idx++;
 	}
 
-	ctx->idx = idx;
+	ctx->idx++;
 	return 0;
 }