diff mbox series

[net-next,1/4] nexthop: Factor out hash threshold fdb nexthop selection

Message ID 20230529201914.69828-2-bpoirier@nvidia.com (mailing list archive)
State Accepted
Commit eedd47a6ec9f683f0b8d931aacca81985be55eec
Headers show
Series nexthop: Refactor and fix nexthop selection for multipath routes | expand

Commit Message

Benjamin Poirier May 29, 2023, 8:19 p.m. UTC
The loop in nexthop_select_path_hthr() includes code to check for neighbor
validity. Since this does not apply to fdb nexthops, simplify the loop by
moving the fdb nexthop selection to its own function.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
---
 net/ipv4/nexthop.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

Comments

David Ahern May 30, 2023, 2:57 p.m. UTC | #1
On 5/29/23 2:19 PM, Benjamin Poirier wrote:
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index f95142e56da0..27089dea0ed0 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -1152,11 +1152,31 @@ static bool ipv4_good_nh(const struct fib_nh *nh)
>  	return !!(state & NUD_VALID);
>  }
>  
> +static struct nexthop *nexthop_select_path_fdb(struct nh_group *nhg, int hash)
> +{
> +	int i;
> +
> +	for (i = 0; i < nhg->num_nh; i++) {
> +		struct nh_grp_entry *nhge = &nhg->nh_entries[i];
> +
> +		if (hash > atomic_read(&nhge->hthr.upper_bound))
> +			continue;
> +
> +		return nhge->nh;
> +	}
> +
> +	WARN_ON_ONCE(1);

I do not see how the stack is going to provide useful information; it
should always be vxlan_xmit ... nexthop_select_path_fdb, right?

besides that:
Reviewed-by: David Ahern <dsahern@kernel.org>
Benjamin Poirier July 19, 2023, 1:54 p.m. UTC | #2
On 2023-05-30 08:57 -0600, David Ahern wrote:
> On 5/29/23 2:19 PM, Benjamin Poirier wrote:
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index f95142e56da0..27089dea0ed0 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -1152,11 +1152,31 @@ static bool ipv4_good_nh(const struct fib_nh *nh)
> >  	return !!(state & NUD_VALID);
> >  }
> >  
> > +static struct nexthop *nexthop_select_path_fdb(struct nh_group *nhg, int hash)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < nhg->num_nh; i++) {
> > +		struct nh_grp_entry *nhge = &nhg->nh_entries[i];
> > +
> > +		if (hash > atomic_read(&nhge->hthr.upper_bound))
> > +			continue;
> > +
> > +		return nhge->nh;
> > +	}
> > +
> > +	WARN_ON_ONCE(1);
> 
> I do not see how the stack is going to provide useful information; it
> should always be vxlan_xmit ... nexthop_select_path_fdb, right?

Not always, it is also possible to have a resilient nhg with fdb
nexthops. In that case, nexthop_select_path_fdb() is not called. In
practice, I tried such a configuration and it does not work well. I have
prepared a fix that I'll send after the current series has been dealt
with.

Sorry for the long delay before my reply.
diff mbox series

Patch

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index f95142e56da0..27089dea0ed0 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1152,11 +1152,31 @@  static bool ipv4_good_nh(const struct fib_nh *nh)
 	return !!(state & NUD_VALID);
 }
 
+static struct nexthop *nexthop_select_path_fdb(struct nh_group *nhg, int hash)
+{
+	int i;
+
+	for (i = 0; i < nhg->num_nh; i++) {
+		struct nh_grp_entry *nhge = &nhg->nh_entries[i];
+
+		if (hash > atomic_read(&nhge->hthr.upper_bound))
+			continue;
+
+		return nhge->nh;
+	}
+
+	WARN_ON_ONCE(1);
+	return NULL;
+}
+
 static struct nexthop *nexthop_select_path_hthr(struct nh_group *nhg, int hash)
 {
 	struct nexthop *rc = NULL;
 	int i;
 
+	if (nhg->fdb_nh)
+		return nexthop_select_path_fdb(nhg, hash);
+
 	for (i = 0; i < nhg->num_nh; ++i) {
 		struct nh_grp_entry *nhge = &nhg->nh_entries[i];
 		struct nh_info *nhi;
@@ -1165,8 +1185,6 @@  static struct nexthop *nexthop_select_path_hthr(struct nh_group *nhg, int hash)
 			continue;
 
 		nhi = rcu_dereference(nhge->nh->nh_info);
-		if (nhi->fdb_nh)
-			return nhge->nh;
 
 		/* nexthops always check if it is good and does
 		 * not rely on a sysctl for this behavior