diff mbox series

[1/4] net: arp: introduce arp_evict_nocarrier sysctl parameter

Message ID 20211013222710.4162634-1-prestwoj@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [1/4] net: arp: introduce arp_evict_nocarrier sysctl parameter | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

James Prestwood Oct. 13, 2021, 10:27 p.m. UTC
This change introduces a new sysctl parameter, arp_evict_nocarrier.
When set (default) the ARP cache will be cleared on a NOCARRIER event.
This new option has been defaulted to '1' which maintains existing
behavior.

Clearing the ARP cache on NOCARRIER is relatively new, introduced by:

commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
Author: David Ahern <dsahern@gmail.com>
Date:   Thu Oct 11 20:33:49 2018 -0700

    net: Evict neighbor entries on carrier down

The reason for this changes is to prevent the ARP cache from being
cleared when a wireless device roams. Specifically for wireless roams
the ARP cache should not be cleared because the underlying network has not
changed. Clearing the ARP cache in this case can introduce significant
delays sending out packets after a roam.

A user reported such a situation here:

https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/

After some investigation it was found that the kernel was holding onto
packets until ARP finished which resulted in this 1 second delay. It
was also found that the first ARP who-has was never responded to,
which is actually what caues the delay. This change is more or less
working around this behavior, but again, there is no reason to clear
the cache on a roam anyways.

As for the unanswered who-has, we know the packet made it OTA since
it was seen while monitoring. Why it never received a response is
unknown.

Signed-off-by: James Prestwood <prestwoj@gmail.com>
---
 include/linux/inetdevice.h  | 1 +
 include/uapi/linux/ip.h     | 1 +
 include/uapi/linux/sysctl.h | 1 +
 net/ipv4/arp.c              | 4 +++-
 net/ipv4/devinet.c          | 4 ++++
 5 files changed, 10 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Oct. 13, 2021, 11:37 p.m. UTC | #1
On Wed, 13 Oct 2021 15:27:07 -0700 James Prestwood wrote:
> This change introduces a new sysctl parameter, arp_evict_nocarrier.
> When set (default) the ARP cache will be cleared on a NOCARRIER event.
> This new option has been defaulted to '1' which maintains existing
> behavior.
> 
> Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
> 
> commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> Author: David Ahern <dsahern@gmail.com>
> Date:   Thu Oct 11 20:33:49 2018 -0700
> 
>     net: Evict neighbor entries on carrier down
> 
> The reason for this changes is to prevent the ARP cache from being
> cleared when a wireless device roams. Specifically for wireless roams
> the ARP cache should not be cleared because the underlying network has not
> changed. Clearing the ARP cache in this case can introduce significant
> delays sending out packets after a roam.
> 
> A user reported such a situation here:
> 
> https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
> 
> After some investigation it was found that the kernel was holding onto
> packets until ARP finished which resulted in this 1 second delay. It
> was also found that the first ARP who-has was never responded to,
> which is actually what caues the delay. This change is more or less
> working around this behavior, but again, there is no reason to clear
> the cache on a roam anyways.
> 
> As for the unanswered who-has, we know the packet made it OTA since
> it was seen while monitoring. Why it never received a response is
> unknown.
> 
> Signed-off-by: James Prestwood <prestwoj@gmail.com>

Seems sensible at a glance, some quick feedback.

Please make sure you run ./scripts/get_maintainers.pl on the patch
and add appropriate folks to CC.

Please rebase the code on top of this tree:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/

> diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
> index 53aa0343bf69..63180170fdbd 100644
> --- a/include/linux/inetdevice.h
> +++ b/include/linux/inetdevice.h
> @@ -133,6 +133,7 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev)
>  #define IN_DEV_ARP_ANNOUNCE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_ANNOUNCE)
>  #define IN_DEV_ARP_IGNORE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_IGNORE)
>  #define IN_DEV_ARP_NOTIFY(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_NOTIFY)
> +#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev) IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)

IN_DEV_ANDCONF() makes most sense, I'd think.

>  struct in_ifaddr {
>  	struct hlist_node	hash;

> diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
> index 922dd73e5740..50cfe4f37089 100644
> --- a/net/ipv4/arp.c
> +++ b/net/ipv4/arp.c
> @@ -1247,6 +1247,7 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
>  {
>  	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
>  	struct netdev_notifier_change_info *change_info;
> +	struct in_device *in_dev = __in_dev_get_rcu(dev);

Don't we need to hold the RCU lock to call this?

>  	switch (event) {
>  	case NETDEV_CHANGEADDR:
> @@ -1257,7 +1258,8 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
>  		change_info = ptr;
>  		if (change_info->flags_changed & IFF_NOARP)
>  			neigh_changeaddr(&arp_tbl, dev);
> -		if (!netif_carrier_ok(dev))
> +		if (IN_DEV_ARP_EVICT_NOCARRIER(in_dev) &&
> +		    !netif_carrier_ok(dev))
>  			neigh_carrier_down(&arp_tbl, dev);
>  		break;
>  	default:
Jakub Kicinski Oct. 13, 2021, 11:40 p.m. UTC | #2
On Wed, 13 Oct 2021 16:37:14 -0700 Jakub Kicinski wrote:
> Please make sure you run ./scripts/get_maintainers.pl on the patch
> and add appropriate folks to CC.

Ah, and beyond that please make sure to CC wireless folks.
The linux-wireless mailing list, Johannes, Jouni, etc.
The question of the "real root cause" is slightly under-explained.
Sounds like something restores carrier before the device is actually up.
Daniel Borkmann Oct. 14, 2021, 8:30 a.m. UTC | #3
[ Adding few more to Cc ]

On 10/14/21 12:27 AM, James Prestwood wrote:
> This change introduces a new sysctl parameter, arp_evict_nocarrier.
> When set (default) the ARP cache will be cleared on a NOCARRIER event.
> This new option has been defaulted to '1' which maintains existing
> behavior.
> 
> Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
> 
> commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> Author: David Ahern <dsahern@gmail.com>
> Date:   Thu Oct 11 20:33:49 2018 -0700
> 
>      net: Evict neighbor entries on carrier down
> 
> The reason for this changes is to prevent the ARP cache from being
> cleared when a wireless device roams. Specifically for wireless roams
> the ARP cache should not be cleared because the underlying network has not
> changed. Clearing the ARP cache in this case can introduce significant
> delays sending out packets after a roam.
> 
> A user reported such a situation here:
> 
> https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
> 
> After some investigation it was found that the kernel was holding onto
> packets until ARP finished which resulted in this 1 second delay. It
> was also found that the first ARP who-has was never responded to,
> which is actually what caues the delay. This change is more or less
> working around this behavior, but again, there is no reason to clear
> the cache on a roam anyways.
> 
> As for the unanswered who-has, we know the packet made it OTA since
> it was seen while monitoring. Why it never received a response is
> unknown.

Wouldn't it make more sense to extend neigh_flush_dev() where we skip eviction
of NUD_PERMANENT (see the skip_perm condition)? Either as a per table setting
(tbl->parms) or as a NTF_EXT_* flag for specific neighbors?

> Signed-off-by: James Prestwood <prestwoj@gmail.com>
> ---
>   include/linux/inetdevice.h  | 1 +
>   include/uapi/linux/ip.h     | 1 +
>   include/uapi/linux/sysctl.h | 1 +
>   net/ipv4/arp.c              | 4 +++-
>   net/ipv4/devinet.c          | 4 ++++
>   5 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
> index 53aa0343bf69..63180170fdbd 100644
> --- a/include/linux/inetdevice.h
> +++ b/include/linux/inetdevice.h
> @@ -133,6 +133,7 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev)
>   #define IN_DEV_ARP_ANNOUNCE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_ANNOUNCE)
>   #define IN_DEV_ARP_IGNORE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_IGNORE)
>   #define IN_DEV_ARP_NOTIFY(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_NOTIFY)
> +#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev) IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)
>   
>   struct in_ifaddr {
>   	struct hlist_node	hash;
> diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h
> index e42d13b55cf3..e00bbb9c47bb 100644
> --- a/include/uapi/linux/ip.h
> +++ b/include/uapi/linux/ip.h
> @@ -169,6 +169,7 @@ enum
>   	IPV4_DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
>   	IPV4_DEVCONF_DROP_GRATUITOUS_ARP,
>   	IPV4_DEVCONF_BC_FORWARDING,
> +	IPV4_DEVCONF_ARP_EVICT_NOCARRIER,
>   	__IPV4_DEVCONF_MAX
>   };
>   
> diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
> index 1e05d3caa712..6a3b194c50fe 100644
> --- a/include/uapi/linux/sysctl.h
> +++ b/include/uapi/linux/sysctl.h
> @@ -482,6 +482,7 @@ enum
>   	NET_IPV4_CONF_PROMOTE_SECONDARIES=20,
>   	NET_IPV4_CONF_ARP_ACCEPT=21,
>   	NET_IPV4_CONF_ARP_NOTIFY=22,
> +	NET_IPV4_CONF_ARP_EVICT_NOCARRIER=23,
>   };
>   
>   /* /proc/sys/net/ipv4/netfilter */
> diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
> index 922dd73e5740..50cfe4f37089 100644
> --- a/net/ipv4/arp.c
> +++ b/net/ipv4/arp.c
> @@ -1247,6 +1247,7 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
>   {
>   	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
>   	struct netdev_notifier_change_info *change_info;
> +	struct in_device *in_dev = __in_dev_get_rcu(dev);
>   
>   	switch (event) {
>   	case NETDEV_CHANGEADDR:
> @@ -1257,7 +1258,8 @@ static int arp_netdev_event(struct notifier_block *this, unsigned long event,
>   		change_info = ptr;
>   		if (change_info->flags_changed & IFF_NOARP)
>   			neigh_changeaddr(&arp_tbl, dev);
> -		if (!netif_carrier_ok(dev))
> +		if (IN_DEV_ARP_EVICT_NOCARRIER(in_dev) &&
> +		    !netif_carrier_ok(dev))
>   			neigh_carrier_down(&arp_tbl, dev);
>   		break;
>   	default:
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index 1c6429c353a9..4ff4403749e0 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -75,6 +75,7 @@ static struct ipv4_devconf ipv4_devconf = {
>   		[IPV4_DEVCONF_SHARED_MEDIA - 1] = 1,
>   		[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
>   		[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] =  1000 /*ms*/,
> +		[IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
>   	},
>   };
>   
> @@ -87,6 +88,7 @@ static struct ipv4_devconf ipv4_devconf_dflt = {
>   		[IPV4_DEVCONF_ACCEPT_SOURCE_ROUTE - 1] = 1,
>   		[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
>   		[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] =  1000 /*ms*/,
> +		[IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
>   	},
>   };
>   
> @@ -2527,6 +2529,8 @@ static struct devinet_sysctl_table {
>   		DEVINET_SYSCTL_RW_ENTRY(ARP_IGNORE, "arp_ignore"),
>   		DEVINET_SYSCTL_RW_ENTRY(ARP_ACCEPT, "arp_accept"),
>   		DEVINET_SYSCTL_RW_ENTRY(ARP_NOTIFY, "arp_notify"),
> +		DEVINET_SYSCTL_RW_ENTRY(ARP_EVICT_NOCARRIER,
> +					"arp_evict_nocarrier"),
>   		DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP_PVLAN, "proxy_arp_pvlan"),
>   		DEVINET_SYSCTL_RW_ENTRY(FORCE_IGMP_VERSION,
>   					"force_igmp_version"),
>
James Prestwood Oct. 14, 2021, 4:20 p.m. UTC | #4
Hi Daniel,

On Thu, 2021-10-14 at 10:30 +0200, Daniel Borkmann wrote:
> [ Adding few more to Cc ]
> 
> On 10/14/21 12:27 AM, James Prestwood wrote:
> > This change introduces a new sysctl parameter, arp_evict_nocarrier.
> > When set (default) the ARP cache will be cleared on a NOCARRIER
> > event.
> > This new option has been defaulted to '1' which maintains existing
> > behavior.
> > 
> > Clearing the ARP cache on NOCARRIER is relatively new, introduced
> > by:
> > 
> > commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> > Author: David Ahern <dsahern@gmail.com>
> > Date:   Thu Oct 11 20:33:49 2018 -0700
> > 
> >      net: Evict neighbor entries on carrier down
> > 
> > The reason for this changes is to prevent the ARP cache from being
> > cleared when a wireless device roams. Specifically for wireless
> > roams
> > the ARP cache should not be cleared because the underlying network
> > has not
> > changed. Clearing the ARP cache in this case can introduce
> > significant
> > delays sending out packets after a roam.
> > 
> > A user reported such a situation here:
> > 
> > https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
> > 
> > After some investigation it was found that the kernel was holding
> > onto
> > packets until ARP finished which resulted in this 1 second delay.
> > It
> > was also found that the first ARP who-has was never responded to,
> > which is actually what caues the delay. This change is more or less
> > working around this behavior, but again, there is no reason to
> > clear
> > the cache on a roam anyways.
> > 
> > As for the unanswered who-has, we know the packet made it OTA since
> > it was seen while monitoring. Why it never received a response is
> > unknown.
> 
> Wouldn't it make more sense to extend neigh_flush_dev() where we skip
> eviction
> of NUD_PERMANENT (see the skip_perm condition)? Either as a per table
> setting
> (tbl->parms) or as a NTF_EXT_* flag for specific neighbors?
> 

This is all new code to me, but from what I can tell NUD_PERMANENT is a
per-neighbor thing, which a wireless supplicant shouldn't be expected
to manage. It also seems like a pain to mark all neighbors as permanent
prior to a roam, then undo it all after a roam.

As for table settings I'll need to look into those, and if/how you
expose them to userspace? I modeled both these (arp/ndisc) options
after the other arp_*/ndisc_* sysctl parameters, as they seemed to fit
in there.

Thanks,
James

>
James Prestwood Oct. 14, 2021, 4:29 p.m. UTC | #5
Hi Jakub,

On Wed, 2021-10-13 at 16:37 -0700, Jakub Kicinski wrote:
> On Wed, 13 Oct 2021 15:27:07 -0700 James Prestwood wrote:
> > This change introduces a new sysctl parameter, arp_evict_nocarrier.
> > When set (default) the ARP cache will be cleared on a NOCARRIER
> > event.
> > This new option has been defaulted to '1' which maintains existing
> > behavior.
> > 
> > Clearing the ARP cache on NOCARRIER is relatively new, introduced
> > by:
> > 
> > commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> > Author: David Ahern <dsahern@gmail.com>
> > Date:   Thu Oct 11 20:33:49 2018 -0700
> > 
> >     net: Evict neighbor entries on carrier down
> > 
> > The reason for this changes is to prevent the ARP cache from being
> > cleared when a wireless device roams. Specifically for wireless
> > roams
> > the ARP cache should not be cleared because the underlying network
> > has not
> > changed. Clearing the ARP cache in this case can introduce
> > significant
> > delays sending out packets after a roam.
> > 
> > A user reported such a situation here:
> > 
> > https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
> > 
> > After some investigation it was found that the kernel was holding
> > onto
> > packets until ARP finished which resulted in this 1 second delay.
> > It
> > was also found that the first ARP who-has was never responded to,
> > which is actually what caues the delay. This change is more or less
> > working around this behavior, but again, there is no reason to
> > clear
> > the cache on a roam anyways.
> > 
> > As for the unanswered who-has, we know the packet made it OTA since
> > it was seen while monitoring. Why it never received a response is
> > unknown.
> > 
> > Signed-off-by: James Prestwood <prestwoj@gmail.com>
> 
> Seems sensible at a glance, some quick feedback.
> 
> Please make sure you run ./scripts/get_maintainers.pl on the patch
> and add appropriate folks to CC.
> 
> Please rebase the code on top of this tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> 
> > diff --git a/include/linux/inetdevice.h
> > b/include/linux/inetdevice.h
> > index 53aa0343bf69..63180170fdbd 100644
> > --- a/include/linux/inetdevice.h
> > +++ b/include/linux/inetdevice.h
> > @@ -133,6 +133,7 @@ static inline void ipv4_devconf_setall(struct
> > in_device *in_dev)
> >  #define IN_DEV_ARP_ANNOUNCE(in_dev)    IN_DEV_MAXCONF((in_dev),
> > ARP_ANNOUNCE)
> >  #define IN_DEV_ARP_IGNORE(in_dev)      IN_DEV_MAXCONF((in_dev),
> > ARP_IGNORE)
> >  #define IN_DEV_ARP_NOTIFY(in_dev)      IN_DEV_MAXCONF((in_dev),
> > ARP_NOTIFY)
> > +#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev)
> > IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)
> 
> IN_DEV_ANDCONF() makes most sense, I'd think.

So given we want '1' as the default as well as the ability to toggle
this option per-netdev I thought this was more appropriate. One caviat
is this would not work for setting ALL netdev's, but I don't think this
is a valid use case; or at least I can't imagine why you'd want to ever
do this.

This is a whole new area to me though, so if I'm understanding these
macros wrong please educate me :)

> 
> >  struct in_ifaddr {
> >         struct hlist_node       hash;
> 
> > diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
> > index 922dd73e5740..50cfe4f37089 100644
> > --- a/net/ipv4/arp.c
> > +++ b/net/ipv4/arp.c
> > @@ -1247,6 +1247,7 @@ static int arp_netdev_event(struct
> > notifier_block *this, unsigned long event,
> >  {
> >         struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> >         struct netdev_notifier_change_info *change_info;
> > +       struct in_device *in_dev = __in_dev_get_rcu(dev);
> 
> Don't we need to hold the RCU lock to call this?

I was wondering that as well. It seems to be used in some places and
not others. Maybe the caller locked prior in places where there was no
lock, I'll look further into it.

> 
> >         switch (event) {
> >         case NETDEV_CHANGEADDR:
> > @@ -1257,7 +1258,8 @@ static int arp_netdev_event(struct
> > notifier_block *this, unsigned long event,
> >                 change_info = ptr;
> >                 if (change_info->flags_changed & IFF_NOARP)
> >                         neigh_changeaddr(&arp_tbl, dev);
> > -               if (!netif_carrier_ok(dev))
> > +               if (IN_DEV_ARP_EVICT_NOCARRIER(in_dev) &&
> > +                   !netif_carrier_ok(dev))
> >                         neigh_carrier_down(&arp_tbl, dev);
> >                 break;
> >         default:
James Prestwood Oct. 14, 2021, 4:32 p.m. UTC | #6
On Wed, 2021-10-13 at 16:40 -0700, Jakub Kicinski wrote:
> On Wed, 13 Oct 2021 16:37:14 -0700 Jakub Kicinski wrote:
> > Please make sure you run ./scripts/get_maintainers.pl on the patch
> > and add appropriate folks to CC.
> 
> Ah, and beyond that please make sure to CC wireless folks.
> The linux-wireless mailing list, Johannes, Jouni, etc.
> The question of the "real root cause" is slightly under-explained.
> Sounds like something restores carrier before the device is actually
> up.

Sure. As for the real root cause, are you talking about the AP side? As
far as the station is concerened the initial packets are making it out
after a roam. So possibly the AP is restoring carrier before it should,
and packets are getting dropped.

In v2 I'll include a cover letter with more details about the test
setup, and what was observed.
Jakub Kicinski Oct. 14, 2021, 4:43 p.m. UTC | #7
On Thu, 14 Oct 2021 09:32:31 -0700 James Prestwood wrote:
> On Wed, 2021-10-13 at 16:40 -0700, Jakub Kicinski wrote:
> > On Wed, 13 Oct 2021 16:37:14 -0700 Jakub Kicinski wrote:  
> > > Please make sure you run ./scripts/get_maintainers.pl on the patch
> > > and add appropriate folks to CC.  
> > 
> > Ah, and beyond that please make sure to CC wireless folks.
> > The linux-wireless mailing list, Johannes, Jouni, etc.
> > The question of the "real root cause" is slightly under-explained.
> > Sounds like something restores carrier before the device is actually
> > up.  
> 
> Sure. As for the real root cause, are you talking about the AP side? As
> far as the station is concerened the initial packets are making it out
> after a roam. So possibly the AP is restoring carrier before it should,
> and packets are getting dropped.

Yeah, that's what I meant. If you're saying that station sends the
packet out correctly then indeed nothing we can do but the workaround.

> In v2 I'll include a cover letter with more details about the test
> setup, and what was observed.

Thanks!
Jakub Kicinski Oct. 14, 2021, 4:59 p.m. UTC | #8
On Thu, 14 Oct 2021 09:29:05 -0700 James Prestwood wrote:
> > > diff --git a/include/linux/inetdevice.h
> > > b/include/linux/inetdevice.h
> > > index 53aa0343bf69..63180170fdbd 100644
> > > --- a/include/linux/inetdevice.h
> > > +++ b/include/linux/inetdevice.h
> > > @@ -133,6 +133,7 @@ static inline void ipv4_devconf_setall(struct
> > > in_device *in_dev)
> > >  #define IN_DEV_ARP_ANNOUNCE(in_dev)    IN_DEV_MAXCONF((in_dev),
> > > ARP_ANNOUNCE)
> > >  #define IN_DEV_ARP_IGNORE(in_dev)      IN_DEV_MAXCONF((in_dev),
> > > ARP_IGNORE)
> > >  #define IN_DEV_ARP_NOTIFY(in_dev)      IN_DEV_MAXCONF((in_dev),
> > > ARP_NOTIFY)
> > > +#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev)
> > > IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)  
> > 
> > IN_DEV_ANDCONF() makes most sense, I'd think.  
> 
> So given we want '1' as the default as well as the ability to toggle
> this option per-netdev I thought this was more appropriate. One caviat
> is this would not work for setting ALL netdev's, but I don't think this
> is a valid use case; or at least I can't imagine why you'd want to ever
> do this.
> 
> This is a whole new area to me though, so if I'm understanding these
> macros wrong please educate me :)

Yeah, TBH I'm not sure what the best practice is here. I know we
weren't very consistent in the past. Having a knob for "all" which 
is 100% ignored seems like the worst option. Let me CC Dave Ahern,
please make sure you CC him on v2 as well.
David Ahern Oct. 14, 2021, 6:32 p.m. UTC | #9
On 10/14/21 10:59 AM, Jakub Kicinski wrote:
> On Thu, 14 Oct 2021 09:29:05 -0700 James Prestwood wrote:
>>>> diff --git a/include/linux/inetdevice.h
>>>> b/include/linux/inetdevice.h
>>>> index 53aa0343bf69..63180170fdbd 100644
>>>> --- a/include/linux/inetdevice.h
>>>> +++ b/include/linux/inetdevice.h
>>>> @@ -133,6 +133,7 @@ static inline void ipv4_devconf_setall(struct
>>>> in_device *in_dev)
>>>>  #define IN_DEV_ARP_ANNOUNCE(in_dev)    IN_DEV_MAXCONF((in_dev),
>>>> ARP_ANNOUNCE)
>>>>  #define IN_DEV_ARP_IGNORE(in_dev)      IN_DEV_MAXCONF((in_dev),
>>>> ARP_IGNORE)
>>>>  #define IN_DEV_ARP_NOTIFY(in_dev)      IN_DEV_MAXCONF((in_dev),
>>>> ARP_NOTIFY)
>>>> +#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev)
>>>> IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)  
>>>
>>> IN_DEV_ANDCONF() makes most sense, I'd think.  
>>
>> So given we want '1' as the default as well as the ability to toggle
>> this option per-netdev I thought this was more appropriate. One caviat
>> is this would not work for setting ALL netdev's, but I don't think this
>> is a valid use case; or at least I can't imagine why you'd want to ever
>> do this.
>>
>> This is a whole new area to me though, so if I'm understanding these
>> macros wrong please educate me :)
> 
> Yeah, TBH I'm not sure what the best practice is here. I know we
> weren't very consistent in the past. Having a knob for "all" which 
> is 100% ignored seems like the worst option. Let me CC Dave Ahern,
> please make sure you CC him on v2 as well.
> 

agreed. It's a matter of consistency in the use of the 'all' variant. I
would say support it even if it you do not believe it will be changed.
David Ahern Oct. 14, 2021, 6:34 p.m. UTC | #10
On 10/13/21 4:27 PM, James Prestwood wrote:
> This change introduces a new sysctl parameter, arp_evict_nocarrier.
> When set (default) the ARP cache will be cleared on a NOCARRIER event.
> This new option has been defaulted to '1' which maintains existing
> behavior.
> 
> Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
> 
> commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> Author: David Ahern <dsahern@gmail.com>
> Date:   Thu Oct 11 20:33:49 2018 -0700
> 
>     net: Evict neighbor entries on carrier down
> 
> The reason for this changes is to prevent the ARP cache from being
> cleared when a wireless device roams. Specifically for wireless roams
> the ARP cache should not be cleared because the underlying network has not
> changed. Clearing the ARP cache in this case can introduce significant
> delays sending out packets after a roam.

how do you know if the existing ARP / ND entries are valid when roaming?
James Prestwood Oct. 14, 2021, 6:52 p.m. UTC | #11
Hi David,

On Thu, 2021-10-14 at 12:34 -0600, David Ahern wrote:
> On 10/13/21 4:27 PM, James Prestwood wrote:
> > This change introduces a new sysctl parameter, arp_evict_nocarrier.
> > When set (default) the ARP cache will be cleared on a NOCARRIER
> > event.
> > This new option has been defaulted to '1' which maintains existing
> > behavior.
> > 
> > Clearing the ARP cache on NOCARRIER is relatively new, introduced
> > by:
> > 
> > commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> > Author: David Ahern <dsahern@gmail.com>
> > Date:   Thu Oct 11 20:33:49 2018 -0700
> > 
> >     net: Evict neighbor entries on carrier down
> > 
> > The reason for this changes is to prevent the ARP cache from being
> > cleared when a wireless device roams. Specifically for wireless
> > roams
> > the ARP cache should not be cleared because the underlying network
> > has not
> > changed. Clearing the ARP cache in this case can introduce
> > significant
> > delays sending out packets after a roam.
> 
> how do you know if the existing ARP / ND entries are valid when
> roaming?
> 

I guess there is no way of really knowing, but since your roaming in
the same network I think we can assume the entries are just as valid
prior to the roam as after it right? If there are invalid entries, they
were likely invalid prior to the roam too.

I obviously cant speak to all network configurations, but I think if
your network infrastructure is set up to roam you into a different
subnet you're doing something wrong. Or I can't think of a reason to do
this, it would defeat the purpose of quickly roaming between BSS's
since you would then have to do DHCP aftewards.

Thanks,
James
diff mbox series

Patch

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index 53aa0343bf69..63180170fdbd 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -133,6 +133,7 @@  static inline void ipv4_devconf_setall(struct in_device *in_dev)
 #define IN_DEV_ARP_ANNOUNCE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_ANNOUNCE)
 #define IN_DEV_ARP_IGNORE(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_IGNORE)
 #define IN_DEV_ARP_NOTIFY(in_dev)	IN_DEV_MAXCONF((in_dev), ARP_NOTIFY)
+#define IN_DEV_ARP_EVICT_NOCARRIER(in_dev) IN_DEV_CONF_GET((in_dev), ARP_EVICT_NOCARRIER)
 
 struct in_ifaddr {
 	struct hlist_node	hash;
diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h
index e42d13b55cf3..e00bbb9c47bb 100644
--- a/include/uapi/linux/ip.h
+++ b/include/uapi/linux/ip.h
@@ -169,6 +169,7 @@  enum
 	IPV4_DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
 	IPV4_DEVCONF_DROP_GRATUITOUS_ARP,
 	IPV4_DEVCONF_BC_FORWARDING,
+	IPV4_DEVCONF_ARP_EVICT_NOCARRIER,
 	__IPV4_DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 1e05d3caa712..6a3b194c50fe 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -482,6 +482,7 @@  enum
 	NET_IPV4_CONF_PROMOTE_SECONDARIES=20,
 	NET_IPV4_CONF_ARP_ACCEPT=21,
 	NET_IPV4_CONF_ARP_NOTIFY=22,
+	NET_IPV4_CONF_ARP_EVICT_NOCARRIER=23,
 };
 
 /* /proc/sys/net/ipv4/netfilter */
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 922dd73e5740..50cfe4f37089 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -1247,6 +1247,7 @@  static int arp_netdev_event(struct notifier_block *this, unsigned long event,
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct netdev_notifier_change_info *change_info;
+	struct in_device *in_dev = __in_dev_get_rcu(dev);
 
 	switch (event) {
 	case NETDEV_CHANGEADDR:
@@ -1257,7 +1258,8 @@  static int arp_netdev_event(struct notifier_block *this, unsigned long event,
 		change_info = ptr;
 		if (change_info->flags_changed & IFF_NOARP)
 			neigh_changeaddr(&arp_tbl, dev);
-		if (!netif_carrier_ok(dev))
+		if (IN_DEV_ARP_EVICT_NOCARRIER(in_dev) &&
+		    !netif_carrier_ok(dev))
 			neigh_carrier_down(&arp_tbl, dev);
 		break;
 	default:
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 1c6429c353a9..4ff4403749e0 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -75,6 +75,7 @@  static struct ipv4_devconf ipv4_devconf = {
 		[IPV4_DEVCONF_SHARED_MEDIA - 1] = 1,
 		[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
 		[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] =  1000 /*ms*/,
+		[IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
 	},
 };
 
@@ -87,6 +88,7 @@  static struct ipv4_devconf ipv4_devconf_dflt = {
 		[IPV4_DEVCONF_ACCEPT_SOURCE_ROUTE - 1] = 1,
 		[IPV4_DEVCONF_IGMPV2_UNSOLICITED_REPORT_INTERVAL - 1] = 10000 /*ms*/,
 		[IPV4_DEVCONF_IGMPV3_UNSOLICITED_REPORT_INTERVAL - 1] =  1000 /*ms*/,
+		[IPV4_DEVCONF_ARP_EVICT_NOCARRIER - 1] = 1,
 	},
 };
 
@@ -2527,6 +2529,8 @@  static struct devinet_sysctl_table {
 		DEVINET_SYSCTL_RW_ENTRY(ARP_IGNORE, "arp_ignore"),
 		DEVINET_SYSCTL_RW_ENTRY(ARP_ACCEPT, "arp_accept"),
 		DEVINET_SYSCTL_RW_ENTRY(ARP_NOTIFY, "arp_notify"),
+		DEVINET_SYSCTL_RW_ENTRY(ARP_EVICT_NOCARRIER,
+					"arp_evict_nocarrier"),
 		DEVINET_SYSCTL_RW_ENTRY(PROXY_ARP_PVLAN, "proxy_arp_pvlan"),
 		DEVINET_SYSCTL_RW_ENTRY(FORCE_IGMP_VERSION,
 					"force_igmp_version"),