diff mbox series

neighbour: guarantee the localhost connections be established successfully even the ARP table is full

Message ID 20240311122401.6549-1-lizheng043@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series neighbour: guarantee the localhost connections be established successfully even the ARP table is full | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 944 this patch: 944
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 4 maintainers not CCed: pabeni@redhat.com edumazet@google.com kuba@kernel.org dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 957 this patch: 957
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 960 this patch: 960
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest warning net-next-2024-03-11--21-00 (tests: 883)

Commit Message

Zheng Li March 11, 2024, 12:24 p.m. UTC
From: Zheng Li <James.Z.Li@Dell.com>

Inter-process communication on localhost should be established successfully even the ARP table is full,
many processes on server machine use the localhost to communicate such as command-line interface (CLI),
servers hope all CLI commands can be executed successfully even the arp table is full.
Right now CLI commands got timeout when the arp table is full.
Set the parameter of exempt_from_gc to be true for LOOPBACK net device to
keep localhost neigh in arp table, not removed by gc.

the steps of reproduced:
server with "gc_thresh3 = 1024" setting, ping server from more than 1024 IPv4 addresses,
run "ssh localhost" on console interface, then the command will get timeout.

Signed-off-by: Zheng Li <James.Z.Li@Dell.com>
---
 net/core/neighbour.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Ratheesh Kannoth March 11, 2024, 1:51 p.m. UTC | #1
On 2024-03-11 at 17:54:01, Zheng Li (lizheng043@gmail.com) wrote:
>
> Inter-process communication on localhost should be established successfully even the ARP table is full,
> many processes on server machine use the localhost to communicate such as command-line interface (CLI),
> servers hope all CLI commands can be executed successfully even the arp table is full.
> Right now CLI commands got timeout when the arp table is full.
> Set the parameter of exempt_from_gc to be true for LOOPBACK net device to
> keep localhost neigh in arp table, not removed by gc.
>
> the steps of reproduced:
> server with "gc_thresh3 = 1024" setting, ping server from more than 1024 IPv4 addresses,
> run "ssh localhost" on console interface, then the command will get timeout.
it does not look correct to me. why gc has to behave differently for loopback devices.
why can't a higher gc_thresh3 value (fine tuned to your use case) wont solve the issue ?
can't you add localhost arp entry statically and get rid of gc issue ?

>
> Signed-off-by: Zheng Li <James.Z.Li@Dell.com>
> ---
>  net/core/neighbour.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 552719c3bbc3..d96dee3d4af6 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -734,7 +734,10 @@ ___neigh_create(struct neigh_table *tbl, const void *pkey,
>  struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
>  				 struct net_device *dev, bool want_ref)
>  {
> -	return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
> +	if (dev->flags & IFF_LOOPBACK)
> +		return ___neigh_create(tbl, pkey, dev, 0, true, want_ref);
> +	else
> +		return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
>  }
>  EXPORT_SYMBOL(__neigh_create);
>
> --
> 2.17.1
>
Zheng Li March 18, 2024, 8:39 a.m. UTC | #2
loopback neigh is a special device in the neighbour system which is
used by all local communications and state is NUD_NOARP.
Any setting value of gc_thresh3 might encounter arp table be full,
manually increasing gc_thresh3 can resolve this issue for every time,
but we hope this issue automatically be resolved in Linux kernel for
all local communications whenever ARP table is full, rather than
manually operation as a workaround.


Ratheesh Kannoth <rkannoth@marvell.com> 于2024年3月11日周一 21:51写道:
>
> On 2024-03-11 at 17:54:01, Zheng Li (lizheng043@gmail.com) wrote:
> >
> > Inter-process communication on localhost should be established successfully even the ARP table is full,
> > many processes on server machine use the localhost to communicate such as command-line interface (CLI),
> > servers hope all CLI commands can be executed successfully even the arp table is full.
> > Right now CLI commands got timeout when the arp table is full.
> > Set the parameter of exempt_from_gc to be true for LOOPBACK net device to
> > keep localhost neigh in arp table, not removed by gc.
> >
> > the steps of reproduced:
> > server with "gc_thresh3 = 1024" setting, ping server from more than 1024 IPv4 addresses,
> > run "ssh localhost" on console interface, then the command will get timeout.
> it does not look correct to me. why gc has to behave differently for loopback devices.
> why can't a higher gc_thresh3 value (fine tuned to your use case) wont solve the issue ?
> can't you add localhost arp entry statically and get rid of gc issue ?
>
> >
> > Signed-off-by: Zheng Li <James.Z.Li@Dell.com>
> > ---
> >  net/core/neighbour.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index 552719c3bbc3..d96dee3d4af6 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -734,7 +734,10 @@ ___neigh_create(struct neigh_table *tbl, const void *pkey,
> >  struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
> >                                struct net_device *dev, bool want_ref)
> >  {
> > -     return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
> > +     if (dev->flags & IFF_LOOPBACK)
> > +             return ___neigh_create(tbl, pkey, dev, 0, true, want_ref);
> > +     else
> > +             return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
> >  }
> >  EXPORT_SYMBOL(__neigh_create);
> >
> > --
> > 2.17.1
> >
Ratheesh Kannoth March 18, 2024, 3:36 p.m. UTC | #3
> From: James Lee <lizheng043@gmail.com>
> Sent: Monday, March 18, 2024 2:09 PM
> To: Ratheesh Kannoth <rkannoth@marvell.com>
> Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> James.Z.Li@dell.com
> Subject: [EXTERNAL] Re: [PATCH] neighbour: guarantee the localhost
> connections be established successfully even the ARP table is full
> 
> Prioritize security for external emails: Confirm sender and content safety
> before clicking links or opening attachments
> 
> ----------------------------------------------------------------------
> loopback neigh is a special device in the neighbour system which is used by all
> local communications and state is NUD_NOARP.
> Any setting value of gc_thresh3 might encounter arp table be full, manually
> increasing gc_thresh3 can resolve this issue for every time, but we hope this
> issue automatically be resolved in Linux kernel for all local communications
> whenever ARP table is full, rather than manually operation as a workaround.

Issue is , these are dynamic entries which cannot be removed by gc.  And there is no
Threshold applicable on it.  I feel like, this may be exploited.
Zheng Li March 19, 2024, 9:42 a.m. UTC | #4
It's not an issue, the loopback device can only be created by kernel
itself, loopback neigh entry also can only be created by kernel, one
loopback neigh for ipv4 ,one for ipv6, impossible that the num of
loopback neigh entries exceeds 2.

Ratheesh Kannoth <rkannoth@marvell.com> 于2024年3月18日周一 23:36写道:
>
> > From: James Lee <lizheng043@gmail.com>
> > Sent: Monday, March 18, 2024 2:09 PM
> > To: Ratheesh Kannoth <rkannoth@marvell.com>
> > Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> > nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> > James.Z.Li@dell.com
> > Subject: [EXTERNAL] Re: [PATCH] neighbour: guarantee the localhost
> > connections be established successfully even the ARP table is full
> >
> > Prioritize security for external emails: Confirm sender and content safety
> > before clicking links or opening attachments
> >
> > ----------------------------------------------------------------------
> > loopback neigh is a special device in the neighbour system which is used by all
> > local communications and state is NUD_NOARP.
> > Any setting value of gc_thresh3 might encounter arp table be full, manually
> > increasing gc_thresh3 can resolve this issue for every time, but we hope this
> > issue automatically be resolved in Linux kernel for all local communications
> > whenever ARP table is full, rather than manually operation as a workaround.
>
> Issue is , these are dynamic entries which cannot be removed by gc.  And there is no
> Threshold applicable on it.  I feel like, this may be exploited.
>
Ratheesh Kannoth March 22, 2024, 3:37 a.m. UTC | #5
> From: James Lee <lizheng043@gmail.com>
> Sent: Tuesday, March 19, 2024 3:13 PM
> To: Ratheesh Kannoth <rkannoth@marvell.com>
> Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> James.Z.Li@dell.com
> Subject: Re: [EXTERNAL] Re: [PATCH] neighbour: guarantee the localhost
> connections be established successfully even the ARP table is full
> 
> It's not an issue, the loopback device can only be created by kernel itself,
> loopback neigh entry also can only be created by kernel, one loopback neigh
> for ipv4 ,one for ipv6, impossible that the num of loopback neigh entries
> exceeds 2.
ACK. I still feel like a hack. Please post a new patch version. Let maintainers take a call.
Zheng Li March 28, 2024, 8:41 a.m. UTC | #6
It's not an issue, why need "post a new patch version"?

Ratheesh Kannoth <rkannoth@marvell.com> 于2024年3月22日周五 11:37写道:
>
> > From: James Lee <lizheng043@gmail.com>
> > Sent: Tuesday, March 19, 2024 3:13 PM
> > To: Ratheesh Kannoth <rkannoth@marvell.com>
> > Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> > nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> > James.Z.Li@dell.com
> > Subject: Re: [EXTERNAL] Re: [PATCH] neighbour: guarantee the localhost
> > connections be established successfully even the ARP table is full
> >
> > It's not an issue, the loopback device can only be created by kernel itself,
> > loopback neigh entry also can only be created by kernel, one loopback neigh
> > for ipv4 ,one for ipv6, impossible that the num of loopback neigh entries
> > exceeds 2.
> ACK. I still feel like a hack. Please post a new patch version. Let maintainers take a call.
Ratheesh Kannoth March 28, 2024, 8:55 a.m. UTC | #7
> From: James Lee <lizheng043@gmail.com>
> Sent: Thursday, March 28, 2024 2:11 PM
> To: Ratheesh Kannoth <rkannoth@marvell.com>
> Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> James.Z.Li@dell.com; Simon Horman <horms@kernel.org>
> Subject: Re: [EXTERNAL] Re: [PATCH] neighbour: guarantee the localhost
> connections be established successfully even the ARP table is full
> 

> It's not an issue, 
Please don’t top post

 >why need "post a new patch version"?
ASFAIK, only https://patchwork.kernel.org/project/netdevbpf/list/ will be actively looked up to merge patches.
I could be wrong. 

> 
> Ratheesh Kannoth <rkannoth@marvell.com> 于2024年3月22日周五
> 11:37写道:
> >
> > > From: James Lee <lizheng043@gmail.com>
> > > Sent: Tuesday, March 19, 2024 3:13 PM
> > > To: Ratheesh Kannoth <rkannoth@marvell.com>
> > > Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org;
> > > nhorman@tuxdriver.com; davem@davemloft.net; jmorris@namei.org;
> > > James.Z.Li@dell.com
> > > Subject: Re: [EXTERNAL] Re: [PATCH] neighbour: guarantee the
> > > localhost connections be established successfully even the ARP table
> > > is full
> > >
> > > It's not an issue, the loopback device can only be created by kernel
> > > itself, loopback neigh entry also can only be created by kernel, one
> > > loopback neigh for ipv4 ,one for ipv6, impossible that the num of
> > > loopback neigh entries exceeds 2.
> > ACK. I still feel like a hack. Please post a new patch version. Let maintainers
> take a call.
diff mbox series

Patch

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 552719c3bbc3..d96dee3d4af6 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -734,7 +734,10 @@  ___neigh_create(struct neigh_table *tbl, const void *pkey,
 struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
 				 struct net_device *dev, bool want_ref)
 {
-	return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
+	if (dev->flags & IFF_LOOPBACK)
+		return ___neigh_create(tbl, pkey, dev, 0, true, want_ref);
+	else
+		return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
 }
 EXPORT_SYMBOL(__neigh_create);