diff mbox series

ipv6: Honor route mtu if it is within limit of dev mtu

Message ID 1614011555-21951-1-git-send-email-kapandey@codeaurora.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series ipv6: Honor route mtu if it is within limit of dev mtu | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Kaustubh Pandey Feb. 22, 2021, 4:32 p.m. UTC
When netdevice MTU is increased via sysfs, NETDEV_CHANGEMTU is raised.

addrconf_notify -> rt6_mtu_change -> rt6_mtu_change_route ->
fib6_nh_mtu_change

As part of handling NETDEV_CHANGEMTU notification we land up on a
condition where if route mtu is less than dev mtu and route mtu equals
ipv6_devconf mtu, route mtu gets updated.

Due to this v6 traffic end up using wrong MTU then configured earlier.
This commit fixes this by removing comparison with ipv6_devconf
and updating route mtu only when it is greater than incoming dev mtu.

This can be easily reproduced with below script:
pre-condition:
device up(mtu = 1500) and route mtu for both v4 and v6 is 1500

test-script:
ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1400
ip -6 route change 2001::/64 dev eth0 metric 256 mtu 1400
echo 1400 > /sys/class/net/eth0/mtu
ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1500
echo 1500 > /sys/class/net/eth0/mtu

Signed-off-by: Kaustubh Pandey <kapandey@codeaurora.org>
---
 net/ipv6/route.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Jakub Kicinski Feb. 24, 2021, 6:47 p.m. UTC | #1
On Mon, 22 Feb 2021 22:02:35 +0530 Kaustubh Pandey wrote:
> When netdevice MTU is increased via sysfs, NETDEV_CHANGEMTU is raised.
> 
> addrconf_notify -> rt6_mtu_change -> rt6_mtu_change_route ->
> fib6_nh_mtu_change
> 
> As part of handling NETDEV_CHANGEMTU notification we land up on a
> condition where if route mtu is less than dev mtu and route mtu equals
> ipv6_devconf mtu, route mtu gets updated.
> 
> Due to this v6 traffic end up using wrong MTU then configured earlier.
> This commit fixes this by removing comparison with ipv6_devconf
> and updating route mtu only when it is greater than incoming dev mtu.
> 
> This can be easily reproduced with below script:
> pre-condition:
> device up(mtu = 1500) and route mtu for both v4 and v6 is 1500
> 
> test-script:
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1400
> ip -6 route change 2001::/64 dev eth0 metric 256 mtu 1400
> echo 1400 > /sys/class/net/eth0/mtu
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1500
> echo 1500 > /sys/class/net/eth0/mtu
> 
> Signed-off-by: Kaustubh Pandey <kapandey@codeaurora.org>
> ---
>  net/ipv6/route.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 1536f49..653b6c7 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4813,8 +4813,7 @@ static int fib6_nh_mtu_change(struct fib6_nh *nh, void *_arg)
>  		struct inet6_dev *idev = __in6_dev_get(arg->dev);
>  		u32 mtu = f6i->fib6_pmtu;
>  
> -		if (mtu >= arg->mtu ||
> -		    (mtu < arg->mtu && mtu == idev->cnf.mtu6))
> +		if (mtu >= arg->mtu)
>  			fib6_metric_set(f6i, RTAX_MTU, arg->mtu);
>  
>  		spin_lock_bh(&rt6_exception_lock);

David, Hideaki - any thoughts on this one? Can we change this long
standing behavior?
David Ahern Feb. 24, 2021, 8:28 p.m. UTC | #2
On 2/22/21 9:32 AM, Kaustubh Pandey wrote:
> When netdevice MTU is increased via sysfs, NETDEV_CHANGEMTU is raised.
> 
> addrconf_notify -> rt6_mtu_change -> rt6_mtu_change_route ->
> fib6_nh_mtu_change
> 
> As part of handling NETDEV_CHANGEMTU notification we land up on a
> condition where if route mtu is less than dev mtu and route mtu equals
> ipv6_devconf mtu, route mtu gets updated.
> 
> Due to this v6 traffic end up using wrong MTU then configured earlier.
> This commit fixes this by removing comparison with ipv6_devconf
> and updating route mtu only when it is greater than incoming dev mtu.
> 
> This can be easily reproduced with below script:
> pre-condition:
> device up(mtu = 1500) and route mtu for both v4 and v6 is 1500
> 
> test-script:
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1400
> ip -6 route change 2001::/64 dev eth0 metric 256 mtu 1400
> echo 1400 > /sys/class/net/eth0/mtu
> ip route change 192.168.0.0/24 dev eth0 src 192.168.0.1 mtu 1500
> echo 1500 > /sys/class/net/eth0/mtu
> 
> Signed-off-by: Kaustubh Pandey <kapandey@codeaurora.org>
> ---
>  net/ipv6/route.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 1536f49..653b6c7 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4813,8 +4813,7 @@ static int fib6_nh_mtu_change(struct fib6_nh *nh, void *_arg)
>  		struct inet6_dev *idev = __in6_dev_get(arg->dev);
>  		u32 mtu = f6i->fib6_pmtu;
>  
> -		if (mtu >= arg->mtu ||
> -		    (mtu < arg->mtu && mtu == idev->cnf.mtu6))
> +		if (mtu >= arg->mtu)
>  			fib6_metric_set(f6i, RTAX_MTU, arg->mtu);
>  
>  		spin_lock_bh(&rt6_exception_lock);
> 

The existing logic mirrors what is done for exceptions, see
rt6_mtu_change_route_allowed and commit e9fa1495d738.

It seems right to me to drop the mtu == idev->cnf.mtu6 comparison in
which case the exceptions should do the same.

Added author of e9fa1495d738 in case I am overlooking something.

Test case should be added to tools/testing/selftests/net/pmtu.sh, and
did you run that script with the proposed change?
diff mbox series

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1536f49..653b6c7 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4813,8 +4813,7 @@  static int fib6_nh_mtu_change(struct fib6_nh *nh, void *_arg)
 		struct inet6_dev *idev = __in6_dev_get(arg->dev);
 		u32 mtu = f6i->fib6_pmtu;
 
-		if (mtu >= arg->mtu ||
-		    (mtu < arg->mtu && mtu == idev->cnf.mtu6))
+		if (mtu >= arg->mtu)
 			fib6_metric_set(f6i, RTAX_MTU, arg->mtu);
 
 		spin_lock_bh(&rt6_exception_lock);