diff mbox series

[v2,net-next,1/1] Allow user to set metric on default route learned via Router Advertisement.

Message ID 20210115080203.8889-1-pchaudhary@linkedin.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [v2,net-next,1/1] Allow user to set metric on default route learned via Router Advertisement. | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 9 maintainers not CCed: petr.vorel@gmail.com edumazet@google.com alex.aring@gmail.com linmiaohe@huawei.com dalias@libc.org rdunlap@infradead.org haolee.swjtu@gmail.com akpm@linux-foundation.org willemb@google.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit fail Errors and warnings before: 24029 this patch: 23712
netdev/kdoc success Errors and warnings before: 38 this patch: 38
netdev/verify_fixes success Link
netdev/checkpatch fail CHECK: Unnecessary parentheses around 'lifetime == 0' CHECK: Unnecessary parentheses around 'rt->fib6_metric != defrtr_usr_metric' ERROR: spaces required around that '=' (ctx:VxV) WARNING: From:/Signed-off-by: email address mismatch: 'From: Praveen Chaudhary <praveen5582@gmail.com>' != 'Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>' WARNING: line length of 81 exceeds 80 columns
netdev/build_allmodconfig_warn fail Errors and warnings before: 23465 this patch: 23096
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Praveen Chaudhary Jan. 15, 2021, 8:02 a.m. UTC
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.

Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This fix adds sysctl for the same.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

Changes in v1.
---
1.) Correct the call to rt6_add_dflt_router.
---

Changes in v2.
[Refer: lkml.org/lkml/2021/1/14/1400]
---
1.) Replace accept_ra_defrtr_metric to ra_defrtr_metric.
2.) Change Type to __u32 instead of __s32.
3.) Change description in Documentation/networking/ip-sysctl.rst.
4.) Use proc_douintvec instead of proc_dointvec.
5.) Code style in ndisc_router_discovery().
6.) Change Type to u32 instead of unsigned int.
---

Logs:
----------------------------------------------------------------
For IPv4:
----------------------------------------------------------------

Config in etc/network/interfaces
----------------------------------------------------------------
```
auto eth0
iface eth0 inet dhcp
    metric 4261413864
```

IPv4 Kernel Route Table:
----------------------------------------------------------------
```
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
```

FRR Table, if a static route is configured.
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
----------------------------------------------------------------
```
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K   0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
```
----------------------------------------------------------------

***i.e. User can prefer Default Router learned via Routing Protocol in IPv4.***
***Similar behavior is not possible for IPv6, without this fix.***

----------------------------------------------------------------
After fix [for IPv6]:
----------------------------------------------------------------
```
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
```

IP monitor: [When IPv6 RA is received]
----------------------------------------------------------------
```
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705  pref high
```

Kernel IPv6 routing table
----------------------------------------------------------------
```
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
```

FRR Table, if a static route is configured.
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
----------------------------------------------------------------
```
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K   ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
```

If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
----------------------------------------------------------------
```
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
```

IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
----------------------------------------------------------------
```
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705  expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704  pref high
```
---
 Documentation/networking/ip-sysctl.rst | 12 ++++++++++++
 include/linux/ipv6.h                   |  1 +
 include/net/ip6_route.h                |  3 ++-
 include/uapi/linux/ipv6.h              |  1 +
 include/uapi/linux/sysctl.h            |  1 +
 net/ipv6/addrconf.c                    | 10 ++++++++++
 net/ipv6/ndisc.c                       | 14 ++++++++++----
 net/ipv6/route.c                       |  5 +++--
 8 files changed, 40 insertions(+), 7 deletions(-)


base-commit: 139711f033f636cc78b6aaf7363252241b9698ef

Comments

kernel test robot Jan. 15, 2021, 4:20 p.m. UTC | #1
Hi Praveen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on 139711f033f636cc78b6aaf7363252241b9698ef]

url:    https://github.com/0day-ci/linux/commits/Praveen-Chaudhary/Allow-user-to-set-metric-on-default-route-learned-via-Router-Advertisement/20210115-160758
base:    139711f033f636cc78b6aaf7363252241b9698ef
config: nds32-randconfig-r015-20210115 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/35f232fe80f8b50430aee1c6e534cba119c88de8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Praveen-Chaudhary/Allow-user-to-set-metric-on-default-route-learned-via-Router-Advertisement/20210115-160758
        git checkout 35f232fe80f8b50430aee1c6e534cba119c88de8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net/ipv6/ndisc.c: In function 'ndisc_router_discovery':
>> net/ipv6/ndisc.c:1308:35: error: 'struct ipv6_devconf' has no member named 'accept_ra_defrtr_metric'; did you mean 'accept_ra_defrtr'?
    1308 |  defrtr_usr_metric = in6_dev->cnf.accept_ra_defrtr_metric;
         |                                   ^~~~~~~~~~~~~~~~~~~~~~~
         |                                   accept_ra_defrtr

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for FRAME_POINTER
   Depends on DEBUG_KERNEL && (M68K || UML || SUPERH) || ARCH_WANT_FRAME_POINTERS
   Selected by
   - LATENCYTOP && DEBUG_KERNEL && STACKTRACE_SUPPORT && PROC_FS && !MIPS && !PPC && !S390 && !MICROBLAZE && !ARM && !ARC && !X86


vim +1308 net/ipv6/ndisc.c

  1241	
  1242		if (in6_dev->if_flags & IF_RS_SENT) {
  1243			/*
  1244			 *	flag that an RA was received after an RS was sent
  1245			 *	out on this interface.
  1246			 */
  1247			in6_dev->if_flags |= IF_RA_RCVD;
  1248		}
  1249	
  1250		/*
  1251		 * Remember the managed/otherconf flags from most recently
  1252		 * received RA message (RFC 2462) -- yoshfuji
  1253		 */
  1254		old_if_flags = in6_dev->if_flags;
  1255		in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED |
  1256					IF_RA_OTHERCONF)) |
  1257					(ra_msg->icmph.icmp6_addrconf_managed ?
  1258						IF_RA_MANAGED : 0) |
  1259					(ra_msg->icmph.icmp6_addrconf_other ?
  1260						IF_RA_OTHERCONF : 0);
  1261	
  1262		if (old_if_flags != in6_dev->if_flags)
  1263			send_ifinfo_notify = true;
  1264	
  1265		if (!in6_dev->cnf.accept_ra_defrtr) {
  1266			ND_PRINTK(2, info,
  1267				  "RA: %s, defrtr is false for dev: %s\n",
  1268				  __func__, skb->dev->name);
  1269			goto skip_defrtr;
  1270		}
  1271	
  1272		/* Do not accept RA with source-addr found on local machine unless
  1273		 * accept_ra_from_local is set to true.
  1274		 */
  1275		net = dev_net(in6_dev->dev);
  1276		if (!in6_dev->cnf.accept_ra_from_local &&
  1277		    ipv6_chk_addr(net, &ipv6_hdr(skb)->saddr, in6_dev->dev, 0)) {
  1278			ND_PRINTK(2, info,
  1279				  "RA from local address detected on dev: %s: default router ignored\n",
  1280				  skb->dev->name);
  1281			goto skip_defrtr;
  1282		}
  1283	
  1284		lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);
  1285	
  1286	#ifdef CONFIG_IPV6_ROUTER_PREF
  1287		pref = ra_msg->icmph.icmp6_router_pref;
  1288		/* 10b is handled as if it were 00b (medium) */
  1289		if (pref == ICMPV6_ROUTER_PREF_INVALID ||
  1290		    !in6_dev->cnf.accept_ra_rtr_pref)
  1291			pref = ICMPV6_ROUTER_PREF_MEDIUM;
  1292	#endif
  1293		/* routes added from RAs do not use nexthop objects */
  1294		rt = rt6_get_dflt_router(net, &ipv6_hdr(skb)->saddr, skb->dev);
  1295		if (rt) {
  1296			neigh = ip6_neigh_lookup(&rt->fib6_nh->fib_nh_gw6,
  1297						 rt->fib6_nh->fib_nh_dev, NULL,
  1298						  &ipv6_hdr(skb)->saddr);
  1299			if (!neigh) {
  1300				ND_PRINTK(0, err,
  1301					  "RA: %s got default router without neighbour\n",
  1302					  __func__);
  1303				fib6_info_release(rt);
  1304				return;
  1305			}
  1306		}
  1307		/* Set default route metric if specified by user */
> 1308		defrtr_usr_metric = in6_dev->cnf.accept_ra_defrtr_metric;
  1309		if (defrtr_usr_metric == 0)
  1310			defrtr_usr_metric = IP6_RT_PRIO_USER;
  1311		/* delete the route if lifetime is 0 or if metric needs change */
  1312		if (rt && ((lifetime == 0) || (rt->fib6_metric != defrtr_usr_metric)))  {
  1313			ip6_del_rt(net, rt, false);
  1314			rt = NULL;
  1315		}
  1316	
  1317		ND_PRINTK(3, info, "RA: rt: %p  lifetime: %d, metric: %d, for dev: %s\n",
  1318			  rt, lifetime, defrtr_usr_metric, skb->dev->name);
  1319		if (!rt && lifetime) {
  1320			ND_PRINTK(3, info, "RA: adding default router\n");
  1321	
  1322			rt = rt6_add_dflt_router(net, &ipv6_hdr(skb)->saddr,
  1323						 skb->dev, pref, defrtr_usr_metric);
  1324			if (!rt) {
  1325				ND_PRINTK(0, err,
  1326					  "RA: %s failed to add default route\n",
  1327					  __func__);
  1328				return;
  1329			}
  1330	
  1331			neigh = ip6_neigh_lookup(&rt->fib6_nh->fib_nh_gw6,
  1332						 rt->fib6_nh->fib_nh_dev, NULL,
  1333						  &ipv6_hdr(skb)->saddr);
  1334			if (!neigh) {
  1335				ND_PRINTK(0, err,
  1336					  "RA: %s got default router without neighbour\n",
  1337					  __func__);
  1338				fib6_info_release(rt);
  1339				return;
  1340			}
  1341			neigh->flags |= NTF_ROUTER;
  1342		} else if (rt) {
  1343			rt->fib6_flags = (rt->fib6_flags & ~RTF_PREF_MASK) | RTF_PREF(pref);
  1344		}
  1345	
  1346		if (rt)
  1347			fib6_set_expires(rt, jiffies + (HZ * lifetime));
  1348		if (in6_dev->cnf.accept_ra_min_hop_limit < 256 &&
  1349		    ra_msg->icmph.icmp6_hop_limit) {
  1350			if (in6_dev->cnf.accept_ra_min_hop_limit <= ra_msg->icmph.icmp6_hop_limit) {
  1351				in6_dev->cnf.hop_limit = ra_msg->icmph.icmp6_hop_limit;
  1352				fib6_metric_set(rt, RTAX_HOPLIMIT,
  1353						ra_msg->icmph.icmp6_hop_limit);
  1354			} else {
  1355				ND_PRINTK(2, warn, "RA: Got route advertisement with lower hop_limit than minimum\n");
  1356			}
  1357		}
  1358	
  1359	skip_defrtr:
  1360	
  1361		/*
  1362		 *	Update Reachable Time and Retrans Timer
  1363		 */
  1364	
  1365		if (in6_dev->nd_parms) {
  1366			unsigned long rtime = ntohl(ra_msg->retrans_timer);
  1367	
  1368			if (rtime && rtime/1000 < MAX_SCHEDULE_TIMEOUT/HZ) {
  1369				rtime = (rtime*HZ)/1000;
  1370				if (rtime < HZ/100)
  1371					rtime = HZ/100;
  1372				NEIGH_VAR_SET(in6_dev->nd_parms, RETRANS_TIME, rtime);
  1373				in6_dev->tstamp = jiffies;
  1374				send_ifinfo_notify = true;
  1375			}
  1376	
  1377			rtime = ntohl(ra_msg->reachable_time);
  1378			if (rtime && rtime/1000 < MAX_SCHEDULE_TIMEOUT/(3*HZ)) {
  1379				rtime = (rtime*HZ)/1000;
  1380	
  1381				if (rtime < HZ/10)
  1382					rtime = HZ/10;
  1383	
  1384				if (rtime != NEIGH_VAR(in6_dev->nd_parms, BASE_REACHABLE_TIME)) {
  1385					NEIGH_VAR_SET(in6_dev->nd_parms,
  1386						      BASE_REACHABLE_TIME, rtime);
  1387					NEIGH_VAR_SET(in6_dev->nd_parms,
  1388						      GC_STALETIME, 3 * rtime);
  1389					in6_dev->nd_parms->reachable_time = neigh_rand_reach_time(rtime);
  1390					in6_dev->tstamp = jiffies;
  1391					send_ifinfo_notify = true;
  1392				}
  1393			}
  1394		}
  1395	
  1396		/*
  1397		 *	Send a notify if RA changed managed/otherconf flags or timer settings
  1398		 */
  1399		if (send_ifinfo_notify)
  1400			inet6_ifinfo_notify(RTM_NEWLINK, in6_dev);
  1401	
  1402	skip_linkparms:
  1403	
  1404		/*
  1405		 *	Process options.
  1406		 */
  1407	
  1408		if (!neigh)
  1409			neigh = __neigh_lookup(&nd_tbl, &ipv6_hdr(skb)->saddr,
  1410					       skb->dev, 1);
  1411		if (neigh) {
  1412			u8 *lladdr = NULL;
  1413			if (ndopts.nd_opts_src_lladdr) {
  1414				lladdr = ndisc_opt_addr_data(ndopts.nd_opts_src_lladdr,
  1415							     skb->dev);
  1416				if (!lladdr) {
  1417					ND_PRINTK(2, warn,
  1418						  "RA: invalid link-layer address length\n");
  1419					goto out;
  1420				}
  1421			}
  1422			ndisc_update(skb->dev, neigh, lladdr, NUD_STALE,
  1423				     NEIGH_UPDATE_F_WEAK_OVERRIDE|
  1424				     NEIGH_UPDATE_F_OVERRIDE|
  1425				     NEIGH_UPDATE_F_OVERRIDE_ISROUTER|
  1426				     NEIGH_UPDATE_F_ISROUTER,
  1427				     NDISC_ROUTER_ADVERTISEMENT, &ndopts);
  1428		}
  1429	
  1430		if (!ipv6_accept_ra(in6_dev)) {
  1431			ND_PRINTK(2, info,
  1432				  "RA: %s, accept_ra is false for dev: %s\n",
  1433				  __func__, skb->dev->name);
  1434			goto out;
  1435		}
  1436	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
David Ahern Jan. 16, 2021, 5:13 p.m. UTC | #2
On 1/15/21 1:02 AM, Praveen Chaudhary wrote:
> For IPv4, default route is learned via DHCPv4 and user is allowed to change
> metric using config etc/network/interfaces. But for IPv6, default route can
> be learned via RA, for which, currently a fixed metric value 1024 is used.
> 
> Ideally, user should be able to configure metric on default route for IPv6
> similar to IPv4. This fix adds sysctl for the same.
> 
> Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
> Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
> 
> Changes in v1.
> ---

your trying to be too fancy in the log messages; everything after this
first '---' is dropped. Just Remove all of the '---' lines and '```' tags.

> 1.) Correct the call to rt6_add_dflt_router.
> ---
> 
> Changes in v2.
> [Refer: lkml.org/lkml/2021/1/14/1400]
> ---
> 1.) Replace accept_ra_defrtr_metric to ra_defrtr_metric.
> 2.) Change Type to __u32 instead of __s32.
> 3.) Change description in Documentation/networking/ip-sysctl.rst.
> 4.) Use proc_douintvec instead of proc_dointvec.
> 5.) Code style in ndisc_router_discovery().
> 6.) Change Type to u32 instead of unsigned int.
> ---
> 
> Logs:
> ----------------------------------------------------------------
> For IPv4:
> ----------------------------------------------------------------
> 
> Config in etc/network/interfaces
> ----------------------------------------------------------------
> ```
> auto eth0
> iface eth0 inet dhcp
>     metric 4261413864

how does that work for IPv4? Is the metric passed to the dhclient and it
inserts the route with the given metric or is a dhclient script used to
replace the route after insert?


> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index dd2b12a32b73..c4b8d4b8d213 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -1871,6 +1871,18 @@ accept_ra_defrtr - BOOLEAN
>  		- enabled if accept_ra is enabled.
>  		- disabled if accept_ra is disabled.
>  
> +ra_defrtr_metric - INTEGER
> +	Route metric for default route learned in Router Advertisement. This value
> +	will be assigned as metric for the default route learned via IPv6 Router
> +	Advertisement. Takes affect only if accept_ra_defrtr' is enabled.

stray ' after accept_ra_defrtr

> +
> +	Possible values are:
> +		0:
> +			default value will be used for route metric
> +			i.e. IP6_RT_PRIO_USER 1024.
> +		1 to 0xFFFFFFFF:
> +			current value will be used for route metric.
> +
>  accept_ra_from_local - BOOLEAN
>  	Accept RA with source-address that is found on local machine
>  	if the RA is otherwise proper and able to be accepted.



> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index eff2cacd5209..b13d3213e58f 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -205,6 +205,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
>  	.max_desync_factor	= MAX_DESYNC_FACTOR,
>  	.max_addresses		= IPV6_MAX_ADDRESSES,
>  	.accept_ra_defrtr	= 1,
> +	.ra_defrtr_metric = 0,

make the the '=' align column wise with the existing entries; seems like
your new line is missing a tab

>  	.accept_ra_from_local	= 0,
>  	.accept_ra_min_hop_limit= 1,
>  	.accept_ra_pinfo	= 1,
> @@ -260,6 +261,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
>  	.max_desync_factor	= MAX_DESYNC_FACTOR,
>  	.max_addresses		= IPV6_MAX_ADDRESSES,
>  	.accept_ra_defrtr	= 1,
> +	.ra_defrtr_metric = 0,

same here

>  	.accept_ra_from_local	= 0,
>  	.accept_ra_min_hop_limit= 1,
>  	.accept_ra_pinfo	= 1,
> @@ -5475,6 +5477,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
>  	array[DEVCONF_MAX_DESYNC_FACTOR] = cnf->max_desync_factor;
>  	array[DEVCONF_MAX_ADDRESSES] = cnf->max_addresses;
>  	array[DEVCONF_ACCEPT_RA_DEFRTR] = cnf->accept_ra_defrtr;
> +	array[DEVCONF_RA_DEFRTR_METRIC] = cnf->ra_defrtr_metric;
>  	array[DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT] = cnf->accept_ra_min_hop_limit;
>  	array[DEVCONF_ACCEPT_RA_PINFO] = cnf->accept_ra_pinfo;
>  #ifdef CONFIG_IPV6_ROUTER_PREF
> @@ -6667,6 +6670,13 @@ static const struct ctl_table addrconf_sysctl[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec,
>  	},
> +	{
> +		.procname	= "ra_defrtr_metric",
> +		.data		= &ipv6_devconf.ra_defrtr_metric,
> +		.maxlen		= sizeof(u32),
> +		.mode		= 0644,
> +		.proc_handler	= proc_douintvec,
> +	},
>  	{
>  		.procname	= "accept_ra_min_hop_limit",
>  		.data		= &ipv6_devconf.accept_ra_min_hop_limit,
> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
> index 76717478f173..2bffed49f5c0 100644
> --- a/net/ipv6/ndisc.c
> +++ b/net/ipv6/ndisc.c
> @@ -1173,6 +1173,7 @@ static void ndisc_router_discovery(struct sk_buff *skb)
>  	struct neighbour *neigh = NULL;
>  	struct inet6_dev *in6_dev;
>  	struct fib6_info *rt = NULL;
> +	u32 defrtr_usr_metric;
>  	struct net *net;
>  	int lifetime;
>  	struct ndisc_options ndopts;
> @@ -1303,18 +1304,23 @@ static void ndisc_router_discovery(struct sk_buff *skb)
>  			return;
>  		}
>  	}
> -	if (rt && lifetime == 0) {
> +	/* Set default route metric if specified by user */
> +	defrtr_usr_metric = in6_dev->cnf.accept_ra_defrtr_metric;

this tells me you did not compile this version of the patch since the
'accept_' has been dropped. Always compile and test every patch before
sending.
Praveen Chaudhary Jan. 19, 2021, 10:17 p.m. UTC | #3
> On Jan 16, 2021, at 9:13 AM, David Ahern <dsahern@gmail.com> wrote:
> 
> On 1/15/21 1:02 AM, Praveen Chaudhary wrote:
>> For IPv4, default route is learned via DHCPv4 and user is allowed to change
>> metric using config etc/network/interfaces. But for IPv6, default route can
>> be learned via RA, for which, currently a fixed metric value 1024 is used.
>> 
>> Ideally, user should be able to configure metric on default route for IPv6
>> similar to IPv4. This fix adds sysctl for the same.
>> 
>> Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
>> Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
>> 
>> Changes in v1.
>> ---
> 
> your trying to be too fancy in the log messages; everything after this
> first '---' is dropped. Just Remove all of the '---' lines and '```' tags.
> 

Removed all ‘—‘ and ‘```’ in v3.

>> 1.) Correct the call to rt6_add_dflt_router.
>> ---
>> 
>> Changes in v2.
>> [Refer: lkml.org/lkml/2021/1/14/1400]
>> ---
>> 1.) Replace accept_ra_defrtr_metric to ra_defrtr_metric.
>> 2.) Change Type to __u32 instead of __s32.
>> 3.) Change description in Documentation/networking/ip-sysctl.rst.
>> 4.) Use proc_douintvec instead of proc_dointvec.
>> 5.) Code style in ndisc_router_discovery().
>> 6.) Change Type to u32 instead of unsigned int.
>> ---
>> 
>> Logs:
>> ----------------------------------------------------------------
>> For IPv4:
>> ----------------------------------------------------------------
>> 
>> Config in etc/network/interfaces
>> ----------------------------------------------------------------
>> ```
>> auto eth0
>> iface eth0 inet dhcp
>>    metric 4261413864
> 
> how does that work for IPv4? Is the metric passed to the dhclient and it
> inserts the route with the given metric or is a dhclient script used to
> replace the route after insert?
> 
> 

Yes, DHCP client picks config under “iface eth0 inet dhcp” line and if metric is configured, then it adds the metric for all added routes.


>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>> index dd2b12a32b73..c4b8d4b8d213 100644
>> --- a/Documentation/networking/ip-sysctl.rst
>> +++ b/Documentation/networking/ip-sysctl.rst
>> @@ -1871,6 +1871,18 @@ accept_ra_defrtr - BOOLEAN
>> 		- enabled if accept_ra is enabled.
>> 		- disabled if accept_ra is disabled.
>> 
>> +ra_defrtr_metric - INTEGER
>> +	Route metric for default route learned in Router Advertisement. This value
>> +	will be assigned as metric for the default route learned via IPv6 Router
>> +	Advertisement. Takes affect only if accept_ra_defrtr' is enabled.
> 
> stray ' after accept_ra_defrtr
> 

Removed.

>> +
>> +	Possible values are:
>> +		0:
>> +			default value will be used for route metric
>> +			i.e. IP6_RT_PRIO_USER 1024.
>> +		1 to 0xFFFFFFFF:
>> +			current value will be used for route metric.
>> +
>> accept_ra_from_local - BOOLEAN
>> 	Accept RA with source-address that is found on local machine
>> 	if the RA is otherwise proper and able to be accepted.
> 
> 
> 
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index eff2cacd5209..b13d3213e58f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -205,6 +205,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
>> 	.max_desync_factor	= MAX_DESYNC_FACTOR,
>> 	.max_addresses		= IPV6_MAX_ADDRESSES,
>> 	.accept_ra_defrtr	= 1,
>> +	.ra_defrtr_metric = 0,
> 
> make the the '=' align column wise with the existing entries; seems like
> your new line is missing a tab

Fixed.

> 
>> 	.accept_ra_from_local	= 0,
>> 	.accept_ra_min_hop_limit= 1,
>> 	.accept_ra_pinfo	= 1,
>> @@ -260,6 +261,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
>> 	.max_desync_factor	= MAX_DESYNC_FACTOR,
>> 	.max_addresses		= IPV6_MAX_ADDRESSES,
>> 	.accept_ra_defrtr	= 1,
>> +	.ra_defrtr_metric = 0,
> 
> same here

Fixed.

> 
>> 	.accept_ra_from_local	= 0,
>> 	.accept_ra_min_hop_limit= 1,
>> 	.accept_ra_pinfo	= 1,
>> @@ -5475,6 +5477,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
>> 	array[DEVCONF_MAX_DESYNC_FACTOR] = cnf->max_desync_factor;
>> 	array[DEVCONF_MAX_ADDRESSES] = cnf->max_addresses;
>> 	array[DEVCONF_ACCEPT_RA_DEFRTR] = cnf->accept_ra_defrtr;
>> +	array[DEVCONF_RA_DEFRTR_METRIC] = cnf->ra_defrtr_metric;
>> 	array[DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT] = cnf->accept_ra_min_hop_limit;
>> 	array[DEVCONF_ACCEPT_RA_PINFO] = cnf->accept_ra_pinfo;
>> #ifdef CONFIG_IPV6_ROUTER_PREF
>> @@ -6667,6 +6670,13 @@ static const struct ctl_table addrconf_sysctl[] = {
>> 		.mode		= 0644,
>> 		.proc_handler	= proc_dointvec,
>> 	},
>> +	{
>> +		.procname	= "ra_defrtr_metric",
>> +		.data		= &ipv6_devconf.ra_defrtr_metric,
>> +		.maxlen		= sizeof(u32),
>> +		.mode		= 0644,
>> +		.proc_handler	= proc_douintvec,
>> +	},
>> 	{
>> 		.procname	= "accept_ra_min_hop_limit",
>> 		.data		= &ipv6_devconf.accept_ra_min_hop_limit,
>> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
>> index 76717478f173..2bffed49f5c0 100644
>> --- a/net/ipv6/ndisc.c
>> +++ b/net/ipv6/ndisc.c
>> @@ -1173,6 +1173,7 @@ static void ndisc_router_discovery(struct sk_buff *skb)
>> 	struct neighbour *neigh = NULL;
>> 	struct inet6_dev *in6_dev;
>> 	struct fib6_info *rt = NULL;
>> +	u32 defrtr_usr_metric;
>> 	struct net *net;
>> 	int lifetime;
>> 	struct ndisc_options ndopts;
>> @@ -1303,18 +1304,23 @@ static void ndisc_router_discovery(struct sk_buff *skb)
>> 			return;
>> 		}
>> 	}
>> -	if (rt && lifetime == 0) {
>> +	/* Set default route metric if specified by user */
>> +	defrtr_usr_metric = in6_dev->cnf.accept_ra_defrtr_metric;
> 
> this tells me you did not compile this version of the patch since the
> 'accept_' has been dropped. Always compile and test every patch before
> sending.

Yeah one patch was pushed bit early. Sorry about that. I will take care of this now onwards.

I have respined the Patch (v3) after addressing your review comments. Build is done in our pipeline.
Build logs: https://dev.azure.com/sonicswitch/build/_build/results?buildId=1669&view=logs&j=011e1ec8-6569-5e69-4f06-baf193d1351e

Thanks a lot again for spending time for this Review,
This feature will help SONiC OS [and others Linux flavors] for better IPv6 support, so thanks again.


Praveen 
https://github.com/praveen-li
David Ahern Jan. 20, 2021, 4:22 a.m. UTC | #4
On 1/19/21 3:17 PM, praveen chaudhary wrote:
>>> ----------------------------------------------------------------
>>> For IPv4:
>>> ----------------------------------------------------------------
>>>
>>> Config in etc/network/interfaces
>>> ----------------------------------------------------------------
>>> ```
>>> auto eth0
>>> iface eth0 inet dhcp
>>>    metric 4261413864
>>
>> how does that work for IPv4? Is the metric passed to the dhclient and it
>> inserts the route with the given metric or is a dhclient script used to
>> replace the route after insert?
>>
>>
> 
> Yes, DHCP client picks config under “iface eth0 inet dhcp” line and if metric is configured, then it adds the metric for all added routes.

As I recall ifupdown{2} forks dhclient as a process to handle dhcp
config, and I believe there is a script that handles adding the default
route with metric. Meaning ... it is not comparable to an RA.

> 
> Thanks a lot again for spending time for this Review,
> This feature will help SONiC OS [and others Linux flavors] for better IPv6 support, so thanks again.

I think SONiC is an abomination, so that is definitely not the
motivation for my reviews. :-)
Praveen Chaudhary Jan. 20, 2021, 6:51 a.m. UTC | #5
> On Jan 19, 2021, at 8:22 PM, David Ahern <dsahern@gmail.com> wrote:
> 
> On 1/19/21 3:17 PM, praveen chaudhary wrote:
>>>> ----------------------------------------------------------------
>>>> For IPv4:
>>>> ----------------------------------------------------------------
>>>> 
>>>> Config in etc/network/interfaces
>>>> ----------------------------------------------------------------
>>>> ```
>>>> auto eth0
>>>> iface eth0 inet dhcp
>>>>   metric 4261413864
>>> 
>>> how does that work for IPv4? Is the metric passed to the dhclient and it
>>> inserts the route with the given metric or is a dhclient script used to
>>> replace the route after insert?
>>> 
>>> 
>> 
>> Yes, DHCP client picks config under “iface eth0 inet dhcp” line and if metric is configured, then it adds the metric for all added routes.
> 
> As I recall ifupdown{2} forks dhclient as a process to handle dhcp
> config, and I believe there is a script that handles adding the default
> route with metric. Meaning ... it is not comparable to an RA.
> 

I hope, we both will agree that a fixed metric value on default route learned via RA 
restricts Network Administrators today. And such issues hinder the deployment
of IPv6 only networks. So if we agree that in future we may need to allow  a
configurable value for metric then this fix makes good sense.
BTW, kindly let me know if there is a better way to configure this metric. I think,
sysctl is the only way.


>> 
>> Thanks a lot again for spending time for this Review,
>> This feature will help SONiC OS [and others Linux flavors] for better IPv6 support, so thanks again.
> 
> I think SONiC is an abomination, so that is definitely not the
> motivation for my reviews. :-)
> 

Trying to make things better day by day. That is the only solace for Software Engineers :-). 

I really appreciate for your time to review this patch. Cheers.
diff mbox series

Patch

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index dd2b12a32b73..c4b8d4b8d213 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1871,6 +1871,18 @@  accept_ra_defrtr - BOOLEAN
 		- enabled if accept_ra is enabled.
 		- disabled if accept_ra is disabled.
 
+ra_defrtr_metric - INTEGER
+	Route metric for default route learned in Router Advertisement. This value
+	will be assigned as metric for the default route learned via IPv6 Router
+	Advertisement. Takes affect only if accept_ra_defrtr' is enabled.
+
+	Possible values are:
+		0:
+			default value will be used for route metric
+			i.e. IP6_RT_PRIO_USER 1024.
+		1 to 0xFFFFFFFF:
+			current value will be used for route metric.
+
 accept_ra_from_local - BOOLEAN
 	Accept RA with source-address that is found on local machine
 	if the RA is otherwise proper and able to be accepted.
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index dda61d150a13..9d1f29f0c512 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -31,6 +31,7 @@  struct ipv6_devconf {
 	__s32		max_desync_factor;
 	__s32		max_addresses;
 	__s32		accept_ra_defrtr;
+	__u32		ra_defrtr_metric;
 	__s32		accept_ra_min_hop_limit;
 	__s32		accept_ra_pinfo;
 	__s32		ignore_routes_with_linkdown;
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 2a5277758379..f51a118bfce8 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -174,7 +174,8 @@  struct fib6_info *rt6_get_dflt_router(struct net *net,
 				     struct net_device *dev);
 struct fib6_info *rt6_add_dflt_router(struct net *net,
 				     const struct in6_addr *gwaddr,
-				     struct net_device *dev, unsigned int pref);
+				     struct net_device *dev, unsigned int pref,
+				     u32 defrtr_usr_metric);
 
 void rt6_purge_dflt_routers(struct net *net);
 
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 13e8751bf24a..70603775fe91 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -189,6 +189,7 @@  enum {
 	DEVCONF_ACCEPT_RA_RT_INFO_MIN_PLEN,
 	DEVCONF_NDISC_TCLASS,
 	DEVCONF_RPL_SEG_ENABLED,
+	DEVCONF_RA_DEFRTR_METRIC,
 	DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 458179df9b27..1e05d3caa712 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -571,6 +571,7 @@  enum {
 	NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 	NET_IPV6_ACCEPT_RA_FROM_LOCAL=26,
 	NET_IPV6_ACCEPT_RA_RT_INFO_MIN_PLEN=27,
+	NET_IPV6_RA_DEFRTR_METRIC=28,
 	__NET_IPV6_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index eff2cacd5209..b13d3213e58f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -205,6 +205,7 @@  static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.max_desync_factor	= MAX_DESYNC_FACTOR,
 	.max_addresses		= IPV6_MAX_ADDRESSES,
 	.accept_ra_defrtr	= 1,
+	.ra_defrtr_metric = 0,
 	.accept_ra_from_local	= 0,
 	.accept_ra_min_hop_limit= 1,
 	.accept_ra_pinfo	= 1,
@@ -260,6 +261,7 @@  static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.max_desync_factor	= MAX_DESYNC_FACTOR,
 	.max_addresses		= IPV6_MAX_ADDRESSES,
 	.accept_ra_defrtr	= 1,
+	.ra_defrtr_metric = 0,
 	.accept_ra_from_local	= 0,
 	.accept_ra_min_hop_limit= 1,
 	.accept_ra_pinfo	= 1,
@@ -5475,6 +5477,7 @@  static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_MAX_DESYNC_FACTOR] = cnf->max_desync_factor;
 	array[DEVCONF_MAX_ADDRESSES] = cnf->max_addresses;
 	array[DEVCONF_ACCEPT_RA_DEFRTR] = cnf->accept_ra_defrtr;
+	array[DEVCONF_RA_DEFRTR_METRIC] = cnf->ra_defrtr_metric;
 	array[DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT] = cnf->accept_ra_min_hop_limit;
 	array[DEVCONF_ACCEPT_RA_PINFO] = cnf->accept_ra_pinfo;
 #ifdef CONFIG_IPV6_ROUTER_PREF
@@ -6667,6 +6670,13 @@  static const struct ctl_table addrconf_sysctl[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname	= "ra_defrtr_metric",
+		.data		= &ipv6_devconf.ra_defrtr_metric,
+		.maxlen		= sizeof(u32),
+		.mode		= 0644,
+		.proc_handler	= proc_douintvec,
+	},
 	{
 		.procname	= "accept_ra_min_hop_limit",
 		.data		= &ipv6_devconf.accept_ra_min_hop_limit,
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 76717478f173..2bffed49f5c0 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1173,6 +1173,7 @@  static void ndisc_router_discovery(struct sk_buff *skb)
 	struct neighbour *neigh = NULL;
 	struct inet6_dev *in6_dev;
 	struct fib6_info *rt = NULL;
+	u32 defrtr_usr_metric;
 	struct net *net;
 	int lifetime;
 	struct ndisc_options ndopts;
@@ -1303,18 +1304,23 @@  static void ndisc_router_discovery(struct sk_buff *skb)
 			return;
 		}
 	}
-	if (rt && lifetime == 0) {
+	/* Set default route metric if specified by user */
+	defrtr_usr_metric = in6_dev->cnf.accept_ra_defrtr_metric;
+	if (defrtr_usr_metric == 0)
+		defrtr_usr_metric = IP6_RT_PRIO_USER;
+	/* delete the route if lifetime is 0 or if metric needs change */
+	if (rt && ((lifetime == 0) || (rt->fib6_metric != defrtr_usr_metric)))  {
 		ip6_del_rt(net, rt, false);
 		rt = NULL;
 	}
 
-	ND_PRINTK(3, info, "RA: rt: %p  lifetime: %d, for dev: %s\n",
-		  rt, lifetime, skb->dev->name);
+	ND_PRINTK(3, info, "RA: rt: %p  lifetime: %d, metric: %d, for dev: %s\n",
+		  rt, lifetime, defrtr_usr_metric, skb->dev->name);
 	if (!rt && lifetime) {
 		ND_PRINTK(3, info, "RA: adding default router\n");
 
 		rt = rt6_add_dflt_router(net, &ipv6_hdr(skb)->saddr,
-					 skb->dev, pref);
+					 skb->dev, pref, defrtr_usr_metric);
 		if (!rt) {
 			ND_PRINTK(0, err,
 				  "RA: %s failed to add default route\n",
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 188e114b29b4..64fe5b51b0c2 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4252,11 +4252,12 @@  struct fib6_info *rt6_get_dflt_router(struct net *net,
 struct fib6_info *rt6_add_dflt_router(struct net *net,
 				     const struct in6_addr *gwaddr,
 				     struct net_device *dev,
-				     unsigned int pref)
+				     unsigned int pref,
+				     u32 defrtr_usr_metric)
 {
 	struct fib6_config cfg = {
 		.fc_table	= l3mdev_fib_table(dev) ? : RT6_TABLE_DFLT,
-		.fc_metric	= IP6_RT_PRIO_USER,
+		.fc_metric	= defrtr_usr_metric ? : IP6_RT_PRIO_USER,
 		.fc_ifindex	= dev->ifindex,
 		.fc_flags	= RTF_GATEWAY | RTF_ADDRCONF | RTF_DEFAULT |
 				  RTF_UP | RTF_EXPIRES | RTF_PREF(pref),