Message ID | 77658f2ff9f9de796ae2386f60b2a372882befa6.1616608328.git.andreas.a.roeseler@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | add support for RFC 8335 PROBE | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 5 of 5 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 3 this patch: 3 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: 'recieved' may be misspelled - perhaps 'received'? WARNING: line length of 101 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 3 this patch: 3 |
netdev/header_inline | success | Link |
On 3/24/21 7:18 PM, Andreas Roeseler wrote: > Modify the icmp_rcv function to check PROBE messages and call icmp_echo > if a PROBE request is detected. > ... > @@ -1340,6 +1440,7 @@ static int __net_init icmp_sk_init(struct net *net) > > /* Control parameters for ECHO replies. */ > net->ipv4.sysctl_icmp_echo_ignore_all = 0; > + net->ipv4.sysctl_icmp_echo_enable_probe = 0; > net->ipv4.sysctl_icmp_echo_ignore_broadcasts = 1; > > /* Control parameter - ignore bogus broadcast responses? */ > Where is sysctl_icmp_echo_enable_probe defined ? Please include a cover letter, also add proper documentation for any new sysctl in Documentation/networking/ip-sysctl.rst
On Wed, 2021-03-24 at 20:47 +0100, Eric Dumazet wrote: > > > On 3/24/21 7:18 PM, Andreas Roeseler wrote: > > Modify the icmp_rcv function to check PROBE messages and call > > icmp_echo > > if a PROBE request is detected. > > > > > ... > > > @@ -1340,6 +1440,7 @@ static int __net_init icmp_sk_init(struct net > > *net) > > > > /* Control parameters for ECHO replies. */ > > net->ipv4.sysctl_icmp_echo_ignore_all = 0; > > + net->ipv4.sysctl_icmp_echo_enable_probe = 0; > > net->ipv4.sysctl_icmp_echo_ignore_broadcasts = 1; > > > > /* Control parameter - ignore bogus broadcast responses? */ > > > > Where is sysctl_icmp_echo_enable_probe defined ? It is defined in patch 3 of this patchset. > > Please include a cover letter, also add proper documentation for any > new sysctl > in Documentation/networking/ip-sysctl.rst > Will do.
On Wed, Mar 24, 2021 at 2:20 PM Andreas Roeseler <andreas.a.roeseler@gmail.com> wrote: > > Modify the icmp_rcv function to check PROBE messages and call icmp_echo > if a PROBE request is detected. > > Modify the existing icmp_echo function to respond ot both ping and PROBE > requests. > > This was tested using a custom modification to the iputils package and > wireshark. It supports IPV4 probing by name, ifindex, and probing by > both IPV4 and IPV6 addresses. It currently does not support responding > to probes off the proxy node (see RFC 8335 Section 2). > > The modification to the iputils package is still in development and can > be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It > supports full sending functionality of PROBE requests, but currently > does not parse the response messages, which is why Wireshark is required > to verify the sent and recieved PROBE messages. The modification adds > the ``-e'' flag to the command which allows the user to specify the > interface identifier to query the probed host. An example usage would be > <./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the > destination node. > > Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> > static bool icmp_echo(struct sk_buff *skb) > { > + struct icmp_ext_hdr *ext_hdr, _ext_hdr; > + struct icmp_ext_echo_iio *iio, _iio; > + struct icmp_bxm icmp_param; > + struct net_device *dev; > + char buff[IFNAMSIZ]; > struct net *net; > + u16 ident_len; > + u8 status; > > net = dev_net(skb_dst(skb)->dev); > - if (!net->ipv4.sysctl_icmp_echo_ignore_all) { > - struct icmp_bxm icmp_param; > + /* should there be an ICMP stat for ignored echos? */ > + if (net->ipv4.sysctl_icmp_echo_ignore_all) > + return true; > + > + icmp_param.data.icmph = *icmp_hdr(skb); > + icmp_param.skb = skb; > + icmp_param.offset = 0; > + icmp_param.data_len = skb->len; > + icmp_param.head_len = sizeof(struct icmphdr); > > - icmp_param.data.icmph = *icmp_hdr(skb); > + if (icmp_param.data.icmph.type == ICMP_ECHO) { > icmp_param.data.icmph.type = ICMP_ECHOREPLY; > - icmp_param.skb = skb; > - icmp_param.offset = 0; > - icmp_param.data_len = skb->len; > - icmp_param.head_len = sizeof(struct icmphdr); > - icmp_reply(&icmp_param, skb); > + goto send_reply; > } > - /* should there be an ICMP stat for ignored echos? */ > - return true; > + if (!net->ipv4.sysctl_icmp_echo_enable_probe) > + return true; > + /* We currently only support probing interfaces on the proxy node > + * Check to ensure L-bit is set > + */ > + if (!(ntohs(icmp_param.data.icmph.un.echo.sequence) & 1)) > + return true; > + /* Clear status bits in reply message */ > + icmp_param.data.icmph.un.echo.sequence &= htons(0xFF00); > + icmp_param.data.icmph.type = ICMP_EXT_ECHOREPLY; > + ext_hdr = skb_header_pointer(skb, 0, sizeof(_ext_hdr), &_ext_hdr); > + iio = skb_header_pointer(skb, sizeof(_ext_hdr), sizeof(_iio), &_iio); This requires that the packet holds a full sizeof(_iio) that is capable of storing an ipv6 address regardless of the class_type. That is not required by the spec, I assume. If not requiring that, do have to do bounds checking for each individual case, e.g., that an ifname fits in the packet if that may be shorter than IFNAMSIZ. > + if (!ext_hdr || !iio) > + goto send_mal_query; > + if (ntohs(iio->extobj_hdr.length) <= sizeof(iio->extobj_hdr)) > + goto send_mal_query; > + ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio->extobj_hdr); As asked in v3: this can have negative overflow? > + status = 0; > + dev = NULL; > + switch (iio->extobj_hdr.class_type) { > + case EXT_ECHO_CTYPE_NAME: > + if (ident_len >= IFNAMSIZ) > + goto send_mal_query; > + memset(buff, 0, sizeof(buff)); > + memcpy(buff, &iio->ident.name, ident_len); > + dev = dev_get_by_name(net, buff); > + break; > + case EXT_ECHO_CTYPE_INDEX: > + if (ident_len != sizeof(iio->ident.ifindex)) > + goto send_mal_query; > + dev = dev_get_by_index(net, ntohl(iio->ident.ifindex)); > + break; > + case EXT_ECHO_CTYPE_ADDR: > + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + > + iio->ident.addr.ctype3_hdr.addrlen) > + goto send_mal_query; > + switch (ntohs(iio->ident.addr.ctype3_hdr.afi)) { > + case ICMP_AFI_IP: > + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + > + sizeof(struct in_addr)) > + goto send_mal_query; > + dev = ip_dev_find(net, iio->ident.addr.ip_addr.ipv4_addr.s_addr); > + break; > +#if IS_ENABLED(CONFIG_IPV6) > + case ICMP_AFI_IP6: > + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + > + sizeof(struct in6_addr)) > + goto send_mal_query; > + rcu_read_lock(); > + dev = ipv6_stub->ipv6_dev_find(net, &iio->ident.addr.ip_addr.ipv6_addr, dev); > + if (dev) > + dev_hold(dev); > + rcu_read_unlock(); This rcu read-size critical is not needed, because the entire receive path is wrapped in such a section. See, for instance, netif_receive_skb_core. Either that, or the __in_dev_get_rcu and rcu_deference below would require a similar critical section.
On Sun, 2021-03-28 at 13:00 -0400, Willem de Bruijn wrote: > On Wed, Mar 24, 2021 at 2:20 PM Andreas Roeseler > <andreas.a.roeseler@gmail.com> wrote: > > > > > + if (!ext_hdr || !iio) > > + goto send_mal_query; > > + if (ntohs(iio->extobj_hdr.length) <= sizeof(iio- > > >extobj_hdr)) > > + goto send_mal_query; > > + ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio- > > >extobj_hdr); > > As asked in v3: this can have negative overflow? The line above checks that iio->extobj_hdr.length is greater than the size of iio->extobj_hdr.
On Mon, Mar 29, 2021 at 2:34 PM Andreas Roeseler <andreas.a.roeseler@gmail.com> wrote: > > On Sun, 2021-03-28 at 13:00 -0400, Willem de Bruijn wrote: > > On Wed, Mar 24, 2021 at 2:20 PM Andreas Roeseler > > <andreas.a.roeseler@gmail.com> wrote: > > > > > > > > + if (!ext_hdr || !iio) > > > + goto send_mal_query; > > > + if (ntohs(iio->extobj_hdr.length) <= sizeof(iio- > > > >extobj_hdr)) > > > + goto send_mal_query; > > > + ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio- > > > >extobj_hdr); > > > > As asked in v3: this can have negative overflow? > > The line above checks that iio->extobj_hdr.length is greater than the > size of iio->extobj_hdr. Completely missed that, clearly. Thanks, that's great, then.
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 616e2dc1c8fa..2acf7d3a66cb 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -971,7 +971,7 @@ static bool icmp_redirect(struct sk_buff *skb) } /* - * Handle ICMP_ECHO ("ping") requests. + * Handle ICMP_ECHO ("ping") and ICMP_EXT_ECHO ("PROBE") requests. * * RFC 1122: 3.2.2.6 MUST have an echo server that answers ICMP echo * requests. @@ -979,27 +979,118 @@ static bool icmp_redirect(struct sk_buff *skb) * included in the reply. * RFC 1812: 4.3.3.6 SHOULD have a config option for silently ignoring * echo requests, MUST have default=NOT. + * RFC 8335: 8 MUST have a config option to enable/disable ICMP + * Extended Echo Functionality, MUST be disabled by default * See also WRT handling of options once they are done and working. */ static bool icmp_echo(struct sk_buff *skb) { + struct icmp_ext_hdr *ext_hdr, _ext_hdr; + struct icmp_ext_echo_iio *iio, _iio; + struct icmp_bxm icmp_param; + struct net_device *dev; + char buff[IFNAMSIZ]; struct net *net; + u16 ident_len; + u8 status; net = dev_net(skb_dst(skb)->dev); - if (!net->ipv4.sysctl_icmp_echo_ignore_all) { - struct icmp_bxm icmp_param; + /* should there be an ICMP stat for ignored echos? */ + if (net->ipv4.sysctl_icmp_echo_ignore_all) + return true; + + icmp_param.data.icmph = *icmp_hdr(skb); + icmp_param.skb = skb; + icmp_param.offset = 0; + icmp_param.data_len = skb->len; + icmp_param.head_len = sizeof(struct icmphdr); - icmp_param.data.icmph = *icmp_hdr(skb); + if (icmp_param.data.icmph.type == ICMP_ECHO) { icmp_param.data.icmph.type = ICMP_ECHOREPLY; - icmp_param.skb = skb; - icmp_param.offset = 0; - icmp_param.data_len = skb->len; - icmp_param.head_len = sizeof(struct icmphdr); - icmp_reply(&icmp_param, skb); + goto send_reply; } - /* should there be an ICMP stat for ignored echos? */ - return true; + if (!net->ipv4.sysctl_icmp_echo_enable_probe) + return true; + /* We currently only support probing interfaces on the proxy node + * Check to ensure L-bit is set + */ + if (!(ntohs(icmp_param.data.icmph.un.echo.sequence) & 1)) + return true; + /* Clear status bits in reply message */ + icmp_param.data.icmph.un.echo.sequence &= htons(0xFF00); + icmp_param.data.icmph.type = ICMP_EXT_ECHOREPLY; + ext_hdr = skb_header_pointer(skb, 0, sizeof(_ext_hdr), &_ext_hdr); + iio = skb_header_pointer(skb, sizeof(_ext_hdr), sizeof(_iio), &_iio); + if (!ext_hdr || !iio) + goto send_mal_query; + if (ntohs(iio->extobj_hdr.length) <= sizeof(iio->extobj_hdr)) + goto send_mal_query; + ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio->extobj_hdr); + status = 0; + dev = NULL; + switch (iio->extobj_hdr.class_type) { + case EXT_ECHO_CTYPE_NAME: + if (ident_len >= IFNAMSIZ) + goto send_mal_query; + memset(buff, 0, sizeof(buff)); + memcpy(buff, &iio->ident.name, ident_len); + dev = dev_get_by_name(net, buff); + break; + case EXT_ECHO_CTYPE_INDEX: + if (ident_len != sizeof(iio->ident.ifindex)) + goto send_mal_query; + dev = dev_get_by_index(net, ntohl(iio->ident.ifindex)); + break; + case EXT_ECHO_CTYPE_ADDR: + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + + iio->ident.addr.ctype3_hdr.addrlen) + goto send_mal_query; + switch (ntohs(iio->ident.addr.ctype3_hdr.afi)) { + case ICMP_AFI_IP: + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + + sizeof(struct in_addr)) + goto send_mal_query; + dev = ip_dev_find(net, iio->ident.addr.ip_addr.ipv4_addr.s_addr); + break; +#if IS_ENABLED(CONFIG_IPV6) + case ICMP_AFI_IP6: + if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + + sizeof(struct in6_addr)) + goto send_mal_query; + rcu_read_lock(); + dev = ipv6_stub->ipv6_dev_find(net, &iio->ident.addr.ip_addr.ipv6_addr, dev); + if (dev) + dev_hold(dev); + rcu_read_unlock(); + break; +#endif + default: + goto send_mal_query; + } + break; + default: + goto send_mal_query; + } + if (!dev) { + icmp_param.data.icmph.code = ICMP_EXT_NO_IF; + goto send_reply; + } + /* Fill bits in reply message */ + if (dev->flags & IFF_UP) + status |= EXT_ECHOREPLY_ACTIVE; + if (__in_dev_get_rcu(dev) && __in_dev_get_rcu(dev)->ifa_list) + status |= EXT_ECHOREPLY_IPV4; + if (!list_empty(&rcu_dereference(dev->ip6_ptr)->addr_list)) + status |= EXT_ECHOREPLY_IPV6; + dev_put(dev); + icmp_param.data.icmph.un.echo.sequence |= htons(status); +send_reply: + icmp_reply(&icmp_param, skb); + return true; +send_mal_query: + icmp_param.data.icmph.code = ICMP_EXT_MAL_QUERY; + goto send_reply; } /* @@ -1088,6 +1179,16 @@ int icmp_rcv(struct sk_buff *skb) icmph = icmp_hdr(skb); ICMPMSGIN_INC_STATS(net, icmph->type); + + /* Check for ICMP Extended Echo (PROBE) messages */ + if (icmph->type == ICMP_EXT_ECHO) { + /* We can't use icmp_pointers[].handler() because it is an array of + * size NR_ICMP_TYPES + 1 (19 elements) and PROBE has code 42. + */ + success = icmp_echo(skb); + goto success_check; + } + /* * 18 is the highest 'known' ICMP type. Anything else is a mystery * @@ -1097,7 +1198,6 @@ int icmp_rcv(struct sk_buff *skb) if (icmph->type > NR_ICMP_TYPES) goto error; - /* * Parse the ICMP message */ @@ -1123,7 +1223,7 @@ int icmp_rcv(struct sk_buff *skb) } success = icmp_pointers[icmph->type].handler(skb); - +success_check: if (success) { consume_skb(skb); return NET_RX_SUCCESS; @@ -1340,6 +1440,7 @@ static int __net_init icmp_sk_init(struct net *net) /* Control parameters for ECHO replies. */ net->ipv4.sysctl_icmp_echo_ignore_all = 0; + net->ipv4.sysctl_icmp_echo_enable_probe = 0; net->ipv4.sysctl_icmp_echo_ignore_broadcasts = 1; /* Control parameter - ignore bogus broadcast responses? */
Modify the icmp_rcv function to check PROBE messages and call icmp_echo if a PROBE request is detected. Modify the existing icmp_echo function to respond ot both ping and PROBE requests. This was tested using a custom modification to the iputils package and wireshark. It supports IPV4 probing by name, ifindex, and probing by both IPV4 and IPV6 addresses. It currently does not support responding to probes off the proxy node (see RFC 8335 Section 2). The modification to the iputils package is still in development and can be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It supports full sending functionality of PROBE requests, but currently does not parse the response messages, which is why Wireshark is required to verify the sent and recieved PROBE messages. The modification adds the ``-e'' flag to the command which allows the user to specify the interface identifier to query the probed host. An example usage would be <./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the destination node. Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> --- Changes: v1 -> v2: - Reorder variable declarations to follow coding style - Switch to functions such as dev_get_by_name and ip_dev_find to lookup net devices v2 -> v3: Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com> - Add verification of incoming messages before looking up netdev Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> - Include net/addrconf.h library for ipv6_dev_find v3 -> v4: Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com> - Use skb_header_pointer to verify fields in incoming message - Add check to ensure that extobj_hdr.length is valid - Check to ensure object payload is padded with ASCII NULL characters when probing by name, as specified by RFC 8335 - Statically allocate buff using IFNAMSIZ - Add rcu blocking around ipv6_dev_find - Use __in_dev_get_rcu to access IPV4 addresses of identified net_device - Remove check for ICMPV6 PROBE types v4 -> v5: - Statically allocate buff to size IFNAMSIZ on declaration - Remove goto probe in favor of single branch - Remove strict check for incoming PROBE requests padding to nearest 32-bit boundary Reported-by: kernel test robot <lkp@intel.com> - Use rcu_dereference when accessing i6_ptr in net_device --- net/ipv4/icmp.c | 127 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 114 insertions(+), 13 deletions(-)