diff mbox

vti6: Add pmtu handling to vti6_xmit.

Message ID 20160217070805.GA316@gauss.secunet.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show

Commit Message

Steffen Klassert Feb. 17, 2016, 7:08 a.m. UTC
On Wed, Feb 10, 2016 at 01:50:20AM +0000, Mark McKinstry wrote:
> >
> > So this version is slightly modified to cover the IPv4 case in addition to
> > the IPv6 case.  With this patch I was able to run netperf over either an
> > IPv4 or IPv6 address routed over the ip6_vti tunnel.
> We have the same issue. When we do a local ping to a remote device over
> a v4 vti tunnel and an intermediate device has a low mtu, pmtu
> discovery reduces the route's pmtu, and ping fails because it does not
> handle the local error message generated by xfrm4_tunnel_check_size().
> Your patch fixes our issue for v6 vti tunnels, but the issue still
> exists for v4 tunnels. Is there any particular reason this patch was
> not delivered for v4 tunnels too - i.e. in vti_xmit()?

I don't remember why we fixed it just for ipv6, we probably need
a similar patch for ipv4.

Does the patch below help (compile tested only)?

Subject: [PATCH] vti: Add pmtu handling to vti_xmit.

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_vti.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Mark McKinstry Feb. 18, 2016, 1:40 a.m. UTC | #1
On 17/02/16 20:08, Steffen Klassert wrote:
> On Wed, Feb 10, 2016 at 01:50:20AM +0000, Mark McKinstry wrote:
>>> So this version is slightly modified to cover the IPv4 case in addition to
>>> the IPv6 case.  With this patch I was able to run netperf over either an
>>> IPv4 or IPv6 address routed over the ip6_vti tunnel.
>> We have the same issue. When we do a local ping to a remote device over
>> a v4 vti tunnel and an intermediate device has a low mtu, pmtu
>> discovery reduces the route's pmtu, and ping fails because it does not
>> handle the local error message generated by xfrm4_tunnel_check_size().
>> Your patch fixes our issue for v6 vti tunnels, but the issue still
>> exists for v4 tunnels. Is there any particular reason this patch was
>> not delivered for v4 tunnels too - i.e. in vti_xmit()?
> I don't remember why we fixed it just for ipv6, we probably need
> a similar patch for ipv4.
>
> Does the patch below help (compile tested only)?
>
> Subject: [PATCH] vti: Add pmtu handling to vti_xmit.
>
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is localy sent, the PMTU mechanism
> of xfrm tries to to local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti_xmit to
> report MTU changes immediately.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/ip_vti.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
> index 5cf10b7..6862305 100644
> --- a/net/ipv4/ip_vti.c
> +++ b/net/ipv4/ip_vti.c
> @@ -156,6 +156,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   	struct dst_entry *dst = skb_dst(skb);
>   	struct net_device *tdev;	/* Device to other host */
>   	int err;
> +	int mtu;
>   
>   	if (!dst) {
>   		dev->stats.tx_carrier_errors++;
> @@ -196,6 +197,18 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   	skb_dst_set(skb, dst);
>   	skb->dev = skb_dst(skb)->dev;
>   
> +	mtu = dst_mtu(dst);
> +	if (!skb->ignore_df && skb->len > mtu) {
> +		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
> +		if (skb->protocol == htons(ETH_P_IP))
> +			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> +				  htonl(mtu));
> +		else
> +			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> +
> +		return -EMSGSIZE;
> +	}
> +
>   	err = dst_output(tunnel->net, skb->sk, skb);
>   	if (net_xmit_eval(err) == 0)
>   		err = skb->len;
This patch fixes our issue, thanks. In our scenario the tunnel path MTU 
now gets updated so that subsequent large packets sent over the tunnel 
get fragmented correctly.--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert Feb. 18, 2016, 12:19 p.m. UTC | #2
On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
> This patch fixes our issue, thanks. In our scenario the tunnel path MTU 
> now gets updated so that subsequent large packets sent over the tunnel 
> get fragmented correctly.

I've applied this patch to the ipsec tree now.
Thanks for testing!
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark McKinstry Feb. 24, 2016, 9:37 p.m. UTC | #3
On 19/02/16 01:19, Steffen Klassert wrote:
> On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
>> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
>> now gets updated so that subsequent large packets sent over the tunnel
>> get fragmented correctly.
> I've applied this patch to the ipsec tree now.
> Thanks for testing!
I spoke too soon. Upon further testing with this patch we have found it 
causes
a skt buffer leak. This is problematic for us and can cause memory 
exhaustion in
one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link. 
Also
the patch's -EMSGSIZE return value appears to be invalid because vti_xmit()
should be returning a type netdev_tx_t (NETDEV_TX_OK etc). It looks to 
me that
this patch should really be doing a goto tx_error rather than doing an early
return with -EMSGSIZE. This would result in the skt buffer being freed,
NETDEV_TX_OK being returned (thus indicating vti_xmit() "took care of 
packet"),
and the tx_errors counter being incremented (which seems like a reasonable
thing to do).

I think the original IPv6 patch probably has the same issues, and could be
causing a DOS attack vulnerability in recent Linux releases. If this patch's
code gets hit for every received packet then the box's memory will soon be
exhausted - e.g. a rogue device sends a stream of largish pkts through a box
with a vti interface, and ignores every ICMPV6_PKT_TOOBIG pkt sent back 
to it.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert Feb. 25, 2016, 11:21 a.m. UTC | #4
On Wed, Feb 24, 2016 at 09:37:39PM +0000, Mark McKinstry wrote:
> On 19/02/16 01:19, Steffen Klassert wrote:
> > On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
> >> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
> >> now gets updated so that subsequent large packets sent over the tunnel
> >> get fragmented correctly.
> > I've applied this patch to the ipsec tree now.
> > Thanks for testing!
> I spoke too soon. Upon further testing with this patch we have found it 
> causes
> a skt buffer leak. This is problematic for us and can cause memory 
> exhaustion in
> one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link. 
> Also
> the patch's -EMSGSIZE return value appears to be invalid because vti_xmit()
> should be returning a type netdev_tx_t (NETDEV_TX_OK etc). It looks to 
> me that
> this patch should really be doing a goto tx_error rather than doing an early
> return with -EMSGSIZE. This would result in the skt buffer being freed,
> NETDEV_TX_OK being returned (thus indicating vti_xmit() "took care of 
> packet"),
> and the tx_errors counter being incremented (which seems like a reasonable
> thing to do).

Yes, you are right here. 

> 
> I think the original IPv6 patch probably has the same issues, and could be
> causing a DOS attack vulnerability in recent Linux releases.

We need to fix both, ipv4 and ipv6. I'll care for it,
thanks for the report.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 5cf10b7..6862305 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -156,6 +156,7 @@  static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *tdev;	/* Device to other host */
 	int err;
+	int mtu;
 
 	if (!dst) {
 		dev->stats.tx_carrier_errors++;
@@ -196,6 +197,18 @@  static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	skb_dst_set(skb, dst);
 	skb->dev = skb_dst(skb)->dev;
 
+	mtu = dst_mtu(dst);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+		if (skb->protocol == htons(ETH_P_IP))
+			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				  htonl(mtu));
+		else
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
+		return -EMSGSIZE;
+	}
+
 	err = dst_output(tunnel->net, skb->sk, skb);
 	if (net_xmit_eval(err) == 0)
 		err = skb->len;