diff mbox

xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels

Message ID 20150527173823.1415.96248.stgit@ahduyck-vm-fedora22 (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show

Commit Message

Alexander Duyck May 27, 2015, 5:40 p.m. UTC
This change makes it so that we use icmpv6_send to report PMTU issues back
into tunnels in the case that the resulting packet is larger than the MTU
of the outgoing interface.  Previously xfrm_local_error was being used in
this case, however this was resulting in no changes, I suspect due to the
fact that the tunnel itself was being kept out of the loop.

This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
behavior seen if the socket was orphaned.  Instead of requiring the socket
to be orphaned this patch simply defaults to using icmpv6_send in the case
that the frame came though a tunnel.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/ipv6/xfrm6_output.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Herbert Xu May 28, 2015, 4:49 a.m. UTC | #1
On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> This change makes it so that we use icmpv6_send to report PMTU issues back
> into tunnels in the case that the resulting packet is larger than the MTU
> of the outgoing interface.  Previously xfrm_local_error was being used in
> this case, however this was resulting in no changes, I suspect due to the
> fact that the tunnel itself was being kept out of the loop.
> 
> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> behavior seen if the socket was orphaned.  Instead of requiring the socket
> to be orphaned this patch simply defaults to using icmpv6_send in the case
> that the frame came though a tunnel.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Does this still work with normal tunnel mode and identical inner
and outer addresses? I recall we used to have a bug where in that
situation the kernel would interpret the ICMP message as a reduction
in outer MTU and thus resulting in a loop where the MTU keeps
getting smaller.

Cheers,
Steffen Klassert May 28, 2015, 4:56 a.m. UTC | #2
On Thu, May 28, 2015 at 12:49:19PM +0800, Herbert Xu wrote:
> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> > This change makes it so that we use icmpv6_send to report PMTU issues back
> > into tunnels in the case that the resulting packet is larger than the MTU
> > of the outgoing interface.  Previously xfrm_local_error was being used in
> > this case, however this was resulting in no changes, I suspect due to the
> > fact that the tunnel itself was being kept out of the loop.
> > 
> > This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> > behavior seen if the socket was orphaned.  Instead of requiring the socket
> > to be orphaned this patch simply defaults to using icmpv6_send in the case
> > that the frame came though a tunnel.
> > 
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> 
> Does this still work with normal tunnel mode and identical inner
> and outer addresses? I recall we used to have a bug where in that
> situation the kernel would interpret the ICMP message as a reduction
> in outer MTU and thus resulting in a loop where the MTU keeps
> getting smaller.

Right, I think this reintroduces a bug that I fixed some years ago with
commit dd767856a36e ("xfrm6: Don't call icmpv6_send on local error")
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert May 28, 2015, 5:36 a.m. UTC | #3
On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> This change makes it so that we use icmpv6_send to report PMTU issues back
> into tunnels in the case that the resulting packet is larger than the MTU
> of the outgoing interface.  Previously xfrm_local_error was being used in
> this case, however this was resulting in no changes, I suspect due to the
> fact that the tunnel itself was being kept out of the loop.
> 
> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> behavior seen if the socket was orphaned.  Instead of requiring the socket
> to be orphaned this patch simply defaults to using icmpv6_send in the case
> that the frame came though a tunnel.

We can use icmpv6_send() just in the case that the packet
was already transmitted by a tunnel device, otherwise we
get the bug back that I mentioned in my other mail.

Not sure if we have something to know that the packet
traversed a tunnel device. That's what I asked in the
thread 'Looking for a lost patch'.

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck May 28, 2015, 7:18 a.m. UTC | #4
On 05/27/2015 10:36 PM, Steffen Klassert wrote:
> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
>> This change makes it so that we use icmpv6_send to report PMTU issues back
>> into tunnels in the case that the resulting packet is larger than the MTU
>> of the outgoing interface.  Previously xfrm_local_error was being used in
>> this case, however this was resulting in no changes, I suspect due to the
>> fact that the tunnel itself was being kept out of the loop.
>>
>> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
>> behavior seen if the socket was orphaned.  Instead of requiring the socket
>> to be orphaned this patch simply defaults to using icmpv6_send in the case
>> that the frame came though a tunnel.
> We can use icmpv6_send() just in the case that the packet
> was already transmitted by a tunnel device, otherwise we
> get the bug back that I mentioned in my other mail.
>
> Not sure if we have something to know that the packet
> traversed a tunnel device. That's what I asked in the
> thread 'Looking for a lost patch'.

Okay I will try to do some more digging.  From what I can tell right now 
it looks like my ping attempts are getting hung up on the 
xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow 
make use of the skb->cb to store a pointer to the tunnel that could be 
checked to determine if we are going through a VTI or not.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 09c76a7b474d..6f9b514d0e38 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -72,6 +72,7 @@  static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
 	struct dst_entry *dst = skb_dst(skb);
+	struct xfrm_state *x = dst->xfrm;
 
 	mtu = dst_mtu(dst);
 	if (mtu < IPV6_MIN_MTU)
@@ -82,7 +83,7 @@  static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
-		else if (skb->sk)
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
 			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
@@ -149,11 +150,16 @@  static int __xfrm6_output(struct sock *sk, struct sk_buff *skb)
 	else
 		mtu = dst_mtu(skb_dst(skb));
 
-	if (skb->len > mtu && xfrm6_local_dontfrag(skb)) {
-		xfrm6_local_rxpmtu(skb, mtu);
-		return -EMSGSIZE;
-	} else if (!skb->ignore_df && skb->len > mtu && skb->sk) {
-		xfrm_local_error(skb, mtu);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb->dev = dst->dev;
+
+		if (xfrm6_local_dontfrag(skb))
+			xfrm6_local_rxpmtu(skb, mtu);
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
+			xfrm_local_error(skb, mtu);
+		else
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
 		return -EMSGSIZE;
 	}