diff mbox series

[net,2/3] udp: check encap socket in __udp_lib_err

Message ID 20210712005554.26948-3-vfedorenko@novek.ru (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fix PMTU for ESP-in-UDP encapsulation | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 1 blamed authors not CCed: marcelo.leitner@gmail.com; 2 maintainers not CCed: yoshfuji@linux-ipv6.org marcelo.leitner@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 7 this patch: 7
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: line length of 96 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 7 this patch: 7
netdev/header_inline success Link

Commit Message

Vadim Fedorenko July 12, 2021, 12:55 a.m. UTC
Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
added checks for encapsulated sockets but it broke cases when there is
no implementation of encap_err_lookup for encapsulation, i.e. ESP in
UDP encapsulation. Fix it by calling encap_err_lookup only if socket
implements this method otherwise treat it as legal socket.

Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
---
 net/ipv4/udp.c | 24 +++++++++++++++++++++++-
 net/ipv6/udp.c | 22 ++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletion(-)

Comments

Willem de Bruijn July 12, 2021, 7:59 a.m. UTC | #1
On Mon, Jul 12, 2021 at 2:56 AM Vadim Fedorenko <vfedorenko@novek.ru> wrote:
>
> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> added checks for encapsulated sockets but it broke cases when there is
> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
> implements this method otherwise treat it as legal socket.
>
> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
> ---
>  net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>  net/ipv6/udp.c | 22 ++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)

This duplicates __udp4_lib_err_encap and __udp6_lib_err_encap.

Can we avoid open-coding that logic multiple times?
Paolo Abeni July 12, 2021, 9:07 a.m. UTC | #2
Hello,

On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> added checks for encapsulated sockets but it broke cases when there is
> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
> implements this method otherwise treat it as legal socket.
> 
> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
> ---
>  net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>  net/ipv6/udp.c | 22 ++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e5cb7fedfbcd..4980e0f19990 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -707,7 +707,29 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>  	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>  			       iph->saddr, uh->source, skb->dev->ifindex,
>  			       inet_sdif(skb), udptable, NULL);
> -	if (!sk || udp_sk(sk)->encap_enabled) {
> +	if (sk && udp_sk(sk)->encap_enabled) {
> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
> +
> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
> +		if (lookup) {
> +			int network_offset, transport_offset;
> +
> +			network_offset = skb_network_offset(skb);
> +			transport_offset = skb_transport_offset(skb);
> +
> +			/* Network header needs to point to the outer IPv4 header inside ICMP */
> +			skb_reset_network_header(skb);
> +
> +			/* Transport header needs to point to the UDP header */
> +			skb_set_transport_header(skb, iph->ihl << 2);
> +			if (lookup(sk, skb))
> +				sk = NULL;
> +			skb_set_transport_header(skb, transport_offset);
> +			skb_set_network_header(skb, network_offset);
> +		}
> +	}
> +
> +	if (!sk) {
>  		/* No socket for error: try tunnels before discarding */
>  		sk = ERR_PTR(-ENOENT);
>  		if (static_branch_unlikely(&udp_encap_needed_key)) {
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 798916d2e722..ed49a8589d9f 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -558,6 +558,28 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>  
>  	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>  			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
> +	if (sk && udp_sk(sk)->encap_enabled) {
> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
> +
> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
> +		if (lookup) {
> +			int network_offset, transport_offset;
> +
> +			network_offset = skb_network_offset(skb);
> +			transport_offset = skb_transport_offset(skb);
> +
> +			/* Network header needs to point to the outer IPv6 header inside ICMP */
> +			skb_reset_network_header(skb);
> +
> +			/* Transport header needs to point to the UDP header */
> +			skb_set_transport_header(skb, offset);
> +			if (lookup(sk, skb))
> +				sk = NULL;
> +			skb_set_transport_header(skb, transport_offset);
> +			skb_set_network_header(skb, network_offset);
> +		}
> +	}

I can't follow this code. I guess that before d26796ae5894,
__udp6_lib_err() used to invoke ICMP processing on the ESP in UDP
socket, and after d26796ae5894 'sk' was cleared
by __udp4_lib_err_encap(), is that correct?

After this patch, the above chunk will not clear 'sk' for packets
targeting ESP in UDP sockets, but AFAICS we will still enter the
following conditional, preserving the current behavior - no ICMP
processing. 

Can you please clarify?

Why can't you use something alike the following instead?

---
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c0f9f3260051..96a3b640e4da 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
        sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
                               iph->saddr, uh->source, skb->dev->ifindex,
                               inet_sdif(skb), udptable, NULL);
-       if (!sk || udp_sk(sk)->encap_type) {
+       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
                /* No socket for error: try tunnels before discarding */
                sk = ERR_PTR(-ENOENT);
                if (static_branch_unlikely(&udp_encap_needed_key)) {

---

Thanks!

/P
Vadim Fedorenko July 12, 2021, 12:09 p.m. UTC | #3
On 12.07.2021 08:59, Willem de Bruijn wrote:
> On Mon, Jul 12, 2021 at 2:56 AM Vadim Fedorenko <vfedorenko@novek.ru> wrote:
>>
>> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> added checks for encapsulated sockets but it broke cases when there is
>> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
>> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
>> implements this method otherwise treat it as legal socket.
>>
>> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>> ---
>>   net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>>   net/ipv6/udp.c | 22 ++++++++++++++++++++++
>>   2 files changed, 45 insertions(+), 1 deletion(-)
> 
> This duplicates __udp4_lib_err_encap and __udp6_lib_err_encap.
> 
> Can we avoid open-coding that logic multiple times?
> 
Yes, sure. I was thinking about the same but wanted to get a feedback
on approach itself. I will try to implement parts of that duplicates
as helpers next round.
Vadim Fedorenko July 12, 2021, 12:45 p.m. UTC | #4
On 12.07.2021 10:07, Paolo Abeni wrote:
> Hello,
> 
> On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
>> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> added checks for encapsulated sockets but it broke cases when there is
>> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
>> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
>> implements this method otherwise treat it as legal socket.
>>
>> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>> ---
>>   net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>>   net/ipv6/udp.c | 22 ++++++++++++++++++++++
>>   2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index e5cb7fedfbcd..4980e0f19990 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -707,7 +707,29 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>>   	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>>   			       iph->saddr, uh->source, skb->dev->ifindex,
>>   			       inet_sdif(skb), udptable, NULL);
>> -	if (!sk || udp_sk(sk)->encap_enabled) {
>> +	if (sk && udp_sk(sk)->encap_enabled) {
>> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
>> +
>> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
>> +		if (lookup) {
>> +			int network_offset, transport_offset;
>> +
>> +			network_offset = skb_network_offset(skb);
>> +			transport_offset = skb_transport_offset(skb);
>> +
>> +			/* Network header needs to point to the outer IPv4 header inside ICMP */
>> +			skb_reset_network_header(skb);
>> +
>> +			/* Transport header needs to point to the UDP header */
>> +			skb_set_transport_header(skb, iph->ihl << 2);
>> +			if (lookup(sk, skb))
>> +				sk = NULL;
>> +			skb_set_transport_header(skb, transport_offset);
>> +			skb_set_network_header(skb, network_offset);
>> +		}
>> +	}
>> +
>> +	if (!sk) {
>>   		/* No socket for error: try tunnels before discarding */
>>   		sk = ERR_PTR(-ENOENT);
>>   		if (static_branch_unlikely(&udp_encap_needed_key)) {
>> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
>> index 798916d2e722..ed49a8589d9f 100644
>> --- a/net/ipv6/udp.c
>> +++ b/net/ipv6/udp.c
>> @@ -558,6 +558,28 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>>   
>>   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>>   			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>> +	if (sk && udp_sk(sk)->encap_enabled) {
>> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
>> +
>> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
>> +		if (lookup) {
>> +			int network_offset, transport_offset;
>> +
>> +			network_offset = skb_network_offset(skb);
>> +			transport_offset = skb_transport_offset(skb);
>> +
>> +			/* Network header needs to point to the outer IPv6 header inside ICMP */
>> +			skb_reset_network_header(skb);
>> +
>> +			/* Transport header needs to point to the UDP header */
>> +			skb_set_transport_header(skb, offset);
>> +			if (lookup(sk, skb))
>> +				sk = NULL;
>> +			skb_set_transport_header(skb, transport_offset);
>> +			skb_set_network_header(skb, network_offset);
>> +		}
>> +	}
> 
> I can't follow this code. I guess that before d26796ae5894,
> __udp6_lib_err() used to invoke ICMP processing on the ESP in UDP
> socket, and after d26796ae5894 'sk' was cleared
> by __udp4_lib_err_encap(), is that correct?

Actually it was cleared just before __udp4_lib_err_encap() and after
it we totally loose the information of socket found by __udp4_lib_lookup()
because __udp4_lib_err_encap() uses different combination of ports
(source and destination ports are exchanged) and could find different
socket.

> 
> After this patch, the above chunk will not clear 'sk' for packets
> targeting ESP in UDP sockets, but AFAICS we will still enter the
> following conditional, preserving the current behavior - no ICMP
> processing.

We will not enter following conditional for ESP in UDP case because
there is no more check for encap_type or encap_enabled. Just for
case of no udp socket as it was before d26796ae5894. But we still
have to check if the socket found by __udp4_lib_lookup() is correct
for received ICMP packet that's why I added code about encap_err_lookup.

I maybe missing something but d26796ae5894 doesn't actually explain
which particular situation should be avoided by this additional check
and no tests were added to simply reproduce the problem. If you can
explain it a bit more it would greatly help me to improve the fix.

Thanks
> 
> Can you please clarify?
> 
> Why can't you use something alike the following instead?
> 
> ---
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index c0f9f3260051..96a3b640e4da 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>                                 iph->saddr, uh->source, skb->dev->ifindex,
>                                 inet_sdif(skb), udptable, NULL);
> -       if (!sk || udp_sk(sk)->encap_type) {
> +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
>                  /* No socket for error: try tunnels before discarding */
>                  sk = ERR_PTR(-ENOENT);
>                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> 
> ---
> 
> Thanks!
> 
> /P
>
Paolo Abeni July 12, 2021, 1:37 p.m. UTC | #5
On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> 
> > After this patch, the above chunk will not clear 'sk' for packets
> > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > following conditional, preserving the current behavior - no ICMP
> > processing.
> 
> We will not enter following conditional for ESP in UDP case because
> there is no more check for encap_type or encap_enabled. 

I see. You have a bug in the ipv6 code-path. With your patch applied:

---
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
                               inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
        if (sk && udp_sk(sk)->encap_enabled) {
		//...
        }

        if (!sk || udp_sk(sk)->encap_enabled) {
	// can still enter here...
---	

> I maybe missing something but d26796ae5894 doesn't actually explain
> which particular situation should be avoided by this additional check
> and no tests were added to simply reproduce the problem. If you can
> explain it a bit more it would greatly help me to improve the fix.

Xin knows better, but AFAICS it used to cover the situation you
explicitly tests in patch 3/3 - incoming packet with src-port == dst-
port == tunnel port - for e.g. vxlan tunnels.

> > Why can't you use something alike the following instead?
> > 
> > ---
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index c0f9f3260051..96a3b640e4da 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> >          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> >                                 iph->saddr, uh->source, skb->dev->ifindex,
> >                                 inet_sdif(skb), udptable, NULL);
> > -       if (!sk || udp_sk(sk)->encap_type) {
> > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> >                  /* No socket for error: try tunnels before discarding */
> >                  sk = ERR_PTR(-ENOENT);
> >                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> > 
> > ---

Could you please have a look at the above ?

Thanks!

/P
Vadim Fedorenko July 12, 2021, 2:05 p.m. UTC | #6
On 12.07.2021 14:37, Paolo Abeni wrote:
> On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
>>
>>> After this patch, the above chunk will not clear 'sk' for packets
>>> targeting ESP in UDP sockets, but AFAICS we will still enter the
>>> following conditional, preserving the current behavior - no ICMP
>>> processing.
>>
>> We will not enter following conditional for ESP in UDP case because
>> there is no more check for encap_type or encap_enabled.
> 
> I see. You have a bug in the ipv6 code-path. With your patch applied:
> 
> ---
>   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>                                 inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>          if (sk && udp_sk(sk)->encap_enabled) {
> 		//...
>          }
> 
>          if (!sk || udp_sk(sk)->encap_enabled) {
> 	// can still enter here...
> ---	
> 

Oh, my bad, thanks for catching this!

>> I maybe missing something but d26796ae5894 doesn't actually explain
>> which particular situation should be avoided by this additional check
>> and no tests were added to simply reproduce the problem. If you can
>> explain it a bit more it would greatly help me to improve the fix.
> 
> Xin knows better, but AFAICS it used to cover the situation you
> explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> port == tunnel port - for e.g. vxlan tunnels.
>

Ok, so my assumption was like yours, that's good.

>>> Why can't you use something alike the following instead?
>>>
>>> ---
>>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>>> index c0f9f3260051..96a3b640e4da 100644
>>> --- a/net/ipv4/udp.c
>>> +++ b/net/ipv4/udp.c
>>> @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>>>           sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>>>                                  iph->saddr, uh->source, skb->dev->ifindex,
>>>                                  inet_sdif(skb), udptable, NULL);
>>> -       if (!sk || udp_sk(sk)->encap_type) {
>>> +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
>>>                   /* No socket for error: try tunnels before discarding */
>>>                   sk = ERR_PTR(-ENOENT);
>>>                   if (static_branch_unlikely(&udp_encap_needed_key)) {
>>>
>>> ---
> 
> Could you please have a look at the above ?
> 
Sure. The main problem I see here is that udp4_lib_lookup in udp_lib_err_encap
could return different socket because of different source and destination port
and in this case we will never check for correctness of originally found socket,
i.e. encap_err_lookup will never be called and the ICMP notification will never
be applied to that socket even if it passes checks.
My point is that it's simplier to explicitly check socket that was found than
rely on the result of udp4_lib_lookup with different inputs and leave the case
of no socket as it was before d26796ae5894.

If it's ok, I will unify the code for check as Willem suggested and resend v2.
Paolo Abeni July 12, 2021, 2:09 p.m. UTC | #7
On Mon, 2021-07-12 at 15:05 +0100, Vadim Fedorenko wrote:
> On 12.07.2021 14:37, Paolo Abeni wrote:
> > On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> > > > After this patch, the above chunk will not clear 'sk' for packets
> > > > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > > > following conditional, preserving the current behavior - no ICMP
> > > > processing.
> > > 
> > > We will not enter following conditional for ESP in UDP case because
> > > there is no more check for encap_type or encap_enabled.
> > 
> > I see. You have a bug in the ipv6 code-path. With your patch applied:
> > 
> > ---
> >   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
> >                                 inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
> >          if (sk && udp_sk(sk)->encap_enabled) {
> > 		//...
> >          }
> > 
> >          if (!sk || udp_sk(sk)->encap_enabled) {
> > 	// can still enter here...
> > ---	
> > 
> 
> Oh, my bad, thanks for catching this!
> 
> > > I maybe missing something but d26796ae5894 doesn't actually explain
> > > which particular situation should be avoided by this additional check
> > > and no tests were added to simply reproduce the problem. If you can
> > > explain it a bit more it would greatly help me to improve the fix.
> > 
> > Xin knows better, but AFAICS it used to cover the situation you
> > explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> > port == tunnel port - for e.g. vxlan tunnels.
> > 
> 
> Ok, so my assumption was like yours, that's good.
> 
> > > > Why can't you use something alike the following instead?
> > > > 
> > > > ---
> > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > > index c0f9f3260051..96a3b640e4da 100644
> > > > --- a/net/ipv4/udp.c
> > > > +++ b/net/ipv4/udp.c
> > > > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> > > >           sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> > > >                                  iph->saddr, uh->source, skb->dev->ifindex,
> > > >                                  inet_sdif(skb), udptable, NULL);
> > > > -       if (!sk || udp_sk(sk)->encap_type) {
> > > > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> > > >                   /* No socket for error: try tunnels before discarding */
> > > >                   sk = ERR_PTR(-ENOENT);
> > > >                   if (static_branch_unlikely(&udp_encap_needed_key)) {
> > > > 
> > > > ---
> > 
> > Could you please have a look at the above ?
> > 
> Sure. The main problem I see here is that udp4_lib_lookup in udp_lib_err_encap
> could return different socket because of different source and destination port
> and in this case we will never check for correctness of originally found socket,
> i.e. encap_err_lookup will never be called and the ICMP notification will never
> be applied to that socket even if it passes checks.
> My point is that it's simplier to explicitly check socket that was found than
> rely on the result of udp4_lib_lookup with different inputs and leave the case
> of no socket as it was before d26796ae5894.
> 
> If it's ok, I will unify the code for check as Willem suggested and resend v2.

If the final code is small enough, please go ahead with that.

Thanks!

Paolo
Xin Long July 16, 2021, 5:50 p.m. UTC | #8
On Mon, Jul 12, 2021 at 9:37 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> >
> > > After this patch, the above chunk will not clear 'sk' for packets
> > > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > > following conditional, preserving the current behavior - no ICMP
> > > processing.
> >
> > We will not enter following conditional for ESP in UDP case because
> > there is no more check for encap_type or encap_enabled.
>
> I see. You have a bug in the ipv6 code-path. With your patch applied:
>
> ---
>         sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>                                inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>         if (sk && udp_sk(sk)->encap_enabled) {
>                 //...
>         }
>
>         if (!sk || udp_sk(sk)->encap_enabled) {
>         // can still enter here...
> ---
>
> > I maybe missing something but d26796ae5894 doesn't actually explain
> > which particular situation should be avoided by this additional check
> > and no tests were added to simply reproduce the problem. If you can
> > explain it a bit more it would greatly help me to improve the fix.
>
> Xin knows better, but AFAICS it used to cover the situation you
> explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> port == tunnel port - for e.g. vxlan tunnels.
Thanks Paolo and sorry for late.

Right, __udp4/6_lib_err_encap() was introduced to process the ICMP error
packets for UDP tunnels. But it will only work when there's no socket
found with src + dst port, as when the src == dst port a socket might
be found(if the bind addr is ANY) and the code will be called.



>
> > > Why can't you use something alike the following instead?
> > >
> > > ---
> > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > index c0f9f3260051..96a3b640e4da 100644
> > > --- a/net/ipv4/udp.c
> > > +++ b/net/ipv4/udp.c
> > > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> > >          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> > >                                 iph->saddr, uh->source, skb->dev->ifindex,
> > >                                 inet_sdif(skb), udptable, NULL);
> > > -       if (!sk || udp_sk(sk)->encap_type) {
> > > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> > >                  /* No socket for error: try tunnels before discarding */
> > >                  sk = ERR_PTR(-ENOENT);
> > >                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> > >
> > > ---
>
> Could you please have a look at the above ?
If not all udp tunnels want to do further validation for ICMP error packet,
This looks good to me.

>
> Thanks!
>
> /P
>
diff mbox series

Patch

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e5cb7fedfbcd..4980e0f19990 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -707,7 +707,29 @@  int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
 			       iph->saddr, uh->source, skb->dev->ifindex,
 			       inet_sdif(skb), udptable, NULL);
-	if (!sk || udp_sk(sk)->encap_enabled) {
+	if (sk && udp_sk(sk)->encap_enabled) {
+		int (*lookup)(struct sock *sk, struct sk_buff *skb);
+
+		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
+		if (lookup) {
+			int network_offset, transport_offset;
+
+			network_offset = skb_network_offset(skb);
+			transport_offset = skb_transport_offset(skb);
+
+			/* Network header needs to point to the outer IPv4 header inside ICMP */
+			skb_reset_network_header(skb);
+
+			/* Transport header needs to point to the UDP header */
+			skb_set_transport_header(skb, iph->ihl << 2);
+			if (lookup(sk, skb))
+				sk = NULL;
+			skb_set_transport_header(skb, transport_offset);
+			skb_set_network_header(skb, network_offset);
+		}
+	}
+
+	if (!sk) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);
 		if (static_branch_unlikely(&udp_encap_needed_key)) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 798916d2e722..ed49a8589d9f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -558,6 +558,28 @@  int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
 			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
+	if (sk && udp_sk(sk)->encap_enabled) {
+		int (*lookup)(struct sock *sk, struct sk_buff *skb);
+
+		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
+		if (lookup) {
+			int network_offset, transport_offset;
+
+			network_offset = skb_network_offset(skb);
+			transport_offset = skb_transport_offset(skb);
+
+			/* Network header needs to point to the outer IPv6 header inside ICMP */
+			skb_reset_network_header(skb);
+
+			/* Transport header needs to point to the UDP header */
+			skb_set_transport_header(skb, offset);
+			if (lookup(sk, skb))
+				sk = NULL;
+			skb_set_transport_header(skb, transport_offset);
+			skb_set_network_header(skb, network_offset);
+		}
+	}
+
 	if (!sk || udp_sk(sk)->encap_enabled) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);