diff mbox series

ipv6: fix NULL pointer dereference in ip6_output()

Message ID 20211206163447.991402-1-andrea.righi@canonical.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series ipv6: fix NULL pointer dereference in ip6_output() | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: ssuryaextr@gmail.com; 1 maintainers not CCed: ssuryaextr@gmail.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Andrea Righi Dec. 6, 2021, 4:34 p.m. UTC
It is possible to trigger a NULL pointer dereference by running the srv6
net kselftest (tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh):

[  249.051216] BUG: kernel NULL pointer dereference, address: 0000000000000378
[  249.052331] #PF: supervisor read access in kernel mode
[  249.053137] #PF: error_code(0x0000) - not-present page
[  249.053960] PGD 0 P4D 0
[  249.054376] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  249.055083] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G            E     5.16.0-rc4 #2
[  249.056328] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[  249.057632] RIP: 0010:ip6_forward+0x53c/0xab0
[  249.058354] Code: 49 c7 44 24 20 00 00 00 00 48 83 e0 fe 48 8b 40 30 48 3d 70 b2 b5 81 0f 85 b5 04 00 00 e8 7c f2 ff ff 41 89 c5 e9 17 01 00 00 <44> 8b 93 78 03 00 00 45 85 d2 0f 85 92 fb ff ff 49 8b 54 24 10 48
[  249.061274] RSP: 0018:ffffc900000cbb30 EFLAGS: 00010246
[  249.062042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8881051d3400
[  249.063141] RDX: ffff888104bda000 RSI: 00000000000002c0 RDI: 0000000000000000
[  249.064264] RBP: ffffc900000cbbc8 R08: 0000000000000000 R09: 0000000000000000
[  249.065376] R10: 0000000000000040 R11: 0000000000000000 R12: ffff888103409800
[  249.066498] R13: ffff8881051d3410 R14: ffff888102725280 R15: ffff888103525000
[  249.067619] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[  249.068881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  249.069777] CR2: 0000000000000378 CR3: 0000000104980000 CR4: 0000000000750ee0
[  249.070907] PKRU: 55555554
[  249.071337] Call Trace:
[  249.071730]  <TASK>
[  249.072070]  ? debug_smp_processor_id+0x17/0x20
[  249.072807]  seg6_input_core+0x2bb/0x2d0
[  249.073436]  ? _raw_spin_unlock_irqrestore+0x29/0x40
[  249.074225]  seg6_input+0x3b/0x130
[  249.074768]  lwtunnel_input+0x5e/0xa0
[  249.075357]  ip_rcv+0x17b/0x190
[  249.075867]  ? update_load_avg+0x82/0x600
[  249.076514]  __netif_receive_skb_one_core+0x86/0xa0
[  249.077231]  __netif_receive_skb+0x15/0x60
[  249.077843]  process_backlog+0x97/0x160
[  249.078389]  __napi_poll+0x31/0x170
[  249.078912]  net_rx_action+0x229/0x270
[  249.079506]  __do_softirq+0xef/0x2ed
[  249.080085]  run_ksoftirqd+0x37/0x50
[  249.080663]  smpboot_thread_fn+0x193/0x230
[  249.081312]  kthread+0x17a/0x1a0
[  249.081847]  ? smpboot_register_percpu_thread+0xe0/0xe0
[  249.082677]  ? set_kthread_struct+0x50/0x50
[  249.083340]  ret_from_fork+0x22/0x30
[  249.083926]  </TASK>
[  249.090295] ---[ end trace 1998d7ba5965a365 ]---

It looks like commit 0857d6f8c759 ("ipv6: When forwarding count rx stats
on the orig netdev") tries to determine the right netdev to account the
rx stats, but in this particular case it's failing and the netdev is
NULL.

Fallback to the previous method of determining the netdev interface (via
skb->dev) to account the rx stats when the orig netdev can't be
determined.

Fixes: 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
 net/ipv6/ip6_output.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

David Ahern Dec. 7, 2021, 3:51 p.m. UTC | #1
[ cc a few SR6 folks ]

On 12/6/21 9:34 AM, Andrea Righi wrote:
> It is possible to trigger a NULL pointer dereference by running the srv6
> net kselftest (tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh):
> 
> [  249.051216] BUG: kernel NULL pointer dereference, address: 0000000000000378
> [  249.052331] #PF: supervisor read access in kernel mode
> [  249.053137] #PF: error_code(0x0000) - not-present page
> [  249.053960] PGD 0 P4D 0
> [  249.054376] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [  249.055083] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G            E     5.16.0-rc4 #2
> [  249.056328] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> [  249.057632] RIP: 0010:ip6_forward+0x53c/0xab0
> [  249.058354] Code: 49 c7 44 24 20 00 00 00 00 48 83 e0 fe 48 8b 40 30 48 3d 70 b2 b5 81 0f 85 b5 04 00 00 e8 7c f2 ff ff 41 89 c5 e9 17 01 00 00 <44> 8b 93 78 03 00 00 45 85 d2 0f 85 92 fb ff ff 49 8b 54 24 10 48
> [  249.061274] RSP: 0018:ffffc900000cbb30 EFLAGS: 00010246
> [  249.062042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8881051d3400
> [  249.063141] RDX: ffff888104bda000 RSI: 00000000000002c0 RDI: 0000000000000000
> [  249.064264] RBP: ffffc900000cbbc8 R08: 0000000000000000 R09: 0000000000000000
> [  249.065376] R10: 0000000000000040 R11: 0000000000000000 R12: ffff888103409800
> [  249.066498] R13: ffff8881051d3410 R14: ffff888102725280 R15: ffff888103525000
> [  249.067619] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
> [  249.068881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  249.069777] CR2: 0000000000000378 CR3: 0000000104980000 CR4: 0000000000750ee0
> [  249.070907] PKRU: 55555554
> [  249.071337] Call Trace:
> [  249.071730]  <TASK>
> [  249.072070]  ? debug_smp_processor_id+0x17/0x20
> [  249.072807]  seg6_input_core+0x2bb/0x2d0
> [  249.073436]  ? _raw_spin_unlock_irqrestore+0x29/0x40
> [  249.074225]  seg6_input+0x3b/0x130
> [  249.074768]  lwtunnel_input+0x5e/0xa0
> [  249.075357]  ip_rcv+0x17b/0x190
> [  249.075867]  ? update_load_avg+0x82/0x600
> [  249.076514]  __netif_receive_skb_one_core+0x86/0xa0
> [  249.077231]  __netif_receive_skb+0x15/0x60
> [  249.077843]  process_backlog+0x97/0x160
> [  249.078389]  __napi_poll+0x31/0x170
> [  249.078912]  net_rx_action+0x229/0x270
> [  249.079506]  __do_softirq+0xef/0x2ed
> [  249.080085]  run_ksoftirqd+0x37/0x50
> [  249.080663]  smpboot_thread_fn+0x193/0x230
> [  249.081312]  kthread+0x17a/0x1a0
> [  249.081847]  ? smpboot_register_percpu_thread+0xe0/0xe0
> [  249.082677]  ? set_kthread_struct+0x50/0x50
> [  249.083340]  ret_from_fork+0x22/0x30
> [  249.083926]  </TASK>
> [  249.090295] ---[ end trace 1998d7ba5965a365 ]---
> 
> It looks like commit 0857d6f8c759 ("ipv6: When forwarding count rx stats
> on the orig netdev") tries to determine the right netdev to account the
> rx stats, but in this particular case it's failing and the netdev is
> NULL.
> 
> Fallback to the previous method of determining the netdev interface (via
> skb->dev) to account the rx stats when the orig netdev can't be
> determined.
> 
> Fixes: 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev")
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> ---
>  net/ipv6/ip6_output.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index ff4e83e2a506..7ca4719ff34c 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -472,6 +472,9 @@ int ip6_forward(struct sk_buff *skb)
>  	u32 mtu;
>  
>  	idev = __in6_dev_get_safely(dev_get_by_index_rcu(net, IP6CB(skb)->iif));
> +	if (unlikely(!idev))
> +		idev = __in6_dev_get_safely(skb->dev);
> +

We need to understand why iif is not set - or set to an invalid value.


>  	if (net->ipv6.devconf_all->forwarding == 0)
>  		goto error;
>  
>
Andrea Mayer Dec. 8, 2021, 12:21 a.m. UTC | #2
Hi David,
Thank you for reporting it and thanks also to Andrea Righi for catching that
issue.

Please see my answer below.

On Tue, 7 Dec 2021 08:51:13 -0700
David Ahern <dsahern@gmail.com> wrote:

> [ cc a few SR6 folks ]
> 
> On 12/6/21 9:34 AM, Andrea Righi wrote:
> > It is possible to trigger a NULL pointer dereference by running the srv6
> > net kselftest (tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh):
> > 
> > [  249.051216] BUG: kernel NULL pointer dereference, address: 0000000000000378
> > [  249.052331] #PF: supervisor read access in kernel mode
> > [  249.053137] #PF: error_code(0x0000) - not-present page
> > [  249.053960] PGD 0 P4D 0
> > [  249.054376] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [  249.055083] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G            E     5.16.0-rc4 #2
> > [  249.056328] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> > [  249.057632] RIP: 0010:ip6_forward+0x53c/0xab0
> > [  249.058354] Code: 49 c7 44 24 20 00 00 00 00 48 83 e0 fe 48 8b 40 30 48 3d 70 b2 b5 81 0f 85 b5 04 00 00 e8 7c f2 ff ff 41 89 c5 e9 17 01 00 00 <44> 8b 93 78 03 00 00 45 85 d2 0f 85 92 fb ff ff 49 8b 54 24 10 48
> > [  249.061274] RSP: 0018:ffffc900000cbb30 EFLAGS: 00010246
> > [  249.062042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8881051d3400
> > [  249.063141] RDX: ffff888104bda000 RSI: 00000000000002c0 RDI: 0000000000000000
> > [  249.064264] RBP: ffffc900000cbbc8 R08: 0000000000000000 R09: 0000000000000000
> > [  249.065376] R10: 0000000000000040 R11: 0000000000000000 R12: ffff888103409800
> > [  249.066498] R13: ffff8881051d3410 R14: ffff888102725280 R15: ffff888103525000
> > [  249.067619] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
> > [  249.068881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  249.069777] CR2: 0000000000000378 CR3: 0000000104980000 CR4: 0000000000750ee0
> > [  249.070907] PKRU: 55555554
> > [  249.071337] Call Trace:
> > [  249.071730]  <TASK>
> > [  249.072070]  ? debug_smp_processor_id+0x17/0x20
> > [  249.072807]  seg6_input_core+0x2bb/0x2d0
> > [  249.073436]  ? _raw_spin_unlock_irqrestore+0x29/0x40
> > [  249.074225]  seg6_input+0x3b/0x130
> > [  249.074768]  lwtunnel_input+0x5e/0xa0
> > [  249.075357]  ip_rcv+0x17b/0x190
> > [  249.075867]  ? update_load_avg+0x82/0x600
> > [  249.076514]  __netif_receive_skb_one_core+0x86/0xa0
> > [  249.077231]  __netif_receive_skb+0x15/0x60
> > [  249.077843]  process_backlog+0x97/0x160
> > [  249.078389]  __napi_poll+0x31/0x170
> > [  249.078912]  net_rx_action+0x229/0x270
> > [  249.079506]  __do_softirq+0xef/0x2ed
> > [  249.080085]  run_ksoftirqd+0x37/0x50
> > [  249.080663]  smpboot_thread_fn+0x193/0x230
> > [  249.081312]  kthread+0x17a/0x1a0
> > [  249.081847]  ? smpboot_register_percpu_thread+0xe0/0xe0
> > [  249.082677]  ? set_kthread_struct+0x50/0x50
> > [  249.083340]  ret_from_fork+0x22/0x30
> > [  249.083926]  </TASK>
> > [  249.090295] ---[ end trace 1998d7ba5965a365 ]---
> > 
> > It looks like commit 0857d6f8c759 ("ipv6: When forwarding count rx stats
> > on the orig netdev") tries to determine the right netdev to account the
> > rx stats, but in this particular case it's failing and the netdev is
> > NULL.
> > 
> > Fallback to the previous method of determining the netdev interface (via
> > skb->dev) to account the rx stats when the orig netdev can't be
> > determined.
> > 
> > Fixes: 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev")
> > Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> > ---
> >  net/ipv6/ip6_output.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index ff4e83e2a506..7ca4719ff34c 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -472,6 +472,9 @@ int ip6_forward(struct sk_buff *skb)
> >  	u32 mtu;
> >  
> >  	idev = __in6_dev_get_safely(dev_get_by_index_rcu(net, IP6CB(skb)->iif));
> > +	if (unlikely(!idev))
> > +		idev = __in6_dev_get_safely(skb->dev);
> > +
> 
> We need to understand why iif is not set - or set to an invalid value.
> 

When an IPv4 packet is received, the ip_rcv_core(...) sets the receiving
interface index into the IPv4 socket control block (v5.16-rc4,
net/ipv4/ip_input.c line 510):
    IPCB(skb)->iif = skb->skb_iif;

If that IPv4 packet is meant to be encapsulated in an outer IPv6+SRH header,
the seg6_do_srh_encap(...) performs the required encapsulation. 
In this case, the seg6_do_srh_encap function clears the IPv6 socket control
block (v5.16-rc4 net/ipv6/seg6_iptunnel.c line 163):
    memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));

The memset(...) was introduced in commit ef489749aae5 ("ipv6: sr: clear
IP6CB(skb) on SRH ip4ip6 encapsulation") a long time ago (2019-01-29).

Since the IPv6 socket control block and the IPv4 socket control block share the
same memory area (skb->cb), the receiving interface index info is lost
(IP6CB(skb)->iif is set to zero).

As a side effect, that condition triggers a NULL pointer dereference if patch
0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") is
applied.

To fix that, I can send a patch where we set the IP6CB(skb)->iif to the
index of the receiving interface, i.e.:

int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto)
         [...]
         ip6_flow_hdr(hdr, 0, flowlabel);
         hdr->hop_limit = ip6_dst_hoplimit(skb_dst(skb));

         memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
+        IP6CB(skb)->iif = skb->skb_iif;
         [...]

What do you think?

Andrea
David Ahern Dec. 8, 2021, 12:50 a.m. UTC | #3
On 12/7/21 5:21 PM, Andrea Mayer wrote:
> 
> When an IPv4 packet is received, the ip_rcv_core(...) sets the receiving
> interface index into the IPv4 socket control block (v5.16-rc4,
> net/ipv4/ip_input.c line 510):
>     IPCB(skb)->iif = skb->skb_iif;
> 
> If that IPv4 packet is meant to be encapsulated in an outer IPv6+SRH header,
> the seg6_do_srh_encap(...) performs the required encapsulation. 
> In this case, the seg6_do_srh_encap function clears the IPv6 socket control
> block (v5.16-rc4 net/ipv6/seg6_iptunnel.c line 163):
>     memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
> 
> The memset(...) was introduced in commit ef489749aae5 ("ipv6: sr: clear
> IP6CB(skb) on SRH ip4ip6 encapsulation") a long time ago (2019-01-29).
> 
> Since the IPv6 socket control block and the IPv4 socket control block share the
> same memory area (skb->cb), the receiving interface index info is lost
> (IP6CB(skb)->iif is set to zero).
> 
> As a side effect, that condition triggers a NULL pointer dereference if patch
> 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") is
> applied.
> 
> To fix that, I can send a patch where we set the IP6CB(skb)->iif to the
> index of the receiving interface, i.e.:
> 
> int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto)
>          [...]
>          ip6_flow_hdr(hdr, 0, flowlabel);
>          hdr->hop_limit = ip6_dst_hoplimit(skb_dst(skb));
> 
>          memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
> +        IP6CB(skb)->iif = skb->skb_iif;
>          [...]
> 
> What do you think?
> 

I like that approach over the need for a fall back in core ipv6 code.
Make sure the above analysis is in the commit message. Thanks for the
quick response,
Florian Westphal Dec. 8, 2021, 10:51 a.m. UTC | #4
David Ahern <dsahern@gmail.com> wrote:
> On 12/7/21 5:21 PM, Andrea Mayer wrote:
> > +        IP6CB(skb)->iif = skb->skb_iif;
> >          [...]
> > 
> > What do you think?
> > 
> 
> I like that approach over the need for a fall back in core ipv6 code.

What if the device is removed after ->iif assignment and before dev lookup?
David Ahern Dec. 8, 2021, 3:46 p.m. UTC | #5
On 12/8/21 3:51 AM, Florian Westphal wrote:
> David Ahern <dsahern@gmail.com> wrote:
>> On 12/7/21 5:21 PM, Andrea Mayer wrote:
>>> +        IP6CB(skb)->iif = skb->skb_iif;
>>>          [...]
>>>
>>> What do you think?
>>>
>>
>> I like that approach over the need for a fall back in core ipv6 code.
> 
> What if the device is removed after ->iif assignment and before dev lookup?
> 

good point. SR6 should make sure the iif is not cleared, and the
fallback to the skb->dev is still needed in case of delete.
Stephen Suryaputra Dec. 8, 2021, 4:34 p.m. UTC | #6
On Wed, Dec 08, 2021 at 08:46:37AM -0700, David Ahern wrote:
> On 12/8/21 3:51 AM, Florian Westphal wrote:
> > David Ahern <dsahern@gmail.com> wrote:
> >> On 12/7/21 5:21 PM, Andrea Mayer wrote:
> >>> +        IP6CB(skb)->iif = skb->skb_iif;
> >>>          [...]
> >>>
> >>> What do you think?
> >>>
> >>
> >> I like that approach over the need for a fall back in core ipv6 code.
> > 
> > What if the device is removed after ->iif assignment and before dev lookup?
> > 
> 
> good point. SR6 should make sure the iif is not cleared, and the
> fallback to the skb->dev is still needed in case of delete.

Thanks for the explanation. I was thinking that ->iif can safely be
assumed to be valid. Florian's point that device can be removed is a
good one. My bad for not putting the check and thanks for fixing.
diff mbox series

Patch

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ff4e83e2a506..7ca4719ff34c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -472,6 +472,9 @@  int ip6_forward(struct sk_buff *skb)
 	u32 mtu;
 
 	idev = __in6_dev_get_safely(dev_get_by_index_rcu(net, IP6CB(skb)->iif));
+	if (unlikely(!idev))
+		idev = __in6_dev_get_safely(skb->dev);
+
 	if (net->ipv6.devconf_all->forwarding == 0)
 		goto error;