diff mbox series

[net,1/2] openvswitch: fix stack OOB read while fragmenting IPv4 packets

Message ID 94839fa9e7995afa6139b4f65c12ac15c1a8dc2f.1618844973.git.dcaratti@redhat.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series fix stack OOB read while fragmenting IPv4 packets | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 1 blamed authors not CCed: sbrivio@redhat.com; 2 maintainers not CCed: sbrivio@redhat.com dev@openvswitch.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 21 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Davide Caratti April 19, 2021, 3:23 p.m. UTC
running openvswitch on kernels built with KASAN, it's possible to see the
following splat while testing fragmentation of IPv4 packets:

 BUG: KASAN: stack-out-of-bounds in ip_do_fragment+0x1b03/0x1f60
 Read of size 1 at addr ffff888112fc713c by task handler2/1367

 CPU: 0 PID: 1367 Comm: handler2 Not tainted 5.12.0-rc6+ #418
 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
 Call Trace:
  dump_stack+0x92/0xc1
  print_address_description.constprop.7+0x1a/0x150
  kasan_report.cold.13+0x7f/0x111
  ip_do_fragment+0x1b03/0x1f60
  ovs_fragment+0x5bf/0x840 [openvswitch]
  do_execute_actions+0x1bd5/0x2400 [openvswitch]
  ovs_execute_actions+0xc8/0x3d0 [openvswitch]
  ovs_packet_cmd_execute+0xa39/0x1150 [openvswitch]
  genl_family_rcv_msg_doit.isra.15+0x227/0x2d0
  genl_rcv_msg+0x287/0x490
  netlink_rcv_skb+0x120/0x380
  genl_rcv+0x24/0x40
  netlink_unicast+0x439/0x630
  netlink_sendmsg+0x719/0xbf0
  sock_sendmsg+0xe2/0x110
  ____sys_sendmsg+0x5ba/0x890
  ___sys_sendmsg+0xe9/0x160
  __sys_sendmsg+0xd3/0x170
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f957079db07
 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 eb ec ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 24 ed ff ff 48
 RSP: 002b:00007f956ce35a50 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007f957079db07
 RDX: 0000000000000000 RSI: 00007f956ce35ae0 RDI: 0000000000000019
 RBP: 00007f956ce35ae0 R08: 0000000000000000 R09: 00007f9558006730
 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
 R13: 00007f956ce37308 R14: 00007f956ce35f80 R15: 00007f956ce35ae0

 The buggy address belongs to the page:
 page:00000000af2a1d93 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112fc7
 flags: 0x17ffffc0000000()
 raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 addr ffff888112fc713c is located in stack of task handler2/1367 at offset 180 in frame:
  ovs_fragment+0x0/0x840 [openvswitch]

 this frame has 2 objects:
  [32, 144) 'ovs_dst'
  [192, 424) 'ovs_rt'

 Memory state around the buggy address:
  ffff888112fc7000: f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff888112fc7080: 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00
 >ffff888112fc7100: 00 00 00 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00
                                         ^
  ffff888112fc7180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff888112fc7200: 00 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00

for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. Then,
in the following call graph:

  ip_do_fragment()
    ip_skb_dst_mtu()
      ip_dst_mtu_maybe_forward()
        ip_mtu_locked()

the pointer to struct dst_entry is used as pointer to struct rtable: this
turns the access to struct members like rt_mtu_locked into an OOB read in
the stack. Fix this changing the temporary variable used for IPv4 packets
in ovs_fragment(), similarly to what is done for IPv6 few lines below.

Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmt")
Cc: <stable@vger.kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/openvswitch/actions.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Eelco Chaudron April 21, 2021, 9:27 a.m. UTC | #1
On 19 Apr 2021, at 17:23, Davide Caratti wrote:

> running openvswitch on kernels built with KASAN, it's possible to see 
> the
> following splat while testing fragmentation of IPv4 packets:

<SNIP>

> for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. 
> Then,
> in the following call graph:
>
>   ip_do_fragment()
>     ip_skb_dst_mtu()
>       ip_dst_mtu_maybe_forward()
>         ip_mtu_locked()
>
> the pointer to struct dst_entry is used as pointer to struct rtable: 
> this
> turns the access to struct members like rt_mtu_locked into an OOB read 
> in
> the stack. Fix this changing the temporary variable used for IPv4 
> packets
> in ovs_fragment(), similarly to what is done for IPv6 few lines below.
>
> Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < 
> net.ipv4.route.min_pmt")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>

The fix looks good to me, however isn’t the real root cause 
ip_mtu_locked() who casts struct dst_entry to struct rtable (not even 
using container_of())?

I do not know details in this area of the code, so maybe it’s just 
fine to always assume dst_entry is part of a rtable struct, as I see 
other core functions do the same 
ipv4_neigh_lookup()/ipv4_confirm_neigh().


Acked-by: Eelco Chaudron <echaudro@redhat.com>
Davide Caratti April 21, 2021, 3:05 p.m. UTC | #2
hello Eelco, thanks for looking at this!

On Wed, 2021-04-21 at 11:27 +0200, Eelco Chaudron wrote:
> 
> On 19 Apr 2021, at 17:23, Davide Caratti wrote:
> 
> > running openvswitch on kernels built with KASAN, it's possible to see 
> > the
> > following splat while testing fragmentation of IPv4 packets:
> 
> <SNIP>
> 
> > for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. 
> > Then,
> > in the following call graph:
> > 
> >   ip_do_fragment()
> >     ip_skb_dst_mtu()
> >       ip_dst_mtu_maybe_forward()
> >         ip_mtu_locked()
> > 
> > the pointer to struct dst_entry is used as pointer to struct rtable: 
> > this
> > turns the access to struct members like rt_mtu_locked into an OOB read 
> > in
> > the stack. Fix this changing the temporary variable used for IPv4 
> > packets
> > in ovs_fragment(), similarly to what is done for IPv6 few lines below.
> > 
> > Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < 
> > net.ipv4.route.min_pmt")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> 
> The fix looks good to me, however isn’t the real root cause 
> ip_mtu_locked() who casts struct dst_entry to struct rtable (not even 
> using container_of())?

good point, that's my understanding (and the reason for that 'Fixes:'
tag). Probably openvswitch was doing this on purpose, and it was "just
working" until commit d52e5a7e7ca4.

But at the current state, I see much easier to just fix the IPv4 part to
have the same behavior as other "users" of ip_do_fragment(), like it
happens for ovs_fragment() when the packet is IPv6 (or br_netfilter
core, see [1]).

By the way, apparently ip_do_fragment() already assumes that a struct
rtable is available for the skb [2]. So, the fix in ovs_fragment() looks
safer to me. WDYT?
Eelco Chaudron April 22, 2021, 9:17 a.m. UTC | #3
On 21 Apr 2021, at 17:05, Davide Caratti wrote:

> hello Eelco, thanks for looking at this!
>
> On Wed, 2021-04-21 at 11:27 +0200, Eelco Chaudron wrote:
>>
>> On 19 Apr 2021, at 17:23, Davide Caratti wrote:
>>
>>> running openvswitch on kernels built with KASAN, it's possible to 
>>> see
>>> the
>>> following splat while testing fragmentation of IPv4 packets:
>>
>> <SNIP>
>>
>>> for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry.
>>> Then,
>>> in the following call graph:
>>>
>>>   ip_do_fragment()
>>>     ip_skb_dst_mtu()
>>>       ip_dst_mtu_maybe_forward()
>>>         ip_mtu_locked()
>>>
>>> the pointer to struct dst_entry is used as pointer to struct rtable:
>>> this
>>> turns the access to struct members like rt_mtu_locked into an OOB 
>>> read
>>> in
>>> the stack. Fix this changing the temporary variable used for IPv4
>>> packets
>>> in ovs_fragment(), similarly to what is done for IPv6 few lines 
>>> below.
>>>
>>> Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU <
>>> net.ipv4.route.min_pmt")
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
>>
>> The fix looks good to me, however isn’t the real root cause
>> ip_mtu_locked() who casts struct dst_entry to struct rtable (not even
>> using container_of())?
>
> good point, that's my understanding (and the reason for that 'Fixes:'
> tag). Probably openvswitch was doing this on purpose, and it was "just
> working" until commit d52e5a7e7ca4.
>
> But at the current state, I see much easier to just fix the IPv4 part 
> to
> have the same behavior as other "users" of ip_do_fragment(), like it
> happens for ovs_fragment() when the packet is IPv6 (or br_netfilter
> core, see [1]).
>
> By the way, apparently ip_do_fragment() already assumes that a struct
> rtable is available for the skb [2]. So, the fix in ovs_fragment() 
> looks
> safer to me. WDYT?

It looks like the assumption that a dst_entry is always embedded in 
rtable seems deeply embedded already, looking at skb_rtable(), so I 
agree this patch is the best solution.

So again, Acked-by: Eelco Chaudron <echaudro@redhat.com>
diff mbox series

Patch

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 92a0b67b2728..77d924ab8cdb 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -827,17 +827,17 @@  static void ovs_fragment(struct net *net, struct vport *vport,
 	}
 
 	if (key->eth.type == htons(ETH_P_IP)) {
-		struct dst_entry ovs_dst;
+		struct rtable ovs_rt = { 0 };
 		unsigned long orig_dst;
 
 		prepare_frag(vport, skb, orig_network_offset,
 			     ovs_key_mac_proto(key));
-		dst_init(&ovs_dst, &ovs_dst_ops, NULL, 1,
+		dst_init(&ovs_rt.dst, &ovs_dst_ops, NULL, 1,
 			 DST_OBSOLETE_NONE, DST_NOCOUNT);
-		ovs_dst.dev = vport->dev;
+		ovs_rt.dst.dev = vport->dev;
 
 		orig_dst = skb->_skb_refdst;
-		skb_dst_set_noref(skb, &ovs_dst);
+		skb_dst_set_noref(skb, &ovs_rt.dst);
 		IPCB(skb)->frag_max_size = mru;
 
 		ip_do_fragment(net, skb->sk, skb, ovs_vport_output);