Message ID | 1432039021-29666-1-git-send-email-michal.kazior@tieto.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Johannes Berg |
Headers | show |
On Tue, 2015-05-19 at 14:37 +0200, Michal Kazior wrote: > This isn't a revert of f8cdddb8d61d ("cfg80211: > check iface combinations only when iface is > running") as far as functionality is considred > because b6a550156bc ("cfg80211/mac80211: move more > combination checks to mac80211") moved the logic > somewhere else. > > It was possible for mac80211 to be coerced into an > unexpected flow causing sdata union to become > corrupted. Station pointer was put into > sdata->u.vlan.sta memory location while it was > really master AP's sdata->u.ap.next_beacon. This > led to station entry being later freed as CSA > beacon before __sta_info_flush() in > ieee80211_stop_ap() and a subsequent invalid > pointer dereference crash. > > The problem was observed with the following test > steps: > > 1. prepare 2 devices > 2. start hostapd AP with wds_sta=1 > 3. connect client with 4addr > 4. disconnect > 5. swap roles & connect > 6. disconnect > [ During AP (which was a client first) > teardown kernel would crash. ] That doesn't really explain how it crashes? > +++ b/net/wireless/util.c > @@ -944,7 +944,7 @@ int cfg80211_change_iface(struct cfg80211_registered_device *rdev, > ntype == NL80211_IFTYPE_P2P_CLIENT)) > return -EBUSY; > > - if (ntype != otype && netif_running(dev)) { > + if (ntype != otype) { > dev->ieee80211_ptr->use_4addr = false; > dev->ieee80211_ptr->mesh_id_up_len = 0; > wdev_lock(dev->ieee80211_ptr); I don't think that makes much sense - the code within this block really only makes sense when the interface *is* running, like disconnecting etc. Doing that when it's *not* would be entirely unexpected to the drivers, no? johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-05-20 at 15:17 +0200, Johannes Berg wrote: > > - if (ntype != otype && netif_running(dev)) { > > + if (ntype != otype) { > > dev->ieee80211_ptr->use_4addr = false; > > dev->ieee80211_ptr->mesh_id_up_len = 0; > > wdev_lock(dev->ieee80211_ptr); > > I don't think that makes much sense - the code within this block really > only makes sense when the interface *is* running, like disconnecting > etc. Doing that when it's *not* would be entirely unexpected to the > drivers, no? The real problem here might be the assignment to use_4addr *before* we've actually disconnected or anything, perhaps that should be moved? Similarly, the mesh_id_up_len should probably be moved into the mesh point switch case... johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20 May 2015 at 15:19, Johannes Berg <johannes@sipsolutions.net> wrote: > On Wed, 2015-05-20 at 15:17 +0200, Johannes Berg wrote: > >> > - if (ntype != otype && netif_running(dev)) { >> > + if (ntype != otype) { >> > dev->ieee80211_ptr->use_4addr = false; >> > dev->ieee80211_ptr->mesh_id_up_len = 0; >> > wdev_lock(dev->ieee80211_ptr); >> >> I don't think that makes much sense - the code within this block really >> only makes sense when the interface *is* running, like disconnecting >> etc. Doing that when it's *not* would be entirely unexpected to the >> drivers, no? The netif_running() was originally introduced in f8cdddb8d61d which did in for entirely different purpose. Hence stripping netif_running() shouldn't be a problem for drivers, can it? The patch was also made kind of obsolete by b6a550156bc. Perhaps there are some other behavioral changes that I'm unaware of in stuff that gets called upon entering this condition down the stack..? Maybe it's just safer to move use_4addr/mesh_id_up_len clearing into a separate `oftype != ntype` condition. What do you think? > The real problem here might be the assignment to use_4addr *before* > we've actually disconnected or anything, perhaps that should be moved? > > Similarly, the mesh_id_up_len should probably be moved into the mesh > point switch case... The problem isn't use_4addr clearing ordering per se. The problem is it wasn't cleared at all on interface changes. The minimal subset of commands that would crash the system was: # host A and host B have just booted; no wpa_s/hostapd running; all vifs are down host A> iw wlan0 set type station host A> iw wlan0 set 4addr on host A> printf 'interface=wlan0\nssid=4addrcrash\nchannel=1\nwds_sta=1' > /tmp/hconf host A> hostapd -B /tmp/conf host B> iw wlan0 set 4addr on host B> ifconfig wlan0 up host B> iw wlan0 connect -w hostAssid host A> pkill hostapd # host A crashes: [ 127.928192] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c8 [ 127.929014] IP: [<ffffffff816f4f32>] __sta_info_flush+0xac/0x158 ... [ 127.934578] [<ffffffff8170789e>] ieee80211_stop_ap+0x139/0x26c [ 127.934578] [<ffffffff8100498f>] ? dump_trace+0x279/0x28a [ 127.934578] [<ffffffff816dc661>] __cfg80211_stop_ap+0x84/0x191 [ 127.934578] [<ffffffff816dc7ad>] cfg80211_stop_ap+0x3f/0x58 [ 127.934578] [<ffffffff816c5ad6>] nl80211_stop_ap+0x1b/0x1d [ 127.934578] [<ffffffff815e53f8>] genl_family_rcv_msg+0x259/0x2b5 ; gdb vmlinux -ex 'l * __sta_info_flush+0xac' -ex quit --quiet Reading symbols from vmlinux...done. 0xffffffff816f4f32 is in __sta_info_flush (/devel/src/linux/net/mac80211/sta_info.c:1033). 1028 WARN_ON(vlans && !sdata->bss); 1029 1030 mutex_lock(&local->sta_mtx); 1031 list_for_each_entry_safe(sta, tmp, &local->sta_list, list) { 1032 if (sdata == sta->sdata || 1033 (vlans && sdata->bss == sta->sdata->bss)) { 1034 if (!WARN_ON(__sta_info_destroy_part1(sta))) 1035 list_add(&sta->free_list, &free_list); 1036 ret++; 1037 } This is because: (cfg.c, ieee80211_change_station) 1397 if (params->vlan->ieee80211_ptr->use_4addr) { 1398 if (vlansdata->u.vlan.sta) { 1399 err = -EBUSY; 1400 goto out_err; 1401 } 1402 1403 rcu_assign_pointer(vlansdata->u.vlan.sta, sta); 1404 new_4addr = true; 1405 } 1406 @1397 would eval to true even though vlansdata is IFTYPE_AP. u.vlan.sta has identical memory offset as u.ap.next_beacon. During ieee80211_stop_ap() it would: (cfg.c, ieee80211_stop_ap) 913 kfree(sdata->u.ap.next_beacon); 914 sdata->u.ap.next_beacon = NULL; ... 929 __sta_info_flush(sdata, true); Effectively sta_info was freed at @914 and then __sta_info_flush() tried to use it (after it was freed). Do you want me to put the above into the commit log? Should I put a copy into the mac80211 patch as well? Micha? -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-05-21 at 09:44 +0200, Michal Kazior wrote: > On 20 May 2015 at 15:19, Johannes Berg <johannes@sipsolutions.net> wrote: > > On Wed, 2015-05-20 at 15:17 +0200, Johannes Berg wrote: > > > >> > - if (ntype != otype && netif_running(dev)) { > >> > + if (ntype != otype) { > >> > dev->ieee80211_ptr->use_4addr = false; > >> > dev->ieee80211_ptr->mesh_id_up_len = 0; > >> > wdev_lock(dev->ieee80211_ptr); > >> > >> I don't think that makes much sense - the code within this block really > >> only makes sense when the interface *is* running, like disconnecting > >> etc. Doing that when it's *not* would be entirely unexpected to the > >> drivers, no? > > The netif_running() was originally introduced in f8cdddb8d61d which > did in for entirely different purpose. Hence stripping netif_running() > shouldn't be a problem for drivers, can it? The patch was also made > kind of obsolete by b6a550156bc. Perhaps there are some other > behavioral changes that I'm unaware of in stuff that gets called upon > entering this condition down the stack..? Perhaps it doesn't matter actually, since all the functions that are called here will double-check things before calling the driver? > Maybe it's just safer to move use_4addr/mesh_id_up_len clearing into a > separate `oftype != ntype` condition. What do you think? Or maybe we should just look at the functions we call in more detail :) > > The real problem here might be the assignment to use_4addr *before* > > we've actually disconnected or anything, perhaps that should be moved? > > > > Similarly, the mesh_id_up_len should probably be moved into the mesh > > point switch case... > > The problem isn't use_4addr clearing ordering per se. The problem is > it wasn't cleared at all on interface changes. Ah. > Do you want me to put the above into the commit log? Should I put a > copy into the mac80211 patch as well? I think just noting (more explicitly) that it was *missing* a clearing in these cases would have been helpful. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/wireless/util.c b/net/wireless/util.c index 70051ab52f4f..7e4e3fffe7ce 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -944,7 +944,7 @@ int cfg80211_change_iface(struct cfg80211_registered_device *rdev, ntype == NL80211_IFTYPE_P2P_CLIENT)) return -EBUSY; - if (ntype != otype && netif_running(dev)) { + if (ntype != otype) { dev->ieee80211_ptr->use_4addr = false; dev->ieee80211_ptr->mesh_id_up_len = 0; wdev_lock(dev->ieee80211_ptr);
This isn't a revert of f8cdddb8d61d ("cfg80211: check iface combinations only when iface is running") as far as functionality is considred because b6a550156bc ("cfg80211/mac80211: move more combination checks to mac80211") moved the logic somewhere else. It was possible for mac80211 to be coerced into an unexpected flow causing sdata union to become corrupted. Station pointer was put into sdata->u.vlan.sta memory location while it was really master AP's sdata->u.ap.next_beacon. This led to station entry being later freed as CSA beacon before __sta_info_flush() in ieee80211_stop_ap() and a subsequent invalid pointer dereference crash. The problem was observed with the following test steps: 1. prepare 2 devices 2. start hostapd AP with wds_sta=1 3. connect client with 4addr 4. disconnect 5. swap roles & connect 6. disconnect [ During AP (which was a client first) teardown kernel would crash. ] Fixes: f8cdddb8d61d ("cfg80211: check iface combinations only when iface is running") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> --- net/wireless/util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)