From patchwork Wed Feb 25 09:35:05 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sven Eckelmann X-Patchwork-Id: 5879091 X-Patchwork-Delegate: johannes@sipsolutions.net Return-Path: X-Original-To: patchwork-linux-wireless@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 94AF0BF440 for ; Wed, 25 Feb 2015 09:41:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A726A2037E for ; Wed, 25 Feb 2015 09:41:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 671F62037B for ; Wed, 25 Feb 2015 09:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751963AbbBYJlu (ORCPT ); Wed, 25 Feb 2015 04:41:50 -0500 Received: from narfation.org ([79.140.41.39]:56375 "EHLO v3-1039.vlinux.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751590AbbBYJlr (ORCPT ); Wed, 25 Feb 2015 04:41:47 -0500 X-Greylist: delayed 399 seconds by postgrey-1.27 at vger.kernel.org; Wed, 25 Feb 2015 04:41:47 EST Received: from bentobox.localnet (drsd-4d05e1a3.pool.mediaWays.net [77.5.225.163]) by v3-1039.vlinux.de (Postfix) with ESMTPSA id BFC751100CB; Wed, 25 Feb 2015 10:35:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=narfation.org; s=20121; t=1424856907; bh=Nmfprg6Jb/BKvPfLoOVAKjzjTGdqLHjxXsO8FYx4Trc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aLWK9sCjseQ9+RZUeX5HpxTloyoQM1AKeJyFUgVVP4gVbg9PpjpxPt45DruXQL/E8 rn1YkrHmMbKcd/baEWonV9UAsRlGg0JQIpO8Fi6kmXAwx/2o/vnOofc/BaHKCB5tqX YKXGDQ8hColZd3wl4M3RuSuKFMg1+77NvPzvlNgo= From: Sven Eckelmann To: Felix Fietkau Cc: simon@open-mesh.com, linux-wireless@vger.kernel.org, johannes@sipsolutions.net, marek@open-mesh.com, Antonio Quartulli Subject: Re: [PATCH v6 2/3] mac80211/minstrel_ht: use the new rate control API Date: Wed, 25 Feb 2015 10:35:05 +0100 Message-ID: <8006741.C7YlhOg3U7@bentobox> User-Agent: KMail/4.14.2 (Linux/3.16.0-4-amd64; KDE/4.14.2; x86_64; ; ) In-Reply-To: <2670025.E9NWYu3f4D@bentobox> References: <1366640083-1054-1-git-send-email-nbd@openwrt.org> <1366640083-1054-2-git-send-email-nbd@openwrt.org> <2670025.E9NWYu3f4D@bentobox> MIME-Version: 1.0 Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_SIGNED,RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Felix, On Friday 20 February 2015 15:12:10 Sven Eckelmann wrote: > > static void > > > > @@ -846,6 +857,8 @@ minstrel_ht_update_caps(void *priv, struct > > ieee80211_supported_band *sband, > > > > msp->is_ht = true; > > memset(mi, 0, sizeof(*mi)); > > > > + > > + mi->sta = sta; > > > > mi->stats_update = jiffies; > > minstrel_ht_update_caps can be called on init and on different other changes > (rate_control_rate_update). > > Which lock protects mi from following scenario? > > context 1: memset(mi, 0, sizeof(*mi)); // mi->sta is now NULL > context 2: minstrel_ht_update_rates -> rate_control_set_rates(mp->hw, > mi->sta, rates) > context 2: rate_control_set_rates dereferences > pubsta->rates (mi->sta + 0x48) -> Kernel Oops > context 1: mi->sta = sta > > The first context is from one of the many rate_control_rate_update in > mac80211 and the second context is from ieee80211_tx_status. > > The question came up when discovering the OpenWrt bug report > https://dev.openwrt.org/ticket/18388 (minstrel_ht_update_caps > the thing most likely behind minstrel_remove_sta_debugfs+0xe8c/0x1674 - at > least EPC is pointing inside this function for a build from this revision) I have someone here who says that he can reproduce this problem with a current mac80211 from OpenWrt in ~40 min in a mesh setup with a lot of multicast. I gave them following test patch to check if it could be related to the scenario explained earlier: He reported back that the mesh nodes were now running fine since 7 hours. It is also tested in another network which now runs since 1 1/2 days and were not able to run stable for more then 20 hours at max before applying that patch. These numbers are no definitive proof but at least suggest that there could be a connection. Maybe you already had some concept how to protect from this problem and have not fully implemented it. Would be nice to hear back from you. Kind regards, Sven --- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/net/mac80211/rc80211_minstrel_ht.c +++ b/net/mac80211/rc80211_minstrel_ht.c @@ -1126,7 +1126,8 @@ minstrel_ht_update_caps(void *priv, stru use_vht = 0; msp->is_ht = true; - memset(mi, 0, sizeof(*mi)); + /* don't reset the first entry of mi which is the sta pointer */ + memset(((u8 *)mi) + sizeof(mi->sta), 0, sizeof(*mi) - sizeof(mi->sta)); mi->sta = sta; mi->stats_update = jiffies;