From patchwork Fri Feb 2 07:50:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ping-Ke Shih X-Patchwork-Id: 10196339 X-Patchwork-Delegate: kvalo@adurom.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BB3AD60247 for ; Fri, 2 Feb 2018 07:51:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A91ED28E01 for ; Fri, 2 Feb 2018 07:51:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9B1F328E11; Fri, 2 Feb 2018 07:51:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1264728E01 for ; Fri, 2 Feb 2018 07:51:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751165AbeBBHvE convert rfc822-to-8bit (ORCPT ); Fri, 2 Feb 2018 02:51:04 -0500 Received: from rtits2.realtek.com ([211.75.126.72]:43356 "EHLO rtits2.realtek.com.tw" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750915AbeBBHvC (ORCPT ); Fri, 2 Feb 2018 02:51:02 -0500 Authenticated-By: X-SpamFilter-By: BOX Solutions SpamTrap 5.62 with qID w127oQfl006343, This message is accepted by code: ctloc85258 Received: from mail.realtek.com (rtitcasv01.realtek.com.tw [172.21.6.18]) by rtits2.realtek.com.tw (8.15.2/2.57/5.78) with ESMTP id w127oQfl006343; Fri, 2 Feb 2018 15:50:26 +0800 Received: from RTITMBSV07.realtek.com.tw ([fe80::8d2f:f777:70b1:3332]) by RTITCASV01.realtek.com.tw ([::1]) with mapi id 14.03.0294.000; Fri, 2 Feb 2018 15:50:26 +0800 From: Pkshih To: James Cameron , Larry Finger CC: "linux-wireless@vger.kernel.org" Subject: RE: rtl8821ae keep alive not set, connection lost Thread-Topic: rtl8821ae keep alive not set, connection lost Thread-Index: AQHTLBP4puW3WLhaQ0Ohrg3rJYHzFaOOjVMAgADeWgCAAipYYA== Date: Fri, 2 Feb 2018 07:50:26 +0000 Message-ID: <5B2DA6FDDF928F4E855344EE0A5C39D13BE7A25E@RTITMBSV07.realtek.com.tw> References: <20170912220916.GB32211@us.netrek.org> <20180201062202.GH917@us.netrek.org> In-Reply-To: <20180201062202.GH917@us.netrek.org> Accept-Language: en-US, zh-TW Content-Language: zh-TW X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.21.69.107] MIME-Version: 1.0 Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > -----Original Message----- > From: linux-wireless-owner@vger.kernel.org [mailto:linux-wireless-owner@vger.kernel.org] On Behalf > Of James Cameron > Sent: Thursday, February 01, 2018 2:22 PM > To: Larry Finger > Cc: linux-wireless@vger.kernel.org; Pkshih > Subject: Re: rtl8821ae keep alive not set, connection lost > > On Wed, Jan 31, 2018 at 11:06:12AM -0600, Larry Finger wrote: > > On 09/12/2017 05:09 PM, James Cameron wrote: > > >Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks > > >rtl8821ae keep alive, causing "Connection to AP lost" and deauth, > > >but why? > > > > > >Wireless connection is lost after a few seconds or minutes, on > > >every OLPC NL3 laptop with rtl8821ae, with any stable kernel after > > >4.10.1, and any kernel with 40b368af4b75. > > > > > >dmesg contains > > > > > > wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost > > > > > >iw event shows > > > > > > wlp2s0: del station 2c:b0:5d:a6:86:eb > > > wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to > inactivity > > > wlp2s0 (phy #0): disconnected (local request) > > > > > >Workaround is to bounce the link, then reconnect; > > > > > > ip link set wlp2s0 down > > > ip link set wlp2s0 up > > > iw dev wlp2s0 connect qz > > > > > >A nearby monitor host captures a deauthentication packet sent by > > >the device. > > > > > >Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment > > >issues") which changes the width of DBI register read. > > > > > >On the face of it, 40b368af4b75 looks correct, especially compared > > >against same function in rtl8723be. > > > > > >I've no idea why reverting fixes the problem. I'm hoping someone > > >here might speculate and suggest ways to test. > > > > > >As keep alive is set through this path, my guess is that keep alive > > >is not being set in the device. Or perhaps reading 16-bits > > >perturbs another register. Is there a way to test? > > > > > >http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13 > > > > > >http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and > > >revert of 40b368af4b75 > > > > James, > > > > I'm afraid we are needing to revisit this problem again. Changing > > that 8-bit read to a 16-bit version causes an unaligned memory > > reference in AARCH64, thus we will need to re-revert. To prevent > > problems on systems such as yours, PK plans to turn off ASPM > > capability and backdoor in certain platforms that will be listed in > > a quirks table. Please report the output of 'dmidecode -t system' > > for you affected system(s). > > Thanks for letting me know. > > We made three production runs, and I'm waiting to get a hold of the > dmidecode for two of them. This may take some weeks; we have to find > stock and ship it, or we have to ask our contract manufacturer (CM) if > they have kept data or units. > > I've dmidecode for one production run. > > http://dev.laptop.org/~quozl/z/1eh7JF.txt (my unit nl3-e) > > I've dmidecode for prototypes, but they have clearly been programmed > badly. We did not ask our CM for Windows compatibility, so they may > have had no step to verify the data. We also went through several > iterations to get serial numbers assigned, so the data I have does not > have good provenance. > > http://dev.laptop.org/~quozl/z/1eh7EE.txt (my unit nl3-c) > http://dev.laptop.org/~quozl/z/1eh7EV.txt (my unit nl3-d) > http://dev.laptop.org/~quozl/z/1eh7He.txt (my unit nl3-a) > http://dev.laptop.org/~quozl/z/1eh8DR.txt (my unit nl3-b) > > > We hope you will be able to test any proposed patches. > > Yes, can do. > > I've just tested v4.15. > > However, I'm concerned about your plan to use quirks; > > 1. turning off ASPM may decrease run time on battery, which if it is > significant, across several thousand laptops will yield generator fuel > or solar budget failure; can the power impact be quantified? > > 2. why not keep ASPM enabled, and use 8-bit when quirked, or on > x86_64, or when not AARCH64? > > 3. why not find the underlying problem; PK is in the same company as > the device firmware engineers, so it should be possible for them to > find out why 16-bit access causes the device firmware to hang? We > drew a blank trying to reach firmware engineers through our CM and > module maker; perhaps we were not large or noisy enough. > > 4. it's not just me; there are others who have reported similar > problems, so won't re-reverting affect them? They haven't engaged in > the process as thoroughly, and may not be in the quirks table. You > also reproduced the problem with different hardware. > Hi James, In my experiment, unaligned-word-access may get wrong values that are different from the value by byte-access. Actually, it can simply verified by using 'lspci' to check PCI configuration space. DBI read 0x70f: _rtl8821ae_dbi_read:1127 r8 0x34f = 0x0017 _rtl8821ae_dbi_read:1131 r8 0x350 = 0x000c _rtl8821ae_dbi_read:1136 r16 0x350 = 0xffff DBI read 0x719: _rtl8821ae_dbi_read:1127 r8 0x34d = 0x0000 _rtl8821ae_dbi_read:1131 r8 0x34e = 0x0002 _rtl8821ae_dbi_read:1136 r16 0x34e = 0x0200 According to the wrong and original value of 0x70f is 0xff, I think larger L1 latency 0x70f[5:3] may be helpful. Please help to try below patch. If it works, quirk table won't be necessary. PK diff --git a/rtl8821ae/hw.c b/rtl8821ae/hw.c index 7d43ba002..e53af06ed 100644 --- a/rtl8821ae/hw.c +++ b/rtl8821ae/hw.c @@ -1123,7 +1123,8 @@ static u8 _rtl8821ae_dbi_read(struct rtl_priv *rtlpriv, u16 addr) } if (0 == tmp) { read_addr = REG_DBI_RDATA + addr % 4; - ret = rtl_read_word(rtlpriv, read_addr); + + ret = rtl_read_byte(rtlpriv, read_addr); } return ret; } @@ -1165,7 +1166,7 @@ static void _rtl8821ae_enable_aspm_back_door(struct ieee80211_hw *hw) } tmp = _rtl8821ae_dbi_read(rtlpriv, 0x70f); - _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7)); + _rtl8821ae_dbi_write(rtlpriv, 0x70f, tmp | BIT(7) | 0x38); tmp = _rtl8821ae_dbi_read(rtlpriv, 0x719); _rtl8821ae_dbi_write(rtlpriv, 0x719, tmp | BIT(3) | BIT(4));