Message ID | 20190318190557.21599-1-timschumi@gmx.de (mailing list archive) |
---|---|
State | Accepted |
Commit | 2f90c7e5d09437a4d8d5546feaae9f1cf48cfbe1 |
Delegated to: | Kalle Valo |
Headers | show |
Series | ath9k: Check for errors when reading SREV register | expand |
On 18/03/2019, Tim Schumacher <timschumi@gmx.de> wrote: > Right now, if an error is encountered during the SREV register > read (i.e. an EIO in ath9k_regread()), that error code gets > passed all the way to __ath9k_hw_init(), where it is visible > during the "Chip rev not supported" message. > > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > Check for -EIO explicitly in ath9k_hw_read_revisions() and return > a boolean based on the success of the operation. Check for that in > __ath9k_hw_init() and abort with a more debugging-friendly message > if reading the revisions wasn't successful. > you are still passing it all the way from ath9k_hw_init > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Failed to read SREV register > ath: phy2: Could not read hardware revision > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > This helps when debugging by directly showing the first point of > failure and it could prevent possible errors if a 0x0f.3 revision > is ever supported. > I don't think this is required. Mac Chip Rev 0x0f.3 at least prints what the problem is instead of generic SREV read failure. Either wrong define in driver or address overlap caused by large/bad firmware in ath9k_htc case.
On 18.03.19 23:35, Tom Psyborg wrote: > On 18/03/2019, Tim Schumacher <timschumi@gmx.de> wrote: >> Right now, if an error is encountered during the SREV register >> read (i.e. an EIO in ath9k_regread()), that error code gets >> passed all the way to __ath9k_hw_init(), where it is visible >> during the "Chip rev not supported" message. >> >> ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits >> ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath9k_htc: Failed to initialize the device >> >> Check for -EIO explicitly in ath9k_hw_read_revisions() and return >> a boolean based on the success of the operation. Check for that in >> __ath9k_hw_init() and abort with a more debugging-friendly message >> if reading the revisions wasn't successful. >> > > you are still passing it all the way from ath9k_hw_init > I meant the actual error code (i.e. -EIO) there, which gets checked for right after the REG_READ() call in ath9k_hw_read_revisions(). If it's present, it immediately prints the "SREV register" message and returns false instead of true. >> ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits >> ath: phy2: Failed to read SREV register >> ath: phy2: Could not read hardware revision >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath9k_htc: Failed to initialize the device >> >> This helps when debugging by directly showing the first point of >> failure and it could prevent possible errors if a 0x0f.3 revision >> is ever supported. >> > > I don't think this is required. Mac Chip Rev 0x0f.3 at least prints > what the problem is instead of generic SREV read failure. The point of this patch is that the "Mac Chip Rev" (in this case) is a false message and that it actually _doesn't_ show what the problem is. There is no Rev that is 0x0f.3. It just falsely takes the -EIO from ath9k_regread() as a valid return value (since it is never caught earlier) and tries to extract a revision from it, which results in 0x0f.3. The case in where the revision succeeded to read, but it simply isn't supported by the driver, is untouched and it still prints the original message. > Either wrong > define in driver or address overlap caused by large/bad firmware in > ath9k_htc case. > I don't know why it fails to read the SREV register in my specific case (I tracked it down to a WMI command timeout, which seems to only happen on a Raspberry Pi 3), but having the SREV error message (which points to the actual issue) instead of the false-positive "Rev not supported" message would have saved me quite some time that I spent with debugging the issue, searching for the source of the wrong value.
On 19/03/2019, Tim Schumacher <timschumi@gmx.de> wrote: > > The case in where the revision succeeded to read, but it simply > isn't supported by the driver, is untouched and it still prints > the original message. > In that case this change is fine. > I don't know why it fails to read the SREV register in my specific > case (I tracked it down to a WMI command timeout, which seems to > only happen on a Raspberry Pi 3), but having the SREV error message > (which points to the actual issue) instead of the false-positive > "Rev not supported" message would have saved me quite some time that > I spent with debugging the issue, searching for the source of the > wrong value. > Did you try some other device than RPI3 ? I've noticed that on router and laptop while doing some htc fw builds.
On 20.03.19 00:38, Tom Psyborg wrote: > On 19/03/2019, Tim Schumacher <timschumi@gmx.de> wrote: >> >> The case in where the revision succeeded to read, but it simply >> isn't supported by the driver, is untouched and it still prints >> the original message. >> > > In that case this change is fine. > >> I don't know why it fails to read the SREV register in my specific >> case (I tracked it down to a WMI command timeout, which seems to >> only happen on a Raspberry Pi 3), but having the SREV error message >> (which points to the actual issue) instead of the false-positive >> "Rev not supported" message would have saved me quite some time that >> I spent with debugging the issue, searching for the source of the >> wrong value. >> > > Did you try some other device than RPI3 ? I've noticed that on router > and laptop while doing some htc fw builds. > I don't have that many other devices to test, but everything worked fine on both of my PCs and it was fine on the RPi3 that was running a raspbian-based kernel instead of mainline.
Tim Schumacher <timschumi@gmx.de> writes: > Right now, if an error is encountered during the SREV register > read (i.e. an EIO in ath9k_regread()), that error code gets > passed all the way to __ath9k_hw_init(), where it is visible > during the "Chip rev not supported" message. > > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > Check for -EIO explicitly in ath9k_hw_read_revisions() and return > a boolean based on the success of the operation. Check for that in > __ath9k_hw_init() and abort with a more debugging-friendly message > if reading the revisions wasn't successful. > > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Failed to read SREV register > ath: phy2: Could not read hardware revision > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > This helps when debugging by directly showing the first point of > failure and it could prevent possible errors if a 0x0f.3 revision > is ever supported. > > Signed-off-by: Tim Schumacher <timschumi@gmx.de> [...] > - val = REG_READ(ah, AR_SREV) & AR_SREV_ID; > + srev = REG_READ(ah, AR_SREV); > + > + if (srev == -EIO) { > + ath_err(ath9k_hw_common(ah), > + "Failed to read SREV register"); > + return false; > + } I really don't like how the error handling is implemented in REG_READ(). If the register has value 0xfffffffb (= -EIO ==-5) won't this interpret that as an error?
On 21.03.19 11:02, Kalle Valo wrote: > Tim Schumacher <timschumi@gmx.de> writes: > >> Right now, if an error is encountered during the SREV register >> read (i.e. an EIO in ath9k_regread()), that error code gets >> passed all the way to __ath9k_hw_init(), where it is visible >> during the "Chip rev not supported" message. >> >> ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits >> ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath9k_htc: Failed to initialize the device >> >> Check for -EIO explicitly in ath9k_hw_read_revisions() and return >> a boolean based on the success of the operation. Check for that in >> __ath9k_hw_init() and abort with a more debugging-friendly message >> if reading the revisions wasn't successful. >> >> ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits >> ath: phy2: Failed to read SREV register >> ath: phy2: Could not read hardware revision >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath: phy2: Unable to initialize hardware; initialization status: -95 >> ath9k_htc: Failed to initialize the device >> >> This helps when debugging by directly showing the first point of >> failure and it could prevent possible errors if a 0x0f.3 revision >> is ever supported. >> >> Signed-off-by: Tim Schumacher <timschumi@gmx.de> > > [...] > >> - val = REG_READ(ah, AR_SREV) & AR_SREV_ID; >> + srev = REG_READ(ah, AR_SREV); >> + >> + if (srev == -EIO) { >> + ath_err(ath9k_hw_common(ah), >> + "Failed to read SREV register"); >> + return false; >> + } > > I really don't like how the error handling is implemented in REG_READ(). > If the register has value 0xfffffffb (= -EIO ==-5) won't this interpret > that as an error? > If the register had that value, it would indeed report an error. However (at least if I read the current code and the data sheet correctly), to make use of the higher 24 bits of the register, the "small"/old version of the SREV_ID (i.e. the rightmost 8 bit) need to be set to 0xFF. Therefore, a register read of 0xfffffffb should never happen in this register. But the error handling is indeed a bit weird. Making the return value a pure status indicator and saving the value from the register by passing a reference would probably be the best solution to fixing this up.
Tim Schumacher <timschumi@gmx.de> writes: > On 21.03.19 11:02, Kalle Valo wrote: >> Tim Schumacher <timschumi@gmx.de> writes: >> >>> - val = REG_READ(ah, AR_SREV) & AR_SREV_ID; >>> + srev = REG_READ(ah, AR_SREV); >>> + >>> + if (srev == -EIO) { >>> + ath_err(ath9k_hw_common(ah), >>> + "Failed to read SREV register"); >>> + return false; >>> + } >> >> I really don't like how the error handling is implemented in REG_READ(). >> If the register has value 0xfffffffb (= -EIO ==-5) won't this interpret >> that as an error? >> > > If the register had that value, it would indeed report an error. However > (at least if I read the current code and the data sheet correctly), to make > use of the higher 24 bits of the register, the "small"/old version of the > SREV_ID (i.e. the rightmost 8 bit) need to be set to 0xFF. Therefore, a > register read of 0xfffffffb should never happen in this register. Good, thanks for checking. > But the error handling is indeed a bit weird. Making the return value a pure > status indicator and saving the value from the register by passing a > reference would probably be the best solution to fixing this up. Yeah, that would be so much better. But that can fixed in another patch, no need to do that here.
Tim Schumacher <timschumi@gmx.de> wrote: > Right now, if an error is encountered during the SREV register > read (i.e. an EIO in ath9k_regread()), that error code gets > passed all the way to __ath9k_hw_init(), where it is visible > during the "Chip rev not supported" message. > > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > Check for -EIO explicitly in ath9k_hw_read_revisions() and return > a boolean based on the success of the operation. Check for that in > __ath9k_hw_init() and abort with a more debugging-friendly message > if reading the revisions wasn't successful. > > ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits > ath: phy2: Failed to read SREV register > ath: phy2: Could not read hardware revision > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath: phy2: Unable to initialize hardware; initialization status: -95 > ath9k_htc: Failed to initialize the device > > This helps when debugging by directly showing the first point of > failure and it could prevent possible errors if a 0x0f.3 revision > is ever supported. > > Signed-off-by: Tim Schumacher <timschumi@gmx.de> > Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Patch applied to ath-next branch of ath.git, thanks. 2f90c7e5d094 ath9k: Check for errors when reading SREV register
diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c index 8581d917635a..b6773d613f0c 100644 --- a/drivers/net/wireless/ath/ath9k/hw.c +++ b/drivers/net/wireless/ath/ath9k/hw.c @@ -252,8 +252,9 @@ void ath9k_hw_get_channel_centers(struct ath_hw *ah, /* Chip Revisions */ /******************/ -static void ath9k_hw_read_revisions(struct ath_hw *ah) +static bool ath9k_hw_read_revisions(struct ath_hw *ah) { + u32 srev; u32 val; if (ah->get_mac_revision) @@ -269,25 +270,33 @@ static void ath9k_hw_read_revisions(struct ath_hw *ah) val = REG_READ(ah, AR_SREV); ah->hw_version.macRev = MS(val, AR_SREV_REVISION2); } - return; + return true; case AR9300_DEVID_AR9340: ah->hw_version.macVersion = AR_SREV_VERSION_9340; - return; + return true; case AR9300_DEVID_QCA955X: ah->hw_version.macVersion = AR_SREV_VERSION_9550; - return; + return true; case AR9300_DEVID_AR953X: ah->hw_version.macVersion = AR_SREV_VERSION_9531; - return; + return true; case AR9300_DEVID_QCA956X: ah->hw_version.macVersion = AR_SREV_VERSION_9561; - return; + return true; } - val = REG_READ(ah, AR_SREV) & AR_SREV_ID; + srev = REG_READ(ah, AR_SREV); + + if (srev == -EIO) { + ath_err(ath9k_hw_common(ah), + "Failed to read SREV register"); + return false; + } + + val = srev & AR_SREV_ID; if (val == 0xFF) { - val = REG_READ(ah, AR_SREV); + val = srev; ah->hw_version.macVersion = (val & AR_SREV_VERSION2) >> AR_SREV_TYPE2_S; ah->hw_version.macRev = MS(val, AR_SREV_REVISION2); @@ -306,6 +315,8 @@ static void ath9k_hw_read_revisions(struct ath_hw *ah) if (ah->hw_version.macVersion == AR_SREV_VERSION_5416_PCIE) ah->is_pciexpress = true; } + + return true; } /************************************/ @@ -559,7 +570,10 @@ static int __ath9k_hw_init(struct ath_hw *ah) struct ath_common *common = ath9k_hw_common(ah); int r = 0; - ath9k_hw_read_revisions(ah); + if (!ath9k_hw_read_revisions(ah)) { + ath_err(common, "Could not read hardware revisions"); + return -EOPNOTSUPP; + } switch (ah->hw_version.macVersion) { case AR_SREV_VERSION_5416_PCI:
Right now, if an error is encountered during the SREV register read (i.e. an EIO in ath9k_regread()), that error code gets passed all the way to __ath9k_hw_init(), where it is visible during the "Chip rev not supported" message. ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver ath: phy2: Unable to initialize hardware; initialization status: -95 ath: phy2: Unable to initialize hardware; initialization status: -95 ath9k_htc: Failed to initialize the device Check for -EIO explicitly in ath9k_hw_read_revisions() and return a boolean based on the success of the operation. Check for that in __ath9k_hw_init() and abort with a more debugging-friendly message if reading the revisions wasn't successful. ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits ath: phy2: Failed to read SREV register ath: phy2: Could not read hardware revision ath: phy2: Unable to initialize hardware; initialization status: -95 ath: phy2: Unable to initialize hardware; initialization status: -95 ath9k_htc: Failed to initialize the device This helps when debugging by directly showing the first point of failure and it could prevent possible errors if a 0x0f.3 revision is ever supported. Signed-off-by: Tim Schumacher <timschumi@gmx.de> --- drivers/net/wireless/ath/ath9k/hw.c | 32 +++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 9 deletions(-)