Message ID | 1553281120-22139-1-git-send-email-pozega.tomislav@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
Series | ath10k: reset chip after supported check | expand |
On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: > When chip reset is done before the chip is checked if supported > there will be crash. Previous behaviour caused bootloops on > Archer C7 v1 units, this patch allows clean device boot without > excluding ath10k driver. > You need Fixes: 1a7fecb766c8 ("ath10k: reset chip before reading chip_id in probe") too > Signed-off-by: Tomislav Požega <pozega.tomislav@gmail.com> > --- > drivers/net/wireless/ath/ath10k/pci.c | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c > index e24403c..ec681da 100644 > --- a/drivers/net/wireless/ath/ath10k/pci.c > +++ b/drivers/net/wireless/ath/ath10k/pci.c > @@ -3619,12 +3619,6 @@ static int ath10k_pci_probe(struct pci_dev *pdev, > goto err_deinit_irq; > } > > - ret = ath10k_pci_chip_reset(ar); > - if (ret) { > - ath10k_err(ar, "failed to reset chip: %d\n", ret); > - goto err_free_irq; > - } > - > bus_params.dev_type = ATH10K_DEV_TYPE_LL; > bus_params.link_can_suspend = true; > bus_params.chip_id = ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); > @@ -3639,6 +3633,12 @@ static int ath10k_pci_probe(struct pci_dev *pdev, > goto err_free_irq; > } > > + ret = ath10k_pci_chip_reset(ar); > + if (ret) { > + ath10k_err(ar, "failed to reset chip: %d\n", ret); > + goto err_free_irq; > + } > + > ret = ath10k_core_register(ar, &bus_params); > if (ret) { > ath10k_err(ar, "failed to register driver core: %d\n", ret); >
+ Michał On 3/22/2019 8:25 PM, Christian Lamparter wrote: > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: >> When chip reset is done before the chip is checked if supported >> there will be crash. Previous behaviour caused bootloops on >> Archer C7 v1 units, this patch allows clean device boot without >> excluding ath10k driver. >> > You need > > Fixes: 1a7fecb766c8 ("ath10k: reset chip before reading chip_id in probe") > > too Looking at the commit subject makes me suspicious whether this is a proper fix. >> Signed-off-by: Tomislav Požega <pozega.tomislav@gmail.com> >> --- >> drivers/net/wireless/ath/ath10k/pci.c | 12 ++++++------ >> 1 files changed, 6 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c >> index e24403c..ec681da 100644 >> --- a/drivers/net/wireless/ath/ath10k/pci.c >> +++ b/drivers/net/wireless/ath/ath10k/pci.c >> @@ -3619,12 +3619,6 @@ static int ath10k_pci_probe(struct pci_dev *pdev, >> goto err_deinit_irq; >> } >> >> - ret = ath10k_pci_chip_reset(ar); >> - if (ret) { >> - ath10k_err(ar, "failed to reset chip: %d\n", ret); >> - goto err_free_irq; >> - } >> - >> bus_params.dev_type = ATH10K_DEV_TYPE_LL; >> bus_params.link_can_suspend = true; >> bus_params.chip_id = ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); It seems to me the chip reset was done explicitly *before* reading the chipid for a reason. """ ath10k: reset chip before reading chip_id in probe There are some very rare cases with some hardware configuration that the device doesn't init quickly enough in which case reading chip_id yielded 0. This caused driver to subsequently fail to setup the device. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> """ Might be the ath10k_pci_chip_reset() function needs to be modified to work properly for Archer C7 v1 units. Regards, Arend
* resending with corrected email address from Kalle -------------------------------------------------------------------- + Michał On 3/22/2019 8:25 PM, Christian Lamparter wrote: > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: >> When chip reset is done before the chip is checked if supported >> there will be crash. Previous behaviour caused bootloops on >> Archer C7 v1 units, this patch allows clean device boot without >> excluding ath10k driver. >> > You need > > Fixes: 1a7fecb766c8 ("ath10k: reset chip before reading chip_id in probe") > > too Looking at the commit subject makes me suspicious whether this is a proper fix. >> Signed-off-by: Tomislav Požega <pozega.tomislav@gmail.com> >> --- >> drivers/net/wireless/ath/ath10k/pci.c | 12 ++++++------ >> 1 files changed, 6 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c >> index e24403c..ec681da 100644 >> --- a/drivers/net/wireless/ath/ath10k/pci.c >> +++ b/drivers/net/wireless/ath/ath10k/pci.c >> @@ -3619,12 +3619,6 @@ static int ath10k_pci_probe(struct pci_dev *pdev, >> goto err_deinit_irq; >> } >> >> - ret = ath10k_pci_chip_reset(ar); >> - if (ret) { >> - ath10k_err(ar, "failed to reset chip: %d\n", ret); >> - goto err_free_irq; >> - } >> - >> bus_params.dev_type = ATH10K_DEV_TYPE_LL; >> bus_params.link_can_suspend = true; >> bus_params.chip_id = ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); It seems to me the chip reset was done explicitly *before* reading the chipid for a reason. """ ath10k: reset chip before reading chip_id in probe There are some very rare cases with some hardware configuration that the device doesn't init quickly enough in which case reading chip_id yielded 0. This caused driver to subsequently fail to setup the device. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> """ Might be the ath10k_pci_chip_reset() function needs to be modified to work properly for Archer C7 v1 units. Regards, Arend
On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: > > * resending with corrected email address from Kalle > -------------------------------------------------------------------- > + Michał Thanks! > On 3/22/2019 8:25 PM, Christian Lamparter wrote: > > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: > >> When chip reset is done before the chip is checked if supported > >> there will be crash. Previous behaviour caused bootloops on > >> Archer C7 v1 units, this patch allows clean device boot without > >> excluding ath10k driver. Can you elaborate more a bit? What kind of crashes are you seeing? What does the bootloop look like? Do you have uart connected to diagnose? Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported by ath10k? I recall the v1 chip was really buggy and required hammering registers sometimes to get things working. [...] > Looking at the commit subject makes me suspicious whether this is a > proper fix. [...] > It seems to me the chip reset was done explicitly *before* reading the > chipid for a reason. > > """ > ath10k: reset chip before reading chip_id in probe > There are some very rare cases with some hardware > configuration that the device doesn't init quickly > enough in which case reading chip_id yielded 0. > This caused driver to subsequently fail to setup > the device. > > Signed-off-by: Michal Kazior <michal.kazior@tieto.com> > Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> > """ Good catch, Arend! > Might be the ath10k_pci_chip_reset() function needs to be modified to > work properly for Archer C7 v1 units. When driver boots up hardware is in an unknown state. It could've been power cycled or it could've remained powered via pcie aux while host cpu went through a warm reset (eg. watchdog). For what it's worth it could still be beaconing if that is what it was doing before host cpu reset. Therefore the device can be in a messy state and it should be reset before talking to it or else you risk reading garbage. I don't recall details about this particular commit but given the timeline of commits at the time this change suggests this was related to supporting qca61x4. In other words your change risks breaking some qca61x4 chips. Michał
On 3/25/19 5:14 AM, Michał Kazior wrote: > On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel > <arend.vanspriel@broadcom.com> wrote: >> >> * resending with corrected email address from Kalle >> -------------------------------------------------------------------- >> + Michał > > Thanks! > > >> On 3/22/2019 8:25 PM, Christian Lamparter wrote: >> > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: >> >> When chip reset is done before the chip is checked if supported >> >> there will be crash. Previous behaviour caused bootloops on >> >> Archer C7 v1 units, this patch allows clean device boot without >> >> excluding ath10k driver. > > Can you elaborate more a bit? What kind of crashes are you seeing? > What does the bootloop look like? Do you have uart connected to > diagnose? > > Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported > by ath10k? I recall the v1 chip was really buggy and required > hammering registers sometimes to get things working. The crash is related to the v1 chip. Is there a good way to detect that this is the chip in question and only apply this work-around for the problem chip? Thanks, Ben
On Mon, 25 Mar 2019 at 16:55, Ben Greear <greearb@candelatech.com> wrote: > On 3/25/19 5:14 AM, Michał Kazior wrote: > > On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel > > <arend.vanspriel@broadcom.com> wrote: > >> > >> * resending with corrected email address from Kalle > >> -------------------------------------------------------------------- > >> + Michał > > > > Thanks! > > > > > >> On 3/22/2019 8:25 PM, Christian Lamparter wrote: > >> > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: > >> >> When chip reset is done before the chip is checked if supported > >> >> there will be crash. Previous behaviour caused bootloops on > >> >> Archer C7 v1 units, this patch allows clean device boot without > >> >> excluding ath10k driver. > > > > Can you elaborate more a bit? What kind of crashes are you seeing? > > What does the bootloop look like? Do you have uart connected to > > diagnose? > > > > Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported > > by ath10k? I recall the v1 chip was really buggy and required > > hammering registers sometimes to get things working. > > The crash is related to the v1 chip. Is there a good way to detect > that this is the chip in question and only apply this work-around > for the problem chip? I don't know of any good way to do that. You could consider device-tree but that would be no different from having a module blacklist in the C7v1 build recipe, or to not build the module at all. That is unless you actually want to make v1 chip work with ath10k at which point there's more fun to be had before it can actually work. Michał
On 3/25/19 1:08 PM, Michał Kazior wrote: > On Mon, 25 Mar 2019 at 16:55, Ben Greear <greearb@candelatech.com> wrote: >> On 3/25/19 5:14 AM, Michał Kazior wrote: >>> On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel >>> <arend.vanspriel@broadcom.com> wrote: >>>> >>>> * resending with corrected email address from Kalle >>>> -------------------------------------------------------------------- >>>> + Michał >>> >>> Thanks! >>> >>> >>>> On 3/22/2019 8:25 PM, Christian Lamparter wrote: >>>> > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: >>>> >> When chip reset is done before the chip is checked if supported >>>> >> there will be crash. Previous behaviour caused bootloops on >>>> >> Archer C7 v1 units, this patch allows clean device boot without >>>> >> excluding ath10k driver. >>> >>> Can you elaborate more a bit? What kind of crashes are you seeing? >>> What does the bootloop look like? Do you have uart connected to >>> diagnose? >>> >>> Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported >>> by ath10k? I recall the v1 chip was really buggy and required >>> hammering registers sometimes to get things working. >> >> The crash is related to the v1 chip. Is there a good way to detect >> that this is the chip in question and only apply this work-around >> for the problem chip? > > I don't know of any good way to do that. > > You could consider device-tree but that would be no different from > having a module blacklist in the C7v1 build recipe, or to not build > the module at all. That is unless you actually want to make v1 chip > work with ath10k at which point there's more fun to be had before it > can actually work. I remember v1, and I have no interest in trying to make it work :) If we could blacklist certain pci slots in the ath10k driver, I guess that would work? I think the goal is to not use the v1 chip, but allow users to add a v2 NIC to the platform, so driver still needs to load. Thanks, Ben
On 03/25/2019 02:34 PM, Michał Kazior wrote: > On Mon, 25 Mar 2019 at 21:23, Ben Greear <greearb@candelatech.com> wrote: >> >> On 3/25/19 1:08 PM, Michał Kazior wrote: >>> On Mon, 25 Mar 2019 at 16:55, Ben Greear <greearb@candelatech.com> wrote: >>>> On 3/25/19 5:14 AM, Michał Kazior wrote: >>>>> On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel >>>>> <arend.vanspriel@broadcom.com> wrote: >>>>>> >>>>>> * resending with corrected email address from Kalle >>>>>> -------------------------------------------------------------------- >>>>>> + Michał >>>>> >>>>> Thanks! >>>>> >>>>> >>>>>> On 3/22/2019 8:25 PM, Christian Lamparter wrote: >>>>>> > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: >>>>>> >> When chip reset is done before the chip is checked if supported >>>>>> >> there will be crash. Previous behaviour caused bootloops on >>>>>> >> Archer C7 v1 units, this patch allows clean device boot without >>>>>> >> excluding ath10k driver. >>>>> >>>>> Can you elaborate more a bit? What kind of crashes are you seeing? >>>>> What does the bootloop look like? Do you have uart connected to >>>>> diagnose? >>>>> >>>>> Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported >>>>> by ath10k? I recall the v1 chip was really buggy and required >>>>> hammering registers sometimes to get things working. >>>> >>>> The crash is related to the v1 chip. Is there a good way to detect >>>> that this is the chip in question and only apply this work-around >>>> for the problem chip? >>> >>> I don't know of any good way to do that. >>> >>> You could consider device-tree but that would be no different from >>> having a module blacklist in the C7v1 build recipe, or to not build >>> the module at all. That is unless you actually want to make v1 chip >>> work with ath10k at which point there's more fun to be had before it >>> can actually work. >> >> I remember v1, and I have no interest in trying to make it work :) >> >> If we could blacklist certain pci slots in the ath10k driver, I guess >> that would work? >> >> I think the goal is to not use the v1 chip, but allow users to add a >> v2 NIC to the platform, so driver still needs to load. > > That makes sense, but I don't see how blacklisting pci slots would > help someone putting v2 nic into C7v1 mobo? Won't the slot be the same > regardless what nic is put? I'm not sure about that...maybe let OpenWRT boot by default assuming the slot is blacklisted, and then disable the blacklist if known to be a v2 NIC. If your patch below works, that looks a lot better though. Hopefully someone with that v1 board can test it. Thanks, Ben > > The best thing I can come up with is something like this: > > --- a/drivers/net/wireless/ath/ath10k/pci.c > +++ b/drivers/net/wireless/ath/ath10k/pci.c > @@ -3629,6 +3629,19 @@ static int ath10k_pci_probe(struct pci_dev *pdev, > goto err_deinit_irq; > } > > + if (hw_rev == ATH10K_HW_QCA988X) { > + /* v1 can crash the system on chip_reset() > + * so all we can do is keep our fingers > + * crossed v2 never reports 0 without a > + * chip_reset() > + */ > + if (ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS) == 0) { > + ath10k_err(ar, "qca9880 v1 is chip not supported"); > + ret = -ENOTSUP; > + goto err_free_irq; > + } > + } > + > ret = ath10k_pci_chip_reset(ar); > if (ret) { > ath10k_err(ar, "failed to reset chip: %d\n", ret); > > I didn't test it. Someone needs to compile and test and make sure v2 > doesn't regress when fw hangs and cold & warm host cpu resets are > mixed in. > > > Michał >
On Mon, 25 Mar 2019 at 21:23, Ben Greear <greearb@candelatech.com> wrote: > > On 3/25/19 1:08 PM, Michał Kazior wrote: > > On Mon, 25 Mar 2019 at 16:55, Ben Greear <greearb@candelatech.com> wrote: > >> On 3/25/19 5:14 AM, Michał Kazior wrote: > >>> On Sat, 23 Mar 2019 at 08:20, Arend Van Spriel > >>> <arend.vanspriel@broadcom.com> wrote: > >>>> > >>>> * resending with corrected email address from Kalle > >>>> -------------------------------------------------------------------- > >>>> + Michał > >>> > >>> Thanks! > >>> > >>> > >>>> On 3/22/2019 8:25 PM, Christian Lamparter wrote: > >>>> > On Friday, March 22, 2019 7:58:40 PM CET Tomislav Požega wrote: > >>>> >> When chip reset is done before the chip is checked if supported > >>>> >> there will be crash. Previous behaviour caused bootloops on > >>>> >> Archer C7 v1 units, this patch allows clean device boot without > >>>> >> excluding ath10k driver. > >>> > >>> Can you elaborate more a bit? What kind of crashes are you seeing? > >>> What does the bootloop look like? Do you have uart connected to > >>> diagnose? > >>> > >>> Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported > >>> by ath10k? I recall the v1 chip was really buggy and required > >>> hammering registers sometimes to get things working. > >> > >> The crash is related to the v1 chip. Is there a good way to detect > >> that this is the chip in question and only apply this work-around > >> for the problem chip? > > > > I don't know of any good way to do that. > > > > You could consider device-tree but that would be no different from > > having a module blacklist in the C7v1 build recipe, or to not build > > the module at all. That is unless you actually want to make v1 chip > > work with ath10k at which point there's more fun to be had before it > > can actually work. > > I remember v1, and I have no interest in trying to make it work :) > > If we could blacklist certain pci slots in the ath10k driver, I guess > that would work? > > I think the goal is to not use the v1 chip, but allow users to add a > v2 NIC to the platform, so driver still needs to load. That makes sense, but I don't see how blacklisting pci slots would help someone putting v2 nic into C7v1 mobo? Won't the slot be the same regardless what nic is put? The best thing I can come up with is something like this: --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -3629,6 +3629,19 @@ static int ath10k_pci_probe(struct pci_dev *pdev, goto err_deinit_irq; } + if (hw_rev == ATH10K_HW_QCA988X) { + /* v1 can crash the system on chip_reset() + * so all we can do is keep our fingers + * crossed v2 never reports 0 without a + * chip_reset() + */ + if (ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS) == 0) { + ath10k_err(ar, "qca9880 v1 is chip not supported"); + ret = -ENOTSUP; + goto err_free_irq; + } + } + ret = ath10k_pci_chip_reset(ar); if (ret) { ath10k_err(ar, "failed to reset chip: %d\n", ret); I didn't test it. Someone needs to compile and test and make sure v2 doesn't regress when fw hangs and cold & warm host cpu resets are mixed in. Michał
These replies never reached my inbox. Anyway scraped conversation off the mailing list archive and CCing relevant people. >> >>> Can you elaborate more a bit? What kind of crashes are you seeing? >> >>> What does the bootloop look like? Do you have uart connected to >> >>> diagnose? >> >>> >> >>> Didn't C7 v1 have the old QCA9880 hw v1 which isn't really supported >> >>> by ath10k? I recall the v1 chip was really buggy and required >> >>> hammering registers sometimes to get things working. >> >> >> >> The crash is related to the v1 chip. Is there a good way to detect >> >> that this is the chip in question and only apply this work-around >> >> for the problem chip? yes there is. but the provided solution should be enough. which device really depend on reset to probe sueccesfully? that device need other fixes! >> > >> > I don't know of any good way to do that. >> > >> > You could consider device-tree but that would be no different from >> > having a module blacklist in the C7v1 build recipe, or to not build >> > the module at all. That is unless you actually want to make v1 chip >> > work with ath10k at which point there's more fun to be had before it >> > can actually work. this is also intention. can it work with v2.0 fw? >> >> I remember v1, and I have no interest in trying to make it work :) >> >> If we could blacklist certain pci slots in the ath10k driver, I guess >> that would work? >> >> I think the goal is to not use the v1 chip, but allow users to add a >> v2 NIC to the platform, so driver still needs to load. > >That makes sense, but I don't see how blacklisting pci slots would >help someone putting v2 nic into C7v1 mobo? Won't the slot be the same >regardless what nic is put? > >The best thing I can come up with is something like this: > >--- a/drivers/net/wireless/ath/ath10k/pci.c >+++ b/drivers/net/wireless/ath/ath10k/pci.c >@@ -3629,6 +3629,19 @@ static int ath10k_pci_probe(struct pci_dev *pdev, > goto err_deinit_irq; > } > >+ if (hw_rev == ATH10K_HW_QCA988X) { >+ /* v1 can crash the system on chip_reset() >+ * so all we can do is keep our fingers >+ * crossed v2 never reports 0 without a >+ * chip_reset() >+ */ >+ if (ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS) == 0) { >+ ath10k_err(ar, "qca9880 v1 is chip not supported"); >+ ret = -ENOTSUP; >+ goto err_free_irq; >+ } >+ } >+ > ret = ath10k_pci_chip_reset(ar); > if (ret) { > ath10k_err(ar, "failed to reset chip: %d\n", ret); > >I didn't test it. Someone needs to compile and test and make sure v2 >doesn't regress when fw hangs and cold & warm host cpu resets are >mixed in. > > >Michał this doesn't even fix the bootloop with v1
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index e24403c..ec681da 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -3619,12 +3619,6 @@ static int ath10k_pci_probe(struct pci_dev *pdev, goto err_deinit_irq; } - ret = ath10k_pci_chip_reset(ar); - if (ret) { - ath10k_err(ar, "failed to reset chip: %d\n", ret); - goto err_free_irq; - } - bus_params.dev_type = ATH10K_DEV_TYPE_LL; bus_params.link_can_suspend = true; bus_params.chip_id = ath10k_pci_soc_read32(ar, SOC_CHIP_ID_ADDRESS); @@ -3639,6 +3633,12 @@ static int ath10k_pci_probe(struct pci_dev *pdev, goto err_free_irq; } + ret = ath10k_pci_chip_reset(ar); + if (ret) { + ath10k_err(ar, "failed to reset chip: %d\n", ret); + goto err_free_irq; + } + ret = ath10k_core_register(ar, &bus_params); if (ret) { ath10k_err(ar, "failed to register driver core: %d\n", ret);
When chip reset is done before the chip is checked if supported there will be crash. Previous behaviour caused bootloops on Archer C7 v1 units, this patch allows clean device boot without excluding ath10k driver. Signed-off-by: Tomislav Požega <pozega.tomislav@gmail.com> --- drivers/net/wireless/ath/ath10k/pci.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-)