diff mbox series

ath10k: Add interrupt summary based CE processing

Message ID 1593193967-29897-1-git-send-email-pillair@codeaurora.org (mailing list archive)
State Accepted
Commit b92aba35d39d10d8a6bdf2495172fd490c598b4a
Delegated to: Kalle Valo
Headers show
Series ath10k: Add interrupt summary based CE processing | expand

Commit Message

Rakesh Pillai June 26, 2020, 5:52 p.m. UTC
Currently the NAPI processing loops through all
the copy engines and processes a particular copy
engine is the copy completion is set for that copy
engine. The host driver is not supposed to access
any copy engine register after clearing the interrupt
status register.

This might result in kernel crash like the one below
[ 1159.220143] Call trace:
[ 1159.220170]  ath10k_snoc_read32+0x20/0x40 [ath10k_snoc]
[ 1159.220193]  ath10k_ce_per_engine_service_any+0x78/0x130 [ath10k_core]
[ 1159.220203]  ath10k_snoc_napi_poll+0x38/0x8c [ath10k_snoc]
[ 1159.220270]  net_rx_action+0x100/0x3b0
[ 1159.220312]  __do_softirq+0x164/0x30c
[ 1159.220345]  run_ksoftirqd+0x2c/0x64
[ 1159.220380]  smpboot_thread_fn+0x1b0/0x288
[ 1159.220405]  kthread+0x11c/0x12c
[ 1159.220423]  ret_from_fork+0x10/0x18

To avoid such a scenario, we generate an interrupt
summary by reading the copy completion for all the
copy engine before actually processing any of them.
This will avoid reading the interrupt status register
for any CE after the interrupt status is cleared.

Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1

Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
---
 drivers/net/wireless/ath/ath10k/ce.c | 63 ++++++++++++++++++++++--------------
 drivers/net/wireless/ath/ath10k/ce.h |  5 +--
 2 files changed, 42 insertions(+), 26 deletions(-)

Comments

Douglas Anderson June 26, 2020, 9:37 p.m. UTC | #1
Hi,

On Fri, Jun 26, 2020 at 10:53 AM Rakesh Pillai <pillair@codeaurora.org> wrote:
>
> Currently the NAPI processing loops through all
> the copy engines and processes a particular copy
> engine is the copy completion is set for that copy
> engine. The host driver is not supposed to access
> any copy engine register after clearing the interrupt
> status register.
>
> This might result in kernel crash like the one below
> [ 1159.220143] Call trace:
> [ 1159.220170]  ath10k_snoc_read32+0x20/0x40 [ath10k_snoc]
> [ 1159.220193]  ath10k_ce_per_engine_service_any+0x78/0x130 [ath10k_core]
> [ 1159.220203]  ath10k_snoc_napi_poll+0x38/0x8c [ath10k_snoc]
> [ 1159.220270]  net_rx_action+0x100/0x3b0
> [ 1159.220312]  __do_softirq+0x164/0x30c
> [ 1159.220345]  run_ksoftirqd+0x2c/0x64
> [ 1159.220380]  smpboot_thread_fn+0x1b0/0x288
> [ 1159.220405]  kthread+0x11c/0x12c
> [ 1159.220423]  ret_from_fork+0x10/0x18
>
> To avoid such a scenario, we generate an interrupt
> summary by reading the copy completion for all the
> copy engine before actually processing any of them.
> This will avoid reading the interrupt status register
> for any CE after the interrupt status is cleared.
>
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
>
> Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
> ---
>  drivers/net/wireless/ath/ath10k/ce.c | 63 ++++++++++++++++++++++--------------
>  drivers/net/wireless/ath/ath10k/ce.h |  5 +--
>  2 files changed, 42 insertions(+), 26 deletions(-)

I'm not an expert on this driver, but your change seems sane to me.

Reviewed-by: Douglas Anderson <dianders@chromium.org>

With your patch I can no longer find a place to put in a magic delay
and reproduce the crash, thus:

Tested-by: Douglas Anderson <dianders@chromium.org>


If it matters, my WiFi firmware reports this:

WLAN.HL.3.2.2-00490-QCAHLSWMTPL-1

...and it should also be WCN3990.


-Doug
Douglas Anderson June 26, 2020, 9:49 p.m. UTC | #2
Hi,

On Fri, Jun 26, 2020 at 2:37 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Fri, Jun 26, 2020 at 10:53 AM Rakesh Pillai <pillair@codeaurora.org> wrote:
> >
> > Currently the NAPI processing loops through all
> > the copy engines and processes a particular copy
> > engine is the copy completion is set for that copy
> > engine. The host driver is not supposed to access
> > any copy engine register after clearing the interrupt
> > status register.
> >
> > This might result in kernel crash like the one below
> > [ 1159.220143] Call trace:
> > [ 1159.220170]  ath10k_snoc_read32+0x20/0x40 [ath10k_snoc]
> > [ 1159.220193]  ath10k_ce_per_engine_service_any+0x78/0x130 [ath10k_core]
> > [ 1159.220203]  ath10k_snoc_napi_poll+0x38/0x8c [ath10k_snoc]
> > [ 1159.220270]  net_rx_action+0x100/0x3b0
> > [ 1159.220312]  __do_softirq+0x164/0x30c
> > [ 1159.220345]  run_ksoftirqd+0x2c/0x64
> > [ 1159.220380]  smpboot_thread_fn+0x1b0/0x288
> > [ 1159.220405]  kthread+0x11c/0x12c
> > [ 1159.220423]  ret_from_fork+0x10/0x18
> >
> > To avoid such a scenario, we generate an interrupt
> > summary by reading the copy completion for all the
> > copy engine before actually processing any of them.
> > This will avoid reading the interrupt status register
> > for any CE after the interrupt status is cleared.
> >
> > Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
> >
> > Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
> > ---
> >  drivers/net/wireless/ath/ath10k/ce.c | 63 ++++++++++++++++++++++--------------
> >  drivers/net/wireless/ath/ath10k/ce.h |  5 +--
> >  2 files changed, 42 insertions(+), 26 deletions(-)
>
> I'm not an expert on this driver, but your change seems sane to me.
>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>
>
> With your patch I can no longer find a place to put in a magic delay
> and reproduce the crash, thus:
>
> Tested-by: Douglas Anderson <dianders@chromium.org>
>
>
> If it matters, my WiFi firmware reports this:
>
> WLAN.HL.3.2.2-00490-QCAHLSWMTPL-1
>
> ...and it should also be WCN3990.

I should also note that, while I'm not terribly familiar with Kalle's
workflow, I would have expected to see him in the "To:" list.  I've
added him, but it's possible he'll need you to repost the patch with
him in the "To:" list.

-Doug
Brian Norris June 26, 2020, 9:52 p.m. UTC | #3
On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
> I should also note that, while I'm not terribly familiar with Kalle's
> workflow, I would have expected to see him in the "To:" list.  I've
> added him, but it's possible he'll need you to repost the patch with
> him in the "To:" list.

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches

Patchwork is his patch queue, so I don't think you need to address him directly.

Brian
Kalle Valo July 16, 2020, 6:38 a.m. UTC | #4
Brian Norris <briannorris@chromium.org> writes:

> On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
>> I should also note that, while I'm not terribly familiar with Kalle's
>> workflow, I would have expected to see him in the "To:" list.  I've
>> added him, but it's possible he'll need you to repost the patch with
>> him in the "To:" list.
>
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
> https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
>
> Patchwork is his patch queue, so I don't think you need to address him directly.

Yup, I take all patches from patchwork so no need to Cc me.
Peter Oh July 21, 2020, 12:33 a.m. UTC | #5
I'm getting this panic on IPQ4019 system after cherry-picked this
single patch on top of working system.

[   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
event from target: 80000000
[   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
L<devres_remove+0x38/0x70> F<005> [00000008]
[   14.326447] Unable to handle kernel NULL pointer dereference at
virtual address 00000008
[   14.333569] pgd = 80cac000
[   14.341892] [00000008] *pgd=00000000
[   14.347804] !#%&Abort P<find_dr+0x28/0x64>
L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
[   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
[   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
shortcut_fe ath10k_core ath mac80211 cfg80211 compat
[   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
4.4.60-yocto-standard-eero #1
[   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
[   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
[   14.386516] PC is at find_dr+0x28/0x64
[   14.392069] LR is at devres_remove+0x38/0x70
[   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
[   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
[   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
[   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
[   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
[   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
[   14.436865] Process systemd-modules (pid: 301, stack limit = 0x9d3f0210)
[   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
[   14.449455] 7cc0:                   9b73da10 a0010113 80430990
8043097c 9b740b00 9b73db60
[   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
9d995600 ffffff92 00000000
[   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
7f28c584 9dbd5f20 7f28cf28
[   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
00000000 00000000 00000000
[   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
815a59e0 8162d420 00000003
[   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
804a7df8 7f28fb94 9d995610
[   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
0000001c 9cae60c8 804a7f9c
[   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
9d9f6634 7f28fb94 9b739a00
[   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
7f293000 00000000 815b6d48
[   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
7f28d2e8 9cae61c0 7f29300c
[   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
802b19d0 9e34ea98 00000000
[   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0 9e34e8a0
9bae8100 9d3f0000 9d801e40
[   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
7f28fc00 00000001 81599848
[   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
80287704 7f28fc00 9e34e8a0
[   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
00007fff 7f28fc00 80286a58
[   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
8088bd0c 7f28fdc4 76cea5cc
[   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
00000000 00000000 00000000
[   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
00000000 00000000 00000000
[   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 81599848
[   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
80209de4 9d3f0000 00000000
[   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
a13d4000 0000d1b0 a13e0b48
[   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
00000000 00000000 0000313c
[   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
00000000 00000000 00000000
[   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
76cea5cc 00000000 00000000
[   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
00000001 76ea754f 00000000
[   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
00000008 00000000 00000000
[   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
(devres_remove+0x38/0x70)
[   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
(devres_destroy+0x8/0x24)
[   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
(devm_iounmap+0x18/0x44)
[   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
(ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
[   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit [ath10k_pci])
from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
[   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
[<804a96cc>] (platform_drv_probe+0x50/0x9c)
[   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
(driver_probe_device+0x2ac/0x404)
[   14.719520] [<804a7df8>] (driver_probe_device) from [<804a7f9c>]
(__driver_attach+0x4c/0x8c)
[   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
(bus_for_each_dev+0x7c/0x8c)
[   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
(bus_add_driver+0x1b4/0x234)
[   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
(driver_register+0xa0/0xe0)
[   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
(ath10k_ahb_init+0x10/0x38 [ath10k_pci])
[   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
[<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
[   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
[<80213468>] (do_one_initcall+0x1c4/0x20c)
[   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
(do_init_module+0x54/0x1ac)
[   14.789916] [<80287704>] (do_init_module) from [<80289298>]
(load_module+0x19e0/0x1b04)
[   14.798249] [<80289298>] (load_module) from [<80289594>]
(SyS_finit_module+0x8c/0x9c)
[   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
(ret_fast_syscall+0x0/0x34)
[   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c (e5943008)
[   14.822108] ---[ end trace f4da008c1c165fb3 ]---
[   14.830623] Kernel panic - not syncing: Fatal exception
[   14.832876] CPU1: stopping
[   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
4.4.60-yocto-standard-eero #1
[   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
[   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
(show_stack+0x10/0x14)
[   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
(dump_stack+0x7c/0x98)
[   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
(handle_IPI+0xdc/0x164)
[   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
(gic_handle_irq+0x80/0x8c)
[   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
(__irq_usr+0x44/0x60)
[   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
[   14.892076] ffa0:                                     0c27987c
40016b9f 0c27987d 0000001f
[   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
00000001 7e817c00 763fed0c
[   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010 ffffffff
[   14.913430] CPU2: stopping
[   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
4.4.60-yocto-standard-eero #1
[   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
[   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
(show_stack+0x10/0x14)
[   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
(dump_stack+0x7c/0x98)
[   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
(handle_IPI+0xdc/0x164)
[   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
(gic_handle_irq+0x80/0x8c)
[   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
(__irq_usr+0x44/0x60)
[   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
[   14.974107] 7fa0:                                     fffffff7
00000017 a6000000 0000014a
[   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
00000001 7e817c00 759fed0c
[   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010 ffffffff
[   14.995461] CPU0: stopping
[   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
4.4.60-yocto-standard-eero #1
[   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
[   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
(show_stack+0x10/0x14)
[   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
(dump_stack+0x7c/0x98)
[   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
(handle_IPI+0xdc/0x164)
[   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
(gic_handle_irq+0x80/0x8c)
[   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
(__irq_usr+0x44/0x60)
[   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
[   15.056139] 7fa0:                                     00000000
00000000 a7e3391d 6a11d866
[   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
00000001 7e817c00 76d15d0c
[   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010 ffffffff
[   15.091080] Rebooting in 5 seconds..


On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo <kvalo@codeaurora.org> wrote:
>
> Brian Norris <briannorris@chromium.org> writes:
>
> > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
> >> I should also note that, while I'm not terribly familiar with Kalle's
> >> workflow, I would have expected to see him in the "To:" list.  I've
> >> added him, but it's possible he'll need you to repost the patch with
> >> him in the "To:" list.
> >
> > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
> > https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
> >
> > Patchwork is his patch queue, so I don't think you need to address him directly.
>
> Yup, I take all patches from patchwork so no need to Cc me.
>
> --
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Douglas Anderson July 21, 2020, 12:40 a.m. UTC | #6
Hi,

On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com> wrote:
>
> I'm getting this panic on IPQ4019 system after cherry-picked this
> single patch on top of working system.
>
> [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
> event from target: 80000000

A bit of a shot in the dark, but any chance you could try this atop it
and see if it helps?

https://lore.kernel.org/r/20200709082024.v2.1.I4d2f85ffa06f38532631e864a3125691ef5ffe06@changeid/

...most of this patch is gutted by that one.

-Doug
Peter Oh July 21, 2020, 12:43 a.m. UTC | #7
I've run 3 units and one of them happens the problem always while the
other 2 are barely happening.

On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com> wrote:
>
> I'm getting this panic on IPQ4019 system after cherry-picked this
> single patch on top of working system.
>
> [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
> event from target: 80000000
> [   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
> L<devres_remove+0x38/0x70> F<005> [00000008]
> [   14.326447] Unable to handle kernel NULL pointer dereference at
> virtual address 00000008
> [   14.333569] pgd = 80cac000
> [   14.341892] [00000008] *pgd=00000000
> [   14.347804] !#%&Abort P<find_dr+0x28/0x64>
> L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
> [   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
> [   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
> shortcut_fe ath10k_core ath mac80211 cfg80211 compat
> [   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
> 4.4.60-yocto-standard-eero #1
> [   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
> [   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
> [   14.386516] PC is at find_dr+0x28/0x64
> [   14.392069] LR is at devres_remove+0x38/0x70
> [   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
> [   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
> [   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
> [   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
> [   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
> [   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
> Segment user
> [   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
> [   14.436865] Process systemd-modules (pid: 301, stack limit = 0x9d3f0210)
> [   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
> [   14.449455] 7cc0:                   9b73da10 a0010113 80430990
> 8043097c 9b740b00 9b73db60
> [   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
> 9d995600 ffffff92 00000000
> [   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
> 7f28c584 9dbd5f20 7f28cf28
> [   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
> 00000000 00000000 00000000
> [   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
> 815a59e0 8162d420 00000003
> [   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
> 804a7df8 7f28fb94 9d995610
> [   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
> 0000001c 9cae60c8 804a7f9c
> [   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
> 9d9f6634 7f28fb94 9b739a00
> [   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
> 7f293000 00000000 815b6d48
> [   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
> 7f28d2e8 9cae61c0 7f29300c
> [   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
> 802b19d0 9e34ea98 00000000
> [   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0 9e34e8a0
> 9bae8100 9d3f0000 9d801e40
> [   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
> 7f28fc00 00000001 81599848
> [   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
> 80287704 7f28fc00 9e34e8a0
> [   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
> 00007fff 7f28fc00 80286a58
> [   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
> 8088bd0c 7f28fdc4 76cea5cc
> [   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 81599848
> [   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
> 80209de4 9d3f0000 00000000
> [   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
> a13d4000 0000d1b0 a13e0b48
> [   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
> 00000000 00000000 0000313c
> [   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
> 00000000 00000000 00000000
> [   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
> 76cea5cc 00000000 00000000
> [   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
> 00000001 76ea754f 00000000
> [   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
> 00000008 00000000 00000000
> [   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
> (devres_remove+0x38/0x70)
> [   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
> (devres_destroy+0x8/0x24)
> [   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
> (devm_iounmap+0x18/0x44)
> [   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
> (ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
> [   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit [ath10k_pci])
> from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
> [   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
> [<804a96cc>] (platform_drv_probe+0x50/0x9c)
> [   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
> (driver_probe_device+0x2ac/0x404)
> [   14.719520] [<804a7df8>] (driver_probe_device) from [<804a7f9c>]
> (__driver_attach+0x4c/0x8c)
> [   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
> (bus_for_each_dev+0x7c/0x8c)
> [   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
> (bus_add_driver+0x1b4/0x234)
> [   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
> (driver_register+0xa0/0xe0)
> [   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
> (ath10k_ahb_init+0x10/0x38 [ath10k_pci])
> [   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
> [<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
> [   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
> [<80213468>] (do_one_initcall+0x1c4/0x20c)
> [   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
> (do_init_module+0x54/0x1ac)
> [   14.789916] [<80287704>] (do_init_module) from [<80289298>]
> (load_module+0x19e0/0x1b04)
> [   14.798249] [<80289298>] (load_module) from [<80289594>]
> (SyS_finit_module+0x8c/0x9c)
> [   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
> (ret_fast_syscall+0x0/0x34)
> [   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c (e5943008)
> [   14.822108] ---[ end trace f4da008c1c165fb3 ]---
> [   14.830623] Kernel panic - not syncing: Fatal exception
> [   14.832876] CPU1: stopping
> [   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
> 4.4.60-yocto-standard-eero #1
> [   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
> [   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> (show_stack+0x10/0x14)
> [   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
> (dump_stack+0x7c/0x98)
> [   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> (handle_IPI+0xdc/0x164)
> [   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> (gic_handle_irq+0x80/0x8c)
> [   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> (__irq_usr+0x44/0x60)
> [   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
> [   14.892076] ffa0:                                     0c27987c
> 40016b9f 0c27987d 0000001f
> [   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
> 00000001 7e817c00 763fed0c
> [   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010 ffffffff
> [   14.913430] CPU2: stopping
> [   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
> 4.4.60-yocto-standard-eero #1
> [   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
> [   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> (show_stack+0x10/0x14)
> [   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
> (dump_stack+0x7c/0x98)
> [   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> (handle_IPI+0xdc/0x164)
> [   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> (gic_handle_irq+0x80/0x8c)
> [   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> (__irq_usr+0x44/0x60)
> [   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
> [   14.974107] 7fa0:                                     fffffff7
> 00000017 a6000000 0000014a
> [   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
> 00000001 7e817c00 759fed0c
> [   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010 ffffffff
> [   14.995461] CPU0: stopping
> [   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
> 4.4.60-yocto-standard-eero #1
> [   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
> [   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> (show_stack+0x10/0x14)
> [   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
> (dump_stack+0x7c/0x98)
> [   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> (handle_IPI+0xdc/0x164)
> [   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> (gic_handle_irq+0x80/0x8c)
> [   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> (__irq_usr+0x44/0x60)
> [   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
> [   15.056139] 7fa0:                                     00000000
> 00000000 a7e3391d 6a11d866
> [   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
> 00000001 7e817c00 76d15d0c
> [   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010 ffffffff
> [   15.091080] Rebooting in 5 seconds..
>
>
> On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo <kvalo@codeaurora.org> wrote:
> >
> > Brian Norris <briannorris@chromium.org> writes:
> >
> > > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
> > >> I should also note that, while I'm not terribly familiar with Kalle's
> > >> workflow, I would have expected to see him in the "To:" list.  I've
> > >> added him, but it's possible he'll need you to repost the patch with
> > >> him in the "To:" list.
> > >
> > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
> > > https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
> > >
> > > Patchwork is his patch queue, so I don't think you need to address him directly.
> >
> > Yup, I take all patches from patchwork so no need to Cc me.
> >
> > --
> > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Peter Oh July 21, 2020, 12:56 a.m. UTC | #8
Since IPQ4019 doesn't support per CE based interrupt summary, I doubt
if this change is correct.
+       ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
+                                         wm_regs->cc_mask | wm_regs->wm_mask);


On Mon, Jul 20, 2020 at 5:53 PM Peter Oh <peter.oh@eero.com> wrote:
>
> At first I gave these 3 patches.
> ath10k: Add interrupt summary based CE processing
>     https://patchwork.kernel.org/patch/11628299/
> ath10k: Keep track of which interrupts fired, don't poll them
>     https://patchwork.kernel.org/patch/11654631/
> ath10k: Get rid of "per_ce_irq" hw param
>     https://patchwork.kernel.org/patch/11654633/
> and saw the crash happen and then reverted the top 2 and used the very first one, but it is still happening.
>
>
>
> On Mon, Jul 20, 2020 at 5:43 PM Peter Oh <peter.oh@eero.com> wrote:
>>
>> I've run 3 units and one of them happens the problem always while the
>> other 2 are barely happening.
>>
>> On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com> wrote:
>> >
>> > I'm getting this panic on IPQ4019 system after cherry-picked this
>> > single patch on top of working system.
>> >
>> > [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
>> > event from target: 80000000
>> > [   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
>> > L<devres_remove+0x38/0x70> F<005> [00000008]
>> > [   14.326447] Unable to handle kernel NULL pointer dereference at
>> > virtual address 00000008
>> > [   14.333569] pgd = 80cac000
>> > [   14.341892] [00000008] *pgd=00000000
>> > [   14.347804] !#%&Abort P<find_dr+0x28/0x64>
>> > L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
>> > [   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
>> > [   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
>> > shortcut_fe ath10k_core ath mac80211 cfg80211 compat
>> > [   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
>> > 4.4.60-yocto-standard-eero #1
>> > [   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
>> > [   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
>> > [   14.386516] PC is at find_dr+0x28/0x64
>> > [   14.392069] LR is at devres_remove+0x38/0x70
>> > [   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
>> > [   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
>> > [   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
>> > [   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
>> > [   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
>> > [   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
>> > Segment user
>> > [   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
>> > [   14.436865] Process systemd-modules (pid: 301, stack limit = 0x9d3f0210)
>> > [   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
>> > [   14.449455] 7cc0:                   9b73da10 a0010113 80430990
>> > 8043097c 9b740b00 9b73db60
>> > [   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
>> > 9d995600 ffffff92 00000000
>> > [   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
>> > 7f28c584 9dbd5f20 7f28cf28
>> > [   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
>> > 00000000 00000000 00000000
>> > [   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
>> > 815a59e0 8162d420 00000003
>> > [   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
>> > 804a7df8 7f28fb94 9d995610
>> > [   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
>> > 0000001c 9cae60c8 804a7f9c
>> > [   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
>> > 9d9f6634 7f28fb94 9b739a00
>> > [   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
>> > 7f293000 00000000 815b6d48
>> > [   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
>> > 7f28d2e8 9cae61c0 7f29300c
>> > [   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
>> > 802b19d0 9e34ea98 00000000
>> > [   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0 9e34e8a0
>> > 9bae8100 9d3f0000 9d801e40
>> > [   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
>> > 7f28fc00 00000001 81599848
>> > [   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
>> > 80287704 7f28fc00 9e34e8a0
>> > [   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
>> > 00007fff 7f28fc00 80286a58
>> > [   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
>> > 8088bd0c 7f28fdc4 76cea5cc
>> > [   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
>> > 00000000 00000000 00000000
>> > [   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
>> > 00000000 00000000 00000000
>> > [   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
>> > 00000000 00000000 81599848
>> > [   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
>> > 80209de4 9d3f0000 00000000
>> > [   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
>> > a13d4000 0000d1b0 a13e0b48
>> > [   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
>> > 00000000 00000000 0000313c
>> > [   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
>> > 00000000 00000000 00000000
>> > [   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
>> > 76cea5cc 00000000 00000000
>> > [   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
>> > 00000001 76ea754f 00000000
>> > [   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
>> > 00000008 00000000 00000000
>> > [   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
>> > (devres_remove+0x38/0x70)
>> > [   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
>> > (devres_destroy+0x8/0x24)
>> > [   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
>> > (devm_iounmap+0x18/0x44)
>> > [   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
>> > (ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
>> > [   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit [ath10k_pci])
>> > from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
>> > [   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
>> > [<804a96cc>] (platform_drv_probe+0x50/0x9c)
>> > [   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
>> > (driver_probe_device+0x2ac/0x404)
>> > [   14.719520] [<804a7df8>] (driver_probe_device) from [<804a7f9c>]
>> > (__driver_attach+0x4c/0x8c)
>> > [   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
>> > (bus_for_each_dev+0x7c/0x8c)
>> > [   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
>> > (bus_add_driver+0x1b4/0x234)
>> > [   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
>> > (driver_register+0xa0/0xe0)
>> > [   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
>> > (ath10k_ahb_init+0x10/0x38 [ath10k_pci])
>> > [   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
>> > [<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
>> > [   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
>> > [<80213468>] (do_one_initcall+0x1c4/0x20c)
>> > [   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
>> > (do_init_module+0x54/0x1ac)
>> > [   14.789916] [<80287704>] (do_init_module) from [<80289298>]
>> > (load_module+0x19e0/0x1b04)
>> > [   14.798249] [<80289298>] (load_module) from [<80289594>]
>> > (SyS_finit_module+0x8c/0x9c)
>> > [   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
>> > (ret_fast_syscall+0x0/0x34)
>> > [   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c (e5943008)
>> > [   14.822108] ---[ end trace f4da008c1c165fb3 ]---
>> > [   14.830623] Kernel panic - not syncing: Fatal exception
>> > [   14.832876] CPU1: stopping
>> > [   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
>> > 4.4.60-yocto-standard-eero #1
>> > [   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
>> > [   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
>> > (show_stack+0x10/0x14)
>> > [   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
>> > (dump_stack+0x7c/0x98)
>> > [   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
>> > (handle_IPI+0xdc/0x164)
>> > [   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
>> > (gic_handle_irq+0x80/0x8c)
>> > [   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
>> > (__irq_usr+0x44/0x60)
>> > [   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
>> > [   14.892076] ffa0:                                     0c27987c
>> > 40016b9f 0c27987d 0000001f
>> > [   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
>> > 00000001 7e817c00 763fed0c
>> > [   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010 ffffffff
>> > [   14.913430] CPU2: stopping
>> > [   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
>> > 4.4.60-yocto-standard-eero #1
>> > [   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
>> > [   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
>> > (show_stack+0x10/0x14)
>> > [   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
>> > (dump_stack+0x7c/0x98)
>> > [   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
>> > (handle_IPI+0xdc/0x164)
>> > [   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
>> > (gic_handle_irq+0x80/0x8c)
>> > [   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
>> > (__irq_usr+0x44/0x60)
>> > [   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
>> > [   14.974107] 7fa0:                                     fffffff7
>> > 00000017 a6000000 0000014a
>> > [   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
>> > 00000001 7e817c00 759fed0c
>> > [   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010 ffffffff
>> > [   14.995461] CPU0: stopping
>> > [   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
>> > 4.4.60-yocto-standard-eero #1
>> > [   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
>> > [   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
>> > (show_stack+0x10/0x14)
>> > [   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
>> > (dump_stack+0x7c/0x98)
>> > [   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
>> > (handle_IPI+0xdc/0x164)
>> > [   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
>> > (gic_handle_irq+0x80/0x8c)
>> > [   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
>> > (__irq_usr+0x44/0x60)
>> > [   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
>> > [   15.056139] 7fa0:                                     00000000
>> > 00000000 a7e3391d 6a11d866
>> > [   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
>> > 00000001 7e817c00 76d15d0c
>> > [   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010 ffffffff
>> > [   15.091080] Rebooting in 5 seconds..
>> >
>> >
>> > On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo <kvalo@codeaurora.org> wrote:
>> > >
>> > > Brian Norris <briannorris@chromium.org> writes:
>> > >
>> > > > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
>> > > >> I should also note that, while I'm not terribly familiar with Kalle's
>> > > >> workflow, I would have expected to see him in the "To:" list.  I've
>> > > >> added him, but it's possible he'll need you to repost the patch with
>> > > >> him in the "To:" list.
>> > > >
>> > > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
>> > > > https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
>> > > >
>> > > > Patchwork is his patch queue, so I don't think you need to address him directly.
>> > >
>> > > Yup, I take all patches from patchwork so no need to Cc me.
>> > >
>> > > --
>> > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Peter Oh July 21, 2020, 12:58 a.m. UTC | #9
My previous email wasn't sent out.

At first I gave these 3 patches.
ath10k: Add interrupt summary based CE processing
    https://patchwork.kernel.org/patch/11628299/
ath10k: Keep track of which interrupts fired, don't poll them
    https://patchwork.kernel.org/patch/11654631/
ath10k: Get rid of "per_ce_irq" hw param
    https://patchwork.kernel.org/patch/11654633/
and saw the crash happen and then reverted the top 2 and used the very
first one, but it is still happening.

On Mon, Jul 20, 2020 at 5:56 PM Peter Oh <peter.oh@eero.com> wrote:
>
> Since IPQ4019 doesn't support per CE based interrupt summary, I doubt
> if this change is correct.
> +       ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
> +                                         wm_regs->cc_mask | wm_regs->wm_mask);
>
>
> On Mon, Jul 20, 2020 at 5:53 PM Peter Oh <peter.oh@eero.com> wrote:
> >
> > At first I gave these 3 patches.
> > ath10k: Add interrupt summary based CE processing
> >     https://patchwork.kernel.org/patch/11628299/
> > ath10k: Keep track of which interrupts fired, don't poll them
> >     https://patchwork.kernel.org/patch/11654631/
> > ath10k: Get rid of "per_ce_irq" hw param
> >     https://patchwork.kernel.org/patch/11654633/
> > and saw the crash happen and then reverted the top 2 and used the very first one, but it is still happening.
> >
> >
> >
> > On Mon, Jul 20, 2020 at 5:43 PM Peter Oh <peter.oh@eero.com> wrote:
> >>
> >> I've run 3 units and one of them happens the problem always while the
> >> other 2 are barely happening.
> >>
> >> On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com> wrote:
> >> >
> >> > I'm getting this panic on IPQ4019 system after cherry-picked this
> >> > single patch on top of working system.
> >> >
> >> > [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
> >> > event from target: 80000000
> >> > [   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
> >> > L<devres_remove+0x38/0x70> F<005> [00000008]
> >> > [   14.326447] Unable to handle kernel NULL pointer dereference at
> >> > virtual address 00000008
> >> > [   14.333569] pgd = 80cac000
> >> > [   14.341892] [00000008] *pgd=00000000
> >> > [   14.347804] !#%&Abort P<find_dr+0x28/0x64>
> >> > L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
> >> > [   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
> >> > [   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
> >> > shortcut_fe ath10k_core ath mac80211 cfg80211 compat
> >> > [   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
> >> > 4.4.60-yocto-standard-eero #1
> >> > [   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
> >> > [   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
> >> > [   14.386516] PC is at find_dr+0x28/0x64
> >> > [   14.392069] LR is at devres_remove+0x38/0x70
> >> > [   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
> >> > [   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
> >> > [   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
> >> > [   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
> >> > [   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
> >> > [   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
> >> > Segment user
> >> > [   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
> >> > [   14.436865] Process systemd-modules (pid: 301, stack limit = 0x9d3f0210)
> >> > [   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
> >> > [   14.449455] 7cc0:                   9b73da10 a0010113 80430990
> >> > 8043097c 9b740b00 9b73db60
> >> > [   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
> >> > 9d995600 ffffff92 00000000
> >> > [   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
> >> > 7f28c584 9dbd5f20 7f28cf28
> >> > [   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
> >> > 00000000 00000000 00000000
> >> > [   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
> >> > 815a59e0 8162d420 00000003
> >> > [   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
> >> > 804a7df8 7f28fb94 9d995610
> >> > [   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
> >> > 0000001c 9cae60c8 804a7f9c
> >> > [   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
> >> > 9d9f6634 7f28fb94 9b739a00
> >> > [   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
> >> > 7f293000 00000000 815b6d48
> >> > [   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
> >> > 7f28d2e8 9cae61c0 7f29300c
> >> > [   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
> >> > 802b19d0 9e34ea98 00000000
> >> > [   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0 9e34e8a0
> >> > 9bae8100 9d3f0000 9d801e40
> >> > [   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
> >> > 7f28fc00 00000001 81599848
> >> > [   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
> >> > 80287704 7f28fc00 9e34e8a0
> >> > [   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
> >> > 00007fff 7f28fc00 80286a58
> >> > [   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
> >> > 8088bd0c 7f28fdc4 76cea5cc
> >> > [   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
> >> > 00000000 00000000 00000000
> >> > [   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
> >> > 00000000 00000000 00000000
> >> > [   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
> >> > 00000000 00000000 81599848
> >> > [   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
> >> > 80209de4 9d3f0000 00000000
> >> > [   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
> >> > a13d4000 0000d1b0 a13e0b48
> >> > [   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
> >> > 00000000 00000000 0000313c
> >> > [   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
> >> > 00000000 00000000 00000000
> >> > [   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
> >> > 76cea5cc 00000000 00000000
> >> > [   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
> >> > 00000001 76ea754f 00000000
> >> > [   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
> >> > 00000008 00000000 00000000
> >> > [   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
> >> > (devres_remove+0x38/0x70)
> >> > [   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
> >> > (devres_destroy+0x8/0x24)
> >> > [   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
> >> > (devm_iounmap+0x18/0x44)
> >> > [   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
> >> > (ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
> >> > [   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit [ath10k_pci])
> >> > from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
> >> > [   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
> >> > [<804a96cc>] (platform_drv_probe+0x50/0x9c)
> >> > [   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
> >> > (driver_probe_device+0x2ac/0x404)
> >> > [   14.719520] [<804a7df8>] (driver_probe_device) from [<804a7f9c>]
> >> > (__driver_attach+0x4c/0x8c)
> >> > [   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
> >> > (bus_for_each_dev+0x7c/0x8c)
> >> > [   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
> >> > (bus_add_driver+0x1b4/0x234)
> >> > [   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
> >> > (driver_register+0xa0/0xe0)
> >> > [   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
> >> > (ath10k_ahb_init+0x10/0x38 [ath10k_pci])
> >> > [   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
> >> > [<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
> >> > [   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
> >> > [<80213468>] (do_one_initcall+0x1c4/0x20c)
> >> > [   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
> >> > (do_init_module+0x54/0x1ac)
> >> > [   14.789916] [<80287704>] (do_init_module) from [<80289298>]
> >> > (load_module+0x19e0/0x1b04)
> >> > [   14.798249] [<80289298>] (load_module) from [<80289594>]
> >> > (SyS_finit_module+0x8c/0x9c)
> >> > [   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
> >> > (ret_fast_syscall+0x0/0x34)
> >> > [   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c (e5943008)
> >> > [   14.822108] ---[ end trace f4da008c1c165fb3 ]---
> >> > [   14.830623] Kernel panic - not syncing: Fatal exception
> >> > [   14.832876] CPU1: stopping
> >> > [   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
> >> > 4.4.60-yocto-standard-eero #1
> >> > [   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
> >> > [   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> >> > (show_stack+0x10/0x14)
> >> > [   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
> >> > (dump_stack+0x7c/0x98)
> >> > [   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> >> > (handle_IPI+0xdc/0x164)
> >> > [   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> >> > (gic_handle_irq+0x80/0x8c)
> >> > [   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> >> > (__irq_usr+0x44/0x60)
> >> > [   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
> >> > [   14.892076] ffa0:                                     0c27987c
> >> > 40016b9f 0c27987d 0000001f
> >> > [   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
> >> > 00000001 7e817c00 763fed0c
> >> > [   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010 ffffffff
> >> > [   14.913430] CPU2: stopping
> >> > [   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
> >> > 4.4.60-yocto-standard-eero #1
> >> > [   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
> >> > [   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> >> > (show_stack+0x10/0x14)
> >> > [   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
> >> > (dump_stack+0x7c/0x98)
> >> > [   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> >> > (handle_IPI+0xdc/0x164)
> >> > [   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> >> > (gic_handle_irq+0x80/0x8c)
> >> > [   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> >> > (__irq_usr+0x44/0x60)
> >> > [   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
> >> > [   14.974107] 7fa0:                                     fffffff7
> >> > 00000017 a6000000 0000014a
> >> > [   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
> >> > 00000001 7e817c00 759fed0c
> >> > [   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010 ffffffff
> >> > [   14.995461] CPU0: stopping
> >> > [   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
> >> > 4.4.60-yocto-standard-eero #1
> >> > [   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
> >> > [   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> >> > (show_stack+0x10/0x14)
> >> > [   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
> >> > (dump_stack+0x7c/0x98)
> >> > [   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> >> > (handle_IPI+0xdc/0x164)
> >> > [   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> >> > (gic_handle_irq+0x80/0x8c)
> >> > [   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> >> > (__irq_usr+0x44/0x60)
> >> > [   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
> >> > [   15.056139] 7fa0:                                     00000000
> >> > 00000000 a7e3391d 6a11d866
> >> > [   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
> >> > 00000001 7e817c00 76d15d0c
> >> > [   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010 ffffffff
> >> > [   15.091080] Rebooting in 5 seconds..
> >> >
> >> >
> >> > On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo <kvalo@codeaurora.org> wrote:
> >> > >
> >> > > Brian Norris <briannorris@chromium.org> writes:
> >> > >
> >> > > > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
> >> > > >> I should also note that, while I'm not terribly familiar with Kalle's
> >> > > >> workflow, I would have expected to see him in the "To:" list.  I've
> >> > > >> added him, but it's possible he'll need you to repost the patch with
> >> > > >> him in the "To:" list.
> >> > > >
> >> > > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
> >> > > > https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
> >> > > >
> >> > > > Patchwork is his patch queue, so I don't think you need to address him directly.
> >> > >
> >> > > Yup, I take all patches from patchwork so no need to Cc me.
> >> > >
> >> > > --
> >> > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Peter Oh July 21, 2020, 1:32 a.m. UTC | #10
I'll take my word back.
It's not this patch problem, but by others.
I have 2 extra patches before the 3 patches so my system looks like

backports from ath.git 5.6-rc1 + linux kernel 4.4 (similar to OpenWrt)
On top of the working system, I cherry-picked these 5.

#1.
ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
ath.git commit 521fc37be3d879561ca5ab42d64719cf94116af0
#2.
ath10k: Fix NULL pointer dereference in AHB device probe
wireless-drivers.git commit 1cfd3426ef989b83fa6176490a38777057e57f6c
#3.
ath10k: Add interrupt summary based CE processing
https://patchwork.kernel.org/patch/11628299/
#4.
ath10k: Keep track of which interrupts fired, don't poll them
https://patchwork.kernel.org/patch/11654631/
#5.
ath10k: Get rid of "per_ce_irq" hw param
https://patchwork.kernel.org/patch/11654633/

The error "[  14.226184] ath10k_ahb a000000.wifi: failed to receive
initialized event from target: 80000000" is because of #1 and #2,
since this happens even after I reverted #3~#5.
Once I reverted all except #1 I got another crash.

[   11.179595] !#%&PageFault P<__ath10k_ce_rx_post_buf+0x14/0x98
[ath10k_core]> L<0x4bc00> F<005> [0000000c]
[   11.179643] Unable to handle kernel NULL pointer dereference at
virtual address 0000000c
[   11.439207] [<7f15a69c>] (__ath10k_ce_rx_post_buf [ath10k_core])
from [<7f15a874>] (ath10k_ce_rx_post_buf+0x3c/0x50 [ath10k_core])
[   11.447204] [<7f15a874>] (ath10k_ce_rx_post_buf [ath10k_core]) from
[<7f2889a4>] (ath10k_pci_diag_read_mem+0x104/0x2a8 [ath10k_pci])
[   11.458706] [<7f2889a4>] (ath10k_pci_diag_read_mem [ath10k_pci])
from [<7f288b68>] (ath10k_pci_diag_read32+0x1c/0x2c [ath10k_pci])
[   11.470767] [<7f288b68>] (ath10k_pci_diag_read32 [ath10k_pci]) from
[<7f28abe8>] (ath10k_pci_init_config+0x2c/0x290 [ath10k_pci])
[   11.482314] [<7f28abe8>] (ath10k_pci_init_config [ath10k_pci]) from
[<7f28d160>] (ath10k_ahb_hif_power_up+0x7c/0xe8 [ath10k_pci])
[   11.494153] [<7f28d160>] (ath10k_ahb_hif_power_up [ath10k_pci])
from [<7f135348>] (ath10k_core_register_work+0x84/0x8f8 [ath10k_core])
[   11.505766] [<7f135348>] (ath10k_core_register_work [ath10k_core])
from [<8023b614>] (process_one_work+0x1c0/0x2f8)
[   11.517594] [<8023b614>] (process_one_work) from [<8023c650>]
(worker_thread+0x280/0x3c0)
[   11.527919] [<8023c650>] (worker_thread) from [<802408f8>]
(kthread+0xd8/0xe8)
[   11.536247] [<802408f8>] (kthread) from [<80209ce8>]
(ret_from_fork+0x14/0x2c)

When I revert #1 eventually, my system is back to working.
So I'm blaming the #1 and #2 could have potential bugs or require
ath.git branch up-to-date.

On Mon, Jul 20, 2020 at 5:58 PM Peter Oh <peter.oh@eero.com> wrote:
>
> My previous email wasn't sent out.
>
> At first I gave these 3 patches.
> ath10k: Add interrupt summary based CE processing
>     https://patchwork.kernel.org/patch/11628299/
> ath10k: Keep track of which interrupts fired, don't poll them
>     https://patchwork.kernel.org/patch/11654631/
> ath10k: Get rid of "per_ce_irq" hw param
>     https://patchwork.kernel.org/patch/11654633/
> and saw the crash happen and then reverted the top 2 and used the very
> first one, but it is still happening.
>
> On Mon, Jul 20, 2020 at 5:56 PM Peter Oh <peter.oh@eero.com> wrote:
> >
> > Since IPQ4019 doesn't support per CE based interrupt summary, I doubt
> > if this change is correct.
> > +       ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
> > +                                         wm_regs->cc_mask | wm_regs->wm_mask);
> >
> >
> > On Mon, Jul 20, 2020 at 5:53 PM Peter Oh <peter.oh@eero.com> wrote:
> > >
> > > At first I gave these 3 patches.
> > > ath10k: Add interrupt summary based CE processing
> > >     https://patchwork.kernel.org/patch/11628299/
> > > ath10k: Keep track of which interrupts fired, don't poll them
> > >     https://patchwork.kernel.org/patch/11654631/
> > > ath10k: Get rid of "per_ce_irq" hw param
> > >     https://patchwork.kernel.org/patch/11654633/
> > > and saw the crash happen and then reverted the top 2 and used the very first one, but it is still happening.
> > >
> > >
> > >
> > > On Mon, Jul 20, 2020 at 5:43 PM Peter Oh <peter.oh@eero.com> wrote:
> > >>
> > >> I've run 3 units and one of them happens the problem always while the
> > >> other 2 are barely happening.
> > >>
> > >> On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com> wrote:
> > >> >
> > >> > I'm getting this panic on IPQ4019 system after cherry-picked this
> > >> > single patch on top of working system.
> > >> >
> > >> > [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
> > >> > event from target: 80000000
> > >> > [   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
> > >> > L<devres_remove+0x38/0x70> F<005> [00000008]
> > >> > [   14.326447] Unable to handle kernel NULL pointer dereference at
> > >> > virtual address 00000008
> > >> > [   14.333569] pgd = 80cac000
> > >> > [   14.341892] [00000008] *pgd=00000000
> > >> > [   14.347804] !#%&Abort P<find_dr+0x28/0x64>
> > >> > L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
> > >> > [   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
> > >> > [   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
> > >> > shortcut_fe ath10k_core ath mac80211 cfg80211 compat
> > >> > [   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
> > >> > 4.4.60-yocto-standard-eero #1
> > >> > [   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
> > >> > [   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
> > >> > [   14.386516] PC is at find_dr+0x28/0x64
> > >> > [   14.392069] LR is at devres_remove+0x38/0x70
> > >> > [   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
> > >> > [   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
> > >> > [   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
> > >> > [   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
> > >> > [   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
> > >> > [   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
> > >> > Segment user
> > >> > [   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
> > >> > [   14.436865] Process systemd-modules (pid: 301, stack limit = 0x9d3f0210)
> > >> > [   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
> > >> > [   14.449455] 7cc0:                   9b73da10 a0010113 80430990
> > >> > 8043097c 9b740b00 9b73db60
> > >> > [   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
> > >> > 9d995600 ffffff92 00000000
> > >> > [   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
> > >> > 7f28c584 9dbd5f20 7f28cf28
> > >> > [   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
> > >> > 00000000 00000000 00000000
> > >> > [   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
> > >> > 815a59e0 8162d420 00000003
> > >> > [   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
> > >> > 804a7df8 7f28fb94 9d995610
> > >> > [   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
> > >> > 0000001c 9cae60c8 804a7f9c
> > >> > [   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
> > >> > 9d9f6634 7f28fb94 9b739a00
> > >> > [   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
> > >> > 7f293000 00000000 815b6d48
> > >> > [   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
> > >> > 7f28d2e8 9cae61c0 7f29300c
> > >> > [   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
> > >> > 802b19d0 9e34ea98 00000000
> > >> > [   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0 9e34e8a0
> > >> > 9bae8100 9d3f0000 9d801e40
> > >> > [   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
> > >> > 7f28fc00 00000001 81599848
> > >> > [   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
> > >> > 80287704 7f28fc00 9e34e8a0
> > >> > [   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
> > >> > 00007fff 7f28fc00 80286a58
> > >> > [   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
> > >> > 8088bd0c 7f28fdc4 76cea5cc
> > >> > [   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
> > >> > 00000000 00000000 00000000
> > >> > [   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
> > >> > 00000000 00000000 00000000
> > >> > [   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
> > >> > 00000000 00000000 81599848
> > >> > [   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
> > >> > 80209de4 9d3f0000 00000000
> > >> > [   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
> > >> > a13d4000 0000d1b0 a13e0b48
> > >> > [   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
> > >> > 00000000 00000000 0000313c
> > >> > [   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
> > >> > 00000000 00000000 00000000
> > >> > [   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
> > >> > 76cea5cc 00000000 00000000
> > >> > [   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
> > >> > 00000001 76ea754f 00000000
> > >> > [   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
> > >> > 00000008 00000000 00000000
> > >> > [   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
> > >> > (devres_remove+0x38/0x70)
> > >> > [   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
> > >> > (devres_destroy+0x8/0x24)
> > >> > [   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
> > >> > (devm_iounmap+0x18/0x44)
> > >> > [   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
> > >> > (ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
> > >> > [   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit [ath10k_pci])
> > >> > from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
> > >> > [   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
> > >> > [<804a96cc>] (platform_drv_probe+0x50/0x9c)
> > >> > [   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
> > >> > (driver_probe_device+0x2ac/0x404)
> > >> > [   14.719520] [<804a7df8>] (driver_probe_device) from [<804a7f9c>]
> > >> > (__driver_attach+0x4c/0x8c)
> > >> > [   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
> > >> > (bus_for_each_dev+0x7c/0x8c)
> > >> > [   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
> > >> > (bus_add_driver+0x1b4/0x234)
> > >> > [   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
> > >> > (driver_register+0xa0/0xe0)
> > >> > [   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
> > >> > (ath10k_ahb_init+0x10/0x38 [ath10k_pci])
> > >> > [   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
> > >> > [<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
> > >> > [   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
> > >> > [<80213468>] (do_one_initcall+0x1c4/0x20c)
> > >> > [   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
> > >> > (do_init_module+0x54/0x1ac)
> > >> > [   14.789916] [<80287704>] (do_init_module) from [<80289298>]
> > >> > (load_module+0x19e0/0x1b04)
> > >> > [   14.798249] [<80289298>] (load_module) from [<80289594>]
> > >> > (SyS_finit_module+0x8c/0x9c)
> > >> > [   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
> > >> > (ret_fast_syscall+0x0/0x34)
> > >> > [   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c (e5943008)
> > >> > [   14.822108] ---[ end trace f4da008c1c165fb3 ]---
> > >> > [   14.830623] Kernel panic - not syncing: Fatal exception
> > >> > [   14.832876] CPU1: stopping
> > >> > [   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
> > >> > 4.4.60-yocto-standard-eero #1
> > >> > [   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
> > >> > [   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > >> > (show_stack+0x10/0x14)
> > >> > [   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
> > >> > (dump_stack+0x7c/0x98)
> > >> > [   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > >> > (handle_IPI+0xdc/0x164)
> > >> > [   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > >> > (gic_handle_irq+0x80/0x8c)
> > >> > [   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > >> > (__irq_usr+0x44/0x60)
> > >> > [   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
> > >> > [   14.892076] ffa0:                                     0c27987c
> > >> > 40016b9f 0c27987d 0000001f
> > >> > [   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
> > >> > 00000001 7e817c00 763fed0c
> > >> > [   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010 ffffffff
> > >> > [   14.913430] CPU2: stopping
> > >> > [   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
> > >> > 4.4.60-yocto-standard-eero #1
> > >> > [   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
> > >> > [   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > >> > (show_stack+0x10/0x14)
> > >> > [   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
> > >> > (dump_stack+0x7c/0x98)
> > >> > [   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > >> > (handle_IPI+0xdc/0x164)
> > >> > [   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > >> > (gic_handle_irq+0x80/0x8c)
> > >> > [   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > >> > (__irq_usr+0x44/0x60)
> > >> > [   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
> > >> > [   14.974107] 7fa0:                                     fffffff7
> > >> > 00000017 a6000000 0000014a
> > >> > [   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
> > >> > 00000001 7e817c00 759fed0c
> > >> > [   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010 ffffffff
> > >> > [   14.995461] CPU0: stopping
> > >> > [   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
> > >> > 4.4.60-yocto-standard-eero #1
> > >> > [   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
> > >> > [   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > >> > (show_stack+0x10/0x14)
> > >> > [   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
> > >> > (dump_stack+0x7c/0x98)
> > >> > [   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > >> > (handle_IPI+0xdc/0x164)
> > >> > [   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > >> > (gic_handle_irq+0x80/0x8c)
> > >> > [   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > >> > (__irq_usr+0x44/0x60)
> > >> > [   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
> > >> > [   15.056139] 7fa0:                                     00000000
> > >> > 00000000 a7e3391d 6a11d866
> > >> > [   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
> > >> > 00000001 7e817c00 76d15d0c
> > >> > [   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010 ffffffff
> > >> > [   15.091080] Rebooting in 5 seconds..
> > >> >
> > >> >
> > >> > On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo <kvalo@codeaurora.org> wrote:
> > >> > >
> > >> > > Brian Norris <briannorris@chromium.org> writes:
> > >> > >
> > >> > > > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson <dianders@chromium.org> wrote:
> > >> > > >> I should also note that, while I'm not terribly familiar with Kalle's
> > >> > > >> workflow, I would have expected to see him in the "To:" list.  I've
> > >> > > >> added him, but it's possible he'll need you to repost the patch with
> > >> > > >> him in the "To:" list.
> > >> > > >
> > >> > > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#who_to_address
> > >> > > > https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
> > >> > > >
> > >> > > > Patchwork is his patch queue, so I don't think you need to address him directly.
> > >> > >
> > >> > > Yup, I take all patches from patchwork so no need to Cc me.
> > >> > >
> > >> > > --
> > >> > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Rakesh Pillai July 21, 2020, 11:24 a.m. UTC | #11
> -----Original Message-----
> From: Peter Oh <peter.oh@eero.com>
> Sent: Tuesday, July 21, 2020 7:03 AM
> To: Kalle Valo <kvalo@codeaurora.org>
> Cc: Brian Norris <briannorris@chromium.org>; Doug Anderson
> <dianders@chromium.org>; linux-wireless <linux-
> wireless@vger.kernel.org>; Rakesh Pillai <pillair@codeaurora.org>; ath10k
> <ath10k@lists.infradead.org>; LKML <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH] ath10k: Add interrupt summary based CE processing
> 
> I'll take my word back.
> It's not this patch problem, but by others.
> I have 2 extra patches before the 3 patches so my system looks like
> 
> backports from ath.git 5.6-rc1 + linux kernel 4.4 (similar to OpenWrt)
> On top of the working system, I cherry-picked these 5.
> 
> #1.
> ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
> ath.git commit 521fc37be3d879561ca5ab42d64719cf94116af0
> #2.
> ath10k: Fix NULL pointer dereference in AHB device probe
> wireless-drivers.git commit 1cfd3426ef989b83fa6176490a38777057e57f6c
> #3.
> ath10k: Add interrupt summary based CE processing
> https://patchwork.kernel.org/patch/11628299/

Hi Peter,
This patch is applicable only for snoc target WCN3990, since there is a check for per_ce_irq.
For PCI targets, per_ce_irq is false, and hence follows a different path.

Thanks,
Rakesh Pillai.



> #4.
> ath10k: Keep track of which interrupts fired, don't poll them
> https://patchwork.kernel.org/patch/11654631/
> #5.
> ath10k: Get rid of "per_ce_irq" hw param
> https://patchwork.kernel.org/patch/11654633/
> 
> The error "[  14.226184] ath10k_ahb a000000.wifi: failed to receive
> initialized event from target: 80000000" is because of #1 and #2,
> since this happens even after I reverted #3~#5.
> Once I reverted all except #1 I got another crash.
> 
> [   11.179595] !#%&PageFault P<__ath10k_ce_rx_post_buf+0x14/0x98
> [ath10k_core]> L<0x4bc00> F<005> [0000000c]
> [   11.179643] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000c
> [   11.439207] [<7f15a69c>] (__ath10k_ce_rx_post_buf [ath10k_core])
> from [<7f15a874>] (ath10k_ce_rx_post_buf+0x3c/0x50 [ath10k_core])
> [   11.447204] [<7f15a874>] (ath10k_ce_rx_post_buf [ath10k_core]) from
> [<7f2889a4>] (ath10k_pci_diag_read_mem+0x104/0x2a8 [ath10k_pci])
> [   11.458706] [<7f2889a4>] (ath10k_pci_diag_read_mem [ath10k_pci])
> from [<7f288b68>] (ath10k_pci_diag_read32+0x1c/0x2c [ath10k_pci])
> [   11.470767] [<7f288b68>] (ath10k_pci_diag_read32 [ath10k_pci]) from
> [<7f28abe8>] (ath10k_pci_init_config+0x2c/0x290 [ath10k_pci])
> [   11.482314] [<7f28abe8>] (ath10k_pci_init_config [ath10k_pci]) from
> [<7f28d160>] (ath10k_ahb_hif_power_up+0x7c/0xe8 [ath10k_pci])
> [   11.494153] [<7f28d160>] (ath10k_ahb_hif_power_up [ath10k_pci])
> from [<7f135348>] (ath10k_core_register_work+0x84/0x8f8 [ath10k_core])
> [   11.505766] [<7f135348>] (ath10k_core_register_work [ath10k_core])
> from [<8023b614>] (process_one_work+0x1c0/0x2f8)
> [   11.517594] [<8023b614>] (process_one_work) from [<8023c650>]
> (worker_thread+0x280/0x3c0)
> [   11.527919] [<8023c650>] (worker_thread) from [<802408f8>]
> (kthread+0xd8/0xe8)
> [   11.536247] [<802408f8>] (kthread) from [<80209ce8>]
> (ret_from_fork+0x14/0x2c)
> 
> When I revert #1 eventually, my system is back to working.
> So I'm blaming the #1 and #2 could have potential bugs or require
> ath.git branch up-to-date.
> 
> On Mon, Jul 20, 2020 at 5:58 PM Peter Oh <peter.oh@eero.com> wrote:
> >
> > My previous email wasn't sent out.
> >
> > At first I gave these 3 patches.
> > ath10k: Add interrupt summary based CE processing
> >     https://patchwork.kernel.org/patch/11628299/
> > ath10k: Keep track of which interrupts fired, don't poll them
> >     https://patchwork.kernel.org/patch/11654631/
> > ath10k: Get rid of "per_ce_irq" hw param
> >     https://patchwork.kernel.org/patch/11654633/
> > and saw the crash happen and then reverted the top 2 and used the very
> > first one, but it is still happening.
> >
> > On Mon, Jul 20, 2020 at 5:56 PM Peter Oh <peter.oh@eero.com> wrote:
> > >
> > > Since IPQ4019 doesn't support per CE based interrupt summary, I doubt
> > > if this change is correct.
> > > +       ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
> > > +                                         wm_regs->cc_mask | wm_regs->wm_mask);
> > >
> > >
> > > On Mon, Jul 20, 2020 at 5:53 PM Peter Oh <peter.oh@eero.com> wrote:
> > > >
> > > > At first I gave these 3 patches.
> > > > ath10k: Add interrupt summary based CE processing
> > > >     https://patchwork.kernel.org/patch/11628299/
> > > > ath10k: Keep track of which interrupts fired, don't poll them
> > > >     https://patchwork.kernel.org/patch/11654631/
> > > > ath10k: Get rid of "per_ce_irq" hw param
> > > >     https://patchwork.kernel.org/patch/11654633/
> > > > and saw the crash happen and then reverted the top 2 and used the
> very first one, but it is still happening.
> > > >
> > > >
> > > >
> > > > On Mon, Jul 20, 2020 at 5:43 PM Peter Oh <peter.oh@eero.com>
> wrote:
> > > >>
> > > >> I've run 3 units and one of them happens the problem always while
> the
> > > >> other 2 are barely happening.
> > > >>
> > > >> On Mon, Jul 20, 2020 at 5:33 PM Peter Oh <peter.oh@eero.com>
> wrote:
> > > >> >
> > > >> > I'm getting this panic on IPQ4019 system after cherry-picked this
> > > >> > single patch on top of working system.
> > > >> >
> > > >> > [   14.226184] ath10k_ahb a000000.wifi: failed to receive initialized
> > > >> > event from target: 80000000
> > > >> > [   14.326406] !#%&PageFault P<find_dr+0x28/0x64>
> > > >> > L<devres_remove+0x38/0x70> F<005> [00000008]
> > > >> > [   14.326447] Unable to handle kernel NULL pointer dereference at
> > > >> > virtual address 00000008
> > > >> > [   14.333569] pgd = 80cac000
> > > >> > [   14.341892] [00000008] *pgd=00000000
> > > >> > [   14.347804] !#%&Abort P<find_dr+0x28/0x64>
> > > >> > L<devres_remove+0x38/0x70> F<005> FILE<PageFault>
> > > >> > [   14.348067] Internal error: PageFault: 5 [#1] PREEMPT SMP ARM
> > > >> > [   14.356568] Modules linked in: ath10k_pci(+) ecm shortcut_fe_drv
> > > >> > shortcut_fe ath10k_core ath mac80211 cfg80211 compat
> > > >> > [   14.372537] CPU: 3 PID: 301 Comm: systemd-modules Not tainted
> > > >> > 4.4.60-yocto-standard-eero #1
> > > >> > [   14.372805] Hardware name: Qualcomm (Flattened Device Tree)
> > > >> > [   14.380961] task: 9b492300 ti: 9d3f0000 task.ti: 9d3f0000
> > > >> > [   14.386516] PC is at find_dr+0x28/0x64
> > > >> > [   14.392069] LR is at devres_remove+0x38/0x70
> > > >> > [   14.395720] pc : [<804aa498>]    lr : [<804aa564>]    psr: 00010193
> > > >> > [   14.395720] sp : 9d3f7cc8  ip : 00000000  fp : 7f18b380
> > > >> > [   14.400155] r10: 9d995610  r9 : 9b73db64  r8 : 9b740b00
> > > >> > [   14.411343] r7 : 80430990  r6 : 8043097c  r5 : 9b73da10  r4 : 00000000
> > > >> > [   14.416554] r3 : 9b740b00  r2 : 8043097c  r1 : 80430990  r0 : 9b73da10
> > > >> > [   14.423153] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
> > > >> > Segment user
> > > >> > [   14.429663] Control: 10c5387d  Table: 80cac06a  DAC: 00000055
> > > >> > [   14.436865] Process systemd-modules (pid: 301, stack limit =
> 0x9d3f0210)
> > > >> > [   14.442683] Stack: (0x9d3f7cc8 to 0x9d3f8000)
> > > >> > [   14.449455] 7cc0:                   9b73da10 a0010113 80430990
> > > >> > 8043097c 9b740b00 9b73db60
> > > >> > [   14.453716] 7ce0: 9d995610 804aa564 9b740b00 9dbd1f20 9d995600
> > > >> > 9d995600 ffffff92 00000000
> > > >> > [   14.461876] 7d00: 9d995610 804aac4c 9b740b00 80430a94 9dbd5f20
> > > >> > 7f28c584 9dbd5f20 7f28cf28
> > > >> > [   14.470035] 7d20: 7f28f3b2 9dbd1f20 00000001 9cae5960 9d9a16e0
> > > >> > 00000000 00000000 00000000
> > > >> > [   14.478196] 7d40: 8162d420 7f28c9d4 9d995610 7f28fb94 8162d42c
> > > >> > 815a59e0 8162d420 00000003
> > > >> > [   14.486355] 7d60: 9d3f7f54 804a96cc 9d995610 00000000 7f28fb94
> > > >> > 804a7df8 7f28fb94 9d995610
> > > >> > [   14.494516] 7d80: 9d995610 7f28fb94 9d995644 815d8400 00000000
> > > >> > 0000001c 9cae60c8 804a7f9c
> > > >> > [   14.502675] 7da0: 00000000 7f28fb94 804a7f50 804a62c0 9d81fc5c
> > > >> > 9d9f6634 7f28fb94 9b739a00
> > > >> > [   14.510834] 7dc0: 815a5908 804a7344 7f28f3b2 7f28f3b3 7f28fb94
> > > >> > 7f293000 00000000 815b6d48
> > > >> > [   14.518995] 7de0: 815b6d48 804a87a4 9cae61c0 7f293000 00000000
> > > >> > 7f28d2e8 9cae61c0 7f29300c
> > > >> > [   14.527153] 7e00: 9cae61c0 80213468 0018bce1 00000001 8040003f
> > > >> > 802b19d0 9e34ea98 00000000
> > > >> > [   14.535315] 7e20: 9e34eaa8 9e34ea98 8040003e 9e34e8a0
> 9e34e8a0
> > > >> > 9bae8100 9d3f0000 9d801e40
> > > >> > [   14.543474] 7e40: 00004eb1 9e34e8a0 9bae8080 9d3f0000 81621200
> > > >> > 7f28fc00 00000001 81599848
> > > >> > [   14.551632] 7e60: 7f28fc00 00000001 9cae6180 7f28fc48 00000001
> > > >> > 80287704 7f28fc00 9e34e8a0
> > > >> > [   14.559793] 7e80: 7f28fc00 00000001 9cae60c0 80289298 7f28fc0c
> > > >> > 00007fff 7f28fc00 80286a58
> > > >> > [   14.567952] 7ea0: 00000000 815998c4 a13e1138 7f293100 a13dba30
> > > >> > 8088bd0c 7f28fdc4 76cea5cc
> > > >> > [   14.576112] 7ec0: 9d3f7f54 802861d4 00000000 00000000 00000000
> > > >> > 00000000 00000000 00000000
> > > >> > [   14.584269] 7ee0: 6e72656b 00006c65 00000000 00000000 00000000
> > > >> > 00000000 00000000 00000000
> > > >> > [   14.592430] 7f00: 00000000 00000000 00000000 00000000 00000000
> > > >> > 00000000 00000000 81599848
> > > >> > [   14.600590] 7f20: 00000000 00000000 76cea5cc 00000008 0000017b
> > > >> > 80209de4 9d3f0000 00000000
> > > >> > [   14.608751] 7f40: 00000000 80289594 9d3f0000 00000000 7e85fbbc
> > > >> > a13d4000 0000d1b0 a13e0b48
> > > >> > [   14.616911] 7f60: a13e09ac a13dd590 00007e08 00008ca8 00000000
> > > >> > 00000000 00000000 0000313c
> > > >> > [   14.625070] 7f80: 00000026 00000027 0000001d 00000000 00000016
> > > >> > 00000000 00000000 00000000
> > > >> > [   14.633230] 7fa0: 5654ca48 80209c40 00000000 00000000 00000008
> > > >> > 76cea5cc 00000000 00000000
> > > >> > [   14.641390] 7fc0: 00000000 00000000 5654ca48 0000017b 00000000
> > > >> > 00000001 76ea754f 00000000
> > > >> > [   14.649549] 7fe0: 7e85fbc0 7e85fbb0 76ce281c 76c6b830 600f0010
> > > >> > 00000008 00000000 00000000
> > > >> > [   14.657717] [<804aa498>] (find_dr) from [<804aa564>]
> > > >> > (devres_remove+0x38/0x70)
> > > >> > [   14.665868] [<804aa564>] (devres_remove) from [<804aac4c>]
> > > >> > (devres_destroy+0x8/0x24)
> > > >> > [   14.672989] [<804aac4c>] (devres_destroy) from [<80430a94>]
> > > >> > (devm_iounmap+0x18/0x44)
> > > >> > [   14.680927] [<80430a94>] (devm_iounmap) from [<7f28c584>]
> > > >> > (ath10k_ahb_resource_deinit+0x20/0x74 [ath10k_pci])
> > > >> > [   14.688671] [<7f28c584>] (ath10k_ahb_resource_deinit
> [ath10k_pci])
> > > >> > from [<7f28cf28>] (ath10k_ahb_probe+0x554/0x6f4 [ath10k_pci])
> > > >> > [   14.698454] [<7f28cf28>] (ath10k_ahb_probe [ath10k_pci]) from
> > > >> > [<804a96cc>] (platform_drv_probe+0x50/0x9c)
> > > >> > [   14.710061] [<804a96cc>] (platform_drv_probe) from [<804a7df8>]
> > > >> > (driver_probe_device+0x2ac/0x404)
> > > >> > [   14.719520] [<804a7df8>] (driver_probe_device) from
> [<804a7f9c>]
> > > >> > (__driver_attach+0x4c/0x8c)
> > > >> > [   14.728374] [<804a7f9c>] (__driver_attach) from [<804a62c0>]
> > > >> > (bus_for_each_dev+0x7c/0x8c)
> > > >> > [   14.736880] [<804a62c0>] (bus_for_each_dev) from [<804a7344>]
> > > >> > (bus_add_driver+0x1b4/0x234)
> > > >> > [   14.744952] [<804a7344>] (bus_add_driver) from [<804a87a4>]
> > > >> > (driver_register+0xa0/0xe0)
> > > >> > [   14.753136] [<804a87a4>] (driver_register) from [<7f28d2e8>]
> > > >> > (ath10k_ahb_init+0x10/0x38 [ath10k_pci])
> > > >> > [   14.761061] [<7f28d2e8>] (ath10k_ahb_init [ath10k_pci]) from
> > > >> > [<7f29300c>] (__init_backport+0xc/0x100 [ath10k_pci])
> > > >> > [   14.770418] [<7f29300c>] (__init_backport [ath10k_pci]) from
> > > >> > [<80213468>] (do_one_initcall+0x1c4/0x20c)
> > > >> > [   14.780633] [<80213468>] (do_one_initcall) from [<80287704>]
> > > >> > (do_init_module+0x54/0x1ac)
> > > >> > [   14.789916] [<80287704>] (do_init_module) from [<80289298>]
> > > >> > (load_module+0x19e0/0x1b04)
> > > >> > [   14.798249] [<80289298>] (load_module) from [<80289594>]
> > > >> > (SyS_finit_module+0x8c/0x9c)
> > > >> > [   14.805975] [<80289594>] (SyS_finit_module) from [<80209c40>]
> > > >> > (ret_fast_syscall+0x0/0x34)
> > > >> > [   14.813959] Code: e1a08003 e1540009 03a04000 0a00000c
> (e5943008)
> > > >> > [   14.822108] ---[ end trace f4da008c1c165fb3 ]---
> > > >> > [   14.830623] Kernel panic - not syncing: Fatal exception
> > > >> > [   14.832876] CPU1: stopping
> > > >> > [   14.837820] CPU: 1 PID: 343 Comm: rngd Tainted: G      D
> > > >> > 4.4.60-yocto-standard-eero #1
> > > >> > [   14.840601] Hardware name: Qualcomm (Flattened Device Tree)
> > > >> > [   14.849210] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > > >> > (show_stack+0x10/0x14)
> > > >> > [   14.854672] [<8021b730>] (show_stack) from [<8041b8dc>]
> > > >> > (dump_stack+0x7c/0x98)
> > > >> > [   14.862658] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > > >> > (handle_IPI+0xdc/0x164)
> > > >> > [   14.869688] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > > >> > (gic_handle_irq+0x80/0x8c)
> > > >> > [   14.876893] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > > >> > (__irq_usr+0x44/0x60)
> > > >> > [   14.884524] Exception stack(0x9beaffb0 to 0x9beafff8)
> > > >> > [   14.892076] ffa0:                                     0c27987c
> > > >> > 40016b9f 0c27987d 0000001f
> > > >> > [   14.897118] ffc0: 00000001 763fedbc 565a344c 54b3de80 54b3e50c
> > > >> > 00000001 7e817c00 763fed0c
> > > >> > [   14.905277] ffe0: fffffffe 763fecc0 00000018 76ec3e18 30010010
> ffffffff
> > > >> > [   14.913430] CPU2: stopping
> > > >> > [   14.919849] CPU: 2 PID: 344 Comm: rngd Tainted: G      D
> > > >> > 4.4.60-yocto-standard-eero #1
> > > >> > [   14.922631] Hardware name: Qualcomm (Flattened Device Tree)
> > > >> > [   14.931236] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > > >> > (show_stack+0x10/0x14)
> > > >> > [   14.936703] [<8021b730>] (show_stack) from [<8041b8dc>]
> > > >> > (dump_stack+0x7c/0x98)
> > > >> > [   14.944688] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > > >> > (handle_IPI+0xdc/0x164)
> > > >> > [   14.951719] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > > >> > (gic_handle_irq+0x80/0x8c)
> > > >> > [   14.958924] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > > >> > (__irq_usr+0x44/0x60)
> > > >> > [   14.966554] Exception stack(0x9beb7fb0 to 0x9beb7ff8)
> > > >> > [   14.974107] 7fa0:                                     fffffff7
> > > >> > 00000017 a6000000 0000014a
> > > >> > [   14.979150] 7fc0: 00000002 759fedbc 565a3470 54b3de80 54b3e50c
> > > >> > 00000001 7e817c00 759fed0c
> > > >> > [   14.987307] 7fe0: 00000009 759fecc0 00000018 76ec3cdc 80010010
> ffffffff
> > > >> > [   14.995461] CPU0: stopping
> > > >> > [   15.001882] CPU: 0 PID: 341 Comm: rngd Tainted: G      D
> > > >> > 4.4.60-yocto-standard-eero #1
> > > >> > [   15.004663] Hardware name: Qualcomm (Flattened Device Tree)
> > > >> > [   15.013267] [<8021ed7c>] (unwind_backtrace) from [<8021b730>]
> > > >> > (show_stack+0x10/0x14)
> > > >> > [   15.018734] [<8021b730>] (show_stack) from [<8041b8dc>]
> > > >> > (dump_stack+0x7c/0x98)
> > > >> > [   15.026719] [<8041b8dc>] (dump_stack) from [<8021dfc8>]
> > > >> > (handle_IPI+0xdc/0x164)
> > > >> > [   15.033752] [<8021dfc8>] (handle_IPI) from [<802093e8>]
> > > >> > (gic_handle_irq+0x80/0x8c)
> > > >> > [   15.040955] [<802093e8>] (gic_handle_irq) from [<8020a844>]
> > > >> > (__irq_usr+0x44/0x60)
> > > >> > [   15.048585] Exception stack(0x9be87fb0 to 0x9be87ff8)
> > > >> > [   15.056139] 7fa0:                                     00000000
> > > >> > 00000000 a7e3391d 6a11d866
> > > >> > [   15.061181] 7fc0: 00000000 76d15dbc 565a3428 54b3de80 54b3e50c
> > > >> > 00000001 7e817c00 76d15d0c
> > > >> > [   15.069339] 7fe0: ffffffeb 76d15cc0 00000018 76ec3d60 80010010
> ffffffff
> > > >> > [   15.091080] Rebooting in 5 seconds..
> > > >> >
> > > >> >
> > > >> > On Wed, Jul 15, 2020 at 11:39 PM Kalle Valo
> <kvalo@codeaurora.org> wrote:
> > > >> > >
> > > >> > > Brian Norris <briannorris@chromium.org> writes:
> > > >> > >
> > > >> > > > On Fri, Jun 26, 2020 at 2:49 PM Doug Anderson
> <dianders@chromium.org> wrote:
> > > >> > > >> I should also note that, while I'm not terribly familiar with Kalle's
> > > >> > > >> workflow, I would have expected to see him in the "To:" list.
> I've
> > > >> > > >> added him, but it's possible he'll need you to repost the patch
> with
> > > >> > > >> him in the "To:" list.
> > > >> > > >
> > > >> > > >
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingp
> atches#who_to_address
> > > >> > > >
> https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches
> > > >> > > >
> > > >> > > > Patchwork is his patch queue, so I don't think you need to
> address him directly.
> > > >> > >
> > > >> > > Yup, I take all patches from patchwork so no need to Cc me.
> > > >> > >
> > > >> > > --
> > > >> > >
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingp
> atches
Douglas Anderson July 21, 2020, 3:37 p.m. UTC | #12
Hi,

On Mon, Jul 20, 2020 at 6:32 PM Peter Oh <peter.oh@eero.com> wrote:
>
> I'll take my word back.
> It's not this patch problem, but by others.
> I have 2 extra patches before the 3 patches so my system looks like
>
> backports from ath.git 5.6-rc1 + linux kernel 4.4 (similar to OpenWrt)
> On top of the working system, I cherry-picked these 5.
>
> #1.
> ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
> ath.git commit 521fc37be3d879561ca5ab42d64719cf94116af0
> #2.
> ath10k: Fix NULL pointer dereference in AHB device probe
> wireless-drivers.git commit 1cfd3426ef989b83fa6176490a38777057e57f6c
> #3.
> ath10k: Add interrupt summary based CE processing
> https://patchwork.kernel.org/patch/11628299/
> #4.
> ath10k: Keep track of which interrupts fired, don't poll them
> https://patchwork.kernel.org/patch/11654631/
> #5.
> ath10k: Get rid of "per_ce_irq" hw param
> https://patchwork.kernel.org/patch/11654633/
>
> The error "[  14.226184] ath10k_ahb a000000.wifi: failed to receive
> initialized event from target: 80000000" is because of #1 and #2,
> since this happens even after I reverted #3~#5.
> Once I reverted all except #1 I got another crash.
>
> [   11.179595] !#%&PageFault P<__ath10k_ce_rx_post_buf+0x14/0x98
> [ath10k_core]> L<0x4bc00> F<005> [0000000c]
> [   11.179643] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000c
> [   11.439207] [<7f15a69c>] (__ath10k_ce_rx_post_buf [ath10k_core])
> from [<7f15a874>] (ath10k_ce_rx_post_buf+0x3c/0x50 [ath10k_core])
> [   11.447204] [<7f15a874>] (ath10k_ce_rx_post_buf [ath10k_core]) from
> [<7f2889a4>] (ath10k_pci_diag_read_mem+0x104/0x2a8 [ath10k_pci])
> [   11.458706] [<7f2889a4>] (ath10k_pci_diag_read_mem [ath10k_pci])
> from [<7f288b68>] (ath10k_pci_diag_read32+0x1c/0x2c [ath10k_pci])
> [   11.470767] [<7f288b68>] (ath10k_pci_diag_read32 [ath10k_pci]) from
> [<7f28abe8>] (ath10k_pci_init_config+0x2c/0x290 [ath10k_pci])
> [   11.482314] [<7f28abe8>] (ath10k_pci_init_config [ath10k_pci]) from
> [<7f28d160>] (ath10k_ahb_hif_power_up+0x7c/0xe8 [ath10k_pci])
> [   11.494153] [<7f28d160>] (ath10k_ahb_hif_power_up [ath10k_pci])
> from [<7f135348>] (ath10k_core_register_work+0x84/0x8f8 [ath10k_core])
> [   11.505766] [<7f135348>] (ath10k_core_register_work [ath10k_core])
> from [<8023b614>] (process_one_work+0x1c0/0x2f8)
> [   11.517594] [<8023b614>] (process_one_work) from [<8023c650>]
> (worker_thread+0x280/0x3c0)
> [   11.527919] [<8023c650>] (worker_thread) from [<802408f8>]
> (kthread+0xd8/0xe8)
> [   11.536247] [<802408f8>] (kthread) from [<80209ce8>]
> (ret_from_fork+0x14/0x2c)
>
> When I revert #1 eventually, my system is back to working.
> So I'm blaming the #1 and #2 could have potential bugs or require
> ath.git branch up-to-date.

You caught me just as I was signing off yesterday evening, but just to
confirm that you are now fairly certain that none of the 3 patches I
was involved with[*] are related to your problems.  If that's wrong
and there's an action I need to take on the patches then let me know!
:-)

[*] The three patches I was involved with:

ath10k: Add interrupt summary based CE processing
https://patchwork.kernel.org/patch/11628299/

ath10k: Keep track of which interrupts fired, don't poll them
https://patchwork.kernel.org/patch/11654631/

ath10k: Get rid of "per_ce_irq" hw param
https://patchwork.kernel.org/patch/11654633/

-Doug
Kalle Valo Aug. 26, 2020, 2:43 p.m. UTC | #13
(Guys, PLEASE edit your quotes. These long emails my use of patchwork
horrible.)

"Rakesh Pillai" <pillair@codeaurora.org> writes:

>> -----Original Message-----
>> From: Peter Oh <peter.oh@eero.com>
>> Sent: Tuesday, July 21, 2020 7:03 AM
>> To: Kalle Valo <kvalo@codeaurora.org>
>> Cc: Brian Norris <briannorris@chromium.org>; Doug Anderson
>> <dianders@chromium.org>; linux-wireless <linux-
>> wireless@vger.kernel.org>; Rakesh Pillai <pillair@codeaurora.org>; ath10k
>> <ath10k@lists.infradead.org>; LKML <linux-kernel@vger.kernel.org>
>> Subject: Re: [PATCH] ath10k: Add interrupt summary based CE processing
>> 
>> I'll take my word back.
>> It's not this patch problem, but by others.
>> I have 2 extra patches before the 3 patches so my system looks like
>> 
>> backports from ath.git 5.6-rc1 + linux kernel 4.4 (similar to OpenWrt)
>> On top of the working system, I cherry-picked these 5.
>> 
>> #1.
>> ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
>> ath.git commit 521fc37be3d879561ca5ab42d64719cf94116af0
>> #2.
>> ath10k: Fix NULL pointer dereference in AHB device probe
>> wireless-drivers.git commit 1cfd3426ef989b83fa6176490a38777057e57f6c
>> #3.
>> ath10k: Add interrupt summary based CE processing
>> https://patchwork.kernel.org/patch/11628299/
>
> This patch is applicable only for snoc target WCN3990, since there is
> a check for per_ce_irq. For PCI targets, per_ce_irq is false, and
> hence follows a different path.

This information should be in the commit log. But I have a patch in the
pending branch which removes per_ce_irq:

[v2,2/2] ath10k: Get rid of "per_ce_irq" hw param

https://patchwork.kernel.org/patch/11654621/

So how will multilple hardware support work then?

In theory I like the patch but there's no information in the patch if
this works or breaks other hardware, especially QCA9884 or QCA6174 PCI
devices. I really need some kind of assurance that this works with all
ath10k devices, not just WCN3990 which you are working on.

I have written about this in the wiki:

https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches#hardware_families
Douglas Anderson Aug. 26, 2020, 2:54 p.m. UTC | #14
Hi,

On Wed, Aug 26, 2020 at 7:44 AM Kalle Valo <kvalo@codeaurora.org> wrote:
>
> (Guys, PLEASE edit your quotes. These long emails my use of patchwork
> horrible.)
>
> "Rakesh Pillai" <pillair@codeaurora.org> writes:
>
> >> -----Original Message-----
> >> From: Peter Oh <peter.oh@eero.com>
> >> Sent: Tuesday, July 21, 2020 7:03 AM
> >> To: Kalle Valo <kvalo@codeaurora.org>
> >> Cc: Brian Norris <briannorris@chromium.org>; Doug Anderson
> >> <dianders@chromium.org>; linux-wireless <linux-
> >> wireless@vger.kernel.org>; Rakesh Pillai <pillair@codeaurora.org>; ath10k
> >> <ath10k@lists.infradead.org>; LKML <linux-kernel@vger.kernel.org>
> >> Subject: Re: [PATCH] ath10k: Add interrupt summary based CE processing
> >>
> >> I'll take my word back.
> >> It's not this patch problem, but by others.
> >> I have 2 extra patches before the 3 patches so my system looks like
> >>
> >> backports from ath.git 5.6-rc1 + linux kernel 4.4 (similar to OpenWrt)
> >> On top of the working system, I cherry-picked these 5.
> >>
> >> #1.
> >> ath10k: Avoid override CE5 configuration for QCA99X0 chipsets
> >> ath.git commit 521fc37be3d879561ca5ab42d64719cf94116af0
> >> #2.
> >> ath10k: Fix NULL pointer dereference in AHB device probe
> >> wireless-drivers.git commit 1cfd3426ef989b83fa6176490a38777057e57f6c
> >> #3.
> >> ath10k: Add interrupt summary based CE processing
> >> https://patchwork.kernel.org/patch/11628299/
> >
> > This patch is applicable only for snoc target WCN3990, since there is
> > a check for per_ce_irq. For PCI targets, per_ce_irq is false, and
> > hence follows a different path.
>
> This information should be in the commit log. But I have a patch in the
> pending branch which removes per_ce_irq:
>
> [v2,2/2] ath10k: Get rid of "per_ce_irq" hw param
>
> https://patchwork.kernel.org/patch/11654621/
>
> So how will multilple hardware support work then?

In theory my patches and Rakesh's patches could be squashed.  His
patch made things marginally better but still didn't really address
the root cause.  When addressing the root cause, I ended up deleting
most of the code that he introduced.  I think it'd be fine to just
apply all 3 patches (Rakesh's plus my 2) and the end result will be
good, but if desired either Rakesh or I could post a squashed series.


> In theory I like the patch but there's no information in the patch if
> this works or breaks other hardware, especially QCA9884 or QCA6174 PCI
> devices. I really need some kind of assurance that this works with all
> ath10k devices, not just WCN3990 which you are working on.
>
> I have written about this in the wiki:
>
> https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches#hardware_families

The end result is that it should only affect "wcn3990-wifi", which is
what this was tested on.  Specifically you can see that "snoc.c" only
includes one compatible string: "qcom,wcn3990-wifi".  You can also see
that the only place that has "per_ce_irq" set to true is
"WCN3990_HW_1_0_DEV_VERSION".



-Doug
Kalle Valo Sept. 1, 2020, 11:59 a.m. UTC | #15
Rakesh Pillai <pillair@codeaurora.org> wrote:

> Currently the NAPI processing loops through all
> the copy engines and processes a particular copy
> engine is the copy completion is set for that copy
> engine. The host driver is not supposed to access
> any copy engine register after clearing the interrupt
> status register.
> 
> This might result in kernel crash like the one below
> [ 1159.220143] Call trace:
> [ 1159.220170]  ath10k_snoc_read32+0x20/0x40 [ath10k_snoc]
> [ 1159.220193]  ath10k_ce_per_engine_service_any+0x78/0x130 [ath10k_core]
> [ 1159.220203]  ath10k_snoc_napi_poll+0x38/0x8c [ath10k_snoc]
> [ 1159.220270]  net_rx_action+0x100/0x3b0
> [ 1159.220312]  __do_softirq+0x164/0x30c
> [ 1159.220345]  run_ksoftirqd+0x2c/0x64
> [ 1159.220380]  smpboot_thread_fn+0x1b0/0x288
> [ 1159.220405]  kthread+0x11c/0x12c
> [ 1159.220423]  ret_from_fork+0x10/0x18
> 
> To avoid such a scenario, we generate an interrupt
> summary by reading the copy completion for all the
> copy engine before actually processing any of them.
> This will avoid reading the interrupt status register
> for any CE after the interrupt status is cleared.
> 
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
> 
> Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>
> Tested-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

Patch applied to ath-next branch of ath.git, thanks.

b92aba35d39d ath10k: Add interrupt summary based CE processing
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
index ffdd4b9..1e16f26 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -481,15 +481,38 @@  static inline void ath10k_ce_engine_int_status_clear(struct ath10k *ar,
 	ath10k_ce_write32(ar, ce_ctrl_addr + wm_regs->addr, mask);
 }
 
-static inline bool ath10k_ce_engine_int_status_check(struct ath10k *ar,
-						     u32 ce_ctrl_addr,
-						     unsigned int mask)
+static bool ath10k_ce_engine_int_status_check(struct ath10k *ar, u32 ce_ctrl_addr,
+					      unsigned int mask)
 {
 	struct ath10k_hw_ce_host_wm_regs *wm_regs = ar->hw_ce_regs->wm_regs;
 
 	return ath10k_ce_read32(ar, ce_ctrl_addr + wm_regs->addr) & mask;
 }
 
+u32 ath10k_ce_gen_interrupt_summary(struct ath10k *ar)
+{
+	struct ath10k_hw_ce_host_wm_regs *wm_regs = ar->hw_ce_regs->wm_regs;
+	struct ath10k_ce_pipe *ce_state;
+	struct ath10k_ce *ce;
+	u32 irq_summary = 0;
+	u32 ctrl_addr;
+	u32 ce_id;
+
+	ce = ath10k_ce_priv(ar);
+
+	for (ce_id = 0; ce_id < CE_COUNT; ce_id++) {
+		ce_state = &ce->ce_states[ce_id];
+		ctrl_addr = ce_state->ctrl_addr;
+		if (ath10k_ce_engine_int_status_check(ar, ctrl_addr,
+						      wm_regs->cc_mask)) {
+			irq_summary |= BIT(ce_id);
+		}
+	}
+
+	return irq_summary;
+}
+EXPORT_SYMBOL(ath10k_ce_gen_interrupt_summary);
+
 /*
  * Guts of ath10k_ce_send.
  * The caller takes responsibility for any needed locking.
@@ -1308,32 +1331,24 @@  void ath10k_ce_per_engine_service(struct ath10k *ar, unsigned int ce_id)
 	struct ath10k_hw_ce_host_wm_regs *wm_regs = ar->hw_ce_regs->wm_regs;
 	u32 ctrl_addr = ce_state->ctrl_addr;
 
-	spin_lock_bh(&ce->ce_lock);
-
-	if (ath10k_ce_engine_int_status_check(ar, ctrl_addr,
-					      wm_regs->cc_mask)) {
-		/* Clear before handling */
-		ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
-						  wm_regs->cc_mask);
-
-		spin_unlock_bh(&ce->ce_lock);
-
-		if (ce_state->recv_cb)
-			ce_state->recv_cb(ce_state);
-
-		if (ce_state->send_cb)
-			ce_state->send_cb(ce_state);
-
-		spin_lock_bh(&ce->ce_lock);
-	}
-
 	/*
+	 * Clear before handling
+	 *
 	 * Misc CE interrupts are not being handled, but still need
 	 * to be cleared.
+	 *
+	 * NOTE: When the last copy engine interrupt is cleared the
+	 * hardware will go to sleep.  Once this happens any access to
+	 * the CE registers can cause a hardware fault.
 	 */
-	ath10k_ce_engine_int_status_clear(ar, ctrl_addr, wm_regs->wm_mask);
+	ath10k_ce_engine_int_status_clear(ar, ctrl_addr,
+					  wm_regs->cc_mask | wm_regs->wm_mask);
 
-	spin_unlock_bh(&ce->ce_lock);
+	if (ce_state->recv_cb)
+		ce_state->recv_cb(ce_state);
+
+	if (ce_state->send_cb)
+		ce_state->send_cb(ce_state);
 }
 EXPORT_SYMBOL(ath10k_ce_per_engine_service);
 
diff --git a/drivers/net/wireless/ath/ath10k/ce.h b/drivers/net/wireless/ath/ath10k/ce.h
index 75df79d..a440aaf 100644
--- a/drivers/net/wireless/ath/ath10k/ce.h
+++ b/drivers/net/wireless/ath/ath10k/ce.h
@@ -259,6 +259,8 @@  int ath10k_ce_disable_interrupts(struct ath10k *ar);
 void ath10k_ce_enable_interrupts(struct ath10k *ar);
 void ath10k_ce_dump_registers(struct ath10k *ar,
 			      struct ath10k_fw_crash_data *crash_data);
+
+u32 ath10k_ce_gen_interrupt_summary(struct ath10k *ar);
 void ath10k_ce_alloc_rri(struct ath10k *ar);
 void ath10k_ce_free_rri(struct ath10k *ar);
 
@@ -369,7 +371,6 @@  static inline u32 ath10k_ce_base_address(struct ath10k *ar, unsigned int ce_id)
 	(((x) & CE_WRAPPER_INTERRUPT_SUMMARY_HOST_MSI_MASK) >> \
 		CE_WRAPPER_INTERRUPT_SUMMARY_HOST_MSI_LSB)
 #define CE_WRAPPER_INTERRUPT_SUMMARY_ADDRESS			0x0000
-#define CE_INTERRUPT_SUMMARY		(GENMASK(CE_COUNT_MAX - 1, 0))
 
 static inline u32 ath10k_ce_interrupt_summary(struct ath10k *ar)
 {
@@ -380,7 +381,7 @@  static inline u32 ath10k_ce_interrupt_summary(struct ath10k *ar)
 			ce->bus_ops->read32((ar), CE_WRAPPER_BASE_ADDRESS +
 			CE_WRAPPER_INTERRUPT_SUMMARY_ADDRESS));
 	else
-		return CE_INTERRUPT_SUMMARY;
+		return ath10k_ce_gen_interrupt_summary(ar);
 }
 
 /* Host software's Copy Engine configuration. */