Message ID | 20230104021445.47484-1-quic_bqiang@quicinc.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | bus: mhi: host: Change the log levels for SYS_ERR event | expand |
Why was this not sent to the MHI mailing list? On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > > Currently no log printed when SYS_ERR happens, this makes > debug quite hard, so change log level to make it noisy. You are going to need to explain this more. There are two drivers in the upstream kernel that are MHI clients - pci_generic and ath11k. I'm assuming that you care about ath11k because you included that mail list. In ath11k_mhi_op_status_cb() I see a warning message printed when the syserr callback is triggered. I see something similar in pci_generic. Looks like a log is printed when SYS_ERR happens in all possible scenarios, so I don't understand the point of this change. Particularly given that dev_dbg messages can be trivially enabled. -Jeff
On 1/4/2023 10:41 AM, Jeffrey Hugo wrote: > Why was this not sent to the MHI mailing list? I don't know the MHI mailing list address, could tell me that? > > On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: >> Currently no log printed when SYS_ERR happens, this makes >> debug quite hard, so change log level to make it noisy. > You are going to need to explain this more. > There are two drivers in the upstream kernel that are MHI clients - > pci_generic and ath11k. > I'm assuming that you care about ath11k because you included that mail list. Yes, I am talking about ath11k. > In ath11k_mhi_op_status_cb() I see a warning message printed when the > syserr callback is triggered. > I see something similar in pci_generic. > > Looks like a log is printed when SYS_ERR happens in all possible > scenarios, so I don't understand the point of this change. > Particularly given that dev_dbg messages can be trivially enabled. > > -Jeff Well, this is not true in some cases. For example, we have met cases where WLAN HW/firmware is not working well, and only send a SYS_ERR event to MHI driver, however this event is not sent to ath11k host becuase of mhi_pm_sys_err_handler(), so we got no log at all.
On Tue, Jan 3, 2023 at 7:57 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > > > On 1/4/2023 10:41 AM, Jeffrey Hugo wrote: > > Why was this not sent to the MHI mailing list? > I don't know the MHI mailing list address, could tell me that? The relevant entry from MAINTAINERS - MHI BUS M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> L: mhi@lists.linux.dev L: linux-arm-msm@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git F: Documentation/ABI/stable/sysfs-bus-mhi F: Documentation/mhi/ F: drivers/bus/mhi/ F: include/linux/mhi.h > > > > On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > >> Currently no log printed when SYS_ERR happens, this makes > >> debug quite hard, so change log level to make it noisy. > > You are going to need to explain this more. > > There are two drivers in the upstream kernel that are MHI clients - > > pci_generic and ath11k. > > I'm assuming that you care about ath11k because you included that mail list. > Yes, I am talking about ath11k. > > In ath11k_mhi_op_status_cb() I see a warning message printed when the > > syserr callback is triggered. > > I see something similar in pci_generic. > > > > Looks like a log is printed when SYS_ERR happens in all possible > > scenarios, so I don't understand the point of this change. > > Particularly given that dev_dbg messages can be trivially enabled. > > > > -Jeff > > Well, this is not true in some cases. For example, we have met cases where > > WLAN HW/firmware is not working well, and only send a SYS_ERR event to MHI > > driver, however this event is not sent to ath11k host becuase of > mhi_pm_sys_err_handler(), > > so we got no log at all. > With the 6.1 kernel? mhi_pm_sys_err_handler() queues the st_worker. mhi_pm_st_worker() , which is the st_worker function, calls mhi_pm_sys_error_transition() in the DEV_ST_TRANSITION_SYS_ERR case (we are processing a SYS_ERR). Pretty much the first thing mhi_pm_sys_err_transition() does is this - /* We must notify MHI control driver so it can clean up first */ mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_SYS_ERROR); Which calls the ath11k driver ath11k_mhi_op_status_cb() I mentioned earlier. -Jeff
On 1/4/2023 11:11 AM, Jeffrey Hugo wrote: > On Tue, Jan 3, 2023 at 7:57 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: >> >> On 1/4/2023 10:41 AM, Jeffrey Hugo wrote: >>> Why was this not sent to the MHI mailing list? >> I don't know the MHI mailing list address, could tell me that? > The relevant entry from MAINTAINERS - > > MHI BUS > M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > L: mhi@lists.linux.dev > L: linux-arm-msm@vger.kernel.org > S: Maintained > T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git > F: Documentation/ABI/stable/sysfs-bus-mhi > F: Documentation/mhi/ > F: drivers/bus/mhi/ > F: include/linux/mhi.h > >>> On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: >>>> Currently no log printed when SYS_ERR happens, this makes >>>> debug quite hard, so change log level to make it noisy. >>> You are going to need to explain this more. >>> There are two drivers in the upstream kernel that are MHI clients - >>> pci_generic and ath11k. >>> I'm assuming that you care about ath11k because you included that mail list. >> Yes, I am talking about ath11k. >>> In ath11k_mhi_op_status_cb() I see a warning message printed when the >>> syserr callback is triggered. >>> I see something similar in pci_generic. >>> >>> Looks like a log is printed when SYS_ERR happens in all possible >>> scenarios, so I don't understand the point of this change. >>> Particularly given that dev_dbg messages can be trivially enabled. >>> >>> -Jeff >> Well, this is not true in some cases. For example, we have met cases where >> >> WLAN HW/firmware is not working well, and only send a SYS_ERR event to MHI >> >> driver, however this event is not sent to ath11k host becuase of >> mhi_pm_sys_err_handler(), >> >> so we got no log at all. >> > With the 6.1 kernel? > > mhi_pm_sys_err_handler() queues the st_worker. > > mhi_pm_st_worker() , which is the st_worker function, calls > mhi_pm_sys_error_transition() in the DEV_ST_TRANSITION_SYS_ERR case > (we are processing a SYS_ERR). > > Pretty much the first thing mhi_pm_sys_err_transition() does is this - > > /* We must notify MHI control driver so it can clean up first */ > mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_SYS_ERROR); > > Which calls the ath11k driver ath11k_mhi_op_status_cb() I mentioned earlier. > > -Jeff No, mhi_pm_sys_err_handler() will NOT queue the st_worker because ath11k host supports RDDM, so the SYS_ERR event will be skipped. See below log: kernel: [ 165.393720] mhi mhi0: local ee: MISSION MODE state: M0 device ee: RAMDUMP DOWNLOAD MODE state: M0 kernel: [ 165.401820] mhi mhi0: State change event to state: SYS ERROR kernel: [ 165.401824] mhi mhi0: System error detected kernel: [ 165.401827] mhi mhi0: Controller supports RDDM, skip SYS_ERROR
So your firmware is glitching, but it isn't kicking you to the RDDM EE? RDDM EE triggers an entirely different code path than the syserr process. If that assumption is correct, then I'm not entirely sure why that check exists since the current code would do the syserr processing only if it's not a RDDM event. There are other reasons the FW might trigger syserr, which would not be a fatal error or rddm, and that should trigger both the controller's processing as well as the MHI core. If I were to guess, I would say that Hemanth and Bhaumik had that in there because they were concerned about the RDDM processing triggering multiple times. I don't see how RDDM processing can trigger without the RDDM EE, which seems to make that concern moot. Sadly, I can no longer ask them to confirm. Have you experimented with removing that check? That seems like a valid fix for your system. What you propose is bypassing the dynamic debug mechanism, which doesn't seem justified in this case from what I've seen so far. -Jeff On Tue, Jan 3, 2023 at 8:17 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > > > On 1/4/2023 11:11 AM, Jeffrey Hugo wrote: > > On Tue, Jan 3, 2023 at 7:57 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > >> > >> On 1/4/2023 10:41 AM, Jeffrey Hugo wrote: > >>> Why was this not sent to the MHI mailing list? > >> I don't know the MHI mailing list address, could tell me that? > > The relevant entry from MAINTAINERS - > > > > MHI BUS > > M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > L: mhi@lists.linux.dev > > L: linux-arm-msm@vger.kernel.org > > S: Maintained > > T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git > > F: Documentation/ABI/stable/sysfs-bus-mhi > > F: Documentation/mhi/ > > F: drivers/bus/mhi/ > > F: include/linux/mhi.h > > > >>> On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: > >>>> Currently no log printed when SYS_ERR happens, this makes > >>>> debug quite hard, so change log level to make it noisy. > >>> You are going to need to explain this more. > >>> There are two drivers in the upstream kernel that are MHI clients - > >>> pci_generic and ath11k. > >>> I'm assuming that you care about ath11k because you included that mail list. > >> Yes, I am talking about ath11k. > >>> In ath11k_mhi_op_status_cb() I see a warning message printed when the > >>> syserr callback is triggered. > >>> I see something similar in pci_generic. > >>> > >>> Looks like a log is printed when SYS_ERR happens in all possible > >>> scenarios, so I don't understand the point of this change. > >>> Particularly given that dev_dbg messages can be trivially enabled. > >>> > >>> -Jeff > >> Well, this is not true in some cases. For example, we have met cases where > >> > >> WLAN HW/firmware is not working well, and only send a SYS_ERR event to MHI > >> > >> driver, however this event is not sent to ath11k host becuase of > >> mhi_pm_sys_err_handler(), > >> > >> so we got no log at all. > >> > > With the 6.1 kernel? > > > > mhi_pm_sys_err_handler() queues the st_worker. > > > > mhi_pm_st_worker() , which is the st_worker function, calls > > mhi_pm_sys_error_transition() in the DEV_ST_TRANSITION_SYS_ERR case > > (we are processing a SYS_ERR). > > > > Pretty much the first thing mhi_pm_sys_err_transition() does is this - > > > > /* We must notify MHI control driver so it can clean up first */ > > mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_SYS_ERROR); > > > > Which calls the ath11k driver ath11k_mhi_op_status_cb() I mentioned earlier. > > > > -Jeff > > > No, mhi_pm_sys_err_handler() will NOT queue the st_worker because ath11k > host supports RDDM, so the SYS_ERR event will be skipped. See below log: > > kernel: [ 165.393720] mhi mhi0: local ee: MISSION MODE state: M0 device > ee: RAMDUMP DOWNLOAD MODE state: M0 > kernel: [ 165.401820] mhi mhi0: State change event to state: SYS ERROR > kernel: [ 165.401824] mhi mhi0: System error detected > kernel: [ 165.401827] mhi mhi0: Controller supports RDDM, skip SYS_ERROR >
Jeffrey Hugo <jeffrey.l.hugo@gmail.com> writes: > Why was this not sent to the MHI mailing list? > > On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote: >> >> Currently no log printed when SYS_ERR happens, this makes >> debug quite hard, so change log level to make it noisy. > > You are going to need to explain this more. > There are two drivers in the upstream kernel that are MHI clients - > pci_generic and ath11k. > I'm assuming that you care about ath11k because you included that mail list. > In ath11k_mhi_op_status_cb() I see a warning message printed when the > syserr callback is triggered. > I see something similar in pci_generic. > > Looks like a log is printed when SYS_ERR happens in all possible > scenarios, so I don't understand the point of this change. > Particularly given that dev_dbg messages can be trivially enabled. Also the error messages are not very informative, especially if there are three identical messages it's hard to track down which code path is triggering them. If these are changed to error messages, I would prefer to improve them as well.
diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c index 1c69feee1703..d8bceabcee63 100644 --- a/drivers/bus/mhi/host/boot.c +++ b/drivers/bus/mhi/host/boot.c @@ -102,7 +102,7 @@ static int __mhi_download_rddm_in_panic(struct mhi_controller *mhi_cntrl) goto error_exit_rddm; if (ee != MHI_EE_RDDM) { - dev_dbg(dev, "Trigger device into RDDM mode using SYS ERR\n"); + dev_info(dev, "Trigger device into RDDM mode using SYS ERR\n"); mhi_set_mhi_state(mhi_cntrl, MHI_STATE_SYS_ERR); dev_dbg(dev, "Waiting for device to enter RDDM\n"); diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c index df0fbfee7b78..54c948ed7f47 100644 --- a/drivers/bus/mhi/host/main.c +++ b/drivers/bus/mhi/host/main.c @@ -497,7 +497,7 @@ irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv) TO_MHI_EXEC_STR(ee), mhi_state_str(state)); if (state == MHI_STATE_SYS_ERR) { - dev_dbg(dev, "System error detected\n"); + dev_err(dev, "System error detected\n"); pm_state = mhi_tryset_pm_state(mhi_cntrl, MHI_PM_SYS_ERR_DETECT); } @@ -871,7 +871,7 @@ int mhi_process_ctrl_ev_ring(struct mhi_controller *mhi_cntrl, { enum mhi_pm_state pm_state; - dev_dbg(dev, "System error detected\n"); + dev_err(dev, "System error detected\n"); write_lock_irq(&mhi_cntrl->pm_lock); pm_state = mhi_tryset_pm_state(mhi_cntrl, MHI_PM_SYS_ERR_DETECT); @@ -1085,7 +1085,7 @@ void mhi_ctrl_ev_task(unsigned long data) write_lock_irq(&mhi_cntrl->pm_lock); state = mhi_get_mhi_state(mhi_cntrl); if (state == MHI_STATE_SYS_ERR) { - dev_dbg(dev, "System error detected\n"); + dev_err(dev, "System error detected\n"); pm_state = mhi_tryset_pm_state(mhi_cntrl, MHI_PM_SYS_ERR_DETECT); } diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c index 083459028a4b..570fdd4db442 100644 --- a/drivers/bus/mhi/host/pm.c +++ b/drivers/bus/mhi/host/pm.c @@ -1223,7 +1223,7 @@ int mhi_force_rddm_mode(struct mhi_controller *mhi_cntrl) if (mhi_cntrl->ee == MHI_EE_RDDM) return 0; - dev_dbg(dev, "Triggering SYS_ERR to force RDDM state\n"); + dev_info(dev, "Triggering SYS_ERR to force RDDM state\n"); mhi_set_mhi_state(mhi_cntrl, MHI_STATE_SYS_ERR); /* Wait for RDDM event */
Currently no log printed when SYS_ERR happens, this makes debug quite hard, so change log level to make it noisy. Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> --- drivers/bus/mhi/host/boot.c | 2 +- drivers/bus/mhi/host/main.c | 6 +++--- drivers/bus/mhi/host/pm.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-)