mbox series

[v11,0/9] Fix up and simplify error recovery mechanism

Message ID 1596975355-39813-1-git-send-email-cang@codeaurora.org (mailing list archive)
Headers show
Series Fix up and simplify error recovery mechanism | expand

Message

Can Guo Aug. 9, 2020, 12:15 p.m. UTC
The changes have been tested with error injections of multiple error types (and
all kinds of mixture of them) during runtime, e.g. hibern8 enter/ exit error,
power mode change error and fatal/non-fatal error from IRQ context. During the
test, error injections happen randomly across all contexts, e.g. clk scaling,
clk gate/ungate, runtime suspend/resume and IRQ.

There are a few more fixes to resolve other minor problems based on the main
change, such as LINERESET handling and racing btw error handler and system
suspend/resume/shutdown, but they will be pushed after this series is taken,
due to there are already too many lines in these changes.

Change since v10:
- Incorporated Markus Elfring's comments

Change since v9:
- Fixed compilation warning from option [-Werror=implicit-fallthrough=] in patch "scsi: ufs: Fix a racing problem btw error handler and runtime PM ops"

Change since v8:
- Added one more fix to ufshcd_abort as requested by Stanley Chu

Change since v7:
- Incorporated Asutosh's comments
- Refined patch "scsi: ufs: Recover hba runtime PM error in error handler"

Change since v6:
- Modified change "scsi: ufs-qcom: Fix schedule while atomic error in ufs_qcom_dump_dbg_regs" to "scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs"

Change since v5:
- Dropped change "scsi: ufs: Fix imbalanced scsi_block_reqs_cnt caused by ufshcd_hold()" as it is not quite related with this series
- Refined func ufshcd_err_handling_prepare in change "scsi: ufs: Recover hba runtime PM error in error handler"

Change since v4:
- Split the original change "ufs: ufs-qcom: Fix a few BUGs in func ufs_qcom_dump_dbg_regs()" to 2 small changes

Change since v3:
- Split the original change "scsi: ufs: Fix up and simplify error recovery mechanism" into 5 changes

Change since v2:
- Incorporate Bart's comment to change "scsi: ufs: Add checks before setting clk-gating states"
- Revised the commit msg of change "scsi: ufs: Fix up and simplify error recovery mechanism"

Change since v1:
- Fixed a compilation error in case that CONFIG_PM is N

Can Guo (9):
  scsi: ufs: Add checks before setting clk-gating states
  ufs: ufs-qcom: Fix race conditions caused by func
    ufs_qcom_testbus_config
  scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs
  scsi: ufs: Add some debug infos to ufshcd_print_host_state
  scsi: ufs: Fix concurrency of error handler and other error recovery
    paths
  scsi: ufs: Recover hba runtime PM error in error handler
  scsi: ufs: Move dumps in IRQ handler to error handler
  scsi: ufs: Fix a racing problem btw error handler and runtime PM ops
  scsi: ufs: Properly release resources if a task is aborted
    successfully

 drivers/scsi/ufs/ufs-qcom.c  |  37 ----
 drivers/scsi/ufs/ufs-sysfs.c |   1 +
 drivers/scsi/ufs/ufshcd.c    | 509 +++++++++++++++++++++++++++----------------
 drivers/scsi/ufs/ufshcd.h    |  14 ++
 4 files changed, 339 insertions(+), 222 deletions(-)

Comments

Martin K. Petersen Aug. 13, 2020, 2:26 a.m. UTC | #1
Can,

> The changes have been tested with error injections of multiple error
> types (and all kinds of mixture of them) during runtime, e.g. hibern8
> enter/ exit error, power mode change error and fatal/non-fatal error
> from IRQ context. During the test, error injections happen randomly
> across all contexts, e.g. clk scaling, clk gate/ungate, runtime
> suspend/resume and IRQ.

Applied to my staging tree. You'll get a formal merge message once 5.10
opens.

Thanks!
Can Guo Aug. 14, 2020, 6:32 a.m. UTC | #2
Hi Martin,

On 2020-08-13 10:26, Martin K. Petersen wrote:
> Can,
> 
>> The changes have been tested with error injections of multiple error
>> types (and all kinds of mixture of them) during runtime, e.g. hibern8
>> enter/ exit error, power mode change error and fatal/non-fatal error
>> from IRQ context. During the test, error injections happen randomly
>> across all contexts, e.g. clk scaling, clk gate/ungate, runtime
>> suspend/resume and IRQ.
> 
> Applied to my staging tree. You'll get a formal merge message once 5.10
> opens.
> 
> Thanks!

Thank you! I will push error recovery ehancement changes after 5.10 
opens.

Regards,

Can Guo
Martin K. Petersen Aug. 18, 2020, 3:11 a.m. UTC | #3
On Sun, 9 Aug 2020 05:15:46 -0700, Can Guo wrote:

> The changes have been tested with error injections of multiple error types (and
> all kinds of mixture of them) during runtime, e.g. hibern8 enter/ exit error,
> power mode change error and fatal/non-fatal error from IRQ context. During the
> test, error injections happen randomly across all contexts, e.g. clk scaling,
> clk gate/ungate, runtime suspend/resume and IRQ.
> 
> There are a few more fixes to resolve other minor problems based on the main
> change, such as LINERESET handling and racing btw error handler and system
> suspend/resume/shutdown, but they will be pushed after this series is taken,
> due to there are already too many lines in these changes.
> 
> [...]

Applied to 5.10/scsi-queue, thanks!

[1/9] scsi: ufs: Add checks before setting clk-gating states
      https://git.kernel.org/mkp/scsi/c/2dec9475a402
[2/9] scsi: ufs: ufs-qcom: Fix race conditions caused by ufs_qcom_testbus_config()
      https://git.kernel.org/mkp/scsi/c/89dd87acd40a
[3/9] scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs
      https://git.kernel.org/mkp/scsi/c/423cc66b5152
[4/9] scsi: ufs: Add some debug information to ufshcd_print_host_state()
      https://git.kernel.org/mkp/scsi/c/3f8af6044713
[5/9] scsi: ufs: Fix concurrency of error handler and other error recovery paths
      https://git.kernel.org/mkp/scsi/c/4db7a2360597
[6/9] scsi: ufs: Recover HBA runtime PM error in error handler
      https://git.kernel.org/mkp/scsi/c/c72e79c0ad2b
[7/9] scsi: ufs: Move dumps in IRQ handler to error handler
      https://git.kernel.org/mkp/scsi/c/c3be8d1ee1bf
[8/9] scsi: ufs: Fix a race condition between error handler and runtime PM ops
      https://git.kernel.org/mkp/scsi/c/5586dd8ea250
[9/9] scsi: ufs: Properly release resources if a task is aborted successfully
      https://git.kernel.org/mkp/scsi/c/35afe60929ab