Message ID | 20240923080344.19084-3-peter.wang@mediatek.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fix abort defect | expand |
On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote: > When the error handler successfully aborts a MCQ request, > it only releases the command and does not notify the SCSI layer. > This may cause another abort after 30 seconds timeout. > This patch notifies the SCSI layer to requeue the request. > > Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode > SQ cleanup. This makes the behavior of MCQ mode consistent with > that of legacy SDB mode. > > Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS > for debugging purposes. Although I like the approach of this patch, two comments below. > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index a6f818cdef0e..b5c7bc50a27e 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp, > break; > case OCS_ABORTED: > result |= DID_ABORT << 16; > + dev_warn(hba->dev, > + "OCS aborted from controller = %x for tag %d\n", > + ocs, lrbp->task_tag); > break; Including the OCS status in this message seems redundant to me. > case OCS_INVALID_COMMAND_STATUS: > result |= DID_REQUEUE << 16; > + dev_warn(hba->dev, > + "OCS invaild from controller = %x for tag %d\n", > + ocs, lrbp->task_tag); Also here, including the OCS status in this message seems redundant to me. Please change "invaild" into "invalid". > @@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag, > ufshcd_update_monitor(hba, lrbp); > ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP); > cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe); > + > + /* > + * Ignore MCQ OCS: ABORTED posted by the host controller. > + * This makes the behavior of MCQ mode consistent with that > + * of legacy SDB mode. > + */ > + if (hba->mcq_enabled) { > + ocs = ufshcd_get_tr_ocs(lrbp, cqe); > + if (ocs == OCS_ABORTED) > + return; > + } Why only ignore the OCS_ABORTED status in MCQ mode? Is my understanding correct that MediaTek controllers can also report this status in legacy mode? Thanks, Bart.
On Mon, 2024-09-23 at 11:19 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote: > > When the error handler successfully aborts a MCQ request, > > it only releases the command and does not notify the SCSI layer. > > This may cause another abort after 30 seconds timeout. > > This patch notifies the SCSI layer to requeue the request. > > > > Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode > > SQ cleanup. This makes the behavior of MCQ mode consistent with > > that of legacy SDB mode. > > > > Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS > > for debugging purposes. > > Although I like the approach of this patch, two comments below. > > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > > index a6f818cdef0e..b5c7bc50a27e 100644 > > --- a/drivers/ufs/core/ufshcd.c > > +++ b/drivers/ufs/core/ufshcd.c > > @@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba > *hba, struct ufshcd_lrb *lrbp, > > break; > > case OCS_ABORTED: > > result |= DID_ABORT << 16; > > +dev_warn(hba->dev, > > +"OCS aborted from controller = %x for tag %d\n", > > +ocs, lrbp->task_tag); > > break; > > Including the OCS status in this message seems redundant to me. > > > case OCS_INVALID_COMMAND_STATUS: > > result |= DID_REQUEUE << 16; > > +dev_warn(hba->dev, > > +"OCS invaild from controller = %x for tag %d\n", > > +ocs, lrbp->task_tag); > > Also here, including the OCS status in this message seems redundant > to me. > > Please change "invaild" into "invalid". Hi Bart, Okay, I will revise the content printed here. > > > @@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba > *hba, int task_tag, > > ufshcd_update_monitor(hba, lrbp); > > ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP); > > cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe); > > + > > +/* > > + * Ignore MCQ OCS: ABORTED posted by the host controller. > > + * This makes the behavior of MCQ mode consistent with that > > + * of legacy SDB mode. > > + */ > > +if (hba->mcq_enabled) { > > +ocs = ufshcd_get_tr_ocs(lrbp, cqe); > > +if (ocs == OCS_ABORTED) > > +return; > > +} > > Why only ignore the OCS_ABORTED status in MCQ mode? Is my > understanding > correct that MediaTek controllers can also report this status in > legacy > mode? > > Thanks, Because according to the specification, only MCQ will have OCS: ABORTED. MediaTek, even with a quirk handling SDB OCS ABORTED, cannot simply ignore it here and not deal with it. Otherwise, it would need to release directly like MCQ ufshcd_abort_one. Thanks Peter > > Bart.
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index a6f818cdef0e..b5c7bc50a27e 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp, break; case OCS_ABORTED: result |= DID_ABORT << 16; + dev_warn(hba->dev, + "OCS aborted from controller = %x for tag %d\n", + ocs, lrbp->task_tag); break; case OCS_INVALID_COMMAND_STATUS: result |= DID_REQUEUE << 16; + dev_warn(hba->dev, + "OCS invaild from controller = %x for tag %d\n", + ocs, lrbp->task_tag); break; case OCS_INVALID_CMD_TABLE_ATTR: case OCS_INVALID_PRDT_ATTR: @@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag, ufshcd_update_monitor(hba, lrbp); ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP); cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe); + + /* + * Ignore MCQ OCS: ABORTED posted by the host controller. + * This makes the behavior of MCQ mode consistent with that + * of legacy SDB mode. + */ + if (hba->mcq_enabled) { + ocs = ufshcd_get_tr_ocs(lrbp, cqe); + if (ocs == OCS_ABORTED) + return; + } + ufshcd_release_scsi_cmd(hba, lrbp); /* Do not touch lrbp after scsi done */ scsi_done(cmd); @@ -6486,8 +6504,11 @@ static bool ufshcd_abort_one(struct request *rq, void *priv) if (!hwq) return 0; spin_lock_irqsave(&hwq->cq_lock, flags); - if (ufshcd_cmd_inflight(lrbp->cmd)) + if (ufshcd_cmd_inflight(lrbp->cmd)) { + set_host_byte(lrbp->cmd, DID_REQUEUE); ufshcd_release_scsi_cmd(hba, lrbp); + scsi_done(lrbp->cmd); + } spin_unlock_irqrestore(&hwq->cq_lock, flags); }