Message ID | 20170728140359.15424-1-thenzl@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Tomas, > the eh reset function returns success when fw_outstanding equals zero, > that means that the counter shouldn't be decremented > when the driver still owns the command Kashyap? Sumit?
>-----Original Message----- >From: Tomas Henzl [mailto:thenzl@redhat.com] >Sent: Friday, July 28, 2017 7:34 PM >To: linux-scsi@vger.kernel.org >Cc: sumit.saxena@broadcom.com; kashyap.desai@broadcom.com >Subject: [PATCH] megaraid_sas: move command counter to correct place > >the eh reset function returns success when fw_outstanding equals zero, that >means that the counter shouldn't be decremented when the driver still owns >the command > > >Signed-off-by: Tomas Henzl <thenzl@redhat.com> >--- > drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c >b/drivers/scsi/megaraid/megaraid_sas_fusion.c >index f990ab4d45..c615aadb2b 100644 >--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c >+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c >@@ -3046,7 +3046,6 @@ complete_cmd_fusion(struct megasas_instance >*instance, u32 MSIxIndex) > } > //Fall thru and complete IO > case MEGASAS_MPI2_FUNCTION_LD_IO_REQUEST: /* LD-IO >Path */ >- atomic_dec(&instance->fw_outstanding); > if (cmd_fusion->r1_alt_dev_handle == >MR_DEVHANDLE_INVALID) { > map_cmd_status(fusion, scmd_local, status, > extStatus, >le32_to_cpu(data_length), @@ -3060,6 +3059,7 @@ >complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex) > scmd_local->scsi_done(scmd_local); > } else /* Optimal VD - R1 FP command completion. >*/ > megasas_complete_r1_command(instance, >cmd_fusion); >+ atomic_dec(&instance->fw_outstanding); > break; > case MEGASAS_MPI2_FUNCTION_PASSTHRU_IO_REQUEST: >/*MFI command */ > cmd_mfi = instance->cmd_list[cmd_fusion- >>sync_cmd_idx]; Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> >-- >2.9.4
Tomas, > the eh reset function returns success when fw_outstanding equals zero, > that means that the counter shouldn't be decremented when the driver > still owns the command Applied to 4.13/scsi-fixes. Thank you!
>-----Original Message----- >From: Martin K. Petersen [mailto:martin.petersen@oracle.com] >Sent: Monday, August 07, 2017 11:07 PM >To: Tomas Henzl >Cc: linux-scsi@vger.kernel.org; sumit.saxena@broadcom.com; >kashyap.desai@broadcom.com >Subject: Re: [PATCH] megaraid_sas: move command counter to correct place > > >Tomas, > >> the eh reset function returns success when fw_outstanding equals zero, >> that means that the counter shouldn't be decremented when the driver >> still owns the command > >Applied to 4.13/scsi-fixes. Thank you! Just realized that this patch may cause performance regression. With this patch below scenario may occur- -Consider outstanding IOs reaches to controller's Queue depth. -Driver frees up command and complete it back to SML. -Since host_busy is decremented, SML can issue one new IO to driver. -By the time, if "fw_outstanding" is not decremented by driver, then driver will return newly submitted IO back to SML with return status SCSI_MLQUEUE_HOST_BUSY because of below code in megaraid_sas driver's IO submission path- if (atomic_inc_return(&instance->fw_outstanding) > instance->host->can_queue) { atomic_dec(&instance->fw_outstanding); return SCSI_MLQUEUE_HOST_BUSY; } This situation will be more evident when RAID1 fastpath IOs are running as in that case driver will be issuing two IOs to firmware for single IO issued from SML. Above situation can be avoided, if this patch is removed. Regarding Tomas' concern, there should not be any problem as driver calls "synchronize_irq" before checking "fw_outstanding". Once fw_outstanding is decremented and driver frees up command, then only driver will be checking "fw_outstanding" equals to zero or not so all this will always fall in a sequence and will not cause the problem stated by Tomas. I am sorry for confusion and would request to revert this patch. Thanks, Sumit > >-- >Martin K. Petersen Oracle Linux Engineering
On Tue, 2017-08-08 at 13:07 +0530, Sumit Saxena wrote: > > > > -----Original Message----- > > From: Martin K. Petersen [mailto:martin.petersen@oracle.com] > > Sent: Monday, August 07, 2017 11:07 PM > > To: Tomas Henzl > > Cc: linux-scsi@vger.kernel.org; sumit.saxena@broadcom.com; > > kashyap.desai@broadcom.com > > Subject: Re: [PATCH] megaraid_sas: move command counter to correct > > place > > > > > > Tomas, > > > > > > > > the eh reset function returns success when fw_outstanding equals > > > zero, > > > that means that the counter shouldn't be decremented when the > > > driver > > > still owns the command > > > > Applied to 4.13/scsi-fixes. Thank you! > > Just realized that this patch may cause performance regression. > With this patch below scenario may occur- > > -Consider outstanding IOs reaches to controller's Queue depth. > -Driver frees up command and complete it back to SML. > -Since host_busy is decremented, SML can issue one new IO to driver. > -By the time, if "fw_outstanding" is not decremented by driver, then > driver will return newly submitted IO back to SML with return status > SCSI_MLQUEUE_HOST_BUSY because of below code > in megaraid_sas driver's IO submission path- > > if (atomic_inc_return(&instance->fw_outstanding) > > instance->host->can_queue) { > atomic_dec(&instance->fw_outstanding); > return SCSI_MLQUEUE_HOST_BUSY; > } > > This situation will be more evident when RAID1 fastpath IOs are > running as in that case driver will be issuing two IOs to firmware > for single IO issued from SML. Above situation can be avoided, if > this patch is removed. > > Regarding Tomas' concern, there should not be any problem as driver > calls "synchronize_irq" before checking "fw_outstanding". Once > fw_outstanding is decremented and driver frees up command, then only > driver will be checking "fw_outstanding" equals to zero or not so all > this will always fall in a sequence and will not cause the problem > stated by Tomas. > > I am sorry for confusion and would request to revert this patch. OK, I've taken it out of the fixes tree. Martin: this means my fixes branch got rebased; can you rebase your fixes branch on top of mine before we all get a beating from Stephen Rothwell because of a mismerge in linux-next due to the tree differences? Thanks, James
James, > Martin: this means my fixes branch got rebased; can you rebase your > fixes branch on top of mine before we all get a beating from Stephen > Rothwell because of a mismerge in linux-next due to the tree > differences? Done.
On 8.8.2017 09:37, Sumit Saxena wrote: >> -----Original Message----- >> From: Martin K. Petersen [mailto:martin.petersen@oracle.com] >> Sent: Monday, August 07, 2017 11:07 PM >> To: Tomas Henzl >> Cc: linux-scsi@vger.kernel.org; sumit.saxena@broadcom.com; >> kashyap.desai@broadcom.com >> Subject: Re: [PATCH] megaraid_sas: move command counter to correct place >> >> >> Tomas, >> >>> the eh reset function returns success when fw_outstanding equals zero, >>> that means that the counter shouldn't be decremented when the driver >>> still owns the command >> Applied to 4.13/scsi-fixes. Thank you! > Just realized that this patch may cause performance regression. > With this patch below scenario may occur- > > -Consider outstanding IOs reaches to controller's Queue depth. > -Driver frees up command and complete it back to SML. > -Since host_busy is decremented, SML can issue one new IO to driver. > -By the time, if "fw_outstanding" is not decremented by driver, then > driver will return newly submitted IO back to SML with return status > SCSI_MLQUEUE_HOST_BUSY because of below code > in megaraid_sas driver's IO submission path- > > if (atomic_inc_return(&instance->fw_outstanding) > > instance->host->can_queue) { > atomic_dec(&instance->fw_outstanding); > return SCSI_MLQUEUE_HOST_BUSY; > } > > This situation will be more evident when RAID1 fastpath IOs are running as > in that case driver will be issuing two IOs to firmware for single IO > issued from SML. > Above situation can be avoided, if this patch is removed. > > Regarding Tomas' concern, there should not be any problem as driver calls > "synchronize_irq" before checking "fw_outstanding". Once fw_outstanding is > decremented and > driver frees up command, then only driver will be checking > "fw_outstanding" equals to zero or not so all this will always fall in a > sequence and will not > cause the problem stated by Tomas. I haven't expected this to fix a real issue in latest upstream code, just wanted to follow the correct ordering. If it creates a performance issue, reverting the patch is fine for me. > > I am sorry for confusion and would request to revert this patch. > > Thanks, > Sumit > >> -- >> Martin K. Petersen Oracle Linux Engineering
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index f990ab4d45..c615aadb2b 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -3046,7 +3046,6 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex) } //Fall thru and complete IO case MEGASAS_MPI2_FUNCTION_LD_IO_REQUEST: /* LD-IO Path */ - atomic_dec(&instance->fw_outstanding); if (cmd_fusion->r1_alt_dev_handle == MR_DEVHANDLE_INVALID) { map_cmd_status(fusion, scmd_local, status, extStatus, le32_to_cpu(data_length), @@ -3060,6 +3059,7 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex) scmd_local->scsi_done(scmd_local); } else /* Optimal VD - R1 FP command completion. */ megasas_complete_r1_command(instance, cmd_fusion); + atomic_dec(&instance->fw_outstanding); break; case MEGASAS_MPI2_FUNCTION_PASSTHRU_IO_REQUEST: /*MFI command */ cmd_mfi = instance->cmd_list[cmd_fusion->sync_cmd_idx];
the eh reset function returns success when fw_outstanding equals zero, that means that the counter shouldn't be decremented when the driver still owns the command Signed-off-by: Tomas Henzl <thenzl@redhat.com> --- drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)