diff mbox series

[V4,12/12] scsi: replace sdev->device_busy with sbitmap

Message ID 20201116090737.50989-13-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series blk-mq/scsi: tracking device queue depth via sbitmap | expand

Commit Message

Ming Lei Nov. 16, 2020, 9:07 a.m. UTC
scsi requires one global atomic variable to track queue depth for each LUN/
request queue, meantime blk-mq tracks queue depth for each hctx. This SCSI's
requirement can't be implemented in blk-mq easily, cause it is a bigger &
harder problem to spread the device or request queue's depth among all hw
queues.

The current approach by using atomic variable can't scale well when there
is lots of CPU cores and the disk is very fast and IO are submitted to this
device concurrently. It has been observed that IOPS is affected a lot by
tracking queue depth via sdev->device_busy in IO path.

So replace the atomic variable sdev->device_busy with sbitmap for
tracking scsi device queue depth.

It is observed that IOPS is improved ~30% by this patchset in the
following test:

1) test machine(32 logical CPU cores)
	Thread(s) per core:  2
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	Model name:          Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz

2) setup scsi_debug:
modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256

3) fio script:
fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \
	--numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \
	--loops=10000 --name=job1 --filename=/dev/sdN

[1] https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi.c        |  4 +++-
 drivers/scsi/scsi_lib.c    | 33 +++++++++++++++------------------
 drivers/scsi/scsi_priv.h   |  3 +++
 drivers/scsi/scsi_scan.c   | 23 +++++++++++++++++++++--
 drivers/scsi/scsi_sysfs.c  |  2 ++
 include/scsi/scsi_device.h |  5 +++--
 6 files changed, 47 insertions(+), 23 deletions(-)

Comments

kernel test robot Nov. 16, 2020, 11:49 a.m. UTC | #1
Hi Ming,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: powerpc64-randconfig-r026-20201116 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
        git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
           sdev_busy = atomic_read(&scmd->device->device_busy);
                                    ~~~~~~~~~~~~  ^
   1 error generated.

vim +365 drivers/scsi/megaraid/megaraid_sas_fusion.c

4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  353  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  354  static inline void
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  355  megasas_get_msix_index(struct megasas_instance *instance,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  356  		       struct scsi_cmnd *scmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  357  		       struct megasas_cmd_fusion *cmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  358  		       u8 data_arms)
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  359  {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  360  	int sdev_busy;
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  361  
103fbf8e4020845 Kashyap Desai 2020-08-19  362  	/* TBD - if sml remove device_busy in future, driver
103fbf8e4020845 Kashyap Desai 2020-08-19  363  	 * should track counter in internal structure.
103fbf8e4020845 Kashyap Desai 2020-08-19  364  	 */
103fbf8e4020845 Kashyap Desai 2020-08-19 @365  	sdev_busy = atomic_read(&scmd->device->device_busy);
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  366  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  367  	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
103fbf8e4020845 Kashyap Desai 2020-08-19  368  	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  369  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  370  			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  371  					MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
103fbf8e4020845 Kashyap Desai 2020-08-19  372  	} else if (instance->msix_load_balance) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  373  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  374  			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  375  				instance->msix_vectors));
103fbf8e4020845 Kashyap Desai 2020-08-19  376  	} else if (instance->host->nr_hw_queues > 1) {
103fbf8e4020845 Kashyap Desai 2020-08-19  377  		u32 tag = blk_mq_unique_tag(scmd->request);
103fbf8e4020845 Kashyap Desai 2020-08-19  378  
103fbf8e4020845 Kashyap Desai 2020-08-19  379  		cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
103fbf8e4020845 Kashyap Desai 2020-08-19  380  			instance->low_latency_index_start;
103fbf8e4020845 Kashyap Desai 2020-08-19  381  	} else {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  382  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  383  			instance->reply_map[raw_smp_processor_id()];
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  384  	}
103fbf8e4020845 Kashyap Desai 2020-08-19  385  }
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  386  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Ming Lei Nov. 18, 2020, 2:35 a.m. UTC | #2
Hello Kashyap & Sumanesh,

On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> Hi Ming,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on block/for-next]
> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> config: powerpc64-randconfig-r026-20201116 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install powerpc64 cross compiling tool for clang build
>         # apt-get install binutils-powerpc64-linux-gnu
>         # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>         git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
> >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>            sdev_busy = atomic_read(&scmd->device->device_busy);

This new reference to sdev->device_busy is added by recent shared host
tag patch, and according to the comment, you may have planed to convert into
one megaraid internal counter.

        /* TBD - if sml remove device_busy in future, driver
         * should track counter in internal structure.
         */

So can you post one patch? And I am happy to fold it into this series.

Thanks,
Ming
Hannes Reinecke Nov. 18, 2020, 7:15 a.m. UTC | #3
Hey Ming,

On 11/18/20 3:35 AM, Ming Lei wrote:
> Hello Kashyap & Sumanesh,
> 
> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>> Hi Ming,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on block/for-next]
>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>> reproduce (this is a W=1 build):
>>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>          chmod +x ~/bin/make.cross
>>          # install powerpc64 cross compiling tool for clang build
>>          # apt-get install binutils-powerpc64-linux-gnu
>>          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          git remote add linux-review https://github.com/0day-ci/linux
>>          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          # save the attached .config to linux build tree
>>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All errors (new ones prefixed by >>):
>>
>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>             sdev_busy = atomic_read(&scmd->device->device_busy);
> 
> This new reference to sdev->device_busy is added by recent shared host
> tag patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
> 
>          /* TBD - if sml remove device_busy in future, driver
>           * should track counter in internal structure.
>           */
> 
> So can you post one patch? And I am happy to fold it into this series.
> 
Seeing that we already have the accessor 'scsi_device_busy()' it's 
probably easier to just use that and not fiddle with driver internals.
See the attached patch.

Cheers,

Hannes
Ming Lei Nov. 18, 2020, 7:44 a.m. UTC | #4
On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> Hey Ming,
> 
> On 11/18/20 3:35 AM, Ming Lei wrote:
> > Hello Kashyap & Sumanesh,
> > 
> > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > Hi Ming,
> > > 
> > > Thank you for the patch! Yet something to improve:
> > > 
> > > [auto build test ERROR on block/for-next]
> > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > > 
> > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > reproduce (this is a W=1 build):
> > >          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > >          chmod +x ~/bin/make.cross
> > >          # install powerpc64 cross compiling tool for clang build
> > >          # apt-get install binutils-powerpc64-linux-gnu
> > >          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          git remote add linux-review https://github.com/0day-ci/linux
> > >          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > >          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          # save the attached .config to linux build tree
> > >          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > 
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > 
> > > All errors (new ones prefixed by >>):
> > > 
> > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > >             sdev_busy = atomic_read(&scmd->device->device_busy);
> > 
> > This new reference to sdev->device_busy is added by recent shared host
> > tag patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> > 
> >          /* TBD - if sml remove device_busy in future, driver
> >           * should track counter in internal structure.
> >           */
> > 
> > So can you post one patch? And I am happy to fold it into this series.
> > 
> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> easier to just use that and not fiddle with driver internals.
> See the attached patch.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                Kernel Storage Architect
> hare@suse.de                              +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

> From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Wed, 18 Nov 2020 08:08:41 +0100
> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>  to atomic counter
> 
> It's always a bad style to access structure internals, especially if
> there is an accessor for it. So convert to use scsi_device_busy()
> intead of accessing the atomic counter directly.
> 
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..272ff123bc6b 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>  	/* TBD - if sml remove device_busy in future, driver
>  	 * should track counter in internal structure.
>  	 */
> -	sdev_busy = atomic_read(&scmd->device->device_busy);
> +	sdev_busy = scsi_device_busy(scmd->device);

megasas_get_msix_index() is called in .queuecommand() path,
scsi_device_busy() might take more cycles since it has to iterate over
each sbitmap words, especially when the sbitmap depth is high.

I'd suggest Kashyap/Sumanesh to check if there is better way to
deal with it. If not, scsi_device_busy() should be fine.


Thanks,
Ming
Hannes Reinecke Nov. 18, 2020, 9:15 a.m. UTC | #5
On 11/18/20 8:44 AM, Ming Lei wrote:
> On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
>> Hey Ming,
>>
>> On 11/18/20 3:35 AM, Ming Lei wrote:
>>> Hello Kashyap & Sumanesh,
>>>
>>> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>>>> Hi Ming,
>>>>
>>>> Thank you for the patch! Yet something to improve:
>>>>
>>>> [auto build test ERROR on block/for-next]
>>>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch]
>>>>
>>>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>>>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>>>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>>>> reproduce (this is a W=1 build):
>>>>           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>>           chmod +x ~/bin/make.cross
>>>>           # install powerpc64 cross compiling tool for clang build
>>>>           # apt-get install binutils-powerpc64-linux-gnu
>>>>           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           git remote add linux-review https://github.com/0day-ci/linux
>>>>           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>>           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           # save the attached .config to linux build tree
>>>>           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>>>
>>>> If you fix the issue, kindly add following tag as appropriate
>>>> Reported-by: kernel test robot <lkp@intel.com>
>>>>
>>>> All errors (new ones prefixed by >>):
>>>>
>>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>>>              sdev_busy = atomic_read(&scmd->device->device_busy);
>>>
>>> This new reference to sdev->device_busy is added by recent shared host
>>> tag patch, and according to the comment, you may have planed to convert into
>>> one megaraid internal counter.
>>>
>>>           /* TBD - if sml remove device_busy in future, driver
>>>            * should track counter in internal structure.
>>>            */
>>>
>>> So can you post one patch? And I am happy to fold it into this series.
>>>
>> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
>> easier to just use that and not fiddle with driver internals.
>> See the attached patch.
>>
>> Cheers,
>>
>> Hannes
>> -- 
>> Dr. Hannes Reinecke                Kernel Storage Architect
>> hare@suse.de                              +49 911 74053 688
>> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> 
>>  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
>> From: Hannes Reinecke <hare@suse.de>
>> Date: Wed, 18 Nov 2020 08:08:41 +0100
>> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>>   to atomic counter
>>
>> It's always a bad style to access structure internals, especially if
>> there is an accessor for it. So convert to use scsi_device_busy()
>> intead of accessing the atomic counter directly.
>>
>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index fd607287608e..272ff123bc6b 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>>   	/* TBD - if sml remove device_busy in future, driver
>>   	 * should track counter in internal structure.
>>   	 */
>> -	sdev_busy = atomic_read(&scmd->device->device_busy);
>> +	sdev_busy = scsi_device_busy(scmd->device);
> 
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
> each sbitmap words, especially when the sbitmap depth is high.
> 
> I'd suggest Kashyap/Sumanesh to check if there is better way to
> deal with it. If not, scsi_device_busy() should be fine.
> 
I guess this whole codepath will become obsolete if and when support for 
polling queues / io_uring will be implemented for megaraid_sas.
This whole section deals with spreading the load over several hardware 
queues once the dedicated one is at risk of being congested.
But this is only required if someone want to reach high IOPS; so if we 
have poll/io_uring support there won't be a need for this anymore.
Or that's the theory, at least :-)

But the patch should be good enough for now to get your patchset in.

Cheers,

Hannes
Kashyap Desai Nov. 19, 2020, 6:20 a.m. UTC | #6
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
member
> named 'device_busy' in 'struct scsi_device'
> >            sdev_busy = atomic_read(&scmd->device->device_busy);
>
> This new reference to sdev->device_busy is added by recent shared host
tag
> patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
>
>         /* TBD - if sml remove device_busy in future, driver
>          * should track counter in internal structure.
>          */
>
> So can you post one patch? And I am happy to fold it into this series.

Ming - Please find the patch for megaraid_sas driver -
I have used helper inline function just for inter-operability with older
kernel to support in our out of box driver.
This way it will be easy for us to replace helper function as per kernel
version check.

Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter

---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..e813ea0ad8b7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
*instance,
 {
        int sdev_busy;

-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
+       sdev_busy = megasas_sdev_busy_read(scmd);

        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
            sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
@@ -3390,6 +3413,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3474,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3575,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
2.18.1

>
> Thanks,
> Ming
Kashyap Desai Nov. 19, 2020, 6:29 a.m. UTC | #7
> > From: Hannes Reinecke <hare@suse.de>
> > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of
> > direct access  to atomic counter
> >
> > It's always a bad style to access structure internals, especially if
> > there is an accessor for it. So convert to use scsi_device_busy()
> > intead of accessing the atomic counter directly.
> >
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > ---
> >  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > index fd607287608e..272ff123bc6b 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
> >  	/* TBD - if sml remove device_busy in future, driver
> >  	 * should track counter in internal structure.
> >  	 */
> > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > +	sdev_busy = scsi_device_busy(scmd->device);
>
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
each
> sbitmap words, especially when the sbitmap depth is high.
>
> I'd suggest Kashyap/Sumanesh to check if there is better way to deal
with it. If
> not, scsi_device_busy() should be fine.

Scsi_device_busy() add significant amount of overhead which will be
visible in terms of reduced IOPS and high CPU cycle. I tested it earlier
and noticed regression in performance.
I posted megaraid_sas driver patch which will use internal per sdev
outstanding similar to legacy sdev_busy counter.

Kashyap
>
>
> Thanks,
> Ming
Ming Lei Nov. 19, 2020, 6:30 a.m. UTC | #8
On Wed, Nov 18, 2020 at 10:15:19AM +0100, Hannes Reinecke wrote:
> On 11/18/20 8:44 AM, Ming Lei wrote:
> > On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> > > Hey Ming,
> > > 
> > > On 11/18/20 3:35 AM, Ming Lei wrote:
> > > > Hello Kashyap & Sumanesh,
> > > > 
> > > > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > > > Hi Ming,
> > > > > 
> > > > > Thank you for the patch! Yet something to improve:
> > > > > 
> > > > > [auto build test ERROR on block/for-next]
> > > > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > > > And when submitting patch, we suggest to use '--base' as documented in
> > > > > https://git-scm.com/docs/git-format-patch]
> > > > > 
> > > > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > > > reproduce (this is a W=1 build):
> > > > >           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > > > >           chmod +x ~/bin/make.cross
> > > > >           # install powerpc64 cross compiling tool for clang build
> > > > >           # apt-get install binutils-powerpc64-linux-gnu
> > > > >           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           git remote add linux-review https://github.com/0day-ci/linux
> > > > >           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > >           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           # save the attached .config to linux build tree
> > > > >           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > > > 
> > > > > If you fix the issue, kindly add following tag as appropriate
> > > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > > 
> > > > > All errors (new ones prefixed by >>):
> > > > > 
> > > > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > > > >              sdev_busy = atomic_read(&scmd->device->device_busy);
> > > > 
> > > > This new reference to sdev->device_busy is added by recent shared host
> > > > tag patch, and according to the comment, you may have planed to convert into
> > > > one megaraid internal counter.
> > > > 
> > > >           /* TBD - if sml remove device_busy in future, driver
> > > >            * should track counter in internal structure.
> > > >            */
> > > > 
> > > > So can you post one patch? And I am happy to fold it into this series.
> > > > 
> > > Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> > > easier to just use that and not fiddle with driver internals.
> > > See the attached patch.
> > > 
> > > Cheers,
> > > 
> > > Hannes
> > > -- 
> > > Dr. Hannes Reinecke                Kernel Storage Architect
> > > hare@suse.de                              +49 911 74053 688
> > > SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> > > HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> > 
> > >  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> > > From: Hannes Reinecke <hare@suse.de>
> > > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
> > >   to atomic counter
> > > 
> > > It's always a bad style to access structure internals, especially if
> > > there is an accessor for it. So convert to use scsi_device_busy()
> > > intead of accessing the atomic counter directly.
> > > 
> > > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > > ---
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > index fd607287608e..272ff123bc6b 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
> > >   	/* TBD - if sml remove device_busy in future, driver
> > >   	 * should track counter in internal structure.
> > >   	 */
> > > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > > +	sdev_busy = scsi_device_busy(scmd->device);
> > 
> > megasas_get_msix_index() is called in .queuecommand() path,
> > scsi_device_busy() might take more cycles since it has to iterate over
> > each sbitmap words, especially when the sbitmap depth is high.
> > 
> > I'd suggest Kashyap/Sumanesh to check if there is better way to
> > deal with it. If not, scsi_device_busy() should be fine.
> > 
> I guess this whole codepath will become obsolete if and when support for
> polling queues / io_uring will be implemented for megaraid_sas.

Not sure if it is related with iopoll which requires host tags. I understand
the code path for selecting msi index should be replaced with the following
simply if host tags is enabled:

	if (instance->host->nr_hw_queues > 1) {
                u32 tag = blk_mq_unique_tag(scmd->request);

                cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
                        instance->low_latency_index_start;
	} else {
		if (instance->perf_mode == MR_BALANCED_PERF_MODE)
			...
		else if (instance->msix_load_balance)
			...
		else
			cmd->request_desc->SCSIIO.MSIxIndex =
				instance->reply_map[raw_smp_processor_id()];
	}

Otherwise there might be risk to trigger soft lockup in case of host tags.

sdev->device_busy is only required for MR_BALANCED_PERF_MODE, so your
patch can be adjusted to read the counter only for MR_BALANCED_PERF_MODE.

> This whole section deals with spreading the load over several hardware
> queues once the dedicated one is at risk of being congested.
> But this is only required if someone want to reach high IOPS; so if we have
> poll/io_uring support there won't be a need for this anymore.

I understand poll is for low latency usage with extra cost of CPU utilization, and
iopoll can't replace irq based IO. But I may misunderstood your point.




Thanks, 
Ming
Ming Lei Nov. 19, 2020, 6:34 a.m. UTC | #9
On Thu, Nov 19, 2020 at 11:50:39AM +0530, Kashyap Desai wrote:
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
> member
> > named 'device_busy' in 'struct scsi_device'
> > >            sdev_busy = atomic_read(&scmd->device->device_busy);
> >
> > This new reference to sdev->device_busy is added by recent shared host
> tag
> > patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> >
> >         /* TBD - if sml remove device_busy in future, driver
> >          * should track counter in internal structure.
> >          */
> >
> > So can you post one patch? And I am happy to fold it into this series.
> 
> Ming - Please find the patch for megaraid_sas driver -
> I have used helper inline function just for inter-operability with older
> kernel to support in our out of box driver.
> This way it will be easy for us to replace helper function as per kernel
> version check.
> 
> Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter
> 
> ---
>  drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
>  2 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> b/drivers/scsi/megaraid/megaraid_sas.h
> index 0f808d63580e..0c6a56b24c6e 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -2019,10 +2019,12 @@ union megasas_frame {
>   * struct MR_PRIV_DEVICE - sdev private hostdata
>   * @is_tm_capable: firmware managed tm_capable flag
>   * @tm_busy: TM request is in progress
> + * @sdev_priv_busy: pending command per sdev
>   */
>  struct MR_PRIV_DEVICE {
>         bool is_tm_capable;
>         bool tm_busy;
> +       atomic_t sdev_priv_busy;
>         atomic_t r1_ldio_hint;
>         u8 interface_type;
>         u8 task_abort_tmo;
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..e813ea0ad8b7 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
> *instance)
>         return 1;
>  }
> 
> +static inline void
> +megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline void
> +megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline int
> +megasas_sdev_busy_read(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
> +}
> +
> +
>  /**
>   * megasas_get_cmd_fusion -    Get a command from the free pool
>   * @instance:          Adapter soft state
> @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
>  {
>         int sdev_busy;
> 
> -       /* TBD - if sml remove device_busy in future, driver
> -        * should track counter in internal structure.
> -        */
> -       sdev_busy = atomic_read(&scmd->device->device_busy);
> +       sdev_busy = megasas_sdev_busy_read(scmd);

The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
skip inc/dec/read the counter for other perf mode.



Thanks, 
Ming
Kashyap Desai Nov. 19, 2020, 7:09 a.m. UTC | #10
> >  /**
> >   * megasas_get_cmd_fusion -    Get a command from the free pool
> >   * @instance:          Adapter soft state
> > @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> > *instance,  {
> >         int sdev_busy;
> >
> > -       /* TBD - if sml remove device_busy in future, driver
> > -        * should track counter in internal structure.
> > -        */
> > -       sdev_busy = atomic_read(&scmd->device->device_busy);
> > +       sdev_busy = megasas_sdev_busy_read(scmd);
>
> The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
> skip inc/dec/read the counter for other perf mode.

Agree. I have created V2.

Subject: [PATCH] megaraid_sas: v2 replace sdev_busy with local counter

use local tracking of per sdev outstanding command since sdev_busy in
SML is improved for peformance reason using sbitmap (earlier it was
atomic variable).

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 +
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 51 +++++++++++++++++----
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..c630404cbb2d 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,44 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct megasas_instance *instance,
+                      struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return 0;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -357,15 +395,9 @@ megasas_get_msix_index(struct megasas_instance
*instance,
                       struct megasas_cmd_fusion *cmd,
                       u8 data_arms)
 {
-       int sdev_busy;
-
-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
-
        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
-           sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
+           (megasas_sdev_busy_read(instance, scmd) >
+           (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))) {
                cmd->request_desc->SCSIIO.MSIxIndex =
                        mega_mod64((atomic64_add_return(1,
&instance->high_iops_outstanding) /
                                        MR_HIGH_IOPS_BATCH_COUNT),
instance->low_latency_index_start);
@@ -3390,6 +3422,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(instance, scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3483,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(instance, scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3584,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(instance,
scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
>
>
>
> Thanks,
> Ming
diff mbox series

Patch

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index a28d48c850cf..e9e2f0e15ac8 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -218,7 +218,7 @@  void scsi_finish_command(struct scsi_cmnd *cmd)
 /*
  * 1024 is big enough for saturating the fast scsi LUN now
  */
-static int scsi_device_max_queue_depth(struct scsi_device *sdev)
+int scsi_device_max_queue_depth(struct scsi_device *sdev)
 {
 	return max_t(int, sdev->host->can_queue, 1024);
 }
@@ -242,6 +242,8 @@  int scsi_change_queue_depth(struct scsi_device *sdev, int depth)
 	if (sdev->request_queue)
 		blk_set_queue_depth(sdev->request_queue, depth);
 
+	sbitmap_resize(&sdev->budget_map, sdev->queue_depth);
+
 	return sdev->queue_depth;
 }
 EXPORT_SYMBOL(scsi_change_queue_depth);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4709418c47e0..f5b675d02929 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -327,7 +327,7 @@  void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
 	if (starget->can_queue > 0)
 		atomic_dec(&starget->target_busy);
 
-	atomic_dec(&sdev->device_busy);
+	sbitmap_put(&sdev->budget_map, cmd->budget_token);
 	cmd->budget_token = -1;
 }
 
@@ -1250,19 +1250,17 @@  scsi_device_state_check(struct scsi_device *sdev, struct request *req)
 }
 
 /*
- * scsi_dev_queue_ready: if we can send requests to sdev, return 1 else
- * return 0.
- *
- * Called with the queue_lock held.
+ * scsi_dev_queue_ready: if we can send requests to sdev, assign one token
+ * and return the token else return -1.
  */
 static inline int scsi_dev_queue_ready(struct request_queue *q,
 				  struct scsi_device *sdev)
 {
-	unsigned int busy;
+	int token;
 
-	busy = atomic_inc_return(&sdev->device_busy) - 1;
+	token = sbitmap_get(&sdev->budget_map);
 	if (atomic_read(&sdev->device_blocked)) {
-		if (busy)
+		if (token >= 0)
 			goto out_dec;
 
 		/*
@@ -1274,13 +1272,11 @@  static inline int scsi_dev_queue_ready(struct request_queue *q,
 				   "unblocking device at zero depth\n"));
 	}
 
-	if (busy >= sdev->queue_depth)
-		goto out_dec;
-
-	return 1;
+	return token;
 out_dec:
-	atomic_dec(&sdev->device_busy);
-	return 0;
+	if (token >= 0)
+		sbitmap_put(&sdev->budget_map, token);
+	return -1;
 }
 
 /*
@@ -1605,15 +1601,16 @@  static void scsi_mq_put_budget(struct request_queue *q, int budget_token)
 {
 	struct scsi_device *sdev = q->queuedata;
 
-	atomic_dec(&sdev->device_busy);
+	sbitmap_put(&sdev->budget_map, budget_token);
 }
 
 static int scsi_mq_get_budget(struct request_queue *q)
 {
 	struct scsi_device *sdev = q->queuedata;
+	int token = scsi_dev_queue_ready(q, sdev);
 
-	if (scsi_dev_queue_ready(q, sdev))
-		return 0;
+	if (token >= 0)
+		return token;
 
 	atomic_inc(&sdev->restarts);
 
@@ -1723,7 +1720,7 @@  static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 		break;
 	case BLK_STS_RESOURCE:
 	case BLK_STS_ZONE_RESOURCE:
-		if (atomic_read(&sdev->device_busy) ||
+		if (sbitmap_any_bit_set(&sdev->budget_map) ||
 		    scsi_device_blocked(sdev))
 			ret = BLK_STS_DEV_RESOURCE;
 		break;
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 180636d54982..30b35002d2f8 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -5,6 +5,7 @@ 
 #include <linux/device.h>
 #include <linux/async.h>
 #include <scsi/scsi_device.h>
+#include <linux/sbitmap.h>
 
 struct request_queue;
 struct request;
@@ -182,6 +183,8 @@  static inline void scsi_dh_add_device(struct scsi_device *sdev) { }
 static inline void scsi_dh_release_device(struct scsi_device *sdev) { }
 #endif
 
+extern int scsi_device_max_queue_depth(struct scsi_device *sdev);
+
 /* 
  * internal scsi timeout functions: for use by mid-layer and transport
  * classes.
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 9af50e6f94c4..9f1b7f3c650a 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -215,6 +215,7 @@  static void scsi_unlock_floptical(struct scsi_device *sdev,
 static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 					   u64 lun, void *hostdata)
 {
+	unsigned int depth;
 	struct scsi_device *sdev;
 	int display_failure_msg = 1, ret;
 	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
@@ -276,8 +277,25 @@  static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	WARN_ON_ONCE(!blk_get_queue(sdev->request_queue));
 	sdev->request_queue->queuedata = sdev;
 
-	scsi_change_queue_depth(sdev, sdev->host->cmd_per_lun ?
-					sdev->host->cmd_per_lun : 1);
+	depth = sdev->host->cmd_per_lun ?: 1;
+
+	/*
+	 * Use .can_queue as budget map's depth because we have to
+	 * support adjusting queue depth from sysfs. Meantime use
+	 * default device queue depth to figure out sbitmap shift
+	 * since we use this queue depth most of times.
+	 */
+	if (sbitmap_init_node(&sdev->budget_map,
+				scsi_device_max_queue_depth(sdev),
+				sbitmap_calculate_shift(depth),
+				GFP_KERNEL, sdev->request_queue->node,
+				false, true)) {
+		put_device(&starget->dev);
+		kfree(sdev);
+		goto out;
+	}
+
+	scsi_change_queue_depth(sdev, depth);
 
 	scsi_sysfs_device_initialize(sdev);
 
@@ -979,6 +997,7 @@  static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 		scsi_attach_vpd(sdev);
 
 	sdev->max_queue_depth = sdev->queue_depth;
+	WARN_ON_ONCE(sdev->max_queue_depth > sdev->budget_map.depth);
 	sdev->sdev_bflags = *bflags;
 
 	/*
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ef5c79037bde..79c36cfdb84b 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -477,6 +477,8 @@  static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	/* NULL queue means the device can't be used */
 	sdev->request_queue = NULL;
 
+	sbitmap_free(&sdev->budget_map);
+
 	mutex_lock(&sdev->inquiry_mutex);
 	vpd_pg0 = rcu_replace_pointer(sdev->vpd_pg0, vpd_pg0,
 				       lockdep_is_held(&sdev->inquiry_mutex));
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index dd0b9f690a26..05c7c320ef32 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -8,6 +8,7 @@ 
 #include <linux/blkdev.h>
 #include <scsi/scsi.h>
 #include <linux/atomic.h>
+#include <linux/sbitmap.h>
 
 struct device;
 struct request_queue;
@@ -106,7 +107,7 @@  struct scsi_device {
 	struct list_head    siblings;   /* list of all devices on this host */
 	struct list_head    same_target_siblings; /* just the devices sharing same target id */
 
-	atomic_t device_busy;		/* commands actually active on LLDD */
+	struct sbitmap budget_map;
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
 
 	atomic_t restarts;
@@ -592,7 +593,7 @@  static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
 
 static inline int scsi_device_busy(struct scsi_device *sdev)
 {
-	return atomic_read(&sdev->device_busy);
+	return sbitmap_weight(&sdev->budget_map);
 }
 
 #define MODULE_ALIAS_SCSI_DEVICE(type) \