scsi: target/tcm_loop: update upper limit of LUN
diff mbox series

Message ID 20190805062313.343221-1-naohiro.aota@wdc.com
State Changes Requested
Headers show
Series
  • scsi: target/tcm_loop: update upper limit of LUN
Related show

Commit Message

Naohiro Aota Aug. 5, 2019, 6:23 a.m. UTC
targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
65535. On the other hand, the kernel driver is limiting max_lun to 0.

This limitation causes an actual problem when you delete a loopback device
(using /sys/class/scsi_device/${lun}/device/delete) and rescan it (using
/sys/class/scsi_host/host${h}/scan). You can delete the device, but cannot
rescan it because its LUN is larger than the max_lun and so the scan
request results in -EINVAL error in scsi_scan_host_selected().

This commit fix the upper limit to be as same as rtslib-fb allows.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 drivers/target/loopback/tcm_loop.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Mike Christie Aug. 5, 2019, 4:33 p.m. UTC | #1
On 08/05/2019 01:23 AM, Naohiro Aota wrote:
> targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
> 65535. On the other hand, the kernel driver is limiting max_lun to 0.
> 
> This limitation causes an actual problem when you delete a loopback device
> (using /sys/class/scsi_device/${lun}/device/delete) and rescan it (using
> /sys/class/scsi_host/host${h}/scan). You can delete the device, but cannot
> rescan it because its LUN is larger than the max_lun and so the scan
> request results in -EINVAL error in scsi_scan_host_selected().

How are you kicking off this rescan?

Just to make sure I understood you, does the initial LU have LUN 0, you
delete that, then are you creating another LU with a LUN value that is
not 0?

Is it rtslib that is giving the new LU a LUN that is not 0?

> 
> This commit fix the upper limit to be as same as rtslib-fb allows.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  drivers/target/loopback/tcm_loop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c
> index 3305b47fdf53..3db541ad727d 100644
> --- a/drivers/target/loopback/tcm_loop.c
> +++ b/drivers/target/loopback/tcm_loop.c
> @@ -336,10 +336,10 @@ static int tcm_loop_driver_probe(struct device *dev)
>  	 */
>  	*((struct tcm_loop_hba **)sh->hostdata) = tl_hba;
>  	/*
> -	 * Setup single ID, Channel and LUN for now..
> +	 * Setup single ID, and Channel for now..
>  	 */
>  	sh->max_id = 2;
> -	sh->max_lun = 0;
> +	sh->max_lun = 65536;
>  	sh->max_channel = 0;
>  	sh->max_cmd_len = SCSI_MAX_VARLEN_CDB_SIZE;
>  
>
Naohiro Aota Aug. 6, 2019, 2:45 a.m. UTC | #2
On Mon, Aug 05, 2019 at 11:33:26AM -0500, Mike Christie wrote:
>On 08/05/2019 01:23 AM, Naohiro Aota wrote:
>> targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
>> 65535. On the other hand, the kernel driver is limiting max_lun to 0.
>>
>> This limitation causes an actual problem when you delete a loopback device
>> (using /sys/class/scsi_device/${lun}/device/delete) and rescan it (using
>> /sys/class/scsi_host/host${h}/scan). You can delete the device, but cannot
>> rescan it because its LUN is larger than the max_lun and so the scan
>> request results in -EINVAL error in scsi_scan_host_selected().
>
>How are you kicking off this rescan?
>
>Just to make sure I understood you, does the initial LU have LUN 0, you
>delete that, then are you creating another LU with a LUN value that is
>not 0?

Not exactly. I'm working on a case multiple device is added at once to
one loopback scsi host. You can create two or more device using
"targetcli" command and they may have their LUN larger than 0. For
example,

$ sudo targetcli
/backstores/fileio> cd /loopback
/loopback> create
Created target naa.5001405218077d66.
/loopback> exit
$ sudo truncate -s 1048576 /mnt/nvme/foo{1,2,3}
$ sudo targetcli /backstores/fileio create name=foo1 file_or_dev=/mnt/nvme/foo1
Created fileio foo1 with size 1048576
$ sudo targetcli /loopback/naa.5001405218077d66/luns create /backstores/fileio/foo1
Created LUN 0.
(Do the same above for foo2 and foo3)

Then, you'll see each of them has LUN 0, 1, 2 assigned: (rtslib scans used LUN and assign free one)

$ lsscsi
...
[7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
[7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
[7:0:1:2]    disk    LIO-ORG  foo3             4.0   /dev/sdf

Now, you can delete one of these device:

$ echo 1 > /sys/class/scsi_device/7\:0\:1\:2/device/delete
$ lsscsi
...
[7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
[7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde

But, you cannot recover it by the scanning:

$ echo "0 1 2" > /sys/class/scsi_host/host7/scan 
-bash: echo: write error: Invalid argument

This command is failing with -EINVAL because "LUN (= 2) >= max_lun (= 0)".

and, even WILDCARD scan cannot recover it.

$ echo "0 1 -" > /sys/class/scsi_host/host7/scan
$ lsscsi
...
[7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd 
[7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde 

Actually, you cannot even rescan LUN 0, at least with a specific scan:

$ echo 1 > /sys/class/scsi_device/7\:0\:1\:0/device/delete 
$ echo "0 1 0" > /sys/class/scsi_host/host7/scan 
-bash: echo: write error: Invalid argument
$ lsscsi 
...
[7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde 

Though, it can be revived using wildcard scan:

$ echo "0 1 -" > /sys/class/scsi_host/host7/scan
$ lsscsi
...
[7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd 
[7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde 

>Is it rtslib that is giving the new LU a LUN that is not 0?

Yes. As I said above, it use the first free one.

>>
>> This commit fix the upper limit to be as same as rtslib-fb allows.
>>
>> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>> ---
>>  drivers/target/loopback/tcm_loop.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c
>> index 3305b47fdf53..3db541ad727d 100644
>> --- a/drivers/target/loopback/tcm_loop.c
>> +++ b/drivers/target/loopback/tcm_loop.c
>> @@ -336,10 +336,10 @@ static int tcm_loop_driver_probe(struct device *dev)
>>  	 */
>>  	*((struct tcm_loop_hba **)sh->hostdata) = tl_hba;
>>  	/*
>> -	 * Setup single ID, Channel and LUN for now..
>> +	 * Setup single ID, and Channel for now..
>>  	 */
>>  	sh->max_id = 2;
>> -	sh->max_lun = 0;
>> +	sh->max_lun = 65536;
>>  	sh->max_channel = 0;
>>  	sh->max_cmd_len = SCSI_MAX_VARLEN_CDB_SIZE;
>>
>>
>
Mike Christie Aug. 6, 2019, 5:42 p.m. UTC | #3
On 08/05/2019 09:45 PM, Naohiro Aota wrote:
> On Mon, Aug 05, 2019 at 11:33:26AM -0500, Mike Christie wrote:
>> On 08/05/2019 01:23 AM, Naohiro Aota wrote:
>>> targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
>>> 65535. On the other hand, the kernel driver is limiting max_lun to 0.
>>>
>>> This limitation causes an actual problem when you delete a loopback
>>> device
>>> (using /sys/class/scsi_device/${lun}/device/delete) and rescan it (using
>>> /sys/class/scsi_host/host${h}/scan). You can delete the device, but
>>> cannot
>>> rescan it because its LUN is larger than the max_lun and so the scan
>>> request results in -EINVAL error in scsi_scan_host_selected().
>>
>> How are you kicking off this rescan?
>>
>> Just to make sure I understood you, does the initial LU have LUN 0, you
>> delete that, then are you creating another LU with a LUN value that is
>> not 0?
> 
> Not exactly. I'm working on a case multiple device is added at once to
> one loopback scsi host. You can create two or more device using
> "targetcli" command and they may have their LUN larger than 0. For
> example,
> 
> $ sudo targetcli
> /backstores/fileio> cd /loopback
> /loopback> create
> Created target naa.5001405218077d66.
> /loopback> exit
> $ sudo truncate -s 1048576 /mnt/nvme/foo{1,2,3}
> $ sudo targetcli /backstores/fileio create name=foo1
> file_or_dev=/mnt/nvme/foo1
> Created fileio foo1 with size 1048576
> $ sudo targetcli /loopback/naa.5001405218077d66/luns create
> /backstores/fileio/foo1
> Created LUN 0.
> (Do the same above for foo2 and foo3)
> 
> Then, you'll see each of them has LUN 0, 1, 2 assigned: (rtslib scans
> used LUN and assign free one)
> 
> $ lsscsi
> ...
> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
> [7:0:1:2]    disk    LIO-ORG  foo3             4.0   /dev/sdf
> 
> Now, you can delete one of these device:
> 
> $ echo 1 > /sys/class/scsi_device/7\:0\:1\:2/device/delete
> $ lsscsi
> ...
> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
> 
> But, you cannot recover it by the scanning:
> 

Why are you using the scsi sysfs interface instead of the target
configfs interface?

I know the comment for max_lun says it wants to support 1 LUN, but the
code like in tcm_loop_port_link seems to support multiple LUNs, so your
patch looks like it could be ok. I would just set max_luns to the kernel
(scsi-ml/lio) limit and not some userspace value.

I think the only problem you might have with your patch is that if you
delete the device via the scsi sysfs interface you will not be able to
unmap the LUN from LIO until you add it back due to tcm_loop_port_unlink
failing to look up the device and being able to decrement the tpg refcount.
Naohiro Aota Aug. 8, 2019, 8:42 a.m. UTC | #4
On Tue, Aug 06, 2019 at 12:42:20PM -0500, Mike Christie wrote:
>On 08/05/2019 09:45 PM, Naohiro Aota wrote:
>> On Mon, Aug 05, 2019 at 11:33:26AM -0500, Mike Christie wrote:
>>> On 08/05/2019 01:23 AM, Naohiro Aota wrote:
>>>> targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
>>>> 65535. On the other hand, the kernel driver is limiting max_lun to 0.
>>>>
>>>> This limitation causes an actual problem when you delete a loopback
>>>> device
>>>> (using /sys/class/scsi_device/${lun}/device/delete) and rescan it (using
>>>> /sys/class/scsi_host/host${h}/scan). You can delete the device, but
>>>> cannot
>>>> rescan it because its LUN is larger than the max_lun and so the scan
>>>> request results in -EINVAL error in scsi_scan_host_selected().
>>>
>>> How are you kicking off this rescan?
>>>
>>> Just to make sure I understood you, does the initial LU have LUN 0, you
>>> delete that, then are you creating another LU with a LUN value that is
>>> not 0?
>>
>> Not exactly. I'm working on a case multiple device is added at once to
>> one loopback scsi host. You can create two or more device using
>> "targetcli" command and they may have their LUN larger than 0. For
>> example,
>>
>> $ sudo targetcli
>> /backstores/fileio> cd /loopback
>> /loopback> create
>> Created target naa.5001405218077d66.
>> /loopback> exit
>> $ sudo truncate -s 1048576 /mnt/nvme/foo{1,2,3}
>> $ sudo targetcli /backstores/fileio create name=foo1
>> file_or_dev=/mnt/nvme/foo1
>> Created fileio foo1 with size 1048576
>> $ sudo targetcli /loopback/naa.5001405218077d66/luns create
>> /backstores/fileio/foo1
>> Created LUN 0.
>> (Do the same above for foo2 and foo3)
>>
>> Then, you'll see each of them has LUN 0, 1, 2 assigned: (rtslib scans
>> used LUN and assign free one)
>>
>> $ lsscsi
>> ...
>> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
>> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
>> [7:0:1:2]    disk    LIO-ORG  foo3             4.0   /dev/sdf
>>
>> Now, you can delete one of these device:
>>
>> $ echo 1 > /sys/class/scsi_device/7\:0\:1\:2/device/delete
>> $ lsscsi
>> ...
>> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
>> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
>>
>> But, you cannot recover it by the scanning:
>>
>
>Why are you using the scsi sysfs interface instead of the target
>configfs interface?

Xfstests btrfs/003 uses the SCSI sysfs interface to emulate missing block device. We can use any SCSI devices to run the test case, so we cannot use target configfs here.

Even the end result of missing "/dev/sd?" is the same, they are two distinct interfaces. So, we need to fix the broken result of the SCSI sysfs interface anyway?

>I know the comment for max_lun says it wants to support 1 LUN, but the
>code like in tcm_loop_port_link seems to support multiple LUNs, so your
>patch looks like it could be ok. I would just set max_luns to the kernel
>(scsi-ml/lio) limit and not some userspace value.

Hm, taking look at the code (target_fabric_make_lun), there is no upper limit check there. So, set max_lun = U64_MAX right?
I once considered that but when I create LUN larger than 65535, "targetcli ls" died complaining "LUN must be 0 to 65535". So I used 65536 here.

Or, should we use max_lun = U64_MAX and fix userland side? They need to be the same, anyway, I believe...

>I think the only problem you might have with your patch is that if you
>delete the device via the scsi sysfs interface you will not be able to
>unmap the LUN from LIO until you add it back due to tcm_loop_port_unlink
>failing to look up the device and being able to decrement the tpg refcount.

Ah, I didn't notice that point. I'll address that and send a new version.

Regards,
Mike Christie Aug. 8, 2019, 6:38 p.m. UTC | #5
On 08/08/2019 03:42 AM, Naohiro Aota wrote:
> On Tue, Aug 06, 2019 at 12:42:20PM -0500, Mike Christie wrote:
>> On 08/05/2019 09:45 PM, Naohiro Aota wrote:
>>> On Mon, Aug 05, 2019 at 11:33:26AM -0500, Mike Christie wrote:
>>>> On 08/05/2019 01:23 AM, Naohiro Aota wrote:
>>>>> targetcli-fb (or its library: rtslib-fb) allows us to create LUN up to
>>>>> 65535. On the other hand, the kernel driver is limiting max_lun to 0.
>>>>>
>>>>> This limitation causes an actual problem when you delete a loopback
>>>>> device
>>>>> (using /sys/class/scsi_device/${lun}/device/delete) and rescan it
>>>>> (using
>>>>> /sys/class/scsi_host/host${h}/scan). You can delete the device, but
>>>>> cannot
>>>>> rescan it because its LUN is larger than the max_lun and so the scan
>>>>> request results in -EINVAL error in scsi_scan_host_selected().
>>>>
>>>> How are you kicking off this rescan?
>>>>
>>>> Just to make sure I understood you, does the initial LU have LUN 0, you
>>>> delete that, then are you creating another LU with a LUN value that is
>>>> not 0?
>>>
>>> Not exactly. I'm working on a case multiple device is added at once to
>>> one loopback scsi host. You can create two or more device using
>>> "targetcli" command and they may have their LUN larger than 0. For
>>> example,
>>>
>>> $ sudo targetcli
>>> /backstores/fileio> cd /loopback
>>> /loopback> create
>>> Created target naa.5001405218077d66.
>>> /loopback> exit
>>> $ sudo truncate -s 1048576 /mnt/nvme/foo{1,2,3}
>>> $ sudo targetcli /backstores/fileio create name=foo1
>>> file_or_dev=/mnt/nvme/foo1
>>> Created fileio foo1 with size 1048576
>>> $ sudo targetcli /loopback/naa.5001405218077d66/luns create
>>> /backstores/fileio/foo1
>>> Created LUN 0.
>>> (Do the same above for foo2 and foo3)
>>>
>>> Then, you'll see each of them has LUN 0, 1, 2 assigned: (rtslib scans
>>> used LUN and assign free one)
>>>
>>> $ lsscsi
>>> ...
>>> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
>>> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
>>> [7:0:1:2]    disk    LIO-ORG  foo3             4.0   /dev/sdf
>>>
>>> Now, you can delete one of these device:
>>>
>>> $ echo 1 > /sys/class/scsi_device/7\:0\:1\:2/device/delete
>>> $ lsscsi
>>> ...
>>> [7:0:1:0]    disk    LIO-ORG  foo1             4.0   /dev/sdd
>>> [7:0:1:1]    disk    LIO-ORG  foo2             4.0   /dev/sde
>>>
>>> But, you cannot recover it by the scanning:
>>>
>>
>> Why are you using the scsi sysfs interface instead of the target
>> configfs interface?
> 
> Xfstests btrfs/003 uses the SCSI sysfs interface to emulate missing
> block device. We can use any SCSI devices to run the test case, so we
> cannot use target configfs here.
> 
> Even the end result of missing "/dev/sd?" is the same, they are two
> distinct interfaces. So, we need to fix the broken result of the SCSI
> sysfs interface anyway?

I agree.

> 
>> I know the comment for max_lun says it wants to support 1 LUN, but the
>> code like in tcm_loop_port_link seems to support multiple LUNs, so your
>> patch looks like it could be ok. I would just set max_luns to the kernel
>> (scsi-ml/lio) limit and not some userspace value.
> 
> Hm, taking look at the code (target_fabric_make_lun), there is no upper
> limit check there. So, set max_lun = U64_MAX right?
> I once considered that but when I create LUN larger than 65535,
> "targetcli ls" died complaining "LUN must be 0 to 65535". So I used
> 65536 here.
> 
> Or, should we use max_lun = U64_MAX and fix userland side? They need to
> be the same, anyway, I believe...

I do not think the kernel should use the limit from one userspace
application. What if you are using a different app or what happens when
someone updates targetcli. You do not want to have to update the kernel
each time. I think the driver should report what it is limited to.

The targetcli limit might be due to how it allocates the LUN,
probes/scans configfs, etc. I think in some places you could end up doing

for i <  U64_MAX
    do some inefficient configfs probe/scan

so for targetcli a value less than U64_MAX makes sense so some commands
do not take minutes. Other apps might work differently though.

Patch
diff mbox series

diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c
index 3305b47fdf53..3db541ad727d 100644
--- a/drivers/target/loopback/tcm_loop.c
+++ b/drivers/target/loopback/tcm_loop.c
@@ -336,10 +336,10 @@  static int tcm_loop_driver_probe(struct device *dev)
 	 */
 	*((struct tcm_loop_hba **)sh->hostdata) = tl_hba;
 	/*
-	 * Setup single ID, Channel and LUN for now..
+	 * Setup single ID, and Channel for now..
 	 */
 	sh->max_id = 2;
-	sh->max_lun = 0;
+	sh->max_lun = 65536;
 	sh->max_channel = 0;
 	sh->max_cmd_len = SCSI_MAX_VARLEN_CDB_SIZE;