[1/1] sd: do not let LBPME bit stop the VPDs speak

Message ID	56df97e0.d08d420a.58e8f.3162@mx.google.com (mailing list archive)
State	Rejected, archived
Headers	show Return-Path: <linux-scsi-owner@kernel.org> Message-ID: <56df97e0.d08d420a.58e8f.3162@mx.google.com> From: tom.ty89@gmail.com To: linux-scsi@vger.kernel.org Cc: Tom Yan <tom.ty89@gmail.com> Subject: [PATCH 1/1] sd: do not let LBPME bit stop the VPDs speak Date: Wed, 9 Mar 2016 11:26:16 +0800 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk

Tom Yan March 9, 2016, 3:26 a.m. UTC

From: Tom Yan <tom.ty89@gmail.com>

Some devices have details of their support on unmapping on the
Block Limits and/or Logical Block Provisioning VPDs while they
do not set the LBPME bit to 1. Though this is required by the
SCSI standards, the VPDs are giving even more concrete details
about the support, so they should be used even when the bit
is set to 0.

Signed-off-by: Tom Yan <tom.ty89@gmail.com>

Tom Yan March 9, 2016, 3:30 a.m. UTC | #1

Example device, JMicron JMS567 UAS SATA bridge:

[tom@localhost ~]$ sudo sg_readcap -16 /dev/sdc
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last logical block address=468862127 (0x1bf244af), Number of
logical blocks=468862128
   Logical block length=512 bytes
   Logical blocks per physical block exponent=3 [so physical block
length=4096 bytes]
   Lowest aligned logical block address=0
Hence:
   Device size: 240057409536 bytes, 228936.6 MiB, 240.06 GB

[tom@localhost ~]$ sudo sg_vpd -a /dev/sdc
Supported VPD pages VPD page:
  Supported VPD pages [sv]
  Unit serial number [sn]
  Device identification [di]
  Block limits (SBC) [bl]
  Logical block provisioning (SBC) [lbpv]
Unit serial number VPD page:
  Unit serial number: 1504380500000432
Device Identification VPD page:
  Addressed logical unit:
    designator type: NAA,  code set: Binary
      0x3015043805000004
    designator type: T10 vendor identification,  code set: ASCII
      vendor id: JMicron
      vendor specific: Generic         1504380500000432
Block limits VPD page (SBC):
  Write same non-zero (WSNZ): 0
  Maximum compare and write length: 0 blocks
  Optimal transfer length granularity: 8 blocks
  Maximum transfer length: 65535 blocks
  Optimal transfer length: 65535 blocks
  Maximum prefetch length: 65535 blocks
  Maximum unmap LBA count: 65472
  Maximum unmap block descriptor count: 1
  Optimal unmap granularity: 1
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0
  Maximum write same length: 0x0 blocks
  Maximum atomic transfer length: 0
  Atomic alignment: 0
  Atomic transfer length granularity: 0
  Maximum atomic transfer length with atomic boundary: 0
  Maximum atomic boundary size: 0
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 1
  Write same (16) with unmap bit supported (LBWS): 0
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 0
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 1
  Descriptor present (DP): 0
  Minimum percentage: 0
  Provisioning type: 0
  Threshold percentage: 0

On 9 March 2016 at 11:26,  <tom.ty89@gmail.com> wrote:
> From: Tom Yan <tom.ty89@gmail.com>
>
> Some devices have details of their support on unmapping on the
> Block Limits and/or Logical Block Provisioning VPDs while they
> do not set the LBPME bit to 1. Though this is required by the
> SCSI standards, the VPDs are giving even more concrete details
> about the support, so they should be used even when the bit
> is set to 0.
>
> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d749da7..a0d7c73 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -2670,9 +2670,6 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
>
>                 sdkp->max_ws_blocks = (u32)get_unaligned_be64(&buffer[36]);
>
> -               if (!sdkp->lbpme)
> -                       goto out;
> -
>                 lba_count = get_unaligned_be32(&buffer[20]);
>                 desc_count = get_unaligned_be32(&buffer[24]);
>
> @@ -2747,9 +2744,6 @@ static void sd_read_block_provisioning(struct scsi_disk *sdkp)
>         unsigned char *buffer;
>         const int vpd_len = 8;
>
> -       if (sdkp->lbpme == 0)
> -               return;
> -
>         buffer = kmalloc(vpd_len, GFP_KERNEL);
>
>         if (!buffer || scsi_get_vpd_page(sdkp->device, 0xb2, buffer, vpd_len))
> --
> 2.7.2
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tom Yan March 9, 2016, 3:38 a.m. UTC | #2

Btw, why SD_MAX_WS16_BLOCKS (which is used for both SD_LBP_WS16 and
SD_LBP_UNMAP) is set to 0x7fffff (23-bit) instead of 0xffffffff
(32-bit, 4-byte)?

Tried to read the commit message but no explanation about that is
available there:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/sd.h?id=5db44863b6ebbb400c5e61d56ebe8f21ef48b1bd

And not sure if they are related but the later added
SD_MAX_XFER_BLOCKS is set to 0xffffffff:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/sd.h?id=bcdb247c6b6a1f3e72b9b787b73f47dd509d17ec

On 9 March 2016 at 11:30, Tom Yan <tom.ty89@gmail.com> wrote:
> Example device, JMicron JMS567 UAS SATA bridge:
>
> [tom@localhost ~]$ sudo sg_readcap -16 /dev/sdc
> Read Capacity results:
>    Protection: prot_en=0, p_type=0, p_i_exponent=0
>    Logical block provisioning: lbpme=0, lbprz=0
>    Last logical block address=468862127 (0x1bf244af), Number of
> logical blocks=468862128
>    Logical block length=512 bytes
>    Logical blocks per physical block exponent=3 [so physical block
> length=4096 bytes]
>    Lowest aligned logical block address=0
> Hence:
>    Device size: 240057409536 bytes, 228936.6 MiB, 240.06 GB
>
> [tom@localhost ~]$ sudo sg_vpd -a /dev/sdc
> Supported VPD pages VPD page:
>   Supported VPD pages [sv]
>   Unit serial number [sn]
>   Device identification [di]
>   Block limits (SBC) [bl]
>   Logical block provisioning (SBC) [lbpv]
> Unit serial number VPD page:
>   Unit serial number: 1504380500000432
> Device Identification VPD page:
>   Addressed logical unit:
>     designator type: NAA,  code set: Binary
>       0x3015043805000004
>     designator type: T10 vendor identification,  code set: ASCII
>       vendor id: JMicron
>       vendor specific: Generic         1504380500000432
> Block limits VPD page (SBC):
>   Write same non-zero (WSNZ): 0
>   Maximum compare and write length: 0 blocks
>   Optimal transfer length granularity: 8 blocks
>   Maximum transfer length: 65535 blocks
>   Optimal transfer length: 65535 blocks
>   Maximum prefetch length: 65535 blocks
>   Maximum unmap LBA count: 65472
>   Maximum unmap block descriptor count: 1
>   Optimal unmap granularity: 1
>   Unmap granularity alignment valid: 0
>   Unmap granularity alignment: 0
>   Maximum write same length: 0x0 blocks
>   Maximum atomic transfer length: 0
>   Atomic alignment: 0
>   Atomic transfer length granularity: 0
>   Maximum atomic transfer length with atomic boundary: 0
>   Maximum atomic boundary size: 0
> Logical block provisioning VPD page (SBC):
>   Unmap command supported (LBPU): 1
>   Write same (16) with unmap bit supported (LBWS): 0
>   Write same (10) with unmap bit supported (LBWS10): 0
>   Logical block provisioning read zeros (LBPRZ): 0
>   Anchored LBAs supported (ANC_SUP): 0
>   Threshold exponent: 1
>   Descriptor present (DP): 0
>   Minimum percentage: 0
>   Provisioning type: 0
>   Threshold percentage: 0
>
> On 9 March 2016 at 11:26,  <tom.ty89@gmail.com> wrote:
>> From: Tom Yan <tom.ty89@gmail.com>
>>
>> Some devices have details of their support on unmapping on the
>> Block Limits and/or Logical Block Provisioning VPDs while they
>> do not set the LBPME bit to 1. Though this is required by the
>> SCSI standards, the VPDs are giving even more concrete details
>> about the support, so they should be used even when the bit
>> is set to 0.
>>
>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d749da7..a0d7c73 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -2670,9 +2670,6 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
>>
>>                 sdkp->max_ws_blocks = (u32)get_unaligned_be64(&buffer[36]);
>>
>> -               if (!sdkp->lbpme)
>> -                       goto out;
>> -
>>                 lba_count = get_unaligned_be32(&buffer[20]);
>>                 desc_count = get_unaligned_be32(&buffer[24]);
>>
>> @@ -2747,9 +2744,6 @@ static void sd_read_block_provisioning(struct scsi_disk *sdkp)
>>         unsigned char *buffer;
>>         const int vpd_len = 8;
>>
>> -       if (sdkp->lbpme == 0)
>> -               return;
>> -
>>         buffer = kmalloc(vpd_len, GFP_KERNEL);
>>
>>         if (!buffer || scsi_get_vpd_page(sdkp->device, 0xb2, buffer, vpd_len))
>> --
>> 2.7.2
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin K. Petersen March 10, 2016, 2:24 a.m. UTC | #3

>>>>> "Tom" == tom ty89 <tom.ty89@gmail.com> writes:

Tom,

Tom> Some devices have details of their support on unmapping on the
Tom> Block Limits and/or Logical Block Provisioning VPDs while they do
Tom> not set the LBPME bit to 1. Though this is required by the SCSI
Tom> standards, the VPDs are giving even more concrete details about the
Tom> support, so they should be used even when the bit is set to 0.

I am not going to enable an already brittle feature if a device can't
report the right thing.

Does the bridge report LBPME=1 if you plug in an SSD?

Martin K. Petersen March 10, 2016, 2:32 a.m. UTC | #4

>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes:

Tom> Btw, why SD_MAX_WS16_BLOCKS (which is used for both SD_LBP_WS16 and
Tom> SD_LBP_UNMAP) is set to 0x7fffff (23-bit) instead of 0xffffffff
Tom> (32-bit, 4-byte)?

This limit represents the largest block range we can describe using a
single bio.

Tom Yan March 10, 2016, 8:15 a.m. UTC | #5

The outputs were made with an SSD that supports TRIM plugged in:

[tom@localhost ~]$ sudo smartctl --identify=wb /dev/sdc | grep -i trim
  69     14          1   Deterministic data after trim supported
  69      5          1   Trimmed LBA range(s) returning zeroed data supported
 169      0          1   Trim bit in DATA SET MANAGEMENT command supported

[tom@localhost ~]$ sudo smartctl --identify=wb /dev/sdc | grep -i 'ds m'
 105      -     0x0008   Max blocks of LBA Range Entries per DS MANAGEMENT cmd

Both `blkdiscard` with a patched kernel and `sg_unmap` appeared to
have gone well (checked with `hexdump`).

I just couldn't think of a case that the LBPME bit would actually
indicates that the VPDs should not be used. It's merely bad
implementation of Read Capacity (16), which doesn't practically stops
the device from supporting unmap (even if a similar device is has
broken unmap support, it doesn't mean that the vendor set the LBPME
bit to 0 because they want to tell you it's broken, it's just they
have broken unmap support IN ADDITION to a bad Read Capacity (16)
implementation).

Also, doesn't the kernel ASSUME that the device support Write Same
(16) with Unmap bit set when the LBPME bit is 1 even if it has no
Block Limits and Logical Block Provisioning VPDs? Isn't that even more
lenient than what I am proposing?

On 10 March 2016 at 10:24, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Tom" == tom ty89 <tom.ty89@gmail.com> writes:
>
> Tom,
>
> Tom> Some devices have details of their support on unmapping on the
> Tom> Block Limits and/or Logical Block Provisioning VPDs while they do
> Tom> not set the LBPME bit to 1. Though this is required by the SCSI
> Tom> standards, the VPDs are giving even more concrete details about the
> Tom> support, so they should be used even when the bit is set to 0.
>
> I am not going to enable an already brittle feature if a device can't
> report the right thing.
>
> Does the bridge report LBPME=1 if you plug in an SSD?
>
> --
> Martin K. Petersen      Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tom Yan March 10, 2016, 8:15 a.m. UTC | #6

Hmm, is it originated from:

(2^32-1) - (2^32-1)%512 = 4294966784 bytes = 8388607 (512-byte) blocks
= 0x7fffff

for the byte count (of a bio?) is limited by a 32-bit representation
(2^32-1) as well just like the block count in the two scsi commands?

What about SD_MAX_XFER_BLOCKS then?

On 10 March 2016 at 10:32, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes:
>
> Tom> Btw, why SD_MAX_WS16_BLOCKS (which is used for both SD_LBP_WS16 and
> Tom> SD_LBP_UNMAP) is set to 0x7fffff (23-bit) instead of 0xffffffff
> Tom> (32-bit, 4-byte)?
>
> This limit represents the largest block range we can describe using a
> single bio.
>
> --
> Martin K. Petersen      Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tom Yan March 10, 2016, 10:33 a.m. UTC | #7

On 10 March 2016 at 12:36, Tom Yan <tom.ty89@gmail.com> wrote:
>
> Both `blkdiscard` with a patched kernel and `sg_unmap` appeared to
> have gone well (checked with `hexdump`).
>
Actually, `blkdiscard` didn't quite actually go well. It seems last
time it just ended quickly with only error on kernel messages, so I
thought it worked.

But apparently it's NOT because of the device itself but has something
to do with the kernel. It will only get into trouble if I set a "step"
(`-p`) larger than "discard_max_bytes" (and/or
"discard_max_hw_bytes"?).

For example, if I use an UNPATCHED kernel and use `echo -n unmap` to
change the "provisioning_mode" sysfs file (this way,
"discard_max_bytes" and "discard_max_hw_bytes" are set to 4294966784),
these work:

[tom@localhost ~]$ time sudo blkdiscard -v -p 33521664 /dev/sdc
/dev/sdc: Discarded 7542374400 bytes from the offset 0
/dev/sdc: Discarded 38415826944 bytes from the offset 7542374400
/dev/sdc: Discarded 37711872000 bytes from the offset 45958201344
/dev/sdc: Discarded 37242568704 bytes from the offset 83670073344
/dev/sdc: Discarded 37577785344 bytes from the offset 120912642048
/dev/sdc: Discarded 37544263680 bytes from the offset 158490427392
/dev/sdc: Discarded 38348783616 bytes from the offset 196034691072
/dev/sdc: Discarded 5673934848 bytes from the offset 234383474688

real    0m6.522s
user    0m0.007s
sys    0m0.097s

[tom@localhost ~]$ time sudo blkdiscard -v -p 4294966784 /dev/sdc
/dev/sdc: Discarded 12884900352 bytes from the offset 0
/dev/sdc: Discarded 55834568192 bytes from the offset 12884900352
/dev/sdc: Discarded 55834568192 bytes from the offset 68719468544
/dev/sdc: Discarded 55834568192 bytes from the offset 124554036736
/dev/sdc: Discarded 55834568192 bytes from the offset 180388604928
/dev/sdc: Discarded 3834236416 bytes from the offset 236223173120

real    0m4.335s
user    0m0.010s
sys    0m0.007s

I even did three trials each, all completed without error:

Mar 10 16:54:46 localhost sudo[435]:      tom : TTY=pts/2 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/bash -c echo -n unmap >
/sys/devices/pci0000:00/0000:00:14.0/usb4/4-6/4-6:1.0/host7/target7:0:0/7:0:0:0/scsi_disk/7:0:0:0/provisioning_mode
Mar 10 16:54:46 localhost sudo[435]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:54:46 localhost sudo[435]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:03 localhost sudo[459]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p 33521664
/dev/sdc
Mar 10 16:56:03 localhost sudo[459]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:09 localhost sudo[459]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:11 localhost sudo[463]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p 33521664
/dev/sdc
Mar 10 16:56:11 localhost sudo[463]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:18 localhost sudo[463]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:21 localhost sudo[467]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p 33521664
/dev/sdc
Mar 10 16:56:21 localhost sudo[467]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:28 localhost sudo[467]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:36 localhost sudo[471]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p
4294966784 /dev/sdc
Mar 10 16:56:36 localhost sudo[471]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:40 localhost sudo[471]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:42 localhost sudo[475]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p
4294966784 /dev/sdc
Mar 10 16:56:42 localhost sudo[475]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:46 localhost sudo[475]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:56:48 localhost sudo[479]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/blkdiscard -v -p
4294966784 /dev/sdc
Mar 10 16:56:48 localhost sudo[479]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:56:53 localhost sudo[479]: pam_unix(sudo:session): session
closed for user root

However, if I use a patched kernel, the kernel will read the VPD of my
bridge set the files accordingly to 33521664 instead, then only
`blkdiscard -p 33521664` works but not `blkdiscard -p 4294966784`
anymore. It will (usually) be stalled and with errors repeated in
kernel message:

...
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#7
uas_eh_abort_handler 0 uas-tag 8 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#7 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#6
uas_eh_abort_handler 0 uas-tag 7 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#6 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#5
uas_eh_abort_handler 0 uas-tag 6 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#5 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#4
uas_eh_abort_handler 0 uas-tag 5 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#4 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#3
uas_eh_abort_handler 0 uas-tag 4 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#3 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#2
uas_eh_abort_handler 0 uas-tag 3 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#2 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#1
uas_eh_abort_handler 0 uas-tag 2 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#1 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#0
uas_eh_abort_handler 0 uas-tag 1 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#0 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#13
uas_eh_abort_handler 0 uas-tag 14 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#13 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#12
uas_eh_abort_handler 0 uas-tag 13 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#12 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#11
uas_eh_abort_handler 0 uas-tag 12 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#11 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#10
uas_eh_abort_handler 0 uas-tag 11 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#10 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#9
uas_eh_abort_handler 0 uas-tag 10 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#9 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#8
uas_eh_abort_handler 0 uas-tag 9 inflight: CMD OUT
Mar 10 18:23:25 localhost kernel: sd 7:0:0:0: [sdc] tag#8 CDB:
opcode=0x42 42 00 00 00 00 00 00 00 18 00
Mar 10 18:23:25 localhost kernel: scsi host7: uas_eh_bus_reset_handler start
Mar 10 18:23:25 localhost kernel: usb 4-6: reset SuperSpeed USB device
number 3 using xhci_hcd
Mar 10 18:23:25 localhost kernel: scsi host7: uas_eh_bus_reset_handler success
Mar 10 18:23:25 localhost kernel: xhci_hcd 0000:00:14.0: ERROR Unknown
event condition 34, HC probably busted
...

Same thing happens (with either a patched or unpatched kernel) if I do
not set `-p` for `blkdiscard` at all, since it basically means `-p
468862128 (total LBA of my drive) * 512` (see man page of
`blkdiscard).

However this works like a charm:

[tom@localhost ~]$ sudo sg_unmap -vvvv -l 0 -n 468862128 /dev/sdc
found bsg_major=250
open /dev/sdc with flags=0x802
    unmap cdb: 42 00 00 00 00 00 00 00 18 00
    unmap parameter list:
00 16 00 10 00 00 00 00  00 00 00 00 00 00 00 00
1b f2 44 b0 00 00 00 00
      duration=4290 ms

Again, three trials:

Mar 10 16:50:45 localhost sudo[429]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/sg_unmap -vvvv -l 0 -n
468862128 /dev/sdc
Mar 10 16:50:45 localhost sudo[429]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:50:49 localhost sudo[429]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:50:51 localhost sudo[433]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/sg_unmap -vvvv -l 0 -n
468862128 /dev/sdc
Mar 10 16:50:51 localhost sudo[433]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:50:55 localhost sudo[433]: pam_unix(sudo:session): session
closed for user root
Mar 10 16:50:58 localhost sudo[437]:      tom : TTY=pts/1 ;
PWD=/home/tom ; USER=root ; COMMAND=/usr/bin/sg_unmap -vvvv -l 0 -n
468862128 /dev/sdc
Mar 10 16:50:58 localhost sudo[437]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 10 16:51:02 localhost sudo[437]: pam_unix(sudo:session): session
closed for user root
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin K. Petersen March 11, 2016, 2:02 a.m. UTC | #8

>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes:

Tom,

Tom> I just couldn't think of a case that the LBPME bit would actually
Tom> indicates that the VPDs should not be used. It's merely bad
Tom> implementation of Read Capacity (16), which doesn't practically
Tom> stops the device from supporting unmap (even if a similar device is
Tom> has broken unmap support, it doesn't mean that the vendor set the
Tom> LBPME bit to 0 because they want to tell you it's broken, it's just
Tom> they have broken unmap support IN ADDITION to a bad Read Capacity
Tom> (16) implementation).

We don't skip the entire VPD page if LBPME=1, just the parts that are
related to the logical block provisioning.

The reason we read those values in the first place is so that we can set
up discard. If the device signals that it does not support discard we
have had no reason to read them or parse them.

Since there are a plethora of USB-SATA devices out there that get this
incredibly wrong (including discarding blocks *outside* of the specified
block range), I am not going to blindly enable this feature.

If you want to tinker with your own setup that's fine. And you are
clearly capable of doing so.

I, however, have to be extremely cautious not to enable something that
might inadvertently cause data corruption for users out there. That's
all I care about.

If the device manufacturer had intended to support discards then they
would presumably have set the LBPME flag per the spec. And if they
intended to support it but messed up setting the single bit flag that
enables the feature, then I would not trust their implementation.

Tom> Also, doesn't the kernel ASSUME that the device support Write Same
Tom> (16) with Unmap bit set when the LBPME bit is 1 even if it has no
Tom> Block Limits and Logical Block Provisioning VPDs?

Yes, because there are many devices out there whose logical block
provisioning support predates the LBP VPD being added to the spec.

Tom> Isn't that even more lenient than what I am proposing?

If we were to entertain enabling discard on your device it would be by
way of quirking it. I am not going to change the heuristics to
accommodate a device that gets trivial things wrong.

Martin K. Petersen March 11, 2016, 2:08 a.m. UTC | #9

>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes:

Tom> Hmm, is it originated from: (2^32-1) - (2^32-1)%512 = 4294966784
Tom> bytes = 8388607 (512-byte) blocks = 0x7fffff

Yes.

Tom> What about SD_MAX_XFER_BLOCKS then?

It is merely there to provide a theoretical upper max for the 16 and
32-byte READ/WRITE commands. The block layer would never submit a
request that big given all the constraints we have in place (controller
segment/scatterlist/max_sectors, etc.).

The TRIM/WRITE SAME/UNMAP commands are different in that their payload
size differs from the size of the block range they act upon. So we need
to artificially constrain their limits.

Tom Yan March 11, 2016, 3:26 a.m. UTC | #10

On 10 March 2016 at 18:33, Tom Yan <tom.ty89@gmail.com> wrote:
>
> But apparently it's NOT because of the device itself but has something
> to do with the kernel. It will only get into trouble if I set a "step"
> (`-p`) larger than "discard_max_bytes" (and/or
> "discard_max_hw_bytes"?).
>

I just tested with an unpatched kernel and change "discard_max_bytes"
according to the VPD manually (while "discard_max_hw_bytes" is set to
4294966784):

/sys/block/sdc/queue/discard_granularity:4096
/sys/block/sdc/queue/discard_max_bytes:33521664
/sys/block/sdc/queue/discard_max_hw_bytes:4294966784
/sys/block/sdc/queue/discard_zeroes_data:0

And the behaviour become exactly like a patched kernel:

[tom@localhost ~]$ sudo bash -c 'echo -n unmap >
/sys/devices/pci0000:00/0000:00:14.0/usb3/3-6/3-6:1.0/host7/target7:0:0/7:0:0:0/scsi_disk/7:0:0:0/provisioning_mode'
[tom@localhost ~]$ sudo bash -c 'echo 33521664 >
/sys/block/sdc/queue/discard_max_bytes'
[tom@localhost ~]$ time sudo blkdiscard -v -p 33521664 /dev/sdc
/dev/sdc: Discarded 6838419456 bytes from the offset 0
/dev/sdc: Discarded 38382305280 bytes from the offset 6838419456
/dev/sdc: Discarded 37644828672 bytes from the offset 45220724736
/dev/sdc: Discarded 37544263680 bytes from the offset 82865553408
/dev/sdc: Discarded 37074960384 bytes from the offset 120409817088
/dev/sdc: Discarded 37812436992 bytes from the offset 157484777472
/dev/sdc: Discarded 37376655360 bytes from the offset 195297214464
/dev/sdc: Discarded 7383539712 bytes from the offset 232673869824

real    0m6.561s
user    0m0.017s
sys    0m0.110s
[tom@localhost ~]$ time sudo blkdiscard -v -p 33521664 /dev/sdc
/dev/sdc: Discarded 14280228864 bytes from the offset 0
/dev/sdc: Discarded 37711872000 bytes from the offset 14280228864
/dev/sdc: Discarded 37812436992 bytes from the offset 51992100864
/dev/sdc: Discarded 37544263680 bytes from the offset 89804537856
/dev/sdc: Discarded 37142003712 bytes from the offset 127348801536
/dev/sdc: Discarded 37343133696 bytes from the offset 164490805248
/dev/sdc: Discarded 37510742016 bytes from the offset 201833938944
/dev/sdc: Discarded 712728576 bytes from the offset 239344680960

real    0m6.412s
user    0m0.017s
sys    0m0.117s
[tom@localhost ~]$ time sudo blkdiscard -v -p 33521664 /dev/sdc
/dev/sdc: Discarded 8816197632 bytes from the offset 0
/dev/sdc: Discarded 38583435264 bytes from the offset 8816197632
/dev/sdc: Discarded 38214696960 bytes from the offset 47399632896
/dev/sdc: Discarded 37410177024 bytes from the offset 85614329856
/dev/sdc: Discarded 37510742016 bytes from the offset 123024506880
/dev/sdc: Discarded 37577785344 bytes from the offset 160535248896
/dev/sdc: Discarded 37544263680 bytes from the offset 198113034240
/dev/sdc: Discarded 4400111616 bytes from the offset 235657297920

real    0m6.371s
user    0m0.023s
sys    0m0.093s
[tom@localhost ~]$ time sudo blkdiscard -v -p 4294966784 /dev/sdc
/dev/sdc: Discarded 0 bytes from the offset 0
blkdiscard: /dev/sdc: BLKDISCARD ioctl failed: Input/output error

real    4m7.586s
user    0m0.010s
sys    0m0.003s

Attached is a full journal (kernel message).

[1/1] sd: do not let LBPME bit stop the VPDs speak

Commit Message

Comments

Patch