Message ID | 20250114070704.2169064-1-russ@har.mn (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Increase drivetemp scsi command timeout to 10s. | expand |
On Mon, Jan 13, 2025 at 11:07:04PM -0800, Russell Harmon wrote: > There's at least one drive (MaxDigitalData OOS14000G) such that if it > receives a large amount of I/O while entering an idle power state will > first exit idle before responding, including causing SMART temperature > requests to be delayed. > > This causes the drivetemp request to exceed its timeout of 1 second. > > Example: > > Normal operation > > $ time cat /sys/class/hwmon/hwmon9/temp1_input > 28000 > cat temp1_input 0.00s user 0.00s system 7% cpu 0.023 total > $ dd if=/dev/sdep of=/dev/null bs=1M iflag=direct # Generate background load > $ ./openSeaChest_PowerControl -d /dev/sdep --transitionPower idle_a > $ time cat /sys/class/hwmon/hwmon9/temp1_input > 0 > cat temp1_input 0.00s user 0.00s system 0% cpu 3.154 total > $ dmesg -t > sd 11:0:1:0: attempting task abort!scmd(0x00000000ef8da38c), outstanding for 2098 ms & timeout 1000 ms > sd 11:0:1:0: [sdep] tag#4639 CDB: ATA command pass through(16) 85 08 0e 00 d5 00 01 00 e0 00 4f 00 c2 00 b0 00 > scsi target11:0:1: handle(0x0009), sas_address(0x4433221105000000), phy(5) > scsi target11:0:1: enclosure logical id(0x500062b202d7ea80), slot(6) > scsi target11:0:1: enclosure level(0x0000), connector name( ) > sd 11:0:1:0: task abort: SUCCESS scmd(0x00000000ef8da38c) > sd 11:0:1:0: Power-on or device reset occurred > $ time cat /sys/class/hwmon/hwmon9/temp1_input > 28000 > cat /sys/class/hwmon/hwmon9/temp1_input 0.00s user 0.00s system 48% cpu 0.005 total > Please rebase on top of v6.13-rc7 and resend. When doing so, please drop test results (or send after "---". Also, he subject should start with "hwmon: (drivetemp) ..." Thanks, Guenter > Signed-off-by: Russell Harmon <russ@har.mn> > --- > drivers/hwmon/drivetemp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c > index 6bdd21aa005a..9e465636f591 100644 > --- a/drivers/hwmon/drivetemp.c > +++ b/drivers/hwmon/drivetemp.c > @@ -193,7 +193,7 @@ static int drivetemp_scsi_command(struct drivetemp_data *st, > scsi_cmd[14] = ata_command; > > return scsi_execute_cmd(st->sdev, scsi_cmd, op, st->smartdata, > - ATA_SECT_SIZE, HZ, 5, NULL); > + ATA_SECT_SIZE, 10 * HZ, 5, NULL); > } > > static int drivetemp_ata_command(struct drivetemp_data *st, u8 feature,
On Tue, Jan 14, 2025 at 3:40 PM Guenter Roeck <linux@roeck-us.net> wrote: > > On Mon, Jan 13, 2025 at 11:07:04PM -0800, Russell Harmon wrote: > > There's at least one drive (MaxDigitalData OOS14000G) such that if it > > receives a large amount of I/O while entering an idle power state will > > first exit idle before responding, including causing SMART temperature > > requests to be delayed. > > > > This causes the drivetemp request to exceed its timeout of 1 second. > > > > Example: > > > > Normal operation > > > > $ time cat /sys/class/hwmon/hwmon9/temp1_input > > 28000 > > cat temp1_input 0.00s user 0.00s system 7% cpu 0.023 total > > $ dd if=/dev/sdep of=/dev/null bs=1M iflag=direct # Generate background load > > $ ./openSeaChest_PowerControl -d /dev/sdep --transitionPower idle_a > > $ time cat /sys/class/hwmon/hwmon9/temp1_input > > 0 > > cat temp1_input 0.00s user 0.00s system 0% cpu 3.154 total > > $ dmesg -t > > sd 11:0:1:0: attempting task abort!scmd(0x00000000ef8da38c), outstanding for 2098 ms & timeout 1000 ms > > sd 11:0:1:0: [sdep] tag#4639 CDB: ATA command pass through(16) 85 08 0e 00 d5 00 01 00 e0 00 4f 00 c2 00 b0 00 > > scsi target11:0:1: handle(0x0009), sas_address(0x4433221105000000), phy(5) > > scsi target11:0:1: enclosure logical id(0x500062b202d7ea80), slot(6) > > scsi target11:0:1: enclosure level(0x0000), connector name( ) > > sd 11:0:1:0: task abort: SUCCESS scmd(0x00000000ef8da38c) > > sd 11:0:1:0: Power-on or device reset occurred > > $ time cat /sys/class/hwmon/hwmon9/temp1_input > > 28000 > > cat /sys/class/hwmon/hwmon9/temp1_input 0.00s user 0.00s system 48% cpu 0.005 total > > > > Please rebase on top of v6.13-rc7 and resend. > When doing so, please drop test results (or send after "---". > Also, the subject should start with "hwmon: (drivetemp) ..." Sent. Thanks!
diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c index 6bdd21aa005a..9e465636f591 100644 --- a/drivers/hwmon/drivetemp.c +++ b/drivers/hwmon/drivetemp.c @@ -193,7 +193,7 @@ static int drivetemp_scsi_command(struct drivetemp_data *st, scsi_cmd[14] = ata_command; return scsi_execute_cmd(st->sdev, scsi_cmd, op, st->smartdata, - ATA_SECT_SIZE, HZ, 5, NULL); + ATA_SECT_SIZE, 10 * HZ, 5, NULL); } static int drivetemp_ata_command(struct drivetemp_data *st, u8 feature,
There's at least one drive (MaxDigitalData OOS14000G) such that if it receives a large amount of I/O while entering an idle power state will first exit idle before responding, including causing SMART temperature requests to be delayed. This causes the drivetemp request to exceed its timeout of 1 second. Example: Normal operation $ time cat /sys/class/hwmon/hwmon9/temp1_input 28000 cat temp1_input 0.00s user 0.00s system 7% cpu 0.023 total $ dd if=/dev/sdep of=/dev/null bs=1M iflag=direct # Generate background load $ ./openSeaChest_PowerControl -d /dev/sdep --transitionPower idle_a $ time cat /sys/class/hwmon/hwmon9/temp1_input 0 cat temp1_input 0.00s user 0.00s system 0% cpu 3.154 total $ dmesg -t sd 11:0:1:0: attempting task abort!scmd(0x00000000ef8da38c), outstanding for 2098 ms & timeout 1000 ms sd 11:0:1:0: [sdep] tag#4639 CDB: ATA command pass through(16) 85 08 0e 00 d5 00 01 00 e0 00 4f 00 c2 00 b0 00 scsi target11:0:1: handle(0x0009), sas_address(0x4433221105000000), phy(5) scsi target11:0:1: enclosure logical id(0x500062b202d7ea80), slot(6) scsi target11:0:1: enclosure level(0x0000), connector name( ) sd 11:0:1:0: task abort: SUCCESS scmd(0x00000000ef8da38c) sd 11:0:1:0: Power-on or device reset occurred $ time cat /sys/class/hwmon/hwmon9/temp1_input 28000 cat /sys/class/hwmon/hwmon9/temp1_input 0.00s user 0.00s system 48% cpu 0.005 total Signed-off-by: Russell Harmon <russ@har.mn> --- drivers/hwmon/drivetemp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)