diff mbox

[1/6] libata: Do not retry commands with valid autosense

Message ID 1438347728-106434-2-git-send-email-hare@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Hannes Reinecke July 31, 2015, 1:02 p.m. UTC
If a failed command has a valid autosense there is no need to
retry it on the ATA level; at best we're incurring the same
error again. So rather not retry it here, but leave it to
the SCSI layer to decide if a retry is in order.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/ata/libata-eh.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Tejun Heo Aug. 2, 2015, 3:44 p.m. UTC | #1
Hello,

On Fri, Jul 31, 2015 at 03:02:03PM +0200, Hannes Reinecke wrote:
> If a failed command has a valid autosense there is no need to
> retry it on the ATA level; at best we're incurring the same
> error again. So rather not retry it here, but leave it to
> the SCSI layer to decide if a retry is in order.

Hmmm... I don't know.  So, we change how we handle errors completely
depending on how the device reports it?  Doesn't seem like a
particularly good idea to me.

Thanks.
Hannes Reinecke Aug. 3, 2015, 7:31 a.m. UTC | #2
On 08/02/2015 05:44 PM, Tejun Heo wrote:
> Hello,
> 
> On Fri, Jul 31, 2015 at 03:02:03PM +0200, Hannes Reinecke wrote:
>> If a failed command has a valid autosense there is no need to
>> retry it on the ATA level; at best we're incurring the same
>> error again. So rather not retry it here, but leave it to
>> the SCSI layer to decide if a retry is in order.
> 
> Hmmm... I don't know.  So, we change how we handle errors completely
> depending on how the device reports it?  Doesn't seem like a
> particularly good idea to me.
> 
The whole point of the autosense feature is that you do _not_
have to fall back to the original trial-and-error libata EH,
but know exactly what the problem is. Plus any retry will be giving
us (in most cases) exactly the same sense code.
_And_ the SCSI layer is actually able to understand the sense code,
allowing him to make a better judgment on what to do with that error.

So any retry in the libata layer will only slow things down,
leading to the same results eventually.

Cheers,

Hannes
Tejun Heo Aug. 3, 2015, 3:04 p.m. UTC | #3
Hello,

On Mon, Aug 03, 2015 at 09:31:57AM +0200, Hannes Reinecke wrote:
> The whole point of the autosense feature is that you do _not_
> have to fall back to the original trial-and-error libata EH,
> but know exactly what the problem is. Plus any retry will be giving
> us (in most cases) exactly the same sense code.

Can you please give some examples?  As lacking as ATA error reporting
is, it still can tell whether retry is necessary or not in most cases.

> _And_ the SCSI layer is actually able to understand the sense code,
> allowing him to make a better judgment on what to do with that error.
>
> So any retry in the libata layer will only slow things down,
> leading to the same results eventually.

Have you tested actual error handling?  I doubt this would work as you
expect it to.  libata EH takes over the entire error handling and when
it determines that the command has failed and retrying won't do any
good, it tells SCSI EH to not retry either.

Ugh... so this is from NCQ autosense thing.  Now ATA devices reports
sense data too which trumps AC_ERR_DEV so libata EH decides to retry
even when the device indicates unrecoverable error.  Urgh... we
shouldn't be taking completely different error handling paths because
a device chooses to report error conditions slightly differently.
Please map them so that they behave in a consistent manner.  I'm gonna
plug autosense for now.

Thanks.
Tejun Heo Aug. 3, 2015, 3:18 p.m. UTC | #4
Adding a bit.

On Mon, Aug 03, 2015 at 11:04:28AM -0400, Tejun Heo wrote:
> Ugh... so this is from NCQ autosense thing.  Now ATA devices reports
> sense data too which trumps AC_ERR_DEV so libata EH decides to retry
> even when the device indicates unrecoverable error.  Urgh... we
> shouldn't be taking completely different error handling paths because
> a device chooses to report error conditions slightly differently.
> Please map them so that they behave in a consistent manner.  I'm gonna
> plug autosense for now.

Also, is there anything substantial we gain from NCQ autosense?  Why
do we want this in the first place?  ATA error reporting is
rudimentary but it more or less works and I'm not sure whether we'd
want to overhaul its basic behaviors at this stage.

Thanks.
James Bottomley Aug. 3, 2015, 3:42 p.m. UTC | #5
On Mon, 2015-08-03 at 11:18 -0400, Tejun Heo wrote:
> Adding a bit.
> 
> On Mon, Aug 03, 2015 at 11:04:28AM -0400, Tejun Heo wrote:
> > Ugh... so this is from NCQ autosense thing.  Now ATA devices reports
> > sense data too which trumps AC_ERR_DEV so libata EH decides to retry
> > even when the device indicates unrecoverable error.  Urgh... we
> > shouldn't be taking completely different error handling paths because
> > a device chooses to report error conditions slightly differently.
> > Please map them so that they behave in a consistent manner.  I'm gonna
> > plug autosense for now.
> 
> Also, is there anything substantial we gain from NCQ autosense?  Why
> do we want this in the first place?  ATA error reporting is
> rudimentary but it more or less works and I'm not sure whether we'd
> want to overhaul its basic behaviors at this stage.

I'd think it would be the same reason as all modern transports: it's
faster and allows processing of sense data in-band.  Under the old
regime, the device is effectively frozen until you collect the data.
Under autosense, the data is collected as part of the in-band command
processing, so it doesn't stall the device.

Modern drives (and protocols) are moving towards being somewhat more
chatty with sense data.  It doesn't just signal an error, mostly it's
just reporting about drive characteristics or other advisory stuff.
This means that if you handle it the old way, you'll get more drive
stalls and a corresponding reduction in throughput.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Aug. 3, 2015, 3:55 p.m. UTC | #6
Hello, James.

On Mon, Aug 03, 2015 at 08:42:43AM -0700, James Bottomley wrote:
> I'd think it would be the same reason as all modern transports: it's
> faster and allows processing of sense data in-band.  Under the old
> regime, the device is effectively frozen until you collect the data.
> Under autosense, the data is collected as part of the in-band command
> processing, so it doesn't stall the device.
> 
> Modern drives (and protocols) are moving towards being somewhat more
> chatty with sense data.  It doesn't just signal an error, mostly it's
> just reporting about drive characteristics or other advisory stuff.
> This means that if you handle it the old way, you'll get more drive
> stalls and a corresponding reduction in throughput.

The problem is not the "auto" part but the "sense" part, I guess.  ATA
devices (the harddisks) never reported sense data and instead had a
more rudimentary error bits and for newer devices NCQ log pages, so
libata EH decodes those error information and takes appropriate
actions for the indicated error condition.

Hannes's patchset makes ATA devices mostly bypass libata EH when sense
data is present.  For, say, unrecoverable read errors, it'd be
possible to make this scheme work (broken currently tho); however,
libata and SCSI aren't that closely tied and there currently is no way
for SCSI to tell libata that, e.g., link error was detected on the
device side, so libata will fail to take link recovery actions on
those cases.

This *can* be made to work in a couple different ways but what's
implemented now is pretty broken and making it work properly in any
other way than integrating sense decoding into libata EH would require
major restructuring of the whole thing which I'm not sure would be
worthwhile at this point.

Thanks.
James Bottomley Aug. 3, 2015, 4:44 p.m. UTC | #7
On Mon, 2015-08-03 at 11:55 -0400, Tejun Heo wrote:
> Hello, James.
> 
> On Mon, Aug 03, 2015 at 08:42:43AM -0700, James Bottomley wrote:
> > I'd think it would be the same reason as all modern transports: it's
> > faster and allows processing of sense data in-band.  Under the old
> > regime, the device is effectively frozen until you collect the data.
> > Under autosense, the data is collected as part of the in-band command
> > processing, so it doesn't stall the device.
> > 
> > Modern drives (and protocols) are moving towards being somewhat more
> > chatty with sense data.  It doesn't just signal an error, mostly it's
> > just reporting about drive characteristics or other advisory stuff.
> > This means that if you handle it the old way, you'll get more drive
> > stalls and a corresponding reduction in throughput.
> 
> The problem is not the "auto" part but the "sense" part, I guess.  ATA
> devices (the harddisks) never reported sense data and instead had a
> more rudimentary error bits and for newer devices NCQ log pages, so
> libata EH decodes those error information and takes appropriate
> actions for the indicated error condition.
> 
> Hannes's patchset makes ATA devices mostly bypass libata EH when sense
> data is present.  For, say, unrecoverable read errors, it'd be
> possible to make this scheme work (broken currently tho); however,
> libata and SCSI aren't that closely tied and there currently is no way
> for SCSI to tell libata that, e.g., link error was detected on the
> device side, so libata will fail to take link recovery actions on
> those cases.
> 
> This *can* be made to work in a couple different ways but what's
> implemented now is pretty broken and making it work properly in any
> other way than integrating sense decoding into libata EH would require
> major restructuring of the whole thing which I'm not sure would be
> worthwhile at this point.

I'm not arguing that *this* patch is the best way to do it.  You asked
*why* autosense and that's what I answered.  I think there's time to
work out the implementation details to get them to be correct and well
structured.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke Aug. 3, 2015, 4:47 p.m. UTC | #8
On 08/03/2015 05:55 PM, Tejun Heo wrote:
> Hello, James.
> 
> On Mon, Aug 03, 2015 at 08:42:43AM -0700, James Bottomley wrote:
>> I'd think it would be the same reason as all modern transports: it's
>> faster and allows processing of sense data in-band.  Under the old
>> regime, the device is effectively frozen until you collect the data.
>> Under autosense, the data is collected as part of the in-band command
>> processing, so it doesn't stall the device.
>>
>> Modern drives (and protocols) are moving towards being somewhat more
>> chatty with sense data.  It doesn't just signal an error, mostly it's
>> just reporting about drive characteristics or other advisory stuff.
>> This means that if you handle it the old way, you'll get more drive
>> stalls and a corresponding reduction in throughput.
> 
> The problem is not the "auto" part but the "sense" part, I guess.  ATA
> devices (the harddisks) never reported sense data and instead had a
> more rudimentary error bits and for newer devices NCQ log pages, so
> libata EH decodes those error information and takes appropriate
> actions for the indicated error condition.
> 
> Hannes's patchset makes ATA devices mostly bypass libata EH when sense
> data is present.  For, say, unrecoverable read errors, it'd be
> possible to make this scheme work (broken currently tho); however,
> libata and SCSI aren't that closely tied and there currently is no way
> for SCSI to tell libata that, e.g., link error was detected on the
> device side, so libata will fail to take link recovery actions on
> those cases.
> 
> This *can* be made to work in a couple different ways but what's
> implemented now is pretty broken and making it work properly in any
> other way than integrating sense decoding into libata EH would require
> major restructuring of the whole thing which I'm not sure would be
> worthwhile at this point.
> 
At the moment NCQ autosense is mostly used to provide the host with more
details for a failed I/O. The typical case here is (no small surprise)
ZAC disks, which use autosense to inform the host about
a malformed I/O.
It is _not_ being used as a replacement for existing error behaviour,
(ie link errors are not being signalled with that; how could they
if there is no link?); in fact, during testing I"ve seen both, autosense
I/O failures and normal I/O failures for which autosense is
not set, and the normal error handling kicks in.

It's not that I've disable the original error handler completely,
it's only bypassed for I/O failure where a sense code is provided.
And the drive surely knows which error occurs, so we'd be daft not be
using that.

So I think disabling autosense completely is a bit extreme...

Cheers,

Hannes
Tejun Heo Aug. 3, 2015, 4:50 p.m. UTC | #9
Hello, James.

On Mon, Aug 03, 2015 at 09:44:06AM -0700, James Bottomley wrote:
> I'm not arguing that *this* patch is the best way to do it.  You asked
> *why* autosense and that's what I answered.  I think there's time to

Heh, that was mostly me being confused.  I was thinking NCQ autosense
was the only way ATA devices would report sense data.

> work out the implementation details to get them to be correct and well
> structured.

Yeah, definitely.

Thanks.
Tejun Heo Aug. 3, 2015, 5:01 p.m. UTC | #10
Hello, Hannes.

On Mon, Aug 03, 2015 at 06:47:46PM +0200, Hannes Reinecke wrote:
> At the moment NCQ autosense is mostly used to provide the host with more
> details for a failed I/O. The typical case here is (no small surprise)
> ZAC disks, which use autosense to inform the host about
> a malformed I/O.
>
> It is _not_ being used as a replacement for existing error behaviour,
> (ie link errors are not being signalled with that; how could they
> if there is no link?); in fact, during testing I"ve seen both, autosense

Hmmm?  Devices can report link error via TF bits and you're bypassing
TF analysis completley if sense data is present.

> I/O failures and normal I/O failures for which autosense is
> not set, and the normal error handling kicks in.
> 
> It's not that I've disable the original error handler completely,
> it's only bypassed for I/O failure where a sense code is provided.
> And the drive surely knows which error occurs, so we'd be daft not be
> using that.

The patches are altering EH actions in a very subtle way depending on
*how* an error is reported, not *what* is reported, which is a pretty
silly thing to do.  It makes things a lot more confusing to follow and
predict.  I really don't think this is an acceptable behavior.

> So I think disabling autosense completely is a bit extreme...

Please restructure the feature so that it doesn't interfere with the
usual EH behavior.  e.g. leave the EH actions alone unless explicitly
necessary but report detailed error information upwards.  If the extra
error information can be helpful in determining what EH actions to
take, factoring in that information can be helpful too but I'm not too
convinced that'd make a huge difference.

Also, please consider that ATA_QCFLAG_SENSE_VALID handling assumes
that the reporting device is an ATAPI device and the command in
question is not a regular IO one.  That's why EH ignores AC_ERR_DEV or
AC_ERR_OTHER if ATA_QCFLAG_SENSE_VALID is set.  This doesn't work out
for the new ATA usage at all.

For now, I've reverted the changes as this is actively detrimental.

Thanks.
Hannes Reinecke Aug. 3, 2015, 6:21 p.m. UTC | #11
On 08/03/2015 07:01 PM, Tejun Heo wrote:
> Hello, Hannes.
> 
> On Mon, Aug 03, 2015 at 06:47:46PM +0200, Hannes Reinecke wrote:
>> At the moment NCQ autosense is mostly used to provide the host with more
>> details for a failed I/O. The typical case here is (no small surprise)
>> ZAC disks, which use autosense to inform the host about
>> a malformed I/O.
>>
>> It is _not_ being used as a replacement for existing error behaviour,
>> (ie link errors are not being signalled with that; how could they
>> if there is no link?); in fact, during testing I"ve seen both, autosense
> 
> Hmmm?  Devices can report link error via TF bits and you're bypassing
> TF analysis completley if sense data is present.
> 
But sense data will never be present for a link error ... and I'm not
disabling that.

>> I/O failures and normal I/O failures for which autosense is
>> not set, and the normal error handling kicks in.
>>
>> It's not that I've disable the original error handler completely,
>> it's only bypassed for I/O failure where a sense code is provided.
>> And the drive surely knows which error occurs, so we'd be daft not be
>> using that.
> 
> The patches are altering EH actions in a very subtle way depending on
> *how* an error is reported, not *what* is reported, which is a pretty
> silly thing to do.  It makes things a lot more confusing to follow and
> predict.  I really don't think this is an acceptable behavior.
> 
But if we were judging on _what_ is being reported we would have to know
which  sense code the drive will be reporting, in effect carrying
a massive table of possible sense codes, and map this to the actions.
Do you really want to go that way?

>> So I think disabling autosense completely is a bit extreme...
> 
> Please restructure the feature so that it doesn't interfere with the
> usual EH behavior.  e.g. leave the EH actions alone unless explicitly
> necessary but report detailed error information upwards.  If the extra
> error information can be helpful in determining what EH actions to
> take, factoring in that information can be helpful too but I'm not too
> convinced that'd make a huge difference.
> 
It does if we can avoid the retry on the libata layer.

> Also, please consider that ATA_QCFLAG_SENSE_VALID handling assumes
> that the reporting device is an ATAPI device and the command in
> question is not a regular IO one.  That's why EH ignores AC_ERR_DEV or
> AC_ERR_OTHER if ATA_QCFLAG_SENSE_VALID is set.  This doesn't work out
> for the new ATA usage at all.
> 
I've just mapped onto ATA_QCFLAGS_SENSE_VALID as I've found this to
reasonably close to what I've been needing.
If that's the wrong choice of course I can modify this.

> For now, I've reverted the changes as this is actively detrimental.
> 
Oh. So you've tested this on a device with autosense enabled?
I haven't seen any negative effects with this patchset, autosense
enabled or not.

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 7465031..1b4e9d1 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2218,6 +2218,8 @@  static inline int ata_eh_worth_retry(struct ata_queued_cmd *qc)
 		return 1;	/* otherwise retry anything from fs stack */
 	if (qc->err_mask & AC_ERR_INVALID)
 		return 0;	/* don't retry these */
+	if (qc->flags & ATA_QCFLAG_SENSE_VALID)
+		return 0;	/* Autosense, no need to retry here */
 	return qc->err_mask != AC_ERR_DEV;  /* retry if not dev error */
 }