diff mbox

[REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

Message ID 20170719083654.GE4151@linux-x5ow.site (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Johannes Thumshirn July 19, 2017, 8:36 a.m. UTC
On Wed, Jul 19, 2017 at 03:13:34AM -0500, Jason L Tibbitts III wrote:
> [   46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0

Ahh now I see the -2 (SG_DXFER_TO_DEV) is the crucial point here. It is 0 in
your case.

This would "fix" it but I'm not generally sure it is _the_ solution:


Doug, what are the rules for SG_DXFER_TO_{FROM_}DEV? Apparently we can't
be sure len > 0, can we rely on dxferp being present?

Thanks,
	Johannes

Comments

Johannes Thumshirn July 21, 2017, 4:43 p.m. UTC | #1
Doug,

On Wed, Jul 19, 2017 at 10:36:54AM +0200, Johannes Thumshirn wrote:
> On Wed, Jul 19, 2017 at 03:13:34AM -0500, Jason L Tibbitts III wrote:
> > [   46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0
> 
> Ahh now I see the -2 (SG_DXFER_TO_DEV) is the crucial point here. It is 0 in
> your case.
> 
> This would "fix" it but I'm not generally sure it is _the_ solution:
> 
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 1e82d4128a84..b421ec81d775 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -764,7 +764,7 @@ static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
>  		return true;
>  	case SG_DXFER_TO_DEV:
>  	case SG_DXFER_TO_FROM_DEV:
> -		if (!hp->dxferp || hp->dxfer_len == 0)
> +		if (!hp->dxferp)
>  			return false;
>  		return true;
>  	case SG_DXFER_UNKNOWN:
> 
> Doug, what are the rules for SG_DXFER_TO_{FROM_}DEV? Apparently we can't
> be sure len > 0, can we rely on dxferp being present?

Any comments on this?

Jason, can you try the above? If it works and Doug doesn't respond, I'm
inclined yo submit this band aid.

Thanks,
	Johannes
Jason L Tibbitts III July 21, 2017, 7:23 p.m. UTC | #2
>>>>> "JT" == Johannes Thumshirn <jthumshirn@suse.de> writes:

JT> Jason, can you try the above? If it works and Doug doesn't respond,
JT> I'm inclined yo submit this band aid.

Unfortunately it doesn't appear to work for me.  Maybe I'm building the
wrong thing, though.  I checked out 4.12, cherry picked
68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 and then applied your one liner
on top of that.  There appears to be no change in behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 47...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1047 Failed

I can also apply the debugging patch and try again if that would give
you more useful information.

 - J<
Johannes Thumshirn July 25, 2017, 7:18 a.m. UTC | #3
On Fri, Jul 21, 2017 at 02:23:16PM -0500, Jason L Tibbitts III wrote:
> I can also apply the debugging patch and try again if that would give
> you more useful information.

Yes please (on top of the snippet I've sent you last).

Thanks a lot,
	Johannes
Douglas Gilbert July 25, 2017, 7:09 p.m. UTC | #4
On 2017-07-19 04:36 AM, Johannes Thumshirn wrote:
> On Wed, Jul 19, 2017 at 03:13:34AM -0500, Jason L Tibbitts III wrote:
>> [   46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0
> 
> Ahh now I see the -2 (SG_DXFER_TO_DEV) is the crucial point here. It is 0 in
> your case.
> 
> This would "fix" it but I'm not generally sure it is _the_ solution:
> 
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 1e82d4128a84..b421ec81d775 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -764,7 +764,7 @@ static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
>   		return true;
>   	case SG_DXFER_TO_DEV:
>   	case SG_DXFER_TO_FROM_DEV:
> -		if (!hp->dxferp || hp->dxfer_len == 0)
> +		if (!hp->dxferp)
>   			return false;
>   		return true;
>   	case SG_DXFER_UNKNOWN:
> 
> Doug, what are the rules for SG_DXFER_TO_{FROM_}DEV? Apparently we can't
> be sure len > 0, can we rely on dxferp being present?

_TO_ is toward the device (i.e. WRITE) and what T10 call "data-out".
_FROM_ is from the device (e.g. INQUIRY) and what T10 call "data-in".

_TO_FROM_ is basically _FROM_ (and has nothing to do with bidi).
_TO_FROM_ is very old and is meant to try and detect "short reads"
by prefilling the indirect buffer (by reading from dxferp) before
it is overwritten by the _FROM_ (and then writing to dxferp). It
is from the time when not all LLDs or HBAs provided an indication
of a "short read". Today users have the 'sg_io_hdr_t::resid' for
that purpose. Whether the sg driver in lk 4.12 still does that
I haven't checked.

The only limit that should be placed on dxfer_len is something like
<= 2**28 (256M) in my opinion. Big enough that the kernel would
reject it and small enough to catch negative values placed in an
unsigned.


The overall problem is that sg_is_valid_dxfer() introduced in lk 4.12
is doing more sanity checks than were done before. My policy was
to ignore ("don't care") combinations of dxfer_direction, dxferp and
dxfer_len that were harmless. Anyway those three variables are
incomplete since the SCSI command and the device also dictate the
length of the data-in transfer.

Doug Gilbert
Johannes Thumshirn July 26, 2017, 7:22 a.m. UTC | #5
On Tue, Jul 25, 2017 at 03:09:00PM -0400, Douglas Gilbert wrote:
> _TO_ is toward the device (i.e. WRITE) and what T10 call "data-out".
> _FROM_ is from the device (e.g. INQUIRY) and what T10 call "data-in".
> 
> _TO_FROM_ is basically _FROM_ (and has nothing to do with bidi).
> _TO_FROM_ is very old and is meant to try and detect "short reads"
> by prefilling the indirect buffer (by reading from dxferp) before
> it is overwritten by the _FROM_ (and then writing to dxferp). It
> is from the time when not all LLDs or HBAs provided an indication
> of a "short read". Today users have the 'sg_io_hdr_t::resid' for
> that purpose. Whether the sg driver in lk 4.12 still does that
> I haven't checked.
> 
> The only limit that should be placed on dxfer_len is something like
> <= 2**28 (256M) in my opinion. Big enough that the kernel would
> reject it and small enough to catch negative values placed in an
> unsigned.

OK. I'll give it a shot, thanks.

> 
> 
> The overall problem is that sg_is_valid_dxfer() introduced in lk 4.12
> is doing more sanity checks than were done before. My policy was
> to ignore ("don't care") combinations of dxfer_direction, dxferp and
> dxfer_len that were harmless. Anyway those three variables are
> incomplete since the SCSI command and the device also dictate the
> length of the data-in transfer.

The problem with these don't care combinations was, that user-space could then
easily crash the kernel. This is the reason I introduced sg_is_valid_dxfer().
It's sole purpuse was to avoid more CVEs, but unfortunately it turned into
quite some regressions.

Thanks,
	Johannes
diff mbox

Patch

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 1e82d4128a84..b421ec81d775 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -764,7 +764,7 @@  static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
 		return true;
 	case SG_DXFER_TO_DEV:
 	case SG_DXFER_TO_FROM_DEV:
-		if (!hp->dxferp || hp->dxfer_len == 0)
+		if (!hp->dxferp)
 			return false;
 		return true;
 	case SG_DXFER_UNKNOWN: