diff mbox

4.15.14 crash with iscsi target and dvd

Message ID 595a10cfb387e6b2ab4d2053b84fed9b3da9e079.camel@wdc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bart Van Assche April 3, 2018, 5:03 p.m. UTC
On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> Wakko Warner wrote:
> > Wakko Warner wrote:
> > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > 4.14.
> > 
> > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > (except for 4.15 which was 1 behind)
> > Each of these kernels crash within seconds or immediate of doing find -type
> > f | xargs cat > /dev/null from the initiator.
> 
> I tried 4.10.0.  It doesn't completely lockup the system, but the device
> that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> process to hang in D state.

Hello Wakko,

Thank you for having narrowed down this further. I think that you encountered
a regression either in the block layer core or in the SCSI core. Unfortunately
the number of changes between kernel versions v4.9 and v4.10 in these two
subsystems is huge. I see two possible ways forward:
- Either that you perform a bisect to identify the patch that introduced this
  regression. However, I'm not sure whether you are familiar with the bisect
  process.
- Or that you identify the command that triggers this crash such that others
  can reproduce this issue without needing access to your setup.

How about reproducing this crash with the below patch applied on top of
kernel v4.15.x? The additional output sent by this patch to the system log
should allow us to reproduce this issue by submitting the same SCSI command
with sg_raw.

Thanks,

Bart.


Subject: [PATCH] Report commands with no physical segments in the system log

---
 drivers/scsi/scsi_lib.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Wakko Warner April 5, 2018, 12:26 a.m. UTC | #1
Bart Van Assche wrote:
> On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> > Wakko Warner wrote:
> > > Wakko Warner wrote:
> > > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > > 4.14.
> > > 
> > > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > > (except for 4.15 which was 1 behind)
> > > Each of these kernels crash within seconds or immediate of doing find -type
> > > f | xargs cat > /dev/null from the initiator.
> > 
> > I tried 4.10.0.  It doesn't completely lockup the system, but the device
> > that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> > target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> > process to hang in D state.
> 
> Hello Wakko,
> 
> Thank you for having narrowed down this further. I think that you encountered
> a regression either in the block layer core or in the SCSI core. Unfortunately
> the number of changes between kernel versions v4.9 and v4.10 in these two
> subsystems is huge. I see two possible ways forward:
> - Either that you perform a bisect to identify the patch that introduced this
>   regression. However, I'm not sure whether you are familiar with the bisect
>   process.
> - Or that you identify the command that triggers this crash such that others
>   can reproduce this issue without needing access to your setup.
> 
> How about reproducing this crash with the below patch applied on top of
> kernel v4.15.x? The additional output sent by this patch to the system log
> should allow us to reproduce this issue by submitting the same SCSI command
> with sg_raw.

Sorry for not getting back in touch.  My internet was down.  I haven't tried
the patch yet.  I'll try to get to that tomorrow.  The system with the issue
is busy and I can't reboot it right now.
Wakko Warner April 6, 2018, 1:46 a.m. UTC | #2
Bart Van Assche wrote:
> On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> > Wakko Warner wrote:
> > > Wakko Warner wrote:
> > > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > > 4.14.
> > > 
> > > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > > (except for 4.15 which was 1 behind)
> > > Each of these kernels crash within seconds or immediate of doing find -type
> > > f | xargs cat > /dev/null from the initiator.
> > 
> > I tried 4.10.0.  It doesn't completely lockup the system, but the device
> > that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> > target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> > process to hang in D state.
> 
> Hello Wakko,
> 
> Thank you for having narrowed down this further. I think that you encountered
> a regression either in the block layer core or in the SCSI core. Unfortunately
> the number of changes between kernel versions v4.9 and v4.10 in these two
> subsystems is huge. I see two possible ways forward:
> - Either that you perform a bisect to identify the patch that introduced this
>   regression. However, I'm not sure whether you are familiar with the bisect
>   process.
> - Or that you identify the command that triggers this crash such that others
>   can reproduce this issue without needing access to your setup.
> 
> How about reproducing this crash with the below patch applied on top of
> kernel v4.15.x? The additional output sent by this patch to the system log
> should allow us to reproduce this issue by submitting the same SCSI command
> with sg_raw.

Ok, so I tried this, but scsi_print_command doesn't print anything.  I added
a check for !rq and the same thing that blk_rq_nr_phys_segments does in an
if statement above this thinking it might have crashed during WARN_ON_ONCE.
It still didn't print anything.  My printk shows this:
[  36.263193] sr 3:0:0:0: cmd->request->nr_phys_segments is 0

I also had scsi_print_command in the same if block which again didn't print
anything.  Is there some debug option I need to turn on to make it print?  I
tried looking through the code for this and following some of the function
calls but didn't see any config options.

> Subject: [PATCH] Report commands with no physical segments in the system log
> 
> ---
>  drivers/scsi/scsi_lib.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 6b6a6705f6e5..74a39db57d49 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1093,8 +1093,10 @@ int scsi_init_io(struct scsi_cmnd *cmd)
>  	bool is_mq = (rq->mq_ctx != NULL);
>  	int error = BLKPREP_KILL;
>  
> -	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
> +	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) {
> +		scsi_print_command(cmd);
>  		goto err_exit;
> +	}
>  
>  	error = scsi_init_sgtable(rq, &cmd->sdb);
>  	if (error)
Wakko Warner April 6, 2018, 2:06 a.m. UTC | #3
Wakko Warner wrote:
> Bart Van Assche wrote:
> > On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> > > Wakko Warner wrote:
> > > > Wakko Warner wrote:
> > > > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > > > 4.14.
> > > > 
> > > > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > > > (except for 4.15 which was 1 behind)
> > > > Each of these kernels crash within seconds or immediate of doing find -type
> > > > f | xargs cat > /dev/null from the initiator.
> > > 
> > > I tried 4.10.0.  It doesn't completely lockup the system, but the device
> > > that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> > > target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> > > process to hang in D state.
> > 
> > Hello Wakko,
> > 
> > Thank you for having narrowed down this further. I think that you encountered
> > a regression either in the block layer core or in the SCSI core. Unfortunately
> > the number of changes between kernel versions v4.9 and v4.10 in these two
> > subsystems is huge. I see two possible ways forward:
> > - Either that you perform a bisect to identify the patch that introduced this
> >   regression. However, I'm not sure whether you are familiar with the bisect
> >   process.
> > - Or that you identify the command that triggers this crash such that others
> >   can reproduce this issue without needing access to your setup.
> > 
> > How about reproducing this crash with the below patch applied on top of
> > kernel v4.15.x? The additional output sent by this patch to the system log
> > should allow us to reproduce this issue by submitting the same SCSI command
> > with sg_raw.
> 
> Ok, so I tried this, but scsi_print_command doesn't print anything.  I added
> a check for !rq and the same thing that blk_rq_nr_phys_segments does in an
> if statement above this thinking it might have crashed during WARN_ON_ONCE.
> It still didn't print anything.  My printk shows this:
> [  36.263193] sr 3:0:0:0: cmd->request->nr_phys_segments is 0
> 
> I also had scsi_print_command in the same if block which again didn't print
> anything.  Is there some debug option I need to turn on to make it print?  I
> tried looking through the code for this and following some of the function
> calls but didn't see any config options.

I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
I added a dev_printk in scsi_print_command where the 2 if statements return.
Logs:
[  29.866415] sr 3:0:0:0: cmd->cmnd is NULL

> > Subject: [PATCH] Report commands with no physical segments in the system log
> > 
> > ---
> >  drivers/scsi/scsi_lib.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 6b6a6705f6e5..74a39db57d49 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1093,8 +1093,10 @@ int scsi_init_io(struct scsi_cmnd *cmd)
> >  	bool is_mq = (rq->mq_ctx != NULL);
> >  	int error = BLKPREP_KILL;
> >  
> > -	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
> > +	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) {
> > +		scsi_print_command(cmd);
> >  		goto err_exit;
> > +	}
> >  
> >  	error = scsi_init_sgtable(rq, &cmd->sdb);
> >  	if (error)
> -- 
>  Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
>  million bugs.
Bart Van Assche April 6, 2018, 2:20 a.m. UTC | #4
On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.

> I added a dev_printk in scsi_print_command where the 2 if statements return.

> Logs:

> [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL


That's something that should never happen. As one can see in
scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
that pointer. Since I have not yet been able to reproduce myself what you
reported, would it be possible for you to bisect this issue? You will need
to follow something like the following procedure (see also
https://git-scm.com/docs/git-bisect):

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git bisect start
git bisect bad v4.10
git bisect good v4.9

and then build the kernel, install it, boot the kernel and test it.
Depending on the result, run either git bisect bad or git bisect good and
keep going until git bisect comes to a conclusion. This can take an hour or
more.

Bart.
Wakko Warner April 6, 2018, 11:42 p.m. UTC | #5
Bart Van Assche wrote:
> On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
> > I added a dev_printk in scsi_print_command where the 2 if statements return.
> > Logs:
> > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL
> 
> That's something that should never happen. As one can see in
> scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
> that pointer. Since I have not yet been able to reproduce myself what you
> reported, would it be possible for you to bisect this issue? You will need
> to follow something like the following procedure (see also
> https://git-scm.com/docs/git-bisect):

I don't know how relevent it is, but this machine boots nfs and exports it's
dvd drives over iscsi with the target modules.  My scsi_target.lio is at the
end.  I removed the iqn name.  The options are default except for a few. 
Non default options I tabbed over.
eth0 is the nfs/localnet nic and eth1 is the
nic that iscsi goes over.
eth0 is onboard pci 8086:1502 (subsystem 1028:05d3)
eth1 is pci 8086:107d (subsystem 8086:1084)
Both use the e1000e driver

The initiator is running 4.4.107.
When running on the initiator, /dev/sr1 is the target /dev/sr0.  Therefor
cat /dev/sr1 > /dev/null seems to work.
mount /dev/sr1 /cdrom works
find /cdrom -type f | xargs cat > /dev/null immediately crashes the target.
Burning to /dev/sr1 seems to work.

I have another nic that uses igb instead, I'll see if that makes a
difference.

> git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git bisect start
> git bisect bad v4.10
> git bisect good v4.9
> 
> and then build the kernel, install it, boot the kernel and test it.
> Depending on the result, run either git bisect bad or git bisect good and
> keep going until git bisect comes to a conclusion. This can take an hour or
> more.

I'll try this.

Here's my scsi_target.lio:
storage pscsi {
	disk dvd0 {
	path /dev/sr0 
attribute {
emulate_3pc yes 
emulate_caw yes 
emulate_dpo no 
emulate_fua_read no 
emulate_model_alias no 
emulate_rest_reord no 
emulate_tas yes 
emulate_tpu no 
emulate_tpws no 
emulate_ua_intlck_ctrl no 
emulate_write_cache no 
enforce_pr_isids yes 
fabric_max_sectors 8192 
is_nonrot yes 
max_unmap_block_desc_count 0 
max_unmap_lba_count 0 
max_write_same_len 65535 
queue_depth 128 
unmap_granularity 0 
unmap_granularity_alignment 0 
}
}
	disk dvd1 {
	path /dev/sr1 
attribute {
emulate_3pc yes 
emulate_caw yes 
emulate_dpo no 
emulate_fua_read no 
emulate_model_alias no 
emulate_rest_reord no 
emulate_tas yes 
emulate_tpu no 
emulate_tpws no 
emulate_ua_intlck_ctrl no 
emulate_write_cache no 
enforce_pr_isids yes 
fabric_max_sectors 8192 
is_nonrot yes 
max_unmap_block_desc_count 0 
max_unmap_lba_count 0 
max_write_same_len 65535 
queue_depth 128 
unmap_granularity 0 
unmap_granularity_alignment 0 
}
}
	disk dvd2 {
	path /dev/sr2 
attribute {
emulate_3pc yes 
emulate_caw yes 
emulate_dpo no 
emulate_fua_read no 
emulate_model_alias no 
emulate_rest_reord no 
emulate_tas yes 
emulate_tpu no 
emulate_tpws no 
emulate_ua_intlck_ctrl no 
emulate_write_cache no 
enforce_pr_isids yes 
fabric_max_sectors 8192 
is_nonrot yes 
max_unmap_block_desc_count 0 
max_unmap_lba_count 0 
max_write_same_len 65535 
queue_depth 128 
unmap_granularity 0 
unmap_granularity_alignment 0 
}
}
}
fabric iscsi {
discovery_auth {
enable no 
mutual_password "" 
mutual_userid "" 
password "" 
userid "" 
}
	target iqn.<myiqn>:dvd tpgt 1 {
enable yes 
attribute {
	authentication no 
cache_dynamic_acls yes 
default_cmdsn_depth 64 
default_erl 0 
demo_mode_discovery yes 
	demo_mode_write_protect no 
fabric_prot_type 0 
	generate_node_acls yes 
login_timeout 15 
netif_timeout 2 
prod_mode_write_protect no 
t10_pi 0 
tpg_enabled_sendtargets 1 
}
auth {
password "" 
password_mutual "" 
userid "" 
userid_mutual "" 
}
parameter {
AuthMethod "CHAP,None" 
DataDigest "CRC32C,None" 
DataPDUInOrder yes 
DataSequenceInOrder yes 
DefaultTime2Retain 20 
DefaultTime2Wait 2 
ErrorRecoveryLevel no 
FirstBurstLength 65536 
HeaderDigest "CRC32C,None" 
IFMarkInt Reject 
IFMarker no 
ImmediateData yes 
InitialR2T yes 
MaxBurstLength 262144 
MaxConnections 1 
MaxOutstandingR2T 1 
MaxRecvDataSegmentLength 8192 
MaxXmitDataSegmentLength 262144 
OFMarkInt Reject 
OFMarker no 
TargetAlias "LIO Target" 
}
	lun 0 backend pscsi:dvd0 
	lun 1 backend pscsi:dvd1 
	lun 2 backend pscsi:dvd2 
	portal 0.0.0.0:3260 
}
}
Wakko Warner April 7, 2018, 1:03 a.m. UTC | #6
Bart Van Assche wrote:
> On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
> > I added a dev_printk in scsi_print_command where the 2 if statements return.
> > Logs:
> > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL
> 
> That's something that should never happen. As one can see in
> scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
> that pointer. Since I have not yet been able to reproduce myself what you
> reported, would it be possible for you to bisect this issue? You will need
> to follow something like the following procedure (see also
> https://git-scm.com/docs/git-bisect):
> 
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git bisect start
> git bisect bad v4.10
> git bisect good v4.9
> 
> and then build the kernel, install it, boot the kernel and test it.
> Depending on the result, run either git bisect bad or git bisect good and
> keep going until git bisect comes to a conclusion. This can take an hour or
> more.

I have 1 question.  Should make clean be done between tests?  My box
compiles the whole kernel in 2 minutes.
Bart Van Assche April 7, 2018, 2:06 a.m. UTC | #7
On Fri, 2018-04-06 at 21:03 -0400, Wakko Warner wrote:
> Bart Van Assche wrote:

> > On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:

> > > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.

> > > I added a dev_printk in scsi_print_command where the 2 if statements return.

> > > Logs:

> > > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL

> > 

> > That's something that should never happen. As one can see in

> > scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize

> > that pointer. Since I have not yet been able to reproduce myself what you

> > reported, would it be possible for you to bisect this issue? You will need

> > to follow something like the following procedure (see also

> > https://git-scm.com/docs/git-bisect):

> > 

> > git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

> > git bisect start

> > git bisect bad v4.10

> > git bisect good v4.9

> > 

> > and then build the kernel, install it, boot the kernel and test it.

> > Depending on the result, run either git bisect bad or git bisect good and

> > keep going until git bisect comes to a conclusion. This can take an hour or

> > more.

> 

> I have 1 question.  Should make clean be done between tests?  My box

> compiles the whole kernel in 2 minutes.


If you trust that the build system will figure out all dependencies then
running make clean is not necessary. Personally I always run make clean
during a bisect before rebuilding the kernel because if a header file has
changed in e.g. the block layer a huge number of files has to be rebuilt
anyway.

Bart.
Wakko Warner April 7, 2018, 4:53 p.m. UTC | #8
Bart Van Assche wrote:
> On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
> > I added a dev_printk in scsi_print_command where the 2 if statements return.
> > Logs:
> > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL
> 
> That's something that should never happen. As one can see in
> scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
> that pointer. Since I have not yet been able to reproduce myself what you
> reported, would it be possible for you to bisect this issue? You will need
> to follow something like the following procedure (see also
> https://git-scm.com/docs/git-bisect):

After doing 3 successful compiles with good/bad, I got this error and was
not able to compile any more kernels:
  CC      scripts/mod/devicetable-offsets.s
scripts/mod/empty.c:1:0: error: code model kernel does not support PIC mode
 /* empty file to figure out endianness / word size */
 
scripts/mod/devicetable-offsets.c:1:0: error: code model kernel does not support PIC mode
 #include <linux/kbuild.h>
 
scripts/Makefile.build:153: recipe for target 'scripts/mod/devicetable-offsets.s' failed

I don't think it found the bad commit.
Wakko Warner April 7, 2018, 5:08 p.m. UTC | #9
Wakko Warner wrote:
> Bart Van Assche wrote:
> > On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:
> > > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.
> > > I added a dev_printk in scsi_print_command where the 2 if statements return.
> > > Logs:
> > > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL
> > 
> > That's something that should never happen. As one can see in
> > scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize
> > that pointer. Since I have not yet been able to reproduce myself what you
> > reported, would it be possible for you to bisect this issue? You will need
> > to follow something like the following procedure (see also
> > https://git-scm.com/docs/git-bisect):
> 
> After doing 3 successful compiles with good/bad, I got this error and was
> not able to compile any more kernels:
>   CC      scripts/mod/devicetable-offsets.s
> scripts/mod/empty.c:1:0: error: code model kernel does not support PIC mode
>  /* empty file to figure out endianness / word size */
>  
> scripts/mod/devicetable-offsets.c:1:0: error: code model kernel does not support PIC mode
>  #include <linux/kbuild.h>
>  
> scripts/Makefile.build:153: recipe for target 'scripts/mod/devicetable-offsets.s' failed
> 
> I don't think it found the bad commit.

I forgot to mention my gcc version.
gcc (Debian 6.2.1-7) 6.2.1 20161215
Bart Van Assche April 7, 2018, 5:09 p.m. UTC | #10
On Sat, 2018-04-07 at 12:53 -0400, Wakko Warner wrote:
> Bart Van Assche wrote:

> > On Thu, 2018-04-05 at 22:06 -0400, Wakko Warner wrote:

> > > I know now why scsi_print_command isn't doing anything.  cmd->cmnd is null.

> > > I added a dev_printk in scsi_print_command where the 2 if statements return.

> > > Logs:

> > > [  29.866415] sr 3:0:0:0: cmd->cmnd is NULL

> > 

> > That's something that should never happen. As one can see in

> > scsi_setup_scsi_cmnd() and scsi_setup_fs_cmnd() both functions initialize

> > that pointer. Since I have not yet been able to reproduce myself what you

> > reported, would it be possible for you to bisect this issue? You will need

> > to follow something like the following procedure (see also

> > https://git-scm.com/docs/git-bisect):

> 

> After doing 3 successful compiles with good/bad, I got this error and was

> not able to compile any more kernels:

>   CC      scripts/mod/devicetable-offsets.s

> scripts/mod/empty.c:1:0: error: code model kernel does not support PIC mode

>  /* empty file to figure out endianness / word size */

>  

> scripts/mod/devicetable-offsets.c:1:0: error: code model kernel does not support PIC mode

>  #include <linux/kbuild.h>

>  

> scripts/Makefile.build:153: recipe for target 'scripts/mod/devicetable-offsets.s' failed

> 

> I don't think it found the bad commit.


Have you tried to modify the kernel Makefile as indicated in the following
e-mail? This should make the kernel build:

https://lists.ubuntu.com/archives/kernel-team/2016-May/077178.html

Thanks,

Bart.
diff mbox

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6b6a6705f6e5..74a39db57d49 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1093,8 +1093,10 @@  int scsi_init_io(struct scsi_cmnd *cmd)
 	bool is_mq = (rq->mq_ctx != NULL);
 	int error = BLKPREP_KILL;
 
-	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
+	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) {
+		scsi_print_command(cmd);
 		goto err_exit;
+	}
 
 	error = scsi_init_sgtable(rq, &cmd->sdb);
 	if (error)