diff mbox

[V2] scsi_debugfs: fix crash in scsi_show_rq()

Message ID 20171107152155.19949-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei Nov. 7, 2017, 3:21 p.m. UTC
cmd->cmnd can be allocated/freed dynamically in case of T10_PI_TYPE2_PROTECTION,
so we can't access it in scsi_show_rq() if 'SCpnt->cmnd != scsi_req(rq)->cmd',
because this request can be freed any time.

This patch trys to fix the following kernel crash when dumping request via
block's debugfs interface:

[  252.962045] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  252.963007] IP: scsi_format_opcode_name+0x1a/0x1c0
[  252.963007] PGD 25e75a067 P4D 25e75a067 PUD 25e75b067 PMD 0
[  252.963007] Oops: 0000 [#1] PREEMPT SMP
[  252.963007] Dumping ftrace buffer:
[  252.963007]    (ftrace buffer empty)
[  252.963007] Modules linked in: scsi_debug ebtable_filter ebtables ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc iptable_filter fuse ip_tables sd_mod sg mptsas mptscsih mptbase crc32c_intel ahci libahci nvme serio_raw scsi_transport_sas libata lpc_ich nvme_core virtio_scsi binfmt_misc dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs
[  252.963007] CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516
[  252.963007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-1.fc25 04/01/2014
[  252.963007] task: ffff88025e6f6000 task.stack: ffffc90001bd0000
[  252.963007] RIP: 0010:scsi_format_opcode_name+0x1a/0x1c0
[  252.963007] RSP: 0018:ffffc90001bd3c50 EFLAGS: 00010286
[  252.963007] RAX: 00000000ffff4843 RBX: 0000000000000050 RCX: 0000000000000000
[  252.963007] RDX: 0000000000000000 RSI: 0000000000000050 RDI: ffffc90001bd3cd8
[  252.963007] RBP: ffffc90001bd3c88 R08: 0000000000001000 R09: 0000000000000000
[  252.963007] R10: ffff880275134000 R11: ffff88027513406c R12: 0000000000000050
[  252.963007] R13: ffffc90001bd3cd8 R14: 0000000000000000 R15: 0000000000000000
[  252.963007] FS:  00007f4d11762700(0000) GS:ffff88027fc40000(0000) knlGS:0000000000000000
[  252.963007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  252.963007] CR2: 0000000000000000 CR3: 000000025e789003 CR4: 00000000003606e0
[  252.963007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  252.963007] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  252.963007] Call Trace:
[  252.963007]  __scsi_format_command+0x27/0xc0
[  252.963007]  scsi_show_rq+0x5c/0xc0
[  252.963007]  ? seq_printf+0x4e/0x70
[  252.963007]  ? blk_flags_show+0x5b/0xf0
[  252.963007]  __blk_mq_debugfs_rq_show+0x116/0x130
[  252.963007]  blk_mq_debugfs_rq_show+0xe/0x10
[  252.963007]  seq_read+0xfe/0x3b0
[  252.963007]  ? __handle_mm_fault+0x631/0x1150
[  252.963007]  full_proxy_read+0x54/0x90
[  252.963007]  __vfs_read+0x37/0x160
[  252.963007]  ? security_file_permission+0x9b/0xc0
[  252.963007]  vfs_read+0x96/0x130
[  252.963007]  SyS_read+0x55/0xc0
[  252.963007]  entry_SYSCALL_64_fastpath+0x1a/0xa5
[  252.963007] RIP: 0033:0x7f4d1127e9b0
[  252.963007] RSP: 002b:00007ffd27082568 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  252.963007] RAX: ffffffffffffffda RBX: 00007f4d1154bb20 RCX: 00007f4d1127e9b0
[  252.963007] RDX: 0000000000020000 RSI: 00007f4d115a7000 RDI: 0000000000000003
[  252.963007] RBP: 0000000000021010 R08: ffffffffffffffff R09: 0000000000000000
[  252.963007] R10: 000000000000037b R11: 0000000000000246 R12: 0000000000022000
[  252.963007] R13: 00007f4d1154bb78 R14: 0000000000001000 R15: 0000000000020000
[  252.963007] Code: c6 e8 1b ca 24 00 eb 8c e8 74 2c ae ff 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 89 f4 48 83 ec 18 <44> 0f b6 32 48 c7 45 c8 00 00 00 00 65 48 8b 04 25 28 00 00 00
[  252.963007] RIP: scsi_format_opcode_name+0x1a/0x1c0 RSP: ffffc90001bd3c50
[  252.963007] CR2: 0000000000000000
[  252.963007] ---[ end trace 83c5bddfbaa6573c ]---
[  252.963007] Kernel panic - not syncing: Fatal exception
[  252.963007] Dumping ftrace buffer:
[  252.963007]    (ftrace buffer empty)
[  252.963007] Kernel Offset: disabled
[  252.963007] ---[ end Kernel panic - not syncing: Fatal exception

Fixes: 0eebd005dd07(scsi: Implement blk_mq_ops.show_rq())
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
V2:
	fix a typo found by John Garry.

 drivers/scsi/scsi_debugfs.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Bart Van Assche Nov. 7, 2017, 4:13 p.m. UTC | #1
On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:
> cmd->cmnd can be allocated/freed dynamically in case of T10_PI_TYPE2_PROTECTION,

> so we can't access it in scsi_show_rq() if 'SCpnt->cmnd != scsi_req(rq)->cmd',

> because this request can be freed any time.


That description is inaccurate. It is not allowed to free SCpnt->cmnd while
a command is being executed.

> diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c

> index 5e9755008aed..7a50878446b4 100644

> --- a/drivers/scsi/scsi_debugfs.c

> +++ b/drivers/scsi/scsi_debugfs.c

> @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct request *rq)

>  	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);

>  	char buf[80];

>  

> -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);

> +	if (cmd->cmnd == scsi_req(rq)->cmd)

> +		__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);

> +	else

> +		strcpy(buf, "unknown");

>  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,

>  		   cmd->retries, msecs / 1000, msecs % 1000);

>  }


This change introduces a new bug, namely that "unknown" will appear in the
debugfs output if (cmd->cmnd != scsi_req(rq)->cmd). Have you considered to use
the test (cmd->cmnd != NULL) instead?

Additionally, if you really want to avoid that this code crashes if NULL is
assigned to cmd->cmnd you will have to use READ_ONCE(cmd->cmnd).

Bart.
Ming Lei Nov. 8, 2017, 12:59 a.m. UTC | #2
On Tue, Nov 07, 2017 at 04:13:48PM +0000, Bart Van Assche wrote:
> On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:
> > cmd->cmnd can be allocated/freed dynamically in case of T10_PI_TYPE2_PROTECTION,
> > so we can't access it in scsi_show_rq() if 'SCpnt->cmnd != scsi_req(rq)->cmd',
> > because this request can be freed any time.
> 
> That description is inaccurate. It is not allowed to free SCpnt->cmnd while
> a command is being executed.

But can you prevent this request being freed when reading this debugfs
file?

> 
> > diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
> > index 5e9755008aed..7a50878446b4 100644
> > --- a/drivers/scsi/scsi_debugfs.c
> > +++ b/drivers/scsi/scsi_debugfs.c
> > @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct request *rq)
> >  	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
> >  	char buf[80];
> >  
> > -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
> > +	if (cmd->cmnd == scsi_req(rq)->cmd)
> > +		__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
> > +	else
> > +		strcpy(buf, "unknown");
> >  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
> >  		   cmd->retries, msecs / 1000, msecs % 1000);
> >  }
> 
> This change introduces a new bug, namely that "unknown" will appear in the
> debugfs output if (cmd->cmnd != scsi_req(rq)->cmd). Have you considered to use

Because there isn't reliable way to get the command safely, and I don't
think it is a new bug.

> the test (cmd->cmnd != NULL) instead?

No, that is worse, because you may cause use-after-free if you read a
freed buffer.
Bart Van Assche Nov. 8, 2017, 1:06 a.m. UTC | #3
On Wed, 2017-11-08 at 08:59 +0800, Ming Lei wrote:
> On Tue, Nov 07, 2017 at 04:13:48PM +0000, Bart Van Assche wrote:

> > On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:

> > > diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c

> > > index 5e9755008aed..7a50878446b4 100644

> > > --- a/drivers/scsi/scsi_debugfs.c

> > > +++ b/drivers/scsi/scsi_debugfs.c

> > > @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct request *rq)

> > >  	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);

> > >  	char buf[80];

> > >  

> > > -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);

> > > +	if (cmd->cmnd == scsi_req(rq)->cmd)

> > > +		__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);

> > > +	else

> > > +		strcpy(buf, "unknown");

> > >  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,

> > >  		   cmd->retries, msecs / 1000, msecs % 1000);

> > >  }

> > 

> > This change introduces a new bug, namely that "unknown" will appear in the

> > debugfs output if (cmd->cmnd != scsi_req(rq)->cmd). Have you considered to use

> 

> Because there isn't reliable way to get the command safely, and I don't

> think it is a new bug.

> 

> > the test (cmd->cmnd != NULL) instead?

> 

> No, that is worse, because you may cause use-after-free if you read a

> freed buffer.


It seems like your operating mode is still to contradict all feedback you get
instead of trying to address the feedback you get?

Anyway, this is a debugging facility so I'm not convinced that avoiding a
(very sporadic) use-after-free in this code is better than omitting very
useful output.

Bart.
Ming Lei Nov. 8, 2017, 1:15 a.m. UTC | #4
On Wed, Nov 08, 2017 at 01:06:44AM +0000, Bart Van Assche wrote:
> On Wed, 2017-11-08 at 08:59 +0800, Ming Lei wrote:
> > On Tue, Nov 07, 2017 at 04:13:48PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:
> > > > diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
> > > > index 5e9755008aed..7a50878446b4 100644
> > > > --- a/drivers/scsi/scsi_debugfs.c
> > > > +++ b/drivers/scsi/scsi_debugfs.c
> > > > @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct request *rq)
> > > >  	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
> > > >  	char buf[80];
> > > >  
> > > > -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
> > > > +	if (cmd->cmnd == scsi_req(rq)->cmd)
> > > > +		__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
> > > > +	else
> > > > +		strcpy(buf, "unknown");
> > > >  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
> > > >  		   cmd->retries, msecs / 1000, msecs % 1000);
> > > >  }
> > > 
> > > This change introduces a new bug, namely that "unknown" will appear in the
> > > debugfs output if (cmd->cmnd != scsi_req(rq)->cmd). Have you considered to use
> > 
> > Because there isn't reliable way to get the command safely, and I don't
> > think it is a new bug.
> > 
> > > the test (cmd->cmnd != NULL) instead?
> > 
> > No, that is worse, because you may cause use-after-free if you read a
> > freed buffer.
> 
> It seems like your operating mode is still to contradict all feedback you get
> instead of trying to address the feedback you get?
> 
> Anyway, this is a debugging facility so I'm not convinced that avoiding a
> (very sporadic) use-after-free in this code is better than omitting very
> useful output.

OK, if no one objects the use-after-free, because this way may trigger
warning from some utility, such as KASAN.
James Bottomley Nov. 8, 2017, 3:04 a.m. UTC | #5
On Wed, 2017-11-08 at 09:15 +0800, Ming Lei wrote:
> On Wed, Nov 08, 2017 at 01:06:44AM +0000, Bart Van Assche wrote:
> > 
> > On Wed, 2017-11-08 at 08:59 +0800, Ming Lei wrote:
> > > 
> > > On Tue, Nov 07, 2017 at 04:13:48PM +0000, Bart Van Assche wrote:
> > > > 
> > > > On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:
> > > > > 
> > > > > diff --git a/drivers/scsi/scsi_debugfs.c
> > > > > b/drivers/scsi/scsi_debugfs.c
> > > > > index 5e9755008aed..7a50878446b4 100644
> > > > > --- a/drivers/scsi/scsi_debugfs.c
> > > > > +++ b/drivers/scsi/scsi_debugfs.c
> > > > > @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct
> > > > > request *rq)
> > > > >  	int msecs = jiffies_to_msecs(jiffies - cmd-
> > > > > >jiffies_at_alloc);
> > > > >  	char buf[80];
> > > > >  
> > > > > -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd,
> > > > > cmd->cmd_len);
> > > > > +	if (cmd->cmnd == scsi_req(rq)->cmd)
> > > > > +		__scsi_format_command(buf, sizeof(buf), cmd-
> > > > > >cmnd, cmd->cmd_len);
> > > > > +	else
> > > > > +		strcpy(buf, "unknown");
> > > > >  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated
> > > > > %d.%03d s ago", buf,
> > > > >  		   cmd->retries, msecs / 1000, msecs %
> > > > > 1000);
> > > > >  }
> > > > 
> > > > This change introduces a new bug, namely that "unknown" will
> > > > appear in the debugfs output if (cmd->cmnd != scsi_req(rq)-
> > > > >cmd). Have you considered to use
> > > 
> > > Because there isn't reliable way to get the command safely, and I
> > > don't think it is a new bug.
> > > 
> > > > 
> > > > the test (cmd->cmnd != NULL) instead?
> > > 
> > > No, that is worse, because you may cause use-after-free if you
> > > read a freed buffer.
> > 
> > It seems like your operating mode is still to contradict all
> > feedback you get instead of trying to address the feedback you get?
> > 
> > Anyway, this is a debugging facility so I'm not convinced that
> > avoiding a (very sporadic) use-after-free in this code is better
> > than omitting very useful output.
> 
> OK, if no one objects the use-after-free, because this way may
> trigger warning from some utility, such as KASAN.

Good grief, this list is supposed to be for technical discussions not
kindergarten playground squabbles; could you both please act your age?

The patch as proposed would lose us all information about PI commands,
hence the objection.  For the concern about use after free on the NULL
check, what about modifying sd_uninit_command() to NULL SCpnt->cmnd
before it calls mempool_free()?  That will likely eliminate the race
window for printing the command.

James
Ming Lei Nov. 8, 2017, 3:37 a.m. UTC | #6
Hi James,

On Tue, Nov 07, 2017 at 07:04:23PM -0800, James Bottomley wrote:
> On Wed, 2017-11-08 at 09:15 +0800, Ming Lei wrote:
> > On Wed, Nov 08, 2017 at 01:06:44AM +0000, Bart Van Assche wrote:
> > > 
> > > On Wed, 2017-11-08 at 08:59 +0800, Ming Lei wrote:
> > > > 
> > > > On Tue, Nov 07, 2017 at 04:13:48PM +0000, Bart Van Assche wrote:
> > > > > 
> > > > > On Tue, 2017-11-07 at 23:21 +0800, Ming Lei wrote:
> > > > > > 
> > > > > > diff --git a/drivers/scsi/scsi_debugfs.c
> > > > > > b/drivers/scsi/scsi_debugfs.c
> > > > > > index 5e9755008aed..7a50878446b4 100644
> > > > > > --- a/drivers/scsi/scsi_debugfs.c
> > > > > > +++ b/drivers/scsi/scsi_debugfs.c
> > > > > > @@ -9,7 +9,10 @@ void scsi_show_rq(struct seq_file *m, struct
> > > > > > request *rq)
> > > > > >  	int msecs = jiffies_to_msecs(jiffies - cmd-
> > > > > > >jiffies_at_alloc);
> > > > > >  	char buf[80];
> > > > > >  
> > > > > > -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd,
> > > > > > cmd->cmd_len);
> > > > > > +	if (cmd->cmnd == scsi_req(rq)->cmd)
> > > > > > +		__scsi_format_command(buf, sizeof(buf), cmd-
> > > > > > >cmnd, cmd->cmd_len);
> > > > > > +	else
> > > > > > +		strcpy(buf, "unknown");
> > > > > >  	seq_printf(m, ", .cmd=%s, .retries=%d, allocated
> > > > > > %d.%03d s ago", buf,
> > > > > >  		   cmd->retries, msecs / 1000, msecs %
> > > > > > 1000);
> > > > > >  }
> > > > > 
> > > > > This change introduces a new bug, namely that "unknown" will
> > > > > appear in the debugfs output if (cmd->cmnd != scsi_req(rq)-
> > > > > >cmd). Have you considered to use
> > > > 
> > > > Because there isn't reliable way to get the command safely, and I
> > > > don't think it is a new bug.
> > > > 
> > > > > 
> > > > > the test (cmd->cmnd != NULL) instead?
> > > > 
> > > > No, that is worse, because you may cause use-after-free if you
> > > > read a freed buffer.
> > > 
> > > It seems like your operating mode is still to contradict all
> > > feedback you get instead of trying to address the feedback you get?
> > > 
> > > Anyway, this is a debugging facility so I'm not convinced that
> > > avoiding a (very sporadic) use-after-free in this code is better
> > > than omitting very useful output.
> > 
> > OK, if no one objects the use-after-free, because this way may
> > trigger warning from some utility, such as KASAN.
> 
> Good grief, this list is supposed to be for technical discussions not
> kindergarten playground squabbles; could you both please act your age?

In my reply, I mentioned I don't object to check NULL any more, and only
provide one extra input about possible KASAN's complaint, which may annoy
people.

> 
> The patch as proposed would lose us all information about PI commands,
> hence the objection.  For the concern about use after free on the NULL
> check, what about modifying sd_uninit_command() to NULL SCpnt->cmnd
> before it calls mempool_free()?  That will likely eliminate the race
> window for printing the command.

That may decrease the window, but can't eliminate it because of write/read
can be reordered.

But looks better, will cook a patch soon, together with using READ_ONCE() and
WRITE_ONCE().

Thanks,
Ming
diff mbox

Patch

diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
index 5e9755008aed..7a50878446b4 100644
--- a/drivers/scsi/scsi_debugfs.c
+++ b/drivers/scsi/scsi_debugfs.c
@@ -9,7 +9,10 @@  void scsi_show_rq(struct seq_file *m, struct request *rq)
 	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
 	char buf[80];
 
-	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
+	if (cmd->cmnd == scsi_req(rq)->cmd)
+		__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
+	else
+		strcpy(buf, "unknown");
 	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
 		   cmd->retries, msecs / 1000, msecs % 1000);
 }