diff mbox

[1/2] scsi-mq: Only show the CDB if available

Message ID 20171205003809.1705-2-bart.vanassche@wdc.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche Dec. 5, 2017, 12:38 a.m. UTC
Since the next patch will make it possible that scsi_show_rq() gets
called before the CDB pointer is changed into a non-NULL value,
only show the CDB if the CDB pointer is not NULL. Additionally,
show the request timeout and SCSI command flags. This patch also
fixes a bug that was reported by Ming Lei. See also Ming Lei,
scsi_debugfs: fix crash in scsi_show_rq(), linux-scsi, 7 November
2017 (https://marc.info/?l=linux-block&m=151006655317188).

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/scsi_debugfs.c | 47 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 42 insertions(+), 5 deletions(-)

Comments

Ming Lei Dec. 5, 2017, 1:15 a.m. UTC | #1
On Mon, Dec 04, 2017 at 04:38:08PM -0800, Bart Van Assche wrote:
> Since the next patch will make it possible that scsi_show_rq() gets
> called before the CDB pointer is changed into a non-NULL value,
> only show the CDB if the CDB pointer is not NULL. Additionally,
> show the request timeout and SCSI command flags. This patch also
> fixes a bug that was reported by Ming Lei. See also Ming Lei,
> scsi_debugfs: fix crash in scsi_show_rq(), linux-scsi, 7 November
> 2017 (https://marc.info/?l=linux-block&m=151006655317188).

Please cook a patch for fixing the crash issue only, since we need
to backport the fix to stable kernel.

> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>

Please Cc: <stable@vger.kernel.org>

> ---
>  drivers/scsi/scsi_debugfs.c | 47 ++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 42 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
> index 01f08c03f2c1..37ed6bb8e6ec 100644
> --- a/drivers/scsi/scsi_debugfs.c
> +++ b/drivers/scsi/scsi_debugfs.c
> @@ -4,13 +4,50 @@
>  #include <scsi/scsi_dbg.h>
>  #include "scsi_debugfs.h"
>  
> +#define SCSI_CMD_FLAG_NAME(name) [ilog2(SCMD_##name)] = #name
> +static const char *const scsi_cmd_flags[] = {
> +	SCSI_CMD_FLAG_NAME(TAGGED),
> +	SCSI_CMD_FLAG_NAME(UNCHECKED_ISA_DMA),
> +	SCSI_CMD_FLAG_NAME(ZONE_WRITE_LOCK),
> +	SCSI_CMD_FLAG_NAME(INITIALIZED),
> +};
> +#undef SCSI_CMD_FLAG_NAME
> +
> +static int scsi_flags_show(struct seq_file *m, const unsigned long flags,
> +			   const char *const *flag_name, int flag_name_count)
> +{
> +	bool sep = false;
> +	int i;
> +
> +	for (i = 0; i < sizeof(flags) * BITS_PER_BYTE; i++) {
> +		if (!(flags & BIT(i)))
> +			continue;
> +		if (sep)
> +			seq_puts(m, "|");
> +		sep = true;
> +		if (i < flag_name_count && flag_name[i])
> +			seq_puts(m, flag_name[i]);
> +		else
> +			seq_printf(m, "%d", i);
> +	}
> +	return 0;
> +}
> +
>  void scsi_show_rq(struct seq_file *m, struct request *rq)
>  {
>  	struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
> -	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
> -	char buf[80];
> +	int alloc_ms = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
> +	int timeout_ms = jiffies_to_msecs(rq->timeout);
> +	const u8 *const cdb = READ_ONCE(cmd->cmnd);
> +	char buf[80] = "(?)";
>  
> -	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
> -	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
> -		   cmd->retries, msecs / 1000, msecs % 1000);
> +	if (cdb)
> +		__scsi_format_command(buf, sizeof(buf), cdb, cmd->cmd_len);
> +	seq_printf(m, ", .cmd=%s, .retries=%d, .result = %#x, .flags=", buf,
> +		   cmd->retries, cmd->result);
> +	scsi_flags_show(m, cmd->flags, scsi_cmd_flags,
> +			ARRAY_SIZE(scsi_cmd_flags));
> +	seq_printf(m, ", .timeout=%d.%03d, allocated %d.%03d s ago",
> +		   timeout_ms / 1000, timeout_ms % 1000,
> +		   alloc_ms / 1000, alloc_ms % 1000);
>  }
> -- 
> 2.15.0
>
Bart Van Assche Dec. 5, 2017, 1:59 a.m. UTC | #2
On Tue, 2017-12-05 at 09:15 +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 04:38:08PM -0800, Bart Van Assche wrote:

> > Since the next patch will make it possible that scsi_show_rq() gets

> > called before the CDB pointer is changed into a non-NULL value,

> > only show the CDB if the CDB pointer is not NULL. Additionally,

> > show the request timeout and SCSI command flags. This patch also

> > fixes a bug that was reported by Ming Lei. See also Ming Lei,

> > scsi_debugfs: fix crash in scsi_show_rq(), linux-scsi, 7 November

> > 2017 (https://marc.info/?l=linux-block&m=151006655317188).

> 

> Please cook a patch for fixing the crash issue only, since we need

> to backport the fix to stable kernel.


The code that is touched by this patch is only used for kernel debugging.
I will do this if others agree with your opinion.

Bart.
Martin K. Petersen Dec. 5, 2017, 3:42 a.m. UTC | #3
Hi Ming,

> Please cook a patch for fixing the crash issue only, since we need
> to backport the fix to stable kernel.

I thought you were going to submit a V5 that addressed James' concerns?
Ming Lei Dec. 5, 2017, 4 a.m. UTC | #4
On Mon, Dec 04, 2017 at 10:42:28PM -0500, Martin K. Petersen wrote:
> 
> Hi Ming,
> 
> > Please cook a patch for fixing the crash issue only, since we need
> > to backport the fix to stable kernel.
> 
> I thought you were going to submit a V5 that addressed James' concerns?
> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering

Hi Martin,

I replied in the following link for James's concerns:

	https://marc.info/?l=linux-block&m=151074751321108&w=2

The fact is that use-after-free can't avoided at all, no matter if
we set the cmnd to NULL before calling free, that means we have to
handle use-after-free well in scsi_show_rq(), so we don't need to
touch the free code.

So V4 is well enough for merge, IMO.


Thanks,
Ming
Ming Lei Dec. 5, 2017, 5 a.m. UTC | #5
On Tue, Dec 05, 2017 at 01:59:51AM +0000, Bart Van Assche wrote:
> On Tue, 2017-12-05 at 09:15 +0800, Ming Lei wrote:
> > On Mon, Dec 04, 2017 at 04:38:08PM -0800, Bart Van Assche wrote:
> > > Since the next patch will make it possible that scsi_show_rq() gets
> > > called before the CDB pointer is changed into a non-NULL value,
> > > only show the CDB if the CDB pointer is not NULL. Additionally,
> > > show the request timeout and SCSI command flags. This patch also
> > > fixes a bug that was reported by Ming Lei. See also Ming Lei,
> > > scsi_debugfs: fix crash in scsi_show_rq(), linux-scsi, 7 November
> > > 2017 (https://marc.info/?l=linux-block&m=151006655317188).
> > 
> > Please cook a patch for fixing the crash issue only, since we need
> > to backport the fix to stable kernel.
> 
> The code that is touched by this patch is only used for kernel debugging.
> I will do this if others agree with your opinion.

No, do not mix two different things in one patch, especially the fix part
need to be backported to stable.

The fix part should aim at V4.15, and the other part can be a V4.16
stuff.
Bart Van Assche Dec. 5, 2017, 4:22 p.m. UTC | #6
On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> No, do not mix two different things in one patch, especially the fix part

> need to be backported to stable.

> 

> The fix part should aim at V4.15, and the other part can be a V4.16

> stuff.


Does this mean that you do not plan to post a v5 of your patch and that you
want me to rework this patch series? I can do that.

Bart.
Ming Lei Dec. 5, 2017, 4:38 p.m. UTC | #7
On Tue, Dec 05, 2017 at 04:22:33PM +0000, Bart Van Assche wrote:
> On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > No, do not mix two different things in one patch, especially the fix part
> > need to be backported to stable.
> > 
> > The fix part should aim at V4.15, and the other part can be a V4.16
> > stuff.
> 
> Does this mean that you do not plan to post a v5 of your patch and that you
> want me to rework this patch series? I can do that.

I believe V4 has been OK for merge, actually the only concern from James
is that 'set the cmnd to NULL *before* calling free so we narrow the race
window.', but that isn't required as I explained, even though you don't do
that in this patch too.

	https://marc.info/?t=151030464300003&r=1&w=2

I will work on V5 if Martin/James thinks it is needed.
James Bottomley Dec. 5, 2017, 4:43 p.m. UTC | #8
On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:22:33PM +0000, Bart Van Assche wrote:
> > 
> > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > > 
> > > No, do not mix two different things in one patch, especially the
> > > fix part need to be backported to stable.
> > > 
> > > The fix part should aim at V4.15, and the other part can be a
> > > V4.16 stuff.
> > 
> > Does this mean that you do not plan to post a v5 of your patch and
> > that you want me to rework this patch series? I can do that.
> 
> I believe V4 has been OK for merge, actually the only concern from
> James is that 'set the cmnd to NULL *before* calling free so we
> narrow the race window.', but that isn't required as I explained,
> even though you don't do that in this patch too.
> 
> 	https://marc.info/?t=151030464300003&r=1&w=2
> 
> I will work on V5 if Martin/James thinks it is needed.

I don't buy that it isn't needed.  The point (and the pattern) is for a
destructive action set the signal *before* you execute the action not
after.  The reason should be obvious: if you set it after you invite a
race where the check says OK but the object has gone.  Even if the race
is highly unlikely, the pattern point still holds.

James
Bart Van Assche Dec. 5, 2017, 5:51 p.m. UTC | #9
On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> On Tue, Dec 05, 2017 at 04:22:33PM +0000, Bart Van Assche wrote:

> > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:

> > > No, do not mix two different things in one patch, especially the fix part

> > > need to be backported to stable.

> > > 

> > > The fix part should aim at V4.15, and the other part can be a V4.16

> > > stuff.

> > 

> > Does this mean that you do not plan to post a v5 of your patch and that you

> > want me to rework this patch series? I can do that.

> 

> I believe V4 has been OK for merge, actually the only concern from James

> is that 'set the cmnd to NULL *before* calling free so we narrow the race

> window.', but that isn't required as I explained, even though you don't do

> that in this patch too.


The next version of this patch series will include the sd driver change requested
by James.

Bart.
Ming Lei Dec. 6, 2017, 1:16 a.m. UTC | #10
On Tue, Dec 05, 2017 at 08:43:49AM -0800, James Bottomley wrote:
> On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote:
> > On Tue, Dec 05, 2017 at 04:22:33PM +0000, Bart Van Assche wrote:
> > > 
> > > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote:
> > > > 
> > > > No, do not mix two different things in one patch, especially the
> > > > fix part need to be backported to stable.
> > > > 
> > > > The fix part should aim at V4.15, and the other part can be a
> > > > V4.16 stuff.
> > > 
> > > Does this mean that you do not plan to post a v5 of your patch and
> > > that you want me to rework this patch series? I can do that.
> > 
> > I believe V4 has been OK for merge, actually the only concern from
> > James is that 'set the cmnd to NULL *before* calling free so we
> > narrow the race window.', but that isn't required as I explained,
> > even though you don't do that in this patch too.
> > 
> > 	https://marc.info/?t=151030464300003&r=1&w=2
> > 
> > I will work on V5 if Martin/James thinks it is needed.
> 
> I don't buy that it isn't needed.  The point (and the pattern) is for a
> destructive action set the signal *before* you execute the action not
> after.  The reason should be obvious: if you set it after you invite a
> race where the check says OK but the object has gone.  Even if the race

Even you do that, the race is still highly likely there:

1) mempool_free() can be much quicker than scsi_show_rq() because it is
a local free, and scsi_show_rq() can be run from remote CPU wrt. the
allocated 'cmd->cmnd', and access to remote NUMA node should be slower
than mempool_free(), so use-after-free is triggered.

2) any preemption or local IRQ in scsi_show_rq() can make it touch
a freed buffer, and sd_uninit_command() is run from irq context.

3) no any barrier is applied, so the actual write can be reordered
in sd_uninit_command()

So setting the cmd->cmnd as NULL before mempool_free() can't avoid
the use-after-free, scsi_show_rq() has to survive that, then
do we really need to add the unnecessary change in sd_uninit_command()?

Not mention the change will make the debug info disappear too early, is
that what we need?

> is highly unlikely, the pattern point still holds.
diff mbox

Patch

diff --git a/drivers/scsi/scsi_debugfs.c b/drivers/scsi/scsi_debugfs.c
index 01f08c03f2c1..37ed6bb8e6ec 100644
--- a/drivers/scsi/scsi_debugfs.c
+++ b/drivers/scsi/scsi_debugfs.c
@@ -4,13 +4,50 @@ 
 #include <scsi/scsi_dbg.h>
 #include "scsi_debugfs.h"
 
+#define SCSI_CMD_FLAG_NAME(name) [ilog2(SCMD_##name)] = #name
+static const char *const scsi_cmd_flags[] = {
+	SCSI_CMD_FLAG_NAME(TAGGED),
+	SCSI_CMD_FLAG_NAME(UNCHECKED_ISA_DMA),
+	SCSI_CMD_FLAG_NAME(ZONE_WRITE_LOCK),
+	SCSI_CMD_FLAG_NAME(INITIALIZED),
+};
+#undef SCSI_CMD_FLAG_NAME
+
+static int scsi_flags_show(struct seq_file *m, const unsigned long flags,
+			   const char *const *flag_name, int flag_name_count)
+{
+	bool sep = false;
+	int i;
+
+	for (i = 0; i < sizeof(flags) * BITS_PER_BYTE; i++) {
+		if (!(flags & BIT(i)))
+			continue;
+		if (sep)
+			seq_puts(m, "|");
+		sep = true;
+		if (i < flag_name_count && flag_name[i])
+			seq_puts(m, flag_name[i]);
+		else
+			seq_printf(m, "%d", i);
+	}
+	return 0;
+}
+
 void scsi_show_rq(struct seq_file *m, struct request *rq)
 {
 	struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
-	int msecs = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
-	char buf[80];
+	int alloc_ms = jiffies_to_msecs(jiffies - cmd->jiffies_at_alloc);
+	int timeout_ms = jiffies_to_msecs(rq->timeout);
+	const u8 *const cdb = READ_ONCE(cmd->cmnd);
+	char buf[80] = "(?)";
 
-	__scsi_format_command(buf, sizeof(buf), cmd->cmnd, cmd->cmd_len);
-	seq_printf(m, ", .cmd=%s, .retries=%d, allocated %d.%03d s ago", buf,
-		   cmd->retries, msecs / 1000, msecs % 1000);
+	if (cdb)
+		__scsi_format_command(buf, sizeof(buf), cdb, cmd->cmd_len);
+	seq_printf(m, ", .cmd=%s, .retries=%d, .result = %#x, .flags=", buf,
+		   cmd->retries, cmd->result);
+	scsi_flags_show(m, cmd->flags, scsi_cmd_flags,
+			ARRAY_SIZE(scsi_cmd_flags));
+	seq_printf(m, ", .timeout=%d.%03d, allocated %d.%03d s ago",
+		   timeout_ms / 1000, timeout_ms % 1000,
+		   alloc_ms / 1000, alloc_ms % 1000);
 }