diff mbox

[4.10,panic,regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

Message ID 20161223100014.GA29467@lst.de (mailing list archive)
State New, archived
Headers show

Commit Message

Christoph Hellwig Dec. 23, 2016, 10 a.m. UTC
On Thu, Dec 22, 2016 at 04:03:56PM -0800, Chris Leech wrote:
> Of course, looks like I've screwed up my bisect run on this so I'm still
> taking a look.  It triggers for me with 'hdparm -B /dev/vda' but may
> also depend on kernel configuration.
> 
> I started with the fedora rawhide config with a lot of debug on and
> trimed it down with a localmodconfig run in the VM to speed up rebuilds.

I think the configuration dependency is CONFIG_HAVE_ARCH_VMAP_STACK,
I've just reproduce the issue with it, and the backtrace points to
__virtblk_add_req when setting up the sense buffer.   And it turns out
that blk_execute_rq tries to do I/O to the on-stack sense buffer.
At least SCSI always has a kmalloced sense buffer, so I guess we'll
need something similar for virtio_blk for now.  For 4.11 I plan to
rework how BLOCK_PC commands work entirely, so hopefull we can make
the sense buffer handling a lot less wasteful.

---
From 0a77bc424ed907c1e99b4756bb498370b498183a Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 23 Dec 2016 10:57:06 +0100
Subject: virtio_blk: avoid DMA to stack for the sense buffer

Most users of BLOCK_PC requests allocate the sense buffer on the stack,
so to avoid DMA to the stack copy them to a field in the heap allocated
virtblk_req structure.  Without that any attempt at SCSI passthrough I/O,
including the SG_IO ioctl from userspace will crash the kernel.  Note that
this includes running tools like hdparm even when the host does not have
SCSI passthrough enabled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/virtio_blk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Linus Torvalds Dec. 23, 2016, 7:42 p.m. UTC | #1
On Fri, Dec 23, 2016 at 2:00 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 23 Dec 2016 10:57:06 +0100
> Subject: virtio_blk: avoid DMA to stack for the sense buffer
>
> Most users of BLOCK_PC requests allocate the sense buffer on the stack,
> so to avoid DMA to the stack copy them to a field in the heap allocated
> virtblk_req structure.  Without that any attempt at SCSI passthrough I/O,
> including the SG_IO ioctl from userspace will crash the kernel.  Note that
> this includes running tools like hdparm even when the host does not have
> SCSI passthrough enabled.

Ugh. This patch is nasty.

I think we should just fix blk_execute_rq() instead.

But from a quick look, we also have at least sg_scsi_ioctl() and
sg_io() doing the same thing.

And the SG_IO thing in bsg_ioctl(). And spi_execute() in scsi_transport_spi.c

And resp_requests() in scsi_debug.c.

So I guess ugly it may need to be, and the rule is that the sense
buffer really can be on the stack and you can't DMA to/from it.
Comments from others?

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe Dec. 24, 2016, 2:45 a.m. UTC | #2
On 12/23/2016 12:42 PM, Linus Torvalds wrote:
> On Fri, Dec 23, 2016 at 2:00 AM, Christoph Hellwig <hch@lst.de> wrote:
>>
>> From: Christoph Hellwig <hch@lst.de>
>> Date: Fri, 23 Dec 2016 10:57:06 +0100
>> Subject: virtio_blk: avoid DMA to stack for the sense buffer
>>
>> Most users of BLOCK_PC requests allocate the sense buffer on the stack,
>> so to avoid DMA to the stack copy them to a field in the heap allocated
>> virtblk_req structure.  Without that any attempt at SCSI passthrough I/O,
>> including the SG_IO ioctl from userspace will crash the kernel.  Note that
>> this includes running tools like hdparm even when the host does not have
>> SCSI passthrough enabled.
> 
> Ugh. This patch is nasty.
> 
> I think we should just fix blk_execute_rq() instead.
> 
> But from a quick look, we also have at least sg_scsi_ioctl() and
> sg_io() doing the same thing.
> 
> And the SG_IO thing in bsg_ioctl(). And spi_execute() in scsi_transport_spi.c
> 
> And resp_requests() in scsi_debug.c.

It's not that it's technically hard to fix up, it's more that it's a
pain in the ass to have to do it. For instance, for blk_execute_rq(), we
either should enforce that the caller allocates it dynamically and then
free it, or we need nasty hack where the caller needs to know he has to
free it. Pretty obvious what I would prefer there.

And yes, there would be a good chunk of other places where this would
nede to be fixed up...

> So I guess ugly it may need to be, and the rule is that the sense
> buffer really can be on the stack and you can't DMA to/from it.
> Comments from others?

I'm just wondering why this is being hit now, we have a 4.9 release with
this issue and nobody reported it (that I saw)... Which is pretty sad.

If no one beats me to it, I'll try and get a patch done on Sunday. We're
in the midst of the holidays here.
Christoph Hellwig Dec. 24, 2016, 9:49 a.m. UTC | #3
On Fri, Dec 23, 2016 at 07:45:45PM -0700, Jens Axboe wrote:
> It's not that it's technically hard to fix up, it's more that it's a
> pain in the ass to have to do it. For instance, for blk_execute_rq(), we
> either should enforce that the caller allocates it dynamically and then
> free it, or we need nasty hack where the caller needs to know he has to
> free it. Pretty obvious what I would prefer there.
> 
> And yes, there would be a good chunk of other places where this would
> nede to be fixed up...

My planned rework for the BLOCK_PC code (split all fields for them out
of struct request and move them into a separate, driver-allocate structure)
would fix this up as a side-effect.  I really wanted to get it into 4.10,
but I didn't manage to fix it up.  I'll try to get it into 4.11 early.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Dec. 24, 2016, 10:07 a.m. UTC | #4
On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote:
> Ugh. This patch is nasty.

It's the same SCSI has done for ages - except that is uses a separate
kmalloc for the sense buffer.

> I think we should just fix blk_execute_rq() instead.

As you found out below it's not just blk_execute_rq, it's the whole
architecture of the BLOCK_PC code, which expects a caller provided
sense buffer.  But with the way blk-mq allocates request structures
we can actually fix it, but I first need to extent the way it allows
drivers to allocate private data to the old request code.  I've
actually already implemented that for SCSI long time ago, and have
started to life it to the block layer.

Once that is done the callers won't need a sense buffer at all, and
can just look at the driver provided one.  Which currently is missing
in virtio-blk, so we'd need something similar to the above patch
anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke Dec. 24, 2016, 1:17 p.m. UTC | #5
On 12/24/2016 11:07 AM, Christoph Hellwig wrote:
> On Fri, Dec 23, 2016 at 11:42:45AM -0800, Linus Torvalds wrote:
>> Ugh. This patch is nasty.
>
> It's the same SCSI has done for ages - except that is uses a separate
> kmalloc for the sense buffer.
>
>> I think we should just fix blk_execute_rq() instead.
>
> As you found out below it's not just blk_execute_rq, it's the whole
> architecture of the BLOCK_PC code, which expects a caller provided
> sense buffer.  But with the way blk-mq allocates request structures
> we can actually fix it, but I first need to extent the way it allows
> drivers to allocate private data to the old request code.  I've
> actually already implemented that for SCSI long time ago, and have
> started to life it to the block layer.
>
Would be cool to have a generic sense buffer.
I always found it slightly odd, pretending that 'struct request' is 
protocol-agnostic and refusing to add a sense data pointer, but at the 
same time having a field 'sense_len' (which gives the length of what 
exactly?).

Christoph, do you have a pointer to your patchset?
Not that I'll be able to do any meaningful work until next year, but 
having a look would be nice. Just to get a feeling where you want to 
head to; I might be able to work on this start of January.

Cheers,

Hannes
Christoph Hellwig Dec. 24, 2016, 1:19 p.m. UTC | #6
On Sat, Dec 24, 2016 at 02:17:26PM +0100, Hannes Reinecke wrote:
> Christoph, do you have a pointer to your patchset?
> Not that I'll be able to do any meaningful work until next year, but having 
> a look would be nice. Just to get a feeling where you want to head to; I 
> might be able to work on this start of January.

I'll push out a branch once it's revieable and not my current unbisectable
mess, should be soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 4, 2017, 2:07 p.m. UTC | #7
On Sat, Dec 24, 2016 at 02:17:26PM +0100, Hannes Reinecke wrote:
> Christoph, do you have a pointer to your patchset?

Here is a pointer to the current one after splitting it into properly
bisectable chunks.  Besides proper changelogs the biggest item left
is fixing up dm-mpath to not allocate its own request structures.

http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-pc-refactor
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 5545a67..3c3b8f6 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -56,6 +56,7 @@  struct virtblk_req {
 	struct virtio_blk_outhdr out_hdr;
 	struct virtio_scsi_inhdr in_hdr;
 	u8 status;
+	u8 sense[SCSI_SENSE_BUFFERSIZE];
 	struct scatterlist sg[];
 };
 
@@ -102,7 +103,8 @@  static int __virtblk_add_req(struct virtqueue *vq,
 	}
 
 	if (type == cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_SCSI_CMD)) {
-		sg_init_one(&sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
+		memcpy(vbr->sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
+		sg_init_one(&sense, vbr->sense, SCSI_SENSE_BUFFERSIZE);
 		sgs[num_out + num_in++] = &sense;
 		sg_init_one(&inhdr, &vbr->in_hdr, sizeof(vbr->in_hdr));
 		sgs[num_out + num_in++] = &inhdr;