Message ID | 20170112181834.GA8808@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> From: Christoph Hellwig [mailto:hch@lst.de] > Sent: Friday, January 13, 2017 02:19 > To: Dexuan Cui <decui@microsoft.com> > Cc: linux-block@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Chris > Valean (Cloudbase Solutions SRL) <v-chvale@microsoft.com> > Subject: Re: [Regression] fstrim hangs on Hyper-V: caused by "block: improve > handling of the magic discard payload" > > Next try: (I've also dropped most of the Cc list) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index c35b6de..2f358f7 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1018,7 +1018,10 @@ static int scsi_init_sgtable(struct request *req, > struct scsi_data_buffer *sdb) > count = blk_rq_map_sg(req->q, req, sdb->table.sgl); > BUG_ON(count > sdb->table.nents); > sdb->table.nents = count; > - sdb->length = blk_rq_bytes(req); > + if (req->rq_flags & RQF_SPECIAL_PAYLOAD) > + sdb->length = req->special_vec.bv_len; > + else > + sdb->length = blk_rq_bytes(req); > return BLKPREP_OK; > } Hi Christoph, The patch works like a charm! fstrim can work now. Chris may help to do more test. FWIW: If (req->rq_flags & RQF_SPECIAL_PAYLOAD) is true, req->special_vec.bv_len is always 24 in my test. Thanks really a lot for your quick patch! :-) Can the patch make it into v4.10? IMO It's a really important fix. Thanks, -- Dexuan -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Dexuan Cui > Sent: Friday, January 13, 2017 11:05 > To: 'Christoph Hellwig' <hch@lst.de> > Cc: linux-block@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Chris > Valean (Cloudbase Solutions SRL) <v-chvale@microsoft.com> > Subject: RE: [Regression] fstrim hangs on Hyper-V: caused by "block: improve > handling of the magic discard payload" > > > From: Christoph Hellwig [mailto:hch@lst.de] > > Sent: Friday, January 13, 2017 02:19 > > To: Dexuan Cui <decui@microsoft.com> > > Cc: linux-block@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Chris > > Valean (Cloudbase Solutions SRL) <v-chvale@microsoft.com> > > Subject: Re: [Regression] fstrim hangs on Hyper-V: caused by "block: > improve > > handling of the magic discard payload" > > > > Next try: (I've also dropped most of the Cc list) > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index c35b6de..2f358f7 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -1018,7 +1018,10 @@ static int scsi_init_sgtable(struct request *req, > > struct scsi_data_buffer *sdb) > > count = blk_rq_map_sg(req->q, req, sdb->table.sgl); > > BUG_ON(count > sdb->table.nents); > > sdb->table.nents = count; > > - sdb->length = blk_rq_bytes(req); > > + if (req->rq_flags & RQF_SPECIAL_PAYLOAD) > > + sdb->length = req->special_vec.bv_len; > > + else > > + sdb->length = blk_rq_bytes(req); > > return BLKPREP_OK; > > } > > Hi Christoph, > The patch works like a charm! > fstrim can work now. > Chris may help to do more test. > > FWIW: > If (req->rq_flags & RQF_SPECIAL_PAYLOAD) is true, > req->special_vec.bv_len is always 24 in my test. > > Thanks really a lot for your quick patch! :-) > > Can the patch make it into v4.10? > IMO It's a really important fix. > > Thanks, > -- Dexuan FYI: I did more tests and the patch worked just great! BTW, fstrim/mkfs are not the only affected tools: I put a WARN_ON before the new line and found python too (see the below calltrace). IMO this means not only SCSI Unmap command is affected, but some other SCSI commands can be affected too? And it looks the bare metal can be affected too? Thanks, -- Dexuan //Dexcuan: in this case: req->special_vec.bv_len is 24 and // blk_rq_bytes(req) is 4096. [ 17.862939] CPU: 2 PID: 1430 Comm: python3 Tainted: G W 4.10.0-rc3+ #1 [ 17.862940] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012 [ 17.862941] Call Trace: [ 17.862947] dump_stack+0x63/0x90 [ 17.862952] __warn+0xcb/0xf0 [ 17.862954] warn_slowpath_fmt+0x5f/0x80 [ 17.862955] scsi_init_sgtable+0x92/0xc0 [ 17.862956] scsi_init_io+0x4f/0x1e0 [ 17.862959] sd_init_command+0x55b/0xdb0 [ 17.862963] ? scsi_host_alloc_command+0x44/0xc0 [ 17.862965] scsi_setup_cmnd+0xf0/0x150 [ 17.862966] scsi_prep_fn+0xef/0x170 [ 17.862968] blk_peek_request+0x180/0x2b0 [ 17.862970] scsi_request_fn+0x3e/0x620 [ 17.862973] ? elv_rb_add+0x61/0x70 [ 17.862977] ? deadline_add_request+0x36/0x80 [ 17.862978] __blk_run_queue+0x33/0x40 [ 17.862979] blk_queue_bio+0x3c8/0x3e0 [ 17.862980] generic_make_request+0xf2/0x1d0 [ 17.862981] submit_bio+0x73/0x150 [ 17.862985] submit_bh_wbc+0x14c/0x180 [ 17.862987] ll_rw_block+0x78/0xb0 [ 17.862988] __block_write_begin_int+0x4d6/0x5c0 [ 17.863002] ? ext4_inode_attach_jinode.part.67+0xb0/0xb0 [ 17.863004] ? ext4_da_write_begin+0x122/0x400 [ 17.863006] __block_write_begin+0x11/0x20 [ 17.863007] ext4_da_write_begin+0x178/0x400 [ 17.863012] generic_perform_write+0xc9/0x1c0 [ 17.863015] ? file_update_time+0xc8/0x110 [ 17.863017] __generic_file_write_iter+0x1a6/0x1f0 [ 17.863020] ext4_file_write_iter+0x89/0x370 [ 17.863023] ? _copy_to_user+0x2e/0x40 [ 17.863026] ? cp_new_stat+0x153/0x180 [ 17.863030] __vfs_write+0xe3/0x160 [ 17.863031] vfs_write+0xb8/0x1b0 [ 17.863032] SyS_write+0x55/0xc0 [ 17.863036] entry_SYSCALL_64_fastpath+0x1e/0xad [ 17.863037] RIP: 0033:0x7f27314bf4bd -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 13, 2017 at 06:16:02AM +0000, Dexuan Cui wrote: > IMO this means not only SCSI Unmap command is affected, but > some other SCSI commands can be affected too? > And it looks the bare metal can be affected too? This affects all drivers looking at the sdb.length field for the total I/O length - many drivers don't need it but just the SGL, including both that I tested d this change on - one being virtualized and one bare metal. It also only affects commands where the data transfer length is different from the length of the written blocks, so only affects WRITE SAME and UNMAP commands, used for discard or zeroing. I'll submit a cleaned up version with a proper block layer helper today. Thanks for reporting and debugging this issue! -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c35b6de..2f358f7 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1018,7 +1018,10 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb) count = blk_rq_map_sg(req->q, req, sdb->table.sgl); BUG_ON(count > sdb->table.nents); sdb->table.nents = count; - sdb->length = blk_rq_bytes(req); + if (req->rq_flags & RQF_SPECIAL_PAYLOAD) + sdb->length = req->special_vec.bv_len; + else + sdb->length = blk_rq_bytes(req); return BLKPREP_OK; }