Message ID | 20190617122000.22181-3-hch@lst.de (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [1/8] scsi: add a host / host template field for the virt boundary | expand |
On 6/17/19 5:19 AM, Christoph Hellwig wrote: > We need to limit the devices max_sectors to what the DMA mapping > implementation can support. If not we risk running out of swiotlb > buffers easily. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > drivers/scsi/scsi_lib.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index d333bb6b1c59..f233bfd84cd7 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) > blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize); > } > > + shost->max_sectors = min_t(unsigned int, shost->max_sectors, > + dma_max_mapping_size(dev) << SECTOR_SHIFT); > blk_queue_max_hw_sectors(q, shost->max_sectors); > if (shost->unchecked_isa_dma) > blk_queue_bounce_limit(q, BLK_BOUNCE_ISA); Does dma_max_mapping_size() return a value in bytes? Is shost->max_sectors a number of sectors? If so, are you sure that "<< SECTOR_SHIFT" is the proper conversion? Shouldn't that be ">> SECTOR_SHIFT" instead? Additionally, how about adding a comment above dma_max_mapping_size() that documents the unit of the returned number? Thanks, Bart.
On Tue, Jun 18, 2019 at 4:57 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 6/17/19 5:19 AM, Christoph Hellwig wrote: > > We need to limit the devices max_sectors to what the DMA mapping > > implementation can support. If not we risk running out of swiotlb > > buffers easily. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > --- > > drivers/scsi/scsi_lib.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index d333bb6b1c59..f233bfd84cd7 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) > > blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize); > > } > > > > + shost->max_sectors = min_t(unsigned int, shost->max_sectors, > > + dma_max_mapping_size(dev) << SECTOR_SHIFT); > > blk_queue_max_hw_sectors(q, shost->max_sectors); > > if (shost->unchecked_isa_dma) > > blk_queue_bounce_limit(q, BLK_BOUNCE_ISA); > > Does dma_max_mapping_size() return a value in bytes? Is > shost->max_sectors a number of sectors? If so, are you sure that "<< > SECTOR_SHIFT" is the proper conversion? Shouldn't that be ">> > SECTOR_SHIFT" instead? Now the patch has been committed, '<< SECTOR_SHIFT' needs to be fixed. Also the following kernel oops is triggered on qemu, and looks device->dma_mask is NULL. [ 5.826483] scsi host0: Virtio SCSI HBA [ 5.829302] st: Version 20160209, fixed bufsize 32768, s/g segs 256 [ 5.831042] SCSI Media Changer driver v0.25 [ 5.832491] ================================================================== [ 5.833332] BUG: KASAN: null-ptr-deref in dma_direct_max_mapping_size+0x30/0x94 [ 5.833332] Read of size 8 at addr 0000000000000000 by task kworker/u17:0/7 [ 5.835506] nvme nvme0: pci function 0000:00:07.0 [ 5.833332] [ 5.833332] CPU: 2 PID: 7 Comm: kworker/u17:0 Not tainted 5.3.0-rc1 #1328 [ 5.836999] ahci 0000:00:1f.2: version 3.0 [ 5.833332] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 [ 5.833332] Workqueue: events_unbound async_run_entry_fn [ 5.833332] Call Trace: [ 5.833332] dump_stack+0x6f/0x9d [ 5.833332] ? dma_direct_max_mapping_size+0x30/0x94 [ 5.833332] __kasan_report+0x161/0x189 [ 5.833332] ? dma_direct_max_mapping_size+0x30/0x94 [ 5.833332] kasan_report+0xe/0x12 [ 5.833332] dma_direct_max_mapping_size+0x30/0x94 [ 5.833332] __scsi_init_queue+0xd8/0x1f3 [ 5.833332] scsi_mq_alloc_queue+0x62/0x89 [ 5.833332] scsi_alloc_sdev+0x38c/0x479 [ 5.833332] scsi_probe_and_add_lun+0x22d/0x1093 [ 5.833332] ? kobject_set_name_vargs+0xa4/0xb2 [ 5.833332] ? mutex_lock+0x88/0xc4 [ 5.833332] ? scsi_free_host_dev+0x4a/0x4a [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 [ 5.833332] ? ata_tdev_match+0x22/0x45 [ 5.833332] ? attribute_container_add_device+0x160/0x17e [ 5.833332] ? rpm_resume+0x26a/0x7c0 [ 5.833332] ? kobject_get+0x12/0x43 [ 5.833332] ? rpm_put_suppliers+0x7e/0x7e [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 [ 5.833332] ? scsi_target_destroy+0x135/0x135 [ 5.833332] __scsi_scan_target+0x14b/0x6aa [ 5.833332] ? pvclock_clocksource_read+0xc0/0x14e [ 5.833332] ? scsi_add_device+0x20/0x20 [ 5.833332] ? rpm_resume+0x1ae/0x7c0 [ 5.833332] ? rpm_put_suppliers+0x7e/0x7e [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 [ 5.833332] ? pick_next_task_fair+0x976/0xa3d [ 5.833332] ? mutex_lock+0x88/0xc4 [ 5.833332] scsi_scan_channel+0x76/0x9e [ 5.833332] scsi_scan_host_selected+0x131/0x176 [ 5.833332] ? scsi_scan_host+0x241/0x241 [ 5.833332] do_scan_async+0x27/0x219 [ 5.833332] ? scsi_scan_host+0x241/0x241 [ 5.833332] async_run_entry_fn+0xdc/0x23d [ 5.833332] process_one_work+0x327/0x539 [ 5.833332] worker_thread+0x330/0x492 [ 5.833332] ? rescuer_thread+0x41f/0x41f [ 5.833332] kthread+0x1c6/0x1d5 [ 5.833332] ? kthread_park+0xd3/0xd3 [ 5.833332] ret_from_fork+0x1f/0x30 [ 5.833332] ================================================================== Thanks, Ming Lei
On Sun, Jul 21, 2019 at 11:01 PM Ming Lei <tom.leiming@gmail.com> wrote: > > On Tue, Jun 18, 2019 at 4:57 AM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 6/17/19 5:19 AM, Christoph Hellwig wrote: > > > We need to limit the devices max_sectors to what the DMA mapping > > > implementation can support. If not we risk running out of swiotlb > > > buffers easily. > > > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > > --- > > > drivers/scsi/scsi_lib.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > > index d333bb6b1c59..f233bfd84cd7 100644 > > > --- a/drivers/scsi/scsi_lib.c > > > +++ b/drivers/scsi/scsi_lib.c > > > @@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) > > > blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize); > > > } > > > > > > + shost->max_sectors = min_t(unsigned int, shost->max_sectors, > > > + dma_max_mapping_size(dev) << SECTOR_SHIFT); > > > blk_queue_max_hw_sectors(q, shost->max_sectors); > > > if (shost->unchecked_isa_dma) > > > blk_queue_bounce_limit(q, BLK_BOUNCE_ISA); > > > > Does dma_max_mapping_size() return a value in bytes? Is > > shost->max_sectors a number of sectors? If so, are you sure that "<< > > SECTOR_SHIFT" is the proper conversion? Shouldn't that be ">> > > SECTOR_SHIFT" instead? > > Now the patch has been committed, '<< SECTOR_SHIFT' needs to be fixed. > > Also the following kernel oops is triggered on qemu, and looks > device->dma_mask is NULL. > > Ming Lei FYI: we also see the panic with a Linux kernel 5.2.0-next-20190719 running on Hyper-V: [ 7.429053] RIP: 0010:dma_direct_max_mapping_size+0x26/0x80 [ 7.429053] Code: 0f b6 c0 c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 4c 14 00 00 84 c0 74 45 48 8b 83 28 02 00 00 4c 8b a3 38 02 00 00 <48> 8b 00 48 85 c0 74 0c 4d 85 e4 74 36 49 39 c4 4c 0f 47 e0 48 89 [ 7.429053] RSP: 0018:ffffc1d5005efbc0 EFLAGS: 00010202 [ 7.429053] RAX: 0000000000000000 RBX: ffff9cf86d24c428 RCX: 0000000000000000 [ 7.429053] RDX: ffff9cf86d12dd00 RSI: 0000000000000200 RDI: ffff9cf86d24c428 [ 7.429053] RBP: ffffc1d5005efbd0 R08: ffff9cf86fcaf0e0 R09: ffff9cf86e0072c0 [ 7.429053] R10: ffffc1d5005efa70 R11: 00000000000301a0 R12: 0000000000000000 [ 7.429053] R13: ffff9cf86d24c428 R14: 0000000000000400 R15: ffff9cf825cff000 [ 7.429053] FS: 0000000000000000(0000) GS:ffff9cf86fc80000(0000) knlGS:0000000000000000 [ 7.429053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7.429053] CR2: 0000000000000000 CR3: 00000003c700a001 CR4: 00000000003606e0 [ 7.456569] NET: Registered protocol family 17 [ 7.429053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 7.469803] Key type dns_resolver registered [ 7.429053] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 7.429053] Call Trace: [ 7.429053] dma_max_mapping_size+0x39/0x50 [ 7.429053] __scsi_init_queue+0x7f/0x140 [ 7.429053] scsi_mq_alloc_queue+0x38/0x60 [ 7.429053] scsi_alloc_sdev+0x1da/0x2b0 [ 7.429053] scsi_probe_and_add_lun+0x471/0xe60 [ 7.429053] __scsi_scan_target+0xfc/0x610 [ 7.429053] scsi_scan_channel+0x66/0xa0 [ 7.429053] scsi_scan_host_selected+0xf3/0x160 [ 7.429053] do_scsi_scan_host+0x93/0xa0 [ 7.429053] do_scan_async+0x1c/0x190 [ 7.429053] async_run_entry_fn+0x3c/0x150 [ 7.429053] process_one_work+0x1f7/0x3f0 [ 7.429053] worker_thread+0x34/0x400 [ 7.429053] kthread+0x121/0x140 [ 7.429053] ret_from_fork+0x35/0x40 [ 7.429053] Modules linked in: [ 7.429053] CR2: 0000000000000000 [ 7.766122] BUG: kernel NULL pointer dereference, address: 0000000000000000 Thanks, -- Dexuan
On 2019/07/22 15:01, Ming Lei wrote: > On Tue, Jun 18, 2019 at 4:57 AM Bart Van Assche <bvanassche@acm.org> wrote: >> >> On 6/17/19 5:19 AM, Christoph Hellwig wrote: >>> We need to limit the devices max_sectors to what the DMA mapping >>> implementation can support. If not we risk running out of swiotlb >>> buffers easily. >>> >>> Signed-off-by: Christoph Hellwig <hch@lst.de> >>> --- >>> drivers/scsi/scsi_lib.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>> index d333bb6b1c59..f233bfd84cd7 100644 >>> --- a/drivers/scsi/scsi_lib.c >>> +++ b/drivers/scsi/scsi_lib.c >>> @@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) >>> blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize); >>> } >>> >>> + shost->max_sectors = min_t(unsigned int, shost->max_sectors, >>> + dma_max_mapping_size(dev) << SECTOR_SHIFT); >>> blk_queue_max_hw_sectors(q, shost->max_sectors); >>> if (shost->unchecked_isa_dma) >>> blk_queue_bounce_limit(q, BLK_BOUNCE_ISA); >> >> Does dma_max_mapping_size() return a value in bytes? Is >> shost->max_sectors a number of sectors? If so, are you sure that "<< >> SECTOR_SHIFT" is the proper conversion? Shouldn't that be ">> >> SECTOR_SHIFT" instead? > > Now the patch has been committed, '<< SECTOR_SHIFT' needs to be fixed. > > Also the following kernel oops is triggered on qemu, and looks > device->dma_mask is NULL. Just hit the exact same problem using tcmu-runner (ZBC file handler) on bare metal (no QEMU). dev->dma_mask is NULL. No problem with real disks though. > > [ 5.826483] scsi host0: Virtio SCSI HBA > [ 5.829302] st: Version 20160209, fixed bufsize 32768, s/g segs 256 > [ 5.831042] SCSI Media Changer driver v0.25 > [ 5.832491] ================================================================== > [ 5.833332] BUG: KASAN: null-ptr-deref in > dma_direct_max_mapping_size+0x30/0x94 > [ 5.833332] Read of size 8 at addr 0000000000000000 by task kworker/u17:0/7 > [ 5.835506] nvme nvme0: pci function 0000:00:07.0 > [ 5.833332] > [ 5.833332] CPU: 2 PID: 7 Comm: kworker/u17:0 Not tainted 5.3.0-rc1 #1328 > [ 5.836999] ahci 0000:00:1f.2: version 3.0 > [ 5.833332] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS ?-20180724_192412-buildhw-07.phx4 > [ 5.833332] Workqueue: events_unbound async_run_entry_fn > [ 5.833332] Call Trace: > [ 5.833332] dump_stack+0x6f/0x9d > [ 5.833332] ? dma_direct_max_mapping_size+0x30/0x94 > [ 5.833332] __kasan_report+0x161/0x189 > [ 5.833332] ? dma_direct_max_mapping_size+0x30/0x94 > [ 5.833332] kasan_report+0xe/0x12 > [ 5.833332] dma_direct_max_mapping_size+0x30/0x94 > [ 5.833332] __scsi_init_queue+0xd8/0x1f3 > [ 5.833332] scsi_mq_alloc_queue+0x62/0x89 > [ 5.833332] scsi_alloc_sdev+0x38c/0x479 > [ 5.833332] scsi_probe_and_add_lun+0x22d/0x1093 > [ 5.833332] ? kobject_set_name_vargs+0xa4/0xb2 > [ 5.833332] ? mutex_lock+0x88/0xc4 > [ 5.833332] ? scsi_free_host_dev+0x4a/0x4a > [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde > [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 > [ 5.833332] ? ata_tdev_match+0x22/0x45 > [ 5.833332] ? attribute_container_add_device+0x160/0x17e > [ 5.833332] ? rpm_resume+0x26a/0x7c0 > [ 5.833332] ? kobject_get+0x12/0x43 > [ 5.833332] ? rpm_put_suppliers+0x7e/0x7e > [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde > [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 > [ 5.833332] ? scsi_target_destroy+0x135/0x135 > [ 5.833332] __scsi_scan_target+0x14b/0x6aa > [ 5.833332] ? pvclock_clocksource_read+0xc0/0x14e > [ 5.833332] ? scsi_add_device+0x20/0x20 > [ 5.833332] ? rpm_resume+0x1ae/0x7c0 > [ 5.833332] ? rpm_put_suppliers+0x7e/0x7e > [ 5.833332] ? _raw_spin_lock_irqsave+0x8c/0xde > [ 5.833332] ? _raw_write_unlock_irqrestore+0x23/0x23 > [ 5.833332] ? pick_next_task_fair+0x976/0xa3d > [ 5.833332] ? mutex_lock+0x88/0xc4 > [ 5.833332] scsi_scan_channel+0x76/0x9e > [ 5.833332] scsi_scan_host_selected+0x131/0x176 > [ 5.833332] ? scsi_scan_host+0x241/0x241 > [ 5.833332] do_scan_async+0x27/0x219 > [ 5.833332] ? scsi_scan_host+0x241/0x241 > [ 5.833332] async_run_entry_fn+0xdc/0x23d > [ 5.833332] process_one_work+0x327/0x539 > [ 5.833332] worker_thread+0x330/0x492 > [ 5.833332] ? rescuer_thread+0x41f/0x41f > [ 5.833332] kthread+0x1c6/0x1d5 > [ 5.833332] ? kthread_park+0xd3/0xd3 > [ 5.833332] ret_from_fork+0x1f/0x30 > [ 5.833332] ================================================================== > > > > Thanks, > Ming Lei >
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d333bb6b1c59..f233bfd84cd7 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize); } + shost->max_sectors = min_t(unsigned int, shost->max_sectors, + dma_max_mapping_size(dev) << SECTOR_SHIFT); blk_queue_max_hw_sectors(q, shost->max_sectors); if (shost->unchecked_isa_dma) blk_queue_bounce_limit(q, BLK_BOUNCE_ISA);
We need to limit the devices max_sectors to what the DMA mapping implementation can support. If not we risk running out of swiotlb buffers easily. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/scsi/scsi_lib.c | 2 ++ 1 file changed, 2 insertions(+)