diff mbox

kernel BUG at block/bio.c:1787! while initializing scsi_debug on ppc64 host

Message ID CACVXFVOPgONuXjGhBnFVxuRVr+=LFe36QR-VTGVvTkEK+V0JLw@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei Dec. 15, 2015, 12:06 p.m. UTC
On Tue, Dec 15, 2015 at 7:20 PM, Eryu Guan <guaneryu@gmail.com> wrote:
> On Fri, Dec 11, 2015 at 07:53:40PM +0800, Eryu Guan wrote:
>> Hi,
>>
>> I saw this kernel BUG_ON on 4.4-rc4 kernel, and this can be reproduced
>> easily on ppc64 host by:
>
> This is still reproducible with 4.4-rc5 kernel.

Could you capture the debug log after appyling the attached patch and
the reproduction?

Thanks,

>
> Thanks,
> Eryu
>
>>
>> modprobe scsi_debug sector_size=512 physblk_exp=3 dev_size_mb=256
>>
>> And I bisected to this commit
>>
>>       commit ca369d51b3e1649be4a72addd6d6a168cfb3f537
>>       Author: Martin K. Petersen <martin.petersen@oracle.com>
>>       Date:   Fri Nov 13 16:46:48 2015 -0500
>>
>>           block/sd: Fix device-imposed transfer length limits
>>
>> I confirmed by reverting this commit on top of 4.4-rc4 kernel and test
>> passed.
>>
>> Thanks,
>> Eryu
>>
>> P.S. dmesg log
>> [  817.477557] scsi_debug:sdebug_driver_probe: host protection
>> [  817.477571] scsi host1: scsi_debug, version 1.85 [20141022], dev_size_mb=256, opts=0x0
>> [  817.478202] scsi 1:0:0:0: Direct-Access     Linux    scsi_debug       0184 PQ: 0 ANSI: 6
>> [  817.478733] sd 1:0:0:0: Attached scsi generic sg1 type 0
>> [  817.496144] sd 1:0:0:0: [sdb] 524288 512-byte logical blocks: (268 MB/256 MiB)
>> [  817.496155] sd 1:0:0:0: [sdb] 4096-byte physical blocks
>> [  817.506142] sd 1:0:0:0: [sdb] Write Protect is off
>> [  817.526134] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
>> [  817.646163] ------------[ cut here ]------------
>> [  817.646168] kernel BUG at block/bio.c:1787!
>> [  817.646172] Oops: Exception in kernel mode, sig: 5 [#1]
>> [  817.646174] SMP NR_CPUS=2048 NUMA pSeries
>> [  817.646178] Modules linked in: scsi_debug(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) dm_mod(E) loop(E) sg(E) pseries_rng(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) sunrpc(E) grace(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) ibmvscsi(E) ibmveth(E) scsi_transport_srp(E)
>> [  817.646205] CPU: 6 PID: 166 Comm: kworker/u321:1 Tainted: G            E   4.4.0-rc4 #1
>> [  817.646211] Workqueue: events_unbound .async_run_entry_fn
>> [  817.646215] task: c00000000a0c0000 ti: c00000000a180000 task.ti: c00000000a180000
>> [  817.646218] NIP: c0000000003b1d54 LR: c0000000003c4780 CTR: c0000000003be420
>> [  817.646222] REGS: c00000000a1826c0 TRAP: 0700   Tainted: G            E    (4.4.0-rc4)
>> [  817.646225] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI>  CR: 24732728  XER: 00000000
>> [  817.646233] CFAR: c0000000003c477c SOFTE: 1
>> GPR00: c0000000003c4780 c00000000a182940 c000000001325e00 c00000016cebcf00
>> GPR04: 0000000000000000 0000000002400000 c00000013c5f4d80 0000000000000040
>> GPR08: f000000000436ac0 0000000000000001 0000000000000000 ffffffffffffffff
>> GPR12: 0000000024732722 c00000000e743900 0000000000000000 f000000000436ac0
>> GPR16: c0000000f9e3eee0 c00000010dab0000 0000000000000001 0000000000000000
>> GPR20: 0000000000000000 0000000000000080 0000000000000000 c00000016cebcf00
>> GPR24: c0000000ff9b5a20 c00000000a182bb8 c00000016cebcf88 0000000000000000
>> GPR28: 0000000000000000 c00000016cebcf00 0000000000000000 0000000000010000
>> [  817.646273] NIP [c0000000003b1d54] .bio_split+0x34/0x110
>> [  817.646277] LR [c0000000003c4780] .blk_queue_split+0x3b0/0x560
>> [  817.646280] Call Trace:
>> [  817.646282] [c00000000a182940] [c00000000a1829d0] 0xc00000000a1829d0 (unreliable)
>> [  817.646287] [c00000000a1829d0] [c0000000003c4780] .blk_queue_split+0x3b0/0x560
>> [  817.646291] [c00000000a182ae0] [c0000000003be460] .blk_queue_bio+0x40/0x430
>> [  817.646295] [c00000000a182b80] [c0000000003bc0f0] .generic_make_request+0x150/0x210
>> [  817.646299] [c00000000a182c30] [c0000000003bc26c] .submit_bio+0xbc/0x1c0
>> [  817.646304] [c00000000a182cf0] [c0000000002cb64c] .submit_bh_wbc+0x19c/0x200
>> [  817.646308] [c00000000a182d90] [c0000000002cbb10] .block_read_full_page+0x310/0x410
>> [  817.646312] [c00000000a183290] [c0000000002cf11c] .blkdev_readpage+0x1c/0x30
>> [  817.646316] [c00000000a183300] [c0000000001e51a0] .do_read_cache_page+0xc0/0x290
>> [  817.646321] [c00000000a1833c0] [c0000000003d59f8] .read_dev_sector+0x38/0xb0
>> [  817.646325] [c00000000a183440] [c0000000003d977c] .read_lba+0xcc/0x1f0
>> [  817.646329] [c00000000a1834f0] [c0000000003da3b8] .efi_partition+0x118/0x780
>> [  817.646333] [c00000000a183670] [c0000000003d6fcc] .check_partition+0x14c/0x2e0
>> [  817.646337] [c00000000a183700] [c0000000003d6260] .rescan_partitions+0xd0/0x380
>> [  817.646341] [c00000000a1837e0] [c0000000002d0b88] .__blkdev_get+0x3d8/0x530
>> [  817.646345] [c00000000a1838a0] [c0000000002d0f10] .blkdev_get+0x230/0x4a0
>> [  817.646348] [c00000000a1839a0] [c0000000003d3288] .add_disk+0x468/0x4f0
>> [  817.646353] [c00000000a183a60] [d000000002026450] .sd_probe_async+0xf0/0x230 [sd_mod]
>> [  817.646357] [c00000000a183af0] [c0000000000d23a8] .async_run_entry_fn+0x98/0x200
>> [  817.646362] [c00000000a183ba0] [c0000000000c6d74] .process_one_work+0x1a4/0x490
>> [  817.646366] [c00000000a183c40] [c0000000000c71dc] .worker_thread+0x17c/0x5a0
>> [  817.646369] [c00000000a183d30] [c0000000000ce704] .kthread+0x104/0x130
>> [  817.646374] [c00000000a183e30] [c000000000009534] .ret_from_kernel_thread+0x58/0xa4
>> [  817.646377] Instruction dump:
>> [  817.646379] 3924ffff 7d292378 fba1ffe8 55290ffe fbc1fff0 fb81ffe0 fbe1fff8 7c9e2378
>> [  817.646386] 7c7d1b78 f8010010 7d2907b4 f821ff71 <0b090000> 81230028 789c0020 5529ba7e
>> [  817.646394] ---[ end trace 0c08ee96e8610127 ]---
>> [  817.647718]
>> [  819.647756] Kernel panic - not syncing: Fatal exception
>> [  819.656776] Rebooting in 10 seconds..
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eryu Guan Dec. 15, 2015, 1:06 p.m. UTC | #1
On Tue, Dec 15, 2015 at 08:06:47PM +0800, Ming Lei wrote:
> On Tue, Dec 15, 2015 at 7:20 PM, Eryu Guan <guaneryu@gmail.com> wrote:
> > On Fri, Dec 11, 2015 at 07:53:40PM +0800, Eryu Guan wrote:
> >> Hi,
> >>
> >> I saw this kernel BUG_ON on 4.4-rc4 kernel, and this can be reproduced
> >> easily on ppc64 host by:
> >
> > This is still reproducible with 4.4-rc5 kernel.
> 
> Could you capture the debug log after appyling the attached patch and
> the reproduction?

Thanks for looking into this! dmesg shows:

[  686.217682] bio_split: sectors 0, bio_sectors 128, bi_rw 0

Thanks,
Eryu

P.S. full call trace

[  686.065692] scsi_debug:sdebug_driver_probe: host protection
[  686.065710] scsi host1: scsi_debug, version 1.85 [20141022], dev_size_mb=256, opts=0x0
[  686.065981] scsi 1:0:0:0: Direct-Access     Linux    scsi_debug       0184 PQ: 0 ANSI: 6
[  686.066873] sd 1:0:0:0: Attached scsi generic sg1 type 0
[  686.077683] sd 1:0:0:0: [sdb] 524288 512-byte logical blocks: (268 MB/256 MiB)
[  686.077694] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[  686.087670] sd 1:0:0:0: [sdb] Write Protect is off
[  686.107671] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
[  686.217682] bio_split: sectors 0, bio_sectors 128, bi_rw 0
[  686.217695] ------------[ cut here ]------------
[  686.217698] kernel BUG at block/bio.c:1793!
[  686.217702] Oops: Exception in kernel mode, sig: 5 [#1]
[  686.217704] SMP NR_CPUS=2048 NUMA pSeries
[  686.217707] Modules linked in: scsi_debug sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp
[  686.217727] CPU: 8 PID: 9515 Comm: kworker/u32:0 Not tainted 4.4.0-rc5+ #33
[  686.217733] Workqueue: events_unbound async_run_entry_fn
[  686.217737] task: c0000005edb23cc0 ti: c0000005f016c000 task.ti: c0000005f016c000
[  686.217740] NIP: c0000000003c45c4 LR: c0000000003c46b8 CTR: 00000000013abb8c
[  686.217743] REGS: c0000005f016ea20 TRAP: 0700   Not tainted  (4.4.0-rc5+)
[  686.217746] MSR: 8000000100029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22bb2322  XER: 0000000f
[  686.217756] CFAR: c0000000003c46cc SOFTE: 1 
GPR00: c0000000003c46b8 c0000005f016eca0 c000000001068300 000000000000002e 
GPR04: c0000005ffd09c50 c0000005ffd1b4a0 0000000000010000 0000000000000000 
GPR08: 0000000000000001 c000000000bab284 00000005ff160000 0000000000000130 
GPR12: 0000000000003f30 c00000000e7e4c00 0000000000000000 f0000000015d0e40 
GPR16: c0000005f3c3b7a0 c000000574390000 0000000000000001 0000000000000000 
GPR20: 0000000000000000 0000000000000080 0000000000000000 c0000005f5093200 
GPR24: c0000005edb0efa0 c0000005f016ee60 c0000005f5093288 0000000000000000 
GPR28: 0000000002400000 c0000005f5093200 0000000000000000 c0000005efd67600 
[  686.217797] NIP [c0000000003c45c4] bio_split+0x54/0x160
[  686.217800] LR [c0000000003c46b8] bio_split+0x148/0x160
[  686.217803] Call Trace:
[  686.217805] [c0000005f016eca0] [c0000000003c46b8] bio_split+0x148/0x160 (unreliable)
[  686.217810] [c0000005f016ed30] [c0000000003d75e0] blk_queue_split+0x3c0/0x570
[  686.217814] [c0000005f016ee30] [c0000000003d10a8] blk_queue_bio+0x48/0x440
[  686.217818] [c0000005f016ee90] [c0000000003cec9c] generic_make_request+0x15c/0x220
[  686.217822] [c0000005f016eef0] [c0000000003cee24] submit_bio+0xc4/0x1d0
[  686.217826] [c0000005f016efa0] [c0000000002db204] submit_bh_wbc+0x1a4/0x200
[  686.217830] [c0000005f016eff0] [c0000000002db6f0] block_read_full_page+0x320/0x420
[  686.217835] [c0000005f016f4a0] [c0000000002dedb4] blkdev_readpage+0x24/0x40
[  686.217839] [c0000005f016f4c0] [c0000000001f06fc] do_read_cache_page+0xbc/0x290
[  686.217844] [c0000005f016f530] [c0000000003e8e00] read_dev_sector+0x40/0xc0
[  686.217848] [c0000005f016f560] [c0000000003ec6bc] read_lba+0xdc/0x200
[  686.217851] [c0000005f016f5c0] [c0000000003ece4c] find_valid_gpt+0xec/0x740
[  686.217855] [c0000005f016f6a0] [c0000000003ed894] efi_partition+0x3f4/0x450
[  686.217859] [c0000005f016f820] [c0000000003ea428] check_partition+0x158/0x2f0
[  686.217863] [c0000005f016f8a0] [c0000000003e9694] rescan_partitions+0xd4/0x390
[  686.217867] [c0000005f016f970] [c0000000002e0938] __blkdev_get+0x3a8/0x4d0
[  686.217871] [c0000005f016f9e0] [c0000000002e0c90] blkdev_get+0x230/0x4a0
[  686.217875] [c0000005f016fa90] [c0000000003e65b8] add_disk+0x478/0x500
[  686.217880] [c0000005f016fb40] [d000000003fa66a8] sd_probe_async+0xf8/0x240 [sd_mod]
[  686.217884] [c0000005f016fbc0] [c0000000000d7db8] async_run_entry_fn+0x98/0x1f0
[  686.217888] [c0000005f016fc50] [c0000000000cc1a0] process_one_work+0x190/0x470
[  686.217892] [c0000005f016fce0] [c0000000000cc5fc] worker_thread+0x17c/0x5a0
[  686.217896] [c0000005f016fd80] [c0000000000d3da8] kthread+0x108/0x130
[  686.217901] [c0000005f016fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
[  686.217904] Instruction dump:
[  686.217906] 7cdf3378 7c9e2378 7c7d1b78 f8010010 7cbc2b78 f821ff71 80c30028 40dd00e8
[  686.217912] 54caba7e 39000000 7f8a2040 40dd00d8 <0b080000> 54c9ba7e 7bdb0020 7f89d840
[  686.217921] ---[ end trace 80d38b6aaec5b2ff ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/bio.c b/block/bio.c
index dbabd48..8d23a99 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1784,6 +1784,12 @@  struct bio *bio_split(struct bio *bio, int sectors,
 {
 	struct bio *split = NULL;
 
+	if (sectors <= 0 || (sectors >= bio_sectors(bio))) {
+		printk("%s: sectors %d, bio_sectors %u, bi_rw %x\n",
+				__func__, sectors, bio_sectors(bio),
+				bio->bi_rw);
+	}
+
 	BUG_ON(sectors <= 0);
 	BUG_ON(sectors >= bio_sectors(bio));