Message ID | 5783EA91.30402@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/11/2016 08:50 PM, Vegard Nossum wrote: > On 07/11/2016 04:51 AM, Theodore Ts'o wrote: >> On Thu, Jul 07, 2016 at 10:10:40PM +0200, Vegard Nossum wrote: >>> >>> I ran into a second problem (this time it was num_clusters_in_group() >>> returning a bogus value) with the same symptoms (random memory >>> corruptions), the new attached patch fixes both problems by checking the >>> values at mount time. >> >> Can you give me a dumpe2fs -h of a file system that is causing >> num_clusters_in_group() to be bogus? >> >> I want to make sure I'm checking that correct base values, insteda of >> doing a brute force loop over all of the block groups and calling >> ext4_num_clusters_in_group() and ext4_num_base_meta_clusters() for all >> block groups. >> >> Thanks!! > > It's sbi->s_es->s_reserved_gdt_blocks: Durrr, no, it's not, I just realised you asked about num_clusters_in_group() and not num_base_meta_clusters(). So I did the same thing for that and I tracked it down to s_blocks_count_{lo,hi} both being 0, causing num_clusters_in_group() to effectively return 0 - ext4_group_first_block_no(sb, block_group). But dumpe2fs shows block count to be 16384, so I was a bit puzzled. I set a breakpoint on s_blocks_count_lo and indeed it's being corrupted: Hardware watchpoint 2: ((struct ext4_super_block *) 0x61e2c400)->s_blocks_count_lo Old value = 16384 New value = 0 0x00000000602d9d59 in memset () (gdb) bt #0 0x00000000602d9d59 in memset () #1 0x000000006010e944 in ext4_init_block_bitmap (...) at fs/ext4/balloc.c:215 #2 ext4_read_block_bitmap_nowait (...) at fs/ext4/balloc.c:455 Curiously enough, that's this memset() in the same function: memset(bh->b_data, 0, sb->s_blocksize); Checking with some debug printks, it indeed seems like bh->b_data points to the struct ext4_super_block (!): &EXT4_SB(sb)->s_es->s_blocks_count_lo = 0000000063a3c404 bh->b_data = 0000000063a3c400 bh->b_size = 400 Well, you can disregard my patch for sure. I'm not sure how the bitmap we're supposed to initialise ends up pointing to the ext4_super_block though. Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 3020fd7..87655c6 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -208,6 +208,8 @@ static int ext4_init_block_bitmap(struct super_block *sb, memset(bh->b_data, 0, sb->s_blocksize); bit_max = ext4_num_base_meta_clusters(sb, block_group); + printk(KERN_ERR "kernel BUG: %llu > %llu\n", bit_max, sb->s_blocksize * 8); + BUG_ON(bit_max > sb->s_blocksize * 8); for (bit = 0; bit < bit_max; bit++)