bio linked list corruption.

Message ID	203e0319-bc9b-245c-e162-709267540d22@fb.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> Subject: Re: bio linked list corruption. To: Linus Torvalds <torvalds@linux-foundation.org> References: <20161026002752.qvrm6yxqb54fiqnd@codemonkey.org.uk> <CA+55aFyv0k9e6bBqGm-LL3CUwimS4+rSu341P7SOV5ezYrrW_g@mail.gmail.com> <CA+55aFydo5oxo+ihUfGffMK6cAZnbZ8WyEiF7xgEx1Bh_aOMww@mail.gmail.com> <20161026163018.wx57yy554576s6e2@codemonkey.org.uk> <CA+55aFzsrRenu3NsDYr3qqO=fC9FUxkLc3M2wzeLBpN7ghwvwA@mail.gmail.com> <20161026184201.6ofblkd3j5uxystq@codemonkey.org.uk> <CA+55aFwD9McVapb0svQrrvP1k6iSkqz5ENNGXY6b+Yo-k7wOsg@mail.gmail.com> <488f9edc-6a1c-2c68-0d33-d3aa32ece9a4@fb.com> <ac6cd46c-d609-ef1f-019f-204248c26c12@fb.com> <CA+55aFwY5jNcUAJzDWERP2r9iZHEKzHSwMi5AJCJiaVd3Z0E-g@mail.gmail.com> <20161026224025.mou27kki4bslftli@codemonkey.org.uk> <CA+55aFyyP6VHOv6KbhVdMMtNfJLftbju_Ui2GNxXt0GgkKm0Jg@mail.gmail.com> <CA+55aFyiJGk-A_cJH41Ec8Xj0Zz9M3EU-igJ9bgusj=nm28tFQ@mail.gmail.com> <2bdc068d-afd5-7a78-f334-26970c91aaca@fb.com> <CA+55aFxqRCEVxocpZVDMqinP=EsmgzcnvWpOT6x8NMmcBpWb2Q@mail.gmail.com> CC: Dave Jones <davej@codemonkey.org.uk>, Chris Mason <clm@fb.com>, "Andy Lutomirski" <luto@amacapital.net>, Andy Lutomirski <luto@kernel.org>, Al Viro <viro@zeniv.linux.org.uk>, Josef Bacik <jbacik@fb.com>, David Sterba <dsterba@suse.com>, linux-btrfs <linux-btrfs@vger.kernel.org>, Linux Kernel <linux-kernel@vger.kernel.org>, Dave Chinner <david@fromorbit.com> From: Jens Axboe <axboe@fb.com> Message-ID: <203e0319-bc9b-245c-e162-709267540d22@fb.com> Date: Wed, 26 Oct 2016 17:20:01 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <CA+55aFxqRCEVxocpZVDMqinP=EsmgzcnvWpOT6x8NMmcBpWb2Q@mail.gmail.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Received-SPF: None (protection.outlook.com: fb.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtETTVQUjE1TUIxMTk1OzIzOlBDcFc3SHE5Ynh1Y2NCSmVUY1RoOStvWHdD?= =?utf-8?B?Zmx3YUxab3I1ZWJkSE9jQllLOU55RHBWYmFBQ2JwTk05K3lTajBSMGhLcTNI?= =?utf-8?B?akFxQUxDQlRtL2FhU0R2ZzROM29uMWNvNW12MXNaOFhNU082QWZFODI4MVJG?= =?utf-8?B?eDFDUXdUby8yaFZTZkxuUlhDVkc5dG9nekl5Qk5MSWZBNGNsTndCQVRwck9j?= =?utf-8?B?YXlhaG82QnlLM3AwMDByazFId3FQK2ZZa2huc1BzYjF2cHNJVUtyU3dSeUEy?= =?utf-8?B?MExLclk0cUxMMzV4aU9EOUprdzRKQ1krakVNeHIwdzRoN2pwWlNVaFhmL3RX?= =?utf-8?B?RWg4N2ZoZjFKTGd0bWpTUzQ0cnBEVUJZZ1pIV3dUSEt3OHhNWW4xeTFSS3ps?= =?utf-8?B?L2IxZERkVlpDZHNHSkJFTFcxMWlpYnljN0x3VGlubEU0S1JTbUpVSjRvZEg4?= =?utf-8?B?dXRZY2w3RWlMcHdtaWN1MWlyblFNVmxXL1IzYzRqZFB2N0o0MC9FZnZ6WUxs?= =?utf-8?B?TzNyQnJHYUZxU3JLd09BMTBXUVl3aHIrdFlVNE40ZXpkalFQQllYRlJZaEZK?= =?utf-8?B?NHVLVzg3cU9rN1JZREJFckwzRXZ3MjVZM0RoWUFiTXVUZ1Z3R1E4dVl6b29H?= =?utf-8?B?N3RPU2RLWTR3bk9qYlA4MWRMNnBLamxBSkR4OFpQUDkvUTRIQXYrVndxRUhh?= =?utf-8?B?WTg3SkxxbU5hTFhkWW94ZlovVU1WQWtSbHNjTG90b1FsY1l6dCtHQXl6djZR?= =?utf-8?B?aW04R1kyTzg1eTBCeW5DNnM5VDAwZlBEQWc1YktNZ2VyUng5d0dGVDErbW1Z?= =?utf-8?B?Qm5WQ2FEaEZwUFlzMTdtM1lVeEdIQlVaemJ6dnlNYXdvYklZQjNOWDV4NXk0?= =?utf-8?B?KzByR1oxemo5M1IvV0t4T0xIK3MyZjhEL21wMlRFbEpzOTNtaGZPbDVTeGhX?= =?utf-8?B?Yk1yd2szU0lnaXp2N3N4dE53VXBMN25hOTdWVCtOcWhyY0ZENnYrR2QrSTgx?= =?utf-8?B?NUI2NU50b3pUNFQzNCtCNmdEL2k3VG85WFY4R1cyVDJrc1FyaXFWWUpkRlk0?= =?utf-8?B?TTFIS1ZxWlVycnNWQnIrMWZPNStoK0IydkRtV3RyRHNxTEFEN2Fpb2FnSW1S?= =?utf-8?B?azJFVFlmbXhNK3lVTmxEYXIwckFiT3h3R040WkxaTVFVM3NhcytFWFRzc1Yx?= =?utf-8?B?NGpKVTZKaHNtTExGQmhpdEpDQ2Y1dk84VUhldnZFdnUwUG5XaGRDT2owdVlU?= =?utf-8?B?a20rc2U4MVRrdmF4NTZMdmEydTdSVzNNeVM5M203aThkeVlmYkhGbDBtY3pY?= =?utf-8?B?eWUzczdZY1VTRGxCWVp5QS9TNkxSeDZ5S210aG5vT3Y3S3k5U0YyYjZoazZL?= =?utf-8?B?VnUxYmJLc0V2VDRrNGJabzgwQm8vSlFHbVBPa0VDWDgrTG9ZSFJtQlovZ29o?= =?utf-8?B?WGloU2QxNms3dW5TbHRrd094M1JLb2p5eitFRys3K1N3MWNSMmhpaUI1MG5w?= =?utf-8?B?SXJjOE80Q0VmVElLOWpKd3BkUGp6VWtEVWFidFd2S1l0WUc5UWQ0VUNpSHRp?= =?utf-8?B?ZkdNUUl6MURuVnJ0TytiYUkweE9Xa01hd3VVQTNWTStWSE9xMDJyTTFFcXZa?= =?utf-8?B?MnJxRUVCV1BQVVFrWnV3RGllTkp0U1UxT2JZYmd2OVBvSVlJSzkxNlk0UG03?= =?utf-8?B?RWVpTFpSMEtWZWNIUzRhUlp5ajRRbUF2YjN6Q3NFc0NDK3JvbFp5d25IRi9l?= =?utf-8?B?MUVRVUk3T05nS1c5SHpGNGRMblBEbkc2R21mMU05OUVick40Mm5ZbnRoTXlG?= =?utf-8?B?ZlFyRWVtUEhGYkRqTTNLVkpmb3FrT1Q1U2FSZFFoZDY3R0lOdmFuTFJtWDNX?= =?utf-8?B?eUxXYkdJL3lJczJQVlRReFIyTnMzS1Yxek0xYzJOY1dONnBCdjQrTWdJQ1pY?= =?utf-8?B?a1JEMFNXU2V0VnVTTkVyNHFpTkpud0o2dVdienNSZ2NydlY5dllUZ2ltZ1pK?= =?utf-8?B?OFhQQ1JJbGpQSTdhWUluOENwQXpBdlZQMFhodz09?= X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1195; 6:+jc0RsY/mZN6K4u3P363PNyMU2PlRNfzeg8KcMMald/H0WcrDqsCJoTqq6bCYQEj1cFcJm9GEmd2G1WyLOmD5ZH2RgD1TMSG1DgQV8KNWAMJpASUFO/hxBrfM7r6+vxFAdlggHJvlBGCqiRbwxUrSePOZ+5AHo55EJVIK5ithfg1wDVohdjpTtSNxISquXcVt7NZl3Hq3FbNAUSpWhxBoxQzkxh9OyF+I9pTCmqL7FJfOtxTG9ppstj5s3/ZRGVc74qH7CcClv8NaU3Gv/l8/WFVuI1y+MyPE0Rg0Sjc3xSnfVn9RxLPld6gZdgRMo/D; 5:JThUgVg8DZrm+mqT65i0sZSn3zXuuAeC8aS7Jd9tnXT1nXzlpV1votQX2BlRftCvqTVeXe0kSHTEC8y09Lok08+A6Mj1/WCaJdXjfT9lEHMcnlMQhTq/cyCadU3ubxJxhFwoIRXn8KePcowGIO3Jhw==; 24:A59Ydjjegh3i0q/TzLVqM1jJuS2JhHV5h0SWisxcg70QijIW8gO/sElUUSISHvRLHpvy5ld5K4vz0HJJeePu/a6jHl9RTGAMCmsCghqQyLE=; 7:J6U50eAPGy4bJsrWVVOwa8HOF2NCSQhw/zmmVAHl+EtZTi7ynppYV2D8X/JMXZi2QpaPaduit6OK9geLoQ2bPh6w9JFZIaJRs9ZOAnFKNP/h/SEUxPC6SYmA2Fv7+miWbpE8OWKQZYEjkkFQldy8cu1Eixsf6oKvZtzIkj4VNgQOoc7/PajUBJqb8rC+gqTP7LQHREtGdLDN9/rqFssaSpLhXDFC/gA5ZmPmtSxNIOEQ6icAE3OjcnKZvAOEOokHEOeZgp++yWoWNa8eh1z0Lt2MzuA0wZ+3HSUfD3UHZWl46vhhea8SMM63bMMP5c/xE644+EFPouqkFPFHaV19kT/gPt7k PvbZhROK9Y0eY/4= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1195; 20:5VMj1scdg+GcYFTqIxdSdbnO4gpC+EWyB8P01+d0GREeih3+zsIEUmicGc6qTPh4Fetxoz/BiLZN9NT0zEapWtB05Lwomgml74kRuA4hMqg1Fz7JQMgceaxT7VcrQPZae0RKUsorSAyMLF9IU1bb3bqhs51rZz969c3a/AfkzMQ= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Oct 2016 23:20:09.1346 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1195 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk

Jens Axboe Oct. 26, 2016, 11:20 p.m. UTC

On 10/26/2016 05:08 PM, Linus Torvalds wrote:
> On Wed, Oct 26, 2016 at 4:03 PM, Jens Axboe <axboe@fb.com> wrote:
>>
>> Actually, I think I see what might trigger it. You are on nvme, iirc,
>> and that has a deep queue.
>
> Yes. I have long since moved on from slow disks, so all my systems are
> not just flash, but m.2 nvme ssd's.
>
> So at least that could explain why Dave sees it at bootup but I don't.

Yep, you'd never sleep during normal boot. The original patch had a
problem, though... This one should be better. Too many 'data's, we'll
still need to assign ctx/hctx, we should just use the current ones, not
the original ones.

Chris Mason Oct. 26, 2016, 11:38 p.m. UTC | #1

On Wed, Oct 26, 2016 at 05:20:01PM -0600, Jens Axboe wrote:
>On 10/26/2016 05:08 PM, Linus Torvalds wrote:
>>On Wed, Oct 26, 2016 at 4:03 PM, Jens Axboe <axboe@fb.com> wrote:
>>>
>>>Actually, I think I see what might trigger it. You are on nvme, iirc,
>>>and that has a deep queue.
>>
>>Yes. I have long since moved on from slow disks, so all my systems are
>>not just flash, but m.2 nvme ssd's.
>>
>>So at least that could explain why Dave sees it at bootup but I don't.
>
>Yep, you'd never sleep during normal boot. The original patch had a
>problem, though... This one should be better. Too many 'data's, we'll
>still need to assign ctx/hctx, we should just use the current ones, not
>the original ones.
>
>diff --git a/block/blk-mq.c b/block/blk-mq.c
>index ddc2eed64771..e56fec187ba6 100644
>--- a/block/blk-mq.c
>+++ b/block/blk-mq.c
>@@ -1217,9 +1224,9 @@ static struct request *blk_mq_map_request(struct 
>request_queue *q,
> 	blk_mq_set_alloc_data(&alloc_data, q, 0, ctx, hctx);
> 	rq = __blk_mq_alloc_request(&alloc_data, op, op_flags);
>
>-	hctx->queued++;
>-	data->hctx = hctx;
>-	data->ctx = ctx;
>+	data->hctx = alloc_data.hctx;
>+	data->ctx = alloc_data.ctx;
>+	data->hctx->queued++;
> 	return rq;
> }

This made it through an entire dbench 2048 run on btrfs.  My script has 
it running in a loop, but this is farther than I've gotten before.  
Looking great so far.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Oct. 26, 2016, 11:47 p.m. UTC | #2

On Wed, Oct 26, 2016 at 07:38:08PM -0400, Chris Mason wrote:
 
 > >-	hctx->queued++;
 > >-	data->hctx = hctx;
 > >-	data->ctx = ctx;
 > >+	data->hctx = alloc_data.hctx;
 > >+	data->ctx = alloc_data.ctx;
 > >+	data->hctx->queued++;
 > > 	return rq;
 > > }
 > 
 > This made it through an entire dbench 2048 run on btrfs.  My script has 
 > it running in a loop, but this is farther than I've gotten before.  
 > Looking great so far.

Fixed the splat during boot for me too.
Now the fun part, let's see if it fixed the 'weird shit' that Trinity
was stumbling on.

	Dave
 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jens Axboe Oct. 27, 2016, midnight UTC | #3

On 10/26/2016 05:47 PM, Dave Jones wrote:
> On Wed, Oct 26, 2016 at 07:38:08PM -0400, Chris Mason wrote:
>
>  > >-	hctx->queued++;
>  > >-	data->hctx = hctx;
>  > >-	data->ctx = ctx;
>  > >+	data->hctx = alloc_data.hctx;
>  > >+	data->ctx = alloc_data.ctx;
>  > >+	data->hctx->queued++;
>  > > 	return rq;
>  > > }
>  >
>  > This made it through an entire dbench 2048 run on btrfs.  My script has
>  > it running in a loop, but this is farther than I've gotten before.
>  > Looking great so far.
>
> Fixed the splat during boot for me too.
> Now the fun part, let's see if it fixed the 'weird shit' that Trinity
> was stumbling on.

Let's let the testing simmer overnight, then I'll turn this into a real
patch tomorrow and get it submitted.

Chris Mason Oct. 27, 2016, 1:33 p.m. UTC | #4

On 10/26/2016 08:00 PM, Jens Axboe wrote:
> On 10/26/2016 05:47 PM, Dave Jones wrote:
>> On Wed, Oct 26, 2016 at 07:38:08PM -0400, Chris Mason wrote:
>>
>>  > >-    hctx->queued++;
>>  > >-    data->hctx = hctx;
>>  > >-    data->ctx = ctx;
>>  > >+    data->hctx = alloc_data.hctx;
>>  > >+    data->ctx = alloc_data.ctx;
>>  > >+    data->hctx->queued++;
>>  > >     return rq;
>>  > > }
>>  >
>>  > This made it through an entire dbench 2048 run on btrfs.  My script
>> has
>>  > it running in a loop, but this is farther than I've gotten before.
>>  > Looking great so far.
>>
>> Fixed the splat during boot for me too.
>> Now the fun part, let's see if it fixed the 'weird shit' that Trinity
>> was stumbling on.
> 
> Let's let the testing simmer overnight, then I'll turn this into a real
> patch tomorrow and get it submitted.
> 

I ran all night on both btrfs and xfs.  XFS came out clean, but btrfs hit
the WARN_ON below.  I hit it a few times with Jens' patch, always the
same warning.  It's pretty obviously a btrfs bug, we're not cleaning up
this list properly during fsync.  I tried a v1 of a btrfs fix overnight,
but I see where it was incomplete now and will re-run.

For the blk-mq bug, I think we got it!

Tested-by: always-blaming-jens-from-now-on <clm@fb.com>

WARNING: CPU: 5 PID: 16163 at lib/list_debug.c:62 __list_del_entry+0x86/0xd0
list_del corruption. next->prev should be ffff8801196d3be0, but was ffff88010fc63308
Modules linked in: crc32c_intel aesni_intel aes_x86_64 glue_helper i2c_piix4 lrw i2c_core gf128mul ablk_helper virtio_net serio_raw button pcspkr floppy cryptd sch_fq_codel autofs4 virtio_blk
CPU: 5 PID: 16163 Comm: dbench Not tainted 4.9.0-rc2-00041-g811d54d-dirty #322
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014
 ffff8801196d3a68 ffffffff814fde3f ffffffff8151c356 ffff8801196d3ac8
 ffff8801196d3ac8 0000000000000000 ffff8801196d3ab8 ffffffff810648cf
 dead000000000100 0000003e813bfc4a ffff8801196d3b98 ffff880122b5c800
Call Trace:
 [<ffffffff814fde3f>] dump_stack+0x53/0x74
 [<ffffffff8151c356>] ? __list_del_entry+0x86/0xd0
 [<ffffffff810648cf>] __warn+0xff/0x120
 [<ffffffff810649a9>] warn_slowpath_fmt+0x49/0x50
 [<ffffffff8151c356>] __list_del_entry+0x86/0xd0
 [<ffffffff8143618d>] btrfs_sync_log+0x75d/0xbd0
 [<ffffffff8143cfa7>] ? btrfs_log_inode_parent+0x547/0xbb0
 [<ffffffff819ad01b>] ? _raw_spin_lock+0x1b/0x40
 [<ffffffff8108d1a3>] ? __might_sleep+0x53/0xa0
 [<ffffffff812095c5>] ? dput+0x65/0x280
 [<ffffffff8143d717>] ? btrfs_log_dentry_safe+0x77/0x90
 [<ffffffff81405b04>] btrfs_sync_file+0x424/0x490
 [<ffffffff8107535a>] ? SYSC_kill+0xba/0x1d0
 [<ffffffff811f2348>] ? __sb_end_write+0x58/0x80
 [<ffffffff812258ac>] vfs_fsync_range+0x4c/0xb0
 [<ffffffff81002501>] ? syscall_trace_enter+0x201/0x2e0
 [<ffffffff8122592c>] vfs_fsync+0x1c/0x20
 [<ffffffff8122596d>] do_fsync+0x3d/0x70
 [<ffffffff810029cb>] ? syscall_slow_exit_work+0xfb/0x100
 [<ffffffff812259d0>] SyS_fsync+0x10/0x20
 [<ffffffff81002b65>] do_syscall_64+0x55/0xd0
 [<ffffffff810026d7>] ? prepare_exit_to_usermode+0x37/0x40
 [<ffffffff819ad286>] entry_SYSCALL64_slow_path+0x25/0x25
---[ end trace c93288442a6424aa ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Oct. 31, 2016, 6:55 p.m. UTC | #5

On Wed, Oct 26, 2016 at 07:47:51PM -0400, Dave Jones wrote:
 > On Wed, Oct 26, 2016 at 07:38:08PM -0400, Chris Mason wrote:
 >  
 >  > >-	hctx->queued++;
 >  > >-	data->hctx = hctx;
 >  > >-	data->ctx = ctx;
 >  > >+	data->hctx = alloc_data.hctx;
 >  > >+	data->ctx = alloc_data.ctx;
 >  > >+	data->hctx->queued++;
 >  > > 	return rq;
 >  > > }
 >  > 
 >  > This made it through an entire dbench 2048 run on btrfs.  My script has 
 >  > it running in a loop, but this is farther than I've gotten before.  
 >  > Looking great so far.
 > 
 > Fixed the splat during boot for me too.
 > Now the fun part, let's see if it fixed the 'weird shit' that Trinity
 > was stumbling on.

It took a while, but.. bad news.


BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
flags: 0x400000000000000c(referenced|uptodate)
page dumped because: non-NULL mapping
CPU: 3 PID: 1586 Comm: kworker/u8:12 Not tainted 4.9.0-rc3-think+ #1 
Workqueue: writeback wb_workfn
  (flush-btrfs-2)
 ffffc90000777828
 ffffffff8130d04c
 ffffea0013838e40
 ffffffff819ff654
 ffffc90000777850
 ffffffff81150e5f
 0000000000000000
 ffffea0013838e40
 400000000000000c
 ffffc90000777860
 ffffffff81150f1a
 ffffc900007778a8
Call Trace:
 [<ffffffff8130d04c>] dump_stack+0x4f/0x73
 [<ffffffff81150e5f>] bad_page+0xbf/0x120
 [<ffffffff81150f1a>] free_pages_check_bad+0x5a/0x70
 [<ffffffff81153818>] free_hot_cold_page+0x248/0x290
 [<ffffffff81153b1b>] free_hot_cold_page_list+0x2b/0x50
 [<ffffffff8115c52d>] release_pages+0x2bd/0x350
 [<ffffffff8115da62>] __pagevec_release+0x22/0x30
 [<ffffffffa00a0d4e>] extent_write_cache_pages.isra.48.constprop.63+0x32e/0x400 [btrfs]
 [<ffffffffa00a1199>] extent_writepages+0x49/0x60 [btrfs]
 [<ffffffffa0081840>] ? btrfs_releasepage+0x40/0x40 [btrfs]
 [<ffffffffa007e993>] btrfs_writepages+0x23/0x30 [btrfs]
 [<ffffffff8115ac4c>] do_writepages+0x1c/0x30
 [<ffffffff811f69f3>] __writeback_single_inode+0x33/0x180
 [<ffffffff811f71e8>] writeback_sb_inodes+0x2a8/0x5b0
 [<ffffffff811f757d>] __writeback_inodes_wb+0x8d/0xc0
 [<ffffffff811f7833>] wb_writeback+0x1e3/0x1f0
 [<ffffffff811f7d72>] wb_workfn+0xd2/0x280
 [<ffffffff810908d5>] process_one_work+0x1d5/0x490
 [<ffffffff81090875>] ? process_one_work+0x175/0x490
 [<ffffffff81090bd9>] worker_thread+0x49/0x490
 [<ffffffff81090b90>] ? process_one_work+0x490/0x490
 [<ffffffff81095d4e>] kthread+0xee/0x110
 [<ffffffff81095c60>] ? kthread_park+0x60/0x60
 [<ffffffff81790b12>] ret_from_fork+0x22/0x30



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Oct. 31, 2016, 7:35 p.m. UTC | #6

On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
>
> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
> flags: 0x400000000000000c(referenced|uptodate)
> page dumped because: non-NULL mapping

Hmm. So this seems to be btrfs-specific, right?

I searched for all your "non-NULL mapping" cases, and they all seem to
have basically the same call trace, with some work thread doing
writeback and going through btrfs_writepages().

Sounds like it's a race with either fallocate hole-punching or
truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
clearly ran other filesystems too but I am not seeing this backtrace
for anything else.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason Oct. 31, 2016, 7:44 p.m. UTC | #7

On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
>On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
>>
>> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
>> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
>> flags: 0x400000000000000c(referenced|uptodate)
>> page dumped because: non-NULL mapping
>
>Hmm. So this seems to be btrfs-specific, right?
>
>I searched for all your "non-NULL mapping" cases, and they all seem to
>have basically the same call trace, with some work thread doing
>writeback and going through btrfs_writepages().
>
>Sounds like it's a race with either fallocate hole-punching or
>truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
>clearly ran other filesystems too but I am not seeing this backtrace
>for anything else.

Agreed, I think this is a separate bug, almost certainly btrfs specific.  
I'll work with Dave on a better reproducer.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Nov. 6, 2016, 4:55 p.m. UTC | #8

<subject changed, hopefully we're done with bio corruption for now>

On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote:
 > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
 > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
 > >>
 > >> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
 > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
 > >> flags: 0x400000000000000c(referenced|uptodate)
 > >> page dumped because: non-NULL mapping
 > >
 > >Hmm. So this seems to be btrfs-specific, right?
 > >
 > >I searched for all your "non-NULL mapping" cases, and they all seem to
 > >have basically the same call trace, with some work thread doing
 > >writeback and going through btrfs_writepages().
 > >
 > >Sounds like it's a race with either fallocate hole-punching or
 > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
 > >clearly ran other filesystems too but I am not seeing this backtrace
 > >for anything else.
 > 
 > Agreed, I think this is a separate bug, almost certainly btrfs specific.  
 > I'll work with Dave on a better reproducer.

Still refining my 'capture ftrace when trinity detects taint' feature,
but in the meantime, here's a variant I don't think we've seen before:

general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 3 PID: 1913 Comm: trinity-c51 Not tainted 4.9.0-rc3-think+ #3 
task: ffff880503350040 task.stack: ffffc90000240000
RIP: 0010:[<ffffffffa007c2f6>] 
  [<ffffffffa007c2f6>] write_ctree_super+0x96/0xb30 [btrfs]
RSP: 0018:ffffc90000243c90  EFLAGS: 00010286
RAX: dae05adadadad000 RBX: 0000000000000000 RCX: 0000000000000002
RDX: ffff8804fdfcc000 RSI: ffff8804edcee313 RDI: ffff8804edcee1c3
RBP: ffffc90000243d00 R08: 0000000000000003 R09: ffff880000000000
R10: 0000000000000001 R11: 0000000000000100 R12: ffff88045151c548
R13: 0000000000000000 R14: ffff8804ee5122a8 R15: ffff8804572267e8
FS:  00007f25c3e0eb40(0000) GS:ffff880507e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f25c1560d44 CR3: 0000000454e20000 CR4: 00000000001406e0
DR0: 00007fee93506000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 0000000000000001
 ffff88050227b3f8
 ffff8804fff01b28
 00000001810b7f35
 ffffffffa007c265
 0000000000000001
 ffffc90000243cb8
 000000002c9a8645
 ffff8804fff01b28
 ffff8804fff01b28
 ffff88045151c548
 0000000000000000
Call Trace:
 [<ffffffffa007c265>] ? write_ctree_super+0x5/0xb30 [btrfs]
 [<ffffffffa00d2956>] btrfs_sync_log+0x886/0xa60 [btrfs]
 [<ffffffffa009f4f9>] btrfs_sync_file+0x479/0x4d0 [btrfs]
 [<ffffffff812789ab>] vfs_fsync_range+0x4b/0xb0
 [<ffffffff81260755>] ? __fget_light+0x5/0x60
 [<ffffffff81278a6d>] do_fsync+0x3d/0x70
 [<ffffffff81278a35>] ? do_fsync+0x5/0x70
 [<ffffffff81278d20>] SyS_fsync+0x10/0x20
 [<ffffffff81002d81>] do_syscall_64+0x61/0x170
 [<ffffffff81894a8b>] entry_SYSCALL64_slow_path+0x25/0x25
Code: c7 48 8b 42 30 4c 8b 08 48 b8 00 00 00 00 00 16 00 00 49 03 81 a0 01 00 00 49 b9 00 00 00 00 00 88 ff ff 48 c1 f8 06 48 c1 e0 0c <4a> 8b 44 08 50 48 39 46 08 0f 84 8d 08 00 00 49 63 c0 48 8d 0c 
RIP 
  [<ffffffffa007c2f6>] write_ctree_super+0x96/0xb30 [btrfs]
 RSP <ffffc90000243c90>

All code
========
   0:	c7                   	(bad)  
   1:	48 8b 42 30          	mov    0x30(%rdx),%rax
   5:	4c 8b 08             	mov    (%rax),%r9
   8:	48 b8 00 00 00 00 00 	movabs $0x160000000000,%rax
   f:	16 00 00 
  12:	49 03 81 a0 01 00 00 	add    0x1a0(%r9),%rax
  19:	49 b9 00 00 00 00 00 	movabs $0xffff880000000000,%r9
  20:	88 ff ff 
  23:	48 c1 f8 06          	sar    $0x6,%rax
  27:	48 c1 e0 0c          	shl    $0xc,%rax
  2b:*	4a 8b 44 08 50       	mov    0x50(%rax,%r9,1),%rax		<-- trapping instruction
  30:	48 39 46 08          	cmp    %rax,0x8(%rsi)
  34:	0f 84 8d 08 00 00    	je     0x8c7
  3a:	49 63 c0             	movslq %r8d,%rax
  3d:	48                   	rex.W
  3e:	8d                   	.byte 0x8d
  3f:	0c                   	.byte 0xc

Code starting with the faulting instruction
===========================================
   0:	4a 8b 44 08 50       	mov    0x50(%rax,%r9,1),%rax
   5:	48 39 46 08          	cmp    %rax,0x8(%rsi)
   9:	0f 84 8d 08 00 00    	je     0x89c
   f:	49 63 c0             	movslq %r8d,%rax
  12:	48                   	rex.W
  13:	8d                   	.byte 0x8d
  14:	0c                   	.byte 0xc


According to objdump -S, it looks like this is an inlined copy of backup_super_roots

        root_backup = info->super_for_commit->super_roots + last_backup;
    2706:       48 8d b8 2b 0b 00 00    lea    0xb2b(%rax),%rdi
    270d:       48 63 c1                movslq %ecx,%rax
    2710:       48 8d 34 80             lea    (%rax,%rax,4),%rsi
    2714:       48 8d 04 b0             lea    (%rax,%rsi,4),%rax
    2718:       48 8d 34 c7             lea    (%rdi,%rax,8),%rsi
            btrfs_header_generation(info->tree_root->node))
    271c:       48 8b 42 30             mov    0x30(%rdx),%rax
    2720:       4c 8b 08                mov    (%rax),%r9
    2723:       48 b8 00 00 00 00 00    movabs $0x160000000000,%rax
    272a:       16 00 00 
    272d:       49 03 81 a0 01 00 00    add    0x1a0(%r9),%rax
        if (btrfs_backup_tree_root_gen(root_backup) ==
    2734:       49 b9 00 00 00 00 00    movabs $0xffff880000000000,%r9
    273b:       88 ff ff 
    273e:       48 c1 f8 06             sar    $0x6,%rax
    2742:       48 c1 e0 0c             shl    $0xc,%rax
    2746:       4a 8b 44 08 50          mov    0x50(%rax,%r9,1),%rax      <-- trapping instruction
    274b:       48 39 46 08             cmp    %rax,0x8(%rsi)
    274f:       0f 84 8d 08 00 00       je     2fe2 <write_ctree_super+0x932>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Nov. 8, 2016, 2:59 p.m. UTC | #9

On Sun, Nov 06, 2016 at 11:55:39AM -0500, Dave Jones wrote:
 > <subject changed, hopefully we're done with bio corruption for now>
 > 
 > On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote:
 >  > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
 >  > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
 >  > >>
 >  > >> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
 >  > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
 >  > >> flags: 0x400000000000000c(referenced|uptodate)
 >  > >> page dumped because: non-NULL mapping
 >  > >
 >  > >Hmm. So this seems to be btrfs-specific, right?
 >  > >
 >  > >I searched for all your "non-NULL mapping" cases, and they all seem to
 >  > >have basically the same call trace, with some work thread doing
 >  > >writeback and going through btrfs_writepages().
 >  > >
 >  > >Sounds like it's a race with either fallocate hole-punching or
 >  > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
 >  > >clearly ran other filesystems too but I am not seeing this backtrace
 >  > >for anything else.
 >  > 
 >  > Agreed, I think this is a separate bug, almost certainly btrfs specific.  
 >  > I'll work with Dave on a better reproducer.
 > 
 > Still refining my 'capture ftrace when trinity detects taint' feature,
 > but in the meantime, here's a variant I don't think we've seen before:

And another new one:

kernel BUG at fs/btrfs/ctree.c:3172!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 0 PID: 22702 Comm: trinity-c40 Not tainted 4.9.0-rc4-think+ #1 
task: ffff8804ffde37c0 task.stack: ffffc90002188000
RIP: 0010:[<ffffffffa00576b9>] 
  [<ffffffffa00576b9>] btrfs_set_item_key_safe+0x179/0x190 [btrfs]
RSP: 0000:ffffc9000218b8a8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8804fddcf348 RCX: 0000000000001000
RDX: 0000000000000000 RSI: ffffc9000218b9ce RDI: ffffc9000218b8c7
RBP: ffffc9000218b908 R08: 0000000000004000 R09: ffffc9000218b8c8
R10: 0000000000000000 R11: 0000000000000001 R12: ffffc9000218b8b6
R13: ffffc9000218b9ce R14: 0000000000000001 R15: ffff880480684a88
FS:  00007f7c7f998b40(0000) GS:ffff880507800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000044f15f000 CR4: 00000000001406f0
DR0: 00007f4ce439d000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 ffff880501430000 d305ffffa00a2245 006c000000000002 0500000000000010
 6c000000000002d3 0000000000001000 000000006427eebb ffff880480684a88
 0000000000000000 ffff8804fddcf348 0000000000002000 0000000000000000
Call Trace:
 [<ffffffffa009cff0>] __btrfs_drop_extents+0xb00/0xe30 [btrfs]
 [<ffffffff8116c80c>] ? function_trace_call+0x13c/0x190
 [<ffffffffa009c4f5>] ? __btrfs_drop_extents+0x5/0xe30 [btrfs]
 [<ffffffff810e2b00>] ? do_raw_write_lock+0xb0/0xc0
 [<ffffffffa00cd43d>] btrfs_log_changed_extents+0x35d/0x630 [btrfs]
 [<ffffffffa00a6a74>] ? release_extent_buffer+0xa4/0x110 [btrfs]
 [<ffffffffa00cd0e5>] ? btrfs_log_changed_extents+0x5/0x630 [btrfs]
 [<ffffffffa00d1085>] btrfs_log_inode+0xb05/0x11d0 [btrfs]
 [<ffffffff8116536c>] ? trace_function+0x6c/0x80
 [<ffffffffa00d0580>] ? log_directory_changes+0xc0/0xc0 [btrfs]
 [<ffffffffa00d1a20>] ? btrfs_log_inode_parent+0x240/0x940 [btrfs]
 [<ffffffff8116c80c>] ? function_trace_call+0x13c/0x190
 [<ffffffffa00d1a20>] btrfs_log_inode_parent+0x240/0x940 [btrfs]
 [<ffffffffa00d17e5>] ? btrfs_log_inode_parent+0x5/0x940 [btrfs]
 [<ffffffff81259131>] ? dget_parent+0x71/0x150
 [<ffffffffa00d3102>] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
 [<ffffffffa009f404>] btrfs_sync_file+0x344/0x4d0 [btrfs]
 [<ffffffff81278a1b>] vfs_fsync_range+0x4b/0xb0
 [<ffffffff812607c5>] ? __fget_light+0x5/0x60
 [<ffffffff81278add>] do_fsync+0x3d/0x70
 [<ffffffff81278aa5>] ? do_fsync+0x5/0x70
 [<ffffffff81278db3>] SyS_fdatasync+0x13/0x20
 [<ffffffff81002d81>] do_syscall_64+0x61/0x170
 [<ffffffff81894ccb>] entry_SYSCALL64_slow_path+0x25/0x25
Code: 48 8b 45 b7 48 8d 7d bf 4c 89 ee 48 89 45 c8 0f b6 45 b6 88 45 c7 48 8b 45 ae 48 89 45 bf e8 af f2 ff ff 85 c0 0f 8f 43 ff ff ff <0f> 0b 0f 0b e8 ee f3 02 e1 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 



Unfortunatly, because this was a BUG_ON, it locked up the box so it
didn't save any additional debug info. Tempted to see if making BUG_ON
a no-op will at least let it live long enough to save the ftrace buffer.

Given this seems to be mutating every time I see something go wrong,
I'm wondering if this is fallout from memory corruption again.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason Nov. 8, 2016, 3:08 p.m. UTC | #10

On 11/08/2016 09:59 AM, Dave Jones wrote:
> On Sun, Nov 06, 2016 at 11:55:39AM -0500, Dave Jones wrote:
>  > <subject changed, hopefully we're done with bio corruption for now>
>  >
>  > On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote:
>  >  > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
>  >  > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
>  >  > >>
>  >  > >> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
>  >  > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
>  >  > >> flags: 0x400000000000000c(referenced|uptodate)
>  >  > >> page dumped because: non-NULL mapping
>  >  > >
>  >  > >Hmm. So this seems to be btrfs-specific, right?
>  >  > >
>  >  > >I searched for all your "non-NULL mapping" cases, and they all seem to
>  >  > >have basically the same call trace, with some work thread doing
>  >  > >writeback and going through btrfs_writepages().
>  >  > >
>  >  > >Sounds like it's a race with either fallocate hole-punching or
>  >  > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
>  >  > >clearly ran other filesystems too but I am not seeing this backtrace
>  >  > >for anything else.
>  >  >
>  >  > Agreed, I think this is a separate bug, almost certainly btrfs specific.
>  >  > I'll work with Dave on a better reproducer.
>  >
>  > Still refining my 'capture ftrace when trinity detects taint' feature,
>  > but in the meantime, here's a variant I don't think we've seen before:
>
> And another new one:
>
> kernel BUG at fs/btrfs/ctree.c:3172!
> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 0 PID: 22702 Comm: trinity-c40 Not tainted 4.9.0-rc4-think+ #1
> task: ffff8804ffde37c0 task.stack: ffffc90002188000
> RIP: 0010:[<ffffffffa00576b9>]
>   [<ffffffffa00576b9>] btrfs_set_item_key_safe+0x179/0x190 [btrfs]
> RSP: 0000:ffffc9000218b8a8  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8804fddcf348 RCX: 0000000000001000
> RDX: 0000000000000000 RSI: ffffc9000218b9ce RDI: ffffc9000218b8c7
> RBP: ffffc9000218b908 R08: 0000000000004000 R09: ffffc9000218b8c8
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffc9000218b8b6
> R13: ffffc9000218b9ce R14: 0000000000000001 R15: ffff880480684a88
> FS:  00007f7c7f998b40(0000) GS:ffff880507800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000044f15f000 CR4: 00000000001406f0
> DR0: 00007f4ce439d000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> Stack:
>  ffff880501430000 d305ffffa00a2245 006c000000000002 0500000000000010
>  6c000000000002d3 0000000000001000 000000006427eebb ffff880480684a88
>  0000000000000000 ffff8804fddcf348 0000000000002000 0000000000000000
> Call Trace:
>  [<ffffffffa009cff0>] __btrfs_drop_extents+0xb00/0xe30 [btrfs]

We've been hunting this one for at least two years.  It's the white 
whale of btrfs bugs.  Josef has a semi-reliable reproducer now, but I 
think it's not the same as the pagevec based problems you reported earlier.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Nov. 10, 2016, 2:35 p.m. UTC | #11

On Tue, Nov 08, 2016 at 10:08:04AM -0500, Chris Mason wrote:
 
 > > And another new one:
 > >
 > > kernel BUG at fs/btrfs/ctree.c:3172!
 > > 
 > > Call Trace:
 > >  [<ffffffffa009cff0>] __btrfs_drop_extents+0xb00/0xe30 [btrfs]
 > 
 > We've been hunting this one for at least two years.  It's the white 
 > whale of btrfs bugs.  Josef has a semi-reliable reproducer now, but I 
 > think it's not the same as the pagevec based problems you reported earlier.

Great, now for whatever reason, I'm hitting this over and over.

Even better, after the last time I hit it, it reboot and this happened during boot..

BTRFS info (device sda6): disk space caching is enabled
BTRFS info (device sda6): has skinny extents
BTRFS info (device sda3): disk space caching is enabled
------------[ cut here ]------------
WARNING: CPU: 1 PID: 443 at fs/btrfs/file.c:546 btrfs_drop_extent_cache+0x411/0x420 [btrfs]
CPU: 1 PID: 443 Comm: mount Not tainted 4.9.0-rc4-think+ #1 
 ffffc90000c4b468  ffffffff813b66bc  0000000000000000  0000000000000000
 ffffc90000c4b4a8  ffffffff81086d2b  0000022200c4b488  000000000002f265
 40c8dded1afd6000  ffff8804ff5cddc8  ffff8804ef26f2b8  40c8dded1afd5000
Call Trace:
 [<ffffffff813b66bc>] dump_stack+0x4f/0x73
 [<ffffffff81086d2b>] __warn+0xcb/0xf0
 [<ffffffff81086e5d>] warn_slowpath_null+0x1d/0x20
 [<ffffffffa009c0f1>] btrfs_drop_extent_cache+0x411/0x420 [btrfs]
 [<ffffffff81215923>] ? alloc_debug_processing+0x73/0x1b0
 [<ffffffffa009c93f>] __btrfs_drop_extents+0x44f/0xe30 [btrfs]
 [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
 [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
 [<ffffffff8121842a>] ? kmem_cache_alloc+0x2aa/0x330
 [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
 [<ffffffffa009e399>] btrfs_drop_extents+0x79/0xa0 [btrfs]
 [<ffffffffa00ce4d1>] replay_one_extent+0x1e1/0x710 [btrfs]
 [<ffffffffa00cec6d>] replay_one_buffer+0x26d/0x7e0 [btrfs]
 [<ffffffff8121732c>] ? ___slab_alloc.constprop.83+0x27c/0x5c0
 [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
 [<ffffffff813d5b87>] ? debug_smp_processor_id+0x17/0x20
 [<ffffffffa00ca3db>] walk_up_log_tree+0xeb/0x240 [btrfs]
 [<ffffffffa00ca5d6>] walk_log_tree+0xa6/0x1d0 [btrfs]
 [<ffffffffa00d32fc>] btrfs_recover_log_trees+0x1dc/0x460 [btrfs]
 [<ffffffffa00cea00>] ? replay_one_extent+0x710/0x710 [btrfs]
 [<ffffffffa0081f65>] open_ctree+0x2575/0x2670 [btrfs]
 [<ffffffffa005144b>] btrfs_mount+0xd0b/0xe10 [btrfs]
 [<ffffffff811da804>] ? pcpu_alloc+0x2d4/0x660
 [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
 [<ffffffff810d39eb>] ? __init_waitqueue_head+0x3b/0x50
 [<ffffffff81243794>] mount_fs+0x14/0xa0
 [<ffffffff8126305b>] vfs_kern_mount+0x6b/0x150
 [<ffffffffa0050a08>] btrfs_mount+0x2c8/0xe10 [btrfs]
 [<ffffffff811da804>] ? pcpu_alloc+0x2d4/0x660
 [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
 [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
 [<ffffffff810d39eb>] ? __init_waitqueue_head+0x3b/0x50
 [<ffffffff81243794>] mount_fs+0x14/0xa0
 [<ffffffff8126305b>] vfs_kern_mount+0x6b/0x150
 [<ffffffff81265bd2>] do_mount+0x1c2/0xda0
 [<ffffffff811d41c0>] ? memdup_user+0x60/0x90
 [<ffffffff81266ac3>] SyS_mount+0x83/0xd0
 [<ffffffff81002d81>] do_syscall_64+0x61/0x170
 [<ffffffff81894ccb>] entry_SYSCALL64_slow_path+0x25/0x25
---[ end trace d3fa03bb9c115bbe ]---
BTRFS: error (device sda3) in btrfs_replay_log:2491: errno=-17 Object already exists (Failed to recover log tree)
BTRFS error (device sda3): cleaner transaction attach returned -30
BTRFS error (device sda3): open_ctree failed


Guess I'll hit it with btrfsck and hope for the best..

	Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason Nov. 10, 2016, 3:27 p.m. UTC | #12

On 11/10/2016 09:35 AM, Dave Jones wrote:
> On Tue, Nov 08, 2016 at 10:08:04AM -0500, Chris Mason wrote:
>
>  > > And another new one:
>  > >
>  > > kernel BUG at fs/btrfs/ctree.c:3172!
>  > >
>  > > Call Trace:
>  > >  [<ffffffffa009cff0>] __btrfs_drop_extents+0xb00/0xe30 [btrfs]
>  >
>  > We've been hunting this one for at least two years.  It's the white
>  > whale of btrfs bugs.  Josef has a semi-reliable reproducer now, but I
>  > think it's not the same as the pagevec based problems you reported earlier.
>
> Great, now for whatever reason, I'm hitting this over and over.
>
> Even better, after the last time I hit it, it reboot and this happened during boot..
>
> BTRFS info (device sda6): disk space caching is enabled
> BTRFS info (device sda6): has skinny extents
> BTRFS info (device sda3): disk space caching is enabled
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 443 at fs/btrfs/file.c:546 btrfs_drop_extent_cache+0x411/0x420 [btrfs]
> CPU: 1 PID: 443 Comm: mount Not tainted 4.9.0-rc4-think+ #1
>  ffffc90000c4b468  ffffffff813b66bc  0000000000000000  0000000000000000
>  ffffc90000c4b4a8  ffffffff81086d2b  0000022200c4b488  000000000002f265
>  40c8dded1afd6000  ffff8804ff5cddc8  ffff8804ef26f2b8  40c8dded1afd5000
> Call Trace:
>  [<ffffffff813b66bc>] dump_stack+0x4f/0x73
>  [<ffffffff81086d2b>] __warn+0xcb/0xf0
>  [<ffffffff81086e5d>] warn_slowpath_null+0x1d/0x20
>  [<ffffffffa009c0f1>] btrfs_drop_extent_cache+0x411/0x420 [btrfs]
>  [<ffffffff81215923>] ? alloc_debug_processing+0x73/0x1b0
>  [<ffffffffa009c93f>] __btrfs_drop_extents+0x44f/0xe30 [btrfs]
>  [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
>  [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
>  [<ffffffff8121842a>] ? kmem_cache_alloc+0x2aa/0x330
>  [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
>  [<ffffffffa009e399>] btrfs_drop_extents+0x79/0xa0 [btrfs]
>  [<ffffffffa00ce4d1>] replay_one_extent+0x1e1/0x710 [btrfs]
>  [<ffffffffa00cec6d>] replay_one_buffer+0x26d/0x7e0 [btrfs]
>  [<ffffffff8121732c>] ? ___slab_alloc.constprop.83+0x27c/0x5c0
>  [<ffffffffa005426a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
>  [<ffffffff813d5b87>] ? debug_smp_processor_id+0x17/0x20
>  [<ffffffffa00ca3db>] walk_up_log_tree+0xeb/0x240 [btrfs]
>  [<ffffffffa00ca5d6>] walk_log_tree+0xa6/0x1d0 [btrfs]
>  [<ffffffffa00d32fc>] btrfs_recover_log_trees+0x1dc/0x460 [btrfs]
>  [<ffffffffa00cea00>] ? replay_one_extent+0x710/0x710 [btrfs]
>  [<ffffffffa0081f65>] open_ctree+0x2575/0x2670 [btrfs]
>  [<ffffffffa005144b>] btrfs_mount+0xd0b/0xe10 [btrfs]
>  [<ffffffff811da804>] ? pcpu_alloc+0x2d4/0x660
>  [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
>  [<ffffffff810d39eb>] ? __init_waitqueue_head+0x3b/0x50
>  [<ffffffff81243794>] mount_fs+0x14/0xa0
>  [<ffffffff8126305b>] vfs_kern_mount+0x6b/0x150
>  [<ffffffffa0050a08>] btrfs_mount+0x2c8/0xe10 [btrfs]
>  [<ffffffff811da804>] ? pcpu_alloc+0x2d4/0x660
>  [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
>  [<ffffffff810dce41>] ? lockdep_init_map+0x61/0x200
>  [<ffffffff810d39eb>] ? __init_waitqueue_head+0x3b/0x50
>  [<ffffffff81243794>] mount_fs+0x14/0xa0
>  [<ffffffff8126305b>] vfs_kern_mount+0x6b/0x150
>  [<ffffffff81265bd2>] do_mount+0x1c2/0xda0
>  [<ffffffff811d41c0>] ? memdup_user+0x60/0x90
>  [<ffffffff81266ac3>] SyS_mount+0x83/0xd0
>  [<ffffffff81002d81>] do_syscall_64+0x61/0x170
>  [<ffffffff81894ccb>] entry_SYSCALL64_slow_path+0x25/0x25
> ---[ end trace d3fa03bb9c115bbe ]---
> BTRFS: error (device sda3) in btrfs_replay_log:2491: errno=-17 Object already exists (Failed to recover log tree)
> BTRFS error (device sda3): cleaner transaction attach returned -30
> BTRFS error (device sda3): open_ctree failed
>
>
> Guess I'll hit it with btrfsck and hope for the best..

You can zero the log if you need to.   Josef has a ton of tracing around 
this right now, so I'm hoping we nail it down very soon.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Nov. 23, 2016, 7:34 p.m. UTC | #13

On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote:
 > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote:
 > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
 > >>
 > >> BUG: Bad page state in process kworker/u8:12  pfn:4e0e39
 > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c
 > >> flags: 0x400000000000000c(referenced|uptodate)
 > >> page dumped because: non-NULL mapping
 > >
 > >Hmm. So this seems to be btrfs-specific, right?
 > >
 > >I searched for all your "non-NULL mapping" cases, and they all seem to
 > >have basically the same call trace, with some work thread doing
 > >writeback and going through btrfs_writepages().
 > >
 > >Sounds like it's a race with either fallocate hole-punching or
 > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ
 > >clearly ran other filesystems too but I am not seeing this backtrace
 > >for anything else.
 > 
 > Agreed, I think this is a separate bug, almost certainly btrfs specific.  
 > I'll work with Dave on a better reproducer.

Ok, let's try this..
Here's the usual dump, from the current tree:


[  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
[  317.702235] page:ffffea001363f500 count:0 mapcount:0 mapping:ffff8804eef8e0e0 index:0x29
[  317.718068] flags: 0x400000000000000c(referenced|uptodate)
[  317.731086] page dumped because: non-NULL mapping
[  317.744118] CPU: 3 PID: 1558 Comm: kworker/u8:8 Not tainted 4.9.0-rc6-think+ #1 
[  317.770404] Workqueue: writeback wb_workfn
[  317.783490]  (flush-btrfs-3)
[  317.783581]  ffffc90000ff3798
[  317.796651]  ffffffff813b69bc
[  317.796742]  ffffea001363f500
[  317.796798]  ffffffff81c474ee
[  317.796854]  ffffc90000ff37c0
[  317.809910]  ffffffff811b30c4
[  317.810001]  0000000000000000
[  317.810055]  ffffea001363f500
[  317.810112]  0000000000000003
[  317.823230]  ffffc90000ff37d0
[  317.823320]  ffffffff811b318f
[  317.823376]  ffffc90000ff3818
[  317.823431] Call Trace:
[  317.836645]  [<ffffffff813b69bc>] dump_stack+0x4f/0x73
[  317.850063]  [<ffffffff811b30c4>] bad_page+0xc4/0x130
[  317.863426]  [<ffffffff811b318f>] free_pages_check_bad+0x5f/0x70
[  317.876805]  [<ffffffff811b61f5>] free_hot_cold_page+0x305/0x3b0
[  317.890132]  [<ffffffff811b660e>] free_hot_cold_page_list+0x7e/0x170
[  317.903410]  [<ffffffff811c08b7>] release_pages+0x2e7/0x3a0
[  317.916600]  [<ffffffff811c2067>] __pagevec_release+0x27/0x40
[  317.929817]  [<ffffffffa00b0a55>] extent_write_cache_pages.isra.44.constprop.59+0x355/0x430 [btrfs]
[  317.943073]  [<ffffffffa00b0fcd>] extent_writepages+0x5d/0x90 [btrfs]
[  317.956033]  [<ffffffffa008ea60>] ? btrfs_releasepage+0x40/0x40 [btrfs]
[  317.968902]  [<ffffffffa008b6a8>] btrfs_writepages+0x28/0x30 [btrfs]
[  317.981744]  [<ffffffff811be6d1>] do_writepages+0x21/0x30
[  317.994524]  [<ffffffff812727cf>] __writeback_single_inode+0x7f/0x6e0
[  318.007333]  [<ffffffff81894cc1>] ? _raw_spin_unlock+0x31/0x50
[  318.020090]  [<ffffffff8127356d>] writeback_sb_inodes+0x31d/0x6c0
[  318.032844]  [<ffffffff812739a2>] __writeback_inodes_wb+0x92/0xc0
[  318.045623]  [<ffffffff81273e97>] wb_writeback+0x3f7/0x530
[  318.058304]  [<ffffffff81274a38>] wb_workfn+0x148/0x620
[  318.070949]  [<ffffffff810a4b54>] ? process_one_work+0x194/0x690
[  318.083549]  [<ffffffff812748f5>] ? wb_workfn+0x5/0x620
[  318.096160]  [<ffffffff810a4bf2>] process_one_work+0x232/0x690
[  318.108722]  [<ffffffff810a4b54>] ? process_one_work+0x194/0x690
[  318.121225]  [<ffffffff810a509e>] worker_thread+0x4e/0x490
[  318.133680]  [<ffffffff810a5050>] ? process_one_work+0x690/0x690
[  318.146126]  [<ffffffff810a5050>] ? process_one_work+0x690/0x690
[  318.158507]  [<ffffffff810aa7e2>] kthread+0x102/0x120
[  318.170880]  [<ffffffff810aa6e0>] ? kthread_park+0x60/0x60
[  318.183254]  [<ffffffff81895822>] ret_from_fork+0x22/0x30


This time, I managed to get my ftrace magic working, so we have this
trace from just before this happened. Does this shed any light ?

https://codemonkey.org.uk/junk/trace.txt

(note that some of the spinlock/rcu calls will be missing here. I chose
to add them to ftrace's exclude list because they dominate the trace
buffer so much, especially in debug builds)

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Nov. 23, 2016, 7:58 p.m. UTC | #14

On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:

 > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
 > trace from just before this happened. Does this shed any light ?
 > 
 > https://codemonkey.org.uk/junk/trace.txt

crap, I just noticed the timestamps in the trace come from quite a bit
later.  I'll tweak the code to do the taint checking/ftrace stop after
every syscall, that should narrow the window some more.

Getting closer..

	Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Dec. 1, 2016, 3:32 p.m. UTC | #15

On Wed, Nov 23, 2016 at 02:58:45PM -0500, Dave Jones wrote:
 > On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:
 > 
 >  > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
 >  > trace from just before this happened. Does this shed any light ?
 >  > 
 >  > https://codemonkey.org.uk/junk/trace.txt
 > 
 > crap, I just noticed the timestamps in the trace come from quite a bit
 > later.  I'll tweak the code to do the taint checking/ftrace stop after
 > every syscall, that should narrow the window some more.
 > 
 > Getting closer..

Ok, this is getting more like it.
http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents.txt

Also same bug, different run, but a different traceview http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents-function-graph.txt

(function-graph screws up the RIP for some reason, 'return_to_handler'
 should actually be btrfs_destroy_inode)


Anyways, I've got some code that works pretty well for dumping the
ftrace buffer now when things go awry.  I just need to run it enough
times that I hit that bad page state instead of this, or a lockdep bug first.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dave Jones Dec. 3, 2016, 4:48 p.m. UTC | #16

On Thu, Dec 01, 2016 at 10:32:09AM -0500, Dave Jones wrote:
 > http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents.txt
 > 
 > Also same bug, different run, but a different traceview http://codemonkey.org.uk/junk/btrfs-destroy-inode-outstanding-extents-function-graph.txt
 > 
 > (function-graph screws up the RIP for some reason, 'return_to_handler'
 >  should actually be btrfs_destroy_inode)

Chris pointed me at a pending patch that took care of this warning.

 > Anyways, I've got some code that works pretty well for dumping the
 > ftrace buffer now when things go awry.  I just need to run it enough
 > times that I hit that bad page state instead of this, or a lockdep bug first.

Which allowed me to run long enough to get this trace..
http://codemonkey.org.uk/junk/bad-page-state.txt

Does this shed any light ?
The interesting process here seems to be kworker/u8:17, and the trace
captures some of what that was doing before that bad page was hit.

I've got another run going now, so I'll compare that trace when it
happens to see if it matches my current theory that it's something to
do with that btrfs_scrubparity_helper. I've seen that show up in stack
traces a few times while chasing this.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vegard Nossum Dec. 4, 2016, 11:04 p.m. UTC | #17

On 23 November 2016 at 20:58, Dave Jones <davej@codemonkey.org.uk> wrote:
> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:
>
>  > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
>  > trace from just before this happened. Does this shed any light ?
>  >
>  > https://codemonkey.org.uk/junk/trace.txt
>
> crap, I just noticed the timestamps in the trace come from quite a bit
> later.  I'll tweak the code to do the taint checking/ftrace stop after
> every syscall, that should narrow the window some more.

FWIW I hit this as well:

BUG: unable to handle kernel paging request at ffffffff81ff08b7
IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
PGD 461e067 PUD 461f063
PMD 1e001e1
Oops: 0003 [#1] PREEMPT SMP KASAN
Dumping ftrace buffer:
  (ftrace buffer empty)
CPU: 0 PID: 21744 Comm: trinity-c56 Tainted: G    B           4.9.0-rc7+ #217
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
task: ffff8801ee924080 task.stack: ffff8801bab88000
RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
__lock_acquire.isra.32+0xda/0x1a30
RSP: 0018:ffff8801bab8f730  EFLAGS: 00010082
RAX: ffffffff81ff071f RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffffff85ae7d00
RBP: ffff8801bab8f7b0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8801e727fc40 R11: fffffbfff0b54ced R12: ffffffff85ae7d00
R13: ffffffff84912920 R14: ffff8801ee924080 R15: 0000000000000000
FS:  00007f37ee653700(0000) GS:ffff8801f6a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff81ff08b7 CR3: 00000001daa70000 CR4: 00000000000006f0
DR0: 00007f37ee465000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
ffff8801ee9247d0 0000000000000000 0000000100000000 ffff8801ee924080
ffff8801f6a201c0 ffff8801f6a201c0 0000000000000000 0000000000000001
ffff880100000000 ffff880100000000 ffff8801e727fc40 ffff8801ee924080
Call Trace:
[<ffffffff81361751>] lock_acquire+0x141/0x2b0
[<ffffffff813530c0>] ? finish_wait+0xb0/0x180
[<ffffffff83c95b29>] _raw_spin_lock_irqsave+0x49/0x60
[<ffffffff813530c0>] ? finish_wait+0xb0/0x180
[<ffffffff813530c0>] finish_wait+0xb0/0x180
[<ffffffff81576227>] shmem_fault+0x4c7/0x6b0
[<ffffffff83a9b7cd>] ? p9_client_rpc+0x13d/0xf40
[<ffffffff81575d60>] ? shmem_getpage_gfp+0x1c90/0x1c90
[<ffffffff81fbe777>] ? radix_tree_next_chunk+0x4f7/0x840
[<ffffffff81352150>] ? wake_atomic_t_function+0x210/0x210
[<ffffffff815ad316>] __do_fault+0x206/0x410
[<ffffffff815ad110>] ? do_page_mkwrite+0x320/0x320
[<ffffffff815b9bcf>] handle_mm_fault+0x1cef/0x2a60
[<ffffffff815b8012>] ? handle_mm_fault+0x132/0x2a60
[<ffffffff815b7ee0>] ? __pmd_alloc+0x370/0x370
[<ffffffff81692e7e>] ? inode_add_bytes+0x10e/0x160
[<ffffffff8162de11>] ? memset+0x31/0x40
[<ffffffff815cba10>] ? find_vma+0x30/0x150
[<ffffffff812373a2>] __do_page_fault+0x452/0x9f0
[<ffffffff81ff071f>] ? iov_iter_init+0xaf/0x1d0
[<ffffffff81237bf5>] trace_do_page_fault+0x1e5/0x3a0
[<ffffffff8122a007>] do_async_page_fault+0x27/0xa0
[<ffffffff83c97618>] async_page_fault+0x28/0x30
[<ffffffff82059341>] ? strnlen_user+0x91/0x1a0
[<ffffffff8205931e>] ? strnlen_user+0x6e/0x1a0
[<ffffffff8157e038>] strndup_user+0x28/0xb0
[<ffffffff81d83c17>] SyS_add_key+0xc7/0x370
[<ffffffff81d83b50>] ? key_get_type_from_user.constprop.6+0xd0/0xd0
[<ffffffff815143ea>] ? __context_tracking_exit.part.4+0x3a/0x1e0
[<ffffffff81d83b50>] ? key_get_type_from_user.constprop.6+0xd0/0xd0
[<ffffffff8100524f>] do_syscall_64+0x1af/0x4d0
[<ffffffff83c96534>] entry_SYSCALL64_slow_path+0x25/0x25
Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <3e>
ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00
RIP  [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30

I didn't read all the emails in this thread, the crash site looks
identical to one of the earlier traces although the caller may be
different.

I think you can rule out btrfs in any case, probably block layer as
well, since it looks like this comes from shmem.


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vegard Nossum Dec. 5, 2016, 11:10 a.m. UTC | #18

On 5 December 2016 at 00:04, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> FWIW I hit this as well:
>
> BUG: unable to handle kernel paging request at ffffffff81ff08b7
> IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
> CPU: 0 PID: 21744 Comm: trinity-c56 Tainted: G    B           4.9.0-rc7+ #217
[...]

> I think you can rule out btrfs in any case, probably block layer as
> well, since it looks like this comes from shmem.

I should rather say that the VM runs on a 9p root filesystem and it
doesn't use/mount any block devices or disk-based filesystems.

I have all the trinity logs for the crash if that's useful. I tried a
couple of runs with just the (at the time) in-progress syscalls but it
didn't turn up anything interesting. Otherwise it seems like a lot of
data to go through by hand.

The crashing child seems to have just started, though, if that's relevant:

[child56:21744] [0]
open(filename="/sys/block/loop2/power/runtime_active_time",
flags=0x777b01, mode=666) = -1 (Not a directory)
[child56:21744] [1] [32BIT] sched_getattr(pid=1, param=0x7f37ec26c000,
size=3415) = -1 (Invalid argument)
[child56:21744] [2] [32BIT]
access(filename="/proc/2/task/2/net/stat/arp_cache", mode=2000) = -1
(Invalid argument)
[child56:21744] [3] getegid() = 0xfffe
[child56:21744] [4]
swapoff(path="/proc/721/task/721/net/dev_snmp6/tunl0") = -1 (Operation
not permitted)
[child56:21744] [5] timerfd_create(clockid=0x0, flags=0x0) = 439
[child56:21744] [6] pkey_mprotect(start=0x7f37ee656000, len=0,
prot=0x1000008, key=0x600000000000000) = 0
[child56:21744] [7] msync(start=0x7f37ee657000, len=0, flags=0x6) = 0
[child56:21744] [8] flock(fd=168, cmd=0xffffffc191f30b0c) = -1
(Invalid argument)
[child56:21744] [9] add_key(_type=0x437a15,
_description=0x7f37ec06c000, _payload=0x7f37ec06c000, plen=0,
ringid=0xfffffffffffffff8)

The other logfiles end thusly:

==> trinity-child0.log <==
[child0:21593] [311] faccessat(dfd=246,
filename="/proc/983/task/983/net/protocols", mode=2000) = -1 (Invalid
argument)
[child0:21593] [312] renameat(olddfd=246,
oldname="/proc/13/task/13/attr/sockcreate", newdfd=377,
newname="/proc/16/task/16/net/stat/rt_cache") = -1 (Permission denied)
[child0:21593] [313] [32BIT] readv(fd=289, vec=0x2e1a3d0, vlen=215) = 0

==> trinity-child100.log <==
[child100:21536] [439] setgid(gid=0x2a000200) = -1 (Operation not permitted)
[child100:21536] [440] waitid(which=175, upid=21587, infop=0x4,
options=3542, ru=0x7f37ec76c000) = -1 (Invalid argument)
[child100:21536] [441]
getxattr(pathname="/proc/980/task/980/net/ptype", name=0x7f37ee466000,
value=0x7f37ec26c000, size=49) = -1 (Operation not supported)

==> trinity-child101.log <==
[child101:21537] [55] getcwd(buf=0x7f37ee466000, size=4096) = 39
[child101:21537] [56] [32BIT] munlock(addr=0x7f37ee658000, len=0) = 0
[child101:21537] [57] semctl(semid=0xbd851e2b40e7df,
semnum=0x1b1b1b1b1b, cmd=0x2000000000, arg=0xcacacacaca) = -1 (Invalid
argument)

==> trinity-child102.log <==
[child102:21542] [11] readahead(fd=353, offset=2, count=249) = -1
(Invalid argument)
[child102:21542] [12] add_key(_type=0x43793f,
_description=0x7f37ec46c000, _payload=0x7f37ee658000, plen=32,
ringid=0xfffffffffffffffa) = -1 (Invalid argument)
[child102:21542] [13] time(tloc=0x7f37ee466000) = 0x584474e0

==> trinity-child103.log <==
[child103:21543] [45] dup(fildes=183) = 512
[child103:21543] [46] rt_sigpending(set=0x7f37ec86c000, sigsetsize=32)
= -1 (Invalid argument)
[child103:21543] [47] newstat(filename="/proc/587/task/587/gid_map",
statbuf=0x7f37ee466000) = 0

==> trinity-child104.log <==
[child104:21546] [49] getdents(fd=162, dirent=0x0, count=127) = -1
(Not a directory)
[child104:21546] [50] [32BIT] clock_adjtime(which_clock=0, utx=0x4) =
-1 (Bad address)
[child104:21546] [51] setsid() = 0x542a

==> trinity-child105.log <==
[child105:21547] [523] epoll_wait(epfd=244, events=0x8, maxevents=246,
timeout=-1) = -1 (Invalid argument)
[child105:21547] [524] dup2(oldfd=244, newfd=244) = 244
[child105:21547] [525] acct(name=0x7f37ec26c000) = -1 (Operation not permitted)

==> trinity-child106.log <==
[child106:19910] [136] getegid() = 0xfffe
[child106:19910] [137] munmap(addr=0x7f37ee65a000, len=4096) = 0
[child106:19910] [138] clock_nanosleep(which_clock=0x1, flags=0x1,
rqtp=0x7f37ec06c000, rmtp=0x7f37ee466000)
==> trinity-child107.log <==
[child107:21224] [994] copy_file_range(fd_in=373, off_in=0x2400e210,
fd_out=373, off_out=8, len=8, flags=0x0) = -1 (Bad file descriptor)
[child107:21224] [995] kcmp(pid1=1, pid2=21453, type=0x5,
idx1=0x787878787878, idx2=0xffffff6060606060) = -1 (Operation not
permitted)
[child107:21224] [996] [32BIT] readv(fd=365, vec=0x2e27e10, vlen=36) = 0

==> trinity-child108.log <==
[child108:21226] [759] recvfrom(fd=219, ubuf=0x7f37ec26c000, size=8,
flags=0x0, addr=0x2e1ed80, addr_len=110) = -1 (Bad file descriptor)
[child108:21226] [760] shmat(shmid=-4097, shmaddr=0x7f37ee465000,
shmflg=-195) = -1 (Invalid argument)
[child108:21226] [761] [32BIT] seccomp(op=0x0, flags=0x0, uargs=0x0)
==> trinity-child109.log <==
[child109:21552] [241] fremovexattr(fd=133, name=0x4) = -1 (Bad address)
[child109:21552] [242] sched_setattr(pid=0, uattr=0x1) = -1 (Invalid argument)
[child109:21552] [243] msgget(key=0x71717171717171,
msgflg=0xec00000000) = 0x2d805a

==> trinity-child10.log <==
[child10:21619] [120] creat(pathname="/proc/6/attr/fscreate", mode=0) = 521
[child10:21619] [121] [32BIT] sethostname(name=0x7f37ec06c000,
len=4096) = -1 (Operation not permitted)
[child10:21619] [122] add_key(_type=0x43793f,
_description=0x7f37ec66c000, _payload=0x7f37ec06c000, plen=8,
ringid=0xfffffffffffffffe) = -1 (Invalid argument)

==> trinity-child110.log <==
[child110:21554] [289] accept(fd=366, upeer_sockaddr=0x2e09f10,
upeer_addrlen=12) = -1 (Socket operation on non-socket)
[child110:21554] [290] unlinkat(dfd=147,
pathname="/proc/16/task/16/net/atm/br2684", flag=-62) = -1 (Invalid
argument)
[child110:21554] [291] [32BIT] seccomp(op=0x0, flags=0x0, uargs=0x0)
==> trinity-child111.log <==
[child111:21556] [102] getsockname(fd=292, usockaddr=0x2e10190,
usockaddr_len=110) = -1 (Socket operation on non-socket)
[child111:21556] [103] munmap(addr=0x7f37ee65a000, len=4096) = 0
[child111:21556] [104] personality(personality=1) = 0

==> trinity-child112.log <==
[child112:21241] [848] setpgid(pid=21456, pgid=9) = -1 (No such process)
[child112:21241] [849] splice(fd_in=182, off_in=0x7f37ec76c000,
fd_out=182, off_out=0x7f37ec76c008, len=4, flags=0x7) = -1 (Bad file
descriptor)
[child112:21241] [850] setgid(gid=0) = -1 (Operation not permitted)

==> trinity-child113.log <==
[child113:21560] [453] memfd_create(uname=0x7f37ee657000, flag=0x1) = 530
[child113:21560] [454] pwritev2(fd=342, vec=0x2e1e000, vlen=133,
pos_l=0xffffff1d, pos_h=-4, flags=0x0) = -1 (Illegal seek)
[child113:21560] [455] readahead(fd=338, offset=0xe000, count=4096) =
-1 (Invalid argument)

==> trinity-child114.log <==
[child114:21562] [27] [32BIT] getgroups16(gidsetsize=0xf12ec06dec,
grouplist=0x7f37ee465000) = -1 (Bad address)
[child114:21562] [28] keyctl(cmd=0x14, arg2=0xcecececece, arg3=8192,
arg4=0x7cc57bd0d8cf577, arg5=0x898989898989) = -1 (Invalid argument)
[child114:21562] [29] readlink(path="/sys/block/loop1/mq/0/pending",
buf=0x7f37ee467000, bufsiz=0x1565a3aa) = -1 (Invalid argument)

==> trinity-child115.log <==
[child115:21563] [392] eventfd(count=8) = 523
[child115:21563] [393] getegid() = 0xfffe
[child115:21563] [394] open(filename="/proc/587/stat", flags=0x795ac0,
mode=666) = -1 (Invalid argument)

==> trinity-child116.log <==
[child116:21567] [80] mbind(start=0x7f37ee65a000, len=0, mode=0x3,
nmask=0x0, maxnode=0x8000, flags=0x5) = -1 (Operation not permitted)
[child116:21567] [81] flock(fd=357, cmd=0xffffff0fe2220862) = 0
[child116:21567] [82] clock_adjtime(which_clock=15, utx=0x8) = -1
(Invalid argument)

==> trinity-child117.log <==
[child117:21570] [333] pipe(fildes=0x7f37ee466000) = 0
[child117:21570] [334] get_robust_list(pid=21439, head_ptr=0x1,
len_ptr=0x7f37ebc18000) = -1 (Operation not permitted)
[child117:21570] [335] execve(name="/proc/550/net/rt_cache",
argv=0x2e230d0, envp=0x2ba92a0)
==> trinity-child118.log <==
[child118:21572] [455] fstatat64(dfd=244,
filename="/proc/1297/task/1297/fdinfo/110", statbuf=0x8, flag=97) = -1
(Invalid argument)
[child118:21572] [456] flistxattr(fd=244, list=0x7f37ec26c000, size=8) = 0
[child118:21572] [457] ioprio_set(which=0x5d5d5d5d5d5d,
who=0x222734221582, ioprio=0xffffff5b5b5b5b5b) = -1 (Invalid argument)

==> trinity-child119.log <==
[child119:21574] [400] timerfd_create(clockid=0x0, flags=0x0) = 522
[child119:21574] [401] execveat(fd=338, name="/proc/1024/net/can-bcm",
argv=0x2bda9f0, envp=0x2e21130, flags=0x100) = 522
[child119:21574] [402] execveat(fd=244,
name="/sys/devices/platform/serial8250/tty/ttyS3/power/autosuspend_delay_ms",
argv=0x2e09b30, envp=0x2e1d0f0, flags=0x1000)
==> trinity-child11.log <==
[child11:21621] [352]
symlink(oldname="/sys/devices/virtual/sound/timer/uevent",
newname="/proc/549/task/549/net/rpc/auth.unix.gid") = -1 (File exists)
[child11:21621] [353] [32BIT] setgroups16(gidsetsize=0xffffff18,
grouplist=0x8) = -1 (Operation not permitted)
[child11:21621] [354] semtimedop(semid=0x100000de63c0d6e9, tsops=0x4,
nsops=0, timeout=0x8)
==> trinity-child120.log <==
[child120:21578] [141] execveat(fd=147,
name="/proc/2/task/2/net/llc/socket", argv=0x2e0a250, envp=0x2e10190,
flags=0x1000) = -1 (Permission denied)
[child120:21578] [142] readlinkat(dfd=330,
pathname="/sys/devices/virtual/misc/ocfs2_control/power/runtime_active_kids",
buf=0x7f37ee656000, bufsiz=4) = -1 (Invalid argument)
[child120:21578] [143] fremovexattr(fd=330, name=0x7f37ec26c000) = -1
(Permission denied)

==> trinity-child121.log <==
[child121:21263] [586] pread64(fd=237, buf=0x7f37ec76c000, count=3746,
pos=75) = 0
[child121:21263] [587] inotify_add_watch(fd=237,
pathname="/proc/974/task/974/net/atm/pvc", mask=0x5) = -1 (Invalid
argument)
[child121:21263] [588]
removexattr(pathname="/proc/974/task/974/net/sco",
name=0x7f37ec26c000) = -1 (Permission denied)

==> trinity-child122.log <==
[child122:21580] [57] getrusage(who=0x3a2a97385da52b25, ru=0x0) = -1
(Invalid argument)
[child122:21580] [58] getdents(fd=373, dirent=0x1, count=0x6cfbd4f4) =
-1 (Bad file descriptor)
[child122:21580] [59] msgsnd(msqid=-9, msgp=0x8, msgsz=59, msgflg=0x0)
= -1 (Bad address)

==> trinity-child123.log <==
[child123:21582] [289] recvmsg(fd=244, msg=0x7f37ec26c000,
flags=0x5f058) = -1 (Socket operation on non-socket)
[child123:21582] [290] poll(ufds=0x2d25880, nfds=7, timeout_msecs=0) = 5
[child123:21582] [291] getppid() = 1297

==> trinity-child124.log <==
[child124:20960] [493] uname(name=0x7f37ee466000) = 0
[child124:20960] [494] sysfs(option=0x1, arg1=-3,
arg2=0x5982b49806138008) = -1 (Bad address)
[child124:20960] [495]
execve(name="/proc/sys/net/ipv4/neigh/sit0/base_reachable_time_ms",
argv=0x2e122d0, envp=0x2bc95a0)
==> trinity-child125.log <==
[child125:21586] [15] setgroups(gidsetsize=12, grouplist=0x0) = -1
(Operation not permitted)
[child125:21586] [16] [32BIT] oldumount(name=0x7f37ec46c000) = -1
(Operation not permitted)
[child125:21586] [17] utimes(filename="/proc/5/net/anycast6",
utimes=0x8) = -1 (Bad address)

==> trinity-child126.log <==
[child126:21587] [29] shmdt(shmaddr=0x7f37ec66c000) = -1 (Invalid argument)
[child126:21587] [30]
setxattr(pathname="/sys/devices/pci0000:00/0000:00:03.0/net/eth0/power/runtime_active_kids",
name=0x4, value=0x7f37ec46c000, size=0xfffd, flags=0x1) = -1 (Bad
address)
[child126:21587] [31] chroot(filename="/proc/1240/net/hci") = -1 (Not
a directory)

==> trinity-child127.log <==
[child127:21588] [290] clock_gettime(which_clock=6156, tp=0x0) = -1
(Invalid argument)
[child127:21588] [291] statfs(pathname="/proc/1252/net/ipv6_route",
buf=0x7f37ee659000) = 0
[child127:21588] [292] msync(start=0x7f37ee658000, len=0, flags=0x6) = 0

==> trinity-child12.log <==
[child12:21625] [54] getsid(pid=21655) = 0
[child12:21625] [55] mlock2(start=0x7f37ec06c000, len=0x136000) = -1
(Cannot allocate memory)
[child12:21625] [56] dup3(oldfd=180, newfd=180, flags=0x80000) = -1
(Invalid argument)

==> trinity-child13.log <==
[child13:21627] [32] add_key(_type=0x437a15,
_description=0x7f37ee466001, _payload=0x7f37ee466000, plen=0x40000,
ringid=0xfffffffffffffffe) = -1 (Invalid argument)
[child13:21627] [33] mknodat(dfd=243,
filename="/proc/553/task/553/attr/fscreate", mode=0xf63d64273f4d42,
dev=-10) = -1 (Operation not permitted)
[child13:21627] [34] mq_timedreceive(mqdes=328,
u_msg_ptr=0x7f37ee466000, msg_len=0xd1d1d1, u_msg_prio=0x7f37ec06c000,
u_abs_timeout=0x7f37ee466000) = -1 (Invalid argument)

==> trinity-child14.log <==
[child14:21633] [194] msgsnd(msqid=0x4e251795f0, msgp=0x0, msgsz=8,
msgflg=0x0) = -1 (Bad address)
[child14:21633] [195] ppoll(ufds=0x2c4b530, nfds=7, tsp=0x2e09fc0,
sigmask=0x7f37ee465000, sigsetsize=128) = -1 (Invalid argument)
[child14:21633] [196] faccessat(dfd=251,
filename="/proc/15/task/15/net/stat/nf_conntrack", mode=40) = -1
(Invalid argument)

==> trinity-child15.log <==
[child15:21634] [210] timer_gettime(timer_id=1, setting=0x4) = -1
(Invalid argument)
[child15:21634] [211] lgetxattr(pathname="/proc/551/setgroups",
name=0x4, value=0xc, size=1) = -1 (Bad address)
[child15:21634] [212] splice(fd_in=265, off_in=0x7f37ec26c000,
fd_out=265, off_out=0x7f37ec26c008, len=1, flags=0x4) = -1 (Bad file
descriptor)

==> trinity-child16.log <==
[child16:21640] [307]
sched_get_priority_max(policy=0x7fffffff08000000) = -1 (Invalid
argument)
[child16:21640] [308] getcpu(cpup=0x7f37ee465000,
nodep=0x7f37ee465004, unused=0x7f37ee465008) = 0
[child16:21640] [309] sendfile(out_fd=367, in_fd=367, offset=0x0,
count=0) = -1 (Bad file descriptor)

==> trinity-child17.log <==
[child17:21642] [227]
llistxattr(pathname="/proc/16/task/16/net/route", list=0x8, size=4) =
0
[child17:21642] [228] msync(start=0x7f37ec66c000, len=16384, flags=0x6) = 0
[child17:21642] [229] [32BIT]
execve(name="/proc/1024/net/atm/devices", argv=0x2c32e20,
envp=0x2dfb5a0) = 0

==> trinity-child18.log <==
[child18:21644] [55] execve(name="/proc/971/task/971/net/icmp",
argv=0x2d25880, envp=0x2b996a0) = -1 (Permission denied)
[child18:21644] [56] utime(filename="/proc/974/net/raw",
times=0x7f37ee465000) = -1 (Operation not permitted)
[child18:21644] [57] umask(mask=0xfffffff7) = 18

==> trinity-child19.log <==
[child19:21651] [208] [32BIT] move_pages(pid=21604, nr_pages=145,
pages=0x2e1a3d0, nodes=0x2e1b3e0, status=0x2e1b630, flags=0x2) = -1
(Bad address)
[child19:21651] [209] timer_getoverrun(timer_id=-2) = -1 (Invalid argument)
[child19:21651] [210] rename(oldname="/proc/14/cmdline",
newname="/proc/1041/task/1041/net/can/rcvlist_all") = -1 (No such file
or directory)

==> trinity-child1.log <==
[child1:19992] [1361]
lsetxattr(pathname="/proc/1297/task/1297/fdinfo/96", name=0x1,
value=0x7f37ee466000, size=0, flags=0x0) = -1 (Bad address)
[child1:19992] [1362] setgid(gid=0xf0f0f0f0f) = -1 (Operation not permitted)
[child1:19992] [1363] clock_nanosleep(which_clock=0x1, flags=0x0,
rqtp=0x7f37ee659000, rmtp=0x7f37ee465000)
==> trinity-child20.log <==
[child20:21027] [1852] sendto(fd=244, buff=0x2df7410, len=1,
flags=0x800436c8, addr=0x2df7460, addr_len=16) = -1 (Socket operation
on non-socket)
[child20:21027] [1853] setresgid(rgid=0, egid=0x4848, sgid=0xdddddddd)
= -1 (Operation not permitted)
[child20:21027] [1854] poll(ufds=0x2e1fb80, nfds=8, timeout_msecs=0) = 6

==> trinity-child21.log <==
[child21:21330] [663]
statfs(pathname="/sys/block/md0/queue/scheduler", buf=0x7f37ec66c000)
= 0
[child21:21330] [664] sched_setaffinity(pid=1, len=1028,
user_mask_ptr=0x7f37ec46c000) = -1 (Operation not permitted)
[child21:21330] [665] [32BIT] seccomp(op=0x0, flags=0x0, uargs=0x0)
==> trinity-child22.log <==
[child22:21652] [61] pkey_alloc(flags=0, init_val=0x0) = -1 (No space
left on device)
[child22:21652] [62] fsync(fd=330) = 0
[child22:21652] [63] [32BIT] seccomp(op=0x0, flags=0x0, uargs=0x0)
==> trinity-child23.log <==
[child23:21655] [237] sync_file_range(fd=244, offset=191, nbytes=1,
flags=0xffffffffffffff9) = -1 (Invalid argument)
[child23:21655] [238] io_submit(ctx_id=4088, nr=1,
iocbpp=0x7f37ee658000) = -1 (Invalid argument)
[child23:21655] [239] flistxattr(fd=367, list=0x7f37ee656000, size=8)
==> trinity-child24.log <==
[child24:21657] [86] [32BIT] personality(personality=0x3fde3e4912ec) =
0x82604841
[child24:21657] [87]
mkdir(pathname="/proc/924/task/924/net/atalk/interface", mode=0) = -1
(File exists)
[child24:21657] [88] [32BIT] fchownat(dfd=381,
filename="/sys/module/nf_conntrack/parameters/nf_conntrack_helper",
user=0x1804802345, group=0xffffffff, flag=-3) = -1 (Invalid argument)

==> trinity-child25.log <==
[child25:21658] [157] timerfd_settime(ufd=201, flags=0x80800,
utmr=0x1, otmr=0x9) = -1 (Bad address)
[child25:21658] [158] rt_tgsigqueueinfo(tgid=0xc0da20044837,
pid=21330, sig=7099, uinfo=0x7f37ee466000) = -1 (Bad address)
[child25:21658] [159] pkey_mprotect(start=0x7f37ec76c000, len=0x79000,
prot=0x0, key=0x5120a54c676961) = -1 (Invalid argument)

==> trinity-child26.log <==
[child26:21665] [20] fdatasync(fd=217) = -1 (Invalid argument)
[child26:21665] [21] fsetxattr(fd=137, name=0x4, value=0x7f37ec26c000,
size=-9, flags=0x2) = -1 (Bad address)
[child26:21665] [22] open_by_handle_at(mountdirfd=137,
handle=0x7f37ee465000, flags=0x5b33c0) = -1 (Operation not permitted)

==> trinity-child27.log <==
[child27:21668] [190] preadv(fd=381, vec=0x2e1b3e0, vlen=233,
pos_l=0x710702af1dfa, pos_h=0xc976c19d4f) = 0
[child27:21668] [191] [32BIT] mlock(addr=0x7f37ec76c000, len=0xce000)
= -1 (Cannot allocate memory)
[child27:21668] [192] [32BIT] getsockopt(fd=381,
level=0x54100210202640, optname=0x9a9a9a9a9a9a, optval=0x7f37ee466000,
optlen=8) = -1 (Socket operation on non-socket)

==> trinity-child28.log <==
[child28:1813] [202] sched_setscheduler(pid=0, policy=0x5,
param=0x7f37ee465000) = 0
[child28:1813] [203] setresuid(ruid=0x80020293a4aa482, euid=247,
suid=0xffffffff81000002) = -1 (Operation not permitted)
[child28:1813] [204] epoll_ctl(epfd=332, op=0x3, fd=332,
event=0x2e0a150) [child28:1813] [204]
newstat(filename="/proc/9/task/9/net/atalk", statbuf=0x8)
[child28:1813] [204] newstat(filename="/proc/9/task/9/net/atalk",
statbuf=0x8) [child28:1813] [204]
statfs(pathname="/proc/989/task/989/attr/prev", buf=0x7f37ee466000)
[child28:1813] [204] newstat(filename="/proc/9/task/9/net/atalk",
statbuf=0x8) [child28:1813] [204]
statfs(pathname="/proc/989/task/989/attr/prev", buf=0x7f37ee466000)
[child28:1813] [204] timerfd_create(clockid=0x1, flags=0x800)
[child28:1813] [204] newstat(filename="/proc/9/task/9/net/atalk",
statbuf=0x8) [child28:1813] [204]
statfs(pathname="/proc/989/task/989/attr/prev", buf=0x7f37ee466000)
[child28:1813] [204] timerfd_create(clockid=0x1, flags=0x800)
[child28:1813] [204] process_vm_readv(pid=1, lvec=0x2e21d30,
liovcnt=48, rvec=0x2e22040, riovcnt=23, flags=0x0)
==> trinity-child29.log <==
= 0
[child29:1329] [68] [32BIT] getitimer(which=0xfffffffb,
value=0x7f37ee465000) [child29:1329] [68]
setpriority(which=0x8000000000, who=4, niceval=-2) [child29:1329] [68]
rt_sigprocmask(how=-9, set=0x0, oset=0x7f37ee466000, sigsetsize=128) =
-1 (Invalid argument)
[child29:1329] [69] shmat(shmid=0xfffffff9, shmaddr=0x1, shmflg=0x55d10db3)
==> trinity-child2.log <==
[child2:21596] [87] accept4(fd=189, upeer_sockaddr=0x2bda970,
upeer_addrlen=110, flags=0x80800) = -1 (Bad file descriptor)
[child2:21596] [88]
access(filename="/proc/1134/task/1134/net/llc/socket", mode=2060) = -1
(Invalid argument)
[child2:21596] [89] sendfile(out_fd=189, in_fd=172,
offset=0x7f37ee465000, count=4096) = -1 (Bad file descriptor)

==> trinity-child30.log <==
[child30:9344] [334] sendmsg(fd=0, msg=0x2c4b530, flags=0x5ba83) = -1
(Socket operation on non-socket)
[child30:9344] [335] signalfd(ufd=382, user_mask=0x7f37ee65a000,
sizemask=4) = -1 (Invalid argument)
[child30:9344] [336] listen(fd=206, backlog=0x4649119809)
[child30:9344] [336] io_getevents(ctx_id=0x372b076d67, min_nr=0xfff8,
nr=4096, events=0x7f37ec86c000, timeout=0x7f37ee65a000)
==> trinity-child31.log <==
[child31:2333] [416] munlockall() = 0
[child31:2333] [417] fgetxattr(fd=189, name=0x0, value=0x7f37ec76c000,
size=252) = -1 (Bad file descriptor)
[child31:2333] [418] fgetxattr(fd=308, name=0x4, value=0x7f37ec26c000,
size=8) [child31:2333] [418] clock_adjtime(which_clock=15, utx=0x1)
[child31:2333] [418] clock_adjtime(which_clock=15, utx=0x1)
[child31:2333] [418] timer_gettime(timer_id=0xf9f9f9f9f9f9,
setting=0x1) [child31:2333] [418] socket(family=32, type=0x80800,
protocol=10) [child31:2333] [418] socket(family=32, type=0x80800,
protocol=10) [child31:2333] [418] dup2(oldfd=371, newfd=363)
[child31:2333] [418] socket(family=32, type=0x80800, protocol=10)
[child31:2333] [418] dup2(oldfd=371, newfd=363) [child31:2333] [418]
unlink(pathname="/proc/1209/task/1209/net/atm/pvc") [child31:2333]
[418] openat(dfd=213, filename="/proc/971/task/971/net/atm",
flags=0x141a00, mode=712) [child31:2333] [418]
timer_gettime(timer_id=0x900040400186, setting=0x4) [child31:2333]
[418] getxattr(pathname="/proc/1023/net/stat/ndisc_cache",
name=0x7f37ec86c000, value=0x7f37ec86c001, size=259) [child31:2333]
[418] getxattr(pathname="/proc/1023/net/stat/ndisc_cache",
name=0x7f37ec86c000, value=0x7f37ec86c001, size=259) [child31:2333]
[418] open(filename="/sys/module/l2tp_ip", flags=0x80400, mode=211)
[child31:2333] [418] umask(mask=0xa10000004d483090) [child31:2333]
[418] umask(mask=0xa10000004d483090) [child31:2333] [418]
setgroups(gidsetsize=0, grouplist=0x4) [child31:2333] [418]
getgroups(gidsetsize=0xd4d4d4d4d4d4, grouplist=0x8) [child31:2333]
[418] getgroups(gidsetsize=0xd4d4d4d4d4d4, grouplist=0x8)
[child31:2333] [418] io_cancel(ctx_id=0xffffffff00000004,
iocb=0x7f37ee465000, result=0x4fc0000000) [child31:2333] [418]
getgroups(gidsetsize=0xd4d4d4d4d4d4, grouplist=0x8) [child31:2333]
[418] io_cancel(ctx_id=0xffffffff00000004, iocb=0x7f37ee465000,
result=0x4fc0000000) [child31:2333] [418] ioperm(from=221,
num=0x1786c7a03485c3c0, turn_on=176) [child31:2333] [418]
fadvise64(fd=271, offset=0x2afb1db, len=449, advice=0x5)
[child31:2333] [418] getpriority(which=0x1, who=2622) [child31:2333]
[418] signalfd(ufd=285, user_mask=0x7f37ee466000, sizemask=0)
[child31:2333] [418] setresuid(ruid=40, euid=0xffff, suid=-7)
==> trinity-child32.log <==
[child32:21672] [199] getresuid(ruid=0x7f37ec26c000,
euid=0x7f37ee65a000, suid=0x7f37ee465000) = 0
[child32:21672] [200] getrlimit(resource=0xa, rlim=0x4) = -1 (Bad address)
[child32:21672] [201] [32BIT] signalfd(ufd=367,
user_mask=0x7f37ec46c000, sizemask=32)
==> trinity-child33.log <==
[child33:21677] [231] mq_getsetattr(mqdes=244,
u_mqstat=0x7f37ee658000, u_omqstat=0x7f37ee658008) = -1 (Invalid
argument)
[child33:21677] [232] gettid() = 0x54ad
[child33:21677] [233] timer_create(which_clock=161,
timer_event_spec=0x7f37ec86c000, create_timer_id=0x7f37ec86c008)
==> trinity-child34.log <==
[child34:21679] [186] sched_rr_get_interval(pid=0, interval=0x7f37ee466000) = 0
[child34:21679] [187] [32BIT] epoll_pwait(epfd=244,
events=0x7f37ee657000, maxevents=0x40300000a, timeout=562) = -1
(Invalid argument)
[child34:21679] [188] getpriority(which=0x0, who=1)
==> trinity-child35.log <==
[child35:14725] [92] getrandom(buf=0x7f37ee656000, count=0, flags=0x3)
= -1 (Invalid argument)
[child35:14725] [93] setgid(gid=-4) = -1 (Operation not permitted)
[child35:14725] [94] execveat(fd=210, name="/proc/942/net/hidp",
argv=0x2bda970, envp=0x2e10190, flags=0x0) = -1 (Operation not
permitted)

==> trinity-child36.log <==
[child36:21681] [167] sethostname(name=0x7f37ec76c000, len=227) = -1
(Operation not permitted)
[child36:21681] [168] munlock(addr=0x7f37ec86c000, len=0x48000) = 0
[child36:21681] [169] execve(name="/proc/980/task/980/net/udplite",
argv=0x2e0a1a0, envp=0x2e10190)
==> trinity-child37.log <==
[child37:21682] [33] setgid(gid=0x6c6c6c6c) = -1 (Operation not permitted)
[child37:21682] [34] syslog(type=0x3, buf=0x7f37ee656000, len=0) = 0
[child37:21682] [35] pwrite64(fd=335, buf=0x2e204b0, count=1220,
pos=56) = -1 (Bad file descriptor)

==> trinity-child38.log <==
[child38:21689] [107] setpriority(which=0xafafafafafafafaf,
who=0xffffffffffff0000, niceval=0x400000000) = -1 (Invalid argument)
[child38:21689] [108] newstat(filename="/proc/983/cmdline",
statbuf=0x7f37ec66c000) = 0
[child38:21689] [109] mq_getsetattr(mqdes=367, u_mqstat=0x0, u_omqstat=0x0)
==> trinity-child39.log <==
[child39:20422] [556] adjtimex(txc_p=0x7f37ee466000) = -1 (Operation
not permitted)
[child39:20422] [557] ioperm(from=0, num=0xd000, turn_on=55) = -1
(Operation not permitted)
[child39:20422] [558] clock_nanosleep(which_clock=0x2, flags=0x0,
rqtp=0x7f37ec66c000, rmtp=0x7f37ec66c000)
==> trinity-child3.log <==
[child3:21597] [146] prlimit64(pid=1, resource=12, new_rlim=0x4,
old_rlim=0xc) = -1 (Bad address)
[child3:21597] [147] mq_unlink(u_name=0x7f37ec46c000) = -1 (No such
file or directory)
[child3:21597] [148] [32BIT] rt_sigprocmask(how=0x1616161616161617,
set=0x7f37ec76c000, oset=0x7f37ee465000, sigsetsize=128) = -1 (Invalid
argument)

==> trinity-child40.log <==
[child40:21692] [101] getgid() = 0xfffe
[child40:21692] [102]
lsetxattr(pathname="/proc/14/task/14/net/udplite6",
name=0x7f37ee657000, value=0x7f37ee465000, size=4096, flags=0x0) = -1
(Permission denied)
[child40:21692] [103] pwritev2(fd=367, vec=0x2e1a3d0, vlen=92,
pos_l=15, pos_h=0xffa4cdcdd0ee0601, flags=0x5)
==> trinity-child41.log <==
[child41:3589] [16] sched_get_priority_max(policy=0x34000000000000c3)
[child41:3589] [16] timerfd_settime(ufd=308, flags=0x80800,
utmr=0x7f37ec66c000, otmr=0x7f37ee466000) [child41:3589] [16]
timerfd_settime(ufd=326, flags=0x0, utmr=0x7f37ee465000,
otmr=0x7f37ec86c000) = -1 (File exists)
[child41:3589] [16] sched_get_priority_max(policy=0x34000000000000c3)
[child41:3589] [16] timerfd_settime(ufd=308, flags=0x80800,
utmr=0x7f37ec66c000, otmr=0x7f37ee466000) [child41:3589] [16]
timerfd_settime(ufd=326, flags=0x0, utmr=0x7f37ee465000,
otmr=0x7f37ec86c000) [child41:3589] [16]
getresuid(ruid=0x7f37ee467000, euid=0x7f37ee467000,
suid=0x7f37ec06c000) [child41:3589] [16] getrusage(who=0x703a9e50fc,
ru=0x8) [child41:3589] [16] munlockall() = 0
[child41:3589] [17]
chmod(filename="/proc/sys/net/ipv6/route/gc_min_interval", mode=4)
==> trinity-child42.log <==
[child42:21696] [165] getrusage(who=-5, ru=0x0) = -1 (Invalid argument)
[child42:21696] [166] fremovexattr(fd=168, name=0x7f37ee465000) = -1
(Numerical result out of range)
[child42:21696] [167] madvise(start=0x7f37ec06c000, len_in=0xdb000,
advice=0x9) = 0

==> trinity-child43.log <==
[child43:21700] [106] readv(fd=244, vec=0x2e1a3d0, vlen=246) = 0
[child43:21700] [107]
swapoff(path="/sys/devices/virtual/tty/tty26/power/runtime_usage") =
-1 (Operation not permitted)
[child43:21700] [108] linkat(olddfd=244,
oldname="/sys/devices/virtual/workqueue/writeback/nice", newdfd=244,
newname="/proc/1014/net/nfsfs/volumes", flags=0x0) = -1 (File exists)

==> trinity-child44.log <==
[child44:21703] [94] mkdir(pathname="/sys/module/pstore/parameters",
mode=0) = -1 (File exists)
[child44:21703] [95] sendfile(out_fd=367, in_fd=137, offset=0x1,
count=8) = -1 (Bad address)
[child44:21703] [96] [32BIT] sync_file_range(fd=367,
offset=0x707070707, nbytes=0x8000, flags=0xffffefff)
==> trinity-child45.log <==
[child45:21704] [1] fdatasync(fd=381) = -1 (Invalid argument)
[child45:21704] [2] munmap(addr=0x7f37ee656000, len=4096) = 0
[child45:21704] [3] pipe2(fildes=0x7f37ec46c000, flags=0x4800) = 0

==> trinity-child46.log <==
[child46:21705] [82]
rmdir(pathname="/sys/devices/tracepoint/power/runtime_active_kids") =
-1 (Permission denied)
[child46:21705] [83] chroot(filename="/proc/15/task/15/net/tcp") = -1
(Not a directory)
[child46:21705] [84] [32BIT] ioprio_set(which=-2, who=2048,
ioprio=0x646464646464) = -1 (Invalid argument)

==> trinity-child47.log <==
[child47:21710] [127] setrlimit(resource=0x7f37ee465000, rlim=16384) =
-1 (Bad address)
[child47:21710] [128] setns(fd=244, nstype=0x0) = -1 (Invalid argument)
[child47:21710] [129] pkey_alloc(flags=0, init_val=0x2)
==> trinity-child48.log <==
[child48:21717] [39]
access(filename="/proc/10/task/10/net/ip_vs_stats", mode=700) = -1
(Invalid argument)
[child48:21717] [40] [32BIT] sigprocmask(how=233, set=0x7f37ec46c000,
oset=-9) = -1 (Bad address)
[child48:21717] [41] [32BIT] seccomp(op=0x0, flags=0x0, uargs=0x0)
==> trinity-child49.log <==
[child49:21718] [81] sendto(fd=168, buff=0x2e0a540, len=1,
flags=0x41110, addr=0x2ba92a0, addr_len=28) = -1 (Socket operation on
non-socket)
[child49:21718] [82] clock_gettime(which_clock=0x430d4f678edd, tp=0x8)
= -1 (Invalid argument)
[child49:21718] [83] removexattr(pathname="/proc/13/attr/prev",
name=0x7f37ec06c000)
==> trinity-child4.log <==
[child4:21602] [155] getgroups(gidsetsize=0, grouplist=0x1) = 1
[child4:21602] [156] sendmsg(fd=0, msg=0x2c22c50, flags=0x40010883) =
-1 (Socket operation on non-socket)
[child4:21602] [157] fchown(fd=330, user=0x20240098000041,
group=0x20000) = -1 (Operation not permitted)

==> trinity-child50.log <==
[child50:21719] [41] shmctl(shmid=0x4c4c4c4c4c4c4c, cmd=0x2, buf=0x0)
= -1 (Invalid argument)
[child50:21719] [42] timer_delete(timer_id=0x80000000) = -1 (Invalid argument)
[child50:21719] [43] madvise(start=0x7f37ec86c000, len_in=0x20000, advice=0x2)
==> trinity-child51.log <==
[child51:21723] [91] msgctl(msqid=0x9f054e8b21, cmd=0x3,
buf=0x7f37ee466000) = 1763
[child51:21723] [92] time(tloc=0x7f37ee466000) = 0x584474e5
[child51:21723] [93] timer_gettime(timer_id=1, setting=0x7f37ee466000)
==> trinity-child52.log <==
[child52:21728] [70] epoll_ctl(epfd=244, op=0x3, fd=244,
event=0x2e0a2f0) = -1 (Operation not permitted)
[child52:21728] [71] [32BIT] statfs(pathname="/proc/548/net/atm/vc",
buf=0x7f37ec46c000) = -1 (Bad address)
[child52:21728] [72] rt_sigaction(sig=52, act=0x7f37ee465000,
oact=0x7f37ec06c000, sigsetsize=128) = -1 (Invalid argument)

==> trinity-child53.log <==
[child53:21734] [64] lseek(fd=244, offset=0, whence=0x1) = 33
[child53:21734] [65] preadv(fd=244, vec=0x2e10190, vlen=17,
pos_l=0x2714319808dd0822, pos_h=128) = 0
[child53:21734] [66] epoll_create1(flags=0x0) = 516

==> trinity-child54.log <==
[child54:21738] [50] ioprio_get(which=0, who=10023) = -1 (Invalid argument)
[child54:21738] [51] [32BIT] setregid(rgid=0x1000000000000,
egid=0xffff) = -1 (Operation not permitted)
[child54:21738] [52]
execve(name="/proc/1240/net/rpc/auth.rpcsec.context", argv=0x2e10190,
envp=0x2c6bd50)
==> trinity-child55.log <==
[child55:21741] [44] recvmsg(fd=381, msg=0x1, flags=0x20000000) = -1
(Socket operation on non-socket)
[child55:21741] [45] [32BIT] setgroups(gidsetsize=0xf9f9f9f9f9f9f9,
grouplist=0x0) = -1 (Operation not permitted)
[child55:21741] [46] execveat(fd=168, name="/proc/irq/4/smp_affinity",
argv=0x2e10190, envp=0x2e00900, flags=0x100)
==> trinity-child56.log <==
[child56:21744] [7] msync(start=0x7f37ee657000, len=0, flags=0x6) = 0
[child56:21744] [8] flock(fd=168, cmd=0xffffffc191f30b0c) = -1
(Invalid argument)
[child56:21744] [9] add_key(_type=0x437a15,
_description=0x7f37ec06c000, _payload=0x7f37ec06c000, plen=0,
ringid=0xfffffffffffffff8)
==> trinity-child57.log <==
[child57:21746] [12] [32BIT] mq_open(u_name=0x7f37ec86c000,
oflag=8192, mode=2000, u_attr=0x7f37ec86c004) = -1 (Bad address)
[child57:21746] [13] getpgid(pid=21226) = 0x52ea
[child57:21746] [14] execveat(fd=168, name="/proc/1297/task/1297",
argv=0x2bda970, envp=0x2ba92a0, flags=0x0)
==> trinity-child58.log <==
[child58:21749] [33] eventfd2(count=4, flags=0x0) = 441
[child58:21749] [34] mknodat(dfd=244,
filename="/sys/devices/pci0000:00/0000:00:03.0/net/eth0/queues",
mode=0x17104124, dev=0xffffffff0000567f) = -1 (Operation not
permitted)
[child58:21749] [35] clock_getres(which_clock=158, tp=0x4) = -1
(Invalid argument)

==> trinity-child59.log <==
[child59:21754] [21] setfsgid(gid=0x40000000000000) = 0xfffe
[child59:21754] [22] getpgrp() = 0
[child59:21754] [23] ppoll(ufds=0x2e0a5d0, nfds=0, tsp=0x2e0a590,
sigmask=0x0, sigsetsize=128)
==> trinity-child5.log <==
[child5:21604] [287] seccomp(op=0x1, flags=0x0, uargs=0x1) = 0
[child5:21604] [288] rt_sigpending(set=0x7f37ee466000,
sigsetsize=0x2323232323232323) = -1 (Invalid argument)
[child5:21604] [289] getuid() = 0xfffe

==> trinity-child60.log <==
[child60:21758] [0] getsid(pid=1) = 0
[child60:21758] [1] io_cancel(ctx_id=255, iocb=0x7f37ec46c000,
result=0x7f37ec46c000) = -1 (Invalid argument)
[child60:21758] [2] [32BIT] recvmmsg(fd=367, mmsg=0x1, vlen=4,
flags=0x0, timeout=0x7f37ec46c000)
==> trinity-child61.log <==
[child61:21760] [5] open(filename="/proc/1240/net/nfsfs",
flags=0x331702, mode=666) = 444
[child61:21760] [6] recvmmsg(fd=244, mmsg=0x7f37ec66c000, vlen=1,
flags=0x0, timeout=0x7f37ee466000) = -1 (Socket operation on
non-socket)
[child61:21760] [7] swapon(path=".//proc/3/task/3/net/ip_vs_conn/",
swap_flags=0x0)
==> trinity-child62.log <==
[child62:21428] [668] memfd_create(uname=0x7f37ee465000, flag=0x1) = 546
[child62:21428] [669]
newstat(filename="/sys/devices/virtual/net/sit0/statistics/rx_dropped",
statbuf=0x8) = -1 (Bad address)
[child62:21428] [670] [32BIT] sigpending(set=0x7f37ee467000) = -1 (Bad address)

==> trinity-child63.log <==
[child63:21431] [151] getgid() = 0xfffe
[child63:21431] [152] set_mempolicy(mode=0xffffffff, nmask=0x0,
maxnode=0xbbbbbbbbbbbbbbbb) = -1 (Invalid argument)
[child63:21762] [0] ioperm(from=0xb5b5b5b5b5b5b5b6, num=514, turn_on=135)
==> trinity-child64.log <==
[child64:21432] [754] utime(filename="/proc/552/net/can/rcvlist_eff",
times=0x4) = -1 (Bad address)
[child64:21432] [755] fcntl(fd=244, cmd=0x406, arg=244) = 540
[child64:21432] [756] mlock2(start=0x7f37ec06c000, len=0x9d000) = -1
(Cannot allocate memory)

==> trinity-child65.log <==
[child65:21435] [140] timer_create(which_clock=0x10000000,
timer_event_spec=0x1, create_timer_id=0x7f37ec46c000) = -1 (Invalid
argument)
[child65:21435] [141] getpeername(fd=152, usockaddr=0x2e0a1a0,
usockaddr_len=16) = -1 (Socket operation on non-socket)
[child65:21435] [142] ioperm(from=0x54ee6fb85625, num=1, turn_on=128)
= -1 (Invalid argument)

==> trinity-child66.log <==
[child66:21438] [310] msgctl(msqid=0x12121212121212, cmd=0x1,
buf=0x7f37ee465000) = -1 (Invalid argument)
[child66:21438] [311] set_robust_list(head=0x7f37ee658000, len=24) = 0
[child66:21438] [312] msgget(key=0x10000000,
msgflg=0x80000000000000d1) = -1 (Permission denied)

==> trinity-child67.log <==
[child67:21439] [184] pkey_alloc(flags=0, init_val=0x0) = -1 (Invalid argument)
[child67:21439] [185] getpgrp() = 0
[child67:21439] [186] [32BIT] getpgrp() = 0

==> trinity-child68.log <==
[child68:21443] [90] shmdt(shmaddr=0x7f37ee465000) = 0
[child68:21443] [91] remap_file_pages(start=0x7f37ec86c000, size=4096,
prot=0, pgoff=0, flags=0x10000) = 0
[child68:21443] [92] [32BIT] get_mempolicy(policy=0x7f37ee467000,
nmask=0x7f37ee467001, maxnode=147, addr=0x7f37ee466000, flags=0x0) =
-1 (Invalid argument)

==> trinity-child69.log <==
[child69:21444] [283] signalfd(ufd=139, user_mask=0x1, sizemask=8) =
-1 (Invalid argument)
[child69:21444] [284]
utimes(filename="/sys/devices/virtual/tty/tty50/power/runtime_status",
utimes=0x7f37ee465000) = -1 (Invalid argument)
[child69:21444] [285] fremovexattr(fd=139, name=0x7f37ee465000) = -1
(Permission denied)

==> trinity-child6.log <==
[child6:21605] [7] msync(start=0x7f37ee656000, len=0, flags=0x1) = 0
[child6:21605] [8] mq_timedreceive(mqdes=324,
u_msg_ptr=0x7f37ec06c000, msg_len=0x41da, u_msg_prio=0x7f37ee656000,
u_abs_timeout=0x7f37ee657000) = -1 (Bad file descriptor)
[child6:21605] [9] [32BIT] sethostname(name=0x7f37ec76c000,
len=0x85b19) = -1 (Operation not permitted)

==> trinity-child70.log <==
[child70:21447] [654] inotify_init1(flags=0x80000) = 547
[child70:21447] [655] setns(fd=244, nstype=0x0) = -1 (Invalid argument)
[child70:21447] [656] getdents(fd=367, dirent=0x4, count=0)
==> trinity-child71.log <==
[child71:21449] [86] fremovexattr(fd=344, name=0x1) = -1 (Bad address)
[child71:21449] [87] fadvise64(fd=344, offset=0, len=1, advice=0x2) = 0
[child71:21449] [88] [32BIT] fstat64(fd=285, statbuf=0x7f37ee467000) =
-1 (Bad address)

==> trinity-child72.log <==
[child72:21138] [88] [32BIT] pread64(fd=210, buf=0x7f37ec06c000,
count=1361, pos=0) = -1 (Invalid argument)
[child72:21138] [89] clock_settime(which_clock=512, tp=0x7f37ec06c000)
= -1 (Invalid argument)
[child72:21138] [90] mbind(start=0x7f37ec86c000, len=0xf5000,
mode=0x2, nmask=0x1, maxnode=0x8000, flags=0x6) = -1 (Bad address)

==> trinity-child73.log <==
[child73:21141] [706] setrlimit(resource=0x7f37ec26c000,
rlim=0xfffffffd) = -1 (Bad address)
[child73:21141] [707] pwrite64(fd=351, buf=0x2e091a0, count=1,
pos=0xfffd) = -1 (Bad file descriptor)
[child73:21141] [708] time(tloc=0x7f37ee65a000) = 0x584474dd

==> trinity-child74.log <==
[child74:21456] [429] setfsgid(gid=-5) = 0xfffe
[child74:21456] [430] sethostname(name=0x7f37ec26c000, len=8) = -1
(Operation not permitted)
[child74:21456] [431] open(filename="/sys/block/loop2/mq/0/cpu_list",
flags=0x333401, mode=666) = -1 (Not a directory)

==> trinity-child75.log <==
[child75:21458] [578] accept(fd=316, upeer_sockaddr=0x2e240e0,
upeer_addrlen=3729) = -1 (Socket operation on non-socket)
[child75:21458] [579] setxattr(pathname="/sys/block/loop5",
name=0x7f37ec06c000, value=0x7f37ec86c000, size=8, flags=0x1) = -1
(Permission denied)
[child75:21458] [580] sched_setscheduler(pid=0, policy=0x3, param=0x0)
= -1 (Invalid argument)

==> trinity-child76.log <==
[child76:21461] [546] gettimeofday(tv=0x7f37ee465000,
tz=0x7f37ee465004) = -1 (Bad address)
[child76:21461] [547] [32BIT] rt_sigaction(sig=32, act=0x7f37ec46c000,
oact=0x0, sigsetsize=128) = -1 (Invalid argument)
[child76:21461] [548] clock_nanosleep(which_clock=0x2, flags=0x1,
rqtp=0x7f37ee656000, rmtp=0x7f37ec06c000)
==> trinity-child77.log <==
[child77:21465] [73]
swapoff(path="/proc/865/task/865/net/rpc/auth.rpcsec.context") = -1
(Operation not permitted)
[child77:21465] [74] sched_getscheduler(pid=21253) = 0
[child77:21465] [75] [32BIT]
rt_sigtimedwait(uthese=0xba501c1fc77f1f57, uinfo=0x0,
uts=0x7f37ee466000, sigsetsize=0x7f37ee466008) = -1 (Invalid argument)

==> trinity-child78.log <==
[child78:21469] [216] mknod(filename="/proc/1023/net/ip_vs_conn",
mode=1007, dev=0x1b9008e90080) = -1 (No such file or directory)
[child78:21469] [217] geteuid() = 0xfffe
[child78:21469] [218] setsockopt(fd=169, level=0xa49a302c26ea,
optname=51, optval=0x2e1afd0, optlen=0x5bf09058529076) = -1 (Bad file
descriptor)

==> trinity-child79.log <==
[child79:21475] [696] setresuid(ruid=-2, euid=43,
suid=0x7272727272727272) = -1 (Operation not permitted)
[child79:21475] [697] msync(start=0x7f37ee659000, len=0, flags=0x6) = 0
[child79:21475] [698] io_submit(ctx_id=118, nr=8, iocbpp=0x8) = -1
(Invalid argument)

==> trinity-child7.log <==
[child7:21609] [33] prctl(option=22, arg2=2, arg3=0, arg4=0, arg5=0) =
-1 (Bad address)
[child7:21609] [34] userfaultfd(flags=0x800) = 511
[child7:21609] [35] getuid() = 0xfffe

==> trinity-child80.log <==
[child80:21478] [101] settimeofday(tv=0x7f37ec26c000,
tz=0x7f37ec46c000) = -1 (Operation not permitted)
[child80:21478] [102] getdents(fd=330, dirent=0x0, count=56) = -1 (Not
a directory)
[child80:21478] [103] epoll_ctl(epfd=330, op=0x2, fd=330,
event=0x2e0a390) = -1 (Invalid argument)

==> trinity-child81.log <==
[child81:21482] [33] [32BIT] move_pages(pid=0, nr_pages=304,
pages=0x2e19fc0, nodes=0x2e1afd0, status=0x2e1b4a0, flags=0x0) = -1
(Bad address)
[child81:21482] [34] getpriority(which=0x1, who=21008) = 1
[child81:21482] [35] sethostname(name=0x7f37ee659000, len=0x6aeaf512)
= -1 (Operation not permitted)

==> trinity-child82.log <==
[child82:21163] [963] keyctl(cmd=0x2, arg2=0xff00000, arg3=0xc928d000,
arg4=0x400000, arg5=1) = -1 (Invalid argument)
[child82:21163] [964] [32BIT] uselib(library=0x7f37ec46c000) = -1 (Bad address)
[child82:21163] [965] mq_getsetattr(mqdes=226, u_mqstat=0x1,
u_omqstat=0x2) = -1 (Bad address)

==> trinity-child83.log <==
[child83:21486] [531] getrlimit(resource=0xb, rlim=0x7f37ee465000) =
-1 (Bad address)
[child83:21486] [532] gettid() = 0x53ee
[child83:21486] [533] renameat(olddfd=244,
oldname="/proc/1252/net/atalk/socket", newdfd=267,
newname=".//proc/721/net/irda/ircomm") = -1 (Not a directory)

==> trinity-child84.log <==
[child84:21488] [238] mkdir(pathname="/proc/1161/task/1161/mounts",
mode=222) = -1 (File exists)
[child84:21488] [239] capget(header=0x67002320,
dataptr=0xfffaffb7effdfffe) = -1 (Bad address)
[child84:21488] [240]
rmdir(pathname="/sys/devices/virtual/misc/ocfs2_control/power/runtime_active_time")
= -1 (Permission denied)

==> trinity-child85.log <==
[child85:21489] [151] settimeofday(tv=0x7f37ec76c000,
tz=0x7f37ec76c008) = -1 (Invalid argument)
[child85:21489] [152] fstatat64(dfd=365,
filename=".//proc/551/net/ip_vs_conn", statbuf=0x7f37ec26c000,
flag=-2) = -1 (Invalid argument)
[child85:21489] [153] read(fd=365, buf=0x7f37ee466000, count=4096) = 0

==> trinity-child86.log <==
[child86:21496] [39] sched_getaffinity(pid=21349, len=217,
user_mask_ptr=0x4) = -1 (Invalid argument)
[child86:21496] [40] semtimedop(semid=149, tsops=0x7f37ec86c000,
nsops=-1, timeout=0x7f37ec86c000) = -1 (Argument list too long)
[child86:21496] [41] get_robust_list(pid=0, head_ptr=0x7f37ee465000,
len_ptr=0x7f37ec66c000) = 0

==> trinity-child87.log <==
[child87:21500] [517] select(n=0, inp=0x2e12fa0, outp=0x2e13030,
exp=0x2e130c0, tvp=0x2e09740) = 0
[child87:21500] [518] readlink(path="/proc/14/comm", buf=0x1,
bufsiz=223) = -1 (Invalid argument)
[child87:21500] [519] setgid(gid=0) = -1 (Operation not permitted)

==> trinity-child88.log <==
[child88:21502] [523] [32BIT] socketcall(call=1, args=0x2e1b390) = -1
(Address family not supported by protocol)
[child88:21502] [524]
newlstat(filename="/proc/722/task/722/net/rpc/nfs",
statbuf=0x7f37ec76c000) = 0
[child88:21502] [525] mq_unlink(u_name=0x7f37ec06c000)
==> trinity-child89.log <==
[child89:21507] [132] [32BIT] getresuid(ruid=0x7f37ec06c000,
euid=0x7f37ec06c001, suid=0x7f37ec06c005) = -1 (Bad address)
[child89:21507] [133] uname(name=0x7f37ec26c000) = 0
[child89:21507] [134] shutdown(fd=213, how=0x0) = -1 (Bad file descriptor)

==> trinity-child8.log <==
[child8:21610] [211] mlock2(start=0x7f37ec26c000, len=0x95000) = -1
(Cannot allocate memory)
[child8:21610] [212]
removexattr(pathname="/sys/module/zswap/parameters/compressor",
name=0x7f37ec46c000) = -1 (Permission denied)
[child8:21610] [213] socket(family=25, type=2048, protocol=5) = -1
(Address family not supported by protocol)

==> trinity-child90.log <==
[child90:21508] [36] [32BIT]
llistxattr(pathname="/proc/1045/net/protocols", list=0x0, size=-6) = 0
[child90:21508] [37] pipe(fildes=0x7f37ee467000) = 0
[child90:21508] [38] ftruncate(fd=183, length=0xffffffffca1f2000) = -1
(Invalid argument)

==> trinity-child91.log <==
[child91:21514] [28] unlink(pathname="/proc/924/net/netstat") = -1
(Permission denied)
[child91:21514] [29] pivot_root(new_root=0x7f37ee465000,
put_old=0x7f37ec06c000) = -1 (Operation not permitted)
[child91:21514] [30] sched_setscheduler(pid=21410, policy=0x3,
param=0x4) = -1 (Bad address)

==> trinity-child92.log <==
[child92:21517] [444] [32BIT]
chroot(filename="/proc/980/task/980/net/dev_snmp6/sit0") = -1 (Not a
directory)
[child92:21517] [445] syslog(type=0x1, buf=0x7f37ec26c000,
len=0x48000) = -1 (Operation not permitted)
[child92:21517] [446] getrandom(buf=0x7f37ec76c000, count=0x2a000,
flags=0x1) = -1 (Invalid argument)

==> trinity-child93.log <==
[child93:21518] [477] waitid(which=0xffd9, upid=1,
infop=0x7f37ec06c000, options=0x100000, ru=0x7f37ec06c000) = -1
(Invalid argument)
[child93:21518] [478] fchmod(fd=168, mode=0) = -1 (Operation not permitted)
[child93:21518] [479] clock_nanosleep(which_clock=0x2, flags=0x1,
rqtp=0x7f37ec76c000, rmtp=0x7f37ee466000)
==> trinity-child94.log <==
[child94:20875] [1515] access(filename="/proc/567/net/atm/pvc",
mode=5000) = -1 (Invalid argument)
[child94:20875] [1516] seccomp(op=0x1, flags=0x1, uargs=0x4) = -1
(Invalid argument)
[child94:20875] [1517] [32BIT] getuid() = 0xfffe

==> trinity-child95.log <==
[child95:21520] [94] userfaultfd(flags=0x0) = 516
[child95:21520] [95] eventfd(count=153) = 517
[child95:21520] [96] shmat(shmid=0x4000000, shmaddr=0x7f37ee466000,
shmflg=0x3000000000400000) = -1 (Invalid argument)

==> trinity-child96.log <==
[child96:21522] [225] pwritev2(fd=324, vec=0x2e19fc0, vlen=70,
pos_l=0x2727272727, pos_h=255, flags=0x2) = -1 (Illegal seek)
[child96:21522] [226] setfsuid(uid=-8) = 0xfffe
[child96:21522] [227] getdents(fd=324, dirent=0x7f37ec06c000,
count=4096) = -1 (Not a directory)

==> trinity-child97.log <==
[child97:21526] [119] [32BIT]
statfs64(pathname="/proc/545/task/545/net/can/rcvlist_all",
sz=0x8500000000000088) = -1 (Invalid argument)
[child97:21526] [120] pipe2(fildes=0x7f37ee465000, flags=0x0) = 0
[child97:21526] [121]
utimes(filename="/sys/devices/virtual/tty/ircomm4", utimes=0x0) = -1
(Permission denied)

==> trinity-child98.log <==
[child98:21529] [163] setresgid(rgid=20, egid=256,
sgid=0xffffffffffff5252) = -1 (Operation not permitted)
[child98:21529] [164]
llistxattr(pathname="/proc/986/net/dev_snmp6/ip_vti0", list=0x0,
size=4) = 0
[child98:21529] [165] getcpu(cpup=0x7f37ee659000,
nodep=0x7f37ee659008, unused=0x7f37ee659009) = 0

==> trinity-child99.log <==
[child99:21532] [223] sigaltstack(uss=0x7f37ec86c000,
uoss=0x7f37ec86c008, regs=0x7f37ee465000) = -1 (Invalid argument)
[child99:21532] [224] vmsplice(fd=320, iov=0x2e1afd0, nr_segs=1016,
flags=0xb) = -1 (Bad file descriptor)
[child99:21532] [225] [32BIT] setfsgid16(gid=0x2209024001252024) = 0xfffe

==> trinity-child9.log <==
[child9:21616] [294] getuid() = 0xfffe
[child9:21616] [295] fstatfs(fd=244, buf=0x8) = -1 (Bad address)
[child9:21616] [296] mlock2(start=0x7f37ec76c000, len=0xe000) = 0


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vegard Nossum Dec. 5, 2016, 5:09 p.m. UTC | #19

On 5 December 2016 at 12:10, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> On 5 December 2016 at 00:04, Vegard Nossum <vegard.nossum@gmail.com> wrote:
>> FWIW I hit this as well:
>>
>> BUG: unable to handle kernel paging request at ffffffff81ff08b7
>> IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
>> CPU: 0 PID: 21744 Comm: trinity-c56 Tainted: G    B           4.9.0-rc7+ #217
> [...]
>
>> I think you can rule out btrfs in any case, probably block layer as
>> well, since it looks like this comes from shmem.
>
> I should rather say that the VM runs on a 9p root filesystem and it
> doesn't use/mount any block devices or disk-based filesystems.
>
> I have all the trinity logs for the crash if that's useful. I tried a
> couple of runs with just the (at the time) in-progress syscalls but it
> didn't turn up anything interesting. Otherwise it seems like a lot of
> data to go through by hand.

I've hit this another 7 times in the past ~3 hours.

Three times the address being dereferenced has pointed to
iov_iter_init+0xaf (even across a kernel rebuild), three times it has
pointed to put_prev_entity+0x55, once to 0x800000008, and twice to
0x292. The fact that it would hit even one of those more than once
across runs is pretty suspicious to me, although the ones that point
to iov_iter_init and put_prev_entity point to "random" instructions in
the sense that they are neither entry points nor return addresses.

shmem_fault() was always on the stack, but it came from different
syscalls: add_key(), newuname(), pipe2(), newstat(), fstat(),
clock_settime(), mount(), etc.

I also got this warning which is related:

------------[ cut here ]------------
WARNING: CPU: 9 PID: 25045 at lib/list_debug.c:59 __list_del_entry+0x14f/0x1d0
list_del corruption. prev->next should be ffff88014bdc79e8, but was
ffff88014bfbfc60
Kernel panic - not syncing: panic_on_warn set ...

CPU: 9 PID: 25045 Comm: trinity-c22 Not tainted 4.9.0-rc7+ #219
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
ffff88014bdc7700 ffffffff81fb0861 ffffffff83e74b60 ffff88014bdc77d8
ffffffff84006c00 ffffffff847103e0 ffff88014bdc77c8 ffffffff81515244
0000000041b58ab3 ffffffff844e21c2 ffffffff81515061 ffffffff00000054
Call Trace:
[<ffffffff81fb0861>] dump_stack+0x83/0xb2
[<ffffffff81515244>] panic+0x1e3/0x3ad
[<ffffffff81515061>] ? percpu_up_read_preempt_enable.constprop.45+0xcb/0xcb
[<ffffffff82016f7f>] ? __list_del_entry+0x14f/0x1d0
[<ffffffff812708bf>] __warn+0x1bf/0x1e0
[<ffffffff8135f2d2>] ? __lock_acquire.isra.32+0xc2/0x1a30
[<ffffffff8127098c>] warn_slowpath_fmt+0xac/0xd0
[<ffffffff812708e0>] ? __warn+0x1e0/0x1e0
[<ffffffff813530c0>] ? finish_wait+0xb0/0x180
[<ffffffff82016f7f>] __list_del_entry+0x14f/0x1d0
[<ffffffff813530c0>] ? finish_wait+0xb0/0x180
[<ffffffff813530cb>] finish_wait+0xbb/0x180
[<ffffffff81576227>] shmem_fault+0x4c7/0x6b0
[<ffffffff81574743>] ? shmem_getpage_gfp+0x673/0x1c90
[<ffffffff81575d60>] ? shmem_getpage_gfp+0x1c90/0x1c90
[<ffffffff81352150>] ? wake_atomic_t_function+0x210/0x210
[<ffffffff815ad316>] __do_fault+0x206/0x410
[<ffffffff815ad110>] ? do_page_mkwrite+0x320/0x320
[<ffffffff815b80ac>] ? handle_mm_fault+0x1cc/0x2a60
[<ffffffff815b8fd7>] handle_mm_fault+0x10f7/0x2a60
[<ffffffff815b8012>] ? handle_mm_fault+0x132/0x2a60
[<ffffffff81310a7f>] ? thread_group_cputime+0x49f/0x6e0
[<ffffffff815b7ee0>] ? __pmd_alloc+0x370/0x370
[<ffffffff81310a9c>] ? thread_group_cputime+0x4bc/0x6e0
[<ffffffff81310d2d>] ? thread_group_cputime_adjusted+0x6d/0xe0
[<ffffffff81237170>] ? __do_page_fault+0x220/0x9f0
[<ffffffff815cba10>] ? find_vma+0x30/0x150
[<ffffffff812373a2>] __do_page_fault+0x452/0x9f0
[<ffffffff81237bf5>] trace_do_page_fault+0x1e5/0x3a0
[<ffffffff8122a007>] do_async_page_fault+0x27/0xa0
[<ffffffff83c97618>] async_page_fault+0x28/0x30
[<ffffffff81fdec7c>] ? copy_user_generic_string+0x2c/0x40
[<ffffffff812b0303>] ? SyS_times+0x93/0x110
[<ffffffff812b0270>] ? do_sys_times+0x2b0/0x2b0
[<ffffffff812b0270>] ? do_sys_times+0x2b0/0x2b0
[<ffffffff8100524f>] do_syscall_64+0x1af/0x4d0
[<ffffffff83c96534>] entry_SYSCALL64_slow_path+0x25/0x25
------------[ cut here ]------------

The warning shows that it made it past the list_empty_careful() check
in finish_wait() but then bugs out on the &wait->task_list
dereference.

Anything stick out?


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski Dec. 5, 2016, 6:11 p.m. UTC | #20

On Sun, Dec 4, 2016 at 3:04 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> On 23 November 2016 at 20:58, Dave Jones <davej@codemonkey.org.uk> wrote:
>> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:
>>
>>  > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
>>  > trace from just before this happened. Does this shed any light ?
>>  >
>>  > https://codemonkey.org.uk/junk/trace.txt
>>
>> crap, I just noticed the timestamps in the trace come from quite a bit
>> later.  I'll tweak the code to do the taint checking/ftrace stop after
>> every syscall, that should narrow the window some more.
>
> FWIW I hit this as well:
>
> BUG: unable to handle kernel paging request at ffffffff81ff08b7

We really ought to improve this message.  If nothing else, it should
say whether it was a read, a write, or an instruction fetch.

> IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
> PGD 461e067 PUD 461f063
> PMD 1e001e1

Too lazy to manually decode this right now, but I don't think it matters.

> Oops: 0003 [#1] PREEMPT SMP KASAN

Note this is SMP, but that just means CONFIG_SMP=y.  Vegard, how many
CPUs does your kernel think you have?

> RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
> __lock_acquire.isra.32+0xda/0x1a30
> RSP: 0018:ffff8801bab8f730  EFLAGS: 00010082
> RAX: ffffffff81ff071f RBX: 0000000000000000 RCX: 0000000000000000

RAX points to kernel text.

> Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
> c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <3e>
> ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00

  2b:    3e ff 80 98 01 00 00     incl   %ds:*0x198(%rax)        <--
trapping instruction

That's very strange.  I think this is:

atomic_inc((atomic_t *)&class->ops);

but my kernel contains:

    3cb4:       f0 ff 80 98 01 00 00    lock incl 0x198(%rax)

So your kernel has been smp-alternatived.  That 3e comes from
alternatives_smp_unlock.  If you're running on SMP with UP
alternatives, things will break.

What's your kernel command line?  Can we have your entire kernel log from boot?

Adding Borislav, since he's the guru for this code.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Dec. 5, 2016, 6:25 p.m. UTC | #21

On Mon, Dec 5, 2016 at 10:11 AM, Andy Lutomirski <luto@kernel.org> wrote:
>
> So your kernel has been smp-alternatived.  That 3e comes from
> alternatives_smp_unlock.  If you're running on SMP with UP
> alternatives, things will break.

I'm assuming he's just running in a VM with a single CPU.

The problem that I pointed out with assuming wake_up_all() actually
removes all wait queue entries does not depend on SMP. The race is
much more fundamental and long-lived.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vegard Nossum Dec. 5, 2016, 6:26 p.m. UTC | #22

On 5 December 2016 at 19:11, Andy Lutomirski <luto@kernel.org> wrote:
> On Sun, Dec 4, 2016 at 3:04 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
>> On 23 November 2016 at 20:58, Dave Jones <davej@codemonkey.org.uk> wrote:
>>> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:
>>>
>>>  > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
>>>  > trace from just before this happened. Does this shed any light ?
>>>  >
>>>  > https://codemonkey.org.uk/junk/trace.txt
>>>
>>> crap, I just noticed the timestamps in the trace come from quite a bit
>>> later.  I'll tweak the code to do the taint checking/ftrace stop after
>>> every syscall, that should narrow the window some more.
>>
>> FWIW I hit this as well:
>>
>> BUG: unable to handle kernel paging request at ffffffff81ff08b7
>
> We really ought to improve this message.  If nothing else, it should
> say whether it was a read, a write, or an instruction fetch.
>
>> IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
>> PGD 461e067 PUD 461f063
>> PMD 1e001e1
>
> Too lazy to manually decode this right now, but I don't think it matters.
>
>> Oops: 0003 [#1] PREEMPT SMP KASAN
>
> Note this is SMP, but that just means CONFIG_SMP=y.  Vegard, how many
> CPUs does your kernel think you have?

My first crash was running on a 1-CPU guest (not intentionally, but
because of a badly configured qemu). I'm honestly surprised it
triggered at all with 1 CPU, but I guess it shows that it's not a true
concurrency issue at least!

>
>> RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
>> __lock_acquire.isra.32+0xda/0x1a30
>> RSP: 0018:ffff8801bab8f730  EFLAGS: 00010082
>> RAX: ffffffff81ff071f RBX: 0000000000000000 RCX: 0000000000000000
>
> RAX points to kernel text.

Yes, it's somewhere in the middle of iov_iter_init() -- other crashes
also had put_prev_entity(), a null pointer, and some garbage values I
couldn't identify.

>
>> Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
>> c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <3e>
>> ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00
>
>   2b:    3e ff 80 98 01 00 00     incl   %ds:*0x198(%rax)        <--
> trapping instruction
>
> That's very strange.  I think this is:
>
> atomic_inc((atomic_t *)&class->ops);
>
> but my kernel contains:
>
>     3cb4:       f0 ff 80 98 01 00 00    lock incl 0x198(%rax)
>
> So your kernel has been smp-alternatived.  That 3e comes from
> alternatives_smp_unlock.  If you're running on SMP with UP
> alternatives, things will break.

Yes, indeed. It was running on 1 CPU by mistake and still triggered the bug.

The crashes started really pouring in once I got my qemu fixed. Just
to reassure you, here's another crash which shows it's using the
correct instruction on an actual multicore guest:

BUG: unable to handle kernel paging request at 00000003030001de
IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
PGD 183fd2067 PUD 0

Oops: 0002 [#1] PREEMPT SMP KASAN
Dumping ftrace buffer:
  (ftrace buffer empty)
CPU: 23 PID: 9584 Comm: trinity-c104 Not tainted 4.9.0-rc7+ #219
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff880189f68000 task.stack: ffff88017fe50000
RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
__lock_acquire.isra.32+0xda/0x1a30
RSP: 0018:ffff88017fe575e0  EFLAGS: 00010002
RAX: 0000000303000046 RBX: 0000000000000000 RCX: 0000000000000000
[...]
Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <f0>
ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00
RIP  [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
RSP <ffff88017fe575e0>
CR2: 00000003030001de
---[ end trace 2846425104eb6141 ]---
Kernel panic - not syncing: Fatal exception
------------[ cut here ]------------

  2b:*  f0 ff 80 98 01 00 00    lock incl 0x198(%rax)           <--
trapping instruction

> What's your kernel command line?  Can we have your entire kernel log from boot?

Just in case you still want this, I've attached the boot log for the
"true SMP" guest above.


Vegard

Dave Jones Dec. 7, 2016, 4:15 p.m. UTC | #23

On Sat, Dec 03, 2016 at 11:48:33AM -0500, Dave Jones wrote:

 > The interesting process here seems to be kworker/u8:17, and the trace
 > captures some of what that was doing before that bad page was hit.

I'm travelling next week, so I'm trying to braindump the stuff I've
found so far and summarise so I can pick it back up later if no-one else
figures it out first.

I've hit the bad page map spew with enough regularity that I've now got a handful of
good traces.

http://codemonkey.org.uk/junk/btrfs/bad-page-state1.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state2.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state3.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state4.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state5.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state6.txt

It smells to me like a race between truncate and the writeback
workqueue. The variety of traces here seem to show both sides
of the race, sometimes it's kworker, sometimes a trinity child process.

bad-page-state3.txt onwards have some bonus trace_printk's from
btrfs_setsize as I was curious what sizes we were passing down to
truncate. The only patterns I see are going from very large to very
small sizes. Perhaps that causes truncate to generate so much
writeback that it makes the race apparent ?

Other stuff I keep hitting:

Start transaction spew:
http://codemonkey.org.uk/junk/btrfs/start_transaction.txt
That's the WARN_ON(h->use_count > 2);
I hit this with enough regularity that I had to comment it out.
It's not clear to me whether this is related at all.

Lockdep spew:
http://codemonkey.org.uk/junk/btrfs/register_lock_class1.txt
http://codemonkey.org.uk/junk/btrfs/register_lock_class2.txt
This stuff has been around for a while (4.6ish iirc)

Sometimes the fs got into a screwed up state that needed btrfscking.
http://codemonkey.org.uk/junk/btrfs/replay-log-fail.txt

	Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Rostedt Dec. 9, 2016, 9:12 p.m. UTC | #24

On Thu, Dec 01, 2016 at 10:32:09AM -0500, Dave Jones wrote:
> 
> (function-graph screws up the RIP for some reason, 'return_to_handler'
>  should actually be btrfs_destroy_inode)

That's because function_graph hijacks the return address and replaces it with
return_to_handler. The back trace has code to show what that handler is
suppose to be. But I should see why the WARNING shows it instead, and fix
that too.

Thanks,

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

bio linked list corruption.

Commit Message

Comments

Patch