diff mbox

bio linked list corruption.

Message ID CA+55aFwY5jNcUAJzDWERP2r9iZHEKzHSwMi5AJCJiaVd3Z0E-g@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Linus Torvalds Oct. 26, 2016, 10:21 p.m. UTC
On Wed, Oct 26, 2016 at 2:52 PM, Chris Mason <clm@fb.com> wrote:
>
> This one is special because CONFIG_VMAP_STACK is not set.  Btrfs triggers in < 10 minutes.
> I've done 30 minutes each with XFS and Ext4 without luck.

Ok, see the email I wrote that crossed yours - if it's really some
list corruption on ctx->rq_list due to some locking problem, I really
would expect CONFIG_VMAP_STACK to be entirely irrelevant, except
perhaps from a timing standpoint.

> WARNING: CPU: 6 PID: 4481 at lib/list_debug.c:33 __list_add+0xbe/0xd0
> list_add corruption. prev->next should be next (ffffe8ffffd80b08), but was ffff88012b65fb88. (prev=ffff880128c8d500).
> Modules linked in: crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper i2c_piix4 cryptd i2c_core virtio_net serio_raw floppy button pcspkr sch_fq_codel autofs4 virtio_blk
> CPU: 6 PID: 4481 Comm: dbench Not tainted 4.9.0-rc2-15419-g811d54d #319
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014
>  ffff880104eff868 ffffffff814fde0f ffffffff8151c46e ffff880104eff8c8
>  ffff880104eff8c8 0000000000000000 ffff880104eff8b8 ffffffff810648cf
>  ffff880128cab2c0 000000213fc57c68 ffff8801384e8928 ffff880128cab180
> Call Trace:
>  [<ffffffff814fde0f>] dump_stack+0x53/0x74
>  [<ffffffff8151c46e>] ? __list_add+0xbe/0xd0
>  [<ffffffff810648cf>] __warn+0xff/0x120
>  [<ffffffff810649a9>] warn_slowpath_fmt+0x49/0x50
>  [<ffffffff8151c46e>] __list_add+0xbe/0xd0
>  [<ffffffff814dec38>] blk_sq_make_request+0x388/0x580
>  [<ffffffff814d5444>] generic_make_request+0x104/0x200

Well, it's very consistent, I have to say. So I really don't think
this is random corruption.

Could you try the attached patch? It adds a couple of sanity tests:

 - a number of tests to verify that 'rq->queuelist' isn't already on
some queue when it is added to a queue

 - one test to verify that rq->mq_ctx is the same ctx that we have locked.

I may be completely full of shit, and this patch may be pure garbage
or "obviously will never trigger", but humor me.

          Linus
block/blk-mq.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Dave Jones Oct. 26, 2016, 10:40 p.m. UTC | #1
On Wed, Oct 26, 2016 at 03:21:53PM -0700, Linus Torvalds wrote:

 > Could you try the attached patch? It adds a couple of sanity tests:
 > 
 >  - a number of tests to verify that 'rq->queuelist' isn't already on
 > some queue when it is added to a queue
 > 
 >  - one test to verify that rq->mq_ctx is the same ctx that we have locked.
 > 
 > I may be completely full of shit, and this patch may be pure garbage
 > or "obviously will never trigger", but humor me.
 
I gave it a shot too for shits & giggles.
This falls out during boot.

[    9.244030] EXT4-fs (sda4): mounted filesystem with ordered data mode. Opts: (null)
[    9.271391] ------------[ cut here ]------------
[    9.278420] WARNING: CPU: 0 PID: 1 at block/blk-mq.c:1181 blk_sq_make_request+0x465/0x4a0
[    9.285613] CPU: 0 PID: 1 Comm: init Not tainted 4.9.0-rc2-think+ #4
[    9.300106]  ffffc90000013848
[    9.307353]  ffffffff8130d27c
[    9.307420]  0000000000000000
[    9.307456]  0000000000000000
[    9.307494]  ffffc90000013888
[    9.314780]  ffffffff81077a41
[    9.314847]  0000049d00013898
[    9.314884]  0000000000000001
[    9.314922]  0000000000000002
[    9.322213]  ffffe8ffffc06600
[    9.322280]  ffffe8ffff606600
[    9.322317]  ffff8805015c2e80
[    9.322355] Call Trace:
[    9.329689]  [<ffffffff8130d27c>] dump_stack+0x4f/0x73
[    9.337208]  [<ffffffff81077a41>] __warn+0xc1/0xe0
[    9.344730]  [<ffffffff81077b18>] warn_slowpath_null+0x18/0x20
[    9.352282]  [<ffffffff812f7975>] blk_sq_make_request+0x465/0x4a0
[    9.359816]  [<ffffffff812eb44a>] ? generic_make_request+0xca/0x210
[    9.367367]  [<ffffffff812eb457>] generic_make_request+0xd7/0x210
[    9.374931]  [<ffffffff812eb5f9>] submit_bio+0x69/0x120
[    9.382488]  [<ffffffff812004ef>] ? submit_bh_wbc+0x16f/0x1e0
[    9.390083]  [<ffffffff812004ef>] submit_bh_wbc+0x16f/0x1e0
[    9.397673]  [<ffffffff81200bfe>] submit_bh+0xe/0x10
[    9.405205]  [<ffffffff81255eac>] __ext4_get_inode_loc+0x1ac/0x3d0
[    9.412795]  [<ffffffff81258f4b>] ext4_iget+0x6b/0xb70
[    9.420366]  [<ffffffff811d6018>] ? lookup_slow+0xf8/0x1f0
[    9.427947]  [<ffffffff81259a7a>] ext4_iget_normal+0x2a/0x30
[    9.435541]  [<ffffffff81264734>] ext4_lookup+0x114/0x230

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Oct. 26, 2016, 10:51 p.m. UTC | #2
On Wed, Oct 26, 2016 at 3:40 PM, Dave Jones <davej@codemonkey.org.uk> wrote:
>
> I gave it a shot too for shits & giggles.
> This falls out during boot.
>
> [    9.278420] WARNING: CPU: 0 PID: 1 at block/blk-mq.c:1181 blk_sq_make_request+0x465/0x4a0

Hmm. That's the

    WARN_ON_ONCE(rq->mq_ctx != ctx);

that I added to blk_mq_merge_queue_io(), and I really think that
warning is valid, and the fact that it triggers shows that something
is wrong with locking.

We just did a

                spin_lock(&ctx->lock);

and that lock is *supposed* to protect the __blk_mq_insert_request(),
but that uses rq->mq_ctx.

So if rq->mq_ctx != ctx, then we're locking the wrong context.

Jens - please explain to me why I'm wrong.

Or maybe I actually might have found the problem? In which case please
send me a patch that fixes it ;)

Dave: it might be a good idea to split that "WARN_ON_ONCE()" in
blk_mq_merge_queue_io() into two, since right now it can trigger both
for the

                blk_mq_bio_to_request(rq, bio);

path _and_ for the

                if (!blk_mq_attempt_merge(q, ctx, bio)) {
                        blk_mq_bio_to_request(rq, bio);
                        goto insert_rq;

path. If you split it into two: one before that "insert_rq:" label,
and one before the "goto insert_rq" thing, then we could see if it is
just one of the blk_mq_merge_queue_io() cases (or both) that is
broken..

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe Oct. 26, 2016, 10:55 p.m. UTC | #3
On 10/26/2016 04:51 PM, Linus Torvalds wrote:
> On Wed, Oct 26, 2016 at 3:40 PM, Dave Jones <davej@codemonkey.org.uk> wrote:
>>
>> I gave it a shot too for shits & giggles.
>> This falls out during boot.
>>
>> [    9.278420] WARNING: CPU: 0 PID: 1 at block/blk-mq.c:1181 blk_sq_make_request+0x465/0x4a0
>
> Hmm. That's the
>
>     WARN_ON_ONCE(rq->mq_ctx != ctx);
>
> that I added to blk_mq_merge_queue_io(), and I really think that
> warning is valid, and the fact that it triggers shows that something
> is wrong with locking.
>
> We just did a
>
>                 spin_lock(&ctx->lock);
>
> and that lock is *supposed* to protect the __blk_mq_insert_request(),
> but that uses rq->mq_ctx.
>
> So if rq->mq_ctx != ctx, then we're locking the wrong context.
>
> Jens - please explain to me why I'm wrong.
>
> Or maybe I actually might have found the problem? In which case please
> send me a patch that fixes it ;)

I think you're pretty close, the two should not be different and I don't
immediately see how. I'll run some testing here, should be easier with
this knowledge.

> Dave: it might be a good idea to split that "WARN_ON_ONCE()" in
> blk_mq_merge_queue_io() into two, since right now it can trigger both
> for the
>
>                 blk_mq_bio_to_request(rq, bio);
>
> path _and_ for the
>
>                 if (!blk_mq_attempt_merge(q, ctx, bio)) {
>                         blk_mq_bio_to_request(rq, bio);
>                         goto insert_rq;

And just in case I can't trigger, would be interesting to add a call to
blk_rq_dump_flags() as well, in case this is some special request.
Linus Torvalds Oct. 26, 2016, 10:58 p.m. UTC | #4
On Wed, Oct 26, 2016 at 3:51 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Dave: it might be a good idea to split that "WARN_ON_ONCE()" in
> blk_mq_merge_queue_io() into two

I did that myself too, since Dave sees this during boot.

But I'm not getting the warning ;(

Dave gets it with ext4, and thats' what I have too, so I'm not sure
what the required trigger would be.

              Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Jones Oct. 26, 2016, 11:01 p.m. UTC | #5
On Wed, Oct 26, 2016 at 03:51:01PM -0700, Linus Torvalds wrote:
 > Dave: it might be a good idea to split that "WARN_ON_ONCE()" in
 > blk_mq_merge_queue_io() into two, since right now it can trigger both
 > for the
 > 
 >                 blk_mq_bio_to_request(rq, bio);
 > 
 > path _and_ for the
 > 
 >                 if (!blk_mq_attempt_merge(q, ctx, bio)) {
 >                         blk_mq_bio_to_request(rq, bio);
 >                         goto insert_rq;
 > 
 > path. If you split it into two: one before that "insert_rq:" label,
 > and one before the "goto insert_rq" thing, then we could see if it is
 > just one of the blk_mq_merge_queue_io() cases (or both) that is
 > broken..

It's the latter of the two.

[   12.302392] WARNING: CPU: 3 PID: 272 at block/blk-mq.c:1191
blk_sq_make_request+0x320/0x4d0

	Dave


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ddc2eed64771..4f575de7fdd0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -521,6 +521,8 @@  void blk_mq_add_to_requeue_list(struct request *rq, bool at_head)
 	 */
 	BUG_ON(rq->cmd_flags & REQ_SOFTBARRIER);
 
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
+
 	spin_lock_irqsave(&q->requeue_lock, flags);
 	if (at_head) {
 		rq->cmd_flags |= REQ_SOFTBARRIER;
@@ -838,6 +840,7 @@  static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 			queued++;
 			break;
 		case BLK_MQ_RQ_QUEUE_BUSY:
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
 			list_add(&rq->queuelist, &rq_list);
 			__blk_mq_requeue_request(rq);
 			break;
@@ -1034,6 +1037,8 @@  static inline void __blk_mq_insert_req_list(struct blk_mq_hw_ctx *hctx,
 
 	trace_block_rq_insert(hctx->queue, rq);
 
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
+
 	if (at_head)
 		list_add(&rq->queuelist, &ctx->rq_list);
 	else
@@ -1137,6 +1142,7 @@  void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 			depth = 0;
 		}
 
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
 		depth++;
 		list_add_tail(&rq->queuelist, &ctx_list);
 	}
@@ -1172,6 +1178,7 @@  static inline bool blk_mq_merge_queue_io(struct blk_mq_hw_ctx *hctx,
 		blk_mq_bio_to_request(rq, bio);
 		spin_lock(&ctx->lock);
 insert_rq:
+WARN_ON_ONCE(rq->mq_ctx != ctx);
 		__blk_mq_insert_request(hctx, rq, false);
 		spin_unlock(&ctx->lock);
 		return false;
@@ -1326,6 +1333,7 @@  static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 				old_rq = same_queue_rq;
 				list_del_init(&old_rq->queuelist);
 			}
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
 			list_add_tail(&rq->queuelist, &plug->mq_list);
 		} else /* is_sync */
 			old_rq = rq;
@@ -1412,6 +1420,7 @@  static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
 			trace_block_plug(q);
 		}
 
+WARN_ON_ONCE(!list_empty(&rq->queuelist));
 		list_add_tail(&rq->queuelist, &plug->mq_list);
 		return cookie;
 	}