From patchwork Sat Nov 21 02:26:45 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 7673271 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 395DE9F2E2 for ; Sat, 21 Nov 2015 02:26:56 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5614020439 for ; Sat, 21 Nov 2015 02:26:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5F15E20443 for ; Sat, 21 Nov 2015 02:26:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759961AbbKUC0v (ORCPT ); Fri, 20 Nov 2015 21:26:51 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:1291 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759627AbbKUC0u (ORCPT ); Fri, 20 Nov 2015 21:26:50 -0500 Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id tAL2Oip3026532; Fri, 20 Nov 2015 18:26:47 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=subject : to : references : cc : from : message-id : date : mime-version : in-reply-to : content-type; s=facebook; bh=3PXM388Hlh6FjyB2jratNIXIaAJlLTxZDSIqrcsdMpI=; b=lKhhNHC1Itw15XKMhgzwMs2Dy9aHHiH6p/RACOcoFxkk8hiGHFJCsBq/4xD1cYLhEenq Dgu9NaMVfhtGtKe/vPnI/CQvE2jYyzUGfERf7L8KuheZeZ31vke5BoztM3qHXhW4uq2L 5cDEy/MVb3GYVAWoBF3s5X6udTXcU8G357o= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 1yacqprjr1-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Fri, 20 Nov 2015 18:26:47 -0800 Received: from [192.168.1.30] (192.168.54.13) by mail.thefacebook.com (192.168.16.14) with Microsoft SMTP Server (TLS) id 14.3.248.2; Fri, 20 Nov 2015 18:26:47 -0800 Subject: Re: [PATCH] Btrfs: fix a bug of sleeping in atomic context To: References: <1447984177-26795-1-git-send-email-bo.li.liu@oracle.com> <20151120131358.GC9887@ret.masoncoding.com> <564F9103.7060301@fb.com> <20151120230829.GB8096@localhost.localdomain> CC: Chris Mason , From: Jens Axboe Message-ID: <564FD665.90603@fb.com> Date: Fri, 20 Nov 2015 19:26:45 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151120230829.GB8096@localhost.localdomain> X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2015-11-21_03:, , signatures=0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/20/2015 04:08 PM, Liu Bo wrote: > On Fri, Nov 20, 2015 at 02:30:43PM -0700, Jens Axboe wrote: >> On 11/20/2015 06:13 AM, Chris Mason wrote: >>> On Thu, Nov 19, 2015 at 05:49:37PM -0800, Liu Bo wrote: >>>> while xfstesting, this bug[1] is spotted by both btrfs/061 and btrfs/063, >>>> so those sub-stripe writes are gatherred into plug callback list and >>>> hopefully we can have a full stripe writes. >>>> >>>> However, while processing these plugged callbacks, it's within an atomic >>>> context which is provided by blk_sq_make_request() because of a get_cpu() >>>> in blk_mq_get_ctx(). >>>> >>>> This changes to always use btrfs_rmw_helper to complete the pending writes. >>>> >>> >>> Thanks Liu, but MD raid has the same troubles, we're not atomic in our unplugs. >>> >>> Jens? >> >> Yeah, blk-mq does have preemption disabled when it flushes, for the single >> queue setup. That's a bug. Attached is an untested patch that should fix it, >> can you try it? >> > > Although it runs into a warning one time of 50 tries, that was not atomic warning but another racy issue. > > WARNING: CPU: 2 PID: 8531 at fs/btrfs/ctree.c:1162 __btrfs_cow_block+0x431/0x610 [btrfs]() > > So overall the patch is good. > >> I'll rework this to be a proper patch, not convinced we want to add the new >> request before flush, that might destroy merging opportunities. I'll unify >> the mq/sq parts. > > That's true, xfstests didn't notice any performance difference but that cannot prove anything. > > I'll test the new patch when you send it out. Try this one, that should retain the plug issue characteristics we care about as well. Tested-by: Liu Bo diff --git a/block/blk-core.c b/block/blk-core.c index 5131993b23a1..4237facaafa5 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -3192,7 +3192,7 @@ static void queue_unplugged(struct request_queue *q, unsigned int depth, spin_unlock(q->queue_lock); } -static void flush_plug_callbacks(struct blk_plug *plug, bool from_schedule) +void flush_plug_callbacks(struct blk_plug *plug, bool from_schedule) { LIST_HEAD(callbacks); diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ae09de62f19..e92f52462222 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1065,7 +1065,8 @@ static int plug_ctx_cmp(void *priv, struct list_head *a, struct list_head *b) blk_rq_pos(rqa) < blk_rq_pos(rqb))); } -void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) +static void __blk_mq_flush_plug_list(struct blk_plug *plug, bool from_sched, + bool skip_last) { struct blk_mq_ctx *this_ctx; struct request_queue *this_q; @@ -1084,13 +1085,15 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) while (!list_empty(&list)) { rq = list_entry_rq(list.next); + if (skip_last && list_is_last(&rq->queuelist, &list)) + break; list_del_init(&rq->queuelist); BUG_ON(!rq->q); if (rq->mq_ctx != this_ctx) { if (this_ctx) { blk_mq_insert_requests(this_q, this_ctx, &ctx_list, depth, - from_schedule); + from_sched); } this_ctx = rq->mq_ctx; @@ -1108,8 +1111,14 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) */ if (this_ctx) { blk_mq_insert_requests(this_q, this_ctx, &ctx_list, depth, - from_schedule); + from_sched); } + +} + +void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) +{ + __blk_mq_flush_plug_list(plug, from_schedule, false); } static void blk_mq_bio_to_request(struct request *rq, struct bio *bio) @@ -1291,15 +1300,16 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) blk_mq_bio_to_request(rq, bio); /* - * we do limited pluging. If bio can be merged, do merge. + * We do limited pluging. If the bio can be merged, do that. * Otherwise the existing request in the plug list will be * issued. So the plug list will have one request at most */ if (plug) { /* * The plug list might get flushed before this. If that - * happens, same_queue_rq is invalid and plug list is empty - **/ + * happens, same_queue_rq is invalid and plug list is + * empty + */ if (same_queue_rq && !list_empty(&plug->mq_list)) { old_rq = same_queue_rq; list_del_init(&old_rq->queuelist); @@ -1380,12 +1390,24 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio) blk_mq_bio_to_request(rq, bio); if (!request_count) trace_block_plug(q); - else if (request_count >= BLK_MAX_REQUEST_COUNT) { - blk_flush_plug_list(plug, false); - trace_block_plug(q); - } + list_add_tail(&rq->queuelist, &plug->mq_list); blk_mq_put_ctx(data.ctx); + + /* + * We unplug manually here, flushing both callbacks and + * potentially queued up IO - except the one we just added. + * That one did not merge with existing requests, so could + * be a candidate for new incoming merges. Tell + * __blk_mq_flush_plug_list() to skip issuing the last + * request in the list, which is the 'rq' from above. + */ + if (request_count >= BLK_MAX_REQUEST_COUNT) { + flush_plug_callbacks(plug, false); + __blk_mq_flush_plug_list(plug, false, true); + trace_block_plug(q); + } + return cookie; } diff --git a/block/blk.h b/block/blk.h index c43926d3d74d..3e0ae1562b85 100644 --- a/block/blk.h +++ b/block/blk.h @@ -107,6 +107,7 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio, unsigned int *request_count, struct request **same_queue_rq); unsigned int blk_plug_queued_count(struct request_queue *q); +void flush_plug_callbacks(struct blk_plug *plug, bool from_schedule); void blk_account_io_start(struct request *req, bool new_io); void blk_account_io_completion(struct request *req, unsigned int bytes);