From patchwork Fri Oct 13 18:10:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shaohua Li X-Patchwork-Id: 10005615 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6225A602B3 for ; Fri, 13 Oct 2017 18:10:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F24B29131 for ; Fri, 13 Oct 2017 18:10:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5404E2912D; Fri, 13 Oct 2017 18:10:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B341829135 for ; Fri, 13 Oct 2017 18:10:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751901AbdJMSKs (ORCPT ); Fri, 13 Oct 2017 14:10:48 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:45658 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751835AbdJMSKr (ORCPT ); Fri, 13 Oct 2017 14:10:47 -0400 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v9DIA7DE011559 for ; Fri, 13 Oct 2017 11:10:47 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=facebook; bh=t1xff0Kt42hTBonSZV1rv+87dtU+9QzP8F7ozyUAw48=; b=SOP8kIm0hhvrwJ7TdS3cJgjPrendH2LLheiw8aKc8zea7eI/B2ailoGb+LFICl1n+wIC T3pgw+tR6OE/x/CEbXLDVikYi9ahSQIsMCBw0pb60fZKuGJXruUfbMdjuR/kE/hRZKqa KLOAQgs7hL+8m/5OQumoh5VaBZ6ALQDXR30= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2dk0y5gd9p-4 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Fri, 13 Oct 2017 11:10:47 -0700 Received: from mx-out.facebook.com (192.168.52.123) by PRN-CHUB08.TheFacebook.com (192.168.16.18) with Microsoft SMTP Server id 14.3.319.2; Fri, 13 Oct 2017 11:10:46 -0700 Received: by devbig638.prn2.facebook.com (Postfix, from userid 11222) id 87C1742401EB; Fri, 13 Oct 2017 11:10:29 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Shaohua Li Smtp-Origin-Hostname: devbig638.prn2.facebook.com To: CC: , , Tejun Heo , Vivek Goyal Smtp-Origin-Cluster: prn2c22 Subject: [PATCH] block-throttle: avoid double charge Date: Fri, 13 Oct 2017 11:10:29 -0700 Message-ID: X-Mailer: git-send-email 2.9.5 X-FB-Internal: Safe MIME-Version: 1.0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-13_07:, , signatures=0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If a bio is throttled and splitted after throttling, the bio could be resubmited and enters the throttling again. This will cause part of the bio is charged multiple times. If the cgroup has an IO limit, the double charge will significantly harm the performance. The bio split becomes quite common after arbitrary bio size change. To fix this, we record the disk info a bio is throttled against. If a bio is throttled and issued, we record the info. We copy the info to cloned bio, so cloned bio (including splitted bio) will not be throttled again. Stacked block device driver will change cloned bio's bi_disk, if a bio's bi_disk is changed, the recorded throttle disk info is invalid, we should throttle again. That's the reason why we can't use a single bit to indicate if a cloned bio should be throttled. We only record gendisk here, if a cloned bio is remapped to other disk, it's very unlikely only partno is changed. Some sort of this patch probably should go into stable since v4.2 Cc: Tejun Heo Cc: Vivek Goyal Signed-off-by: Shaohua Li --- block/bio.c | 3 +++ block/blk-throttle.c | 15 ++++++++++++--- include/linux/blk_types.h | 4 ++++ 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/block/bio.c b/block/bio.c index 8338304..dce8314 100644 --- a/block/bio.c +++ b/block/bio.c @@ -597,6 +597,9 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) * so we don't set nor calculate new physical/hw segment counts here */ bio->bi_disk = bio_src->bi_disk; +#ifdef CONFIG_BLK_DEV_THROTTLING + bio->bi_throttled_disk = bio_src->bi_throttled_disk; +#endif bio_set_flag(bio, BIO_CLONED); bio->bi_opf = bio_src->bi_opf; bio->bi_write_hint = bio_src->bi_write_hint; diff --git a/block/blk-throttle.c b/block/blk-throttle.c index ee6d7b0..155549a 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2130,9 +2130,15 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, WARN_ON_ONCE(!rcu_read_lock_held()); - /* see throtl_charge_bio() */ - if (bio_flagged(bio, BIO_THROTTLED) || !tg->has_rules[rw]) + /* + * see throtl_charge_bio() for BIO_THROTTLED. If a bio is throttled + * against a disk but remapped to other disk, we should throttle it + * again + */ + if (bio_flagged(bio, BIO_THROTTLED) || !tg->has_rules[rw] || + (bio->bi_throttled_disk && bio->bi_throttled_disk == bio->bi_disk)) goto out; + bio->bi_throttled_disk = NULL; spin_lock_irq(q->queue_lock); @@ -2227,8 +2233,11 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, * don't want bios to leave with the flag set. Clear the flag if * being issued. */ - if (!throttled) + if (!throttled) { bio_clear_flag(bio, BIO_THROTTLED); + /* if the bio is cloned, we don't throttle it again */ + bio->bi_throttled_disk = bio->bi_disk; + } #ifdef CONFIG_BLK_DEV_THROTTLING_LOW if (throttled || !td->track_bio_latency) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 3385c89..2507566 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -89,6 +89,10 @@ struct bio { void *bi_cg_private; struct blk_issue_stat bi_issue_stat; #endif +#ifdef CONFIG_BLK_DEV_THROTTLING + /* record which disk the bio is throttled against */ + struct gendisk *bi_throttled_disk; +#endif #endif union { #if defined(CONFIG_BLK_DEV_INTEGRITY)