From patchwork Wed Feb 22 14:41:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 9586947 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 17BBE6020B for ; Wed, 22 Feb 2017 14:41:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3C692858A for ; Wed, 22 Feb 2017 14:41:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D8A7F285EE; Wed, 22 Feb 2017 14:41:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 902B62858A for ; Wed, 22 Feb 2017 14:41:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932889AbdBVOlq (ORCPT ); Wed, 22 Feb 2017 09:41:46 -0500 Received: from mail-ua0-f181.google.com ([209.85.217.181]:33940 "EHLO mail-ua0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932726AbdBVOlp (ORCPT ); Wed, 22 Feb 2017 09:41:45 -0500 Received: by mail-ua0-f181.google.com with SMTP id c32so2981080uac.1; Wed, 22 Feb 2017 06:41:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=K/BJLQ/zSKYEyFezToN67xdimPMwqZ6sS5UV1ypbT0Y=; b=lMVKjFyJHLwoPBtB7Qh9pRllICwAujWbcA+5CHg1Q+RBSSppxDiYP0VON/RpAdl6jv kmNvU9TYnUCnkPTNtSyY2/+GRWZKCuYzzrHbyQhK3fleoKvTlgyEZxlshNz8IooFNgKh FNMbIjIc/p2613jO9UlAGX8hZAu1a8T6RWEmAVUewwgKLvRT9P2N2Ymup55bZmkWw9n7 XE+gp6KjWScYXr5kdQiEoTEO0s/53df6QZPWVqiOKn5S2EhaLaHc09tFkEj3PqujvTFU wlyKn8C7xzjkvPvqhvxSpNaQ1QNW+FX+JToxuvCk0D1DMHQfJaXAfHiNgERlPg1eOLTy k2tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=K/BJLQ/zSKYEyFezToN67xdimPMwqZ6sS5UV1ypbT0Y=; b=RvWyST3lwakWcXR+7qxubCbWPe4EbmDAMHZJj7oC62vCvCtBj/0FuXjnGtlW1Hw+X0 wuvG8BKHU2X2u+cCy58vKxp+ebSFx8R5stsPwW5r5C1KjE0exuEWzsY+KLmNsN2surhW 7scBJ85RzjcggIouMNbRPbxyPAoy+qEGLky15MbESKKH/PvAPDHFXPwlTmS+f9hKBgT4 ndQkJZkZWZ2afkiu7GrvJtIMpQfj2ToGSxGc6GsqdUgIXjjkPGa5TD30AoZIEG1Pvliz X8vNdbf8XjHax03vRbBFMLLB9x7a4GlfujPtCtAyc09wP0RBn1hfyaweVDKhzQ2f+zw+ e4kg== X-Gm-Message-State: AMke39mWsvPqHFTbiC4subxDSbDyPrdrUf4ThE6Cf/89XcF3AF/vb9b0MJyl/dU9yihOT4VCy7hHK+aEwYn5Vw== X-Received: by 10.159.36.165 with SMTP id 34mr17317748uar.48.1487774503803; Wed, 22 Feb 2017 06:41:43 -0800 (PST) MIME-Version: 1.0 Received: by 10.176.76.194 with HTTP; Wed, 22 Feb 2017 06:41:43 -0800 (PST) In-Reply-To: References: From: Ilya Dryomov Date: Wed, 22 Feb 2017 15:41:43 +0100 Message-ID: Subject: Re: blk_integrity_revalidate() clears BDI_CAP_STABLE_WRITES To: "Martin K. Petersen" Cc: Ceph Development , linux-block@vger.kernel.org, Dan Williams , Christoph Hellwig Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Feb 22, 2017 at 5:41 AM, Martin K. Petersen wrote: >>>>>> "Ilya" == Ilya Dryomov writes: > > Ilya, > > Ilya> could you please explain blk_integrity_revalidate() and > Ilya> its GENHD_FL_UP check in particular? We have the queue, > Ilya> bi->profile can't be NULL after blk_integrity_register(), and > Ilya> since the latter "must" be used for registering the profile with > Ilya> the block layer, wouldn't the following be sufficient for > Ilya> blk_integrity users? > > IIrc, the FL_UP check fixed a registration problem in the nvme driver. Did it have something to do with a NULL disk->queue? > > The rationale behind revalidate was that we need to handle devices which > lose the integrity capability at runtime (i.e. a integrity-enabled DM > device is extended with a non-cable drive forcing the feature to be > turned off). The clearing of the integrity profile is more important in > that case than zapping the stable pages flag. But that was the original > reason for not just ORing BDI_CAP_STABLE_WRITES. > > I don't have a huge problem with keeping stable pages on if a device > suddenly stops being integrity capable. However, I'd like to understand > your use case a bit better. Well, blk_integrity_revalidate() doesn't clear the profile, it just clears the stable pages flag. Whoever calls blk_integrity_unregister() to clear the profile can also clear the stable pages flag -- why not let blk_integrity_unregister() clear the flag like I suggested? > > Ilya> The alternative seems to be to set up a bogus > Ilya> blk_integrity_profile (nop_profile won't do -- this one would have > Ilya> to be truly bogus w/ NULL *_fn) under BLK_DEV_INTEGRITY ifdefs and > Ilya> hope that nothing breaks. > > Can you point me to the relevant code on your end? This code doesn't exist yet -- it's just something I thought I'd have to do if my patch or something along those lines isn't accepted. There is the nop_profile with stub *_fn callbacks, which is used by nvme and nvdimm to make bio_integrity_enabled() return true, so that per-interval buffers are allocated in bio_integrity_prep(). "I want some space for per-interval metadata, but no integrity checksums" is use case there. In the rbd case, we don't want that. We just want to set the stable pages flag and make sure it's not reset by blk_integrity code. So, if blk_integrity_revalidate() isn't fixed, rbd will need to set up a bogus profile with NULL *_fn callbacks to make bio_integrity_enabled() return false. This is obviously fragile, because blk_get_integrity() will no longer return NULL... We are setting BDI_CAP_STABLE_WRITES in rbd_init_disk(), see commit bae818ee1577 ("rbd: require stable pages if message data CRCs are enabled"). The CRCs are generated in write_partial_message_data() and verified in read_partial_msg_data(). I'm attaching the patch I had in mind. Thanks, Ilya From 24a0cf8bc437a65b3986e9ab25cf2f05e6bbd5d0 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov Date: Mon, 20 Feb 2017 18:50:50 +0100 Subject: [PATCH] block: get rid of blk_integrity_revalidate() Commit 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") introduced blk_integrity_revalidate(), which clears the stable pages bit if no blk_integrity profile is registered. It's called from revalidate_disk() and rescan_partitions(), making it impossible to set BDI_CAP_STABLE_WRITES for drivers that support partitions and don't use blk_integrity. One such driver is rbd, where the ceph messenger is responsible for generating/verifying CRCs. Since blk_integrity_{un,}register() "must" be used for (un)registering the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES setting there. This way drivers that call blk_integrity_register() and use integrity infrastructure won't interfere with drivers that don't but still want stable pages. Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") Signed-off-by: Ilya Dryomov --- block/blk-integrity.c | 19 ++----------------- block/partition-generic.c | 1 - fs/block_dev.c | 1 - include/linux/genhd.h | 2 -- 4 files changed, 2 insertions(+), 21 deletions(-) diff --git a/block/blk-integrity.c b/block/blk-integrity.c index d69c5c79f98e..319f2e4f4a8b 100644 --- a/block/blk-integrity.c +++ b/block/blk-integrity.c @@ -417,7 +417,7 @@ void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template bi->tuple_size = template->tuple_size; bi->tag_size = template->tag_size; - blk_integrity_revalidate(disk); + disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES; } EXPORT_SYMBOL(blk_integrity_register); @@ -430,26 +430,11 @@ EXPORT_SYMBOL(blk_integrity_register); */ void blk_integrity_unregister(struct gendisk *disk) { - blk_integrity_revalidate(disk); + disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES; memset(&disk->queue->integrity, 0, sizeof(struct blk_integrity)); } EXPORT_SYMBOL(blk_integrity_unregister); -void blk_integrity_revalidate(struct gendisk *disk) -{ - struct blk_integrity *bi = &disk->queue->integrity; - - if (!(disk->flags & GENHD_FL_UP)) - return; - - if (bi->profile) - disk->queue->backing_dev_info.capabilities |= - BDI_CAP_STABLE_WRITES; - else - disk->queue->backing_dev_info.capabilities &= - ~BDI_CAP_STABLE_WRITES; -} - void blk_integrity_add(struct gendisk *disk) { if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype, diff --git a/block/partition-generic.c b/block/partition-generic.c index 7afb9907821f..0171a2faad68 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -497,7 +497,6 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev) if (disk->fops->revalidate_disk) disk->fops->revalidate_disk(disk); - blk_integrity_revalidate(disk); check_disk_size_change(disk, bdev); bdev->bd_invalidated = 0; if (!get_capacity(disk) || !(state = check_partition(disk, bdev))) diff --git a/fs/block_dev.c b/fs/block_dev.c index 3c47614a4b32..b94e2a4974a1 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1422,7 +1422,6 @@ int revalidate_disk(struct gendisk *disk) if (disk->fops->revalidate_disk) ret = disk->fops->revalidate_disk(disk); - blk_integrity_revalidate(disk); bdev = bdget_disk(disk, 0); if (!bdev) return ret; diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 76f39754e7b0..76d6a1cd4153 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -722,11 +722,9 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size) #if defined(CONFIG_BLK_DEV_INTEGRITY) extern void blk_integrity_add(struct gendisk *); extern void blk_integrity_del(struct gendisk *); -extern void blk_integrity_revalidate(struct gendisk *); #else /* CONFIG_BLK_DEV_INTEGRITY */ static inline void blk_integrity_add(struct gendisk *disk) { } static inline void blk_integrity_del(struct gendisk *disk) { } -static inline void blk_integrity_revalidate(struct gendisk *disk) { } #endif /* CONFIG_BLK_DEV_INTEGRITY */ #else /* CONFIG_BLOCK */ -- 2.4.3