diff mbox

blk_integrity_revalidate() clears BDI_CAP_STABLE_WRITES

Message ID CAOi1vP8M91kiF-FnT=EoJgTxfstMEsC-nhLY5KtRkrKROjWPWA@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ilya Dryomov Feb. 20, 2017, 6:45 p.m. UTC
Hello,

rbd requires stable pages if ceph message data CRCs are enabled.  To
that end we set BDI_CAP_STABLE_WRITES in rbd_init_disk(), but since
commit 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk",
went into 4.4) this bit is almost immediately cleared:

add_disk
  register_disk
    blkdev_get
      rescan_partitions
        blk_integrity_revalidate

void blk_integrity_revalidate(struct gendisk *disk)
{
        struct blk_integrity *bi = &disk->queue->integrity;

        if (!(disk->flags & GENHD_FL_UP))
                return;

        if (bi->profile)
                disk->queue->backing_dev_info.capabilities |=
                        BDI_CAP_STABLE_WRITES;
        else
                disk->queue->backing_dev_info.capabilities &=
                        ~BDI_CAP_STABLE_WRITES;
}

ceph messenger is responsible for generating/verifying CRCs, so we
don't call blk_integrity_register() -- bi->profile is NULL.

Martin, could you please explain blk_integrity_revalidate() and its
GENHD_FL_UP check in particular?  We have the queue, bi->profile can't
be NULL after blk_integrity_register(), and since the latter "must" be
used for registering the profile with the block layer, wouldn't the
following be sufficient for blk_integrity users?


@@ -430,26 +430,11 @@ EXPORT_SYMBOL(blk_integrity_register);
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-       blk_integrity_revalidate(disk);
+       disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
        memset(&disk->queue->integrity, 0, sizeof(struct blk_integrity));
 }
 EXPORT_SYMBOL(blk_integrity_unregister);

blk_integrity_revalidate() can then go away -- I can send the full
patch later.

The alternative seems to be to set up a bogus blk_integrity_profile
(nop_profile won't do -- this one would have to be truly bogus w/ NULL
->*_fn) under BLK_DEV_INTEGRITY ifdefs and hope that nothing breaks.
That can't be the right thing to do here...

Thanks,

                Ilya

Comments

Martin K. Petersen Feb. 22, 2017, 4:41 a.m. UTC | #1
>>>>> "Ilya" == Ilya Dryomov <idryomov@gmail.com> writes:

Ilya,

Ilya> could you please explain blk_integrity_revalidate() and
Ilya> its GENHD_FL_UP check in particular?  We have the queue,
Ilya> bi->profile can't be NULL after blk_integrity_register(), and
Ilya> since the latter "must" be used for registering the profile with
Ilya> the block layer, wouldn't the following be sufficient for
Ilya> blk_integrity users?

IIrc, the FL_UP check fixed a registration problem in the nvme driver.

The rationale behind revalidate was that we need to handle devices which
lose the integrity capability at runtime (i.e. a integrity-enabled DM
device is extended with a non-cable drive forcing the feature to be
turned off). The clearing of the integrity profile is more important in
that case than zapping the stable pages flag. But that was the original
reason for not just ORing BDI_CAP_STABLE_WRITES.

I don't have a huge problem with keeping stable pages on if a device
suddenly stops being integrity capable. However, I'd like to understand
your use case a bit better.

Ilya> The alternative seems to be to set up a bogus
Ilya> blk_integrity_profile (nop_profile won't do -- this one would have
Ilya> to be truly bogus w/ NULL *_fn) under BLK_DEV_INTEGRITY ifdefs and
Ilya> hope that nothing breaks.

Can you point me to the relevant code on your end?

Thanks,
Martin
diff mbox

Patch

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index d69c5c79f98e..319f2e4f4a8b 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -417,7 +417,7 @@  void blk_integrity_register(struct gendisk *disk,
struct blk_integrity *template
        bi->tuple_size = template->tuple_size;
        bi->tag_size = template->tag_size;

-       blk_integrity_revalidate(disk);
+       disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
 }
 EXPORT_SYMBOL(blk_integrity_register);