diff mbox

[1/6] dm cache: cache shrinking support

Message ID 1384190448-3861-2-git-send-email-snitzer@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Mike Snitzer Nov. 11, 2013, 5:20 p.m. UTC
From: Joe Thornber <ejt@redhat.com>

Allow a cache to shrink if the blocks being removed from the cache are
not dirty.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-cache-metadata.c |   66 ++++++++++++++++++++++++++++++++++++++++
 drivers/md/dm-cache-target.c   |   63 ++++++++++++++++++++++++++++++++-----
 2 files changed, 120 insertions(+), 9 deletions(-)

Comments

Alasdair G Kergon Nov. 11, 2013, 9:19 p.m. UTC | #1
On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote:
> From: Joe Thornber <ejt@redhat.com>
> 
> Allow a cache to shrink if the blocks being removed from the cache are
> not dirty.
 
Please would you document the intended procedure here?

> +			DMERR("unable to shrink cache due to dirty blocks");

This error is highly undesirable: part of a device has been removed
while it is still needed!

How does someone go about reducing the size of the cache avoiding this
error?  (Analogy: to reduce size of a filesystem you run a filesystem
resize process before reducing the size of the block device.)

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 11, 2013, 9:32 p.m. UTC | #2
On Mon, Nov 11 2013 at  4:19pm -0500,
Alasdair G Kergon <agk@redhat.com> wrote:

> On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote:
> > From: Joe Thornber <ejt@redhat.com>
> > 
> > Allow a cache to shrink if the blocks being removed from the cache are
> > not dirty.
>  
> Please would you document the intended procedure here?

I'm not rebasing this patch (and in turn all commits that follow) to
tweak this header.  We can add information to cache.txt as a follow-on
commit.
 
> > +			DMERR("unable to shrink cache due to dirty blocks");
> 
> This error is highly undesirable: part of a device has been removed
> while it is still needed!

Huh?  The device still had dirty blocks, so the cache wasn't resized.

> How does someone go about reducing the size of the cache avoiding this
> error?  (Analogy: to reduce size of a filesystem you run a filesystem
> resize process before reducing the size of the block device.)

Use the cleaner policy to purge the cache first.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Nov. 11, 2013, 9:48 p.m. UTC | #3
On Mon, Nov 11, 2013 at 04:32:33PM -0500, Mike Snitzer wrote:
> On Mon, Nov 11 2013 at  4:19pm -0500,
> Alasdair G Kergon <agk@redhat.com> wrote:
  
> > > +			DMERR("unable to shrink cache due to dirty blocks");
> > This error is highly undesirable: part of a device has been removed
> > while it is still needed!
> Huh?  The device still had dirty blocks, so the cache wasn't resized.

The device no longer has those dirty blocks: you already removed them 
(otherwise you wouldn't have reached this point).
 
> Use the cleaner policy to purge the cache first.
 
But how do you do that without also cleaning the part of the cache that
you're keeping?  (For example, why should it be necessary to stop using
your preferred cache policy for a while and clean the entire cache just in
order to reduce the size of the cache slightly?)

If the current situation is temporary then can we indicate what a better
solution is likely to look like and whether it is likely still to use this
proposed shrinker interface or whether it could replace it with something else?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 11, 2013, 9:50 p.m. UTC | #4
On Mon, Nov 11 2013 at  4:32pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Nov 11 2013 at  4:19pm -0500,
> Alasdair G Kergon <agk@redhat.com> wrote:
> 
> > On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote:
> > > From: Joe Thornber <ejt@redhat.com>
> > > 
> > > Allow a cache to shrink if the blocks being removed from the cache are
> > > not dirty.
> >  
> > Please would you document the intended procedure here?
> 
> I'm not rebasing this patch (and in turn all commits that follow) to
> tweak this header.  We can add information to cache.txt as a follow-on
> commit.
>  
> > > +			DMERR("unable to shrink cache due to dirty blocks");
> > 
> > This error is highly undesirable: part of a device has been removed
> > while it is still needed!
> 
> Huh?  The device still had dirty blocks, so the cache wasn't resized.

Maybe you were saying: the trigger to resize is based on the size of the
fast device having been reduced.. so getting that error _after_ the user
already resized the fast device isn't really useful.

Valid point... but given DM gives users a tremendous amount of room to
hang itself this is par for the course no?  Making this more
bullet-proof could have userspace detect that the device is being used
as in a dm-cache device... and then running userspace utils to analyze
whether the device still has dirty blocks.

Kernel is merely acting as it was told.. and yeah the user could do the
wrong thing.  So much more documentation is warranted for sure.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Nov. 11, 2013, 10:09 p.m. UTC | #5
On Mon, Nov 11, 2013 at 04:50:09PM -0500, Mike Snitzer wrote:
> Valid point... but given DM gives users a tremendous amount of room to
> hang itself this is par for the course no?  Making this more
> bullet-proof could have userspace detect that the device is being used
> as in a dm-cache device... and then running userspace utils to analyze
> whether the device still has dirty blocks.
 
Indeed - I am asking what the intended process for userspace to follow
is, as I can't see any particularly efficient one at the moment (for a
general case with many dirty blocks) without first making further kernel
changes - and some ways to do this better might involve using a
different kernel interface instead of the one included in this patch.

In other words, I'm asking for some details of the future direction of
the cache shrinking feature so we can see that the interface in this
patch is genuinely useful to userspace and will be built upon as the
userspace shrinking features are developed.  Otherwise I think it could
be premature to include it if it turns out userspace needs to do things
a different way with a different interface.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Nov. 11, 2013, 10:15 p.m. UTC | #6
Could you make the case that this interface is only intended for use
when using the cache in a mode that doesn't have dirty blocks?

And that the mechanism for shrinking a cache with dirty blocks has not
yet been developed but might use a quite different interface?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Nov. 11, 2013, 11:03 p.m. UTC | #7
> DMERR("unable to shrink cache due to dirty blocks");

Also, what is the process to recover from this error, given that you have
already lost the end of the device and you cannot now resume the device
without a preresume failure?

Do the tools support retrieval of the dirty blocks from the portion of the
device that you do still have?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 11, 2013, 11:06 p.m. UTC | #8
On Mon, Nov 11, 2013 at 5:15 PM, Alasdair G Kergon <agk@redhat.com> wrote:
> Could you make the case that this interface is only intended for use
> when using the cache in a mode that doesn't have dirty blocks?

This really isn't even an "interface".  All this patch enables a user
to do is shrink the cache device is they have taken care to not have
dirty blocks (be it "cleaner" policy or passthrough or writethrough
operation modes).

> And that the mechanism for shrinking a cache with dirty blocks has not
> yet been developed but might use a quite different interface?

We can add an interface to instruct the cache to clean all dirty
blocks >= a particular size [1].  Whereby enabling even dirty/hot
caches to remain generally useful while allowing to shrink the fast
device.

But no I'm not prepared to predict what that'll look like, and no that
doesn't invalidate the basic support that this patch is providing.

We'll improve the docs but there is no reason to withhold this patch
waiting for more add-ons.  By the time lvm2 has support for dm-cache
this very likely won't be a concern anyway (e.g. we'll have the
ability to do [1] above).

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 11, 2013, 11:08 p.m. UTC | #9
On Mon, Nov 11, 2013 at 6:03 PM, Alasdair G Kergon <agk@redhat.com> wrote:
>> DMERR("unable to shrink cache due to dirty blocks");
>
> Also, what is the process to recover from this error, given that you have
> already lost the end of the device and you cannot now resume the device
> without a preresume failure?
>
> Do the tools support retrieval of the dirty blocks from the portion of the
> device that you do still have?

Whatever the user used to resize the fast device would need to be
reversed.  The metadata hasn't changed, and provided the user can
actually get the original data back (possible with linear volume
anyway) then they can easily recover.  More sophisticated volumes may
not make this possible....

It hurts when one shoots one's self in the foot.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Alasdair G Kergon Nov. 11, 2013, 11:17 p.m. UTC | #10
On Mon, Nov 11, 2013 at 06:06:26PM -0500, Mike Snitzer wrote:
> This really isn't even an "interface".  

It is an extension to the existing userspace-kernel interface: a size
reduction of an underlying device is detected on reload and triggers the
removal of metadata relating to the removed portion of the device.

Another interface might for example require a message to be sent to set
a threshold above which the device is 'cleaned' until there is no
metadata pointing beyond that point.  A subsequent reload using the
smaller underlying device would not then need to trigger any metadata
removal: it would already have been removed.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 11, 2013, 11:34 p.m. UTC | #11
On Mon, Nov 11, 2013 at 6:17 PM, Alasdair G Kergon <agk@redhat.com> wrote:
> On Mon, Nov 11, 2013 at 06:06:26PM -0500, Mike Snitzer wrote:
>> This really isn't even an "interface".
>
> It is an extension to the existing userspace-kernel interface: a size
> reduction of an underlying device is detected on reload and triggers the
> removal of metadata relating to the removed portion of the device.
>
> Another interface might for example require a message to be sent to set
> a threshold above which the device is 'cleaned' until there is no
> metadata pointing beyond that point.  A subsequent reload using the
> smaller underlying device would not then need to trigger any metadata
> removal: it would already have been removed.

There is no reason to make them mutually exclusive.  Or to _only_ have
the later behavior you suggested where metadata is automatically
resized once clean below the specified threshold.

Even without this patch the user is already screwed if they resize
away a dirty portion of the fast device while it is actively in use by
dm-cache.

relative to your concerns: All this patch changes is detecting the
shrink when still dirty and tells the user they have shot themselves
in the foot.  So if anything this patch offers an improvement.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c
index 2262b4e..062b83e 100644
--- a/drivers/md/dm-cache-metadata.c
+++ b/drivers/md/dm-cache-metadata.c
@@ -667,19 +667,85 @@  void dm_cache_metadata_close(struct dm_cache_metadata *cmd)
 	kfree(cmd);
 }
 
+/*
+ * Checks that the given cache block is either unmapped or clean.
+ */
+static int block_unmapped_or_clean(struct dm_cache_metadata *cmd, dm_cblock_t b,
+				   bool *result)
+{
+	int r;
+	__le64 value;
+	dm_oblock_t ob;
+	unsigned flags;
+
+	r = dm_array_get_value(&cmd->info, cmd->root, from_cblock(b), &value);
+	if (r) {
+		DMERR("block_unmapped_or_clean failed");
+		return r;
+	}
+
+	unpack_value(value, &ob, &flags);
+	*result = !((flags & M_VALID) && (flags & M_DIRTY));
+
+	return 0;
+}
+
+static int blocks_are_unmapped_or_clean(struct dm_cache_metadata *cmd,
+					dm_cblock_t begin, dm_cblock_t end,
+					bool *result)
+{
+	int r;
+	*result = true;
+
+	while (begin != end) {
+		r = block_unmapped_or_clean(cmd, begin, result);
+		if (r)
+			return r;
+
+		if (!*result) {
+			DMERR("cache block %llu is dirty",
+			      (unsigned long long) from_cblock(begin));
+			return 0;
+		}
+
+		begin = to_cblock(from_cblock(begin) + 1);
+	}
+
+	return 0;
+}
+
 int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size)
 {
 	int r;
+	bool clean;
 	__le64 null_mapping = pack_value(0, 0);
 
 	down_write(&cmd->root_lock);
 	__dm_bless_for_disk(&null_mapping);
+
+	if (from_cblock(new_cache_size) < from_cblock(cmd->cache_blocks)) {
+		r = blocks_are_unmapped_or_clean(cmd, new_cache_size, cmd->cache_blocks, &clean);
+		if (r) {
+			__dm_unbless_for_disk(&null_mapping);
+			goto out;
+		}
+
+		if (!clean) {
+			DMERR("unable to shrink cache due to dirty blocks");
+			r = -EINVAL;
+			__dm_unbless_for_disk(&null_mapping);
+			goto out;
+		}
+	}
+
 	r = dm_array_resize(&cmd->info, cmd->root, from_cblock(cmd->cache_blocks),
 			    from_cblock(new_cache_size),
 			    &null_mapping, &cmd->root);
 	if (!r)
 		cmd->cache_blocks = new_cache_size;
 	cmd->changed = true;
+
+out:
 	up_write(&cmd->root_lock);
 
 	return r;
diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index 655994f..183dfc9 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -2502,26 +2502,71 @@  static int load_discard(void *context, sector_t discard_block_size,
 	return 0;
 }
 
+static dm_cblock_t get_cache_dev_size(struct cache *cache)
+{
+	sector_t size = get_dev_size(cache->cache_dev);
+	(void) sector_div(size, cache->sectors_per_block);
+	return to_cblock(size);
+}
+
+static bool can_resize(struct cache *cache, dm_cblock_t new_size)
+{
+	if (from_cblock(new_size) > from_cblock(cache->cache_size))
+		return true;
+
+	/*
+	 * We can't drop a dirty block when shrinking the cache.
+	 */
+	while (from_cblock(new_size) < from_cblock(cache->cache_size)) {
+		new_size = to_cblock(from_cblock(new_size) + 1);
+		if (is_dirty(cache, new_size)) {
+			DMERR("unable to shrink cache; cache block %llu is dirty",
+			      (unsigned long long) from_cblock(new_size));
+			return false;
+		}
+	}
+
+	return true;
+}
+
+static int resize_cache_dev(struct cache *cache, dm_cblock_t new_size)
+{
+	int r;
+
+	r = dm_cache_resize(cache->cmd, cache->cache_size);
+	if (r) {
+		DMERR("could not resize cache metadata");
+		return r;
+	}
+
+	cache->cache_size = new_size;
+
+	return 0;
+}
+
 static int cache_preresume(struct dm_target *ti)
 {
 	int r = 0;
 	struct cache *cache = ti->private;
-	sector_t actual_cache_size = get_dev_size(cache->cache_dev);
-	(void) sector_div(actual_cache_size, cache->sectors_per_block);
+	dm_cblock_t csize = get_cache_dev_size(cache);
 
 	/*
 	 * Check to see if the cache has resized.
 	 */
-	if (from_cblock(cache->cache_size) != actual_cache_size || !cache->sized) {
-		cache->cache_size = to_cblock(actual_cache_size);
-
-		r = dm_cache_resize(cache->cmd, cache->cache_size);
-		if (r) {
-			DMERR("could not resize cache metadata");
+	if (!cache->sized) {
+		r = resize_cache_dev(cache, csize);
+		if (r)
 			return r;
-		}
 
 		cache->sized = true;
+
+	} else if (csize != cache->cache_size) {
+		if (!can_resize(cache, csize))
+			return -EINVAL;
+
+		r = resize_cache_dev(cache, csize);
+		if (r)
+			return r;
 	}
 
 	if (!cache->loaded_mappings) {