Message ID | 1384190448-3861-2-git-send-email-snitzer@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote: > From: Joe Thornber <ejt@redhat.com> > > Allow a cache to shrink if the blocks being removed from the cache are > not dirty. Please would you document the intended procedure here? > + DMERR("unable to shrink cache due to dirty blocks"); This error is highly undesirable: part of a device has been removed while it is still needed! How does someone go about reducing the size of the cache avoiding this error? (Analogy: to reduce size of a filesystem you run a filesystem resize process before reducing the size of the block device.) Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11 2013 at 4:19pm -0500, Alasdair G Kergon <agk@redhat.com> wrote: > On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote: > > From: Joe Thornber <ejt@redhat.com> > > > > Allow a cache to shrink if the blocks being removed from the cache are > > not dirty. > > Please would you document the intended procedure here? I'm not rebasing this patch (and in turn all commits that follow) to tweak this header. We can add information to cache.txt as a follow-on commit. > > + DMERR("unable to shrink cache due to dirty blocks"); > > This error is highly undesirable: part of a device has been removed > while it is still needed! Huh? The device still had dirty blocks, so the cache wasn't resized. > How does someone go about reducing the size of the cache avoiding this > error? (Analogy: to reduce size of a filesystem you run a filesystem > resize process before reducing the size of the block device.) Use the cleaner policy to purge the cache first. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 04:32:33PM -0500, Mike Snitzer wrote: > On Mon, Nov 11 2013 at 4:19pm -0500, > Alasdair G Kergon <agk@redhat.com> wrote: > > > + DMERR("unable to shrink cache due to dirty blocks"); > > This error is highly undesirable: part of a device has been removed > > while it is still needed! > Huh? The device still had dirty blocks, so the cache wasn't resized. The device no longer has those dirty blocks: you already removed them (otherwise you wouldn't have reached this point). > Use the cleaner policy to purge the cache first. But how do you do that without also cleaning the part of the cache that you're keeping? (For example, why should it be necessary to stop using your preferred cache policy for a while and clean the entire cache just in order to reduce the size of the cache slightly?) If the current situation is temporary then can we indicate what a better solution is likely to look like and whether it is likely still to use this proposed shrinker interface or whether it could replace it with something else? Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11 2013 at 4:32pm -0500, Mike Snitzer <snitzer@redhat.com> wrote: > On Mon, Nov 11 2013 at 4:19pm -0500, > Alasdair G Kergon <agk@redhat.com> wrote: > > > On Mon, Nov 11, 2013 at 12:20:43PM -0500, Mike Snitzer wrote: > > > From: Joe Thornber <ejt@redhat.com> > > > > > > Allow a cache to shrink if the blocks being removed from the cache are > > > not dirty. > > > > Please would you document the intended procedure here? > > I'm not rebasing this patch (and in turn all commits that follow) to > tweak this header. We can add information to cache.txt as a follow-on > commit. > > > > + DMERR("unable to shrink cache due to dirty blocks"); > > > > This error is highly undesirable: part of a device has been removed > > while it is still needed! > > Huh? The device still had dirty blocks, so the cache wasn't resized. Maybe you were saying: the trigger to resize is based on the size of the fast device having been reduced.. so getting that error _after_ the user already resized the fast device isn't really useful. Valid point... but given DM gives users a tremendous amount of room to hang itself this is par for the course no? Making this more bullet-proof could have userspace detect that the device is being used as in a dm-cache device... and then running userspace utils to analyze whether the device still has dirty blocks. Kernel is merely acting as it was told.. and yeah the user could do the wrong thing. So much more documentation is warranted for sure. Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 04:50:09PM -0500, Mike Snitzer wrote: > Valid point... but given DM gives users a tremendous amount of room to > hang itself this is par for the course no? Making this more > bullet-proof could have userspace detect that the device is being used > as in a dm-cache device... and then running userspace utils to analyze > whether the device still has dirty blocks. Indeed - I am asking what the intended process for userspace to follow is, as I can't see any particularly efficient one at the moment (for a general case with many dirty blocks) without first making further kernel changes - and some ways to do this better might involve using a different kernel interface instead of the one included in this patch. In other words, I'm asking for some details of the future direction of the cache shrinking feature so we can see that the interface in this patch is genuinely useful to userspace and will be built upon as the userspace shrinking features are developed. Otherwise I think it could be premature to include it if it turns out userspace needs to do things a different way with a different interface. Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Could you make the case that this interface is only intended for use when using the cache in a mode that doesn't have dirty blocks? And that the mechanism for shrinking a cache with dirty blocks has not yet been developed but might use a quite different interface? Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
> DMERR("unable to shrink cache due to dirty blocks");
Also, what is the process to recover from this error, given that you have
already lost the end of the device and you cannot now resume the device
without a preresume failure?
Do the tools support retrieval of the dirty blocks from the portion of the
device that you do still have?
Alasdair
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 5:15 PM, Alasdair G Kergon <agk@redhat.com> wrote: > Could you make the case that this interface is only intended for use > when using the cache in a mode that doesn't have dirty blocks? This really isn't even an "interface". All this patch enables a user to do is shrink the cache device is they have taken care to not have dirty blocks (be it "cleaner" policy or passthrough or writethrough operation modes). > And that the mechanism for shrinking a cache with dirty blocks has not > yet been developed but might use a quite different interface? We can add an interface to instruct the cache to clean all dirty blocks >= a particular size [1]. Whereby enabling even dirty/hot caches to remain generally useful while allowing to shrink the fast device. But no I'm not prepared to predict what that'll look like, and no that doesn't invalidate the basic support that this patch is providing. We'll improve the docs but there is no reason to withhold this patch waiting for more add-ons. By the time lvm2 has support for dm-cache this very likely won't be a concern anyway (e.g. we'll have the ability to do [1] above). -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 6:03 PM, Alasdair G Kergon <agk@redhat.com> wrote: >> DMERR("unable to shrink cache due to dirty blocks"); > > Also, what is the process to recover from this error, given that you have > already lost the end of the device and you cannot now resume the device > without a preresume failure? > > Do the tools support retrieval of the dirty blocks from the portion of the > device that you do still have? Whatever the user used to resize the fast device would need to be reversed. The metadata hasn't changed, and provided the user can actually get the original data back (possible with linear volume anyway) then they can easily recover. More sophisticated volumes may not make this possible.... It hurts when one shoots one's self in the foot. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 06:06:26PM -0500, Mike Snitzer wrote:
> This really isn't even an "interface".
It is an extension to the existing userspace-kernel interface: a size
reduction of an underlying device is detected on reload and triggers the
removal of metadata relating to the removed portion of the device.
Another interface might for example require a message to be sent to set
a threshold above which the device is 'cleaned' until there is no
metadata pointing beyond that point. A subsequent reload using the
smaller underlying device would not then need to trigger any metadata
removal: it would already have been removed.
Alasdair
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 11, 2013 at 6:17 PM, Alasdair G Kergon <agk@redhat.com> wrote: > On Mon, Nov 11, 2013 at 06:06:26PM -0500, Mike Snitzer wrote: >> This really isn't even an "interface". > > It is an extension to the existing userspace-kernel interface: a size > reduction of an underlying device is detected on reload and triggers the > removal of metadata relating to the removed portion of the device. > > Another interface might for example require a message to be sent to set > a threshold above which the device is 'cleaned' until there is no > metadata pointing beyond that point. A subsequent reload using the > smaller underlying device would not then need to trigger any metadata > removal: it would already have been removed. There is no reason to make them mutually exclusive. Or to _only_ have the later behavior you suggested where metadata is automatically resized once clean below the specified threshold. Even without this patch the user is already screwed if they resize away a dirty portion of the fast device while it is actively in use by dm-cache. relative to your concerns: All this patch changes is detecting the shrink when still dirty and tells the user they have shot themselves in the foot. So if anything this patch offers an improvement. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c index 2262b4e..062b83e 100644 --- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -667,19 +667,85 @@ void dm_cache_metadata_close(struct dm_cache_metadata *cmd) kfree(cmd); } +/* + * Checks that the given cache block is either unmapped or clean. + */ +static int block_unmapped_or_clean(struct dm_cache_metadata *cmd, dm_cblock_t b, + bool *result) +{ + int r; + __le64 value; + dm_oblock_t ob; + unsigned flags; + + r = dm_array_get_value(&cmd->info, cmd->root, from_cblock(b), &value); + if (r) { + DMERR("block_unmapped_or_clean failed"); + return r; + } + + unpack_value(value, &ob, &flags); + *result = !((flags & M_VALID) && (flags & M_DIRTY)); + + return 0; +} + +static int blocks_are_unmapped_or_clean(struct dm_cache_metadata *cmd, + dm_cblock_t begin, dm_cblock_t end, + bool *result) +{ + int r; + *result = true; + + while (begin != end) { + r = block_unmapped_or_clean(cmd, begin, result); + if (r) + return r; + + if (!*result) { + DMERR("cache block %llu is dirty", + (unsigned long long) from_cblock(begin)); + return 0; + } + + begin = to_cblock(from_cblock(begin) + 1); + } + + return 0; +} + int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size) { int r; + bool clean; __le64 null_mapping = pack_value(0, 0); down_write(&cmd->root_lock); __dm_bless_for_disk(&null_mapping); + + if (from_cblock(new_cache_size) < from_cblock(cmd->cache_blocks)) { + r = blocks_are_unmapped_or_clean(cmd, new_cache_size, cmd->cache_blocks, &clean); + if (r) { + __dm_unbless_for_disk(&null_mapping); + goto out; + } + + if (!clean) { + DMERR("unable to shrink cache due to dirty blocks"); + r = -EINVAL; + __dm_unbless_for_disk(&null_mapping); + goto out; + } + } + r = dm_array_resize(&cmd->info, cmd->root, from_cblock(cmd->cache_blocks), from_cblock(new_cache_size), &null_mapping, &cmd->root); if (!r) cmd->cache_blocks = new_cache_size; cmd->changed = true; + +out: up_write(&cmd->root_lock); return r; diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c index 655994f..183dfc9 100644 --- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -2502,26 +2502,71 @@ static int load_discard(void *context, sector_t discard_block_size, return 0; } +static dm_cblock_t get_cache_dev_size(struct cache *cache) +{ + sector_t size = get_dev_size(cache->cache_dev); + (void) sector_div(size, cache->sectors_per_block); + return to_cblock(size); +} + +static bool can_resize(struct cache *cache, dm_cblock_t new_size) +{ + if (from_cblock(new_size) > from_cblock(cache->cache_size)) + return true; + + /* + * We can't drop a dirty block when shrinking the cache. + */ + while (from_cblock(new_size) < from_cblock(cache->cache_size)) { + new_size = to_cblock(from_cblock(new_size) + 1); + if (is_dirty(cache, new_size)) { + DMERR("unable to shrink cache; cache block %llu is dirty", + (unsigned long long) from_cblock(new_size)); + return false; + } + } + + return true; +} + +static int resize_cache_dev(struct cache *cache, dm_cblock_t new_size) +{ + int r; + + r = dm_cache_resize(cache->cmd, cache->cache_size); + if (r) { + DMERR("could not resize cache metadata"); + return r; + } + + cache->cache_size = new_size; + + return 0; +} + static int cache_preresume(struct dm_target *ti) { int r = 0; struct cache *cache = ti->private; - sector_t actual_cache_size = get_dev_size(cache->cache_dev); - (void) sector_div(actual_cache_size, cache->sectors_per_block); + dm_cblock_t csize = get_cache_dev_size(cache); /* * Check to see if the cache has resized. */ - if (from_cblock(cache->cache_size) != actual_cache_size || !cache->sized) { - cache->cache_size = to_cblock(actual_cache_size); - - r = dm_cache_resize(cache->cmd, cache->cache_size); - if (r) { - DMERR("could not resize cache metadata"); + if (!cache->sized) { + r = resize_cache_dev(cache, csize); + if (r) return r; - } cache->sized = true; + + } else if (csize != cache->cache_size) { + if (!can_resize(cache, csize)) + return -EINVAL; + + r = resize_cache_dev(cache, csize); + if (r) + return r; } if (!cache->loaded_mappings) {