diff mbox

[6/6] dm cache: add cache block invalidation support

Message ID 1384190448-3861-7-git-send-email-snitzer@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Mike Snitzer Nov. 11, 2013, 5:20 p.m. UTC
From: Joe Thornber <ejt@redhat.com>

Cache block invalidation is removing an entry from the cache without
writing it back.  Cache blocks can be invalidated via the
'invalidate_cblocks' message, which takes an arbitrary number of cblock
ranges:
   invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*

E.g.
   dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 Documentation/device-mapper/cache.txt |   12 ++-
 drivers/md/dm-cache-target.c          |  225 ++++++++++++++++++++++++++++++++-
 2 files changed, 233 insertions(+), 4 deletions(-)

Comments

Alasdair G Kergon Nov. 12, 2013, 12:46 a.m. UTC | #1
On Mon, Nov 11, 2013 at 12:20:48PM -0500, Mike Snitzer wrote:
> From: Joe Thornber <ejt@redhat.com>
 
> Cache block invalidation is removing an entry from the cache without
> writing it back.  Cache blocks can be invalidated via the
> 'invalidate_cblocks' message, which takes an arbitrary number of cblock
> ranges:
>    invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*
 
If we're expecting to need to invalidate a large number of ranges, should we
switch this over to hexadecimal like we did for dm-switch?

Should we consider changing this to <cblock>+<length> as the dm
interfaces tend to work with a start offset and a length?

Or will it be that the userspace interface will not be the bottleneck in
this code so it really doesn't matter about attempting to optimise that
part?

> +++ b/Documentation/device-mapper/cache.txt

>  The test suite can be found here:
> -https://github.com/jthornber/thinp-test-suite
> +https://github.com/jthornber/device-mapper-test-suite

It's really not good if URLs in existing documentation become invalid
like this: Could an obvious pointer be placed at the old location to
direct people to the new location?

Can we document whether or not all the ranges will have been invalidated
before the message returns?

> +	if (!passthrough_mode(&cache->features)) {
> +		DMERR("cache has to be in passthrough mode for invalidation");

Can we document this too?

> @@ -2841,6 +3054,12 @@ static int cache_message(struct dm_target *ti, unsigned argc, char **argv)
> +	if (!strcmp(argv[0], "invalidate_cblocks"))

strcasecmp please!

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 12, 2013, 5:35 p.m. UTC | #2
On Mon, Nov 11 2013 at  7:46pm -0500,
Alasdair G Kergon <agk@redhat.com> wrote:

> On Mon, Nov 11, 2013 at 12:20:48PM -0500, Mike Snitzer wrote:
> > From: Joe Thornber <ejt@redhat.com>
>  
> > Cache block invalidation is removing an entry from the cache without
> > writing it back.  Cache blocks can be invalidated via the
> > 'invalidate_cblocks' message, which takes an arbitrary number of cblock
> > ranges:
> >    invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*
>  
> If we're expecting to need to invalidate a large number of ranges, should we
> switch this over to hexadecimal like we did for dm-switch?
> 
> Should we consider changing this to <cblock>+<length> as the dm
> interfaces tend to work with a start offset and a length?
> 
> Or will it be that the userspace interface will not be the bottleneck in
> this code so it really doesn't matter about attempting to optimise that
> part?

TBD.  Plan is to introduce "invalidate_cblocks_hex" variant if needed.

> > +++ b/Documentation/device-mapper/cache.txt
> 
> >  The test suite can be found here:
> > -https://github.com/jthornber/thinp-test-suite
> > +https://github.com/jthornber/device-mapper-test-suite
> 
> It's really not good if URLs in existing documentation become invalid
> like this: Could an obvious pointer be placed at the old location to
> direct people to the new location?
> 
> Can we document whether or not all the ranges will have been invalidated
> before the message returns?
> 
> > +	if (!passthrough_mode(&cache->features)) {
> > +		DMERR("cache has to be in passthrough mode for invalidation");
> 
> Can we document this too?
> 
> > @@ -2841,6 +3054,12 @@ static int cache_message(struct dm_target *ti, unsigned argc, char **argv)
> > +	if (!strcmp(argv[0], "invalidate_cblocks"))
> 
> strcasecmp please!

Pushed the following commit to address your comments in this patchset,
can rebase this commit if you have any further suggestions:
http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=e67df7acf867b654c6eca7d08ee0452e9c5e4534

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Nov. 12, 2013, 7:26 p.m. UTC | #3
On Tue, Nov 12 2013 at 12:35pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> Pushed the following commit to address your comments in this patchset,
> can rebase this commit if you have any further suggestions:
> http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=e67df7acf867b654c6eca7d08ee0452e9c5e4534

rebased with minor cache.txt revision here:
http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=7b6b2bc98c0303b7f043ad5b35906f833e56308d

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/Documentation/device-mapper/cache.txt b/Documentation/device-mapper/cache.txt
index ff6639f..fc9d2df 100644
--- a/Documentation/device-mapper/cache.txt
+++ b/Documentation/device-mapper/cache.txt
@@ -244,12 +244,22 @@  The message format is:
 E.g.
    dmsetup message my_cache 0 sequential_threshold 1024
 
+
+Invalidation is removing an entry from the cache without writing it
+back.  Cache blocks can be invalidated via the invalidate_cblocks
+message, which takes an arbitrary number of cblock ranges.
+
+   invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*
+
+E.g.
+   dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789
+
 Examples
 ========
 
 The test suite can be found here:
 
-https://github.com/jthornber/thinp-test-suite
+https://github.com/jthornber/device-mapper-test-suite
 
 dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
 	/dev/mapper/ssd /dev/mapper/origin 512 1 writeback default 0'
diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index 8c02177..41e664b 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -150,6 +150,25 @@  struct cache_stats {
 	atomic_t discard_count;
 };
 
+/*
+ * Defines a range of cblocks, begin to (end - 1) are in the range.  end is
+ * the one-past-the-end value.
+ */
+struct cblock_range {
+	dm_cblock_t begin;
+	dm_cblock_t end;
+};
+
+struct invalidation_request {
+	struct list_head list;
+	struct cblock_range *cblocks;
+
+	atomic_t complete;
+	int err;
+
+	wait_queue_head_t result_wait;
+};
+
 struct cache {
 	struct dm_target *ti;
 	struct dm_target_callbacks callbacks;
@@ -241,6 +260,7 @@  struct cache {
 
 	bool need_tick_bio:1;
 	bool sized:1;
+	bool invalidate:1;
 	bool commit_requested:1;
 	bool loaded_mappings:1;
 	bool loaded_discards:1;
@@ -251,6 +271,12 @@  struct cache {
 	struct cache_features features;
 
 	struct cache_stats stats;
+
+	/*
+	 * Invalidation fields.
+	 */
+	spinlock_t invalidation_lock;
+	struct list_head invalidation_requests;
 };
 
 struct per_bio_data {
@@ -283,6 +309,7 @@  struct dm_cache_migration {
 	bool demote:1;
 	bool promote:1;
 	bool requeue_holder:1;
+	bool invalidate:1;
 
 	struct dm_bio_prison_cell *old_ocell;
 	struct dm_bio_prison_cell *new_ocell;
@@ -904,8 +931,11 @@  static void migration_success_post_commit(struct dm_cache_migration *mg)
 			list_add_tail(&mg->list, &cache->quiesced_migrations);
 			spin_unlock_irqrestore(&cache->lock, flags);
 
-		} else
+		} else {
+			if (mg->invalidate)
+				policy_remove_mapping(cache->policy, mg->old_oblock);
 			cleanup_migration(mg);
+		}
 
 	} else {
 		if (mg->requeue_holder)
@@ -1115,6 +1145,7 @@  static void promote(struct cache *cache, struct prealloc *structs,
 	mg->demote = false;
 	mg->promote = true;
 	mg->requeue_holder = true;
+	mg->invalidate = false;
 	mg->cache = cache;
 	mg->new_oblock = oblock;
 	mg->cblock = cblock;
@@ -1137,6 +1168,7 @@  static void writeback(struct cache *cache, struct prealloc *structs,
 	mg->demote = false;
 	mg->promote = false;
 	mg->requeue_holder = true;
+	mg->invalidate = false;
 	mg->cache = cache;
 	mg->old_oblock = oblock;
 	mg->cblock = cblock;
@@ -1161,6 +1193,7 @@  static void demote_then_promote(struct cache *cache, struct prealloc *structs,
 	mg->demote = true;
 	mg->promote = true;
 	mg->requeue_holder = true;
+	mg->invalidate = false;
 	mg->cache = cache;
 	mg->old_oblock = old_oblock;
 	mg->new_oblock = new_oblock;
@@ -1188,6 +1221,7 @@  static void invalidate(struct cache *cache, struct prealloc *structs,
 	mg->demote = true;
 	mg->promote = false;
 	mg->requeue_holder = true;
+	mg->invalidate = true;
 	mg->cache = cache;
 	mg->old_oblock = oblock;
 	mg->cblock = cblock;
@@ -1525,6 +1559,58 @@  static void writeback_some_dirty_blocks(struct cache *cache)
 }
 
 /*----------------------------------------------------------------
+ * Invalidations.
+ * Dropping something from the cache *without* writing back.
+ *--------------------------------------------------------------*/
+
+static void process_invalidation_request(struct cache *cache, struct invalidation_request *req)
+{
+	int r = 0;
+	uint64_t begin = from_cblock(req->cblocks->begin);
+	uint64_t end = from_cblock(req->cblocks->end);
+
+	while (begin != end) {
+		r = policy_remove_cblock(cache->policy, to_cblock(begin));
+		if (!r) {
+			r = dm_cache_remove_mapping(cache->cmd, to_cblock(begin));
+			if (r)
+				break;
+
+		} else if (r == -ENODATA) {
+			/* harmless, already unmapped */
+			r = 0;
+
+		} else {
+			DMERR("policy_remove_cblock failed");
+			break;
+		}
+
+		begin++;
+        }
+
+	cache->commit_requested = true;
+
+	req->err = r;
+	atomic_set(&req->complete, 1);
+
+	wake_up(&req->result_wait);
+}
+
+static void process_invalidation_requests(struct cache *cache)
+{
+	struct list_head list;
+	struct invalidation_request *req, *tmp;
+
+	INIT_LIST_HEAD(&list);
+	spin_lock(&cache->invalidation_lock);
+	list_splice_init(&cache->invalidation_requests, &list);
+	spin_unlock(&cache->invalidation_lock);
+
+	list_for_each_entry_safe (req, tmp, &list, list)
+		process_invalidation_request(cache, req);
+}
+
+/*----------------------------------------------------------------
  * Main worker loop
  *--------------------------------------------------------------*/
 static bool is_quiescing(struct cache *cache)
@@ -1593,7 +1679,8 @@  static int more_work(struct cache *cache)
 			!bio_list_empty(&cache->deferred_writethrough_bios) ||
 			!list_empty(&cache->quiesced_migrations) ||
 			!list_empty(&cache->completed_migrations) ||
-			!list_empty(&cache->need_commit_migrations);
+			!list_empty(&cache->need_commit_migrations) ||
+			cache->invalidate;
 }
 
 static void do_worker(struct work_struct *ws)
@@ -1605,6 +1692,7 @@  static void do_worker(struct work_struct *ws)
 			writeback_some_dirty_blocks(cache);
 			process_deferred_writethrough_bios(cache);
 			process_deferred_bios(cache);
+			process_invalidation_requests(cache);
 		}
 
 		process_migrations(cache, &cache->quiesced_migrations, issue_copy);
@@ -2271,6 +2359,7 @@  static int cache_create(struct cache_args *ca, struct cache **result)
 
 	cache->need_tick_bio = true;
 	cache->sized = false;
+	cache->invalidate = false;
 	cache->commit_requested = false;
 	cache->loaded_mappings = false;
 	cache->loaded_discards = false;
@@ -2284,6 +2373,9 @@  static int cache_create(struct cache_args *ca, struct cache **result)
 	atomic_set(&cache->stats.commit_count, 0);
 	atomic_set(&cache->stats.discard_count, 0);
 
+	spin_lock_init(&cache->invalidation_lock);
+	INIT_LIST_HEAD(&cache->invalidation_requests);
+
 	*result = cache;
 	return 0;
 
@@ -2833,7 +2925,128 @@  err:
 }
 
 /*
- * Supports <key> <value>.
+ * A cache block range can take two forms:
+ *
+ * i) A single cblock, eg. '3456'
+ * ii) A begin and end cblock with dots between, eg. 123-234
+ */
+static int parse_cblock_range(struct cache *cache, const char *str,
+			      struct cblock_range *result)
+{
+	char dummy;
+	uint64_t b, e;
+	int r;
+
+	/*
+	 * Try and parse form (ii) first.
+	 */
+	r = sscanf(str, "%llu-%llu%c", &b, &e, &dummy);
+	if (r < 0)
+		return r;
+
+	if (r == 2) {
+		result->begin = to_cblock(b);
+		result->end = to_cblock(e);
+		return 0;
+	}
+
+	/*
+	 * That didn't work, try form (i).
+	 */
+	r = sscanf(str, "%llu%c", &b, &dummy);
+	if (r < 0)
+		return r;
+
+	if (r == 1) {
+		result->begin = to_cblock(b);
+		result->end = to_cblock(from_cblock(result->begin) + 1u);
+		return 0;
+	}
+
+	DMERR("invalid cblock range '%s'", str);
+	return -EINVAL;
+}
+
+static int validate_cblock_range(struct cache *cache, struct cblock_range *range)
+{
+	uint64_t b = from_cblock(range->begin);
+	uint64_t e = from_cblock(range->end);
+	uint64_t n = from_cblock(cache->cache_size);
+
+	if (b >= n) {
+		DMERR("begin cblock out of range: %llu >= %llu", b, n);
+		return -EINVAL;
+	}
+
+	if (e > n) {
+		DMERR("end cblock out of range: %llu > %llu", e, n);
+		return -EINVAL;
+	}
+
+	if (b >= e) {
+		DMERR("invalid cblock range: %llu >= %llu", b, e);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int request_invalidation(struct cache *cache, struct cblock_range *range)
+{
+	struct invalidation_request req;
+
+	INIT_LIST_HEAD(&req.list);
+	req.cblocks = range;
+	atomic_set(&req.complete, 0);
+	req.err = 0;
+	init_waitqueue_head(&req.result_wait);
+
+	spin_lock(&cache->invalidation_lock);
+	list_add(&req.list, &cache->invalidation_requests);
+	spin_unlock(&cache->invalidation_lock);
+	wake_worker(cache);
+
+	wait_event(req.result_wait, atomic_read(&req.complete));
+	return req.err;
+}
+
+static int process_invalidate_cblocks_message(struct cache *cache, unsigned count,
+					      const char **cblock_ranges)
+{
+	int r = 0;
+	unsigned i;
+	struct cblock_range range;
+
+	if (!passthrough_mode(&cache->features)) {
+		DMERR("cache has to be in passthrough mode for invalidation");
+		return -EPERM;
+	}
+
+	for (i = 0; i < count; i++) {
+		r = parse_cblock_range(cache, cblock_ranges[i], &range);
+		if (r)
+			break;
+
+		r = validate_cblock_range(cache, &range);
+		if (r)
+			break;
+
+		/*
+		 * Pass begin and end origin blocks to the worker and wake it.
+		 */
+		r = request_invalidation(cache, &range);
+		if (r)
+			break;
+	}
+
+	return r;
+}
+
+/*
+ * Supports
+ *	"<key> <value>"
+ * and
+ *     "invalidate_cblocks [(<begin>)|(<begin>-<end>)]*
  *
  * The key migration_threshold is supported by the cache target core.
  */
@@ -2841,6 +3054,12 @@  static int cache_message(struct dm_target *ti, unsigned argc, char **argv)
 {
 	struct cache *cache = ti->private;
 
+	if (!argc)
+		return -EINVAL;
+
+	if (!strcmp(argv[0], "invalidate_cblocks"))
+		return process_invalidate_cblocks_message(cache, argc - 1, (const char **) argv + 1);
+
 	if (argc != 2)
 		return -EINVAL;