Message ID | 20171020052404.13762-2-ross.zwisler@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Oct 19, 2017 at 11:24:04PM -0600, Ross Zwisler wrote: > Now that we have the ability log filesystem writes using a flat buffer, add > support for DAX. Unfortunately we can't easily track data that has been > written via mmap() now that the dax_flush() abstraction was removed by this > commit: > > commit c3ca015fab6d ("dax: remove the pmem_dax_ops->flush abstraction") > > Otherwise we could just treat each flush as a big write, and store the data > that is being synced to media. It may be worthwhile to add the dax_flush() > entry point back, just as a notifier so we can do this logging. > > The motivation for this support is the need for an xfstest that can test > the new MAP_SYNC DAX flag. By logging the filesystem activity with > dm-log-writes we can show that the MAP_SYNC page faults are writing out > their metadata as they happen, instead of requiring an explicit > msync/fsync. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > --- Ok this is just my ignorance of how DAX works shining through, but do we need a new flag to indicate this is DAX data? You are logging it like it's just normal data going to a certain sector, is that good enough? If it is then hooray this looks fine to me, I'm just slightly confused. Thanks, Josef
On Mon, Oct 23, 2017 at 01:34:09PM -0400, Josef Bacik wrote: > On Thu, Oct 19, 2017 at 11:24:04PM -0600, Ross Zwisler wrote: > > Now that we have the ability log filesystem writes using a flat buffer, add > > support for DAX. Unfortunately we can't easily track data that has been > > written via mmap() now that the dax_flush() abstraction was removed by this > > commit: > > > > commit c3ca015fab6d ("dax: remove the pmem_dax_ops->flush abstraction") > > > > Otherwise we could just treat each flush as a big write, and store the data > > that is being synced to media. It may be worthwhile to add the dax_flush() > > entry point back, just as a notifier so we can do this logging. > > > > The motivation for this support is the need for an xfstest that can test > > the new MAP_SYNC DAX flag. By logging the filesystem activity with > > dm-log-writes we can show that the MAP_SYNC page faults are writing out > > their metadata as they happen, instead of requiring an explicit > > msync/fsync. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > --- > > Ok this is just my ignorance of how DAX works shining through, but do we need a > new flag to indicate this is DAX data? You are logging it like it's just normal > data going to a certain sector, is that good enough? If it is then hooray this > looks fine to me, I'm just slightly confused. Thanks, > > Josef I don't think we need a special flag to specify that it's DAX. Really it's just the same as a normal filesystem write, except that we actually do the work of writing the data via the FS DAX iomap code instead of bubbling it all the way down to the block driver.
On Fri, Oct 20 2017 at 1:24am -0400, Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > Now that we have the ability log filesystem writes using a flat buffer, add > support for DAX. Unfortunately we can't easily track data that has been > written via mmap() now that the dax_flush() abstraction was removed by this > commit: > > commit c3ca015fab6d ("dax: remove the pmem_dax_ops->flush abstraction") > > Otherwise we could just treat each flush as a big write, and store the data > that is being synced to media. It may be worthwhile to add the dax_flush() > entry point back, just as a notifier so we can do this logging. > > The motivation for this support is the need for an xfstest that can test > the new MAP_SYNC DAX flag. By logging the filesystem activity with > dm-log-writes we can show that the MAP_SYNC page faults are writing out > their metadata as they happen, instead of requiring an explicit > msync/fsync. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> I've picked this up, please see: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-4.15/dm&id=ae613bbb0144e84cb3c0ebfa9f4fd4d1507c2f0e I tweaked the header and tweaked a couple whitespace nits. Also switched version bump from 1.0.1 to 1.1.0. Thanks, Mike
On Tue, Oct 24, 2017 at 03:22:23PM -0400, Mike Snitzer wrote: > On Fri, Oct 20 2017 at 1:24am -0400, > Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > > > Now that we have the ability log filesystem writes using a flat buffer, add > > support for DAX. Unfortunately we can't easily track data that has been > > written via mmap() now that the dax_flush() abstraction was removed by this > > commit: > > > > commit c3ca015fab6d ("dax: remove the pmem_dax_ops->flush abstraction") > > > > Otherwise we could just treat each flush as a big write, and store the data > > that is being synced to media. It may be worthwhile to add the dax_flush() > > entry point back, just as a notifier so we can do this logging. > > > > The motivation for this support is the need for an xfstest that can test > > the new MAP_SYNC DAX flag. By logging the filesystem activity with > > dm-log-writes we can show that the MAP_SYNC page faults are writing out > > their metadata as they happen, instead of requiring an explicit > > msync/fsync. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > I've picked this up, please see: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-4.15/dm&id=ae613bbb0144e84cb3c0ebfa9f4fd4d1507c2f0e > > I tweaked the header and tweaked a couple whitespace nits. Also > switched version bump from 1.0.1 to 1.1.0. > > Thanks, > Mike Sure, your tweaks look fine. Thanks!
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c index c65f9d1..6a8d352 100644 --- a/drivers/md/dm-log-writes.c +++ b/drivers/md/dm-log-writes.c @@ -10,9 +10,11 @@ #include <linux/init.h> #include <linux/blkdev.h> #include <linux/bio.h> +#include <linux/dax.h> #include <linux/slab.h> #include <linux/kthread.h> #include <linux/freezer.h> +#include <linux/uio.h> #define DM_MSG_PREFIX "log-writes" @@ -609,6 +611,50 @@ static int log_mark(struct log_writes_c *lc, char *data) return 0; } +static int log_dax(struct log_writes_c *lc, sector_t sector, size_t bytes, + struct iov_iter *i) +{ + struct pending_block *block; + + if (!bytes) + return 0; + + block = kzalloc(sizeof(struct pending_block), GFP_KERNEL); + if (!block) { + DMERR("Error allocating dax pending block"); + return -ENOMEM; + } + + block->data = kzalloc(bytes, GFP_KERNEL); + if (!block->data) { + DMERR("Error allocating dax data space"); + kfree(block); + return -ENOMEM; + } + + /* write data provided via the iterator */ + if (!copy_from_iter(block->data, bytes, i)) { + DMERR("Error copying dax data"); + kfree(block->data); + kfree(block); + return -EIO; + } + + /* rewind the iterator so that the block driver can use it */ + iov_iter_revert(i, bytes); + + block->datalen = bytes; + block->sector = bio_to_dev_sectors(lc, sector); + block->nr_sectors = ALIGN(bytes, lc->sectorsize) >> lc->sectorshift; + + atomic_inc(&lc->pending_blocks); + spin_lock_irq(&lc->blocks_lock); + list_add_tail(&block->list, &lc->unflushed_blocks); + spin_unlock_irq(&lc->blocks_lock); + wake_up_process(lc->log_kthread); + return 0; +} + static void log_writes_dtr(struct dm_target *ti) { struct log_writes_c *lc = ti->private; @@ -874,9 +920,49 @@ static void log_writes_io_hints(struct dm_target *ti, struct queue_limits *limit limits->io_min = limits->physical_block_size; } +static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff, + long nr_pages, void **kaddr, pfn_t *pfn) +{ + struct log_writes_c *lc = ti->private; + struct block_device *bdev = lc->dev->bdev; + struct dax_device *dax_dev = lc->dev->dax_dev; + sector_t sector = pgoff * PAGE_SECTORS; + int ret; + + ret = bdev_dax_pgoff(bdev, sector, nr_pages * PAGE_SIZE, &pgoff); + if (ret) + return ret; + return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn); +} + +static size_t log_writes_dax_copy_from_iter(struct dm_target *ti, + pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *i) +{ + struct log_writes_c *lc = ti->private; + struct block_device *bdev = lc->dev->bdev; + struct dax_device *dax_dev = lc->dev->dax_dev; + sector_t sector = pgoff * PAGE_SECTORS; + int err; + + if (bdev_dax_pgoff(bdev, sector, ALIGN(bytes, PAGE_SIZE), &pgoff)) + return 0; + + /* Don't bother doing anything if logging has been disabled */ + if (!lc->logging_enabled) + goto dax_copy; + + err = log_dax(lc, sector, bytes, i); + if (err) { + DMWARN("Error %d logging DAX write", err); + return 0; + } +dax_copy: + return dax_copy_from_iter(dax_dev, pgoff, addr, bytes, i); +} + static struct target_type log_writes_target = { .name = "log-writes", - .version = {1, 0, 0}, + .version = {1, 0, 1}, .module = THIS_MODULE, .ctr = log_writes_ctr, .dtr = log_writes_dtr, @@ -887,6 +973,8 @@ static struct target_type log_writes_target = { .message = log_writes_message, .iterate_devices = log_writes_iterate_devices, .io_hints = log_writes_io_hints, + .direct_access = log_writes_dax_direct_access, + .dax_copy_from_iter = log_writes_dax_copy_from_iter, }; static int __init dm_log_writes_init(void)
Now that we have the ability log filesystem writes using a flat buffer, add support for DAX. Unfortunately we can't easily track data that has been written via mmap() now that the dax_flush() abstraction was removed by this commit: commit c3ca015fab6d ("dax: remove the pmem_dax_ops->flush abstraction") Otherwise we could just treat each flush as a big write, and store the data that is being synced to media. It may be worthwhile to add the dax_flush() entry point back, just as a notifier so we can do this logging. The motivation for this support is the need for an xfstest that can test the new MAP_SYNC DAX flag. By logging the filesystem activity with dm-log-writes we can show that the MAP_SYNC page faults are writing out their metadata as they happen, instead of requiring an explicit msync/fsync. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> --- Here's a link to Jan's latest MAP_SYNC set, which can be used for the fstest: https://www.spinics.net/lists/linux-xfs/msg11852.html MAP_SYNC is not needed for basic DAX+dm-log-writes functionality. --- drivers/md/dm-log-writes.c | 90 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 89 insertions(+), 1 deletion(-)