diff mbox

[v5] DAX: enable iostat for read/write

Message ID 20170112183848.23159-1-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kani, Toshi Jan. 12, 2017, 6:38 p.m. UTC
DAX IO path does not support iostat, but its metadata IO path does.
Therefore, iostat shows metadata IO statistics only, which has been
confusing to users.

Add iostat support to the DAX read/write path.

Note, iostat still does not support the DAX mmap path as it allows
user applications to access directly.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Joe Perches <joe@perches.com>
---
v5:
 - Add a flag in case 'start' is 0 after 'jiffies' rolls over.
   (Dan Williams)
 - Fix a signed/unsigned conversion. (Joe Perches)
---
 fs/dax.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Joe Perches Jan. 12, 2017, 6:02 p.m. UTC | #1
On Thu, 2017-01-12 at 11:38 -0700, Toshi Kani wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
[]
> diff --git a/fs/dax.c b/fs/dax.c
[]
> @@ -1058,12 +1058,24 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
>  {
>  	struct address_space *mapping = iocb->ki_filp->f_mapping;
>  	struct inode *inode = mapping->host;
> +	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
>  	loff_t pos = iocb->ki_pos, ret = 0, done = 0;
>  	unsigned flags = 0;
> +	unsigned long start = 0;
> +	int do_acct = blk_queue_io_stat(disk->queue);
>  
>  	if (iov_iter_rw(iter) == WRITE)
>  		flags |= IOMAP_WRITE;
>  
> +	if (do_acct) {
> +		sector_t sec = iov_iter_count(iter) >> 9;
> +
> +		start = jiffies;
> +		generic_start_io_acct(iov_iter_rw(iter),
> +				      min_t(unsigned long, 1, sec),

I believe I mislead you with a thinko.

Your original code was
	(!sec) ? 1 : sec
and I suggested incorrectly using min_t

It should of course be max_t.  Sorry.

Also, as sec is now sector_t (u64), perhaps this
unsigned long cast is incorrect.
Kani, Toshi Jan. 12, 2017, 6:26 p.m. UTC | #2
On Thu, 2017-01-12 at 10:02 -0800, Joe Perches wrote:
> On Thu, 2017-01-12 at 11:38 -0700, Toshi Kani wrote:

> > DAX IO path does not support iostat, but its metadata IO path does.

> > Therefore, iostat shows metadata IO statistics only, which has been

> > confusing to users.

> 

> []

> > diff --git a/fs/dax.c b/fs/dax.c

> 

> []

> > @@ -1058,12 +1058,24 @@ dax_iomap_rw(struct kiocb *iocb, struct

> > iov_iter *iter,

> >  {

> >  	struct address_space *mapping = iocb->ki_filp->f_mapping;

> >  	struct inode *inode = mapping->host;

> > +	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;

> >  	loff_t pos = iocb->ki_pos, ret = 0, done = 0;

> >  	unsigned flags = 0;

> > +	unsigned long start = 0;

> > +	int do_acct = blk_queue_io_stat(disk->queue);

> >  

> >  	if (iov_iter_rw(iter) == WRITE)

> >  		flags |= IOMAP_WRITE;

> >  

> > +	if (do_acct) {

> > +		sector_t sec = iov_iter_count(iter) >> 9;

> > +

> > +		start = jiffies;

> > +		generic_start_io_acct(iov_iter_rw(iter),

> > +				      min_t(unsigned long, 1,

> > sec),

> 

> I believe I mislead you with a thinko.

> 

> Your original code was

> 	(!sec) ? 1 : sec

> and I suggested incorrectly using min_t

> 

> It should of course be max_t.  Sorry.


My bad. I should have caught it.

> Also, as sec is now sector_t (u64), perhaps this

> unsigned long cast is incorrect.


I see. Since iov_iter_count() returns a size_t value, I will use
'size_t' for 'sec' as you originally suggested. 

Thanks,
-Toshi
Christoph Hellwig Jan. 25, 2017, 11:19 a.m. UTC | #3
On Thu, Jan 12, 2017 at 11:38:48AM -0700, Toshi Kani wrote:
> DAX IO path does not support iostat, but its metadata IO path does.
> Therefore, iostat shows metadata IO statistics only, which has been
> confusing to users.
> 
> Add iostat support to the DAX read/write path.
> 
> Note, iostat still does not support the DAX mmap path as it allows
> user applications to access directly.

NAK.  DAX I/O should not be accounted for block device statistics.
Jeff Moyer Jan. 25, 2017, 3:15 p.m. UTC | #4
Christoph Hellwig <hch@infradead.org> writes:

> On Thu, Jan 12, 2017 at 11:38:48AM -0700, Toshi Kani wrote:
>> DAX IO path does not support iostat, but its metadata IO path does.
>> Therefore, iostat shows metadata IO statistics only, which has been
>> confusing to users.
>> 
>> Add iostat support to the DAX read/write path.
>> 
>> Note, iostat still does not support the DAX mmap path as it allows
>> user applications to access directly.
>
> NAK.  DAX I/O should not be accounted for block device statistics.

Agreed, this is a layering violation.

Cheers,
Jeff
Kani, Toshi Jan. 25, 2017, 3:55 p.m. UTC | #5
On Wed, 2017-01-25 at 10:15 -0500, Jeff Moyer wrote:
> Christoph Hellwig <hch@infradead.org> writes:

> 

> > On Thu, Jan 12, 2017 at 11:38:48AM -0700, Toshi Kani wrote:

> > > DAX IO path does not support iostat, but its metadata IO path

> > > does. Therefore, iostat shows metadata IO statistics only, which

> > > has been confusing to users.

> > > 

> > > Add iostat support to the DAX read/write path.

> > > 

> > > Note, iostat still does not support the DAX mmap path as it

> > > allows user applications to access directly.

> > 

> > NAK.  DAX I/O should not be accounted for block device statistics.

> 

> Agreed, this is a layering violation.


I will check to see if it can fit on top of Dan's patch-set.

Thanks,
-Toshi
diff mbox

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 5c74f60..a3e406a 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1058,12 +1058,24 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 {
 	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct inode *inode = mapping->host;
+	struct gendisk *disk = inode->i_sb->s_bdev->bd_disk;
 	loff_t pos = iocb->ki_pos, ret = 0, done = 0;
 	unsigned flags = 0;
+	unsigned long start = 0;
+	int do_acct = blk_queue_io_stat(disk->queue);
 
 	if (iov_iter_rw(iter) == WRITE)
 		flags |= IOMAP_WRITE;
 
+	if (do_acct) {
+		sector_t sec = iov_iter_count(iter) >> 9;
+
+		start = jiffies;
+		generic_start_io_acct(iov_iter_rw(iter),
+				      min_t(unsigned long, 1, sec),
+				      &disk->part0);
+	}
+
 	while (iov_iter_count(iter)) {
 		ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
 				iter, dax_iomap_actor);
@@ -1073,6 +1085,9 @@  dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 		done += ret;
 	}
 
+	if (do_acct)
+		generic_end_io_acct(iov_iter_rw(iter), &disk->part0, start);
+
 	iocb->ki_pos += done;
 	return done ? done : ret;
 }