diff mbox

[v3] xfs: test per-inode DAX flag by IO

Message ID 1484878888-11483-1-git-send-email-xzhou@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Murphy Zhou Jan. 20, 2017, 2:21 a.m. UTC
In a DAX mountpoint, do IO betwen files with and
without DAX per-inode flag. We do mmap and O_DIRECT
read/write IO in this case. Then test again in the
same device without dax mountoption.

Add help _require_scratch_dax to make sure we can
test DAX feature on SCRATCH_DEV.

Add mmap dio test programme to test read/write
between a mmap area of one file and another file
directly, with different size.

Signed-off-by: Xiong Zhou <xzhou@redhat.com>
---
v3:
 close fds in C test programme for clean up.

 .gitignore        |   1 +
 common/rc         |  14 +++++++
 src/Makefile      |   2 +-
 src/t_mmap_dio.c  |  89 +++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/138     | 110 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/138.out |   2 +
 tests/xfs/group   |   1 +
 7 files changed, 218 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_dio.c
 create mode 100755 tests/xfs/138
 create mode 100644 tests/xfs/138.out

Comments

Murphy Zhou Jan. 20, 2017, 6:15 a.m. UTC | #1
common/rc         : requires SCRATCH_DEV support DAX
src/t_mmap_dio.c  : intro mmap and O_DIRECT rw through files
tests/generic/405 : IO between DAX/non-DAX mountpoints
tests/xfs/138     : IO between DAX/non-DAX xfs files(per-inode flag)

v2 :
  Merge helper function changes into the first patch;
  Rewrite _require_dax, check options for sure;
  Print msg in t_mmap_dio.c to show which test going wrong;
  Empty mount options and check after mount to ensure we
wont mount with wrong option;
  Remove unnecessary leading underscore and _fail;
  Use xfs_io instead of dd;
  Other minor fixes.

v3:
 close fds in C test programme for clean up.

v4:
 Test both buffered and O_DIRECT IO;
 Fix arg numbers in C test programme;
 Fix fs options check after mount.
 Cc Jeff Moyer since this test is based on his code.
 (Sorry for the late cc!)

Test status:
  Both cases not run on normal block device;
  Both cases PASS on ramdisk based pmem devices;
  DIO in both cases FAIL on brd based ramdisk with:
  DIO in both cases FAIL on nvdimm devices with:
    +write(Bad address) len 1024 dio dax to nondax
    +write(Bad address) len 4096 dio dax to nondax
    +write(Bad address) len 16777216 dio dax to nondax
    +write(Bad address) len 67108864 dio dax to nondax

  I've reported this to nvdimm list.
  https://lists.01.org/pipermail/linux-nvdimm/2017-January/008600.html

Xiong Zhou (2):
  xfs: test per-inode DAX flag by IO
  generic: test mmap io through DAX and non-DAX

 .gitignore            |   1 +
 common/rc             |  13 ++++++
 src/Makefile          |   2 +-
 src/t_mmap_dio.c      | 105 ++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/405     | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/405.out |   2 +
 tests/generic/group   |   1 +
 tests/xfs/138         | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/138.out     |   2 +
 tests/xfs/group       |   1 +
 10 files changed, 361 insertions(+), 1 deletion(-)
 create mode 100644 src/t_mmap_dio.c
 create mode 100755 tests/generic/405
 create mode 100644 tests/generic/405.out
 create mode 100755 tests/xfs/138
 create mode 100644 tests/xfs/138.out
Ross Zwisler Jan. 24, 2017, 10:28 p.m. UTC | #2
On Fri, Jan 20, 2017 at 02:15:48PM +0800, Xiong Zhou wrote:
> common/rc         : requires SCRATCH_DEV support DAX
> src/t_mmap_dio.c  : intro mmap and O_DIRECT rw through files
> tests/generic/405 : IO between DAX/non-DAX mountpoints
> tests/xfs/138     : IO between DAX/non-DAX xfs files(per-inode flag)
> 
> v2 :
>   Merge helper function changes into the first patch;
>   Rewrite _require_dax, check options for sure;
>   Print msg in t_mmap_dio.c to show which test going wrong;
>   Empty mount options and check after mount to ensure we
> wont mount with wrong option;
>   Remove unnecessary leading underscore and _fail;
>   Use xfs_io instead of dd;
>   Other minor fixes.
> 
> v3:
>  close fds in C test programme for clean up.
> 
> v4:
>  Test both buffered and O_DIRECT IO;
>  Fix arg numbers in C test programme;
>  Fix fs options check after mount.
>  Cc Jeff Moyer since this test is based on his code.
>  (Sorry for the late cc!)
> 
> Test status:
>   Both cases not run on normal block device;
>   Both cases PASS on ramdisk based pmem devices;
>   DIO in both cases FAIL on brd based ramdisk with:
>   DIO in both cases FAIL on nvdimm devices with:
>     +write(Bad address) len 1024 dio dax to nondax
>     +write(Bad address) len 4096 dio dax to nondax
>     +write(Bad address) len 16777216 dio dax to nondax
>     +write(Bad address) len 67108864 dio dax to nondax
> 
>   I've reported this to nvdimm list.
>   https://lists.01.org/pipermail/linux-nvdimm/2017-January/008600.html
> 
> Xiong Zhou (2):
>   xfs: test per-inode DAX flag by IO
>   generic: test mmap io through DAX and non-DAX
> 
>  .gitignore            |   1 +
>  common/rc             |  13 ++++++
>  src/Makefile          |   2 +-
>  src/t_mmap_dio.c      | 105 ++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/405     | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/405.out |   2 +
>  tests/generic/group   |   1 +
>  tests/xfs/138         | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/138.out     |   2 +
>  tests/xfs/group       |   1 +
>  10 files changed, 361 insertions(+), 1 deletion(-)
>  create mode 100644 src/t_mmap_dio.c
>  create mode 100755 tests/generic/405
>  create mode 100644 tests/generic/405.out
>  create mode 100755 tests/xfs/138
>  create mode 100644 tests/xfs/138.out
> 
> -- 
> 1.8.3.1

I just wanted to let you know that I'm testing with these new xfstests right
now, and so far I've been unable to successfully get any PMD faults.  I'm
looking into why that is right now, and should hopefully have some changes so
we can do both PTE and PMD testing with this set.

Also, it looks like the test number "generic/405" was already used in
xfstests/master by this commit:

66768bc generic/405: test mkfs against thin provision device

So this may need to be generic/406. :)
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Murphy Zhou Feb. 3, 2017, 5:57 a.m. UTC | #3
On Tue, Jan 24, 2017 at 03:28:55PM -0700, Ross Zwisler wrote:
> On Fri, Jan 20, 2017 at 02:15:48PM +0800, Xiong Zhou wrote:
> > common/rc         : requires SCRATCH_DEV support DAX
> > src/t_mmap_dio.c  : intro mmap and O_DIRECT rw through files
> > tests/generic/405 : IO between DAX/non-DAX mountpoints
> > tests/xfs/138     : IO between DAX/non-DAX xfs files(per-inode flag)
> > 
> > v2 :
> >   Merge helper function changes into the first patch;
> >   Rewrite _require_dax, check options for sure;
> >   Print msg in t_mmap_dio.c to show which test going wrong;
> >   Empty mount options and check after mount to ensure we
> > wont mount with wrong option;
> >   Remove unnecessary leading underscore and _fail;
> >   Use xfs_io instead of dd;
> >   Other minor fixes.
> > 
> > v3:
> >  close fds in C test programme for clean up.
> > 
> > v4:
> >  Test both buffered and O_DIRECT IO;
> >  Fix arg numbers in C test programme;
> >  Fix fs options check after mount.
> >  Cc Jeff Moyer since this test is based on his code.
> >  (Sorry for the late cc!)
> > 
> > Test status:
> >   Both cases not run on normal block device;
> >   Both cases PASS on ramdisk based pmem devices;
> >   DIO in both cases FAIL on brd based ramdisk with:
> >   DIO in both cases FAIL on nvdimm devices with:
> >     +write(Bad address) len 1024 dio dax to nondax
> >     +write(Bad address) len 4096 dio dax to nondax
> >     +write(Bad address) len 16777216 dio dax to nondax
> >     +write(Bad address) len 67108864 dio dax to nondax
> > 
> >   I've reported this to nvdimm list.
> >   https://lists.01.org/pipermail/linux-nvdimm/2017-January/008600.html
> > 
> > Xiong Zhou (2):
> >   xfs: test per-inode DAX flag by IO
> >   generic: test mmap io through DAX and non-DAX
> > 
> >  .gitignore            |   1 +
> >  common/rc             |  13 ++++++
> >  src/Makefile          |   2 +-
> >  src/t_mmap_dio.c      | 105 ++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/405     | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/405.out |   2 +
> >  tests/generic/group   |   1 +
> >  tests/xfs/138         | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/138.out     |   2 +
> >  tests/xfs/group       |   1 +
> >  10 files changed, 361 insertions(+), 1 deletion(-)
> >  create mode 100644 src/t_mmap_dio.c
> >  create mode 100755 tests/generic/405
> >  create mode 100644 tests/generic/405.out
> >  create mode 100755 tests/xfs/138
> >  create mode 100644 tests/xfs/138.out
> > 
> > -- 
> > 1.8.3.1
> 
> I just wanted to let you know that I'm testing with these new xfstests right
> now, and so far I've been unable to successfully get any PMD faults.  I'm
> looking into why that is right now, and should hopefully have some changes so
> we can do both PTE and PMD testing with this set.

Thank you very much for looking into this!

Adding a printk msg in dax_iomap_pmd_fault in fs/dax.c shows that
these 2 cases called this function, so do __radix_tree_insert
in lib/radix-tree.c with order > 0.  I must have missed something..

> 
> Also, it looks like the test number "generic/405" was already used in
> xfstests/master by this commit:
> 
> 66768bc generic/405: test mkfs against thin provision device
> 
> So this may need to be generic/406. :)

Ya, i can update this or Eryu can handle it while applying.

Thanks,
Xiong
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eryu Guan Feb. 3, 2017, 6:29 a.m. UTC | #4
On Fri, Feb 03, 2017 at 01:57:17PM +0800, Xiong Zhou wrote:

> > I just wanted to let you know that I'm testing with these new xfstests right
> > now, and so far I've been unable to successfully get any PMD faults.  I'm
> > looking into why that is right now, and should hopefully have some changes so
> > we can do both PTE and PMD testing with this set.
> 
> Thank you very much for looking into this!
> 
> Adding a printk msg in dax_iomap_pmd_fault in fs/dax.c shows that
> these 2 cases called this function, so do __radix_tree_insert
> in lib/radix-tree.c with order > 0.  I must have missed something..
> 
> > 
> > Also, it looks like the test number "generic/405" was already used in
> > xfstests/master by this commit:
> > 
> > 66768bc generic/405: test mkfs against thin provision device
> > 
> > So this may need to be generic/406. :)
> 
> Ya, i can update this or Eryu can handle it while applying.

I'll do renumber after patch being reviewed, no need to update patch
only to change the seq number. (Better to use latest tree when writing
new case though.)

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ross Zwisler Feb. 3, 2017, 4:57 p.m. UTC | #5
On Fri, Feb 03, 2017 at 01:57:17PM +0800, Xiong Zhou wrote:
> On Tue, Jan 24, 2017 at 03:28:55PM -0700, Ross Zwisler wrote:
> > On Fri, Jan 20, 2017 at 02:15:48PM +0800, Xiong Zhou wrote:
> > > common/rc         : requires SCRATCH_DEV support DAX
> > > src/t_mmap_dio.c  : intro mmap and O_DIRECT rw through files
> > > tests/generic/405 : IO between DAX/non-DAX mountpoints
> > > tests/xfs/138     : IO between DAX/non-DAX xfs files(per-inode flag)
> > > 
> > > v2 :
> > >   Merge helper function changes into the first patch;
> > >   Rewrite _require_dax, check options for sure;
> > >   Print msg in t_mmap_dio.c to show which test going wrong;
> > >   Empty mount options and check after mount to ensure we
> > > wont mount with wrong option;
> > >   Remove unnecessary leading underscore and _fail;
> > >   Use xfs_io instead of dd;
> > >   Other minor fixes.
> > > 
> > > v3:
> > >  close fds in C test programme for clean up.
> > > 
> > > v4:
> > >  Test both buffered and O_DIRECT IO;
> > >  Fix arg numbers in C test programme;
> > >  Fix fs options check after mount.
> > >  Cc Jeff Moyer since this test is based on his code.
> > >  (Sorry for the late cc!)
> > > 
> > > Test status:
> > >   Both cases not run on normal block device;
> > >   Both cases PASS on ramdisk based pmem devices;
> > >   DIO in both cases FAIL on brd based ramdisk with:
> > >   DIO in both cases FAIL on nvdimm devices with:
> > >     +write(Bad address) len 1024 dio dax to nondax
> > >     +write(Bad address) len 4096 dio dax to nondax
> > >     +write(Bad address) len 16777216 dio dax to nondax
> > >     +write(Bad address) len 67108864 dio dax to nondax
> > > 
> > >   I've reported this to nvdimm list.
> > >   https://lists.01.org/pipermail/linux-nvdimm/2017-January/008600.html
> > > 
> > > Xiong Zhou (2):
> > >   xfs: test per-inode DAX flag by IO
> > >   generic: test mmap io through DAX and non-DAX
> > > 
> > >  .gitignore            |   1 +
> > >  common/rc             |  13 ++++++
> > >  src/Makefile          |   2 +-
> > >  src/t_mmap_dio.c      | 105 ++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/405     | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/405.out |   2 +
> > >  tests/generic/group   |   1 +
> > >  tests/xfs/138         | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/xfs/138.out     |   2 +
> > >  tests/xfs/group       |   1 +
> > >  10 files changed, 361 insertions(+), 1 deletion(-)
> > >  create mode 100644 src/t_mmap_dio.c
> > >  create mode 100755 tests/generic/405
> > >  create mode 100644 tests/generic/405.out
> > >  create mode 100755 tests/xfs/138
> > >  create mode 100644 tests/xfs/138.out
> > > 
> > > -- 
> > > 1.8.3.1
> > 
> > I just wanted to let you know that I'm testing with these new xfstests right
> > now, and so far I've been unable to successfully get any PMD faults.  I'm
> > looking into why that is right now, and should hopefully have some changes so
> > we can do both PTE and PMD testing with this set.
> 
> Thank you very much for looking into this!
> 
> Adding a printk msg in dax_iomap_pmd_fault in fs/dax.c shows that
> these 2 cases called this function, so do __radix_tree_insert
> in lib/radix-tree.c with order > 0.  I must have missed something..

Ah, yea, the flow is a little confusing.  When we first try to do a PMD fault
we insert a 2MiB "empty" entry into the radix tree that basically just allows
us to lock an entire 2MiB range.  This happens in dax_iomap_pmd_fault() via 
grab_mapping_entry().  This is most likely what you're seeing with your debug.

Then, with that empty 2MiB entry in place we try and actually service the
fault and insert a real mapping to a 2MiB huge page.  There are still cases
when this can fall back to 4k pages, and one of them is if the block
allocation we are given by the filesystem isn't 2MiB aligned.  That is the
alignment check against PG_PMD_COLOUR in dax_pmd_insert_mapping(), and that's
what we were hitting.  The way to get around this is to tell XFS that we would
like 2MiB aligned and sized block allocations via the following mkfs options:

export MKFS_OPTIONS="-d su=2m,sw=1"

We also need to fallocate our storage space so that we get 2 MiB allocations
instead of 4k allocations.

I've been working on patches that do all of this - I'll try and send them out
today.

This has taken a little longer than I would have liked because when debugging
this issue I found an issue with DAX + DIO in the kernel.  So, your test has
already found an important bug in the kernel before it was even committed to
xfstests!  :)

BTW, if we fallocate our files, is there additional value in writing data into
the files before we start testing as you do via these lines?

$XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
        $SCRATCH_MNT/tf_s >> $seqres.full 2>&1
$XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
        $SCRATCH_MNT/tf_d >> $seqres.full 2>&1

This puts a known pattern into the files and means that reads are handled from
media instead of from hole pages, but we never verify the data pattern and it
slows down the test quite a bit.  What do you think?
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Murphy Zhou Feb. 4, 2017, 10:14 a.m. UTC | #6
On Fri, Feb 03, 2017 at 09:57:10AM -0700, Ross Zwisler wrote:
> On Fri, Feb 03, 2017 at 01:57:17PM +0800, Xiong Zhou wrote:
> > On Tue, Jan 24, 2017 at 03:28:55PM -0700, Ross Zwisler wrote:
> > > On Fri, Jan 20, 2017 at 02:15:48PM +0800, Xiong Zhou wrote:
> > > > common/rc         : requires SCRATCH_DEV support DAX
> > > > src/t_mmap_dio.c  : intro mmap and O_DIRECT rw through files
> > > > tests/generic/405 : IO between DAX/non-DAX mountpoints
> > > > tests/xfs/138     : IO between DAX/non-DAX xfs files(per-inode flag)
> > > > 
> > > > v2 :
> > > >   Merge helper function changes into the first patch;
> > > >   Rewrite _require_dax, check options for sure;
> > > >   Print msg in t_mmap_dio.c to show which test going wrong;
> > > >   Empty mount options and check after mount to ensure we
> > > > wont mount with wrong option;
> > > >   Remove unnecessary leading underscore and _fail;
> > > >   Use xfs_io instead of dd;
> > > >   Other minor fixes.
> > > > 
> > > > v3:
> > > >  close fds in C test programme for clean up.
> > > > 
> > > > v4:
> > > >  Test both buffered and O_DIRECT IO;
> > > >  Fix arg numbers in C test programme;
> > > >  Fix fs options check after mount.
> > > >  Cc Jeff Moyer since this test is based on his code.
> > > >  (Sorry for the late cc!)
> > > > 
> > > > Test status:
> > > >   Both cases not run on normal block device;
> > > >   Both cases PASS on ramdisk based pmem devices;
> > > >   DIO in both cases FAIL on brd based ramdisk with:
> > > >   DIO in both cases FAIL on nvdimm devices with:
> > > >     +write(Bad address) len 1024 dio dax to nondax
> > > >     +write(Bad address) len 4096 dio dax to nondax
> > > >     +write(Bad address) len 16777216 dio dax to nondax
> > > >     +write(Bad address) len 67108864 dio dax to nondax
> > > > 
> > > >   I've reported this to nvdimm list.
> > > >   https://lists.01.org/pipermail/linux-nvdimm/2017-January/008600.html
> > > > 
> > > > Xiong Zhou (2):
> > > >   xfs: test per-inode DAX flag by IO
> > > >   generic: test mmap io through DAX and non-DAX
> > > > 
> > > >  .gitignore            |   1 +
> > > >  common/rc             |  13 ++++++
> > > >  src/Makefile          |   2 +-
> > > >  src/t_mmap_dio.c      | 105 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  tests/generic/405     | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  tests/generic/405.out |   2 +
> > > >  tests/generic/group   |   1 +
> > > >  tests/xfs/138         | 116 ++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  tests/xfs/138.out     |   2 +
> > > >  tests/xfs/group       |   1 +
> > > >  10 files changed, 361 insertions(+), 1 deletion(-)
> > > >  create mode 100644 src/t_mmap_dio.c
> > > >  create mode 100755 tests/generic/405
> > > >  create mode 100644 tests/generic/405.out
> > > >  create mode 100755 tests/xfs/138
> > > >  create mode 100644 tests/xfs/138.out
> > > > 
> > > > -- 
> > > > 1.8.3.1
> > > 
> > > I just wanted to let you know that I'm testing with these new xfstests right
> > > now, and so far I've been unable to successfully get any PMD faults.  I'm
> > > looking into why that is right now, and should hopefully have some changes so
> > > we can do both PTE and PMD testing with this set.
> > 
> > Thank you very much for looking into this!
> > 
> > Adding a printk msg in dax_iomap_pmd_fault in fs/dax.c shows that
> > these 2 cases called this function, so do __radix_tree_insert
> > in lib/radix-tree.c with order > 0.  I must have missed something..
> 
> Ah, yea, the flow is a little confusing.  When we first try to do a PMD fault
> we insert a 2MiB "empty" entry into the radix tree that basically just allows
> us to lock an entire 2MiB range.  This happens in dax_iomap_pmd_fault() via 
> grab_mapping_entry().  This is most likely what you're seeing with your debug.
> 
> Then, with that empty 2MiB entry in place we try and actually service the
> fault and insert a real mapping to a 2MiB huge page.  There are still cases
> when this can fall back to 4k pages, and one of them is if the block
> allocation we are given by the filesystem isn't 2MiB aligned.  That is the
> alignment check against PG_PMD_COLOUR in dax_pmd_insert_mapping(), and that's
> what we were hitting.  The way to get around this is to tell XFS that we would
> like 2MiB aligned and sized block allocations via the following mkfs options:
> 
> export MKFS_OPTIONS="-d su=2m,sw=1"
> 
> We also need to fallocate our storage space so that we get 2 MiB allocations
> instead of 4k allocations.

Aha, I forgot to checking return status of fault handler. Thanks very much
for the detailed explanation and instructions. :)

> 
> I've been working on patches that do all of this - I'll try and send them out
> today.
> 
> This has taken a little longer than I would have liked because when debugging
> this issue I found an issue with DAX + DIO in the kernel.  So, your test has
> already found an important bug in the kernel before it was even committed to
> xfstests!  :)

Good to know. :)

> 
> BTW, if we fallocate our files, is there additional value in writing data into
> the files before we start testing as you do via these lines?
> 
> $XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
>         $SCRATCH_MNT/tf_s >> $seqres.full 2>&1
> $XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
>         $SCRATCH_MNT/tf_d >> $seqres.full 2>&1
> 
> This puts a known pattern into the files and means that reads are handled from
> media instead of from hole pages, but we never verify the data pattern and it
> slows down the test quite a bit.  What do you think?

falloc is better for this job. I'll send next version after more tests.

Thanks for reviewing!

--
Xiong
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/.gitignore b/.gitignore
index 7dcea14..48a40a0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -129,6 +129,7 @@ 
 /src/cloner
 /src/renameat2
 /src/t_rename_overwrite
+/src/t_mmap_dio
 
 # dmapi/ binaries
 /dmapi/src/common/cmd/read_invis
diff --git a/common/rc b/common/rc
index 892c46e..3706620 100644
--- a/common/rc
+++ b/common/rc
@@ -2632,6 +2632,20 @@  _require_scratch_shutdown()
 	_scratch_unmount
 }
 
+# Does dax mount option work on this dev/fs?
+_require_scratch_dax()
+{
+	_require_scratch
+	_scratch_mkfs > /dev/null 2>&1
+	_scratch_mount -o dax
+	# Check options to be sure. XFS ignores dax option
+	# and goes on if dev underneath does not support dax.
+	_fs_options $SCRATCH_DEV | grep -w "dax" > /dev/null 2>&1
+	[ $? -ne 0 ] && \
+		_notrun "$SCRATCH_DEV $FSTYP does not support -o dax"
+	_scratch_unmount
+}
+
 # Does norecovery support by this fs?
 _require_norecovery()
 {
diff --git a/src/Makefile b/src/Makefile
index 94d74aa..eb5a56c 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -12,7 +12,7 @@  TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
 	godown resvtest writemod makeextents itrash rename \
 	multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
 	t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
-	holetest t_truncate_self
+	holetest t_truncate_self t_mmap_dio
 
 LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/t_mmap_dio.c b/src/t_mmap_dio.c
new file mode 100644
index 0000000..abec442
--- /dev/null
+++ b/src/t_mmap_dio.c
@@ -0,0 +1,89 @@ 
+/*
+ * This programme was originally written by
+ *     Jeff Moyer <jmoyer@redhat.com>
+ */
+#define _GNU_SOURCE 1
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+#include <libaio.h>
+#include <errno.h>
+#include <sys/time.h>
+
+void usage(char *prog)
+{
+	fprintf(stderr,
+		"usage: %s <src file> <dest file> <size> <msg>\n",
+		prog);
+	exit(1);
+}
+
+void err_exit(char *op, unsigned long len, char *s)
+{
+	fprintf(stderr, "%s(%s) len %lu %s\n",
+		op, strerror(errno), len, s);
+	exit(1);
+}
+
+int main(int argc, char **argv)
+{
+	int fd, fd2, ret;
+	char *map;
+	unsigned long len;
+
+	if (argc < 3)
+		usage(basename(argv[0]));
+
+	len = strtoul(argv[3], NULL, 10);
+	if (errno == ERANGE)
+		err_exit("strtoul", 0, argv[4]);
+
+	/* Open source file and mmap*/
+	fd = open(argv[1], O_RDWR, 0644);
+	if (fd < 0)
+		err_exit("open s", len, argv[4]);
+
+	map = (char *)mmap(NULL, len,
+		PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+	if (map == MAP_FAILED)
+		err_exit("mmap", len, argv[4]);
+
+	/* Open dest file with O_DIRECT */
+	fd2 = open(argv[2], O_RDWR|O_DIRECT, 0644);
+	if (fd2 < 0)
+		err_exit("open d", len, argv[4]);
+
+	/* First, test storing to dest file from source mapping */
+	ret = write(fd2, map, len);
+	if (ret != len)
+		err_exit("write", len, argv[4]);
+
+	ret = (int)lseek(fd2, 0, SEEK_SET);
+	if (ret == -1)
+		err_exit("lseek", len, argv[4]);
+
+	/* Next, test reading from dest file into source mapping */
+	ret = read(fd2, map, len);
+	if (ret != len)
+		err_exit("read", len, argv[4]);
+	ret = msync(map, len, MS_SYNC);
+	if (ret < 0)
+		err_exit("msync", len, argv[4]);
+
+	ret = munmap(map, len);
+	if (ret < 0)
+		err_exit("munmap", len, argv[4]);
+
+	ret = close(fd);
+	if (ret < 0)
+		err_exit("clsoe fd", len, argv[4]);
+
+	ret = close(fd2);
+	if (ret < 0)
+		err_exit("close fd2", len, argv[4]);
+
+	exit(0);
+}
diff --git a/tests/xfs/138 b/tests/xfs/138
new file mode 100755
index 0000000..9822441
--- /dev/null
+++ b/tests/xfs/138
@@ -0,0 +1,110 @@ 
+#! /bin/bash
+# FS QA Test 138
+#
+# Test per-inode DAX flag by mmap direct IO.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017 Red Hat Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+_supported_fs xfs
+_supported_os Linux
+_require_scratch_dax
+_require_test_program "feature"
+_require_test_program "t_mmap_dio"
+_require_xfs_io_command "chattr" "+/-x"
+
+# $1 mmap read/write size
+t_dax_flag_mmap_dio()
+{
+	# both dax
+	$XFS_IO_PROG -c "chattr +x" $SCRATCH_MNT/tf_s
+	$XFS_IO_PROG -c "chattr +x" $SCRATCH_MNT/tf_d
+	src/t_mmap_dio $SCRATCH_MNT/tf_{s,d} $1 "both dax"
+
+	# from non dax to dax
+	$XFS_IO_PROG -c "chattr -x" $SCRATCH_MNT/tf_s
+	src/t_mmap_dio $SCRATCH_MNT/tf_{s,d} $1 "nondax to dax"
+
+	# from dax to non dax
+	$XFS_IO_PROG -c "chattr +x" $SCRATCH_MNT/tf_s
+	$XFS_IO_PROG -c "chattr -x" $SCRATCH_MNT/tf_d
+	src/t_mmap_dio $SCRATCH_MNT/tf_{s,d} $1 "dax to nondax"
+
+	# both non dax
+	$XFS_IO_PROG -c "chattr -x" $SCRATCH_MNT/tf_s
+	src/t_mmap_dio $SCRATCH_MNT/tf_{s,d} $1 "both nondax"
+}
+
+do_tests()
+{
+	# less than page size
+	t_dax_flag_mmap_dio 1024
+	# page size
+	t_dax_flag_mmap_dio `src/feature -s`
+	# bigger sizes, for PMD faults
+	t_dax_flag_mmap_dio 16777216
+	t_dax_flag_mmap_dio 67108864
+}
+
+_scratch_mkfs > /dev/null 2>&1
+
+# mount with dax option
+_scratch_mount "-o dax"
+
+psize=`src/feature -s`
+tsize=$((1024 * 1024 * 1024))
+
+$XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
+	$SCRATCH_MNT/tf_s >> $seqres.full 2>&1
+$XFS_IO_PROG -f -c "pwrite -W -b $psize 0 $tsize" \
+	$SCRATCH_MNT/tf_d >> $seqres.full 2>&1
+
+do_tests
+_scratch_unmount
+
+# mount again without dax option
+export MOUNT_OPTIONS=""
+_scratch_mount
+do_tests
+
+# success, all done
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/xfs/138.out b/tests/xfs/138.out
new file mode 100644
index 0000000..614ba1a
--- /dev/null
+++ b/tests/xfs/138.out
@@ -0,0 +1,2 @@ 
+QA output created by 138
+Silence is golden
diff --git a/tests/xfs/group b/tests/xfs/group
index 3c5884c..4b406c0 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -135,6 +135,7 @@ 
 135 auto logprint quick v2log
 136 attr2
 137 auto metadata v2log
+138 auto attr quick
 139 auto quick clone
 140 auto clone
 141 auto log metadata