diff mbox

[v2,3/7] dm: fix test for DAX device support

Message ID 20180529195106.14268-4-ross.zwisler@linux.intel.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Ross Zwisler May 29, 2018, 7:51 p.m. UTC
Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
flag is set on the device's request queue to decide whether or not the
device supports filesystem DAX.  This is insufficient because there are
devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
which don't actually support DAX.

This means that you could create a dm-linear device, for example, where the
first part of the dm-linear device was a PMEM namespace in fsdax mode and
the second part was a PMEM namespace in raw mode.  Both DM and the
filesystem you put on that dm-linear device would think the whole device
supports DAX, which would lead to bad behavior once your raw PMEM namespace
part using DAX needed struct page for something.

Fix this by using bdev_dax_supported() like filesystems do at mount time.
This checks for raw mode and also performs other tests like checking to
make sure the dax_direct_access() path works.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
---
 drivers/md/dm-table.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Mike Snitzer June 1, 2018, 8:19 p.m. UTC | #1
On Tue, May 29 2018 at  3:51P -0400,
Ross Zwisler <ross.zwisler@linux.intel.com> wrote:

> Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
> flag is set on the device's request queue to decide whether or not the
> device supports filesystem DAX.  This is insufficient because there are
> devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
> which don't actually support DAX.

Isn't that a PMEM bug then?

What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted?
 
> This means that you could create a dm-linear device, for example, where the
> first part of the dm-linear device was a PMEM namespace in fsdax mode and
> the second part was a PMEM namespace in raw mode.  Both DM and the
> filesystem you put on that dm-linear device would think the whole device
> supports DAX, which would lead to bad behavior once your raw PMEM namespace
> part using DAX needed struct page for something.

The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if
it didn't then the stacked-up linear DM wouldn't 

> Fix this by using bdev_dax_supported() like filesystems do at mount time.
> This checks for raw mode and also performs other tests like checking to
> make sure the dax_direct_access() path works.

Sorry "This" does those things where?

> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
> ---
>  drivers/md/dm-table.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 0589a4da12bb..5bb994b012ca 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
>  static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
>  			       sector_t start, sector_t len, void *data)
>  {
> -	struct request_queue *q = bdev_get_queue(dev->bdev);
> -
> -	return q && blk_queue_dax(q);
> +	return bdev_dax_supported(dev->bdev, PAGE_SIZE);
>  }
>  
>  static bool dm_table_supports_dax(struct dm_table *t)
> -- 
> 2.14.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Snitzer June 1, 2018, 8:46 p.m. UTC | #2
On Fri, Jun 01 2018 at  4:19P -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Tue, May 29 2018 at  3:51P -0400,
> Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> 
> > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
> > flag is set on the device's request queue to decide whether or not the
> > device supports filesystem DAX.  This is insufficient because there are
> > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
> > which don't actually support DAX.
> 
> Isn't that a PMEM bug then?
> 
> What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted?
>  
> > This means that you could create a dm-linear device, for example, where the
> > first part of the dm-linear device was a PMEM namespace in fsdax mode and
> > the second part was a PMEM namespace in raw mode.  Both DM and the
> > filesystem you put on that dm-linear device would think the whole device
> > supports DAX, which would lead to bad behavior once your raw PMEM namespace
> > part using DAX needed struct page for something.
> 
> The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if
> it didn't then the stacked-up linear DM wouldn't 
> 
> > Fix this by using bdev_dax_supported() like filesystems do at mount time.
> > This checks for raw mode and also performs other tests like checking to
> > make sure the dax_direct_access() path works.
> 
> Sorry "This" does those things where?

I see you meant bdev_dax_supported() does these additional checks.

My previous question stands though.  Why is QUEUE_FLAG_DAX getting set
if the device hasn't already passed these checks?  Shouldn't setting
QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing?

But looking at the drivers that do set QUEUE_FLAG_DAX: they
don't have the bdev readily available.  Anyway, just strikes me as
bizarre that a driver can set QUEUE_FLAG_DAX without having to have
ensured bdev_dax_supported() passes (even if not programatically, but
that the developer has verified the hooks, et al exist).

But I'll give up on this line of questioning..

My dilemma now is: how do I take these changes without first rebasing
linux-dm.git ontop of Darrick's xfs tree?

I probably should've reviewed faster and been the one to take the entire
set (with appropriate acks obviously).
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ross Zwisler June 1, 2018, 9:11 p.m. UTC | #3
On Fri, Jun 01, 2018 at 04:46:04PM -0400, Mike Snitzer wrote:
> On Fri, Jun 01 2018 at  4:19P -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Tue, May 29 2018 at  3:51P -0400,
> > Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> > 
> > > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
> > > flag is set on the device's request queue to decide whether or not the
> > > device supports filesystem DAX.  This is insufficient because there are
> > > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
> > > which don't actually support DAX.
> > 
> > Isn't that a PMEM bug then?
> > 
> > What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted?

Yup, that seems like a bug.

> > > This means that you could create a dm-linear device, for example, where the
> > > first part of the dm-linear device was a PMEM namespace in fsdax mode and
> > > the second part was a PMEM namespace in raw mode.  Both DM and the
> > > filesystem you put on that dm-linear device would think the whole device
> > > supports DAX, which would lead to bad behavior once your raw PMEM namespace
> > > part using DAX needed struct page for something.
> > 
> > The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if
> > it didn't then the stacked-up linear DM wouldn't 
> > 
> > > Fix this by using bdev_dax_supported() like filesystems do at mount time.
> > > This checks for raw mode and also performs other tests like checking to
> > > make sure the dax_direct_access() path works.
> > 
> > Sorry "This" does those things where?
> 
> I see you meant bdev_dax_supported() does these additional checks.
> 
> My previous question stands though.  Why is QUEUE_FLAG_DAX getting set
> if the device hasn't already passed these checks?  Shouldn't setting
> QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing?
> 
> But looking at the drivers that do set QUEUE_FLAG_DAX: they
> don't have the bdev readily available.  Anyway, just strikes me as
> bizarre that a driver can set QUEUE_FLAG_DAX without having to have
> ensured bdev_dax_supported() passes (even if not programatically, but
> that the developer has verified the hooks, et al exist).

I agree that it's a bug that QUEUE_FLAG_DAX is set for raw mode namespaces,
thanks for pointing that out.  I will fix that, but I still think that DM
should be using bdev_dax_supported() like the filesystems do.  Even beyond the
raw mode / fsdax mode thing, this does additional checks that are useful. So,
I think these DM patches are still correct.

> But I'll give up on this line of questioning..
> 
> My dilemma now is: how do I take these changes without first rebasing
> linux-dm.git ontop of Darrick's xfs tree?
> 
> I probably should've reviewed faster and been the one to take the entire
> set (with appropriate acks obviously).
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams June 1, 2018, 9:16 p.m. UTC | #4
On Fri, Jun 1, 2018 at 1:46 PM, Mike Snitzer <snitzer@redhat.com> wrote:
> On Fri, Jun 01 2018 at  4:19P -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
>
>> On Tue, May 29 2018 at  3:51P -0400,
>> Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
>>
>> > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
>> > flag is set on the device's request queue to decide whether or not the
>> > device supports filesystem DAX.  This is insufficient because there are
>> > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but
>> > which don't actually support DAX.
>>
>> Isn't that a PMEM bug then?
>>
>> What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted?
>>
>> > This means that you could create a dm-linear device, for example, where the
>> > first part of the dm-linear device was a PMEM namespace in fsdax mode and
>> > the second part was a PMEM namespace in raw mode.  Both DM and the
>> > filesystem you put on that dm-linear device would think the whole device
>> > supports DAX, which would lead to bad behavior once your raw PMEM namespace
>> > part using DAX needed struct page for something.
>>
>> The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if
>> it didn't then the stacked-up linear DM wouldn't
>>
>> > Fix this by using bdev_dax_supported() like filesystems do at mount time.
>> > This checks for raw mode and also performs other tests like checking to
>> > make sure the dax_direct_access() path works.
>>
>> Sorry "This" does those things where?
>
> I see you meant bdev_dax_supported() does these additional checks.
>
> My previous question stands though.  Why is QUEUE_FLAG_DAX getting set
> if the device hasn't already passed these checks?  Shouldn't setting
> QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing?
>
> But looking at the drivers that do set QUEUE_FLAG_DAX: they
> don't have the bdev readily available.  Anyway, just strikes me as
> bizarre that a driver can set QUEUE_FLAG_DAX without having to have
> ensured bdev_dax_supported() passes (even if not programatically, but
> that the developer has verified the hooks, et al exist).
>
> But I'll give up on this line of questioning..
>
> My dilemma now is: how do I take these changes without first rebasing
> linux-dm.git ontop of Darrick's xfs tree?
>
> I probably should've reviewed faster and been the one to take the entire
> set (with appropriate acks obviously).

I'd say just make a topic branch on top of xfs and merge it... with
Darrick's permission of course. It might also mean you should wait for
the xfs pull request to go to Linus first during the window.
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0589a4da12bb..5bb994b012ca 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -885,9 +885,7 @@  EXPORT_SYMBOL_GPL(dm_table_set_type);
 static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
 			       sector_t start, sector_t len, void *data)
 {
-	struct request_queue *q = bdev_get_queue(dev->bdev);
-
-	return q && blk_queue_dax(q);
+	return bdev_dax_supported(dev->bdev, PAGE_SIZE);
 }
 
 static bool dm_table_supports_dax(struct dm_table *t)