Message ID | 20180529195106.14268-4-ross.zwisler@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 29 2018 at 3:51P -0400, Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX > flag is set on the device's request queue to decide whether or not the > device supports filesystem DAX. This is insufficient because there are > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but > which don't actually support DAX. Isn't that a PMEM bug then? What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted? > This means that you could create a dm-linear device, for example, where the > first part of the dm-linear device was a PMEM namespace in fsdax mode and > the second part was a PMEM namespace in raw mode. Both DM and the > filesystem you put on that dm-linear device would think the whole device > supports DAX, which would lead to bad behavior once your raw PMEM namespace > part using DAX needed struct page for something. The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if it didn't then the stacked-up linear DM wouldn't > Fix this by using bdev_dax_supported() like filesystems do at mount time. > This checks for raw mode and also performs other tests like checking to > make sure the dax_direct_access() path works. Sorry "This" does those things where? > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support") > --- > drivers/md/dm-table.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > index 0589a4da12bb..5bb994b012ca 100644 > --- a/drivers/md/dm-table.c > +++ b/drivers/md/dm-table.c > @@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type); > static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, > sector_t start, sector_t len, void *data) > { > - struct request_queue *q = bdev_get_queue(dev->bdev); > - > - return q && blk_queue_dax(q); > + return bdev_dax_supported(dev->bdev, PAGE_SIZE); > } > > static bool dm_table_supports_dax(struct dm_table *t) > -- > 2.14.3 >
On Fri, Jun 01 2018 at 4:19P -0400, Mike Snitzer <snitzer@redhat.com> wrote: > On Tue, May 29 2018 at 3:51P -0400, > Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > > > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX > > flag is set on the device's request queue to decide whether or not the > > device supports filesystem DAX. This is insufficient because there are > > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but > > which don't actually support DAX. > > Isn't that a PMEM bug then? > > What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted? > > > This means that you could create a dm-linear device, for example, where the > > first part of the dm-linear device was a PMEM namespace in fsdax mode and > > the second part was a PMEM namespace in raw mode. Both DM and the > > filesystem you put on that dm-linear device would think the whole device > > supports DAX, which would lead to bad behavior once your raw PMEM namespace > > part using DAX needed struct page for something. > > The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if > it didn't then the stacked-up linear DM wouldn't > > > Fix this by using bdev_dax_supported() like filesystems do at mount time. > > This checks for raw mode and also performs other tests like checking to > > make sure the dax_direct_access() path works. > > Sorry "This" does those things where? I see you meant bdev_dax_supported() does these additional checks. My previous question stands though. Why is QUEUE_FLAG_DAX getting set if the device hasn't already passed these checks? Shouldn't setting QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing? But looking at the drivers that do set QUEUE_FLAG_DAX: they don't have the bdev readily available. Anyway, just strikes me as bizarre that a driver can set QUEUE_FLAG_DAX without having to have ensured bdev_dax_supported() passes (even if not programatically, but that the developer has verified the hooks, et al exist). But I'll give up on this line of questioning.. My dilemma now is: how do I take these changes without first rebasing linux-dm.git ontop of Darrick's xfs tree? I probably should've reviewed faster and been the one to take the entire set (with appropriate acks obviously).
On Fri, Jun 01, 2018 at 04:46:04PM -0400, Mike Snitzer wrote: > On Fri, Jun 01 2018 at 4:19P -0400, > Mike Snitzer <snitzer@redhat.com> wrote: > > > On Tue, May 29 2018 at 3:51P -0400, > > Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > > > > > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX > > > flag is set on the device's request queue to decide whether or not the > > > device supports filesystem DAX. This is insufficient because there are > > > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but > > > which don't actually support DAX. > > > > Isn't that a PMEM bug then? > > > > What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted? Yup, that seems like a bug. > > > This means that you could create a dm-linear device, for example, where the > > > first part of the dm-linear device was a PMEM namespace in fsdax mode and > > > the second part was a PMEM namespace in raw mode. Both DM and the > > > filesystem you put on that dm-linear device would think the whole device > > > supports DAX, which would lead to bad behavior once your raw PMEM namespace > > > part using DAX needed struct page for something. > > > > The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if > > it didn't then the stacked-up linear DM wouldn't > > > > > Fix this by using bdev_dax_supported() like filesystems do at mount time. > > > This checks for raw mode and also performs other tests like checking to > > > make sure the dax_direct_access() path works. > > > > Sorry "This" does those things where? > > I see you meant bdev_dax_supported() does these additional checks. > > My previous question stands though. Why is QUEUE_FLAG_DAX getting set > if the device hasn't already passed these checks? Shouldn't setting > QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing? > > But looking at the drivers that do set QUEUE_FLAG_DAX: they > don't have the bdev readily available. Anyway, just strikes me as > bizarre that a driver can set QUEUE_FLAG_DAX without having to have > ensured bdev_dax_supported() passes (even if not programatically, but > that the developer has verified the hooks, et al exist). I agree that it's a bug that QUEUE_FLAG_DAX is set for raw mode namespaces, thanks for pointing that out. I will fix that, but I still think that DM should be using bdev_dax_supported() like the filesystems do. Even beyond the raw mode / fsdax mode thing, this does additional checks that are useful. So, I think these DM patches are still correct. > But I'll give up on this line of questioning.. > > My dilemma now is: how do I take these changes without first rebasing > linux-dm.git ontop of Darrick's xfs tree? > > I probably should've reviewed faster and been the one to take the entire > set (with appropriate acks obviously).
On Fri, Jun 1, 2018 at 1:46 PM, Mike Snitzer <snitzer@redhat.com> wrote: > On Fri, Jun 01 2018 at 4:19P -0400, > Mike Snitzer <snitzer@redhat.com> wrote: > >> On Tue, May 29 2018 at 3:51P -0400, >> Ross Zwisler <ross.zwisler@linux.intel.com> wrote: >> >> > Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX >> > flag is set on the device's request queue to decide whether or not the >> > device supports filesystem DAX. This is insufficient because there are >> > devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but >> > which don't actually support DAX. >> >> Isn't that a PMEM bug then? >> >> What is the point of setting QUEUE_FLAG_DAX if it cannot be trusted? >> >> > This means that you could create a dm-linear device, for example, where the >> > first part of the dm-linear device was a PMEM namespace in fsdax mode and >> > the second part was a PMEM namespace in raw mode. Both DM and the >> > filesystem you put on that dm-linear device would think the whole device >> > supports DAX, which would lead to bad behavior once your raw PMEM namespace >> > part using DAX needed struct page for something. >> >> The PMEM namespace in raw mode shouldn't be setting QUEUE_FLAG_DAX, if >> it didn't then the stacked-up linear DM wouldn't >> >> > Fix this by using bdev_dax_supported() like filesystems do at mount time. >> > This checks for raw mode and also performs other tests like checking to >> > make sure the dax_direct_access() path works. >> >> Sorry "This" does those things where? > > I see you meant bdev_dax_supported() does these additional checks. > > My previous question stands though. Why is QUEUE_FLAG_DAX getting set > if the device hasn't already passed these checks? Shouldn't setting > QUEUE_FLAG_DAX on request_queue depend on bdev_dax_supported() passing? > > But looking at the drivers that do set QUEUE_FLAG_DAX: they > don't have the bdev readily available. Anyway, just strikes me as > bizarre that a driver can set QUEUE_FLAG_DAX without having to have > ensured bdev_dax_supported() passes (even if not programatically, but > that the developer has verified the hooks, et al exist). > > But I'll give up on this line of questioning.. > > My dilemma now is: how do I take these changes without first rebasing > linux-dm.git ontop of Darrick's xfs tree? > > I probably should've reviewed faster and been the one to take the entire > set (with appropriate acks obviously). I'd say just make a topic branch on top of xfs and merge it... with Darrick's permission of course. It might also mean you should wait for the xfs pull request to go to Linus first during the window.
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 0589a4da12bb..5bb994b012ca 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type); static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { - struct request_queue *q = bdev_get_queue(dev->bdev); - - return q && blk_queue_dax(q); + return bdev_dax_supported(dev->bdev, PAGE_SIZE); } static bool dm_table_supports_dax(struct dm_table *t)
Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX flag is set on the device's request queue to decide whether or not the device supports filesystem DAX. This is insufficient because there are devices like PMEM namespaces in raw mode which have QUEUE_FLAG_DAX set but which don't actually support DAX. This means that you could create a dm-linear device, for example, where the first part of the dm-linear device was a PMEM namespace in fsdax mode and the second part was a PMEM namespace in raw mode. Both DM and the filesystem you put on that dm-linear device would think the whole device supports DAX, which would lead to bad behavior once your raw PMEM namespace part using DAX needed struct page for something. Fix this by using bdev_dax_supported() like filesystems do at mount time. This checks for raw mode and also performs other tests like checking to make sure the dax_direct_access() path works. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support") --- drivers/md/dm-table.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)