Message ID | 1452103263-1592-2-git-send-email-ross.zwisler@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into > bdevname() where is is dereferenced. This assumption isn't always true - > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer > head where bh->b_bdev is never set. I hit this BUG while testing the DAX > PMD fault path. > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" > for the block device. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Cc: Dan Williams <dan.j.williams@intel.com> > --- > fs/dax.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 7af8797..03cc4a3 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, > { > if (bh) { > char bname[BDEVNAME_SIZE]; > - bdevname(bh->b_bdev, bname); > + > + if (bh->b_bdev) > + bdevname(bh->b_bdev, bname); > + else > + snprintf(bname, BDEVNAME_SIZE, "unknown"); > + > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " > "length %zd fallback: %s\n", fn, current->comm, > address, bname, bh->b_state, (u64)bh->b_blocknr, I'm assuming there's no danger of a such a buffer_head ever being used for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead go fix ext4 to not send partially filled buffer_heads? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed 06-01-16 11:14:09, Dan Williams wrote: > On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler > <ross.zwisler@linux.intel.com> wrote: > > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into > > bdevname() where is is dereferenced. This assumption isn't always true - > > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer > > head where bh->b_bdev is never set. I hit this BUG while testing the DAX > > PMD fault path. > > > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" > > for the block device. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > Cc: Dan Williams <dan.j.williams@intel.com> > > --- > > fs/dax.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/fs/dax.c b/fs/dax.c > > index 7af8797..03cc4a3 100644 > > --- a/fs/dax.c > > +++ b/fs/dax.c > > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, > > { > > if (bh) { > > char bname[BDEVNAME_SIZE]; > > - bdevname(bh->b_bdev, bname); > > + > > + if (bh->b_bdev) > > + bdevname(bh->b_bdev, bname); > > + else > > + snprintf(bname, BDEVNAME_SIZE, "unknown"); > > + > > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " > > "length %zd fallback: %s\n", fn, current->comm, > > address, bname, bh->b_state, (u64)bh->b_blocknr, > > I'm assuming there's no danger of a such a buffer_head ever being used > for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead > go fix ext4 to not send partially filled buffer_heads? No. The real problem is a long-standing abuse of struct buffer_head to be used for passing block mapping information (it's on my todo list to remove that at least from DAX code and use cleaner block mapping interface but first I want basic DAX functionality to settle down to avoid unnecessary conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need that filled in, set it yourself in before passing bh to the block mapping function. Honza
On Thu, Jan 7, 2016 at 1:34 AM, Jan Kara <jack@suse.cz> wrote: > On Wed 06-01-16 11:14:09, Dan Williams wrote: >> On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler >> <ross.zwisler@linux.intel.com> wrote: >> > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into >> > bdevname() where is is dereferenced. This assumption isn't always true - >> > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer >> > head where bh->b_bdev is never set. I hit this BUG while testing the DAX >> > PMD fault path. >> > >> > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" >> > for the block device. >> > >> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> >> > Cc: Dan Williams <dan.j.williams@intel.com> >> > --- >> > fs/dax.c | 7 ++++++- >> > 1 file changed, 6 insertions(+), 1 deletion(-) >> > >> > diff --git a/fs/dax.c b/fs/dax.c >> > index 7af8797..03cc4a3 100644 >> > --- a/fs/dax.c >> > +++ b/fs/dax.c >> > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, >> > { >> > if (bh) { >> > char bname[BDEVNAME_SIZE]; >> > - bdevname(bh->b_bdev, bname); >> > + >> > + if (bh->b_bdev) >> > + bdevname(bh->b_bdev, bname); >> > + else >> > + snprintf(bname, BDEVNAME_SIZE, "unknown"); >> > + >> > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " >> > "length %zd fallback: %s\n", fn, current->comm, >> > address, bname, bh->b_state, (u64)bh->b_blocknr, >> >> I'm assuming there's no danger of a such a buffer_head ever being used >> for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead >> go fix ext4 to not send partially filled buffer_heads? > > No. The real problem is a long-standing abuse of struct buffer_head to be > used for passing block mapping information (it's on my todo list to remove > that at least from DAX code and use cleaner block mapping interface but > first I want basic DAX functionality to settle down to avoid unnecessary > conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need > that filled in, set it yourself in before passing bh to the block mapping > function. > Ok, makes sense. Ross, can you fix this instead by unconditionally looking up the bdev rather that saying "unknown". The bdev should always be retrievable. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 07, 2016 at 07:17:22AM -0800, Dan Williams wrote: > On Thu, Jan 7, 2016 at 1:34 AM, Jan Kara <jack@suse.cz> wrote: > > On Wed 06-01-16 11:14:09, Dan Williams wrote: > >> On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler > >> <ross.zwisler@linux.intel.com> wrote: > >> > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into > >> > bdevname() where is is dereferenced. This assumption isn't always true - > >> > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer > >> > head where bh->b_bdev is never set. I hit this BUG while testing the DAX > >> > PMD fault path. > >> > > >> > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" > >> > for the block device. > >> > > >> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > >> > Cc: Dan Williams <dan.j.williams@intel.com> > >> > --- > >> > fs/dax.c | 7 ++++++- > >> > 1 file changed, 6 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/fs/dax.c b/fs/dax.c > >> > index 7af8797..03cc4a3 100644 > >> > --- a/fs/dax.c > >> > +++ b/fs/dax.c > >> > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, > >> > { > >> > if (bh) { > >> > char bname[BDEVNAME_SIZE]; > >> > - bdevname(bh->b_bdev, bname); > >> > + > >> > + if (bh->b_bdev) > >> > + bdevname(bh->b_bdev, bname); > >> > + else > >> > + snprintf(bname, BDEVNAME_SIZE, "unknown"); > >> > + > >> > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " > >> > "length %zd fallback: %s\n", fn, current->comm, > >> > address, bname, bh->b_state, (u64)bh->b_blocknr, > >> > >> I'm assuming there's no danger of a such a buffer_head ever being used > >> for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead > >> go fix ext4 to not send partially filled buffer_heads? > > > > No. The real problem is a long-standing abuse of struct buffer_head to be > > used for passing block mapping information (it's on my todo list to remove > > that at least from DAX code and use cleaner block mapping interface but > > first I want basic DAX functionality to settle down to avoid unnecessary > > conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need > > that filled in, set it yourself in before passing bh to the block mapping > > function. > > > > Ok, makes sense. > > Ross, can you fix this instead by unconditionally looking up the bdev > rather that saying "unknown". The bdev should always be retrievable. Sure, will do. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 07, 2016 at 10:34:02AM +0100, Jan Kara wrote: > On Wed 06-01-16 11:14:09, Dan Williams wrote: > > On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler > > <ross.zwisler@linux.intel.com> wrote: > > > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into > > > bdevname() where is is dereferenced. This assumption isn't always true - > > > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer > > > head where bh->b_bdev is never set. I hit this BUG while testing the DAX > > > PMD fault path. > > > > > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" > > > for the block device. > > > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > Cc: Dan Williams <dan.j.williams@intel.com> > > > --- > > > fs/dax.c | 7 ++++++- > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/dax.c b/fs/dax.c > > > index 7af8797..03cc4a3 100644 > > > --- a/fs/dax.c > > > +++ b/fs/dax.c > > > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, > > > { > > > if (bh) { > > > char bname[BDEVNAME_SIZE]; > > > - bdevname(bh->b_bdev, bname); > > > + > > > + if (bh->b_bdev) > > > + bdevname(bh->b_bdev, bname); > > > + else > > > + snprintf(bname, BDEVNAME_SIZE, "unknown"); > > > + > > > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " > > > "length %zd fallback: %s\n", fn, current->comm, > > > address, bname, bh->b_state, (u64)bh->b_blocknr, > > > > I'm assuming there's no danger of a such a buffer_head ever being used > > for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead > > go fix ext4 to not send partially filled buffer_heads? > > No. The real problem is a long-standing abuse of struct buffer_head to be > used for passing block mapping information (it's on my todo list to remove > that at least from DAX code and use cleaner block mapping interface but > first I want basic DAX functionality to settle down to avoid unnecessary > conflicts). Filesystem is not supposed to touch bh->b_bdev. That has not been true for a long, long time. e.g. XFS always rewrites bh->b_bdev in get_blocks because the file may not reside on the primary block device of the filesystem. i.e.: /* * If this is a realtime file, data may be on a different device. * to that pointed to from the buffer_head b_bdev currently. */ bh_result->b_bdev = xfs_find_bdev_for_inode(inode); > If you need > that filled in, set it yourself in before passing bh to the block mapping > function. That may be true, but we cannot assume that the bdev coming back out of get_block is the same one that was passed in. Cheers, Dave.
On Fri, Jan 08, 2016 at 10:10:00AM +1100, Dave Chinner wrote: > On Thu, Jan 07, 2016 at 10:34:02AM +0100, Jan Kara wrote: > > On Wed 06-01-16 11:14:09, Dan Williams wrote: > > > On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler > > > <ross.zwisler@linux.intel.com> wrote: > > > > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into > > > > bdevname() where is is dereferenced. This assumption isn't always true - > > > > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer > > > > head where bh->b_bdev is never set. I hit this BUG while testing the DAX > > > > PMD fault path. > > > > > > > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown" > > > > for the block device. > > > > > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > > Cc: Dan Williams <dan.j.williams@intel.com> > > > > --- > > > > fs/dax.c | 7 ++++++- > > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/fs/dax.c b/fs/dax.c > > > > index 7af8797..03cc4a3 100644 > > > > --- a/fs/dax.c > > > > +++ b/fs/dax.c > > > > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, > > > > { > > > > if (bh) { > > > > char bname[BDEVNAME_SIZE]; > > > > - bdevname(bh->b_bdev, bname); > > > > + > > > > + if (bh->b_bdev) > > > > + bdevname(bh->b_bdev, bname); > > > > + else > > > > + snprintf(bname, BDEVNAME_SIZE, "unknown"); > > > > + > > > > pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " > > > > "length %zd fallback: %s\n", fn, current->comm, > > > > address, bname, bh->b_state, (u64)bh->b_blocknr, > > > > > > I'm assuming there's no danger of a such a buffer_head ever being used > > > for the bdev parameter to dax_map_atomic()? Shouldn't we also/instead > > > go fix ext4 to not send partially filled buffer_heads? > > > > No. The real problem is a long-standing abuse of struct buffer_head to be > > used for passing block mapping information (it's on my todo list to remove > > that at least from DAX code and use cleaner block mapping interface but > > first I want basic DAX functionality to settle down to avoid unnecessary > > conflicts). Filesystem is not supposed to touch bh->b_bdev. > > That has not been true for a long, long time. e.g. XFS always > rewrites bh->b_bdev in get_blocks because the file may not reside on > the primary block device of the filesystem. i.e.: > > /* > * If this is a realtime file, data may be on a different device. > * to that pointed to from the buffer_head b_bdev currently. > */ > bh_result->b_bdev = xfs_find_bdev_for_inode(inode); > > > If you need > > that filled in, set it yourself in before passing bh to the block mapping > > function. > > That may be true, but we cannot assume that the bdev coming back > out of get_block is the same one that was passed in. For our use case I think this is fine - we just need the bdev to be filled in so that we can print reasonable error messages. If the filesystem updates bh->b_bdev during get_blocks(), we are fine with that. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/dax.c b/fs/dax.c index 7af8797..03cc4a3 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, { if (bh) { char bname[BDEVNAME_SIZE]; - bdevname(bh->b_bdev, bname); + + if (bh->b_bdev) + bdevname(bh->b_bdev, bname); + else + snprintf(bname, BDEVNAME_SIZE, "unknown"); + pr_debug("%s: %s addr: %lx dev %s state %lx start %lld " "length %zd fallback: %s\n", fn, current->comm, address, bname, bh->b_state, (u64)bh->b_blocknr,
__dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into bdevname() where is is dereferenced. This assumption isn't always true - when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer head where bh->b_bdev is never set. I hit this BUG while testing the DAX PMD fault path. Instead, verify that we have a valid bh->b_bdev, else just say "unknown" for the block device. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> --- fs/dax.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)