diff mbox

[v7,1/9] dax: fix NULL pointer dereference in __dax_dbg()

Message ID 1452103263-1592-2-git-send-email-ross.zwisler@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ross Zwisler Jan. 6, 2016, 6 p.m. UTC
__dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
bdevname() where is is dereferenced.  This assumption isn't always true -
when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
PMD fault path.

Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
for the block device.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
---
 fs/dax.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Dan Williams Jan. 6, 2016, 7:14 p.m. UTC | #1
On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
> bdevname() where is is dereferenced.  This assumption isn't always true -
> when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
> head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
> PMD fault path.
>
> Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
> for the block device.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> ---
>  fs/dax.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 7af8797..03cc4a3 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
>  {
>         if (bh) {
>                 char bname[BDEVNAME_SIZE];
> -               bdevname(bh->b_bdev, bname);
> +
> +               if (bh->b_bdev)
> +                       bdevname(bh->b_bdev, bname);
> +               else
> +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
> +
>                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
>                         "length %zd fallback: %s\n", fn, current->comm,
>                         address, bname, bh->b_state, (u64)bh->b_blocknr,

I'm assuming there's no danger of a such a buffer_head ever being used
for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
go fix ext4 to not send partially filled buffer_heads?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara Jan. 7, 2016, 9:34 a.m. UTC | #2
On Wed 06-01-16 11:14:09, Dan Williams wrote:
> On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
> > bdevname() where is is dereferenced.  This assumption isn't always true -
> > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
> > head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
> > PMD fault path.
> >
> > Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
> > for the block device.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  fs/dax.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 7af8797..03cc4a3 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
> >  {
> >         if (bh) {
> >                 char bname[BDEVNAME_SIZE];
> > -               bdevname(bh->b_bdev, bname);
> > +
> > +               if (bh->b_bdev)
> > +                       bdevname(bh->b_bdev, bname);
> > +               else
> > +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
> > +
> >                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
> >                         "length %zd fallback: %s\n", fn, current->comm,
> >                         address, bname, bh->b_state, (u64)bh->b_blocknr,
> 
> I'm assuming there's no danger of a such a buffer_head ever being used
> for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
> go fix ext4 to not send partially filled buffer_heads?

No. The real problem is a long-standing abuse of struct buffer_head to be
used for passing block mapping information (it's on my todo list to remove
that at least from DAX code and use cleaner block mapping interface but
first I want basic DAX functionality to settle down to avoid unnecessary
conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need
that filled in, set it yourself in before passing bh to the block mapping
function.

								Honza
Dan Williams Jan. 7, 2016, 3:17 p.m. UTC | #3
On Thu, Jan 7, 2016 at 1:34 AM, Jan Kara <jack@suse.cz> wrote:
> On Wed 06-01-16 11:14:09, Dan Williams wrote:
>> On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
>> <ross.zwisler@linux.intel.com> wrote:
>> > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
>> > bdevname() where is is dereferenced.  This assumption isn't always true -
>> > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
>> > head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
>> > PMD fault path.
>> >
>> > Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
>> > for the block device.
>> >
>> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
>> > Cc: Dan Williams <dan.j.williams@intel.com>
>> > ---
>> >  fs/dax.c | 7 ++++++-
>> >  1 file changed, 6 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/fs/dax.c b/fs/dax.c
>> > index 7af8797..03cc4a3 100644
>> > --- a/fs/dax.c
>> > +++ b/fs/dax.c
>> > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
>> >  {
>> >         if (bh) {
>> >                 char bname[BDEVNAME_SIZE];
>> > -               bdevname(bh->b_bdev, bname);
>> > +
>> > +               if (bh->b_bdev)
>> > +                       bdevname(bh->b_bdev, bname);
>> > +               else
>> > +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
>> > +
>> >                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
>> >                         "length %zd fallback: %s\n", fn, current->comm,
>> >                         address, bname, bh->b_state, (u64)bh->b_blocknr,
>>
>> I'm assuming there's no danger of a such a buffer_head ever being used
>> for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
>> go fix ext4 to not send partially filled buffer_heads?
>
> No. The real problem is a long-standing abuse of struct buffer_head to be
> used for passing block mapping information (it's on my todo list to remove
> that at least from DAX code and use cleaner block mapping interface but
> first I want basic DAX functionality to settle down to avoid unnecessary
> conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need
> that filled in, set it yourself in before passing bh to the block mapping
> function.
>

Ok, makes sense.

Ross, can you fix this instead by unconditionally looking up the bdev
rather that saying "unknown".  The bdev should always be retrievable.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ross Zwisler Jan. 7, 2016, 10:16 p.m. UTC | #4
On Thu, Jan 07, 2016 at 07:17:22AM -0800, Dan Williams wrote:
> On Thu, Jan 7, 2016 at 1:34 AM, Jan Kara <jack@suse.cz> wrote:
> > On Wed 06-01-16 11:14:09, Dan Williams wrote:
> >> On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
> >> <ross.zwisler@linux.intel.com> wrote:
> >> > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
> >> > bdevname() where is is dereferenced.  This assumption isn't always true -
> >> > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
> >> > head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
> >> > PMD fault path.
> >> >
> >> > Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
> >> > for the block device.
> >> >
> >> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> >> > Cc: Dan Williams <dan.j.williams@intel.com>
> >> > ---
> >> >  fs/dax.c | 7 ++++++-
> >> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/fs/dax.c b/fs/dax.c
> >> > index 7af8797..03cc4a3 100644
> >> > --- a/fs/dax.c
> >> > +++ b/fs/dax.c
> >> > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
> >> >  {
> >> >         if (bh) {
> >> >                 char bname[BDEVNAME_SIZE];
> >> > -               bdevname(bh->b_bdev, bname);
> >> > +
> >> > +               if (bh->b_bdev)
> >> > +                       bdevname(bh->b_bdev, bname);
> >> > +               else
> >> > +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
> >> > +
> >> >                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
> >> >                         "length %zd fallback: %s\n", fn, current->comm,
> >> >                         address, bname, bh->b_state, (u64)bh->b_blocknr,
> >>
> >> I'm assuming there's no danger of a such a buffer_head ever being used
> >> for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
> >> go fix ext4 to not send partially filled buffer_heads?
> >
> > No. The real problem is a long-standing abuse of struct buffer_head to be
> > used for passing block mapping information (it's on my todo list to remove
> > that at least from DAX code and use cleaner block mapping interface but
> > first I want basic DAX functionality to settle down to avoid unnecessary
> > conflicts). Filesystem is not supposed to touch bh->b_bdev. If you need
> > that filled in, set it yourself in before passing bh to the block mapping
> > function.
> >
> 
> Ok, makes sense.
> 
> Ross, can you fix this instead by unconditionally looking up the bdev
> rather that saying "unknown".  The bdev should always be retrievable.

Sure, will do.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner Jan. 7, 2016, 11:10 p.m. UTC | #5
On Thu, Jan 07, 2016 at 10:34:02AM +0100, Jan Kara wrote:
> On Wed 06-01-16 11:14:09, Dan Williams wrote:
> > On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
> > <ross.zwisler@linux.intel.com> wrote:
> > > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
> > > bdevname() where is is dereferenced.  This assumption isn't always true -
> > > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
> > > head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
> > > PMD fault path.
> > >
> > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
> > > for the block device.
> > >
> > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  fs/dax.c | 7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/dax.c b/fs/dax.c
> > > index 7af8797..03cc4a3 100644
> > > --- a/fs/dax.c
> > > +++ b/fs/dax.c
> > > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
> > >  {
> > >         if (bh) {
> > >                 char bname[BDEVNAME_SIZE];
> > > -               bdevname(bh->b_bdev, bname);
> > > +
> > > +               if (bh->b_bdev)
> > > +                       bdevname(bh->b_bdev, bname);
> > > +               else
> > > +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
> > > +
> > >                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
> > >                         "length %zd fallback: %s\n", fn, current->comm,
> > >                         address, bname, bh->b_state, (u64)bh->b_blocknr,
> > 
> > I'm assuming there's no danger of a such a buffer_head ever being used
> > for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
> > go fix ext4 to not send partially filled buffer_heads?
> 
> No. The real problem is a long-standing abuse of struct buffer_head to be
> used for passing block mapping information (it's on my todo list to remove
> that at least from DAX code and use cleaner block mapping interface but
> first I want basic DAX functionality to settle down to avoid unnecessary
> conflicts). Filesystem is not supposed to touch bh->b_bdev.

That has not been true for a long, long time. e.g. XFS always
rewrites bh->b_bdev in get_blocks because the file may not reside on
the primary block device of the filesystem. i.e.:

        /*
         * If this is a realtime file, data may be on a different device.
         * to that pointed to from the buffer_head b_bdev currently.
         */
        bh_result->b_bdev = xfs_find_bdev_for_inode(inode);

> If you need
> that filled in, set it yourself in before passing bh to the block mapping
> function.

That may be true, but we cannot assume that the bdev coming back
out of get_block is the same one that was passed in.

Cheers,

Dave.
Ross Zwisler Jan. 7, 2016, 11:39 p.m. UTC | #6
On Fri, Jan 08, 2016 at 10:10:00AM +1100, Dave Chinner wrote:
> On Thu, Jan 07, 2016 at 10:34:02AM +0100, Jan Kara wrote:
> > On Wed 06-01-16 11:14:09, Dan Williams wrote:
> > > On Wed, Jan 6, 2016 at 10:00 AM, Ross Zwisler
> > > <ross.zwisler@linux.intel.com> wrote:
> > > > __dax_dbg() currently assumes that bh->b_bdev is non-NULL, passing it into
> > > > bdevname() where is is dereferenced.  This assumption isn't always true -
> > > > when called for reads of holes, ext4_dax_mmap_get_block() returns a buffer
> > > > head where bh->b_bdev is never set.  I hit this BUG while testing the DAX
> > > > PMD fault path.
> > > >
> > > > Instead, verify that we have a valid bh->b_bdev, else just say "unknown"
> > > > for the block device.
> > > >
> > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > ---
> > > >  fs/dax.c | 7 ++++++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/dax.c b/fs/dax.c
> > > > index 7af8797..03cc4a3 100644
> > > > --- a/fs/dax.c
> > > > +++ b/fs/dax.c
> > > > @@ -563,7 +563,12 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address,
> > > >  {
> > > >         if (bh) {
> > > >                 char bname[BDEVNAME_SIZE];
> > > > -               bdevname(bh->b_bdev, bname);
> > > > +
> > > > +               if (bh->b_bdev)
> > > > +                       bdevname(bh->b_bdev, bname);
> > > > +               else
> > > > +                       snprintf(bname, BDEVNAME_SIZE, "unknown");
> > > > +
> > > >                 pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
> > > >                         "length %zd fallback: %s\n", fn, current->comm,
> > > >                         address, bname, bh->b_state, (u64)bh->b_blocknr,
> > > 
> > > I'm assuming there's no danger of a such a buffer_head ever being used
> > > for the bdev parameter to dax_map_atomic()?  Shouldn't we also/instead
> > > go fix ext4 to not send partially filled buffer_heads?
> > 
> > No. The real problem is a long-standing abuse of struct buffer_head to be
> > used for passing block mapping information (it's on my todo list to remove
> > that at least from DAX code and use cleaner block mapping interface but
> > first I want basic DAX functionality to settle down to avoid unnecessary
> > conflicts). Filesystem is not supposed to touch bh->b_bdev.
> 
> That has not been true for a long, long time. e.g. XFS always
> rewrites bh->b_bdev in get_blocks because the file may not reside on
> the primary block device of the filesystem. i.e.:
> 
>         /*
>          * If this is a realtime file, data may be on a different device.
>          * to that pointed to from the buffer_head b_bdev currently.
>          */
>         bh_result->b_bdev = xfs_find_bdev_for_inode(inode);
> 
> > If you need
> > that filled in, set it yourself in before passing bh to the block mapping
> > function.
> 
> That may be true, but we cannot assume that the bdev coming back
> out of get_block is the same one that was passed in.

For our use case I think this is fine - we just need the bdev to be filled in
so that we can print reasonable error messages.  If the filesystem updates
bh->b_bdev during get_blocks(), we are fine with that.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/dax.c b/fs/dax.c
index 7af8797..03cc4a3 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -563,7 +563,12 @@  static void __dax_dbg(struct buffer_head *bh, unsigned long address,
 {
 	if (bh) {
 		char bname[BDEVNAME_SIZE];
-		bdevname(bh->b_bdev, bname);
+
+		if (bh->b_bdev)
+			bdevname(bh->b_bdev, bname);
+		else
+			snprintf(bname, BDEVNAME_SIZE, "unknown");
+
 		pr_debug("%s: %s addr: %lx dev %s state %lx start %lld "
 			"length %zd fallback: %s\n", fn, current->comm,
 			address, bname, bh->b_state, (u64)bh->b_blocknr,