diff mbox series

[2/2] ext4, dax: set ext4_dax_aops for dax files

Message ID 20180911154246.6844-3-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show
Series fix sync to flush processor cache for ext4 DAX files | expand

Commit Message

Kani, Toshi Sept. 11, 2018, 3:42 p.m. UTC
Sync syscall to an existing DAX file needs to flush processor cache,
but it does not currently.  This is because 'ext4_da_aops' is set to
address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
since S_DAX flag is set after ext4_set_aops() in the open path.

  New file
  --------
  lookup_open
    ext4_create
      __ext4_new_inode
        ext4_set_inode_flags   // Set S_DAX flag
      ext4_set_aops            // Set aops to ext4_dax_aops

  Existing file
  -------------
  lookup_open
    ext4_lookup
      ext4_iget
        ext4_set_aops          // Set aops to ext4_da_aops
        ext4_set_inode_flags   // Set S_DAX flag

Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().

Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/ext4/inode.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Dan Williams Sept. 11, 2018, 6:15 p.m. UTC | #1
On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
> Sync syscall to an existing DAX file needs to flush processor cache,
> but it does not currently.  This is because 'ext4_da_aops' is set to
> address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> since S_DAX flag is set after ext4_set_aops() in the open path.
>
>   New file
>   --------
>   lookup_open
>     ext4_create
>       __ext4_new_inode
>         ext4_set_inode_flags   // Set S_DAX flag
>       ext4_set_aops            // Set aops to ext4_dax_aops
>
>   Existing file
>   -------------
>   lookup_open
>     ext4_lookup
>       ext4_iget
>         ext4_set_aops          // Set aops to ext4_da_aops
>         ext4_set_inode_flags   // Set S_DAX flag
>
> Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
>
> Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086

Same format nit:

Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
Cc: <stable@vger.kernel.org>


> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> ---
>  fs/ext4/inode.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 775cd9b4af55..93cbbb859c40 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>         if (ret)
>                 goto bad_inode;
>
> +       ext4_set_inode_flags(inode);
> +

Hmm, does this have unintended behavior changes?

I notice that there are some checks for flags "IS_APPEND(inode) ||
IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
didn't look too much deeper at whether those checks are bogus, but it
would seem safer to do something like this for a lower risk fix.

Thoughts?

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d0dd585add6a..1e9ab445c777 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4999,7 +4999,6 @@ struct inode *ext4_iget(struct super_block *sb,
unsigned long ino)
        if (S_ISREG(inode->i_mode)) {
                inode->i_op = &ext4_file_inode_operations;
                inode->i_fop = &ext4_file_operations;
-               ext4_set_aops(inode);
        } else if (S_ISDIR(inode->i_mode)) {
                inode->i_op = &ext4_dir_inode_operations;
                inode->i_fop = &ext4_dir_operations;
@@ -5042,6 +5041,12 @@ struct inode *ext4_iget(struct super_block *sb,
unsigned long ino)
        }
        brelse(iloc.bh);
        ext4_set_inode_flags(inode);
+       /*
+        * Now that we have determined whether DAX is enabled, set the
+        * proper address spaces operations
+        */
+       if (S_ISREG(inode->i_mode))
+               ext4_set_aops(inode);

        unlock_new_inode(inode);
        return inode;
Kani, Toshi Sept. 11, 2018, 6:41 p.m. UTC | #2
On Tue, 2018-09-11 at 11:15 -0700, Dan Williams wrote:
> On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > Sync syscall to an existing DAX file needs to flush processor cache,
> > but it does not currently.  This is because 'ext4_da_aops' is set to
> > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > since S_DAX flag is set after ext4_set_aops() in the open path.
> > 
> >   New file
> >   --------
> >   lookup_open
> >     ext4_create
> >       __ext4_new_inode
> >         ext4_set_inode_flags   // Set S_DAX flag
> >       ext4_set_aops            // Set aops to ext4_dax_aops
> > 
> >   Existing file
> >   -------------
> >   lookup_open
> >     ext4_lookup
> >       ext4_iget
> >         ext4_set_aops          // Set aops to ext4_da_aops
> >         ext4_set_inode_flags   // Set S_DAX flag
> > 
> > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> > 
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> 
> Same format nit:
> 
> Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> Cc: <stable@vger.kernel.org>

Will do.

> > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: "Theodore Ts'o" <tytso@mit.edu>
> > Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> > ---
> >  fs/ext4/inode.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 775cd9b4af55..93cbbb859c40 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> >         if (ret)
> >                 goto bad_inode;
> > 
> > +       ext4_set_inode_flags(inode);
> > +
> 
> Hmm, does this have unintended behavior changes?
> 
> I notice that there are some checks for flags "IS_APPEND(inode) ||
> IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> didn't look too much deeper at whether those checks are bogus, but it
> would seem safer to do something like this for a lower risk fix.
> 
> Thoughts?

Good catch!  Agreed.

Thanks!
-Toshi
Jan Kara Sept. 12, 2018, 9:31 a.m. UTC | #3
On Tue 11-09-18 11:15:18, Dan Williams wrote:
> On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > Sync syscall to an existing DAX file needs to flush processor cache,
> > but it does not currently.  This is because 'ext4_da_aops' is set to
> > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > since S_DAX flag is set after ext4_set_aops() in the open path.
> >
> >   New file
> >   --------
> >   lookup_open
> >     ext4_create
> >       __ext4_new_inode
> >         ext4_set_inode_flags   // Set S_DAX flag
> >       ext4_set_aops            // Set aops to ext4_dax_aops
> >
> >   Existing file
> >   -------------
> >   lookup_open
> >     ext4_lookup
> >       ext4_iget
> >         ext4_set_aops          // Set aops to ext4_da_aops
> >         ext4_set_inode_flags   // Set S_DAX flag
> >
> > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> >
> > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> 
> Same format nit:
> 
> Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> Cc: <stable@vger.kernel.org>
> 
> 
> > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: "Theodore Ts'o" <tytso@mit.edu>
> > Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> > ---
> >  fs/ext4/inode.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 775cd9b4af55..93cbbb859c40 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> >         if (ret)
> >                 goto bad_inode;
> >
> > +       ext4_set_inode_flags(inode);
> > +
> 
> Hmm, does this have unintended behavior changes?
> 
> I notice that there are some checks for flags "IS_APPEND(inode) ||
> IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> didn't look too much deeper at whether those checks are bogus, but it
> would seem safer to do something like this for a lower risk fix.
> 
> Thoughts?

Well, safer but it would leave the landmine around for others to hit.
Toshi, please move the ext4_set_inode_flags() call to be just after the
assignment:

	ei->i_flags = le32_to_cpu(raw_inode->i_flags);

in ext4_iget(). That way people won't introduce checks for i_flags that can
never hit... And yes, it fixes also other bugs (mostly in sanity checks
AFAICS) than the DAX issue.

								Honza

> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d0dd585add6a..1e9ab445c777 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4999,7 +4999,6 @@ struct inode *ext4_iget(struct super_block *sb,
> unsigned long ino)
>         if (S_ISREG(inode->i_mode)) {
>                 inode->i_op = &ext4_file_inode_operations;
>                 inode->i_fop = &ext4_file_operations;
> -               ext4_set_aops(inode);
>         } else if (S_ISDIR(inode->i_mode)) {
>                 inode->i_op = &ext4_dir_inode_operations;
>                 inode->i_fop = &ext4_dir_operations;
> @@ -5042,6 +5041,12 @@ struct inode *ext4_iget(struct super_block *sb,
> unsigned long ino)
>         }
>         brelse(iloc.bh);
>         ext4_set_inode_flags(inode);
> +       /*
> +        * Now that we have determined whether DAX is enabled, set the
> +        * proper address spaces operations
> +        */
> +       if (S_ISREG(inode->i_mode))
> +               ext4_set_aops(inode);
> 
>         unlock_new_inode(inode);
>         return inode;
Kani, Toshi Sept. 12, 2018, 4:08 p.m. UTC | #4
On Wed, 2018-09-12 at 11:31 +0200, Jan Kara wrote:
> On Tue 11-09-18 11:15:18, Dan Williams wrote:
> > On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > > Sync syscall to an existing DAX file needs to flush processor cache,
> > > but it does not currently.  This is because 'ext4_da_aops' is set to
> > > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
> > > since S_DAX flag is set after ext4_set_aops() in the open path.
> > > 
> > >   New file
> > >   --------
> > >   lookup_open
> > >     ext4_create
> > >       __ext4_new_inode
> > >         ext4_set_inode_flags   // Set S_DAX flag
> > >       ext4_set_aops            // Set aops to ext4_dax_aops
> > > 
> > >   Existing file
> > >   -------------
> > >   lookup_open
> > >     ext4_lookup
> > >       ext4_iget
> > >         ext4_set_aops          // Set aops to ext4_da_aops
> > >         ext4_set_inode_flags   // Set S_DAX flag
> > > 
> > > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
> > > 
> > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
> > 
> > Same format nit:
> > 
> > Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
> > Cc: <stable@vger.kernel.org>
> > 
> > 
> > > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> > > Cc: Jan Kara <jack@suse.cz>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: "Theodore Ts'o" <tytso@mit.edu>
> > > Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> > > ---
> > >  fs/ext4/inode.c |    3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 775cd9b4af55..93cbbb859c40 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> > >         if (ret)
> > >                 goto bad_inode;
> > > 
> > > +       ext4_set_inode_flags(inode);
> > > +
> > 
> > Hmm, does this have unintended behavior changes?
> > 
> > I notice that there are some checks for flags "IS_APPEND(inode) ||
> > IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
> > didn't look too much deeper at whether those checks are bogus, but it
> > would seem safer to do something like this for a lower risk fix.
> > 
> > Thoughts?
> 
> Well, safer but it would leave the landmine around for others to hit.
> Toshi, please move the ext4_set_inode_flags() call to be just after the
> assignment:
> 
> 	ei->i_flags = le32_to_cpu(raw_inode->i_flags);
> 
> in ext4_iget(). That way people won't introduce checks for i_flags that can
> never hit... And yes, it fixes also other bugs (mostly in sanity checks
> AFAICS) than the DAX issue.

Sure.  Assuming you think the implicit change Dan pointed out is not a
problem, yes, I will go with this cleaner approach.

Thanks!
-Toshi
Dan Williams Sept. 12, 2018, 4:41 p.m. UTC | #5
On Wed, Sep 12, 2018 at 9:08 AM, Kani, Toshi <toshi.kani@hpe.com> wrote:
> On Wed, 2018-09-12 at 11:31 +0200, Jan Kara wrote:
>> On Tue 11-09-18 11:15:18, Dan Williams wrote:
>> > On Tue, Sep 11, 2018 at 8:42 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
>> > > Sync syscall to an existing DAX file needs to flush processor cache,
>> > > but it does not currently.  This is because 'ext4_da_aops' is set to
>> > > address_space_operations of existing DAX files, instead of 'ext4_dax_aops',
>> > > since S_DAX flag is set after ext4_set_aops() in the open path.
>> > >
>> > >   New file
>> > >   --------
>> > >   lookup_open
>> > >     ext4_create
>> > >       __ext4_new_inode
>> > >         ext4_set_inode_flags   // Set S_DAX flag
>> > >       ext4_set_aops            // Set aops to ext4_dax_aops
>> > >
>> > >   Existing file
>> > >   -------------
>> > >   lookup_open
>> > >     ext4_lookup
>> > >       ext4_iget
>> > >         ext4_set_aops          // Set aops to ext4_da_aops
>> > >         ext4_set_inode_flags   // Set S_DAX flag
>> > >
>> > > Change ext4_iget() to call ext4_set_inode_flags() before ext4_set_aops().
>> > >
>> > > Fixes: 5f0663bb4a64f588f0a2dd6d1be68d40f9af0086
>> >
>> > Same format nit:
>> >
>> > Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops")
>> > Cc: <stable@vger.kernel.org>
>> >
>> >
>> > > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
>> > > Cc: Jan Kara <jack@suse.cz>
>> > > Cc: Dan Williams <dan.j.williams@intel.com>
>> > > Cc: "Theodore Ts'o" <tytso@mit.edu>
>> > > Cc: Andreas Dilger <adilger.kernel@dilger.ca>
>> > > ---
>> > >  fs/ext4/inode.c |    3 ++-
>> > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> > > index 775cd9b4af55..93cbbb859c40 100644
>> > > --- a/fs/ext4/inode.c
>> > > +++ b/fs/ext4/inode.c
>> > > @@ -4998,6 +4998,8 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>> > >         if (ret)
>> > >                 goto bad_inode;
>> > >
>> > > +       ext4_set_inode_flags(inode);
>> > > +
>> >
>> > Hmm, does this have unintended behavior changes?
>> >
>> > I notice that there are some checks for flags "IS_APPEND(inode) ||
>> > IS_IMMUTABLE(inode)" *before* the call to ext4_set_inode_flags(). I
>> > didn't look too much deeper at whether those checks are bogus, but it
>> > would seem safer to do something like this for a lower risk fix.
>> >
>> > Thoughts?
>>
>> Well, safer but it would leave the landmine around for others to hit.
>> Toshi, please move the ext4_set_inode_flags() call to be just after the
>> assignment:
>>
>> ei->i_flags = le32_to_cpu(raw_inode->i_flags);
>>
>> in ext4_iget(). That way people won't introduce checks for i_flags that can
>> never hit... And yes, it fixes also other bugs (mostly in sanity checks
>> AFAICS) than the DAX issue.
>
> Sure.  Assuming you think the implicit change Dan pointed out is not a
> problem, yes, I will go with this cleaner approach.
>

Yes, Jan's proposal looks best to me.
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 775cd9b4af55..93cbbb859c40 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4998,6 +4998,8 @@  struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 	if (ret)
 		goto bad_inode;
 
+	ext4_set_inode_flags(inode);
+
 	if (S_ISREG(inode->i_mode)) {
 		inode->i_op = &ext4_file_inode_operations;
 		inode->i_fop = &ext4_file_operations;
@@ -5043,7 +5045,6 @@  struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 		goto bad_inode;
 	}
 	brelse(iloc.bh);
-	ext4_set_inode_flags(inode);
 
 	unlock_new_inode(inode);
 	return inode;