diff mbox series

cifs: fix allocation size on newly created files

Message ID CAH2r5mv0kMa__3-KvRRE20OZ3R=cnOFVbrAzVyRA0zpXsbaBiw@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series cifs: fix allocation size on newly created files | expand

Commit Message

Steve French March 19, 2021, 5:25 a.m. UTC
Applications that create and extend and write to a file do not
expect to see 0 allocation size.  When file is extended,
set its allocation size to a plausible value until we have a
chance to query the server for it.  When the file is cached
this will prevent showing an impossible number of allocated
blocks (like 0).  This fixes e.g. xfstests 614 which does

    1) create a file and set its size to 64K
    2) mmap write 64K to the file
    3) stat -c %b for the file (to query the number of allocated blocks)

It was failing because we returned 0 blocks.  Even though we would
return the correct cached file size, we returned an impossible
allocation size.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: <stable@vger.kernel.org>
---
 fs/cifs/inode.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

      inode->i_mapping && inode->i_mapping->nrpages != 0) {
  rc = filemap_fdatawait(inode->i_mapping);
@@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
iattr *attrs,
  if (rc == 0) {
  cifsInode->server_eof = attrs->ia_size;
  cifs_setsize(inode, attrs->ia_size);
+ /*
+ * i_blocks is not related to (i_size / i_blksize),
+ * but instead 512 byte (2**9) size is required for
+ * calculating num blocks. Until we can query the
+ * server for actual allocation size, this is best estimate
+ * we have for the blocks allocated for this file
+ */
+ inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;

  /*
  * The man page of truncate says if the size changed,
@@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
struct iattr *attrs)
  sys_utimes in which case we ought to fail the call back to
  the user when the server rejects the call */
  if ((rc) && (attrs->ia_valid &
- (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
+     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
  rc = 0;
  }

Comments

Aurélien Aptel March 19, 2021, 4:35 p.m. UTC | #1
I gave it a try and it seems correct, alloc size is zero without the
patch, despite size being >0.

The estimate seems reasonable, it will get updated when cache timeout
runs out.

Server is supposed to return newly allocated blocks on Close Response
but I see it's zero on Windows Server despite file size being >0 (maybe
server bug).

In any case change looks ok, you can

Reviewed-by: Aurelien Aptel <aaptel@suse.com>

But little nit picking on comments and useless final hunk:

Steve French <smfrench@gmail.com> writes:
> + /*
> + * i_blocks is not related to (i_size / i_blksize),
> + * but instead 512 byte (2**9) size is required for
> + * calculating num blocks. Until we can query the
> + * server for actual allocation size, this is best estimate
> + * we have for the blocks allocated for this file
> + */
> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;

I would put in the comment: number of 512bytes blocks rounded up, much
easier to read.

>
>   /*
>   * The man page of truncate says if the size changed,
> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> struct iattr *attrs)
>   sys_utimes in which case we ought to fail the call back to
>   the user when the server rejects the call */
>   if ((rc) && (attrs->ia_valid &
> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>   rc = 0;
>   }

You should remove that hunk, it's not doing anything

Cheers,
Steve French March 19, 2021, 4:52 p.m. UTC | #2
patch updated - thx

On Fri, Mar 19, 2021 at 11:35 AM Aurélien Aptel <aaptel@suse.com> wrote:
>
>
> I gave it a try and it seems correct, alloc size is zero without the
> patch, despite size being >0.
>
> The estimate seems reasonable, it will get updated when cache timeout
> runs out.
>
> Server is supposed to return newly allocated blocks on Close Response
> but I see it's zero on Windows Server despite file size being >0 (maybe
> server bug).
>
> In any case change looks ok, you can
>
> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
>
> But little nit picking on comments and useless final hunk:
>
> Steve French <smfrench@gmail.com> writes:
> > + /*
> > + * i_blocks is not related to (i_size / i_blksize),
> > + * but instead 512 byte (2**9) size is required for
> > + * calculating num blocks. Until we can query the
> > + * server for actual allocation size, this is best estimate
> > + * we have for the blocks allocated for this file
> > + */
> > + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
>
> I would put in the comment: number of 512bytes blocks rounded up, much
> easier to read.
>
> >
> >   /*
> >   * The man page of truncate says if the size changed,
> > @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> > struct iattr *attrs)
> >   sys_utimes in which case we ought to fail the call back to
> >   the user when the server rejects the call */
> >   if ((rc) && (attrs->ia_valid &
> > - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >   rc = 0;
> >   }
>
> You should remove that hunk, it's not doing anything
>
> Cheers,
> --
> Aurélien Aptel / SUSE Labs Samba Team
> GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)
>
Tom Talpey March 19, 2021, 5:38 p.m. UTC | #3
On 3/19/2021 1:25 AM, Steve French wrote:
> Applications that create and extend and write to a file do not
> expect to see 0 allocation size.  When file is extended,
> set its allocation size to a plausible value until we have a
> chance to query the server for it.  When the file is cached
> this will prevent showing an impossible number of allocated
> blocks (like 0).  This fixes e.g. xfstests 614 which does
> 
>      1) create a file and set its size to 64K
>      2) mmap write 64K to the file
>      3) stat -c %b for the file (to query the number of allocated blocks)
> 
> It was failing because we returned 0 blocks.  Even though we would
> return the correct cached file size, we returned an impossible
> allocation size.
> 
> Signed-off-by: Steve French <stfrench@microsoft.com>
> CC: <stable@vger.kernel.org>
> ---
>   fs/cifs/inode.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> index 7c61bc9573c0..17a2c87b811c 100644
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> *mnt_userns, const struct path *path,
>    * We need to be sure that all dirty pages are written and the server
>    * has actual ctime, mtime and file length.
>    */
> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> STATX_BLOCKS)) &&
>        !CIFS_CACHE_READ(CIFS_I(inode)) &&
>        inode->i_mapping && inode->i_mapping->nrpages != 0) {
>    rc = filemap_fdatawait(inode->i_mapping);
> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> iattr *attrs,
>    if (rc == 0) {
>    cifsInode->server_eof = attrs->ia_size;
>    cifs_setsize(inode, attrs->ia_size);
> + /*
> + * i_blocks is not related to (i_size / i_blksize),
> + * but instead 512 byte (2**9) size is required for
> + * calculating num blocks. Until we can query the
> + * server for actual allocation size, this is best estimate
> + * we have for the blocks allocated for this file
> + */
> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;

I don't think 512 is a very robust choice, no server uses anything
so small any more. MS-FSA requires the allocation quantum to be the
volume cluster size. Is that value available locally?

Tom.

>    /*
>    * The man page of truncate says if the size changed,
> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> struct iattr *attrs)
>    sys_utimes in which case we ought to fail the call back to
>    the user when the server rejects the call */
>    if ((rc) && (attrs->ia_valid &
> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>    rc = 0;
>    }
>
Steve French March 19, 2021, 5:42 p.m. UTC | #4
We report the block size properly (typically much larger) - but the
kernel API returns allocation size in 512 byte units no matter what the
block size is.   Number of blocks returned for the kernel API
     inode->i_blocks
is unrelated to the block size (simply allocation_size/512 rounded up by 1).

On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 3/19/2021 1:25 AM, Steve French wrote:
> > Applications that create and extend and write to a file do not
> > expect to see 0 allocation size.  When file is extended,
> > set its allocation size to a plausible value until we have a
> > chance to query the server for it.  When the file is cached
> > this will prevent showing an impossible number of allocated
> > blocks (like 0).  This fixes e.g. xfstests 614 which does
> >
> >      1) create a file and set its size to 64K
> >      2) mmap write 64K to the file
> >      3) stat -c %b for the file (to query the number of allocated blocks)
> >
> > It was failing because we returned 0 blocks.  Even though we would
> > return the correct cached file size, we returned an impossible
> > allocation size.
> >
> > Signed-off-by: Steve French <stfrench@microsoft.com>
> > CC: <stable@vger.kernel.org>
> > ---
> >   fs/cifs/inode.c | 12 ++++++++++--
> >   1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> > index 7c61bc9573c0..17a2c87b811c 100644
> > --- a/fs/cifs/inode.c
> > +++ b/fs/cifs/inode.c
> > @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> > *mnt_userns, const struct path *path,
> >    * We need to be sure that all dirty pages are written and the server
> >    * has actual ctime, mtime and file length.
> >    */
> > - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> > + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> > STATX_BLOCKS)) &&
> >        !CIFS_CACHE_READ(CIFS_I(inode)) &&
> >        inode->i_mapping && inode->i_mapping->nrpages != 0) {
> >    rc = filemap_fdatawait(inode->i_mapping);
> > @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> > iattr *attrs,
> >    if (rc == 0) {
> >    cifsInode->server_eof = attrs->ia_size;
> >    cifs_setsize(inode, attrs->ia_size);
> > + /*
> > + * i_blocks is not related to (i_size / i_blksize),
> > + * but instead 512 byte (2**9) size is required for
> > + * calculating num blocks. Until we can query the
> > + * server for actual allocation size, this is best estimate
> > + * we have for the blocks allocated for this file
> > + */
> > + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
>
> I don't think 512 is a very robust choice, no server uses anything
> so small any more. MS-FSA requires the allocation quantum to be the
> volume cluster size. Is that value available locally?
>
> Tom.
>
> >    /*
> >    * The man page of truncate says if the size changed,
> > @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> > struct iattr *attrs)
> >    sys_utimes in which case we ought to fail the call back to
> >    the user when the server rejects the call */
> >    if ((rc) && (attrs->ia_valid &
> > - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >    rc = 0;
> >    }
> >
Steve French March 19, 2021, 5:46 p.m. UTC | #5
e.g. stat reports much larger than 512 byte block size over SMB3

# stat /mnt2/foo
  File: /mnt2/foo
  Size: 65536      Blocks: 128        IO Block: 1048576 regular file
Device: 34h/52d Inode: 88946092640651991  Links: 1

and local file systems do the same ie "blocks" is unrelated to block size
the fs reports.  Here is an example to XFS locally

# stat Makefile
  File: Makefile
  Size: 66247      Blocks: 136        IO Block: 4096   regular file
Device: 10302h/66306d Inode: 1076242180  Links: 1

On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
>
> We report the block size properly (typically much larger) - but the
> kernel API returns allocation size in 512 byte units no matter what the
> block size is.   Number of blocks returned for the kernel API
>      inode->i_blocks
> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
>
> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
> >
> > On 3/19/2021 1:25 AM, Steve French wrote:
> > > Applications that create and extend and write to a file do not
> > > expect to see 0 allocation size.  When file is extended,
> > > set its allocation size to a plausible value until we have a
> > > chance to query the server for it.  When the file is cached
> > > this will prevent showing an impossible number of allocated
> > > blocks (like 0).  This fixes e.g. xfstests 614 which does
> > >
> > >      1) create a file and set its size to 64K
> > >      2) mmap write 64K to the file
> > >      3) stat -c %b for the file (to query the number of allocated blocks)
> > >
> > > It was failing because we returned 0 blocks.  Even though we would
> > > return the correct cached file size, we returned an impossible
> > > allocation size.
> > >
> > > Signed-off-by: Steve French <stfrench@microsoft.com>
> > > CC: <stable@vger.kernel.org>
> > > ---
> > >   fs/cifs/inode.c | 12 ++++++++++--
> > >   1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> > > index 7c61bc9573c0..17a2c87b811c 100644
> > > --- a/fs/cifs/inode.c
> > > +++ b/fs/cifs/inode.c
> > > @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> > > *mnt_userns, const struct path *path,
> > >    * We need to be sure that all dirty pages are written and the server
> > >    * has actual ctime, mtime and file length.
> > >    */
> > > - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> > > + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> > > STATX_BLOCKS)) &&
> > >        !CIFS_CACHE_READ(CIFS_I(inode)) &&
> > >        inode->i_mapping && inode->i_mapping->nrpages != 0) {
> > >    rc = filemap_fdatawait(inode->i_mapping);
> > > @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> > > iattr *attrs,
> > >    if (rc == 0) {
> > >    cifsInode->server_eof = attrs->ia_size;
> > >    cifs_setsize(inode, attrs->ia_size);
> > > + /*
> > > + * i_blocks is not related to (i_size / i_blksize),
> > > + * but instead 512 byte (2**9) size is required for
> > > + * calculating num blocks. Until we can query the
> > > + * server for actual allocation size, this is best estimate
> > > + * we have for the blocks allocated for this file
> > > + */
> > > + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
> >
> > I don't think 512 is a very robust choice, no server uses anything
> > so small any more. MS-FSA requires the allocation quantum to be the
> > volume cluster size. Is that value available locally?
> >
> > Tom.
> >
> > >    /*
> > >    * The man page of truncate says if the size changed,
> > > @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> > > struct iattr *attrs)
> > >    sys_utimes in which case we ought to fail the call back to
> > >    the user when the server rejects the call */
> > >    if ((rc) && (attrs->ia_valid &
> > > - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > > +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > >    rc = 0;
> > >    }
> > >
>
>
>
> --
> Thanks,
>
> Steve
Tom Talpey March 19, 2021, 5:52 p.m. UTC | #6
But it's not the block size here, it's the cluster size. Block
size is the per-io hunk, allocation size is the number of blocks
lined up to receive it.

Perhaps the safest number is the file size itself, unrounded.

On 3/19/2021 1:46 PM, Steve French wrote:
> e.g. stat reports much larger than 512 byte block size over SMB3
> 
> # stat /mnt2/foo
>    File: /mnt2/foo
>    Size: 65536      Blocks: 128        IO Block: 1048576 regular file
> Device: 34h/52d Inode: 88946092640651991  Links: 1
> 
> and local file systems do the same ie "blocks" is unrelated to block size
> the fs reports.  Here is an example to XFS locally
> 
> # stat Makefile
>    File: Makefile
>    Size: 66247      Blocks: 136        IO Block: 4096   regular file
> Device: 10302h/66306d Inode: 1076242180  Links: 1
> 
> On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
>>
>> We report the block size properly (typically much larger) - but the
>> kernel API returns allocation size in 512 byte units no matter what the
>> block size is.   Number of blocks returned for the kernel API
>>       inode->i_blocks
>> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
>>
>> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
>>>
>>> On 3/19/2021 1:25 AM, Steve French wrote:
>>>> Applications that create and extend and write to a file do not
>>>> expect to see 0 allocation size.  When file is extended,
>>>> set its allocation size to a plausible value until we have a
>>>> chance to query the server for it.  When the file is cached
>>>> this will prevent showing an impossible number of allocated
>>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
>>>>
>>>>       1) create a file and set its size to 64K
>>>>       2) mmap write 64K to the file
>>>>       3) stat -c %b for the file (to query the number of allocated blocks)
>>>>
>>>> It was failing because we returned 0 blocks.  Even though we would
>>>> return the correct cached file size, we returned an impossible
>>>> allocation size.
>>>>
>>>> Signed-off-by: Steve French <stfrench@microsoft.com>
>>>> CC: <stable@vger.kernel.org>
>>>> ---
>>>>    fs/cifs/inode.c | 12 ++++++++++--
>>>>    1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
>>>> index 7c61bc9573c0..17a2c87b811c 100644
>>>> --- a/fs/cifs/inode.c
>>>> +++ b/fs/cifs/inode.c
>>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
>>>> *mnt_userns, const struct path *path,
>>>>     * We need to be sure that all dirty pages are written and the server
>>>>     * has actual ctime, mtime and file length.
>>>>     */
>>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
>>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
>>>> STATX_BLOCKS)) &&
>>>>         !CIFS_CACHE_READ(CIFS_I(inode)) &&
>>>>         inode->i_mapping && inode->i_mapping->nrpages != 0) {
>>>>     rc = filemap_fdatawait(inode->i_mapping);
>>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
>>>> iattr *attrs,
>>>>     if (rc == 0) {
>>>>     cifsInode->server_eof = attrs->ia_size;
>>>>     cifs_setsize(inode, attrs->ia_size);
>>>> + /*
>>>> + * i_blocks is not related to (i_size / i_blksize),
>>>> + * but instead 512 byte (2**9) size is required for
>>>> + * calculating num blocks. Until we can query the
>>>> + * server for actual allocation size, this is best estimate
>>>> + * we have for the blocks allocated for this file
>>>> + */
>>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
>>>
>>> I don't think 512 is a very robust choice, no server uses anything
>>> so small any more. MS-FSA requires the allocation quantum to be the
>>> volume cluster size. Is that value available locally?
>>>
>>> Tom.
>>>
>>>>     /*
>>>>     * The man page of truncate says if the size changed,
>>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
>>>> struct iattr *attrs)
>>>>     sys_utimes in which case we ought to fail the call back to
>>>>     the user when the server rejects the call */
>>>>     if ((rc) && (attrs->ia_valid &
>>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>>>>     rc = 0;
>>>>     }
>>>>
>>
>>
>>
>> --
>> Thanks,
>>
>> Steve
> 
> 
>
Steve French March 19, 2021, 6:08 p.m. UTC | #7
Yes - the Linux terminology is confusing.  Quoting from Ubuntu docs
e.g. about stat output

           "IO Block" in stat's output is the preferred number of
           bytes that the file system uses for reading and writing files...
           "Blocks", on the other hand, counts how many 512-bytes blocks
           are allocated on disk for the file.

So for NFS and SMB3 mounts they return 1MB for "IO Block" size.

statfs on the other hand shows what the server thinks the block size
is (often 4K) but
that is a different field.

And of course number of "blocks" in stat output is meant to return
allocation size
(in 512 byte units for historical reasons, even if most file systems
don't use 512
byte blocks anymore)


On Fri, Mar 19, 2021 at 12:52 PM Tom Talpey <tom@talpey.com> wrote:
>
> But it's not the block size here, it's the cluster size. Block
> size is the per-io hunk, allocation size is the number of blocks
> lined up to receive it.
>
> Perhaps the safest number is the file size itself, unrounded.
>
> On 3/19/2021 1:46 PM, Steve French wrote:
> > e.g. stat reports much larger than 512 byte block size over SMB3
> >
> > # stat /mnt2/foo
> >    File: /mnt2/foo
> >    Size: 65536      Blocks: 128        IO Block: 1048576 regular file
> > Device: 34h/52d Inode: 88946092640651991  Links: 1
> >
> > and local file systems do the same ie "blocks" is unrelated to block size
> > the fs reports.  Here is an example to XFS locally
> >
> > # stat Makefile
> >    File: Makefile
> >    Size: 66247      Blocks: 136        IO Block: 4096   regular file
> > Device: 10302h/66306d Inode: 1076242180  Links: 1
> >
> > On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
> >>
> >> We report the block size properly (typically much larger) - but the
> >> kernel API returns allocation size in 512 byte units no matter what the
> >> block size is.   Number of blocks returned for the kernel API
> >>       inode->i_blocks
> >> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
> >>
> >> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
> >>>
> >>> On 3/19/2021 1:25 AM, Steve French wrote:
> >>>> Applications that create and extend and write to a file do not
> >>>> expect to see 0 allocation size.  When file is extended,
> >>>> set its allocation size to a plausible value until we have a
> >>>> chance to query the server for it.  When the file is cached
> >>>> this will prevent showing an impossible number of allocated
> >>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
> >>>>
> >>>>       1) create a file and set its size to 64K
> >>>>       2) mmap write 64K to the file
> >>>>       3) stat -c %b for the file (to query the number of allocated blocks)
> >>>>
> >>>> It was failing because we returned 0 blocks.  Even though we would
> >>>> return the correct cached file size, we returned an impossible
> >>>> allocation size.
> >>>>
> >>>> Signed-off-by: Steve French <stfrench@microsoft.com>
> >>>> CC: <stable@vger.kernel.org>
> >>>> ---
> >>>>    fs/cifs/inode.c | 12 ++++++++++--
> >>>>    1 file changed, 10 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> >>>> index 7c61bc9573c0..17a2c87b811c 100644
> >>>> --- a/fs/cifs/inode.c
> >>>> +++ b/fs/cifs/inode.c
> >>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> >>>> *mnt_userns, const struct path *path,
> >>>>     * We need to be sure that all dirty pages are written and the server
> >>>>     * has actual ctime, mtime and file length.
> >>>>     */
> >>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> >>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> >>>> STATX_BLOCKS)) &&
> >>>>         !CIFS_CACHE_READ(CIFS_I(inode)) &&
> >>>>         inode->i_mapping && inode->i_mapping->nrpages != 0) {
> >>>>     rc = filemap_fdatawait(inode->i_mapping);
> >>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> >>>> iattr *attrs,
> >>>>     if (rc == 0) {
> >>>>     cifsInode->server_eof = attrs->ia_size;
> >>>>     cifs_setsize(inode, attrs->ia_size);
> >>>> + /*
> >>>> + * i_blocks is not related to (i_size / i_blksize),
> >>>> + * but instead 512 byte (2**9) size is required for
> >>>> + * calculating num blocks. Until we can query the
> >>>> + * server for actual allocation size, this is best estimate
> >>>> + * we have for the blocks allocated for this file
> >>>> + */
> >>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
> >>>
> >>> I don't think 512 is a very robust choice, no server uses anything
> >>> so small any more. MS-FSA requires the allocation quantum to be the
> >>> volume cluster size. Is that value available locally?
> >>>
> >>> Tom.
> >>>
> >>>>     /*
> >>>>     * The man page of truncate says if the size changed,
> >>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> >>>> struct iattr *attrs)
> >>>>     sys_utimes in which case we ought to fail the call back to
> >>>>     the user when the server rejects the call */
> >>>>     if ((rc) && (attrs->ia_valid &
> >>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>>     rc = 0;
> >>>>     }
> >>>>
> >>
> >>
> >>
> >> --
> >> Thanks,
> >>
> >> Steve
> >
> >
> >



--
Thanks,

Steve
Tom Talpey March 19, 2021, 6:26 p.m. UTC | #8
Hrm. I am still uneasy about making up a number. It could break
an application. And the issue isn't even in the client!

Did you ping Neal, or contact dochelp about the Windows server
behavior? I'd be happy to but I don't have any context, including
which filesystem is doing this.

On 3/19/2021 2:08 PM, Steve French wrote:
> Yes - the Linux terminology is confusing.  Quoting from Ubuntu docs
> e.g. about stat output
> 
>             "IO Block" in stat's output is the preferred number of
>             bytes that the file system uses for reading and writing files...
>             "Blocks", on the other hand, counts how many 512-bytes blocks
>             are allocated on disk for the file.
> 
> So for NFS and SMB3 mounts they return 1MB for "IO Block" size.
> 
> statfs on the other hand shows what the server thinks the block size
> is (often 4K) but
> that is a different field.
> 
> And of course number of "blocks" in stat output is meant to return
> allocation size
> (in 512 byte units for historical reasons, even if most file systems
> don't use 512
> byte blocks anymore)
> 
> 
> On Fri, Mar 19, 2021 at 12:52 PM Tom Talpey <tom@talpey.com> wrote:
>>
>> But it's not the block size here, it's the cluster size. Block
>> size is the per-io hunk, allocation size is the number of blocks
>> lined up to receive it.
>>
>> Perhaps the safest number is the file size itself, unrounded.
>>
>> On 3/19/2021 1:46 PM, Steve French wrote:
>>> e.g. stat reports much larger than 512 byte block size over SMB3
>>>
>>> # stat /mnt2/foo
>>>     File: /mnt2/foo
>>>     Size: 65536      Blocks: 128        IO Block: 1048576 regular file
>>> Device: 34h/52d Inode: 88946092640651991  Links: 1
>>>
>>> and local file systems do the same ie "blocks" is unrelated to block size
>>> the fs reports.  Here is an example to XFS locally
>>>
>>> # stat Makefile
>>>     File: Makefile
>>>     Size: 66247      Blocks: 136        IO Block: 4096   regular file
>>> Device: 10302h/66306d Inode: 1076242180  Links: 1
>>>
>>> On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
>>>>
>>>> We report the block size properly (typically much larger) - but the
>>>> kernel API returns allocation size in 512 byte units no matter what the
>>>> block size is.   Number of blocks returned for the kernel API
>>>>        inode->i_blocks
>>>> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
>>>>
>>>> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
>>>>>
>>>>> On 3/19/2021 1:25 AM, Steve French wrote:
>>>>>> Applications that create and extend and write to a file do not
>>>>>> expect to see 0 allocation size.  When file is extended,
>>>>>> set its allocation size to a plausible value until we have a
>>>>>> chance to query the server for it.  When the file is cached
>>>>>> this will prevent showing an impossible number of allocated
>>>>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
>>>>>>
>>>>>>        1) create a file and set its size to 64K
>>>>>>        2) mmap write 64K to the file
>>>>>>        3) stat -c %b for the file (to query the number of allocated blocks)
>>>>>>
>>>>>> It was failing because we returned 0 blocks.  Even though we would
>>>>>> return the correct cached file size, we returned an impossible
>>>>>> allocation size.
>>>>>>
>>>>>> Signed-off-by: Steve French <stfrench@microsoft.com>
>>>>>> CC: <stable@vger.kernel.org>
>>>>>> ---
>>>>>>     fs/cifs/inode.c | 12 ++++++++++--
>>>>>>     1 file changed, 10 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
>>>>>> index 7c61bc9573c0..17a2c87b811c 100644
>>>>>> --- a/fs/cifs/inode.c
>>>>>> +++ b/fs/cifs/inode.c
>>>>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
>>>>>> *mnt_userns, const struct path *path,
>>>>>>      * We need to be sure that all dirty pages are written and the server
>>>>>>      * has actual ctime, mtime and file length.
>>>>>>      */
>>>>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
>>>>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
>>>>>> STATX_BLOCKS)) &&
>>>>>>          !CIFS_CACHE_READ(CIFS_I(inode)) &&
>>>>>>          inode->i_mapping && inode->i_mapping->nrpages != 0) {
>>>>>>      rc = filemap_fdatawait(inode->i_mapping);
>>>>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
>>>>>> iattr *attrs,
>>>>>>      if (rc == 0) {
>>>>>>      cifsInode->server_eof = attrs->ia_size;
>>>>>>      cifs_setsize(inode, attrs->ia_size);
>>>>>> + /*
>>>>>> + * i_blocks is not related to (i_size / i_blksize),
>>>>>> + * but instead 512 byte (2**9) size is required for
>>>>>> + * calculating num blocks. Until we can query the
>>>>>> + * server for actual allocation size, this is best estimate
>>>>>> + * we have for the blocks allocated for this file
>>>>>> + */
>>>>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
>>>>>
>>>>> I don't think 512 is a very robust choice, no server uses anything
>>>>> so small any more. MS-FSA requires the allocation quantum to be the
>>>>> volume cluster size. Is that value available locally?
>>>>>
>>>>> Tom.
>>>>>
>>>>>>      /*
>>>>>>      * The man page of truncate says if the size changed,
>>>>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
>>>>>> struct iattr *attrs)
>>>>>>      sys_utimes in which case we ought to fail the call back to
>>>>>>      the user when the server rejects the call */
>>>>>>      if ((rc) && (attrs->ia_valid &
>>>>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>>>>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
>>>>>>      rc = 0;
>>>>>>      }
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Steve
>>>
>>>
>>>
> 
> 
> 
> --
> Thanks,
> 
> Steve
>
Steve French March 19, 2021, 6:48 p.m. UTC | #9
Well ... it is a little complicated to query it on close in the
current cifs.ko compounding code and allocation size can change on the
server so returning a 'plausible' allocation size rather than an
impossible one is progress.

On Fri, Mar 19, 2021 at 1:26 PM Tom Talpey <tom@talpey.com> wrote:
>
> Hrm. I am still uneasy about making up a number. It could break
> an application. And the issue isn't even in the client!
>
> Did you ping Neal, or contact dochelp about the Windows server
> behavior? I'd be happy to but I don't have any context, including
> which filesystem is doing this.
>
> On 3/19/2021 2:08 PM, Steve French wrote:
> > Yes - the Linux terminology is confusing.  Quoting from Ubuntu docs
> > e.g. about stat output
> >
> >             "IO Block" in stat's output is the preferred number of
> >             bytes that the file system uses for reading and writing files...
> >             "Blocks", on the other hand, counts how many 512-bytes blocks
> >             are allocated on disk for the file.
> >
> > So for NFS and SMB3 mounts they return 1MB for "IO Block" size.
> >
> > statfs on the other hand shows what the server thinks the block size
> > is (often 4K) but
> > that is a different field.
> >
> > And of course number of "blocks" in stat output is meant to return
> > allocation size
> > (in 512 byte units for historical reasons, even if most file systems
> > don't use 512
> > byte blocks anymore)
> >
> >
> > On Fri, Mar 19, 2021 at 12:52 PM Tom Talpey <tom@talpey.com> wrote:
> >>
> >> But it's not the block size here, it's the cluster size. Block
> >> size is the per-io hunk, allocation size is the number of blocks
> >> lined up to receive it.
> >>
> >> Perhaps the safest number is the file size itself, unrounded.
> >>
> >> On 3/19/2021 1:46 PM, Steve French wrote:
> >>> e.g. stat reports much larger than 512 byte block size over SMB3
> >>>
> >>> # stat /mnt2/foo
> >>>     File: /mnt2/foo
> >>>     Size: 65536      Blocks: 128        IO Block: 1048576 regular file
> >>> Device: 34h/52d Inode: 88946092640651991  Links: 1
> >>>
> >>> and local file systems do the same ie "blocks" is unrelated to block size
> >>> the fs reports.  Here is an example to XFS locally
> >>>
> >>> # stat Makefile
> >>>     File: Makefile
> >>>     Size: 66247      Blocks: 136        IO Block: 4096   regular file
> >>> Device: 10302h/66306d Inode: 1076242180  Links: 1
> >>>
> >>> On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
> >>>>
> >>>> We report the block size properly (typically much larger) - but the
> >>>> kernel API returns allocation size in 512 byte units no matter what the
> >>>> block size is.   Number of blocks returned for the kernel API
> >>>>        inode->i_blocks
> >>>> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
> >>>>
> >>>> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
> >>>>>
> >>>>> On 3/19/2021 1:25 AM, Steve French wrote:
> >>>>>> Applications that create and extend and write to a file do not
> >>>>>> expect to see 0 allocation size.  When file is extended,
> >>>>>> set its allocation size to a plausible value until we have a
> >>>>>> chance to query the server for it.  When the file is cached
> >>>>>> this will prevent showing an impossible number of allocated
> >>>>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
> >>>>>>
> >>>>>>        1) create a file and set its size to 64K
> >>>>>>        2) mmap write 64K to the file
> >>>>>>        3) stat -c %b for the file (to query the number of allocated blocks)
> >>>>>>
> >>>>>> It was failing because we returned 0 blocks.  Even though we would
> >>>>>> return the correct cached file size, we returned an impossible
> >>>>>> allocation size.
> >>>>>>
> >>>>>> Signed-off-by: Steve French <stfrench@microsoft.com>
> >>>>>> CC: <stable@vger.kernel.org>
> >>>>>> ---
> >>>>>>     fs/cifs/inode.c | 12 ++++++++++--
> >>>>>>     1 file changed, 10 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> >>>>>> index 7c61bc9573c0..17a2c87b811c 100644
> >>>>>> --- a/fs/cifs/inode.c
> >>>>>> +++ b/fs/cifs/inode.c
> >>>>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> >>>>>> *mnt_userns, const struct path *path,
> >>>>>>      * We need to be sure that all dirty pages are written and the server
> >>>>>>      * has actual ctime, mtime and file length.
> >>>>>>      */
> >>>>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> >>>>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> >>>>>> STATX_BLOCKS)) &&
> >>>>>>          !CIFS_CACHE_READ(CIFS_I(inode)) &&
> >>>>>>          inode->i_mapping && inode->i_mapping->nrpages != 0) {
> >>>>>>      rc = filemap_fdatawait(inode->i_mapping);
> >>>>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> >>>>>> iattr *attrs,
> >>>>>>      if (rc == 0) {
> >>>>>>      cifsInode->server_eof = attrs->ia_size;
> >>>>>>      cifs_setsize(inode, attrs->ia_size);
> >>>>>> + /*
> >>>>>> + * i_blocks is not related to (i_size / i_blksize),
> >>>>>> + * but instead 512 byte (2**9) size is required for
> >>>>>> + * calculating num blocks. Until we can query the
> >>>>>> + * server for actual allocation size, this is best estimate
> >>>>>> + * we have for the blocks allocated for this file
> >>>>>> + */
> >>>>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
> >>>>>
> >>>>> I don't think 512 is a very robust choice, no server uses anything
> >>>>> so small any more. MS-FSA requires the allocation quantum to be the
> >>>>> volume cluster size. Is that value available locally?
> >>>>>
> >>>>> Tom.
> >>>>>
> >>>>>>      /*
> >>>>>>      * The man page of truncate says if the size changed,
> >>>>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> >>>>>> struct iattr *attrs)
> >>>>>>      sys_utimes in which case we ought to fail the call back to
> >>>>>>      the user when the server rejects the call */
> >>>>>>      if ((rc) && (attrs->ia_valid &
> >>>>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>>>>      rc = 0;
> >>>>>>      }
> >>>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks,
> >>>>
> >>>> Steve
> >>>
> >>>
> >>>
> >
> >
> >
> > --
> > Thanks,
> >
> > Steve
> >
Steve French March 19, 2021, 6:48 p.m. UTC | #10
(ie even if there is a Windows bug on close with delayed updates to
allocation size)

On Fri, Mar 19, 2021 at 1:48 PM Steve French <smfrench@gmail.com> wrote:
>
> Well ... it is a little complicated to query it on close in the
> current cifs.ko compounding code and allocation size can change on the
> server so returning a 'plausible' allocation size rather than an
> impossible one is progress.
>
> On Fri, Mar 19, 2021 at 1:26 PM Tom Talpey <tom@talpey.com> wrote:
> >
> > Hrm. I am still uneasy about making up a number. It could break
> > an application. And the issue isn't even in the client!
> >
> > Did you ping Neal, or contact dochelp about the Windows server
> > behavior? I'd be happy to but I don't have any context, including
> > which filesystem is doing this.
> >
> > On 3/19/2021 2:08 PM, Steve French wrote:
> > > Yes - the Linux terminology is confusing.  Quoting from Ubuntu docs
> > > e.g. about stat output
> > >
> > >             "IO Block" in stat's output is the preferred number of
> > >             bytes that the file system uses for reading and writing files...
> > >             "Blocks", on the other hand, counts how many 512-bytes blocks
> > >             are allocated on disk for the file.
> > >
> > > So for NFS and SMB3 mounts they return 1MB for "IO Block" size.
> > >
> > > statfs on the other hand shows what the server thinks the block size
> > > is (often 4K) but
> > > that is a different field.
> > >
> > > And of course number of "blocks" in stat output is meant to return
> > > allocation size
> > > (in 512 byte units for historical reasons, even if most file systems
> > > don't use 512
> > > byte blocks anymore)
> > >
> > >
> > > On Fri, Mar 19, 2021 at 12:52 PM Tom Talpey <tom@talpey.com> wrote:
> > >>
> > >> But it's not the block size here, it's the cluster size. Block
> > >> size is the per-io hunk, allocation size is the number of blocks
> > >> lined up to receive it.
> > >>
> > >> Perhaps the safest number is the file size itself, unrounded.
> > >>
> > >> On 3/19/2021 1:46 PM, Steve French wrote:
> > >>> e.g. stat reports much larger than 512 byte block size over SMB3
> > >>>
> > >>> # stat /mnt2/foo
> > >>>     File: /mnt2/foo
> > >>>     Size: 65536      Blocks: 128        IO Block: 1048576 regular file
> > >>> Device: 34h/52d Inode: 88946092640651991  Links: 1
> > >>>
> > >>> and local file systems do the same ie "blocks" is unrelated to block size
> > >>> the fs reports.  Here is an example to XFS locally
> > >>>
> > >>> # stat Makefile
> > >>>     File: Makefile
> > >>>     Size: 66247      Blocks: 136        IO Block: 4096   regular file
> > >>> Device: 10302h/66306d Inode: 1076242180  Links: 1
> > >>>
> > >>> On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@gmail.com> wrote:
> > >>>>
> > >>>> We report the block size properly (typically much larger) - but the
> > >>>> kernel API returns allocation size in 512 byte units no matter what the
> > >>>> block size is.   Number of blocks returned for the kernel API
> > >>>>        inode->i_blocks
> > >>>> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
> > >>>>
> > >>>> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@talpey.com> wrote:
> > >>>>>
> > >>>>> On 3/19/2021 1:25 AM, Steve French wrote:
> > >>>>>> Applications that create and extend and write to a file do not
> > >>>>>> expect to see 0 allocation size.  When file is extended,
> > >>>>>> set its allocation size to a plausible value until we have a
> > >>>>>> chance to query the server for it.  When the file is cached
> > >>>>>> this will prevent showing an impossible number of allocated
> > >>>>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
> > >>>>>>
> > >>>>>>        1) create a file and set its size to 64K
> > >>>>>>        2) mmap write 64K to the file
> > >>>>>>        3) stat -c %b for the file (to query the number of allocated blocks)
> > >>>>>>
> > >>>>>> It was failing because we returned 0 blocks.  Even though we would
> > >>>>>> return the correct cached file size, we returned an impossible
> > >>>>>> allocation size.
> > >>>>>>
> > >>>>>> Signed-off-by: Steve French <stfrench@microsoft.com>
> > >>>>>> CC: <stable@vger.kernel.org>
> > >>>>>> ---
> > >>>>>>     fs/cifs/inode.c | 12 ++++++++++--
> > >>>>>>     1 file changed, 10 insertions(+), 2 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> > >>>>>> index 7c61bc9573c0..17a2c87b811c 100644
> > >>>>>> --- a/fs/cifs/inode.c
> > >>>>>> +++ b/fs/cifs/inode.c
> > >>>>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> > >>>>>> *mnt_userns, const struct path *path,
> > >>>>>>      * We need to be sure that all dirty pages are written and the server
> > >>>>>>      * has actual ctime, mtime and file length.
> > >>>>>>      */
> > >>>>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> > >>>>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> > >>>>>> STATX_BLOCKS)) &&
> > >>>>>>          !CIFS_CACHE_READ(CIFS_I(inode)) &&
> > >>>>>>          inode->i_mapping && inode->i_mapping->nrpages != 0) {
> > >>>>>>      rc = filemap_fdatawait(inode->i_mapping);
> > >>>>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> > >>>>>> iattr *attrs,
> > >>>>>>      if (rc == 0) {
> > >>>>>>      cifsInode->server_eof = attrs->ia_size;
> > >>>>>>      cifs_setsize(inode, attrs->ia_size);
> > >>>>>> + /*
> > >>>>>> + * i_blocks is not related to (i_size / i_blksize),
> > >>>>>> + * but instead 512 byte (2**9) size is required for
> > >>>>>> + * calculating num blocks. Until we can query the
> > >>>>>> + * server for actual allocation size, this is best estimate
> > >>>>>> + * we have for the blocks allocated for this file
> > >>>>>> + */
> > >>>>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
> > >>>>>
> > >>>>> I don't think 512 is a very robust choice, no server uses anything
> > >>>>> so small any more. MS-FSA requires the allocation quantum to be the
> > >>>>> volume cluster size. Is that value available locally?
> > >>>>>
> > >>>>> Tom.
> > >>>>>
> > >>>>>>      /*
> > >>>>>>      * The man page of truncate says if the size changed,
> > >>>>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> > >>>>>> struct iattr *attrs)
> > >>>>>>      sys_utimes in which case we ought to fail the call back to
> > >>>>>>      the user when the server rejects the call */
> > >>>>>>      if ((rc) && (attrs->ia_valid &
> > >>>>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > >>>>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> > >>>>>>      rc = 0;
> > >>>>>>      }
> > >>>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Thanks,
> > >>>>
> > >>>> Steve
> > >>>
> > >>>
> > >>>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > >
> > > Steve
> > >
>
>
>
> --
> Thanks,
>
> Steve
diff mbox series

Patch

From 2ea7f44e54c7d3c4e8236e82950efb6aedda2486 Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@microsoft.com>
Date: Fri, 19 Mar 2021 00:05:48 -0500
Subject: [PATCH] cifs: fix allocation size on newly created files

Applications that create and extend and write to a file do not
expect to see 0 allocation size.  When file is extended,
set its allocation size to a plausible value until we have a
chance to query the server for it.  When the file is cached
this will prevent showing an impossible number of allocated
blocks (like 0).  This fixes e.g. xfstests 614 which does

    1) create a file and set its size to 64K
    2) mmap write 64K to the file
    3) stat -c %b for the file (to query the number of allocated blocks)

It was failing because we returned 0 blocks.  Even though we would
return the correct cached file size, we returned an impossible
allocation size.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: <stable@vger.kernel.org>
---
 fs/cifs/inode.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 7c61bc9573c0..17a2c87b811c 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -2395,7 +2395,7 @@  int cifs_getattr(struct user_namespace *mnt_userns, const struct path *path,
 	 * We need to be sure that all dirty pages are written and the server
 	 * has actual ctime, mtime and file length.
 	 */
-	if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
+	if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE | STATX_BLOCKS)) &&
 	    !CIFS_CACHE_READ(CIFS_I(inode)) &&
 	    inode->i_mapping && inode->i_mapping->nrpages != 0) {
 		rc = filemap_fdatawait(inode->i_mapping);
@@ -2585,6 +2585,14 @@  cifs_set_file_size(struct inode *inode, struct iattr *attrs,
 	if (rc == 0) {
 		cifsInode->server_eof = attrs->ia_size;
 		cifs_setsize(inode, attrs->ia_size);
+		/*
+		 * i_blocks is not related to (i_size / i_blksize),
+		 * but instead 512 byte (2**9) size is required for
+		 * calculating num blocks. Until we can query the
+		 * server for actual allocation size, this is best estimate
+		 * we have for the blocks allocated for this file
+		 */
+		inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
 
 		/*
 		 * The man page of truncate says if the size changed,
@@ -2912,7 +2920,7 @@  cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs)
 		sys_utimes in which case we ought to fail the call back to
 		the user when the server rejects the call */
 		if ((rc) && (attrs->ia_valid &
-				(ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
+			    (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
 			rc = 0;
 	}
 
-- 
2.27.0