diff mbox

[f2fs-dev] Dwrite with non-aligned offset and size

Message ID 556C481C.1050400@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

He YunLei June 1, 2015, 11:55 a.m. UTC
Hi Jaegeuk,

We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
as fallow:

dio04

<<<test_start>>>
tag=dio04 stime=1432278894
cmdline="diotest4"
contacts=""
analysis=exit
<<<test_output>>>
diotest4    1  TPASS  :  Negative Offset
diotest4    2  TPASS  :  removed
diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
diotest4    5  TPASS  :  Read beyond the file size
......

the result of ext4 with same environment:

dio04

<<<test_start>>>
tag=dio04 stime=1432259643
cmdline="diotest4"
contacts=""
analysis=exit
<<<test_output>>>
diotest4    1  TPASS  :  Negative Offset
diotest4    2  TPASS  :  removed
diotest4    3  TPASS  :  Odd count of read and write
diotest4    4  TPASS  :  Read beyond the file size
......

Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
dwrite with non-aligned offset and size will turn into buffered write. Whether it will
have some impact on user layer applications?

I wrote a patch, not well tested, how do you think of it?

I wish you and other developers in this list could help me in a correct way.

Thanks,
He

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jaegeuk Kim June 1, 2015, 11:01 p.m. UTC | #1
On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
> Hi Jaegeuk,
> 
> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
> as fallow:
> 
> dio04
> 
> <<<test_start>>>
> tag=dio04 stime=1432278894
> cmdline="diotest4"
> contacts=""
> analysis=exit
> <<<test_output>>>
> diotest4    1  TPASS  :  Negative Offset
> diotest4    2  TPASS  :  removed
> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
> diotest4    5  TPASS  :  Read beyond the file size
> ......
> 
> the result of ext4 with same environment:
> 
> dio04
> 
> <<<test_start>>>
> tag=dio04 stime=1432259643
> cmdline="diotest4"
> contacts=""
> analysis=exit
> <<<test_output>>>
> diotest4    1  TPASS  :  Negative Offset
> diotest4    2  TPASS  :  removed
> diotest4    3  TPASS  :  Odd count of read and write
> diotest4    4  TPASS  :  Read beyond the file size
> ......
> 
> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
> have some impact on user layer applications?

It's not a big deal to return -EINVAL.
When I take a look at other filesystem behaviors, it seems there is no restriction.

> 
> I wrote a patch, not well tested, how do you think of it?

Returning the error number would be good to me.
Could you write and sumbit a complete one?

Thanks,

> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 9bedfa8..ba5d94c 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
>         if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
>                 return 0;
> 
> -       if (check_direct_IO(inode, iter, offset))
> -               return 0;
> +       err = check_direct_IO(inode, iter, offset)
> +       if (err)
> +               return -EINVAL;
> 
>         trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
> 
> I wish you and other developers in this list could help me in a correct way.
> 
> Thanks,
> He
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
He YunLei June 2, 2015, 4:21 a.m. UTC | #2
On 2015/6/2 7:01, Jaegeuk Kim wrote:
> On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
>> Hi Jaegeuk,
>>
>> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
>> as fallow:
>>
>> dio04
>>
>> <<<test_start>>>
>> tag=dio04 stime=1432278894
>> cmdline="diotest4"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> diotest4    1  TPASS  :  Negative Offset
>> diotest4    2  TPASS  :  removed
>> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
>> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
>> diotest4    5  TPASS  :  Read beyond the file size
>> ......
>>
>> the result of ext4 with same environment:
>>
>> dio04
>>
>> <<<test_start>>>
>> tag=dio04 stime=1432259643
>> cmdline="diotest4"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> diotest4    1  TPASS  :  Negative Offset
>> diotest4    2  TPASS  :  removed
>> diotest4    3  TPASS  :  Odd count of read and write
>> diotest4    4  TPASS  :  Read beyond the file size
>> ......
>>
>> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
>> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
>> have some impact on user layer applications?
>
> It's not a big deal to return -EINVAL.
> When I take a look at other filesystem behaviors, it seems there is no restriction.
>

Ext4 do a check in the function do_blockdev_direct_IO:

          if (align & blocksize_mask) {
              if (bdev)
                  blkbits = blksize_bits(bdev_logical_block_size(bdev));
              blocksize_mask = (1 << blkbits) - 1;
              if (align & blocksize_mask)
                  goto out;
          }

It will return -EINVAL if the alignment is not satisfied.

In f2fs, it do the check by check_direct_IO() before blockdev_direct_IO().
The difference between the two methods is whether turn dwrite with non-aligned
offset and size into buffered write. I am not very clear which one is better!

Thanks,
He

>>
>> I wrote a patch, not well tested, how do you think of it?
>
> Returning the error number would be good to me.
> Could you write and sumbit a complete one?
>
> Thanks,
>
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 9bedfa8..ba5d94c 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
>>          if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
>>                  return 0;
>>
>> -       if (check_direct_IO(inode, iter, offset))
>> -               return 0;
>> +       err = check_direct_IO(inode, iter, offset)
>> +       if (err)
>> +               return -EINVAL;
>>
>>          trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
>>
>> I wish you and other developers in this list could help me in a correct way.
>>
>> Thanks,
>> He
>
> .
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
?? July 3, 2015, 8:02 a.m. UTC | #3
Hi Yunlei,

Sorry for the long delay.

> -----Original Message-----
> From: He YunLei [mailto:heyunlei@huawei.com]
> Sent: Tuesday, June 02, 2015 12:21 PM
> To: Jaegeuk Kim
> Cc: linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
> 
> On 2015/6/2 7:01, Jaegeuk Kim wrote:
> > On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
> >> Hi Jaegeuk,
> >>
> >> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
> >> as fallow:
> >>
> >> dio04
> >>
> >> <<<test_start>>>
> >> tag=dio04 stime=1432278894
> >> cmdline="diotest4"
> >> contacts=""
> >> analysis=exit
> >> <<<test_output>>>
> >> diotest4    1  TPASS  :  Negative Offset
> >> diotest4    2  TPASS  :  removed
> >> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
> >> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
> >> diotest4    5  TPASS  :  Read beyond the file size
> >> ......
> >>
> >> the result of ext4 with same environment:
> >>
> >> dio04
> >>
> >> <<<test_start>>>
> >> tag=dio04 stime=1432259643
> >> cmdline="diotest4"
> >> contacts=""
> >> analysis=exit
> >> <<<test_output>>>
> >> diotest4    1  TPASS  :  Negative Offset
> >> diotest4    2  TPASS  :  removed
> >> diotest4    3  TPASS  :  Odd count of read and write
> >> diotest4    4  TPASS  :  Read beyond the file size
> >> ......
> >>
> >> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
> >> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
> >> have some impact on user layer applications?
> >
> > It's not a big deal to return -EINVAL.
> > When I take a look at other filesystem behaviors, it seems there is no restriction.
> >
> 
> Ext4 do a check in the function do_blockdev_direct_IO:
> 
>           if (align & blocksize_mask) {
>               if (bdev)
>                   blkbits = blksize_bits(bdev_logical_block_size(bdev));
>               blocksize_mask = (1 << blkbits) - 1;
>               if (align & blocksize_mask)
>                   goto out;
>           }
> 
> It will return -EINVAL if the alignment is not satisfied.

I think we can get hint from the error case description in write(2) manual:

"EINVAL  fd is attached to an object which is unsuitable for writing; or the file
was  opened with the O_DIRECT flag, and either the address specified in buf, the
value specified in count, or the current file offset is not suitably aligned."

So if the alignment is not satisfied, we should return '-EINVAL' instead of
letting user fall back to buffered write.

Do you have time to make and send us a patch for fixing this issue?

Thanks,

> 
> In f2fs, it do the check by check_direct_IO() before blockdev_direct_IO().
> The difference between the two methods is whether turn dwrite with non-aligned
> offset and size into buffered write. I am not very clear which one is better!
> 
> Thanks,
> He
> 
> >>
> >> I wrote a patch, not well tested, how do you think of it?
> >
> > Returning the error number would be good to me.
> > Could you write and sumbit a complete one?
> >
> > Thanks,
> >
> >>
> >> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >> index 9bedfa8..ba5d94c 100644
> >> --- a/fs/f2fs/data.c
> >> +++ b/fs/f2fs/data.c
> >> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
> >>          if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
> >>                  return 0;
> >>
> >> -       if (check_direct_IO(inode, iter, offset))
> >> -               return 0;
> >> +       err = check_direct_IO(inode, iter, offset)
> >> +       if (err)
> >> +               return -EINVAL;
> >>
> >>          trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
> >>
> >> I wish you and other developers in this list could help me in a correct way.
> >>
> >> Thanks,
> >> He
> >
> > .
> >
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
He YunLei July 16, 2015, 2:57 a.m. UTC | #4
On 2015/7/3 16:02, Chao Yu wrote:
> Hi Yunlei,
>
> Sorry for the long delay.
>
>> -----Original Message-----
>> From: He YunLei [mailto:heyunlei@huawei.com]
>> Sent: Tuesday, June 02, 2015 12:21 PM
>> To: Jaegeuk Kim
>> Cc: linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
>>
>> On 2015/6/2 7:01, Jaegeuk Kim wrote:
>>> On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
>>>> as fallow:
>>>>
>>>> dio04
>>>>
>>>> <<<test_start>>>
>>>> tag=dio04 stime=1432278894
>>>> cmdline="diotest4"
>>>> contacts=""
>>>> analysis=exit
>>>> <<<test_output>>>
>>>> diotest4    1  TPASS  :  Negative Offset
>>>> diotest4    2  TPASS  :  removed
>>>> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
>>>> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
>>>> diotest4    5  TPASS  :  Read beyond the file size
>>>> ......
>>>>
>>>> the result of ext4 with same environment:
>>>>
>>>> dio04
>>>>
>>>> <<<test_start>>>
>>>> tag=dio04 stime=1432259643
>>>> cmdline="diotest4"
>>>> contacts=""
>>>> analysis=exit
>>>> <<<test_output>>>
>>>> diotest4    1  TPASS  :  Negative Offset
>>>> diotest4    2  TPASS  :  removed
>>>> diotest4    3  TPASS  :  Odd count of read and write
>>>> diotest4    4  TPASS  :  Read beyond the file size
>>>> ......
>>>>
>>>> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
>>>> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
>>>> have some impact on user layer applications?
>>>
>>> It's not a big deal to return -EINVAL.
>>> When I take a look at other filesystem behaviors, it seems there is no restriction.
>>>
>>
>> Ext4 do a check in the function do_blockdev_direct_IO:
>>
>>            if (align & blocksize_mask) {
>>                if (bdev)
>>                    blkbits = blksize_bits(bdev_logical_block_size(bdev));
>>                blocksize_mask = (1 << blkbits) - 1;
>>                if (align & blocksize_mask)
>>                    goto out;
>>            }
>>
>> It will return -EINVAL if the alignment is not satisfied.
>
> I think we can get hint from the error case description in write(2) manual:
>
> "EINVAL  fd is attached to an object which is unsuitable for writing; or the file
> was  opened with the O_DIRECT flag, and either the address specified in buf, the
> value specified in count, or the current file offset is not suitably aligned."
>
> So if the alignment is not satisfied, we should return '-EINVAL' instead of
> letting user fall back to buffered write.
>
> Do you have time to make and send us a patch for fixing this issue?
>
> Thanks,
>
In man page of open(2)
	
The O_DIRECT flag may impose alignment restrictions on the length and
address of user-space buffers and the file offset of I/Os.  In Linux
alignment restrictions vary by filesystem and kernel version and
might be absent entirely.  However there is currently no
filesystem-independent interface for an application to discover these
restrictions for a given file or filesystem.

So now I don't know which one is better, falling back to buffered write or not?
>>
>> In f2fs, it do the check by check_direct_IO() before blockdev_direct_IO().
>> The difference between the two methods is whether turn dwrite with non-aligned
>> offset and size into buffered write. I am not very clear which one is better!
>>
>> Thanks,
>> He
>>
>>>>
>>>> I wrote a patch, not well tested, how do you think of it?
>>>
>>> Returning the error number would be good to me.
>>> Could you write and sumbit a complete one?
>>>
>>> Thanks,
>>>
>>>>
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 9bedfa8..ba5d94c 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
>>>>           if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
>>>>                   return 0;
>>>>
>>>> -       if (check_direct_IO(inode, iter, offset))
>>>> -               return 0;
>>>> +       err = check_direct_IO(inode, iter, offset)
>>>> +       if (err)
>>>> +               return -EINVAL;
>>>>
>>>>           trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
>>>>
>>>> I wish you and other developers in this list could help me in a correct way.
>>>>
>>>> Thanks,
>>>> He
>>>
>>> .
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
>
> .
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
?? July 16, 2015, 10:13 a.m. UTC | #5
> -----Original Message-----
> From: He YunLei [mailto:heyunlei@huawei.com]
> Sent: Thursday, July 16, 2015 10:58 AM
> To: Chao Yu
> Cc: 'Jaegeuk Kim'; linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
> 
> On 2015/7/3 16:02, Chao Yu wrote:
> > Hi Yunlei,
> >
> > Sorry for the long delay.
> >
> >> -----Original Message-----
> >> From: He YunLei [mailto:heyunlei@huawei.com]
> >> Sent: Tuesday, June 02, 2015 12:21 PM
> >> To: Jaegeuk Kim
> >> Cc: linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >> Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
> >>
> >> On 2015/6/2 7:01, Jaegeuk Kim wrote:
> >>> On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
> >>>> Hi Jaegeuk,
> >>>>
> >>>> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
> >>>> as fallow:
> >>>>
> >>>> dio04
> >>>>
> >>>> <<<test_start>>>
> >>>> tag=dio04 stime=1432278894
> >>>> cmdline="diotest4"
> >>>> contacts=""
> >>>> analysis=exit
> >>>> <<<test_output>>>
> >>>> diotest4    1  TPASS  :  Negative Offset
> >>>> diotest4    2  TPASS  :  removed
> >>>> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
> >>>> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
> >>>> diotest4    5  TPASS  :  Read beyond the file size
> >>>> ......
> >>>>
> >>>> the result of ext4 with same environment:
> >>>>
> >>>> dio04
> >>>>
> >>>> <<<test_start>>>
> >>>> tag=dio04 stime=1432259643
> >>>> cmdline="diotest4"
> >>>> contacts=""
> >>>> analysis=exit
> >>>> <<<test_output>>>
> >>>> diotest4    1  TPASS  :  Negative Offset
> >>>> diotest4    2  TPASS  :  removed
> >>>> diotest4    3  TPASS  :  Odd count of read and write
> >>>> diotest4    4  TPASS  :  Read beyond the file size
> >>>> ......
> >>>>
> >>>> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
> >>>> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
> >>>> have some impact on user layer applications?
> >>>
> >>> It's not a big deal to return -EINVAL.
> >>> When I take a look at other filesystem behaviors, it seems there is no restriction.
> >>>
> >>
> >> Ext4 do a check in the function do_blockdev_direct_IO:
> >>
> >>            if (align & blocksize_mask) {
> >>                if (bdev)
> >>                    blkbits = blksize_bits(bdev_logical_block_size(bdev));
> >>                blocksize_mask = (1 << blkbits) - 1;
> >>                if (align & blocksize_mask)
> >>                    goto out;
> >>            }
> >>
> >> It will return -EINVAL if the alignment is not satisfied.
> >
> > I think we can get hint from the error case description in write(2) manual:
> >
> > "EINVAL  fd is attached to an object which is unsuitable for writing; or the file
> > was  opened with the O_DIRECT flag, and either the address specified in buf, the
> > value specified in count, or the current file offset is not suitably aligned."
> >
> > So if the alignment is not satisfied, we should return '-EINVAL' instead of
> > letting user fall back to buffered write.
> >
> > Do you have time to make and send us a patch for fixing this issue?
> >
> > Thanks,
> >
> In man page of open(2)
> 
> The O_DIRECT flag may impose alignment restrictions on the length and
> address of user-space buffers and the file offset of I/Os.  In Linux
> alignment restrictions vary by filesystem and kernel version and
> might be absent entirely.  However there is currently no
> filesystem-independent interface for an application to discover these
> restrictions for a given file or filesystem.

Right, there is no such interface here, and I don't think we need this,
since developers already have the manual for guiding them to use the
interfaces of filesystem correctly. Manuals should describe the restrictions
of interfaces which is exported by VFS, if some filesystems did not follow
theses restricting rules, we should add related description in manual.

> 
> So now I don't know which one is better, falling back to buffered write or not?

IMO, DIO is used for special purpose, if the target if not be achieved because of
wrong parameters, why not feedback to our user as write(2) manual descripted?

Any special reason to fallback to buffered write in f2fs?

Thanks,

> >>
> >> In f2fs, it do the check by check_direct_IO() before blockdev_direct_IO().
> >> The difference between the two methods is whether turn dwrite with non-aligned
> >> offset and size into buffered write. I am not very clear which one is better!
> >>
> >> Thanks,
> >> He
> >>
> >>>>
> >>>> I wrote a patch, not well tested, how do you think of it?
> >>>
> >>> Returning the error number would be good to me.
> >>> Could you write and sumbit a complete one?
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>>> index 9bedfa8..ba5d94c 100644
> >>>> --- a/fs/f2fs/data.c
> >>>> +++ b/fs/f2fs/data.c
> >>>> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter,
> >>>>           if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
> >>>>                   return 0;
> >>>>
> >>>> -       if (check_direct_IO(inode, iter, offset))
> >>>> -               return 0;
> >>>> +       err = check_direct_IO(inode, iter, offset)
> >>>> +       if (err)
> >>>> +               return -EINVAL;
> >>>>
> >>>>           trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
> >>>>
> >>>> I wish you and other developers in this list could help me in a correct way.
> >>>>
> >>>> Thanks,
> >>>> He
> >>>
> >>> .
> >>>
> >>
> >>
> >> ------------------------------------------------------------------------------
> >> _______________________________________________
> >> Linux-f2fs-devel mailing list
> >> Linux-f2fs-devel@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >
> >
> > .
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Kim July 17, 2015, 2:16 a.m. UTC | #6
Hi,

On Thu, Jul 16, 2015 at 06:13:40PM +0800, Chao Yu wrote:
> > -----Original Message-----
> > From: He YunLei [mailto:heyunlei@huawei.com]
> > Sent: Thursday, July 16, 2015 10:58 AM
> > To: Chao Yu
> > Cc: 'Jaegeuk Kim'; linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
> > Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
> > 
> > On 2015/7/3 16:02, Chao Yu wrote:
> > > Hi Yunlei,
> > >
> > > Sorry for the long delay.
> > >
> > >> -----Original Message-----
> > >> From: He YunLei [mailto:heyunlei@huawei.com]
> > >> Sent: Tuesday, June 02, 2015 12:21 PM
> > >> To: Jaegeuk Kim
> > >> Cc: linux-fsdevel@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
> > >> Subject: Re: [f2fs-dev] Dwrite with non-aligned offset and size
> > >>
> > >> On 2015/6/2 7:01, Jaegeuk Kim wrote:
> > >>> On Mon, Jun 01, 2015 at 07:55:08PM +0800, He YunLei wrote:
> > >>>> Hi Jaegeuk,
> > >>>>
> > >>>> We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in detail is
> > >>>> as fallow:
> > >>>>
> > >>>> dio04
> > >>>>
> > >>>> <<<test_start>>>
> > >>>> tag=dio04 stime=1432278894
> > >>>> cmdline="diotest4"
> > >>>> contacts=""
> > >>>> analysis=exit
> > >>>> <<<test_output>>>
> > >>>> diotest4    1  TPASS  :  Negative Offset
> > >>>> diotest4    2  TPASS  :  removed
> > >>>> diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
> > >>>> diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
> > >>>> diotest4    5  TPASS  :  Read beyond the file size
> > >>>> ......
> > >>>>
> > >>>> the result of ext4 with same environment:
> > >>>>
> > >>>> dio04
> > >>>>
> > >>>> <<<test_start>>>
> > >>>> tag=dio04 stime=1432259643
> > >>>> cmdline="diotest4"
> > >>>> contacts=""
> > >>>> analysis=exit
> > >>>> <<<test_output>>>
> > >>>> diotest4    1  TPASS  :  Negative Offset
> > >>>> diotest4    2  TPASS  :  removed
> > >>>> diotest4    3  TPASS  :  Odd count of read and write
> > >>>> diotest4    4  TPASS  :  Read beyond the file size
> > >>>> ......
> > >>>>
> > >>>> Does f2fs allow dwrite with non-aligned offset and size? I check the code and found
> > >>>> dwrite with non-aligned offset and size will turn into buffered write. Whether it will
> > >>>> have some impact on user layer applications?
> > >>>
> > >>> It's not a big deal to return -EINVAL.
> > >>> When I take a look at other filesystem behaviors, it seems there is no restriction.
> > >>>
> > >>
> > >> Ext4 do a check in the function do_blockdev_direct_IO:
> > >>
> > >>            if (align & blocksize_mask) {
> > >>                if (bdev)
> > >>                    blkbits = blksize_bits(bdev_logical_block_size(bdev));
> > >>                blocksize_mask = (1 << blkbits) - 1;
> > >>                if (align & blocksize_mask)
> > >>                    goto out;
> > >>            }
> > >>
> > >> It will return -EINVAL if the alignment is not satisfied.
> > >
> > > I think we can get hint from the error case description in write(2) manual:
> > >
> > > "EINVAL  fd is attached to an object which is unsuitable for writing; or the file
> > > was  opened with the O_DIRECT flag, and either the address specified in buf, the
> > > value specified in count, or the current file offset is not suitably aligned."
> > >
> > > So if the alignment is not satisfied, we should return '-EINVAL' instead of
> > > letting user fall back to buffered write.
> > >
> > > Do you have time to make and send us a patch for fixing this issue?
> > >
> > > Thanks,
> > >
> > In man page of open(2)
> > 
> > The O_DIRECT flag may impose alignment restrictions on the length and
> > address of user-space buffers and the file offset of I/Os.  In Linux
> > alignment restrictions vary by filesystem and kernel version and
> > might be absent entirely.  However there is currently no
> > filesystem-independent interface for an application to discover these
> > restrictions for a given file or filesystem.
> 
> Right, there is no such interface here, and I don't think we need this,
> since developers already have the manual for guiding them to use the
> interfaces of filesystem correctly. Manuals should describe the restrictions
> of interfaces which is exported by VFS, if some filesystems did not follow
> theses restricting rules, we should add related description in manual.
> 
> > 
> > So now I don't know which one is better, falling back to buffered write or not?
> 
> IMO, DIO is used for special purpose, if the target if not be achieved because of
> wrong parameters, why not feedback to our user as write(2) manual descripted?
> 
> Any special reason to fallback to buffered write in f2fs?

I think it'd be better to return -EINVAL, when unaligned offset was given, since
what users try to request dios means they want to avoid the filesystem layer
and get the maximum storage performance.
So, we need to alarm them to guide correct settings.

The only case that we need to fall back to buffered write would be some specific
conditions like inline_data and encryption.

Thanks,

> 
> Thanks,
> 
> > >>
> > >> In f2fs, it do the check by check_direct_IO() before blockdev_direct_IO().
> > >> The difference between the two methods is whether turn dwrite with non-aligned
> > >> offset and size into buffered write. I am not very clear which one is better!
> > >>
> > >> Thanks,
> > >> He
> > >>
> > >>>>
> > >>>> I wrote a patch, not well tested, how do you think of it?
> > >>>
> > >>> Returning the error number would be good to me.
> > >>> Could you write and sumbit a complete one?
> > >>>
> > >>> Thanks,
> > >>>
> > >>>>
> > >>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > >>>> index 9bedfa8..ba5d94c 100644
> > >>>> --- a/fs/f2fs/data.c
> > >>>> +++ b/fs/f2fs/data.c
> > >>>> @@ -2010,8 +2010,9 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter
> > *iter,
> > >>>>           if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
> > >>>>                   return 0;
> > >>>>
> > >>>> -       if (check_direct_IO(inode, iter, offset))
> > >>>> -               return 0;
> > >>>> +       err = check_direct_IO(inode, iter, offset)
> > >>>> +       if (err)
> > >>>> +               return -EINVAL;
> > >>>>
> > >>>>           trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
> > >>>>
> > >>>> I wish you and other developers in this list could help me in a correct way.
> > >>>>
> > >>>> Thanks,
> > >>>> He
> > >>>
> > >>> .
> > >>>
> > >>
> > >>
> > >> ------------------------------------------------------------------------------
> > >> _______________________________________________
> > >> Linux-f2fs-devel mailing list
> > >> Linux-f2fs-devel@lists.sourceforge.net
> > >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > >
> > >
> > > .
> > >
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9bedfa8..ba5d94c 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2010,8 +2010,9 @@  static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
         if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
                 return 0;

-       if (check_direct_IO(inode, iter, offset))
-               return 0;
+       err = check_direct_IO(inode, iter, offset)
+       if (err)
+               return -EINVAL;

         trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));