diff mbox series

[RFC,12/18] iomap: don't increase i_size if it's not a write operation

Message ID 20231123125121.4064694-13-yi.zhang@huaweicloud.com (mailing list archive)
State New
Headers show
Series ext4: use iomap for regular file's buffered IO path and enable large foilo | expand

Commit Message

Zhang Yi Nov. 23, 2023, 12:51 p.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

Increase i_size in iomap_zero_range() looks not needed, the caller
should handle it. Especially, when truncate partial block, we should
not increase i_size beyond the new EOF here. It dosn't affect xfs and
gfs2 now because they reset the new file size after zero out, it doesn't
matter that a brief increase in i_size. But it will affect ext4 because
it set file size before truncate, so avoid increasing if it's not a
write path.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/iomap/buffered-io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig Nov. 23, 2023, 3:34 p.m. UTC | #1
On Thu, Nov 23, 2023 at 08:51:14PM +0800, Zhang Yi wrote:
> index fd4d43bafd1b..3b9ba390dd1b 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -852,13 +852,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
>  	 * cache.  It's up to the file system to write the updated size to disk,
>  	 * preferably after I/O completion so that no stale data is exposed.
>  	 */
> -	if (pos + ret > old_size) {
> +	if ((iter->flags & IOMAP_WRITE) && pos + ret > old_size) {
>  		i_size_write(iter->inode, pos + ret);
>  		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>  	}
>  	__iomap_put_folio(iter, pos, ret, folio);
>  
> -	if (old_size < pos)
> +	if ((iter->flags & IOMAP_WRITE) && old_size < pos)
>  		pagecache_isize_extended(iter->inode, old_size, pos);
>  	if (ret < len)
>  		iomap_write_failed(iter->inode, pos + ret, len - ret);

I agree with your rationale, but I hate how this code ends up
looking.  In many ways iomap_write_end seems like the wrong
place to update the inode size anyway.  I've not done a deep
analysis, but I think there shouldn't really be any major blocker
to only setting IOMAP_F_SIZE_CHANGED in iomap_write_end, and then
move updating i_size and calling pagecache_isize_extended to
iomap_write_iter.
Zhang Yi Nov. 24, 2023, 1:41 a.m. UTC | #2
On 2023/11/23 23:34, Christoph Hellwig wrote:
> On Thu, Nov 23, 2023 at 08:51:14PM +0800, Zhang Yi wrote:
>> index fd4d43bafd1b..3b9ba390dd1b 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -852,13 +852,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
>>  	 * cache.  It's up to the file system to write the updated size to disk,
>>  	 * preferably after I/O completion so that no stale data is exposed.
>>  	 */
>> -	if (pos + ret > old_size) {
>> +	if ((iter->flags & IOMAP_WRITE) && pos + ret > old_size) {
>>  		i_size_write(iter->inode, pos + ret);
>>  		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>>  	}
>>  	__iomap_put_folio(iter, pos, ret, folio);
>>  
>> -	if (old_size < pos)
>> +	if ((iter->flags & IOMAP_WRITE) && old_size < pos)
>>  		pagecache_isize_extended(iter->inode, old_size, pos);
>>  	if (ret < len)
>>  		iomap_write_failed(iter->inode, pos + ret, len - ret);
> 
> I agree with your rationale, but I hate how this code ends up
> looking.  In many ways iomap_write_end seems like the wrong
> place to update the inode size anyway.  I've not done a deep
> analysis, but I think there shouldn't really be any major blocker
> to only setting IOMAP_F_SIZE_CHANGED in iomap_write_end, and then
> move updating i_size and calling pagecache_isize_extended to
> iomap_write_iter.
> 

Yeah, make sense. It looks fine in my first glance, I will check
is there are any side effects.

Thanks,
Yi.
Zhang Yi Nov. 30, 2023, 12:26 p.m. UTC | #3
On 2023/11/23 23:34, Christoph Hellwig wrote:
> On Thu, Nov 23, 2023 at 08:51:14PM +0800, Zhang Yi wrote:
>> index fd4d43bafd1b..3b9ba390dd1b 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -852,13 +852,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
>>  	 * cache.  It's up to the file system to write the updated size to disk,
>>  	 * preferably after I/O completion so that no stale data is exposed.
>>  	 */
>> -	if (pos + ret > old_size) {
>> +	if ((iter->flags & IOMAP_WRITE) && pos + ret > old_size) {
>>  		i_size_write(iter->inode, pos + ret);
>>  		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>>  	}
>>  	__iomap_put_folio(iter, pos, ret, folio);
>>  
>> -	if (old_size < pos)
>> +	if ((iter->flags & IOMAP_WRITE) && old_size < pos)
>>  		pagecache_isize_extended(iter->inode, old_size, pos);
>>  	if (ret < len)
>>  		iomap_write_failed(iter->inode, pos + ret, len - ret);
> 
> I agree with your rationale, but I hate how this code ends up
> looking.  In many ways iomap_write_end seems like the wrong
> place to update the inode size anyway.  I've not done a deep
> analysis, but I think there shouldn't really be any major blocker
> to only setting IOMAP_F_SIZE_CHANGED in iomap_write_end, and then
> move updating i_size and calling pagecache_isize_extended to
> iomap_write_iter.
> 

Think about it in depth, I think we cannot move updating i_size
to iomap_write_iter() because we have to do this under folio lock,
otherwise, once we unlock folio, the writeback process could start
writing back and call folio_zero_segment() to zero out the valid
data beyond the unupdated i_size. Only if we move
__iomap_put_folio() out together, but I suppose it's not a good
way.

Thanks,
Yi.
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index fd4d43bafd1b..3b9ba390dd1b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -852,13 +852,13 @@  static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 	 * cache.  It's up to the file system to write the updated size to disk,
 	 * preferably after I/O completion so that no stale data is exposed.
 	 */
-	if (pos + ret > old_size) {
+	if ((iter->flags & IOMAP_WRITE) && pos + ret > old_size) {
 		i_size_write(iter->inode, pos + ret);
 		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
 	}
 	__iomap_put_folio(iter, pos, ret, folio);
 
-	if (old_size < pos)
+	if ((iter->flags & IOMAP_WRITE) && old_size < pos)
 		pagecache_isize_extended(iter->inode, old_size, pos);
 	if (ret < len)
 		iomap_write_failed(iter->inode, pos + ret, len - ret);