diff mbox series

[02/42] btrfs: introduce write_one_subpage_eb() function

Message ID 20210415050448.267306-3-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: add full read-write support for subpage | expand

Commit Message

Qu Wenruo April 15, 2021, 5:04 a.m. UTC
The new function, write_one_subpage_eb(), as a subroutine for subpage
metadata write, will handle the extent buffer bio submission.

The major differences between the new write_one_subpage_eb() and
write_one_eb() is:
- No page locking
  When entering write_one_subpage_eb() the page is no longer locked.
  We only lock the page for its status update, and unlock immediately.
  Now we completely rely on extent io tree locking.

- Extra bitmap update along with page status update
  Now page dirty and writeback is controlled by
  btrfs_subpage::dirty_bitmap and btrfs_subpage::writeback_bitmap.
  They both follow the schema that any sector is dirty/writeback, then
  the full page get dirty/writeback.

- When to update the nr_written number
  Now we take a short cut, if we have cleared the last dirty bit of the
  page, we update nr_written.
  This is not completely perfect, but should emulate the old behavior
  good enough.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 55 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

Comments

Josef Bacik April 15, 2021, 7:03 p.m. UTC | #1
On 4/15/21 1:04 AM, Qu Wenruo wrote:
> The new function, write_one_subpage_eb(), as a subroutine for subpage
> metadata write, will handle the extent buffer bio submission.
> 
> The major differences between the new write_one_subpage_eb() and
> write_one_eb() is:
> - No page locking
>    When entering write_one_subpage_eb() the page is no longer locked.
>    We only lock the page for its status update, and unlock immediately.
>    Now we completely rely on extent io tree locking.
> 
> - Extra bitmap update along with page status update
>    Now page dirty and writeback is controlled by
>    btrfs_subpage::dirty_bitmap and btrfs_subpage::writeback_bitmap.
>    They both follow the schema that any sector is dirty/writeback, then
>    the full page get dirty/writeback.
> 
> - When to update the nr_written number
>    Now we take a short cut, if we have cleared the last dirty bit of the
>    page, we update nr_written.
>    This is not completely perfect, but should emulate the old behavior
>    good enough.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/extent_io.c | 55 ++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 55 insertions(+)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 21a14b1cb065..f32163a465ec 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4196,6 +4196,58 @@ static void end_bio_extent_buffer_writepage(struct bio *bio)
>   	bio_put(bio);
>   }
>   
> +/*
> + * Unlike the work in write_one_eb(), we rely completely on extent locking.
> + * Page locking is only utizlied at minimal to keep the VM code happy.
> + *
> + * Caller should still call write_one_eb() other than this function directly.
> + * As write_one_eb() has extra prepration before submitting the extent buffer.
> + */
> +static int write_one_subpage_eb(struct extent_buffer *eb,
> +				struct writeback_control *wbc,
> +				struct extent_page_data *epd)
> +{
> +	struct btrfs_fs_info *fs_info = eb->fs_info;
> +	struct page *page = eb->pages[0];
> +	unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META;
> +	bool no_dirty_ebs = false;
> +	int ret;
> +
> +	/* clear_page_dirty_for_io() in subpage helper need page locked. */
> +	lock_page(page);
> +	btrfs_subpage_set_writeback(fs_info, page, eb->start, eb->len);
> +
> +	/* If we're the last dirty bit to update nr_written */
> +	no_dirty_ebs = btrfs_subpage_clear_and_test_dirty(fs_info, page,
> +							  eb->start, eb->len);
> +	if (no_dirty_ebs)
> +		clear_page_dirty_for_io(page);
> +
> +	ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page,
> +			eb->start, eb->len, eb->start - page_offset(page),
> +			&epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0,
> +			false);
> +	if (ret) {
> +		btrfs_subpage_clear_writeback(fs_info, page, eb->start,
> +					      eb->len);
> +		set_btree_ioerr(page, eb);
> +		unlock_page(page);
> +
> +		if (atomic_dec_and_test(&eb->io_pages))
> +			end_extent_buffer_writeback(eb);
> +		return -EIO;
> +	}
> +	unlock_page(page);
> +	/*
> +	 * Submission finishes without problem, if no range of the page is
> +	 * dirty anymore, we have submitted a page.
> +	 * Update the nr_written in wbc.
> +	 */
> +	if (no_dirty_ebs)
> +		update_nr_written(wbc, 1);
> +	return ret;
> +}
> +
>   static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
>   			struct writeback_control *wbc,
>   			struct extent_page_data *epd)
> @@ -4227,6 +4279,9 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
>   		memzero_extent_buffer(eb, start, end - start);
>   	}
>   
> +	if (eb->fs_info->sectorsize < PAGE_SIZE)
> +		return write_one_subpage_eb(eb, wbc, epd);
> +

Same comment here, again you're calling write_one_eb() which expects to do the 
eb thing, but then later have an entirely different path for the subpage stuff, 
and thus could just call your write_one_subpage_eb() helper from there instead 
of stuffing it into write_one_eb().

Also, I generally don't care about ordering of patches as long as they make 
sense generally.

However in this case if you were to bisect to just this patch you would be 
completely screwed, as the normal write path would just fail to write the other 
eb's on the page.  You really need to have the patches that do the 
write_cache_pages part done first, and then have this patch.

Or alternatively, leave the order as it is, and simply don't wire the helper up 
until you implement the subpage writepages further down.  That may be better, 
you won't have to re-order anything and you can maintain these smaller chunks 
for review, which may not be possible if you re-order them.  Thanks,

Josef
Qu Wenruo April 15, 2021, 11:25 p.m. UTC | #2
On 2021/4/16 上午3:03, Josef Bacik wrote:
> On 4/15/21 1:04 AM, Qu Wenruo wrote:
>> The new function, write_one_subpage_eb(), as a subroutine for subpage
>> metadata write, will handle the extent buffer bio submission.
>>
>> The major differences between the new write_one_subpage_eb() and
>> write_one_eb() is:
>> - No page locking
>>    When entering write_one_subpage_eb() the page is no longer locked.
>>    We only lock the page for its status update, and unlock immediately.
>>    Now we completely rely on extent io tree locking.
>>
>> - Extra bitmap update along with page status update
>>    Now page dirty and writeback is controlled by
>>    btrfs_subpage::dirty_bitmap and btrfs_subpage::writeback_bitmap.
>>    They both follow the schema that any sector is dirty/writeback, then
>>    the full page get dirty/writeback.
>>
>> - When to update the nr_written number
>>    Now we take a short cut, if we have cleared the last dirty bit of the
>>    page, we update nr_written.
>>    This is not completely perfect, but should emulate the old behavior
>>    good enough.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/extent_io.c | 55 ++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 55 insertions(+)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 21a14b1cb065..f32163a465ec 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -4196,6 +4196,58 @@ static void
>> end_bio_extent_buffer_writepage(struct bio *bio)
>>       bio_put(bio);
>>   }
>> +/*
>> + * Unlike the work in write_one_eb(), we rely completely on extent
>> locking.
>> + * Page locking is only utizlied at minimal to keep the VM code happy.
>> + *
>> + * Caller should still call write_one_eb() other than this function
>> directly.
>> + * As write_one_eb() has extra prepration before submitting the
>> extent buffer.
>> + */
>> +static int write_one_subpage_eb(struct extent_buffer *eb,
>> +                struct writeback_control *wbc,
>> +                struct extent_page_data *epd)
>> +{
>> +    struct btrfs_fs_info *fs_info = eb->fs_info;
>> +    struct page *page = eb->pages[0];
>> +    unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META;
>> +    bool no_dirty_ebs = false;
>> +    int ret;
>> +
>> +    /* clear_page_dirty_for_io() in subpage helper need page locked. */
>> +    lock_page(page);
>> +    btrfs_subpage_set_writeback(fs_info, page, eb->start, eb->len);
>> +
>> +    /* If we're the last dirty bit to update nr_written */
>> +    no_dirty_ebs = btrfs_subpage_clear_and_test_dirty(fs_info, page,
>> +                              eb->start, eb->len);
>> +    if (no_dirty_ebs)
>> +        clear_page_dirty_for_io(page);
>> +
>> +    ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page,
>> +            eb->start, eb->len, eb->start - page_offset(page),
>> +            &epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0,
>> +            false);
>> +    if (ret) {
>> +        btrfs_subpage_clear_writeback(fs_info, page, eb->start,
>> +                          eb->len);
>> +        set_btree_ioerr(page, eb);
>> +        unlock_page(page);
>> +
>> +        if (atomic_dec_and_test(&eb->io_pages))
>> +            end_extent_buffer_writeback(eb);
>> +        return -EIO;
>> +    }
>> +    unlock_page(page);
>> +    /*
>> +     * Submission finishes without problem, if no range of the page is
>> +     * dirty anymore, we have submitted a page.
>> +     * Update the nr_written in wbc.
>> +     */
>> +    if (no_dirty_ebs)
>> +        update_nr_written(wbc, 1);
>> +    return ret;
>> +}
>> +
>>   static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
>>               struct writeback_control *wbc,
>>               struct extent_page_data *epd)
>> @@ -4227,6 +4279,9 @@ static noinline_for_stack int
>> write_one_eb(struct extent_buffer *eb,
>>           memzero_extent_buffer(eb, start, end - start);
>>       }
>> +    if (eb->fs_info->sectorsize < PAGE_SIZE)
>> +        return write_one_subpage_eb(eb, wbc, epd);
>> +
>
> Same comment here, again you're calling write_one_eb() which expects to
> do the eb thing, but then later have an entirely different path for the
> subpage stuff, and thus could just call your write_one_subpage_eb()
> helper from there instead of stuffing it into write_one_eb().

But there are some common code before calling the subpage routine.

I don't think it's a good idea to have duplicated code between subpage
and regular routine.

>
> Also, I generally don't care about ordering of patches as long as they
> make sense generally.
>
> However in this case if you were to bisect to just this patch you would
> be completely screwed, as the normal write path would just fail to write
> the other eb's on the page.  You really need to have the patches that do
> the write_cache_pages part done first, and then have this patch.

No way one can bisect to this patch.
Without the last patch to enable subpage write, bisect will never point
to this one.

And how could it be possible to implement data write before metadata?
Without metadata write ability, data write won't even be possible.

But without data write ability, metadata write can still be possible,
just doing basic touch/inode creation or even inline extent creation.

So I'm afraid metadata write patches must be before data write patches.

Thanks,
Qu

>
> Or alternatively, leave the order as it is, and simply don't wire the
> helper up until you implement the subpage writepages further down.  That
> may be better, you won't have to re-order anything and you can maintain
> these smaller chunks for review, which may not be possible if you
> re-order them.  Thanks,
>
> Josef
Josef Bacik April 16, 2021, 1:26 p.m. UTC | #3
On 4/15/21 7:25 PM, Qu Wenruo wrote:
> 
> 
> On 2021/4/16 上午3:03, Josef Bacik wrote:
>> On 4/15/21 1:04 AM, Qu Wenruo wrote:
>>> The new function, write_one_subpage_eb(), as a subroutine for subpage
>>> metadata write, will handle the extent buffer bio submission.
>>>
>>> The major differences between the new write_one_subpage_eb() and
>>> write_one_eb() is:
>>> - No page locking
>>>    When entering write_one_subpage_eb() the page is no longerlocked.
>>>    We only lock the page for its status update, and unlock immediately.
>>>    Now we completely rely on extent io tree locking.
>>>
>>> - Extra bitmap update along with page status update
>>>    Now page dirty and writeback is controlled by
>>>    btrfs_subpage::dirty_bitmap and btrfs_subpage::writeback_bitmap.
>>>    They both follow the schema that any sector is dirty/writeback, then
>>>    the full page get dirty/writeback.
>>>
>>> - When to update the nr_written number
>>>    Now we take a short cut, if we have cleared the last dirtybit of the
>>>    page, we update nr_written.
>>>    This is not completely perfect, but should emulate the oldbehavior
>>>    good enough.
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>>   fs/btrfs/extent_io.c | 55 ++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 55 insertions(+)
>>>
>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>>> index 21a14b1cb065..f32163a465ec 100644
>>> --- a/fs/btrfs/extent_io.c
>>> +++ b/fs/btrfs/extent_io.c
>>> @@ -4196,6 +4196,58 @@ static void
>>> end_bio_extent_buffer_writepage(struct bio *bio)
>>>       bio_put(bio);
>>>   }
>>> +/*
>>> + * Unlike the work in write_one_eb(), we rely completely on extent
>>> locking.
>>> + * Page locking is only utizlied at minimal to keep the VM code happy.
>>> + *
>>> + * Caller should still call write_one_eb() other than this function
>>> directly.
>>> + * As write_one_eb() has extra prepration before submitting the
>>> extent buffer.
>>> + */
>>> +static int write_one_subpage_eb(struct extent_buffer *eb,
>>> +                struct writeback_control *wbc,
>>> +                struct extent_page_data *epd)
>>> +{
>>> +    struct btrfs_fs_info *fs_info = eb->fs_info;
>>> +    struct page *page = eb->pages[0];
>>> +    unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META;
>>> +    bool no_dirty_ebs = false;
>>> +    int ret;
>>> +
>>> +    /* clear_page_dirty_for_io() in subpage helper needpage locked. */
>>> +    lock_page(page);
>>> +    btrfs_subpage_set_writeback(fs_info, page, eb->start, eb->len);
>>> +
>>> +    /* If we're the last dirty bit to update nr_written*/
>>> +    no_dirty_ebs = btrfs_subpage_clear_and_test_dirty(fs_info, page,
>>> +                              eb->start, eb->len);
>>> +    if (no_dirty_ebs)
>>> +        clear_page_dirty_for_io(page);
>>> +
>>> +    ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page,
>>> +            eb->start, eb->len, eb->start - page_offset(page),
>>> +            &epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0,
>>> +            false);
>>> +    if (ret) {
>>> +        btrfs_subpage_clear_writeback(fs_info, page, eb->start,
>>> +                          eb->len);
>>> +        set_btree_ioerr(page, eb);
>>> +        unlock_page(page);
>>> +
>>> +        if (atomic_dec_and_test(&eb->io_pages))
>>> +            end_extent_buffer_writeback(eb);
>>> +        return -EIO;
>>> +    }
>>> +    unlock_page(page);
>>> +    /*
>>> +     * Submission finishes without problem, if no range of the page is
>>> +     * dirty anymore, we have submitted a page.
>>> +     * Update the nr_written in wbc.
>>> +     */
>>> +    if (no_dirty_ebs)
>>> +        update_nr_written(wbc, 1);
>>> +    return ret;
>>> +}
>>> +
>>>   static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
>>>               struct writeback_control *wbc,
>>>               struct extent_page_data *epd)
>>> @@ -4227,6 +4279,9 @@ static noinline_for_stack int
>>> write_one_eb(struct extent_buffer *eb,
>>>           memzero_extent_buffer(eb, start, end - start);
>>>       }
>>> +    if (eb->fs_info->sectorsize < PAGE_SIZE)
>>> +        return write_one_subpage_eb(eb, wbc, epd);
>>> +
>>
>> Same comment here, again you're calling write_one_eb() which expects to
>> do the eb thing, but then later have an entirely different path for the
>> subpage stuff, and thus could just call your write_one_subpage_eb()
>> helper from there instead of stuffing it into write_one_eb().
> 
> But there are some common code before calling the subpage routine.
> 
> I don't think it's a good idea to have duplicated code between subpage
> and regular routine.
> 

Ah I missed the part at the top for zero'ing out the buffer.  In that case turn 
that into a helper function and then keep the paths separate.  Thanks,

Josef
Thiago Jung Bauermann April 18, 2021, 7:45 p.m. UTC | #4
Em quinta-feira, 15 de abril de 2021, às 20:25:32 -03, Qu Wenruo escreveu:
> On 2021/4/16 上午3:03, Josef Bacik wrote:
> > Also, I generally don't care about ordering of patches as long as they
> > make sense generally.
> > 
> > However in this case if you were to bisect to just this patch you would
> > be completely screwed, as the normal write path would just fail to write
> > the other eb's on the page.  You really need to have the patches that do
> > the write_cache_pages part done first, and then have this patch.
> 
> No way one can bisect to this patch.
> Without the last patch to enable subpage write, bisect will never point
> to this one.

Maybe I don't fully understand how bisect selects the next commit to test,
but isn't it possible to randomly land at this patch while bisecting an
issue that isn't even related to btrfs?

Thanks and regards,
Thiago Jung Bauermann
diff mbox series

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 21a14b1cb065..f32163a465ec 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4196,6 +4196,58 @@  static void end_bio_extent_buffer_writepage(struct bio *bio)
 	bio_put(bio);
 }
 
+/*
+ * Unlike the work in write_one_eb(), we rely completely on extent locking.
+ * Page locking is only utizlied at minimal to keep the VM code happy.
+ *
+ * Caller should still call write_one_eb() other than this function directly.
+ * As write_one_eb() has extra prepration before submitting the extent buffer.
+ */
+static int write_one_subpage_eb(struct extent_buffer *eb,
+				struct writeback_control *wbc,
+				struct extent_page_data *epd)
+{
+	struct btrfs_fs_info *fs_info = eb->fs_info;
+	struct page *page = eb->pages[0];
+	unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META;
+	bool no_dirty_ebs = false;
+	int ret;
+
+	/* clear_page_dirty_for_io() in subpage helper need page locked. */
+	lock_page(page);
+	btrfs_subpage_set_writeback(fs_info, page, eb->start, eb->len);
+
+	/* If we're the last dirty bit to update nr_written */
+	no_dirty_ebs = btrfs_subpage_clear_and_test_dirty(fs_info, page,
+							  eb->start, eb->len);
+	if (no_dirty_ebs)
+		clear_page_dirty_for_io(page);
+
+	ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page,
+			eb->start, eb->len, eb->start - page_offset(page),
+			&epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0,
+			false);
+	if (ret) {
+		btrfs_subpage_clear_writeback(fs_info, page, eb->start,
+					      eb->len);
+		set_btree_ioerr(page, eb);
+		unlock_page(page);
+
+		if (atomic_dec_and_test(&eb->io_pages))
+			end_extent_buffer_writeback(eb);
+		return -EIO;
+	}
+	unlock_page(page);
+	/*
+	 * Submission finishes without problem, if no range of the page is
+	 * dirty anymore, we have submitted a page.
+	 * Update the nr_written in wbc.
+	 */
+	if (no_dirty_ebs)
+		update_nr_written(wbc, 1);
+	return ret;
+}
+
 static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
 			struct writeback_control *wbc,
 			struct extent_page_data *epd)
@@ -4227,6 +4279,9 @@  static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
 		memzero_extent_buffer(eb, start, end - start);
 	}
 
+	if (eb->fs_info->sectorsize < PAGE_SIZE)
+		return write_one_subpage_eb(eb, wbc, epd);
+
 	for (i = 0; i < num_pages; i++) {
 		struct page *p = eb->pages[i];