[2/9] block-copy: add missing coroutine_fn annotations

Message ID	20221103134206.4041928-3-eesposit@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Emanuele Giuseppe Esposito <eesposit@redhat.com> To: qemu-block@nongnu.org Cc: Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, Eric Blake <eblake@redhat.com>, Fam Zheng <fam@euphon.net>, qemu-devel@nongnu.org, Emanuele Giuseppe Esposito <eesposit@redhat.com> Subject: [PATCH 2/9] block-copy: add missing coroutine_fn annotations Date: Thu, 3 Nov 2022 09:41:59 -0400 Message-Id: <20221103134206.4041928-3-eesposit@redhat.com> In-Reply-To: <20221103134206.4041928-1-eesposit@redhat.com> References: <20221103134206.4041928-1-eesposit@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.129.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.047, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Sender: "Qemu-devel" <qemu-devel-bounces@nongnu.org> Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	Still more coroutine and various fixes in block layer \| expand [0/9] Still more coroutine and various fixes in block layer [1/9] block: call bdrv_co_drain_begin in a coroutine [2/9] block-copy: add missing coroutine_fn annotations [3/9] nbd/server.c: add missing coroutine_fn annotations [4/9] block-backend: replace bdrv__above with blk__above [5/9] block: distinguish between bdrv_create running in coroutine and not [6/9] block/vmdk: add missing coroutine_fn annotations [7/9] block: bdrv_create_file is a coroutine_fn [8/9] block: bdrv_create is never called in non-coroutine context [9/9] block/dirty-bitmap: remove unnecessary qemu_in_coroutine() case

Emanuele Giuseppe Esposito Nov. 3, 2022, 1:41 p.m. UTC

block_copy_reset_unallocated and block_copy_is_cluster_allocated are
only called by backup_run, a corotuine_fn itself.

Same applies to block_copy_block_status, called by
block_copy_dirty_clusters.

Therefore mark them as coroutine too.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Paolo Bonzini Nov. 3, 2022, 4:56 p.m. UTC | #1

On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote:
> block_copy_reset_unallocated and block_copy_is_cluster_allocated are
> only called by backup_run, a corotuine_fn itself.
> 
> Same applies to block_copy_block_status, called by
> block_copy_dirty_clusters.
> 
> Therefore mark them as coroutine too.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

They don't need to be coroutine_fn.  coroutine_fn is needed if you call 
another coroutine_fn, but not if you _are only called_ by coroutine_fn. 
There is nothing in these functions that needs them to be called from a 
coroutine.

The only exception is qemu_coroutine_yield(), which is the only leaf 
coroutine_fn.

Paolo

> ---
>   block/block-copy.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index bb947afdda..f33ab1d0b6 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
>       return ret;
>   }
>   
> -static int block_copy_block_status(BlockCopyState *s, int64_t offset,
> -                                   int64_t bytes, int64_t *pnum)
> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
> +                                                int64_t offset,
> +                                                int64_t bytes, int64_t *pnum)
>   {
>       int64_t num;
>       BlockDriverState *base;
> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, int64_t offset,
>    * Check if the cluster starting at offset is allocated or not.
>    * return via pnum the number of contiguous clusters sharing this allocation.
>    */
> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
> -                                           int64_t *pnum)
> +static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s,
> +                                                        int64_t offset,
> +                                                        int64_t *pnum)
>   {
>       BlockDriverState *bs = s->source->bs;
>       int64_t count, total_count = 0;
> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
>    * @return 0 when the cluster at @offset was unallocated,
>    *         1 otherwise, and -ret on error.
>    */
> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
> -                                     int64_t offset, int64_t *count)
> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
> +                                                  int64_t offset,
> +                                                  int64_t *count)
>   {
>       int ret;
>       int64_t clusters, bytes;

Kevin Wolf Nov. 3, 2022, 6:06 p.m. UTC | #2

Am 03.11.2022 um 17:56 hat Paolo Bonzini geschrieben:
> On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote:
> > block_copy_reset_unallocated and block_copy_is_cluster_allocated are
> > only called by backup_run, a corotuine_fn itself.

s/corotuine_fn/coroutine_fn/

> > 
> > Same applies to block_copy_block_status, called by
> > block_copy_dirty_clusters.
> > 
> > Therefore mark them as coroutine too.
> > 
> > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> 
> They don't need to be coroutine_fn.  coroutine_fn is needed if you call
> another coroutine_fn, but not if you _are only called_ by coroutine_fn.
> There is nothing in these functions that needs them to be called from a
> coroutine.
> 
> The only exception is qemu_coroutine_yield(), which is the only leaf
> coroutine_fn.

I think it can make sense to have coroutine_fn as a documentation for
things that are only ever called in a coroutine even if they could
theoretically also work outside of coroutine context.

Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
not only less obvious that it's even possible to do, but we'll have to
add potentially many additional coroutine_fn annotations in the whole
call chain in an otherwise unrelated patch.

Kevin

Paolo Bonzini Nov. 3, 2022, 6:36 p.m. UTC | #3

On 11/3/22 19:06, Kevin Wolf wrote:
> I think it can make sense to have coroutine_fn as a documentation for
> things that are only ever called in a coroutine even if they could
> theoretically also work outside of coroutine context.
> 
> Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
> not only less obvious that it's even possible to do, but we'll have to
> add potentially many additional coroutine_fn annotations in the whole
> call chain in an otherwise unrelated patch.

This is true.  On the other hand, coroutine_fn also means "this is 
allowed to suspend", which may have implications on the need for locking 
in the caller.  So you need to judge case-by-case.

If there are good reasons to add the note, you could add an assertion 
that you are qemu_in_coroutine(), which notes that you are in a 
coroutine but you don't suspend.  In this case however I don't think 
it's likely that there will be a coroutine_fn call added later.

I guess I tend to err on the side of "it's good that it's not obvious 
that you can call a coroutine_fn", but I also need to correct myself as 
qemu_coroutine_yield() is not the only leaf; there is also 
qemu_co_queue_next() and qemu_co_queue_restart_all(), which are 
coroutine_fn because they rely on the queuing behavior of 
aio_co_enter().  In this case I violated my own rule because it is 
always a bug to call these functions outside coroutine context.

Paolo

Emanuele Giuseppe Esposito Nov. 4, 2022, 7:35 a.m. UTC | #4

Am 03/11/2022 um 19:36 schrieb Paolo Bonzini:
> On 11/3/22 19:06, Kevin Wolf wrote:
>> I think it can make sense to have coroutine_fn as a documentation for
>> things that are only ever called in a coroutine even if they could
>> theoretically also work outside of coroutine context.
>>
>> Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
>> not only less obvious that it's even possible to do, but we'll have to
>> add potentially many additional coroutine_fn annotations in the whole
>> call chain in an otherwise unrelated patch.
> 
> This is true.  On the other hand, coroutine_fn also means "this is
> allowed to suspend", which may have implications on the need for locking
> in the caller.  So you need to judge case-by-case.
> 
> If there are good reasons to add the note, you could add an assertion
> that you are qemu_in_coroutine(), which notes that you are in a
> coroutine but you don't suspend.  In this case however I don't think
> it's likely that there will be a coroutine_fn call added later.
> 
> I guess I tend to err on the side of "it's good that it's not obvious
> that you can call a coroutine_fn", but I also need to correct myself as
> qemu_coroutine_yield() is not the only leaf; there is also
> qemu_co_queue_next() and qemu_co_queue_restart_all(), which are
> coroutine_fn because they rely on the queuing behavior of
> aio_co_enter().  In this case I violated my own rule because it is
> always a bug to call these functions outside coroutine context.
> 

But isn't it a bug also not to mark a function _only_ called by
coroutine_fn? My point is that if this function is an implementation of
a BlockDriver callback marked as coroutine_fn (like in patch 6 with
vmdk), then it would make sense.

This is actually the point of this serie (which I might not have
explained well in the cover letter), every function marked here is
eventually called by/calling a BlockDriver callback marked as coroutine_fn.

Currently we have something like this:
BlockDriver {
    void coroutine_fn (*bdrv_A)(void) = implA;
}

void coroutine_fn implA() {
    funcB();
    funcC();
}

void funcB() {}; <--- missing coroutine_fn?
void funcC() {}; <--- missing coroutine_fn?

In addition, as I understand draining is not allowed in coroutines. If a
function/callback only running in coroutines is not marked as
coroutine_fn, then it will be less obvious to notice that draining is
not allowed there.

Emanuele

Paolo Bonzini Nov. 4, 2022, 8:44 a.m. UTC | #5

On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
> But isn't it a bug also not to mark a function _only_ called by
> coroutine_fn? My point is that if this function is an implementation of
> a BlockDriver callback marked as coroutine_fn (like in patch 6 with
> vmdk), then it would make sense.

If a function implements a coroutine_fn callback but does not suspend, 
then it makes sense to mark it coroutine_fn.

In general it's not a bug.  In most cases it would only be a coincidence 
that the function is called from a coroutine_fn.  For example consider 
bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may 
suspend now (it doesn't) or in the future.  However it's only doing some 
math based on the result of bdrv_get_info(), so it is extremely unlikely 
that this will happen.

In this case... oh wait.  block_copy_is_cluster_allocated is calling 
bdrv_is_allocated, and block_copy_reset_unallocated calls 
block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed 
coroutine/non-coroutine function, and in this case it is useful to 
document that bdrv_is_allocated will suspend.  The patch is correct, 
only the commit message is wrong.

Likewise for blockstatus_to_extents in patch 3, where the commit message 
does mention bdrv_* functions.  As I mentioned in my quick review of 
patch 3, this can also snowball into a series of its own to clean up all 
callees of bdrv_co_common_block_status_above, similar to what Alberto 
did for read/write functions back in June, so that they are properly 
marked as coroutine_fn.  If you want to do it, don't do it by hand 
though, you can use his static analyzer.  It's slow but it's faster than 
doing it by hand.

> This is actually the point of this serie (which I might not have
> explained well in the cover letter), every function marked here is
> eventually called by/calling a BlockDriver callback marked as coroutine_fn.

Again I don't think this is useful in general, but your three patches 
(2/3/6) did catch cases that wants to be coroutine_fn.  So my objection 
is dropped with just a better commit message.

> Currently we have something like this:
> BlockDriver {
>      void coroutine_fn (*bdrv_A)(void) = implA;
> }
> 
> void coroutine_fn implA() {
>      funcB();
>      funcC();
> }
> 
> void funcB() {}; <--- missing coroutine_fn?
> void funcC() {}; <--- missing coroutine_fn?
> 
> In addition, as I understand draining is not allowed in coroutines.

... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/

> If a function/callback only running in coroutines is not marked as
> coroutine_fn, then it will be less obvious to notice that draining is
> not allowed there.

I think it has to be judged case by base.  Your patches prove that, in 
most cases, you have coroutine_fn for things that ultimately do some 
kind of I/O or query.  In general the interesting path to explore is 
"coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly) 
generated_co_wrapper".  The vrc tool could be extended to help finding 
them, with commands like

     label coroutine_fn bdrv_co_read
     label coroutine_fn bdrv_co_write
     ...
     label generated_co_wrapper bdrv_read
     label generated_co_wrapper bdrv_write
     paths coroutine_fn !coroutine_fn generated_co_wrapper

Paolo

Emanuele Giuseppe Esposito Nov. 4, 2022, 9:20 a.m. UTC | #6

Am 04/11/2022 um 09:44 schrieb Paolo Bonzini:
> On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
>> But isn't it a bug also not to mark a function _only_ called by
>> coroutine_fn? My point is that if this function is an implementation of
>> a BlockDriver callback marked as coroutine_fn (like in patch 6 with
>> vmdk), then it would make sense.
> 
> If a function implements a coroutine_fn callback but does not suspend,
> then it makes sense to mark it coroutine_fn.
> 
> In general it's not a bug.  In most cases it would only be a coincidence
> that the function is called from a coroutine_fn.  For example consider
> bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may
> suspend now (it doesn't) or in the future.  However it's only doing some
> math based on the result of bdrv_get_info(), so it is extremely unlikely
> that this will happen.
> 
> In this case... oh wait.  block_copy_is_cluster_allocated is calling
> bdrv_is_allocated, and block_copy_reset_unallocated calls
> block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed
> coroutine/non-coroutine function, and in this case it is useful to
> document that bdrv_is_allocated will suspend.  The patch is correct,
> only the commit message is wrong.

[...]

>> This is actually the point of this serie (which I might not have
>> explained well in the cover letter), every function marked here is
>> eventually called by/calling a BlockDriver callback marked as
>> coroutine_fn.
> 
> Again I don't think this is useful in general, but your three patches
> (2/3/6) did catch cases that wants to be coroutine_fn.  So my objection
> is dropped with just a better commit message.

I see your point about "in general it's not a bug".
At this point I just want to make sure that we agree that it's correct
to add coroutine_fn because of "the function calls a g_c_w that
suspends" *&&* "all the callers are coroutine_fn". If the callers
weren't coroutine_fn then g_c_w would just create a new coroutine and
poll, and I don't think that would be part of your definition of "can
suspend".

Thank you,
Emanuele

> 
>> Currently we have something like this:
>> BlockDriver {
>>      void coroutine_fn (*bdrv_A)(void) = implA;
>> }
>>
>> void coroutine_fn implA() {
>>      funcB();
>>      funcC();
>> }
>>
>> void funcB() {}; <--- missing coroutine_fn?
>> void funcC() {}; <--- missing coroutine_fn?
>>
>> In addition, as I understand draining is not allowed in coroutines.
> 
> ... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/
> 
>> If a function/callback only running in coroutines is not marked as
>> coroutine_fn, then it will be less obvious to notice that draining is
>> not allowed there.
> 
> I think it has to be judged case by base.  Your patches prove that, in
> most cases, you have coroutine_fn for things that ultimately do some
> kind of I/O or query.  In general the interesting path to explore is
> "coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly)
> generated_co_wrapper".  The vrc tool could be extended to help finding
> them, with commands like
> 
>     label coroutine_fn bdrv_co_read
>     label coroutine_fn bdrv_co_write
>     ...
>     label generated_co_wrapper bdrv_read
>     label generated_co_wrapper bdrv_write
>     paths coroutine_fn !coroutine_fn generated_co_wrapper
> 
> Paolo
>

Kevin Wolf Nov. 4, 2022, 1:12 p.m. UTC | #7

Am 04.11.2022 um 09:44 hat Paolo Bonzini geschrieben:
> On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
> > But isn't it a bug also not to mark a function _only_ called by
> > coroutine_fn? My point is that if this function is an implementation of
> > a BlockDriver callback marked as coroutine_fn (like in patch 6 with
> > vmdk), then it would make sense.
> 
> If a function implements a coroutine_fn callback but does not suspend, then
> it makes sense to mark it coroutine_fn.
> 
> In general it's not a bug.  In most cases it would only be a coincidence
> that the function is called from a coroutine_fn.  For example consider
> bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may
> suspend now (it doesn't) or in the future.  However it's only doing some
> math based on the result of bdrv_get_info(), so it is extremely unlikely
> that this will happen.
> 
> In this case... oh wait.  block_copy_is_cluster_allocated is calling
> bdrv_is_allocated, and block_copy_reset_unallocated calls
> block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed
> coroutine/non-coroutine function, and in this case it is useful to document
> that bdrv_is_allocated will suspend.  The patch is correct, only the commit
> message is wrong.

Ah, right, now I remember that this is what I had discussed with
Emanuele. I knew that there was a reason for it...

What we probably should really do is a bdrv_co_is_allocated() that
doesn't go through the mixed function, but directly calls
bdrv_co_common_block_status_above(). bdrv_is_allocated() is only a
smaller wrapper anyway, so it wouldn't duplicate much code.

I seem to remember that Emanuele had a few more bdrv_is_allocated()
calls that actually took the coroutine path and could use the same new
function.

Kevin

Paolo Bonzini Nov. 4, 2022, 1:16 p.m. UTC | #8

On 11/4/22 10:20, Emanuele Giuseppe Esposito wrote:
> At this point I just want to make sure that we agree that it's correct
> to add coroutine_fn because of "the function calls a g_c_w that
> suspends" *&&* "all the callers are coroutine_fn". If the callers
> weren't coroutine_fn then g_c_w would just create a new coroutine and
> poll, and I don't think that would be part of your definition of "can
> suspend".

Yes, we agree on that.  The confusion was just on the commit message.

The even-better fix would be to also call coroutine_fn from 
coroutine_fn, instead of calling mixed coroutine/non-coroutine functions 
such as g_c_w functions.  However, adding coroutine_fn *is* correct.

Paolo

[2/9] block-copy: add missing coroutine_fn annotations

Commit Message

Comments

Patch