diff mbox series

[2/9] block-copy: add missing coroutine_fn annotations

Message ID 20221103134206.4041928-3-eesposit@redhat.com (mailing list archive)
State New, archived
Headers show
Series Still more coroutine and various fixes in block layer | expand

Commit Message

Emanuele Giuseppe Esposito Nov. 3, 2022, 1:41 p.m. UTC
block_copy_reset_unallocated and block_copy_is_cluster_allocated are
only called by backup_run, a corotuine_fn itself.

Same applies to block_copy_block_status, called by
block_copy_dirty_clusters.

Therefore mark them as coroutine too.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Paolo Bonzini Nov. 3, 2022, 4:56 p.m. UTC | #1
On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote:
> block_copy_reset_unallocated and block_copy_is_cluster_allocated are
> only called by backup_run, a corotuine_fn itself.
> 
> Same applies to block_copy_block_status, called by
> block_copy_dirty_clusters.
> 
> Therefore mark them as coroutine too.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

They don't need to be coroutine_fn.  coroutine_fn is needed if you call 
another coroutine_fn, but not if you _are only called_ by coroutine_fn. 
There is nothing in these functions that needs them to be called from a 
coroutine.

The only exception is qemu_coroutine_yield(), which is the only leaf 
coroutine_fn.

Paolo

> ---
>   block/block-copy.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index bb947afdda..f33ab1d0b6 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
>       return ret;
>   }
>   
> -static int block_copy_block_status(BlockCopyState *s, int64_t offset,
> -                                   int64_t bytes, int64_t *pnum)
> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
> +                                                int64_t offset,
> +                                                int64_t bytes, int64_t *pnum)
>   {
>       int64_t num;
>       BlockDriverState *base;
> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, int64_t offset,
>    * Check if the cluster starting at offset is allocated or not.
>    * return via pnum the number of contiguous clusters sharing this allocation.
>    */
> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
> -                                           int64_t *pnum)
> +static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s,
> +                                                        int64_t offset,
> +                                                        int64_t *pnum)
>   {
>       BlockDriverState *bs = s->source->bs;
>       int64_t count, total_count = 0;
> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
>    * @return 0 when the cluster at @offset was unallocated,
>    *         1 otherwise, and -ret on error.
>    */
> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
> -                                     int64_t offset, int64_t *count)
> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
> +                                                  int64_t offset,
> +                                                  int64_t *count)
>   {
>       int ret;
>       int64_t clusters, bytes;
Kevin Wolf Nov. 3, 2022, 6:06 p.m. UTC | #2
Am 03.11.2022 um 17:56 hat Paolo Bonzini geschrieben:
> On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote:
> > block_copy_reset_unallocated and block_copy_is_cluster_allocated are
> > only called by backup_run, a corotuine_fn itself.

s/corotuine_fn/coroutine_fn/

> > 
> > Same applies to block_copy_block_status, called by
> > block_copy_dirty_clusters.
> > 
> > Therefore mark them as coroutine too.
> > 
> > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> 
> They don't need to be coroutine_fn.  coroutine_fn is needed if you call
> another coroutine_fn, but not if you _are only called_ by coroutine_fn.
> There is nothing in these functions that needs them to be called from a
> coroutine.
> 
> The only exception is qemu_coroutine_yield(), which is the only leaf
> coroutine_fn.

I think it can make sense to have coroutine_fn as a documentation for
things that are only ever called in a coroutine even if they could
theoretically also work outside of coroutine context.

Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
not only less obvious that it's even possible to do, but we'll have to
add potentially many additional coroutine_fn annotations in the whole
call chain in an otherwise unrelated patch.

Kevin
Paolo Bonzini Nov. 3, 2022, 6:36 p.m. UTC | #3
On 11/3/22 19:06, Kevin Wolf wrote:
> I think it can make sense to have coroutine_fn as a documentation for
> things that are only ever called in a coroutine even if they could
> theoretically also work outside of coroutine context.
> 
> Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
> not only less obvious that it's even possible to do, but we'll have to
> add potentially many additional coroutine_fn annotations in the whole
> call chain in an otherwise unrelated patch.

This is true.  On the other hand, coroutine_fn also means "this is 
allowed to suspend", which may have implications on the need for locking 
in the caller.  So you need to judge case-by-case.

If there are good reasons to add the note, you could add an assertion 
that you are qemu_in_coroutine(), which notes that you are in a 
coroutine but you don't suspend.  In this case however I don't think 
it's likely that there will be a coroutine_fn call added later.

I guess I tend to err on the side of "it's good that it's not obvious 
that you can call a coroutine_fn", but I also need to correct myself as 
qemu_coroutine_yield() is not the only leaf; there is also 
qemu_co_queue_next() and qemu_co_queue_restart_all(), which are 
coroutine_fn because they rely on the queuing behavior of 
aio_co_enter().  In this case I violated my own rule because it is 
always a bug to call these functions outside coroutine context.

Paolo
Emanuele Giuseppe Esposito Nov. 4, 2022, 7:35 a.m. UTC | #4
Am 03/11/2022 um 19:36 schrieb Paolo Bonzini:
> On 11/3/22 19:06, Kevin Wolf wrote:
>> I think it can make sense to have coroutine_fn as a documentation for
>> things that are only ever called in a coroutine even if they could
>> theoretically also work outside of coroutine context.
>>
>> Otherwise, when we want to introduce a coroutine_fn call somewhere, it's
>> not only less obvious that it's even possible to do, but we'll have to
>> add potentially many additional coroutine_fn annotations in the whole
>> call chain in an otherwise unrelated patch.
> 
> This is true.  On the other hand, coroutine_fn also means "this is
> allowed to suspend", which may have implications on the need for locking
> in the caller.  So you need to judge case-by-case.
> 
> If there are good reasons to add the note, you could add an assertion
> that you are qemu_in_coroutine(), which notes that you are in a
> coroutine but you don't suspend.  In this case however I don't think
> it's likely that there will be a coroutine_fn call added later.
> 
> I guess I tend to err on the side of "it's good that it's not obvious
> that you can call a coroutine_fn", but I also need to correct myself as
> qemu_coroutine_yield() is not the only leaf; there is also
> qemu_co_queue_next() and qemu_co_queue_restart_all(), which are
> coroutine_fn because they rely on the queuing behavior of
> aio_co_enter().  In this case I violated my own rule because it is
> always a bug to call these functions outside coroutine context.
> 

But isn't it a bug also not to mark a function _only_ called by
coroutine_fn? My point is that if this function is an implementation of
a BlockDriver callback marked as coroutine_fn (like in patch 6 with
vmdk), then it would make sense.

This is actually the point of this serie (which I might not have
explained well in the cover letter), every function marked here is
eventually called by/calling a BlockDriver callback marked as coroutine_fn.

Currently we have something like this:
BlockDriver {
    void coroutine_fn (*bdrv_A)(void) = implA;
}

void coroutine_fn implA() {
    funcB();
    funcC();
}

void funcB() {}; <--- missing coroutine_fn?
void funcC() {}; <--- missing coroutine_fn?

In addition, as I understand draining is not allowed in coroutines. If a
function/callback only running in coroutines is not marked as
coroutine_fn, then it will be less obvious to notice that draining is
not allowed there.

Emanuele
Paolo Bonzini Nov. 4, 2022, 8:44 a.m. UTC | #5
On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
> But isn't it a bug also not to mark a function _only_ called by
> coroutine_fn? My point is that if this function is an implementation of
> a BlockDriver callback marked as coroutine_fn (like in patch 6 with
> vmdk), then it would make sense.

If a function implements a coroutine_fn callback but does not suspend, 
then it makes sense to mark it coroutine_fn.

In general it's not a bug.  In most cases it would only be a coincidence 
that the function is called from a coroutine_fn.  For example consider 
bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may 
suspend now (it doesn't) or in the future.  However it's only doing some 
math based on the result of bdrv_get_info(), so it is extremely unlikely 
that this will happen.

In this case... oh wait.  block_copy_is_cluster_allocated is calling 
bdrv_is_allocated, and block_copy_reset_unallocated calls 
block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed 
coroutine/non-coroutine function, and in this case it is useful to 
document that bdrv_is_allocated will suspend.  The patch is correct, 
only the commit message is wrong.

Likewise for blockstatus_to_extents in patch 3, where the commit message 
does mention bdrv_* functions.  As I mentioned in my quick review of 
patch 3, this can also snowball into a series of its own to clean up all 
callees of bdrv_co_common_block_status_above, similar to what Alberto 
did for read/write functions back in June, so that they are properly 
marked as coroutine_fn.  If you want to do it, don't do it by hand 
though, you can use his static analyzer.  It's slow but it's faster than 
doing it by hand.

> This is actually the point of this serie (which I might not have
> explained well in the cover letter), every function marked here is
> eventually called by/calling a BlockDriver callback marked as coroutine_fn.

Again I don't think this is useful in general, but your three patches 
(2/3/6) did catch cases that wants to be coroutine_fn.  So my objection 
is dropped with just a better commit message.

> Currently we have something like this:
> BlockDriver {
>      void coroutine_fn (*bdrv_A)(void) = implA;
> }
> 
> void coroutine_fn implA() {
>      funcB();
>      funcC();
> }
> 
> void funcB() {}; <--- missing coroutine_fn?
> void funcC() {}; <--- missing coroutine_fn?
> 
> In addition, as I understand draining is not allowed in coroutines.

... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/

> If a function/callback only running in coroutines is not marked as
> coroutine_fn, then it will be less obvious to notice that draining is
> not allowed there.

I think it has to be judged case by base.  Your patches prove that, in 
most cases, you have coroutine_fn for things that ultimately do some 
kind of I/O or query.  In general the interesting path to explore is 
"coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly) 
generated_co_wrapper".  The vrc tool could be extended to help finding 
them, with commands like

     label coroutine_fn bdrv_co_read
     label coroutine_fn bdrv_co_write
     ...
     label generated_co_wrapper bdrv_read
     label generated_co_wrapper bdrv_write
     paths coroutine_fn !coroutine_fn generated_co_wrapper

Paolo
Emanuele Giuseppe Esposito Nov. 4, 2022, 9:20 a.m. UTC | #6
Am 04/11/2022 um 09:44 schrieb Paolo Bonzini:
> On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
>> But isn't it a bug also not to mark a function _only_ called by
>> coroutine_fn? My point is that if this function is an implementation of
>> a BlockDriver callback marked as coroutine_fn (like in patch 6 with
>> vmdk), then it would make sense.
> 
> If a function implements a coroutine_fn callback but does not suspend,
> then it makes sense to mark it coroutine_fn.
> 
> In general it's not a bug.  In most cases it would only be a coincidence
> that the function is called from a coroutine_fn.  For example consider
> bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may
> suspend now (it doesn't) or in the future.  However it's only doing some
> math based on the result of bdrv_get_info(), so it is extremely unlikely
> that this will happen.
> 
> In this case... oh wait.  block_copy_is_cluster_allocated is calling
> bdrv_is_allocated, and block_copy_reset_unallocated calls
> block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed
> coroutine/non-coroutine function, and in this case it is useful to
> document that bdrv_is_allocated will suspend.  The patch is correct,
> only the commit message is wrong.

[...]

>> This is actually the point of this serie (which I might not have
>> explained well in the cover letter), every function marked here is
>> eventually called by/calling a BlockDriver callback marked as
>> coroutine_fn.
> 
> Again I don't think this is useful in general, but your three patches
> (2/3/6) did catch cases that wants to be coroutine_fn.  So my objection
> is dropped with just a better commit message.

I see your point about "in general it's not a bug".
At this point I just want to make sure that we agree that it's correct
to add coroutine_fn because of "the function calls a g_c_w that
suspends" *&&* "all the callers are coroutine_fn". If the callers
weren't coroutine_fn then g_c_w would just create a new coroutine and
poll, and I don't think that would be part of your definition of "can
suspend".

Thank you,
Emanuele

> 
>> Currently we have something like this:
>> BlockDriver {
>>      void coroutine_fn (*bdrv_A)(void) = implA;
>> }
>>
>> void coroutine_fn implA() {
>>      funcB();
>>      funcC();
>> }
>>
>> void funcB() {}; <--- missing coroutine_fn?
>> void funcC() {}; <--- missing coroutine_fn?
>>
>> In addition, as I understand draining is not allowed in coroutines.
> 
> ... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/
> 
>> If a function/callback only running in coroutines is not marked as
>> coroutine_fn, then it will be less obvious to notice that draining is
>> not allowed there.
> 
> I think it has to be judged case by base.  Your patches prove that, in
> most cases, you have coroutine_fn for things that ultimately do some
> kind of I/O or query.  In general the interesting path to explore is
> "coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly)
> generated_co_wrapper".  The vrc tool could be extended to help finding
> them, with commands like
> 
>     label coroutine_fn bdrv_co_read
>     label coroutine_fn bdrv_co_write
>     ...
>     label generated_co_wrapper bdrv_read
>     label generated_co_wrapper bdrv_write
>     paths coroutine_fn !coroutine_fn generated_co_wrapper
> 
> Paolo
>
Kevin Wolf Nov. 4, 2022, 1:12 p.m. UTC | #7
Am 04.11.2022 um 09:44 hat Paolo Bonzini geschrieben:
> On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote:
> > But isn't it a bug also not to mark a function _only_ called by
> > coroutine_fn? My point is that if this function is an implementation of
> > a BlockDriver callback marked as coroutine_fn (like in patch 6 with
> > vmdk), then it would make sense.
> 
> If a function implements a coroutine_fn callback but does not suspend, then
> it makes sense to mark it coroutine_fn.
> 
> In general it's not a bug.  In most cases it would only be a coincidence
> that the function is called from a coroutine_fn.  For example consider
> bdrv_round_to_clusters().  Marking it coroutine_fn signals that it may
> suspend now (it doesn't) or in the future.  However it's only doing some
> math based on the result of bdrv_get_info(), so it is extremely unlikely
> that this will happen.
> 
> In this case... oh wait.  block_copy_is_cluster_allocated is calling
> bdrv_is_allocated, and block_copy_reset_unallocated calls
> block_copy_is_cluster_allocated.  bdrv_is_allocated is a mixed
> coroutine/non-coroutine function, and in this case it is useful to document
> that bdrv_is_allocated will suspend.  The patch is correct, only the commit
> message is wrong.

Ah, right, now I remember that this is what I had discussed with
Emanuele. I knew that there was a reason for it...

What we probably should really do is a bdrv_co_is_allocated() that
doesn't go through the mixed function, but directly calls
bdrv_co_common_block_status_above(). bdrv_is_allocated() is only a
smaller wrapper anyway, so it wouldn't duplicate much code.

I seem to remember that Emanuele had a few more bdrv_is_allocated()
calls that actually took the coroutine path and could use the same new
function.

Kevin
Paolo Bonzini Nov. 4, 2022, 1:16 p.m. UTC | #8
On 11/4/22 10:20, Emanuele Giuseppe Esposito wrote:
> At this point I just want to make sure that we agree that it's correct
> to add coroutine_fn because of "the function calls a g_c_w that
> suspends" *&&* "all the callers are coroutine_fn". If the callers
> weren't coroutine_fn then g_c_w would just create a new coroutine and
> poll, and I don't think that would be part of your definition of "can
> suspend".

Yes, we agree on that.  The confusion was just on the commit message.

The even-better fix would be to also call coroutine_fn from 
coroutine_fn, instead of calling mixed coroutine/non-coroutine functions 
such as g_c_w functions.  However, adding coroutine_fn *is* correct.

Paolo
diff mbox series

Patch

diff --git a/block/block-copy.c b/block/block-copy.c
index bb947afdda..f33ab1d0b6 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -577,8 +577,9 @@  static coroutine_fn int block_copy_task_entry(AioTask *task)
     return ret;
 }
 
-static int block_copy_block_status(BlockCopyState *s, int64_t offset,
-                                   int64_t bytes, int64_t *pnum)
+static coroutine_fn int block_copy_block_status(BlockCopyState *s,
+                                                int64_t offset,
+                                                int64_t bytes, int64_t *pnum)
 {
     int64_t num;
     BlockDriverState *base;
@@ -613,8 +614,9 @@  static int block_copy_block_status(BlockCopyState *s, int64_t offset,
  * Check if the cluster starting at offset is allocated or not.
  * return via pnum the number of contiguous clusters sharing this allocation.
  */
-static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
-                                           int64_t *pnum)
+static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s,
+                                                        int64_t offset,
+                                                        int64_t *pnum)
 {
     BlockDriverState *bs = s->source->bs;
     int64_t count, total_count = 0;
@@ -669,8 +671,9 @@  void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
  * @return 0 when the cluster at @offset was unallocated,
  *         1 otherwise, and -ret on error.
  */
-int64_t block_copy_reset_unallocated(BlockCopyState *s,
-                                     int64_t offset, int64_t *count)
+int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
+                                                  int64_t offset,
+                                                  int64_t *count)
 {
     int ret;
     int64_t clusters, bytes;