Message ID | 20221103134206.4041928-3-eesposit@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Still more coroutine and various fixes in block layer | expand |
On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote: > block_copy_reset_unallocated and block_copy_is_cluster_allocated are > only called by backup_run, a corotuine_fn itself. > > Same applies to block_copy_block_status, called by > block_copy_dirty_clusters. > > Therefore mark them as coroutine too. > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> They don't need to be coroutine_fn. coroutine_fn is needed if you call another coroutine_fn, but not if you _are only called_ by coroutine_fn. There is nothing in these functions that needs them to be called from a coroutine. The only exception is qemu_coroutine_yield(), which is the only leaf coroutine_fn. Paolo > --- > block/block-copy.c | 15 +++++++++------ > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/block/block-copy.c b/block/block-copy.c > index bb947afdda..f33ab1d0b6 100644 > --- a/block/block-copy.c > +++ b/block/block-copy.c > @@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task) > return ret; > } > > -static int block_copy_block_status(BlockCopyState *s, int64_t offset, > - int64_t bytes, int64_t *pnum) > +static coroutine_fn int block_copy_block_status(BlockCopyState *s, > + int64_t offset, > + int64_t bytes, int64_t *pnum) > { > int64_t num; > BlockDriverState *base; > @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, int64_t offset, > * Check if the cluster starting at offset is allocated or not. > * return via pnum the number of contiguous clusters sharing this allocation. > */ > -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset, > - int64_t *pnum) > +static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s, > + int64_t offset, > + int64_t *pnum) > { > BlockDriverState *bs = s->source->bs; > int64_t count, total_count = 0; > @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes) > * @return 0 when the cluster at @offset was unallocated, > * 1 otherwise, and -ret on error. > */ > -int64_t block_copy_reset_unallocated(BlockCopyState *s, > - int64_t offset, int64_t *count) > +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s, > + int64_t offset, > + int64_t *count) > { > int ret; > int64_t clusters, bytes;
Am 03.11.2022 um 17:56 hat Paolo Bonzini geschrieben: > On 11/3/22 14:41, Emanuele Giuseppe Esposito wrote: > > block_copy_reset_unallocated and block_copy_is_cluster_allocated are > > only called by backup_run, a corotuine_fn itself. s/corotuine_fn/coroutine_fn/ > > > > Same applies to block_copy_block_status, called by > > block_copy_dirty_clusters. > > > > Therefore mark them as coroutine too. > > > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> > > They don't need to be coroutine_fn. coroutine_fn is needed if you call > another coroutine_fn, but not if you _are only called_ by coroutine_fn. > There is nothing in these functions that needs them to be called from a > coroutine. > > The only exception is qemu_coroutine_yield(), which is the only leaf > coroutine_fn. I think it can make sense to have coroutine_fn as a documentation for things that are only ever called in a coroutine even if they could theoretically also work outside of coroutine context. Otherwise, when we want to introduce a coroutine_fn call somewhere, it's not only less obvious that it's even possible to do, but we'll have to add potentially many additional coroutine_fn annotations in the whole call chain in an otherwise unrelated patch. Kevin
On 11/3/22 19:06, Kevin Wolf wrote: > I think it can make sense to have coroutine_fn as a documentation for > things that are only ever called in a coroutine even if they could > theoretically also work outside of coroutine context. > > Otherwise, when we want to introduce a coroutine_fn call somewhere, it's > not only less obvious that it's even possible to do, but we'll have to > add potentially many additional coroutine_fn annotations in the whole > call chain in an otherwise unrelated patch. This is true. On the other hand, coroutine_fn also means "this is allowed to suspend", which may have implications on the need for locking in the caller. So you need to judge case-by-case. If there are good reasons to add the note, you could add an assertion that you are qemu_in_coroutine(), which notes that you are in a coroutine but you don't suspend. In this case however I don't think it's likely that there will be a coroutine_fn call added later. I guess I tend to err on the side of "it's good that it's not obvious that you can call a coroutine_fn", but I also need to correct myself as qemu_coroutine_yield() is not the only leaf; there is also qemu_co_queue_next() and qemu_co_queue_restart_all(), which are coroutine_fn because they rely on the queuing behavior of aio_co_enter(). In this case I violated my own rule because it is always a bug to call these functions outside coroutine context. Paolo
Am 03/11/2022 um 19:36 schrieb Paolo Bonzini: > On 11/3/22 19:06, Kevin Wolf wrote: >> I think it can make sense to have coroutine_fn as a documentation for >> things that are only ever called in a coroutine even if they could >> theoretically also work outside of coroutine context. >> >> Otherwise, when we want to introduce a coroutine_fn call somewhere, it's >> not only less obvious that it's even possible to do, but we'll have to >> add potentially many additional coroutine_fn annotations in the whole >> call chain in an otherwise unrelated patch. > > This is true. On the other hand, coroutine_fn also means "this is > allowed to suspend", which may have implications on the need for locking > in the caller. So you need to judge case-by-case. > > If there are good reasons to add the note, you could add an assertion > that you are qemu_in_coroutine(), which notes that you are in a > coroutine but you don't suspend. In this case however I don't think > it's likely that there will be a coroutine_fn call added later. > > I guess I tend to err on the side of "it's good that it's not obvious > that you can call a coroutine_fn", but I also need to correct myself as > qemu_coroutine_yield() is not the only leaf; there is also > qemu_co_queue_next() and qemu_co_queue_restart_all(), which are > coroutine_fn because they rely on the queuing behavior of > aio_co_enter(). In this case I violated my own rule because it is > always a bug to call these functions outside coroutine context. > But isn't it a bug also not to mark a function _only_ called by coroutine_fn? My point is that if this function is an implementation of a BlockDriver callback marked as coroutine_fn (like in patch 6 with vmdk), then it would make sense. This is actually the point of this serie (which I might not have explained well in the cover letter), every function marked here is eventually called by/calling a BlockDriver callback marked as coroutine_fn. Currently we have something like this: BlockDriver { void coroutine_fn (*bdrv_A)(void) = implA; } void coroutine_fn implA() { funcB(); funcC(); } void funcB() {}; <--- missing coroutine_fn? void funcC() {}; <--- missing coroutine_fn? In addition, as I understand draining is not allowed in coroutines. If a function/callback only running in coroutines is not marked as coroutine_fn, then it will be less obvious to notice that draining is not allowed there. Emanuele
On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote: > But isn't it a bug also not to mark a function _only_ called by > coroutine_fn? My point is that if this function is an implementation of > a BlockDriver callback marked as coroutine_fn (like in patch 6 with > vmdk), then it would make sense. If a function implements a coroutine_fn callback but does not suspend, then it makes sense to mark it coroutine_fn. In general it's not a bug. In most cases it would only be a coincidence that the function is called from a coroutine_fn. For example consider bdrv_round_to_clusters(). Marking it coroutine_fn signals that it may suspend now (it doesn't) or in the future. However it's only doing some math based on the result of bdrv_get_info(), so it is extremely unlikely that this will happen. In this case... oh wait. block_copy_is_cluster_allocated is calling bdrv_is_allocated, and block_copy_reset_unallocated calls block_copy_is_cluster_allocated. bdrv_is_allocated is a mixed coroutine/non-coroutine function, and in this case it is useful to document that bdrv_is_allocated will suspend. The patch is correct, only the commit message is wrong. Likewise for blockstatus_to_extents in patch 3, where the commit message does mention bdrv_* functions. As I mentioned in my quick review of patch 3, this can also snowball into a series of its own to clean up all callees of bdrv_co_common_block_status_above, similar to what Alberto did for read/write functions back in June, so that they are properly marked as coroutine_fn. If you want to do it, don't do it by hand though, you can use his static analyzer. It's slow but it's faster than doing it by hand. > This is actually the point of this serie (which I might not have > explained well in the cover letter), every function marked here is > eventually called by/calling a BlockDriver callback marked as coroutine_fn. Again I don't think this is useful in general, but your three patches (2/3/6) did catch cases that wants to be coroutine_fn. So my objection is dropped with just a better commit message. > Currently we have something like this: > BlockDriver { > void coroutine_fn (*bdrv_A)(void) = implA; > } > > void coroutine_fn implA() { > funcB(); > funcC(); > } > > void funcB() {}; <--- missing coroutine_fn? > void funcC() {}; <--- missing coroutine_fn? > > In addition, as I understand draining is not allowed in coroutines. ... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/ > If a function/callback only running in coroutines is not marked as > coroutine_fn, then it will be less obvious to notice that draining is > not allowed there. I think it has to be judged case by base. Your patches prove that, in most cases, you have coroutine_fn for things that ultimately do some kind of I/O or query. In general the interesting path to explore is "coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly) generated_co_wrapper". The vrc tool could be extended to help finding them, with commands like label coroutine_fn bdrv_co_read label coroutine_fn bdrv_co_write ... label generated_co_wrapper bdrv_read label generated_co_wrapper bdrv_write paths coroutine_fn !coroutine_fn generated_co_wrapper Paolo
Am 04/11/2022 um 09:44 schrieb Paolo Bonzini: > On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote: >> But isn't it a bug also not to mark a function _only_ called by >> coroutine_fn? My point is that if this function is an implementation of >> a BlockDriver callback marked as coroutine_fn (like in patch 6 with >> vmdk), then it would make sense. > > If a function implements a coroutine_fn callback but does not suspend, > then it makes sense to mark it coroutine_fn. > > In general it's not a bug. In most cases it would only be a coincidence > that the function is called from a coroutine_fn. For example consider > bdrv_round_to_clusters(). Marking it coroutine_fn signals that it may > suspend now (it doesn't) or in the future. However it's only doing some > math based on the result of bdrv_get_info(), so it is extremely unlikely > that this will happen. > > In this case... oh wait. block_copy_is_cluster_allocated is calling > bdrv_is_allocated, and block_copy_reset_unallocated calls > block_copy_is_cluster_allocated. bdrv_is_allocated is a mixed > coroutine/non-coroutine function, and in this case it is useful to > document that bdrv_is_allocated will suspend. The patch is correct, > only the commit message is wrong. [...] >> This is actually the point of this serie (which I might not have >> explained well in the cover letter), every function marked here is >> eventually called by/calling a BlockDriver callback marked as >> coroutine_fn. > > Again I don't think this is useful in general, but your three patches > (2/3/6) did catch cases that wants to be coroutine_fn. So my objection > is dropped with just a better commit message. I see your point about "in general it's not a bug". At this point I just want to make sure that we agree that it's correct to add coroutine_fn because of "the function calls a g_c_w that suspends" *&&* "all the callers are coroutine_fn". If the callers weren't coroutine_fn then g_c_w would just create a new coroutine and poll, and I don't think that would be part of your definition of "can suspend". Thank you, Emanuele > >> Currently we have something like this: >> BlockDriver { >> void coroutine_fn (*bdrv_A)(void) = implA; >> } >> >> void coroutine_fn implA() { >> funcB(); >> funcC(); >> } >> >> void funcB() {}; <--- missing coroutine_fn? >> void funcC() {}; <--- missing coroutine_fn? >> >> In addition, as I understand draining is not allowed in coroutines. > > ... except we have bdrv_co_yield_to_drain() to allow that, sort of. :/ > >> If a function/callback only running in coroutines is not marked as >> coroutine_fn, then it will be less obvious to notice that draining is >> not allowed there. > > I think it has to be judged case by base. Your patches prove that, in > most cases, you have coroutine_fn for things that ultimately do some > kind of I/O or query. In general the interesting path to explore is > "coroutine_fn calls (indirectly) non-coroutine_fn calls (indirectly) > generated_co_wrapper". The vrc tool could be extended to help finding > them, with commands like > > label coroutine_fn bdrv_co_read > label coroutine_fn bdrv_co_write > ... > label generated_co_wrapper bdrv_read > label generated_co_wrapper bdrv_write > paths coroutine_fn !coroutine_fn generated_co_wrapper > > Paolo >
Am 04.11.2022 um 09:44 hat Paolo Bonzini geschrieben: > On 11/4/22 08:35, Emanuele Giuseppe Esposito wrote: > > But isn't it a bug also not to mark a function _only_ called by > > coroutine_fn? My point is that if this function is an implementation of > > a BlockDriver callback marked as coroutine_fn (like in patch 6 with > > vmdk), then it would make sense. > > If a function implements a coroutine_fn callback but does not suspend, then > it makes sense to mark it coroutine_fn. > > In general it's not a bug. In most cases it would only be a coincidence > that the function is called from a coroutine_fn. For example consider > bdrv_round_to_clusters(). Marking it coroutine_fn signals that it may > suspend now (it doesn't) or in the future. However it's only doing some > math based on the result of bdrv_get_info(), so it is extremely unlikely > that this will happen. > > In this case... oh wait. block_copy_is_cluster_allocated is calling > bdrv_is_allocated, and block_copy_reset_unallocated calls > block_copy_is_cluster_allocated. bdrv_is_allocated is a mixed > coroutine/non-coroutine function, and in this case it is useful to document > that bdrv_is_allocated will suspend. The patch is correct, only the commit > message is wrong. Ah, right, now I remember that this is what I had discussed with Emanuele. I knew that there was a reason for it... What we probably should really do is a bdrv_co_is_allocated() that doesn't go through the mixed function, but directly calls bdrv_co_common_block_status_above(). bdrv_is_allocated() is only a smaller wrapper anyway, so it wouldn't duplicate much code. I seem to remember that Emanuele had a few more bdrv_is_allocated() calls that actually took the coroutine path and could use the same new function. Kevin
On 11/4/22 10:20, Emanuele Giuseppe Esposito wrote: > At this point I just want to make sure that we agree that it's correct > to add coroutine_fn because of "the function calls a g_c_w that > suspends" *&&* "all the callers are coroutine_fn". If the callers > weren't coroutine_fn then g_c_w would just create a new coroutine and > poll, and I don't think that would be part of your definition of "can > suspend". Yes, we agree on that. The confusion was just on the commit message. The even-better fix would be to also call coroutine_fn from coroutine_fn, instead of calling mixed coroutine/non-coroutine functions such as g_c_w functions. However, adding coroutine_fn *is* correct. Paolo
diff --git a/block/block-copy.c b/block/block-copy.c index bb947afdda..f33ab1d0b6 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task) return ret; } -static int block_copy_block_status(BlockCopyState *s, int64_t offset, - int64_t bytes, int64_t *pnum) +static coroutine_fn int block_copy_block_status(BlockCopyState *s, + int64_t offset, + int64_t bytes, int64_t *pnum) { int64_t num; BlockDriverState *base; @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, int64_t offset, * Check if the cluster starting at offset is allocated or not. * return via pnum the number of contiguous clusters sharing this allocation. */ -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset, - int64_t *pnum) +static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s, + int64_t offset, + int64_t *pnum) { BlockDriverState *bs = s->source->bs; int64_t count, total_count = 0; @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes) * @return 0 when the cluster at @offset was unallocated, * 1 otherwise, and -ret on error. */ -int64_t block_copy_reset_unallocated(BlockCopyState *s, - int64_t offset, int64_t *count) +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s, + int64_t offset, + int64_t *count) { int ret; int64_t clusters, bytes;
block_copy_reset_unallocated and block_copy_is_cluster_allocated are only called by backup_run, a corotuine_fn itself. Same applies to block_copy_block_status, called by block_copy_dirty_clusters. Therefore mark them as coroutine too. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> --- block/block-copy.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)