[v4,6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

Message ID	20210614073350.17048-7-eesposit@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=L9Eg=LI=nongnu.org=qemu-devel-bounces+qemu-devel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 44DAA606A5 From: Emanuele Giuseppe Esposito <eesposit@redhat.com> To: qemu-block@nongnu.org Subject: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState Date: Mon, 14 Jun 2021 09:33:50 +0200 Message-Id: <20210614073350.17048-7-eesposit@redhat.com> In-Reply-To: <20210614073350.17048-1-eesposit@redhat.com> References: <20210614073350.17048-1-eesposit@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -29 X-Spam_score: -3.0 X-Spam_bar: --- X-Spam_report: (-3.0 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Cc: Kevin Wolf <kwolf@redhat.com>, Emanuele Giuseppe Esposito <eesposit@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com> Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
Series	block-copy: protect block-copy internal structures \| expand [v4,0/6] block-copy: protect block-copy internal structures [v4,1/6] block-copy: small refactor in block_copy_task_entry and block_copy_common [v4,2/6] block-copy: streamline choice of copy_range vs. read/write [v4,3/6] block-copy: improve comments of BlockCopyTask and BlockCopyState types and functions [v4,4/6] block-copy: move progress_set_remaining in block_copy_task_end [v4,5/6] block-copy: add a CoMutex [v4,6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

Emanuele Giuseppe Esposito June 14, 2021, 7:33 a.m. UTC

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.

The atomic here are necessary because the fields are concurrently modified
also outside coroutines.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

Vladimir Sementsov-Ogievskiy June 19, 2021, 8:06 p.m. UTC | #1

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
> By adding acquire/release pairs, we ensure that .ret and .error_is_read
> fields are written by block_copy_dirty_clusters before .finished is true.

And that they are read by API user after .finished is true.

> 
> The atomic here are necessary because the fields are concurrently modified
> also outside coroutines.

To be honest, finished is modified only in coroutine. And read outside.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 33 ++++++++++++++++++---------------
>   1 file changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 6416929abd..5348e1f61b 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>       Coroutine *co;
>   
>       /* State */
> -    bool finished;
> +    bool finished; /* atomic */

So, logic around finished:

Thread of block_copy does:
0. finished is false
1. tasks set ret and error_is_read
2. qatomic_store_release finished -> true
3. after that point ret and error_is_read are not modified

Other threads can:

- qatomic_read finished, just to check are we finished or not

- if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true.

>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>   
>       /* To reference all call states from BlockCopyState */
>       QLIST_ENTRY(BlockCopyCallState) list;
>   
>       /* OUT parameters */
> -    bool cancelled;
> +    bool cancelled; /* atomic */

Logic around cancelled is simpler:

- false at start

- qatomic_read is allowed from any thread

- qatomic_write to true is allowed from any thread

- never write to false

Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation.

>       /* Fields protected by lock in BlockCopyState */
>       bool error_is_read;
>       int ret;
> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>   
> -    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
> +    while (bytes && aio_task_pool_status(aio) == 0 &&
> +           !qatomic_read(&call_state->cancelled)) {
>           BlockCopyTask *task;
>           int64_t status_bytes;
>   
> @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>       do {
>           ret = block_copy_dirty_clusters(call_state);
>   
> -        if (ret == 0 && !call_state->cancelled) {
> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>                   /*
>                    * Check that there is no task we still need to
> @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>            * 2. We have waited for some intersecting block-copy request
>            *    It may have failed and produced new dirty bits.
>            */
> -    } while (ret > 0 && !call_state->cancelled);
> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>   
> -    call_state->finished = true;
> +    qatomic_store_release(&call_state->finished, true);

so, all writes to ret and error_is_read are finished to this point.

>   
>       if (call_state->cb) {
>           call_state->cb(call_state->cb_opaque);
> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
>           return;
>       }
>   
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));

Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough.

>       g_free(call_state);
>   }
>   
>   bool block_copy_call_finished(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished;
> +    return qatomic_load_acquire(&call_state->finished);

and here

>   }
>   
>   bool block_copy_call_succeeded(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret == 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret == 0;

Here qatomic_load_acquire() is reasonable: it protects read of ->ret

>   }
>   
>   bool block_copy_call_failed(BlockCopyCallState *call_state)
>   {
> -    return call_state->finished && !call_state->cancelled &&
> -        call_state->ret < 0;
> +    return qatomic_load_acquire(&call_state->finished) &&
> +           !qatomic_read(&call_state->cancelled) &&
> +           call_state->ret < 0;

Here reasonable.

>   }
>   
>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>   {
> -    return call_state->cancelled;
> +    return qatomic_read(&call_state->cancelled);
>   }
>   
>   int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   {
> -    assert(call_state->finished);
> +    assert(qatomic_load_acquire(&call_state->finished));

Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.

Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.

So, let's use simple qatomic_read here too.

>       if (error_is_read) {
>           *error_is_read = call_state->error_is_read;
>       }
> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>   
>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>   {
> -    call_state->cancelled = true;
> +    qatomic_set(&call_state->cancelled, true);
>       block_copy_kick(call_state);
>   }
>   
> 

Uhh :)

Ok, that looks close too. Or in other words, I feel that I have good enough understanding of all the thread-safe logic that you have implemented :)

Emanuele Giuseppe Esposito June 21, 2021, 9:30 a.m. UTC | #2

On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>> fields are written by block_copy_dirty_clusters before .finished is true.
> 
> And that they are read by API user after .finished is true.
> 
>>
>> The atomic here are necessary because the fields are concurrently 
>> modified
>> also outside coroutines.
> 
> To be honest, finished is modified only in coroutine. And read outside.
> 
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 33 ++++++++++++++++++---------------
>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index 6416929abd..5348e1f61b 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>>       Coroutine *co;
>>       /* State */
>> -    bool finished;
>> +    bool finished; /* atomic */
> 
> So, logic around finished:
> 
> Thread of block_copy does:
> 0. finished is false
> 1. tasks set ret and error_is_read
> 2. qatomic_store_release finished -> true
> 3. after that point ret and error_is_read are not modified
> 
> Other threads can:
> 
> - qatomic_read finished, just to check are we finished or not
> 
> - if finished, can read ret and error_is_read safely. If you not sure 
> that block-copy finished, use qatomic_load_acquire() of finished first, 
> to be sure that you read ret and error_is_read AFTER finished read and 
> checked to be true.
> 
>>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>>       /* To reference all call states from BlockCopyState */
>>       QLIST_ENTRY(BlockCopyCallState) list;
>>       /* OUT parameters */
>> -    bool cancelled;
>> +    bool cancelled; /* atomic */
> 
> Logic around cancelled is simpler:
> 
> - false at start
> 
> - qatomic_read is allowed from any thread
> 
> - qatomic_write to true is allowed from any thread
> 
> - never write to false
> 
> Note that cancelling and finishing are racy. User can cancel block-copy 
> that's already finished. We probably may improve change it, but I'm not 
> sure that it worth doing. Still, maybe leave some comment in API 
> documentation.
> 
>>       /* Fields protected by lock in BlockCopyState */
>>       bool error_is_read;
>>       int ret;
>> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState 
>> *call_state)
>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>> -    while (bytes && aio_task_pool_status(aio) == 0 && 
>> !call_state->cancelled) {
>> +    while (bytes && aio_task_pool_status(aio) == 0 &&
>> +           !qatomic_read(&call_state->cancelled)) {
>>           BlockCopyTask *task;
>>           int64_t status_bytes;
>> @@ -761,7 +762,7 @@ static int coroutine_fn 
>> block_copy_common(BlockCopyCallState *call_state)
>>       do {
>>           ret = block_copy_dirty_clusters(call_state);
>> -        if (ret == 0 && !call_state->cancelled) {
>> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>>                   /*
>>                    * Check that there is no task we still need to
>> @@ -792,9 +793,9 @@ static int coroutine_fn 
>> block_copy_common(BlockCopyCallState *call_state)
>>            * 2. We have waited for some intersecting block-copy request
>>            *    It may have failed and produced new dirty bits.
>>            */
>> -    } while (ret > 0 && !call_state->cancelled);
>> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>> -    call_state->finished = true;
>> +    qatomic_store_release(&call_state->finished, true);
> 
> so, all writes to ret and error_is_read are finished to this point.
> 
>>       if (call_state->cb) {
>>           call_state->cb(call_state->cb_opaque);
>> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState 
>> *call_state)
>>           return;
>>       }
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Here we don't need load_aquire, as we don't read other fields. 
> qatomic_read is enough.

So what you say makes sense, the only thing that I wonder is: wouldn't 
it be better to have the acquire without assertion (or assert 
afterwards), just to be sure that we delete when finished is true?

[...]

> 
>>   }
>>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>>   {
>> -    return call_state->cancelled;
>> +    return qatomic_read(&call_state->cancelled);
>>   }
>>   int block_copy_call_status(BlockCopyCallState *call_state, bool 
>> *error_is_read)
>>   {
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
> not yet finished anyway). So, caller is double sure that block-copy is 
> finished.
> 
> Also it's misleading: if we think that it do some protection, we are 
> doing wrong thing: assertions may be simply compiled out, we can't rely 
> on statements inside assert() to be executed.
> 
> So, let's use simple qatomic_read here too.

Same applies here.

> 
>>       if (error_is_read) {
>>           *error_is_read = call_state->error_is_read;
>>       }
>> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState 
>> *call_state, bool *error_is_read)
>>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>>   {
>> -    call_state->cancelled = true;
>> +    qatomic_set(&call_state->cancelled, true);
>>       block_copy_kick(call_state);
>>   }
>>
> 
> Uhh :)
> 
> Ok, that looks close too. Or in other words, I feel that I have good 
> enough understanding of all the thread-safe logic that you have 
> implemented :)

Good! :)

Emanuele

Paolo Bonzini June 22, 2021, 8:15 a.m. UTC | #3

On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:
>>
>> -    assert(call_state->finished);
>> +    assert(qatomic_load_acquire(&call_state->finished));
> 
> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
> not yet finished anyway). So, caller is double sure that block-copy is 
> finished.

It does.  If it returns true, you still want the load of finished to 
happen before the reads that follow.

Otherwise I agree with your remarks.

Paolo

> Also it's misleading: if we think that it do some protection, we are 
> doing wrong thing: assertions may be simply compiled out, we can't rely 
> on statements inside assert() to be executed.
> 
> So, let's use simple qatomic_read here too.
> 
>>       if (error_is_read) {
>>           *error_is_read = call_state->error_is_read;
>>       }

Vladimir Sementsov-Ogievskiy June 22, 2021, 9:36 a.m. UTC | #4

22.06.2021 11:15, Paolo Bonzini wrote:
> On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.
> 
> It does.  If it returns true, you still want the load of finished to happen before the reads that follow.
> 

Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway.

OK, I agree, let's keep it.

> Otherwise I agree with your remarks.
> 
> Paolo
> 
>> Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.
>>
>> So, let's use simple qatomic_read here too.
>>
>>>       if (error_is_read) {
>>>           *error_is_read = call_state->error_is_read;
>>>       } 
>

Vladimir Sementsov-Ogievskiy June 22, 2021, 9:56 a.m. UTC | #5

21.06.2021 12:30, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:
>> 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:
>>> By adding acquire/release pairs, we ensure that .ret and .error_is_read
>>> fields are written by block_copy_dirty_clusters before .finished is true.
>>
>> And that they are read by API user after .finished is true.
>>
>>>
>>> The atomic here are necessary because the fields are concurrently modified
>>> also outside coroutines.
>>
>> To be honest, finished is modified only in coroutine. And read outside.
>>
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/block-copy.c | 33 ++++++++++++++++++---------------
>>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 6416929abd..5348e1f61b 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
>>>       Coroutine *co;
>>>       /* State */
>>> -    bool finished;
>>> +    bool finished; /* atomic */
>>
>> So, logic around finished:
>>
>> Thread of block_copy does:
>> 0. finished is false
>> 1. tasks set ret and error_is_read
>> 2. qatomic_store_release finished -> true
>> 3. after that point ret and error_is_read are not modified
>>
>> Other threads can:
>>
>> - qatomic_read finished, just to check are we finished or not
>>
>> - if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true.
>>
>>>       QemuCoSleep sleep; /* TODO: protect API with a lock */
>>>       /* To reference all call states from BlockCopyState */
>>>       QLIST_ENTRY(BlockCopyCallState) list;
>>>       /* OUT parameters */
>>> -    bool cancelled;
>>> +    bool cancelled; /* atomic */
>>
>> Logic around cancelled is simpler:
>>
>> - false at start
>>
>> - qatomic_read is allowed from any thread
>>
>> - qatomic_write to true is allowed from any thread
>>
>> - never write to false
>>
>> Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation.
>>
>>>       /* Fields protected by lock in BlockCopyState */
>>>       bool error_is_read;
>>>       int ret;
>>> @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
>>>       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>       assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
>>> -    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
>>> +    while (bytes && aio_task_pool_status(aio) == 0 &&
>>> +           !qatomic_read(&call_state->cancelled)) {
>>>           BlockCopyTask *task;
>>>           int64_t status_bytes;
>>> @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>>>       do {
>>>           ret = block_copy_dirty_clusters(call_state);
>>> -        if (ret == 0 && !call_state->cancelled) {
>>> +        if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
>>>               WITH_QEMU_LOCK_GUARD(&s->lock) {
>>>                   /*
>>>                    * Check that there is no task we still need to
>>> @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
>>>            * 2. We have waited for some intersecting block-copy request
>>>            *    It may have failed and produced new dirty bits.
>>>            */
>>> -    } while (ret > 0 && !call_state->cancelled);
>>> +    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
>>> -    call_state->finished = true;
>>> +    qatomic_store_release(&call_state->finished, true);
>>
>> so, all writes to ret and error_is_read are finished to this point.
>>
>>>       if (call_state->cb) {
>>>           call_state->cb(call_state->cb_opaque);
>>> @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
>>>           return;
>>>       }
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough.
> 
> So what you say makes sense, the only thing that I wonder is: wouldn't it be better to have the acquire without assertion (or assert afterwards), just to be sure that we delete when finished is true?
> 

Hmm. I think neither compiler nor processor should reorder read structure field and free() call on the whole structure :)

And anyway for block_copy_call_free() caller is responsible for the structure not being used by other thread.

> 
>>
>>>   }
>>>   bool block_copy_call_cancelled(BlockCopyCallState *call_state)
>>>   {
>>> -    return call_state->cancelled;
>>> +    return qatomic_read(&call_state->cancelled);
>>>   }
>>>   int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>>>   {
>>> -    assert(call_state->finished);
>>> +    assert(qatomic_load_acquire(&call_state->finished));
>>
>> Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished.
>>
>> Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed.
>>
>> So, let's use simple qatomic_read here too.
> 
> Same applies here.

Here I agree with Paolo, assertion works better as written..

So we can just keep it as is.

> 
>>
>>>       if (error_is_read) {
>>>           *error_is_read = call_state->error_is_read;
>>>       }
>>> @@ -894,7 +897,7 @@ int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read)
>>>   void block_copy_call_cancel(BlockCopyCallState *call_state)
>>>   {
>>> -    call_state->cancelled = true;
>>> +    qatomic_set(&call_state->cancelled, true);
>>>       block_copy_kick(call_state);
>>>   }
>>>
>>
>> Uhh :)
>>
>> Ok, that looks close too. Or in other words, I feel that I have good enough understanding of all the thread-safe logic that you have implemented :)
> 
> Good! :)
> 
> Emanuele
>

Paolo Bonzini June 22, 2021, 10:20 a.m. UTC | #6

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>> It does.  If it returns true, you still want the load of finished to 
>> happen before the reads that follow.
> 
> Hmm.. The worst case if we use just qatomic_read is that assertion will 
> not crash when it actually should. That doesn't break the logic. But 
> that's not good anyway.
> 
> OK, I agree, let's keep it.

You can also have a finished job, but get a stale value for 
error_is_read or ret.  The issue is not in getting the stale value per 
se, but in block_copy_call_status's caller not expecting it.

(I understand you agree, but I guess it can be interesting to learn 
about this too).

Paolo

Vladimir Sementsov-Ogievskiy June 22, 2021, 10:39 a.m. UTC | #7

22.06.2021 13:20, Paolo Bonzini wrote:
> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>> It does.  If it returns true, you still want the load of finished to happen before the reads that follow.
>>
>> Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway.
>>
>> OK, I agree, let's keep it.
> 
> You can also have a finished job, but get a stale value for error_is_read or ret.  The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it.
> 
> (I understand you agree, but I guess it can be interesting to learn about this too).
> 

Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true?

That means that we must do it not in assertion (to not be compiled out):

bool finished = load_acquire()

assert(finished);

... read reat and error_is_read ...

Emanuele Giuseppe Esposito June 22, 2021, 8:57 p.m. UTC | #8

On 22/06/2021 12:39, Vladimir Sementsov-Ogievskiy wrote:
> 22.06.2021 13:20, Paolo Bonzini wrote:
>> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>>> It does.  If it returns true, you still want the load of finished to 
>>>> happen before the reads that follow.
>>>
>>> Hmm.. The worst case if we use just qatomic_read is that assertion 
>>> will not crash when it actually should. That doesn't break the logic. 
>>> But that's not good anyway.
>>>
>>> OK, I agree, let's keep it.
>>
>> You can also have a finished job, but get a stale value for 
>> error_is_read or ret.  The issue is not in getting the stale value per 
>> se, but in block_copy_call_status's caller not expecting it.
>>
>> (I understand you agree, but I guess it can be interesting to learn 
>> about this too).
>>
> 
> Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
> explicitly doing load_acquire(finished) and checking that it's true?
> 
> That means that we must do it not in assertion (to not be compiled out):
> 
> bool finished = load_acquire()
> 
> assert(finished);
> 
> ... read reat and error_is_read ...
> 
> 

If I understand correctly, this was what I was trying to say before: 
maybe it's better that we make sure that @finished is set before reading 
@ret and @error_is_read. And because assert can be disabled, we can do 
like you wrote above.

Anyways, let's wait Paolo's answer for this. Once this is ready, I will 
send v5.

Emanuele

Paolo Bonzini June 23, 2021, 10:06 a.m. UTC | #9

On 22/06/21 12:39, Vladimir Sementsov-Ogievskiy wrote:
> 22.06.2021 13:20, Paolo Bonzini wrote:
>> On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
>>> OK, I agree, let's keep it.
>>
>> You can also have a finished job, but get a stale value for 
>> error_is_read or ret.  The issue is not in getting the stale value per 
>> se, but in block_copy_call_status's caller not expecting it.
> 
> Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
> explicitly doing load_acquire(finished) and checking that it's true?
> 
> That means that we must do it not in assertion (to not be compiled out):
> 
> bool finished = load_acquire()
> 
> assert(finished);
> 
> ... read reat and error_is_read ...

In reality you must have synchronized in some other way; that outside 
synchronization outside block-copy.c is what guarantees that the 
assertion does not fail.  The simplest cases are:

- a mutex: "unlock" counts as release, "lock" counts as acquire;

- a bottom half: "schedule" counts as release, running counts as acquire.

Therefore, removing the assertion would work just fine because the 
synchronization would be like this:

    write ret/error_is_read
    write finished
    trigger bottom half or something -->    bottom half runs
                                            read ret/error_is_read

So there is no need at all to read ->finished, much less to load it 
outside the assertion.  At the same time there are two problems with 
"assert(qatomic_read(&call_state->finished))".  Note that they are not 
logic issues, they are maintenance issues.

First, if *there is a bug elsewhere* and the above synchronization 
doesn't happen, it may manifest sometimes as an assertion failure and 
sometimes as a memory reordering.  This is what I was talking about in 
the previous message, and it is probably the last thing that you want 
when debugging; since we're adding asserts defensively, we might as well 
do it well.

Second, if somebody later carelessly changes the function to

   if (qatomic_read(&call_state->finished)) {
       ...
   } else {
       error_setg(...);
   }

that would be broken.  Using qatomic_load_acquire makes the code more 
future-proof against a change like the one above.

Paolo

[v4,6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

Commit Message

Comments

Patch