diff mbox series

[v5,22/31] block_int-common.h: assertion in the callers of BlockDriver function pointers

Message ID 20211124064418.3120601-23-eesposit@redhat.com (mailing list archive)
State New, archived
Headers show
Series block layer: split block APIs in global state and I/O | expand

Commit Message

Emanuele Giuseppe Esposito Nov. 24, 2021, 6:44 a.m. UTC
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c        | 18 ++++++++++++++++++
 block/create.c | 10 ++++++++++
 2 files changed, 28 insertions(+)

Comments

Hanna Czenczek Dec. 16, 2021, 6:43 p.m. UTC | #1
On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   block.c        | 18 ++++++++++++++++++
>   block/create.c | 10 ++++++++++
>   2 files changed, 28 insertions(+)

[...]

> diff --git a/block/create.c b/block/create.c
> index 89812669df..0167118579 100644
> --- a/block/create.c
> +++ b/block/create.c
> @@ -42,6 +42,16 @@ static int coroutine_fn blockdev_create_run(Job *job, Error **errp)
>       BlockdevCreateJob *s = container_of(job, BlockdevCreateJob, common);
>       int ret;
>   
> +    /*
> +     * Currently there is nothing preventing this
> +     * function from being called in an iothread context.
> +     * However, since it will crash anyways because of the
> +     * aiocontext lock not taken, we might as well make it
> +     * crash with a more meaningful error, by checking that
> +     * we are in the main loop
> +     */
> +    assert(qemu_in_main_thread());

Mostly agreed.  This function is always run in the main loop right now, 
so this assertion will never fail.

But that’s the “mostly”: Adding this assertion won’t give a more 
meaningful error, because the problem still remains that block drivers 
do not error out when encountering (or correctly handle) BDSs in 
non-main contexts, and so it remains a “qemu_mutex_unlock_impl: 
Operation not permitted”.

Not trying to say that that’s your problem.  It’s pre-existing, and this 
assertion is good.  Just wanting to clarify something about the comment 
that seemed unclear to me (in that I found it implied that the 
qemu_mutex_unlock_impl error would be replaced by this assertion failing).

Hanna

> +
>       job_progress_set_remaining(&s->common, 1);
>       ret = s->drv->bdrv_co_create(s->opts, errp);
>       job_progress_update(&s->common, 1);
Emanuele Giuseppe Esposito Dec. 17, 2021, 3:53 p.m. UTC | #2
On 16/12/2021 19:43, Hanna Reitz wrote:
> On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>   block.c        | 18 ++++++++++++++++++
>>   block/create.c | 10 ++++++++++
>>   2 files changed, 28 insertions(+)
> 
> [...]
> 
>> diff --git a/block/create.c b/block/create.c
>> index 89812669df..0167118579 100644
>> --- a/block/create.c
>> +++ b/block/create.c
>> @@ -42,6 +42,16 @@ static int coroutine_fn blockdev_create_run(Job 
>> *job, Error **errp)
>>       BlockdevCreateJob *s = container_of(job, BlockdevCreateJob, 
>> common);
>>       int ret;
>> +    /*
>> +     * Currently there is nothing preventing this
>> +     * function from being called in an iothread context.
>> +     * However, since it will crash anyways because of the
>> +     * aiocontext lock not taken, we might as well make it
>> +     * crash with a more meaningful error, by checking that
>> +     * we are in the main loop
>> +     */
>> +    assert(qemu_in_main_thread());
> 
> Mostly agreed.  This function is always run in the main loop right now, 
> so this assertion will never fail.
> 
> But that’s the “mostly”: Adding this assertion won’t give a more 
> meaningful error, because the problem still remains that block drivers 
> do not error out when encountering (or correctly handle) BDSs in 
> non-main contexts, and so it remains a “qemu_mutex_unlock_impl: 
> Operation not permitted”.
> 
> Not trying to say that that’s your problem.  It’s pre-existing, and this 
> assertion is good.  Just wanting to clarify something about the comment 
> that seemed unclear to me (in that I found it implied that the 
> qemu_mutex_unlock_impl error would be replaced by this assertion failing).
> 

You are right. Trying your example given in v4:

$ touch /tmp/iothread-create-test.qcow2
$ ./qemu-system-x86_64 -object iothread,id=iothr0 -qmp stdio <<EOF
{"execute": "qmp_capabilities"}
{"execute":"blockdev-add","arguments":{"node-name":"proto","driver":"file","filename":"/tmp/iothread-create-test.qcow2"}}
{"execute":"x-blockdev-set-iothread","arguments":{"node-name":"proto","iothread":"iothr0"}}
{"execute":"blockdev-create","arguments":{"job-id":"create","options":{"driver":"qcow2","file":"proto","size":0}}}
EOF

I still get "qemu_mutex_unlock_impl: Operation not permitted"

I will remove the comment above the assertion, makes no sense.

Or should I replace it with a TODO/FIXME explaining the above? Something 
like:

/*
  * TODO: it is currently possible to run a blockdev-create job in an
  * I/O thread, for example by doing:
  * [ command line above ]
  * This should not be allowed.
  */

What do you think?

Emanuele


> 
>> +
>>       job_progress_set_remaining(&s->common, 1);
>>       ret = s->drv->bdrv_co_create(s->opts, errp);
>>       job_progress_update(&s->common, 1);
>
Hanna Czenczek Dec. 17, 2021, 4 p.m. UTC | #3
On 17.12.21 16:53, Emanuele Giuseppe Esposito wrote:
>
>
> On 16/12/2021 19:43, Hanna Reitz wrote:
>> On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>   block.c        | 18 ++++++++++++++++++
>>>   block/create.c | 10 ++++++++++
>>>   2 files changed, 28 insertions(+)
>>
>> [...]
>>
>>> diff --git a/block/create.c b/block/create.c
>>> index 89812669df..0167118579 100644
>>> --- a/block/create.c
>>> +++ b/block/create.c
>>> @@ -42,6 +42,16 @@ static int coroutine_fn blockdev_create_run(Job 
>>> *job, Error **errp)
>>>       BlockdevCreateJob *s = container_of(job, BlockdevCreateJob, 
>>> common);
>>>       int ret;
>>> +    /*
>>> +     * Currently there is nothing preventing this
>>> +     * function from being called in an iothread context.
>>> +     * However, since it will crash anyways because of the
>>> +     * aiocontext lock not taken, we might as well make it
>>> +     * crash with a more meaningful error, by checking that
>>> +     * we are in the main loop
>>> +     */
>>> +    assert(qemu_in_main_thread());
>>
>> Mostly agreed.  This function is always run in the main loop right 
>> now, so this assertion will never fail.
>>
>> But that’s the “mostly”: Adding this assertion won’t give a more 
>> meaningful error, because the problem still remains that block 
>> drivers do not error out when encountering (or correctly handle) BDSs 
>> in non-main contexts, and so it remains a “qemu_mutex_unlock_impl: 
>> Operation not permitted”.
>>
>> Not trying to say that that’s your problem.  It’s pre-existing, and 
>> this assertion is good.  Just wanting to clarify something about the 
>> comment that seemed unclear to me (in that I found it implied that 
>> the qemu_mutex_unlock_impl error would be replaced by this assertion 
>> failing).
>>
>
> You are right. Trying your example given in v4:
>
> $ touch /tmp/iothread-create-test.qcow2
> $ ./qemu-system-x86_64 -object iothread,id=iothr0 -qmp stdio <<EOF
> {"execute": "qmp_capabilities"}
> {"execute":"blockdev-add","arguments":{"node-name":"proto","driver":"file","filename":"/tmp/iothread-create-test.qcow2"}} 
>
> {"execute":"x-blockdev-set-iothread","arguments":{"node-name":"proto","iothread":"iothr0"}} 
>
> {"execute":"blockdev-create","arguments":{"job-id":"create","options":{"driver":"qcow2","file":"proto","size":0}}} 
>
> EOF
>
> I still get "qemu_mutex_unlock_impl: Operation not permitted"
>
> I will remove the comment above the assertion, makes no sense.
>
> Or should I replace it with a TODO/FIXME explaining the above? 
> Something like:
>
> /*
>  * TODO: it is currently possible to run a blockdev-create job in an
>  * I/O thread, for example by doing:
>  * [ command line above ]
>  * This should not be allowed.
>  */

Just removing it makes the most sense to me.  We already have a TODO 
comment to that effect (block/create.c:86).

(And to be precise, it is *not* possible to run a blockdev-create job in 
an I/O thread, it’s always run in the main thread.  It is possible to 
run a blockdev-create job involving nodes in I/O threads, and it’s fine 
that that’s possible, but in such cases the block drivers’ 
.bdrv_co_create() implementations should at least error out with a 
benign error, which they don’t.)

Hanna
diff mbox series

Patch

diff --git a/block.c b/block.c
index b77ab0a104..180884b8c0 100644
--- a/block.c
+++ b/block.c
@@ -526,6 +526,7 @@  static void coroutine_fn bdrv_create_co_entry(void *opaque)
 
     CreateCo *cco = opaque;
     assert(cco->drv);
+    assert(qemu_in_main_thread());
 
     ret = cco->drv->bdrv_co_create_opts(cco->drv,
                                         cco->filename, cco->opts, &local_err);
@@ -1090,6 +1091,7 @@  int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
 static void bdrv_join_options(BlockDriverState *bs, QDict *options,
                               QDict *old_options)
 {
+    assert(qemu_in_main_thread());
     if (bs->drv && bs->drv->bdrv_join_options) {
         bs->drv->bdrv_join_options(options, old_options);
     } else {
@@ -1598,6 +1600,7 @@  static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
 {
     Error *local_err = NULL;
     int i, ret;
+    assert(qemu_in_main_thread());
 
     bdrv_assign_node_name(bs, node_name, &local_err);
     if (local_err) {
@@ -1989,6 +1992,8 @@  static int bdrv_fill_options(QDict **options, const char *filename,
     BlockDriver *drv = NULL;
     Error *local_err = NULL;
 
+    assert(qemu_in_main_thread());
+
     /*
      * Caution: while qdict_get_try_str() is fine, getting non-string
      * types would require more care.  When @options come from
@@ -2182,6 +2187,7 @@  static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
                             uint64_t *nperm, uint64_t *nshared)
 {
     assert(bs->drv && bs->drv->bdrv_child_perm);
+    assert(qemu_in_main_thread());
     bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
                              parent_perm, parent_shared,
                              nperm, nshared);
@@ -2267,6 +2273,7 @@  static void bdrv_drv_set_perm_commit(void *opaque)
 {
     BlockDriverState *bs = opaque;
     uint64_t cumulative_perms, cumulative_shared_perms;
+    assert(qemu_in_main_thread());
 
     if (bs->drv->bdrv_set_perm) {
         bdrv_get_cumulative_perm(bs, &cumulative_perms,
@@ -2278,6 +2285,7 @@  static void bdrv_drv_set_perm_commit(void *opaque)
 static void bdrv_drv_set_perm_abort(void *opaque)
 {
     BlockDriverState *bs = opaque;
+    assert(qemu_in_main_thread());
 
     if (bs->drv->bdrv_abort_perm_update) {
         bs->drv->bdrv_abort_perm_update(bs);
@@ -2293,6 +2301,7 @@  static int bdrv_drv_set_perm(BlockDriverState *bs, uint64_t perm,
                              uint64_t shared_perm, Transaction *tran,
                              Error **errp)
 {
+    assert(qemu_in_main_thread());
     if (!bs->drv) {
         return 0;
     }
@@ -4357,6 +4366,7 @@  int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
 
     assert(qemu_get_current_aio_context() == qemu_get_aio_context());
     assert(bs_queue != NULL);
+    assert(qemu_in_main_thread());
 
     QTAILQ_FOREACH(bs_entry, bs_queue, entry) {
         ctx = bdrv_get_aio_context(bs_entry->state.bs);
@@ -4622,6 +4632,7 @@  static int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
 
     assert(reopen_state != NULL);
     assert(reopen_state->bs->drv != NULL);
+    assert(qemu_in_main_thread());
     drv = reopen_state->bs->drv;
 
     /* This function and each driver's bdrv_reopen_prepare() remove
@@ -4832,6 +4843,7 @@  static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
     bs = reopen_state->bs;
     drv = bs->drv;
     assert(drv != NULL);
+    assert(qemu_in_main_thread());
 
     /* If there are any driver level actions to take */
     if (drv->bdrv_reopen_commit) {
@@ -4873,6 +4885,7 @@  static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
     assert(reopen_state != NULL);
     drv = reopen_state->bs->drv;
     assert(drv != NULL);
+    assert(qemu_in_main_thread());
 
     if (drv->bdrv_reopen_abort) {
         drv->bdrv_reopen_abort(reopen_state);
@@ -4886,6 +4899,7 @@  static void bdrv_close(BlockDriverState *bs)
     BdrvChild *child, *next;
 
     assert(!bs->refcnt);
+    assert(qemu_in_main_thread());
 
     bdrv_drained_begin(bs); /* complete I/O */
     bdrv_flush(bs);
@@ -6671,6 +6685,8 @@  static int bdrv_inactivate_recurse(BlockDriverState *bs)
     int ret;
     uint64_t cumulative_perms, cumulative_shared_perms;
 
+    assert(qemu_in_main_thread());
+
     if (!bs->drv) {
         return -ENOMEDIUM;
     }
@@ -7180,6 +7196,7 @@  static void bdrv_detach_aio_context(BlockDriverState *bs)
     BdrvAioNotifier *baf, *baf_tmp;
 
     assert(!bs->walking_aio_notifiers);
+    assert(qemu_in_main_thread());
     bs->walking_aio_notifiers = true;
     QLIST_FOREACH_SAFE(baf, &bs->aio_notifiers, list, baf_tmp) {
         if (baf->deleted) {
@@ -7207,6 +7224,7 @@  static void bdrv_attach_aio_context(BlockDriverState *bs,
                                     AioContext *new_context)
 {
     BdrvAioNotifier *ban, *ban_tmp;
+    assert(qemu_in_main_thread());
 
     if (bs->quiesce_counter) {
         aio_disable_external(new_context);
diff --git a/block/create.c b/block/create.c
index 89812669df..0167118579 100644
--- a/block/create.c
+++ b/block/create.c
@@ -42,6 +42,16 @@  static int coroutine_fn blockdev_create_run(Job *job, Error **errp)
     BlockdevCreateJob *s = container_of(job, BlockdevCreateJob, common);
     int ret;
 
+    /*
+     * Currently there is nothing preventing this
+     * function from being called in an iothread context.
+     * However, since it will crash anyways because of the
+     * aiocontext lock not taken, we might as well make it
+     * crash with a more meaningful error, by checking that
+     * we are in the main loop
+     */
+    assert(qemu_in_main_thread());
+
     job_progress_set_remaining(&s->common, 1);
     ret = s->drv->bdrv_co_create(s->opts, errp);
     job_progress_update(&s->common, 1);