Message ID | 20210409160406.1800272-3-rvkagan@yandex-team.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block/nbd: assorted bugfixes | expand |
09.04.2021 19:04, Roman Kagan wrote: > Simplify lifetime management of BDRVNBDState->connection_thread by > delaying the possible cleanup of it until the BDRVNBDState itself goes > away. > > This also fixes possible use-after-free in nbd_co_establish_connection > when it races with nbd_co_establish_connection_cancel. > > Signed-off-by: Roman Kagan<rvkagan@yandex-team.ru> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10.04.2021 11:06, Vladimir Sementsov-Ogievskiy wrote: > 09.04.2021 19:04, Roman Kagan wrote: >> Simplify lifetime management of BDRVNBDState->connection_thread by >> delaying the possible cleanup of it until the BDRVNBDState itself goes >> away. >> >> This also fixes possible use-after-free in nbd_co_establish_connection >> when it races with nbd_co_establish_connection_cancel. >> >> Signed-off-by: Roman Kagan<rvkagan@yandex-team.ru> > > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > Ha stop, it crashes iotest 51, as nbd_clear_bdrvstate is called also from nbd_process_options. And this shows that we also do wrong thing when simply return from two ifs pre-patch (and one after-patch). Yes, after successful nbd_process options we should call nbd_clear_bdrvstate() on failure path.
10.04.2021 11:38, Vladimir Sementsov-Ogievskiy wrote: > 10.04.2021 11:06, Vladimir Sementsov-Ogievskiy wrote: >> 09.04.2021 19:04, Roman Kagan wrote: >>> Simplify lifetime management of BDRVNBDState->connection_thread by >>> delaying the possible cleanup of it until the BDRVNBDState itself goes >>> away. >>> >>> This also fixes possible use-after-free in nbd_co_establish_connection >>> when it races with nbd_co_establish_connection_cancel. >>> >>> Signed-off-by: Roman Kagan<rvkagan@yandex-team.ru> >> >> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> > > Ha stop, it crashes iotest 51, as nbd_clear_bdrvstate is called also from nbd_process_options. > > And this shows that we also do wrong thing when simply return from two ifs pre-patch (and one after-patch). Yes, after successful nbd_process options we should call nbd_clear_bdrvstate() on failure path. > And also it shows that patch is more difficult than it seems. I still think that for 6.0 we should take a simple use-after-free-only fix, and don't care about anything else.
On Sat, Apr 10, 2021 at 12:56:34PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 10.04.2021 11:38, Vladimir Sementsov-Ogievskiy wrote: > > 10.04.2021 11:06, Vladimir Sementsov-Ogievskiy wrote: > > > 09.04.2021 19:04, Roman Kagan wrote: > > > > Simplify lifetime management of BDRVNBDState->connection_thread by > > > > delaying the possible cleanup of it until the BDRVNBDState itself goes > > > > away. > > > > > > > > This also fixes possible use-after-free in nbd_co_establish_connection > > > > when it races with nbd_co_establish_connection_cancel. > > > > > > > > Signed-off-by: Roman Kagan<rvkagan@yandex-team.ru> > > > > > > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > > > > > > Ha stop, it crashes iotest 51, as nbd_clear_bdrvstate is called also from nbd_process_options. > > > > And this shows that we also do wrong thing when simply return from two ifs pre-patch (and one after-patch). Yes, after successful nbd_process options we should call nbd_clear_bdrvstate() on failure path. The test didn't crash at me somehow but you're right there's bug here. > And also it shows that patch is more difficult than it seems. I still think that for 6.0 we should take a simple use-after-free-only fix, and don't care about anything else. I agree it turned out more complex than apporpriate for 6.0. Let's proceed with the one you've posted. Thanks, Roman.
diff --git a/block/nbd.c b/block/nbd.c index d86df3afcb..6e3879c0c5 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -144,16 +144,31 @@ typedef struct BDRVNBDState { NBDConnectThread *connect_thread; } BDRVNBDState; +static void nbd_free_connect_thread(NBDConnectThread *thr); static int nbd_establish_connection(BlockDriverState *bs, SocketAddress *saddr, Error **errp); static int nbd_co_establish_connection(BlockDriverState *bs, Error **errp); -static void nbd_co_establish_connection_cancel(BlockDriverState *bs, - bool detach); +static void nbd_co_establish_connection_cancel(BlockDriverState *bs); static int nbd_client_handshake(BlockDriverState *bs, Error **errp); static void nbd_yank(void *opaque); static void nbd_clear_bdrvstate(BDRVNBDState *s) { + NBDConnectThread *thr = s->connect_thread; + bool thr_running; + + qemu_mutex_lock(&thr->mutex); + thr_running = thr->state == CONNECT_THREAD_RUNNING; + if (thr_running) { + thr->state = CONNECT_THREAD_RUNNING_DETACHED; + } + qemu_mutex_unlock(&thr->mutex); + + /* the runaway thread will clean it up itself */ + if (!thr_running) { + nbd_free_connect_thread(thr); + } + object_unref(OBJECT(s->tlscreds)); qapi_free_SocketAddress(s->saddr); s->saddr = NULL; @@ -293,7 +308,7 @@ static void coroutine_fn nbd_client_co_drain_begin(BlockDriverState *bs) qemu_co_sleep_wake(s->connection_co_sleep_ns_state); } - nbd_co_establish_connection_cancel(bs, false); + nbd_co_establish_connection_cancel(bs); reconnect_delay_timer_del(s); @@ -333,7 +348,7 @@ static void nbd_teardown_connection(BlockDriverState *bs) if (s->connection_co_sleep_ns_state) { qemu_co_sleep_wake(s->connection_co_sleep_ns_state); } - nbd_co_establish_connection_cancel(bs, true); + nbd_co_establish_connection_cancel(bs); } if (qemu_in_coroutine()) { s->teardown_co = qemu_coroutine_self(); @@ -534,18 +549,12 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp) * nbd_co_establish_connection_cancel * Cancel nbd_co_establish_connection asynchronously: it will finish soon, to * allow drained section to begin. - * - * If detach is true, also cleanup the state (or if thread is running, move it - * to CONNECT_THREAD_RUNNING_DETACHED state). s->connect_thread becomes NULL if - * detach is true. */ -static void nbd_co_establish_connection_cancel(BlockDriverState *bs, - bool detach) +static void nbd_co_establish_connection_cancel(BlockDriverState *bs) { BDRVNBDState *s = bs->opaque; NBDConnectThread *thr = s->connect_thread; bool wake = false; - bool do_free = false; qemu_mutex_lock(&thr->mutex); @@ -556,21 +565,10 @@ static void nbd_co_establish_connection_cancel(BlockDriverState *bs, s->wait_connect = false; wake = true; } - if (detach) { - thr->state = CONNECT_THREAD_RUNNING_DETACHED; - s->connect_thread = NULL; - } - } else if (detach) { - do_free = true; } qemu_mutex_unlock(&thr->mutex); - if (do_free) { - nbd_free_connect_thread(thr); - s->connect_thread = NULL; - } - if (wake) { aio_co_wake(s->connection_co); } @@ -2294,31 +2292,33 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags, return -EEXIST; } + nbd_init_connect_thread(s); + /* * establish TCP connection, return error if it fails * TODO: Configurable retry-until-timeout behaviour. */ if (nbd_establish_connection(bs, s->saddr, errp) < 0) { - yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name)); - return -ECONNREFUSED; + ret = -ECONNREFUSED; + goto fail; } ret = nbd_client_handshake(bs, errp); if (ret < 0) { - yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name)); - nbd_clear_bdrvstate(s); - return ret; + goto fail; } /* successfully connected */ s->state = NBD_CLIENT_CONNECTED; - nbd_init_connect_thread(s); - s->connection_co = qemu_coroutine_create(nbd_connection_entry, s); bdrv_inc_in_flight(bs); aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co); return 0; +fail: + yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name)); + nbd_clear_bdrvstate(s); + return ret; } static int nbd_co_flush(BlockDriverState *bs)
Simplify lifetime management of BDRVNBDState->connection_thread by delaying the possible cleanup of it until the BDRVNBDState itself goes away. This also fixes possible use-after-free in nbd_co_establish_connection when it races with nbd_co_establish_connection_cancel. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> --- This is a more future-proof version of the fix for use-after-free but less intrusive than Vladimir's series so that it can go in 6.0. block/nbd.c | 58 ++++++++++++++++++++++++++--------------------------- 1 file changed, 29 insertions(+), 29 deletions(-)