Message ID | 20211124140951.439684-1-vsementsov@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,for-6.2] block/nbd: forbid incompatible change of server options on reconnect | expand |
On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote: > Reconnect feature was never prepared to handle server options changed > on reconnect. Let's be stricter and check what exactly is changed. If > server capabilities just got richer don't worry. Otherwise fail and > drop the established connection. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > > Hi all! The patch is probably good for 6.2. It's an RFC because I didn't > test it yet) But I want to early send, so that my proposed design be > available for discussion. We're cutting it awfully close. My justification for including it in -rc3 (if we like it) is that it is a lot easier to audit that we reject server downgrades than it is to audit whether we have a CVE because of a server downgrade across a reconnect. But it is not a new regression to 6.2, so slipping it to 7.0 (if we don't feel comfortable with the current iteration of the patch) is okay on that front. > > > include/block/nbd.h | 9 +++++ > nbd/client-connection.c | 86 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 95 insertions(+) > > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 78d101b774..3d379b5539 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -157,6 +157,10 @@ enum { > #define NBD_FLAG_SEND_RESIZE (1 << NBD_FLAG_SEND_RESIZE_BIT) > #define NBD_FLAG_SEND_CACHE (1 << NBD_FLAG_SEND_CACHE_BIT) > #define NBD_FLAG_SEND_FAST_ZERO (1 << NBD_FLAG_SEND_FAST_ZERO_BIT) > +/* > + * If you add any new NBD_FLAG_ flag, check that logic in > + * nbd_is_new_info_compatible() is still good about handling flags. > + */ > > /* New-style handshake (global) flags, sent from server to client, and > control what will happen during handshake phase. */ > @@ -305,6 +309,11 @@ struct NBDExportInfo { > > uint32_t context_id; > > + /* > + * WARNING! when add any new field to the structure, don't forget to check adding > + * and updated nbd_is_new_info_compatible() function. update the > + */ Odd that one comment has WARNING! and the other does not. > + > /* Set by server results during nbd_receive_export_list() */ > char *description; > int n_contexts; > diff --git a/nbd/client-connection.c b/nbd/client-connection.c > index 695f855754..2d66993632 100644 > --- a/nbd/client-connection.c > +++ b/nbd/client-connection.c > @@ -37,6 +37,10 @@ struct NBDClientConnection { > bool do_negotiation; > bool do_retry; > > + /* Used only by connection thread, no need in locking the mutex */ s/no need in locking the mutex/does not need mutex protection/ > + bool has_prev_info; > + NBDExportInfo prev_info; > + > QemuMutex mutex; > > /* > @@ -160,6 +164,67 @@ static int nbd_connect(QIOChannelSocket *sioc, SocketAddress *addr, > return 0; > } > > +static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo *new, > + Error **errp) > +{ > + uint32_t dropped_flags; > + > + if (old->structured_reply && !new->structured_reply) { > + error_setg(errp, "Server options degrade after reconnect: " degraded > + "structured_reply is not supported anymore"); > + return false; > + } > + > + if (old->base_allocation && !new->base_allocation) { > + error_setg(errp, "Server options degrade after reconnect: " degraded > + "base_allocation is not supported anymore"); > + return false; > + } Do we also need to insist that the context id value be identical, or can our code gracefully deal with it being different? We don't ever send the context id, so even if we retry a CMD_BLOCK_STATUS, our real risk is whether we will reject the new server's reply because it used a different id than we were expecting. > + > + if (old->size != new->size) { > + error_setg(errp, "NBD export size changed after reconnect"); > + return false; > + } > + > + /* > + * No worry if rotational status changed. But other flags are feature flags, > + * they should not degrade. > + */ > + dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL; > + if (dropped_flags) { > + error_setg(errp, "Server options degrade after reconnect: flags 0x%" degraded > + PRIx32 " are not reported anymore", dropped_flags); > + return false; > + } > + > + if (new->min_block > old->min_block) { > + error_setg(errp, "Server requires more strict min_block after " > + "reconnect: %" PRIu32 " instead of %" PRIu32, > + new->min_block, old->min_block); > + return false; > + } Good... > + if (new->min_block && (old->min_block % new->min_block)) { > + error_setg(errp, "Server requires new min_block %" PRIu32 > + " after reconnect, incompatible with old one %" PRIu32, > + new->min_block, old->min_block); > + return false; > + } ...but why is this one necessary? Since min_block has to be a power of 2, and you just proved that new->min_block <= old->min_block above, this condition will always be false. > + > + if (new->max_block < old->max_block) { > + error_setg(errp, "Server requires more strict max_block after " > + "reconnect: %" PRIu32 " instead of %" PRIu32, > + new->max_block, old->max_block); > + return false; > + } > + > + if (old->context_id != new->context_id) { > + error_setg(errp, "Meta context id changed after reconnect"); > + return false; > + } Oh, this answers my question above. We should put this near where we check above. And this check should only be performed if base_allocation was supported in the old server (if the old server lacks it and the new server supports it, the ids may be differ, but that's an upgrade, not a downgrade, and we don't care). > + > + return true; > +} > + > static void *connect_thread_func(void *opaque) > { > NBDClientConnection *conn = opaque; > @@ -183,6 +248,27 @@ static void *connect_thread_func(void *opaque) > conn->do_negotiation ? &conn->updated_info : NULL, > conn->tlscreds, &conn->ioc, &conn->err); > > + if (ret == 0) { > + if (conn->has_prev_info && > + !nbd_is_new_info_compatible(&conn->prev_info, > + &conn->updated_info, &conn->err)) > + { > + NBDRequest request = { .type = NBD_CMD_DISC }; > + QIOChannel *ioc = conn->ioc ?: QIO_CHANNEL(conn->sioc); > + > + nbd_send_request(ioc, &request); > + qio_channel_close(ioc, NULL); > + > + object_unref(OBJECT(conn->ioc)); > + conn->ioc = NULL; > + > + ret = -EINVAL; > + } else { > + conn->prev_info = conn->updated_info; > + conn->has_prev_info = true; > + } > + } > + > /* > * conn->updated_info will finally be returned to the user. Clear the > * pointers to our internally allocated strings, which are IN parameters > -- > 2.31.1 > Looks like it is on the right track.
On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote: > Reconnect feature was never prepared to handle server options changed > on reconnect. Let's be stricter and check what exactly is changed. If > server capabilities just got richer don't worry. Otherwise fail and > drop the established connection. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > + /* > + * No worry if rotational status changed. But other flags are feature flags, > + * they should not degrade. > + */ > + dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL; > + if (dropped_flags) { > + error_setg(errp, "Server options degrade after reconnect: flags 0x%" > + PRIx32 " are not reported anymore", dropped_flags); > + return false; > + } Your logic is good for most flags, but somewhat wrong for NBD_FLAG_READ_ONLY_BIT. For cases where we are only using the block device read-only, we don't care about changes of that bit, in either direction. But for cases where we want to use the block device read-write, the bit changing from clear in the old to set in the new server is an incompatible change that your logic failed to flag.
29.11.2021 20:34, Eric Blake wrote: > On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote: >> Reconnect feature was never prepared to handle server options changed >> on reconnect. Let's be stricter and check what exactly is changed. If >> server capabilities just got richer don't worry. Otherwise fail and >> drop the established connection. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> >> Hi all! The patch is probably good for 6.2. It's an RFC because I didn't >> test it yet) But I want to early send, so that my proposed design be >> available for discussion. > > We're cutting it awfully close. My justification for including it in > -rc3 (if we like it) is that it is a lot easier to audit that we > reject server downgrades than it is to audit whether we have a CVE > because of a server downgrade across a reconnect. But it is not a new > regression to 6.2, so slipping it to 7.0 (if we don't feel comfortable > with the current iteration of the patch) is okay on that front. > >> >> >> include/block/nbd.h | 9 +++++ >> nbd/client-connection.c | 86 +++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 95 insertions(+) >> >> diff --git a/include/block/nbd.h b/include/block/nbd.h >> index 78d101b774..3d379b5539 100644 >> --- a/include/block/nbd.h >> +++ b/include/block/nbd.h >> @@ -157,6 +157,10 @@ enum { >> #define NBD_FLAG_SEND_RESIZE (1 << NBD_FLAG_SEND_RESIZE_BIT) >> #define NBD_FLAG_SEND_CACHE (1 << NBD_FLAG_SEND_CACHE_BIT) >> #define NBD_FLAG_SEND_FAST_ZERO (1 << NBD_FLAG_SEND_FAST_ZERO_BIT) >> +/* >> + * If you add any new NBD_FLAG_ flag, check that logic in >> + * nbd_is_new_info_compatible() is still good about handling flags. >> + */ >> >> /* New-style handshake (global) flags, sent from server to client, and >> control what will happen during handshake phase. */ >> @@ -305,6 +309,11 @@ struct NBDExportInfo { >> >> uint32_t context_id; >> >> + /* >> + * WARNING! when add any new field to the structure, don't forget to check > > adding > >> + * and updated nbd_is_new_info_compatible() function. > > update the > >> + */ > > Odd that one comment has WARNING! and the other does not. > >> + >> /* Set by server results during nbd_receive_export_list() */ >> char *description; >> int n_contexts; >> diff --git a/nbd/client-connection.c b/nbd/client-connection.c >> index 695f855754..2d66993632 100644 >> --- a/nbd/client-connection.c >> +++ b/nbd/client-connection.c >> @@ -37,6 +37,10 @@ struct NBDClientConnection { >> bool do_negotiation; >> bool do_retry; >> >> + /* Used only by connection thread, no need in locking the mutex */ > > s/no need in locking the mutex/does not need mutex protection/ > >> + bool has_prev_info; >> + NBDExportInfo prev_info; >> + >> QemuMutex mutex; >> >> /* >> @@ -160,6 +164,67 @@ static int nbd_connect(QIOChannelSocket *sioc, SocketAddress *addr, >> return 0; >> } >> >> +static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo *new, >> + Error **errp) >> +{ >> + uint32_t dropped_flags; >> + >> + if (old->structured_reply && !new->structured_reply) { >> + error_setg(errp, "Server options degrade after reconnect: " > > degraded > >> + "structured_reply is not supported anymore"); >> + return false; >> + } >> + >> + if (old->base_allocation && !new->base_allocation) { >> + error_setg(errp, "Server options degrade after reconnect: " > > degraded > >> + "base_allocation is not supported anymore"); >> + return false; >> + } > > Do we also need to insist that the context id value be identical, or > can our code gracefully deal with it being different? We don't ever > send the context id, so even if we retry a CMD_BLOCK_STATUS, our real > risk is whether we will reject the new server's reply because it used > a different id than we were expecting. > >> + >> + if (old->size != new->size) { >> + error_setg(errp, "NBD export size changed after reconnect"); >> + return false; >> + } >> + >> + /* >> + * No worry if rotational status changed. But other flags are feature flags, >> + * they should not degrade. >> + */ >> + dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL; >> + if (dropped_flags) { >> + error_setg(errp, "Server options degrade after reconnect: flags 0x%" > > degraded > >> + PRIx32 " are not reported anymore", dropped_flags); >> + return false; >> + } >> + >> + if (new->min_block > old->min_block) { >> + error_setg(errp, "Server requires more strict min_block after " >> + "reconnect: %" PRIu32 " instead of %" PRIu32, >> + new->min_block, old->min_block); >> + return false; >> + } > > Good... > >> + if (new->min_block && (old->min_block % new->min_block)) { >> + error_setg(errp, "Server requires new min_block %" PRIu32 >> + " after reconnect, incompatible with old one %" PRIu32, >> + new->min_block, old->min_block); >> + return false; >> + } > > ...but why is this one necessary? Since min_block has to be a power > of 2, and you just proved that new->min_block <= old->min_block above, > this condition will always be false. Ah yes, if they all are power of two, than that's an extra check. > >> + >> + if (new->max_block < old->max_block) { >> + error_setg(errp, "Server requires more strict max_block after " >> + "reconnect: %" PRIu32 " instead of %" PRIu32, >> + new->max_block, old->max_block); >> + return false; >> + } >> + >> + if (old->context_id != new->context_id) { >> + error_setg(errp, "Meta context id changed after reconnect"); >> + return false; >> + } > > Oh, this answers my question above. We should put this near where we > check above. And this check should only be performed if > base_allocation was supported in the old server (if the old server > lacks it and the new server supports it, the ids may be differ, but > that's an upgrade, not a downgrade, and we don't care). OK. I think, I tried to follow ordering of fields in the structure, but logic is more important of course. > >> + >> + return true; >> +} >> + >> static void *connect_thread_func(void *opaque) >> { >> NBDClientConnection *conn = opaque; >> @@ -183,6 +248,27 @@ static void *connect_thread_func(void *opaque) >> conn->do_negotiation ? &conn->updated_info : NULL, >> conn->tlscreds, &conn->ioc, &conn->err); >> >> + if (ret == 0) { >> + if (conn->has_prev_info && >> + !nbd_is_new_info_compatible(&conn->prev_info, >> + &conn->updated_info, &conn->err)) >> + { >> + NBDRequest request = { .type = NBD_CMD_DISC }; >> + QIOChannel *ioc = conn->ioc ?: QIO_CHANNEL(conn->sioc); >> + >> + nbd_send_request(ioc, &request); >> + qio_channel_close(ioc, NULL); >> + >> + object_unref(OBJECT(conn->ioc)); >> + conn->ioc = NULL; >> + >> + ret = -EINVAL; >> + } else { >> + conn->prev_info = conn->updated_info; >> + conn->has_prev_info = true; >> + } >> + } >> + >> /* >> * conn->updated_info will finally be returned to the user. Clear the >> * pointers to our internally allocated strings, which are IN parameters >> -- >> 2.31.1 >> > > Looks like it is on the right track. >
29.11.2021 22:16, Eric Blake wrote: > On Wed, Nov 24, 2021 at 03:09:51PM +0100, Vladimir Sementsov-Ogievskiy wrote: >> Reconnect feature was never prepared to handle server options changed >> on reconnect. Let's be stricter and check what exactly is changed. If >> server capabilities just got richer don't worry. Otherwise fail and >> drop the established connection. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> + /* >> + * No worry if rotational status changed. But other flags are feature flags, >> + * they should not degrade. >> + */ >> + dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL; >> + if (dropped_flags) { >> + error_setg(errp, "Server options degrade after reconnect: flags 0x%" >> + PRIx32 " are not reported anymore", dropped_flags); >> + return false; >> + } > > Your logic is good for most flags, but somewhat wrong for > NBD_FLAG_READ_ONLY_BIT. For cases where we are only using the block > device read-only, we don't care about changes of that bit, in either > direction. But for cases where we want to use the block device > read-write, the bit changing from clear in the old to set in the new > server is an incompatible change that your logic failed to flag. > Oh right! Will fix it and resend soon.
diff --git a/include/block/nbd.h b/include/block/nbd.h index 78d101b774..3d379b5539 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -157,6 +157,10 @@ enum { #define NBD_FLAG_SEND_RESIZE (1 << NBD_FLAG_SEND_RESIZE_BIT) #define NBD_FLAG_SEND_CACHE (1 << NBD_FLAG_SEND_CACHE_BIT) #define NBD_FLAG_SEND_FAST_ZERO (1 << NBD_FLAG_SEND_FAST_ZERO_BIT) +/* + * If you add any new NBD_FLAG_ flag, check that logic in + * nbd_is_new_info_compatible() is still good about handling flags. + */ /* New-style handshake (global) flags, sent from server to client, and control what will happen during handshake phase. */ @@ -305,6 +309,11 @@ struct NBDExportInfo { uint32_t context_id; + /* + * WARNING! when add any new field to the structure, don't forget to check + * and updated nbd_is_new_info_compatible() function. + */ + /* Set by server results during nbd_receive_export_list() */ char *description; int n_contexts; diff --git a/nbd/client-connection.c b/nbd/client-connection.c index 695f855754..2d66993632 100644 --- a/nbd/client-connection.c +++ b/nbd/client-connection.c @@ -37,6 +37,10 @@ struct NBDClientConnection { bool do_negotiation; bool do_retry; + /* Used only by connection thread, no need in locking the mutex */ + bool has_prev_info; + NBDExportInfo prev_info; + QemuMutex mutex; /* @@ -160,6 +164,67 @@ static int nbd_connect(QIOChannelSocket *sioc, SocketAddress *addr, return 0; } +static bool nbd_is_new_info_compatible(NBDExportInfo *old, NBDExportInfo *new, + Error **errp) +{ + uint32_t dropped_flags; + + if (old->structured_reply && !new->structured_reply) { + error_setg(errp, "Server options degrade after reconnect: " + "structured_reply is not supported anymore"); + return false; + } + + if (old->base_allocation && !new->base_allocation) { + error_setg(errp, "Server options degrade after reconnect: " + "base_allocation is not supported anymore"); + return false; + } + + if (old->size != new->size) { + error_setg(errp, "NBD export size changed after reconnect"); + return false; + } + + /* + * No worry if rotational status changed. But other flags are feature flags, + * they should not degrade. + */ + dropped_flags = (old->flags & ~new->flags) & ~NBD_FLAG_ROTATIONAL; + if (dropped_flags) { + error_setg(errp, "Server options degrade after reconnect: flags 0x%" + PRIx32 " are not reported anymore", dropped_flags); + return false; + } + + if (new->min_block > old->min_block) { + error_setg(errp, "Server requires more strict min_block after " + "reconnect: %" PRIu32 " instead of %" PRIu32, + new->min_block, old->min_block); + return false; + } + if (new->min_block && (old->min_block % new->min_block)) { + error_setg(errp, "Server requires new min_block %" PRIu32 + " after reconnect, incompatible with old one %" PRIu32, + new->min_block, old->min_block); + return false; + } + + if (new->max_block < old->max_block) { + error_setg(errp, "Server requires more strict max_block after " + "reconnect: %" PRIu32 " instead of %" PRIu32, + new->max_block, old->max_block); + return false; + } + + if (old->context_id != new->context_id) { + error_setg(errp, "Meta context id changed after reconnect"); + return false; + } + + return true; +} + static void *connect_thread_func(void *opaque) { NBDClientConnection *conn = opaque; @@ -183,6 +248,27 @@ static void *connect_thread_func(void *opaque) conn->do_negotiation ? &conn->updated_info : NULL, conn->tlscreds, &conn->ioc, &conn->err); + if (ret == 0) { + if (conn->has_prev_info && + !nbd_is_new_info_compatible(&conn->prev_info, + &conn->updated_info, &conn->err)) + { + NBDRequest request = { .type = NBD_CMD_DISC }; + QIOChannel *ioc = conn->ioc ?: QIO_CHANNEL(conn->sioc); + + nbd_send_request(ioc, &request); + qio_channel_close(ioc, NULL); + + object_unref(OBJECT(conn->ioc)); + conn->ioc = NULL; + + ret = -EINVAL; + } else { + conn->prev_info = conn->updated_info; + conn->has_prev_info = true; + } + } + /* * conn->updated_info will finally be returned to the user. Clear the * pointers to our internally allocated strings, which are IN parameters
Reconnect feature was never prepared to handle server options changed on reconnect. Let's be stricter and check what exactly is changed. If server capabilities just got richer don't worry. Otherwise fail and drop the established connection. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- Hi all! The patch is probably good for 6.2. It's an RFC because I didn't test it yet) But I want to early send, so that my proposed design be available for discussion. include/block/nbd.h | 9 +++++ nbd/client-connection.c | 86 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+)