Message ID | 20211111153354.18807-1-rvkagan@yandex-team.ru (mailing list archive) |
---|---|
Headers | show |
Series | vhost: stick to -errno error return convention | expand |
On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > Error propagation between the generic vhost code and the specific backends is > not quite consistent: some places follow "return -1 and set errno" convention, > while others assume "return negated errno". Furthermore, not enough care is > taken not to clobber errno. > > As a result, on certain code paths the errno resulting from a failure may get > overridden by another function call, and then that zero errno inidicating > success is propagated up the stack, leading to failures being lost. In > particular, we've seen errors in the communication with a vhost-user-blk slave > not trigger an immediate connection drop and reconnection, leaving it in a > broken state. > > Rework error propagation to always return negated errno on errors and > correctly pass it up the stack. Looks like something we want post release. I'll tag it but pls ping me after the release to help make sure it's not lost. > Roman Kagan (10): > vhost-user-blk: reconnect on any error during realize > chardev/char-socket: tcp_chr_recv: don't clobber errno > chardev/char-socket: tcp_chr_sync_read: don't clobber errno > chardev/char-fe: don't allow EAGAIN from blocking read > vhost-backend: avoid overflow on memslots_limit > vhost-backend: stick to -errno error return convention > vhost-vdpa: stick to -errno error return convention > vhost-user: stick to -errno error return convention > vhost: stick to -errno error return convention > vhost-user-blk: propagate error return from generic vhost > > chardev/char-fe.c | 7 +- > chardev/char-socket.c | 17 +- > hw/block/vhost-user-blk.c | 4 +- > hw/virtio/vhost-backend.c | 4 +- > hw/virtio/vhost-user.c | 401 +++++++++++++++++++++----------------- > hw/virtio/vhost-vdpa.c | 37 ++-- > hw/virtio/vhost.c | 98 +++++----- > 7 files changed, 307 insertions(+), 261 deletions(-) > > -- > 2.33.1 >
On Thu, Nov 11, 2021 at 03:14:56PM -0500, Michael S. Tsirkin wrote: > On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > > Error propagation between the generic vhost code and the specific backends is > > not quite consistent: some places follow "return -1 and set errno" convention, > > while others assume "return negated errno". Furthermore, not enough care is > > taken not to clobber errno. > > > > As a result, on certain code paths the errno resulting from a failure may get > > overridden by another function call, and then that zero errno inidicating > > success is propagated up the stack, leading to failures being lost. In > > particular, we've seen errors in the communication with a vhost-user-blk slave > > not trigger an immediate connection drop and reconnection, leaving it in a > > broken state. > > > > Rework error propagation to always return negated errno on errors and > > correctly pass it up the stack. > > Looks like something we want post release. I'll tag it > but pls ping me after the release to help make sure > it's not lost. It doesn't introduce new features so I guess it might qualify for rc0, but the churn is somewhat too big indeed. OK I'll reiterate once 6.2 is out; meanwhile if anyone has spare cycles to review it, it'll be much appreciated. Thanks, Roman. > > > > Roman Kagan (10): > > vhost-user-blk: reconnect on any error during realize > > chardev/char-socket: tcp_chr_recv: don't clobber errno > > chardev/char-socket: tcp_chr_sync_read: don't clobber errno > > chardev/char-fe: don't allow EAGAIN from blocking read > > vhost-backend: avoid overflow on memslots_limit > > vhost-backend: stick to -errno error return convention > > vhost-vdpa: stick to -errno error return convention > > vhost-user: stick to -errno error return convention > > vhost: stick to -errno error return convention > > vhost-user-blk: propagate error return from generic vhost > > > > chardev/char-fe.c | 7 +- > > chardev/char-socket.c | 17 +- > > hw/block/vhost-user-blk.c | 4 +- > > hw/virtio/vhost-backend.c | 4 +- > > hw/virtio/vhost-user.c | 401 +++++++++++++++++++++----------------- > > hw/virtio/vhost-vdpa.c | 37 ++-- > > hw/virtio/vhost.c | 98 +++++----- > > 7 files changed, 307 insertions(+), 261 deletions(-) > > > > -- > > 2.33.1 > > >
On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > Error propagation between the generic vhost code and the specific backends is > not quite consistent: some places follow "return -1 and set errno" convention, > while others assume "return negated errno". Furthermore, not enough care is > taken not to clobber errno. > > As a result, on certain code paths the errno resulting from a failure may get > overridden by another function call, and then that zero errno inidicating > success is propagated up the stack, leading to failures being lost. In > particular, we've seen errors in the communication with a vhost-user-blk slave > not trigger an immediate connection drop and reconnection, leaving it in a > broken state. > > Rework error propagation to always return negated errno on errors and > correctly pass it up the stack. Hi Roman, if there are bugfixes here I'll be happy to take them right now. The wholesale rework seems inappropriate for 6.2, I'll be happy to tag it for after 6.2. Pls ping me aftre release to help make sure it's not lost. > Roman Kagan (10): > vhost-user-blk: reconnect on any error during realize > chardev/char-socket: tcp_chr_recv: don't clobber errno > chardev/char-socket: tcp_chr_sync_read: don't clobber errno > chardev/char-fe: don't allow EAGAIN from blocking read > vhost-backend: avoid overflow on memslots_limit > vhost-backend: stick to -errno error return convention > vhost-vdpa: stick to -errno error return convention > vhost-user: stick to -errno error return convention > vhost: stick to -errno error return convention > vhost-user-blk: propagate error return from generic vhost > > chardev/char-fe.c | 7 +- > chardev/char-socket.c | 17 +- > hw/block/vhost-user-blk.c | 4 +- > hw/virtio/vhost-backend.c | 4 +- > hw/virtio/vhost-user.c | 401 +++++++++++++++++++++----------------- > hw/virtio/vhost-vdpa.c | 37 ++-- > hw/virtio/vhost.c | 98 +++++----- > 7 files changed, 307 insertions(+), 261 deletions(-) > > -- > 2.33.1 >
On Sun, Nov 28, 2021 at 04:47:20PM -0500, Michael S. Tsirkin wrote: > On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > > Error propagation between the generic vhost code and the specific backends is > > not quite consistent: some places follow "return -1 and set errno" convention, > > while others assume "return negated errno". Furthermore, not enough care is > > taken not to clobber errno. > > > > As a result, on certain code paths the errno resulting from a failure may get > > overridden by another function call, and then that zero errno inidicating > > success is propagated up the stack, leading to failures being lost. In > > particular, we've seen errors in the communication with a vhost-user-blk slave > > not trigger an immediate connection drop and reconnection, leaving it in a > > broken state. > > > > Rework error propagation to always return negated errno on errors and > > correctly pass it up the stack. > > Hi Roman, > if there are bugfixes here I'll be happy to take them right now. > The wholesale rework seems inappropriate for 6.2, I'll be > happy to tag it for after 6.2. Pls ping me aftre release to help > make sure it's not lost. All these patches are bugfixes in one way or another. That said, none of the problems being addressed are recent regressions. OTOH the patches introduce non-zero churn and change behavior on some error paths, so I'd suggest to postpone the whole series till after 6.2 is out. Thanks, Roman.
On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > Error propagation between the generic vhost code and the specific backends is > not quite consistent: some places follow "return -1 and set errno" convention, > while others assume "return negated errno". Furthermore, not enough care is > taken not to clobber errno. > > As a result, on certain code paths the errno resulting from a failure may get > overridden by another function call, and then that zero errno inidicating > success is propagated up the stack, leading to failures being lost. In > particular, we've seen errors in the communication with a vhost-user-blk slave > not trigger an immediate connection drop and reconnection, leaving it in a > broken state. > > Rework error propagation to always return negated errno on errors and > correctly pass it up the stack. > > Roman Kagan (10): > vhost-user-blk: reconnect on any error during realize > chardev/char-socket: tcp_chr_recv: don't clobber errno > chardev/char-socket: tcp_chr_sync_read: don't clobber errno > chardev/char-fe: don't allow EAGAIN from blocking read So I dropped this one. If you are so inclined, pls work on this separately. > vhost-backend: avoid overflow on memslots_limit > vhost-backend: stick to -errno error return convention > vhost-vdpa: stick to -errno error return convention > vhost-user: stick to -errno error return convention > vhost: stick to -errno error return convention > vhost-user-blk: propagate error return from generic vhost > > chardev/char-fe.c | 7 +- > chardev/char-socket.c | 17 +- > hw/block/vhost-user-blk.c | 4 +- > hw/virtio/vhost-backend.c | 4 +- > hw/virtio/vhost-user.c | 401 +++++++++++++++++++++----------------- > hw/virtio/vhost-vdpa.c | 37 ++-- > hw/virtio/vhost.c | 98 +++++----- > 7 files changed, 307 insertions(+), 261 deletions(-) > > -- > 2.33.1 >