mbox series

[net,v2,0/6] vsock: Transport reassignment and error handling issues

Message ID 20250121-vsock-transport-vs-autobind-v2-0-aad6069a4e8c@rbox.co (mailing list archive)
Headers show
Series vsock: Transport reassignment and error handling issues | expand

Message

Michal Luczaj Jan. 21, 2025, 2:44 p.m. UTC
Series deals with two issues:
- socket reference count imbalance due to an unforgiving transport release
  (triggered by transport reassignment);
- unintentional API feature, a failing connect() making the socket
  impossible to use for any subsequent connect() attempts.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Changes in v2:
- Introduce vsock_connect_fd(), simplify the tests, stick to SOCK_STREAM,
  collect Reviewed-by (Stefano)
- Link to v1: https://lore.kernel.org/r/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co

---
Michal Luczaj (6):
      vsock: Keep the binding until socket destruction
      vsock: Allow retrying on connect() failure
      vsock/test: Introduce vsock_bind()
      vsock/test: Introduce vsock_connect_fd()
      vsock/test: Add test for UAF due to socket unbinding
      vsock/test: Add test for connect() retries

 net/vmw_vsock/af_vsock.c         |  13 ++++-
 tools/testing/vsock/util.c       |  87 +++++++++++-----------------
 tools/testing/vsock/util.h       |   2 +
 tools/testing/vsock/vsock_test.c | 122 ++++++++++++++++++++++++++++++++++-----
 4 files changed, 152 insertions(+), 72 deletions(-)
---
base-commit: d640627663bfe7d8963c7615316d7d4ef60f3b0b
change-id: 20250116-vsock-transport-vs-autobind-2da49f1d5a0a

Best regards,

Comments

Stefano Garzarella Jan. 22, 2025, 11:45 a.m. UTC | #1
On Tue, Jan 21, 2025 at 03:44:01PM +0100, Michal Luczaj wrote:
>Series deals with two issues:
>- socket reference count imbalance due to an unforgiving transport release
>  (triggered by transport reassignment);
>- unintentional API feature, a failing connect() making the socket
>  impossible to use for any subsequent connect() attempts.
>
>Signed-off-by: Michal Luczaj <mhal@rbox.co>
>---
>Changes in v2:
>- Introduce vsock_connect_fd(), simplify the tests, stick to SOCK_STREAM,
>  collect Reviewed-by (Stefano)
>- Link to v1: https://lore.kernel.org/r/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co

Thanks for sorting out my comments, I've reviewed it all and got it
running, it seems to be going well!

Stefano

>
>---
>Michal Luczaj (6):
>      vsock: Keep the binding until socket destruction
>      vsock: Allow retrying on connect() failure
>      vsock/test: Introduce vsock_bind()
>      vsock/test: Introduce vsock_connect_fd()
>      vsock/test: Add test for UAF due to socket unbinding
>      vsock/test: Add test for connect() retries
>
> net/vmw_vsock/af_vsock.c         |  13 ++++-
> tools/testing/vsock/util.c       |  87 +++++++++++-----------------
> tools/testing/vsock/util.h       |   2 +
> tools/testing/vsock/vsock_test.c | 122 ++++++++++++++++++++++++++++++++++-----
> 4 files changed, 152 insertions(+), 72 deletions(-)
>---
>base-commit: d640627663bfe7d8963c7615316d7d4ef60f3b0b
>change-id: 20250116-vsock-transport-vs-autobind-2da49f1d5a0a
>
>Best regards,
>-- 
>Michal Luczaj <mhal@rbox.co>
>
Michal Luczaj Jan. 22, 2025, 2:16 p.m. UTC | #2
On 1/22/25 12:45, Stefano Garzarella wrote:
> On Tue, Jan 21, 2025 at 03:44:01PM +0100, Michal Luczaj wrote:
>> Series deals with two issues:
>> - socket reference count imbalance due to an unforgiving transport release
>>  (triggered by transport reassignment);
>> - unintentional API feature, a failing connect() making the socket
>>  impossible to use for any subsequent connect() attempts.
>>
>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>> ---
>> Changes in v2:
>> - Introduce vsock_connect_fd(), simplify the tests, stick to SOCK_STREAM,
>>  collect Reviewed-by (Stefano)
>> - Link to v1: https://lore.kernel.org/r/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co
> 
> Thanks for sorting out my comments, I've reviewed it all and got it
> running, it seems to be going well!

Great! I was worried that I might have oversimplified the UAF selftest
(won't trigger the splat if second transport == NULL), so please let me
know if it starts acting strangely (quietly passes the test on an unpatched
system), and for what combination of enabled transports.

Thanks,
Michal
Stefano Garzarella Jan. 22, 2025, 3:47 p.m. UTC | #3
On Wed, 22 Jan 2025 at 15:16, Michal Luczaj <mhal@rbox.co> wrote:
>
> On 1/22/25 12:45, Stefano Garzarella wrote:
> > On Tue, Jan 21, 2025 at 03:44:01PM +0100, Michal Luczaj wrote:
> >> Series deals with two issues:
> >> - socket reference count imbalance due to an unforgiving transport release
> >>  (triggered by transport reassignment);
> >> - unintentional API feature, a failing connect() making the socket
> >>  impossible to use for any subsequent connect() attempts.
> >>
> >> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> >> ---
> >> Changes in v2:
> >> - Introduce vsock_connect_fd(), simplify the tests, stick to SOCK_STREAM,
> >>  collect Reviewed-by (Stefano)
> >> - Link to v1: https://lore.kernel.org/r/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co
> >
> > Thanks for sorting out my comments, I've reviewed it all and got it
> > running, it seems to be going well!
>
> Great! I was worried that I might have oversimplified the UAF selftest
> (won't trigger the splat if second transport == NULL), so please let me
> know if it starts acting strangely (quietly passes the test on an unpatched
> system), and for what combination of enabled transports.

Yeah, I was worrying the same and thinking if it's better to add more
connect also with LOOPBACK and a CID > 2 to be sure we test all the
scenarios, but we can do later, for now let's have this series merged
to fix the real issue.

I tested without the fixes (first 2 patches) and I can see the
use-after-free reports only on the "host" where I have both loopback
and H2G loaded, but this should be fine.

Thanks for your work!
Stefano
Michal Luczaj Jan. 22, 2025, 8:10 p.m. UTC | #4
On 1/22/25 16:47, Stefano Garzarella wrote:
> On Wed, 22 Jan 2025 at 15:16, Michal Luczaj <mhal@rbox.co> wrote:
>>
>> On 1/22/25 12:45, Stefano Garzarella wrote:
>>> On Tue, Jan 21, 2025 at 03:44:01PM +0100, Michal Luczaj wrote:
>>>> Series deals with two issues:
>>>> - socket reference count imbalance due to an unforgiving transport release
>>>>  (triggered by transport reassignment);
>>>> - unintentional API feature, a failing connect() making the socket
>>>>  impossible to use for any subsequent connect() attempts.
>>>>
>>>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>>>> ---
>>>> Changes in v2:
>>>> - Introduce vsock_connect_fd(), simplify the tests, stick to SOCK_STREAM,
>>>>  collect Reviewed-by (Stefano)
>>>> - Link to v1: https://lore.kernel.org/r/20250117-vsock-transport-vs-autobind-v1-0-c802c803762d@rbox.co
>>>
>>> Thanks for sorting out my comments, I've reviewed it all and got it
>>> running, it seems to be going well!
>>
>> Great! I was worried that I might have oversimplified the UAF selftest
>> (won't trigger the splat if second transport == NULL), so please let me
>> know if it starts acting strangely (quietly passes the test on an unpatched
>> system), and for what combination of enabled transports.
> 
> Yeah, I was worrying the same and thinking if it's better to add more
> connect also with LOOPBACK and a CID > 2 to be sure we test all the
> scenarios, but we can do later, for now let's have this series merged
> to fix the real issue.

Sure, I'll take care of this CID galore later on.

> I tested without the fixes (first 2 patches) and I can see the
> use-after-free reports only on the "host" where I have both loopback
> and H2G loaded, but this should be fine.

Argh, sorry. FWIW, re-adding a bind() after the second connect should
increase the coverage.