[3/4] vsock/virtio: fix flush of works during the .remove()

Message ID	20190528105623.27983-4-sgarzare@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Stefano Garzarella <sgarzare@redhat.com> To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, Stefan Hajnoczi <stefanha@redhat.com>, "David S. Miller" <davem@davemloft.net>, "Michael S . Tsirkin" <mst@redhat.com> Subject: [PATCH 3/4] vsock/virtio: fix flush of works during the .remove() Date: Tue, 28 May 2019 12:56:22 +0200 Message-Id: <20190528105623.27983-4-sgarzare@redhat.com> In-Reply-To: <20190528105623.27983-1-sgarzare@redhat.com> References: <20190528105623.27983-1-sgarzare@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	vsock/virtio: several fixes in the .probe() and .remove() \| expand [0/4] vsock/virtio: several fixes in the .probe() and .remove() [1/4] vsock/virtio: fix locking around 'the_virtio_vsock' [2/4] vsock/virtio: stop workers during the .remove() [3/4] vsock/virtio: fix flush of works during the .remove() [4/4] vsock/virtio: free used buffers during the .remove()

Stefano Garzarella May 28, 2019, 10:56 a.m. UTC

We flush all pending works before to call vdev->config->reset(vdev),
but other works can be queued before the vdev->config->del_vqs(vdev),
so we add another flush after it, to avoid use after free.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

Jason Wang May 29, 2019, 3:22 a.m. UTC | #1

On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> We flush all pending works before to call vdev->config->reset(vdev),
> but other works can be queued before the vdev->config->del_vqs(vdev),
> so we add another flush after it, to avoid use after free.
>
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>   net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index e694df10ab61..ad093ce96693 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>   	return ret;
>   }
>   
> +static void virtio_vsock_flush_works(struct virtio_vsock *vsock)
> +{
> +	flush_work(&vsock->loopback_work);
> +	flush_work(&vsock->rx_work);
> +	flush_work(&vsock->tx_work);
> +	flush_work(&vsock->event_work);
> +	flush_work(&vsock->send_pkt_work);
> +}
> +
>   static void virtio_vsock_remove(struct virtio_device *vdev)
>   {
>   	struct virtio_vsock *vsock = vdev->priv;
> @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	mutex_lock(&the_virtio_vsock_mutex);
>   	the_virtio_vsock = NULL;
>   
> -	flush_work(&vsock->loopback_work);
> -	flush_work(&vsock->rx_work);
> -	flush_work(&vsock->tx_work);
> -	flush_work(&vsock->event_work);
> -	flush_work(&vsock->send_pkt_work);
> -
>   	/* Reset all connected sockets when the device disappear */
>   	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
>   
> @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	vsock->event_run = false;
>   	mutex_unlock(&vsock->event_lock);
>   
> +	/* Flush all pending works */
> +	virtio_vsock_flush_works(vsock);
> +
>   	/* Flush all device writes and interrupts, device will not use any
>   	 * more buffers.
>   	 */
> @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	/* Delete virtqueues and flush outstanding callbacks if any */
>   	vdev->config->del_vqs(vdev);
>   
> +	/* Other works can be queued before 'config->del_vqs()', so we flush
> +	 * all works before to free the vsock object to avoid use after free.
> +	 */
> +	virtio_vsock_flush_works(vsock);


Some questions after a quick glance:

1) It looks to me that the work could be queued from the path of 
vsock_transport_cancel_pkt() . Is that synchronized here?

2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run 
still needed? It looks to me we've already done except that we need 
flush rx_work in the end since send_pkt_work can requeue rx_work.

Thanks


> +
>   	kfree(vsock);
>   	mutex_unlock(&the_virtio_vsock_mutex);
>   }

Stefano Garzarella May 29, 2019, 10:58 a.m. UTC | #2

On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
> 
> On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> > We flush all pending works before to call vdev->config->reset(vdev),
> > but other works can be queued before the vdev->config->del_vqs(vdev),
> > so we add another flush after it, to avoid use after free.
> > 
> > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > ---
> >   net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
> >   1 file changed, 17 insertions(+), 6 deletions(-)
> > 
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index e694df10ab61..ad093ce96693 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> >   	return ret;
> >   }
> > +static void virtio_vsock_flush_works(struct virtio_vsock *vsock)
> > +{
> > +	flush_work(&vsock->loopback_work);
> > +	flush_work(&vsock->rx_work);
> > +	flush_work(&vsock->tx_work);
> > +	flush_work(&vsock->event_work);
> > +	flush_work(&vsock->send_pkt_work);
> > +}
> > +
> >   static void virtio_vsock_remove(struct virtio_device *vdev)
> >   {
> >   	struct virtio_vsock *vsock = vdev->priv;
> > @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> >   	mutex_lock(&the_virtio_vsock_mutex);
> >   	the_virtio_vsock = NULL;
> > -	flush_work(&vsock->loopback_work);
> > -	flush_work(&vsock->rx_work);
> > -	flush_work(&vsock->tx_work);
> > -	flush_work(&vsock->event_work);
> > -	flush_work(&vsock->send_pkt_work);
> > -
> >   	/* Reset all connected sockets when the device disappear */
> >   	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
> > @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> >   	vsock->event_run = false;
> >   	mutex_unlock(&vsock->event_lock);
> > +	/* Flush all pending works */
> > +	virtio_vsock_flush_works(vsock);
> > +
> >   	/* Flush all device writes and interrupts, device will not use any
> >   	 * more buffers.
> >   	 */
> > @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> >   	/* Delete virtqueues and flush outstanding callbacks if any */
> >   	vdev->config->del_vqs(vdev);
> > +	/* Other works can be queued before 'config->del_vqs()', so we flush
> > +	 * all works before to free the vsock object to avoid use after free.
> > +	 */
> > +	virtio_vsock_flush_works(vsock);
> 
> 
> Some questions after a quick glance:
> 
> 1) It looks to me that the work could be queued from the path of
> vsock_transport_cancel_pkt() . Is that synchronized here?
>

Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
queue work from the upper layer (socket).

Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
a rare issue could happen:
we are setting the_virtio_vsock to NULL at the start of .remove() and we
are freeing the object pointed by it at the end of .remove(), so
virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
running, accessing the object that we are freed.

Should I use something like RCU to prevent this issue?

    virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
    {
        rcu_read_lock();
        vsock = rcu_dereference(the_virtio_vsock_mutex);
        ...
        rcu_read_unlock();
    }

    virtio_vsock_remove()
    {
        rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
        synchronize_rcu();

        ...

        free(vsock);
    }

Could there be a better approach?


> 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
> needed? It looks to me we've already done except that we need flush rx_work
> in the end since send_pkt_work can requeue rx_work.

The main reason of tx_run/rx_run/event_run is to prevent that a worker
function is running while we are calling config->reset().

E.g. if an interrupt comes between virtio_vsock_flush_works() and
config->reset(), it can queue new works that can access the device while
we are in config->reset().

IMHO they are still needed.

What do you think?


Thanks for your questions,
Stefano

Jason Wang May 30, 2019, 9:46 a.m. UTC | #3

On 2019/5/29 下午6:58, Stefano Garzarella wrote:
> On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
>> On 2019/5/28 下午6:56, Stefano Garzarella wrote:
>>> We flush all pending works before to call vdev->config->reset(vdev),
>>> but other works can be queued before the vdev->config->del_vqs(vdev),
>>> so we add another flush after it, to avoid use after free.
>>>
>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>> ---
>>>    net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
>>>    1 file changed, 17 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>> index e694df10ab61..ad093ce96693 100644
>>> --- a/net/vmw_vsock/virtio_transport.c
>>> +++ b/net/vmw_vsock/virtio_transport.c
>>> @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>>    	return ret;
>>>    }
>>> +static void virtio_vsock_flush_works(struct virtio_vsock *vsock)
>>> +{
>>> +	flush_work(&vsock->loopback_work);
>>> +	flush_work(&vsock->rx_work);
>>> +	flush_work(&vsock->tx_work);
>>> +	flush_work(&vsock->event_work);
>>> +	flush_work(&vsock->send_pkt_work);
>>> +}
>>> +
>>>    static void virtio_vsock_remove(struct virtio_device *vdev)
>>>    {
>>>    	struct virtio_vsock *vsock = vdev->priv;
>>> @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>    	mutex_lock(&the_virtio_vsock_mutex);
>>>    	the_virtio_vsock = NULL;
>>> -	flush_work(&vsock->loopback_work);
>>> -	flush_work(&vsock->rx_work);
>>> -	flush_work(&vsock->tx_work);
>>> -	flush_work(&vsock->event_work);
>>> -	flush_work(&vsock->send_pkt_work);
>>> -
>>>    	/* Reset all connected sockets when the device disappear */
>>>    	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
>>> @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>    	vsock->event_run = false;
>>>    	mutex_unlock(&vsock->event_lock);
>>> +	/* Flush all pending works */
>>> +	virtio_vsock_flush_works(vsock);
>>> +
>>>    	/* Flush all device writes and interrupts, device will not use any
>>>    	 * more buffers.
>>>    	 */
>>> @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>    	/* Delete virtqueues and flush outstanding callbacks if any */
>>>    	vdev->config->del_vqs(vdev);
>>> +	/* Other works can be queued before 'config->del_vqs()', so we flush
>>> +	 * all works before to free the vsock object to avoid use after free.
>>> +	 */
>>> +	virtio_vsock_flush_works(vsock);
>>
>> Some questions after a quick glance:
>>
>> 1) It looks to me that the work could be queued from the path of
>> vsock_transport_cancel_pkt() . Is that synchronized here?
>>
> Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
> queue work from the upper layer (socket).
>
> Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
> a rare issue could happen:
> we are setting the_virtio_vsock to NULL at the start of .remove() and we
> are freeing the object pointed by it at the end of .remove(), so
> virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
> running, accessing the object that we are freed.


Yes, that's my point.


>
> Should I use something like RCU to prevent this issue?
>
>      virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
>      {
>          rcu_read_lock();
>          vsock = rcu_dereference(the_virtio_vsock_mutex);


RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).


>          ...
>          rcu_read_unlock();
>      }
>
>      virtio_vsock_remove()
>      {
>          rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
>          synchronize_rcu();
>
>          ...
>
>          free(vsock);
>      }
>
> Could there be a better approach?
>
>
>> 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
>> needed? It looks to me we've already done except that we need flush rx_work
>> in the end since send_pkt_work can requeue rx_work.
> The main reason of tx_run/rx_run/event_run is to prevent that a worker
> function is running while we are calling config->reset().
>
> E.g. if an interrupt comes between virtio_vsock_flush_works() and
> config->reset(), it can queue new works that can access the device while
> we are in config->reset().
>
> IMHO they are still needed.
>
> What do you think?


I mean could we simply do flush after reset once and without 
tx_rx/rx_run tricks?

rest();

virtio_vsock_flush_work();

virtio_vsock_free_buf();


Thanks


>
>
> Thanks for your questions,
> Stefano

Stefano Garzarella May 30, 2019, 10:10 a.m. UTC | #4

On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
> 
> On 2019/5/29 下午6:58, Stefano Garzarella wrote:
> > On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
> > > On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> > > > We flush all pending works before to call vdev->config->reset(vdev),
> > > > but other works can be queued before the vdev->config->del_vqs(vdev),
> > > > so we add another flush after it, to avoid use after free.
> > > > 
> > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > ---
> > > >    net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
> > > >    1 file changed, 17 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index e694df10ab61..ad093ce96693 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> > > >    	return ret;
> > > >    }
> > > > +static void virtio_vsock_flush_works(struct virtio_vsock *vsock)
> > > > +{
> > > > +	flush_work(&vsock->loopback_work);
> > > > +	flush_work(&vsock->rx_work);
> > > > +	flush_work(&vsock->tx_work);
> > > > +	flush_work(&vsock->event_work);
> > > > +	flush_work(&vsock->send_pkt_work);
> > > > +}
> > > > +
> > > >    static void virtio_vsock_remove(struct virtio_device *vdev)
> > > >    {
> > > >    	struct virtio_vsock *vsock = vdev->priv;
> > > > @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > >    	mutex_lock(&the_virtio_vsock_mutex);
> > > >    	the_virtio_vsock = NULL;
> > > > -	flush_work(&vsock->loopback_work);
> > > > -	flush_work(&vsock->rx_work);
> > > > -	flush_work(&vsock->tx_work);
> > > > -	flush_work(&vsock->event_work);
> > > > -	flush_work(&vsock->send_pkt_work);
> > > > -
> > > >    	/* Reset all connected sockets when the device disappear */
> > > >    	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
> > > > @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > >    	vsock->event_run = false;
> > > >    	mutex_unlock(&vsock->event_lock);
> > > > +	/* Flush all pending works */
> > > > +	virtio_vsock_flush_works(vsock);
> > > > +
> > > >    	/* Flush all device writes and interrupts, device will not use any
> > > >    	 * more buffers.
> > > >    	 */
> > > > @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > >    	/* Delete virtqueues and flush outstanding callbacks if any */
> > > >    	vdev->config->del_vqs(vdev);
> > > > +	/* Other works can be queued before 'config->del_vqs()', so we flush
> > > > +	 * all works before to free the vsock object to avoid use after free.
> > > > +	 */
> > > > +	virtio_vsock_flush_works(vsock);
> > > 
> > > Some questions after a quick glance:
> > > 
> > > 1) It looks to me that the work could be queued from the path of
> > > vsock_transport_cancel_pkt() . Is that synchronized here?
> > > 
> > Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
> > queue work from the upper layer (socket).
> > 
> > Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
> > a rare issue could happen:
> > we are setting the_virtio_vsock to NULL at the start of .remove() and we
> > are freeing the object pointed by it at the end of .remove(), so
> > virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
> > running, accessing the object that we are freed.
> 
> 
> Yes, that's my point.
> 
> 
> > 
> > Should I use something like RCU to prevent this issue?
> > 
> >      virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
> >      {
> >          rcu_read_lock();
> >          vsock = rcu_dereference(the_virtio_vsock_mutex);
> 
> 
> RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
> 

Okay, I'm going this way.

> 
> >          ...
> >          rcu_read_unlock();
> >      }
> > 
> >      virtio_vsock_remove()
> >      {
> >          rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
> >          synchronize_rcu();
> > 
> >          ...
> > 
> >          free(vsock);
> >      }
> > 
> > Could there be a better approach?
> > 
> > 
> > > 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
> > > needed? It looks to me we've already done except that we need flush rx_work
> > > in the end since send_pkt_work can requeue rx_work.
> > The main reason of tx_run/rx_run/event_run is to prevent that a worker
> > function is running while we are calling config->reset().
> > 
> > E.g. if an interrupt comes between virtio_vsock_flush_works() and
> > config->reset(), it can queue new works that can access the device while
> > we are in config->reset().
> > 
> > IMHO they are still needed.
> > 
> > What do you think?
> 
> 
> I mean could we simply do flush after reset once and without tx_rx/rx_run
> tricks?
> 
> rest();
> 
> virtio_vsock_flush_work();
> 
> virtio_vsock_free_buf();

My only doubt is:
is it safe to call config->reset() while a worker function could access
the device?

I had this doubt reading the Michael's advice[1] and looking at
virtnet_remove() where there are these lines before the config->reset():

	/* Make sure no work handler is accessing the device. */
	flush_work(&vi->config_work);

Thanks,
Stefano

[1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org

Jason Wang May 30, 2019, 11:59 a.m. UTC | #5

On 2019/5/30 下午6:10, Stefano Garzarella wrote:
> On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
>> On 2019/5/29 下午6:58, Stefano Garzarella wrote:
>>> On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
>>>> On 2019/5/28 下午6:56, Stefano Garzarella wrote:
>>>>> We flush all pending works before to call vdev->config->reset(vdev),
>>>>> but other works can be queued before the vdev->config->del_vqs(vdev),
>>>>> so we add another flush after it, to avoid use after free.
>>>>>
>>>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>>>> ---
>>>>>     net/vmw_vsock/virtio_transport.c | 23 +++++++++++++++++------
>>>>>     1 file changed, 17 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>>>> index e694df10ab61..ad093ce96693 100644
>>>>> --- a/net/vmw_vsock/virtio_transport.c
>>>>> +++ b/net/vmw_vsock/virtio_transport.c
>>>>> @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>>>>>     	return ret;
>>>>>     }
>>>>> +static void virtio_vsock_flush_works(struct virtio_vsock *vsock)
>>>>> +{
>>>>> +	flush_work(&vsock->loopback_work);
>>>>> +	flush_work(&vsock->rx_work);
>>>>> +	flush_work(&vsock->tx_work);
>>>>> +	flush_work(&vsock->event_work);
>>>>> +	flush_work(&vsock->send_pkt_work);
>>>>> +}
>>>>> +
>>>>>     static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>     {
>>>>>     	struct virtio_vsock *vsock = vdev->priv;
>>>>> @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>     	mutex_lock(&the_virtio_vsock_mutex);
>>>>>     	the_virtio_vsock = NULL;
>>>>> -	flush_work(&vsock->loopback_work);
>>>>> -	flush_work(&vsock->rx_work);
>>>>> -	flush_work(&vsock->tx_work);
>>>>> -	flush_work(&vsock->event_work);
>>>>> -	flush_work(&vsock->send_pkt_work);
>>>>> -
>>>>>     	/* Reset all connected sockets when the device disappear */
>>>>>     	vsock_for_each_connected_socket(virtio_vsock_reset_sock);
>>>>> @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>     	vsock->event_run = false;
>>>>>     	mutex_unlock(&vsock->event_lock);
>>>>> +	/* Flush all pending works */
>>>>> +	virtio_vsock_flush_works(vsock);
>>>>> +
>>>>>     	/* Flush all device writes and interrupts, device will not use any
>>>>>     	 * more buffers.
>>>>>     	 */
>>>>> @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>     	/* Delete virtqueues and flush outstanding callbacks if any */
>>>>>     	vdev->config->del_vqs(vdev);
>>>>> +	/* Other works can be queued before 'config->del_vqs()', so we flush
>>>>> +	 * all works before to free the vsock object to avoid use after free.
>>>>> +	 */
>>>>> +	virtio_vsock_flush_works(vsock);
>>>> Some questions after a quick glance:
>>>>
>>>> 1) It looks to me that the work could be queued from the path of
>>>> vsock_transport_cancel_pkt() . Is that synchronized here?
>>>>
>>> Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
>>> queue work from the upper layer (socket).
>>>
>>> Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
>>> a rare issue could happen:
>>> we are setting the_virtio_vsock to NULL at the start of .remove() and we
>>> are freeing the object pointed by it at the end of .remove(), so
>>> virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
>>> running, accessing the object that we are freed.
>>
>> Yes, that's my point.
>>
>>
>>> Should I use something like RCU to prevent this issue?
>>>
>>>       virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
>>>       {
>>>           rcu_read_lock();
>>>           vsock = rcu_dereference(the_virtio_vsock_mutex);
>>
>> RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
>>
> Okay, I'm going this way.
>
>>>           ...
>>>           rcu_read_unlock();
>>>       }
>>>
>>>       virtio_vsock_remove()
>>>       {
>>>           rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
>>>           synchronize_rcu();
>>>
>>>           ...
>>>
>>>           free(vsock);
>>>       }
>>>
>>> Could there be a better approach?
>>>
>>>
>>>> 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
>>>> needed? It looks to me we've already done except that we need flush rx_work
>>>> in the end since send_pkt_work can requeue rx_work.
>>> The main reason of tx_run/rx_run/event_run is to prevent that a worker
>>> function is running while we are calling config->reset().
>>>
>>> E.g. if an interrupt comes between virtio_vsock_flush_works() and
>>> config->reset(), it can queue new works that can access the device while
>>> we are in config->reset().
>>>
>>> IMHO they are still needed.
>>>
>>> What do you think?
>>
>> I mean could we simply do flush after reset once and without tx_rx/rx_run
>> tricks?
>>
>> rest();
>>
>> virtio_vsock_flush_work();
>>
>> virtio_vsock_free_buf();
> My only doubt is:
> is it safe to call config->reset() while a worker function could access
> the device?
>
> I had this doubt reading the Michael's advice[1] and looking at
> virtnet_remove() where there are these lines before the config->reset():
>
> 	/* Make sure no work handler is accessing the device. */
> 	flush_work(&vi->config_work);
>
> Thanks,
> Stefano
>
> [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org


Good point. Then I agree with you. But if we can use the RCU to detect 
the detach of device from socket for these, it would be even better.

Thanks

Stefano Garzarella May 31, 2019, 8:18 a.m. UTC | #6

On Thu, May 30, 2019 at 07:59:14PM +0800, Jason Wang wrote:
> 
> On 2019/5/30 下午6:10, Stefano Garzarella wrote:
> > On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
> > > On 2019/5/29 下午6:58, Stefano Garzarella wrote:
> > > > On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
> > > > > On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> > > > > > @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > >     	vsock->event_run = false;
> > > > > >     	mutex_unlock(&vsock->event_lock);
> > > > > > +	/* Flush all pending works */
> > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > > +
> > > > > >     	/* Flush all device writes and interrupts, device will not use any
> > > > > >     	 * more buffers.
> > > > > >     	 */
> > > > > > @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > >     	/* Delete virtqueues and flush outstanding callbacks if any */
> > > > > >     	vdev->config->del_vqs(vdev);
> > > > > > +	/* Other works can be queued before 'config->del_vqs()', so we flush
> > > > > > +	 * all works before to free the vsock object to avoid use after free.
> > > > > > +	 */
> > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > Some questions after a quick glance:
> > > > > 
> > > > > 1) It looks to me that the work could be queued from the path of
> > > > > vsock_transport_cancel_pkt() . Is that synchronized here?
> > > > > 
> > > > Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
> > > > queue work from the upper layer (socket).
> > > > 
> > > > Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
> > > > a rare issue could happen:
> > > > we are setting the_virtio_vsock to NULL at the start of .remove() and we
> > > > are freeing the object pointed by it at the end of .remove(), so
> > > > virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
> > > > running, accessing the object that we are freed.
> > > 
> > > Yes, that's my point.
> > > 
> > > 
> > > > Should I use something like RCU to prevent this issue?
> > > > 
> > > >       virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
> > > >       {
> > > >           rcu_read_lock();
> > > >           vsock = rcu_dereference(the_virtio_vsock_mutex);
> > > 
> > > RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
> > > 
> > Okay, I'm going this way.
> > 
> > > >           ...
> > > >           rcu_read_unlock();
> > > >       }
> > > > 
> > > >       virtio_vsock_remove()
> > > >       {
> > > >           rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
> > > >           synchronize_rcu();
> > > > 
> > > >           ...
> > > > 
> > > >           free(vsock);
> > > >       }
> > > > 
> > > > Could there be a better approach?
> > > > 
> > > > 
> > > > > 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
> > > > > needed? It looks to me we've already done except that we need flush rx_work
> > > > > in the end since send_pkt_work can requeue rx_work.
> > > > The main reason of tx_run/rx_run/event_run is to prevent that a worker
> > > > function is running while we are calling config->reset().
> > > > 
> > > > E.g. if an interrupt comes between virtio_vsock_flush_works() and
> > > > config->reset(), it can queue new works that can access the device while
> > > > we are in config->reset().
> > > > 
> > > > IMHO they are still needed.
> > > > 
> > > > What do you think?
> > > 
> > > I mean could we simply do flush after reset once and without tx_rx/rx_run
> > > tricks?
> > > 
> > > rest();
> > > 
> > > virtio_vsock_flush_work();
> > > 
> > > virtio_vsock_free_buf();
> > My only doubt is:
> > is it safe to call config->reset() while a worker function could access
> > the device?
> > 
> > I had this doubt reading the Michael's advice[1] and looking at
> > virtnet_remove() where there are these lines before the config->reset():
> > 
> > 	/* Make sure no work handler is accessing the device. */
> > 	flush_work(&vi->config_work);
> > 
> > Thanks,
> > Stefano
> > 
> > [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org
> 
> 
> Good point. Then I agree with you. But if we can use the RCU to detect the
> detach of device from socket for these, it would be even better.
> 

What about checking 'the_virtio_vsock' in the worker functions in a RCU
critical section?
In this way, I can remove the rx_run/tx_run/event_run.

Do you think it's cleaner?

Thank you very much,
Stefano

Jason Wang May 31, 2019, 9:56 a.m. UTC | #7

On 2019/5/31 下午4:18, Stefano Garzarella wrote:
> On Thu, May 30, 2019 at 07:59:14PM +0800, Jason Wang wrote:
>> On 2019/5/30 下午6:10, Stefano Garzarella wrote:
>>> On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
>>>> On 2019/5/29 下午6:58, Stefano Garzarella wrote:
>>>>> On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
>>>>>> On 2019/5/28 下午6:56, Stefano Garzarella wrote:
>>>>>>> @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>>>      	vsock->event_run = false;
>>>>>>>      	mutex_unlock(&vsock->event_lock);
>>>>>>> +	/* Flush all pending works */
>>>>>>> +	virtio_vsock_flush_works(vsock);
>>>>>>> +
>>>>>>>      	/* Flush all device writes and interrupts, device will not use any
>>>>>>>      	 * more buffers.
>>>>>>>      	 */
>>>>>>> @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>>>      	/* Delete virtqueues and flush outstanding callbacks if any */
>>>>>>>      	vdev->config->del_vqs(vdev);
>>>>>>> +	/* Other works can be queued before 'config->del_vqs()', so we flush
>>>>>>> +	 * all works before to free the vsock object to avoid use after free.
>>>>>>> +	 */
>>>>>>> +	virtio_vsock_flush_works(vsock);
>>>>>> Some questions after a quick glance:
>>>>>>
>>>>>> 1) It looks to me that the work could be queued from the path of
>>>>>> vsock_transport_cancel_pkt() . Is that synchronized here?
>>>>>>
>>>>> Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
>>>>> queue work from the upper layer (socket).
>>>>>
>>>>> Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
>>>>> a rare issue could happen:
>>>>> we are setting the_virtio_vsock to NULL at the start of .remove() and we
>>>>> are freeing the object pointed by it at the end of .remove(), so
>>>>> virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
>>>>> running, accessing the object that we are freed.
>>>> Yes, that's my point.
>>>>
>>>>
>>>>> Should I use something like RCU to prevent this issue?
>>>>>
>>>>>        virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
>>>>>        {
>>>>>            rcu_read_lock();
>>>>>            vsock = rcu_dereference(the_virtio_vsock_mutex);
>>>> RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
>>>>
>>> Okay, I'm going this way.
>>>
>>>>>            ...
>>>>>            rcu_read_unlock();
>>>>>        }
>>>>>
>>>>>        virtio_vsock_remove()
>>>>>        {
>>>>>            rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
>>>>>            synchronize_rcu();
>>>>>
>>>>>            ...
>>>>>
>>>>>            free(vsock);
>>>>>        }
>>>>>
>>>>> Could there be a better approach?
>>>>>
>>>>>
>>>>>> 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
>>>>>> needed? It looks to me we've already done except that we need flush rx_work
>>>>>> in the end since send_pkt_work can requeue rx_work.
>>>>> The main reason of tx_run/rx_run/event_run is to prevent that a worker
>>>>> function is running while we are calling config->reset().
>>>>>
>>>>> E.g. if an interrupt comes between virtio_vsock_flush_works() and
>>>>> config->reset(), it can queue new works that can access the device while
>>>>> we are in config->reset().
>>>>>
>>>>> IMHO they are still needed.
>>>>>
>>>>> What do you think?
>>>> I mean could we simply do flush after reset once and without tx_rx/rx_run
>>>> tricks?
>>>>
>>>> rest();
>>>>
>>>> virtio_vsock_flush_work();
>>>>
>>>> virtio_vsock_free_buf();
>>> My only doubt is:
>>> is it safe to call config->reset() while a worker function could access
>>> the device?
>>>
>>> I had this doubt reading the Michael's advice[1] and looking at
>>> virtnet_remove() where there are these lines before the config->reset():
>>>
>>> 	/* Make sure no work handler is accessing the device. */
>>> 	flush_work(&vi->config_work);
>>>
>>> Thanks,
>>> Stefano
>>>
>>> [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org
>>
>> Good point. Then I agree with you. But if we can use the RCU to detect the
>> detach of device from socket for these, it would be even better.
>>
> What about checking 'the_virtio_vsock' in the worker functions in a RCU
> critical section?
> In this way, I can remove the rx_run/tx_run/event_run.
>
> Do you think it's cleaner?


Yes, I think so.

Thanks


>
> Thank you very much,
> Stefano

Stefano Garzarella June 6, 2019, 8:11 a.m. UTC | #8

On Fri, May 31, 2019 at 05:56:39PM +0800, Jason Wang wrote:
> On 2019/5/31 下午4:18, Stefano Garzarella wrote:
> > On Thu, May 30, 2019 at 07:59:14PM +0800, Jason Wang wrote:
> > > On 2019/5/30 下午6:10, Stefano Garzarella wrote:
> > > > On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
> > > > > On 2019/5/29 下午6:58, Stefano Garzarella wrote:
> > > > > > On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
> > > > > > > On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> > > > > > > > @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > > > >      	vsock->event_run = false;
> > > > > > > >      	mutex_unlock(&vsock->event_lock);
> > > > > > > > +	/* Flush all pending works */
> > > > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > > > > +
> > > > > > > >      	/* Flush all device writes and interrupts, device will not use any
> > > > > > > >      	 * more buffers.
> > > > > > > >      	 */
> > > > > > > > @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > > > >      	/* Delete virtqueues and flush outstanding callbacks if any */
> > > > > > > >      	vdev->config->del_vqs(vdev);
> > > > > > > > +	/* Other works can be queued before 'config->del_vqs()', so we flush
> > > > > > > > +	 * all works before to free the vsock object to avoid use after free.
> > > > > > > > +	 */
> > > > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > > > Some questions after a quick glance:
> > > > > > > 
> > > > > > > 1) It looks to me that the work could be queued from the path of
> > > > > > > vsock_transport_cancel_pkt() . Is that synchronized here?
> > > > > > > 
> > > > > > Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
> > > > > > queue work from the upper layer (socket).
> > > > > > 
> > > > > > Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
> > > > > > a rare issue could happen:
> > > > > > we are setting the_virtio_vsock to NULL at the start of .remove() and we
> > > > > > are freeing the object pointed by it at the end of .remove(), so
> > > > > > virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
> > > > > > running, accessing the object that we are freed.
> > > > > Yes, that's my point.
> > > > > 
> > > > > 
> > > > > > Should I use something like RCU to prevent this issue?
> > > > > > 
> > > > > >        virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
> > > > > >        {
> > > > > >            rcu_read_lock();
> > > > > >            vsock = rcu_dereference(the_virtio_vsock_mutex);
> > > > > RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
> > > > > 
> > > > Okay, I'm going this way.
> > > > 
> > > > > >            ...
> > > > > >            rcu_read_unlock();
> > > > > >        }
> > > > > > 
> > > > > >        virtio_vsock_remove()
> > > > > >        {
> > > > > >            rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
> > > > > >            synchronize_rcu();
> > > > > > 
> > > > > >            ...
> > > > > > 
> > > > > >            free(vsock);
> > > > > >        }
> > > > > > 
> > > > > > Could there be a better approach?
> > > > > > 
> > > > > > 
> > > > > > > 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
> > > > > > > needed? It looks to me we've already done except that we need flush rx_work
> > > > > > > in the end since send_pkt_work can requeue rx_work.
> > > > > > The main reason of tx_run/rx_run/event_run is to prevent that a worker
> > > > > > function is running while we are calling config->reset().
> > > > > > 
> > > > > > E.g. if an interrupt comes between virtio_vsock_flush_works() and
> > > > > > config->reset(), it can queue new works that can access the device while
> > > > > > we are in config->reset().
> > > > > > 
> > > > > > IMHO they are still needed.
> > > > > > 
> > > > > > What do you think?
> > > > > I mean could we simply do flush after reset once and without tx_rx/rx_run
> > > > > tricks?
> > > > > 
> > > > > rest();
> > > > > 
> > > > > virtio_vsock_flush_work();
> > > > > 
> > > > > virtio_vsock_free_buf();
> > > > My only doubt is:
> > > > is it safe to call config->reset() while a worker function could access
> > > > the device?
> > > > 
> > > > I had this doubt reading the Michael's advice[1] and looking at
> > > > virtnet_remove() where there are these lines before the config->reset():
> > > > 
> > > > 	/* Make sure no work handler is accessing the device. */
> > > > 	flush_work(&vi->config_work);
> > > > 
> > > > Thanks,
> > > > Stefano
> > > > 
> > > > [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org
> > > 
> > > Good point. Then I agree with you. But if we can use the RCU to detect the
> > > detach of device from socket for these, it would be even better.
> > > 
> > What about checking 'the_virtio_vsock' in the worker functions in a RCU
> > critical section?
> > In this way, I can remove the rx_run/tx_run/event_run.
> > 
> > Do you think it's cleaner?
> 
> 
> Yes, I think so.
> 

Hi Jason,
while I was trying to use RCU also for workers, I discovered that it can
not be used if we can sleep. (Workers have mutex, memory allocation, etc.).
There is SRCU, but I think the rx_run/tx_run/event_run is cleaner.

So, if you agree I'd send a v2 using RCU only for the
virtio_transport_send_pkt() or vsock_transport_cancel_pkt(), and leave
this patch as is to be sure that no one is accessing the device while we
call config->reset().

Thanks,
Stefano

Jason Wang June 13, 2019, 8:57 a.m. UTC | #9

On 2019/6/6 下午4:11, Stefano Garzarella wrote:
> On Fri, May 31, 2019 at 05:56:39PM +0800, Jason Wang wrote:
>> On 2019/5/31 下午4:18, Stefano Garzarella wrote:
>>> On Thu, May 30, 2019 at 07:59:14PM +0800, Jason Wang wrote:
>>>> On 2019/5/30 下午6:10, Stefano Garzarella wrote:
>>>>> On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
>>>>>> On 2019/5/29 下午6:58, Stefano Garzarella wrote:
>>>>>>> On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
>>>>>>>> On 2019/5/28 下午6:56, Stefano Garzarella wrote:
>>>>>>>>> @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>>>>>       	vsock->event_run = false;
>>>>>>>>>       	mutex_unlock(&vsock->event_lock);
>>>>>>>>> +	/* Flush all pending works */
>>>>>>>>> +	virtio_vsock_flush_works(vsock);
>>>>>>>>> +
>>>>>>>>>       	/* Flush all device writes and interrupts, device will not use any
>>>>>>>>>       	 * more buffers.
>>>>>>>>>       	 */
>>>>>>>>> @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>>>>>>>>>       	/* Delete virtqueues and flush outstanding callbacks if any */
>>>>>>>>>       	vdev->config->del_vqs(vdev);
>>>>>>>>> +	/* Other works can be queued before 'config->del_vqs()', so we flush
>>>>>>>>> +	 * all works before to free the vsock object to avoid use after free.
>>>>>>>>> +	 */
>>>>>>>>> +	virtio_vsock_flush_works(vsock);
>>>>>>>> Some questions after a quick glance:
>>>>>>>>
>>>>>>>> 1) It looks to me that the work could be queued from the path of
>>>>>>>> vsock_transport_cancel_pkt() . Is that synchronized here?
>>>>>>>>
>>>>>>> Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
>>>>>>> queue work from the upper layer (socket).
>>>>>>>
>>>>>>> Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
>>>>>>> a rare issue could happen:
>>>>>>> we are setting the_virtio_vsock to NULL at the start of .remove() and we
>>>>>>> are freeing the object pointed by it at the end of .remove(), so
>>>>>>> virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
>>>>>>> running, accessing the object that we are freed.
>>>>>> Yes, that's my point.
>>>>>>
>>>>>>
>>>>>>> Should I use something like RCU to prevent this issue?
>>>>>>>
>>>>>>>         virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
>>>>>>>         {
>>>>>>>             rcu_read_lock();
>>>>>>>             vsock = rcu_dereference(the_virtio_vsock_mutex);
>>>>>> RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
>>>>>>
>>>>> Okay, I'm going this way.
>>>>>
>>>>>>>             ...
>>>>>>>             rcu_read_unlock();
>>>>>>>         }
>>>>>>>
>>>>>>>         virtio_vsock_remove()
>>>>>>>         {
>>>>>>>             rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
>>>>>>>             synchronize_rcu();
>>>>>>>
>>>>>>>             ...
>>>>>>>
>>>>>>>             free(vsock);
>>>>>>>         }
>>>>>>>
>>>>>>> Could there be a better approach?
>>>>>>>
>>>>>>>
>>>>>>>> 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
>>>>>>>> needed? It looks to me we've already done except that we need flush rx_work
>>>>>>>> in the end since send_pkt_work can requeue rx_work.
>>>>>>> The main reason of tx_run/rx_run/event_run is to prevent that a worker
>>>>>>> function is running while we are calling config->reset().
>>>>>>>
>>>>>>> E.g. if an interrupt comes between virtio_vsock_flush_works() and
>>>>>>> config->reset(), it can queue new works that can access the device while
>>>>>>> we are in config->reset().
>>>>>>>
>>>>>>> IMHO they are still needed.
>>>>>>>
>>>>>>> What do you think?
>>>>>> I mean could we simply do flush after reset once and without tx_rx/rx_run
>>>>>> tricks?
>>>>>>
>>>>>> rest();
>>>>>>
>>>>>> virtio_vsock_flush_work();
>>>>>>
>>>>>> virtio_vsock_free_buf();
>>>>> My only doubt is:
>>>>> is it safe to call config->reset() while a worker function could access
>>>>> the device?
>>>>>
>>>>> I had this doubt reading the Michael's advice[1] and looking at
>>>>> virtnet_remove() where there are these lines before the config->reset():
>>>>>
>>>>> 	/* Make sure no work handler is accessing the device. */
>>>>> 	flush_work(&vi->config_work);
>>>>>
>>>>> Thanks,
>>>>> Stefano
>>>>>
>>>>> [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org
>>>> Good point. Then I agree with you. But if we can use the RCU to detect the
>>>> detach of device from socket for these, it would be even better.
>>>>
>>> What about checking 'the_virtio_vsock' in the worker functions in a RCU
>>> critical section?
>>> In this way, I can remove the rx_run/tx_run/event_run.
>>>
>>> Do you think it's cleaner?
>>
>> Yes, I think so.
>>
> Hi Jason,
> while I was trying to use RCU also for workers, I discovered that it can
> not be used if we can sleep. (Workers have mutex, memory allocation, etc.).
> There is SRCU, but I think the rx_run/tx_run/event_run is cleaner.
>
> So, if you agree I'd send a v2 using RCU only for the
> virtio_transport_send_pkt() or vsock_transport_cancel_pkt(), and leave
> this patch as is to be sure that no one is accessing the device while we
> call config->reset().
>
> Thanks,
> Stefano


If it work, I don't object to use that consider it was suggested by 
Michael. You can go this way and let's see.

Personally I would like something more cleaner. E.g RCU + some kind of 
reference count (kref?).

Thanks

Stefano Garzarella June 27, 2019, 10:26 a.m. UTC | #10

On Thu, Jun 13, 2019 at 04:57:15PM +0800, Jason Wang wrote:
> 
> On 2019/6/6 下午4:11, Stefano Garzarella wrote:
> > On Fri, May 31, 2019 at 05:56:39PM +0800, Jason Wang wrote:
> > > On 2019/5/31 下午4:18, Stefano Garzarella wrote:
> > > > On Thu, May 30, 2019 at 07:59:14PM +0800, Jason Wang wrote:
> > > > > On 2019/5/30 下午6:10, Stefano Garzarella wrote:
> > > > > > On Thu, May 30, 2019 at 05:46:18PM +0800, Jason Wang wrote:
> > > > > > > On 2019/5/29 下午6:58, Stefano Garzarella wrote:
> > > > > > > > On Wed, May 29, 2019 at 11:22:40AM +0800, Jason Wang wrote:
> > > > > > > > > On 2019/5/28 下午6:56, Stefano Garzarella wrote:
> > > > > > > > > > @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > > > > > >       	vsock->event_run = false;
> > > > > > > > > >       	mutex_unlock(&vsock->event_lock);
> > > > > > > > > > +	/* Flush all pending works */
> > > > > > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > > > > > > +
> > > > > > > > > >       	/* Flush all device writes and interrupts, device will not use any
> > > > > > > > > >       	 * more buffers.
> > > > > > > > > >       	 */
> > > > > > > > > > @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
> > > > > > > > > >       	/* Delete virtqueues and flush outstanding callbacks if any */
> > > > > > > > > >       	vdev->config->del_vqs(vdev);
> > > > > > > > > > +	/* Other works can be queued before 'config->del_vqs()', so we flush
> > > > > > > > > > +	 * all works before to free the vsock object to avoid use after free.
> > > > > > > > > > +	 */
> > > > > > > > > > +	virtio_vsock_flush_works(vsock);
> > > > > > > > > Some questions after a quick glance:
> > > > > > > > > 
> > > > > > > > > 1) It looks to me that the work could be queued from the path of
> > > > > > > > > vsock_transport_cancel_pkt() . Is that synchronized here?
> > > > > > > > > 
> > > > > > > > Both virtio_transport_send_pkt() and vsock_transport_cancel_pkt() can
> > > > > > > > queue work from the upper layer (socket).
> > > > > > > > 
> > > > > > > > Setting the_virtio_vsock to NULL, should synchronize, but after a careful look
> > > > > > > > a rare issue could happen:
> > > > > > > > we are setting the_virtio_vsock to NULL at the start of .remove() and we
> > > > > > > > are freeing the object pointed by it at the end of .remove(), so
> > > > > > > > virtio_transport_send_pkt() or vsock_transport_cancel_pkt() may still be
> > > > > > > > running, accessing the object that we are freed.
> > > > > > > Yes, that's my point.
> > > > > > > 
> > > > > > > 
> > > > > > > > Should I use something like RCU to prevent this issue?
> > > > > > > > 
> > > > > > > >         virtio_transport_send_pkt() and vsock_transport_cancel_pkt()
> > > > > > > >         {
> > > > > > > >             rcu_read_lock();
> > > > > > > >             vsock = rcu_dereference(the_virtio_vsock_mutex);
> > > > > > > RCU is probably a way to go. (Like what vhost_transport_send_pkt() did).
> > > > > > > 
> > > > > > Okay, I'm going this way.
> > > > > > 
> > > > > > > >             ...
> > > > > > > >             rcu_read_unlock();
> > > > > > > >         }
> > > > > > > > 
> > > > > > > >         virtio_vsock_remove()
> > > > > > > >         {
> > > > > > > >             rcu_assign_pointer(the_virtio_vsock_mutex, NULL);
> > > > > > > >             synchronize_rcu();
> > > > > > > > 
> > > > > > > >             ...
> > > > > > > > 
> > > > > > > >             free(vsock);
> > > > > > > >         }
> > > > > > > > 
> > > > > > > > Could there be a better approach?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still
> > > > > > > > > needed? It looks to me we've already done except that we need flush rx_work
> > > > > > > > > in the end since send_pkt_work can requeue rx_work.
> > > > > > > > The main reason of tx_run/rx_run/event_run is to prevent that a worker
> > > > > > > > function is running while we are calling config->reset().
> > > > > > > > 
> > > > > > > > E.g. if an interrupt comes between virtio_vsock_flush_works() and
> > > > > > > > config->reset(), it can queue new works that can access the device while
> > > > > > > > we are in config->reset().
> > > > > > > > 
> > > > > > > > IMHO they are still needed.
> > > > > > > > 
> > > > > > > > What do you think?
> > > > > > > I mean could we simply do flush after reset once and without tx_rx/rx_run
> > > > > > > tricks?
> > > > > > > 
> > > > > > > rest();
> > > > > > > 
> > > > > > > virtio_vsock_flush_work();
> > > > > > > 
> > > > > > > virtio_vsock_free_buf();
> > > > > > My only doubt is:
> > > > > > is it safe to call config->reset() while a worker function could access
> > > > > > the device?
> > > > > > 
> > > > > > I had this doubt reading the Michael's advice[1] and looking at
> > > > > > virtnet_remove() where there are these lines before the config->reset():
> > > > > > 
> > > > > > 	/* Make sure no work handler is accessing the device. */
> > > > > > 	flush_work(&vi->config_work);
> > > > > > 
> > > > > > Thanks,
> > > > > > Stefano
> > > > > > 
> > > > > > [1] https://lore.kernel.org/netdev/20190521055650-mutt-send-email-mst@kernel.org
> > > > > Good point. Then I agree with you. But if we can use the RCU to detect the
> > > > > detach of device from socket for these, it would be even better.
> > > > > 
> > > > What about checking 'the_virtio_vsock' in the worker functions in a RCU
> > > > critical section?
> > > > In this way, I can remove the rx_run/tx_run/event_run.
> > > > 
> > > > Do you think it's cleaner?
> > > 
> > > Yes, I think so.
> > > 
> > Hi Jason,
> > while I was trying to use RCU also for workers, I discovered that it can
> > not be used if we can sleep. (Workers have mutex, memory allocation, etc.).
> > There is SRCU, but I think the rx_run/tx_run/event_run is cleaner.
> > 
> > So, if you agree I'd send a v2 using RCU only for the
> > virtio_transport_send_pkt() or vsock_transport_cancel_pkt(), and leave
> > this patch as is to be sure that no one is accessing the device while we
> > call config->reset().
> > 
> > Thanks,
> > Stefano
> 
> 
> If it work, I don't object to use that consider it was suggested by Michael.
> You can go this way and let's see.

Okay, I'll try if it works.

> 
> Personally I would like something more cleaner. E.g RCU + some kind of
> reference count (kref?).

I'll try to check if kref can help to have a cleaner solution in this
case.

Thanks for your comments,
Stefano

[3/4] vsock/virtio: fix flush of works during the .remove()

Commit Message

Comments

Patch