diff mbox

vhost: fix initialization for vq->is_le

Message ID 20170130100936.17065-1-pasic@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Halil Pasic Jan. 30, 2017, 10:09 a.m. UTC
Currently, under certain circumstances vhost_init_is_le does just a part
of the initialization job, and depends on vhost_reset_is_le being called
too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
when vq->private_data is NULL. This is not only counter intuitive, but
also real a problem because it breaks vhost_net. The bug was introduced to
vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
legacy devices"). The symptom is corruption of the vq's used.idx field
(virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
shutdown on a vq with pending descriptors.

Let us make sure the outcome of vhost_init_is_le never depend on the state
it is actually supposed to initialize, and fix virtio_net by removing the
reset from vhost_vq_init_access.

With the above, there is no reason for vhost_reset_is_le to do just half
of the job. Let us make vhost_reset_is_le reinitialize is_le.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Michael A. Tebolt <miket@us.ibm.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
---

The bug was already discussed here: 
http://www.spinics.net/lists/kvm/msg144365.html
This is a follow up patch.

---
 drivers/vhost/vhost.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Greg Kurz Jan. 30, 2017, 7:06 p.m. UTC | #1
On Mon, 30 Jan 2017 11:09:36 +0100
Halil Pasic <pasic@linux.vnet.ibm.com> wrote:

> Currently, under certain circumstances vhost_init_is_le does just a part
> of the initialization job, and depends on vhost_reset_is_le being called
> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
> when vq->private_data is NULL. This is not only counter intuitive, but
> also real a problem because it breaks vhost_net. The bug was introduced to
> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
> legacy devices"). The symptom is corruption of the vq's used.idx field
> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
> shutdown on a vq with pending descriptors.
> 
> Let us make sure the outcome of vhost_init_is_le never depend on the state
> it is actually supposed to initialize, and fix virtio_net by removing the
> reset from vhost_vq_init_access.
> 
> With the above, there is no reason for vhost_reset_is_le to do just half
> of the job. Let us make vhost_reset_is_le reinitialize is_le.
> 
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

> 
> The bug was already discussed here: 
> http://www.spinics.net/lists/kvm/msg144365.html
> This is a follow up patch.
> 
> ---
>  drivers/vhost/vhost.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index d643260..8f99fe0 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -130,14 +130,14 @@ static long vhost_get_vring_endian(struct vhost_virtqueue *vq, u32 idx,
>  
>  static void vhost_init_is_le(struct vhost_virtqueue *vq)
>  {
> -	if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> -		vq->is_le = true;
> +	vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1)
> +		|| virtio_legacy_is_little_endian();
>  }
>  #endif /* CONFIG_VHOST_CROSS_ENDIAN_LEGACY */
>  
>  static void vhost_reset_is_le(struct vhost_virtqueue *vq)
>  {
> -	vq->is_le = virtio_legacy_is_little_endian();
> +	vhost_init_is_le(vq);
>  }
>  
>  struct vhost_flush_struct {
> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>  	int r;
>  	bool is_le = vq->is_le;
>  
> -	if (!vq->private_data) {
> -		vhost_reset_is_le(vq);
> +	if (!vq->private_data)
>  		return 0;
> -	}
>  
>  	vhost_init_is_le(vq);
>
Halil Pasic Jan. 31, 2017, 3:56 p.m. UTC | #2
On 01/30/2017 08:06 PM, Greg Kurz wrote:
>> Currently, under certain circumstances vhost_init_is_le does just a part
>> of the initialization job, and depends on vhost_reset_is_le being called
>> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
>> when vq->private_data is NULL. This is not only counter intuitive, but
>> also real a problem because it breaks vhost_net. The bug was introduced to
>> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
>> legacy devices"). The symptom is corruption of the vq's used.idx field
>> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
>> shutdown on a vq with pending descriptors.
>>
>> Let us make sure the outcome of vhost_init_is_le never depend on the state
>> it is actually supposed to initialize, and fix virtio_net by removing the
>> reset from vhost_vq_init_access.
>>
>> With the above, there is no reason for vhost_reset_is_le to do just half
>> of the job. Let us make vhost_reset_is_le reinitialize is_le.
>>
>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
>> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
>> ---
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 

Thanks! 

We have some tests on s390x (that is BE) running, but I won't be able to
test the change with cross endian and legacy. 

What do you think, should I/we RFT or are we fine without?

Regards,
Halil
Michael S. Tsirkin Jan. 31, 2017, 6:28 p.m. UTC | #3
On Tue, Jan 31, 2017 at 04:56:13PM +0100, Halil Pasic wrote:
> 
> 
> On 01/30/2017 08:06 PM, Greg Kurz wrote:
> >> Currently, under certain circumstances vhost_init_is_le does just a part
> >> of the initialization job, and depends on vhost_reset_is_le being called
> >> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
> >> when vq->private_data is NULL. This is not only counter intuitive, but
> >> also real a problem because it breaks vhost_net. The bug was introduced to
> >> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
> >> legacy devices"). The symptom is corruption of the vq's used.idx field
> >> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
> >> shutdown on a vq with pending descriptors.
> >>
> >> Let us make sure the outcome of vhost_init_is_le never depend on the state
> >> it is actually supposed to initialize, and fix virtio_net by removing the
> >> reset from vhost_vq_init_access.
> >>
> >> With the above, there is no reason for vhost_reset_is_le to do just half
> >> of the job. Let us make vhost_reset_is_le reinitialize is_le.
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
> >> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
> >> ---
> > Reviewed-by: Greg Kurz <groug@kaod.org>
> > 
> 
> Thanks! 
> 
> We have some tests on s390x (that is BE) running, but I won't be able to
> test the change with cross endian and legacy. 
> 
> What do you think, should I/we RFT or are we fine without?
> 
> Regards,
> Halil

More testing can't hurt. I can merge this meanwhile.
Greg Kurz Feb. 1, 2017, 2:19 p.m. UTC | #4
On Tue, 31 Jan 2017 16:56:13 +0100
Halil Pasic <pasic@linux.vnet.ibm.com> wrote:

> On 01/30/2017 08:06 PM, Greg Kurz wrote:
> >> Currently, under certain circumstances vhost_init_is_le does just a part
> >> of the initialization job, and depends on vhost_reset_is_le being called
> >> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
> >> when vq->private_data is NULL. This is not only counter intuitive, but
> >> also real a problem because it breaks vhost_net. The bug was introduced to
> >> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
> >> legacy devices"). The symptom is corruption of the vq's used.idx field
> >> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
> >> shutdown on a vq with pending descriptors.
> >>
> >> Let us make sure the outcome of vhost_init_is_le never depend on the state
> >> it is actually supposed to initialize, and fix virtio_net by removing the
> >> reset from vhost_vq_init_access.
> >>
> >> With the above, there is no reason for vhost_reset_is_le to do just half
> >> of the job. Let us make vhost_reset_is_le reinitialize is_le.
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
> >> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
> >> ---  
> > Reviewed-by: Greg Kurz <groug@kaod.org>
> >   
> 
> Thanks! 
> 
> We have some tests on s390x (that is BE) running, but I won't be able to
> test the change with cross endian and legacy. 
> 

I'll try to find some time to run such tests on ppc.

Cheers.

--
Greg

> What do you think, should I/we RFT or are we fine without?
> 
> Regards,
> Halil
>
Halil Pasic Feb. 1, 2017, 2:29 p.m. UTC | #5
On 01/31/2017 07:28 PM, Michael S. Tsirkin wrote:
> On Tue, Jan 31, 2017 at 04:56:13PM +0100, Halil Pasic wrote:
>>
>>
>> On 01/30/2017 08:06 PM, Greg Kurz wrote:
>>>> Currently, under certain circumstances vhost_init_is_le does just a part
>>>> of the initialization job, and depends on vhost_reset_is_le being called
>>>> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
>>>> when vq->private_data is NULL. This is not only counter intuitive, but
>>>> also real a problem because it breaks vhost_net. The bug was introduced to
>>>> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
>>>> legacy devices"). The symptom is corruption of the vq's used.idx field
>>>> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
>>>> shutdown on a vq with pending descriptors.
>>>>
>>>> Let us make sure the outcome of vhost_init_is_le never depend on the state
>>>> it is actually supposed to initialize, and fix virtio_net by removing the
>>>> reset from vhost_vq_init_access.
>>>>
>>>> With the above, there is no reason for vhost_reset_is_le to do just half
>>>> of the job. Let us make vhost_reset_is_le reinitialize is_le.
>>>>
>>>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
>>>> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
>>>> ---
>>> Reviewed-by: Greg Kurz <groug@kaod.org>
>>>
>>
>> Thanks! 
>>
>> We have some tests on s390x (that is BE) running, but I won't be able to
>> test the change with cross endian and legacy. 
>>
>> What do you think, should I/we RFT or are we fine without?
>>
>> Regards,
>> Halil
> 
> More testing can't hurt. I can merge this meanwhile.
> 

I received a word from our test team. No problems discovered with
 a mix of legacy and virtio 1 guests on s390x (was reliably
reproducible without this patch with the same setup).
Could you please add:

Tested-by: Michael A. Tebolt <miket@us.ibm.com>

Regards,
Halil
Jason Wang Feb. 4, 2017, 2:27 a.m. UTC | #6
On 2017年01月30日 18:09, Halil Pasic wrote:
> Currently, under certain circumstances vhost_init_is_le does just a part
> of the initialization job, and depends on vhost_reset_is_le being called
> too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
> when vq->private_data is NULL. This is not only counter intuitive, but
> also real a problem because it breaks vhost_net. The bug was introduced to
> vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for
> legacy devices"). The symptom is corruption of the vq's used.idx field
> (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
> shutdown on a vq with pending descriptors.
>
> Let us make sure the outcome of vhost_init_is_le never depend on the state
> it is actually supposed to initialize, and fix virtio_net by removing the
> reset from vhost_vq_init_access.
>
> With the above, there is no reason for vhost_reset_is_le to do just half
> of the job. Let us make vhost_reset_is_le reinitialize is_le.
>
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Reported-by: Michael A. Tebolt <miket@us.ibm.com>
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices")
> ---
>
> The bug was already discussed here:
> http://www.spinics.net/lists/kvm/msg144365.html
> This is a follow up patch.
>
> ---
>   drivers/vhost/vhost.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index d643260..8f99fe0 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -130,14 +130,14 @@ static long vhost_get_vring_endian(struct vhost_virtqueue *vq, u32 idx,
>   
>   static void vhost_init_is_le(struct vhost_virtqueue *vq)
>   {
> -	if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> -		vq->is_le = true;
> +	vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1)
> +		|| virtio_legacy_is_little_endian();
>   }
>   #endif /* CONFIG_VHOST_CROSS_ENDIAN_LEGACY */
>   
>   static void vhost_reset_is_le(struct vhost_virtqueue *vq)
>   {
> -	vq->is_le = virtio_legacy_is_little_endian();
> +	vhost_init_is_le(vq);
>   }
>   
>   struct vhost_flush_struct {
> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>   	int r;
>   	bool is_le = vq->is_le;
>   
> -	if (!vq->private_data) {
> -		vhost_reset_is_le(vq);
> +	if (!vq->private_data)
>   		return 0;
> -	}
>   
>   	vhost_init_is_le(vq);
>   

Acked-by: Jason Wang <jasowang@redhat.com>

We can probably just drop vhost_reset_is_le() and just use 
vhost_init_is_le() instead.

Thanks
diff mbox

Patch

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index d643260..8f99fe0 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -130,14 +130,14 @@  static long vhost_get_vring_endian(struct vhost_virtqueue *vq, u32 idx,
 
 static void vhost_init_is_le(struct vhost_virtqueue *vq)
 {
-	if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
-		vq->is_le = true;
+	vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1)
+		|| virtio_legacy_is_little_endian();
 }
 #endif /* CONFIG_VHOST_CROSS_ENDIAN_LEGACY */
 
 static void vhost_reset_is_le(struct vhost_virtqueue *vq)
 {
-	vq->is_le = virtio_legacy_is_little_endian();
+	vhost_init_is_le(vq);
 }
 
 struct vhost_flush_struct {
@@ -1714,10 +1714,8 @@  int vhost_vq_init_access(struct vhost_virtqueue *vq)
 	int r;
 	bool is_le = vq->is_le;
 
-	if (!vq->private_data) {
-		vhost_reset_is_le(vq);
+	if (!vq->private_data)
 		return 0;
-	}
 
 	vhost_init_is_le(vq);