diff mbox series

[RFC,2/2] virtio/s390: fix race in ccw_io_helper()

Message ID 20180912140202.12292-3-pasic@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series virtio/s390: fix some races in virtio-ccw | expand

Commit Message

Halil Pasic Sept. 12, 2018, 2:02 p.m. UTC
While ccw_io_helper() seems like intended to be exclusive in a sense that
it is supposed to facilitate I/O for at most one thread at any given
time, there is actually nothing ensuring that threads won't pile up at
vcdev->wait_q. If they all threads get woken up and see the status that
belongs to some other request as their own. This can lead to bugs. For an
example see :
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432

This normally does not cause problems, as these are usually infrequent
operations that happen in a well defined sequence and normally do not
fail. But occasionally sysfs attributes are directly dependent
on pieces of virio config and trigger a get on each read.  This gives us
at least one method to trigger races.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
---

This is a big hammer -- mutex on virtio_ccw device level would more than
suffice. But I don't think it hurts, and maybe there is a better way e.g.
one using some common ccw/cio mechanisms to address this. That's why this
is an RFC.
---
 drivers/s390/virtio/virtio_ccw.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Cornelia Huck Sept. 18, 2018, 6:45 p.m. UTC | #1
On Wed, 12 Sep 2018 16:02:02 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> While ccw_io_helper() seems like intended to be exclusive in a sense that
> it is supposed to facilitate I/O for at most one thread at any given
> time, there is actually nothing ensuring that threads won't pile up at
> vcdev->wait_q. If they all threads get woken up and see the status that
> belongs to some other request as their own. This can lead to bugs. For an
> example see :
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432
> 
> This normally does not cause problems, as these are usually infrequent
> operations that happen in a well defined sequence and normally do not
> fail. But occasionally sysfs attributes are directly dependent
> on pieces of virio config and trigger a get on each read.  This gives us
> at least one method to trigger races.

Yes, the idea behind ccw_io_helper() was to provide a simple way to use
the inherently asynchronous channel I/O operations in a synchronous
way, as that's what the virtio callbacks expect. I did not consider
multiple callbacks for a device running at the same time; but if the
interface allows that, we obviously need to be able to handle it.

Has this only been observed for the config get/set commands? (The
read-before-write thing?)

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Colin Ian King <colin.king@canonical.com>
> ---
> 
> This is a big hammer -- mutex on virtio_ccw device level would more than
> suffice. But I don't think it hurts, and maybe there is a better way e.g.
> one using some common ccw/cio mechanisms to address this. That's why this
> is an RFC.

I'm for using more delicate tools, if possible :)

We basically have two options:
- Have a way to queue I/O operations and then handle them in sequence.
  Creates complexity, and is likely overkill. (We already have a kind
  of serialization because we re-submit the channel program until the
  hypervisor accepts it; the problem comes from the wait queue usage.)
- Add serialization around the submit/wait procedure (as you did), but
  with a per-device mutex. That looks like the easiest solution.

> ---
>  drivers/s390/virtio/virtio_ccw.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index a5e8530a3391..36252f344660 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -289,6 +289,8 @@ static int doing_io(struct virtio_ccw_device *vcdev, __u32 flag)
>  	return ret;
>  }
>  
> +DEFINE_MUTEX(vcio_mtx);
> +
>  static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>  			 struct ccw1 *ccw, __u32 intparm)
>  {
> @@ -296,6 +298,7 @@ static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>  	unsigned long flags;
>  	int flag = intparm & VIRTIO_CCW_INTPARM_MASK;
>  
> +	mutex_lock(&vcio_mtx);
>  	do {
>  		spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags);
>  		ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0);
> @@ -308,7 +311,9 @@ static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>  		cpu_relax();
>  	} while (ret == -EBUSY);

We probably still want to keep this while loop to be on the safe side
(unsolicited status from the hypervisor, for example.)

>  	wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0);
> -	return ret ? ret : vcdev->err;
> +	ret = ret ? ret : vcdev->err;
> +	mutex_unlock(&vcio_mtx);
> +	return ret;
>  }
>  
>  static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
Halil Pasic Sept. 19, 2018, 1:17 p.m. UTC | #2
On 09/18/2018 08:45 PM, Cornelia Huck wrote:
> On Wed, 12 Sep 2018 16:02:02 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> While ccw_io_helper() seems like intended to be exclusive in a sense that
>> it is supposed to facilitate I/O for at most one thread at any given
>> time, there is actually nothing ensuring that threads won't pile up at
>> vcdev->wait_q. If they all threads get woken up and see the status that
>> belongs to some other request as their own. This can lead to bugs. For an
>> example see :
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432
>>
>> This normally does not cause problems, as these are usually infrequent
>> operations that happen in a well defined sequence and normally do not
>> fail. But occasionally sysfs attributes are directly dependent
>> on pieces of virio config and trigger a get on each read.  This gives us
>> at least one method to trigger races.
> 
> Yes, the idea behind ccw_io_helper() was to provide a simple way to use
> the inherently asynchronous channel I/O operations in a synchronous
> way, as that's what the virtio callbacks expect. I did not consider
> multiple callbacks for a device running at the same time; but if the
> interface allows that, we obviously need to be able to handle it.
> 
> Has this only been observed for the config get/set commands? (The
> read-before-write thing?)
> 
>>
>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
>> Reported-by: Colin Ian King <colin.king@canonical.com>
>> ---
>>
>> This is a big hammer -- mutex on virtio_ccw device level would more than
>> suffice. But I don't think it hurts, and maybe there is a better way e.g.
>> one using some common ccw/cio mechanisms to address this. That's why this
>> is an RFC.
> 
> I'm for using more delicate tools, if possible :)
> 
> We basically have two options:
> - Have a way to queue I/O operations and then handle them in sequence.
>   Creates complexity, and is likely overkill. (We already have a kind
>   of serialization because we re-submit the channel program until the
>   hypervisor accepts it; the problem comes from the wait queue usage.)

I secretly hoped we already have something like this somewhere. Getting
some kind of requests processed and wanting to know if each of these worked
or not seemed like fairly common. I agree, implementing this just for
virtio-ccw would be an overkill, I agree.

> - Add serialization around the submit/wait procedure (as you did), but
>   with a per-device mutex. That looks like the easiest solution.
> 

Yep, I'm for doing something like this first. We can think about doing
something more elaborate later. I will send a non-RFC with an extra
per-device mutex. Unless you object.

Another thought that crossed my head was making the transport ops
mutex on each virtio-ccw device -- see our conversation on get/set
config. I don't think it would make a big difference, since the
ccw stuff is mutex already, so we only have parallelism for the
preparation and for post-processing the results of the ccw io.

Regards,
Halil

>> ---
>>  drivers/s390/virtio/virtio_ccw.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
>> index a5e8530a3391..36252f344660 100644
>> --- a/drivers/s390/virtio/virtio_ccw.c
>> +++ b/drivers/s390/virtio/virtio_ccw.c
>> @@ -289,6 +289,8 @@ static int doing_io(struct virtio_ccw_device *vcdev, __u32 flag)
>>  	return ret;
>>  }
>>  
>> +DEFINE_MUTEX(vcio_mtx);
>> +
>>  static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>>  			 struct ccw1 *ccw, __u32 intparm)
>>  {
>> @@ -296,6 +298,7 @@ static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>>  	unsigned long flags;
>>  	int flag = intparm & VIRTIO_CCW_INTPARM_MASK;
>>  
>> +	mutex_lock(&vcio_mtx);
>>  	do {
>>  		spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags);
>>  		ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0);
>> @@ -308,7 +311,9 @@ static int ccw_io_helper(struct virtio_ccw_device *vcdev,
>>  		cpu_relax();
>>  	} while (ret == -EBUSY);
> 
> We probably still want to keep this while loop to be on the safe side
> (unsolicited status from the hypervisor, for example.)
> 

Nod.

>>  	wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0);
>> -	return ret ? ret : vcdev->err;
>> +	ret = ret ? ret : vcdev->err;
>> +	mutex_unlock(&vcio_mtx);
>> +	return ret;
>>  }
>>  
>>  static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>
Cornelia Huck Sept. 19, 2018, 2:07 p.m. UTC | #3
On Wed, 19 Sep 2018 15:17:28 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On 09/18/2018 08:45 PM, Cornelia Huck wrote:

> > We basically have two options:
> > - Have a way to queue I/O operations and then handle them in sequence.
> >   Creates complexity, and is likely overkill. (We already have a kind
> >   of serialization because we re-submit the channel program until the
> >   hypervisor accepts it; the problem comes from the wait queue usage.)  
> 
> I secretly hoped we already have something like this somewhere. Getting
> some kind of requests processed and wanting to know if each of these worked
> or not seemed like fairly common. I agree, implementing this just for
> virtio-ccw would be an overkill, I agree.

I've encountered that pattern so far mostly for driver-internal I/O
(setting some stuff up via channel commands etc.) Other usages (like
e.g. the dasd driver processing block layer requests) are asynchronous,
and the common I/O layer uses a full-fledged fsm. Much of the trouble
comes from implementing a synchronous interface via asynchronous
commands, and I'd really like to keep that as simple as possible
(especially as this is not the hot path).

> 
> > - Add serialization around the submit/wait procedure (as you did), but
> >   with a per-device mutex. That looks like the easiest solution.
> >   
> 
> Yep, I'm for doing something like this first. We can think about doing
> something more elaborate later. I will send a non-RFC with an extra
> per-device mutex. Unless you object.

No, that sounds good to me.

> 
> Another thought that crossed my head was making the transport ops
> mutex on each virtio-ccw device -- see our conversation on get/set
> config. I don't think it would make a big difference, since the
> ccw stuff is mutex already, so we only have parallelism for the
> preparation and for post-processing the results of the ccw io.

Do you spot any other places where we may need to care about concurrent
processing (like for the ->config area in the previous patch)?
Halil Pasic Sept. 19, 2018, 4:56 p.m. UTC | #4
On 09/19/2018 04:07 PM, Cornelia Huck wrote:
>> Yep, I'm for doing something like this first. We can think about doing
>> something more elaborate later. I will send a non-RFC with an extra
>> per-device mutex. Unless you object.
> No, that sounds good to me.
> 

OK.

>> Another thought that crossed my head was making the transport ops
>> mutex on each virtio-ccw device -- see our conversation on get/set
>> config. I don't think it would make a big difference, since the
>> ccw stuff is mutex already, so we only have parallelism for the
>> preparation and for post-processing the results of the ccw io.
> Do you spot any other places where we may need to care about concurrent
> processing (like for the ->config area in the previous patch)?
> 

It is hard to tell, because:
* Synchronization external to the transport could make things work
out just fine.
* virtio_config_ops does not document these requirements if any.
* So it's up to the devices to use the stuff without shooting
  themselves in the foot.
* virtio-pci does not seem to do more to avoid such problems that
  we do.

Back then when learning vritio-ccw I did ask myself such questions
and based on vrito-pci and I was like looks similar, should be
good enough.

But I guess you know all this and that's why you said "any other
places where we *may* need to care". Some of the places I'm not
sure we don't are:
* status get/set, and
* find_vqs, del_vqs

But if I had been sure that these are problematic, I would have
pointed that out.

Status is one byte, and that could save the day. But I can't recall
if the architecture guarantees that we won't observe anything funny.

Regarding find_vqs I think of some external-ish events that could
make us tear down before build-up (find_vqs) finished. Funny thing
is we do protect the list with a lock, but the lock is taken only while
dealing with a single element. So I don't think it would save us.
I hope synchronization external to virtio-ccw takes care of this,
but I cant tell for sure. I'm only sure that I don't fully understand
the interactions. I stated that in the cover letter ;).

Regards,
Halil
Cornelia Huck Sept. 20, 2018, 10:15 a.m. UTC | #5
On Wed, 19 Sep 2018 18:56:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On 09/19/2018 04:07 PM, Cornelia Huck wrote:

> > Do you spot any other places where we may need to care about concurrent
> > processing (like for the ->config area in the previous patch)?
> >   
> 
> It is hard to tell, because:
> * Synchronization external to the transport could make things work
> out just fine.
> * virtio_config_ops does not document these requirements if any.
> * So it's up to the devices to use the stuff without shooting
>   themselves in the foot.
> * virtio-pci does not seem to do more to avoid such problems that
>   we do.
> 
> Back then when learning vritio-ccw I did ask myself such questions
> and based on vrito-pci and I was like looks similar, should be
> good enough.

Yep, I agree. If there's nothing obvious, I think we should just leave
it as it is now.
diff mbox series

Patch

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index a5e8530a3391..36252f344660 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -289,6 +289,8 @@  static int doing_io(struct virtio_ccw_device *vcdev, __u32 flag)
 	return ret;
 }
 
+DEFINE_MUTEX(vcio_mtx);
+
 static int ccw_io_helper(struct virtio_ccw_device *vcdev,
 			 struct ccw1 *ccw, __u32 intparm)
 {
@@ -296,6 +298,7 @@  static int ccw_io_helper(struct virtio_ccw_device *vcdev,
 	unsigned long flags;
 	int flag = intparm & VIRTIO_CCW_INTPARM_MASK;
 
+	mutex_lock(&vcio_mtx);
 	do {
 		spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags);
 		ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0);
@@ -308,7 +311,9 @@  static int ccw_io_helper(struct virtio_ccw_device *vcdev,
 		cpu_relax();
 	} while (ret == -EBUSY);
 	wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0);
-	return ret ? ret : vcdev->err;
+	ret = ret ? ret : vcdev->err;
+	mutex_unlock(&vcio_mtx);
+	return ret;
 }
 
 static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,