diff mbox

new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

Message ID 20151220105146-mutt-send-email-mst@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michael S. Tsirkin Dec. 20, 2015, 9:25 a.m. UTC
On Thu, Dec 17, 2015 at 03:39:10PM +0100, Peter Zijlstra wrote:
> On Thu, Dec 17, 2015 at 04:33:44PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Dec 17, 2015 at 02:57:26PM +0100, Peter Zijlstra wrote:
> > > 
> > > You could of course go fix that instead of mutilating things into
> > > sort-of functional state.
> > 
> > Yes, we'd just need to touch all architectures, all for
> > the sake of UP which almost no one uses.
> > Basically, we need APIs that explicitly are
> > for talking to another kernel on a different CPU on
> > the same SMP system, and implemented identically
> > between CONFIG_SMP and !CONFIG_SMP on all architectures.
> > 
> > Do you think this is something of general usefulness,
> > outside virtio?
> 
> I'm not aware of any other case, but if there are more parts of virt
> that need this then I see no problem adding it.

When I wrote this, I assumed there are no other users, and I'm still not
sure there are other users at the moment. Do you see a problem then?

> That is, virt in general is the only use-case that I can think of,
> because this really is an artifact of interfacing with an SMP host while
> running an UP kernel.

Or another guest kernel on an SMP host.

> But I'm really not familiar with virt, so I do not know if there's more
> sites outside of virtio that could use this.
> Touching all archs is a tad tedious, but its fairly straight forward.

So I looked and I was only able to find other another possible user in Xen.

Cc Xen folks.

I noticed that drivers/xen/xenbus/xenbus_comms.c uses
full memory barriers to communicate with the other side.
For example:

                /* Must write data /after/ reading the consumer index.  * */
                mb();

                memcpy(dst, data, avail);
                data += avail;
                len -= avail;
        
                /* Other side must not see new producer until data is * there. */
                wmb();
                intf->req_prod += avail;
                
                /* Implies mb(): other side will see the updated producer. */
                notify_remote_via_evtchn(xen_store_evtchn);

To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
would be sufficient, so mb() and wmb() here are only needed if
a non-SMP guest runs on an SMP host.

Is my analysis correct?

So what I'm suggesting is something like the below patch,
except instead of using virtio directly, a new set of barriers
that behaves identically for SMP and non-SMP guests will be introduced.

And of course the weak barriers flag is not needed for Xen -
that's a virtio only thing.

For example:

smp_pv_wmb()
smp_pv_rmb()
smp_pv_mb()

I'd like to get confirmation from Xen folks before posting
this patchset.

Comments/suggestions?

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

--

compile-tested only.

Comments

Andrew Cooper Dec. 20, 2015, 5:07 p.m. UTC | #1
On 20/12/15 09:25, Michael S. Tsirkin wrote:
> On Thu, Dec 17, 2015 at 03:39:10PM +0100, Peter Zijlstra wrote:
>> On Thu, Dec 17, 2015 at 04:33:44PM +0200, Michael S. Tsirkin wrote:
>>> On Thu, Dec 17, 2015 at 02:57:26PM +0100, Peter Zijlstra wrote:
>>>> You could of course go fix that instead of mutilating things into
>>>> sort-of functional state.
>>> Yes, we'd just need to touch all architectures, all for
>>> the sake of UP which almost no one uses.
>>> Basically, we need APIs that explicitly are
>>> for talking to another kernel on a different CPU on
>>> the same SMP system, and implemented identically
>>> between CONFIG_SMP and !CONFIG_SMP on all architectures.
>>>
>>> Do you think this is something of general usefulness,
>>> outside virtio?
>> I'm not aware of any other case, but if there are more parts of virt
>> that need this then I see no problem adding it.
> When I wrote this, I assumed there are no other users, and I'm still not
> sure there are other users at the moment. Do you see a problem then?
>
>> That is, virt in general is the only use-case that I can think of,
>> because this really is an artifact of interfacing with an SMP host while
>> running an UP kernel.
> Or another guest kernel on an SMP host.
>
>> But I'm really not familiar with virt, so I do not know if there's more
>> sites outside of virtio that could use this.
>> Touching all archs is a tad tedious, but its fairly straight forward.
> So I looked and I was only able to find other another possible user in Xen.
>
> Cc Xen folks.
>
> I noticed that drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
> For example:
>
>                 /* Must write data /after/ reading the consumer index.  * */
>                 mb();
>
>                 memcpy(dst, data, avail);
>                 data += avail;
>                 len -= avail;
>         
>                 /* Other side must not see new producer until data is * there. */
>                 wmb();
>                 intf->req_prod += avail;
>                 
>                 /* Implies mb(): other side will see the updated producer. */
>                 notify_remote_via_evtchn(xen_store_evtchn);
>
> To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a non-SMP guest runs on an SMP host.
>
> Is my analysis correct?

Correct.  The reason full barriers are used is so non-SMP guests still
function correctly.  It is certainly inefficient.

>
> So what I'm suggesting is something like the below patch,
> except instead of using virtio directly, a new set of barriers
> that behaves identically for SMP and non-SMP guests will be introduced.
>
> And of course the weak barriers flag is not needed for Xen -
> that's a virtio only thing.
>
> For example:
>
> smp_pv_wmb()
> smp_pv_rmb()
> smp_pv_mb()
>
> I'd like to get confirmation from Xen folks before posting
> this patchset.
>
> Comments/suggestions?

Very much +1 for fixing this.

Those names would be fine, but they do add yet another set of options in
an already-complicated area.

An alternative might be to have the regular smp_{w,r,}mb() not revert
back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
non-native environment.  (I don't know how feasible this suggestion is,
however.)

~Andrew
Peter Zijlstra Dec. 20, 2015, 7:59 p.m. UTC | #2
On Sun, Dec 20, 2015 at 05:07:19PM +0000, Andrew Cooper wrote:
> 
> Very much +1 for fixing this.
> 
> Those names would be fine, but they do add yet another set of options in
> an already-complicated area.
> 
> An alternative might be to have the regular smp_{w,r,}mb() not revert
> back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
> non-native environment.  (I don't know how feasible this suggestion is,
> however.)

So a regular SMP kernel emits the LOCK prefix and will patch it out with
a DS prefix (iirc) when it finds but a single CPU. So for those you
could easily do this.

However an UP kernel will not emit the LOCK and do no patching.

So if you're willing to make CONFIG_PARAVIRT depend on CONFIG_SMP or
similar, this is doable.

I don't see people going to allow emitting the LOCK prefix (and growing
the kernel text size) for UP kernels.
Michael S. Tsirkin Dec. 21, 2015, 7:10 a.m. UTC | #3
On Sun, Dec 20, 2015 at 08:59:44PM +0100, Peter Zijlstra wrote:
> On Sun, Dec 20, 2015 at 05:07:19PM +0000, Andrew Cooper wrote:
> > 
> > Very much +1 for fixing this.
> > 
> > Those names would be fine, but they do add yet another set of options in
> > an already-complicated area.
> > 
> > An alternative might be to have the regular smp_{w,r,}mb() not revert
> > back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
> > non-native environment.  (I don't know how feasible this suggestion is,
> > however.)
> 
> So a regular SMP kernel emits the LOCK prefix and will patch it out with
> a DS prefix (iirc) when it finds but a single CPU. So for those you
> could easily do this.
> 
> However an UP kernel will not emit the LOCK and do no patching.
> 
> So if you're willing to make CONFIG_PARAVIRT depend on CONFIG_SMP or
> similar, this is doable.

One of the uses for virtio is to allow testing an existing kernel on
kvm just by loading a module, and this will break this usecase.

> I don't see people going to allow emitting the LOCK prefix (and growing
> the kernel text size) for UP kernels.

Thinking about this more, maybe __smp_*mb is a better set of names.

The nice thing about it is that we can then have generic code
that does basically

#ifdef CONFIG_SMP
#define smp_mb() __smp_mb()
#else
#define smp_mb() barrier()
#endif 

and reuse this on all architectures.

So instead of a maintainance burden, we are actually
removing code duplication.
David Vrabel Dec. 21, 2015, 10:47 a.m. UTC | #4
On 20/12/15 09:25, Michael S. Tsirkin wrote:
> 
> I noticed that drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
> For example:
> 
>                 /* Must write data /after/ reading the consumer index.  * */
>                 mb();
> 
>                 memcpy(dst, data, avail);
>                 data += avail;
>                 len -= avail;
>         
>                 /* Other side must not see new producer until data is * there. */
>                 wmb();
>                 intf->req_prod += avail;
>                 
>                 /* Implies mb(): other side will see the updated producer. */
>                 notify_remote_via_evtchn(xen_store_evtchn);
> 
> To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a non-SMP guest runs on an SMP host.
> 
> Is my analysis correct?

For x86, yes.

For arm/arm64 I think so, but would prefer one of the Xen arm
maintainers to confirm.  In particular, whether inner-shareable barriers
are sufficient for memory shared with the hypervisor.

> So what I'm suggesting is something like the below patch,
> except instead of using virtio directly, a new set of barriers
> that behaves identically for SMP and non-SMP guests will be introduced.
> 
> And of course the weak barriers flag is not needed for Xen -
> that's a virtio only thing.
> 
> For example:
> 
> smp_pv_wmb()
> smp_pv_rmb()
> smp_pv_mb()

The smp_ prefix doesn't make a lot of sense to me here since these
barriers are going to be the same whether the kernel is SMP or not.

David
Michael S. Tsirkin Dec. 21, 2015, 11:52 a.m. UTC | #5
On Mon, Dec 21, 2015 at 10:47:49AM +0000, David Vrabel wrote:
> On 20/12/15 09:25, Michael S. Tsirkin wrote:
> > 
> > I noticed that drivers/xen/xenbus/xenbus_comms.c uses
> > full memory barriers to communicate with the other side.
> > For example:
> > 
> >                 /* Must write data /after/ reading the consumer index.  * */
> >                 mb();
> > 
> >                 memcpy(dst, data, avail);
> >                 data += avail;
> >                 len -= avail;
> >         
> >                 /* Other side must not see new producer until data is * there. */
> >                 wmb();
> >                 intf->req_prod += avail;
> >                 
> >                 /* Implies mb(): other side will see the updated producer. */
> >                 notify_remote_via_evtchn(xen_store_evtchn);
> > 
> > To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> > would be sufficient, so mb() and wmb() here are only needed if
> > a non-SMP guest runs on an SMP host.
> > 
> > Is my analysis correct?
> 
> For x86, yes.
> 
> For arm/arm64 I think so, but would prefer one of the Xen arm
> maintainers to confirm.  In particular, whether inner-shareable barriers
> are sufficient for memory shared with the hypervisor.
> 
> > So what I'm suggesting is something like the below patch,
> > except instead of using virtio directly, a new set of barriers
> > that behaves identically for SMP and non-SMP guests will be introduced.
> > 
> > And of course the weak barriers flag is not needed for Xen -
> > that's a virtio only thing.
> > 
> > For example:
> > 
> > smp_pv_wmb()
> > smp_pv_rmb()
> > smp_pv_mb()
> 
> The smp_ prefix doesn't make a lot of sense to me here since these
> barriers are going to be the same whether the kernel is SMP or not.
> 
> David

Guest kernel - yes. But it's only needed because you
are running on an SMP host.
Stefano Stabellini Dec. 21, 2015, 2:50 p.m. UTC | #6
On Mon, 21 Dec 2015, David Vrabel wrote:
> On 20/12/15 09:25, Michael S. Tsirkin wrote:
> > 
> > I noticed that drivers/xen/xenbus/xenbus_comms.c uses
> > full memory barriers to communicate with the other side.
> > For example:
> > 
> >                 /* Must write data /after/ reading the consumer index.  * */
> >                 mb();
> > 
> >                 memcpy(dst, data, avail);
> >                 data += avail;
> >                 len -= avail;
> >         
> >                 /* Other side must not see new producer until data is * there. */
> >                 wmb();
> >                 intf->req_prod += avail;
> >                 
> >                 /* Implies mb(): other side will see the updated producer. */
> >                 notify_remote_via_evtchn(xen_store_evtchn);
> > 
> > To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> > would be sufficient, so mb() and wmb() here are only needed if
> > a non-SMP guest runs on an SMP host.
> > 
> > Is my analysis correct?
> 
> For x86, yes.
> 
> For arm/arm64 I think so, but would prefer one of the Xen arm
> maintainers to confirm.  In particular, whether inner-shareable barriers
> are sufficient for memory shared with the hypervisor.

inner-shareable barriers are certainly OK. In this case there would be
also a switch from dsb to dmb barriers, which I also think should be OK.

What about all the mb() and wmb() in RING_PUSH_REQUESTS and
RING_PUSH_RESPONSES in include/xen/interface/io/ring.h ?


> > So what I'm suggesting is something like the below patch,
> > except instead of using virtio directly, a new set of barriers
> > that behaves identically for SMP and non-SMP guests will be introduced.
> > 
> > And of course the weak barriers flag is not needed for Xen -
> > that's a virtio only thing.
> > 
> > For example:
> > 
> > smp_pv_wmb()
> > smp_pv_rmb()
> > smp_pv_mb()
> 
> The smp_ prefix doesn't make a lot of sense to me here since these
> barriers are going to be the same whether the kernel is SMP or not.
diff mbox

Patch

diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c
index fdb0f33..a28f049 100644
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -36,6 +36,7 @@ 
 #include <linux/interrupt.h>
 #include <linux/sched.h>
 #include <linux/err.h>
+#include <linux/virtio_ring.h>
 #include <xen/xenbus.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/events.h>
@@ -123,14 +124,14 @@  int xb_write(const void *data, unsigned len)
 			avail = len;
 
 		/* Must write data /after/ reading the consumer index. */
-		mb();
+		virtio_mb(true);
 
 		memcpy(dst, data, avail);
 		data += avail;
 		len -= avail;
 
 		/* Other side must not see new producer until data is there. */
-		wmb();
+		virtio_wmb(true);
 		intf->req_prod += avail;
 
 		/* Implies mb(): other side will see the updated producer. */
@@ -180,14 +181,14 @@  int xb_read(void *data, unsigned len)
 			avail = len;
 
 		/* Must read data /after/ reading the producer index. */
-		rmb();
+		virtio_rmb(true);
 
 		memcpy(data, src, avail);
 		data += avail;
 		len -= avail;
 
 		/* Other side must not see free space until we've copied out */
-		mb();
+		virtio_mb(true);
 		intf->rsp_cons += avail;
 
 		pr_debug("Finished read of %i bytes (%i to go)\n", avail, len);