diff mbox

[v3,05/11] qemu-thread: add simple test-and-set spinlock

Message ID 1461107270-19234-6-git-send-email-cota@braap.org (mailing list archive)
State New, archived
Headers show

Commit Message

Emilio Cota April 19, 2016, 11:07 p.m. UTC
From: Guillaume Delbergue <guillaume.delbergue@greensocs.com>

Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: call cpu_relax() while spinning; optimize for
 uncontended locks by acquiring the lock with xchg+test instead of
 test+xchg+test.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 include/qemu/thread.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

Comments

Richard Henderson April 20, 2016, 3:18 p.m. UTC | #1
On 04/19/2016 04:07 PM, Emilio G. Cota wrote:
> From: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
>
> Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
> [Rewritten. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [Emilio's additions: call cpu_relax() while spinning; optimize for
>   uncontended locks by acquiring the lock with xchg+test instead of
>   test+xchg+test.]
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---

It probably doesn't matter for any real hosts, but do note that there are 
compiler primitives for test-and-set that (can be) simpler for a cpu to 
implement than xchg.  This likely affects only ancient hosts like sparcv7,
or tiny hosts like SH.

We don't have to change anything here, but it does seem more natural to use a 
test-and-set primitive.

> +static inline int qemu_spin_trylock(QemuSpin *spin)
> +{
> +    if (atomic_read(&spin->value) || atomic_xchg(&spin->value, true)) {
> +        return -EBUSY;

I think there's no point in the extra read here.


r~
Emilio Cota April 20, 2016, 5:17 p.m. UTC | #2
On Wed, Apr 20, 2016 at 08:18:56 -0700, Richard Henderson wrote:
> On 04/19/2016 04:07 PM, Emilio G. Cota wrote:
> >From: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
> >
> >Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
> >[Rewritten. - Paolo]
> >Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >[Emilio's additions: call cpu_relax() while spinning; optimize for
> >  uncontended locks by acquiring the lock with xchg+test instead of
> >  test+xchg+test.]
> >Signed-off-by: Emilio G. Cota <cota@braap.org>
> >---
> 
> It probably doesn't matter for any real hosts, but do note that there are
> compiler primitives for test-and-set that (can be) simpler for a cpu to
> implement than xchg.  This likely affects only ancient hosts like sparcv7,
> or tiny hosts like SH.
> 
> We don't have to change anything here, but it does seem more natural to use
> a test-and-set primitive.

I've tried to find a GCC intrinsic for test-and-set, and I've only found
lock_test_and_set, which is what we use for atomic_xchg (except on ppc)
because it really is an atomic exchange:
 "This builtin, as described by Intel, is not a traditional test-and-set
  operation, but rather an atomic exchange operation."
 https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

> >+static inline int qemu_spin_trylock(QemuSpin *spin)
> >+{
> >+    if (atomic_read(&spin->value) || atomic_xchg(&spin->value, true)) {
> >+        return -EBUSY;
> 
> I think there's no point in the extra read here.

Yep that's a remnant of the original "TATAS" implementation. Will
send a revised patch in a few minutes.

Thanks,

		Emilio
Richard Henderson April 20, 2016, 5:55 p.m. UTC | #3
On 04/20/2016 10:17 AM, Emilio G. Cota wrote:
> I've tried to find a GCC intrinsic for test-and-set, and I've only found
> lock_test_and_set, which is what we use for atomic_xchg (except on ppc)
> because it really is an atomic exchange:
>   "This builtin, as described by Intel, is not a traditional test-and-set
>    operation, but rather an atomic exchange operation."
>   https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

Please read the entire documentation, not just the first sentence.


r~
Emilio Cota April 20, 2016, 6:11 p.m. UTC | #4
On Wed, Apr 20, 2016 at 10:55:45 -0700, Richard Henderson wrote:
> On 04/20/2016 10:17 AM, Emilio G. Cota wrote:
> >I've tried to find a GCC intrinsic for test-and-set, and I've only found
> >lock_test_and_set, which is what we use for atomic_xchg (except on ppc)
> >because it really is an atomic exchange:
> >  "This builtin, as described by Intel, is not a traditional test-and-set
> >   operation, but rather an atomic exchange operation."
> >  https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
> 
> Please read the entire documentation, not just the first sentence.

I did read it and I'm aware of the limitations of using xchg.

My comment was related to this:

> [...] do note that there are compiler primitives for test-and-set that
> (can be) simpler for a cpu to implement than xchg.

What compiler (I assume gcc) primitives are these? I couldn't find them.

Thanks,

		Emilio
Richard Henderson April 20, 2016, 7:39 p.m. UTC | #5
On 04/20/2016 11:11 AM, Emilio G. Cota wrote:
> On Wed, Apr 20, 2016 at 10:55:45 -0700, Richard Henderson wrote:
>> On 04/20/2016 10:17 AM, Emilio G. Cota wrote:
>>> I've tried to find a GCC intrinsic for test-and-set, and I've only found
>>> lock_test_and_set, which is what we use for atomic_xchg (except on ppc)
>>> because it really is an atomic exchange:
>>>   "This builtin, as described by Intel, is not a traditional test-and-set
>>>    operation, but rather an atomic exchange operation."
>>>   https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
>>
>> Please read the entire documentation, not just the first sentence.
>
> I did read it and I'm aware of the limitations of using xchg.
>
> My comment was related to this:
>
>> [...] do note that there are compiler primitives for test-and-set that
>> (can be) simpler for a cpu to implement than xchg.
>
> What compiler (I assume gcc) primitives are these? I couldn't find them.

__sync_lock_test_and_set and __atomic_test_and_set.

Both expand to ldstub on sparcv7, tas on coldfire, tas.b on sh.
None of these are xchg operations.

I had forgotten that there wasn't a __sync_exchange builtin, so 
__sync_lock_test_and_set plays double-duty as both xchg and test-and-set.

But when __atomic builtins are available, __atomic_exchange does not fall back; 
you must use __atomic_test_and_set for less capable hosts.

But of course none of this is really relevant for normal hosts.


r~
diff mbox

Patch

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bdae6df..e2af57c 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -1,6 +1,9 @@ 
 #ifndef __QEMU_THREAD_H
 #define __QEMU_THREAD_H 1
 
+#include <errno.h>
+#include "qemu/processor.h"
+#include "qemu/atomic.h"
 
 typedef struct QemuMutex QemuMutex;
 typedef struct QemuCond QemuCond;
@@ -60,4 +63,35 @@  struct Notifier;
 void qemu_thread_atexit_add(struct Notifier *notifier);
 void qemu_thread_atexit_remove(struct Notifier *notifier);
 
+typedef struct QemuSpin {
+    int value;
+} QemuSpin;
+
+static inline void qemu_spin_init(QemuSpin *spin)
+{
+    spin->value = 0;
+}
+
+static inline void qemu_spin_lock(QemuSpin *spin)
+{
+    while (atomic_xchg(&spin->value, true)) {
+        while (atomic_read(&spin->value)) {
+            cpu_relax();
+        }
+    }
+}
+
+static inline int qemu_spin_trylock(QemuSpin *spin)
+{
+    if (atomic_read(&spin->value) || atomic_xchg(&spin->value, true)) {
+        return -EBUSY;
+    }
+    return 0;
+}
+
+static inline void qemu_spin_unlock(QemuSpin *spin)
+{
+    atomic_mb_set(&spin->value, 0);
+}
+
 #endif