diff mbox

[02/10] compiler.h: add QEMU_CACHELINE + QEMU_ALIGN() + QEMU_CACHELINE_ALIGNED

Message ID 1459834253-8291-3-git-send-email-cota@braap.org (mailing list archive)
State New, archived
Headers show

Commit Message

Emilio Cota April 5, 2016, 5:30 a.m. UTC
I'm assuming windows compilers don't support this attribute.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 include/qemu/compiler.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Peter Maydell April 5, 2016, 7:57 a.m. UTC | #1
On 5 April 2016 at 06:30, Emilio G. Cota <cota@braap.org> wrote:
> I'm assuming windows compilers don't support this attribute.

Our windows compiler is gcc, so I would expect it to support
attribute aligned, and a quick grep through the sources shows
we already use it in several places.

> +#define QEMU_CACHELINE (64)

Why 64? Does anything bad happen if the host's cache line
size turns out to be greater than the value here ?

thanks
-- PMM
Paolo Bonzini April 5, 2016, 8:49 a.m. UTC | #2
On 05/04/2016 07:30, Emilio G. Cota wrote:
> I'm assuming windows compilers don't support this attribute.

They actually do. :)

Paolo

> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  include/qemu/compiler.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
> index 8f1cc7b..fb946f1 100644
> --- a/include/qemu/compiler.h
> +++ b/include/qemu/compiler.h
> @@ -41,6 +41,16 @@
>  # define QEMU_PACKED __attribute__((packed))
>  #endif
>  
> +#define QEMU_CACHELINE (64)
> +
> +#if defined(_WIN32)
> +#define QEMU_ALIGN(B)
> +#else
> +#define QEMU_ALIGN(B) __attribute__((aligned(B)))
> +#endif
> +
> +#define QEMU_CACHELINE_ALIGNED QEMU_ALIGN(QEMU_CACHELINE)
> +
>  #ifndef glue
>  #define xglue(x, y) x ## y
>  #define glue(x, y) xglue(x, y)
>
Lluís Vilanova April 5, 2016, 12:57 p.m. UTC | #3
Emilio G Cota writes:

> I'm assuming windows compilers don't support this attribute.
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  include/qemu/compiler.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)

> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
> index 8f1cc7b..fb946f1 100644
> --- a/include/qemu/compiler.h
> +++ b/include/qemu/compiler.h
> @@ -41,6 +41,16 @@
>  # define QEMU_PACKED __attribute__((packed))
>  #endif
 
> +#define QEMU_CACHELINE (64)
[...]

You should make this a value taken from configure. In linux you could use
something like:

  getconf LEVEL1_DCACHE_LINESIZE

Or an equivalent program using sysconf(3) if getconf is not present.


Cheers,
  Lluis
Peter Maydell April 5, 2016, 12:58 p.m. UTC | #4
On 5 April 2016 at 13:57, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
> Emilio G Cota writes:
>
>> I'm assuming windows compilers don't support this attribute.
>> Signed-off-by: Emilio G. Cota <cota@braap.org>
>> ---
>>  include/qemu/compiler.h | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>
>> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
>> index 8f1cc7b..fb946f1 100644
>> --- a/include/qemu/compiler.h
>> +++ b/include/qemu/compiler.h
>> @@ -41,6 +41,16 @@
>>  # define QEMU_PACKED __attribute__((packed))
>>  #endif
>
>> +#define QEMU_CACHELINE (64)
> [...]
>
> You should make this a value taken from configure. In linux you could use
> something like:
>
>   getconf LEVEL1_DCACHE_LINESIZE
>
> Or an equivalent program using sysconf(3) if getconf is not present.

The cache line size at compile time is not necessarily the cache
line size at runtime...

thanks
-- PMM
Paolo Bonzini April 5, 2016, 3:29 p.m. UTC | #5
On 05/04/2016 14:58, Peter Maydell wrote:
>> >
>> > You should make this a value taken from configure. In linux you could use
>> > something like:
>> >
>> >   getconf LEVEL1_DCACHE_LINESIZE
>> >
>> > Or an equivalent program using sysconf(3) if getconf is not present.
> The cache line size at compile time is not necessarily the cache
> line size at runtime...

But the size of data structures cannot change at run-time.  A bigger
worry is cross-compilation.

Linux has these defaults:

	Alpha	32 or 64
	ARM	configurable, 64 for ARMv7
	AArch64	128
	PPC     128
	s390	256
	x86	configurable, default 64

which should be easy to copy in QEMU too.

Paolo
Lluís Vilanova April 5, 2016, 4:23 p.m. UTC | #6
Peter Maydell writes:

> On 5 April 2016 at 13:57, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
>> Emilio G Cota writes:
>> 
>>> I'm assuming windows compilers don't support this attribute.
>>> Signed-off-by: Emilio G. Cota <cota@braap.org>
>>> ---
>>> include/qemu/compiler.h | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>> 
>>> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
>>> index 8f1cc7b..fb946f1 100644
>>> --- a/include/qemu/compiler.h
>>> +++ b/include/qemu/compiler.h
>>> @@ -41,6 +41,16 @@
>>> # define QEMU_PACKED __attribute__((packed))
>>> #endif
>> 
>>> +#define QEMU_CACHELINE (64)
>> [...]
>> 
>> You should make this a value taken from configure. In linux you could use
>> something like:
>> 
>> getconf LEVEL1_DCACHE_LINESIZE
>> 
>> Or an equivalent program using sysconf(3) if getconf is not present.

> The cache line size at compile time is not necessarily the cache
> line size at runtime...

True, but exposing it as a configure flag allows tunning for the target arch
during cross-compilation. It would be bettwe to use the compiler to tell you
about the target cache line size, but I'm not aware of a way to do it.

Got it!

  gcc -march=native --help=params -v 2>&1 | grep "param l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\) .*/\1/'


Cheers,
  Lluis
Richard Henderson April 5, 2016, 4:31 p.m. UTC | #7
On 04/05/2016 09:23 AM, Lluís Vilanova wrote:
> Got it!
>
>    gcc -march=native --help=params -v 2>&1 | grep "param l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\) .*/\1/'

That will only work on x86, where we can be pretty sure it's 64 anyway.


r~
Peter Maydell April 5, 2016, 4:56 p.m. UTC | #8
On 5 April 2016 at 17:31, Richard Henderson <rth@twiddle.net> wrote:
> On 04/05/2016 09:23 AM, Lluís Vilanova wrote:
>>
>> Got it!
>>
>>    gcc -march=native --help=params -v 2>&1 | grep "param
>> l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\)
>> .*/\1/'
>
>
> That will only work on x86, where we can be pretty sure it's 64 anyway.

Doesn't work on OSX x86 either, since the gcc-to-clang wrapper
doesn't support --help=params :-)

thanks
-- PMM
Peter Maydell April 5, 2016, 6:01 p.m. UTC | #9
On 5 April 2016 at 18:24, Emilio G. Cota <cota@braap.org> wrote:
> On Tue, Apr 05, 2016 at 08:57:45 +0100, Peter Maydell wrote:
>> On 5 April 2016 at 06:30, Emilio G. Cota <cota@braap.org> wrote:
>> > +#define QEMU_CACHELINE (64)
>>
>> Why 64? Does anything bad happen if the host's cache line
>> size turns out to be greater than the value here ?
>
> Defining a number lower than the host's cache line could result
> in some false sharing. No big deal for most workloads.
>
> Defining a number greater than the host's cache might result in
> wasted memory, which we really should avoid.
>
> I used 64 because it is common on x86, which I assume is what most
> developers develop on.
>
> On Tue, Apr 05, 2016 at 17:29:05 +0200, Paolo Bonzini wrote:
>> But the size of data structures cannot change at run-time.  A bigger
>> worry is cross-compilation.
>>
>> Linux has these defaults:
>>
>>       Alpha   32 or 64
>>       ARM     configurable, 64 for ARMv7
>>       AArch64 128
>>       PPC     128
>>       s390    256
>>       x86     configurable, default 64
>>
>> which should be easy to copy in QEMU too.
>
> So how about this:
> we add these defaults, and also add an optional --configure
> parameter to override said defaults.

I think this definitely doesn't merit a configure parameter.

> Otherwise I'd just stick to 64.

If this is basically just a hashtable semi-tunable parameter knob,
then I don't mind if we use 64 everywhere (or some slightly more
architecture-aware default). But I would prefer that we not make
it a global define QEMU_CACHELINE if we can't actually guarantee
that it's the size of the cacheline (which we can't), because I
think that will be confusing and invite future misuse.

Unless we have another use case in the tree at the moment for
a number which is "probably the cacheline size, but might be
smaller or larger if you're unlucky", in which case we just want
a better name :-)

thanks
-- PMM
Lluís Vilanova April 5, 2016, 7:02 p.m. UTC | #10
Peter Maydell writes:

> On 5 April 2016 at 17:31, Richard Henderson <rth@twiddle.net> wrote:
>> On 04/05/2016 09:23 AM, Lluís Vilanova wrote:
>>> 
>>> Got it!
>>> 
>>> gcc -march=native --help=params -v 2>&1 | grep "param
>>> l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\)
>>> .*/\1/'
>> 
>> 
>> That will only work on x86, where we can be pretty sure it's 64 anyway.

Oh, you mean l1-cache-line-size is an x86-only parameter in GCC?


> Doesn't work on OSX x86 either, since the gcc-to-clang wrapper
> doesn't support --help=params :-)

Well, my suggestion of adding a configure flag still applies :)


Cheers,
  Lluis
Emilio Cota April 5, 2016, 7:13 p.m. UTC | #11
On Tue, Apr 05, 2016 at 19:01:07 +0100, Peter Maydell wrote:
> On 5 April 2016 at 18:24, Emilio G. Cota <cota@braap.org> wrote:
> > So how about this:
> > we add these defaults, and also add an optional --configure
> > parameter to override said defaults.
> 
> I think this definitely doesn't merit a configure parameter.

Agreed :)

> > Otherwise I'd just stick to 64.
> 
> If this is basically just a hashtable semi-tunable parameter knob,
> then I don't mind if we use 64 everywhere (or some slightly more
> architecture-aware default). But I would prefer that we not make
> it a global define QEMU_CACHELINE if we can't actually guarantee
> that it's the size of the cacheline (which we can't), because I
> think that will be confusing and invite future misuse.
> 
> Unless we have another use case in the tree at the moment for
> a number which is "probably the cacheline size, but might be
> smaller or larger if you're unlucky", in which case we just want
> a better name :-)

Ok, so for now I'll only leave the ALIGN() macro, and then use
ALIGN(64) in the hash table. We can always reconsider adding a
more proper definition if this gets widespread use.

Thanks,

		Emilio
Richard Henderson April 5, 2016, 7:15 p.m. UTC | #12
On 04/05/2016 12:02 PM, Lluís Vilanova wrote:
> Peter Maydell writes:
> 
>> On 5 April 2016 at 17:31, Richard Henderson <rth@twiddle.net> wrote:
>>> On 04/05/2016 09:23 AM, Lluís Vilanova wrote:
>>>>
>>>> Got it!
>>>>
>>>> gcc -march=native --help=params -v 2>&1 | grep "param
>>>> l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\)
>>>> .*/\1/'
>>>
>>>
>>> That will only work on x86, where we can be pretty sure it's 64 anyway.
> 
> Oh, you mean l1-cache-line-size is an x86-only parameter in GCC?

No, using -march=native to detect the host line size.


r~
Lluís Vilanova April 5, 2016, 8:09 p.m. UTC | #13
Richard Henderson writes:

> On 04/05/2016 12:02 PM, Lluís Vilanova wrote:
>> Peter Maydell writes:
>> 
>>> On 5 April 2016 at 17:31, Richard Henderson <rth@twiddle.net> wrote:
>>>> On 04/05/2016 09:23 AM, Lluís Vilanova wrote:
>>>>> 
>>>>> Got it!
>>>>> 
>>>>> gcc -march=native --help=params -v 2>&1 | grep "param
>>>>> l1-cache-line-size" | sed -e 's/.* --param l1-cache-line-size=\([0-9]\+\)
>>>>> .*/\1/'
>>>> 
>>>> 
>>>> That will only work on x86, where we can be pretty sure it's 64 anyway.
>> 
>> Oh, you mean l1-cache-line-size is an x86-only parameter in GCC?

> No, using -march=native to detect the host line size.

Ah. That's just an example. For cross-compilation you would use a different
march argument (or none to use the default target sub-arch) and get the
parameter for the target processor. This should already be known by configure as
part of the arguments to select the cross-compiler and target architecture
(e.g., CC).


Lluis
Paolo Bonzini April 6, 2016, 11:44 a.m. UTC | #14
On 05/04/2016 22:09, Lluís Vilanova wrote:
> Ah. That's just an example. For cross-compilation you would use a different
> march argument (or none to use the default target sub-arch) and get the
> parameter for the target processor. This should already be known by configure as
> part of the arguments to select the cross-compiler and target architecture
> (e.g., CC).

I would just use 64, with special cases for PPC and s390... even for
AArch64 128 seems a little wasteful.

Paolo
Laurent Desnogues April 6, 2016, 12:02 p.m. UTC | #15
On Wed, Apr 6, 2016 at 1:44 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 05/04/2016 22:09, Lluís Vilanova wrote:
>> Ah. That's just an example. For cross-compilation you would use a different
>> march argument (or none to use the default target sub-arch) and get the
>> parameter for the target processor. This should already be known by configure as
>> part of the arguments to select the cross-compiler and target architecture
>> (e.g., CC).
>
> I would just use 64, with special cases for PPC and s390... even for
> AArch64 128 seems a little wasteful.

As far as I know, the only AArch64 CPU to have 128-byte cache line is
Cavium ThunderX (it seems to also be the case for Cavium MIPS-based
CPU).


Laurent
diff mbox

Patch

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 8f1cc7b..fb946f1 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -41,6 +41,16 @@ 
 # define QEMU_PACKED __attribute__((packed))
 #endif
 
+#define QEMU_CACHELINE (64)
+
+#if defined(_WIN32)
+#define QEMU_ALIGN(B)
+#else
+#define QEMU_ALIGN(B) __attribute__((aligned(B)))
+#endif
+
+#define QEMU_CACHELINE_ALIGNED QEMU_ALIGN(QEMU_CACHELINE)
+
 #ifndef glue
 #define xglue(x, y) x ## y
 #define glue(x, y) xglue(x, y)